 Hey guys, welcome back skits own episode 13 as usual We're implementing stuff from scratch and assembly topic today is an extremely boring one and that is floating point IO and The motivation for this is because well This whole series is about engineering software that means you're doing math, which means you need numbers So you have to be able to get numbers into the software and so that's where floating point IO comes into play So this will include string to float conversions as well as just the basics of character delimited files like CSVs, etc So actually I asked Mandy what are some options for getting floating point data into our programs and she had a few ideas all them are good ideas The first one option a we could just hard code all the data into the binary And this is actually not a bad idea because our code Compiles, you know in quotes compiles really quickly it assembles instantly basically and so You could even write some kind of a wrapper script That like writes our assembly code with the numbers embedded inside. That's not a terrible idea to be honest But yeah, it would get kind of unwieldy if you had a bunch of different things that you wanted to Run your software on you'd have to constantly be changing the binary Which again is not terribly hard to do but not the easiest way either Option B would be to prompt the user to enter all the data with the keyboard And again, that's not a terrible idea because you can actually you know Automate that if you really wanted to And you could as she says you could actually break apart the integer and fraction parts of it Enter the integer part of this number Enter the fraction part of this number All right, and then you repeat that over and over again. That's a that's a good idea But again, it comes kind of unwieldy when you have like Thousand by thousand element matrices that you want to you know work with and so Yeah, not not really the best, but it's a workable solution option C is What we're gonna do in this video and that is to just parse floats from an input file like a CSV and What that entails is Basically looking at strings and Figuring out what those strings mean because remember, you know, we read, you know out of a file You open up like a file and you see one two three point four five six you have to remember that that's not a number That's a sequence of ASCII bytes. This is seven ASCII bytes in order This does not mean a hundred twenty three point four five six This means some random string of numbers and so we have to have a function program a Routine or whatever that can adapt the string into an actual quantity and remember floating point numbers, you know In our heads, you know look like this, but in the computer it looks like this And so the how do you get from a string of ASCII bytes to a hexadecimal string like this? It's almost seems crazy But ideally We'll do that in this video and we want to be able to do that for all of these different kinds of combinations of inputs I have a couple here. So a number like one twenty three We have to able to parse that even though it's integer. We have to able to pull that into a float and then one two three point four five six an example of a integer and a fractionable fractional part of a number combined negative numbers have to be there explicitly positive numbers have to be there Significent notation right e has to be be able to be parsed As well as negative decimals with you know positive exponents as well the whole gambit of Scientific regular and even whole looking numbers like one twenty three have to be able to parse by our parse float function So for that parse float function, it looks like this This is how I implemented it at least it's function called parse float and All it does is it returns a floating point number in register XMM zero and It takes one input and all that input is is a pointer to the first Character of your string and then what it will do is it will return in XMM zero the value of that parsed character array which starts at register at the address in register RDI and Terminates at any non numerical character here I say besides decimal point but also besides e besides plus besides minus any number that you could consider to be numeric in nature and So this function has to have a lot of steps And remember a lot of these steps are optional steps because not all numbers have all these pieces, right? The number one twenty three doesn't have a negative sign or a decimal point or a fractional part or an exponent or a sign for the Exponent right none of those things. So you have to be able to kind of constantly be checking for What do I even need to parse let alone? What am I parsing and? So the couple steps are like this The first step is to check for a leading sign Is it going to be a negative number is it going to be an explicitly positive number? So plus five Or is this going to be a regular five, you know, which is the same thing, but it's written differently Next would be to parse the whole number part of the number. So in this case 123 Checking for a decimal point parsing the fractional part checking for an exponent checking for a sign on the exponent Parsing the exponent and then ending at the first invalid bite. So bite besides Numbers besides decimal point besides e besides plus besides minus And actually you can't even have a plus right you can't have One two three point four five six e negative three plus. That's not a valid number, right? So you have to think what is it? What is invalid at this point in the number and that changes, right? Because if the number ends here different Bites would be considered invalid, right? Anyway It's a pretty simple process Let me first specify All the different relevant ASCII values. Remember these characters are just ASCII values And so here maybe a couple maybe two dozen numbers are relevant You have numbers zero through nine, which are ASCII values 48 through 57 But you also have Little e lowercase e which is ASCII value 101. You have plus which is 43 minus is 45 Decimal point is 46 and then maybe for a CSV you'd care to know that a comma is 44 And then you have other things like slash which not we're not going to use in in this video So the first thing you do is to check for that leading sign And so remember RDI The value in that register is an address that points to the first bite in our number and the question is What is this bite? Is this bite a number like six or is it a negative sign or is it a positive sign? Ideally it should be one of those three things if it's not you have a problem if it's a W You have you have a problem and so first thing we do is we check is this number a plus is this character a plus I should say and so actually first you can see I have defined R8 This is just going to be a flag that we're going to use to track is this number Negative so if R8 is zero it's a positive number if R8 is one it's negative So when you X or are it with itself that means set R8 to zero and so at the outset of this program We assume it's a positive number First thing we do is we compare you can see here compare the bite at the address In register RDI with the value 43 43 corresponds to a plus symbol And so this comparison says hey is our first character a plus and then jump equal means if it is a plus Jump to this of this address here explicitly positive number That means we've specified that our number is actually positive In that case all we're doing is we're incrementing our pointer So our pointer no longer points to the plus sign it points to what we hope is the first Actual numerical number like six in this case it would point to the one here With this increment RDI instruction At that point you go on to the loop And now we're parsing the integer part of the number Hopefully However, if it's not a plus sign this jump doesn't execute And so you go to the next instruction which says compare the bite at the address in register RDI with The quantity 45 that is a minus sign Here I say jump not equal so that means if it is not a minus sign Just start the start the integer parsing. We hope that we're now in A valid numerical quantity like six or seven or eight. We know that at this point It's not a negative sign. However, if it is a negative sign this won't execute and the Computer will move to this instruction, which means in card eight that just sets our negative number flag to one We increment our pointer to the next bite And we go on with the code. So this is a very Detailed look at every simple operation. We're just checking if the first character was a negative sign or a positive sign Or just some regular number okay The next step was to parse the whole number part at this point Our pointer points to the first hopefully numerical quantity and I'm not going to cover error handling like oh, what if it's a q what if it's a You know null byte or something I'm assuming the all thing that we're passing in is valid. And if not, it's just going to crash anyway So I don't care if the handle error is specifically In this case at this point our pointer points to the first integer number And we have a loop for this I will specify as well that outside the loop we should define some values Rax is going to be our our kind of Accumulator or just our tracker of the number. So we're starting out with rex as zero So in this case our rex will eventually be 123 But at the beginning it's zero and rcx is just the multiplier for our base That's called a radix And you know base 10 the radix is 10 So we're going to just track How our number changes when we add digits To it for example, let's say the first digit was a one You'd have to multiply that by one by a 10 Before you add the next digit So then you have 12 multiply that by 10 you have 120 before you add the third digit which is three So that's how it works, right? So anyway, the loop is like this First thing you do is you grab the byte at the address in register rdi And you zero extend it which means you make all the leading digits zero and you put the value in rbx That's what i'm doing at least then i'm subtracting 48 from that number. What does that do? Well Look at looking at the ascii values Let's say that number was a one which it was The ascii value would be 49 So taking 48 from that would give you the true numerical quantity Of one So then what you do is i mole rcx what this does when you don't when you don't define The actual um registered to multiply by this just multiplies rx by 10 And so at this point our rx was zero We scale zero by 10 which is still zero and then we add to our accumulator The new digit which was one so now rx has a value of one Then what we do is we increment our pointer to now point from the one to the two And then we check some conditions in this case we're checking. Hey, is there an exponent coming? Hey, is there a non-numerical quantity coming is our number over is there a q coming? Is there a w coming is there a comma coming it will break in that case Also, we check for non valid numbers Like a slash symbol things like that or we can check for a decimal point and go to that part of the code as well But every time we cycle through One of these bytes one at a time we have to check What is coming next to know what we're even looking for? next But let's just say it's still a numerical quantity if it is if the next number in this case is a two then This jump will execute and we will come back to the top At that case we're going to take the two value The two value in inascii is 50 take 48 from that is it to Multiply our rx which was one by 10 now we have 10 in rx add two to that 12 Check the conditions once again come back to the top Our next Byte is a three which is 51 in ascii You take 48 from that that's a three Multiply 12 which is our accumulator value by 10 120 add three to it 123 at this point rx contains 123 which is our integer part and then These conditions will satisfy one of them probably will in this case the decimal point check will probably satisfy Because the next number is a decimal point. What is a decimal point in ascii? It's 46 So when we compare the byte in rdi with 46 That is checking for a decimal point which is the next step of our process and that will satisfy And we're going to fall out of this loop So at this point we've done This step this step and this step But I will say that most of the next steps are just the same thing Just repeated because when we're parsing the fraction, it's no different from parsing integer, right? We're just parsing three different Well in this case three different characters, but it could be n different characters And uh, it's just an integer But the difference is when parsing the fraction It's the same algorithm as before But this time your your pointer was pointing to the four as opposed to pointing to the one But the only difference that you have is you have to also track the denominator. So every time that you Go one byte you have to increase your denominator from one by a factor of 10 Or to say from from 10 by a factor of 10 because the first quantity you have here four It's not actually four. It's four tenths. And when you go to two bytes, you have 45 hundreds And three bytes 456 Thousands and so you have to keep track of the implied denominator as well But it's a very simple matter of multiplying by 10 every time Then checking for An exponent and a sign That's the same thing as the whole number thing again. You're just checking for these bytes And then parsing the integer Quantity of the exponent. So we've gone through this it's just this process repeated Once again, however You can't have a decimal point in your exponent So, you know, you have to be smart about what you're checking for what's a valid number that can come in this situation But yeah at this point all the Boxes have been checked. We have now been able to parse Everything from the sign to the whole number to the Decimal or not to the fractional part to the exponent to the sign of the exponent And hopefully we found all the constituent parts of this quantity Now it's a matter of putting everything together And so when you have all these things, let me move myself You have A sign a whole number a fractional part an implied denominator An exponent flag for the sign as well as An exponent which you can hopefully you know use later and so Pick the sign for example That negative sign in our number Corresponds with a float value of negative one our whole number corresponds to a float of that number You can use the convert instruction to convert whole numbers to floats The instruction is a cvt si to sd And that's how you get from a whole number to a float number of that value Now for the fractional part where we've tracked the fraction and we've Tracked the implied denominator in this case divide the two and that's your fraction Add those together multiply it by your float Flat you know You're leading your your leading sign flag and you have a quantity that corresponds to your Sign your whole number part and your fractional part now the exponent comes in if you have an exponent You have to consider well. What was the exponent in our case? It was a three So you have to get a multiplier from that 10 to the power of three Now because that's a thousand and because our sign flag was negative you're dividing by a thousand if your sign flag was positive You need multiplying by a thousand either way Based off all this data you now know you're taking your combined number dividing it by a thousand and here's your resultant and Automatically if you've done this The you know using the floating point unit on the computer it will automatically know the Well, it will be using the hexadecimal format for that number for all its computations And so this is what the computer will see this is what you'll understand that number to mean And even though your original number looked like this was a series of ascii bites now you've successfully have converted that to a Floating point number on your computer great It's actually a lot simpler than it would sound But there is one more step and that is to pull those numbers from a delimited character file of floats for example like a csv a tsv etc Like this so here's an example of a csv where you have numbers like 1.01 comma negative 2.02 comma 3.03 and the delimiter here obviously is the comma And so we have to implement a function that can begin to parse these floats using the above function From a file and so it's a little bit more convoluted because you have to go through a buffer to do this but Here's how the function is set up the way I've written it It's a function called parse delimited float file And it doesn't return anything but it basically takes rdx Which is a number of double precision floating point values from the file descriptor which you've already opened hopefully And passed that value in register rsi Places those in an array that begins to address rdi. That's why it's a pointer to double format And we use a buffer a buffer for that purpose because we're going to be reading bytes from a file Into a buffer of a length rcx bytes starting at address r8 And the delimiter between those floats is variable you can make it a semicolon You can make it a comma you could make it A new line Whatever you do just pass it into the low byte of register r9 So a lot of stuff in there, but simply this just takes an opened file Which has the delimiter between floats and parses that into an array The only caveat the only complexity here is that you have because you're parsing from a file You're not parsing from you know memory Is that you have to know that if you're pulling you know 128 bytes or 1,024 bytes from a file, it's not going to line up with your delimiter, right? You know, let's say your let's say your buffer is size 64 And let's say 64 bytes is I don't know somewhere in this 11 Right, let's say this is your 64th byte. So this is bytes zero. This is byte 63 Well, you know, you can parse this number and this number and this number all the way around to the 10 But the instant you try to parse 11 It's not a full number anymore because we're missing it. There's no there's no comma And so you have to know that hey, this is not a full number. I have to refill the buffer starting at This one and get 64 bytes from there That's the only complexity in this whole situation because you can't read from a file Well, you guess you could but it's not very efficiently from a file byte by byte We're using a buffer to make it more efficient of a read operation And because of that you have to kind of have some extra logic in there to make sure that Yeah, everything I'm reading is what I'm intending to read be reading the whole time Hope that makes sense. If not, here's a flow chart basically You're opening the file descriptor and you're reading some number of bytes from that file into a buffer We call that read buffer and we're iterating through that buffer parsing floats in this case We parse 1.01 and we parse negative 2.02 and we parse 3.03 and we repeat that process until There is a there's no longer a delimiter between our current location and the end of the buffer When that is true, there's no longer a comma in our in our buffer We have to read remainder of we have to basically copy our End of our buffer to the beginning and read the rest of the buffer bytes from the input file to fill the buffer once again And then we keep looping that until we're out of bytes to read A lot of steps, but it's pretty straightforward. If you look at the code everything there is commented pretty well so With that in mind, let's check out the code I have four different examples today one is called a parse floats This is just parsing floats from strings. So let's say the user typed in 1.23 hit enter. You could parse that string into a float Then I have Two examples here that take csv's or similar files and parses floats from those files And then I have a fourth example, which is Just writing a csv file, which is very straightforward. I'm not going to even talk about how that works You just basically write floats in the way we've covered in our previous video with a comma between them So very straightforward as well With that in mind, let's look at the code itself. So all of these are in the soy hub suppository You can check it out example 13 Um, and there's a read me you can check out the read me Let's do really quick All the different things with some just a very brief explanation of how they Of what they're doing so example a in this case we were um reading Strings so here you can see I got a bunch of Floats float strings, which are you know, as you can see quite simple numbers. We have 123 negative 123 Adding decimal points and negative signs exponents positives, etc This kind of covers the gambit of different types of floats that you would want to pass into your program And so how does this what's happening in this? Program well, what are we including or including? A couple Uh input files or functions I should say one is the parse float function, which I specified in this video but also print float and Print characters just to dump out our results as we read them And here's how the process works Basically, all we're doing is we are We're gonna we're passing that pointer to the start address of the float of the string, which is a float um And then we are calling parse float this Parses that float starting at the address that we specified and then puts the value in register x and m zero and then Uh, it prints it out with a new line So it does that for all the floats all 10 floats that we specified It parses them and prints them out. So if I Compile or assemble and run this you can see that it successfully parses All those various formats and so the code functions, hopefully as intended Okay, with that out of the way, let's open up the example b Example b was just parsing floats from an input file. So if I um Look, I have a matrix dot csv file here by opening the matrix csv file. You can see I've got those numbers from the Uh The slide 1.01 negative 2.02 etc and we're going to try to read these into an array. That's the idea So if I open up the code you can see in this case We have a rebuffer of size 64 bytes. So we're going to be reading 64 bytes at a time from that file that matrix dot csv file and um Using our parse the limited float file function, which I just explained to to do so And then we're going to print it out obviously with this print array float function, which we implemented many many weeks and months ago So it's very straightforward. All we do is we open the input file Which is very simple. We just we do this in one of our first videos opening files. Um, we read the floats From that csv into an array I'll show you the array down here at the bottom. The array is Uh, we it's a 6 by 4 element float matrix And so we have to initialize 24 Quad words or yeah 24 quad words of space for that. We have our file name described here as a null terminated string And we just Open that file Read the floats in from the csv into that array using that function I specified the limiter here is a comma We could change that to a semicolon if we wanted to if it was a semicolon limited file But it's not it's a csv We call that function and then we print out the array. So if I execute this you can see that the Matrix that csv file is successfully parsed and all the data is passed into an array And that array is then printed to the screen. So we can successfully parse a matrix from an input file Great Let's go to example c Now this example was a little bit different because I have let me move the binary really quick I have Two input files here. So I've got a multiply dot this and a buy dot this As you would expect, let me open up those um Files they are just what's this one two three four five six They're both uh six Hold on one two three four five six seven eight or nine nine elements So I get the pi three by three matrices um And they just multiply numbers together this one you can see is a csv file Whereas this one is a semicolon limited file Um, just to show that there's you can do whatever you want You could pass into files and whatever format you'd like and it can still the function still works the same way And so the idea is we're multiplying those two files by each other which doesn't make sense but It we're parsing numbers from the files and multiplying those numbers together And the way it works. Let's open up the code and check it out Again, we have a rebuff for size 64 bytes. How does this work? Let me go to the bottom first So it looks like we have two different file names multiply dot this and buy and buy that this And we have three different arrays. We have one array to store the first input file data One array for the second input file data and then one for the product And obviously we're just going to be including the file open Assembly program Function the parse the limited file function print array function to print out the results and also our matrix multiply function to multiply matrices And so First thing we do is the open input file one. We parse the values from that csv Into array one that memory location Do the same thing for array two from file name two then we multiply them together with this uh, make sure multiple matrix multiply function that we wrote a few weeks ago And then we print out the results to the screen. And so If I run that you can see it parses those numbers and you can see we've got numbers like 420 1337 1776 pi is that e Uh, what is that? I don't even know what that is Um, square root of two square root of three Whatever that I don't even know what that is. Um, but yeah, this is a intended result. So yeah, the function works as expected so Last example was example 13d. That's not a directory In this case, we're just writing to a file here. You can see we've written to the matrix that out function to the file I'm going to delete that so we can do it again Let's open up the code and see what's going on. So I've got a 3x3 element float matrix in memory here You can see just 1.11 2.22 etc The idea is simple. We're going to write this to this file name And so we have a function that we wrote called print limited floats. It's very straightforward. You can check out the The online suppository for the date details of that Uh, we opened the input file. We write that array out to the csv with a common delimiter Hey, you know what? Let's just change the delimiter just for fun. Let's make it an and sign Run the code That will give us a matrix that out file open matrix that out and you can see we have all of our data written out in scientific notation with an and in between each each floating point number Why would I do that? I don't know, but we did it anyway with that I'll End the video. I think we covered everything in pretty great detail. Perhaps too much detail It's a very very boring topic. I'm sorry when you had to go through this Um, if you want to hang out, we have a discord server link in the description Hope you guys enjoyed. See you in the next video