 Welcome to this course on computer programming. In this session, we shall see how to create a direct access file from the data available in a text file that we created in the last session. You recall that we wrote a program which handled CSV files and created a mark data dot text file. We note that a file on disk is like an array of bytes. So, just as an array element can be accessed through an index value, one or more bytes can be directly accessed by giving the position of the starting byte. There are functions in C plus plus which can be read, which can read or write a number of bytes at a specified position in the file. In this session, we are studying a program to create a binary file in which there are fixed length records. Later in the next session, we will see how these records can be directly accessed, read and updated. First, we recall the format of the marks data dot txt file which we created through a program last time. We had the roll number, we had the name, we had the batch and we had the marks for every student written in a line and there were as many lines as there were students. Note that we had actually created text data with values separated by blank spaces. Each line contains a record of one student's information. Each line has four pieces of information. These are called four fields or attributes. These fields are roll number, name, batch number and marks. Now each field and thus each record has a fixed length in bytes. What is important is this fixed length. If we know the record size or number of bytes in a record, say s, then if data for say 10,000 students written in a disk file, then we know that the file will contain 10,000 into s bytes. More important, if we know what is the relative position of a particular record which we want to access, let us say that position is r, then we can directly read the data for that student by simply going to r into s minus one-th byte because this will be the starting byte position for the record of that student. The next s bytes will contain the record itself. So, this is the great advantage of using fixed length records in direct access files. Of course, in this particular session, we are merely going to create a binary file amenable for direct access. To begin with, we define a record structure so that we can collect all different attributes of a record in one place and refer to all of them by a single name. We also note that there is no need to store all values in a text format in the output file. We can as well use internal format to directly store values. Here is one good way of defining a structure. We say struct student info and define these four elements in role care name 30 int batch float marks. Notice that this structure struct student info acts like a new type. It is now possible to define a structure variable s by saying struct student info s. The moment we define this variable, individual elements of s now can be accessed easily as we know by s dot role s dot name s dot batch and s dot marks. The size in bytes of a structure can be found by a special operator called size of. So, if we say size of struct student info, it will return the number of bytes allocated by the compiler inside internal memory. This can be assigned to an integer variable say rake size. Notice that this is integer it will take 4 bytes, this is integer it will take 4 bytes, this is float it will take 4 bytes. These three fields together account for 12 bytes. The name is 30 bytes. So, we would expect the length to be 42 bytes for this record. Unfortunately, most compilers will count the size of our record as 44 bytes because elements need to be allocated at what is called word boundary. That means every new allocation must begin at a byte which number is divisible by 4. That is why although s dot name is only 30 bytes, 2 bytes will be added as padding bytes and s dot batch will be allocated at the next word boundary. The program logic for creating a database file is very simple. We open the input text file and output binary file. Now we read as before one line from input text file into 4 variables. We set up a file loop. We again test for not end of a end of file of input file. If the file has not ended, I have got the right line. All I do is assign values to elements of structure variable and write the entire structure variable to output file. Again, I read the next line in order to go back to the next entry. So, please note that in this iteration, I need to do only two things. From the line that I have read, assign values to elements of structure variable and write the entire structure to the output file. To begin with, I read a line before entering the loop and then before going to the next iteration, I read another line. At the end of this file, I have to close files. That is the simple program logic. Let us look at the program itself. After the include statements, we look at the structure definition, struct student info. We have already seen this definition. Please note that this defines a new sort of data type or abstract data type called student info. In the main program, we define a variable S of the type struct student info. We define the rec size. As mentioned earlier, we find out the value of rec size by using size of and we output it for our own information. The size of record is so and so. We define two files, fp input and fp output. As we already know, to begin with, we have to do some necessary housekeeping, namely, open these files so that they are associated with appropriate external files. We know that fp input is to be associated with markdata.txt file, which we have created last time, but this time the file is opened in read mode that is an input file. Of course, if the file pointer is null, we give an error message and return. In exactly the same fashion, we associate the fp output file pointer by opening another file called student db. Note the parameter wb. W means for writing, so it is an output file. B means binary file. By default, this is a text file, but when we write b, it means a binary file. So, in short, a binary file will be opened with this name. Of course, if for some reason the file cannot be opened, I will get a null pointer. I will put out a error message and get out with a return minus one. Having done this housekeeping, I also define four variables r, b, care n, 30 and float m. These are temporary internal variables to hold values, which I read from the input file. So, these are attribute values for a record. I start with initial value of 0 for account and do the following. I first read from the fp input file the values for r, n, b and m. Notice that I use pointers except for n, which is a character error. Notice the format specifiers d, s, d, f. These will suffice because I have exactly four values separated by blank spaces. So, having read these numbers, I will start the file iteration as long as fp input file has not ended. So, not f e o f of fp input, I will continue to iterate like this. And what do I do in every iteration? Simple. I have got values of r, n, b and m, which I assign to elements of my structure variable s, s dot roll, s dot batch, s dot marks are all assigned values of r, b and m. As far as s dot name is concerned, I use a simple string copy function, which will copy from n into s dot name. Having constructed the values for elements of the variable s, I simply wish to write this structure as is onto an output file. I use the f write statement. Please note I have passed the structure pointer and s. fp output is the last parameter to which I have to write the file. Notice these two parameters. Rec size comma 1. This indicates the number of bytes to be written, which is the size of the record and this indicates the number of records to be written. So, number of bytes in a record and number of records. Usually, I will be writing one record at a time. Please note that I am writing the output file sequentially. So, after this write, I of course, print that record for my own benefit and read the next input record. Go back to this white state. So, as I read records one by one from the input file, they will get written in terms of the fixed length structure s into the output file here. At the end, I do my standard housekeeping. I print out messages like marks, data file read, database created. I write the total number of records written and more importantly, I do not forget to close the fp input and fp output file. This ends my program. Let us look at the execution of this program. You will recall the c out and print f statement that we has used in our program. So, first it outputs the size of record that is 44. Next, it will output each record. Please note that account is also printed there. So, we know the record number 1, 2, 3, 4, 5, 10. We knew that there were 10 records. So, we know all 10 records have been read and have been written. At the end, the messages say marks, data file read and printed, database created for students and total number of records written there. In short, our program has worked correctly and has created a database file. We call it a database file. Please note that the word database actually refers to a completely independent body and field of study. In plain English, database is any base or any collection of data. So, we call this a database file. In summary, we have studied how to create a binary file, how to write fixed length records containing students data. In the next session, we will see how to access and update these records using direct access to our binary files. Thank you.