 Good morning everybody. We now go ahead to discuss character strings and pointers. First in this session I will describe some reading material or what I call C resources. Some of these have been kept on the Moodle. So all of you can download it. All of them are open source material incidentally or freely distributed material. We will then discuss handling characters in C and string handling using arrays. We will discuss pointers not separately but as part of the whole session because as most of you would recollect without pointers strings are not very effectively handled in C and perhaps this is the right time to introduce pointers. We will also utilize this opportunity to indicate passing values by reference in functions so that the use of pointer is extended to that area as well. As I mentioned earlier we will not discuss basic input output today. We shall discuss this along with the file I O and other material on Monday morning. So first the open source contents. There was a book written by Badahan, Brady and Doran. This was published originally by Addison Wesley in 1991. Now this version has been made freely available. There is a website which contains the HTML version. This is the website here. Publications.gbdirect.co.uk. It is C underscore book copyright.html. Well that is the copyright notice but this is the book. So if you remove the last item you will get to see the HTML book. The PDF version of this book has a funny file name and this file has been uploaded on the workshop model. So all of you can access it. This is a very useful resource although it is a very old book. It is a very useful resource and since it is freely distributable you can give these copies to all your students. They can see the soft copy on the net. They can print it and use the hard copy. They may even want to copy some of the programs from this book and run them as exercises. You may want to do the same thing and extend some of the programs as problems for your students. It will not be inappropriate to mention here that one of the objectives of this workshop is to lay the foundation for a very large scale collaborative effort across the country whereby appropriate resources in learning teaching material are made available to every teacher and student of programming in this country. In fact all those 650 or teachers who are actually participating in the workshop shall be forming the foundation group for this activity. Towards this end we are making a small beginning by making this material available. This material has not been composed at IIT as you can see. It is a book written long time ago by experts. Now there is one problem with old books. The language definition itself keeps evolving. Fortunately the C language standard has not further evolved beyond 1999. The publication of the standard which is most current was given by a committee report in a document published in 2005. This document is also available on the net. The name of this document is N1124.pdf. I have downloaded this and this will also be uploaded onto your Moodle. So all of you can check this. What is the purpose? The purpose is if there are some features of C programming language which have been extended since 1991 this open source material will not contain description of those features. I will mention a couple of these features which have changed over the years during the course of this session. However it is important that when you teach your students, the students are familiar with the current state of art standards and that is the reason why this document would be very useful. This is a very exhaustive document and in fact I will encourage my colleague teachers participating in this workshop to read this document. Although huge amount of programming might not currently be done as you can see but the notion of reading an established international standard, understanding the features and the implications is a useful practice and in fact I would suggest that teachers participating in this workshop who will be teaching other courses involving programming languages or for that matter any technology which has been standardized, it is an extremely beneficial practice to keep referring to international standards most of which are available on web. Last but not the least, there is a set of notes which have been compiled by my colleague Mr. Avinash Aute. I do not know whether I mentioned it or not. Avinash Aute was an IIT alumnus. He worked in Tata consultancy services for several decades and after retirement he is engaged in social service in training people from the economically backwards set of the society. I requested him a couple of years ago to join this Eklavya project and he currently works here as part time program director of the project. He has written notes on C programming and these notes are being open sourced by him. This material which is in the form of concise notes, examples and a large number of exercises is being made available as this workshop and of course it will be open source. Currently on your Moodle I think what has been uploaded is the PDF version of the C book. The other two resources will be uploaded by noon time today. So you will have access to all of these. Incidentally the CD which was sent to all coordinators for setting up the Unix environment, the Ubuntu environment contains the material compiled by Mr. Avinash Aute. His notes are there as well. So with this backdrop on resources I will add one more thing. Many of you would have come across such useful resources on the net. I would request all such colleagues to send mails to us pointing out the internet links where such material is available and accessible. Kindly make sure that the material that you make reference to is open source. Open source means that it should not be available just for anyone to read but it should be available for anyone to copy. It should be available for anyone to distribute. It should be available for anyone to modify. It should in fact be available for anyone to do exactly what one pleases to do with it. That is the full and complete definition of open source. At an appropriate time I will describe the Creative Commons license by attribution under which we will be releasing all of our material. And this material you will agree will encourage both students and teachers in the entire community in the country to further contribute to these contents and make them richer for everybody to use. With this introduction we now move over to the discussion of how to teach certain concepts in C programming and we begin with first the list of reserved words. The reason I bring this list up is that you will recall yesterday there was a query from a colleague who asked whether main is a reserved word or not. As I admitted yesterday I did not remember whether main was reserved word or not. It appeared to me that it should be a reserved word because I have not seen the word main used anywhere else in NEC program except while defining the main function. However it so transpires that the word main is not a reserved word. What I show here in this slide is the list of reserved words which I call the old list. This is the list which is presented in the open source PDF book that I mentioned which was compiled in 1991. There is another list. I have not put it here. That is a slightly longer list than this which is now part of the C programming standard. Neither the old one nor the new one contains the word main. As a matter of fact a colleague of mine who is helping me in organizing this workshop Mr. Nagesh Karmari who is an engineer from Goa Engineering College and currently works as a research associate here he pointed out to me that when he was a student he and some other enthusiastic people used to do funny experiments doing various things out of curiosity on C and they used to use the word main in different ways. In fact he mentioned that they even tried to use the main recursively which I asked him what was the purpose he says just out of curiosity. This trait by the way you will find once large number of your students. I would suggest that as long as the trait does not result in any destructive activity this curiosity should be encouraged as much as is possible. In fact our students are far more capable of experimenting and finding out new things than us because we are relatively older people. So this is the list for example it is also instructive to go through the list because it also tells us the kind of instructions that we have in C programming language. Notice for example that for is a reserved keyword because for implements an iteration for us. Similarly while is a reserved keyword because while is another way of implementing iteration as we saw in the previous session. Notice also that go to is a reserved keyword go to statement exists in C. In fact it exists in most of the programming languages although rarely used. So while we tell our students not to use it but we should have perhaps one example somewhere in the discussion on control structures to indicate how exactly go to works. In the same fashion there is a word break we have not explicitly discussed it but the word break is used to indicate getting out of any iteration or any iteration blocks such as for or while without completing the entire body of execution. There are occasions when you suddenly determine a condition which is not connected with the for condition or while condition but that condition is such that you have to quit whatever repetition you are doing. For such purposes the break statement is used. Anyway there are interesting implications of reading such a list and trying to recall what each one indicates. For example double int all of these are reserved keywords. You will notice a peculiar keyword called register. This is again a legacy of old C as I had mentioned C programming language is capable of writing programs which are very close to the machine. That is the reason why as someone had said it is still archaically called the middle level language. Of course that term is no more used. However it is possible to enforce certain values to be stored in the registers of a computer machine. Such features indeed are used even today when people write C programs for embedded systems. For example if you are writing a program which will govern the behavior of a washing machine. Remember that washing machines have microcontrollers in them or a C program which will govern the use of let us say a car engine. A modern car has as many as 50 to 70 microcomputers. Each of them contain embedded software much of which is still written in C. So you would require certain features like this when you program for esoteric applications. To conclude there are reserved words in C which must be used for the purpose for which they are indicated. This is an old list and the current list is available in this standard. I had actually put up a slide with the new list but then I thought I would like to encourage participating teachers to actually open the standard if or nothing else just to locate the updated list and therefore I removed that slide. We now discuss the character data type in C. A character constant is of a one character value. It contains one character, a single character and it is written within quotes in this fashion. Notice that single quotes are used not double quotes. A double quote invariably represents a string we shall deal with how C program handles string separately but suffice it to say that individual characters which we would like to store. These are symbols basically which we would like to store in our memory locations. These data values are indicated by enclosing a single character in single apostrophes. It is useful to note certain visible differences and similarities with which we should caution our students. For example when in a program in a printout or on a textbook they see A or capital P it is very easy for them to distinguish these but look at these symbols. This is actually small O. This is capital O and this is 0. It is useful to remind all of our students that these three symbols look very similar and therefore one should be very careful. Notice that small O and big O are typically circular whereas 0 is elliptical in its appearance. This small distinction is worthwhile mentioning explicitly to our students in the class. We should tell our students that almost all the symbols that we use not just in English language but in any other language we will have to be represented as character data or a character construct. Consequently the representation of these symbols inside the computer is of some consequence. Traditionally each symbol is represented internally as an integer value equivalent to the ASCII code of the character. American standard code for information interchange has been the standard for very very long time before the advent of this standard. The standard used to be called episodic. Episodic representation is still used. This originated out of IBM mainframe computers since IBM machines were the first ones to be commercially utilized. Episodic representation was very common and popular in the early days of computing. However ASCII now is a universal representation. Now ASCII is actually although it is stored in a byte original ASCII code is not a one full byte code. A full byte unsigned integer can represent 256 values as we had seen earlier 0 to 255. The original ASCII code represents only 128 symbols 0 to 127 because English language has many much less symbols than even 127. However when you look at modern natural languages which need to be represented inside the machine and therefore characters of the alphabet of such languages need to be stored people find it very difficult to store these characters in a single byte. Particularly when ASCII has already occupied the entire position. There are mechanisms to store what we call local language characters. It is common we will not go into the details but it is common to represent most of the complex symbols in some of the languages which are more than 128 symbols to use two byte characters. However traditionally when we say care in C care always means it is a one byte length character. So it occupies one byte which is the smallest addressable unit of memory. Perhaps it is most useful at this stage to do so although it can be done by you to your students at any point in time in the course. That is talking about sizes of various data types in the internal representation of the memory. For a C programmer it is absolutely important to understand what is the size of a memory location allocated to any particular data type. I have written a small program called sizes.C look at what this program does. It defines a lot of variables unsigned int, int b, long c, flow text, double y, care ch, short int si. Some of you may not remember that integers are not only of the type int and long but there is also a type called short int. Traditionally short int is two bytes long. Int could be either two bytes or four bytes and long could be either four bytes or eight bytes as I had briefly mentioned yesterday when discussing the value limits. Float similarly is typically four bytes long, double is eight bytes long and so on. Unsigned int a is exactly same as int in length. However the values it will store will be different. For example if you have a two byte long unsigned, four byte long int or a four byte long int then the normal value which you can store as I mentioned is between minus 2 to the power 31 to 2 to the power 31 minus 1. However if you have an unsigned int then the range changes completely to only positive numbers and the range becomes zero to 2 to the power 32 minus 1 for a 32 bit number. What this small program does is it prints sizes of internal allocation of memory to various data types. So I have tried to cover your int, long int, unsigned int, float, double, care, short, etc. And I have also included for good measure the pointers int, star p, float, star q. I will also ask my colleagues to upload this small program so that any of you are interested in looking at this can actually run this program. As a matter of fact it is very useful to advise your students pretty early to run this program or you yourself as teacher should run this program on different machines that are available in your labs and tell your students the implementation limitation of the particular compiler that is being used. Remember that we discuss that a long int should ordinarily be 8 bytes. However in most of the implementations that we see in our labs namely the PCs and small Intel servers the representation of long is just 4 bytes. Consequently long is not longer than int in terms of the number of bytes that it has and this has an implication on the values that can be stored in such data types. This program helps us print out all the sizes. Let us see what we are using here. We are printing size of unsigned int. This is percent D is just to print any numerical value because what we are printing is size. We are not printing character, we are not printing floating point, we are printing size. Size is always an integer number and how do we determine the size? Well there is a function in C which says size of A size of B size of anything as a matter of fact what it does is whatever that anything is it determines what type it is and then from a report type of the sizes available to the compiler it determines what is the size of the memory location which is allocated to that variable. Observe that A is int so when I say size of A it will tell me what is the size of an integer value in a represented internet. Similarly when I see when I say size of X for example it will tell me the internal number of bytes required for floating point. Notice here that I am also putting size of P which is actually a pointer to integer type. In fact one of the colleague participants as I had mentioned had raised the issue of why he was not getting the size of the pointer correctly printed. I will indicate the small error in the program that he was trying although he was correctly trying to print the sizes by putting size of P size of A etc. So let us just quickly see what this program produces. This program will produce an output of this kind. It will say for my machine on which I executed this program size of unsigned int is 4 size of int is 4 size of long is 4 size of float is 4. Notice therefore that all ints unsigned int and long int are all 4 bytes on my machine and this by the way is typical. I have tried it on a PC and on a laptop these are the results that you get. The double however is 8 bytes. Remember that 8 bytes does not mean an 8 byte integer but an 8 byte floating point representation. Therefore the mantissa part has a much larger precision in double than in float. Notice that short int is 2 bytes. Notice also that character is 1 byte. Although the program sizes.c contains size of every pointer and every data type what I have shown here is the size of integer pointer and size of floating point point. It will have been more interesting perhaps to show size of a double pointer. Some might suspect that if the double is 8 bytes long then the pointer should also be longer. As I shall explain in a short while pointers are addresses and pointers have absolutely nothing to do with the data type value that is stored inside the memory location to which the pointers point. We shall see that in a moment and therefore almost all pointers that you encounter will be 4 bytes long. They represent what is known as the addressability of the basic machine or the address space. Consequently such machines which are pointer lengths of 4 bytes are often referred to 32 bit address space machines. 4 bytes is 32 bits. In fact almost all intrinsic native addresses in such machines will be of 32 bit long. We shall of course discuss the notion of pointers a little more later in the course of our discussion. We now discuss the actual representation of data. Any data within the machine is actually represented only using two symbols 0 and 1. Recall that we mentioned that some of you might be teaching these concepts very early in the subject to your students. Some of you may be teaching them later I personally prefer teaching these concepts slightly later in the course because the first portion of the teaching could very well occupy the principles of programming rather than in a details how the internal representation is handled. However whichever way you do it somewhere or the other you will have to explain it and perhaps it is useful to show some examples of this kind that if I represent integers by binary digits or bits this is typically the representation that I will get. I can represent up to value 9, value 7 by 3 binary digits and that is why I have shown here 0 0 0 0 0 1 0 1 0 etc. However 8 requires 4 bits to represent. 4 bits can continue to represent values up to 15 but 16 will require 5 bits and so on. Now we tell our students that 8 such bits are grouped together to form a byte and unsigned integer value will range from 0 to 255 or 2 to the power 8 minus 1 when stored inside a byte. And the machine memory actually consists of a very large number of bytes each of which has a unique address. This is important in the old days the machines could not address individual bytes. They were word addressable machine and a computer word is often formed by combining several bytes together. It is not uncommon to see a 2 byte word in smaller microprocessors and older PCs but most words today are 4 byte words and that is the reason why you will find that int and float and such commonly used data types are all 4 bytes long because 4 bytes the machine finds it extremely easy to address as a group. However all modern computers can address individual bytes and therefore these are called byte addressable machines. Since byte is too small to represent useful numerical values a contiguous group of bytes is used which is called a word. So the way we could explain it to our students is that look the basic internal representation is in terms of binary digits this is an example of how numerical values integer numerical values can be represented by bytes. 8 such bits form a byte and the machine memory actually consists of a large number of bytes each of which has a unique address and since byte is too small a unit to represent numerical values a contiguous group of bytes is used. This is where we can actually indicate the use of pointers in programming languages without even using the word pointer. So if we are for example describing binary representation and machine memory etc very early in our teaching then we can say that 2 or 4 bytes may be used to represent integers 4 or 8 bytes may be used to represent floating point numbers etc. So here is an example int m comma n float a3 this is an array of 3 elements 0 1 and 2. So the memory in Dumbo's lovers will be typically allocated sequentially and in the order of appearance. So m may have a value 573 n may have a value minus 1 2 3 4 5 6 7 a 0 a 1 a 2 could have these floating point values. This is merely to tell our students that consecutive memory locations not individual bytes but group of bytes depending upon the type of data that we talk will be allocated generally in accordance with the order in which these variables are declared variables or arrays immediately after this since we are talking about Dumbo's drawers we will say each drawer has a name like m and a 0 could have explained it through an appropriate diagram or picture but now we can say the notion of a location address as I mentioned location address is nothing but a pointer but rather than introducing the notion of location address as a meaning of pointer I would suggest we do it the other way around since students would have to be taught about memory locations and since any location a physical location like a building or a house always has an address I notice that students find it much easier to understand the concept of a location address rather than the word pointer and therefore it is useful to first introduce the notion of a location address as I said we can do that very early in life if you are starting your teaching or programming concepts using binary representation and machine architecture and so on or you can do it sometime after having introduced the basic programming concepts and after having done the basic manipulation of numerical values including for example arrays if you so wish so here is the explanation that the addresses of consecutive locations of the data objects declared as above in memory drawers will differ by four why because although we show one contiguous drawer containing a single value we have explained that the drawer actually represents a group of bytes and in most cases particularly for the items that we have declared with the associated data types the there will be four bytes representing a value consequently if the first location has an address 10000 then in the drawer number 10000 the value M will value 573 will be designated as M but the next location address will be 10000 four where N is stored the next location address is 10000 eight and so on what are these numbers we can explain to people that since electronic memory of the digital computer is formed in terms of bytes and groups of bytes and since each byte is addressable this number is nothing but the physical consecutive location of a byte in the total memory of the computer so if we say that the computer has a memory of 512 megabytes or 2 gigabytes or whatever then every byte in that 2 gigabytes or 512 megabytes etcetera will have a unique address and that address is what is shown here observe that if I am using 32 bit addresses which is what this particular thing shows then I cannot have the memory of the computer exceed 2 to the power 32 why because with 32 bits I have only 2 to the power 32 unique numbers in actual practice the physical memory of a computer is much larger I will leave it to you to think about how to explain to your students that how the larger memory which in number of bytes is more than 2 to the power 32 can actually be addressed at all because you need unique addresses to address any location the one way of explaining is that within a program if the memory requirements of my program are within the limits of 32 bit addressing then as far as that program is concerned it could be compiled and then it could be mapped to any segment of the larger memory and that segment can be completely addressed assuming 0 base by 32 base however most machines now are migrating to 64 bit addressing notice that address length has nothing to do with length of the actual data values which we have stored for example if I have a character type as we have seen a character internally occupies only 1 byte so there will be a single byte value here 1 byte value a letter a b p q or whatever always simple unsigned what should say integer representation of 1 byte or 2 bytes could have smaller value however the length of the address will be determined by the addressing mechanism that the machine uses this is where the pointers that we see you print the value of a pointer irrespective of whether it is pointing to a single character or it is pointing to a float or it is pointing to a double the pointer will always have the same size and that is because the addressing mechanism inside the computer is using a constant side constant size mechanism which typically on the machine is that we use is 4 bytes that is the reason why the output of the previous program will show that all pointers irrespective of what they point to are 4 bytes long the programming language c plus plus actually supports Boolean or bull data type this underscore bull was not the original data type available in the c programming language but the current standard defines the data type underscore bull for those of you who want to teach the notion of a Boolean variable which has a truth value true or false might find it useful however in most c programs a Boolean data type is not explicitly used instead you depend upon a numerical value 0 and non-zero describing the truth value false and true or however is available as a data type which occupies one byte if you wish to use the type bull which is used to represent truth value then any comparison operation can be used as an expression resulting in a Boolean value which can be assigned to a particular Boolean type data type for example if answer 2 is defined as a Boolean type then it can be defined a value resulting out of x less than 25 one may wonder that x less than 25 is not a numerical expression in c if you calculate x less than 25 as a value you will find that this expression depending upon whether the expression results in a true comparison or a false comparison will actually return a numerical value it is that numerical value which is considered to represent the Boolean value of true and false in c in c plus plus the actual representation t r u e and f a l s c are in fact two explicit data values separately they are not numerical data values although they may be represented internally by numerical value and they can in fact be assigned to data types which is of Boolean and once you do these assignments following usage will be valid and meaningful if answer 1 that means if answer 1 is true you do this otherwise something else or while answer 2 a repetition is continued as long as answer 2 is true of course the body of the iteration must modify certain values such that the condition represented by answer 2 is appropriately changed let us continue our discussion of character data type each character which we see on our terminal or there is a question size of a function or an operator there is a question of c o p pune what is the size of a function or an operator I would like to submit that sizes are of memory locations and whether it is a function or whether it is an operator it does not explicitly occupy memory after the program is compiled both the function and the operator are actually part of our computer program they are not data values and therefore there is no notion of the size of either a function or an operator that is a very good point that was raised because some students may ask these questions they may relate size to any and everything that they see in a c program whereas it is useful to explain to them that size pertains to the actual size of memory location which is allocated to hold some value and since value is contained only in variables variables or array elements it is appropriate to talk of sizes only of variables and array elements but not of other things that are part of the c programming language so every character that we see on our terminal which we input using our keyboard is reprinted by an internal numerical code ASCII code which is 1 byte long is used here is an example A has a code of 97 Z has a code of 122 A is 65 capital A that is capital Z is 90 a space which means we see nothing either on a printer or on a monitor but when we put a press bar a space gets inserted its ASCII code is 32 backslash n which is a new line character is 10 one must distinguish between character 0 and backslash 0 a backslash 0 or null is an actual 0 value the ASCII code is 0 or backslash 0 whereas the character 0 has a different code because character 0 is part of the visible or graphical symbols as we call them one can declare variables in a program of type care for example care later 1 care later 2 equal to y just as we can initialize any variable when we declare it such as we can say int q equal to 28 similarly we can say care later 2 equal to y observe that a care variable can hold only one character at a time the dilemma is that just like we would like to handle larger numerical values and not just one digit of a numerical value imagine what would have happened if the storage locations in computers memory could store only one digit at a time then to deal with a simple number as 237 for example I will need three memory locations I will have two in one location three second location and seven in third location and handling this number as a single unit will become very clumsy in fact that is what we have to resort to whenever we want to deal with higher precision arithmetic then what is provided by the basic capability of the machine or the language compiler as I promise you in the next week we shall see an example of high precision representation of numbers up to 99 digits and how to add them how to multiply them etc etc these are typically the questions that we ask in mid semester or in semester examinations but with characters we have no choice C programming language permits storing of individual characters only if I want to store names of people names of streets name of a subject any character string for that matter C programming language does not have any intrinsic data type representing such a value so if a name is for example 35 characters long there is no data type in C which permits me to put a 35 character long value consequently you will observe that although we use strings in our programs particularly in printf statements and such places there is no mechanism to address a string of characters as a single entity directly inside C program so how does C handle character strings well it handles it by saying that look if you have a string then put different characters of the string in array elements and I will give you facilities of doing some special things with such arrays which are called character arrays in short this is how we can introduce the way C stores your string inside its memory so C can permit us it permits us to define character type variables there is a question from AC Amruta Puri what is the difference between the word size in C language and processor word size okay that is a good question word size does not relate to sizes in any programming language but word size represents always the native computers feature so the notion of a word is actually a computer terminology what you have in C programming language or for that matter in any other programming language such as C++ Java Fortran what have you what you have in these programming languages are size of data value storage so how exactly a integer value is stored how exactly a floating point value is stored that is what size refers to in programming language I think it is a good point because students will get confused between the word size and the words the size that we use inside a programming language so these are completely different notions as I mentioned earlier the notion of the word was very important in case of computer processor because these were word addressable machines however almost all computers today are byte addressable and therefore there is no intrinsic notion of word however the digital electronic circuitry quite often the logic processing for example quite often permits you to load 4 bytes into the memory or to download 4 bytes from memory into a register etc and to that extent the notion of the word is still relevant for the computer processor but there is nothing like a word as far as programming language is concerned so programming language you talk only of size and not of word size and this size is associated with variable types size of int size of care size of long int size of double size of short int etc etc of course these two are related because if the computer has an implicit word size such as 4 bytes as we saw in most commonly used processors then intrinsically language compilers will use 4 bytes to represent most commonly used data that is the reason why we see that if you run that sizes.c program you will get 4 as the size of int of float and of most such commodity data types which are used very commonly I hope that answers the question so let us go ahead in handling the character data type and see how exactly character strings can be handled this is where we say that the traditional way of representing character strings in C and therefore also in C++ is to use a care array so here is a declaration of a character array care employee first name 60 this array permits us to store up to 59 symbols or 59 characters into this string why 59 after all employee first name array or any other array that I define if I define it of size 60 then it should have 60 elements it indeed has 60 elements the index may be from 0 to 59 but just because the last index is 59 that is not the reason why we say that string can have 59 characters the reason is completely different the reason is that these strings are stored in an extremely artificial manner so if I have a string of symbols to be stored in an array the first symbol of that string is stored in the first location namely 0th location the second in the first location and so on however whenever the symbols end there is an artificial character called a null value backslash 0 which is an absolute 0 which is inserted in the byte immediately following the meaningful characters comprising that string which constitute that string so to indicate that such an array contains a string a null value is stored in the location immediately after the last character of the string it is useful to look at a string representation because students will always have these doubts suppose the string hello has to be stored in a char array so I have declared a character string called message 40 these are 40 elements these elements can be represented to occupy consecutive memory locations and these are all byte locations by the way one byte long because char has a size of one byte so what will happen if we want to push the string hello inside this char array message the way the string will be stored will be I will have h stored in the first location I will have small e stored in the second location I will have l stored in the third location I will have another l stored in the fourth location so when I say first second third fourth I actually mean 0th element first element and so on second third the fourth element in this particular case will be 0 notice that the string has ended however there are other locations here and it is important to emphasize to our students that whenever I declare an array if the array has 40 elements then all elements including the last element which will actually have an index 39 will always contain something or the other remember we have already told our students that numbers memory drivers are never empty every location and therefore every byte in the memory of the actual computer will always contain a value so there is no notion of a byte being blank now since byte always contains a value it is important for us to ensure that for meaningful execution of our program we refer to a memory location for using a value only when either we have put the value in through an input statement or some value has been computed and put inside that location by our dumbo or the c compile now observe what would happen if I have an array in which I have not stored anything to begin with then this will have some 40 junk byte values each byte is a value between 0 to 256 when I start putting characters inside this array to store a string I have put h here I have put e here I have put l here I have put l here and I have put a small o here but the next element contains some value question mark question mark question mark and so on consequently if later on we examine what are the contents of this array we will find this is h e l l o and imagine that this value incidentally happens to be say z after all we do not know what value is there so how will we not know that our string is hello and not hello z because z also looks like a valid symbol as a matter of fact since we are talking about one byte contains every byte of this array we actually contain a valid symbol because every byte represents some ASCII code some may be printable some may not be printable but they are all valid codes any which way and this is the reason why c programming language decided that look if I want to give you facilities to represent a string then I better tell you that if you have a string of symbols which is smaller in size then the total number of elements that you have in array then I will give you a special mechanism to indicate when your symbols have ended consequently if we wish to put the string hello inside this character array the mechanism that c handles it and c asks us to handle it is to say you put your symbols h e l l o in locations 0 1 2 3 4 of this string but in the fifth location you put a special symbol which is backslash 0 or not so this question mark will consequently be replaced by backslash 0 and therefore any other program including the c compiler and the library functions examining this string for contents will immediately find out that the string that has been stored by us by any previous operation contains only five meaningful symbols because it will examine when it locates backslash 0 and it will say ah the string has ended here so string representation in c requires an extra artificial character which is backslash 0 by understanding of course it may so happen that either by design or by accident all these question marks are backslash 0 it does not really matter what the subsequent byte positions contain they may have backslash 0 or they may have something else but when c program examines an array through either a function called or by convention through our logic then it is mandatory to say that whenever I come across a backslash 0 the meaningful portion of the string has ended and that is exactly how c permits us to handle character strings there is one query let me quickly look at that this query from jargaon what is the query whether message array size is 40 or I guess it is 41 ok the question is answered as follows if I have declared the car array message as size 40 then it has exactly 40 elements the elements contain indices 0 1 2 3 4 up to 39 the last element of this array is 39 element the total size of this array is 40 however the actual size of a string that I can store inside this array is not of 40 symbols but only of 39 symbols because for any string to be meaningful as we have seen the last element immediately following the meaningful symbols must be a backslash 0 so in the worst case if I have a very large message to store inside this array I must have the last element at least having backslash 0 so if the last element has backslash 0 that leaves me only 39 meaningful positions to contain symbols of my message ordinarily however my message will be much smaller than this size in fact I will always use this size of an array which is much larger than the size of actual strings that I wish to store so that I never have an overflow problem a message such as hello which is actually a 5 letter message H E L L O will occupy 6 locations in this array H E L L O followed by a backslash 0 so 6 locations together define the string in C for me out of which 5 are meaningful symbols constituting my string and the last one is a backslash 0 ok we will stop here we will continue more interaction later as we see an example by the way we will just have a 5 minute body break and exactly 5 minutes later we will assemble back here thank you