 Welcome to the session. I am Mr Praveen Yalapa Kumbar. Today we want to see lossless compression algorithm. The learning outcome of this topic is at the end of the session student will be able to explain the concept of Shannon-Fano algorithm in multimedia communication techniques. The contents of this topics are introduction, basics of information theory, variable length encoding, Shannon-Fano algorithm, compression ratio and entropy. First introduction. This is the basic diagram of general data compression scheme. In this diagram, there are mainly three parts. The first one is encoder, second one is storage or network, and third one is decoder. Here first input data is provided to the encoder. This encoder is also called as a compression. Here the compression of data will be done and after that one input will be provided to the network which will be stored. After that it will send to the decoder. Decoder is also called as a decompression. Here the decompression takes place and we get the output data. From this diagram we can conclude there are two types. One is a lossless data compression and second one is lossy data compression. Now data compression is also known as source coding. Data compression is the process of encoding data in a such a way that it consumes less memory space. Data compression reduces the number of resources required to store and transmit the data. It can be done in two ways that is lossless compression and lossy compression. Compression means the process of coding that will effectively reduce the total number of bits needed to represent certain information. Lossless compression is a class of data compression algorithm that allows the original data to be perfectly reconstructed from the compressed data. Therefore here the chance of the data reduction is less as compared to the next one is lossy compression. If the compression and decompression process induce some information loss then the compression scheme is called as lossy compression. Therefore here the most of the large amount of data will be reduced in compare with the lossless compression. Now we see compression ratio. The compression ratio is defined as ratio of the B0 by B1 that is represented by the equation number 1 where B0 is the number of bits before the compression and B1 represents the number of bits after the compression. Therefore the compression ratio defined as ratio of the number of bits before compression to the number of bits after the compression. Now basics of information theory. Now before going to the this information basic information theory we want to go for what is the means of entropy. Entropy n of an information source with alphabet yes there has a different sample space here we take as a n sample space that is S1, S2, S3 up to SN and from this sample space we can derive the formula for entropy n is equal to summation of i is equal to 1 is to n Pi log to the base 2 1 by Pi that is equation number 2. Now this 1 by Pi goes to the numerator when it will goes to the numerator then it will become as a minus therefore this equation to become as minus summation i is equal to 1 is to n Pi log to the base 2 Pi that is equation number 3 this is a formula for the entropy. Where Pi is the probability that symbol si in S will occur. The next term that is log base 2 1 by Pi indicates the amount of information contained in Si which corresponds to the number of bits needed to encode Si. Now we see variable length coding. The entropy indicates the information contained in an information source that is S. It leads to a family of coding methods commonly known as entropy coding methods. Variable length coding is one of the best known such methods. Now we want to see Shannon-Fano algorithm. The Shannon-Fano algorithm was implemented developed by Shannon at the Bell Labs and Robert Fano at MIT therefore the name indicates Shannon-Fano. The encoding steps of the Shannon-Fano algorithm can be presented in the following top-down manner. There are basically two steps from first step is sort the symbol according to the frequency count of S. Second one is recursively divide the symbols into two parts each with approximately the same number of counts until all parts contain only one symbol. Now we want to see a natural way of implementing the above procedure is to build a binary tree. As a convention let's assign bit 0 to the left branch bit 1 to the right branch. It's a method of constructing perfect score based on a set of symbols and their probabilities estimated or measured. The techniques was proposed in Shannon's mathematical theory of communication his 1948 article introducing the field of information theory. In the field of data compression Shannon-Fano coding named after cloudy Shannon and Robert Fano. Now you want to see the example. Symbols to be coded are the characters in the word is a speaker and first of all we want to calculate the frequency count of the symbol is here for the S how many times it will comes if you observe S is comes to 1 therefore you write here S is 1 P comes 1 here E comes two times in the here and here therefore we write here E is 2 A occurs one time K occurs one R occurs one therefore if you made summation of this one we get total number is 7. Now everyone pause the video here and recall the first step of the Shannon-Fano algorithm. Okay now the first step is sort the symbol according to the frequency count of their occurs therefore the previous table become as largest number the count of symbol become first and after that one it will arrange in this way descending order therefore the previous table become now this one. Now steps to recursively divide the symbols into two parts each with approximately the same number of counts until all parts contains only one symbol. Now if you observe here I want to divide into two main parts nearly equal therefore I want to divide left hand side I will get E S that is a total number of count is 2 plus 1 is a 3 therefore I am writing here 3 and if you observe remaining part PAKR that total becomes 4 that's I will take is the right hand side it will PAKR and left hand side indicates 0 and right hand side indicates 1 again in the next step I want to divide this this part again I want to divide this part until that not contains only one part therefore now the next becomes this E S is divided into two parts that is E that is 2 that is a count of symbol is 2 again as I divide 1 again if I divide this is a nearly equal parts that is PA2 KR2 again I want to divide these parts because it does not contains a single symbol it contains double that's why again I want to divide this one here I am not required to divide these parts therefore the next part become as PA2 is divided into 1 and A is a 1 KR is divided into K1 and R1 in this way I will divide up to the this contains the single now next if you observe this part I made up on a table that is a first column indicates symbol after that one count and code now if you observe this table is calculated from this diagram now if you observe here these are the symbols and these are the counts which I have done in the previous slide now if this code becomes as if you observe E from this one E comes from 0 0 that's why here I will write 0 0 S comes if you observe S comes 0 1 that I want to write 0 1 in this way I will write all this code for this different symbols now I want to calculate the number of bits used number of bits is used calculated with the help of code into count code here these are the two and these two two into two four in this way you have to count the number of bits used if you observe here here code is two and here count is one these two how it will comes 0 and 1 there are two symbols that's why you have to take as a 2 2 into 1 that is a 2 and in this way I will use the number of bits used now I want to calculate the probability how we calculate the probability I will calculate the probability by count divided by the number of count that is 2 by 7 I will get 0.29 in this way I will count for the different symbols now here the total number of a bits is is the summation of this all this total number of bits that is here 18 bits now I want to go for compression ratio if the total number of bits required to represent the data before compression is B0 the total number of bits required to represent the data after the compression is B1 therefore compression ratio is B0 by B1 here for B0 is equal to 8 into 7 is equal to 56 symbols here there are 7 characters and for each character I will take as a 8 bits B1 is a total 18 bits that we calculate in the previous slide therefore the compression ratio becomes 3.11 therefore average number of bits used by 7 symbols is 2.57 now entropy now entropy I will calculate it from this table as the formulas 0.29 into log to the base 2 1 by 0.29 plus 0.14 that's remaining all this part contains the same that's why I will multiplied by the 5 because 1 2 3 4 5 that's why I will write the 5 after this following these steps we get the value is 2.52 this suggests that minimum average number of bits to code each character in the world speaker would be at least 2.52 the Shannon phenol algorithm delivers satisfactory coding results for data compression the references for this topic is thank you