 Good morning, I am Dr. Madhuri Savan, Senior Research Scientist from IIT Bombay working on Akash project. The main aim for my team is to address the issues which we are facing from the technical perspective as well as from the language perspective. So, our primary focus is as we have seen in the previous sessions, they use extensively videos for knowledge dissemination, blender animations to show the functioning of the mechanism of the objects. So, considering all these things, we think that the power which is consumed by the tablet, we have to consider that part also. So, my primary focus is these fundamental five areas. One is the digitization of books, transcripts of the video lectures, content development in regional languages, development of e-learning training programs and development of hardware device to project video lectures from Akash. Now, currently what we are facing is, we are facing the extensive licenses for the proprietary tool, because they are pretty expensive and we are talking about the affordable solutions using Akash tablets. So, the primary goal is to provide the open source software to prepare our applications and our solutions. Another challenge what we are facing is Akash tablet does not support flash and all of you know that currently most of the applications or the content which is developed is in 80 percent of the content is developed in flash. Akash tablet does not support Java and we have seen most of the web-based Java applets, they also do not support, they are not supported by Akash tablet and so, we are facing the challenges to develop the content without flash and without Java. There are power issues to recharge the Akash tablet in rural areas as we all aware, due to the load shading, due to the power shortage, the children or the students might face these problems. So, how to optimize the uses of tablet and still the and the uses of Akash tablet, we are working on that and therefore, with the alternative solutions for videos, how to create even the transcripts which can give almost the same amount of knowledge through transcripts. And the last thing is the shortage of content available in regional languages, the regional language 80 to 90 percent of India, people speak only the regional languages and there is a acute shortage of content in the regional languages. So, our emphasis is how we can provide the content in the regional languages, maybe they can actually the language of learning might be English, but to have the better understanding of the concept, the same content should be available in the regional languages and the efforts are on. Now, these are few of our solutions which we are trying to work on, explore freely available open source software. As I mentioned earlier, we are only thinking about using open source software, so that they will be freely available and can be downloaded freely from Akash websites. Then explore HTML5 features to develop interactive content. As mentioned by the previous team, there are the blender animations which is a 3D animations, but HTML5 which can be partially replaced for the flash because it also supports the videos and the audios and it can be tagging could be done in HTML5. So, we are exploring HTML5 options extensively. Again we said that the user of blender like software to develop animated applets, develop textual content which will consume less battery power. So, here we are talking about the transcripts which can partially replace the video content because if the video consumes lot of battery, the same content can be used by or consumed by the user through these transcripts. And the last is the help of natural language processing technology to develop content in the regional languages. But extensive research has been going on in Bombay IIT and we are taking help of those departments to develop the transcript which we are developing in English into the another regional languages and very soon we can develop those in 5 to 12 different Indian languages. Now let us see the digitization of book. The first thing is before digitization, please ensure the copyright issue because we cannot digitize unless permitted by the publisher or the publisher by if he is a government. So, if there is no issue then the process is very simple. We have seen the lot of textbooks for our school textbooks are available in the form of PDF which is a non-editable format which you can read only as a scan images. So, process is any lesson any text which you want to do the digitization please scan it. After scanning you have to use the OCR tool to convert into editable text. OCR tool is the optical characterization recognition tool which converts the non-editable PDF file into the editable text. Now this editable text we can convert into HTML and finally it converts into HTML pages. To add some additional features for the better understanding of the concept or the difficult portion we have added the tool tip which is a feature of HTML file and this tool tip what it adds is it gives you the additional information just in on the web if you give the help, help menu provides you the additional information similarly the tool tip also provides you with the additional information and once the lesson is ready you can put it on tablet and that becomes an interactive lesson. Now let us take the some example so a little recap scan the lesson convert it into editable text using open source OCR engine. We have actually done research on lot of engines unfortunately we have found that the proprietary tools do not do the proper OCR scanning for the Devanagari or the Indian languages but we have found that this particular link which you will find out the free online OCR engine which could convert our non-editable Marathi text into the editable Marathi text and therefore I am giving a link you might want to try for the any regional language you if you want to then convert this into HTML format at the tool tip which I already mentioned and put the lesson on Akash. Now this is one of the example which we have converted one of the nine standard lesson which is in science lesson of sound this is the English lesson so we have scanned it then similarly the same lesson in Marathi we have scanned it and these are two the PDF versions of this both the lessons. We added the tool tip for example the propagation of sound when I went to school we have found out that many students whose English is a second language find difficulty in understanding some jargons some terminologies and therefore we have decided that in the lesson we would give the tool tip to provide additional information for those difficult words terminologies or the definitions. So now you can see the lesson this is the this is the HTML lesson ported on Akash tablet the advantage here is in the lesson you have the links for the sub topics or the sub sections. So when you actually touch the nature of sound it directly takes you to that subsection of the lesson and so on if you want to touch the Marathi you can swap between Marathi and English. So the advantage here is you are in English medium but English is not your first language then many times it is difficult to understand the concept well. So the we have providing the additional information through your mother tongue and slowly you will have lesson in multiple languages this is just a demonstration we have converted the lesson from English to Marathi and vice versa so the same thing you will find in Marathi this is Marathi lesson if you want to see the English part of it you can click on or you can touch on the Akash tablet the English button and you can see view the English portion of this lesson. Now our next focus is converting into transcripts you have seen extensive session on proximity where the videos are showcased through the proximity tool. Now here in IIT we have loads of content by the renowned professors and we have thousands and thousands of us of videos in the respective field developed by these renowned professors. Now the problem is since the battery life of our Akash is just 3 hours and the load of this much content is very difficult to fit in on the Akash. So we have come up with the alternatives if you can actually convert all these videos into transcripts with then it might be easy to put all the transcripts in the form of text on Akash tablet. So this is the basic necessity and out of that we have done doing extensive research on transcripts. Now the advantage here is it occupies less memory it needs less battery power to view compared to videos and helpful in understanding the concept and the last any time you are travelling or when you want to use the optimized battery life you can still read the transcripts on your Akash tablet. Now we are using currently the dragon tool which is a proprietary tool to convert the speech into text. Now how the dragon works? The dragon works are based on the two things dragon takes the input one is your acoustic or the way you speak or your speech directly. Second thing what it does is it takes the input as the text of words jargon and terminologies. So how do you give this input? What we do is we take a lot of transcripts or the conversation of the user or the speaker whose lectures we would be doing converting into transcripts. For example, currently we are doing Professor Fartex lectures. So we have taken about 6 to 7 lectures the transcript of 6 to 7 lectures in a textual format and that becomes the input for the dragon tool. Now dragon tool takes the language which creates a dictionary and the acoustic which comes through the speech and then the tool gives you the transcribed text. Now how when we talk about the tool? The tool is a very very complex thing. It happens you through the mathematical models and through the statistical models one of the models they use is a HMM which is a hidden Markov model. Now when we talk about the computation naturally it needs a lot of memory. So if you are really have any intention to do the transcript conversion then the minimum memory required for your PC will be at least 4 GB and above. The more is better because it gives the better output. So the computation takes place using HMM model where the spoken words are broken into phonemes. What are the phonemes? The phonemes is the smallest sound unit of the word which is spoken and this phonemes what it does is it tries to match with the word which is called the n gram sequencing. I do not want to go into details but this phonemes is coming from the acoustic model and the n gram sequencing is coming from the language model. What is n? n is the any integer between 1 and 3. So n gram is between a, a, k when I talk about a, a, k, a, s, h, a, k, a for example Akash. So Akash a, a, k is my first three letters, a, s, h is my third three letters. So what these two things we during the computation do is they try to match with each other and then they find out the probable word and the probable sentence. So continuously if you train the tool, the tool becomes intelligent and then it starts providing the better output. What is the minimum requirement is you have to train the tool. For example, now we are training the tool for the professor Fartex lecture. So you cannot train the tool with, for any other professor because the tool only understands the jargon, the way you speak, the way the words are broken, the way the sentence is broken by a particular speaker. And if you immediately try with other person without erasing the profile created for the first person, the tool gets confused and therefore one has to use the tool extremely carefully. So what we have seen is it again it creates a language model, create a dictionary of words, jargons and terminologies used by the speaker and it saves it in the dictionary. The dictionary goes on increasing as you create more and more transcripts and it becomes more and more intelligent to understand how the speaker is speaking. Acoustic model train the tool with the voice of the speaker for at least 10 minutes and then convert direct speech into transcript or extracted voice from the video into transcript. So there are two ways to convert the transcripts, either you can directly convert the speech into the transcript, there is one way. Second thing is if you have videos as currently we are doing, we have thousands of hours of videos, what we are doing is we are extracting the voice from the videos and that becomes the input to the dragon tool and the output is the transcript. So this two way you can convert the transcript and the transcript could be used in multiple ways the way you want for the knowledge dissemination. When we have seen initially when we talk about the proprietary tools, it is becoming extremely expensive to use the proprietary tools for many schools and colleges. Therefore, parallely we have started doing the research to build our own tool using the Sphinx 4 framework. The Sphinx 4 framework is been developed by the CMU or the Carnegie Mellon University and this tool or the framework we found is extremely useful to prepare to create different type of tools. So the currently research is on in our department and we are taking the help of other departments as well to prepare or to develop the speech to text tool. Once the tool is ready, we might bundle up with another application or other applications and provide on the Akash. So Akash whoever wants to convert their videos or their entire lectures into transcripts they might want to do that using our indigenous tool which will be developed within a year's time. Our focus is the developing the e-learning content for Akash and currently what we are focusing is the training of modules of LATIC. A LATIC is an open source document creation software which is not really directly part of the your education, but ultimately you have to document, you have to do the knowledge dissemination through some kind of documentation and LATIC being the open source we have created the training modules of LATIC and which will be soon ported on Akash and everyone could use it and it will be a simple way to use LATIC to develop your own documents. Second focus is the training modules for Android. Now Android is becoming the most popular operating systems for tablets and for the mobile devices and therefore training on Akash on Android is becoming absolutely imperative and again our focus is to develop the Android and provide this Android training program on Akash tablet. Our future plan is one of the plans is to develop Vedic mathematics modules primarily for the school children to enhance their capability to do the calculations very through easy methods. Our next focus is the content development in the regional languages. Now as I said earlier our 80 to 90 percent of our population they do not speak or very understanding of English is very low. So our primary focus is how we can convert the content which is currently available in English on the net and provide them in the regional languages so that they will have better understanding and knowledge dissemination will happen through the hook and nook corner of India. Now in IIT there is a big group working on the natural language processing search engine and they have very recently developed a sandhan which is the actually the search engine developed by IIT along with the collaboration with other IITs and this is primarily for the it is a tourism this is based on the natural language processing search engine. So you might want to use this maybe your parents your colleagues your friends might want to use this. So now even the holidays are approaching so you also might want to see where you would like to go. So you can may use the sandhan the website is given here it is H-T-T-P-W-W-W and this is the website you might note down and you may do the search in your own regional languages it supports five different languages. So we are taking their help and currently we the discussions are on to develop about whatever transcripts we are converting into English how to convert them into this these regional languages without using the direct translations but using the technology and in future in a years or two years time you will get the transcripts in the multiple languages between 5 to 12 and that is our primary goal. So with this I would really like to conclude my session and we really feel that with the help of Akash we can use to reach out the Hukki Nook corner of India and this is the dream of India to become the powerful country by 2020 and this is the dream and let us fulfill that. Thank you.