 afternoon Dr. Anant and so before I start I'm myself Harish Kashyap I'm a speaker here at the ODSC and Dr. Anant also was the keynote speaker today in the morning and I know to I'll allow Anant to speak for himself but before how I know Anant is my master's thesis was based on his paper and it was a single paper single author paper by Dr. Anant from Nuance communication that's kind 2005 and very fundamental paper that revolutionized I would say the art of system combination in speech recognition which was a way to improve the accuracy of speech and my advisor in BBN then Spiros came up to me and said you should read this paper and start implementing it and I worked on that that became a part of basis for my future work and confusion matrix combination and so he's a very renowned figure in speech and we're very lucky to have him here and I know I would love to have him see a few words about himself his journey what made him pick up speech recognition as his forte and how his transition to AI overall has happened thanks Harish you're very kind I think that was a piece of work that I was kind of proud about but I wouldn't treat think about it and has grand terms as you described but yeah so my journey I did my BTEC at IIT Kanpur and in in I graduated in 86 and I remember in 1985 we had the opportunity to do summer jobs and I did my I got to do my summer job at TIFR in Bombay I taught at Institute of Fundamental Research and I worked with Professor Kuldeep Paliwal and over there he was a speech recognition researcher and you know thinking about him is bringing goosebumps to me right now because he was to me a mentor and somebody who really changed made made me very interested in in this area of research as well as speech research now Dr. Paliwal was an Indian researcher at the time worked in TIFR but was a world-renowned researcher so it made me feel that it doesn't matter where you're from you don't need to be in America or England or the places where all this research is going on but here in TIFR we have this guy doing fantastic work so he really inspired me I then went and did my PhD in the US at Rutgers I got lucky in that Rutgers University was very close to Bell Labs and I worked with a person in Bell Labs who came to give a talk at Rutgers I was interested in what he was doing he liked what I did for my PhD thesis and I got to work at Bell Labs right after my PhD so being at the right place at the right time kind of thing I got an opportunity to work at Bell Labs and that's how I really started working in speech very interesting and today beautifully you showed the how the speech transition of speech over time as to how the word error rates have reduced with deep learning changing the game and and in fact I was during the time I graduated during the time where the increments were or rather the improvements were incremental and now we're at a stage where we've had very good word error rates and do you see the future of speech to be reducing that in terms of the curve the reduction of the word error rates you know there are many challenges like CDE or any of the speech recognition systems are facing trouble from what we see right so I think in terms of word error rates we have achieved fairly low word error rates in some of the tasks that in the past we didn't have such lower rates like the conversation speech recognition task we talked about so one can argue that in some of these tasks we're already at word error rates where we can start thinking of interesting applications but I think making them more real time making them more streaming and also coming up with speech recognition for human to human speech communication I think is something which I think is the next area. A lot of the speech recognition systems today have to deal with humans talking to a machine but if you want to do stuff where people are talking to each other and many of the research problems that people worked in the past was on this sort of data but I think we don't have many applications on this sort of data I mean voice search you know voice calling navigation it's all human talking to a machine but what about humans talking to humans like recognizing meetings recognizing you know conversations and various things like that I think there is lots of new dimensions there and the problem also can become much more difficult from an acoustic and language standpoint. That's really nice to hear and do you think it might do you see wins in AI helping speech in terms of what I mean by AI is of course speech algorithms are mostly machine learning methods like Gaussian mixture models and you know acoustic models hidden Markov models but many of the times the context is the context being taken into account when recognizing speech the way humans do like when we recognize we look at the context the background where we are at and we consolidate information that's beyond just the spoken text or training methods do you think any effort in that direction do you think do you see speech going in that direction? Yeah I think it certainly do and I think to an extent some of it has already gone in that sort of a direction so for example the Microsoft research I talked about this morning where they brought the word array down to human accuracy their language model that they used is actually a language model that takes into account not just the current sentence but the entire dialogue so what happened in the conversation what did the pre what what was the context of the conversation itself that enables us to predict what the current word would be not just based upon the previous words in the sentence so for example that I think is a is something we're already using long context to determine what the output should be and in the last question is in your own journey from speech to larger non-speech problems have you a where you involved in a lot of the non-speech problems I think you were and B is what would your advice be for people who have such transitions happening in their career? Right so I have largely worked in speech recognition but I did make forays into other areas and right now at LinkedIn we're interested in speech but I'm not we're not directly working into this deeply as we did at other companies so I think I think in terms of machine learning as an applied area speech being an application of machine learning image pro image recognition being an application fraud detection being an application so when you think in terms of the mathematical underpinnings of these models then if you abstract away the application from the underlying technology like LSTMs RNNs DNNs whatever it is then you can see how you can move from one to the other of course there's a lot of domain knowledge that is required in each domain but you pick that up if you're surely I would just say you know be confident about doing that and continue to learn and be willing to jump into new areas and just try things perfect thank you so much and it's an honor to meet you I wanted to have you here and thanks thank you so much and thank thanks for inviting me to this conference narration