 Can we have Niharika question with us? She'll be talking about supply chain bots for retailers Yes, can you also add Shreedhar on live? Sure. Yeah. Thank you. And we also have Shreedhar with us. Yeah. Thank you. Hi Yes Shreedhar, so we can get started. I'll just move on to the next slide. Yeah Yeah, this is a quick intro It's about Niharika and myself. We both work for Tata Consultancy Services and this is our brief profile Yeah, so this is going to be the agenda for this next 25 minutes We will be giving a brief introduction about chat boards and how A is playing a big role in retail these days and the specific industry use case Which we have took that is supply chain. We'll be talking about it about it a little bit And then we will also talk about the tech stack that have been used for building chat boards for retailers And then the next section would be about all the ML components that go into building these parts right from training data, the data preprocessing, the data wrangling, the intent recognitions and then Entity recognitions and then annotations for the training data Enabling context awareness and how to wrap all this as a service so that it can be used by different users through different channels and then how to retrain the models and how to encourage user feedback to improve the board So that's in brief that is going to come Yeah, Niharika, go ahead Yeah, the first one is what are chat boards? Chat boards are Artificial intelligence software or it's a piece of code Which helps the computer to understand the human interaction or a human language so that Over a period of time it can become a virtual assistant to assist the human being in various tasks If you see the evolution it all started In the last few decades with scripted bots. These are nothing but rules engine. We started with rules engine and slowly As we developed and Python and R came into mainstream and then quite a few algorithms came mainstream We went into intent recognition So at some point the bots were able to understand the intent of the user's question or the user's query And then currently we are in the face of virtual agents where we can not only understand the intent of the Question but we can also find out the entities that were involved in that question And we can also connect to quite a few Ecosystems of enterprise applications and we can also understand enterprise context So currently we are in the virtual agent space and the next phase of chat boards are going to be truly artificially Exhibit truly artificial intelligence where they can respond like human human beings They can understand the context based on the environment where they operate and they can do self training on themselves and then they can do self-healing as well. So that is the ultimate Phase of a chatbot. We are not there yet and we are trying to get into that space So what drives the bot market all of a sudden? There is a bot phenomena everywhere So if you see We can see these chatbots in e-commerce sites. We can see in a banking site We can see in mutual funds So many sites wherever you go, there is a bot popping up. It's due to advances in AI and ML in the last decade or so democratization of cloud computing resources were becoming available At a cheaper cost Bot frameworks are becoming available and people are becoming mainstream. So all these are fueling the growth of bots And if you see how to build bots these are the top three players in the market Amazon Lex Google Dialogflow and Azure Bot Services. Apart from these there are a lot of open source frameworks. There are a lot of startups in this field So it's a quite an interesting space to watch So now coming to the retail industry in particular What is the AI market? addressable market. In 2018 it was close to 10 billion By 2025 it is going to be 118 billion So how and why retail is leading this pack? because of ultra high competition among retailers for a Mindshare and wallet share of the customers and then As more and more differentiation is required to attract the customers So bot is one way to enable that. So customer engagement customer delight enhance the shopping experience advent of Omni channel Where customer can touch base with the retailer either to the e-commerce website or to a store or to a mobile lab So many options are available today So to enable that seamless transaction and to give him a better experience bot will be going to play a vital role And then there are quite a few business processes that can be Leveraged through a bot Some of the examples are in inventory tracking item tracking order tracking Yeah Next slide please So here is a typical supply chain Where the raw material on the left-hand side and then the customer is on the right-hand side in between Those two there is a empty number of steps and lots of business processes are involved to convert that raw material To a finished product and then ship that product to the distribution center and finally to a retailer I live where the Customer is going to purchase an item if it is an online store But nevertheless it goes all the way from raw material to the distribution center So supply chain is a rich repository of business processes and information available In any enterprise so supply chain is one primary and that is the reason why we have chosen our bot for this particular business process And there are a lot of sub processes within supply chain and some of the key things are listed down here customer engagement customer order processing warehouse management Store replenishment. These are all some of the key processes where the item moves and lot of Move-in parts are here and Invariably at an enterprise level. There is a hell a lot of information available in all these sub business areas Yeah, so now we have seen the importance of supply chain We have seen the bot and these are the building blocks of the bot So the bedrock is going to be natural language processing So natural language processing is a domain of artificial intelligence which helps Computers to understand natural language which is used for interaction between human beings and in this case It is going to be predominantly English. Of course, we have multi-lingual supports coming available these days and the other building blocks are intent recognition what is the Intent of a user utterance. Is he talking about subject A? Is he talking about subject B? So those are the intent recognition and then you have named entity Extractions these are nothing but entities that are embedded in an intent So I want to book a ticket to go to New Delhi. So we understood this is a travel-related intent and then the entity is about Delhi Okay, the user is wanting to go to Delhi and that is an intent and when he wants to go That will be an intent and how he wants to go there. So all these are intent all these are entities and then The other building block is guided conversation or conversational UI. This is more about contextual awareness you have to hand hold the conversation with the user For a period of time the subject should be Held and it cannot be lost Say for example, if the user is talking about travel and then he's querying about a lot of interesting sites interesting questions about the Destination and finally he says look a ticket So we should be able to still hold on to the topic and understand if it's still relevant for the same destination and same intent So that is about the conversational part in guiding the conversation And now we have voice recognition The utterance need not be text always. We do support Voice and then the customer can speak and this will be converted into text and then we take into the back-end Algorithms and then we look and we can get a response as well And finally we can enable multilingual support either in the form of text or in the form of voice as well So all these put together give you a holistic Chatbot solution Next we have the tech stack and these are all the Quite common components that have been used to build the thought So we have Python 3x as the base and then we have class, basic Fast-text these are some of the key packages that are used and on the right side you have cryptography SMTP, JSON and all that So nowadays we do have a set of plans where open source technologies are gaining common You do have commercial software to build the thought but the open source technologies and being in-house are being aggressively adopted these days the the prime reason being enterprise data is being protected and Quite a few retailers are not comfortable sharing their enterprise data and knowledge outside the organization So there is some stigma there when it comes to transferring the data to cloud knowledge to cloud so a lot of in-house Hardware is also available and then open source technologies are leveraged widely to build this chatbot Here is the architecture This is a high-level architecture on the left hand side. You can see the utterances In the form of question and the next component is a query and response parser That is an intelligent parser which can parse both question and a response and then it can route it to inference engine and the inference engine is where the intense of the Apprentice has been found out and once it is found out it hits the relevant knowledge based model These are the ones which are going to be the app answer to the user query whenever it is a fact-driven architecture and if there are some queries which need to be hit which need to be Routed to enterprise systems to get Answers whether for example, we have stop on hand for this particular item for that you have to reach a rival application database And then get the answer back So in those scenarios, you have to find out the entity is embedded in the question and then hit the relevant Webservice and then reach the database and bring the answer back so the response parser in turn processes the data and then constructs and Suitable response for the user based on his access levels and that is the level of the response to the user Yeah, so this is about the training data We do have a domain specific Data these days any retailer it needs to understand enterprise specific data It is not a generic set of questions for generic pieces of information, which you will find So this is from here. We start the actual Chatbot specific Subtopics including the internals of the chatbot. I would hand over from here to Niharika and she will Take us through the remaining Right. Thank you. Thank you, Shridha What we see and what you've seen in the last 10 slides is What is the general architecture what what is the chatbot all about and how does it fit into this Use case and so now that we've seen what the training data is So these are the different types right when you have an item you have Have sort of metrics which you need to process that so all these are valid questions Which the user can give and chat work needs to respond So in any ML machine learning pipeline The first thing which we basically have is first collect data, which is what we just saw The second is to preprocess it in a way Beat using acronyms or let's say using punctuations converting all of them to lowercase So some of the common packages what we use are just string based packages But the one thing which we also need to look up is Let's say spell checks. So it is it is a human practice Let's say you make a small spelling mistake, right? But you would want the chatbot to understand that spelling mistake and correct it and then And probably give back the right answer. So there is a you know an open source tool called some spell which basically which basically loads a dictionary. So what it does it it helps you to give the top or the The closest recommendation so it works on the concept of edit distance and Levenstein distance algorithms So what you're going to be seeing in the next few slides are all open source tools and packages, which you can Work for us So what you can so what does since well do is it basically has a Frequency dictionary which has the number of words and the frequency of it So the why it kind of helps and is because let's say you have domain specific Jargon words something which are not regular English dictionary words, right? So this is where the frequency dictionary which you can create on your own corpus helps it So all what you can see here is a small snippet which says Which in which your member spelling has been this is wrong. There's an extra E over there So what does this do is basically it gives you You know, it searches through the dictionary on the base of edit distance Which means any for anything I do over there. I add a character or I delete a character So every character is basically one operation. The cost is equal to one So on that concept of edit distance it The verbosity is basically you can help it will help you to list out the closest or the top of the all recommendations of that spell check and you can get the closest one and then You know change your user query and in such a way that your chatboard can handle that questions and not throw an error So the second part which we saw is basically the intent recognition What it happens is like how should they explain it's about recognizing the purpose of a user's foot, right? So you can have different types. You want to chat what to respond your greetings You want to chat what to answer what it is going to do, which is basically your metric definitions It can also, you know, fetch values from a service or web service or a baby Those are domain-specific questions and also help you resolve your level one tickets So any any error which a user is facing and he says I want to find a solution to this That is these are all different intents and the different queries which the user can throw at the chat board So intent recognition is basically it helps you to You know, it helps you to map each query to a particular intent and each intent will then lead to a particular action from the chat board So the one open source package or tool which you can kind of leverage here is fast text So fast text is built by Facebook AI research labs and they're basically a library to for efficient text classification So what you see here is, you know, a basic sample training data kind of training data which fast text requires So you need to give the different intents. So those are basically your labels and you also need to give all kinds of Inputs and all kinds of which are which need to map to that particular intent So all readings all different types need to map to the same intent. So once you do that Using fast text is pretty easy There are it's basically a supervised learning which you can leverage and you can give your file name as your input You can tune your hyperparameters based on your epoch based on the n-grams Based on your learning rate and then so what this really outputs is you save your model and you load your model and you Send your input text. So what this helps is it gives you the greeting and once you have the reading you can then customize your fast text or you can customize your chat board to handle those labels and those intents, right? So this is how you get started with fast text and Once we have an intent in place We then move on to actually extracting entities from them. So some of them Let's say a metric definition don't need to you know, you just have a Pointed question and a pointed answer But what if you need to extract certain entities be it a supplier's name or be it a certain metric like units dollars or Percentages. So this is the whole concept where named entity recognition Comes into picture. So what you see on the right here is so Spacey is a again an open source tool or a package to help you to extract these named entities But what it does is it has pre-trained Models which will help you to extract these and help you to extract person organization countries You know date time percentage all of them have been built But again, since we are talking specifically about chatbot for retailers What you can do is you can basically build your own custom model which will help you to where you can give your own training data and you know Create training data in such a way that you can give the positional indexes of these entities So what you see here is the kind of input data which is required by spacey specifically So you enter all your statements and you give your positional Indexes of these so if you want to extract sure the position of shirt is basically from 21st character to 26th character And the entity here is your product name so after you give let's say thousands and thousands of Input like this. You basically train your any our model using spacey open source tool and you can Help it extract those particular things so that you can use those values as your metrics to fetch that queries Fetch from your web services. So when you have some item number or a DC and you want to This is what you have to do So the whole this whole thing has been basically annotated and automated so that the complete training data To the annotation of this and finally splitting it into your training validation and test and and then loading your model for prediction So now that we have our training data ready. So this is just a brief overview of how Spacey works, right? So you have your different labels. You need to give your gradient gradient You know your epoch learning rate and all of that and you basically train your model in such a way that those entities are extracted So the next part now that we have our intent and entities extracted now that we know our chatbot is able to You know extract relevant information from your From you know, the query which the user has entered We come to the fact which is basically context awareness So for example, if there's a situation where I ask you, okay What is who is the president of India and the next the user is asking that You know What is how what is his term or how long it is? So the user so the chatbot has to understand that we are talking about the president of India So to understand and to retain the metrics to retain the user queries and the entities is where we bring in the whole context awareness One way you can leverage this is using flask sessions So they're basically key value pairs where you have session variables and uh, you know values where you can use your session id as your As your key where and map all of your session data and user input based on that So you so it's basically a slot based logic Where through which you can check the mandatory entities You can respond back to the user if certain entities for your question is not fulfilled And this context is basically there till your user logs out of the application Right, so we come to the last stage of our entire thing now that your machine learning model is ready Your query is done. You have your Intent any are in place and you also have your context awareness in place Now we come to the part where you deploy and basically Expose your chatbot as a web service. This is where you can leverage Flask, so it basically interacts with your ui The angular react ui and then uh, so you have a json basic request and a response and this will help you to You know, basically, uh, you know get the input from the user Hit the relevant web services from where you can extract your information And maintain your session maintain context awareness and then give the response back to the user So the last part is so every solution always has a scope for improvement So what you can or what we've also done is basically have a user feedback Which is which is at different levels. You can have a reply every level thumbs up thumbs on feedback At every user level and the whole solution level feedback. So this basically helps us Understand which which were right, which were wrong Understand the user requirements the business requirements and this leads us to better enhancing our chatbot model Uh, you can do it. So there are different prediction accuracies based on whether your intents are right based on your any are models So this will kind of help us lead to the model retraining So you can choose whether you want to increase data. You want to bring in new intents You want to bring in new entities. You want to do some hyper parameter tuning And improve your model as and when the business requirement involved and as and when the users You know use your chatbot So we come to the last part and see that if you'd like to take over the conclusion Yeah So now we saw we saw all the building blocks how we can build the bot If you want to replicate here are certain tips the first and foremost leverage nlp and python And then you need to integrate with your own enterprise Supply chain systems the ecosystem of supply chain applications And then that is very much required because the information is available in a lot of databases and relevant tables there And then one can leverage a lot of pre-trained ml classifiers Be it traditional classifier or a deep learning model And it could be in the form of a tensorflow or a by touch model Now we do have a lot of transformer based classifiers for a high level of accuracy especially when it comes to text processing and context awareness and then The the prime purpose of a bot in addition to all these application systems like reporting systems dashboards Weekly reports monthly for so many transaction systems that come embedded with the Reporting application so the user is able to get to those information, but it takes a lot of time And sift through to reach a piece of information So the the prime use of bot will be support a pointed question and get a pointed answer So the sooner you are able to get there and seamlessly we're able to get there that will determine the success of the bot And then enable capability for the model to relearn from its mistakes So that it continuously improves that is what Neharika spoke about in the previous slide about model rate training and the reasons for model going stale and then finally it has to Exhibit context awareness and it has to hand hold the pieces of information in its memory As long as the customer is staying put to a particular Subject or domain it cannot drop the ball abruptly and then finally In this digital world everybody is using mount extension And enabled has a lot of clicks to reach to the information that they need So other prime objective of the bot Should be to remove those number of clicks and seamlessly reach to the piece of information that the user is looking for So if you can build a bot that exhibits these things and the success of that implementation is quite bright So with that We will stop our presentation and we are open for questions Hey, thank you Neharika and Frida. That was a wonderful presentation And I certainly learned a couple of things about chatbots about intense lots and A whole lot of odd So I do have the questions for you. So in the conclusion you touched about You know how transformer models can be used in NLP Uh, so I see that, you know awfully there are quite a few of them coming up Uh, the third excel net or uh, many more about, you know, uh Localization of text So what are your thoughts on all of these models? Yeah, uh, see these transformer models are all Very complicated the neural network architectures and they come come with millions of weights and biases So there will always always be a technical challenge and the common example would be bird And apart from the transformer under the transformer category, we have a lot of variations of bird like all bird Dill bird or distilled bird and so many variations of different improvements So all these will help to improve the context of the Contextual mechanism is which bot can correlate different Entities or different parts of the intent and it can seamlessly understand So it is as a human being one can understand with the examples what in harika or I stated in a piece of conversation, but for a Software to understand that is where these attention mechanisms and all these are coming and This will certainly aid in improving the contextual capability Of the bots. Yes, but it comes with the cost. You need the relevant hardware to run all those transformers And to realize their potential Yeah, and I think attention algorithms are another area where I think the nlp community is building a lot Yes So if you do have a couple of more questions that we move on to them Do you recommend any open source parsers or Which has you know, which has custom build parsing specific to each domain? Yeah The parsers which we have built for retail domain are all custom because we don't have any generic Questions like like a travel app or a visa ordering app like where the subject is quite restricted But when you get into a retailer's enterprise, there are a lot of jargon Technical jargon lot of acronyms and there are a lot of company specific terminology which are extensively To communicate so predominantly we use custom built parsers Which a predefined or readily available parser may not be able to support that so that has been our experience So just to add on to that Just one thing. So in the slide I must have probably showed you a few training data examples, right Which you very had available to ship recommended shipments. You have certain very specific suppliers DCs so these are all Like I said, not part of your regular Or let's say supplier in all of that. So an open source. Let's say a custom built parser like she said You know helps us to get what the user wants. So it's a very specific chat bot not something which all retailers can leverage throughout Right, right So SME help is like essential Absolutely So moving on to the next one from where do we get domain specific data for training? I think you try kind of answer this question So, yeah So that's a question for any machine learning news case. This is always a billion dollar question So nobody gives you data. So if you're implementing for a particular client, be it a retailer or a banking customer or any one travel customer So you have to rely on the enterprise data. That is where a ton of data is available And that is also where it comes with more scrutiny. They are not Ready to park park away with this data so that you can take it to cloud and do something So this is where their enterprise knowledge is available. So it has to be the enterprise data There is no enterprise data available open source for anyone to play around with. Yeah What you can do is one thing what you can do is the initial training data slide which we had So you don't get thousands and thousands of data from the client or whoever it is only gets a specific questions So let's say 20 varieties of questions So what you can do is you can leverage python and kind of augments that and Make your own automated python script is what was done And you can you know make that as big as let's say So that it can actually have an impact in your machine learning models, right? So that's how you can Kind of explore Yeah, moving on to the next one. We actually have a whole line of questions. Let me just pick, you know, one or two from them Okay Next one is how will the feedback be used to improve model performance? I think there's a very important one because You know to build your model you need to be able to capture feedback So could you guys shedding light on these? Yeah, yeah, Niharika, shall I go ahead with this? Yes Yeah, yeah, so we can enable feedback either the usual response level for the bot or we can at the entire session level How the Two levels you can enable feedback and once you get that feedback you can put it in a flash dash forward and then drill down and then Come up with various metrics to show what sort of questions we're getting some sort what we're getting thumbs down And how frequently and all sorts of analysis we can do With this information now The onus is on the machine learning team or the data science team to revisit the model and see what aspects of the model can be weak for So that the bot response is highly appreciated So it is a it enables a feedback loop to revisit your model and Fix the certain aspects of the entire pipeline where you can improvise on the response It is a it is a continuous process. It is not a one-time work Till the model stabilizes Thank you. So let me let me move on to the last one. So this one is again Quite an important one. So in production, what is the state of the strategy for the, you know That we use pickling or do we use any other tools for storing the models Yeah, see uh as we get into uh a deep learning world with all transformer models and all that uh pickling Pickling is no more in the picture. Pickling is all for Smaller size models when we talk about gigabytes of Gear bed size models There is always a challenge. So the production I think currently models are Not generally published to get only your code gets published to get so the for the models you should have a separate repository other than get somewhere you can move or there are instances where Certain clients enable the model building production itself So those are the two strategies available as of now, especially when the model sizes are big But when the models are small just certainly pickling can be done and then it can go as far as your regular cacd pipeline But when the model sizes are big your cacd pipeline doesn't work Especially for the model development So thanks a lot. Niharika and Shridhar again. I do the wonderful talk and quite insightful Yeah, thank you