 Okay, it's time. So welcome everyone to the last interactive session delivered by Alex and Fer. So this is probably the longest title of the entire EuroPerson 2021. The title is The Pattern Machine Learning Natural Language Processing meets VRAR. And then there's a subtitle, deploy large NLP models, create knowledge graph and build new types of interfaces. And in this session, Alex and Fer will walk us through three machine learning NLP pipelines within one hour. So that is a lot to cover. And that means I should not eat into this one hour session. Take away Alex and Fer. Thank you, Raquel. Thank you everyone for joining. The pattern is the open source project. So everything which we are going to show today is open on GitHub. So if you wouldn't be able to follow the slides, then there are code references. In agenda, we are not going to talk about virtual reality and augmented reality part of the project because it's JavaScript based and we are in your Python conference. But there is a talk on RedisCon 2021, which is covers how to build virtual reality or augmented reality using 3GS out of RedisGraph. And as you can see in agenda, we are going to cover Redis, RedisCI, RedisGiz, all of this. Then we'll review of the simple natural language processing pipeline based on RedisGiz. And then we go deep dive into one of the most sophisticated pipelines you can build natural language processing. The RedisCI RedisGiz. We do expect to overrun and we booked open table city works room for the next hour. So we will be sticking around after that session to answer any questions or query. About myself, my name is Dr. Alexander Mikhailov. I'm geek and data architecture team in nationwide building society. This is my fun project, which I'm doing outside of my employment, the full employment permission of nationwide building society. During daytime, I work on synthetic data, data privacy and digital twins, but national language processing search engine and AI is my fun and passion. Over to you, Dvir. Hi, I'm Dvir. I'm for the last two, almost two and a half years I'm working for Redis Labs in the CTO team called the innovation or incubation team of Redis Labs. Currently I'm working on the machine learning related or AI related project, including the RedisCI. And previously I worked on doing my first year, a year and a half in Redis Labs, I worked on the RedisGiz. And before that I did my master in the Technion. Thank you. So, what is the problem? In the beginning of 2020, medical profession met a new challenge where previously a number of publications on about COVID previously known as SARS-V virus were about two per month. And they were facing more than 300 plus articles per day. And what I'm trying to do is to build a better tools to help medics or other knowledge management professionals to navigate via such flow of information. And the project pattern was born. The five ever increasing complexity helped community to find relevant knowledge using artificial intelligence and novel user experience elements. It's all powered by Redis. There are no any other databases in the project at all. So it's on the Redis and Redis models. Well, kind of visualize what we mean and put the initial English process in machine learning pipeline in the context. I'm going to do a very quick demo. Yeah. So this is the demo server, the demo page. And the first thought is that different roles of medical profession have a different requirements of working the information. A medical student can be comfortable using mouse and the keyboard and have a nice PC to have a three dimensional visualization, or even we can make a step into virtual reality, where you will be hand waving information in similar to minority report for joining Mnemonic. So that's the interface we propose for medical student. So it have a long natural language processing request query, and the output is three dimensional graph, which is built on the premises of that you want to explore the visualization space. And the whole purpose of this visualization is to highlight the hidden constructs. Yeah, so for example, the temperature is one of the super nodes. So the nodes are the medical terms, they map to universal medical dictionary. The edges are the observation of multiple those medical terms met in an article. I'm going to go into slightly more details. And the important point is to highlight here, affect and falls came out of universal medical language system. And this is the title, and this is a summary which is built by T five model, so you don't have to read the whole article. So I do a summary. And as you can see, I've populated new articles for which no summaries have been built, and it will have enough time, we can cover the actual building of the summary life coding very dangerous in the conference. So, let me go back to my slides. And we'll take it further foundation of it is radius, which stands for remote dictionary server and this is where we will take us through what is radius and radius models. Okay, so, we're going to lay the foundations for the common knowledge that we need for the further slides in this talk. So we first need to understand what is ready, so ready season open source in memory data structure. Okay, we can refer it as a key value database. And the usage of either primary database cash layer message broker. So several data structures out of the box, such as a string touches list. And you can have, you can, you can execute the beat map cycle of looks and just up shell indexes operation, as well as, as well as streams. So, as I said before, ready for the key value, and a key value, basically database. And its commands are the majority of its commands are referring to a command a key and value or multiple values. So in this simple example over here, the command is the first argument, which is set. And the name of the key is key and the value of the value is the string value. And the next command, so we can return back to the, the next one that we're going to execute is get the key and we're turning the, we get back the string file. This is important because Alex is going to show you some ready commands being used in the next few slides. The other thing that we need to talk to discuss about is sharding, which also be demonstrated on the next slide is how radius is scaling out. So, that is splitting is actually sharding data, according to some has looked for a single instance ready. We get the 16,384 has looked where the keys are mapped according to a CRC 16 and a function has function. And when we are starting to, if you want to scale up or out, we start to use multiple instances of radius in a cluster where each instance is responsible for a continuous range of has looked, but they are not to each other. And in this example, we see that we can get a key, a key has looked by getting by running the command cluster this lot. And we can tag different keys with the same hashtag, and we will get the same has looked for both of the key. Okay, this is ready some in one moment in one minute. And next thing that we're going to discuss about is the latest modules. So, we can refer this modules as extensions or plugins for this that extended functionality regarding the operations that they can, they can perform. They can expose their own data structures and their own commands, but they still run on redis and they still enjoying the latest capabilities for in memory and using this infrastructure itself for communication, consistency, etc, etc. So now we're going to describe some of the modules that Alex has used in his project. And the first one is a redis graph, redis graph is a property graph database where the entities are known and relationships and each entity has its own property. It implements the open cipher query language, where we have an example. And over here. That we want, we want to return every people that person named Tom knows in the distance of 123 hops for him. And we get this illustration on the right hand side, which is the query, the query output presented in a in a graphical manner. In the hood, we will be using graph plus, which is for this traversal for those kinds of traversal we were traversal using graph plus which is a linear algebra library for space matrices multiplication. And of course it is again a redis module, it's a plugin for this. So we have ready to die ready to die is ready to module that uses exposing a new data structure, which are, which is pencil and the data type, and actually naveling deep, deep and machine learning model execution on CPU and GPU. Actually makes ready to like an inference server for several machine learning framework. So what, what is going on inside the CI. Next is, is that it is embedding by torch transfer flow and onyx runtime framework for model execution. And it also support the post script for script execution in for pytorch. And it supports multiple devices, CPU and GPU. And it gets 10 sources input and returns them and as output of each one of the those frameworks that we support. Next, radius gives radius gives is a module that exposes several engine capability for multi model and cluster operation of ready. And it supports events or batch operation. And it is agnostic to how you run a radius if you're running on a standalone instance or cluster or what we offer in the labs which is an enterprise cluster. It has built in coordinator for the cluster support under the hood. We will see it in the next slide. We will see map reduce operations. And it shows the data, however, however, we need it, regarding the, regarding the execution so this illustration in the red rectangle we see the underlying core of radius gears, where it has the cluster management. It shows how, where and how to run your functions that you register or recipe, this is how we call them. It has the execution manager which actually schedule the functions, the functions execution, and after this functionality. So everything we have a C API that we can register execution support. Sorry, we support registering function and function execution in a C. We have a Python bind for this. So we support actually submitting the Python script. With radius gears capabilities with radius API. And that involves also whatever library that you need in Python. So, both myself and me, which is the architect of our team had the pleasure to support Alex in his project. And we are happy for this. And I will turn it back to Alex. Thank you. Thank you. As you can see, I just used the radius is a, is a tongue in the chip, because I used half of the motherless of the radius capabilities. And one of the bit, which we're mentioned is that it supports Python, it supports Java, and there are maybe other languages coming. And radius gears is absolutely amazing model for data scientists. So, it's in memory storage, presently scalable if it's radius cluster, they've data charging in build processional data on data devout need to move it and out. And it's actually quite small. So, a radius in cluster mode with radius gears and Python consumes about 20 megabytes of run and think how much more data you can process if you're on the line infrastructure is memory efficient. To explain what you saw in the demo what where those notes came from, I have to explain a bit how to build knowledge drop how to turn text into knowledge drop. And this is the foundation of the first pipeline, which is using one of the oldest natural language processing algorithm, which is a churrasic automata. So, I assume that when you're writing a text there are two concepts which are related to each other. And the point of all this processing is that I want not strings in my search results but I want things as medical community will understand them. And that's where I map concepts to concept unified identifier in external medical dictionary. And that's the step to, as you can see, it's a concept and the concept and it defies, and then to concept identifies are related in a sentence. So here is example on lowest level, how it looks like. So there is a concept and the defy, which is a transmission, and there is a concept of the defy, which actually means both. And there is a sentence which connects them together, a rate of transmission in, in fact, born, as you can see the stemming and the rest of the things happen automagically. And then it's all stored in a radius graph. So this is a way of turning a bunch of text into the knowledge graph. And this is the simplest pipeline, which is in the pattern project. It's the first pipeline I built about a year ago as a part of radius hackathon 2020. So the input is the court 19 documents. It's a cargo competition data set which was released in the beginning of 2020. It contains about 50,000 Jason documents in different languages. And some of them will not spell check. I mean, there are some of them will see our into Jason. And that's why the steps are, we create the streams, we pass it through language detect spell check, then we split it into paragraphs, and then we feed it into a radius graph. And then I have a plus API, which allows to query a graphic guy. As I mentioned, it's all runs inside radius gears. So all those steps are fairly small tiny. And then I can script, which then scale using the radius drops across all shots. Obviously, there is a pre step which is, I read all UMLS tables, and then I build automata, and then the matcher runs automating automata matching inside of the radius gears and produces streams which then saved in the radius drops. So that's the simplest pipeline. And now we are going to talk a bit about modern state of the art machine learning models. So if you're a data scientist or initial language processing community, you obviously heard about birth by the directional encoder presentation from transformers. It's the machine learning model which created by researchers at Google and language, it's state of the art in variety and LP tasks, including question answering. So this prediction text summarization text classification and many more, and there is a link to the good write up on on the blog towards data science. What we are going to use their models today is a question answering. And there are other machine learning models which are already superseded there, but this this one I could deploy. So the query processing and the first API call exact use exactly the same as a classic matcher which I used on radius gears. So overall query processing the one which you saw me doing on the demo server where I hit. Impact transmission is, it goes via class API, it goes by the same as a classic matcher, which turn text into nodes, then it fetches nodes and they just from radius graph. And then it fetches list of title and ages and kicks off to more pipelines. And one of them is a pipeline number three, which is a summary pipeline which can be very easily calculated offline. And another one is the most complicated one is the question answering. So it fetches answer, and the role which I haven't demonstrated. The question answering pipeline model have a completely different interface, because the assumption is that knows as a medical profession who interact with customer prefer not to touch keyboard, and prefer to listen to the answer, rather than read it. The answer in pipeline manifest itself or knows, not only in advanced interface, which is using hand tracking sensor, but also in text to speech, where the answer is read back to the news. And that's where we are using a lot of the modern radius gears radius AI capabilities. And the pipeline works is like, if the key is not present in the response if we haven't searched for that particular question, it will run the whole third question answer in machine learning models on each preloaded chart. And then it will return an answer and it will catch this answer. So, just to reiterate, yeah, so we have a question, and we have a plus API, which queries the radius graph, and then it produces results and check if we already have key on a particular chart. If the key is missing, it captures key miss event. And it starts the pipeline where it recognizes a pen's question to the list of potential answers. And then it runs the question answering inference and then stores answer on ready shot. So the next hit, if user hit refresh button is returned in millisecond under one millisecond, because that's how radius labs team likes to their radius to perform under two milliseconds. But the first run probably will take about slightly over the second because there is another optimization which I apply where all potential answers are pre-organized using the radius gears. And the most advanced code initial language processing pipeline looks like this. Yeah, so it queries carefully constructed key in the radius, and then it returns an answer. And that's the most complicated code in the pipeline. Any questions? To be fair, that's what actually happened in the background, and you can follow the code on the GitHub, it probably will be easier than in presentation. So, get, this is the sharp ID, this is the article ID, this is another sharp ID, and this is the sentence ID, and that's where, and this is the question. That queries third question answering on sharp where is the key of the pre-organized tensor potential answer. So going back to how the machine learning model work for question answering is it can return your potential answer if you give it a list of variables. So we have to fetch the articles, we have to fetch sentences which may be related to the answer, then feed it into radius AI per question answering machine, and then we cash the response. And it uses a lot of radius gears and radius AI magic. So, the actual key miss function in the radius gears registered like this, and you can see that it is listening for key miss type events in on prefix their question answering, and the command is yet, but the important point is more I think local. What it means is that this function is distributed on all shards of the radius cluster, and if you run it locally it will run perfectly. If you run it on a high performance server, how I'm doing it right now, it will distribute across 20, all CPUs or as many shards as you specified, and then it runs. Get tensor from key, create model runner, create tensor from block, model runner at the input, model runner is synced in and all of it done in non-blocking main thread mode. What it means that if you want to take your machine learning models into production, that you can be sure that your radius instance continues serving other customers while heavy lifting machine learning model performs its heavy listing and then returns the result which we cash instantly. And this is the how simple the key miss function is. It's just effectively constructing cash, then again using I think away construct spins up the query, and then it's a variety reply to make sure that we return the result into the same plan. The next pipeline is a way simpler. So summarization pipeline is using Google T5 text to text transfer transformer model for summarization. And the toll runs just using radius gears, because the difficulty of question answering model was we could not pre-deconize and pre-percolate question. We pre-deconized answers, but we could not pre-percolate questions because we don't know them up front. So summarization it's actually way easier because we can deconize the whole text. We can then store the output in a radius set, and then we can run T5 inference on instance GPU or CPU. So Piel has a question, Alex. Go on. It's my last meaningful slide. So, so do you, right, I can read the question so the question is, so, hi, so radius AI uses transformers under the hood, or do they have their own similar implementation. So, radius AI is inference server. I put transformers models into that. And one of the comments, which we are mentioning before the talk, that I forgot to mention that I preload radius machine learning models into radius AI as a part of the start of the script. So, I think there is a bit which is apparent if you run the code, but not apparent if you talk about it. So, radius gears is good for data scientists, but you need to deploy specific configuration. And I'm going slightly on the tangent there because for radius gears to be productive you have to deploy radius in the cluster mode with high availability. So, each master have its own slate, because what I'm doing when I'm populating radius gears function, I'm attaching libraries like PyTorch, transformers, downloading those libraries, then converting them into radius AI, which Dvir can enlighten you more, and then I push it all on charts. All those actually, they are not small libraries, they are two gigabytes and more. So, while they are going to be populated, the cluster have to remain stable and that's why you need to have availability. So, Dvir, do you want to talk about preloading machine learning models into radius AI? Yeah, well, radius AI is just an inference server. It's not holding any predefined models or capabilities, so you just need to either freeze or export or trace the model that you want to deploy. It depends on your platform, on your framework that you want to use. Just store it on radius AI. From here, you just need to execute it on the inputs that you want. Thank you. Thank you for question. Any more questions? When I'm presenting, I don't see them, sorry. I don't see them at the moment, but if anybody else has any questions, feel free to put them in the chat. Yeah, and as I said, it's all open source. Join us on GitHub. We have a live demo server. You can join our team on Discord and we have a collateral which I can point you to talk more about it. And if you want, we can actually do live coding demo. Raquel, do you want us to code? I've already done for time, so we are really quick. No, no, you have almost a half an hour left, right? That's what I mean. I think we went so quickly. So go ahead to the live demo. Right. Okay, so this is a radius instance. This is a shardim radius instance. And this is exactly what I thought. So it have quite a few keys, but it will check that. Then you can see that I tested a few queries before. What we can do is we can run the third question answering model. Or we can populate more. I think one of the really nice things about radius gears on a radius EI, you can see the load is actually pretty even. So at no point it balance in the load of the server pretty nicely. So before the talk, I submitted about 10,000 articles into the pipeline and I was still processing it, but at the same time I managed to demonstrate it. So these are radius graph, radius plus radius graph instance, which acts as a front end. And this one is radius gears, radius EI, although which acts as a computational cluster. Currently they are single on my demo server, but you can scale radius gears on, for example, using the radius enterprise, or you can. Did I, I continue sharing that. And you can see what I'm talking about. So, Raquel, do you want to come up with a query? It will not necessarily will produce results, but we'll show the calculation. Let's go with your favorite query, what is that? My favorite query is already pre-computed, so it's not, it's not interesting. I have a list of queries which were so effectiveness of inter-travel restrictions. The challenge, if I'll hit any of those queries, they're going to be pre-computed, so we'll see it appearing in the cache. So, I would like rather handcrafted is, I think we have some really humorous one when we were testing it like horse. So, it will not produce any meaningful answer because I'm feeding completely wrong article into it, but it will run machine learning model and it actually produced an answer, which I can't believe. So, it's run, as you can see, it's just down case that respond and return it. So, to be fair, that's the empty return from the question answering model, but we know it actually work in the background and cache the response. So, you can trigger whole really complicated radius gears, radius graph, radius AI pipeline just producing the query like that. And think about APIs, yeah, so your API complexity becomes very simple because the only thing it does, it goes into radius. Yeah, I think interactive session is anything else we can show. Is anything else people would like to see. I will type that into the chat, but if there is anything people want to see, please just comment now if you don't want to join the zoom. Yeah, we have a, I see, we have a matrix chat. Oops, shall we crash it. Yeah, just for fun. Right, we have a question from Francesco. And the question is, could you please clarify again how you chose the context for running bird QA and retrieve answers to your questions. Okay, that's, that's good. Right, let me go back to slide. Right. So, the context is chosen by running this pipeline. Yeah, so we're going through much. Yeah, we get the nodes and probably will be equally easy just to point it on. On the demo. Right. So, let's run the query for it doesn't have any answer. No, it does. It's just a bit sluggish. So, the context for birthday is literally the same as you can see in front of the screen on the pop up. Yeah, except when you're when we're in exploration mode, you see a lot of notes for the question answering. And I go for like top, top five most ranked nodes, and then I fetch the corresponding sentences for that. And then the sentences are fed into Berkeley as a potential answer. You can do it out of the box using Transformers library. Yeah, you can. The challenge of using the Transformers library is that you, you'll have to recognize both question and answer in one go, and then you have to run an inference on it. It's difficult about it. If you want to build any step towards production level user experiences, there needs to be optimizations. And that's where I tried to leverage radius as much as possible to produce a lot of optimizations, pre-technizing answers, and then using radius CI and the radius gears to recognize questions separately and then effectively concatenate them and then run inference. Yeah, so the reason why it's closer to one second, not close to two or 15 seconds so if you'll run off the shelf transformer model on like five articles, it will take you at least seven seconds. And the funny thing is, even if you have a GPU-enabled instance, your GPU will not be loaded for more than 30%. That's why it's important to do steps like data sharding, data distribution, pre-technization of the text because for images, you don't need that step. Yeah, images already already numbers, so you can use radius CI directly. For natural language processing, you have to think about what are the steps required to pre-technize them. And I can go, if people are interested, I can go into how the aghacaric actually works or the challenge of relevance. So one of the challenges, so does it answer the question first? I do believe Francesco was pretty happy with the answer, but you can go ahead. So there are a few things which we can talk a bit more. So why do we need roles? So my purpose of building first pipeline was to leverage industry knowledge. It's a universal medical language system, which is maintained by US government, and it's a huge dictionary, which is mapping all terms and all variations to what it actually means. But if you open the tables, look under the hood, bleeding, not normalized word bleeding, means 11,000 different things to medics. And that's where I think the roles and the relevance should be dynamic. And that's why I assume that general practitioner or nurse are interested in these are the semantic layers in UMLS taxonomy, anatomical structure, disease or syndrome, body part organ or organ component or diagnostic procedure. So I'm trying to limit knowledge graph output only to those concepts found in that semantic layer. Another bit which I'm trying to do is I'm trying to give users the ability to mark the node, the concept is not important, and it will be filtered out of their output. What's the difference between standard TFIDF or the M25 term frequency inverse document frequency is that you will never see that bleeding means 11,000 different things. You will see a top five ranked by the medical dictionary, and that's where all current national language processing tools or searches, they drive you towards the most frequent, most desired input. Yeah, what I wanted to be able for people to do is to actually go and check the evolution of the concepts, and they managed to get rid of this thing is so one of the features which is, I think quite cool, but is that there is a slider, which shows you how the terms how your search query evolved over time. And for me that was very, although it's kind of feature for me it was very important because it allows to show that our language and our communication evolves over time. The driver of next step would be turn it all into mixed reality experience, and you can go travel forward and backward in time, but software is not there. So I managed to get some of the hardware working. But getting contract consensus working, where they are proved to be quite challenging experience. And another bit which I wanted to touch on is we talk about knowledge graphs, graph databases, GraphQL, RDF, REST, Partial, and other things, and we constantly mix different concepts. I think one of the messages I would like to highlight is knowledge graph is what you store, what to write things and not strings are really like that to do it. Graph databases where you write and read it from, and SparkQL, for example, how do you query that. There may not be interrelated, but I talk to people who are constantly mix those concepts, and I'd like to bring this side of knowledge up. Right. The traditional open domain question answering architecture will look very similar to the one which we presented to you, except it will use the different retrieval mechanisms like the M25, which is derivative of time frequency and those to come in frequency. As I said, I wanted to craft my own relevance. And that's why I built this knowledge graph pipeline where we map it to things and then we store it in Registrar. Any more questions or deep dive requests? Pierre said he doesn't necessarily have a query, but a following up question on the results, which is how would you interpret the resulting graph? Well, so I'm not medic, the resulting graph should be interpreted by medic. Yeah, and I think that's where is a So keep in mind it's a hackathon project is not a finished product. I wanted to present results in three dimensional and virtual reality. I understand a lot of people like to have a standard Google's spreadsheet like output or two dimensional graph. And this is the presentation of this graph requires an additional work and to make it practically useful, you need to work with a stakeholder who will be interpreted this graph. No medical professional professionals were engaged during hackathon. I'm open to suggestions how to improve it. How to interpret it right now is it is a technology demo. Think about it like Engel Bard's demo in 1975. These things can be done using current technology. So that is not out of UMLS. Yeah, and if I click on not it gives me the concept unified and to fire, which we can go and fetch from US government's website. Edge is the list of titles, which is probably similar to what most people would like to see as an output. And obviously there is a work with progress is your potentially should be able to move it through roles, because at different times you can have a different role. But that's whole other topic of exploration and just to illustrate the. So, event, generic word, and that have a very specific meaning, I'll have to log in eventually, if I'm lucky, then we'll be able to see it. And event is related to disease or syndrome rather than very generic event because I narrowed down the dictionary from high level concepts to something which medical professions can relate to. Yeah, so relevance particular for specialized field is not currently solved problem. Yeah, so all current tools. And the question of usefulness, because if you put the same terms into Google. You probably will not necessarily reach meaningful results. Does it answer question or shall we go deeper into any topic. Yeah, would you be able to. Right. Yes, yes, thank you for the answer about how to interpret the results congratulations on the work. I would like to highlight one thing in terms of visualization. So we actually quite familiar the word of cloud. We have three dimensional visualizations of text. And what I want to show if I'll manage to not to break it is that in two dimension, it looks like a soup. Yeah, but when you add the third dimension, you can actually see some patterns. I think a lot of our hardware supports three dimensional visualization for the last 10 years, except we're given up to gamers to gaming community to play the rather than using it for ourselves. So, it's nearly there. So when you're using word of cloud, you have a probably this type of view. Yeah, and then you slice it using different things, different filters. But what is actually behind is a completely different structure, the one which you can see different patterns so because it's how does temperature and humidity affects that you can see the temperature. I'm more visible as a pattern and super not. I'm sure yes there are other layouts which we employed for force directed graphs into dimension. My other ask of people try to think what is your multidimensional representation. And let's try to keep it at least in 3G, but if they'll manage to get it into virtual reality will not be limited by the desktop space and you can just turn you head around to look at other things. Right. And a lot of things I wanted to say, I thought we will be talking longer as it's interactive session we are quite happy to continue interaction and in open table. I think quite a lot of people seem to be joining on matrix by chat. So they're only viewing the stream. So I think it's probably better for us to move. I will sign post the session that you have already registered in the open space. And so, and then I will post that over here so that everybody can be there and chat in the more informal fashion. Okay, and it will also post all of these links because the slides are already online. You've uploaded them so I'll post all the slide there as well. And all of the links that you mentioned, I will also post it there so people can have easy access. Brilliant. Thank you. All right. Great. That was a great session. Thank you so much. Okay. Okay, so that's the end of our interactive session and thank you very much.