 OK, we are good to go. All right. So hi, everyone. Hey, Trevor. Great to see you again. So hi, everyone. I'm Bain. Great to have you during the lunchtime. I know it's a little bit not the right timing, but I'll try to make it quick and sweet so you guys can get back to the lunch so you don't get too hungry around it. So I'll do a little bit introduction of myself before I jump into the content of what we're discussing today around neural search. So I'm the co-founder and COO from Gina AI. We are a startup that is two years old. No, actually three years out this year. We just passed through our series A fundraising. So we are a team of 50. We've been starting this around the whole kind of a new concept around search pipeline. We're trying to solve unstructured data problem, but we'll explain a little bit later in my sharing. I'm not from the engineering side of things, so it's not going to be too technical. But we do want to share what we've seen over the past two years since we started. Well, we created the term of neural search when nobody would know what the hell that is, but we start to actually tell people how that is actually using. And we've seen a lot of interesting cases that is popping up in our community. So let's get started. Hope you will enjoy the content, and let's get it tried up later. So first, before I jump into the details of that, I'd like to introduce a little bit of ourselves. So from GINA AI, GINA AI is a startup that is doing neural search stuff, which we are an open source company. So we so far is having a team over 50 across the globe. We are very distributed, but we do have a very large community. We have seven main open source repositories on GitHub, and two of them is exceeding 10,000 stars. So I'll explain a little bit later. And also we have now four offices, two offices in China, one in the headquarters in Berlin. We have a big engineering team there. And also we are setting up an office here in San Jose in North America. That's where we found most of our community were actually coming from. So nowadays we have over 3,000 people in our Slack channel of our community. And so we're generating around a million in the quarter of downloads of neural search. People are trying it out. We are very happy about that. And we are also, we've raised over $38 million US dollars so far, and we are also recognized, very lucky to be recognized quite a lot over the past few years. We've been concessively listed on the CB Insights, top AI companies 100 list over two years since they started. And we do appreciate the recognition that also was listed on Forbes. So that's a brief intro of us, but most important is what we do and why we believe neural search is something that people need to pay attention on. So first of all, before I jump onto the talk, I wanna know, so how many of you sitting here today were developers? Can you probably raise your hand? Oh, that's great. So the second question is, how many of you have heard about neural search before? That's even greater. And so I guess half of the room is actually being experienced in neural search before. This might not be too rare for you, but there are still some of the folks here sitting not too familiar with neural search. So the fundamental thing that we are trying to solve is to target on the data that is not structured way. So there are a lot of data nowadays. There are text messages. There are a lot of search tools that is searched in the bar. They type in information in an query. That's a very conventional way of search, but neural search is targeting on the data that is not formed in a structured way, just like an Excel or a SQL data sort of thing. So that's what we are trying to solve on unstructured data, basically making the pipeline of search on those kind of data a lot easier. So we've seen basically there were some of the kind of a calculation from MongoDB as well. In the world, there are almost probably 80 to 90% of the data are actually structured or actually stored in an unstructured way. So that can refer to something like media data, metadata, or even some of the messages in the long text. Those are the things that are stored in an unstructured way. So Gina and our ecosystem of neural search, you're trying to solve that problem with bootstrapping the pipeline that can solve the problem easier. Yeah, so what is neural search? I want to show some of the cases. So it's been very common nowadays. It's like you can search for some content in the book, like you want to find a fuzzy sentence in a book. For example, in this Pride and Prejudice, you want to see some content relate to, for example, she smiled too much in a book. Well, in a traditional way, if the book is having a sentence that is written in, she smiled too much. You can actually pin that because it's easy to make letter matching. But if it's not structured in that way, it makes it hard to actually pull out the information that is similar expressing, oh, she smiled too much, but actually in a different way, for example, she might have fancied it too much, for example. That might be also relevant to whatever she's showing, probably she's showing up a face that is smile. It's relevant information, but not exact match. So this is a fuzzy kind of search. And there are a lot of, for sure, NLP models is trying to solve that. But actually putting that into production to solve a problem like that is way beyond just NLP. So neural search is one of the way to solve it of the fuzzy search. And also another example is image search. It's pretty common. Everyone is on Amazon or even in Google, you can search for images. You can upload a t-shirt, for example, like what I'm wearing today. You want to find a similar kind of t-shirt, for example. And they pop up the top K results to find the exact match of the t-shirt. But in the traditional way, searching it is that you need to have a database that is tagged with a meta information that is already there. For example, it's a black t-shirt, black t-shirt. And they got some of their drawings on it in the front. And that's the thing, tag information that is stored there for a search system to search in a symbolic way. But what neural search is trying to do is similarity. So I can upload an image. And then I can find in the database of other images that looks alike. Or even combining with other information, for example, a t-shirt. But I want that in red. So you can combine a query of an image and the text to find similar things that you want to actually find. So that extends the whole horizon of the search range of the information. So this is what we are trying to do. And this is what we've been seeing also in our community that they're building. So it seems like a pretty standard thing already in the e-commerce that we've seen on different GDC platforms, et cetera. But it's not easy to build that I'll tell you why later. And also, one interesting thing we've been figuring out over the past year is also neural search can also search for the 3D models. So I'm not sure how many of you have ever heard about game studios. Like game studios, they really make big ton of money. But have you ever experienced how they actually make a game? So they will actually have a lot of artists that draw, basically draw a lot of different content, like all of these elements out there, of the 3D models and meshes. It's composed in that way. But in the traditional production line, it's not easy to actually get everything tagged well with the proper information. Which make it so hard for the game creators or the engineers to actually pull out information that is relevant to compose the scene. Imagine if you were in a game, like you need different trees, right? You need probably 100 different trees. But you might have already these elements in your assets pool. But they might not be named well, like it's a tree. Or they may be tagged as green. So they cannot find relevant information. This is what is possible to make all of the content searching accessible for creators to pull out the information that they need. So they can click on a person there. And then they can find a similar person that they can pull out, just drag and draw into the scene. So that will speed up the whole process of the production a lot easier. And this is what we've seen also as what newer search is actually empowering. So those are all of the things. And even a chatbot, for example. Chatbot is kind of a chatting experience. But it actually can be explained in a way of a search. So these are all the things that we've seen, OK, newer search as a framework can actually do. So it's very diverse. It's not just a search bar. So we want to jump out our imagination from the search bar to the others. So what we've seen, newer search, in our definition, is a deep learning-powered information retrieval for multi- and cross-model. So this is a kind of a concept that we put out. One thing is multi-model. That means, OK, you have different types of modality of data. For example, like image, or audio, or video, or 3D meshes, or whatever. So those are the things that we combine as multi-modality because it's not stored in a traditional search way of text. And then on that, you will have, OK, you upload an image and you get a retrieval of image. That's what we call multi-modality. And also cross-modality is something I mentioned just now around an e-commerce search, for example. You get an image there, and you get a textual description of it, and it's combined query to fetch a different kind of modality of data. So that is kind of a cross-model search application. It's making it a little bit more complicated comparing to the same modality retrieval. So all of these basically is trying to solve a problem that the data is there. You've got the data, and you want to find the relationship. This is what a search system is actually solving. And why do we want to actually mention about neural search? It's important on unlocking all of this because the traditional way of doing it is trying to get all of the data represented in the symbolic way, which means basically just giving them tags or different meta information so we can make it feasible or make it approachable. Now basically leveraging all of the deep learning models, yes, a lot of you have experienced that, it's basically making it so accessible. So what we are focusing on and trying to unlock that is not just the model itself, but actually how to pull the most state-of-the-art model into production because solving a search matter is complicated. Let's give an example. So for example, like in a news, let's say. A news may contain, for example, an image that has a lot of information on that. There are some dogs in the window. There are cats sitting there. And also there is a title of a news. There is an author. So there is a lot of information. Let's think about a task. We want to understand the document that is formed into this. It has contained image, it contains text, it contains meta information. For example, the title of the article, the author. And also there are passages in it and there are sentences. And it's a lot of different granularity. It could be a paragraph. It could be a sentence. And how do you actually adjust the granularity in the information that you want? It needs to be very flexible. But in a traditional way, a lot of engineers might be able to solve that information finding problem. You want to type in a description about similar stuff and you can retrieve the message. Yes, it definitely can. But what normal engineer need to do is probably a team of engineers. You need to have the object storage. You need to have data vector database. And then you also need data type definition. Basically you define all of things by yourself, basically. And then you also need the model inference, for example, to train the model stuff. And later on, you also need to represent that in the web frame. So there are a lot of different text stack that you need to glue together to make it work into solving a problem. I want to find information of these news, basically. Or even in a recommendation, probably, in another format. But this is what they need to do. Imagine, OK, so we know that developers are specialized in some of the areas. Some are back-end engineers. Some are DevOps. Some are front engineers. And then you cannot have one person that solve all, basically. So for a lot of business, you might need a team of three, basically, to compile all of the stacks together, to get it up running. That is headache, because it's not just about the progress of developing it. It takes a lot of time, resource, even on a tryout. And also, it's hard to maintain, because it's gluing everything together. You need to take care of a lot of different communication components, right? So that's why we've been experiencing, especially my co-founders, they've been experiencing over the 10 years of career in the search domain. That is a headache. So this is why we think about, OK, we need to find a way that is making it a lot easier. That is why we did the GINA ecosystem. So if we're looking at, OK, how GINA will solve in our way of neural search, we just need one stack to solve them all on the bottom. That's it. That's the GINA and ecosystem. Define the data, define the flow, and then it gets up running. So our ecosystem is trying to build something that is powering universal data types. And also the Python, Pythonic experience is something we built everything on Python, basically. That's the most, I would say, relatively up-to-date, very human-friendly language in the programming as well. And also, we are kind of powering a lot of data scientists that they can actually easily bootstrap the pipeline just on your laptop with the Docker RAID. That's what we define out, explain our ecosystem a little bit later. And also, that is designed for the modern apps because it's so cloud-native. It's very well integrated with Kubernetes of all of the Docker experiences, making it very accessible. So people can deploy it in the distributed environment, especially in AI. We know training is a headache because you need resource. You need computing resources. And you cannot keep the computing resources consuming all the time because it doesn't make business sense. So you want the most heavy part of the computing in the most cost-efficient way, for sure. That's how a distributed or a cloud-native system need to be built. So you can actually distribute the kind of a cost in different nodes. So that's how G9 is designed. And then if you want to get the service up to the cloud, that's where you want Gina. So you can put that up. And then that covers all of the things that you need, basically, on the performance in the cloud, getting connections, talking to different nodes, sort of thing. So this is how we try to solve. And this is also the importance of neural search. Why? Because it shortens your time, it saves your cost, it makes it easier to maintain, and also scale. That's why we need a neural search system to solve whatever problems we were mentioning before. I'll do that later as well. So looking at what we have. So we have an ecosystem that is a lot of open source repositories that you can check up on our GitHub. We have DocArray, which is the data type to define the unstructured data. Even though we call it unstructured data, it's hard to define, but we need to have a way to define it in the computer language. That's what we define as DocArray. Gina is the cloud-native neural search framework that level up the whole pipeline into production to the cloud and also scale. But also, I'm not sure. So I've got one more question is, how many of you guys have been working in a search domain? One or two? Yes. So I guess you must experience, it is probably bootstrapping the pipeline not that difficult. But if you want to get that up-running in the actual production, result matters. Like you need a good result of the retrieval. Otherwise, it's not going to be used well by people. That's also part of the headache. So that's why we developed a tool that is called FineTuner, specifically defined to fine-tune the search result of the pipeline. Because everything in the Gina ecosystem that we try to build is plug-and-play. So you have different components. You have executors that is packed with all of the models, state-of-the-art models that you can actually use. So this is another component that solves the last mile of problem of getting that up to production needs, basically. So that's also needed. And so we have Clipus service that embed images and sentences into fixed-length vectors with Clip. So we have built an ecosystem, and people are playing with it. That makes their life of building up the system a lot easier. Why? I can give you an example. So for example, finding a sentence in a book, in the book, Fuzzily, previously you may imagine, as I said before, there are different stack they need to pull up together. But now looking at what they have done with Gina, you just need 14 lines of code. Yeah, you can never imagine an AI solution is actually built in just 14 lines of code. It's easy to define. And it's easy to run up on the local machine. So that's how a lot of people are actually boosting it and trying to solve a simple matter at the local environment. Another example is the image search. It seems to be difficult defining all of the images in different matter information. And also all of the structures, how do you define it granularity stuff, it's troublesome. And now with Gina, 25 lines of code is actually solving that. So that is actually bringing it so accessible for a lot of engineers to make things happen, especially trying out stuff that you have a new idea, you want to validate that. This is a great moment for you to actually do that. And it's also with the ecosystem that we help people to scale. They can actually leveraging the cloud native framework that people can actually play with. And so fundamental stuff people need to know about. It's just a document defining an executor different components that you can pull from our hub. There are like 100 different executors processing any kind of data, different things. And also flow is the streamlines that distribute the executor and get it up running. So that's why some of the potentials, just like Metaverse company is actually using Gina in production to solve the 3D mesh search, that unlock the potential of the creators creating things in a lot faster speed. And there is also a startup that is doing in education industry that they build up there, AI basically, AI tutoring. This is a very interesting case. Nowadays, especially starting from COVID, a lot of things is when to do the internet. So online education as well. But a lot of online education is formed in a way of videos. Just giving a video clip, you learn the course, and then you get the information. But what that makes different is that you lose a bit of sense of studying what's the difference between watching a clip in a course and having a course offline, just like we are having on talk right now. It's the part of interaction. Because you cannot ask a video clip or question, right? What is that? What is this? And how do I use this? How do I use that? This is what they do with Gina to unlock the possibility of interactive content during an education course. So they built a bot that go through all of the content of the video clip, or maybe a PDF, or whatever format. And that bot will learn how to actually explain all of the problems that is occurring during the course. So they can explain as a tutor along the side to the video course and to the students. So they get the kind of a continuous tutoring experience along their study online. So this is also very fascinating that these community companies actually building. So there are a lot of different scenarios that can be unlocked by neural search, text to image. Pretty common right now, not too exciting. But question answering, duplicating detection. For example, there are some insurance company doing anti-fraud, for example, for the case filing. And classification at large scale. And 3D model search, semantic content recommendation stuff. There are a lot of potential that we are actually able to do. And we hope to see more people unlocking it with more interesting cases. We're more than glad to see that. Also, apart from search, about our observation over the past year is that actually what Gina is capable of is not just search. There are some of the use cases mentioned before. For example, the chatbot or the robot is not in the traditional search experience already. But it also unlocks the other opportunity. Lately, it's been very popular. I'm not sure if you guys noticed that Dali is a very popular model that the OpenAI released. And this year has released Dali 2 as well. It's basically generating content, right? Generating an image from a text description. So this is what has been built. Ocean Beach firm in Van Gogh style. This is what Dali generated. And also, we can think of the Statue of Liberty wearing a VR headset. It doesn't exist before, but it can be generated. Right now with AI, it's already unlocked. So this is great. But leveraging Dali in a lot of use cases might be a little bit troublesome, because it's not that open. So you can get it in and out, and you can adjust easily. So this is what we're trying to do. OK, so that is cool. How can we actually make it accessible on people actually using it? And even, for example, like this painting of a couple kissing in nebula, that's something that normally it's either an artist to try it or not something that program can actually do it. So we've seen the Dali 2 as well, which is more advanced and also more high-res kind of art generating. So what we did is we find this type of generative art is deep learning model content generation. It's not you've got existing data. It's something that doesn't exist, and you are creating that. So those are the models that is actually serving that. But how to actually use that in the scenes seems to be troublesome. So that is what we've seen. Data is not there, but the relationship is there, because the relationship between a descriptive text to an image type of representation is there. So how to put that into work? It's basically using the relationship to make data. That is very different from search, because search is about the data is there. You need to use the search to find the relationship. So it's totally different way. So we've tried it out with Gina, with our Gina flow, and we generated a project that is called Dali Flow. So it's easy to play around. People can just describe it, and that's get the Dali running up. These are the arts actually Dali Flow generated, and it's already gaining around 1,000 stars on GitHub. So this is what we have. You guys can also check out on GitHub. It's a client server architecture, and it combines the Dali and diffusion using Gina. So the fun fact of that is 288 lines of Python and 42 lines of YAML. It's a big model, but considering that big size of model of the lines of code that has been written into actually realizing that doesn't seem too much, and it's also a weekend of work of our team, our CEO built that. So this is how Gina is actually able to unlock the potential engine generative R, which wasn't something we imagined originally. And combining all this together is the potential that we've seen as a Neurosurge framework. It can unlock not just the Neurosurge as multi and cross modality search application, but also generative art in multi and cross modality applications, which is far beyond just search. So we hope all of the things that I was trying to talk about earlier, trying to inspire team and people here sitting in the room and virtually to see what you can actually build, because there is a lot of potential that you can unlock with the Neurosurge. So that's pretty much of the sharing. I'm trying to make it quick for the lunch, so that I don't drag too much of the time. That's pretty much of it. And you can always find us. There's our website, especially our GitHub, on our repositories. And also, feel free to join our Slack. We have over 3,000 people across the globe already there, and we're discussing very interesting stuff. Yeah, that's pretty much of it. And so any questions? Yeah, go ahead. Yeah, we are actually working on the GPT-3 with the flow, actually, to give an example of that. But in general, the chatbot experience way that we've seen is more sitting in the Neurosurge applications rather than just cross-model generative, because there are a lot of fuzzy way of actually you can play with. But we are actually working on that. It's going to be something that we want to show as a demo very soon. So it's a comparison on the performance for Neurosurge and traditional search methods. So that's an interesting one, though. It really depends on the scenes or scenarios that you are actually looking at. So traditional search, especially symbolic ones, it's been improved over decades. It's been very fast right now on the systematic, symbolic ones on the text data, which is great. So what we are trying to see on Neurosurge is it unlocks the opportunity for searching something that symbolic way is not able to do. For example, you've got a data set that you want to get up running very quickly of multimodality, of the 3D meshes. That's not something previously you can do without a proper attacking system, without a proper database. But now you can actually unlock that. So we are not comparing the kind of performance apple to apple in general, but it actually provides, we've got some of the benchmarks on our documentation, actually, they can look into. Especially in the multimodality kind of scenario, multimedia data, that's way better than the traditional symbolic search. But if you are comparing, say, OK, what is that comparing to log search at the moment, probably elastic is the best. We have no doubt about it. But definitely evolving into more of the potential of the other data, that's where Neurosurge will take off. Hope that answered that question. All right? So any more question stuff? OK, I guess we're good. And thanks, everybody. And I won't take too much of the line. Enjoy your lunch and enjoy the event. All right? By the way, we've got stickers. So thank you for coming.