 All right, well, I suppose we can get started. So thanks, everyone, for coming today to this talk. This talk is about Milvus. And Milvus is what we call a cloud-native vector database. I'll get into what exactly a vector database is during the presentation. But before that, I do want to introduce myself very briefly. My name is Frank. I work at a company called Zilliz. We are a startup that's based out of San Francisco. And here are my, if you want to get in touch with me, LinkedIn, Twitter, email, anything of that sort, feel free to do so. My door is always open, so to speak. All right, so coming up. So I'll talk a bit first about unstructured data. What is it? Why do we care about it? And also, why does it make sense for us to store this unstructured data in a database? I'll follow up with embeddings for Semantic Search, how we can use embeddings to understand unstructured data. And then I'll talk a bit about vector databases as well as Milvus. So Milvus, again, is an open source vector database, entirely cloud-native and free to use. I'll also end off with some real-world use cases for Milvus and talk about why Milvus is exciting for not just the present, but also for the next 5, 10, 15, 20 years. All right, so what is unstructured data? Does anybody want to take a gander? Anybody want to take a guess? Yeah, precisely. And what I'm going to do is I'm actually going to give you a t-shirt. So thank you for answering that. If this doesn't fit very well, you can just find me later, and I'll give you a different one. Yeah. I was at another conference last week, and I was speaking to one of the folks there, and she said that I had a voice like a radio talk show host. So apparently, I was making her fall asleep. I don't want to happen. I don't want the same to happen to my beloved audience here. I'll try to keep things a little bit more active, try to involve everyone here a little bit more, and also anyone who's remote as well. So that's exactly right, the gentleman in the blue shirt. Unstructured data is examples of his images, video, audio, text, or any data that does not conform to a predefined data model. Now, that's going to become clear in this upcoming slide, which I want to talk first about the evolution of data. So in the 1960s, 70s, 80s, when computers became ubiquitous, a very, very major application for computers is storage, indexing, and searching across massive quantities of data. But you see very early on, a lot of the data was structured. So tabular databases, or maybe even semi-structured. So in the middle, we have a JSON database and no SQL database. Very early on, you have databases like MySQL, PostgreSQL, in the early 2010s, these Wacom stores, MongoDB, so on and so forth. But as the age of, as the mobile computing age, and as the age of, let's say, the internet became, there's more and more data coming in, more and more data being generated on a daily basis, we find that there's tons of unstructured data out there. Images, video, audio, text, and also some really lesser known ones, so graphs, protein structures, geospatial data is a big one. All of this can be considered unstructured data. But there's no real database to help us understand what this unstructured data is. There's no database to help us understand how do we semantic, what are the semantics behind all of this unstructured data, right? And on top of that, you'll read reports from, let's say, Seagate or IDC, where probably something like over 70 or 80% of data that's being generated today is unstructured. That's quite a large amount. So how do we represent unstructured data, right? How do we take a piece of unstructured data and turn it into something that a computer can store and index, and that's, we do it through the power of machine learning, we do it through deep learning models. So I know this is the LF AI and data form, but for folks who aren't as familiar with machine learning or deep learning in particular, I'll explain this very briefly. In a lot of these deep learning or more generic machine learning models, there are these layers, right? And if you take an intermediate layer that ends up being a very good representation of your input data, and these intermediate layers, the results of them are called embeddings, okay? So the idea is that you can take some input data, embed that into a very, very large space, and be able to have very strong semantics across us. So in this particular graph that we see up here, this is done using the TensorFlow projector. You'll see that very similar concepts are grouped together. Up top, you have physical science concepts, so concepts from, let's say, biology, chemistry, physics. Over on the bottom, you see, Chris, frankly, I know it might be a little bit hard to see for some of the folks in the back, I apologize. You see a lot of names towards the bottom, and over on the right, you see a lot of methodologies or processes, right? All right, so now this sort of leads me into the second section of this presentation, which is actually talking about what embeddings are, right? And embedding is a vector representation that includes the underlying meaning behind unstructured data. Touched upon this in the previous section, I'll sort of dive a little bit more into that in this section as well. How can we use embeddings to really understand a lot of this unstructured data? So you see here in this particular example, I have two different pieces of unstructured data. The first is a sentence, and the second is an image. Now, both of these really correspond to each other. This type of bird is called a tohi, one of my favorite types, which I'll get to later. But the sentence, a tohi person on a branch, I can embed that into the same space as an image using modern machine learning. So there are a lot of multimodal models out there, a lot of multimodal algorithms, and I can use that, I can leverage that power to be able to understand two individual pieces of unstructured data, right? And these embeddings are inherently very, very powerful. So not only do semantically similar objects have close embeddings, but directionality matters as well. So you see in this case, there's sort of directionality inherently embedded into the concept of male-female. There's directionality in verb tense as well as country-capital. So a lot of this is very, embeddings are very, very powerful as a tool from machine learning models for us to be able to understand all of this unstructured data, right? Looking ahead a little bit, one of my favorite examples to go to, go to example, so to speak, is reverse image search. This particular example is, you see on the left some images and on the right, the center right, you see the nearest neighbors in a day set, pretty small data set of, I wanna say about a thousand images, right? This is from something called unsplashed data set. And all of them are cementically similar to each other. If we look at the nearest neighbors in this embedding space, for the top image, it mostly corresponds to plants, for the middle image, it corresponds to mountains, and for the bottom image, it corresponds to forests, trees, so on and so forth, right? And as the last portion of this section, I'll talk a bit about vector indexes and how we can leverage embeddings, how we can leverage these powerful sort of floating point vectors to be able to search and do a lot of the semantic analysis. It's really done through vector indexes, right? So, you know, in a traditional database, maybe you have a bitmap index or clustered, unclustered index, or an index based off of, I don't know, bee trees, so on and so forth. And the same goes for embedding space for a vector database as well. We have these indexes which we can use to be able to search across these long or large embedding vectors, right? And there, again, I won't go too much into the detail of each of these. Probably, for example, like ANOI itself could be an entire 30-minute presentation, but just know that there are various types of vector indexes out there. Okay, all right, so here's the fun part, right? Let's talk about what is a vector database. All right, so pop quiz number two. What is a vector database? Does anybody want to answer that? Come on now, don't be shy. Yeah, yeah, gentlemen in the... Do nearest neighbor lookups and similar research. There we go, there we go. Are you familiar with vector databases before coming to this presentation, or? Yeah, I am. Okay, there we go. All right, there we go. But no, that's exactly right, right? A vector database is a database that is purpose-built to store, index, and query large quantities of embeddings. And these embeddings, again, they correspond to pieces of unstructured data. They allow us to semantically analyze all this unstructured data. Two images of, let's say, German shepherds would be very close to each other in embedding space, right? So that's great, so I'm gonna... Again, if this doesn't fit you very well, I'm happy to just come back after the presentation and I'll get you a new one. But, again, vector databases now form, they're forming an increasingly core part of a lot of production machine learning, a lot of production AI systems. Over there on the bottom, I wanna emphasize in production, we have data flow, data pipelining, model serving, and then a lot of the results of these models will go into a vector database, right? Either as metadata or as, more powerfully, the embeddings themselves. Okay, so I'm gonna talk about the Movis architecture a little bit later, but I do wanna talk first about some Movis features. We support a lot of hardware accelerators. So machine learning and AI in general is a very, very math-heavy field, right? You can imagine back propagation or even inference. All of these require lots of floating point, or if you do quantization, lots of integer math, right? So we support SIMD out of the box, and we also have GPU support coming up in a version very, very soon. We're also gonna support other accelerators as well, so FPGAs, MPUs, TPUs, to be able to do that fast indexing, that fast querying on top of all these hardware accelerators. We also support key database functions, so like any good database, we have data partitioning, data sharding, as well as filtered, these hybrid queries and searches, the ability to, let's say, search based off of an embedding, and then have some filtering based off of metadata as well. We have multiple options for vector indexes, vector indices, and similarity metrics. So there's Facebook AI similarity search, which has a flat, you know, and also implementation SW, as well as product quantization. Again, I won't go too much into those. There's also, we also have ANOI, and we are working on implementing Google's scalable nearest neighbors as well. In terms of similarity metrics, we have just standard Euclidean distance, so you can imagine two embeddings. I wanna be able to compute the L2, sort of, Euclidean distance between those two, or the L2 norm, we have that. We also do a dot product similarity metric, which, if you have normalized embedding vectors, it ends up being cosine similarity. I would say cosine and Euclidean are probably the two most common. I really haven't seen any applications use similarity metrics other than these two, but we also have Boolean metrics on top of that as well, much, much, much less used. And we provided a number of SDKs. So I think as any sort of well-known or any powerful database today, they have a lot of these APIs, SDKs, connectors that can go along with them. We have Python, Go, Node, and Java, and I believe we are working on, the team's working on a C++1 as well. But a big, I think plus, and a big sticking point for Milvus as an open source project in particular is that we are entirely cloud native, we're Kubernetes native with deployment through Helm. We have a native S3-based design or S3-like. I know a lot of GCP, Azure, a lot of these sort of cloud companies, they now have S3 endpoints, but if you want to be able to do it, if you want to be able to do some on-prem deployments, we also have MinIO, right? That gives us, that allows us for pretty easy on-prem to cloud conversion. We're fully distributed. That means that we are highly elastic and horizontally scalable. And again, I'll dive a little bit more into the Milvus architecture in the upcoming section. We have disaggregated storage and compute. So a concept called shared storage. I think very much, if we look at traditional relational databases or databases for structured data, a lot of them, I want to say, in the past, they were based on this sort of shared nothing architecture where each of my individual machines would store a portion of the data, it would be responsible for querying over that data and indexing that data as well. And then you had Snowflake come around, they very much gave us a shared storage architecture or shared something architecture. And really revolutionized the entire data processing, data processing, I'll say pipeline, right? And we're very much the same. So we have, doing that disaggregated storage and compute, getting to that shared storage, shared something model is very important for us to be able to scale well into the future. And then we have separate read, write, and background indexing services as well. And I think if a lot of this is confusing to you right now, I would just say be patient. I think it should become a little bit clearer as we talk about the architecture in the next section. I also want to briefly touch upon the entire vector database ecosystem. We have also ETL for vector databases. If I go back to slides, sort of the data pipeline and data flow, that's also a very, very important part of the vector database ecosystem. And for us, we have a project called TOEI. If you remember from an earlier slide, this is why I say I like TOEI's a lot. It's a type of bird. It's built, it is, this is essentially ETL, extract, transform, and load. Does transformation for a lot of unstructured data, right? So for users who probably don't have, you know, these machinery models in their production pipeline, they can leverage this ETL tool to be able to generate embeddings for millvis for their vector database. I won't go too much into some of the details of TOEI, but I'm happy to chat about this further with anybody during the Q and A. And we also have a suite of administrative and visualization tools as well. So we have a management GUI called ATU, and then we have something called Federer, which helps you visualize your vector indexes. I think this particular visualization should be H and SW. Yeah, there it is. So H and SW is a, you know, it is a graph-based indexing algorithm. I think it's the most commonly used one today. All right, so this is the fun part, or the least fun part, I guess, depending on how you look at things. Talk a bit about the millvis architecture, right? Now, this is the architecture, I think, from a mid-level or a high-level perspective. You'll see a lot of these layers, right? We have the coordinator layer, the access layer, worker layer, and the object layer. I'm gonna go over these one by one, see how they allow millvis to be cloud-native and scalable, and then, again, I'm always happy to take questions or follow up questions about how this works in the Q and A. So before we begin, I think it's important to, or hopefully, you know, defining some of these data-related languages in the context of SQL will help, will help understand, you know, should help understand exactly why some of the architecture choices, why we did some of these architecture design choices later on. So DDL, Data Definition Language, is used to modify or define the database scheme itself. We also have a Data Management Language, DML, which is used for actual storage. It's used for modifying, deleting, and retrieving the data. And then we have a Data Control Language, DCL, which is there to manage user rights and permissions. Now the access layer itself is not that, you know, it's not that, I would say, not too fun. It's essentially just an outward-facing layer, and it's used to route the proper commands to the proper places. It's made up of a bunch of proxy nodes, and that allows us to do, again, horizontal scaling, depending on the number of users hitting the database. We have a coordinator layer as well, and again, I won't spend too much time on this, but we have four different types of coordinators. Each is responsible, with the exception of the root coordinator, each is responsible for their own cluster of workers. So there's a query coordinator who's responsible for managing a query cluster. There is a data coordinator that is responsible for managing the data cluster, and we have an index coordinator that is responsible for managing the index cluster, right? So all of these are, there's essentially one node for each of these different types of coordinators, and then that allows us to scale the cluster, the computation itself, horizontally. So this ties in very nicely with the worker layer. Again, workers are the stateless entities, and that allows us to spin up new workers or shut down new workers as we like, and these workers are, these workers just handle, these workers handle DML requests, right? The data node in particular, it retrieves log data, and again, we have inside Milvus, we have this concept called log as data, where we turn most of, we use most of the data, most of the embedding store within Milvus is within the log. We have the query nodes are there to, that's actually a mistake, so the index and query nodes are reversed. I apologize for that. The index nodes are there to build indexes on inserted data, and the query nodes are there to do searches, to do queries over the data that you've already indexed. The storage layer itself is composed of metadata storage, log broker, and object storage, and again, I touched upon this in a previous slide, but the log broker itself is there for streaming data persistence. It allows us to, let's say, given a particular time stamp, I can always travel back to that time, or I can keep filling into the log itself. Metadata storage is done using ETCD, and object storage, again, is done using S3 and MinIO. This allows us to do easy on-prem to cloud conversion. All right, so some key takeaways. I know I went through the architecture pretty quickly, but the key takeaways for the Milvus architecture itself is that, first, we have a single coordinator type per service, or per worker, so one single query coordinator for the query nodes, for example. Data itself is stored in collections, so if any of you have ever used MongoDB, you should be familiar with the concept of collections, and Milvus really builds on top of that as well. We have collections of these embeddings, collections of these vector indexes. We have disaggregation of query indexing as well as data, and that allows us to, depending on the application, if my application requires very, very few inserts, maybe I only really need, let's say, one index node or one data node, but if it requires a lot of query, if I have a lot of users hitting this database simultaneously, I can expand out my query nodes to be able to fit this particular type of application, and the vice versa is true as well. If I have a lot of data that's being inserted or deleted, and I don't have too many users hitting the database itself, I can scale out the index nodes. I can scale out the data nodes and keep the size of the query cluster small. So this gives us a lot of flexibility to be able to target the right types of applications. I think similar, in a way, is similar to what you see in a lot of modern cloud database architecture, such as Snowflake. And then we also have something called Logiz data, which I know I didn't touch on too much, but essentially what you need to know here is that all data that we have is stored in a log-based format, and that allows us, that gives us some pretty good flexibility in terms of what we want to be able to do with the data itself. If we want to go back to a previous snapshot, or if we want to be able to do, again, we have a PubSub scheme for a lot of this data as well. Okay, so talking about some real-world use cases, has anybody, I know the gentleman in the back over there, has used or knows a bit about vector databases, but does anybody want to talk about their personal sort of application that's built on top of vector databases? Anybody want to share? I have a t-shirt with your name on it if you do, so I'm all ears. We use embeddings like natural language processing embeddings, but no database. It's mostly running on top of OpenGPT, but those models, so we use embeddings, but somebody would have trained it somewhere, because if they're in just a Wikipedia, somebody would have trained it, so I can see, maybe they would use it, but yeah, that was my follow-up question as well, when you're done. What is the use case for smaller companies who can't generate those kind of embeddings because they do not have enough compute and memory for lack of resources? Sure, sure, absolutely, and I'd love to answer that as a follow-up question in the Q&A as well, but I think you're absolutely right. Vector databases are there to, again, search, sort, and index across all of this unstructured data that we have in the world today. A big use case you mentioned is textual search. If I have electrical engineering and computer science, for example, these two are very, very related, similar fields, but a traditional database model would probably put computer science closer to political science or closer to social science, simply because they share a common word. Do we have any electrical engineering majors in the crowd, by the way? I'm just curious. Ah, in the back. Electrical engineering? Electrical engineering. Take the electrical off? Okay, okay. Well, yeah, that's a shame. It's a, I guess I'm part of a dying breed. I majored in electrical engineering in college, so. But again, getting back to the point on hand, the first, I've sort of selected four real-world use cases that I think are really sort of salient for me that I really like. The first one is reverse image search. So going back very away to the beginning of the presentation, you remember that we were talking about reverse image search and how that can be used to do, how that can be used within Milvus to be able to, or how Milvus can help you implement a reverse image search system. And for Cleveland Museum of Art in particular, this was a very, very interesting use case where they wanted to be able to help you use it. They wanted to improve the engagement, improve user engagement, right? And they did it by building, they did it by building this reverse image search system called Art Lens AI, I believe it's still online. There's a blog post about it if you search for it as well. And it really helped a lot of their users, a lot of them, museum goers increase, got them more interested in art and got them more interested in the gallery as well. Threat detection is another big one. This is probably one that most folks don't really think of. The idea that for, again, the company here is Trend Micro, the idea that we can take, let's say an Android APK and turn that into an embedding as well. Now there's no real machine learning or there's no real deep neural network used in this particular application, but they do have a way, a handcrafted algorithm of being able to turn a new APK or any APK for that matter into an embedding vector. And where that becomes really interesting is that you can take known malware, embed that into a common space, and then be able to search for that inside of Milva. So if I have a new APK, an unknown APK, a big way for me to be able to see or to tell if it's malware is I can turn that into an embedding and see if it's close to any other pieces of malware that I've seen before. It's a very interesting use case. And I think prior to using Milva, I think Trend Micro was relying pretty much entirely on MySQL, I'm not sure exactly how that worked there, but they were able to see some significant gains in terms of scale and scope after switching to Milva. There's property search as well, this is something that we did with Compass where Compass is in a way similar to Zillow Redfin, not exactly, but in a way you wanna be able to search for new types of property, right? Again, not your traditional NLP or computer vision use case, but it still is something that is pretty, it's something that I guess is a good use case for vector databases, but not one that people would think of off the top of their head. And then product recommendation, I did leave this one for last, I think most of the majority of users of vector indexes and vector databases are in the product space, they wanna be able to recommend products based off of, let's say textual data or based off of image data, you can imagine if I am Yelp for example, I wanna be able to analyze the pictures that users upload in addition to, let's say tags or other pieces of this human generated metadata. So yeah, so that is it. It's a little bit, I know we're sort of getting into that lunchtime, so I didn't wanna take up too much time here. I will take any questions now, if anybody has them, and then that way we have probably about another 10 minutes or so. Yeah, so embeddings mostly come with a cost, right? Because the compute and memory that's required to generate that embedding in a deep learning environment, that's cost intensive. So how does this like Milvus as a vector database, but there's like a lot of things that need to be done, right? So based on your experience, what is the cost associated for something like product recommendation? Is it comparable to maybe Elastic or like is there something that's going on? That's a great question. And it's good that you bring up Elastic as well. I know Elastic also has, it's sort of a bit of an inside before I answer the question. I know Elastic also has these A&N search endpoints now as well. But it's not, you know, I've tried it myself and it's not really, again, it's not built from the ground up, I think to support a lot of these embeddings. So that's really where I think Milvus shines in comparison to Elastic searches A&N endpoints. And going back to your question at hand where a lot of these embeddings, perhaps, you know, I'm a small company, I don't have enough compute resources to be able to generate these embeddings. There are, this is a big reason why we have this particular project as well, ETL for vector databases. It allows us to reintegrate a variety of different, let's say models and methods to be able to turn unstructured data into embeddings. Some of them are very, very cheap. Some of them require very, very little compute resources. Obviously, if you do have a GPU, we'll automatically leverage it. But if you don't, or if you wanna be able to, let's say, do it on a single machine or just a very, very small cluster, again, being a small company or startup with limited resources, that is something that we can do. Now, as an alternative, if I wanna be able to turn an image into an embedding, and I don't wanna use, let's say, large, you know, VIT huge, for example, right? I can select one of the, maybe an older model or a model that has a very, very scaled down version of that to be able to generate my embeddings quicker and with less fewer compute resources. So it's not exactly, I would say, a Milvus question, but it is a question that a lot of, you know, it is a problem that a lot of Milvus users have, and that's really why we wanna have this ETL tool to enable users to be able to generate embeddings quickly and smoothly as well. I hope that answers your question pretty, yeah. Any other questions? Yeah, so the question was, can I provide a demo of the speed of the query? We actually have these notebooks, and I'm happy to share with them. I don't, yeah, I don't have them on the slides right now, but we have these notebooks, and I'm happy to share, again, happy to share that afterwards, where you can actually spin up a Milvus cluster on, let's say, on a Mac or on Linux, and you are able to sort of do queries against that. So if we have an entire notebook, which will show you how to do that. In terms of sort of going back to your question about how quickly we can do queries, we have, you know, I'll share a graph with you. There's a, we do have some benchmarks on the Milvus.io website, which sort of given the size of your cluster, given the type of index that you're using, given the length of your vectors, you'll be able to see how each of those sort of features affects your query rate. But I would say, you know, for, let's say, like 50 million vectors in your vector database, we can do, we should be able to do, again, depending on the index, depending on the size of your cluster, we should be able to do, you know, like 500 QPS pretty easily without too many resources. Hope that answers your question. Yes. Can you talk a little bit about the scope of queries that are currently well supported versus those that might be more on the roadmap going forward? So for example, I think K&N is one that's supported right now. I'm not sure if you support like searching based on like a cosine similarity threshold or something like that or, you know, how, how flexible I guess that is. Yeah, so absolutely. I think that, let me see if I can find the slide here. So yeah, so we have, I think, and again, this also ties in with what I was talking about, you know, with elastic search and DNN index as well. We provide, we strive to provide a lot of flexibility for the user, not only in terms of the index type that they want, but also in terms of the similarity metrics. So if I have, you know, let's say, I wanna be able to use HNSW, which is, you know, again, I believe the most commonly used index for my particular application, I can do so. But if I don't have too much data, let's say I only have, you know, a million vectors and I want very, very precise search, I can use a flat index, so I can tailor the index to my particular application. And on top of that, we have for floating point values for these real, real tensors or real embeddings, we have Euclidean distance and DAW product as well, which if you normalize your embeddings, which I encourage everybody to do, don't use unnormalized embeddings, you take the DAW product and that ends up being cosine similarity, right? So that's really a key, you know, a huge feature of Milvus that I think is really awesome is that we provide this type of flexibility for our users. We integrate this directly into our open source project. Again, I hope that answers your question, but if not, I'm happy to chat a little bit, a little bit more offline as well. We have a question from a virtual attendee. Sure. Viacheslav asks, is Milvus capable to improve performance of data processing comparing with other databases? Is Milvus able to improve the speed of data processing when compared with other databases, right? Okay, so I would say a lot of your traditional databases are not meant to store embeddings. They're not really meant to store unstructured data. And Milvus, I won't say improves the speed of data processing in that sense, but it is targeted towards a different type of data. It's targeted towards the data that we see, again, here on the right-hand side. So data that there is being more and more of you know, like short videos or long videos, again, text, images, audio, graphs, geospatial data, this type of, it's meant to be able to take this type of data and analyze and search across unstructured data rather than traditional structured or semi-structured data, right? In that sense, I wouldn't say that Milvus is able to improve upon the speed or in any sense of the word, but it's really meant to work hand-in-hand with a lot of these traditional databases. It's meant to work hand-in-hand with, let's say, MongoDB or hand-in-hand with CockroachDB databases for traditional relational or semi-structured data, right? These really, I would say, are two different domains and Milvus is not looking to replace any of these traditional databases, but more looking to complement them. Great question, by the way. So is there any domain-specific optimization done from Milvus because if you are storing embedding, it's mostly numerical and you mentioned about floating points and everything, right? So what is the optimization when you store it because numbers should be compressed format rather than video, let's say, if you have to store video at the block. So what is the optimization if you have slides for that? Absolutely, yeah, I will, again, for the very, very specific embedding-based optimization, we'll have to double-check with someone within the Milvus team, but I know that we do have optimizations based off of the concept that, yeah, we store only embeddings. So there is compression that's done automatically to help, well, to save object storage space. And we also, this also ties in with SIMD support as well as other modes of parallelization. So you see a lot of modern CPUs, they have SSE or AVX2, AVX512, and we really leverage these compute capabilities to allow our users to speed up querying, speed up indexing, so on and so forth, yeah. Is there any database platform that integrates the best with Milvus when you're working with maybe thousands of X-rays, like long X-rays or cells, pictures of cells? Could you repeat that question real quick? Is there any database platform that works the best with Milvus when you're working with images of the long X-rays or cells? Okay, I see, so you're talking about any domain-specific adaptation for the unstructured data itself, right? I would, that's actually an interesting question and I won't say that there's any particular domain-specific adaptation for the input data that we do. You're talking, I think in your case, you're talking more about a very, sort of like a sub-field within computer vision, right? There, again, we're not really experts on the different types of use cases for Milvus and I think a lot of that would, maybe you'd have to, I think a lot of that you'd have to maybe tune some of the Milvus parameters to really see where the best performance comes from. I know that probably doesn't answer your question fully or that's probably not the answer that you were looking for, but again, I'm not really in the area of what you mentioned of sort of like biosciences but I imagine that there are some optimizations that can be done to Milvus itself and also to our ETL tool and also to our visualization tools to improve the performance on your particular type of unstructured data. Is there any notion of versioning built into Milvus in terms of you have different models, versions generating different embeddings over time and being able to time travel almost like you would in like iceberg or something like that? That's a great question and it's sort of like an area of research that I myself am particularly interested in as well which is how do we correlate embeddings? Let's say we have two image recognition models. They generate embeddings. How do we correlate these two different types of embeddings together? I think that's sort of more or less the question what you're looking for, right? We don't have a great tool for that right now. I'll just be very, very straightforward and I know CVPR is sort of at the same time that it's sort of happening right now as well and really I think this is an open question both for me as well as for a lot of other folks in machine learning is how do we be able to, how are we able to correlate two different versions of embeddings or two different types of embeddings together into the same space? It's a question that I again, I don't know. I'm not sure if anybody knows really well like what the best way to do that is but with that being said in terms of just different versioning you can put different, you can put these different embeddings in different collections. They can always correspond to the same object and you can be, you're able to search through them that way. Yeah, so again, I'm pretty challenging questions. I feel like we're playing a game of stump Frank at this point but yeah, I'd love to if you come across any research, you come across any information on that I'd love to chat further. I guess that's pretty much it then. I wanna thank everybody, excuse me for coming to the talk today. I know it's during lunch time so it's probably tougher than let's say a morning or an afternoon session but that's pretty much all that I had. Thank you for coming.