 First of all, thank you everybody for joining my talk. I'm not sure maybe some of you might have come to my other talk this morning. It was a joint talk with the Rising Wave Processor. So if you are there, there may be a little bit of overlap in terms of just giving the introductory portion to the chat GPT side. So again, thank you very much. And my name is Mary Grigleski. I'm a senior developer advocate at Datastacks, like the primary sponsor of this conference. So my talk, too, is really about entering the brief new world of gen AI with Vector Search. And actually, may I ask you how many of you are already working with generative AI or any forms of that? OK, great. Are you also like Cassandra user, too, by any chance? Yeah, you are. OK, so you're familiar. Are you by any chance using our Astra database? No, no, not yet. OK, so this would be more introductory. So bear with me, too. I also will be talking a bit more introductory things, just to set the stage. So basically about AI. But so does this mean the rest of you are new than to AI, maybe like generative AI? OK, so this probably is the right thing then. I will be kind of more introductory. So OK, and again, I know you are all experienced people, but it's just from a perspective of knowing about gen AI, that's what I'm trying to talk about. So this is the agenda. I'll just give a quick introduction and then a brief background of AI, gen AI. What are the players in the new gen AI era that we're living in? And then I'll introduce to you some terminologies like GPTs, NLP, LLMS, and then getting into vector database, vector search, and then vector embeddings, all these things. And a small quick demo, given the fact that it's only a 30-minute very short talk. But just to show you how you can leverage on using Astra, our managed Cassandra in the cloud, Astra, DataStacks, Astra platform, you get the $25 every month free tier access. And the nice thing is that you don't need to give your credit card. So I think it's kind of a nice thing. You can just play around with it. And if you need more, you can always talk to any of us at DataStacks too. So OK, so that's kind of agenda for today. And first of all, who is Mary? Who am I? So I'm a senior developer advocate at DataStacks. I think I assume most people know who DataStacks is by now. OK, so I came more from a Java background. I was a Java engineer for a number of years. And then about five years ago, I started doing advocacy. So this is about going out and speaking at conferences and hosting community meetups and things like that. And I also am the president of the Chicago Java users group too. But the thing is, it's a very worldwide organization. In fact, there's also San Francisco, Silicon Valley, East Bay, a lot of Java users groups. So we're kind of a big family too, and Java folks. And how many of you just out of curiosity are Java developers here by any chance? Yeah, quite a few. OK, very cool. So yeah, anyway, so that's about me. But the thing is, now we're moving into this gen AI era. So kind of working with new things. And we should all embrace it. Because interestingly, when I go to conferences speaking with some of the kind of deeply experienced Java folks, and they're kind of like, oh, this AI gen AI is just a fad. But I think not really so. They feel like it could be like blockchain. But it's a really different purpose here that we're trying to leverage on. I think it's got a wider kind of application area. I think it's exciting is what I feel. So OK, so I will be also sharing my contact information later too. So no need to worry about this. And I'll be sharing this slide deck as well. OK, so a brief background of AI, right? So even for me, I kind of have to say, because when I first joined Datastacks, that was last year, March of 2022. That was way before the gen AI time. And I actually joined the event streaming or the streaming team at Datastacks. We also have Astra platform. There's also a managed streaming, which is powered by Apache Pulsar. So that's the team I joined. But as such too, I have experience and also like interest in a lot of different things. So about maybe like five months ago in July, that's when Datastacks said, let's everybody work on gen AI. So I took the opportunity, embraced it, and started learning. So a bit of a background, right? I feel that there could be a lot of confusion too, because there's so much stuff being talked about. Gen AI since the birth of chat GPT a year ago, on November 30th. And so it's kind of confusing. People are saying that, oh, they are going to take our job away, because it's true. It can write code on our behalf. But the thing is, I think we're trusting it too much to think that it is going to take away our jobs. No. And we know that it is just a machine. And so it's still rely on human beings, essentially. But anyway, so to kind of go back to AI, when I did started doing some research, trying to understand this field, it's really, if you kind of look, the very first kind of documented, you can say, is artificial intelligence. Meaning we are not doing things. We are relying on some machine or some other mechanism to do the work for us. It's basically you can go back to 400 and 500 BC. And it's by this Greek philosopher called Akitis. And he invented this theme powered pigeons. And so that's kind of like mechanical, but still it's kind of a form of artificial intelligence, I suppose, because you are letting machine to do the work. So a bit of a fast forward to now, like 20th century, this kind of would be a most interesting kind of time in terms of how it has led up to the current stage of generative AI kind of phase of this whole computing world. Because it was like about 1930s, there was all these science fiction that came out. And then there's also Alan Turing, the father of modern computing from the UK. He talked about can machines talk or think, right? Can machines think? And all of these things is really getting everybody interested to get into research area, to really make machines operate beyond just like us telling it to do things. We want it to be really truly intelligence to do more than it is kind of obviously can do, so to speak. So lots of experiment being done and all of this research. And basically wanting to point out to because I actually was also a developer advocate at IBM before. And of course, IBM had the deep blue kind of chess machine, essentially. So it essentially beat the chess champion of the time, Gary Kasparov, in chess too. That's actually kind of an interesting story if you wanted to look into that. But by no means it is doing artificial intelligence the same way as it is now with generative AI. But then think of it that way is that it is a milestone in the move of kind of moving towards more and more sophistication, so to speak. So OK, so this is just a quick touch on this AI kind of what led to it. Again, it's all about automation that we are talking about. And also too, wanting to point out the kind of a general kind of if we describe this whole kind of period in time, how does it fit, right? So this is like a popular kind of set theory using the set theory to describe AI, artificial intelligence. So essentially you look into it kind of peeling of the onions. And artificial intelligence is the big onion. And basically it's about mimicking the intelligence and behavior of human beings or other living entities. We want this, that's what we wanted to do. But when you kind of dig deeper into it, then there's a machine learning layer in which now we are actually not going to be actively training the computer to think. But rather we are providing it data so then it can learn from the data. That's machine learning. But the thing is too, just with that it's not sufficient. So you have to kind of go even deeper, which is like to see how the brain works, getting into neural networks. That's what like really the core of this is. It's a deep learning side, the neural network side. And that's what also where all the current like LLMs and generative AIs, all of these NLP, all of these are kind of getting into that neural network kind of layers, how the brain, the neurons work. Okay, so let's kind of take a look at this fascinating gen AI era. So what is generative AI, right? So it's a disruptive field in AI. Disruptive in the sense is that right now then, any kind of computer program, we know there's input. We need some input to tell it what needs to be done. But the generative AI is that the difference is that we just have it kind of talk, we can talk to the bot, talk to the machine as though it is another human being. So that's kind of disruptive in that sense and nowhere before would there be any kind of machine that can interpret how we talk like humans. But because like, think about it, right? We're all engineers, when we first start to learn to program is we have to follow strict rules. How do we give the input? Like say, for example, you're even like doing Java, let's say we have to give specific parameters, you have to tell it what kind of parameters and it's just kind of not very convenient, right? It's not really how human beings talk. We don't talk to you like, hey, can you, I want this equals two, that equals five, something. So basically it's amazing now generative AI can actually take human forms of kind of languages to kind of interpret the input. So that's what those are the prompts, right? The prompts comes in and being able to take from the prompts and then get the answers that we want. Kind of is a very disruptive, very innovative way of doing things. And generative AI makes use of the machine learning. If you look at the onion layer, it's the machine learning to learn from the data as well as the deep learning side of things, right? In order to be able to produce contents. And also to generative AI tends to be more creative because if you think of it, right, it can write poems and can write essays and generate code and look for images and design a new house and design your address. All of these things is very innovative too. As opposed to the predictive AI in which like we're, we may be, you know, we're already doing some form of, well, kind of like we're already doing predictive AI before this, but that's still kind of the traditional way of making business forecast based on some data or looking at weather forecast based on some pattern in the atmosphere or something, which is different because it makes prediction, but that predictive or predicted results will go away. After five days of prediction, it will be gone, that type of stuff. But generative, it tends to be creative and it lasts longer too. Okay, so now just a bit of history. Let me kind of make sure my time, sorry, drop my cell phone. So, okay, so, and so now let's look into the, you know, the, I'm sorry, this thing pop up. Okay, so this is just kind of a quick step through history and then since the new millennium 2000, it's basically that's when the neural network, all of the research started kind of, kind of more like gearing it towards where we are today. Back then they may not know exactly what it is, but what they are doing too, is interesting to look at, you know, highlights of it during this period. Like only 20 years ago, there was the fast, the first feed forward neural network language models. And then 2011 basically was Siri that came out with the iPhone that was kind of, also quite innovative too because it's using NLP natural language processing assistance, right, Apple did that. And then 2013 was when there was work to VEC and that's really kind of really leading to where we are today, dealing with neural network, learning, word associations from a large set of texts, right, that was the work to VEC. And then 2017 is basically if you have been paying attention to there's this paper called attention is all you need. So that was a research being done by Google, group of Google researchers on the transformer too. Okay, so those are kind of a bit of a history that comes before it. And also then now let's take a look too. How about, you know, the players, right? So to speak in this new generative AI era. So if we kind of look into generative AI there, it basically is based on all the models that that's where the models are being trained and can become intelligent and perform all of the cognitive type of processing it needs. So as we all know, these are just a small handful of all of the models that have been developed but essentially open AI came out, you know, and they have GPT 3.5, which shocked the world, I suppose are kind of bring the world into a new age, you know, of computing. So GPT 3.5 came out a year ago, and then there are also other like stable diffusion, Dolly, mid-journities are with images and then GPT 4 came out in May. There's also Codex, which is basically the model that's behind GitHub like co-pilot and Whisper, which is dealing with audio, for example. And then there are also generative apps that are making use of those models. So these are just again a small handful, right? There's Monkey Learn, it's basically learning from the text kind of like it looks for searches for answers in there. You don't need to really code too much, in fact. And then there's Chat GPT. And then as I mentioned, GitHub co-pilot and the other like Salesforce AI, Bing AI, Notion AI, all of these things that have come out. WordTune, for example, these are just examples of the apps kind of that makes use of the model. And one thing I want to also point out is that as we all know, we are probably thinking more about text input that comes in. And then from text, you can kind of search for what you look for. That's kind of like the same mode, so to speak. But the thing is to make it more useful for this generative AI era is the thing is that the things should come out so then you can ask question in text. And then it comes out back with images, for example. So it's sort of like you have different kinds of inputs and come out with some other response as images, as videos, audio, all of these things are pretty advanced too. So think of this modality thing as also being kind of, being done a lot of research is being done too. And these are kind of like the six main type of modes that we're trying to work with. So they kind of have cross functional kind of capability of finding each other. Based on one mode, you look for results on the other mode. Okay, so that's that. And also too, then let's look at players. In terms of players, what are people players, right? So I just wanted to point out too, there are data scientists and computer, let's say computer vision engineers, they are more concerned with the what, right? Kind of subject matter. Like basically now a generative AI and they may not actually be as concerned with how things are being implemented. So who will be implementing things are the data engineer, AI engineer, ML ops, DevOps engineer kind of. So we kind of should be working together, kind of helping each other out so to speak. So yeah, so that's kind of like, we have to understand the different roles that different people should play too. Okay, so now let's kind of get back a bit then into some of the terminologies, right? GPT, generative retrain transformer such as chat GPT. So GPT essentially what it does is, as the name suggests is transformer. So it's basically take simple prompts, as I mentioned earlier too, it takes natural human languages as input. And then what it does too is that, let's say if you are a text, you're searching for some responses that are in text, then you go and do similarity searches for example. So say for example, here at data stacks, we have the, we have Astra, which is based on Cassandra database. And then so we actually have capability of, basically adding in vector data type that can help with similarity searches. And I'll be explaining with it in a little bit. So these are like GPT, that's what it is, right? It transforms and then kind of essentially answers the questions based on the prompts, the input, the questions that you are asking. And it produces content such as like a new essay, a blog post, writing a piece of music, designing a new dress and that type of stuff. So it was pretty nice too, this kind of technology. And the thing is too with GPT, this actually came out at around like 2018 or so. Basically it was Alec Redford's paper only five years ago, came out and wrote about GPT of a language model and then OpenAI first published it too then in five years ago. And then very soon, a year later, it was GPT2 that came out with another language model. It's essentially trained on a dataset with more documents too, as this kind of expanded is a scope of it too. But then as we know too, then a lot of things have happened right within the five years. And then in 2022, just last year was stability, AI develops like stable diffusion, which is a deep learning text to image model that generates images based on text descriptions. So this led to like dolly and mid journey that type. And then very soon too then a year, exactly a year ago, November 30th, chat GPT releases GPT 3.5, which is an AI tool that reach one million users within like five days. So it's clearly is taken the world by storm. And then of course now, the chat GPT, all of these things continue to go on. And 2023, as we can see all the big vendors players started to come out, like Microsoft come out with Bing building in the chat GPT into Bing, Google has barred and all of these, as we can see in the market now, everything clawed, there's also many things. And so then 2023, there's also newer version of chat GPT, all of these things. So without saying, we know that it's kind of keeps going on this whole wave. So let's take a look then also into NLP, natural language processing. It is actually an interdisciplinary subfield of linguistics of computer science. So as we all know now, we need to process natural languages. So this is the discipline that we kind of uses, right? And it's basically uses rule-based kind of probabilistic machine learning kind of techniques to do kind of a processing data that kind of like that. And enables computer to learn from the contents, including not just kind of producing A equals B or something, but it's basically create a context around what you're looking for. So it's kind of going beyond just the very obvious thing, so to speak. So the idea is that also, it should be able to also generate like context and also draw some insights too from the documents that you're trying to kind of ask. So that's what kind of where the magic comes from. And now then there's also large language models, LLMs, as we all know, LLMs are what we actually work with mostly these days, right, as engineers, developers. And this is a type of machine learning model. And it's also what is called like foundational model. So it needs to be pre-trained. So essentially too, it pre-trained and it also has to use a lot of computing resources because of the large amounts of data needs to go through. And so it, you know, like us, right, like individuals, there's just no way we can train LLM as we all know. It's always relying on big companies, vendors that are funding all of these LLMs to be produced. And so it takes a lot of, you know, computing power, like GPUs is not just one or two, is we're talking about thousands of them to train. And even with that, it's going to take a lot of time to do it too. So they are not cheap and everything. However, also bear in mind is that LLMs are pretty kind of static too, even though it has a large amount of data, it might be taking data from a year ago. So between the year and now, that set of data is missing too. So that's why there are also other techniques are being talked about, right? For example, a technique called architectural pattern called retrieval augmented generation, you might have heard in here too, RAC, right? That's a patterns commonly used too. It's just an architectural pattern. It can be implemented in many thousands of ways, essentially. But that can help to fill in the gaps for these missing, so to speak like missing data in the LLMs. So that's that. And then, so what does LLM do is basically performs all of the NLP tasks, you know, it has to process natural languages that comes in all of the prompts and then answer all of the questions, analyze all of the sentiments, all of these things, chat bot conversations, for example. So now if I want to draw a picture of LLM where it sits, right? If you can kind of look back at the original onion diagram is sort of like artificial intelligence and then machine learning, deep learning is going inside and then very deep is where the Gen A, I think, comes in. The transformers, image gen, LLMs, the transformers sit in kind of inside this deep learning layer. So yeah, so this is how it is. And then it also makes use of the NLP that's kind of behind it. It's really the, you know, the rule set that it goes by. Okay, so some examples too of API and frameworks. So now we kind of go into like, well, how do we start doing some things? Basically, these are some of the common APIs and frameworks perhaps you have all heard about, right? By now, LangChain, Lama2, LamaIndex, and there are also speakers from those companies too at our conference here. And then Palm Huggingface. Huggingface is interesting. It's an open source kind of AI hub thing. Think of it more like a GitHub kind of repository for code, but this is AI hub, you know, for Huggingface you can search for things on there and use them for free too. Okay, so now let me then get into the vector database and vector search and there's so much to talk about this within 30 minutes, I have to go fast. So vector database, so why is it important, right? So let's understand it too as such, right? It's a database. It's basically purpose built to, it's built specifically to handle like complex machine learning purpose type of operations. And as such, you know, with machine learning, as we all know now, we need to be doing like searches. We can't just be like what we're used to doing data in a scalar version or kind of like manner. Scalar data is more like about single dimension. We need to now have data being able to be multi-dimensional. So what does it mean? It just means it operates like, you know, like you need to be able to handle data and kind of able to draw like conclusion to it and also build out a context. So as such too, vector database makes use of a lot of math underneath the hood. If you are kind of more keen on math, then you probably are aware of linear algebra and it's making use of matrix math, all of these things that goes behind the scenes vector, kind of all of these operations. That's why it's called a vector database. So it's built for that kind of machine learning type of purpose to search for data. That's more complex because you want to be able to identify patterns or how they relate to each other, all of these things. And so the thing is too, in order to be able to use this vector database, what it does is that it will translate the input data, let's say, you know, from a string of, let's say, oops, sorry, it's just a, sorry, but like if you kind of look at a string, so you need to basically have to parse your data and then break down your words and each word will then be mapped to a numeric representation. Now, the thing is too, is that you break it down and so then you can also then look into how it's being stored. So it's called dimensions in your database. So let's say you have a table, you have this vector data type. If you kind of do a selection, you'll be able to see that it is actually an array filled with, you know, what you call floating point numbers. So each floating point numbers represent each dimension of that string that you're trying to construct. So that's what it is working with. So if you kind of try to do selection on databases, it's not very interesting, it's just all numbers, they are all decimal places and you're like, okay, you don't really know what it means. But the thing is though, that's what it is. Vector database has this capability of doing similarity searches and then using a technique called approximate nearest neighbor ANN to kind of help to do all the searches. And it's all depending on, you know, using vector math to help you to do it. So if you're familiar with vector math, they're cosine matching and Euclidean or dot matrix kind of different ways of doing searches. So yeah, so if you're interested and you can always go back to learning all of your geometry, then they will kind of help you with that too. But actually I was actually a math major, math major in computer science when I was in college many years ago and I remember myself thinking, how would people use it, you know? So, and the interesting is you enter the workforce as I'm sure you know, you're going in and come up some IT shop and you were fixing spaghetti code and then I said, I never get to use math. And so now I finally as an advocate, I'm working with vector search vector database. I realize, oh, this is great. It's really digging up. And I actually go back to looking at some of my math and then say, oh, this is how it is being used. So pretty interesting too. So, okay, so that's vector database. Okay, so this I kind of realized it's kind of, you know, kind of running short on time but just want to kind of quickly kind of show it to you in diagrams, right? This vector database, vector data, essentially you can look at it like X, Y kind of coordinate two-dimensional. So that's how they are being represented. Like the V, you're kind of doing searches, right? They have a special place in the graph. Think of it like that. So then you want to see how close they are to what you're searching for. That's how it's being represented in a two-dimensional graph like that. So, okay. And then another picture of showing it is that a bit more, makes more sense. Let's say you're searching for words like cat and dog and house. They are being like, you know, essentially represented like this in a conceptual space. You know, you can look at it like that. So when you're doing searches too is trying to look for the closest to what you're searching for based on their vector embeddings that are kind of spread out in your graphical kind of representation too. Okay, so that's that. And this one I won't get into all the details but one thing I want to point out is that there are vector databases. For example, at data stacks, we actually started implementing the vector data search a data type two in our version of the Cassandra which is the managed cloud. And now it will be in Cassandra 5.0. This feature will go into Cassandra. I think it's already out in beta too in Cassandra. So anyway, one thing I wanted to also point out is that at data stacks we have our founder, co-founder of the company Jonathan Ellis. He is the one that wrote Jvector. This particular library is a pure Java library that actually does a lot of the approximate nearest neighbor kind of technique in there that he implements and is using specifically this technique called disk and it comes out of Microsoft too. It's really highly performance. So that's why you'll find our vector database to be very solid out there because of this mechanism underneath too. So if you are more interested in it I can give you more information. And then if you need to I can reach out and talk to Jonathan. He's really nice too. I can ask him if you need more information about that. So okay, so that's that. And then vector embeddings real quickly to say what is it being used for, right? So besides doing like vector searches you can actually use it for clustering recommendations like shopping, you know, you need product recommendations. Think of it like that. It can do searches in multi-dimensional fashion anomaly detection, diversity measurement and classification of data, all of these. You can kind of use these techniques to do a lot of things too. Okay, so one word to kind of point out is that you can still use traditional database too. Nobody is stopping you but you realize that it just cannot handle the complex data type that's required in doing machine learning kind of searches. So maybe just not a good idea even though you can do it. Okay, so this is just a quick kind of diagram to kind of show you like in a typical Gen AI kind of rack based type of applications and how the vector data can be kind of, you know, in a graphical diagram representing it how, you know, the database is become very important. You do need certain kind of storage to kind of store the data. So that's where it is very important to have a database that actually can handle the storage of the data very efficiently and allow it to be query also very efficiently. So the thing is with database or data stacks our database can handle the job very well. So, okay, so really quick demo. I don't know if I still have time but let me really, really quickly kind of show you like that but we have, okay, we have actually this one I will be sharing with you the short link is basically you can go to our data stacks to sign up. As I mentioned too, you can sign up, you can use sign up with GitHub or Google easily for single sign on or sign up without giving your credit card, which is really nice just your email and then you can create the account but let's say I already have this up here. Oh, let me quickly sign in. So I'll sign in, I already have an account is as easy as you just quickly sign in and then you see this and also too there are different things is pretty self-guided. So I let you explore that but I do want to point out that if you want to create a vector database you can just go here, create a database and then you can pick vector and then essentially to give the database name a key space which is names, essentially name space and then the region, right? AWS or Azure or GCP, I don't know why GCP or oh, something wrong with Google today, sorry but some days and not all the time. So we have like the three major cloud platform then you can see quickly click then you can create the database and then from there, I already have one kind of spin up is actually using Azure. So you can kind of see everything in here but again, I'm running out of time so I won't have time to go step to everything but you can also use SQL console to kind of quickly interact with the database or using CLI, your command line too. So okay, so that's pretty much kind of it and but I want to point out to you in my slide deck I also give you the link as to how you can get to these examples and so you can step through these examples very much self-guided examples. So I'll let you kind of handle kind of experiment with it. So I'll get back into here and quickly kind of share with you again. This is the vector search on Astra platform. I invite you to sign up for our vector search. Again, you know, no obligation. Yeah, so you get something to use for free. Okay, and then some benefits. Oh, just wanted to say as we have seen, you know, Gen AI is a good thing you can ask and you shall always receive but make sure you ask wisely because they are challenges too. There are hallucinations. Maybe if there are no data in there it could give you back wrong answer. So have to be very careful. Make sure, you know, there are also ethical concerns when you work with Gen AI and also the real time kind of lag too as I kind of talk about it but we can try to make up using rec pattern for LLMs for example. Okay, so with that I want to say thank you and share with you some resources. This is my slide deck, the short bitly link and also the QR code if you want to connect to this slide deck and kind of get all the information from there. Okay, let's see. I see some of you are ready to take a picture, okay. All right. Okay, and then here too, just some of the resources, the links. If you get my slide deck you'll be able to get to all these links here. And then this one too, just want to quickly share with you. I have also a Twitch stream. If you want to follow me and I promise in 2024 I'm going to restart doing more streaming, life coding, life streaming. And I can invite you too if you want to join and interact with me, I'll put you on the video too. So yeah, and then with that I want to, oh, these are just data stacks offering but again you can go up to our booth. Oh, booth is closed, I'm sorry. Okay, anyway, I give you those cards. Please sign up for it and we can talk. So with that, thank you very much and I really appreciate you coming to my talk and please let me know what you think. Thank you. You're using our account in that case, yeah. But the thing is you can also go through from, I think from the AWS side, you should also be able to pick, I think it's on the marketplace for our extra database. I know very sure Azure can do it because I did a talk and then he was showing on the Azure portal, you can also do it from their side and then be able to also access from our console too. So yeah, yeah, yeah, let's give it a try. Yeah, thank you, thank you so much, yeah.