 I'm Robin, welcome back to Whiteboard Friday. About a year ago, tools like ChatGBT, large language models based tools, came out and got us all thinking about how we can use them better as individuals and marketers. In this video, I'm not gonna make you a machine learning expert and I don't think I'm gonna be tearing up what people have been saying so far, but I'm gonna break down some of the technology and some of the limitations that go into it, which I think gives us a lot of clues about how we can use them best as individuals and marketers. And I think you'll be pleasantly surprised that a lot of it has parallels with the kind of things we're used to. So let's start with the steps that they have to go through for ChatGBT, for example, to give you an answer to a question. Again, like search engines, they have to first gather the data, then they need to save the data in a format that they're able to access, and then they need to give you your answer at the end, which is kind of like ranking. If we start with gathering the data, this is the bit that's closest to the search engines that we know and love. So they're basically accessing web pages, crawling the internet, and if they haven't visited a web page or got another source for a piece of information, they just don't know that answer. And they're kind of at a disadvantage here because search engines have been doing this, have been recording this information for decades, whereas they've kind of only just started. So they've got a lot of catching up to do. There are a lot of different corners of the internet that they haven't really been able to visit. One of the things that they can do on a piece of information that they can gather that other search engines can't access is chat data. So when you are using the platforms, they are gathering data about what you're putting in and how you're interacting with it, and that feeds into their training model. So that's one thing for you to be aware of when you're working with platforms like ChatGPT is that if you're putting in private data in there, it's not necessarily private after you've done that. So you might want to look at your settings or look at using the APIs because they tend to promise they don't train on API data. If we move on to the second stage, saving that information, this is kind of what we refer to as indexing in search. And this is where things diverge a little bit, but there's still quite a lot of parallels. So in the early days of search engines, actually the index, the data that they had saved wasn't updated live the way we're used to it. It wasn't that as soon as something came out onto the internet, we could kind of be sure that it would appear in a search engine somewhere. It was more that they would update once every few months because it was very expensive. It was costly in terms of time and money for them to do those index updates. And we're in a similar situation with large language models at the moment. You may have noticed that every so often they say, okay, we've updated things. The information that it's got is now alive up till April or something like that. And that's because when they want to put more information into the models, they actually have to retrain the whole thing. So again, it's very costly for them to do. And both of those limitations kind of feed into the answers that you're getting at the end. I'm sure you've seen this, you might be working with chat GPT and it hasn't happened to see the information that you're asking about or the information it does have is out of date. And the way that you've probably managed that is by either copying and pasting a bunch of context into the chat window, which is something that's quite specific to these tools that we can make the most of, or maybe you've asked Bing to go away and have a look for something. And this is something, a dynamic that we're getting very used to with these tools, but it is quite specific to large language models. The fact that we can shore up this, the gaps in the long-term memory by kind of dumping in stuff to the short-term memory and asking it to work with that. The interesting thing is, even if we're not doing this stuff, that's what's happening in the background anyway. So when you have a conversation with chat GPT, the model actually has no memory even of what it last sent you. What's happening is whenever you send a message, there's a script running that is copying the entire conversation. And the message you send is actually, here is a conversation between a machine and a person, and here is a question at the very end, what would you say next? So it has none of that memory. And that's useful for us to know because we're actually relying on the ability to dump all of this context in short-term memory as much as possible in all of these kinds of interactions. And there can be some limitations there. So used to be that the main limitation was just how much information you could fit in. And a lot of the conversations came down to context windows. So it used to be that you couldn't really paste that much information in there. We've reached the point where actually you can put quite a lot in. So GPT 4.5, you can paste about 300 pages worth of text. Anthropic, which is a key competitor, you can paste about 500 pages of text in there. So we've really reached a point where like, I don't want to be copying and pasting that much information into each prompt. That's not really the limitation anymore, but we've still got a couple of problems that they need to overcome as companies and as developers of these tools. One of the problems is 500 pages is a lot, but it's still not the entire internet. So the gaps that they've got in their knowledge still can't be filled by us just pasting everything into the short-term memory. The other problem is there's a bunch of research that have been done by Liu and Al and a couple of other researchers who found that if you are working with this short-term context, if you're putting a load of information into prompts and the key information it needs to pull out is buried somewhere in the middle, quite consistently, the accuracy of it pulling out that information drops off. In fact, you can get a better performance boost by just taking the relevant information and putting it towards the start of your prompt, then you can by paying for the next most expensive model. So that's something that's kind of interesting for us to think about because when we're working with these models, for one thing, we know that we are regularly dumping lots of information into the context, like that's how it pays attention to our conversations, that's how we fill in these gaps in knowledge. So on the one hand, we don't want to have to search through all of these different documents and find out exactly where the right answer is. That's kind of what we're asking ChatGPT to do. But on the other hand, if we are putting loads of irrelevant stuff in there, it becomes more and more likely that it's going to miss the thing that we actually wanted to handle. So there's a couple of things that we can do to deal with that. One, in terms of us as individuals, we can remember that long conversations are essentially all being dumped into this context window. So if we're talking to about one thing and then we actually want to switch to something else, we can essentially just decide to switch to a new chat, which is very easy in these interfaces. Another thing that we can do is be mindful of things like mega prompts. So this is really condensed blocks of instructions that you quite often find people putting at the top of these interactions. Again, that could mean that things kind of get shifted into this middle portion where things often get lost. So we're kind of trying to make sure that we're focusing things as much as possible. And of course, if we're dumping big blocks of information in there as well, we're trying to do some weeding out of the stuff that we know is less relevant. The other solution to this kind of problem is kind of more technological and more enterprise, but it's definitely something for us all to be aware of as a growing industry. And that's something called RAG. So RAG stands for Retrieval Augmented Generation. And the way that changes is basically, instead of our question going straight to chatGPT, for example, with nothing else, we have a separate database of important information that we want. So that could be all of our company's internal documentation that we don't want to go external. Or it could just be a bunch of things that we know are relevant, but we couldn't copy and paste them all straight into the context. So we have this separate database and there are tools called vector databases, which essentially are designed to work very well with large language models. And they have a very similar kind of logic to them. So they can pull out the most relevant documents that should go as part of that prompt and add it in naturally to the context. So this kind of automates that step of you having to go and search and copy and paste things and weed out the things that are less relevant. It makes it much more streamlined and it gives you a way to have a more private but much more contextually aware version of these tools. And it starts to solve some of the problems that we have in terms of missing knowledge. So in terms of things to watch out for and potentially work you might want to get involved in, it's worth kind of bearing in mind if that's what people mean when they talk about RAG. But what does that mean for us as individuals? What does that mean for us as search marketers? Well, as I said before, these companies have a couple of fairly big problems that make it quite hard for them to fill all of the gaps in their knowledge, quite expensive at the very least. At the moment, that's not too much of a problem. These tools are so new that we don't remind a bit of copy and pasting. But it is going to get to the point, I think, where we're starting to compare these tools based on how easy they are to use. Do I have to keep reminding it of the stuff that I had last week? If so, I might switch to this other one that I don't have to do so much. So they're going to have to solve that problem somehow. One way they could solve that problem is by baking in RAG from the start. So essentially, when you're using this tool, they have kind of three tiers of memory. They've got the long-term memory, the model, which is the way it's working at the moment, the short-term memory, the stuff that you're dumping into the text chat, and then a kind of a medium-term memory, which is information they've been able to gather, but they haven't had the time or the resource to bake it into the model yet. Now, if you try to do RAG yourself, and if you've got skills and a bit of time and a bit of coding knowledge, I recommend you give it a try, you find that a lot of the conclusions you come to are very familiar. So for example, HTML, fantastic, flexible way to communicate information. If someone has designed a page well, they've put the right headings in there, it becomes much easier to pull out the most relevant stuff, and then that chunk of information is much more likely to work its way into your answer. Likewise, if you've got a page that's kind of full of irrelevant stuff and a couple of good nuggets of information, or a page that is very well targeted to the topic that you're thinking about, the second page is probably going to be the thing that shows up. So the things that we've been thinking about in terms of optimizing for search engines are very relevant when we start thinking about RAG. And even if they don't go down the RAG route, we know they still have to crawl and find all of this information. So the idea of having pages which are very worth a crawler's time to visit, very well structured, easy to understand, very relevant, these all continue to be a good way to think about how we can prepare ourselves for potentially optimizing for large language models in the future. So, in summary, as individuals, if we're working with something like ChatGBT, we need to bear in mind the data's not private, so you might need to look at your settings or check out using the APIs. And we want to keep the conversations as focused as possible on the things that we want. If we're pasting in a bunch of data, do some checks to try and weed out the stuff that we know definitely won't have the answer. In terms of search marketers, if you're optimizing your website for search engines at the moment and you have someone saying to you, but what are you doing to make sure that we're ready for large language models? We don't know exactly what's going to happen, but it's a really, really good bet that the stuff that you're doing now to make things good for search engines at the moment are going to be very relevant for the large language models that come. So, I hope you've enjoyed listening as much as I enjoyed talking about all of this. If you have any thoughts, please feel free to reach out to me and thanks a lot for listening.