 Hello, everybody, and welcome to the November 2023 episode of law.mit.edu's idea flow, a relatively informal set of discussion and demo sessions where we highlight what we think are some of the most interesting or important, relevant and timely happenings in the area of computational law generally, and nowadays specifically that means applications of generative AI to help solve legal use cases or to achieve, you know, goals that that we may have for law and legal processes. We've got two really interesting examples of just just those types of projects and initiatives and to get us started today. I've invited some friends and colleagues at consumer reports innovation lab to show us some really interesting and useful things that they're starting to do with privacy policies as part of a innovative application that puts more data control back in the hands of consumers. And so with that, I'd love to introduce Ginny, and also Dan, and if you could introduce yourselves and your roles, and then dive right into this extremely cool application of generative AI that you've, you've been devising and starting to deploy to get on top of privacy policies on behalf of consumers. Take it away. Thanks, Daza, and thanks for inviting us to be here. It's really awesome to be with all of you. I've attended this many times and seen a lot of great presentations here so it's an honor to get to present to this group. My name is Ginny Foss and I lead the product in R&D group within consumer reports innovation lab. So my team is responsible for a lot of research and fast prototyping about how we use technologies to power consumer protection use cases. And I'm here with my colleague Dan Leininger. Hi, yes, I am Dan Leininger. I'm the head of experimental engineering in the innovation lab at consumer reports. And very nice to meet you all. So today the two of us are going to let you all peek into the hood of what we have brewing here in our innovation lab. All of this work is very early stage experimental work. None of it has been deployed to production. However, the way that our lab works is we are building products that are serving consumers every day. And we use those products to help us figure out which problems we want to prioritize solving with new technologies. And so specifically our group has done a lot of work for a few years now in the area of privacy and consumer data rights. So this has been one of our kind of closest partners and co conspirators in this work, but consumer data rights are a new kind of fundamental right that we have as consumers in this country. These are rights that are enshrined under new laws to privacy in states like California and about a dozen others. And these are laws that give you a right to your data in various forms, the right to tele company to stop selling your data to tele company that they should send you a copy of your data, as well as to tele company to delete your data entirely. And these are rights that we've been really excited about a consumer reports because we think they give consumers a whole new kind of agency in a digital marketplace. We're doing a lot of research as well as experimentation and even product development around this idea of consumer data rights. I'm going to start by just showing you a product that is our kind of flagship data rights product that we just released, and that will give the context for some of the experimentation we've been doing with large language models and privacy policies. So I'm going to share my screen and give you all a glimpse of our new product, which is called permission slip confirming that this screen is visible to the group. Looks great. And so permission slip isn't a mobile app that helps you take back control of the data companies have about you. It's available on iOS and Android. And what we did with permission slip is we wanted to make it really easy and seamless for consumers to use their new rights to their data, their rights to privacy across the country. So what this is is it's a, it's an interface that allows you to find out what kinds of data a company collects. And we have a whole library of companies in the app, and we read the company's privacy policies to and interpret that for you so you understand what data a company like Home Depot has about you. So how do we give you as a consumer easy ways to manage that data by activating your data rights. And so we offer two rights right now in permissions left one is we let you send a request and tell a company to stop selling your data. And then the second is that we allow you to send a request that tells a company to delete your data entirely. So what's important to to understand from these screens is that a lot of what our team is doing under the hood every day is reading privacy policies from companies and interpreting them so that consumers can make an informed choice about how to manage data. So let's just segue into exactly what that process used to look like in a manual sense. So I'm hoping you all are seeing a slide right now about our teams process to help to to educate consumers about the kinds of data companies collect. So, what that process has been is that we collect copies of a company's official privacy policy. We have a consistent way of coding these privacy policies. We look for the kind of data companies collect and we try to articulate that data in one of 17 different buckets, which are the 17 types of data that different laws stipulate exist. And then we put that into this kind of nice and hopefully easy to interpret interface in the permission slip app. So for a company like McDonald's you can see what identifiers they collect about you what account information they collect about you etc etc. And this is something that our team has been doing manually for many months. We've hired people who are pursuing their PhDs and usable privacy to help us interpret these privacy policies on behalf of consumers. But what we've found, especially as you know chat GBT was announced there's all of this interest in large language models and how they might be applied. We realized that we had like a really clear cut use case for ways that a large language model could help us take in large numbers of privacy policies, interpret them consistently, and then show information to consumers in a way that helps them make an informed choice. So I'll show you all just kind of what this has looked like and then Dan will do like a live demo after we show you some of the kind of pre recorded behavior of this research app that primarily Dan has developed in partnership with with my team that runs the permission slip app. And so we had 24 steps or 24 things we would look at when we looked at a company's privacy policy, as we tried to interpret that privacy policy for consumers. And what's been cool is that we've been able to use large language models to automate all 24 steps of of kind of interpreting a privacy policy and then spitting out a distilled version of it that a consumer can use. So I'll just show a pre recorded demo and then hand it over to Dan to show you a little bit more in depth how the research app works. So let's say we want to add a new company's this is a company that we don't have in permission slip right now, but it's a company that we think consumers will want to manage their data with. So in this example we have Tinder the dating app has a lot of sensitive data about people. What you're seeing here is that we're able to get a description of Tinder, we're able to quickly fetch a logo. And then we're able to put in a URL for Tinder's privacy policy. And then we put that document to our database. And then here we go to the section of the privacy policy that is about the information that Tinder is collecting. And we snatch that section as and kind of use it as the context. And then Dan presses this create company button in the demo. It takes a few minutes. I'll actually move on to the next video to show you all what happens when this data is generated. So what you see here is what we've been able to glean from the privacy policy using large language models so we have been able to answer lots of different questions about privacy policy we even provide the source text for what parts of the privacy policy answer the questions. And all of this information is information that is then kind of fed into the reasoning behind how we advise the consumer on how to manage their data. And so what you see here is an experimental branch of the permission slip app. And this is an experimental Tinder card that we just generated in this demo. So you're able to see the 17 categories of data. You're able to see some advice on what you might consider if you're going to tell Tinder to stop selling your data or tell Tinder to delete it. So I'll stop sharing my screen there and hand it over to Dan to give you all more kind of in depth summary of kind of the process we went through and how this all works. But I guess before handing over to Dan just any questions on like the problem we're trying to solve or the general approach before we dive into some of the specifics. Firstly, let the record reflect that in chat one of the completely random unbidden comments was simply one word. Wow. And I concur. But but having said that it does raise so many questions and we've got a great group here. So does if anyone has questions, I'm feel free to pipe up at this point. Also, we can hold them until after after Dan's part. But Jenny gave a really great overview and please if anyone has any questions and wants to do a discussion of this. I'll go a little bit more deeply into, you know, some of the specific questions we were trying to answer and maybe some of the problems we ran into, and some of the work that we're thinking about going into the future. But yeah, please feel free to ask questions along the way. I think that I'll just get started in the back of my mind. I'm almost always asking the same question when I see this type of application, which is what actually was the prompt, and were you were you using. If it was opening I were you using a system prompt in conjunction with specific prompts. And then also how did you the other half of that that question about prompts is the input for the output. How did you control the output to basically regularize its shape and format so that it would fit into this very structured, you know, output that that that you need so that it doesn't start, you know, kind of like homespun, you know, things like hey thanks that's a great question. Here's my answers or other like extraneous verbiage that we don't want. Yeah, exactly. And here I'm going to try to give me a second I'll bring up some of the prompts that you so you guys can check them out and yes you're right. And all of this stuff and in my experience, working with LLMs generally and working with frameworks with LLMs. Like, always just sort of dive down until you can figure out what the prompt is and what people are talking about is is the best way to sort of step back and see what the system is trying to do. So I love that question does it and just give me give me one second here and I'll see if I can bring up a good example of one. And while you're doing that just for those who are here who may not be completely steeped in the technology or what we're even talking about. One of the things with this generative AI that was remark those are applications that you would have heard of would be chat GPT is the big one but there's there's quite a few is that it's natural language. So what that means is like the the prompt isn't like a sequel query or a Boolean or some something like that. It's you kind of it's it's conversational almost in the way that it works and that that provides all kinds of interesting opportunities to apply it for legal use cases where you know a lot it could hardly be more of a like a word and language based field. So, with that little pattern back to you Dan. Great. And thank you. It turns out that I cannot bring that up and I'm going to bring up just some code so I have a newer version of this app that goes into some detail, a little bit about what these prompts look like but instead I'm going to show you just on some code and we'll go through them here. Second. Let's see. So, you all see that now. It's good. Great. So, what this is a lot of this logic is the goal of this was to try to get these things to to export Jason. So, our goal like going into this there's a lot of ways and even in the releases that open I release yesterday that's like you can just sort of specify Jason coming out of this stuff but when I first started digging into this and this is still where it exists in our app there's a bunch of ways you can do this with a Jason schema and other ways but we were sort of. What I saw one example of someone using TypeScript schemas in order to identify the things that you wanted to get out of it, and then using comments in the code to identify more specifically what you want to get out of those definitions and so this app is basically based off of that and like I said you can do this with Jason schema and other ways to sort of define what the structure of the output is, but this is an example of a query that is we're trying to get out all of the data rights methods from a, from a privacy policy and those data rights methods are like the your how can a consumer and authorization exercise their right to, you know, opt out of sale exercise the right to and so this is the prompt that goes in and does it and you can see that it's a bunch of it's a regular sort of system prompt instruction. It sets up this schema and the schema says go through and find anywhere if we're used like a rag approach which basically we load these privacy policies into a vector database we use a context query to find these relevant sections of the document. We use these prompts to, to answer these questions based on those documents. And so this one. It's complicated it's what we're trying to figure out is like an array of submission methods. And so what we're turning is like basically a list, and those lists are like what is the method what is the type of method how can it be used so for example the action type would be phone, and here's the phone number. And here is, you know, let's see. Here is the privacy infrastructure provider if it's something like one trust or things like that. And all this information helps actually on our back end, when we're setting up companies to do these requests. It's a really good first pass, and I feel like it's, you know, none of this stuff is ever 100% correct it's always comes out somewhere around 90% correct. But it's a really good first pass at doing this research and so we go in and we pull out this information and we use this specific query to for our researchers to go and start to basically verify this information and set up how we're going to do these requests. So walking through one more. Just sort of simple one. Let's see. So, this one here is very similar but what we're asking the AI to do is we go as part of this we pull down a Wikipedia extract basically the first paragraph of a Wikipedia article. And then we ask it to answer a few different questions based off this schema from that Wikipedia article here. And one of the things I just want to highlight is something we found interesting for both validating and interesting. I'm not sure if it actually improves the results or not, but that we ask this thing to provide its reasoning along the way. And when you're going through and validating these things and we do all of these things that we put in the app we do there's human on the loop or validating the outputs. So having this reasoning here is you can sort of pick up really easily where an LLM is failing just by seeing how wrong its reasoning is for things. So, and that's an interesting thing to add to this. So let me share the app. Real quick. Again. Okay. So we'll just go to a little more detail about what these different things look like and what some of the output is. So these are the documents that we add in when we pull them back we count the tokens in the documents. And we try to parse out the date that the document was glass updated it's effective date. And then the overview. So we ask a bunch of questions on that. We try to determine whether or not the company is the data broker given downloads basically a bunch of CSV downloads from the California and Vermont data broker registries. This is an output of what those data categories are. And you can see, this is all the data that the company collects or what they said they collect in privacy policy. And they just score me and I'm the right like you go it seems to go from one to 10 or something. That's actually, because we're initially using models with small context windows those are the IDs of the source text, which is down here, where that is pulled out. And so that was just the way that we were going through and instead of it instead of it repeating the source text for each one of these things and filling up that context window. We mapped it to a set of IDs and then had it generate just one version of each source text. Got it and what you're saying so much in every sentence here but let me go back one more. Did I hear you say correctly that you pull the basically structured data every company that's listed in the California registry of data brokers. And then some point in here you're doing like a match to see whether the company that you're looking at the privacy policy of is also listed in the data broker registry and you're doing that through an LLM like through a prompt. Yeah, and it's like, whether or not to do that, that's easily something that could be done with like a fuzzy search. Also, you know, it's like super easy. And so the idea is that we sort of we have an LLM do it. And honestly, I think a fuzzy search is probably better at doing that. But I was curious to see how good an LLM would do it. So we added that in. What's the answer like how was the performance when you spot checked it. There are some things where it gets wrong. Like, you know, it will, it will say that a company is like zoom what we're on right now. There is a company called I think zoom info and the data brokers registry and when zoom.us is put into it. It goes, Oh, zoom info. This must be a data broker. So, so yeah, I would say, you know, like all stuff 90 to 90% great. You know, you got to watch out for things like that. Very informative. Thank you. And sorry, I didn't mean to break your flow, but I just want to make sure we're, we're hitting some of the top of the waves as you, as you color. Yeah, so this is that first prompt that I showed you this is the output of that. Wow. So you can see the different data rights, we say, you know, do they explicitly say who can, who can use it, etc, etc. We'll try to see if there's some more interesting ones third party sharing and selling. You know, we're basically asking, do they say that they sell data to say that they share it, do they say that they sell it specifically as defined by the CCPA, who do they share and sell it to and with. Who do they buy data from who do they collect data from. This is relevant to see what they even have a user's data for us. This blurb we just sort of fine tuned the model that generates these really little burbs about a company given a Wikipedia extract. And down at the bottom, these are the sort of final output sentences that we use in the app. At the end so and again, we started to we this summer went through a process of figuring out a way to validate this information so created a framework to validate all of these questions and sub questions in here and created this is a pretty hacky interface but this is what we would use to validate things. So we have a set of validation instructors, instructions that were generated by this really brilliant fellow names one hey that was with us this summer. And we would go through and manually validate these things based on this rubric on the right and basically market as good wrong or could be better. So that we have in our database we have like the value which is the answer and then we have the edited value for each one of these. And so we would, if they're wrong, we fix them, and then with the fixed answers, hopefully somewhere down the road we could possibly use those as a way to, you know, find you in a better model to do these things or things like that in the future so the validation process is not just about validation but also creating a really good gold data set. So with that, and please let me know if anyone has questions. Standing. Dr. Lance Elliott, you're on active alert that I may call upon you just so you know, I see you're in the audience. One quick thing I just have to say is, this is great work thank you for taking the time to present it. Everybody can see how you're coming at this. There's it's really a time of true kind of innovation and creativity as people are seeing this these powerful tools, and trying to think how could this apply to something I'm actually trying to do. And you've really like taken it all the way. And it's it's well documented it's clear. One base question I have is, is, have you been able to get your head around, like the, for the average time it would take to process these policies manually at number one is, is it comparable time or I would imagine it would be significantly comparable time for similar information and number two, what about the quality so how much, you know, when, like I have been in the position of looking at terms and conditions and policies, hundreds and thousands at a time for various projects. And I know my quality, it starts to dip a little bit after the 10th and 50th and 1000 one of those but nonetheless like, how, how is the quality, compared to people through time and quality, l l m versus people for this task what's your assessment. The time this is way faster. I think that we calculated it was like 1500% faster or something like that so the first 80 companies we put into the app took months of researching into privacy policies to do it and in that process we had a brilliant researcher who created this like basically structure and protocol for doing this research and that really helped this process so it's like that protocol was used as ways that we, you know, talk to the l l m and get it to go and answer these things. Right. Like, it's way faster, of course, you know, we could run this thing and run, you know, hundreds of companies through this thing in a day, and then we have to go through and validate them right. So, regardless, it's way faster to do it quality wise, you know, like I said, there's like, it's still in that 90 to 95% range, but things are wrong and you have to go and find them and fix them but it's still way faster. And to your point as a, yeah, it's like our own researcher, you know, and even when we were going in creating like our own, we went through at the very end of this and the validation process and hand coded like 20 of these companies to use as like a gut check. And even in our hand coding process, we missed the ton of stuff, you know, and you know so that was illuminating and also hand coding these things is terrible. It's like, at least one thing in the world to do. So, yeah, hope that answers it. It does. Thank you so much. And I see we've got one more. Some simple, we've got a few questions in chat and through DMs here, but we can get to at least one more before we need to go to the next segment, which is from Nolan. And the question is, do you think generative AI based legal classification extraction will make the current generation of legal due diligence document discovery. AI modes obsolete. Yeah, I don't know that might be a question more for you like I, my sense of this project is this is a directed project and goal this project was was to produce the content we needed to produce in order to add infinite more companies into this app. So, I, my general sense of working with LMS is there a great at transforming information. So, I'll leave it at that and maybe if you have answers to that. I have one perspective, which is on your obsolete's a big word. And so, no, not, but I am what I'm already seeing actually what and I was just involved with a friend of mine who is general counsel of a company that was being acquired and there's a lot of due diligence here is that the current tool set still does some things that are non overlapping, especially when you're going through financials and, you know, due diligence and document discovery is kind of a big umbrella. There's a lot of different things you're doing based on the facet of the task, some of which don't overlap with LMS some some stuff LMS seem to be comparable or better at and really interestingly there's a whole class of things that these LMS can do that we could never do before at all. So I imagine we're going to see a period of sort of co evolution, will it be a, you know, like humans existed with, you know, like, with neanderthals for a long time simultaneously. So I think we're going to see like a kind of a period of dual evolution. And, and there's going to be some some things that other technologies continue to do that, at least long as we have the transformer model and the sort of probabilistic approach with LMS that that that that the that this technology just doesn't do so. But overall though, I just tend to see a lot of promise here. It just keeps over delivering on things that I think are interesting and good and very, very relevantly I just can't stress enough what Dan said, several times which is, it is not perfect. It has like some flaws, some limits, some biases and so don't worry y'all there's a continuous role for humans that have judgment and can look over the outputs and be the final say. Oh, it looks like Lance it so can I just want to do one quick depth deputization then we'll move to the next segment. We're actually joined by another colleague Lance Elliot who is a real expert on the advent of this technology in the legal field, and Lance I just want to give you a chance to come off mute to see if you had any answer as well since you're in the audience I want to like deputize you for a hot second to see if you have any perspectives on that question. Sure, can you hear me. You sound good. Oh great. So, first of all I apologize I was on another zoom call that finally ended. Secondly, it sounded like at the tail end there there was a bit of a discussion maybe and stop me right away if I'm off target. About the idea that large language models, machine learning deep learning is all focused on what in the I field they call sub symbolic approaches, which is you look at data. You find patterns and data based on those patterns, you try to make predictions. And that's really what generative AI is all about. Now, you might also know, there was an earlier era of the AI field about expert systems knowledge based systems where you explicitly wrote out rules that you would then have the computer system try to execute to perform those rules. It almost sound like they're at the tail end and I might be off target, that there was a bit of a discussion about where we headed with generative AI LLMs. What's that future look like. And if that is kind of the question, then my answer is, it's what many refer to as neuro symbolic AI. The idea is that you combine both the sub symbolic pattern matching with the kind of rules based approach. Together in combination. And even Sam Altman, who is the CEO of open AI, he has come out and said LLMs and machine learning the way that we conceive of it today can only go so far. Something else has to break through because today so far, we've gotten away with just getting larger and larger models and greater and greater computational processing to get where we are now. And the belief is yes that will continue to take us forward. But if we really want to make it like all the way to an AGI artificial general intelligence kind of capacity that we have to come up with something else and some believe that that something else is this neuro symbolic AI. So does I'll just pause there I don't know I may be off target from what the group was talking about but maybe that nonetheless is of interest to the group. That was of interest. Thank you and that and that was not overlapping with what I had said as well so I sort of said the assume we have the same kind or more powerful the same kind of LLMs. It won't be a complete opposite leading of the prior tech for due diligence and discovery. But that's a really good point which is one thing we can count on is change we're not going to have. I don't think this is the ceiling and there will not be any further evolution of the of the technology and in addition to the symbolic generative AI mash up I think people like in Microsoft and other places are looking at planning. And like just other modalities that they can start to mush together with all of which could could change the whole playing field again so thank you Lance for letting me tap you. We've got one more comment from Brian and then Kara and Rich you're on deck so you know kind of warm up your engines and we'll go to you as soon as Brian has made his contribution. I'll keep this short but to what does was saying, you know, obsolete is a huge, huge word and a huge concept and I don't think traditional due diligence will become obsolete. You know it's always been sort of in this pattern of continually evolving with the technology that is used to evaluate it with. So I think that trend will continue. You know, an example that I think demonstrates how this could happen is with some of these note taking apps that use generative AI. So, I know a lot of people here probably use them where the app makes a recording and a transcript and let you query the notes from what was said in the call. So one of the things that I've seen just in very personally that I've started doing in these calls is I'll start signaling like so the action items from this call or this. And so you start learning to use the tool in a way that allows you to leverage, you know, large language models generative AI to the fullest extent that it can be leveraged and I think that's sort of the area where due diligence will transform. I think people are going to find the sweet spots of using LLMs, generative AI, all of these cool new applications in ways that reduce the total amount of time that it takes that increases the effectiveness of the due diligence process that drives the operational expenses of that process down and it allows us to get through due diligence quicker with less of a headache, I think. So that was all I had. Awesome. Thank you. And incidentally, I'm saying all these names I should have mentioned hi I'm Daza Greenwood. I'm a I head up law dot MIT dot edu of which this idea flow is one little thread and Brian who just spoke Brian Wilson is editor in editor in chief of the MIT computational law report which is kind of the flagship publication and public facing facet of law dot MIT dot edu so there you go and thanks Brian. Now, let's move forward to the next course in this meal. We have with us co founders of a new initiative called describe dot AI, who are doing some, I think really impressive work, applying generative AI to case law. And so with that, Kara and Dan, I'm just going to ask if you could introduce yourselves and and tell us more about how you're applying generative AI to case law and you know for what does describe do and in particular what what can we do now that we could never do before. And then we'll go right to discussion. Excellent. Thank you Daza and thank you so much for having us here today we're very excited to be joining you from Newton, which is Newton Massachusetts which is right outside of Boston so certainly close to MIT. I'm the person she her and I am the co founder of describe dot AI as you mentioned and I am not the technologist. That would be rich my husband and I am a marketing and biz dev person so I think for this group rich will have the more interesting things to say so I will hand it off to him. Thank you. I am a, I come from a traditional computer science and software engineering background at in the corporate world so I was able to bring some of that experience and architecture and everything else, computer architecture over to this field so I think the best way to talk about this is, I could show you a quick demo, because it's the type of product where, once you see it. That's, that's really the magic of it. So let me do this. Okay, you got you got it. So, basically what we've done is you could search for any type of legal concept or term or case facts whatever you want in here. So I'll just do a quick one, you could see it. Search for I was just making things up landlord invasion. So I'll just do like child was injured at a playground. So this is good like if you're a lawyer. You could search for specific legal terms that will are relevant to what you want to look for and if you're not a lawyer you could search for something simple like this and find case cases that might be relevant to you so that you could learn more about the law. So you look through it, you could narrow by state here. But the way it works is, we've read in all of the original legal opinions and, and kind of split them up and summarized them and made them searchable. So for example, here's a case. This opinion describes a case involving a child who was injured while playing on playground good. Well, not good for the kid, but a good for our search. So basically, if I click through, we will work a little backwards here. Here's the original opinion so you could always refer back to the original opinion. These opinions are super long as you all know some of them. There's an opinion I ran into that initially crashed the system that was like 977 pages. So, yeah, so there's all kinds of opinions out there. But if you. One is broken the opinions up, and then summarize them for, and it makes the search more precise. For example, in this you could see in this case Clyde right brought in negligence, then the second part of the opinion describes a child named Jordan was injured while hanging on the rollergate. This is another part. And you could see this fourth one down is actually the one that it found as a match. It describes a case involving a child who was injured while playing on a playground. So it, it, by breaking the opinions up. I mean, it was done out of necessity because the language models at the beginning of the year when we started in coding could only handle small context windows. And now that they're able to be bigger I still think this was the best approach for what we're doing, because the search engine searches and every single one of these, and then on the results. It shows you the one that match the best. So, that's one part of it, and we're working on other things to, for example, by click on this one. We're doing something like what they mentioned that consumer reports. I mean, on a different type of scale, but we're actually deep reading into the opinions and having it tell us like here's the parties. Here's the introduction background procedural history issues presented analysis, holding and conclusion, so that you could go right into here and if you're an attorney or someone else interested in the law. That's exactly what this opinion was, and then down below you still get the summaries and the original. But this is a good way to just quickly glance, and this was all I generated using some as as Dan could attest to it's a big iterative process using the figuring out these prompts to get this stuff out of there. We're going through and we're making these types of precise summaries for every single opinion, they're not generated on the fly it's like, as you when when you use a search engine you'll see everything's instant, because the cases don't change so there's no reason to. And you can't summarize them in advance, so we just summarize them keep them in the database match them up, and everything's super fast you don't have to wait like 20 seconds for it to come back and summarize. So, there you go that's that's the quick demo of it. Amazing. A couple of things. First of all, thank you for for taking the time to walk us through I agree is nothing better than a demo. Could you speak a little bit. So, part of the reason I wanted to bring you forward to show this so we could people could know about and we could discuss it was to demonstrate what's different about generative AI when it comes to searching for case law. One of the things it's that to me is kind of profoundly different and almost mind blowing and its potential is that we're not searching for words or phrases anymore, but we can in a sense in the kind of high dimensional vector space that this data has been atomized into and through the models. We can we can see, as it were, the concepts or the ideas behind the words. And that's what happens with the kind of cosine similarity and the semantic searches that are that are going on. And if I understand correctly that match score like in this case for real Linda unified school district versus superior court. We've got a zero dot eight nine to one which is pretty high. When I when I do these things. Um, could you speak on a little more detail about what is that little score. And what does it mean is isn't the same as like, we have three matches. Like it's not like yes or no. It's a score that relates to relevance of a of a new type, like what is the score what and what does it mean. Sure. So, on the tape, the people on the call may have heard of vector databases are seen it in a article or something that they've read on the, the way they work is is we don't have to get too much into it but what as I said they store these numbers to represent concepts and language and everything. And so, if I were to have store in that database something like that dog was sleeping in his doghouse. And then I typed in the dog was sleeping in his doghouse that would give 100% match. If I said, the cat was sleeping on the cat bed. That's like a similar it's it knows it's an animal. It knows it's sleeping. It knows that, you know, it matches it up so that would give like maybe a 90. In particular case, we're going to have a hard time getting over, like a 9089 is actually quite high, because we're searching these entire paragraphs and unless you get this exact paragraph almost exactly. You're not going to get like 100% match, but it finds this relevance because it talks about it. There's a child. There's a playgrounds there's an injury. There's, you know, they were that the more you put in or the less you put in the more it can match up to here. It, it, it will make that score a little higher. And now in this particular case when I click through each one of these, like this first one doesn't mention a playground at all. I don't think just glancing at it. So this probably would be a lower score. This, this looks like a slide onto concrete that might be a middle score than this one. This is the actual one of found. So it, the thing grades each one. And what I've done is just return the highest matching one without repeating the, you know, you don't want to repeat like you don't want to send back like five, from the same space because so it just pulls out the highest one that matches based on that vector matching score. So answer your question. Amazing. And just just to drill down a little bit more on this. Sure. Let's say you have the word child, we see the word child here so in some ways it's not a great example of what I was trying to get at but like it let's just say like you're a, you're just some person you don't really know the cases or maybe that the word child has like resonance and is noted in specific statutes and regs and case laws and so you're saying, maybe other words to talk about like young people or like my children, or my, or minors or, or maybe different facet of it like so people that have extra rights under the law who blah blah blah. It could start to sort of it sort of see these years of cases and can find the underlying meaning, and it's assessing relevance based on that. FYI everybody. This is huge. Like we could never see the shape of the law before we could only do very brittle, literal, like word and phrase searches and, you know, like rejects kind of stuff. This is pretty important because it might tell us for the first time, what is the law, anyway, like, just what are what are what is the what, what are the underlying rules, and how could they apply to my unique circumstances even if I'm using different words, or the facts and circumstances are playing out a little differently. It's down to relevance and relevance is, is, you know, is a deep deep rabbit hole, and the kind of semantic inference engines that are available through this technology, can it help us begin to plum the depths of these of these quintessential concepts for law. Dazza, do you want to on our, on our pre call you mentioned the thing about Wyoming do you want to mention that where you were using it when you found that and that's partly why why you're here today. So I was, I was helping the state of Wyoming that's a good anecdote, who was looking at some legislation to advance their policy agenda to give individuals and companies more rights over their digital information. In a sense that this is kind of, you know, compatible with with the consumer reports thematic, you know, initiative we just heard about giving people control over their digital information. But one of the things that's different is consumer reports was based on California Consumer Privacy Act. What they were trying to do in Wyoming was just look more generally at what is the relationship between people and our data. And they were following a state law framework of property they're saying among other things it's also property. It's a kind of personal property, but it's not physical, it's intangible or digital. You know, when we create it or when we perfect rights to it, it's still a type of property and we should be able to have at least basic property rights and property rights are a state law, very much an area of state law. So, this isn't a rather unusual, I think logically it applies perfectly. And they were looking to clarify some of that intellectual property is kind of confused that a little bit. So now we have this other statutory framework of, you know, that names things certain ways, but that's like a little off to the side from just basic property that we've had for millennia as a concept and law. And so I wanted to see cases that stood for that principle. And so I put in. So I asked some very basic questions about, you know, what are like legal, I'd have to go back and find the process to the worst the effect of like, what are legal principles that that stand for the, the rule that people have, you know, property rights over over their digital situation. And when I asked it that way it pulled up a bunch of cases that I'd never seen before, because I guess I hadn't been searching the words correctly but there was this whole rich thread in tax law about intangible property. There was a bunch of stuff about kind of interest that people have to like their, to their kind of like customer lists and to, you know, kind of database information that use very different words. There was a lot of information about like what people that are creating digital art, but it wasn't copyright. It was it was. So anyway, it was really interesting that it got underneath. And a she if you raise your hand if you want to say something I think I saw your finger. But anyway, so I was really impressed that I was able to ask a question one way and it identified a lot of case law that stood for the idea. And I've never been able to do that before. I kind of had to just, all the owners was on me to think of all the right words to use or I had to hit like pay dirt on a law review article of somebody that spent years, you know, tracking down every case. But even then we miss cases like how do we know that there's a case out there that would be relevant to what we're trying to do if we don't know what the words of that case are and if somebody hasn't gone through and tax on a fight in a way that relates to what we're searching for. All of this is very brittle it's very superficial, and it is it doesn't let us know what the law is this technology at least has the prospect to allow us to unleash and to reveal what the law is and to connect it to you know the questions that we have about how it would apply to given facts and circumstances so. So that's my sort of anecdote about how I started using describe and how I was like kind of blown away, basically, and it really was helpful to the legislature I sent them the stuff that we turned out and it helped inform their ideas of how to then frame things in in some, some statutory reforms that they've been considering so it's very useful it's very mind blowing is very practical. That's my anecdote. Cool. Yeah and and along those lines I know we were wrapping up in a minute. The way like people out there if you're a pro say person or just you're involved in the law for some reason you don't know all those words either so you could just type in what happened to you, and then suddenly you'll like if you type talk about your mother somebody coercer to change the law then suddenly you'll learn like okay that's called that's undue influence so why don't I, you know this thing will tell you it's undue influence and so you could know what to research. Um, so if people wanted to play with this so I want so we are out of time we had a little more time. I would actually suggest that everybody go to describe.ai. And try some searches and like try it for yourself. See how this plays out. And, and then I would ask you to share your results so we can talk about them so I can't do the last part but I can do the first part, which is go and check it out. To the extent that you're on LinkedIn go to the MIT computational law report post about today and make comments and you know let's continue the conversation in that way. Thank you rich thank you Cara everybody we have some really interesting things coming up. Rachel if you could go on mute. We've got the 2024 MIT computational law course is coming back at you it's free and open to all in January. And we're going to go much deeper into many of these topics and give you all an opportunity to use some tools, and to get even more deeply engaged. And, and thank you all for joining us for the sort of resurgence of idea flow. I know we've taken a long hiatus, but we're back baby. So thank you all, and we look forward to seeing you next time. Bye bye. Thank you.