 Welcome everybody to the law dot MIT dot edu idea flow. This session is specifically focused on the emerging art and science of legal prompt engineering. The way that you can basically get the most out of generative AI by prompting it. And if you prompt it in a really good way, you get much better results. And if we prompt in a really good way within the domain of law, we can get even better results for law and legal processes. So with that, I'm going to just make a few initial remarks. Is this screen coming through. Yep, looks great. Great. As I mentioned, this idea flow is part of the media arm of the MIT computational law report. And this is our core team, although we're about to make some interesting announcements with some additions to the core team. And I want to just thank everybody for keeping law dot MIT dot edu alive and robust and growing. This is a session that I think you're going to see is part of a theme that is going to deepen over 2023 and possibly beyond. And it's that theme of generative AI for law and legal processes. This slide just shows a that we already started reflecting it in January at the eighth annual MIT computational law workshop and we're just going deeper and deeper because the, the capabilities of this technology, particular for law and legal processes seems profound. I'm just scratching the surface. Having said that, I hasten to also caution with a few warnings and disclaimers, namely this technology, we must always remember, especially in professional context for using it that it is prone to false and inaccurate content. And it also contains any variety of biases and prejudice sort of baked within the data set and having to do with the way that it's been trained and configured. In other words, of Sam Altman, who is CEO of Open AI that brings us chat GPT says chat GPT is incredibly limited, but good enough at some things to create a misleading impression of greatness with some mistake to be relying on it for anything even for now. It's a preview of progress. Okay, so just keep that in mind as we're exploring what it's good for and how and how to use it responsibly. Recently, the legal tech hub has released a sort of initial compendium of places where LLMs are being used in law. I'm going to go forward to mention this by way of setting the table for our special guest Damian who's coming up in a moment don't worry I'm not going to keep blathering forever, but this is partly to answer that what should be a question in your mind. Why am I here, is this important, does this matter. Well, yeah, it does matter so in addition to our kind of issue spotting we thought it was important in early January now here we are in mid late February of 2023 and there's already a raft of technologies in each of these domains that are releasing products and services that that use and apply large language models in the legal area. And there's many, many more where that came from. I might add. E discovery was not particularly on the list, but I know that that's already happening. And then Damian had a cool comments on the LinkedIn post where I first saw this that there's also classification and tagging should probably be an area and I know because we coordinated on this that we're going to hear more about that in a I can't wait to hear more about that. If you all don't know about Sally, that's probably the second hottest thing in law right now along with generative AI in my in my view. So is prompt engineering really a thing. Well, let's take a look at what's happening out there in the wild. This is an amazing prompt engineer position that is available and anthropic they're the people that bring us Claude, which is comparable product to chat GPT in my view, it's actually better than chat GPT at several things. It's different, you know, chat PT has some some advantages to if you look at the my recent talk to DC legal hackers, I do a head to head comparison on a long crop chain related to if I do you share duties between chat GPT and Claude. Anyway, um, open AI is not the only game in town. There's, there's Google, there's others and tropic is one they have a prompt engineering librarian position. Well, that's interesting. That's a technology company. What about us in the law. Well, Alan and ovary have you heard of them for the largest law firms in the world has recently announced a partner a partnership with an open AI funded company that has created a domain tailored to the implementation of GPT. Three plus for the legal domain and it's called Harvey. So they're they're rolling this out to their lawyers their partners their associates their paralegals they're like, um, Empire of, of, of legal stuff, how they're using that is going to include prompts. Okay, that's part of this will become part of like the job description for people in that form or already as part of it. Well, what about like an individual role of prompt legal prompt engineer. Yeah, mm hmm. That's also starting to happen. So there's a there is a job posting that just hit like about a week ago from, I don't know how to pronounce this is it Mishkan. Yeah. Does anybody know okay I apologize to you if I didn't say it right, but I according to my internet research it's a serious firm. And they're in in Europe and they have a role specifically for GPT legal prompt engineer. It's a really cool position. This is just the tip of the iceberg so I'd say like this is sort of like just an early indication. Validation that this is real this matters a month from now and two months from now and a quarter or two from now this is going to be prevalent in the economy so that anyway that's just this is just an indication. Proof of reality. And so the only other thing I'll say is I'll share this on the I've already actually updated the program page for this chat with some links but here's the head to head comparison between chat GPT and Claude that you can look at you should look at. There's a really nice prompt engineering tutorial, which I encourage you all to look at. Did my screen change when I click the link or is it still on the old slides. No I'm seeing open AI right now. Okay, great. So this is amazing this is just a very especially for chat GPT. This is a great authoritative I'd say high level example of how to do prompt engineering use the last model. Specifically for API but a lot of this stuff works perfectly well through the chat interface chat GPT be specific descriptive details as possible articulate the desired output. Start with arrow shot so we can get into more of that a little bit later, but you'll all see this played through beautifully in my view in Damien's example. There's a lot of other stuff too. There's people who have been already putting tips out there for how to do legal prompt engineering so I've linked to some of those have been falling with great interest upon on LinkedIn. And there's some and then there's some academic papers, Megan ma pointed this one out to me teaching a language model to think like a lawyer is some great very deep stuff there. So, okay, without further ado, it is my great pleasure and honor to introduce our special invited guest Damien real. And he is a big deal, I would say in legal tech in my view. He's really comfortable just saying a big deal. He's done tremendous stuff with not not just his background deep in laws a litigator, but really interesting technical stuff as well. If we have time you should tell people about what you did with with music melodies and sort of flooding the field algorithmically. He's basically doing jujitsu on intellectual property, so that to prevent others from exclusively owning cool melodies and he's done a lot of cool acts and great work now with fast case, including standard setting work with Sally, which is critically needed to make the law be able to to solve as data which is necessary and we're way behind in our field. But what really caught my eye, not long ago was a really cool legal engineering hack that I would say legal prompt engineering hack that Damien did where he was able to tickle the algorithm in a certain way to get it to actually create a pretty good brief in a litigation context to context that he knows all too well about and so I wanted to invite you first off to just share more and maybe walk us through how you did that awesome hack. And also like maybe highlight the cool complaint that you started with because it's like at a meta level, particularly relevant. And then I know you have some broader views on the meaning of this and how this breaks down for for all of us in terms of our roles and, and the capabilities of technology be happy if you could say a few words on that and then we'll open it up to questions we're going to ask people to use chat for your questions and comments. Damien, you've got that so that's thank you for that very kind intro I really appreciate it so a bit of my background that might help the talk is that I've been a litigation since 2002, so I litigated with a large firm Robbins Kaplan for about a decade, I represented Best Buy and most of their commercial litigation. I sued JP Morgan over the mortgage best security crisis. I represented victims of Bernie Madoff so I have a large litigation background but I'm also a coder since 1985. I'm a horrible coder as anyone who has works with me will know but I know enough about code to make me dangerous. And that enough about code led me to pitch Thompson Reuters in 2015. I said, hey, you should. Here's some legal tech that can change the practice of law you should build it and hire me. And they were done enough to do that so I worked for TR for a couple of years with 100 programmers and 50 lawyers building this really big thing. I left that really cool job to do another cool job in cybersecurity where my biggest thing was that Facebook hired me in my company to investigate Cambridge So I spent a year of my life on Facebook's campus with Facebook's data scientists and my former FBI CIA NSA people that worked with me to figure out how bad guys were using Facebook data Monday through Friday on Facebook's campus for about 52 weeks in a row is that was my life for a while. I have that really cool job to join my current really cool job at fast case and doctor alarm where there's 750 million judicial opinions and lawyer file documents that are just waiting to be extracted to be able to say what are what are the things that matter and how can I generate new documents based on these old documents The only reason I left my cool Facebook and cybersecurity thing is to do the really cool thing that I'm going to be talking to you about today is is how, how can we take this treasure trove of law, which law is all language. Right, that's all we're doing. And we are doing a revolution in large language models to be able to manipulate that text. And so to be able to quickly do that my background as a lawyer for 15 years, plus my background as a really crappy coder is enough to say maybe prompt engineering. That is the topic for this maybe that's the best combination of those skills. So if you're like me where you're a really good lawyer and a crappy coder prompt engineering is maybe the thing for you. Because then you're going to be able to be able to do do really outsize things that a coder alone would not be able to do because the coder does not know the legal context that it's going wrong. So that's thing number one thing number two, because as I asked my while I was on the Facebook engagement, we were at a, we're having drinks in the hotel lounge and I said, Hey, Noah, you know how you can brute force a password by going A, B, and he said yeah I said, what if we did that with music, what if we went to the right to me to the fire and then mathematically exhausted every melody that's ever been and every melody that ever can be mathematically. And he said, F yeah, let's do that. That night he created prototype of 3000 brute force melodies. We have now gone from 3000 we've now made 418 billion with a B melodies. I wrote them all to disk it does those 418 billion have exhausted every melody that's ever been and every melody that ever can be mathematically, we've just exhausted the data set wrote them all to disk. Once they're written to disk, they're copyrighted automatically. And then I put everything in the public domain under creative common zero to be able to keep space open for songwriters so they don't get eaten up by copyright. We call another hack right in a well in the sense that the law plus technology plus music hack that we've been doing, but enough about these other hacks let's talk about legal prompt engineering hacks, which I'm going to share my screen now to be able to show you. The idea with this is that I know as a lawyer that I want to be able to get things done. And as a lawyer when I was in a law firm, I would have to write briefs or I'd have to assign a younger associate to write briefs. And so this exercise is largely much like you give a first year associate. I want you to draft the first draft of this thing. I know that you're not very smart first years associate so I'm going to tell you in very simple terms, what I want you to do. I'm going to use short sentences, because those are easy to understand with those short sentences. It's less likely that you're going to go off on a tangent, right. Everything you would say to a first year associate are things that you would say to GPT or to any large language model as a prompt. This is a LinkedIn article that that maybe we can put at some point in somewhere so you can read this at your leisure. But what I thought about is using the table of contents for a brief. And so this isn't just any motion to dismiss. This is actually a motion to dismiss in the GitHub case. You know, this lawsuit is a bunch of a bunch of dose so a bunch of anonymous coders are suing GitHub over GPT and it codex specifically is the implementation of GPT. So in this lawsuit, you have, you have them saying the coders saying, Hey, GPT and open AI, you ingested all the code that I put into GitHub. The code that I put into GitHub was licensed under the MIT license or some other license. See you by ingesting all of my code and by the way, all everyone else's code in this have infringed that license, and to be able to now they're fighting over this. I have some strong feelings about this because they're essentially shoehorning a copyright claim into a licensing claim, which is I think not the not the right way to go about it. And that's actually the context in which I framed my prompt for large language models so I took this lawsuit, which is being pulled up right here. Of course, it's not able to be reached. And then I thought, okay, we as lawyers already do what programmers called parsing that is they be able to parse things and the what the things that we parse are called tables of contents. Because we as humans know that I have a 50 page brief that it's going to be hard for my reader, that is the judge to be able to ingest all 50 pages. So I'm going to give them a cheat sheet of all the things that matter. That cheat sheet is called the table of contents, where I, in very easy terms, provide every single argument I'm going to be making in this brief. That's of course easy for the judge to be able to grok okay I'm going to say the complaint fails for the cause of action it also fails to stake law cause of action each of these things. I can then translate that into arguments and be able to eyes a judge can say okay they win an argument a but they lose an argument be okay I'm not sure about arguments see so this is essentially a way to simplify 25 pages. So I thought, what if we take this human curated simplification that is called a table of contents from the motion to dismiss in the GitHub case, and then create a bullet pointed list of counter arguments of that. So essentially taking this human curated argument list and make the counter arguments of that. I said below the table of contents from motion to dismiss in federal court. So this context is important right first you want to set the table. This isn't just any text. This is a table of contents from emotion dismiss and federal court. So now the large language model realizes okay now I know the context does that's what this thing is. Please create a bullet pointed list of counter arguments. It only did it create it parsed this thing but you can see that there's also it had 15161718. It created this bullet pointed list. And where the first one says that you lack standing plaintiffs lack standing to assert their claims. Then the output says plaintiffs have standing to assert their claims. So essentially it's flipping this to be able to make counter arguments based on these arguments. When you prompt it's done two things it's created a bullet pointed list, jettisoning all the other crap that is in here right, and it's also created sub lists as well for these things. So now I have bullet points. Next is okay now I want to be able to say what are the elements of each claim, because you know as a lawyer I know as a lawyer and many of you know as lawyers is that you have, you know, breach of contract, which is what they're doing to the license. What are the elements of breach of contract element is that there is a contract element to is there's a breach right element three is that there's damages as a result of that breach right. Those are the elements of the claim. So I said okay my promise for each bullet point above provides sub bullets to provide the elements of each claim. So here are the is the bullet saying the plaintiffs have standing to assert their claims. What are the elements you have to prove you have to prove that they suffered an injury in fact accurate. What is traceable to defend his contact accurate, a favorable court decision is likely to address the industry, accurate, right. It's accurate across the 50 states, there might be a jurisdiction or to where there might be an additional element right or maybe in a jurisdiction that doesn't require element to right, but pretty good. Right, pretty good head start, and it did that for each one of these. They have, you know, they have shown good cause for anonymity, such as fear of retaliation. Right, I didn't. That was not, that was not anywhere in my prompt, and they will not prejudice the defendants the fact that these are anonymous. Good start, right. So now I have, you know, a bullet list and then sub bullets of each of the elements of this. Now, okay, cool, I want to flesh out facts that would prove this. So as a prompt I said, okay now for each level to sub bullet, which tpt knew what that was, for some reason. That is called an element. Please provide level three sub sub bullets, just to so I figured you know just case it doesn't know what level two is I'm going to go with level three just case it doesn't know what a sub bullet is I'm going to go with sub sub bullet. And then it says now provide examples of what could potentially be relevant facts, which show that the plaintiff satisfied each element. You can see that this is actually a later prompt because the first prompt included the things that had a lot of melon medical injury. And so I said okay now exclude facts that include medical injury, instead focused on facts related to commercial injuries and contractual injuries, which is what these plaintiffs are particularly alleging. So again whether it would a coder have known to exclude these things. They would if they have a legal background, right. But because if I was a lawyer then you're able to be able to then may make a better one. So it says now okay now for here's this. Here's injury in fact that I have shown standing through an injury in fact, I have suffered economic harm. As a result of the actions, I've lost revenue. I've incurred cost. And then, why causation. It's a direct cause of injuries, but for right but for is a very legal term would not have suffered harm. And then here's your dressability. If you rule in my favor, it's going to address my injuries, etc. monetary damages would compensate me. We could go through. But now you could say okay. Now that I have the claims, the arguments, the elements of each of the claims, and now I have examples of facts that could prove out that I want to be able to gather and flesh out those relevant facts from the real world. So one quick thing because you did it for the previous prompt results, would you agree as a, as a experienced litigator that these are good examples of facts that would tend to be able to prove the elements needed. Oh, 100%. Yeah, these are things that each one of these sub bullets, I would ask my client what kind of economic harm have you suffered as this, and then I would get a done at the David from them, or I'd had deposition testimony from them, right. And we build cases. So when you think about, you know, are, are these things going to eat lawyers jobs. Well, let's think about how I would have done this as a litigator, I would have thought in my brain about all of these things and I would have made a list that I would then ask my client to do that list if I would have, you know, look through all the cases would have taken me maybe an hour or to be able to think through that list. This did this in exactly 20 seconds, and it gave me a head start and I might be able to say okay, maybe it missed a few, but I'm not starting from whole cloth I'm just adding to augmenting what the machine does. So this is the idea of centaur lawyering is that I was a centaur I'm faster better stronger. And, you know, could have first year associate have done this. Probably not. Right, so I'm a senior partner charging at $1000 an hour do I assign this to a junior associate that would have to turn it around in two or three days. Or do I spend the 20 seconds using GPT to get as good as a junior associate and then augment it and get it out the door. So, did that cost anybody their jobs. Well, I'm not using that senior that junior associate as a senior partner, right, so it's going to cost that work. If I choose to do this. Similarly, if I'm a client, say I'm a corporate client. And I want to be able to say okay, do I call outside council which is going to charge me $700 an hour. They're going to have to find a junior associate that's, you know, it's going to take a week or so. Or do I just throw this into GPT and try to get a fast answer and get out the door, right. I'm not. Is it better than humans are not better than humans. The real question is, is it perceived as better faster stronger. And even if it's not perceived as better faster stronger. I just put out another thing in LinkedIn about there is an efficacy that is how effective is the thing. And then what is the cost. So we think about cost as an x axis, and the efficacy is a y axis. If humans are really expensive and take a lot of time and cost of time. Then maybe I as a in house council want to be able to say faster. And even if it's less efficacious. If it's less effective, but is cheaper and cost and cheaper in time. Maybe I go with that a little bit less effective in the form of expediency. That's all I'm saying to to Daz's short question. Yeah, I think that this is a. This is a lot of the things that I would be doing as a lawyer anyway, and it's just giving me a faster, better, stronger start. Perfect. Thanks. So now you mentioned it for the first couple of prompts that you thought it was pretty good and I just wanted to double check that it was still holding up. Okay, now let's go forward. What did you do next. So now next you say, okay, for any one factual claim, I could say, you know, open AI actions with the direct cause of the injuries provide factual examples of how a large language model in a training text will cause the author of that training text to lose money. So here I'm essentially saying here's the plaintiff's case, actually be able to make give me examples of how for each claim. This is proved out. And it says, here's an example one, open AI use the author's copyrighted work as training data for its large language model without obtaining permission from the author. As a result, causation, the author lost potential revenue from their licensing work to other companies for similar uses. Really f and good. I go to the client and I say, Hey, is this happened and they say, Yeah, let me show you all the ways this has happened. Here's number two. Open AI created a product known as codex, such as a writing assistant tool codex that use the author's copyrighted work, my MIT licensed GitHub data as training data and competed directly with my own writing services. I'm a coder dude, you competed with me causing me the coder to lose clients and revenue. Pretty f and good argument. I would use that as a lawyer. If I'm, if I'm the lawyer, Matthew Butterick is their lawyer, right? If I'm Matthew Butterick, I'm going to do that. Example three, used an author's copyrighted work as training data and subsequently created AI generated version of the work that is essentially taking my code and just making a version of that code. That was similar enough to my code to the original to cause confusion in the marketplace, leading to lost sales of me. It's pretty good, right? If I'm a unique, if I have a unique coding style, maybe it's stealing my unique coding style of creating confusion in the marketplace, right? This is kind of a stretch. Maybe, maybe true. Maybe not. Then the last one is training data was sold and distributed without providing any compensation to me as the author also accurate. Anyway, so this, this is where I stopped. You could keep going, right? You could be able to say, Okay, for this example one, give me some examples of how this, right? The beauty of data science is not question number one. The beauty of data science is what question one leads you to question two, which leads you to question and so I've gone I guess four level steep on this to be able to do this but you could imagine building out your entire case strategy. And, you know, this took this entire literally took me less than a minute. So if an associate charges 500 an hour, right? Would they spend an hour to do all this? Probably, right? Would I as a senior lawyer charge 45 seconds? Probably not, right? That's just part of my 45 seconds. But now I'm going to flesh out the facts, right? I'm going to ask my clients, hey, did this thing happen? Did that thing happen? And, and, you know, maybe someday the LLM I could be able to input my discovery and be able to say now from that discovery and from these deposition testimony, why don't you go ahead and pull out all the things that matter? Maybe we could do that. But that's maybe down the road a bit. But for right now I'm, you know, charging the client for validating class case law and doing things. So anyway, so there's you can imagine how this could fit into a pipeline. Say for every, if I'm responding to a motion, input the table of contents into that motion to be able to output all the elements of all the things to be able to output all facts that it could be using that to be able to prove those things. And then maybe faster, better, stronger. You can imagine a company like mine, Dr. Alarm Fast Case, where we have 725 million judicial opinions and lawyer file documents. You can imagine that might be interesting to us. Say I hit a button to this motion dismiss, say make counter arguments and facts, and then it could push it through GPT with some good prompting and output counter arguments and facts to prove those counter arguments. That might be interesting. If motion dismiss, then do the prompt for the MTD, right? If complaint, do the prompt for an answer. Respond to each of these allegations in the complaint. And then for each of those make a counter argument to admit deny, etc. Anyway, so this, this is just one example of prompt engineering. There's lots and lots of others. So I guess any questions on this first example before I move on to others. Bravo is the first thing that actually one technical thing. I can't see the chat. I'm not sure what's going on with my implementation. So we'll see him and Damian, if you don't mind tracking the chat. And if there's amazing or important, you know, issues or whatever. Yep. Yeah, so I'm showing the chat on the screen right now. So question number one is how will, how will associates that we're trained to become seniors, if a big portion of their work will be done by machines. That's a fine question. And so maybe, so in my example, I as a senior lawyer, essentially did this work. You could imagine, I as a senior lawyer would instead assign the junior lawyer to do this work. This way, so you can almost think of this almost like a calculator, then, you know, we teach people to add and subtract and multiply by a long hand. And then after they figure that out, then we give them a calculator. Right. So in the same way lawyers, we teach them how to do this kind of work in law school. But then as a first year associate, maybe we give them the calculator that is called a large language model to be able to then work through and be able to say okay, junior associate, use this as a head start, and then flesh out all the things, you know flesh out all the facts that I want to show. Based on these facts, example one example to example three example four, create a list for each of our clients, a list of questions that we asked each of those people, and then be able to then follow up as a junior associate with each of those people to get an affidavit drafted for that. So, so you can imagine even GPT helping with that. So for example one, use an author's copyrighted work. I could input example one and say okay for example one, give me a list of questions that somebody might be able to ask an author of how this might happen. And then it could output those questions. Anyway, so that's that's all the way of saying that I as a junior lawyer would use this as a calculator to be able to train you to become a senior lawyer, not in the way that the senior lawyers were trained, because that was yesterday's right, we're training them using GPT today's calculators to be able to practice law in a better way than the juniors did and then the seniors did in their day. So anyway, I hope that hope that answers your question I think that today's associates, if they use these as a calculator, not as a competitor will be able to run faster better stronger than their, their seniors are here here. I'll make a quick interjection here is, there's a couple of things that are sort of between the cracks now that I want to put some, some light on one of them is, I think there's more to say about the role of the junior associate going forward. I mean, the first thing to say is, we don't really, I mean, we don't have crystal balls, yet this technology is profound. That means we can anticipate there'll be a lot of changes, most of which are going to be hard to predict. Okay. That's not going to stop me from making predictions though. So one thing that I conjecture might happen is, there's going to be a important need for what I would call due diligence and legal assurance with respect to the outputs of these models. So one of the things that you mentioned Damian, when you were doing a quick kind of on the fly, kind of like quality assurance, you're like, Well, yeah, this looks pretty good, which by the way I agree I think it's amazing. I'm flabbergasted personally by by the results you got from this and you're very clever and expert I would say legal prompt engineering, which is where most of the value was in how you prompted it. I'm flabbergasted that the thing works so well. And yet you also said, and this is the part I'm focusing on for this, this facet of the roles of tomorrow. Well, maybe there's some jurisdictions where there's just two out of these three and maybe who knows has there been recent case law and some other jurisdiction where there's a fourth or what have you. Okay, that sounds like a real good time to have a licensed attorney who's expert in the law, doing a do some due diligence and then in legal assurance as part of the regular workflow of utilizing these as inputs to the practice of law, not to mention other other legal processes. There's going to be a lot more on this soon at law.mit.edu. Why, because we're going to convene one of our in one of our famous task forces on the task force is going to be among people who will help to generate a set of principles and guidelines for what at least as a hot first take. We think some of the due diligence and legal assurance types of actions ought to be when when utilizing this technology for law and legal processes. Then we'll circulate it get a lot of feedback about we hope to be constructive and starting to look at that more carefully because we all have lingering knowledge that this as interesting as it is. We can't totally rely on everything in terms of that breaks down into several different types of things, some of which could be put on checklist and some of which involve other types of processes. Yeah, really, a really good example of that is that, for example, you can imagine the large language model saying Roe v Wade is the law of the land. Right, right. But of course it's not. And of course that's I've talked with the Berlin and Ed Walters my CEO, and he said that citator a citator is going to be most important food to what does it was just talking about right because the citator will say oh wait Roe v Wade upon which the large language model has been based is no longer the law of the land. You know, plus he versus Ferguson is no longer the law of the land. So anyway, so the idea of what is good law and what is bad law which of course changes daily. The large language model takes the entire longitude of human history, and or at least as much as it pulls, and it's not able to say that now this is bad law. So anyway, so maybe we can build systems maybe we can are currently building systems to be able to create a citator aspect of it you know maybe this portion of the statute is bad law today. Therefore, if it's trained on or matches too closely this statute maybe we put a flag up. Those are exactly things that should we be we should be thinking about outstanding. Can I give some other prompts that might be useful to the audience. Yeah, there was one other quick thing I want to just mention, which is in that the open area of like where's the role left for attorneys. In every case that you're looking at fundamental and this really is riffing also off your observation that the law changes and, you know, but the other aspect of the law about the other consequence of having a large language model, being very backward looking at some ways because it's sort of the information as his training set. That's all historical. And that it not only might it not cover things that just happened, but it definitely doesn't cover ideas you have where if you, if you asserted the idea correctly in litigation, and you won. Now we've got new precedent. So it definitely doesn't know about that. And so there's this incredibly important role for experience creative litigators like you and others out there. So it's a great opportunity for looking at this stuff and coming up with novel approaches and in this case I thought was a really good example so it's your, your, what you got from your prompts which again we're excellent was kind of recipes of the typical stuff that you would do for this general kind of case. What's going to happen in this case is we're going to, I think, we're going to be looking at new horizons in the applications of intellectual property, partly because we're not looking at literally like a pastiche of, of like, you know, sections of the language it's been, it's been atomized and vectorized as parameters and billions of parameters and, and kind of like almost sort of like the Star Trek. You know, beam me up like it's disappeared but it's come back in a completely different form. This isn't Captain Kirk anymore. And so I think, you know there's doubtless intellectual property around it, but this is going to be, we're not going to have novel theories and new ideas. This is another excellent place for human beings. Okay, with that, yeah I would love to hear more of your prompts that is in fact the, the order of the day. And, and, and before I do the other prompts I want to augment what you were just saying and if you think about the, the role of the human in all this process, one role is to be able to count the things that machines cannot count. So for example, I as a human read the news, and I know that all the senators are pounding the drums of this particular thing that maybe GPT hasn't ingested at this point and maybe I've talked to the large leaders the CEO of the large companies to add yeah we're going to have to be regulated right. Those are not in the model. So when I was working at Facebook they would have posters that says not everything that counts can be counted, and not everything that can be counted counts. And that's an example of what does I was just saying is that, um, yeah, the machine can count things, but it can't count the kind of side room conversations that I had with CEO, right it can't count the news that kind of the way of the wind is turning in one direction or the other direction that machines can't do well, but that humans are things that can do well so anyway so that's I think for, and that's part of the prompt engineering to be able to say, if I if I were going to say you know here's as a prompt engineer if the tables, if the winds are blowing this way, what will be the output. So I could use my human insight that I got whispered in the halls, be able to then inject it into it so the machine says oh okay now I know that what was whispered in the halls so I'm now able to be able to provide even better insights. So with that, let's let's go to some other cool prompts going forward. So cool prompts as I start sharing my screen is that you can be able to extract things. I've just been calling this, you're there's generative models where generative AI tell me a story, write me a poem, right. And then there's what I call generative extractive models, where I say, I'm going to give you as a model some text. And then I essentially prompt based on that text. So all of this is generative extractive, which is much better than just pure generative. And as a Gen X or I like to say Gen X generative extractor. Anyway, so above is some legal text from the text above please extract verbatim text. So essentially saying don't hallucinate snippets placed each snippet into a Markdown table. You might not know this but you're able to be able to say you put it in Markdown. So extract only verbatim snippets. If you're not sure do not answer. Again saying GPT don't effin hallucinate on me. Okay, blows the syntax for creating this Markdown table, make a column, and then I'm going to give you some examples. Within any column, if a cell has multiple snippets then make it a new row. Using this syntax below are the instructions for creating this Markdown table. Here are some areas of law. By the way, these are Sally areas of law, one of many. Here are industries. These are Sally industries here are legal concepts. These are Sally legal concepts here are actors here are assets Sally assets Sally forums Sally legal entities Sally authorities Sally locations Sally governmental bodies. So essentially saying here is marketing text from a law firm website from this lot marketing text. Why don't you pull out all the things in Sally that matter in Markdown. Here are the legal concepts trademark trade dress unfair competition. Each of these is verbatim a Sally tag copyright Sally tag. Then being able to say you know, actors okay claimants Sally tag respondents Sally tag IP disputes check trademark assets yep copyright assets yep and worldwide etc. So, then here's another one. And then you do the same thing here. So this is a, this is not a generative don't it's not telling me a story, but it's extractive. Give me the tags of the things that matter and give it to me in structured text. And then you can imagine just running this across all of your data to be able to say, here are all the areas of law, here are all the industries for which are doing this area of law here the concepts. And because it's a markdown, you could then essentially use it in your pipeline to be able to say okay now it's a markdown I'm going to take all these columns and then be able to push it into our other systems. So all that's to say that this generative extractive is a big deal. Also create a decision tree. This is something that, of course, open AI has decided that that chat GPT is just too busy to be able to do it so I'll see if I can log in quickly if not I'll jump over to the production side to be able to do the playground. But one of them is to be able to say I as a lawyer one of the benefits that I give to my clients at least gave to my clients in my 15 years of litigation is to create decision trees for them to be able to say you know if this then that to be able to say this is the things that that matters. I said give me a decision tree on whether to bring a breach of contract lawsuit under New York law. And so then here's the output decision is say is there a valid contract seems like a good element to the claim if yes there is go to step two, if there's not a valid contract. Do not pass go do not collect $200 right has been preached. Yep. Is it material. Yep. Is the injured party ready to perform. Yep. Is it you know it's time past right. These are all really good questions, whether they're accurate under New York law or not. I don't know. Right. Maybe maybe they are maybe that that that's the kind of validation. But it gives us a good start and gives us you know something to edit to be able to give to the client. Now this is the cool part, express that in Python. And here's Python script that implements this in decision tree. I mean, that's think about that. It's important to know there's only a general etc. So anyway, that's use case number two to be able to create a decision tree, then maybe create code out of this. And another is to be able to simplify arguments. Say, you know, here's summarize this text in three bullet points. And then this is what document does today. We input a bunch of text from a document, and then it outputs three bullet points. Then, if it's too hard, you can see this is kind of dense. You can say make it simpler. And now here's a simple output to be able to take this long dense text and simplify it. So if you want to be accurate, that you could be able to be able to change what in in the playground, you can see is temperature temperature if you see right here this is open a eyes where you can be able to put a prompt so think of it like a chat to be on steroids. The temperature here is, if it's over to the right, it's more creative. On the left, it's less creative and more restrictive. And so then is able to say okay I want to be able to not guess right this is essentially a guest meter right be creative to the right, don't be creative on the left. So, what I've done here is to be able to take a count from a complaint saying this is a complaint where they saying I'm comfortable competition. So the prompt that I say is below is an area of law set. And then I give the Sally data set of all the areas of law. And then the prompt is to assign zero or more area of law tags to the source of already defined the source as this. And then return your answer as a Jason Jason for those who don't know is a good way to be able to express this as as example Jason output. Here's an example of how I want you to express this. And so then I already ran this let's let's run it again I'm going to submit this. And you can see here is trade dress and copyright infringement is trademark trades law and copyright law. It got exactly right. So now I can just use this as part of my pipeline to be able to say okay cool now I'm going to tag these things with the Sally tags, which by the way are right here. So what are the areas of law, banking law and trade secret right so intellectual property laws right here. trademark and trade missile law etc. So I'm now tagging these things up and then on the as part of the pipeline here's the identifier for trademark and trade rest law. And with the the idea with all of these things is that we're being adopted by Thompson voters and by Lexus and by I manage and by net documents. So if you say hey Thompson word is give me all your trademark stuff, you would send this identifier, and then you would send you all the trademark things, you would send the same identifier to Lexus, and they'll give you all of theirs. Send that same identifier to I manage, they'll send you all of theirs send the same identifier to net documents, and they'll send you all of theirs. So the idea with this is that these are 13,000 tags that will essentially take the entire industry, and then be able to say everyone is speaking literally the same data language that now we're all able to say okay. The thing that is called negligent misrepresentation that you could sue me for if I say something false about DASA DASA consume me for negligent misrepresentation. So we can all be using the same literal language for negligent misrepresentation, because it has a unique identifier. That is, right. As soon as right negligent misrepresentation. Here's the identifier, all of us docket alarm fast case Thompson Reuters, Lexus, everyone is now able to tag these things up. And so here to give an example of all the companies that are using Sally. There's Lexus Bloomberg, the Tara as a lot net documents, I manage a bunch of law firms, Microsoft, Jason Barnwell is using Intel that my employer to pass case document next chapter. The idea that mostly to be able to prove the point is, you can be able to say, hey, if I if it's hard to extract these things. Large language models is a prompt to be able to extract these tags, and then tag these things up in a way that can then be interoperable between TR and Lexus and latera and that documents, etc. So that's that's example number two and I know that we have only 10 minutes left I just make sense to jump into chatting questions. Well, it does but I have, I'm so curious what's your next example going to be like I hate to leave some amazing thing on the table, like, did you have an I mean if you're if the next one's going to be knocking out of the park like the first two. Let's let the here's a there and they're kind of interesting so you know you could be able to say to the extent that above is a snippet extract statutes or regulations and causes of action. So it's saying you know I want punitive damages. So these are largely Sally claims. So all of these, I would say is is not nearly as interesting the other so I think maybe it makes sense to be able to jump to the tap. All right, good. I appreciate that. Okay, and of course as usual I can actually even see the chat so good. Sure. So one of them is how are you standardizing the protocols of stuff in and out and this is a discussion slash argument I just had with I managed publicly today. So it's Sam Grange, who I admire greatly. He says you know what's the advantage of going to Sally. And the advantage of going to Sally is there are lots of ways to express the thing that is negligent misrepresentation, some jurisdictions call it negligent misrepresentation causing harm. So this person's question is how do I standardize the protocols, and the answer is kind of Sally, right. The thing whether you call it negligent misrepresentation causing harm or negligent misrepresentation. Here's the identifier, the unique identifier that relates to the thing that is negligent misrepresentation. And in the same way, you can imagine that motion to dismiss is a thing that is named motion dismiss everywhere, except for California, where in the motion dismiss dismiss is actually called a demur. So if you have. So here you have motion dismiss is called a demur as an alternative label in some jurisdictions called motion to terminate. Anyway, but they're, each of them only has one, one identifier. So anyway, that's how large language models and LLMs work together, where you, you can use an LLM to extract motions dismiss and motions to terminate and demur is, but then you're going to be able to put on this to be able to say all of those are actually this one thing. What's the potential long term impact of IT companies entering the legal market, like prompt engineering software companies. I would say that if I were need IT company I would be chasing lawyers that could do the kind of prompt engineering that we've been talking about today. But the question is whether Google and Apple and Amazon and Facebook, see the legal market as big enough to chase. One could imagine that what I've just been describing here in all these prompt engineering tasks, it's really good for, say the US legal market. But if you look at the US legal market that's maybe $2 billion or something right. But they're chasing trillions of dollars of market. So do they want to chase this and if the answer is yes they do want to chase it, then I would be worried. But I don't know if they want to chase it yet. But of course we have some early indicators if I may. So I've mentioned Alan ovaries recent announcement that they'll be deploying open AI, LLM technology. That's not because open AI is, you know, took in your words chasing the legal market. What they are doing is starting to kind of segment with deals to kind of almost like resellers and customizers. One of those companies has come up with something called Harvey, which is domain specific to the legal profession. So I would guess that this is like a deluge and the water is going to find its mark across every industry, even if it's through a variety of channels. There's one other thing is I've been invited, Matthew Waddington have actually never even spoken to in person but we kind of talked to each other on LinkedIn to come up mute if he wants to. And that's just because he showed a legal prompt that I was intrigued by a LinkedIn just like I don't know what I feel like it was a day or two ago. And it was just very creative. He's so he does legislation in a jurisdiction and and he found a way to feed in parts of the law. And then he asked, he was chat GPT to turn it into basically like a decision tree kind of kind of thing so extract, extract. Well, I don't want to characterize you, but Matthew if you're willing to other you are, could you just say what that cool hack was because it was a, it was a really fresh take on what to do with that with this technology and legislation. It wasn't it wasn't my idea was Regis River it. But but he had done it on some French. An international treaty. So I thought I'd do it on English text with legislation. I'm very happy with the results. But I was, I was asking it to turn legislative provisions into if then statements. Yeah, it got, it got the first one right, which was a very simple if then as soon as there was an or in it. So it was if a all be then see it lost itself. It was just having a bad day I don't know. Hey instantly it's great to meet you in person. We're in zoom reality. And so that was that I just thank you for sharing that that's another thing we should keep in mind that like there's I think some of the coolest, possibly most powerful applications of this technology to law, haven't even been thought of yet. So the stuff that we're doing now is is very important. But a lot of people want to express rules as code, and there's a lot of good reasons to do that. And getting a head start on that is another latent capability of the technology, which will improve over time. Okay, and thank thank you Matthew back to you. Yeah, there's a whatnot. So to build on what Jazza just said is that people might know Dan tat Dan cats and Michael Bob Marino. They did a really cool thing with GPT to be able to, you know, rules as code is something that Lawrence Lessig has been talking about since the 1990s and that's one of the reasons I went to law school because I read his book that code is law. But they did a variation on Lessig that is Dan cats and Michael Bob Marito to say, here is a whole portion of the US code. And this GPT knows what a knowledge graph is that it knows what RDFS is it knows what owl is it knows what's causes, it knows those concepts, what Michael did was to extract from the US code here all the ideas that matter. And here are all those ideas relate to each other. And he puts this onto Dan cats and he announced this about a month ago so I talked to him a few weeks ago and I said how'd you do it he said how'd you think I do it. And I would think of it almost like a word cloud and take the most common words and bubble them up to the top and then see how those words connect to each other that's how high we do it he's like, yeah that's how I would have done it to, but instead, because GPT knows and because GPT knows the knowledge graph, I said to it, I extract all of the things that matter, all the ideas from this text and express it in scoffs and RDFS. And it created this beautiful graph of how all the ideas in the US code relate to each other. And so, so that is mind blowing. But now the other mind blowing thing that we're doing right now is to be able to now he's connecting all the ideas in the US code that connect to each other, and then is being able to being able to say, now connect that to Sally, that is what are all the Sally identifiers that relate to this tax thing, or relate to this, this other thing. And now we're doing the mapping of the US code to tax, you know, tax idea tax evasion and tax law etc. So anyway, so this is kind of idea of extracting from natural text scoffs elements and knowledge graph elements is something that is that is pretty mind blowing. There's some chats in the three minutes we have left if that is that unless desert you'd rather do something else. No, I think that's good. We can at least get one more Q&A. That would be terrific. So cool. So there's a there's a young associates, essentially saying that they want to be able to figure out how to be a rainmaker. One can imagine that prompting could help that I want to chase this industry, who are particular. What are some particular roles in this industry that I as a lawyer might want to chase, because they have decision making capability on legal spend. Right. So that's maybe rainmaking. In Quebec it says our bar association is very concerned about using the tools. It's not actually using AI, not a common way. Are there sophisticated tools as legal only so it sounds like sounds like the concern about AI is largely just not knowing how AI works. So the, I would say that any one of these prompts that I've been doing here, how much of this is releasing, I guess client confidential data. Not many right because I'm extracting from it, you know, give me things that I should argue, and then the things that the facts that I'm going to ask my client, stay out of GPT. So anyway, so I would say that these kind of concerns, if I'm putting all my client information into the tool that open AI might be reading that I might be worried, but using it as an extractive tool, maybe less worried about it. Legal Script Kitties. So I invite everybody to glean from what Dania just said. This is a good practice. So when you're doing your legal prompts, do it this way so that you're not infusing confidential information in the prompt, but you're at a level of abstraction where you're still eliciting the right information. So I just want to put that more affirmatively. Thanks back to you. Indeed. The question about legal script kitties. And that's yeah script kitty for those who don't know is people that don't really know what to code but they can essentially copy and paste code and then be able to use tools. That's what a prompt engineer is right I'm a I'm a I'm a crappy. I almost said the S word I'm a crappy coder, but I'm a really good script kitty. So that but that actually makes me a good prompt engineer because I could be able to express in legal terms the things that then the machine is able to do. Knowing we have one minute left working jurisdictions outside the US. I love to do some experiments. Yes, definitely. TAM and legal. I don't know what TAM is. Devil's Advocates. It's a fool's errand to do prompt engineering. Simple answer is all this is ad hoc experience system. And you. Oh, nice. Where's possible. This is thorn. I love thorn in so many ways. We have limited access to model the open AI is a black box which is accurate. There's no control over it and they're probably going to be completely different. I am knowing that we've reached the end of our time agree with you 100% thorn. I think this might be a good way to end is to say that the there is no explainability and the things that we've just described either works or it doesn't work how it works. We have no F an idea and that might it might work today and it might be totally broken tomorrow because the open AI model is closed. We have no idea whether it's going to work tomorrow or not. So all that's to say, maybe a good closing for DAS and everybody else is that, you know, trust but verify, maybe don't even trust just verify right to be able to say yeah this maybe it's wrong maybe it's BS. And if it's BS, we want to be sure it's BS don't trust it. Thank you. Thank you. And that that is a perfect segue. First of all, thank you for taking your incredibly valuable time and sharing what I think is really some awesome. I guess, what would I call pioneering and creative and highly competent examples of what this technology can do and how to use it elegantly and effectively. I really am grateful for that. And by way of that last question I guess we ought to end where where at least I started which is with the cautionary warnings which is that this technology is, it is not appropriate to simply rely upon it I think the biggest risk because it does so much it is is in fact so good is that we over rely on it. So I think for for due diligence and legal assurance what I'm trying to do myself and my own practices and what what I think is going to be reflected in these guidelines that will come up from that task force I mentioned, it would be just what you said Damien. So let's start with, don't that it's untrustworthy that it's inaccurate, maybe even that it's priorities and interest and biases are hostile to the interests of your client. So start with that, and then work up from there so it becomes an input. And I think that doesn't. That does not eliminate the role of attorneys and licensed professionals who are by usheries. It actually emphasizes the critical need for our judgment and our expertise. So with that I want to thank you all for joining us again for this episode of law dot MIT dot edu's idea flow, and we look forward to continuing this series with you later this year. Bye bye.