 And we're just starting with us. Hi, guys, welcome back to future law day two. Today we have a couple of outstanding speakers. But before we want to get into the actual discussion and interaction, I want to introduce one other member of the teaching community and this community, Professor Sandy Patlin. Hi, everybody. So welcome. We just got back from China, so I'm not entirely rational over here. And I'm probably not going to be able to stay the whole class, because I'm going to actually have to go pick it up. My apologies. And it's really exciting to see this happen. Looking forward to things that happen. So please don't pick it as anything negative if anybody needs to be here. Well, actually, if you guys start storing, you're going to lose it. It's awful. Thanks. Thank you. So today we have two terrific speakers on a topic that should be near and dear to all of us. Artificial intelligence and its role in law, the privatization of AI and the way that it's going to change the legal future. And I want to just briefly introduce our two speakers. But before I do that, I want to thank one of them in particular, Brian Kuhn, who's from IBM and DEZ. He's just going to talk for a second about an event involving IBM. Thanks. It is very fortuitous that Brian should be here. Because IBM Watson for Law is one of our collaborators for the MIT legal forum event on October 30th and 31st that I mentioned last class here on the sixth floor, where we'll be doing a deep dive into law and AI and law and blockchain. So thank you for partnering on that. Thank you for the privilege. What can we look forward to at a high level? Understanding a rational approach to use case identification and delivering value rather than AI want to do something with AI. Practical, rational approach to deliver business value. And also change management. Let's discuss that a great deal. And we'll focus a lot on the intersection of blockchain for law and AI for law. These have been presented as two separate critical models really until recently. And of course, they're not. It's such a little bit of a method. There's a computational overlay. And some of your projects, I think, will touch on it too. All right, well, thanks very much for that. And on with our regular programming. Thanks, so I'm back at MC Rollo here. So here's what we're going to do. Today, we're going to have two speakers. And what we're going to do is we're going to have them speak back to back. They both form across the country to be with us today. And I want to just give a very, very brief intro to both of them before I turn it over and let them say they're really interesting stuff. So our first speaker today will be Pablo Aragando. Pablo is the vice president of Case Tech. Case Tech is a legal research tool that uses artificial intelligence to do some really remarkable things. Back my day job at Suffolk Law at Public Legal Research and Writing for a decade. What they're doing at Case Tech absolutely changes the dynamic of how legal research is done in a way that these multi-billion dollar companies just can't keep up with. So I'm really interested for you guys to interact with Pablo and to hear a bit about his insights into AI and the work of Case Tech. After that, Pablo is going to join us in the audience. And Brian Kuhn from IBM is going to step up. Brian is the co-founder of IBM Watson's legal group. He's done remarkable things. He speaks all around the world on the use of AI in the legal industry. And he's got some projects underway that are going to change things in a major way, including a project and a use case that he personally headed up that I hope they'll say a few words about. So after Pablo and Brian have both had a chance to talk, we'll invite everyone to join the conversation and think together about how their work might map on your work and the ideas that you might have. And we'll get their feedback. So we've got anything further? Brian and Pablo, thanks for being with us and let me hand it to you, Pablo. All right, thank you very much, Gabe. And thank you all. What a true honor to get to be here at MIT to talk about artificial intelligence and the law. Let me preface this by saying that when it comes to artificial intelligence, you guys are a stone throw away from the folks that are creating much of it and have built much of it. And so, well, I'm happy to talk about some of the issues in legal informatics and some of the things that are nuances for law. I would be remiss if I let you know that when it comes to pure real hardcore AI, certainly you would want to get input for better men and women than I. But the truth is that law is a field where AI can have a lot of implication. A little bit about myself. So I graduated from law school in 2005 and I worked as a patent mitigator at a firm in San Francisco and then a firm in New York. And I realized there was a discrepancy between the tools. We were representing companies like Apple and Google and we were fighting over their technology. And the technology that we were using to do it was terrible, right? And so I noticed that there was this disconnect. Now the partners at the firm, you won't be surprised that we're not overly concerned with my thoughts about our technology that we're using. But I found that there was a place where they do care about these things and that was the Stanford Center for Legal Informatics, also known as Stanford Codex. And Codex basically has one foot in the computer science department at Stanford and one in the law school. And I've been there sort of putting my shoulder towards building better legal technology and in particular focusing on legal research. Because legal research is a crucial aspect in law. In fact, it's not just a tool of law. Legal research is in essence the practice of law, right? Finding the right precedent among the eight million or so judicial opinions that are out there is one of the key skills that you learn in law school. It's one of the mandatory classes that you have to take. And depending on where you're from you're at, you can spend up to 30% to 50% of your time as a first year doing it. And it's one of these situations where it reminds me of the Jim Morrison line, the song he says, I've been down so long that it looks like up to me. And I think with law a little bit, you have folks that are sort of, they're so used to these tools, Westlaw and Lexis and they sort of think that everything's going swimmingly. And that's all well and good until anyone stops to actually look at it. And then you find in fact that these long buoyant queries that they're writing, right? These long strings that they create to sort of convey context are both over and under inclusive. They're getting results they didn't want which makes them inefficient and sometimes miscases. And they're missing cases that they did want which of course makes them miscases. And part of the problem I think is that these traditional systems are basically relying on a form of the wisdom of the crowd. Which is to say that they'll tend to put at the top of your results the cases that have been cited by a lot of other cases, right? And you've seen how well that works with the internet, right? I hope it argued with the wisdom of the crowd. The problem though, when you try to apply that to professional search is that your context is as unique as your plant, right? So imagine if you went to the doctor's office and say, oh, I kind of have a headache and he said, you know what? This is our best selling pill, just take it, right? You'd say, no, doctor, I don't care. I want what's best for me, right? And so these tools then that rely on the wisdom of the crowd that try to do this generic one-size-fits-all systems invariably, they're sort of good for showing you the big seminal cases sometimes, but they're very bad at finding that needle in the haystack. So then the question is, okay, well, if there's not the wisdom of the crowd, then what wisdom might we use, right? And so you might think, well, let's just use the wisdom of the user, right? This is a patent lawyer we've seen, normally he tends to go to patent cases, right? But then the problem with that is that you can have a violent shift in your context, so I would be doing patent litigation and then suddenly I would be doing a pro-bono immigration case. And so believe me, when you're doing pro-bono immigration, patent law is very rarely brought in, like there's never an IP issue in it, right? I wish I would ask, do you have any patent, no, you don't, okay, well, we'll keep you here. And so the wisdom of the user doesn't quite work, right? And then also if you were doing biotech versus computer science, it doesn't even have to be that big a shift if you were doing a semiconductor patent, you don't want to see the same cases if you were doing a gene patent, okay? And so what I realized very early on in my sort of efforts here is that there is the correct wisdom out there and that wisdom is the wisdom of the litigation record itself that's specific to that matter. So when I file a lawsuit, I file a complaint and it's in digitized format and that begins a very long string of data that encodes a context. Just as real as our genomes encode a biological organism, this record encodes who the parties are, what the jurisdiction is, what the underlying facts are, and it grows as the litigation progresses. And once you couple a litigation research system to that record, you can have amazingly intuitive results and there's things open up which is when possible when you were ignoring that record. So when I first thought of this, it was in terms of, okay, well, I'm gonna do a search for damages and it's gonna know by looking at the complaint that I don't want antitrust damages. I don't want to slip on a banana. I want my patent damages, right? But then when we built the prototype, something else occurred to us, which is like, you don't need the keyword actually, right? Because once you're listening to the litigation record, before you type a thing, the software starts to say, no, I've got an idea of what you might want to read, right? And so here we are with Kara. So let me show an example here. I used to be able to say, this is a brief from the Uber litigation, but now they're being sued so much that I have to sell you which litigation I need. So Uber drivers brought a class action lawsuit saying that we should be considered employees, not independent contract record. And therefore you owe us gas money. Understandably, Uber said, we prefer that that not be the case, right? And so there was a big litigation. For those of you that are lawyers, this will make, as you've seen this a million times. For those of you who aren't, this is basically a long 30 page document that the lawyers for Uber's filed telling the judge, I don't want to go to trial, we don't need a trial, the jury's not, don't even involve them, right? There's no real issue. And so true to a common law system like ours, this table of authorities are all of the earlier decisions that they're citing, because that's how we argue things in our system, right? We look at all the earlier cases and all the earlier opinions. And so you can think of that as just a very complex network of unidirectional citation, right? It's, they don't cite back and forth because they have to cite back in time. And then a bunch of text, right? A bunch of words where they fight and fight. And we haven't doctored this in any way, right? So traditionally what would happen and what is happening all over the country right now for the 95% of attorneys who don't have us yet is they read this brief and then they try to write their long, long, boolean query. Independent contractor and driver and chauffeur and you only try to guess at what the right words are and they're always getting it right. This is what you do with care. All briefs. Probably no metadata. No CR. It'll take a second longer if you have those here. And it returns cases that are not in the brief. And that's key. Return of cases that you already have. It says these are cases not in the brief. So we'll scroll through a few. This first one is about whether or not drivers are independent contractors. Again, here, whether classic drivers were employees. It turns out FedEx drivers brought the same claim a while back. Actually, they tried twice in the poor guys. Bus drivers at a terminal and then number six, my favorite, holding whether lift drivers were employees or independent contractors was a matter for each of them. What it brings you then is case law that's related to what you're working on, not just legally, but factually, right? And we've designed it from the ground up to find the things that you might miss using traditional tools. And since this is MIT, we're gonna get deep in the weeds on this because we're here. All right. How many of you guys know about shepherds or key sites? Those of you who are law students know about this. Okay, so one of the big tools that lawyers use is a tool that lets you follow that direct citation path that I mentioned. So you're saying I'm reading this case, which cases cite to it directly and I'm gonna walk that path, right? But now think about a different situation. Imagine a case that doesn't cite directly to that to another case, but you see those two cases together a lot in paragraphs, right? Just like you see Friday the 13th and Nightmare on Elm Street on the same shelf, even though they don't talk to each other. I mean, by the sequels, they might have actually started to talk about each other. But those early ones, right? That proximity, that co-reference to each other is a very important relationship because what that means is that I judge read both those cases and said, I know they're related even though they don't cite to each other and that's why I'm gonna talk to them on the same shelf. And so that's one of the things that we do is we use that important relationship. And if there's a theme to all of the work that I've been doing over the last eight years, it's that where judges have put in a lot of effort to do something, harness it, right? Information is work, right? Choosing among many things, choosing one and selecting that is information and there's always work there and it can almost always be extracted, especially in legal documents where you tend to have uniform pattern. All right, let's see. So now we also do suggested terms. So where this matters, I'm only gonna bring this up because the core technology we're using here was built with genetic algorithms. How many of you guys have heard of genetic algorithms? I'm beginning to sense lawyer versus non-lawyer here. So neural nets get a lot of play and everyone kind of talks about neural nets, but I think genetic algorithms are the coolest thing, right? So what these genetic algorithms do is we're mimicking not the structure of neurons, but the process of natural selection, the very thing that led to our beautiful neural nets, right, over millions of years. And so what you do is you basically tell the algorithms and plural, like you have a little population of them and you say, okay, kiddies, like, you know, success is to have a high score at Mario Brothers. That's what success means, now go. And they all try their best and then you literally take the two that do the best and you say, well, why don't you guys sort of make and produce a new group? And you literally just replicate that over and over and you sort of optimize it, right? And again, this is something that there are people here that will tell you about it, probably people who invented it and wouldn't be surprised to learn, you know, stones that are away from here, but genetic algorithms is just one of the many types in the AI family of algorithms that you can then use to sort of start to make things more intuitive. So how am I doing on time? So I think what I wanna show you, let me see if this opens correctly. So this article in Stanford Law Review talks a lot about the different aspects of artificial intelligence and law. And it was written in 1970. And half of the paragraphs in this thing, I could be things that I could say to you today and sound like I'm in sort of the zeitgeist of people talking about law and artificial intelligence, right? What seems cutting edge to us right now are actually things that I think are decades old often, right? And the same sort of issues that this guy brings up. Can a computer do analogous reasoning? No, right now, right? Will computers replace lawyers, things like that, right? So all of the conversations that we're having right now we've actually been occurring off and on, I think for decades. And so I think one of the other things I'd advise you guys when you look at this space, don't be afraid to go look at what people were saying in the 80s or the 70s or the 90s even, right? So just try to talk about this stuff because you'll see a lot of its pertinent and can actually help inform your thinking about this. And the problems that exist in the 70s that are still here are obviously thorny ones that if you solve you'll probably be well off. All right, so I think lastly I wanna show you. All right. So how do I know that this is the holding of the split case? Right? How about can you define holding for folks that don't know? Sure, that's a good point. Okay, so in a case, the judge will talk about a lot of things. He'll talk about the background facts. He'll talk about the overriding tests. He'll talk about all these things and then he'll make a holding, which is sort of the core legal point that he wants to be taken away from the case, right? It's sort of one of the key decisions that's being made. And very often you'll say to somebody, just tell me what the holding of the case is or in court you'll say your honor, this case held that, you gotta go. And the problem is that there's no AI out there that you can feed a case and it will spit out the holding for you. Right, then this, you know, IBM wants to confirm this, right, so you know. So what we had to do, we had to realize that if there's no AI out there that does it. So what Westlaw and Lexis, these big companies do is they hire an army of editors and these editors may or may not love their jobs, right, but they're out there churning out these little summaries, right? Writing sort of one-size-fits-all summary. But what we realized is that there's another place to get concise summaries. And that's in these special things that are unique to law called explanatory parentheticals. I mean, parentheticals aren't unique to law, but what is unique to law is that very often, I mean, I'll tell you exactly how many times in a second, but judges will cite to a case and then in a parenthetical, give you the summary. And because of this thing called the Blue Book, which is a uniform system of citation, which I hated in law school, right? It's every stupid common period and dash that you have to do and you have to capitalize it. But boy, when you get into legal informatics, do you love the Blue Book because the Blue Book is what lets the regular expressions, right? And the people get mad at me like, oh, we should be doing Python, parsing, grammar, and I like the old school regular expressions that you just find for this. So you can write regular expressions that say pull out any parenthetical that follows the specific case sites. And, you know, and at first I just pulled out once that began with holding, but then I was like, what about finding? You know, finding is a good word or concluding. And then I started to just get endorphins every time I pulled out more. So I just pulled out everything that wasn't quoting. So I pulled out 400,000 of these summaries. And then our VP of data science did one of these sort of like, you know, step aside, man, let me show you what this is supposed to look like. And then he pulled out 2.3 million of them. So think about that for a second. That's 2.3 million summaries written by judges, right? I mean, talk about having a good editorial board. And it was there this whole time just waiting, just saying, please just take me, I'm here, right? Because judges had done the work, there was an identifiable pattern, and we pulled them out. And so now I submit Westline Lexis, even though ours is the crown jewel of case summaries, right? And in fact, you'll see multiple summaries because different courts will summarize the case, looking at it from different angles. Actually, no, those are key passages. Sorry, I need to go back to one of the summaries. This case is too new. That's the one downside with it. Different cases will look at it from different angles. And sometimes, depending on your context, one of those sub-angles is actually what you're interested in, right? So it's kind of, in a sense, free, and it's more powerful, and it's from a more reliable source. So I guess what I wanna stress there is that where AI can't take you, it doesn't mean you give up, right? It means are there places where humans have done that work in a way that can be reliable, right? And that also goes for creating training sets to do sort of sophisticated machine learning stuff, right? So one of the big problems is we don't have a citator that's editorial, right? So we can't tell you, this case was distinguished by this case, right? Again, Westlaw and Lexus, army of editors, just reading and saying, I think this is distinguished, right? And they disagree with each other 15% of the time, so it's not like they're perfect, right? But a lot of these parentheticals don't begin with the word holding, they begin with the word distinguishing. So it'll say John B. Doe, Perenn distinguishing, you know, Penney B. Tiger, whatever, right? Could you maybe just define distinguishing because it's not the Merriam-Webster definition here. It's not that different from Merriam-Webster. Yeah, you're right. Distinguishing is what the court is saying. It's not that I find this case to be bad law. It's that it doesn't apply here for various reasons of the fact. So I cite a case that says, Your Honor, it says here you need a warrant to board the guy's Uber. And they say, well, he did have a warrant, so that doesn't have anything to do with this. And you go, sorry, right? And that's where you want that citator so you don't look stupid like that, right? But my point is, I guess, that you can also use the judge's descriptions of cases as though you would hire them to do your machine learning label, right? As though they had been hired to kind of create the training sets that you can use. And then you can then speed that through and come up with algorithms that can then, at least to varying degrees of confidence, tell you, I think this relationship is distinguishing. So those are sort of the, some of the things that we're working on and sort of looking at during this. And the last thing I would say about this is right now, the interface is drag and drop. But the nature of our court system is that when you file a brief, it's filed publicly. And so we're now starting to listen to thousands of dockets automatically. And as soon as something's filed, running it through our tools and sending it to the attorney. So the attorney gets a note from the court that says, this was filed and then two minutes later, we're in their face and be like, and you should read these cases about it, by the way, right? And it's, we have confused a lot of attorneys like that. So, but I see a system where we're going forward and this can also help for pro se, for folks that are kind of representing themselves, for folks that have, can't afford an attorney that can hire three associates to do research. This ability to use the litigation record itself to retrieve useful case law to sort of assemble these things. I submit you guys, it's going to be sort of the future of how laws are. Thank you so much for having me. Thanks Pablo. Please hold your questions for Pablo. I'm sure people have them as, as, as do I. We're going to welcome Brian Cune up next. And while Brian comes on up for a lecture, let me just say a couple of words about Brian. Brian is the person who created and co-founded IBM Watson legal group. And he is also a proud alumnus of Suffolk University Law School. And he's going to present today some of the work that IBM has been working on to clean up legal space using Watson. No problem.