 I think it'll be a little easier to see this if I end up as such. So I'd like to start off by providing in this talk a overview of kind of what we're about. Then what I'm going to do is highlight sort of three specific social science slash empirical legal studies examples of ongoing projects. I think that these projects have sort of pedagogical value, research value, but also potential payoff. Some of the ideas here are payoff for end users of legal legal information. And so even for the folks at, let's say, at Alexis Nexus, some of the information visualization, for example, that we're going to show is a way to help end users be able to mediate their crazy amount of content that they have that they're confronted with. Then I'm going to hand the floor over to Mike, who's going to walk through various sort of components, which might be enhanced using law.gov style information. That'll sort of culminate with a sort of list of possible questions, not that Carl asked us to compose, which are just ideas of research projects that would aid folks in the academy, but also potentially people who were interested in doing things like business analytics or something like that with a sort of law flavor. So with that, I just want to start by providing a little orientation of where we kind of come in on all of this. We're sort of involved in this big data era. And for those of you not familiar, it basically goes like this. Increasing computer power, decreasing data storage costs, they're fundamentally changing the scope of scientific inquiry. And this point has been highlighted in a number of recent publications, Special Edition of Nature, Paper and Science. And just even just a few months ago, an economist had a 16 page special report on the data deluge, as they call it. And so I think that this sort of is a bit of a game change for those of us who are interested in studying law and studying political and legal systems. And so what I'd like to offer is where we come in is we're interested in using computational techniques to try to study these legal systems at a very high level, but also be able to use these techniques to drill down on particular things that we're interested in. So it's not sort of the traditional distinction in the social sciences is qualitative and quantitative. It's both qualitative and quantitative, but it's using data at a scope that hasn't really been used in the past. So legal systems have a tremendous amount of information. We just showed, even in Texas, just the examples. Now, if we think of the 50 states in the union, we think the federal government, we think of all the municipalities and so forth, it becomes a very large corpus. And the question is, do we have methods that scale the size and scope of this information? And so I think that very quickly you say, well, is it possible in human time to actually go through all this? And when that sort of takes you right into computation and computing? And so we were on a blog with this title called Computational Legal Studies. And what we try to do on the blog is just highlight, there's various people around the country and around the world working on various components that are associated with applying computational methods to study legal systems and political systems more generally. And so it's about large end data analysis. I just want to highlight just so that I'm on the record. It's also theory, theoretical models, mathematical models, computer simulations and things like this are still part of the puzzle too. And one of the reasons this is the case is, you only have one run of history and so the data that's generated from a particular process, you can't ask questions about counterfactuals. Well, what if the world was different in this way just from a screen data because the world was the way it was and the data was generated on that state of the world? And so it's important that I just want to be clear that theory still has a place in all of this but theoretical offerings have not changed in the way that data has changed in just say the last three to five years. But just on this point, I just want to highlight those you might be familiar with like Thomas Schelling's social segregation model. This is another work won in the 2005 Nobel Prize and the point is that's a theoretical model that makes empirical predictions that we can test using data. And out of Michigan, Mike for example works for Bob Axelrod and he has this evolutionary cooperation model which is a game theoretic model and perhaps Nobel will be in his future very soon. So for Wod.gov we see this as being a very exciting possibility because, you know, high quality, authenticated data is what we need to do this type of analysis. And there's a lot of social science and legal studies questions that we can evaluate but the key is that we need the data. And so I'm going to walk through, as I mentioned, a few projects but then we're going to sort of culminate with this master list. But here's just a few things and these are more pitched at the federal level but this could just as easily be in line with the discussion that was had earlier today. So I'll just highlight our first project that we're working on is we're studying the United States Code. And I think most of you are familiar with here's a picture of it on the shelf. I tried, I did my best in law school to never craft a spine on one of those but, you know, but here I am studying so the jokes on me I guess. But for those of you who are familiar with it's the compiled version of federal statutory law and it's compiled in a way, may or similar to the way a computer program is compiled. So there's changes being made all the time. Statutes at large are the sort of chronological ordering of bill passed by Congress and then they're compiled in this object of the United States Code. It's important to note that these do not include administrative regulations but notwithstanding, you know, some of these titles in the code are pretty familiar to folks. One going on right now is the census. Another one that hopefully you've made some arrangements on is Title 26, the tax code, right, internal revenue code. Other ones would include like labor law in the United States, public health and welfare, so forth and so on. So for us we think this is a very interesting way to get a scope on, a perspective on the scope of law in a modern society. So we could ask some very basic questions like well how big is this thing? How big is it? We can see how many pages it is, we can see how many books it fills. Those are basic sort of starting points. Another question you could say is well how complex is this? And how is that complexity of scope changed over time? There's a lot of anecdotal accounts about things like this. There's not a lot of science about this type of stuff, I have to say. That's what we're interested in doing. We want to study these objects in a way that would stand up in a scientific purity publication. So here's another question we could ask. Are changes in the code coupled with changes in the administrative state? So how are regulations which supplement and sort of integrate with the code, how are those been changing over time? So I just want to make the point if you don't accept my premise about computation, about the scope of something like this that we're talking about. So here's the first page in the United States code. Page one, open the book, here's what it looks like. But I take that and I just, we'll just take that and let's make it yellow. That's just so we can keep track of that page. So now I'm going to take that and I'm going to move it all up into the corner. I'm going to fill the entire page. I'm going to fill the entire stream of pages. If you did that, that's already getting to be quite a serious project if you want to read it all. That's going to get you something that's roughly equivalent to labor law in the United States. Federal labor law. But, you know, that's not how large the code is, right? So if I take this and I push this up there again, right? This right here is our page, right? It's getting pretty hard to see it there. And I fill the whole screen now, that's the United States code. And so again, the point, the visual demonstration here is meant to reinforce the principle that if you want to study this at population level, you're already moving it into a space that, now I think I have a lot of folks here who may already be in agreement on this, but just in case there's a holdout, you can start reading. We'll see you in quite a while if that's the direction you choose to go. So how do we measure these objects? So I just showed you kind of a visual demonstration how large this is. What's the approach to measure an object like this for in terms of complexity? The first thing is you need to generate a mathematical representation of the object in question. That's just the sort of jumping off point for anything like this. And then you need to come up with a qualitatively justified manner to measure that representation. So I'll start by talking about how would you think about this as an mathematical object? Well, what are the features of the object in question? One thing that obviously jumps out at you when you look at the object is it's got a hierarchical structure. So, you know, that's not news to most people in here, but just, you know, a 501c3 exempt organization is a specific claim to traverse a hierarchy from 26 all the way down to C and then to 3. And if you looked at that, it would look something like this. And actually, this deravively understates the amount of hierarchy because this is only going to the section level. And so, if you went deeper, there would just be more range and greater depth of different places, but this is just going to give you an idea. Now, in terms of an end user, perhaps this would be an easier way to navigate this than to see those screens with a layer. This sort of a visualization, if it was interactive, might be a better way for them to have a handle on where the content is structured. This is just that, you know, I put that out there as a possibility. But this is the first sort of component of an object like this. It's not the only one. One thing that the code has is citations and that has to be dealt with. So, just as an example, like, let's say tax evasion. The tax code may go in and reference criminal procedure when talking about tax evasion. So, it's a tax question, but it's like, well, we need to rely on principles that are developed in Title 18. And so, there's this dependency between components in the object. And that has to be represented as well. And so, when you take this example and you then iterate over the whole code, it looks like that. That's actually what the citation network of the United States code looks like. And if you go online to our blog, this is fully zoomable and you can go in and you can look at this, okay? And this right here is fairly modular. That's the tax code. So, the tax code has references, but most of the references are internal to the tax code. And some of them are not. And so, this component that's kind of on the boundary is spanning across. But again, if you want to take a look at this on your own, this is available. And again, visualization is a way to help deal with the complexity that's out there, to help end users, right? So, even if you don't care about the research side of this, you say, look, for an end user, this is perhaps a much better way, a more intuitive way for them to experience something like this. And the last component, and there are some components under these, but I'm not going to go into too much depth about them. It's a linguistic content. The code has about 24 million words, as of just fairly recently here, and that's represented on those pages that I showed you earlier. So, we have a couple papers where we begin to sketch out how to do this representation. The first one is under review of Physica A, and it basically describes this in a formalized way. And this other paper, which really just explores the properties of the citation network, is under review of European Journal of Physics B. Both are available on the physics art done, and our blogger is a portal to physics art at the top of that. But those are sort of setting the stage in some sense to this question of complexity. And one of the things we're interested in is the complexity of law and the complexity of society, and the way those things are moving together. So, we have this paper under review, or in progress right now, where we actually try to, and that's more for a general audience. The other two are pretty technical papers, but this is for a more general audience, where we try to measure the object and provide some sort of in-depth explanation for these things. When we do that, these factors still come up. These are the three factors, and I just see sort of legal studies is going in this direction. To deal with something like that, you kind of at least need to have some familiarity with a lot of different things, because it really calls upon you to think about applied math, and law, and computer science, and linguistics, and psych, and political economy, and lots of other things. And so that's sort of our perspective on things, but this is an example of one project that we're currently undertaking. We can ask questions. We can ask questions like, how is the complexity of law and complexity of society co-evolving with one another? And we can do it in a way that really has measurements involved, and if people don't like the measurements, they can write a follow-up. And that's like science, right? And that I think would really be exciting. So we're putting that out there to the world. So another project, which is probably just slightly on the boundary of things that law.gov might be about, but just an example of when you make data available, things are possible. One of the exciting things is things that we can't even think of now. If you put that information out there, there's a lot of creative people, and they can come up with stuff that you would never be able to contemplate, either with that data stream or mashing it up with other data streams. They can say, well, I'm going to use this over here, and this over here, and do some sort of novel recombination. And that's a whole other avenue here. And that's one of the reasons that having that information could be really exciting. So a project I've been working on thinks about network analysis of federal judges. So the basic ideas, and I probably don't need to really elaborate this group here, but we're going to use these colors going through. That's the hierarchy of the courts, and these colors will be consistent throughout. So this is sort of the question I'm interested in is social topology. Even among judges, there are superstars, and then there are just other judges that kind of come and go. And so we're interested in, well, why did this person become a star and not that one? Why is this person the one that shows up in every casebook and not that one? Why is this person the one that's always discussed and not that one? So the question is, can we get a measurement of something like that? So what we did is we went out and collected law clerk information. And we did that for 1995 to 2005. And we went out to the staff directories that are produced. Now one thing we can imagine that for every judge, that basic information was made available. This is not what currently we're going to scope, I think, blah, blah, blah, this is just a proposal of a, this is nothing serious. Just make the names available, and we can use that information to show the social connections between individuals. The basic premise of the paper is this, that in aggregate. Not at the individual level, not at one particular transaction. When you study law court movements, sort of a physical system, you just watch them all moving around. That that sort of tells us, ever reveals social and professional relationships between people. That any given year, maybe a personal move between somebody, but over a wide window, like 10 years, those patterns sort of mared out in a particular way that there's actually something going on underneath the hood. And so that's the core claim of the paper. If you don't buy the core claim, that's fine. But let's just show you what that would look like. Maybe you'll buy it then. So this is something that people, I'm sure, here at the Career Services Office at Texas, I know at Michigan, they spend a lot of time thinking about this. We want to help our students get these courtships. And what we really like to do is get our students a courtship at the Supreme Court. Well, how can we do that? We have these two justices, which might be the end point. But they don't just hire people straight away. Oh, well, one thing that we might learn over time is, if you go to Judge A, you can get to Justice Y. That's the gateway to Justice Y. Another thing you might learn as you study the information as they collect information over time is, and this is constantly changing, because the judges are going in and out. But you might learn at any given moment. Judge B will get you to Justice Y and to Justice Z. Someone might say, well, I don't know if I can get to Judge B directly. Oh, well, if I go to Judge C, you see, I can get to Judge B, and then maybe I can get to Justice Y or Z. Maybe me, maybe not, but at least I know that there's a path. Maybe I'll make that path, maybe I won't. Same thing, you might learn something like this, these paths, and more and more of these as you go on. And maybe there's regional aspects to this. There are, in fact, other aspects like this. But this is information. Now, there's been a lot written about this. I'm not interested in the market for courts per se. I'm interested in using all of that to learn something about relationships between the judges. I'm not so invested in the courts. I'm not trying to help them get their courtships. That's for other folks to do, but I think we do learn something in the aggregate. So let me show you what that looks like. That's what it looks like when you take the 10 years and you visualize it as a network. And so what you have is this really dense core here of individuals, and then you have this sort of periphery. One thing I think is interesting is there are components within the periphery which are still interesting. Like right here, you see this? There's all this flow into this individual. From there, there's only this one path that gets you into there, but there's all these people trying to flow into this person. And you see a couple more of those on the boundary. As you get to the center, though, there's this really dense core of individuals. And there's a visualization algorithm that generates this. It tries to minimize energy configuration. I could talk about that offline. But I think this is what you want to see. You want to zoom in and look at, well, who specifically are we talking about? So here's just an example. Now, I just want to highlight that this data ended at the end of the natural rank list court. So the point was, let's hold the Supreme Court constant for 10 years, no changes at the top, and let's see on what's going on down below. So one thing that jumps out on that is somebody like Sotomayor being one of the key feeder judges. And you see this here. And if you do some graph statistics on this, her name jumped out. We had her name on a list. People will have generate these lists anyway. But this is a sort of not just a just so story, because I say so story. This is like, if you just were to look at their behavior, they're saying she's one, exactly where you might find her, sort of left, but very close to the court. Here's Merrick Garland, a name that's been mentioned for as a possible nominee Supreme Court. There's Alex Kaczynski, nice circuit judge, Judge Posner, Dermot O'Scanolin, and then our San Mojito right there. And again, he's green then because he was yet to be elevated to the Supreme Court. So I think, if he didn't buy the claim, I think this lays out pretty much like people's intuitions, but it does so in what I would argue is a more neutral framework. We're just using this data to try to get a handle on who's who in the federal judiciary. And this lays out like, if I told people, go ahead and write down where they saw it lays out, I think this looks a little very familiar. So again, we can learn something about the system. We just type in information. This probably wasn't exactly what people thought was gonna happen when they put those names in that directory, but this is the type of things that are possible. And if you're interested, this paper's online. It's a working paper at Mission Law School and it's coming out in Ohio State. So if you want to, all this stuff's available online, if you want to check it out. So the last thing I want to talk about is exploring the path of precedent, judicial citations. And so one place we can study those, an obvious starting point is to start with United States reports. And like here's a page, here's Bush versus Gore page one. From a case like this, there'll be citations in the case. And in this instance, it's similar, it's a similar approach, it's a network approach. And in that instance, here instead of being judged is the nodes are cases. And the edges are citations and they're directed. They're from case one to case two. I'll show you a very concrete example of that. The case decided in 2000 called Dickerson. It cites lots of things. The one of the things it cites is Oregon versus Elstap. Right, so the case has one to end citations. Might have tons of them, might have a few. This is one of the cases it cites. And Oregon versus Elstap cites Miranda versus Arizona. And not surprisingly, this is why there's clustering in the graph, Dickerson also cites Miranda versus Arizona. And so this is a trivial example, but the point is just to highlight what is actually going on. When you see these great graphs, what's actually the individual components are made of? And one thing I just know as a point of differentiation from the social network to a citation network is, time in, these are acyclic graphs, meaning basically time runs the opposite direction. There isn't these, the triangles don't close. And you kind of already knew that in the sense that Dickerson here decided in 2000 can't cite a case in 2005, right? So time has to run sort of strictly in opposite. I mean maybe something can be decided on the same day and then you get into this like ordering problem. But as a general matter, that's a fundamental property of these types of graphs. And I'll show you in a minute that produces a lot of problems for us as researchers because of most of the good methods have been developed for social networks, not for these types of graphs. So I wanna show you what potential payoff here would be to study the path of precedent the way it develops over time. So I'm gonna show you a dynamic visualization of the early years of the court. And I think you get a very different perspective when you see things in their time. We see the cases in our time looking back when we think about what's meaningful now. But if you see the decisions in their time, you get a very different perspective about what's important. Okay, we'll have to do that one more time. Bonus opportunity. Okay, so we have this movie here. Hopefully it will show up. Okay, can you see it in the back? What's gonna happen here is they're gonna be green when they're sighted. The year's gonna be at the top. The year's gonna be changing. And it's gonna be rotating. It's in 3D. And what you'll see is in the early years of the court, there's very little citations in the largest weekly connected component of the graph. There's very few citations. They're citing things, but they're not citing their decisions. They're citing English law, French law, legal commentators like Blackstone. So in the earlier, there's very little structure to the graph. I'm labeling major cases just to give you anchor points like Marbury, the Charming Betsy. You'll see is the graph doesn't have very much structure and these cases are kind of at the boundary. Now, this is all gonna change. The first sort of cluster of cases decided where they cite their own precedent in a heavy sort of way happens in the 1816, 1817 window. It involves a set of maritime cases. So in just a second here, this graph is gonna really start to take off right there. These cases here are maritime cases. They build off of ideas in the Charming Betsy. Other cases begin to link to them. They start to become a central cluster in the graph. And what happens is, again, these cases aren't cited very much today, but in their time, the time in question, the year at the top, they were core cases. If you look at citation that worked at that moment, you get a very good perspective about what's important. By now, marble is starting to become a central case, but it wasn't a central case at the beginning. It was just on the boundary. And now it's time going forward. You're starting to see, this graph's already getting very dense, very fast. This is like watching history, and it's like a two minute version of 35 years here or whatever, but the idea here is now you see like major pieces like McCulloch and so forth. And this is available online. You can watch it again if you want to, but the idea is through things like this, it's a way to demonstrate these ideas. These ideas of what's important and when it's important. And to sort of pull off our biases. Our biases are to look now and look from here back to then and say, oh, this is this important case, but if you look at it in their prime, you get a very good perspective about what's important. So that's another possible project, things that one can do if the data's available. As I mentioned, we have this problem. We have this problem that we have these measures, good measures for studying networks, but mainly social networks. Graphs like this are much more tricky because of the lack of cycles. So we're developing this method, what I'll talk about in one second called the SYNC method, which tries to work on these problems. But this is a visualization of Marbury up to right now. Call it the six degrees of Marbury versus Madison. The way this works is Marbury's in the center. The first ring here are cases that directly cite Marbury. The next ring are cases that cite cases that cite Marbury. The next ring is cases that cite cases that cite Marbury and so forth and so on. And so this is sort of analogous, but importantly distinct as I mentioned, from social network, but it's similar to like your second and third degree friends on Facebook or something like this. And so the idea is, what we call this thing is a SYNC. And this paper's also a physical edge on a revision. I think we're gonna hopefully not gonna get through on that one. The idea of a SYNC is this. If you, we wanna understand the origin of particular legal ideas. Wanna see the cases in which they begin. And we wanna see, we wanna trace where those go. So the idea of a SYNC is, if you turn the flow instead of away from the case, you turn the flow back to the case. If you started at the outer range and you said, well, where would this all flow? Where would it all drain? It drains to the SYNC. It drains to the center there. And the center is Marbury. So with that SYNC, you can create a distance measure between a case. You can say, how similar using the citation network alone, we'll talk, I'll talk, mention that we wanna use linguistic analysis and marry those up. But just using the citation, we can say, how similar is the citation profile of this case to that case? And we have lots of cases here. So that's gonna be need to be done in an automatic, automatic, automated framework. But we can create this sort of a dendrogram which shows the sort of nature of relationships between things. The payoff for all this, okay, I'm not sure what the payoff of all this is. The payoff is that you can color the graph like this. You can color the network and you can tell the difference between components of the graph. So this year, these components, when you look at the network, they look like they belong together. But if you use this method, you realize they're actually, they share a higher order relationship. But when you drill down, they don't, they aren't the same. What they are is, they're all related to this maritime annulty cluster, but they're distinct in this respect. One is about sort of private international law, private law, and a little bit about criminal law. So it's about torts and contracts and criminal law associated with prizes, taking the ships on the high sea. So under the prize statute of 1812 and subsequent prize statutes, but under that statute, you could take a prize. You could take a ship under the high sea under certain conditions. But if you violated those conditions, you may have committed a tort. You may have committed, you may have impaired somebody's contract and now you owe damages to the person who would do those goods were delivered and you may have committed piracy. And that's what one of those components is. The other component is also a prize related matter, but it's about the commander in chief power, the president, the scope of Congress's authority to write the prize statute. So again, they're all about prizes. They have a higher order relationship. But when you drill down, if you read the cases, you'd say, no, no, these belong in separate covers, right? And this is what's exciting is this method allows us to distinguish between things like that in an automated framework. And that, I think, and this is why we submitted something like this, okay, because we think it's a bit of a breakthrough, but you know, time and range that we've seen. One of the questions with something like this is how does it scale to a bigger universe of things? But the idea is maybe we have a bit of a breakthrough here. That's what we have to work with ultimately. See, it's non-trivial to go to the whole set. There's like 30,000-ish cases. So that graph we showed you is much smaller than 30,000. That's what it all looks like, actually. This is what era, all the sites go backwards in time, right? Where do you site? You site yourself? So Google the Rehnquist work. Do they site the other Rehnquist court decisions? Do they site one term back? Do they site the Berger court? Or do they site one of the other courts going backwards in time? So this is just a way to represent that. To think about that. But this is what I mean about scaling. Gotta ask yourself, with these methods, is do they scale to huge graphs? Do they scale to large dotted bodies of data? If something works on 100, 500, 1,000, doesn't work on 30,000, the answer's not always clear. But this is just, we'll have more documentation on this graph online soon. Here's just some other fun stuff. This is the direction that we really like to go in the future, which is linguistics. Marrying up language and citations to do this sort of detection. And here's an example, and this is not like a serious study or anything, but if you look for the use of the word, you take all of the cases and you just look for the tokens, like abortion. This is the word frequencies over time. There's basically, there's a little spike here but there's almost no mention of this stuff and then there's tons of it, right? I mean that's like those critical from nothing to critical, right? And then you see property. And again, this isn't something that we're super serious with the idea here is this sort of tracks westward expansion and railroad cases and things like this and maybe there's a second moment here associated with intellectual property and things like this. But again, I haven't, I don't, don't, don't quote me on this second moment in particular. I'm not sure, but the idea is that would be the place to jump off the duo to drill down and look very carefully at. So, so I've offered you several examples of possibilities. The idea is these are only meant to be emblematic. So maybe you're interested in something completely different, but this is the type of things that are possible. It's really only the beginning. We can unlock the vault. A lot of things to happen, you know. But the key is the data. We've got to get more high quality data and that's why we're so excited about a lot of stuff. I'm going to hand it over to Mike now. Yeah, go ahead. Sorry. It's not so interesting. Are you all going to go international too and you're, you know, looking over everything? It seems like a good way to predict the growth areas for practices that, you know, in firms they're trying to make a balanced offering. Yeah, one place is that we've been very interested in applying some of these methods to other jurisdictions. You know, maybe start with common law countries. There have been similar analyses done in some countries using ideas that we are also drawing upon. There's methods that are being developed in civil law countries and in common law countries by researchers. I think there's a group at Seelence Poe. I'm not positive about that. I think at Seelence Poe in France is trying to work to use these sort of ideas. I know that there's a group in Austria. There's a group in the UK. There's a group, I believe, in Australia, but again, I'm not positive about all this, but the idea would be, yeah. One thing is, if we want to talk about general patterns or universality, we need to get a comparison to other jurisdictions. If something holds across jurisdictions, then you might not be able to make a claim about universality or something. I don't know if they're universal. But anything here holds outside of this case. But the only way to know is to look. Are these cetacean analyses built off public data, or are you borrowing a private data set? No, these are based off of public information. But it's public information that we had to spend a lot of time gathering. Is it public now, or is it the kind of? It is public now. It is public now. To try something different and figure it out. Yeah, I mean, this stuff is public. For example, one place is our last offer on the one paper is James Fowler on his website. He's a fairly prominent network scientist. You'll find some of this data is available off of this site towards the bottom of the page. He does lots of other things. That's just one place. But in other instances, this has been non-trivial. I mean, I might come and put it in. Mike is basically a professional programmer among other things. And it shouldn't be the case that we have to bring in a professional programmer just to do this type of analysis. In an ideal world, stuff would be made and available in a way. Hopefully it would be a lot of stuff where we could really just start to actually do the underlying research and not spend so much time trying to assemble data sets and get them in reasonable formats that we can actually do work with. So that's the hope. On a theoretical level, what is the difference, or is there a difference between what you call computational legal studies and empirical legal studies? It's like empirical legal studies. Yeah, I think there's a lot of analogies to that. But this is at a scope. This is generally they'll use an out-of-the-box software program, a statistical software program or something like this. This is about going sort of another level up to that. So if that is sort of two orders of magnitude in terms of scope above that, but you could think of this as empirical legal studies just at a wider level. They tend to generally draw their influences from political science and from economics. And I would say we draw our influences more from computer science and physics than the other. But I'm in political science as well. So I don't know if that's a distinction even worth imposing, but it's really about leveraging developments and computing very seriously to do these things and less about just sort of using sort of out-of-the-box software. But again, maybe that's not a meaningful distinction. I don't know. Sure. I'll give you an example of a mechanized graph. I think one of my colleagues at Howard a few years ago looked for the word democracy in the Supreme Court database. First instance, first instance was in the 20th century. Yeah, that's, I mean, and that's not what your intuition would be. It's very early. It would be your guess. I guess one of the things when you look at the data, you find out sometimes that your gut instinct is not even close to accurate, right? And sometimes it is. It tracks very much what you already think. But some instances you couldn't be more wrong, right? And this might be one of those cases, I think. So maybe I'll pass it over to Mike, who's going to sort of take us through these other elements. So I'll hand it over to him. Right. So these are clearly things that we've gone into some detail on. These are projects that we've carried through some stage of publication. And we just want to step back a little bit, maybe synthesize the broad categories in which we really think we can actually bring empirical methods to bear on these problems. So again, here are affiliations. And the way we think about these problems are sort of two categories of objects or dynamics that we are interested in. There's law on the left there with statutory objects like the code or the federal register and analogs at the state levels. And there's also the whole set of judicial objects like cases or maybe more than that of the precedent or judges and their clerks and justices. And these are somewhat distinct from the study of the dynamics of disputes, right? So you have mediation. Maybe you have some databases of labor, labor mediation, arbitration, right, that you can analyze. People do this. There's also litigation more generally. You could think about how you might analyze the dynamics of litigation to make better decisions in the process of either deciding whether to litigate or how to litigate or when to stop litigating. And so we're just going to present a few examples from each of these categories other than mediation because we really just don't have enough on that front at this point to be very worthwhile. But again, statutory dynamics. Finance or what science sort of cares about it. It's a basic conception, right? You need to think about dynamics in order to understand problems. So what might we be interested in? There's the US code, which we showed some existing research we've done on. There's the Code of Federal Regulations. Both of these only recently have been provided for bulk access download. And unfortunately, there's 200 years of these that we still need in order to answer a lot of questions. So you can get the CFRs right now in XML. You can get the US code, for instance, from Cornell's Legal Information Institute in bulk XML snapshots. They only have three snapshots at this point that you get, though. So if you wanted to answer questions longitudinally about law as represented by the code or the Code of Federal Regulations, you can only do this over very narrow windows. We've looked at how we could actually digitize these resources ourselves, historically. But then you have all of these issues that come up. So clearly, as we now have authoring platforms that are digitized going forward, we're OK. But if we're ever to answer these questions going back, we need to really think about these problems. There are also various other statutory objects, such as state codes or things like the UCC or the EPA that we might be interested in looking at either over time or differences, for instance, between states, the minor. And right, so we have these. What questions can we ask about them? The first is, as we saw last time about the code. And again, legislation at the federal level is a nightmare. We've done some analysis that's been in the times on, for instance, the health care bill and all of the coverage of its length. And if you try and actually understand what a bill does, it's an incredibly hard undertaking because of this process of drafting that is very sort of incremental instead of representing whole snapshots. This, of course, is in part due to the fact that some of these bills modify such a wide array of pieces of the code that it would be an additional hundreds of pages, I think, in these bills by the time they were done. But three questions that are concrete that you could ask. If you had all of this data, the U.S. code and legislation available, would be, for instance, is the tax code getting longer? This question's been answered by page counts or by the CCH reporter, right? People give me statistics. Another more interesting question is, are other parts of the code or are other agency rules using the tax code, Title 26 or 26 CFR, to affect policies whether they're for any reason that people use tax, right? The health care bill is a great example of how Title 26 is being used as a policy lever to actually affect some kind of change in incentives. And this is a question that we're obviously very interested in. Are bills or just statutes in general getting more complex? This has ramifications if we continue to have humans draft law, right? Because humans have some cognitive limits. And at some point, throwing more humans at problems doesn't really scale to the complexity of drafting or understanding law. And many of these go likewise with the Code of Federal Regulations. Although there are some other interesting questions we can ask, like, are agencies creating more law than Congress itself? And has this changed over time? Is this something we, as a democracy, really understand or should we accept this? Or should we change this trend if there is a trend? There are also more fine-grained questions that are clearly of interest at this point in time. Does the SEC act proactively or retroactively to market trends? For instance, credit default swaps. Should they have issued rules faster? Or were they actually on time in sort of adjusting to changes in practices in the industry? Then a question that we've really thought about because tax policy is a very interesting thing in general is, can we actually look at the cost of compliance for businesses across different states, just roughly speaking through the analysis of state posts? So this is something we've discussed with some individuals from Brookings. And they seem to think that it's an interesting way to look at this problem. And again, if there were a central location to download standardized and normalized forms of state codes, this is a project we could have already done. We have all of the tools and conceptions and software necessary to attack this problem. But getting all the state codes and trying to analyze them simultaneously is a real pain. So we talked a little bit earlier about judicial dynamics as well. We have a couple of different things, like precedent, which is in some sense different than cases. Because cases might contain multiple precedents in some sense. And there are also judges and justices. And Dan previously showed some study of judges and justices and their clerks. But we can ask other questions, like citation patterns. Does a particular justice, not a term, does that justice like or not a particular justice's opinion? Or how do intra-court citations reflect actual underlying social networks in the court? So maybe we can tell which justices are really friends or not based on their citation practice. Some of this is clearly more academic. But some of this is definitely useful either for legal history or for practitioners if they're looking for, perhaps, holes in precedent or places where it might be weak. And so the next category that we're going to deal with is litigation. Mediation don't really know as much about. We know this is clearly a large part of what many lawyers do in some parts of the market. And so if there were more mediation data available, privacy concerns aside, obviously, you could make much better decisions about mediating in some cases. Litigation is similar, though, right? So we might want to ask, when are settlements most likely to occur in the process of litigation? We just had the docket sheet. Say we didn't even have any of the pages after the docket sheet. Just the docket sheet was the list of events for each of these. What could we say? Now, settlements aren't really recorded, right? But you could sometimes infer when settlements happen. And you could try and answer this question. You could also ask, what paths do cases take to get to the Supreme Court? If your goal is to get a case to the Supreme Court, we're obviously strategic decisions about where to file, how to file, how to generate your case. And people have good intuition about this sometimes where they have experience, but not always. So maybe we can help here. This is something that we actually have worked on and we don't have any publications forthcoming for this yet. But we have every docket from the tax court. This isn't easy to do. They don't allow folk access. They even don't allow spiders anymore. But you can obviously ask yourself, what are the courts that real American citizens are most likely to interact with? The tax court is probably actually one of those courts where an average Jew off the street might actually end up. They might actually take the petition. So you could ask, giving your covariates about what's wrong with your filing or what kind of person you are or whether you're representing yourself pro se, are you gonna win or should you just settle? These are actually useful things to real people in the world. And sort of generalizing all of these is can we really build a prediction model for litigation broadly? Could we actually give you probabilities of transition from state to state, from motion, some kind of motion to some kind of success or failure in the case? Because we live, as Dan said in an era where we have so much more data that we really should at least try to make more reason decisions instead of just relying on gut instincts or experience. Not that those aren't useful, but if we have the data, why don't we at least see if we can do better? So these are just a few of the ideas that we've come across in our few years here in empirical or computational legal studies. And this is just a simple summary of them. I think it'd be best if we just go to a discussion right now about any of these ideas. Interesting. The thing about this citation to the same core, when I was on the Court of Appeals, the nine judges, three panels, and the Court at that time would cite any Court opinion in Texas, any other Court of Appeals, or it's for the Court. And I made the argument that the Court should first look in the interest of the stair desises of the President to the President to an opinion we had written and we should always follow what we had written before we followed what I said on the Court. And it was difficult to convince but ultimately I managed to convince the people to first cite the first Court. Because if you don't have that discipline, then any opinion can go in any direction no matter what that Court has done. If they can find some other Court, somewhere that's done what they wanna do. Right, so you really would like more certainty and sort of the jurisprudence that somebody might have to consider in order to make decisions about their litigation clearly that's a good principle if you'd like to make it easier for litigants. That study is interesting, but how? Yeah, I don't know if justices or judges always use stair decisis in the opinion but there are ways just by categorizing opinions and then looking at the citation networks to see which Courts maybe do follow this principle in which they don't. This is clearly something that might go into a litigation model for predicting if you know that somebody really holds dear to them this principle, then you can guess which cases or which precedent they're gonna use in their opinion, so these are important things. Have you thought about looking at who gets overturned? Right, so this is a little bit harder because just purely traversing citation networks won't get you the answer. You need to actually have something like effect from Shepard's for instance, or Keysight in order to say is this a negative or positive citation? Is it to a dissent? No, right, so I think coding at Lexus has changed over time in some sense, but these are things again that other firms have added value on. Theoretically you could train text classification methods to perhaps detect sentiment positive or negative in paragraphs with citations or sentences. This is not easy, obviously Shepard's and Lexus now have put a lot of time into doing this with lawyers on board too, but for the public or for academics, this is perhaps a good training dataset but also a useful, useful thing. I mean, when justices are appointed, right, we always talk about when they were overturned, so people now have to just look, right. You just, maybe you can go to Lexus, you can type in the name of the justice, you can then go through Shepard's or something like negative effect cases, but Lexus doesn't have Shepard's for every case and you could, if you had access to every case, automate the detection of over turnings of justice, it's not the case. I assume you can do a bibliographic citation analysis. I think it's easier to identify the citation and you then classify them by, whether it's a journal or whether it's a newspaper article or whether it's Wikipedia or whether it's... I think that the answer there is... You've done that manually. Yeah, the answer there is you can train a wide variety of classifiers or just set a couple of regular expression patterns or build some kind of tokenizing or classification trees, but the answer is you can do pretty good with simple tools but to get good accuracy you really need few needs or you need to put a lot more time into it. One of the things that we're interested in me, at least inside the academy is there's lots of claims about something like the evolution of the law. That'll be a phrase that's used a lot. If you wanna take that phrase seriously, it seems to me, this is the beginning of a concept like that and one that sort of an evolutionary biologist would actually take seriously is we need to specify fitness function. This is the things that go forth and this is where the application is actually very useful. We need to know what things... So here's a question you might ask. Is a bad site, or is all good news, bad news? So what's the worst thing for a precedent? To never get cited or to be cited negatively? I'd like to know the answer to that. I mean, if we want about evolution of the law, we want about what's fit and what survives, I don't even know the answer to that question. My intuition would be saying getting any site is better than getting no site over some period. In some sense, you might say, if you haven't been cited for how long are you functionally dead? Maybe you'll be revived, right? Maybe it's that this sort of principle only comes around so often and it just doesn't get cited and maybe there's some exogenous event in the world. Maybe if something happens in the world it makes this case important again. But then there's cases that are in a space that constantly are being treated and this case is ignored and it says if it's dead. But maybe it's never been given a negative site then. It's worse than negative, it's just not cited at all. I mean, if you look at the Supreme Court's decisions, just as an example, something like half of them have never been cited a single time, not once by another Supreme Court decision. Now, I'm not here to say they're dead, but that doesn't look like life to me, you know? Half of them never been cited once. So that's just an example and that's a paper that somebody else wrote called The Web of the Law, but the interesting, all these citations have these flavors to them. They have a particular sort of distribution associated with it. And again, that informs even I think our judgments about how from a pedagogical standpoint, how to train lawyers is to say, look, you need to understand the landscape and what you're working. Most of the things aren't gonna get cited at all. Why did things become the things that get all the sites versus other things? Like to me is something that is useful for practitioners. It's not just some academic point, although the academic point is about evolution. We're gonna talk about evolution. I'll put that out there. But you're only talking about judicial opinion. That's right. Just within that space. 10 years ago, it was a judicial opinion and it hasn't been cited since. It's subsequent judicial opinions. Right. It could be because it was so clear and absolute and bang, they were upset and set up, no point litigating that. Right, no, I mean. It wasn't cited at the Supreme Court level. It could have been cited by people. That's right. And see, this is the thing. If we had all that data, I could assess that at all levels. Right, I could say, it's not only not been cited at the Supreme Court, it's never been cited by any court in any jurisdiction anywhere. Then it gets to be hard. Now it may be that this thing is bulletproof. It's such a good opinion. But that's not what we see often. We see burstiness. Right, we see patterns of burstiness that look like, that work like this. Other, no opinion seems to have that flavor, generally. What happens is it starts to elaborate a principle and people wanna test the sub component principles in some hierarchy. So even a case that's really unassailable in some sense, still agglomerates just the citation in the sense that they say, well, in this prior case, we developed this principle and now we're sort of need to question this sub-principle or what have you. So even in that space, it may not be the case. No, maybe it is. Maybe it's just bulletproof and it never gets cited again. So, go back to the judge, judge, the clerk. Yes. And everything. The graph. Yeah. This or the other? On the other one, actually. Okay. So we've got a few outliers there that really don't connect anybody. Right. So maybe there's two judges back there. Yeah. Is it, these are all federal judges? All federal. So the blue is district court, the green is circuit court and the yellow are supreme court justices. So the general pattern is green tends to be in the center, not a huge surprise. Blue tends to be on the boundary, but it's not uniformly the case. Now, one thing is, is I recognize right up the bat, there's limitations. This is just clerks. Well, one of the things we wanna do is study citations also. Do the citation patterns mimic this sort of pattern or are they different in some ways? So a lot of times people say, if you do citation analysis, I don't know if I really learn anything because I might just cite you because you wrote this really good case or you just wrote about a case that's controversial or important and it's nothing about you in particular. It's just that you happen to catch this case and now I'm gonna cite it. You just ended up getting cited. So one of the reasons we did this is we wanted to create some alternative way to measure the same sort of idea. And maybe other ones. The point is, is the more data you make available, the more we can test propositions like this. So that picture is the isolated thing in the bottom of the center. Oh, that little- You got a green going up to a blue going up to a green. Yeah. What does the placement mean? The placement here is gonna be unrepresentative much. This placement is kind of heuristic and should only really be used on the artist who would be coming in. So the fact that this is connected to the rest of this stuff makes the way out of algorithm sort of just throw this to the boundary and the distances between these nodes no longer mean quite as much. So- If they're not connected, it's not, that's the, if you're not connected in some way, the algorithm doesn't really have a way to deal with that but almost everything is connected. So the point is that little cluster there could be over here, over there, over there. The one that's unconnected, could it go anywhere. The things that are connected are placed for a reason. The green one is below the blue one is below the green one. What is the, by the green one? Those are actually two, this is a, that's a sort of random property because actually there's the two and then there's the three and they're not actually connected. That's just, what it does is it basically randomly places things that are unconnected. Oh, they're overlaid. They're just overlay. And there's no meaning to that except for just by random chance they happen to land there. But everything that is connected, there is a meaning. That those meanings, those distances are based upon sort of this energy minimization configuration. Unfortunately, I didn't include the slide but I can show you the process by which that graph is actually generated. It starts as a big cluster and what it does is it sort of zaps it with energy and it pretends that each of the nodes have magnetic charges and that are trying to repel one another. And then so the more connections you have those are spring coils that hold you together. And so the thicker the spring coil the more likely you are to stay close. So the spring coil that's in your mechanical pencil isn't gonna hold anything together. The one that's on the back of like a 18 wheeler is gonna hold you together. So the more connections you have the more other connections you have the more you're likely to stay in the center. So anyway, that's a little background on that. I'd be really interested in hearing about what questions of policy you might or problems. Like when you started to put this up I thought you were gonna also talk about judge performance or whether or not a judge is really effective in moving things along. And I wanted to share a story from the National Institute of Health and they wanted to know which of the professors they gave grants to were more effective at using those monies to develop something that ended up having social benefit. And they traced the citations of their patents all the way through. But they also used other sources of data. So they used news data. They used company financial data. And so I would posit that you probably didn't need other sources to really answer your question because there are so many impacts, right? To what causes someone to make a decision it's not just gonna be the legal part of it. No, absolutely. I mean, the point is, I mean, we couldn't agree with that more I guess is what I'm sure Mike would support. The more information, one of the really fantastic things as I see as I mentioned earlier is that people can come and novelly recombine data sets in ways that you would not have contemplated. They'd take a thing from over here and over here. You would say, oh, that makes sense for that space and that makes sense for that space. You might not see that sort of recombination of two different data streams in a way that's meaningful. And so, yeah, I couldn't agree more. This is one of what I mean when I say this is only only the beginning of this type of stuff. But the key is that we get that data out there in a way that each one of these projects isn't like an incredible undertaking in terms of just the data collection. And that's what it's been in some of these instances just to get the data we spend months and months and months. And if it weren't that way, we'd probably do a lot. I have a lot more things to show you. I do want to make a point for Carl to benefit. I think, again, briefing is an important component to this, particularly if you consider bursting and the social network. You guys are looking at citations, which in some instances may take years before they ever show up in an opinion. But briefing, particularly at a district court level and then at the appellate level really shows sort of a social dynamic that's going on below the surface. And so if citations are appearing, if statutes are appearing in particular, a lot of cases already come pre-tagged with classifications of what they are, you could possibly see something development in the positive long in terms of business intelligence. Where do we see areas developing in particular regions simply because of a citation network? And what's going on at that lower level? Before anything even reaches a publisher or a publisher? I was very interested in the thing you said earlier about using the briefs. Now, I realized that there's these ethical concerns, but just evening from a pedagogical standpoint, what is it that makes a brief good? Well, if we had all every brief over a huge window of litigants, over a large types of class of cases, we might be able to create, and we may think we have a good handle on that question, but the answer might be a little bit surprising. And the point is, we may be able to think about some properties of things that are very successful over, and this is sort of the business analytics piece. And the other, there's a very interesting paper that I would just highlight to people using Supreme Court briefs, but just as an emblematic example of instances where this was amicus briefs as well, where the court not only cites the brief, but without quoting uses language that's specifically in those briefs. So you can automate the detection of these sorts of things using sort of plagiarism type software. And they found instances where the court has just lifted whole text strings of text and imported them. Now, I'm not here to send them to the hall monitor or whatever, but I'm just identifying that if you had to justify the production of this brief to the person who's paying for it, you say, look, the Supreme Court took this component and directly incorporated it. And I just think that it may be in a more everyday sense, you could say the district court or the local court, you can see these instances where my argument was successful because you can see this direct incorporation. But imagine you do that on a huge scale, you could sort of think about, well, what is it the things that actually are successful, are persuasive? Maybe my Martinville Holloway could go on. Yeah, you could be a superstar lawyer, right? Being in the magazine, I don't know. A point that I would make coming from financial engineering and sort of algorithmic trading is that even small improvements in abilities can have large impacts in high volume areas of business, right? So we might think that we're not gonna be doing that much better than an experienced partner will do on this case. But every little percentage can add up if this really is a core part of your business. And so it's at least worth it to try to look at some of this stuff. That kind of segues to what I was gonna ask you. How far are we from having sort of off-the-shelf tools that a savvy practitioner or a lawyer could use to do some of the kinds of things that you all are doing with lots of data scrubbing and lots of analytics? If I had access to a lot of this or less data, I could have it in two weeks. So it's really the data, not the computational methods that you're using that is the hard work. Most of it is the data. If you assume that you have sort of stable systems and you can back test on history, or at least recent history then, you can do this stuff. That's easy for you to, I'm talking about. I'm talking about somebody down the line here or me being able to do some, at least, I could have analytics with decent data. How far are we from that? I mean, at least for specialized parts of law, what I'm saying is I could back test this, develop some software that would provide an interface for you to answer, like sort of answer a few questions, maybe go through a decision tree and then tell you, should you litigate or not? Or depending on where you are in the process, should you sell or should you continue? We need, but over time, the thing is that would be developed and then there's always this question of using backwards data. This is what my point about runs of history and why you need theoretical models too. Can't just use data if you want to do forward prediction always, right? And the question is over what window can you do prediction? And so anyway, it's also like, well, for certain questions there may not be enough information to make some inference, but again, if we could make people 5% better at what they do, that could be very meaningful over a lot of volumes of transactions, I'd say. Would your predictions take into account, though, the impact that your projections would create, right? Because if everyone became aware of the structure of judges, the best way to funnel to it. This is Robert Lucas, the Nobel economist's point about these types of analyses, that these arbitrage opportunities become recaptured in markets. Well, the question is, do you want to be the guy who did that? Or do you want to be the guy who was the following? Technically, we're dealing with arbitrage deadweight too. Even if the arbitrage opportunity goes away, we're still a more efficient society and we all have higher welfare, maybe you can just charge you that kind of thing. It will change the predictive model at that point, so you'll have to at least then, you know, we'll be around then, I guess. You won't waste time litigating something you shouldn't litigate. Even if everybody's using it, people will be wasting less time, right? We'll be able to go home on a Friday night. I just, I also think in terms of the business development perspective, there's good and then there's valuable. So your good grease may have stood the test of time in their argument, but perhaps your valuable grease, certainly in large law, are gonna be the rare ones that are discussing something brand new. That people who see this as a deep pocket coming about are gonna grab on to those rare ones that you have found that have their own little spike in a certain court and they're saying, that's my next avenue, that's what I wanna pursue. So I think there is a difference between the good and the valuable when it comes to your looking at words. And I would suggest, my statement wasn't about making value judgements, it was just simply about the volume in which certain opinions may be showing up underneath, don't even make it an opinion where, particularly if you're in trial court, many of them don't write it. Right, well it's hard to look at litigation outcomes without looking at verdict and settlement. Is that a part of the data? A lot of settlement data is not available. It's not to anyone. No, it's not to anyone. And I had a recent conversation about arbitration of the proceeding with Carl Verak. And it was just an issue of how valuable would arbitration information be if it were publicly available. It would actually open the door to something that's very mysterious to many people. So it would make settlement decisions a lot easier than what they knew. And it would make some of those very rich people much more spread the wealth of the state. Very siloed, encapsulated. Absolutely. So I had a two-part question for you. If we were in a business school and you gave this presentation and I said, so are your peers in the industry doing this? You'd be saying, oh yeah, the guys at Goldman Sachs are like way ahead of us. And so I guess one of the questions are folks in the big law firms or in Lexus Nexus or in other places applying these kinds of techniques already. And then the second part of the question is, are you alone in the academy or do you have peers in the law schools that again, if this data were available, is there a large population of researchers just kind of waiting for better data? I would start with the second one first. And I'd say there are people that are spread around. I mean, Paul would be one of the folks that I would identify. But there are people at various spaces. Some people in information schools, some people in law schools, some people in political science and even in economics and computer science. So they are spread around. There's not a heavy concentration of that. I mean, there's a lot of people who do empirical legal studies but I would just note that I've noticed that IBM seems to have some interest in moving into this space as a subset. One thing that they produced, an interface you might want to take a look at is something called many bills which is sort of the follow up to their many eyes information visualization. What it basically is to show you the structure of a bill but it's mainly graphical and then it pulls up the text. Now part of the thing is we don't have a server that can feed that type of stuff. I don't have access personally. That's I mean, they have those sort of resources but I see, I've noticed that IBM is, just as one example is interested in this as a potential. Did IBM work just more than one part? Did IBM work more broadly or is it just? I noticed that that's on the sort of visualization side but then also in the sort of business analytics side it seems like that this is somewhere they're moving in. Now I'm not privy to what they're saying in house. I have no idea but just as an outsider's view I've noticed that and I think that other firms, I mean, it seems to me that this is one thing where, I mean, I feel folks from Lex it's not letting them speak to themselves. I don't know what people do in large law firms on this question. I think they form out the work to other people. That's what they tend to do. I will say, if you look, IBM also has patents in the late 90s on algorithmic analysis of what the tax could be generated tax money and stuff like this. It's something that if you're willing to look through the patented basis you can see many ideas that many of these companies have at least laid stakes to whether or not they've been something that's been implemented with profitables of different questions. I don't know what Lexus is doing on this front. The one thing I would say, it seems like they're Lexus and Wes are uniquely positioned to leverage the data streams they have and then sort of provide consulting services I don't know whether that's on the, I mean maybe you're not privy to even mention that but it seems like an, not an obvious but something that would be worthy of investigation. Sure, well, we certainly will work with particularly large law firms if they want to use their internal data. But we do have visualization and analytic products that are well known or public knowledge. If you want to look at the court link profiles you can look at all of the federal docket information and find out a profile of a judge. What kind of cases they hear, how they've decided things over time. You can find the profile of an attorney who's argued at the federal level for similar types of questions and it's all, you know, makes nice the pie charts and it's graphical. Where we've had the most demand interestingly enough is around our company data. So if you are trying to, there's often, you know, issues of fact in the legal case and knowing the corporate hierarchy of a entity, knowing the assets that someone owns or hasn't claimed that they own, those are the places where we've applied most of this because that's where people will pay for it. I haven't really found that there's demand for this. Now we're playing with a lot of this stuff in how you look at legal data. Particularly if we think about the folks coming through law school now and the way they learn and differently than how anyone else in the room maybe learned. So, you know, that may come to pass but right now what people want to buy is not legal data in this visual way. They like the rest of our data that way. Do you have any plans to sort of produce in-house consulting services? That would be moral. We will do the data mining for, that was kind of what I would, we will mine the data for you and just provide you input results using our data, not yours. Using our data, oh, to answer a question? Not just a quest, but like to provide like one of these prediction models or predictions about your likelihood of success, let's say, in a space. Sure, continue around litigation. That's where people spend money, it's on litigation, so they want to know before they take a case, the likelihood that they'll make money on that case. Those kinds of services, the more data that we can pull into them, the better our prediction capability. I'd say they're nascent, right? But sometimes too, if you, some of what you're telling in talk about intuition, it depends on the scope of your practice. If you're practicing in the county in Texas, you know all the players. You know how that judge rules, you know you don't want to be in his courtroom, you're having a hard courtroom. So it's really who are you arguing or who are you aiming at? And I would say you're probably, I mean, large law that has a national practice. And then kind of deep pockets, they might be interested in paying. I will point out that as a small publisher, I feel like the key is dirty data for a lot of it, even if we can free it up, finding a way to clean it up is the problem, which has certainly been where the large publishers have spent the money making sure that that data is clean. But for good reason, they're not about to share it with us. And that's been one of the limitations, certainly with our businesses, how much time and effort we've had to put in just to clean up the data that we do use. This is actually a two-part question that kind of builds off of some other stuff that's already been asked. Just from my assessment, kind of looking at what publication I'm coming from, and just kind of giving this sort of a definitional of this field you're in, a lot of it seems to be coming out of Europe. Is that because, and I guess the second part of the question is, is that because data is just more available over there and so is there something that we can kind of emulate as far as on this side? Or is that just... I don't know if it's any more available. I don't know if Mike, if you have any thoughts on this. I think the data is more available. It's a different kind of legal system, though, for the most part. A lot of what I've seen been over there with this thing with two conferences is a much more abstract kind of research. I might argue that in the short to medium term, it's likely less useful to follow those paths that a lot of European research centers have followed. So maybe 50 years from now, those will pay off. But in the short term, I think that the kind of discussions we've just had about business analytics are probably much more feasible. I mean, there's a group, I'll just give an example. There's a group called, I think, Alignment Center in Amsterdam that is appended to a group of artificial intelligence and law group. But I guess some of these, I would definitely second the point of Mike's position, which is, these are fairly abstract, developing automated systems of reasoning and things like this, which, again, could be very, very useful. Maybe the machines will really be in charge then, I don't know, but that's one thing. But the sort of more short term, helping with my case right now, I see a little less. I do think that part of it is the training there for PhDs in law that are centered on law and then they tend to, one of the places they tend to go, it seems like, with some of the groups is computer science first, maybe economics, maybe some of these others, but they have more training in this type of space. In some senses, ahead on some of these questions, I don't know. Okay, well, thank you very much.