 OK, so I guess that it's time for me to start. And I was a bit worried that I will have a huge audience. It would have been the largest, but this is manageable. This is what I'm used to. My name is Daniel German. I'm a university professor. I work at the University of Victoria in the other side of the country in Canada. And this is work that I've been doing with Kate Stewart from the New Year's Foundation and Bram Adams. He is also a professor at the Polytechnic of Montreal. And the work that I'm going to describe today, and it's this combination of research plus hacking. And it has been quite fascinating and interesting. So many of us use GATE. GATE is one of the big contributions that the Linux kernel has. It will also, it seems to be having a huge impact, not only in software development. There are many, many, many other areas where people are using GATE for version control. So it's a fascinating technology. I don't think I have to tell you why it's good and what are the benefits of using GATE. The fact that it's pervasive and people are using it is proven of that it's an excellent tool. And one of the things that's fascinating about GATE is a great archival framework for historical information. The way that GATE works, essentially, you give it files, blobs in GATE lingo. And it will just put it away and put metadata on it. And then you can, at the very least, extract them, extract them back. Or you can do some comparisons with it, which is quite actually useful. And that's the notion of being able to use DEF. If you have a DEF that is able to understand the data that you have, then you'll be able to see some interesting differences between one version and another version of that file. The typical use case is what are the changes between the previous version and the one today? And for that reason, then, GATE is extremely good at operating at line level. And line level, then it has old infrastructure. It's a little bit aware of some program constructs, and particularly with C. But in general, it's agnostic to the programming language that you use. And that's actually why people use it in many different things. And one aspect that people love about GATE is also the ability to do blame. And as you can see, you wrote this. Or this is the commit that introduced this change. And this is a screenshot using the GitHub interface. And for GATE blame. And this comes from a file in the Linux kernel. So it can actually tell you for each line of code and who are the authors or the commit IDs that introduced that change. But the main restriction is that it's basically line-based. And it's sufficiently good for most of the tasks that we have. But because of that agnosticity, then people are using it for other domains. So it's actually very, very common now for authors of text, of documents, to use version control. With my collaborators, with my students, we write papers using GATE. And it's great because you can do the same kind of activities that you will do when you're doing software development. You can roll back. You can blame. You can branch. You can merge, et cetera, et cetera, et cetera. But one interesting aspect that made GATE very different from any other version control systems is its malleability of what history is. Essentially, GATE allows you to write the clean history that you want others to see, which might or might not be very true depending on the practices that your particular software development team. And so there are lots of tools dedicated to writing history. And it's extremely powerful. And we're actually using some of those tools to do some of the work that we are going to present today. And so with the proper motivation, you can essentially change the history possibly to be whatever you want it to be. Of course, in the process, you might annoy some of your collaborators and try GIT rebased and then push to master. And then people will actually get some comments. Or you might just do it, which is the typical use case. You clone your software, your project. You do your work. You work maybe for one week, two months, three months. You get your commit. Some of those might be more like backups of your work. And then you say, now I'm finished. I'm done. And you're going to rebase some of those to make them look cleaner. And nicer for the people who will be in the future or maintaining them, or in the case of the Colonel who will review them. So I love this quote by Indiana Jones. Because part of my work is finding interesting research data to work on. And so I see a lot of the work, as I do, as similar to one of an archaeologist. So I'm a software archaeologist as one of my research fields. And it doesn't just say that archaeology is a search for fact, not truth. And all we can know is whatever is the fact is that the history that the developers have left for us is recorded. What really happened, we have no idea. And the more facts that we have, the better we can interpret that history. But unless they record every single train of thought, every single activity that they do, we will just have misguided, not misguided, but incomplete history. And so the history in gate, no matter what project you have, it's likely to be incomplete. So what can we do with it? So with these constraints, is there information that we can extract from the history of gate that is a little bit more fine grained and complete than the one that Blame by line does? And so we have a dream. So we talk a little bit about this. So Kate, I've not heard for years. We talk in Linux in Japan and in December. It was December last year. And about some of these interesting needs that we had for being able to go a little bit more fine grained on the kernel. And so we'll talk a little bit about what was that dream. So this is a function from the Git project. So when we tried to understand how to extract history of gate, we decided to use Git. Because if Git uses Git, it's probably the best user of it. So kind of this recursiveness, I just like it. Kind of Richard Stolman is. And so that's the typical blame that you have. So how accurate is the history of that function? If you trace the history back, you can go to every single commit as mentioned. You can look at the commits before, and the commits before, and the commits before. But it's a little bit tricky. And there is no real infrastructure to be able to traverse that history at a more fine grained level. So let's look at this particular line of function line. And so Ramsey Jones is the one who added it. And let's assume for a moment that we're able to actually take this source code and this function. And we're able to actually pass it through some sort of filter. And we create this tokenized version. So essentially, every single one of these lines becomes a token in the source code. Most of you are programmers, so think of a token of whatever definition of the program in language it has. And so the brace will be a token. Equal, equal will be one token, even though it's two characters. We consider string to be a token. It was an interesting question. So what about comments? Are comments just a bunch of words or are they one token? Well, for the purpose that we have, we decided that a comment is just one token. So in other words, we get into more tracing the evolution of the comments in the source code. Because comments are also huge. So let's assume that we have this. Will it be possible that we can actually go from the history per line to history per token? So the one below is the history per token. So we're able to actually see, in a far more grain detail, what are the actual commits that we would show of the token. So let me actually run you through an example in this case. So we have our original line. And those are the same tokens that correspond to that line. But instead of one change, we're actually observing that there are several changes here. So there are actually three commits involved. And let me colorize them. So the pink part is actually originally authored by Linus Torvales. Well, we don't really know whether it was him or it was a patch from somewhere else. But it's so much in the past that we don't know anything else about that. And then the actual word commit comes from Ramsey, Ramsey Jones. And then the blue part actually is the struct commit. And the parentheses and the semicolon also comes from Linus Torvales. So when you trace the history of this, what happened is that the original line was authored by Linus, so the pink line. And then it came the blue part. And then at the end it came the yellow part. But who is attributed to the line? The last modifier of the line. So in this case, that's why Ramsey appears to be the one who is supposed to have actually modified that line last. Well, in this case, three other commits are ignored. In this particular function, the green part is parts that they are wrongly attributed using blame. So if you look at the blame per line, they're attributed to different people that they threw out or if you're able to see them by talking. So that was actually a problem. So this was a goal. So we wanted to be able to do this. Whether it's good or not, that's a different story. It's about ability. Maybe at the end we will find that there's no difference that the talking and the line are mostly identical. So that was actually maybe that's something that could potentially happen. So we started thinking. And we came out with what we are calling the evolutionary views of VCU repos. And let me actually run you through the basic idea. It's relatively simple. So we have a file. And if you have a filter of that file, that creates another file. It's very simple. Think of even a cat will be a trivial filter. And in this case, the filter that we're using is the one that is responsible for doing the tokenization of the file. Now when you have different programming languages, then you have to deal with different tokenizations for each one of them. I think that at this point, we have been able to deal with Java and Python. C++ a little bit trickier. C++ is harder to parse than C. And of course, we don't want to have a full-blown parser. It's an overkill. And then you have to be able to compile the code at the same time. So we're using what the researchers call island parsers. In fact, we're using a tool called SourceML from the University of Kent. So this is what we do. So we take the original source code. And then we pass it through this filter. And based on this parser by Kent University of Kent, and then we generate the version there. So that's actually our fundamental view. So think of it. So for every version of every file that ever appeared in the Linux kernel, we do this. That's not difficult. And any student can actually do it within a couple of days. It will take some time to do the tokenization of the kernel because it's massive. But it's not actually difficult. But then we have to bring it to the next level, which is the commit. So for every file that is in every commit, we create the view of this. So it's just recursively we're going one level. And you see where I'm going? So our purpose is to create a repository in Git. So this is actually so if we have original repository in Git, we want to create a view repository. So we're using Git to exploit the history at the token level of the history store already by Git. And so our goal is that if you give me a repository, I can create another repository where you have the history by token. And then you can use all the infrastructure that is already in Git to be able to traverse that history, which is, for us, the most important blame, at least in this case. So this is actually what we have been able to do. So this is a file in Git. So look at actually the blame. So this is just a traditional blame. So this is just the last commits of a particular file. And these are the list of commits in the token version of the file. So notice that the metadata is the same, except that the committer is different, because I'm the one doing those commits. But that's the goal. Essentially, we have a typologically identical repository. And for each one of the commits, the commits is a view of the original commit. And all the metadata matches except for the committer. So and then we're able to do blame. So this is actually using the GitHub infrastructure again. So the one that I showed you in our dream, I created those slides extracting the data from Git using the GitHub infrastructure, which is actually the beauty of this. By staying within Git as the storage, we're able to use all the tools that they are around to be able to exploit it. So we don't have to build or we don't need to have a special database with a special web server that allows you to browse that history. So for example, if you have this commit, so this is just a traditional commit in a file. And notice that already GitHub gives you a little bit of an indication that even though this is the line, it highlights the token. Sorry, a Git diff allows you to do that. It's actually relatively trivial if you're dealing with a line. When we do the tokenized version, we actually see that there is actually this removal and this addition. So the replacement of that. And notice that the metadata is the same. And on top of that, we actually link with the commit ID of the original commit. So when you're browsing the history in the tokenized version, you can quickly jump to the original one so you can see what the commit is there. So let me actually now come to the Linux history part. So the history is stored in three different eras. So of course, September 1991, the very first version 0.01. That's why we are celebrating the 25th anniversary of the new scandal. And until 2001, when Bitkeeper started to be used, there's no version control of the kernel. It's interesting that some teams started to use version control. You see still some archives of CVS repositories where the top teams use version control to do their development. But Linus never liked CVS. He will never actually accept it as a version control system. So he will just receive patches, apply them. And then using Usenet, Usenet became the repository, the way to distribute the code in that era. So he did a lot of work to try to retrieve every single version he could, and then committed into a Git repository. So we have a prehistoric Git repository that has just snapshots. Of course, all the commits are associated with one person. And there is a very long period between each one of these commits. So the history is not very fine-grained. We call it and always the committer is Linus Torvalds. And then in 2001, Bitkeeper started to be used by Linus Torvalds until 2005. And you know, there's a very interesting story behind why stopping using Bitkeeper and the moment that Git is born. And of course, we're here so we know that Git is born specifically to satisfy the needs of Linus Torvalds for version control in the current. So between 2001, 2005, the history is stored in Bitkeeper logs. And Thomas Glakesner did the work to extract that and put it into a Git repository. And in 2005, Git starts to be used and becomes the version control system that we have today. Because of that, so we have really, really, really very low granular, it starts to improve. And then here we have far more fine-grained. And as more teams start to move to Git in the last 10 years, we start to have a more fine-grained view of the code. It's quite interesting to see, as a researcher, to see the different uses of Git in different teams. In Git, every commit is kept through the merging process until it arrives in Linus. The commits, as they happen, they are maintained. There are projects like the Postgres that does not have a single merge. They say, you finish a feature, you squash it, and we will commit the squash at head of master. So therefore, all of those microhistory you have, I don't care for that. Because I just care for the feature. The new scale is different. The new scale actually maintains all of that. Of course, we have also observed that there are cases where people develop drivers, for example, outside. And there's a person that comes in and imports all of that in one single mega-commit. So that still affects a lot of the granularity. But that's more observable. So the other nice thing is that Git actually allows you to concatenate repositories of common history and using physical graphs. And so we were able to use these repositories and create one single big repository that's able to track the history of the files all the way to 1999. And there are some interesting features of Git that also make it very, very good for history analysis. One of them is the ability to detect renames. It's configurable. You can configure how hard it has to look for when a file is renamed. But even if the file is slightly modified in that rename, it's able to actually say, oh, this file moved from here into there. And it's only 97% the same file, which is quite valuable for the kind of work that we have. But however, there's actually some big warnings that I want to address. So one of them is that in the nomenclature of Git, we have the concept of the author. And the author is a very, very strong term. I prefer to use the term contributor of the commit. This is the person who is giving the commit. This is providing the code. Might have been the author or might be somebody else. And he's just a conduit to actually put it into virtual control system. Or in other cases, this person might be doing some refactoring and might be taking code that was authored by somebody else and put it in a different location. Or maybe do some extra modifications. In that case, the true author of that code is a combination of different individuals. What we are tracking here is the contributor of the code, the one who is actually providing. And of course, the Git blame is not able to really deal with refactoring. In fact, it becomes a very interesting issue and even topic of research. When you refactor, who's the true author of the code? That means that for every chunk of code, you might have that this code actually comes from many different individuals with different proportions of participation within that. And I don't think that even lawyers are sure about what exactly that means. So here are some of the statistics that provide information about this. So this is up to version 4.7. The left hand side is the slugs. And the right hand side includes the size by tokens. And you can see that it's around seven, six times more the number of tokens and the number of lines. And the general reports, they just use plain lines. They don't even use the concept of slug. They just count with WC how many lines it has. And when you divide it, actually, the number of blank lines in the kernel is relatively significant. It makes sense because we use blank lines to divide code. And when we break it, as I mentioned before, comments, we keep them separate. We keep them as just one single token. Doesn't really matter how long it is. That actually means that we're not able to trace the provenance of the tokens, sorry, of the comments. But if necessary, we could actually do that. One of the things that interests me was, can I see what code has survived from that very, very first version, 0.01? And it happens that this function, skip a toy, is the function that contains the most code from its very original version. There are not many functions that survive, but this is one of them. And it kind of makes sense. A toy converting strings to integers. This is that function in the original kernel. And if I show you the version that we have today, you will see that there is the static, sorry, the no inline for stack. And but there's something very fascinating happening here. So notice this and this. Let me show you the original code. It was a while ago. Somebody in 2015 decided, well, discover that every time this function is called, it's always called with the first character being a digit. And therefore, the condition was not necessary. I assert that for maintainability, that's a bad decision, okay? For the strife of pure speed, this person decided to submit the patch. So that's the patch. Rasmus in 2015 decided to change the while loop into a do while. But of course, so this is actually a tokenized version. And this is what I was mentioning that we are not able to trace at this level the move because this actually just moved the while from the top to the bottom, puts the bread from one side to the other. But in terms of token still the same. It's actually something interesting because now that we have that level of granularity, we can go through every single one of these changes and see when these kinds of moves happen and create an extra link that it doesn't provide. I'm tempted to revert this change. Some of you know, like the corners, like you find in this church that this place is still from the original very, very first church that we built. So to undo this change, to bring it back to this function was like in the very, very original version of the kernel. And this is actually the closest we have to that. There's another file that's actually quite interesting. This is the file that remains the most as in the very first version of the kernel, okay? And if you program and see, it's essentially just a data structure to implement the macros for determining the type of character, okay? Notice actually who the owner of this code is, Andrea Rosa. So appears to be the author of that big data structure. And although the comments, they appear as linux. So one interesting thing, so this is actually using GitHub. In GitHub, the history is not concatenated. Remember I talked about the three eras? And here is not. So that's why linux.vals 2.612 appears to be the moment where that comment was added, even though it comes from way before. Let me show you a little bit of what we are able to do now. So this is the output that we're able to generate. So providing blame per line is very, very nice from a user interface point of view. You put the line on one side and the blame on the other. Providing blame by token, it's a nightmare, okay? Because how do I show it? Well, in this case, I decided to colorize according to who the author is. And then if you stay long enough in a character, it will tell you who that is, okay? And so look at that. So this actually comes from the pre-git times and it dates from the very first version of the kernel, okay? And that's the only code from this person from Andre who actually added to this file, the real code. And because let me actually just scroll down, we'll actually see that there's nothing else. We also add a summary. So in this case, you can actually see that pre-git, 595 tokens, Arnaldo who added one macro at the top and at the bottom, the export symbol. He added seven, sorry, at the top, he added an include. So he has an include. Probably he added include with some file and then the person later came and modified the actual file name and changed that name. And then Andre has one. So Andre has one token in this file that was almost the original one, okay? So why, what happened? Anybody has any hypothesis? This is the original file, okay? Notice at the bottom, all of this has zeros. So let me go back to the new version. The new version has all this piece down here, all this piece down here. And the original file has zeros, okay? But if I search for tabs, so when he committed the code, he replaced the tabs with spaces. Who becomes now the owner of the line? He did, okay? So this process actually allows us to be able to discover these situations. And so this case, then as I mentioned before, so pre-git, 585, in fact, you can actually see it here. So they actually commits. So we have the original one is 494 tokens. Then at some point in 1996, this were replaced, so these extended characters. One interesting aspect is that the piece were added, but these little commas, I still are from the original one. And it brings a very interesting aspect. So when we look at the history of code, if you have a semicolon that you left there from the original function, but you didn't have anything else but your semicolons, you still create another of the work of that function before. Semicolons might be an exaggeration, but what about if some of those names are still there? And so some very interesting aspects for that. And anyway, so that's kind of an exaggerated case. Of course, the question that I get asked all the time is does it matter? This is the proportion of code according to the year when it was created as of version 4.7. So for example, this means that 50% of the code was written around 2010 or before. Notice that lines and tokens are almost the same. When you start to aggregate data, well, some people win, some people lose, and at the end almost looks identical. Yes, the tokens is a little bit higher towards the past. So we have more surviving code from the past, but overall it's almost identical. I was surprised by this. So I thought that might be because there's a lot of people that add drivers and architectures into the kernel, and they probably have very few authors. So I thought, let's go to an area that I know that there's far more activity, kernel. Yeah, it's a bit better, but not that much. It also tells us very interesting stories because here it tells you that most of the code from 2007 and before, yeah, they're still survives, but they're just remnants. And then the code really survives over a very long, long period of time. Of course, code gets replaced. But it's not surprising because the kernel keeps growing and growing and growing, okay? These are some tables of people. As I mentioned before, pre-git is everything that happened in that prehistoric part of the kernel. So there's still 5.1% number of tokens predate the use of BitKeeper. And if we do it by line, if we concatenate the histories, it's 3.81. So you recover a little bit more data from the past this way. Essentially, the numbers are the same with two exceptions and Joe. What happened is particularly with Joe. Joe is a person that has been very surgical doing global replacements or adding macros in front of the declarations of functions. He doesn't have that much code added and yet the kind of modifications that he makes are small, but he gains lines in double count counts. And he also makes changes to functions that have been already kind of finished and they don't require modifications later. And on the other hand, then Aaron, then he's in the other side that his changes have been slowly being overseeded by other people modifying the same lines. It's quite fascinating because he tells you that, yeah, so it doesn't really matter what method you use, the top 20 contribute around 20% of the kernel according to the Git logs. So what about the kernel directory of the kernel? And the numbers are very similar there too. And in fact, all the 20 are the same in the other, just they just shift positions two or three locations. They're not that different. Which I have to say, I was very surprised. I was not expecting that. I was expecting a little bit more variability, particularly because the files are heavily modified. And in fact, let me actually go back to you. This file is not very interesting, but let me actually just open the other file. So this is BS print F. And let me just scroll quickly through it. You can start to see all the colors that we have. For example, you can see the aggregates of each function. So how many tokens each one of them has. And of course, you have functions that have fewer authors. And interestingly enough, as you start moving down, so this is actually a function with a lot of different authors. When you start moving into the present, let me just move towards the bottom. So you start to actually see functions with less authors. So here, for example, and when does Joe got 83? Should be the other way around. I have something wrong here. Oh no, Joe is actually 83. So why is it the colors are pretty wrong? Anyway, my interface a little bit iffy in these cases. And so notice that this function is also two. So we have a huge number of tokens by the same author. Okay, Daniel Boardman. So that's kind of expected as new code comes in. So this is actually one by the last meal. And so as the function grows, the code gets added to the bottom. So the top ones will have far more people involved than the bottom ones that will have less. There are other aspects. So many, many small changes. This surprises me too, particularly as a researcher. So in here, I'm counting non mergers to modify CNH files, not assembly. So in fact, as something I should have worn before, I'm not counting assembly. Assembly is already fine, great enough. And 9.5% of the commits added three or less C tokens and removed three or less C tokens. From my experience, that basically tells me that there are bug fixes. Or some specific things to actually deal with compilers. And 3.8% added one token and removed one token. It's like a surgical operation. I didn't touch anything, but just one single token that I renamed. But at the same time, we also have these huge additions as I mentioned before of people who say, here it is, there's all this code with no extra history. One of them is a file system. I don't remember the name. More than one million tokens. Sorry, sorry, this is, sorry, sorry, two commits added one million tokens and removed more than one million tokens. And it's just called moving. That git is not able to detect with the default parameters. So they decided to actually move them from one location to another and in git without properly fine tuning the moving detection. It just says there's this huge movement from one side to the other. Churn, so that's actually a definition that we tend to use a lot of academia is tokens added minus tokens removed. And so in this case, there are two commits that they have more than one million tokens. One of those is a file system. So the file system was completely added. Another one is a driver. There's no history on those parts and they just were added in just one single commit. And but still you can see that 48% of commits have that churn of less than 10 tokens. So that's kind of, you can see that as the growth, the absolute growth in terms of number of tokens after that commit actually happens, okay? And 26% have negative churn and which actually goes very well. So we money lament and pass away recently. You say that for systems to stay, to satisfy the needs of their or of the users, their team has to put up a good effort on making sure that the system is maintained. And I think that this basically says there's a lot of cleanup, a lot of activity that doesn't really, it's not even reflected because when you have negative churn, you don't even count in the counts of lines. You're removing, you don't count when I count you in the previous slide. And that's part of the reason that in the Linux kernel report, I presume commits are used more asymmetric of effort because counting tokens as a symmetric also has huge potential threats to validity. So in conclusion, so on the large token aligner equivalent, at least for the kernel, doesn't seem to be much difference. On this mall, they provide a very interesting, fine grade view of the solution of the code. Of course, there will be code that will be newer, that will have fewer changes, but for as the code gets older, you get a more clear view in terms of how the code gets to be where it is. And just to recap it too late, and so our original, so our proposal essentially the ability to create this view repositories. So if you give me a Git repo, I can create you a view repo that maps the files. We have two particular examples of how to use them. One of them is tokenization. The other one is just tracing the declarations of the file. So in that case, you are only concerned about who adds functions or global variables. And then the commit will only tell you that and nothing else. And so we have been able to use it for that. Then I show you actually how we can use it to be able to have better traceability of who is actually authoring code and with a more granular method. And finally, conclusion that in the large, the accounts, both of them are equivalent, but in the small, then you really get a much better history of what's happening. And that's basically all. Thank you. Questions? Yeah, it can be done. Because everything can be done in Git. Yeah, because we are just basically, originally the way that we develop it, we started from scratch creating another repository. As we have understood what we're doing, we're able to go directly into the storage. And so you can essentially add all of that. Yeah, so well, so one thing that Git, for example, Git itself as a project as development is that they import repos of history when they add new features. GitK, for example. GitK started as its own project with its own author. And at some point, junior, I suspect, he said, oh, I like it. Can we have it? Rather than take just one single commit and put it all in, he merged the repositories. So all the history that was before got added at the same time. I think that's actually the practice that is not even thought about by the development teams when they are accepting contributions. And I think that many people just even know it's possible. I didn't knew until I started looking at this history. And so essentially, you can bring the history of it to somebody else and put it as part of your own. And if more projects do that, then we will have better granularity. The other alternative is what many teams in the Linux kernel have done, which is archive the previous repositories. Because developers might not care for that history. But some people might care. And it's good to be able to say, this is the commit that added the code. What do I know about the commit? And then you go to the commit log and says imported code from somewhere. And then you go to that somewhere. And did you do the job? And that's something that I also am very thankful for the Linux developers because the commit logs and also Git, the commit logs are huge. So they give a lot of context and information. And not every project is as good as that. So that's an answer to your question. Say it again. Java? Yeah. Well, anything that we can parse into that tokenized version. And Java is the easiest language to deal with. So yeah, so. So it just adds information to whatever the next step is. So this is kind of what I was mentioning about the facts. And then the job of the courts is to find the truth, whatever that truth is. And this is just adding more information. So I think that there's a lot of analysis that needs to be done on top. That's right. So we're just moving layers, right? We're just moving layers and saying, here there's more information. Or here there's more information. Or look at this code. Oh, you thought it came from there. Didn't come from there. Come from all the way down there. Yeah, and when you start looking at the aggregations, those people start to become interesting. So it's Larry Finger. So Larry Finger is one of them. Larry Finger has a lot of commits in a very short period of time. So I thought, what happened? So my hypothesis, because I haven't talked to him, is that he's a proxy. And for a company. And then he is the one that takes whatever development they have and puts it into the new scam. Yeah, so that's actually, I think that basically what we're saying is like, this is what gives you. But there's a lot more that you can add. And then it's a matter of actually connecting all that information. So I think that there's need for models and techniques. And so, but yeah, this we're basically saying because there's more data, there's more potential. So our goal is to publish that in the next three or four months. So that's for us the review. And as we publish it, then the tools become open source. And that's our goal. That everything that I have here can be run by anybody. If any of you at this time has a project that doesn't have the scale of the legal scandal, because the legal scandal is mind-boggling. And I'll be happy to actually process it and give you, you're talking as history to a certain point, okay? Any other question? Good, well, thank you very much.