 I'm going to talk about the Flossmetrics project which is trying to extract the data from free software, open source, liberal software development. The ideas of the projects are quite simple. As you may know, in the liberal software community, there are a lot of opinions. Everybody has an opinion about everything. We could improve development, the quality of a project, how many people are working there, things like that. But there are very little facts, just a few. On the other hand, there is a lot of data publicly available about free software development. You know that most players are using public forges or all the development history is available. You can track every developer every single file through the whole history of development since, in many cases, 15 years ago. This is very rich information, which, by the way, was not available to the software engineering community, I mean the academic software engineering community before. Obviously, in the proprietary software case, it's very, very hard to access not only the source code, but the history of the source code, or the back reports about the code, or even the design documents of the software. With respect to researchers, this basically means that we are in heaven because we have a lot of public data, we can product the studies from others, we can validate results with the real data, and we can have really large samples. We can have thousands of projects when we want to make an analysis. In addition, there are a lot of people interested in this kind of data because there are a lot of people involved in the development community. We have developers, of course, they could be interested in how the project is performing, which kind of things are happening in the project right now, especially in large projects. This is very difficult to know if you are not really, really deep into the core of the project, but also companies, in many cases companies are either running liberal software projects themselves, or are interested in some kind of development, of liberal software development. So, with these ideas in mind, we can make some questions. Of course, these are very high-level things like, can we improve somehow liberal software development, maybe by identifying practices that are leading to better software, or to software being developed more quickly, or to better productivity, or something like that, or can traditional economic software engineering learn something from liberal software by analyzing the procedures and the practices in the liberal software community and trying to understand from a quantitative point of view why they are better, because it's obvious at this point that the liberal software community is using neo-development methods that traditional companies are not using. But the main question is why they are better and in which aspects all the companies could benefit from them. And of course, could projects understand better what they are doing, how they are doing it, by analyzing the patterns in the repositories, for instance. So, the main goals of the project, first slide is more like what we academics find in liberal software development, second slide is more like the specific goals of flow matrix. Basically, we want to retrieve data from thousands of projects, mainly from well-known for just like so forth, but also from big projects like Mozilla, or Gnome, or Kitty, or Apache. We want to make an analysis, we are making an analysis about actors, so mainly developers, back reporters, people collaborating in mailing lists, artifacts, so the kind of software we're producing, but also messages or individual files in the source code management system, for instance. And of course, processes. How is people fixing backs, for instance? What's the world-wide process for fixing a back? Somebody reports, somebody assigns, somebody maybe fixes, somebody reopens, we fix it, it was in fact not fixed, somebody fixes again. And what's happening in that? Timing for that, for instance, the usual pattern for closing a back in different projects, how that's mapped to project policies, for instance, etc. Well, on top of that, we're doing some high-level studies. We're still with issues like software evolution, so how is liberal software evolving in different projects over time? Not only in terms of lines of code, but also in terms of complexity, for instance, or quite interested in terms of community. So, how large is the core group who is developing the software over time? What happens when the core group changes, for instance? With respect to human resources, we're interested in productivity, in raw estimation of the effort in building free software, things like that. With respect to quality, we're using some quality metrics. Of course, quality metrics don't tell the whole story of the quality of a product, but you can get the first idea of what's happening if you apply those quality metrics either to source code or to the processes in the development cycle. We are basically putting together a database with all of this and making it available to others. Anyone can come. Usually, researchers are interested. Some developers could also be interested, but in general, you can download the database and do the kind of analysis you could be interested in. And we are also trying to provide some tools for following up the development. So, the AIS, that's the last phase of the project, to provide some simple tools so that if you want to follow such and such project, sorry, if you want to follow such and such project, you just grant the tool and you get something like a web page with the main characteristics of the project. Of course, we are quite interested in involving with the developers of a community and we are already working with some projects to help, usually they help us in getting the data more easily. We'll talk about that later. So summarizing the main results, this database I talked about with factual details. The factual is quite important because the things we have in the database are facts. So nobody can discuss that. That means somebody was doing a commit such and such, such and such file and the data is this one. So things like that. But millions of things like that. We are having some higher levels of studies and analysis, which are basically reports that will be public when we finish it in a couple of months from now. And a sustainable platform for benchmarking and analysis. So basically all this data is going to be kept in service so that anyone wanted to make a benchmarking analysis, for instance, can, can hear and look for similar projects to anyone, anyone and for analyzing patterns or whatever. Our main focus is in providing data and information. So we are interested in better testing this data for understanding it. So by doing some analysis, but our main interest in providing the data so that anyone can come and do his or her portfolio analysis. Well, we have a bunch of partners all over Europe, including universities and companies. And current status. Our project is due to finish by August. So we are close to finish. But right now we are more or less entering into production. At the end of the project, we expect to have like 5,000 projects analyzed in the machinery for growing up to the end of the project. Right now you can find in the, in the website, in the website of the project right down here, milkyaris.firstmetrics.org. You can find database stands for something like 2,300 projects. Last time I checked. Basically what we offer is full MySQL dams for CVS and conversion commit records. So all the history of the project metrics on size and complexity for some code. We still only have around 100 projects with that kind of information, but that's quite interesting. That's every release of every file of all the project through all the history. So that means having a complexity of the project of a time, for instance, or the size of the project of a time. We also have in many of these main areas, which are very usual for tracking the interrelationships and the communication patterns in the project and issue a backtracking systems for something like 700 projects right now. We are working on aggregating all of this. Hopefully next week or in a couple of weeks from now we are going to offer a complete dam for all the projects in the same database, which allows for cross querying things in different projects. We are working on those focused studies, and we have that multiaries web-based repository. We're also working on an API, because right now it's a bit manual to get the dams, but we are working on an API which should be really in something like one month from now. This is just, I'm going to skip this, you can connect to the website and look at it. It's basically what we are offering, but if you go to the projects section, you can see the whole listing of the projects and then download the dams if you want. This is the kind of thing you find if you go to a project. These are severe stamps, for instance, on subversion dams, these are mailing list stamps, etc. One of the most interesting things of the projects are the tools that are moving all of this. We have developed tools for analyzing subversion CVS G-Timbassar systems repositories, and we're also developing, we have developed tools for analyzing back reporting systems on mailing lists. The most interesting one probably is CVS Anali. CVS Anali is basically capable of going through a repository, download every version of the file, and every commit record, which means you can try the whole history of the repository and put it into a nice SQL database where you can make queries much more simple than when the usual development tools. As I said, there are similar things for Bits on Amelistat, but for Bits on Amelistat, they are much more simple, because the only thing they do is to mine the repository and basically dam that into a database. As I said, all of this is integrated into Malkeris. Which kind of studies are we doing with this? I talked a bit about this before. We have evolution of projects where we are interested in the evolution of code over time, evolution of communities over time, and also things like the responsiveness of the break, for instance in back reporting or in releasing or things like that. We are trying to detect deviations, so usually you see a comment pattern, at some point you see that the break is performing very well or performing very well for some reason. We are quite interested in those, let's say, disrupting points, because usually they reflect some core change in the project, which is quite interesting to understand what was happening after and before that point and what happened exactly there. We are quite interested in the human resources side, such as what happens when the core team in a project changes. Does the project notice that? How long does it take to recover if there is some problem or something? You can find very detailed information about that because you can detect very carefully when the core change happens and you can look at the parameters of the project and see what's happening after that point. As I said before, we are also working in a fourth and value estimation with a group of economists and we are basically, in the end, what we are looking for is finding parameters that can characterize the status of a project. So that, of course, this is never going to be like having an expert on the project who knows everything about the project and is following them in these holy days and the CVS commits holy days and so on. But if you just go there, see some parameters, you could have a very detailed idea of what's happening in this project. How many people is working here? What's the trend? Are they having a community or not? What's the model with quality and brexition of the core? Things like that. And we're quite interested in detecting sustainability conditions for projects. So when the project is lead to success or not, at some point it seems that you can't find out because you need to enter some growth pattern. If you don't get enough resources, I mean human resources, it's very likely that the project is not going to succeed. Things like that. Of course, since what we are providing is data so that others can do the analysis, imagination in the end is the limit of the kind of the study you can do. I'm going to skip all this part about the kind of problems we have. You have it in the slides if you want to look at it. And I'm just trying to the final slide which is, if you are interested, we are pretty much happy of having your feedback or even your collaboration if you want. Only detail of the description of work of the project is available on the website so you can look there. There's something like 100 pages about what we are doing, schedule, things like that. All the software we're using is liberal software. If you are interested in the software and not in the data, you of course can use the software. It's available here. Of course, if you are interested in the process of the project, have a look at the website. In a couple of months from now we are entering full production, but I guess that right now we are pretty much usable. We are also interested in knowing how this could be useful for you as a developer or maybe somebody involved in the liberal software community somehow. So please give us feedback. We are willing to collaborate with clients and very interested in feedback from developers, specifically in the areas of privacy because we are having a lot of data that could be considered as a problem for privacy, but also in uses of this in different projects. Sorry, different projects. And this is all from my side. Okay, just some time. Thank you.