 This graph is for each stable release, so the relation between the count of line of scope on each language in this release by the count of line of scope in Debian ham. So, in Debian ham, Debian 2.0, this graph is one in all case. Well, then we can use the numbers in MSLux we can measure on Debian to estimate the effort for making a Debian distribution from scraps. You can see that Debian 4.0 or Debian 4.0 is about 10 times more than Debian ham. And, well, some words about the estimation technique. This estimation technique was designed by Bohem in 1981, so this estimation technique is old and the result must be considered with calcium. As a summary, this method takes a simple, takes the number of MSLux and apply a set of formulas and obtain the effort in many years and the time of development estimated time of development in years. So, by doing a trivial computation, we can obtain with the average salary of a programmer, we can obtain the cost in million of dollars. It's interesting to compare the Debian size with other operating systems, both free, libre and proprietary. There is a similar study on Red Hat distribution done by David Wheeler. So, we have using this data until now. But now we have a modern version of Red Hat, that is Fedora code. And you can see that Debian and Fedora code, which is contemporary, Debian is much larger than Fedora. The third column is the number of MSLux of each system. Also, we can compare the Debian releases with proprietary systems. We can find the numbers for proprietary systems in some other bibliographies and computer magazines. For those, we must use the numbers also with caution because the unavailability of short code and the scope of this system. In Debian, you can find all the user needs. And for example, in Microsoft Windows, you can find only the operating system with a small quantity of user application. So, this number, in general, must be considered with caution. However, you can see that the last release of Windows XP is also smaller, much smaller than Debian releases since Potato. Also, we have, a month ago, we have MSLux, the Open Solaris. And you can see that Open Solaris is smaller. Note that Open Solaris only includes the kernel source and some of the user commands of the unit. And this number must not be considered. Finally, we have another research in the copyright analysis from the source code of Debian and other software. The methodology is again to download the package and select the package and do a hidden analysis of its file. And using the Pyternity tool, we can search the autosheet pattern. And this tool is not published yet and is a research project of our group. Well, the result is yet on its earliest stage of research. So, this result also must be taken with caution. We can offer you two slides. First shows the copyright, the most important, the most frequent copyright found in Debian ham. And the next shows the copyright found in Debian sharks. As a first approach for results, we see that the enterprise collaboration are growing more than other groups. And also about the health of code is from individual developers. Well, these are the address for the tools we use for this measure. First, you can find in Debian the slot count tool since the Debian ham, the Debian potato, I think. And the copyright analysis, as we said, can be done with Pyternity. And on these addresses, you will come, you will, you can find the tool. And I think in a month or two months, it is not published yet. Also, in our main web page, you can find most tools for those that we named Libre software engineering. Finally, I wish to show you this address where you can find the concrete numbers for this talk. In the page, you can select any distribution, any Debian distribution since ham to shark. And then, for example, in shark, you can see the statistics without measure. For example, in the link statistic, this is the global statistic, the number of measures packets, the number of files, the number of m-slots, the number of slots, mean values, and the language distribution. Also, you can see these results for each packet. For example, you can select the first, the first, the most sized packet, which is an open office. And then, our mySQL is doing the carry. I can't call to the system administrator. Well, for each packet, you can see the language distribution and the global numbers for each packet and the estimate cost. Yes, the last line, the estimate cost for each packet in dollars. With an average salary is from 2010. And now, I don't know the graphs, so, for example, these graphs are not very good. Well, the congratulations about this talk about the evolution of Debian distribution are these. First, Debian duplicate site every two years. The mean packet size, second, the mean packet size remain almost constant. C is the mean language, but other languages grow more quickly. Mainly, those interpretive. The most important conclusion is Debian new linus is probably the biggest software system created until date by a coordinated group of people. And about the copyright analysis, about the half of copyright belong to individual developers, and the enterprise collaboration are growing more than individual developers. Once again, our Internet addresses and the book about Kokomo, if you wish to know more about this old estimation technique. We still have a bit of time. Are there any questions for Juan? Actually, I have a question. Do you have any statistics about the localizations? I don't understand, sorry. Yes, the question is that, have you got any statistics about the localization issues? For instance, the number of languages that are covered by free software versus commercial operating system for instance. Is that plan though? Are there plans to do that? I guess my universal translator doesn't quite work well today. Babbel fish to the rescue. Not at the month, but in the next phase perhaps the project for Edge. Well, the problem is usually we can only short code and obtain statistics about the programming language and the software engineering. However, we have another tool, the UBES analysis, UBES Analyze. In this tool, we have the identification of the translation in EDSEO channels. But not for Edge language. Because I was thinking that one interesting statistics together for starting with the next Debian release would be on the number of PO files. I mean, by counting the number of PO files that exist for a given piece of software and making statistics about that, then you would be able to compile exactly how many languages are covered proportionally basically over, let's say, what the main repository in Debian contains that you have statistics that, English obviously is covered for everything because it's the difficult, but then let's say that how big of a percentage of the whole main repository, for instance supports French, Spanish and so forth in Trav. It's just a suggestion, I was wondering if that was planned, but I think it could be an interesting development to do for the next Debian release. For EDSEO, more than a percentage, and that it needs to be completed. I mean, if the idea is interesting, it's a suggestion. It better be. Could anyone try to phone him please and see if he's on his way? Unfortunately, I don't have his number myself. Yes, unless there's other questions, we still have time until we find our next speaker. Okay. How do you control situations where actually same source code is counted two times? For instance, openoffice.org and openoffice.org, Xemian edition. ¿Qué tan en cuenta tiene la cuenta de duplicado de software libre, de codios software? Por ejemplo, cuando se cuenta dos veces el mismo en OpenOffice, si lo tuyo en cuenta o no. Yes. OpenOffice no repeat. In most packages, we have developed an statistic to identify the packages with code search. For example, Mozilla, but in OpenOffice, the source package is only one, I remember, no? Situation can come in same package also. Same situation can be found with one package also. ¿Hasá lísico dentro el paquete codios fuentes duplicados? ¿Se refiere eso? The slot count tool. Identify identical files with MG5 code. What I meant was the slot count cannot handle if there are two times same source code. Only counts physical source code lines, but it don't handle if there are same line two times. ¿Si se cuenta la misma línea dos veces ese tipo de cosas? No, no, no. Two different files in one line is ever two files. There are another code similar to MG5, which is SimSIMS, SimSIMS, SimSIMS, so when two files are very similar, the code is very similar. So the slot count doesn't use this code. Thank you. Okay, so we will take a short five minute break for those who... Yes, well, we just switched laptops during that short five minutes. So we'll start then the next presentation in five minutes.