 Thank you everyone for joining today. I have the opportunity to be with Yehui. We are going to discuss about building such services with Chaos Technology. There are two, three, four products out there. We would like to share with you this concept of basically innovation through open source communities. My name is Daniel Izquierdo. I'm one of the founders of the Chaos community, founder as well of the B-3rd year, a software analytics company. I'm part of the Innersource Commons. I'm the president of the Innersource Commons Foundation, so if you have any thoughts on Innersource, I'm happy to discuss about them. Then we have with us Yehui. Hello everyone. My name is Yehui. I come from China and I'm from Huawei, by the way. I'm the co-founder of the OS Compass, which is a software based on the Chaos technology and also the key component provided by B-3rd year. Also, I'm a member of the Chaos community. I'm very happy to be here and to share some experience we've got in the state for years. Thank you all. This is my turn. Oh, yeah, I remember. We just launched out a survey for our Chaos community to help us to do some survey. If you already use some metrics model or tools provided by Chaos community, tell us what barriers you have met. And tell us what things we can do to improve all these things. So the data line is just closed recently. Please help us. Give us some feedback and we will try our best to make the improvement. Thank you. By the way, if you have any thoughts on software and data in general, we have our director of data science from the Chaos community, Don Foster. So she's here. Yeah, thank you. By the way, we have QR code here. You can scan at any time you have because we always put it in the right corner here. Yeah, so pretty quick, what is Chaos? Basically, the first chaos, the acronym is for community health analytics for open source software, right? So the first question we have on the table is what is community health? Yeah, so many people saw our users who care about the community health. So here you can see the individual contributors and community managers, maintenance and companies. I think except for that, we also have many people from universities like researchers. They care about all things happened in the open source because it's a key part to make people collaborate. So it's very useful for those people and also it's very useful for people who want to invite us to start a company which provides some background support for open source projects. Indeed, the concept of health is basically something subjective brought into our minds. We try to define what health means with a metric. That's something that we may bring with 10-15 metrics, right? So this is what we are trying to define in chaos. So far, there are different working groups. We have, for instance, diversity inclusion. We have a risk working group. We have a working group now focused on OSPOS, open source program offices. We have a working group that started on OSPOS at universities, but open source means for universities. And there are some others as well. This is like the place where we are defining metrics, let's say, from a technological, agnostic point of view. But now we have the software, which is part of the conversation that we are having nowadays. So the goals of the community, as you can see here, is basically we are defining metrics. We are defining metrics models, which is some examples later, with the goal of measuring this concept of open source health. We want to produce open source software, which is part of, let's say, the DNA of the community as well. We want to bring industrial partners. We want to bring universities. We want to bring individuals. We want to bring open source foundations to help us to define this. So we will share with you some references later. Yeah. Okay. I think we just showed the whole milestones we had, what happened in the past, blah, blah, blah, six years. And I think it's... Actually, I'm just a drawer in this community two years ago. So some people from chaos, like old people, like co-founders like Daniels, Don, and Shen, they told me, when they were talking about how to set up a community to measure the health of the open source, and it just happened in our conversation, simply in 2017, North American open source summit, just like today. We have a conversation, and it's, okay, go. And we have such community set up. And our fourth release of chaos metrics happened like four years ago, 2019. And at the very beginning, I think we just have 30 metrics released, but right now we have almost like 100 metrics released, and plus 11 metrics model released. And we had joined the community two years ago. We started this translation work, which translates from English to Chinese and to some other languages. And our first Chinese version was released just after six months later. And we are very happy to continue on moving forward and continue our community. So we call ourselves, I mean the member of the community as a chaotic. So I'm proud of such name. Yep. Thank you for the great work. Moving forward. So again, this discussion about metrics. We have software, so we are now starting with software in the discussion. All the metrics, are they implemented in software? The answer is no, they are not. All the software is basically covered by the metrics. No, the answer is no. There is certain intersection. So we are working on that. It happens that there is so much data available out there. Thank you. Basically, there is so much data out there that is really hard to, let's say, formalize all of that information. So we are moving forward. We are discovering the metrics that say that that makes sense for different personas, let's say, in the community, and we are slowly covering all of that spectrum of metrics. Yep. So here's a website where we provide all the metrics we have and all the metrics model we have. Some metrics relate to the event we released all here. And we also have some governance leadership and contribution part. So we divided into the different domains. And also we have some metrics model in the right bottom right corner. We have all the metrics models. So later on, we will show you some examples we use in a real quick case. And now let's move to technology. For the record, there is another project called Ogre. Ogre and Grimoire Lab, both started a bit later, might be right. But both projects started at the same time in Chaos community. Those are doing slightly different things, but Grimoire Lab is the project I know, so this is what we are discussing. And then, basically, OSS Compiles started and was built on top of Grimoire Lab, which is part of the conversation today. So really quick, some key features we can say. So if you think about open source communities, what you have are a variety of data sources. And then when you are entering those data sources, GitHub, JIRA, Slack, Peter, some others, basically all the CI CD, you have an identity in each of them. So there is a tool specifically that is taking care of all of the identities and affiliation. So then you can have an overview of who is who basically in the community. That's one thing. Then Grimoire Lab covers more than 30 different data sources. So there is a covering 90% of the useful data sources that we can see in open source work. And then there are 70 plus different dashboard use cases that are there out of the box. The tool is really flexible, so think of this as a NoSQL database, basically with a full of different databases, information coming from this. And then what is helping you, Grimoire Lab, is basically to reduce the complexity of first gathering all the data, forget about APIs, logs, et cetera. You don't have to take care about incremental support. It's giving you all the historical information from the very beginning of that data source, and so on and so forth. A bit of the architecture really quick. So from left to right basically is the usual data mining process, right? Left part, we have all the data sources that they already mentioned, stackable flow that perhaps some others. There is a piece that is gathering all the information and storing that into JSON format, JSON schema. And then this is later basically moved into sorting head, which is a tool that is in charge of all of the identities and affiliations. Remember, in Europe GDP are ready. So we need to take care of all of this, obtain, about, part of the law. We need to take care. And then the rest of the information is basically two types of indexes or databases, raw information, as this is coming from the renal data source, and then a rich information, which is then one of the advantages of using Grimoire Lab, which is you have information way much closer to the business layer, let's say, of software development, right? So this is more or less the thing. And then at the end, of course, you can have reports, visualizations, et cetera, et cetera. All the technology is running on top of Open Search, which is another open source project which is used to be elastic search. They decided to change the license, no open source anymore, so. Okay, services born from Grimoire Lab really quick. We are good with time. So Calderon.io was a service run by Vitergia. So we have Grimoire Lab, which is the open source project, Calderon.io, which is one of the sub-services. Then we have Viteria Analytics, which is another tool in this case provided by Vitergia, and this case was covered by ISO, standards, quality security, all of this, you know, to provide sub-services, but to large corporations with certain warranties, let's say. There are other dashboards, Mighty Community, Mystic by Rochester Institute of Technology, LFX Insights, by the way is an interesting aspect here, is that this is a proprietary solution by the Unix Foundation. Initially built on Grimoire Lab, they decided to evolve the technology, so then they are using other things that are inspired initially in chaos technology, which is good. And then OSS Compass, finally. Moving forward. If you remember this chart here, then we go into some of the numbers. A bit more historical background. All of this started at the university. So, in 2000, the research group, Librasoft, that was started in free software engineering, basically, started at the operations. In 2006, we started the tool in Metrix Grimoire. This was the time when I joined the research group. And then in 2012, that was when Viteria was founded. So then we have like 12 years for research. It was not in the DNA of the group, let's say, to industrialize, to make a product of existing research. So let's say it just happened, right? There is a certain alignment of all of the planets in the world, and then we say, okay, this may happen at some point. And then we go into 2015. We decided to migrate the technology to a new thing. So then Grimoire Lab was born. And then in 2017, we have chaos. And then after 2017 to nowadays, we have all of these projects, right, that I mentioned. So Calderon, FX Insights, Maltic and some others. Let me represent this in a more currently colored one. Okay, we have this line here. On the left, those are the 12 years. Research and development, mainly doing this. Then we have research and production, which is basically what we were doing was to test the market. Does it make sense? Is someone, you know, putting some money on the table, so then they buy the service. And then chaos started in 2017. And then finally, we have all of these services. So if you think about this, right, we have 16 years of research plus testing in production somehow. And then because, and this is what I would say, and this is again anecdotal data for today, but we need to look probably at research or so. After all of these, the same 16 years, only in four years, what we have is four services built on top of existing technology. Why is that happening? My point, my thing is that it's because chaos, basically. Chaos brought so much visibility to the technology that we started to have adopters. So I would say is first, you can build on top of 16 years of work, something that took a few months perhaps, a couple of years. We'll see some results from OSS Compass as an example. But then at the same time, you can think of thanks to open source, you are basically, you know, having all of that innovation. And then ideally, you are contributing back. So you are bringing back innovation to the community. So then it's not only that we did all of this research and production, but then we are, you know, having back that innovation from all of community members and so on. So, yeah. And then we go into OSS Compass, so all yours. Yeah, it's my part. So first, I will use this to introduce the relationship between the OSS Compass and Chaos. So first, OSS Compass is an open-source community. It provides two main artifacts. First is a SAS platform and which is provided for everyone to use for free. And I will use next few slides to introduce that later. And there is another artifact we call it OSS ecosystem, evaluation system. So this is the theoretical framework. And as you know that Chaos, they provide some metrics, which is some atomic indicators. And they also provide some metrics models, which combine a bunch of metrics into one metrics model to describe you the story specifically. And also, so our OSS ecosystem, this theoretical framework we just combine those metrics and metrics model into a system. But it's not just a simple list. So I do research work for this area. I mean the evaluation in open-source in the past 30 years from the academic means the research output. I found that there exists a two main stream about open-source evaluation. The first is initialized from last century. It's more care about the code quality. It's nothing different with our commercial product procedure. And also the second mainstream is coming from the beginning of this century. And it's more care about the collaboration happened in its own community. It's about the responsiveness of users, of contributors. And after 10 years because more and more commercial companies like Bittreya, Huawei and many other commercial companies start joining open-source work, leading the open-source community. So they are more care about ecosystem built around open-source community, no matter upstream or downstream. So the third mainstream they are more care about the collaboration happened as an ecosystem. So our theoretical framework to choose using the third mainstream, so we call it ecosystem evaluation system. And we have three dimensions. First, productivity, second, robustness, niche creation, okay. And in the next few slides, I will introduce these things. Yeah. Okay. And internally in our ecosystem evaluation system, we have total four matrix models. And not all so far, but we will continue to produce more matrix models. But what we should focus on right now, I mean the first priority, I think it should be the collaboration because we believe the collaboration is the key part of the open-source. So no matter internally, collaboration and externally collaborations, we should focus on that first. So we start from the first one, collaboration development index. It's more it's a evaluate code development process inside of the community to see how smoothness of collaborations happened in this community. And we use another matrix model called community service and support. We use this matrix model to say okay, if you want your contributors from your community to collaborate smoothly what kind of service and support you can provide to your contributors, we would mirror their capacities provided by this community. And secondly and the next is the organization activity. We use this matrix model to mirror the ecosystem collaborations from the upstream and downstream, we believe because we believe in one community, I mean the open-source if treated as a platform they must have some downstream and upstream collaborations and stands from organizations collaborations. So we are monitoring and evaluate their activities in this community. And also we have some first one, we call it community activity. It's a relatively comprehensive matrix model. We use this matrix model to give an overview evaluation for the whole community. So currently we totally have our matrix model and in our bottom box, we show our algorithm used in this matrix model because as you know that all the matrix model you can treat them as a master in program language. Metrics are the input and the matrix model are the output or return values. So for each of the matrix we use HP to calculate this weight and for the output of the matrix model, we choose this algorithm. Actually this is created by Rowe Pike who is a very respectful people from I remember he just retired several years ago but he do a lot of contribution on this part. So we decide to use this algorithm to add our algorithm of the matrix model and we do some, make some improvement but anyway we ensure to use this as our default algorithm to calculate the matrix model. But I think in the near future we will provide a more flexible algorithm coming from the different people and from the students. Well welcome to the solutions. Anyway yeah, this is the home page we have of the OSS compass. On top we provide URL here and in the bottom right corner we provide scan code to help you to just locate the URL easily. But anyway you can see that this is the whole page quite simply. We just provide a search box to help you. All you need to do is just type the interesting project located on github or githy. Find it and to check out the inside report based on the four matrix model I just introduced before. And in case you didn't find that that matters because in the right corner in this home page you can submit your interest project or communities. So as so far we already covered most 50,000 github projects and which cross almost 300 technique domains already. So if you have any questions, you may find something you have interest on that. So you can touch one single report tree URL or you can type an organization URL of github. Because we provide a repo level or organization level monitoring on github and githy. So what kind of service we provide for all of you. The first page is about inside dashboard. This is just a detailed model result for each of single repositories or communities. It provides value explorations or risk perceptions. And meanwhile, we also provide some compass live because we think we also could not be initialize or innovate by one person or by one team. We think there would be more innovation from all over the world. So we provide such space to let you to invent your own metrics model which could be focused on the specific technical areas or it could be common for all the users. And this is how we get collaborations with the chaos metrics model because our chaos metrics model when they set up this new model we can take a try on in the compass live because in this compass live we have like what I mentioned 50,000 project data site and we have to create with each other you can create your own metrics models. And in the near future we will launch out a new project management data dashboard because when you have four metrics model output you find some problem as a project manager or maintainers you want to improve that how do you do so? So we provide such a single project for example I found my issue response time is too long so what is the top 10 issues to cause this problem? We will prove that this data you can manage by yourself. In the last two slides I will give you a first right case do you remember this keynote shared by Jim yesterday? So it's called the neutral homes beguide broad investment so in this slide investment in pet touch surplus TensorFlow as time of the establishment foundation just like last year September just one year ago and what is interesting thing is that in our next slides we are going to also provide analysis between these two projects I swear we do nothing about these slides but we will cover at least before yesterday it's true we were discussing we indeed hide the names of the projects in the next slides and then we said oh Jim did it okay so then we go for it actually we just finished our drive to the ocean last weekend right? So as Daniel mentioned at the beginning we are going to cover these two projects name but constantly in term of color Jim even through the same value we use we have no choice so here TensorFlow I mean blue color TensorFlow and pet touch okay so first of all we have to say both TensorFlow and pet touch played a very important role in open source world especially in deep learning framework we are grateful for the grant contribution no matter in industrial areas but also in the economic areas they are so important for everyone they appreciate their contributions so today we just mentioned some of our insight based on the metrics model we have in Compass so first you can see I moved the date mentioned by Jim here last year 2022 September that's the surplus time slot happened about investment and from this three metric model first we can go through some comprehensive metric model community activity you can see that pet touch exceeds our surplus TensorFlow at the beginning of last year and if we push time again back seven months earlier we use organization activity to describe the ecosystem collaborations among this community we found that at this time slot pet touch exceeds TensorFlow and again if we push time another earlier nine months when we start monitoring collaboration development index in short we call it CDI we found CDI already participate what has happened like this so what we can do if you are community manager in this case so when I notice that CDI have such very obvious change we need to find the root cause what happened in my community what I can do to make it to stop it decline and there is one important note to let you know that even for the CDI I mean the collaboration development index it is still a proxy indicator it has some latency compared with what our recent research we found the core contributors migration happened in the community it is quite important to influence the ecosystem or the sustainability of a community so I mean in the near future we may release this new matrix model in chaos also we will show the result so I think I just use the community manager's point of view but what if you are investors what if you are researchers or users of this software what kind of value or insights you can get from this I think it is very open questions we can discuss later so this is a very powerful chart the image and everything so now summarizing because we are running out of time we would love to have some questions from you first of all chaos brought great visibility to Green Marla as a project started to increase kind of adoption I assume that we go for the charts that you were showing before probably we could see a similar growth when basically joining chaos community index collaboration index and others so I would say a community is a really great amplifier of the mission and the vision in this case of Green Marla or as well as a piece of software this part of this OSS Compass and others so basically being part of the community has been incredibly useful from a community perspective and I would say from a business perspective as well and then it is these numbers that we have here of course before to the numbers just to say community growth and adopters and everything is a challenge as well so it is a good challenge to have but it is a challenge in terms of how to make sense of all of the people needs, feature request and so on and so forth and it is about scaling with the community somehow by the way data is really great to scale yourself to understand what is going on in the community so have a look at chaos and then just as a summary here it took again 16 years from research to production or test the market and then we have seen that only in four years people have been building on top of this forking the project and creating on existing things so you can take advantage of all of this history so this is basically open source then if you can share some more thoughts on OSS Compass because it only takes one year to build OSS Compass from scratch to a SAS platform and plus the theoretical platform and during we create the whole things we got a lot of support for chaos communities and also we got a lot of support from B2G engineers so we never think it's some competitors with each other we think chaos and GrimLab to give us a platform and OSS Compass is niche based on this platform we are very happy to share the values produced by chaos and we produce more values together moving forward that's all I can say I think this is all I think we step into the last part so you have links to everything all the pieces of software that we didn't work on did that match Q&A time so you have questions comments? thank you for sharing analytics it's always nice to see practical examples too for organizations doing open source a lot of times it's interesting to see how much contribution comes from the inside and how much comes from outside contributors to their open source project do you have metrics and analytics for this? exactly because as you can see we provide organization activities and in this metrics model we will show how many organizations have joined your communities how much contributions has been made so you can distinguish these organizations if it stands for your company or we can stand for some outside collaborators from some outside companies we do have such metrics to do so it's basically part of the information and some of the repositories basically the developer who is producing that event and coming in to see something else and then it's what we call counting potatoes so it's basically how this developer belongs to this company so then after a while you decide basically where that person belongs so then you can start having those numbers either from this company or this other company and then you can compare see trends in our lab and sorting head which is a tool you can specify the time when that person left the company so then you are basically accurately stating the activity for each of the organizations if a company were to adopt some of the metrics just start with a very basic setup of some metrics for example what Susie just mentioned do you have a rough idea of what the time invest would be from scratch to some basic metrics so that depends if you are willing to use some of the existing sub services which is this so you can go to SSCompass you can go to Calderon, you can go to LFX Insights so there are many places and then you will have Insights on that if you want to run this internally so then it's okay I want to take the technology I want to do this on prem on my side maybe you can serve how much time does it take basically to deploy for SSCompass because I am biased here if you only care about several metrics you already care about what do you have that I didn't do anything I just took my microphone for Void if just the several metrics is quite simple we just spawned like last thing one month let's go based on the context that you only care about two or three projects but if you are considering I want to have a bunch of like a very big community which contains like 10,000 reports then I don't think a quick support a quick setup will be happened so what we recommend like chaos like a bit easier they have some analysis service to provide some data directly to you and OSCompass we also provide some REST API directly to some metrics or metrics model you can fetch those data directly and only care about some communities you care about in the usual deployment time probably less than a day it's everything up and running the only problem is basically about the gathering process so if you are gathering information from Maxilla that's really heavy intense and so on so that may take longer time if you are going to get commit to the Linux kernel you just need to get log and then you have all the history there get clone, sorry and then you have all the history there so that depends on the size as well of the number of projects that you are gathering it's mainly about data and restrictions on the other side we are out of time I guess so it's all well thank you all thank you