 Ready? Okay. Thanks. Thanks for joining. This is gonna be a session about metrics, collecting metrics of OpenStack, what we're doing, the state of the art. We're gonna present it in this first part and then we're gonna have the next session to discuss what we want to do next, what kind of feedback you have if you cannot stay for the next session, write it down on the note cards that I gave you. So you can write your questions, you can write your comments, you can write ideas, suggesting things. We will collect them afterwards and go through them in the next part. So without further ado, I will call for Dan Stengel from HP to describe what HP is doing to analyze the OpenStack contributions and then we will have Jesus Barraona and Sanjeev Nath for other project, the activity OpenStack.org. Dan? Thanks. Yeah, so I'm Dan Stengel, I'm from HP's open source and cloud organization and Stefano was kind enough to invite me here to talk briefly about the GitDM tool set that has been up until recently at least kind of the state of the art for measuring contributions to the OpenStack community. So I'm just gonna give you a quick rundown of kind of how it works and at least for Grizzly what we found. So GitDM, the tool set comes originally to us from the Linux kernel community and Jonathan Corbett and Greg KH, heavy duty Linux kernel developers wrote this tool a number of years ago to help them get a better idea of who built the Linux kernel and we, the OpenStack community and in particular Mark McLaughlin, adapted GitDM, forked it and ported it if you will to OpenStack to allow us to gather and publish some of the same metrics and Mark's been doing that work now I think for a couple of years now at least since the Essex release and now through Grizzly. So we've got a fair bit of experience with that tool set and it's not a very complicated tool set, there's really no magic to it. Basically GitDM can measure and report on three key areas of community contribution. One is contributions to the Git source code control system and tracking and analyzing change sets. The other one or the next one is Launchpad defects and so there's a component that measures defects in Launchpad and finally Garret code reviews which many projects in OpenStack use. So and then tying those all together is a database or not a database but a set of flat files that map email addresses, multiple email addresses to single individual contributors, as well as organizations. Who's who, who works for who and how the domains and individuals relate to one another. So just to prove that there's really no magic, this is really all you need to do to run GitDM yourself. You basically have, you've cloned a working copy of all of the projects that you want to look at and you grab a copy of Mark's fork of GitDM and then you run a simple shell script, doit.sh. It's pretty self-explanatory and do it is just a recipe for building the metrics that we've been running for the last several releases. And I have to say I can't take credit for almost any of this but a couple of months ago my manager at HP asked me if I could generate these metrics inside of HP on a regular basis on a daily basis so that various organizations within HP that that want to see this kind of information can have access to it. And so you know I really am standing on the shoulders of giants. I'm not presenting work that I've personally done that much of so credit where credit is due. So briefly I just wanted to give you a flavor of what GitDM generates and kind of the state of the measurements that we've taken for the Grizzly release. So you know I work for HP so all we do is measure stuff, right? At least that's what we used to do. So you know the data that GitDM generates is really just column or text data. So I just took that cut and pasted in and generated some pretty pictures for the sake of this presentation. This isn't actually literally the output of GitDM but I wanted to make something that would kind of show nice on a slide. So you know like I said GitDM measures change sets in Git. So this slide shows you just a rundown of the top lines changed in Grizzly by employer and it also generates data per individual as well but I didn't have enough avatars or headshots of everybody to show that level of contribution. So and you know and then as well the number of change sets contributed by employer so those are kind of the two main Git metrics that GitDM is looking at. And none of this should be really surprising. First of all because Mark McLaughlin already published it and second of all because it's kind of stuff that we already know like we pretty much know who the big players in OpenStack are right now. If you walk through the exhibit hall you know all these logos show up on the wall right now. So the the significance and the relative ranking of contributors and of their employers I think is not really that surprising to anybody. What is interesting though is to see how this data evolves during the course of a release and the time series of data. So you know what we've been doing at HP and I think what others are hoping to do is set up a regular update of this data so that you can watch how organizations and individuals contributions change over time. Which projects are trending which are growing which are shrinking. Where are the where is the work going. And so these are the employers with the most hackers or the most developers if you will. And then as I mentioned GitDM also tracks launchpad defects. So this is an idea of who's most active in resolving defects in by assignee or by bug owner in the Grizzly release. And then finally the Garret code reviews. Again another significant component of how we're active, how we're contributing in the community. And then really quick I'll skim through these slides because they're not that interesting but I kind of wanted if anybody wanted to run GitDM I wanted to make sure you had enough of the basics to be able to run the tool. The configuration is really dead simple. GitDM basically just needs to know what's in a release. What am I looking at. And so you define the projects that you're looking at. A date range of commits that you're interested in. And then optionally a list of exclusions to omit. And those are things the reason you would do that is for example to leave out automated continuous integration commits by Jenkins. And forgive me if I'm rushing I want to make sure I give everybody time here. GitDM also has again a flat file database of who's who in a project. So the aliases file allows you to map multiple contributor addresses to a single canonical address. Same is true for launchpad. We're able to keep track of who's who in launchpad mapping a launchpad ID to the contributor's email address. And then again who works for who or which company or organization is that person associated with. If there is an affiliation that is public or that's obvious. And the other thing that GitDM offers is the ability to track by dates of service. So you know if somebody worked for Red Hat until December of 2012 and then they moved to Rackspace or whatever you take your you're able to keep that information in the configuration so that when GitDM is generating the analysis it can take that into account. And then you can do a similar set of mapping for organization. So in the rare case that a maybe it's not that rare that an organization's affiliations change. So a subsidiary or an acquisition comes or goes GitDM also contract that sort of thing. So just briefly to wrap up I wanted to cover since I've been doing this for a couple of months now I've run into a few issues. Some of these have also come up on the OpenStack Dev mailing list and none of these are particularly surprising but I thought I'd just highlight a couple of things that that we found. One thing. So one thing that that we've noticed that is fairly self evident is that we really need to be measuring all of OpenStack across the board and making metrics available in kind of combinations that make sense to the consumer of that information. So the GitDM tool as Mark McLaughlin developed it you know it really covers the core or the integrated projects on the left hand side. Nova, Swift, Cinder, etc. And that obviously that's a huge and central part of OpenStack but there's there's 50 projects for Grizzly that make up OpenStack. And so depending on who you talk to, test and development or continuous integration or the library projects or documentation, I don't think OpenStack would be what it is today without all of those pieces. So I think it's it's critical for us as a community going forward to make sure that we are tracking and giving credit where credit is due for all of those components and not just a selected subset. And then a couple questions that have come up that people ask me when I describe what GitDM is trying to do. How do we track defects? This came up on the mailing list too. And you know, everybody kind of has a little bit different take on it. But something that we should figure out in the long run is when is a defect done? Who gets the credit for it? You know, which one or more entities can can claim it as something that they did. And that's difficult. It's not obvious. So I think we should pay a little more attention to that. And then having a consistent centralized database of who's who in the project. I showed you how GitDM handles that, which is just a flat file associating email addresses and employers. And that works pretty well. And it's open source. It's everybody can examine it and view it. But it's not necessarily the best solution. And it's hard for other tools to consume that information. And then, you know, lines of code or the number of change sets tell part of the story of code contributions, but that's not everything. It's not the complete story. A lot of people have asked me if we can if we can wait those or that, you know, some types of contributions may be more significant than others. Or even if you even if you committed a bunch of code, if you committed 800 lines of code and then realize, you know, oh, I made a mistake, I got to back those out and do it again, you know, you don't want to count that first commit and the second commit, you want to be a little bit smarter about it. I put a couple of ideas up there just sort of brainstorming, but something that we might want to consider is how to wait commits. But then my final thought was really just that I think there's been some discussion about how we measure and why we measure OpenStack contributions and who it benefits and who it doesn't benefit and so on and so forth. And I really think ultimately that if we can have community generated and vetted metrics that everybody can see and anybody can reproduce on their own, it's only going to do us good as a community. I think it's only going to make us better and stronger and tell our story better. And you know, people will use the numbers however they want to, but you know, data is data, facts are facts. So anyway, thanks. And I will pass it on to Jesus, who's going to talk a little bit about what Petergy has been doing. Okay, good afternoon. Thank you very much. I'm Jesus González-Paraona from the Biturgy Company and from the University of Rey Juan Carlos in Madrid. And I'm going to talk about something that some of you maybe know, which is some studies that we have been doing for OpenStack. This one is the study on companies, which is in terms of the information produced quite similar to the one that then showed us the moment ago, they're using a different methodology. That means that the results are always not the same, but it's interesting to compare. We are also working with the OpenStack Foundation for building a specific dashboard with information not only related to companies, in fact, not related to companies, but related to how OpenStack has been developed these days. You have it online. If you want to have a look at it, it's basically activity.openstack.org slash dash. And part of this dashboard, something that we're presenting today, is what we would call the Act to Enable dashboard, which is basically a dashboard where you can point and click and select, I want to see the contributions by such and such company for such and such projects, for such and such time period, things like that. It's a still video, so it's not working perfectly. Please report in the back, or you may find or something. But it's a starting point for something that deals with, exactly something that I'm interested in a moment ago, which is how do we deal with the different kind of metrics that people want? Some people want to look at all the metrics calculated. Some others are interested in what's happening in a specific project. Some others are interested in what's happening with a specific campaign. And the way of aggregating them is also very different from people to people. So what we try to do here is let the user select. So this is a starting point, but the idea is that the user decide what he or she wants to see. Basically, the main idea of all of this is that in OpenStack, as in any other free software, open source software, with an open development model, information is available out there. There's a lot of data. It can be retrieved, it can be organized, it can be analyzed, and anyone can do that. That means you no longer depend on the metrics that the vendor is giving you. You can go there and measure. If you don't like my metrics, you can go there, get your own, as they said. This is very, very important. From my point of view, this is part of the transparency of open source projects. And this is quite important because data is data, as they also said, and you can interpret it some way or some other way. Sorry, what? Okay, great. Thank you very much. Sorry. Well, let's go on. As I said, this is basically about transparency. This is also about knowledge extraction. So OpenStack is large in its complex. It's very difficult to really understand what's happening everywhere. Maybe an expert all the day, mainly Lewis, really can't really understand what's going on, but once again. But in any case, getting this kind of knowledge of the brain is not easy. So metrics are here also to help. Then we need objective information in the sense that there are going to be a lot of talks, a lot of discussions in mailing lists, a lot of opinions. Data is data, and you can again use that data for the discussion. Again, it could be interpreted in different ways, but at least you have something objective to deal with. And if you don't like this data, you can produce your own. And again, for decision tracking. So right now you, for instance, decide we want to put into practice such and such policies for reducing time to commit or time to fix or whichever. How do you really know that the policies are working? The only way is you go there and measure what's happening. And that's it. So data is for doing all of this data has to be extracted. It has to be mined. And data lives in repositories where they were not designed really to get this kind of information out. So a git is designed for getting commits and so on. But it's not really designed for getting a stats. You can get stats, but it's not a thing. Same for a launchpad, whichever. So the idea is we need tools specific for retrieving that information. And then the data has a lot of complexities. And you need again tools for dealing with all those complexities, filtering the data, managing the data and so on. In the end, we are also quite interested in the idea of if everything in OpenStack is free solver. Okay, let's go and analyze it also with free solvers so that again, anyone can cam and use exactly the same methodology if they want because they have the source code and they can look at how it was done. They can use the data sets. They can produce their own data sets with the same data. And they can of course contribute to fixbacks or to improve the system, whichever. So for this way, we're using the metrics grimoire set of tools. I'm not going to enter into details, but basically you have mainly three tools there, which is CBS Anali for going to source code management repositories. You have Beecho for going to ticketed systems like Baxilla, Jira, or the most useful forges. And you have MLS stats for going after mailing list and some forums too. And then you have Beecho grimoire. Basically what the metrics grimoire tools do is go to repository and store the information into a database. Once you have it into an SQL database, you can do a lot of things with it. And for that, we have basically like two different parts. One is for analyzing the information and the other one is for visualizing the information. For visualizing right now, we are just producing JSON files with some JavaScript to basically produce the visualization. Basically then FlotR and Envision and things like that. And for querying, we are using Python and we are using R. Both are quite simple to use with an SQL database and both can produce very interesting information. So this is the R part where we are producing those JSON files, but we are also doing filtering, messaging of data, and we are producing many others different kinds of information so we can produce charts, we can produce photographs, things like that. And summary of what basically more JSON, sorry, JavaScript can do, which is basically produce an HTML5 application for dealing with the data. For instance, the Act to Enable dashboard is done with this. So in the end, what we did was yes, ran metrics grimoire on the OpenStack repositories, produce the database, you have the database online. So if you want, you can just download it and use your own tools. So there are many issues here. For instance, go to mine. As Dan also said, there are a lot of repositories, a lot of trackers, but some of the information there is strictly related to OpenStack. Some other information is related to some other place. For instance, you have tickets in LaunchPod that came from, I don't know from canonical things. At some point they were also signed into OpenStack Tracker. What kind of activity exactly are you interested in tracking? Same thing goes to many repositories which are in fact, forked from other repositories outside OpenStack. Then you have to produce queries specific for OpenStack. Example, that was raised in the mailing list during the last days, how to decide, okay, how to decide that a bag is closed and who is closing it. Same for running then some Python and other scripts. Example, you have to produce information by sub-project, but for that you have to identify the repositories for a sub-project. Then again, customizing basically more adjacent to produce information useful. For that you have to do things like remote boats from the stats and then while result, you have to export results via HTTP. We have some performance issues there. Just the future, we are trying to do some other things more related to development. For instance, this is preliminary study on how people is closing tickets. This is not for OpenStack, but for the project of Adadia is to look at how quickly people is fixing bags. This is tricky because if you think about it, what happens when a developer comes and fixes all bags. What means that is that the medium for that developer is going to go up a lot. But he is doing very good job because he is fixing bags that maybe were open for one year. So it's not as easy as let's just count the mean or the median of time to fix or things like that. So that's why we are producing quantiles and this is the same information over time. This is another kind of study. This is how people is entering and leaving the project. So that's me. That's me. This is people entering and leaving the project. This is for Linux, but same thing we are producing right now for OpenStack. So basically we have all the generations I'm not going to enter into the details, but basically we have all the generations every three months of people entering Linux and how they are living. So for instance, the first one is the first hackers in the Linux kernel, how they were living year after year so that in the front line you have people that were still here this year and the first one is like six years ago. So the back one. So again we are producing this kind of thing for OpenStack. In summary and this is the last slide. Basically you have a lot of information. What we are trying to do is to extract interesting information to summarize it and to do that with free software tools so that anyone can come and produce their own analysis and make an analysis of how we are doing it or whichever. So he comes to questions. What does the OpenStack community really want to know about OpenStack? And this is a bit of marketing of the company. I just keep it. This is credits and please any feedback is more than welcome. Slides are online and I tweeted about that a while ago so you can go. That's it. Thank you very much. Good afternoon. My name is Sanjeev Nath. So what I want to talk about is what we are doing for OpenStack and we were approached by OpenStack Foundation to really try to bring together information across a number of different tools. So Zagel and Wicked Smart. Wicked Smart is a platform for integration and it's primarily focusing on application lifecycle management. So the idea is that distributed teams that are genius tool sets whatever the processes are across these different organizations and teams. How do you bring it all together and actually literally under a single pane of glass or on a single page. So we did this for OpenStack because as the projects are evolving it's clear when you have tens of projects and you've got hundreds of developers and lots of depositories. Even if all of that information was in a single tool it would be very difficult to manage. But now you start to look at for example Launchpad that has one set of information that you got Garrett and GitHub and maybe other tools that come into play for example for forum and wikis and stuff. How will information be unified across all of these and then more than that what happens when you want to bring it into your organization and tie it with whatever is happening. So that's what we want to kind of talk about today. So the first thing is looking at the silos of content and concepts. I mean it's clear and Launchpad and GitHub and Garrett are just some of them. But there's also for example organization and member database that OpenStack Foundation itself maintains. So pulling that and then combining it with all of the activities that are happening across these three tools for example. And what happens is you don't have very I mean it's not easily attainable in terms of insights when you have these silos. You know for example you know when you have a developer or VISH what organization is VISH belong to or represent. What you know who reviewed and merged the fix and what project repository and branches are you know used to apply the fix and so on. So I mean it's you know on and on and on because some of these questions are really basically going across the domain of a particular tool. But it's the questions that you really care to ask and I think Dan already had brought up a few of those right earlier in terms of the things that we want to try to see. And as I said the last bullet point you know how do you not only are you able to answer those questions in terms of the community activity but how do you then start to tie it with what's happening internally on your project. So let's talk about what is Wicked Smart doing in this context. So what we're doing basically I mean just in terms of what is live out there is that we're taking information from a number of different tools that you can see already. So Launchpad, Garrett and GitHub. And how Wicked Smart works it's really a hub and spoke model. It's model driven which means that all of the concepts are formally defined. Concept of a bug, a project, you know a review, a commit and so on and so forth. And it uses a very, very rich formal definition of all of these different concepts across the full application lifecycle management. So actually it's not just about bugs and commits. It can tie ultimately a customer all the way to bugs and cases that they report and somewhere in between our bills and deployment and deployment platform and so on and so forth. So you can extend it all the way, you know from the beginning all the way to the end. And in this particular case for example so this is just a demonstration of how things can be pulled. So we look at all the core concepts that are represented across these tools. And so you've got the organization and member database that OpenStack Foundation has. The projects that it has and then underneath for example, you know projects and releases and milestones are in Launchpad. You've got Patsets and reviews are in Garrett and then of course you've got the repositories and branches and stuff. So we're pulling all of that and then we're mapping them and linking them to each other using this metamodel. So that you end up with a repository that's not only pulling the data but it's also interrelating it. And what you end up with is a graph, a very rich graph because it no longer just has information related to X number of bugs versus developers but it has all the interrelationships between them so you can traverse from anywhere to anywhere. You can start with an organization who are all the contributors, what bugs they've fixed, what bugs they've reviewed and so on and so forth and you can go all the way to build and releases. And you can work backwards so you can pretty much have any dimension of information here. You can start with a project, you can start with a release, you can start with a milestone and then do queries that basically tie them all together. So this is really the richness of the graph and the capability that comes with Wicked Smart and this is what you end up with. So we started, let me just go back quickly to where I first began. You had information related to a bug here, Python client library for Nova, Nova Actions is broken. There's a basically in Garrett there's a view done and emerge and then in GitHub there's a commit obviously, right? So all of these different places are all some pages that are referencing a single set of activities related to a bug. Now if you go forward to what happens after integration is literally you get everything on a single page. So here's Vish for example and this is his commit activity over the past six months. Information about what organization he belongs to, what he's done in terms of check-ins, reviews and so on. And then down here you can see for example, there's actually a specific line here that references the commit that we just pointed out, you know and GitHub for example, what bug it's associated with that came from Launchpad. And then of course you can click here and have access to the actual review and whatever the review history is of all the people that were involved in the activity in Garrett. And then here again you'll see somewhere on the same page the information pertaining to the bug. So here's the commit relating to a bug, there's a bug and the status in terms of who worked on it, when and so on and so forth. And this is only a sample. Of course you can go to activity.openstack.org slash data and you'll see lots of different pages. You'll see graphs and charts that talk about what is the activity in the last 30 days, who's done the check-ins and X number of time. You can see pages by contributors, you can see pages of organizations, pages by projects and repositories. So it really doesn't matter if you go back to this idea here, you can start from any node and then find out anything that's interrelated to that node. And then of course finally, I mean there's lots of different aspects of this. The fact that this is a hub and spoke model means the information that is captured and in a very formal meta model if you would is accessible to any tool. So I mean we're actually building now a plugin for Eclipse which is gonna be able to access the same information that you saw earlier from within Confluence. You can also access it from any other application. And it uses Sparkle Query which is a W3C standard RDF query language to retrieve the info. So it's very easy to get. And then of course the faceted search. Here just as an example of what I'm showing you is first I'm looking for a bug. And I'm actually doing search on a specific topic not just anything that matches a particular character string. And then I put some search here and then it will only return me information related to bugs. I mean any bug in which that character string matches. So that's the idea of precise information. And again, if you were to tie this with your internal tools and whatever is happening within the development teams inside then you can see the value of how you can have access to community content and internal all at the same time. Anyway, I think that was it for me. Thank you Stefano for the opportunity. Now this is on. Okay, cool. So, right, we have some extra time. So instead of you writing down whatever you want to, if you have questions or if you wanna see more we have plenty of time now and in the following session to discuss about this. What the way I envisioned it is that since so many companies so many people seem to be interested in about and are collecting data from the OpenStack community I wanted to provide some, one service to the members and one service to the public that is access to that data in a ready to use ready to consume form kind of a self-service board where people can go and get the data. So the main ingredient that is now missing, I mean, right now there is GetTM that is out there, it's public, there is the activity, the OpenStack.org that has some charts, got some search engines, it's got all of that integration that we've seen now. What I think is still missing is for the next step is to build actually reports and queries about you, the users. I mean, the companies there and the people that are interested in to seeing this data. So that not only we're providing the raw data but we're also providing you a point and click place where you can go and get the monthly report for your boss, for your manager, or for your manager, the point and click and get the data that you need for your executive report or something like that. So that's why I give you that those pieces of paper to write down things or if you have any comment, very happy to take them now. Actually Sean, it was, I visited a Yahoo that Sparkle that my interest into prioritizing this activity report. I don't know if you were there but Neil, no Neil. Well somebody here who asked me one of the open source people. Yeah, I know. Anyway, so sure. Have you played with it with activity or have you? Any comment? I'll be on the spot. Bad man. Yeah, you know, what I started doing when I did that, I started comparing it to what Oliens has. I forget who they purchased. Oh, yeah, under Black Duck. And then I was looking at that the report that one guy did publish to the community list. I can't remember his name. So anyway, what I started doing is I started comparing. They're all slightly different. They're all pulling the data in different ways. And to be honest, I don't remember exactly the differences. What I intended to do was to start to build out from that what I would like to see. So I'll have to defer and say it's homework for me. Yes, I've been holding my team's responsible, not responsible. I'm calling them out for or giving them kudos for this kind of data. So I'd actually, it's not currently the way it is, but I'd like to actually start having this kind of data drive actual annual reviews of employees and make it part of it. But it's not that way now, but I'd like to see that. So I foresee this being important to, it's important now, because I want to judge on how active we are and what kind of contribution we're doing and code and other things, that I'd like to make it even more important in the future. So. Great. With respect to what you say about differences in different kinds of analysis, like OLO and activity and some other. My vision is that what we need is some agreed by the community ways of measuring things. And the discussion that we had in many lists about how to consider who is closing up a ticket, for instance, is one instance of that. Because in the end, probably we really don't know what OLO exactly is counting as a commit or something like that, but we can now exactly hear where, for instance, counting as a commit or GTM is counting as a commit or time to fix or who fix what or whichever. So my impression is we need at least one, let's say community definition for that. Once we have the definition, we can just go to the data and make the correct query and that's it, so it's not that difficult. The difficult part is probably agreeing on a definition. But we could even have several definitions in the sense that maybe your company is interested in who is sitting near the time of closing the bag, some other research are interested in the opener, someone is maybe interested in all those that had changes in the ticket. We could have all of them if needed and we could even have customized versions of the statistics, let's say. Because my impression after being said in several different periods, especially large periods with companies involved, is it's very, very difficult to agree in a way of measuring things. Because you know some company is going to be very, in one measure, some companies are going to be very, or yes, they are interested in such and such kind of things. So, but diversity is not that bad. So what I mean is maybe we can have commits with this definition, we can have commits with this other definition, that's it. So, but the important thing is to know what you mean by commits for instance. That's my impression. Yeah, that's a little loud. So, I've looked at activity, activity's pretty cool. Obviously Json X Sports would be great, but this drives back to something that I've been working on for the talk that I'm giving on Thursday is I started pulling down vulnerability information. We do OSSAs, but we don't have a specific format. We've slowly grown into a format that's, you know, it changes over time. What we don't, what we're starting to collect is information like who's reporting stuff and things like that. So, built a database off of that as well as the external databases that exist like CVEs and stuff like that. And the benefit there is if we can deliver these metrics in an API format, people can use them as they need to. Areas of the projects can use them as they need to. So I think first and foremost when we're collecting these statistics we should think of it in the terms of providing it as a service API that other things can pull out and other people can tie into. So. Measure and then go to the formats would be the way to go because first of all you have to know the interesting things to measure. How to, for instance, for your project it's just interesting to measure every week, every day, every month, all of them together. That kind of things are important and then you have to decide on the format and agree with you. Having a well-documented API would be great, yeah. There we go. It's, I've been playing with Git DM. I'm Mark Atwood from Hewlett Backward. I've been playing with the Git DM stuff that we ourselves worked on. And played with activity and it's pretty awesome. The tool that I have discovered that I need and before I go off and hair off and write it on my own. I need something like Garrett, but for entire organizations. Instead of looking at a commit and a proposed commit and seeing how good it is. I've discovered I need a way to once a month sit down and say what is, show me all of the proposed commits in reviewable format proposed by HP employees. I don't want is, we've gone the rounds about the dangers of metrics and management by objective and managers being tempted to drive the metrics and my own way of trying to prevent that from happening is I don't want to report metrics to my managers but I would like to be able to sit down and say once a month, here are some of the cool things that people inside my organization have done and there's no good real tooling right now to do that. Activities close, possibly both reviewed and merged. Yeah, where this came from is on the HP keynote, our OpenStack keynote coming up in a couple of days. I was given the task by the speakers to say, here are all the things that all of the team said they did. You can figure out the next phrase. Get DM has some of that data. You know, there's no authoritative source for HP there is actually an authoritative source. All of the HP people are supposed to be in a particular team in Launchpad. Well, it's supposed to be, right, I mean. That's the way we fix it is as we just add is every time we find one that's not I add to the team and redo it again. But I mean, looking across all of the tools, each tool sort of generates or manages its own list of affiliations and like Mark or I or the people that have contributed to get DM will either through just knowing somebody is changing jobs or by looking at their activity in Launchpad or something, we'll say, oh yeah, well, they moved to Company X last year so, you know, and we'll update the data. And then of course, half the time that's done somebody comes back and says, no, no, no, no, that was, they moved to a different company or a different time. So that's really hard because there isn't one centralized repository. Right, I mean there is one centralized repository. You could drill down really deep and nobody would care who's at that level. And so some of it I think is incumbent on the consumer to say, well, for us, for HP, like we wanna slice and dice this 10 ways from Sunday. Yeah, just some comments first in terms of interface. Actually, we are looking into and considering Rast and Jason for this, right? So you can basically get all the data. There's a couple of different things. I mean, one is there's a repository that's hosted by OpenStack Foundation and then you can also just download Wickets Martin and install it in your own environment and integrate it, right? Then the second to your point here about the tooling, I mean, the stuff that you're talking about in terms of specific questions, I mean, that's what this is all about. There's two aspects of it actually. One is the question that you asked, it can be done here, it's just a query. So you don't really need a tool for this. But even beyond that, when you talk about these sub-segments, if you would, this is really the fundamental aspects of the semantic metamodel is that it's set manipulation. I mean, all you need to do is define what the sets are and there is no taxonomy here. I mean, it's not a one-dimensional structure, basically. So you just define the sets. I mean, it could be, I want to know contributions by people that are employed by HP or people that actually are in a particular geography or people that have certain education. I mean, as long as you have the information, it's simple queries. So I think I would definitely urge you guys to at least either throw the questions out there that you think you want answers to or play with this and see how far it can go because it's collecting everything that is out there from these tools. So you ought to be able to mine it for a lot of things. That the, so we, I think that to move forward, we should have a wider conversation online in public with the rest of the community. So we were talking about this and I throw it out there to grab what you think. We might have either a new mailing list or we can create a topic on the OpenStack Dev mailing list so we can keep talking with everybody still until it grows. I, yeah, I agree with that, I would go with it. And the other thing is we're going to publish all the source code for all the tools that we have right now as a project. From Peter here, so Wicked Smart, we already have, there is a bunch of new documentation from the Wicked Smart project on the Confluence page. So there is, so we can start, you know, you can start playing, you can download the Wicked Smart for OpenStack version. We're gonna put that also in the public repository with all the source code so you can replicate and you can bring it home and start playing with it. Because, you know, as you said, Git DM is wonderful. It's very immediate, it's very simple to grasp and it's very fast, but it's also limited. I mean, it's, when you get to history, you need to run it all the time and then you need to store that somewhere. But with this stuff, I think we have gone past the occasional use of Git DM every release cycle too. So you might want to look into that. And if there are no more comments, we might want to wrap it up and go think about either catch the last session or just go relax a little bit before the drinks. So much of it, one of our big questions, and I don't know how to answer it yet, is how do you quantify in a velocity and a business value and a feedback what your development teams are doing within OpenStack? You know, from the business point of view, a lot of what I've done the last year is just explain that giving a date is a great thing, but it takes all of this, there's all of this process when you're using an open source development model. And so we still do though, we want to be able to quantify that the way we would any other software development where you own everything and you're just doing it in there because it makes people feel good. When you can see, I got 30 points done, or whatever model or methodology that you're using. And really our management isn't interested in it, it's just me. It's me to help explain the story of what it means to be in an open source development model. I work for Rackspace, I'm the Deploy Infrastructure Software Manager over there. So kind of doing a lot of what Monty Taylor and Jim Blair do, but for just the Rackspace people, and then we go back and everything like that. My name is Rainia. And one of the things I've been really, really working for is just how long it takes from the time a bug is submitted to the time someone picks it up, to the time someone approves it, to the time it actually is available to be deployed. Cause really when we measure done, that's what we're talking about is I found it, I have this idea, whether it's a bug, whether it's a feature, it gets really hard with blueprints when they're spread out so far across so many different patches and pieces to know when is it done. If they don't, then for us, we have internal project trackings because some of them may be, they might be a bug in Launchpad or it might be a blueprint, but it's also a story for us. And so finding a way to link it, to link your two, whatever you're using be it Jira or Bugzilla, we have the unfortunate situation of using version one. And so just finding ways to get those to talk so that we can start quantifying. And one I would hand to Sanjeev because I know that he has answers for you. So the one is, the one about the metrics that you're interested in that we already have on the public facing tools. So Launchpad and Garrett and Git. We need to, what we need to do, what I think is worth doing is to sit down with you, which since you have a very specific, already thought out ideas and write a user story, or a few user stories. Write, let's write an epic and just understand exactly what you want because we can build it. I'm sure that we already have all the tools set out. We just need to find the queries that will satisfy your needs. It's already done. Sure, absolutely. Of course, of course. And that's where me and you, or Sanjeev and Jesus and Dan, if you want to join in and write down this epic, it will make you also familiar a little bit with the toolings that we use on this other side so that you may, and since they're public, you may want to even devote one engineer or half of an engineer every now and then to contribute and to write, I mean, your team can learn the query language, the Sparkle query language, and you can write your own queries against the database that is already out, okay? So the second part is the connection between the vision and your internal tools. And that's where Wicked Smart is. So on that one, actually, I mean, that was the thing that we'd covered earlier. The whole point of Wicked Smart as a integration platform is that you can put tools on top of it. And if they support REST, for example, then it's actually very easy to pull information because for Launchpad, GitHub, Garrett, we have a single generic REST importer. You configure it and start pulling. However, for other tools like Jira, we actually have a very, very sophisticated plugin and we have connectors for Salesforce and so on and so forth, Selenium, all these different tools. So even in a situation where if you have two version ones and one Jira and one Wigzilla, you can pull them all together and you can map whatever the concepts are. If you want to map a Jira Epic to something else or a story to something else, right? So you can do that so that underneath, because that's the aspect of semantic reconciliation you're doing when you're mapping. And it's very easy to configure. But then once it's configured, then the same dashboard and reporting that comes into play. Yeah, basically what we can do, I mean, what Wicked Smart can do is that to run on your end where you write an Epic, that Epic, the click of a button becomes a blueprint. And the comments on the Epic or the breaking down into the Epic into user stories, you can also automate that and have that work in the Launchpad. And this is something that you guys can talk about. They have a booth right there, you know? Yeah. I would like to come back and then try to specialize that a bit because they are not that different either, so. With security stuff, for instance, we do dependencies, right? We work in other Python libraries we import and we also do work there. There are OpenStack developers who go in and fix things like Python Eventlet because it needs to be fixed. There is to identify what you want to measure. And you have to measure many different things. For instance, you can compare your branch with a regular branch and see where commits are inserted and so on. Or you can track information by several people. I mean, you know, some people related to OpenStack is working in such and such Python library, but that work is for OpenStack, so basically you can measure that. Again, it's a problem of trying to learn what you want to know. And once you know, we can compare it with the upstream, we can, but that's quite important because if you look at all the repositories, like half of them are really cloning of some other repositories. And, yeah. Right, I say that we can call it a day and go get drinks after we talk.