 Alright. Welcome everyone. Thanks for coming to my session. My name is Ildi Kovancza. I just recently joined the OpenStack Foundation. My role is Ecosystem Technical Lead. And as part of this job role, one of my responsibilities is to try to keep the ecosystem healthy because if you just look at these numbers, our community is quite large. So even regarding people, we have many countries involved, so we have many cultures. It is a great mixture of people, mindsets and ways of thinking. We have many many companies around and well the codebase is just growing and growing and growing. So how to deal with all these numbers because these are nice and large and we like to keep them growing as much as we can or keeping them stable or keep them growing. But how to operate in an environment which is kind of this dynamic at this size. So as OpenStack itself is kind of a software package, an open source cloud platform, we can look at it from multiple perspectives. One of these is the business view. So if we have in the room people who are taking care of a company's business, then you might be interested in the user surveys that we are continuously running because from a business perspective it is really important to have the latest data to know how the market looks like, how the adoption of our platform is going. It is really important for us as the foundation and as the community to know how successful this software package that we are developing. And on the other hand for those companies who are using it and building their business on top of it, it is also a very important information for them. So as you can see on the slide we are reaching out to more than 2,500 users in the timeframe of one year. But I still need to remind you that this is not a comprehensive market study. So the user surveys are similarly kind of going on community with a community mindset. So they are with voluntary basis. Therefore when you find information regarding which projects are mostly used and which new and upcoming projects people are interested at, those are good numbers. But you need to know that those numbers should not drive your business and or technical decisions. And on the other hand OpenStack itself is not just the software package but all those more than 60,000 people who are working on it. So it is very important that we are all aware of that how the community operates and how we as individuals within the community are operating. So when you are joining us and start to participate then you start to recognize that it is a little bit chaotic and also there are so many moving parts that those ways how you might use to track that how efficient you are or you were with your work and those deadlines that you set up as targets and how you collaborated with your team within your own company. Those things are just almost the same but not anymore because of all these moving parts and we are having people coming in and leaving. So it's really not a stable environment therefore we cannot handle it from metrics perspective either. And one very important thing right at the beginning all the numbers that you can find as publicly available data these numbers are not for just increasing them and kind of playing metrics game with them. So it's really more for your information and for you to be able to improve and also to have the community to improve by being a good community citizen. So what you can find and what we provide is really just a basic set of metrics. So we as the foundation or the community itself we are not really combining any of any of the data that is available. We are we are trying to give you a package from which you can build your own system by which you can track yourself or how the community is doing. And on the other hand I would like to also draw attention to the fact that no matter how simple the metrics are and how simple and straightforward the definition of a metric is it still depends on that who looks at that number and who tries to interpret it and what they are trying to use it for. Because it will it will not mean the same thing when it is used in a different context. And also what we what we feel important is that the data that we have is that can be collected automatically. So what you see online it is usually really the latest and greatest values. And as I see still not a that good mixture of women and men in this room. It is it is quite new activity. This is the first diagram that personally myself I saw this morning the women of OpenStack group in a in a collaboration with They started a diversity program with with collecting metrics and kind of trying to check the tendency in this area and see how we are performing to to deal with different types of diversity like gender diversity. So as you can see on the figure we are we are doing better. So I was hoping to see more women in the room. But if if you you or your company are interested in this number or participating in the activity please reach out either to me or to the women of OpenStack group or just or just follow these these activities because the data is really interesting. And it still shows that we are on on on track and on on the journey. But we still have kind of homework to do and we have room to improve further. So what tools we are we are having that offers you data. We have basically two publicly available. One is tech analytics. I would guess that many of you at least heard about it already or maybe using it already. And the other one is a bit of job based activity dashboard. You on both of these you can find different metrics. I try to just rather paste screenshots here to to give you an idea that basically what we have. The community view is really more about the numbers of of collaboration and contribution. So in this sense we have for example comets we have we have emails we have lines of code which personally I totally hate. We have reviews and in the other tool you can see that we we try to follow that how the tracking systems are used and operating and how the IRS IRC which is the tax based communication channel for us how busy that is and how much people are active and responsive on that media. So when you when you look into into these these metrics I I thought to try to talk a little bit more about those which are more commonly used and which are sometimes also misused. Personally I have experienced from the past when when when people kind of thought that if if a number is is growing rapidly that's just good and that will keep their managers happy or the manager thought that that it will just keep him or her happy if if the team operates like this but it's not really only not not really true. I think in most cases it's it's not true especially if you're really just looking to one simple number. For example if you just type into the browser that Stakelytics.com then the first metric that that you will see is the code reviews because all the all the code changes are reviewed by by the project teams and it is kind of a very important number from the from the sense of to see that that how much the code is double triple checked before before it gets part of of the software package that we are offering and it is also something that for example developers are rather do always tomorrow or maybe next week or start next month or if their managers as done for so here I would like to draw attention to this one because it is really important to do valuable reviews and and don't be afraid of numbers like like growing the number of minus ones which means that I think you should work more on this code because those are the most valuable reviews in most cases and just kind of blindly giving plus ones which means that you you agree with you agree with the code change you don't have the power to to merge it but but you think it's good without any comments it's just not the beneficial for the project teams for the community for the the quality of of the code base so for example when you as a developer or if you're a manager of whom the team starts to to work upstream and work work in open source it is really important to to avoid let's say practices like what if we plus one 10 or 20 patches every single day so I I saw earlier like I don't know having 400 reviews in in 10 days or something like that so you can see that that something is really wrong there and the unfortunate thing with these cases is that in many times it's not really about just to to to grow the number but they are kind of doing it by mistake and they they don't really think about that that what what this really means and on the other hand when when you look at it from community perspective it really has a kind of bad interpretation because you can be considered as a as a rather bad than any kind of a good community citizen and this is something that you would really really like to avoid so cold reviews are the things that that you need to do in a clever basis so for example when you as a developer look at your numbers because sometimes you're just checking that that how you're operating and progressing within the community if you see that those numbers are too high then you might think about that that that the review and the other activity balances maybe not okay especially if you grew the number too quickly and you and you gave too many minus ones that that usually means that you need to check back on on all those things later so you you can easily overload yourself so these numbers can can be used in a way of finding the balance in your in your own work and find find a good way to to do the everyday tasks when you're when you're trying to be part of a developer team or or if you're already part of it. We have lovely numbers like commits and lines of code and these are kind of favorite ones and I think it's kind of usually handled as a competition at least people many times look at it like who has the most commits who has the most number of lines changed in the code to to clarify the two words if any of you wouldn't know commits means a chunk of code that or documentation changed that that got merged and got part of the software package that we are offering lines of code means the number of lines of code or documentation that got changed and especially when we are trying to rationalizing things this can be a high number either or either the changes consist of many comets and each commit has a smaller lines of code or or maybe a few larger changes regarding these things what you would usually use these things for for example when when you're trying to make sure that that your strategy is followed by by the activities so when for example you're working for a company and and there's a product that that you're developing then you know that which projects you need and most probably those are the projects where you would like to see more and more comets and or lines of code in these metrics because if you just focus on being part of the community by kind of growing the numbers overall without checking where where you're active and where you're not active this usually means that you get distracted and and the efforts that you're you're putting in the community contribution and collaboration might be kind of a wasted effort because one important thing about open source and how we are working together is that it is really important that that you you have influence within that team and and project within that team and project within that team and you have influence within that team and and project where where you have where you have business or where you have interest either either personally because there's really no other way to kind of influence the direction where where the projects and the codes life is going so the only ways to to be active at those places which is really important for you and your business as opposed to just randomly trying to generate large number of changes here and there it is also also personal experience that that sometimes people just just misinterpret this this whole thing and trying to use it as kind of more a marketing opportunity rather than something really really strategical work we have one interesting metric I I think I figured it out what it really means a year ago and I'm I'm in in OpenStack for more than three years now the person day effort number actually means it counts the days when you had any sort of off stream activities so for example you you sent out a mail you uploaded a new part set for review or you you you made a code review this can be really useful for you just to check that how much time and how much effort you put into the upstream and open source work I don't really think again that just using this one number it will be that much meaning for you but but using it as kind of part of a group of numbers it can be a really good indicator just to compare the effort that you put into the job and and the outcome that you have by the end of the day beyond this we we are also tracking other things like like blueprints and bug reports these are these are more the values of of how busy a project is and and how the maintenance work of the project is is going usually I would say you as an individual I don't really think that you will use it much but from from the project perspective where where you're working in this is kind of again an interesting and somewhat important set of numbers maybe you if you if you try to use it just to track your own work it can be an interesting number that was the ratio between your submitted blueprints and the implemented blueprints because if the numbers are let's say you're submitting many many blueprints and none of them gets to the implemented stage then this is this is a feedback that you you you don't do something good in a good way and this this can be for for multiple reasons but most probably the the root cause of the reasons are the communication because one of one of the things that that are really really important in a community like OpenStack is communicating that what you would like to do how you would like to solve your problems and and if someone has a comment on on your idea or on your code change then how you agree on the best solution with which you can move forward so if you're not really successful with with bringing the new feature ideas through you might revisit and rethink the way how you're describing your ideas how you're trying to discuss your ideas with with other people within each project and how you're handling the whole process from introducing the idea until it gets implemented and fully functional. Regarding the tools one of the I think most useful thing is that in Stack Analytics you you can find a people view or well person view so you can you can find the activity reports for each and every individual who contributes to the code base of OpenStack and many times the question comes up that okay I got stuck with something and I don't really know the solution I try to reach out to people on IRC but I wasn't really successful I also have problems with with the time zones because the people are from all around the globe so it's really hard to to ensure that that when you're awake and active you will find many people around but actually in this activity view you can find a diagram with these smaller and larger dots which is the UTC hour of when the person is active so this is mine based on this as a homework you can you can figure out that what time zone I might be in although maybe it's a little bit misleading because I mostly stay up late at night as opposed to get up early in the morning so maybe you will not be able to locate the the exact time zone but but you will be really close and also if if you're interested in in in the project and the area that that I'm working on and you find overlapping with your active hours and mine you can even reach out to me on IRC directly because you will have a better chance to to get response so for example if you're if you're struggling with this you can you can use this graph to to find this data which is really really useful because many people are struggling with how to do the communication effectively and how to deal with the time zone difference which is really really inconvenient beyond this you can also find all the numbers that a person produced and also some information like launch but ID and github and get ID and also that which projects the person participates as a core reviewer or core developer these are also important information if you would like to reach out to the person and also to identify that that on what the person is working on so you you find a common ground for your discussions. The other tool that we have and I mentioned is is the one which is based on Biturgya which is the activity.oblastag.org. I think it is kind of more of a lonely site because most of the people at least who I know are using Stakelytics. This one is more more about an overview like thing as opposed to go into details like providing for project metrics so you can see the overall activities regarding commits for example you can also find the data which is for companies but you will not find that which OpenStack project produced the different chunks of the data and you can also find information here for example about IRC which we don't have on Stakelytics or at least not yet and it can be also kind of an interesting and the important metric in the sense of you will see that who are the people who are kind of our people persons who will more likely answer you or will be available on IRC. I would not recommend still to start the person on the very top of the list but at least you have an idea that if you need information then who might be your friend and who might help you with the process. Okay so as I mentioned in the beginning it's really important that what mindset you're looking the numbers at because it will have a totally different meaning for you. For example when you're more in the management layer and you're more interested in the business view and product strategy view of the development process then you will be interested in both seeing some numbers growing as a tendency or being stable and or growing and on the other hand you also need to ensure that all the effort that your team is putting into the open source and upstream work is focused on all those areas which are important for your product strategy and it does not necessarily have to be just all those projects that you are using today but if you're planning to bring some more projects into your production environment or into your product you might start to let's say invest into those areas and build a team and have influence within those projects which you will need in the future because then by the time you will actually get there that you need box fixed, you need to introduce new features, you will already have the stable ground under your feet and you can kind of ensure way better that you can keep your timelines and you will be successful with all the contributions and product plans that you have. I think it's kind of true for multiple aspects that it's usually better to try to put together a matrix package and kind of combining all these numbers that you find available as opposed to looking at them as single values because if you just want to see a very high level view on whether something is moving or not then you can use one number but if you would like to also ensure that the process that you are having internally and the ways of working that you are trying to follow with the community where you are working then you need to kind of compare that how much effort went into the work and how much outcome you have and how successful and how productive the team is. When you are an individual personally myself I was way more interested in these numbers when I was a newcomer and when I started my journey with OpenStack because it was a really good way just to see that how I am as an individual succeeding or not succeeding in the community even if I was contributing on behalf of a company. So it can even be a kind of a self-motivation thing when you are seeing that you have comets merged, you have your reviews ongoing and you start to see all the positive aspects of your community involvement. And here what you usually try to really focus more on is that how you are doing all your things are efficient enough. So for example you can track your personal day effort along with your reviews, the comets and the comets that you had and kind of seeing the combination of these things as an overall number that you would like to kind of stabilize and or grow and you can see by this way that actually the work what you are having has a certain outcome kind of in the timeframe how you plan to get all your work done. And basically this was the information I wanted to share with you and we have quite some time for questions that I am happy to answer if you have any. Can you come up to the mic just so that because we are recording. Hello. Can this matrix be used for improving the code quality and improving the education of the newcomers of the community? Regarding code quality I would say that if the review numbers in the balance of minus ones and plus ones is kind of compared to the project size is a good set of numbers and it is not going down but more up. That is something that means that people are kind of actively reviewing the code. And I think it is more for you as an individual who tries to participate and who tries to ensure that where you are working will be stable and something that is easy to maintain. The review number is the one that you can concentrate on. Beyond this I think we do not track too many things which would explicitly tell you things about the code quality although I know about activities where we have data for example from universities and they are working on code quality improvements based on that data. I think in most cases they also make the data public so it is available. But so far we as a community we do not really track those and by data I mean for example code complexity or code duplication and these sort of things. We do not have these available as a community but we have teams and people around the community who are trying to keep an eye on this aspect as well. And I can really just encourage you that if you are interested in these areas and if it is something that you are already doing then please join and provide us information because it is really important from maintainability and efficient code development point of view to address these aspects as well. Regarding newcomers I would say that we are offering trainings and we are more trying to teach them how to use these numbers better and try not to misread them. This is what we have mostly in connection with onboarding new people and using these numbers. Any further questions, yes? Hi, thank you for the presentation. Just to mention for the audience we are part of the activity board but we haven't had the proper support for the last year or something. We finished let's say the specific relation with the foundation with that product so that's maybe the reason because there might be, you may miss some information there, said all of this. The question is my personal perception having a specific metrics like having top people contributing in terms of commits and so on, it's gamification but once you have a metric people may try to cheat on that metric so instead of having individuals metrics that maybe is as you say you should try to improve by yourself having your metrics but maybe from the foundation perspective you should look for some general trends and have to improve performance or bottlenecks but probably not going for individuals so I would like to know your opinion about this. You mean not track individuals but also try to... I mean tracking individuals the problem is that if you say this is the metric we are using, it can be better than others if you keep adding commits or smaller commits you know so how to fix this? That's a really good question and to be honest I don't think we have the answer for it. This is a discussion that pops up every now and then. I think it was a half year ago when I read a quite long myth read about it and I think it was about cold reviews particularly and personally my view is more on this that it would be better to let's say provide some education along with the metrics because I really don't think that you can correlate these metrics and do anything with them so that people will never ever do these games with it or they just don't try to game it but if we can provide let's say examples that in what way it is really useful and also kind of let them know that we try to follow these numbers and we try to reach out to people I mean on community level and community-wide not really directly as the foundation but we are reaching out to people who are doing these patterns and who are following these patterns and I think it's kind of better to provide the information before rather than go after them and talk to them that for whatever reason they are doing what they are doing it is wrong and it is recognized by the individuals in the community and it has a negative response on it so beyond that I don't really know how to fix this but it is definitely an interesting thing so kind of partially the reasoning behind this talk is also to share information and try to spread the word and give people ideas that if they put effort into using these numbers the right way then they can be way more successful in the community than if they just really try to compete by number by number Any further questions? Hi, I work on a community team and we have goals to be number X and number Y and reviews and such like that and this encourages poor behavior like quantity over quality as you mentioned before and we see things as like drive-by reviews drive-by commits and even like co-authorship what kind of goals would you recommend for a team instead of those? I would rather say that so the number of commits is at least something that gets merged so if you're using the number of commits then that activity is still something that was useful for the community but for example if you're doing this with reviews or patch sets because we are even tracking those so how many versions of a patch you're uploading that can really easily end up in behaviors like you're just rebasing everything in every second day because that will just improve or increase these numbers what we used to use when I was part of a team a team of contributors I think we multiplied the number of person-day effort the person-day effort number the number of negative reviews and the commits and we checked this overall number and it was kind of a feedback for us that how we are operating and for example if you don't do reviews and you don't try to really thoroughly go through a patch and see whether it is good or not if you never ever had a comment to any patch then there's something wrong so when that number is zero then your overall number will be zero so you can have let's say 2,000 commits but your overall number will still be zero so that's kind of a feedback that what you're doing is not the best community citizenship behavior that you can do and it's kind of true for the commits as well because you might be really successful with reviewing which is really important but also if you're part of a development team then your goal is also to get your code changes let it be a bug fix or a feature part of a feature into the code base so something else that I pay attention to in metrics is disagreements and reviews would you consider using those as part of your goals do you know what disagreements are in terms of stack-alytics it's when a core contributor will minus one after you plus one or vice versa yeah so I wouldn't recommend to put there a minus one just because you want to get that number up but if you have disagreements I usually consider that as a positive sign in the sense of no matter who plus two or plus one a patch you went through it and if you found even the smallest thing that does not look okay then you still put a minus one on it so in general having that number larger than zero is definitely positive but if it grows too rapidly then I think it should be an indication that you might not just kind of disagreed by for a reason but just to get that number up so kind of game the system again I think in both ways but anyway our time is four minutes we are four minutes over so thank you everyone for attending and happy participation and summit