 Is there some event that's about to finish and I should wait for? We're just finishing lunchtime, I guess. So, should I wait another 10 minutes for people to go before coming here? I have a question. 5 minutes? 5 minutes. If you wait, I'll take a jump and tell them to come back in. Okay. What's the list of events that you can't set up in the lunchtime? I mean, I've seen exactly what you said on video several times already, so it feels really good. I think it was particularly interesting. It was shocking, but it's obviously been a great success. I don't know where this comes from, but somebody may get hurt. Okay, I guess I can begin. Afternoon, and welcome to Debian Contributors one year later. Any of you already raise your hand who already knows about Debian Contributors? Basically all. Okay. Almost all. Okay. Anyway, we are in the field of the Debian Diversity Statement. Who hasn't read the Debian Diversity Statement yet? It's nice to read it again. The Debian Project welcomes and encourages participation by everyone, no matter how you identify yourself or how others perceive you. We welcome you. We welcome contributions from everyone as long as they interact constructively with our community. While much of the work for our project is technical in nature, we value and encourage contributions from those with experience in other areas. That's the context in which Debian Contributor moves. The executive summary of Debian Contributors is that everyone who contributes to Debian is now officially a Debian contributor with uppercase initials. So we did not have a name for people doing stuff in Debian that was not package maintenance, unless they were Debian developers. But before they were Debian developers, they were nobody. And that has to be fixed. That had to be fixed. That has been fixed. They are now Debian contributors with uppercase initials. There is no process or approval to become a Debian contributor. It's a do-or-cursy thing. The moment you do something for Debian, you are a Debian contributor. And what you gain from being a Debian contributor is to have your name on a list of Debian contributors. So at least Debian contributors are now Debian contributors. And what you gain from being a Debian contributor is to have your name on a list of Debian contributors. So at least that means that the Debian project officially recognized that you exist, which I think is very important. And it's important to say that Debian developers are not automatically Debian contributors. If a Debian developer does not contribute to Debian for some time, then they haven't been Debian contributors in that time. Right? So it's about what people do, not about what people are. Then, of course, there's implementation details. You can choose not to have your name on a list, or like for privacy issues, or you can see some summary of what you do, get thanked, you get reputation. The basic things that should be granted to people doing work with us. And let's see. So there's a website with lists of Debian contributors, and there's some infrastructure for collecting data. Let's see the website. So there are people, and there are teams. And the website has contributors grouped by year. So we know that since the beginning of 2014, we have 1,642 known people as Debian contributors. These are the 1,000 and so on people. And these are teams that reported any activity. There's a microphone. Who was first? Does that number include people who have chosen not to be listed? That number does not include people who have chosen not to be listed. Barring bugs in the website. You have a bug that lists me twice. We can fix that right now. You scroll down, I have a noodles at Alioth and a noodles at Debian. Let's look in. Not intuitively if to search for Jonathan. One and two. So that's noodles and fingerprint and that's email. So we remove your email from here, and we add your email here. Is it noodles? Yes. Oh, dear. Also it's clear to fall. So have you rate all of the noodles contributions? Sad face. Do you also want noodles, Debian, wiki? To Enrico. Okay, we can fix that. Is this still new? That's not me. But it could be. Or it could be Enrico. Debian contributes as he's now one person. Okay, and again. So we all identify ourselves as noodles, no matter how we see ourselves. Yeah. Okay. Good. It should form now and tomorrow it will merge. What's the point of the teams? The teams, it's because every team is responsible for contributing. So the question was what's the point of teams? The point of teams is that teams are responsible for sending contributor data. I am not going to maintain all the information about everyone that contributes in Debian in any possible way, anywhere in the world, myself by hand. So teams report data, which gives us a nice way to track team health. So this way we can see if teams are still contributing to Debian or not. So teams... No, people are contributors. That sentence says that this amount of people and 15 teams contribute to Debian. Yeah, well... In 15 teams today. And some of those outside the teams. Okay, can you send me a patch? Okay, so there's the site. We can click on a person. I'm Enrico. Dammit. I see too many things. I'll look at myself so I don't violate people's privacy. Flowing thing. So there's my personal page. I can see all the identifiers from which I'm known. And for each of them I see what contribution in what team is known and in what span. So it tells me that I've been uploading packages as far as the site knows from 2006 to 2013. Which is not correct. Good, this is the old key. The new key will be associated as we speak. So there's a list of all my identifiers. I can see what the site knows about me. There's a log of what happened. I think this log should include the unassociated noodles. So the idea is that the modifications in the site are auditable. So everything that happens about personal information, every changes are auditable. This one is very auditable. Every nightly maintenance seems to bug. I can change my name. And I can hide activity from all my identifiers or any of them if for some reason I don't want some of them to be seen. It is possible to see just the teams, to see contributions broken down by team. So the DEBCOM subtitling team seems to have become active again since the beginning of DEBCOM for something like that. And there are statistics. So we currently have more than 5000 emails, 1000 key fingerprints and 2600 login names. Known as contributors in the site. There is redundancy. So people will have their Alioth and the Debian login name in here. People have multiple email addresses. There's no redundancy. Well, fingerprints are just what's in the Debian key. I need to have a look. Those are the ones that are shown by the website. Emails are not shown. The only thing that is shown about you is the person, the identifiers you manage them individually. You are the only person that can see them. And these are numbers of identifiers that are not shown, are known to the site, but are not shown in the site. So we have 33,000 emails coming from the BTS that people have not claimed and those will be obtained. But the point of showing these numbers is that the tracking contributors hints that Debian is bigger than we are used to think. So generally we think that Debian is about 1000 developers, one third of which are missing in action. It could be that there's like one or two orders of magnitude more contributors that we don't know about. The idea is to find out. I had slides once. I had slides. Since it says contributors until 2013, it automatically, so it splits by last known year of contribution, it automatically cleans out people who become inactive. While still crediting, because one can see who was active until 2013, so if somebody becomes inactive, they will still be credited in the website. They will not show as actual contributors. So yes, self-managing in that sense. Design choice for the whole thing is that we do not need to be perfect. We credit people to the best that we can. But crediting everyone is something that I believe is impossible due to the size of Debian and the fact that there is no clear line from who contributes and does not contribute to Debian. We could say that Debian users are Debian contributors in a way and we wouldn't know. Somebody that gives a talk at the local Linux user group about how cool it is to do something with Debian, we may not know, but they would be in a way Debian contributors because, well, that Debian marketing thing. So we know that we cannot credit everyone. So there's no point in trying to be perfect in the website. The intention is trying to make it doable. So yeah, lossy, time granularity a month instead of a second. There's no tracking of gaps. I'm not particularly motivated to add it. There's no guarantee of latency. As I told Jonathan, that change will show in the website maybe tomorrow. And there's no guarantee of catching everyone. But the point is to catch most people to submit data to the website. There is a simple protocol which is documented here. And there is a data mining tool that is documented here. And which is both in Debian and in Debian machine in Debian service machines. The list has it installed so it can mine mailing list. Alioth has it installed so it can mine git repositories. And it's any machine where it makes sense. And if it needs being more, then we can add it in more. And the data submission is designed so that team do not need to go out of their way to track people. If a team only has a text file saying these are the members today, the site is able to deal with it. There's no need to remember the initial date for contribution or end date for contribution. Let's see. Will it in the right browser? Yes. So yeah, protocol documented here. I wouldn't go into details, but it's simple. And a data mining tool for which you could do something like this. This has a configuration file. And this will look at all the git repositories in CollabMaint. Look at file ownerships of files in .git slash refs and .git slash objects and build a contributor list for that using the timestamps in the files. And submit it to the website. To submit it to the website there's a need to be a bit at the top saying data source name which is taken from the file name if you want. There's other data mining methods. This just scans directories and looks at file ownership. So if you have like random file area where people put stuff, you can see when people modified files. This will run git log on git repositories and look at emails in log commits. There's a difference between looking at file ownerships in .git and looking at emails in .git. If I maintain a big upstream project, say Samba, and I clone the Samba repository, add my branch, push it to CollabMaint, if I scan git logs for contributions then all Samba upstreams become W contributors. Which would not be desirable, I guess. Well, it would be desirable if it really were so, but not if we misrepresent them. While looking at file ownership, we'll actually track who did the push, but it may lose information when the git repository is packed or it may lose information if I merge the patches that were sent to me via email. So it's up to each project, each team in Debian to figure out their workflow and what's the best way to mine information out of their repositories. There's another data mining tool that takes from addresses from a mailing list archive. So if you have a team mailing list, you can use this to get everyone who ever posted to the list on the site. And you can blacklist and whitelist emails. So one nice thing that can be done with this email scanning is if I have like Debian publicity team which has a known membership, I can put the known membership in the whitelist and have the data mining script figure out contribution time spans by looking at the list archives. There's another one that does arbitrary SQL queries to Postgres, which is pretty good to get data out of UDD or Project B. All the examples are functional data sources. That is actually what is used to get uploads unless it's been tweaked by the FTP team. Scan subversion directories and that's it. More can be added. It's a generic tool which self-document itself. It generates its own documentation. So as a new data source is added, we automatically get it documented because, yes. There is also a to-do list page with a to-do list of simple things that can be implemented on the website if somebody wants to get acquainted with the code. And the to-do list page has also a to-do list of data sources that could be added and pointers to all relevant development mailing lists. And it's linked from the bottom of every page in the website. So SARS code is here and to help see here. But back to the slides, maybe. During the website demo, we've seen that there's identity management. So there are people and there are identifiers. I have a bunch of email addresses, two GPG keys, one of which is all then replaced by the 4096 one. I have Alliot login and Debian login. There's people that have two Debian logins for historical reasons. It's possible to merge them all and say they belong to me. So an identifier can be claimed. As we've seen earlier, I claimed Noodle's email address. And I unclaimed it and claimed it to him. And one can manage visibility. We have not yet... Well, the person that promised me that would do it hasn't yet implemented claiming email addresses for everyone. It would have been like, this email address is me, send a challenge. If one replies to the challenge, then it's confirmed. That has not been implemented. But yet, but I'm hoping that it gets done at some point. But a couple of days ago, I've implemented the possibility for any DD to arbitrarily claim and unclaim any identifier for anyone. So if you are a member of a team and somebody in the team complains that they don't show up in the site, you can see what email, what GPG key they use to contribute and go and update. Do not do it without the consent of the person. So don't go and add everyone's known email address for anyone in your team because maybe they don't want that to happen. This identity management is something that was missing in Debian. Well, there was the MIA team doing something like this. And this is kind of missing an action team. They track maintainers that are not active anymore. So they need to find out... So they need to distinguish whether somebody is inactive or whether somebody has just changed email address. So they did something like this. This is identity management done kind of by design intentionally from day one and allowing anyone to manage everything related to them, including visibility. So there's been some talking about merging efforts with the MIA team because this could be also the master database for them. But yeah, it's been just some corridor discussion at this step conf. It also means that this could be a database of Debian people that could be used by anyone that would like to do anything cool. With Debian people. If somebody wanted to give away badges, I'm not going to do it myself. I don't know how much I like the idea. But if somebody wanted to do a Debian social networking thingy of any kind, here we have people. The concept of people is new to Debian. We had email addresses, GPZ keys. We had names. We still have name only full name like in Debian change logs, which fails because we have people with the same name, a family name. But this works. So it could be the building block for something else. I had an idea for thankyou.debian.org website where I could put the URL of an email address in the list archive and say thank you for that email. And the site would track like the best emails of the month. And then every time that someone's email is considered is thanked by someone that counts as a Debian contribution. So that also feeds back the site. And then I thought that could be also done for packages like thankyou for this package. And again, it shows in the site and it counts like as a thanked contribution and Debian contributors. And then I thought that could be like an icon in the Debian mailing list and in the Debian like pts or package lists to say thank you like a thumb up icon. Or maybe something else. But yeah, it's the building block for things that involve people. I think it's an exciting starting point. And I would like to see more feedback being implemented in Debian besides bug reports. So I think we've seen pretty much all of it. Takeaway message from the talk is that every Debian contributor should be on the site. Sorry, that's a spelling mistake. If you are a Debian contributor who is not on the site or know someone who is not on the site, you can fix it by either doing the claiming of email addresses or implementing data mining for your team. I have a friend who's not yet a Debian developer who works with the security team, who was frustrated because it does not show up. And so we fixed it by creating a data source for this. So now it shows up together with others. So you could make yourself known to Debian by implementing the right thing. To create a new data source, you just log in the website as a Debian developer and say create a new data source. You put the name, you put the description, a URL, an authentication token, implementation notes. And now there is a Debian Conf data source in the site with me listed as an admin. I can configure it. I can configure it and add contribution types. So speaker, there's a description from two different point of view that is needed to build phrases like this is a speaker at Debian Conf. These are the list of people giving a talk at Debian Conf. So there's a bit of redundancy here. And that's a contribution type. I can add another one. These strings end up in a way being translatable. And yeah, now anyone that has the authentication token can submit all these contribution types for this data source. Setting it up is this simple. And I can also, other people as administrators for this data source, they do not need to be Debian developers. So I could create a data source and then hand it over to any Debian contributor. We have a word now. I could say any Debian contributor instead of any random person, which is better. Then I can still access it because I am an admin of the site. So if there's a person that would like to set up a data source, but they're not DDs, any DDs can create it for them and hand it over. And then they can set up data mining on Alioth. And yeah, done. That's it, I think. Any question? What about the main list there are privates like security team? Sorry, security Debian or that's a private main list. Well, what about that one? Are you mining that one too? No, those security people are the security tracker people, which... That's public information, but the main list is private. So at the moment only the Debian security tracker thing. There's a more info link that sends you to the team. So it's an excellent way to learn about teams in Debian. So they track CVE list commits. If somebody wants to track the private mailing list and can work it out for the team, maybe for the team it's acceptable to leak who sends email to it. Maybe not acceptable. I don't think so. Then don't set it up. I was just asking maybe you did it somehow, I don't know. This is managed by Federico. So in the list of data sources you can see who's the administrator for it. And you can also see implementation details. So, okay, it's not in here. I'm not going into the configure page because since I'm an admin in the website it will show the authentication token. So, okay, I won't do that. But any admin can see how it's deployed. There's a text field with implementation details. So if an admin becomes inactive somebody else they can take over and go and... Okay, it's scheduled with Chrome in that machine. So take over that Chrome job, talk with the admins and adopt it somehow. When I said I don't know what's Debian security tracker I'll have a look. It's an interesting use case for the website. So the usual question I'd like to contribute to Debian but I don't know what I can do. Then it could be answered by have a look at this. The list of teams, there is a link from each of this to the team itself. If any of these teams are something that you like you can see the person that are in them. You can find a link to where they hang out and you can join them. It becomes a bit of a shopping list for getting more involved in Debian. Is there a way to programmatically query this site? In particular I can think that it would be very useful to be able for hearing me and to take a fingerprint and then tell us if this person is still active. So in the same way I can currently query MIA by logging in and saying MIA query username. It would be useful to be able to query this site. It seems to have a more complete set of information and say is this person active whether by an email address or a fingerprint or a login. It's easy to implement. I did not implement a public API for it because of previous issues. But I'm happy to work out a way to give you access to this which could just be like make a public API that wants some kind of token to be activated and that token is in Debian machines. So yeah, absolutely. I mean having to log into a Debian host to do the query is not a problem. I do that currently with MIA which is then authenticated on my Debian login. So that sort of level is fine. Yeah, doable. We can discuss it since you have a good use case for it. Lucas? So I missed the beginning of the talk so if you had that question earlier just tell me or watch the video. Last year there was the idea of a Debian welcome team floating around. Did you ask about it already? No. Okay. So the idea was to have a team that would welcome new contributors sending a personalized email. So one could for example watch for people from a particular country and establish some kind of social link with new contributors. That sounds like a perfect platform to build on. Are there things that you can see, one of the things that is not suitable for that or particularly difficult? I think it's suitable. I don't think it's difficult. And since the site is opt-in for new developers, so it's opt-out for Debian developers, at least for the identifiers that are obviously belonging to them, but it's opt-in for people who, for any activity that is not directly crackable to an Alliov or Debian account. But it would be nice to send an email to people that just show up on the site or that have been showing up for a month saying, oh, hi, we would like to say thank you. And here's how you opt-in this website. The same email could be part of a more interesting Debian welcoming thing of which I guess Ashish has something to say. Yeah. So from my perspective, the Debian welcoming team idea was based on mostly the work that Daniel Holbach and also Andrew here did years ago in Ubuntu called the Ubuntu developer advisory team, and for them it was opt-out in that the developer advisory team would see information about all Ubuntu contributors who were doing packages that got sponsored. And that's also true today on things like qa.debian.org. You can see who did an upload. It's not like we keep that private. I think that building something like a welcoming team on top of this, if we had a reasonable belief that most people would opt-in would be fine. Otherwise, we would probably go build our own data set. But given that I was excited about that last year and didn't manage to pull it off in the past year, it's super clear that I will in the next year. Petter contacted me in IRC. He said he was testing and logging in with his alias username and password, and it did not give him the same result as logging in with the SSO password. Is this surprising? Is this expected? It's not surprising. The plan for this is to say that Debian developers should log in with their Debian account and to start... So I would like to, in SSO Debian.org, I would like to say that if your username does not end in bash guest, I will refuse to log you in via alias. Because I look at the domain to see what are the credentials of people. And if the domain ends in alias Debian.org, then I know you are not a developer. Therefore, I would not allow you to create a new data source, for example. We do not... It's probably not a good idea to trust alias for granting Debian developer level permissions. Because the security of alias is different than the security of Debian.org machines. And most Debian developer accounts on alias are probably unused since... Like, bash guest, I have no idea. But I would like the two things not to mix and get a unique, flat namespace of people with one and only one accepted credential into the single sign-on. Petter had another question as well. Which features from the Fedora badge system? Badges fedoraproject.org makes sense to implement in the WM contributor system? I don't know the Fedora... Well, I officially don't know the Fedora badge system. I had a conversation about it a couple of years ago at Faustem and that's all I know, so it's as if I don't know. But I'm personally not interested in badge systems. So my position is that at the contributor's Debian.org level, I would like things to be as objective as possible. But if somebody wants to implement badges.debian.org, I'm very happy if they do. And I'm happy to give them some kind of access to the Debian contributor user database so that they can build on an actual user database instead of building their own. So it could feed other websites. But I don't want to say this contributor is more than that contributor. It's a statement that I don't like to do in this website. If somebody wants to set up something like that and take care of it, yeah, fine. So I forgot to say earlier that I think this is completely amazing and completely great. Oh, I guess I should say standing, not just try to sit down and then forget in the middle of sitting down what I'm doing. So, yeah, this is actually great. I remember this was an idea a year ago, basically, and it's fantastically useful and has such amazing amounts of data, and it's phenomenal. I wanted to ask about, I guess I theorize that we should... I guess what I mean is I suggest that we should include in an opt-out basis package uploaders into this, even if they're not DDs. And I wonder what you think of that. We already included information elsewhere. Package uploaders... Sorry, I mean the sponsor is not the sponsors. Yeah, the sponsors should already... I don't remember. I don't remember if they're ready. It makes sense for that to be opt-out instead of opt-in. I'm not entirely sure about the implementation details of it because it means that I need to add... Okay, I can add a flag to a data source saying this data source sends data that is opt-out by default instead of opt-in. So it can be on a data source basis. Yeah, doable. To the list. And then at some point when the WN welcoming team gets our act back together we could also use the automated data thing that Noodle's was talking about. Second? At some point when the WN welcoming team gets our act back together we could use the same hypothetical future machine readable API for the service that Noodle's might use via to replace the MIA process tools. There's one minute left. Two will ask a question. There's minus five minutes left. Two will ask a question. Who will ask a question earlier if it's minus five? Okay, so thank you Arion and add data sources. That's the biggest missing bit of this site is data sources and it's quite simple to add. And at the bottom of the site in the how to help link there's a link to a mailing list where you can ask for help setting up data sources as well. Thanks, yes.