 Okay, thank you. Hello everyone. I'm going to talk about... It's like you're better. It's like you're better. I will talk loudly. Thank you. Awesome. How many of you have used doors in the past? You've seen both doors. You've like, been told, done. I'm excited to take a little break from working to go back to school to get a PhD. So, right now I'm in London working on my PhD, which is based on the Linux kernel. And I'm looking at how people collaborate together in the kernel. I'm going to talk about some basic options and how to use doors for repositories, which is what it's typically used for. So, I'll talk about some options you can use to make it look a little better with doors. I'll also talk about the custom lock format, where you can visualize all sorts of different things with doors, including things like valid lessons for open-source projects, web crackers, that sort of thing. Those are the two examples that I'll use in the presentation. And then I'll talk about some additional options, video generation, that sort of thing. The presentation is probably 75% of the demo. The first thing I'm going to show you is, this is not something that I created, but of course it's used in a lot of different ways. So, I thought that this one was interesting because this is, some of it out, it's the person you can see here, was an unnaturally tragically ill recycling accident. And they did this video as a tribute to him after he was gone. So, they used this, so it was like, we're young. They used this to visualize all of his contributions to young over the years. And you collaborate with other people, so you can see other people coming in. You can see people who work at Red Hat have a little Red Hat check. And Amatars, which is an interesting type. So they didn't, so I like this video one just because of the topic. I think this is a really excellent tribute to someone. But I also think they did a very nice job using some of the various videos, of course. So I'll talk about some of the things that they did and show you how to do a bit of this. I'm not going to play the whole thing through, but if you Google this, this is how I'm on YouTube. And then I'm just going to show you a basic. A really basic example, and I'll talk more about how this was generated. But this is an example of the figure show management camp website repository. So you can see that Chris is creating some initial files. I'm going to write this on this typo. To do, see Richard flying in, see his updated style sheets. And the index files, which I can't really look nice. And then you'll start to see, so this is our code right at the top. And you'll start to see a lot more activity as we're getting closer and closer to the event. So this is an event that's held two days after FOSDAF. And so you can see now that we're hitting November, we have a couple of papers open. We've decided on different rooms. It's for the chef, for the puppet, for the performance. So you can see all of that activity going on. So Gorse allows you to really visualize what's going on in any particular repository. Now it's a small repository, because it really needs to be spread. But you can see now, we're nearing up to the event. So now we're in the event. You can see all the security activity. And you can see people already have their mutations. They're having their content to the website. And then you'll see a drop-off where nothing happens. It goes directly to FOSDAF. And then they send it one site every week. This will be again next year. Let me just give a quick example. So now that you've seen this, how many of you have seen something made before us? A lot more. So a lot of you have seen Gorse. And it's incredibly easy to use. You can see this command that I used. There are a couple of free rich options that I'll show you. But basically, you install Gorse, you type Gorse, you forward it to the point of enter repository, and enter it. So it's incredibly simple to use. Does it support some modules? I'm sorry? Does it support some modules? And get an aggregate done with some modules as well? I don't think the Gorse natively supports that. But there is a project called Gorse Sympress, which allows you to aggregate multiple repositories together. So there's links to that here. I'll talk about that today. We can see here that there is... I have an end of repository. And the readme and the end of repository has all kinds of details. There's links to Gorse Sympress. It has detailed descriptions of everything that I'm going to talk about. This is what I did with LinuxCon 2016. But the contents are pretty much the same. So it's a slightly shorter version of it. But the readme has lots of useful information. And it also has... The repository also has all of the examples that I'm going to talk about. So it's really useful. The version is a recent run. We went into the repository. And there's lots, lots of options to use to make things work. But it's a little bit better, depending on how big your repository is, how active it is. I'll show you some of those. But if you type in Gorse minus capital H, you can see all of the options. I don't think it's all in Gorse right now. But I usually have those. So one of the things you can do is you can create a... But one of the things you do... So if you have a really big repository, or if you're just interested in looking at maybe during a release window, or maybe a period of time, right before or after a conference, you can limit the dates. Now where the date will be taken is incredibly useful. If you're looking at, for example, visualizing something as large as a Linux kernel. So the way Gorse works is it reads the whole Git log, and pulls it in, and then starts visualization after it starts the Git log. So for something like a Linux kernel, which has 10 years of Git history and with Jillian Commits, it takes it to take forever even to start the visualization. So you sit there and watch it, think while it reads the Git log. So one way to pick that up is to give it a smaller time. On top of that, you actually want to look at the whole thing. But if you just want to look at a small piece of it, you can see things like that. You can also auto-skip. So that was what you saw when it shifted from large to some jumper, the last example. So it auto-skips after, I think after the conference for a day, after the conference for a day. And then you can also speed things up or slow things down in seconds per day. So for example, some of the examples we're going to use are very, very small repositories. We can not block activities, because those are easy to do as examples for presentations and demos. But you can tweak the seconds per day. So 0.5, how about you, for some small repositories? If you're looking at something like the Linux kernel and you set that to 0.5 seconds per day, you can't even see what's going on. So you can jack that up, and you can make it 10 seconds per day, depending on the author. And then the other thing you can do, we saw this in the some example of this in the Young video. But you can also add user avatars. And the way you do that is you give it an image for a day, where the images are located, with the dash, dash, user imager command, or option. So the example I'm going to show you in a minute, we have start date, stop date. So you can see here, we call this kit after a one day or one second. It's a week. And then seconds per day, 0.1 for a month. So I set that one day up. And then I gave it a user imager, where since it's a very small repository, I really just pull their images from here and save them in this repository. And then I'm going to point it at a your repository called the status. Now there is also, I have not done this yet, but there's a link to it in the review. But there's also a way to type in the avatar. To use the avatar to pull the images. Which is probably more useful for doing it for a project that has a lot of images on another demo. This is exactly what I just showed you on that last slide. So you can see here, we have our search in the date. You can see the avatars. You can get some data on the stats in the repository. But you know, the avatars are given a nice touch. I don't know if they agree. But presentation. Any other questions so far? I don't have why I did this. But like I said, the GitHub repo does literally have everything that I talked about. So there are a lot of other ways that you can make it look a lot nicer. So I talked about how to speed things up so it makes down, but you can also make things look nice. So one of the ways to do that is to change the date format. So you might want to use a simpler format that you can scroll and buy. You might want to give it a title. For example, you might want to give it a title as a blogger. So that's sort of the repository. I changed the font size and font color. So I made the font a little bit bigger. I made the orange. You have to match the tortilla logo. The downside to the font color and font size is that it controls both the title and the date. Which is here. And so you can't scroll this into everything. But I'll talk a bit about how you can make the title to center anyway. And then you can see here that I have the tortilla logo just because the guy from Turkey attempted to do it once. I've done a lot of work on the stats. The other thing you can do is if you want it to look a little bit nicer, and this is what they did with the young folks, is you can, instead of giving it a logo, you can give it a whole banner image that covers that entire bottom part of the ground. Or the entire bottom part of the visualization. So instead of giving it a title, you just give it a banner. And so the logo essentially becomes the title. And that's actually what I'll show you in the example. So I'm going to have these two examples. The first one has the date format, font size, font color, and it has the title. That's generated by course. See what that looks like. And it has the turkey logo. I pointed at the the bottom one is the exact same thing, but instead I'm using that banner image to replace the title. And so you can see the difference between what that looks like. This one. Sorry. Okay. You can see that we have the logo on the bottom. We have the title down here. We have the dates on the top. So it looks a little bit nicer. Again, it's sort of weird that the title is always the same. Except for all the times of dates. But I'll show you what it looks like when you replace this with a banner. Which is probably going to look a little bit queer because the vectors don't seem to. Yeah, so let's cut off. Because I did this for a bigger screen. But you can see at the bottom that I made a figure. It's a little more graphically applied than I am. I might be a little more engineer because I know the sense of what I am. So you can use the save font as the logo. Or you can use the nice image to get something fancier than just the title of the logo. Do you do anything with the long bottom there? The only thing you have to be careful of is to make sure that the image that you created is the size of the window that you're going to be using. The other thing you can do that's really useful, especially if you want to create a video and have a bunch of people watch it is that you can give it captions. So you can, so like I sort of narrated what was going on in that repository that we used for the big management camp, you can also create your own captions. And it's really simple to do. It's just a quite separated file to create with examples and like it have a lot of story. The only tricky bit is you have to give it a unique time. So you have to prefer whatever time you want to use, like Sam. And then a pipe and then the text that you want to use. And the way you do that is you just give it a dash-dash-capture-file option. You give it a caption duration which in this case is three seconds. And you give it color and size. And then the other, one of the other things to do is to display some additional information that you can display the key. So you notice before that there are numbers with files. And those correspond to file types. So a dot-hi file will be a different color than a dot-ts file. And I'll show you what that looks like. The exact same caption. So this is particularly useful if you wanted to say, you know, narrate maybe what was going on during the register though or you know, maybe somebody who had the process of re-factoring something so important to kind of display what was going on there. So this is the really nice way to do it. And then you can also see the key that the big files are HDMI. The blue ones are EMGs and JPEGs. So you can do it in the context. The other thing that you can do is with the K key, you can make that disappear over here. The other thing you can do if you want to say, pause something and take a closer look at it. You can just hit the space bar. And that pause the visualization part is. So maybe you wanted to take a look at that and look at the mouse over and see what some of these HDMI files were. The space bar again. Questions? We're transitioning over to the customer for us. Yes. So you can use this on like source repositories and with the mailing list that sort of implies we can use different types of links. Yes, that's actually what I'm talking about right now. Perfect. That's perfect. Here, question here first. Is there a way to influence how what is the nodes or what is the node cluster for the file? How it clusters the files. The way it clusters the files is based on the file tree. So where it is in the directory structure. And I don't believe that there's a way to change that. However, you could possibly do that by running it through the custom log format. So you could, and that's what I'll talk about next, is the custom log format. What you could do is you could take the data out of the git repository and re-format it in a way that you wanted to see it and visualize it using the custom log format. Is it possible to specify the joint stand and identity format with respect to the UNIX 27? So the question was is there a way to use the data format in ISO, I understand? Do you understand for the custom log format that you would need to display it? I believe it only takes UNIX time stamped but I've never tried it. So, they tried. Okay, so the way the custom log format works is similar, but I showed you the caption file as a type separated file. First bit, again, UNIX time stamped. Second bit is the username of the person doing whatever activities that you're tracking. And then, there are three different update types which point out to things that are typically done in files, but you can mount them to other things which you've seen and you've had the idea on the mailing list of live examples. So, the three types are A for added and B for updated. You can have the file or some other information that you wanted to display. And optionally, you can also have another type that has a text formatted color which you can watch the color and you'll see that the bug example that I use later. This first little log, I'm a huge fan of that I work more on speed and tools. So, these are tools that you use and people typically use to pull data out of open source projects. So, in the case of the two examples that I'm going to show you, I'm going to show you an example of a mailing list. So, I use MLS maps and get it formatted into a database where I can work with it to get the data into the system log format. And then for the bug data which I borrowed the data set from my supervisor at the university who has a bunch of visualizations and other analysis on actually bug factors and you use PCHO to pull the data out of the actually bug digital database. And then there are a whole bunch of other tools that are part of this like sorting out ease, these are really excellent tools if you're interested in getting data out of open source projects and being able to do something with it. So, you can do a lot of stuff. These are the tools that I use for the mailing list. So, the mailing list example. The way it works is you use something like ML stats which pulls everything out of your mailing list archive and dumps it into a database and then you extract the information that you're interested in via a relatively simple database source. I hope you generated yours because the format I'm talking about earlier. As far as it works, I have a Unix type CF I have the person who sent the email and it was a new email and what I'm looking at this is I'm looking at it as if they've added a thread to the mailing list. So, we're calling that a new message using the A for added and then the other thing that I look at is the email that's in response to someone else. So, I think of that as sort of modifying a thread and then I track if they're responding to you. If you do other things here, you can track the thread they're responding to and do something interesting as well. But that's good to hear. And then your course. So, the way it works is pretty similar. These are some common progressions. So, the match is between just a little bit longer or no. So, it equals 500. So, you throw them a little faster, but it's another 15. Again, the August tips and I is a my idle time. So, it keeps the signals around and keeps the visualization of what it makes this file attractive to people in this case around for a few extra seconds. So, it's a little fabric, right? This is like track and call data, and generate a custom format file. So, you can actually, I have the code that I use to do all of this in my repository. So, the Python script, you can point it at an MLStats database so that as soon as you can run MLStats you can get everything into a database. Point at a Python script on a database, that Python script is comparable to Python. But it works for me. Right? I'm not going to go through this now. This is the time, this is the database part that I used to pull the data out of my own house and to get the data that I needed. So, now I can see what it looks like when you visualize the MLList to see exactly how it looks. Slide. So, we have here, so people creating new threads, so they're sending new messages and now you can start to see that we have people reply to other people on our spot list. So, you can see that Alex Williams is going to roll it, for example. And it looks a little bit weird because people don't have a tree structure, right? You could put some kind of structure behind it if you were interested in doing that. Maybe, you know, people who typically work on a certain project or something, you could put them in some category. We'll see how we did this with the bug data. We did actually put it in a tree structure. But this is kind of interesting, right? Because you can see who is responding to who on MLList. So, this is one of the Linux kernel MLLists. This is the IOMPU MLList, which is on memory data for MLList. And it's only one month of data because it was fantastic. But this is a fairly manageable time just in January. I know in the sandwich section people will remind me of that. And it's interesting how you can use for us to do other things. And I've also seen people use for us to visualize things to have something to do with projects. Sorry? I have a question? Have you tried using the sub-it for the filing instead? You could. Yeah, you absolutely could. I was so so something from PhD. I'm actually looking at how we work together within the Linux kernel. So for me, the interesting bit was looking at who was responding to another person. But yeah, you could absolutely put the thread in there instead. And you could do something with the way the thread is structured to create a file structure. And that would probably be a better example. Any other questions on the MLList data? So again, this is a little bit of magic. I don't have a script to generate this because it's the data that my supervisor put together. But it's similar. It's a very similar thing. You strap the data. You generate a forest test format file, which in this case is the person who submitted the bug. And it was a pre-bug. They used the A for the credit bug. And then they did put the subtree structure. So they put these in a module structure and then the same thing. It's the modified bug. It's put in there in the F for the modified. And a module for the modified bug. So you can see this is very similar to the one that I just did for the MLList. You can see here, unfortunately people are numbers. But you can see that there's a bit of a structure here. So if I pause this, I can see there's a few ones. These are other modules. These are the bugs for the build system. So you can see that they put the modules in various, or the bugs in various modules. So they're getting a greater tree structure. So that's really nice. Lots of activation. Any questions on that? I know I talked really fast. I'm really serious. There are some additional options. So I showed you the space bar to pause. You can also use control plus minus to see that we're still down. Because you can actually click in a specific time frame. You can see like a specific date. And the other thing you can use your arrow keys to put a camera around. So maybe there's, maybe you're just interested in those little bit down here that you want. So you can use your arrow keys to sort of move things around. Which is pretty good. And then as I mentioned, you can use a key. You can also put it on a loop. This is really useful if you're running data. And you want to just run this on a loop. Both people are having their workings at their drinks. And people get really excited about this, right? When it's their own project and they're watching your visualization of. So like, oh, I remember when So So joined the project and she was like, oh yeah, that's what you're acting like. You shut the whatever. And people get really excited about it. And then the dash app is for full screen. Now, you can also generate videos. It's good if you want to upload this YouTube. It's great if you want to visualize a very, very large repository that you can use. And you want to show it in a presentation and you don't want it to wait for it to get long to load. I'm not going to get there exactly how it works. Because as you can see, it depends on how you do the visualization. It depends on the specific code that you have installed, the type of videos that you have. So it's fairly detailed. But there's a link in the description of the repository to where you can use the instruction to generate videos. It's a whole page to tell you how to do that. Oh, there are clinics back with those various options. I'd love to live follow and if we could. I don't believe so. I don't believe that there's a problem with that. What do you think about getting these stats? Would it be cool to have interested in, say, Carol in this patch of story where you get bars and how many you find, how many invitations, how many to move to kind of look at the patches that are in the 18s and the new metrics and all those. So the group that's during the development of the mail-in-west stats is in Georgia, they're based out of the grid and they're related with the university that created different problems. I think you can see there. And they are actually doing research on looking at patches and you're looking at, it's like you know, time to you know, time to send things like that. So there are four, and I can't remember exactly which schools they're working on. So there's a link to the the Bryn Mawr and some of the tools which I found back in Nevada. And there's the IRC channel which you can pop into and all the people who work on it and they do a festival pass which will see that. So it's probably that one probably. I don't remember in the mail looking at the whole thing right now. Oh! Yeah! Sorry, they have a stand. They have a stand in K Holden. But they're also hanging out in the cafeteria wearing orange shirts. Next to the guys, the orange shirts. So, yeah. Sorry, I don't know about your board. Is there another question? Yeah. Can you combine data from multiple sources so you can get data from bug tracker issues that combine with the gift commits in one web? You could do that using the custom log format. So you would have to do that manually. So you'd have to take the data from the data repository and you'd have to bring the data from the tracker or the mailing list. And the words are together somehow in a custom log format. I think it would be a little bit more visualized. It might be interesting to see what's going on at the same time. I think that would actually be a really useful tool to have when you're trying to get it the way you're like in community upwards. You'll have to create programs for users to get all the individual account links to one person. There are two tools to do that. So one of the tools within that you're more allowed is called sorting hat. And I spend a lot of time using that tool. It has some automated ways of looking or matching tools like entities across all small data sources. And then there are some manual ways which can do that. So if you know there's something for all of the manual integrations. So I've spent the last two weeks on sorting hat. Right? Just maybe I have a story that's going on. What's an example of a lot of insight that you gain by using the tools to visualize something? Okay, so we always ask you this. And the answer is there's a lot. There might be some insights to get in using a visualization tool like this. For the most part, I think the insights come from other sources of data. It's a little bit hard to get a lot of insight based on the tool. People love it. It's like handy. I think it's fun. But, you know, I don't know that you can get any deep insights from the visualization tool that you wouldn't already get by participating in that community. Just to sort of let you know it can show you some systemic movement in your community. And it would be impressive to see because in a project where I'm a contributor we have some areas of where we open for less contributions. And with tools like that, you really should see it with movement and then it's very it pops a lot of the screen and it's a graphic. It's not so busy at all.