 Okay, we are recording. So, hi everyone. If you watch this recording, you have a special Google Summer of Code session about external fingerprint storage. It's a project ID created by the general contributors in 2019, but we moved it to this year. And the main objective of this project is contribute to the Plugable Storage ecosystem in Jenkins. Originally, it was a part of Cloud Native Special Interest Group. So, a group where each target is making Jenkins Cloud Native or at least Cloud Friendly. And one of the projects was Plugable Storage. And there is a wider scope, which includes a number of areas. And Fingerprints is just one of these areas. But Fingerprints has a lot of use cases for Plugable Storage. It's not just externalizing storage. It's probably additional analytics, because if you put it to some database like Elasticsearch or whatever, you can query the data, you can visualize it outside Jenkins. Same may apply to Postgres or whatever solution. And we're interested to have this project in JSOC, because the code base is relatively isolated, being compared to other Plugable Storage stories. And it can be delivered separately within the JSOC client train. So, that's why we have this project idea. Last year, we didn't start the project for it. We had several applications, but we didn't accept them during the final project mapping. But I hope that we will be able to run it this year. So, there is a description. It comes from Google Docs, so probably I'll open it. So, yeah, the project is quite straightforward. There is Fingerprints Storage Implementation in Jenkins right now. Historically, Jenkins does everything through XML storage on the disk within Jenkins home. And Fingerprints Storage is not an exception. So, right now, everything is being stored in XMLs. Before 2018 or 2019, we had a performance issues related to that, because each fingerprint was stored as a separate file. Now, the storage got optimized, but still it's a problem for maintainers. It's a problem for backup management. Also, there is a lot of issues related to data access. So, by moving the Fingerprints Storage to database, we could improve the setup. We could also use the size of Jenkins homes, which is important for containers. Even if you use a storage coming from volume, you would rather prefer dedicated database for that. And yeah, that's the main idea. Also, we keep one item accessing this data. So, if somebody is interested to create new REST APIs or GraphQL support for accessing fingerprint data, for example, if you need to trace artifacts, it's something that you could do. And if you're interested in a web UI, if you're interested in a particular security-related features like encryption, etc., it also could be on the table, because there is a lot of potential improvements we could do. So, this project is wide open, and you can submit your own ideas based on the top-level description or your own interest. My objective for today is to discuss the project, answer your questions, and provide as much information as possible. And if you want, we can do code dives and whatever, and we can schedule more sessions if needed. So, that's my goal for today, and I hope it provides some top-level overview. If you want, I can show you some examples of how fingerprint engineers look like. So, one of the good examples is actually Docker traceability plugin. It's not a good example in terms of the status, because the plugin is rather abundant now. I created it in 2015, but due to reasons behind my control, I had to move to other more important projects. But still, it allows tracking usage of containers within Jenkins. So, you can see that there is container info, and basically the plugin allowed to trace image usages and container usages within Jenkins interface. The more probable use case for that is actually artifact storage, because when you publish artifacts, you can enable fingerprints, and it's a Jenkins core functionality. So, many plugins in Jenkins, like copper artifacts, they actually use fingerprint engine under the hood, and you can extend it for more use cases. So, let's see from me, and let's discuss your project ideas and any questions you have. You have your resume. Oh, sorry. So, the first question I have is, so, I just wanted to confirm first two things. In the proposal, there is this, sorry, not in the proposal, in the idea, in the project idea. It is mentioned that the re-implement lazy local storage, is this correct? Is the interpretation of this correct in the sense that we need to have the file storage as a abstracted away? And is this the same thing? Yeah, that's totally correct. So, how Jenkins can create, especially how Jenkins correct it, it's an extension. Yeah, right. So, everything in Jenkins is extensible, and we try to provide this extensibility. So, when you create a new storage implementation, you can hard code it, or you could create extension points. So, for Jenkins, the default way is to create extension points. And you can see that there is quite a number of them. So, this list is just for the Jenkins core and plugins can also define their own extension points. So, you can have something like, let's say, fingerprint storage. For example, here you can take a look at this settings provider, which basically provides some maybe settings, I believe. It's a historical functionality, but it's available in the Jenkins core. More relevant to the current story, for example, we have fingerprint storage. I have proposed it in the proposal. We have a fingerprint storage, and then we have a file fingerprint storage for the, for ensuring backwards compatibility. Yeah, right. So, yeah, I was actually looking for external storage implementation, not for fingerprint, sorry, for confusion. Yeah, if I'm not mistaken, I think. Yeah, workflow API provides local storage implementation, we can get to it. So, one example, for example, is credentials provider. So, for credentials provider here, you can see that this extension point has several implementations, including Kubernetes credentials, plugins, et cetera. And if you take a look at Javadoc, you can see that this is an extension point, which basically provides a number of methods, well, some of the methods are defaults, some of the methods should be actually extended by implementations. And the Jenkins core of plugins separated with these extension points using APIs. So, you will provide an abstraction layer where you hide the functionality. In the case of fingerprint storage, this abstraction layer is yet to be created, because originally it was just storage implementation, then there is additional functionality added, for example, a fingerprint cleanup and other bits. So, if you look at the current fingerprint code in Jenkins, it's a bit scattered. And the factor in code creating abstraction layer would be one step. And then you can use file system fingerprint storage as a first reference implementation. If you're interested, we had several projects, which we've already referenced, is about log storage. If we go to the previous page, you can see that there was a number of JEPs submitted. So, we implemented it for pipeline, it was just a phase. So, JCP was working on that extra log storage for pipeline. So, this one is delivered and there are plugins using that. But we also wanted to expand it into the Jenkins core. So, this is the JEP for that. And this JEP also has a reference implementation, which can show you approximately how it would be done. So, here you can see that there are actually multiple components. The main product for me was Elasticsearch log stage plugin. We also created a special API plugin for login API. So, to simplify specific tasks related to external login. For example, currently Jenkins stores 60 miles. But let's say we use JSON storages. And in such case, you can have common functionality for converting data to JSON and we could put it to the API level. And there was also core functionality, which basically provided these extension fonts. So, here, if you take a look, you can find, yeah, this pull request is big. And actually, this pull request was feature complete. So, it wasn't delivered just because there were some design concerns. And then we had to move on to another task. I pronounce it too often, but it's open source. Not everything gets integrated. And here you can see that basically the approach is the same. I'll probably go to the file view. So, let me change the interface a bit. It should be here. Whatever, I'll find the term. So, one of the things. Log storage. Yeah, so one of the things we had that day is log storage. Yeah, log storage is in, if it's in workflow API. So, yeah, but this log storage is different because so having pipeline was the first step, but pipeline is just a single type. And pipeline does the address use cases, for example, for freestyle jobs, like matrix configuration, etc. It doesn't address cases for system logging, for agent provisioning, and other bits. So, of course, the second phase was to actually provide a more complex solution right inside the Jenkins core to integrate it with pipeline. So, yeah, you can take a look at log storage implementation in pipeline, but I was just showing you this code because it shows what changes in Jenkins core if you want to do that. For example, there was API for builds. So, there is, okay. So, there is run. So, it's a common entity which represents any Jenkins build. And here you can see that there are some APIs which we need to adjust. There are some common cases like just recovering storages, etc. And also, we needed to customize log storage because our design decision there was to support multiple log storages on a build basis. So, for example, you could keep historical builds in a file system, you could keep new builds in the new system or in arbitrary system, if you wish. So, it would, it was our approach to editing integration concern because if you just decide in your project proposal to have single storage, then you will have to resolve for data integration. Here we approached differently. We decided that, okay, we will just support arbitrary restoration. And it won't be our problem to migrate after that. And there were also some changes in APIs. So, you can see that there are changes here and there because the implementation falls to light on a file system approach internally. So, there was method like get log file and write in the run. And it presumes that there is a file system. So, there we added some compatibility layers. But for the most of implementations, we introduced new efficient layers, which don't require creation of temporary files when it's not needed. So, that's what was done for implementation and all the abstraction there was happening through two components. One was a log storage. So, here log storage is just abstract class, which provides common methods. So, for example, task listeners, which provides streaming code data in Jenkins, some declaration logic, which would allow to externalize login. Because for the login, we wanted to do submissions right from the agents Jenkins. But yeah, this is tail. So, this is main abstraction there. And we said that all the details just go to implementations. So, Jenkins code doesn't have any implementation for them except file storage. And you can also see that there is another class, which is log storage factory. So, in this case, we made a design decision to make a log storage factory an extension point. So, there is a producer class, which can be configured in a build, which produces your log storage. But we created the producer an extension point, just because it was more easier from API perspective. For fingerprint storage, the design is up to use, and it can be rated during the first phases. So, you don't have to final design right now. But keep in mind that the factory abstractions and other common abstractions from object array in the program, it could be also applied here. So, yeah, this is log storage. And basically fingerprint storage could be quite similar in terms of implementation. And if you do it in Jenkins code. So, you can just take a look at the JAPI was referencing. And it provides some information about how it was done. And you can also take a look at the API plugins. These plugins are rather prototypes, just reference implementations, which show how it could be used. So, do not consider them as a final solution. But if you want, you can read them. I hope it summarizes how to look architecturally. And by phases, if we return to the proposal, my expectation that there is a new extension point for sure, so that everything else can be internally external. We don't look forward to adding more dependencies into the Jenkins code. We try to avoid it as much as possible. So, we hope that all the new implementations could be done as plugins. And creating these plugins is the most, it will be basically happening in parallel. So, you can design APIs and extension points in parallel. We introduced some tooling for that. For example, APIs in Jenkins could be marked as better. So, you don't commit to binary compatibility, even if you went quote earlier. So, you will be able to deliver these plugins in parallel with support changes. So, plug-in development and core changes in the core can run in parallel. Yeah, they can run in parallel. Just to approach our GSOC projects. We want components to be continuously integrated into MySQL branch. So, we don't want to have a GSOC project, which basically has a long-standing request for three months, four months, and then something happens. If it's possible to break down the project and to deliver things incrementally to the main code base, we recommend this approach. It shows to be much more efficient. And that's how we develop Jenkins. So, it also changes some approach. It requires some approaches in testing, for example, because we also expect it changed to be more or less complete. So, it means that we propose a feature. We expect it to propose documentation and tests when this feature ends. But yeah, from what we've seen, this is a common software practice now, that if you break down your store projects to smaller changes, you can actually deliver them quicker. That's what we will practice here. And answer my question, actually, that I was going, my follow-up question was that actually, in my current proposal, I've breaking down core and plugin development into two parts. And then I was thinking, that becomes a very short time frame for both an alpha and beta release accordingly. That's what I thought. So, yeah, that actually, that sort of start out. So, it's difficult to design a good API from the first approach. That's why, when we started doing major section changes around 2014-15, we added some tooling. Then this tooling was extended. So, now it's possible to just create beta APIs, to integrate them without committing to binary compatibility, which is a big barter when it becomes productized. But, and you can also incrementally deliver changes. So, for example, if you propose a pull request to the Jenkins core, so let's say you deliver a feature. So, here it doesn't really matter. Let's just open the first open pull request. It has nothing to do with this plugin APIs, but still. So, here you can see that we actually have there is just some deviation. So, for almost every pull request build, we publish incrementals version. So, it means that there is a new release published or maybe a repository for each pull request. And if you're developing a plugin, you can still get this version. Okay. I'm not sure what happened in this version. So, you can see that there are URLs, but so basically it should be published using the ID, but the notification didn't come through. So, maybe something went wrong with this deployment, but in principle for every pull request and for every build in a branch, it should be getting a version which you can consume in your plugin. So, you can really do development in parallel using the Jenkins CI infrastructure and other needs and you can do development feature. So, for me, yes, all the development would happen in parallel. It's actually more convenient because you deliver a feature and you don't worry what is the component. Also, in the project media, it was mentioned storage of data structures currently supported. Is this a reference to everything that fingerprint supports like fingerprint map, fingerprint, fingerprint, all those classes? Is this talking about that? So, not exactly. We have, so Jenkins fingerprint structure is basically not formalized. So, let's take a look at the structure. Okay. So, here you can see the structure of the fingerprint, but one main thing here that fingerprint consists of faces. So, it contains a list of faces. And each face is basically an object structure which is provided by plugins or by Jenkins Core. And it doesn't have fixed structure. It doesn't have schema. So, for example, if you just wanted to create a table in SQL database, it might be a challenge for you because this structure is not normalized in any way. It's just an object. And we use a library extreme to serialize this object to the disk. So, one of the potential challenges in this project is how to serialize it to databases if you use database, especially if you still want to access this data for whatever analytics. It's a nice challenge and you can come up with a solution. But yeah, what I mean here is that you cannot really rely on a regular data structure, which is sometimes common for databases. Yes. Actually, that's what I've been thinking that's a challenge in my proposal because I mean, like if I'm storing an XML file in a storage, I mean, if I store it directly as XML, not only do I have a memory overhead, but also then I have the problem of maintaining consistency because if some other Jenkins instance reads it after I have read it and tries to access it, either I can put a lock on the entire database, have it as a transaction, but then I have other problems, like I don't know when it's actually just trying to read it or it's actually going to edit it also later on. Yeah, I rank multiple Jenkins instances access in the same database. It's a use case for your JSOC application. You basically have a choice whether you want to support it or not, because there is definitely benefits for that, especially when we talk about large scale Jenkins instances, which includes hundreds of masters putting all the data into the same database, which really makes sense. But yeah, data handling will be a challenge for you. Though current databases allow resolving data, there is a lot of functionality. As long as you do not try to trace the same artifact, then yes, you will have to have proper APIs and you could add some logic on the database side, but still you would need to figure out a way how to handle it in Jenkins API, because you need to implement fingerprint storage API to make sure that it's not a problem of the Jenkins core, it's a problem of flagging, which talks to a database or whatever and use is a specific functionality. It's a challenge even for how the reference implementation is going to even do it. I'm still working on it. So goodness is that you don't have to provide a production ready implementation with all features at the first step. Again, we could start from prototype or from alpha version for the plugin, which provides some features and our advice is to have alpha releases early to get some community feedback. And then you can build on the top of it. And for me, even if the result of your project is API in Jenkins core and reference implementation, which works, but which has some document limitations for me, it's a good result. So having a full production ready implementation would be nice, of course, but it's not a strict goal for the project, because there are also security concerns and other things. So when you try to create something production, you realize that you need a team for several years. And this is where the project starts. So for JSOC, you don't want that. We deliver something and then we expand. So speaking of facets and other bits, so if you're concerned about storing text and notes to disk, there is actually another approach. For example, you could just use blobs because Jenkins includes its own data serialization when engine, you have a remote in protocol, basically a master to agent communication. And it uses object serialization. And for that, there is an engine in Jenkins. So if you want to have binary storage, it could be more efficient. And you can just reuse components. Well, I will show remote ink. But the remote ink is the implementation. And data serialization logic is mostly in the Jenkins core. So you can use the CPIs. And if you want, I can show them to you later. So the other option was obviously that I go for a relational approach and break it down and have it into tables. But even that is not a straightforward solution because because even that involves that I have to know when changes, what changes have happened because I can't make an update operation without knowing that. So I mean, what I was trying was, I didn't want to touch the fingerprint class as much. But I think this is going to ultimately, I mean, I can't think of a way that I achieve this without touching that. So yeah. Yeah, it's technically possible. So a fingerprint is just an abstraction layer right now Jenkins. Well, to be honest, this abstraction is leaky because it references file system and vice versa. But still, there is abstraction layer we could use. And if you want to build a relational structure for some components, it's possible because for artifacts, for other deployment facets that are classes right inside Jenkins core. So you could try to build relational structure for that. If it helps, and if you build that, it may be definitely helpful for getting information about this artifacts for tracing for querying the data from the storage or from the Jenkins site. If you do that, it will be definitely a nice addition. At the same time, it might be challenging when you go into details. For example, if you talk about Docker 3Cability I presented. So here, for example, Docker fingerprint facet, which basically is just an abstraction layer. But the storage implementations actually include a lot of data in Docker 3Cability plugin, because we just invoke commands like Docker inspect and translate these commands to the data structure. So finally, there is a lot of data being stored and just handling this data, indexing this data in Postgres, it might be challenging. Speaking of that, if somebody wants to recover Docker 3Cability plugin as a part of this project, it's also possible. So I already did a couple of pull requests to the Jenkins core, which actually could stabilize this plugin. So if you want to work specifically on Docker site, I believe it would be also possible. Anything else? Any questions? Yeah, so if I want to propose any UI improvements, how do I go about that? Should I like, because I'm not sure if there's a UI signal. So I think I'm not sure how to propose a UI change. Should I just add design templates? So in your proposal, you're welcome to put it how you prefer, for example, the separate section with documentation, etc. So just explain what you want to do. It will be enough. If you want to get feedback, you can go beyond, for example, platform special interest group or cloud native. There are the special interest groups and Jenkins like user experience. And what you could do, if you have a particular section, you can just join the Gitter chat or send a message to the amenities and say, hey, I have a proposal, what do you think? Also, you can just go to the Jenkins developer amenities because we are perfectly fine to discuss such topics there. So if you have a proposal, which is, let's say, a part of your project, but it's a part which can be discussed separately, just feel free to start a thread about that. And maybe if you will get feedback. Yeah, personally, I'm not an expert in UI, especially in terms of user experience, I can write some code, but also to lie on feedback a lot when we develop stuff. And yesterday, we just had a blog post by Uli Hofner about some components. So I'm not sure whether you have already seen that, but you would find the user interface of the browser plugins, but basically same approaches could be applied to fingerprints, et cetera, because this area could be improved a lot in Jenkins. That is a lot of code, but basically it's about additional controls for analytics, for data browsing, and for fingerprints, it might be also important because again, you would be handling a lot of data. Okay. Any other questions? Would it also make sense to file a JEP right now or not? So my proposal would be to postpone JEP. So if you talk about a formal JEP towards the community bonding and coding phase, so right now, JEP is a pretty heavy process and right now going through that might be time consuming. So for me, recommendation is to do it during the project. At the same time, if you have a proposal, you can just submit it as a Google Doc for initial review. And I can just provide you an example. So I'm working on a public road map for the Jenkins project. So before submitting the JEP, I just started the developer metrics thread and there you can see a link to the Google Doc. So instead of doing formal JEP, I basically used the Google Doc to put the information I made publicly available for commands. I've got some commands there and I'm processing that. So even before submitting the JEP, I get some feedback from the community and use that. If you have something in mind, you could do it in this way. For example, just send some parts of your draft or maybe link to your project draft to the developer mailing list. It's highlighted in particular components and I believe it would be the best approach for now. Later, we can convert it to JEP. That's for sure. For me, if you have finger installation, having a JEP for that would be nice. But it's a rather follow-up for me. Any other questions, especially from other students, because you must stay silent during the call. So don't hesitate to ask questions. And if I miss something, just ask. Or if you're not comfortable, you can ask and chat. I still have difficulties with opening it. Just a second. Yeah, no questions in the chat right now. Apparently, now you cannot open chat when you screen share. I'll try to assemble some meeting calls after the meeting. But in general, if you feel free to, whatever you have in mind, to the Google Doc I shared. Because these meeting calls are mostly for you and they may help you use the next steps. Really interesting project though. Well, we have many projects, which are interesting. And you can find something in a new project and you can expand. So project ideas, just project ideas. For example, in your case, Slavin, if you're interested, since you work on Docker Poland proposal, maybe you would be interested to take a look at the Docker fingerprints and Docker traceability because it's in the same area. So, for example, for you, it might be an interesting condition to your project. If you just look at that. Yeah, do you have any links of that? Can you post in the chat or something? I could definitely take a look. Yeah. So this is basically a Docker traceability. So just to explain the history of this plugin, it was started in 2015. I just spent two weeks or maybe a bit more on it. So it was working, but it has some downsides right now. For example, it has no fixed client. So you'll have to use REST API to submit the data from external locations. But basically, it allows to do entrant image and continue to use a striking within Jenkins. So for example, you build the Docker image using Docker pipeline plugin, and then you use it in test environment. So you can use this plugin to trace this usage of those cases. For example, there's some summaries with deployment of events and everything is powered by Fingerprints Engine. So for example, there will be some really created external Fingerprint storage. You will put still good benefits from that Docker polling project. Okay. Yeah. I'll definitely take a look. Thanks. Yeah. So it's just an interesting use case for Fingerprints. Anything else for today? If not, thanks a lot for your time. And again, it's just overview meeting plus Q&A. So if you have any questions, we can follow up later in the chats. Right now, the project idea uses a cognitive seek. But did we just lose the connection? I think so. Yeah. So you use it for the connection? Yes. He's the host so we have to do it. So anyway, that went through a proposal. Really interesting. Very nice proposal. Yeah. People are interested. We lost you. I think for the entire time. Good to have you back. Oh, sorry. Yeah. So I work remotely and my internal quality is not very good, especially on mornings due to reason I cannot explain. Okay. So what I wanted to say that right now the project uses cloud native specialist group channels, depending on the state we might keep with disease or we might move it to the platform specialist group, which is much more active. So I will make sure that there is coming in turn on this project if it happens, or if there are multiple projects, it's potentially possible. I believe you can find momenters for the same way. Just make sure to submit your proposals early so we can provide feedback and we can do the logistics on our side. It's fine to just get feedback from the draft state as a Google doc, right? I mean the official interface GSOC isn't very necessary, right? Yeah. So my recommendation is to do both. So for Janky's community, yes, we operate through Google doc and through mailing please, because we want the discussions to be as public as possible. Only mentors would have access to your application on the JSOC website. So basically we don't use JSOC website for discussions. We use other channels, but still submitting a project right there makes sense. Because firstly we can see who is going to apply to this project or who considers that. And we can do some initial planning plus we can get metrics because we have two or three dozen active students across the projects right now. Likely we will get more applications. Usually there are many last minute applications, etc. But when we look at the proposal dashboards, we can already start pre-planning for the next phases, which is important because April is a related assignment for students, but it's not like that for organizing mentors. Yeah, I did that. I guess I submitted all of my proposals via the Google dashboard as a draft. So I'll probably hit the PDF button in the last minute. Yeah, hopefully it gets completed. Edit your drafts. Don't always submit it as final proposals for now. Just keep it as a draft. It's perfectly fine with us. Yeah, of course. Okay, anything else for today? Yeah, I ask it too often. Yeah, thanks a lot for your time. Thanks for the next project. Okay, bye all. Bye, see you.