 everybody welcome to this presentation on the Jenkins external fingerprint storage project. This was one of the Jenkins eShark projects this year. I had some amazing mentors so Oleg, Andrey and Mike helped me along all the way along. So for today's presentation I'll be starting off with a small personal introduction. Then I'll be talking about what exactly is the Jenkins fingerprinting engine, what are fingerprints. Then I'll talk about the external fingerprint storage API that we developed during the course of this project. And then I'll talk about some of the plugins that are ready for the community to use and some information about them. Following that I'll be showcasing a demo which will show the usage for how to use these plugins and following that we'll have a small Q&A. So I'll start off with a personal introduction. I'm Sumit. I was the user student for this project. I'm currently pursuing a bachelor's in engineering. I started contributing to Jenkins in December 2019 and I made some contributions around the Jenkins fingerprint engine which ultimately led me to be interested and taking part in this wonderful project. So I'll start with the Jenkins fingerprinting engine and what exactly fingerprints are. So fingerprints inside Jenkins, they are nothing but a way to track the usage of artifacts or files across the entire Jenkins, you know, different jobs and builds across the entire CICD flow and that makes dependency tracking easy. So to take a small example here, say team A builds an artifact, you know, aid or jar and team B is building, you know, bead or jar and bead or jar consumes aid or jar and team B finds, you know, there is some issue in aid or jar and they report it to team A. Now team A needs to find out exactly which version of aid or jar is team B using. So this is one of the use case for fingerprints and they can easily, using the thing Jenkins fingerprinting engine, they can find out that team B is using which version of aid or jar so they can, you know, fix the bug. So I'll show a live example of this usage inside Jenkins. So here I have a Jenkins instance configured with two jobs, a job A and job B, just like I talked about in the example. So if I configure a build for job A, I can go and see that the build that was created, I can see it's fingerprints. So I can see that it has fingerprint recorded the fingerprint for one file, which is aid or jar. And I can see its usage is across Jenkins. So I can see that it's only been used in build aid of job A. Now if I go back and I start a build for job B, which actually consumes aid or jar, and then I go to see which fingerprints were recorded, I can see that it's recording the fingerprint for aid or jar, whose original owner was just job A's build aid. I can see the usage history for this, and I can see that, you know, this particular artifact was, it's used in job A's build aid and job B's build aid. So that's how across, that's how the fingerprinting engine in Jenkins is allowing me to, you know, view this usage of aid or jar across the entire CICD flow. So that was a small live example for Jenkins fingerprinting engine. And I've discussed the, you know, I've shown the fingerprints UI, this UI inside Jenkins. So now I'll be talking about what was the motivation behind what we did in this project. So currently, so before this project was created, it was, you know, we took all this over. The Jenkins fingerprinting metadata was stored inside the local disk storage of Jenkins home. So it was stored as XML files. And it has its, it had its own set of disadvantages that as the number of fingerprints increase, the disk storage of Jenkins also, you know, takes more space. And, you know, you cannot configure pay as you use cloud storages that are, you know, much cheaper these days. Also, you cannot, you know, configure, say replica sets to, you know, provide better reliability, availability, backup management is harder. And last of all, each Jenkins instance is storing the fingerprint metadata for only the, for its own fingerprints. So, so say if you have multiple Jenkins instances and you want to track this usage across Jenkins instances, there's not a common store for all these fingerprints. So with these concerns in mind, we built the external fingerprint storage API in Jenkins core. So the idea behind this architecture of pluggable storage is that Jenkins core provides this external fingerprint storage API to plugin developers and plugin developers can then, you know, build storage specific plugins. So say they can be a latest fingerprint storage plugin or any, you know, say Postgres fingerprint storage plugin. And the users can just install this plugin, they can configure their own instance. And just, you know, after that, all these fingerprints start getting stored inside the external storage. So that was the idea behind this API. And also, you know, that a single such storage instance can act as a store for multiple Jenkins instance fingerprints. This idea of externalizing a component for inside Jenkins, you know, it's a wide initiative inside Jenkins to make Jenkins cloud native. So there are other stories also, for example, externalizing the storage of configuration files, blogs, etc. And, you know, I can view these all on the cloud native page, the cloud native site page. So I'll be briefly touching upon the external fingerprint storage API. So if you are, if you would like to know the various design decisions we took, so there's a Jenkins enhancement proposal, the JEP 226, which highlights those in detail. We create and introduce this API in Jenkins core 2.242. And since then we have been consistently upgrading it, improving it, and even so up till Jenkins 2.253, something related to fingerprints has been going out. So apart from the basic methods that this API offers, you know, like loading, saving, deleting these fingerprints, there are some other features also that it offers. So those I'll be discussing in the following slides. But if you are, if you would like to, you know, get to know more about the methods that API offers, feel free to refer to the Java doc that contains some more information on this. So as I said, one of the features that we also implemented that this API offers is fingerprint cleanup. So we introduced this facility in Jenkins core 2.248. So basically what happens is that as you know, these builds inside Jenkins get deleted, this fingerprint metadata might get obsolete, and these fingerprints need to be cleaned up, or they'll keep consuming extra space. So that's the cleanup facility. And there's a job that runs in Jenkins daily to clean up these fingerprints. So we extended the dysfunctionality to Jenkins external fingerprint storage API, so that external storage plugins can perform and configure their own fingerprint cleanup strategies. So we offer these some methods in the API for plugin developers, which they can use to, you know, allow users to configure cleanup, and they can implement it in the most efficient way possible because the API is generic enough. So for example, the latest fingerprint storage plugin uses cursors to traverse all the fingerprints and clean those up. So that's how the latest plugin does it. Also, you know, we understand that it might be the case that this performance overhead just does not make sense if the storage is very cheap. So in the fingerprint section inside, when you configure this configure fingerprints, go ahead and disable fingerprint cleanup. So users can do that if they feel that it's just not worth the extra performance overhead. So that is also that something we left up with users. Another feature that this API offers is fingerprint migration. So we introduced fingerprint migration in Jenkins 2.251. So what fingerprint migration does is that for those users who already have fingerprints on their local storage, Jenkins core automatically takes the task of transferring these fingerprints to the newly configured external fingerprint storage. But we do not do that in one go. So as so, you know, there might be a large number of fingerprints and it does not make sense to, you know, from a performance standpoint to do this in one go. So we implement a lazy migration strategy. So as and when these fingerprints get used, they get migrated to the new configured external storage. So this allows gradual migration of all fingerprints from local storage to the new external storage. Okay, so lastly, I'll be talking about some of the external fingerprint storage plugins that are available for the community to use. The first one that we developed during the course of this project was the Redis fingerprint storage plugin. So it's currently, the latest release for it is 1.0 RC3. And it can be installed directly via the update center. So all the instructions for how to use it. So these are all available and you can even use jcask to configure it. Also, the GitHub page is also linked. So we appreciate everybody, you know, giving it a try and letting us know the valuable feedback on this plugin. It supports both migration and cleanup. So that's about the Redis plugin. So after building the Redis plugin, what we realized is that Redis, so these fingerprints were getting stored in Redis as blobs. So all this fingerprint metadata was not queryable. So with that in mind and, you know, to define a relational structure to this data and allow powerful querying strategies, we, and also because Redis was an in-memory database, we decided to build the Postgres fingerprint storage plugin. So the latest release for that is the 0.1 alpha 1. It's at the moment, it supports migration but not cleanup. It allows powerful querying of this fingerprint metadata and especially there's something called as a fingerprint facet that exists inside fingerprints. So plugins can add some extra information inside fingerprints using facets. So that is also queryable. So for example, the Docker traceability plugin tracks the usage of Docker images and you can actually trace these across multiple Jenkins instances via the Postgres plugin. At the moment, you have to have knowledge of the database to actually, you know, write these queries but this has actually opened up a huge potential for other plugins to consume this facility and offer this right out of the box to the users. So we also appreciate everybody trying it out. At the moment, you can only install it via the experimental update center because there's only been an alpha release but, you know, give it a shot. Let us know. Finally, I'll be moving on to the demo. So, all right, I have my Jenkins instance here. So what I'll be doing is I'll create a new job. All right. Okay. I'll create a new job and I'll add a build step. So it's a very simple job. So what it does is that I'll echo the world and create an artifact that would be, say, demo.jr and I'll add a post build action that is to record fingerprints. So this way I can record the fingerprint of demo.jr. I'll hit apply and I'll hit save. At this moment in this Jenkins instance, I have a fingerprint storage plugins installed but I don't have them configured. So what I'll do is I'll hit a build for the Redis demo first and I'll see what fingerprints got created. So it's recording the usage of demo.jr in build one but so, okay, so you can see this editor also, right? Just to make sure. Yes. Awesome. So if you see that it's actually storing this particular fingerprint as an XML file inside Jenkins, right? So what I'll do is I actually have a server, a Redis server here and at the moment it has, it's empty, right? So now what I'll do is I'll show how to configure the Redis fingerprint storage plugin. So I'll go to manage Jenkins. I don't want to, I don't have to install the plugin because I already have it installed. I'll just go to configure system and go slow down to the fingerprint section. So as I discussed that here is the, here's the checkbox which allows, you know, enabling and disabling the fingerprint cleanup. And here's the fingerprint storage engine drop down. So I'll be choosing the Redis fingerprint storage. Here I can configure host, port, SSL, database, connection timeouts, output timeouts and credentials if any. Once I have that all in place, I can just hit the test Redis connection and it gives me a success. That means it's able to connect to my Redis instance and I'll just hit apply and I'll hit save. And now the Redis fingerprint storage should store the fingerprints. So what I do is I'll hit the build, I'll hit on the build for this job and let's see what fingerprint it created. So demo.js usage has been recorded in builds one and build two. Now if we go here, you can actually see this fingerprint got automatically deleted from the local storage. I have another fingerprint but this is not the Redis one. It's the one I showed at the live example. But if I go and hit my Redis instance, I can see that this fingerprint is now getting saved inside my Redis instance. So this is the Redis demo fingerprint that got stored as a blog inside Redis. So this is what I meant when I was talking about fingerprint migration. It automatically migrated this fingerprint from the local storage to the external storage. Lastly, I'll also show how to use the postgres plugin. So I can go again configure system. And then the fingerprint section, I can choose postgres this time and I can configure the details. So I have a database named demo already and I can add credentials for it. So just a disclaimer that if you have created a database on your own, the users have to create it on their own and the rest of the plugin can after that use it. So if I hit test connection, I get a success. But I also need to hit perform postgres screen initialization so that the schema gets created inside my database. I got a success. So now what I can do is I can hit apply and I can hit save. So now, so also another side note that ideally we should not be configuring multiple storages at the same time. So I mean, just configure one external storage and then use it. But for the demo, I'm actually changing it to the postgres one and I'll hit build for it. And let's see that the fingerprint was recorded properly. Yes. And if I go to my postgres interface, I can see that inside the demo database, there is a fingerprint schema that has a table for fingerprints. And I can see that it's recorded the fingerprint which I asked it to. And it's storing all this information like which instance created this fingerprint, what's the file name, which job originally created this fingerprint. And if I go to the table job build relation, it actually stores which jobs and builds are used to this fingerprint. And the facet relation table stores the facet uses. There is a phase three post that I've also linked in the presentation. It highlights some of the common queries that you can use to query this database. And essentially facets, jobs, builds and anything can be queried. And if you have multiple Jenkins instances configured, then you get all this information across Jenkins instances. So that's a great use case. And maybe this should in the future, plugins can use it and, you know, write off a title to box to users. Finally, next steps. So contributions are always welcome. We hope that more plugin developers will come in. They'll implement this more most different, these plugins for more different storages. Team up is currently not offered by postgres. Once we implement that, we can go ahead with the RC release for it. And we discussed some optimizations that can be made to the API. So, you know, there's some work we could do there. Otherwise, we appreciate, you know, everybody to try these plugins out. Feedback is always valuable to us. And if you have certain use cases that we could help you out with, we'd be happy to look into them. Before I move into QA, there's some links also. I'll share this slide deck on the project page also. And also before I move into Q&A, I would like to, you know, thank my mentors, Oleg and Mike. I think they put in a lot of effort that, you know, helping me out. We had long design discussions. Ultimately, I think that made me a better programmer also. And it was a great summer. And, and definitely the org admins for Jenkins have been a great job in, you know, this entire program. Really thankful to them. And I look forward to, you know, contributing more and for the years to come. So thank you. And I'll open it for any Q&A. Thank you. Thank you. Thank you. So yeah, we are going slightly over time. We have another presentation and we have 20 minutes left. But yeah, let's firstly answer a question from Ulia. Are there plans to pull the external storage API up to the Jenkins core? So not every plugin that wants to use an external storage needs to create individual plugins to publish, read it, read this, et cetera. So I guess the question is about external storage, not about external fingerprint storage. Will you just correct me if I'm wrong? I need to go on voice permissions to order. Okay. Yeah. Yes, it's in general. So not only for fingerprints, it would make sense to have it in general. So if I want to use it in the warnings plugin, I currently need to write an own plugin for Postgres and so on, which seems a little bit too much plugins. So would you repeat the question? I can actually find the question. Sorry. Yeah. So Ulia was asked whether there is a plan to have generic external storage API. So not just for fingerprints, but basically for any type. I think as far as my understanding goes, there is a database plugin inside Jenkins, which you can use for configuring databases from one place. But I think even when JCASC also and other, I think for the Amazon S3 artifact manager. So all these plugins and all these Jebs are looking into externalizing different components of Jenkins and each of them has a very different design that needs to go into it. So I think this is happening at the moment in the cloud native sick on a more like a per feature basis. But yeah, I mean, if we can actually make it generic, that would also be amazing. Yeah. Also, if you take a look at the pluggable storage page in cloud native sick, you can discover that there is there are stories for generic built results storages. So basically for run metadata, including call actions, like warnings in G plugin reports. So there is interest to have such a generic layer and the work done by Smith basically opens a way for us, I would say to use radius because the same implementation could be used to serialize data to extract the data. And we could start building that soon. It really depends on the contributions. But that is actually why we have enough information and fingerprint storage is additional source of information for us so that we could do it right. Is there any comments from the mentors of this project? Okay. So Mike is currently not able to speak. He said that Sumit did a great job and the demo went really good. So thanks to Sumit. And now it comes from me. Yeah, it was a really great project and Sumit did a lot of things which we originally didn't plan. So we were way ahead of the schedule. I would say that the key functionality was delivered during the first coding phase. We had reference implementation, which was fully operational by the middle of the second coding phase. And it's not an easy project fingerprints part of the Jenkins core. So it means that there are much strict delivery cycles also Jenkins enhancement proposal. It's a part specific to the Jenkins core and evolution. And I would say that Sumit did a really good job while preparing the plan, documenting the plan, preparing formal specification for Jenkins enhancement proposal and also communicating with Jenkins core maintainers team and with other plugin maintainers. So yeah, there are a lot of people involved and also these implementations are quite complicated on the technical side, but still plugins have, for example, integration testing, which involved tools like test containers. So for me, it's a completed project with a lot of value delivered to the Jenkins community. And as I said before, it also builds foundation for future pluggable storage development in the project. And yeah, personally, I think that this project is a total success. And hopefully we'll be able to finalize it to find the asset of the JEP and to make an EPI public with an Excel CS baseline. I believe everything is in place. So it's just a bit of the Jenkins developer, my increased discussion. So thanks a lot, Sumit. Thank you.