 Well, thank you for joining this session today. My name is Marc Shaggy-Cazard. Don't bother trying to pronounce it, just call me Marc. And I work for Outshift by Cisco, which is Cisco's innovation lab. I'm also a CNC ambassador as of this March. And I'm an open source tech lead within Cisco, so I work with open source technologies, which is partly why I'm delivering this talk today. But I'd like to start by telling a story about how we got into working with container registries and why we started looking at this. And as I mentioned, Outshift is an innovation lab. And innovation always, or not always, but most of the time starts by trying to validate some sort of ideas. And what I didn't mention about Outshift is that right now we are focusing on the cloud native space. So we do have a bunch of open source and product, open source software and products in the cloud native space. And cloud native applications or solutions are often delivered as container images. So getting these two together, we have ideas to validate, and we need container images to be deployed potentially in design partner environments. So that's the third part of the story of why we started looking at running container registries or using container registries. And from past experience, we have learned that distributing container images is not necessarily a trivial problem. So we have to think about the kind of environments where you want to distribute those images to, like developer machines or CI needs to be able to pull those images. You have to be able to pull those images from container orchestrators, like Kubernetes, obviously. So we need some sort of flexible authentication and authorization solution alongside with this container registry that we wanted to use for distributing images to design partner environments and to customer environments. And the other thing we wanted to, or brother didn't want to do, is spend a lot of time and resources on operations. We wanted to minimize the operational burden. This was the beginning of a project of distributing images to design partner environments. So we wanted to minimize the operational burden because that's not what we really wanted to spend time on. So we wanted to have a solution that gives us backups, monitoring, and everything else necessary for running a service like this. Now, obviously, we didn't want it to really from scratch. So where do people go these days when they need a cloud-native solution for something? We guess it's at the CNCF landscape, of course. So we take a look at the CNCF landscape and see what available options are out there. And there are quite a few, actually. So if you take a look at, there are much more now than before. But if you take a look at the CNCF landscape, there is actually a container registry category. And there are quite a few solutions in there. And if you look closely at the solutions, you will actually be able to identify a couple of categories in which those solutions fit in. We came up with these four categories. Obviously, cloud hostage solutions are the most common one, basically. And the cloud provider have their own container registry solution. There is a special category which doesn't really fit in with the rest, which is the peer-to-peer registries. The peer-to-peer registries rather focus on efficiently distributing images. And they're not really for distributing images outside of a system. It's not really for sharing images with customers, for example. They are focused on efficiently distributing images within a system. So that's the other category. Then there are the only one solutions, like JFrog and Harbour and everything else, kind of the artifact repository kind of solutions. And then the last one is what we call plain-on registries, which is basically just container registries and nothing else. Now, as I said, we needed to find a simple solution at first. And it is kind of an obvious choice to take a look at cloud hostage registries, because they are really easy to set up. Basically, there is no operational cost you have to think about. But obviously, they have downsides. Like, for example, you need to manually set up IAM and cross-second shares with other customer accounts if you want to share images outside of your company, or you need to share IAM credentials from your cloud provider account, which is kind of weird. And the other thing is surprisingly, the companies that we worked with were not really eager to register cloud provider accounts that they haven't already used. Like, if we asked them to register on a specific cloud provider, they were not so happy about it. I don't really know why, because it's basically free, and you don't really need to do anything other than set up some IAM credentials. But this was actually the feedback that we received fairly regularly. So as time passed, a number of design partners grew, a number of customers grew, more and more projects were being onboarded to the solution, and new requirements came up. Like, as I mentioned, companies didn't really like using cloud providers that they haven't already used. So the new requirement was that no specific cloud provider registration would be required for the next iteration of our container registry solution. And the other thing that came up is, as more and more projects started to use the solution, they wanted a little bit more flexible authorization. Like, if a project has dozens of container images, like manually granting access to those images in a cloud provider hosted registry studios, like they wanted some more flexible solution for granting authorization, or granting access to these images for design partners. And in product management, there is this entitlement-based access management, which is basically just role-based access management in our world. So going back to research and looking at the previous options, obviously, cloud hosted registries were no longer an option. peer-to-peer registries, as I mentioned, they have a very specific use case. So they were not really an option for us either. So the next thing that we looked at was trying to use one of the only one solutions available on the market. And again, going back to the CNCF landscape, please, there are quite a few only one solutions in the container registry category. I'm not going to detail why, but we pretty quickly crossed off JFrog. Like, we couldn't use it for some reasons. JFrog is awesome, but the kind of integration that we needed would have taken a lot of work. Quay wasn't open source at the time, so it wasn't really an option. It's in the CNCF landscape right now, but it wasn't an open source at the time. Portus was basically unmaintained by Dan as well. So the only thing that's remained on the list from the CNCF landscape was hardware. And hardware kind of ticked off a lot of boxes, like hardware structures, container images into projects, which makes the authorization setup a little bit easier, because now products can get their own projects, even though if they have multiple image repositories or container images, they can grant access to those specific projects. So authorization became simpler. Also, hardware has this concept of robot accounts where you can create special credentials basically for service-to-service authentication, so you no longer need like a cloud provider IAM to use the registry. And it also comes with a lot of other features, like image replication, which is really useful for us, because we didn't actually have to change our CI processes. We just needed to replicate images from our existing registries into hardware, and it made the whole operation a lot easier. Interestingly, hardware actually uses the distribution under the hood, which falls into the plain old registries category. I will talk about that a little bit later. Now, hardware is awesome, but as we started to use it, we noticed a few things that made it a little bit harder to use. We overcame those challenges, but it was at the back of our mind during the time that these were not ideal solutions. For example, group-based access is not available for robot accounts, so we need to manage access for robot accounts to those projects individually, which we did. We built a little service next to hardware, and we were able to manage the entitlements or the roles within our system, but it was a little bit inconvenient. The other problem with hardware, specifically around the API integrations, is API authentication works a little bit weird, like if you're a user, you can use specific parts of the API, but if you want to control everything, you need to use the so-called admin credentials, which is not really a great idea if you want to limit access to a sidecar service, for example, so that was a bit weird, and we also needed to use the admin access to create cross-project robot accounts, like robot accounts either belong to a specific project in hardware or they are system-levels robot accounts. So we had these kind of issues, but hardware basically ticked off all the requirements we had at the time, and it's still running, so I guess it was good. Now, some time later, the company started to adopt product-led growth, which often comes with requirements like being able to subscribe for a software on a self-serve portal without any kind of manual sales process. So building a self-serve portal was kind of the next milestone for us. In addition to, at this point, we were already looking at onboarding customers to the solution as we're not just design partners, so some closer integration with sales and licensing and using those systems for granting access to specific products became a new requirement. This is the point where we basically decided maybe we should take a look at building something of our own, like, hardware was great, but it had its limitations and things like external authorization is not an option in hardware. Now, before telling you what solution we came up with, I need to talk a little bit about how container registries work. How many of you are familiar with OCI? Open container, cool. So if you're familiar with OCI, you know that OCI publishes three specifications today. One for the runtime, which is how you can run the container from a file system model and configuration. One is called the OCI image specification, which is how the image is formatted and transferred. And the one interesting in our case is the OCI distribution specification, which is basically an HTTP interface for downloading images. Now, the thing is, and it's probably not obvious from this image, but the OCI distribution spec actually doesn't define any kind of authentication or authorization solution. I guess that's kind of a conscious decision because authentication and authorization is hard and everyone have different needs, so they wanted to publish something that worked for everyone. So that's the OCI distribution specification. Now, the spec itself defines an HTTP interface, which means that technically you could build any kind of HTTP-based authentication for your registry if the client supports it. And in fact, basic authentication is something that most clients support today. So if you wanted to put basic golf in front of your registry, you could do that and Docker and Scopeo probably would work. I'm not sure all of the clients support that, but that's something that most of the clients support. Now, the problem with that is it still doesn't solve the authorization problem. Like, you've authenticated the user fine. How do you make sure that the users can only pull the images that they are supposed to pull or they can't push anything to the registry? And the thing is there is no real formal specification for that today, but if you ever use Docker login, you've already used the one available today to everyone. It's not a formal specification which means there are issues with it, but it's documented under the distribution project and this is basically what Docker came up with to solve the authorization problem. It's a token-based authorization which means when you authenticate, you get a token and with that token, you are able to access specific resources within the registry. Now, this probably looks a bit more complicated than it really is, but so if you wanna follow suit right now, feel free to then, then we'll talk about it at the end. So basically the specification requires you to run a so-called authorization service and this authorization service will be the one giving you or issuing you these tokens. Now the process starts by the client which is either Docker or Scopeo or whatever you use, talking to the registry because by default you know where the registry is, you don't know where the authorization service is. So the first step is talking to the registry and the client in fact goes to the registry always as a first step and the registry in the first case will reply with an unauthorized response. Like if you don't have a token, you can't access the resource but in that same response, the registry also tells you where you can get a token that will allow you to access that specific resource. So it returns a challenge header that contains the authorization service location and the resource that you're trying to access and the type of action you're trying to do. So you can go to the authorization service in the third step, present your credentials, your username and password and ask for a token that you can give to the registry. Now the authorization service will verify your credentials, it will check if you have authorization but it doesn't actually perform the authorization, it just returns a list of actions you can do. So if you try to push to a repository that you're not supposed to, the authorization service will not, the request to the authorization service will not fail because it does not actually know about that repository in the registry, it will just return an empty list of actions that you can do. And in the fifth step, the client goes back to the registry with the same request but this time it will present the token and the registry is the one that will actually perform the authorization seeing in the token the kind of actions that you can do. This token is generally a JSON web token but it's actually an implementation detail like different registries may implement different strategies but it's usually a JSON web token and within the JSON web token you usually have the actions scopes for the resources you are trying to access. And in the final step, the registry begins the operation that you wanted to in the first place. So that's how the Docker Registry Authorization SPAC works. Now let's try to put all this together because now we have the building blocks that we can use to actually beard our own private container registry and the first step is actually choosing the registry that you want to use. Now from the CNCF landscape list, there are two registries falling into the plain all the registries category. The first one is distribution that I already mentioned. This is basically the reference Docker Registry Implementation that Docker initially implemented. Most providers out there that have some sort of registry rely on this project like Docker Hub, GitHub Container Registry and so on. Harbor uses it. So most providers either use it directly or just use components of it because it can kind of be used as a library as well. And it has a ton of other features like CDN support so you can distribute images more efficiently. So this is kind of the effect of implementation. It implements OCI. They believe it's also backwards compatible with the old Docker V2 Registry SPAC as well. Now, distribution is usually a good default choice. The team behind it is currently focused on delivering the V3 version of distribution. So maintenance is a bit slow at the moment. The last minor version was from 2022 and it does have a few missing features. Like for example, if you want to run the registry on a cloud provider, it doesn't really support the new kind of workload entity authentication. So if you want to run it on AWS and you want to use S3 as a back end, you need to use an actual IM user today if you want to authenticate with the databases API. It doesn't support the workload identities which has been requested for a while now but obviously the team is working on V3 so they are kind of swamped right now. The other option that we have in the CNCF landscape is called ZOT. It's actually a project that Cisco developed initially and donated to the CNCF. Both distribution and ZOT are CNCF sandbox projects, by the way. Now the problem with ZOT is that the registry old spec is kind of broken at the moment. I do have a fix and PR and we are in communication with the ZOT maintenance but if you want to use the registry old spec right now, you can't really. But ZOT is much smaller, it's much more lightweight and it's basically just the pure OCI distribution implementation so it doesn't have any of the CDN and other features. Now the third thing you need or the second or the other thing you need to run your private container registry is an authorization service and based on the documentation that the distribution project has, you can absolutely build your own authorization service in fact for a file, the distribution project had a reference implementation for the authorization service in the repository. They removed it a couple months ago because it wasn't maintained, it was rather simple. But there is another library that I've been working for a while now and I published it on GitHub. It's basically a library that implement the specifications so if you just want a library that you can use to build your own service, you can but it's also a service like you can run it as it is right now and you can integrate or you can use it to the bunch of different integrations to run it as a ready-made service. Still working progress but it's actually working so if we have time I can actually do a little demonstration here. Let's see, I think we have some time. Can you read it? So this is a quick start project that you can find on GitHub and it's basically, if I go into the Docker Compost setup it's basically the authorization service itself, the distribution container registry, the ZALT container registry and Kerbos which is an open source authorization solution. It's also CNCF Sandbox project as far as I know. I use Kerbos to integrate rather flexible authorization solution into the service. We'll take a look at the solution itself in a moment but first I wanted to take a look at the authorization service configuration. So it does have a couple of different providers like you can use different authentication solutions, you can use different token issuers and authorizer solutions. Kerbos is one but you can easily integrate your own same for user management. This one is a static list of users and this example actually implements three different use cases. One is a simple user use case where you are able to push into your own name space but not anywhere else. The other one is the admin use case where you can push everywhere and the third one is the customer or design partner use case where you have like a number of entitlements. It's kind of their role-based access control model where you have a number of entitlements and that translates to a list of repositories that you have access to. Now let's take a look at the authorization policy itself and then we can check the example as well. So this is the Kerbos authorization model. It allows you to write rather complex authorization rules which is kind of why I like it. But basically the use cases that we have here is everyone is able to pull from the default name space whether it's a user or an administrator. The admin user can push and pull everything. The admin user can push into the default name space that's kind of redundant. The users are able to pull into their own namespaces and this is the part that's interesting for that so it's basically checks the username against the repository path and then finally customers are able to pull from namespaces where the repository name starts with one of their entitlements. So these are rather simple examples but you might be able to see that using some kind of authorization solution like this compared with any kind of user management system or licensing system you can basically build whatever solution you need. So let's actually start everything and again this is available on GitHub so if you wanna follow it later, you absolutely can. I'm just gonna go through some of the examples here. So first we need the configuration but I think we already have one. We already have a registry. I'm gonna use the distribution in this case but I think both distribution and ZOT should be work with these examples. So the first thing to do is logging in into the registry and the credentials are just admin and password in this case but what's gonna happen here is Copey will try to go through the registry first. The registry will say you are not authorized. Copey will go to the authorization service, get the token and then in the next step it's actually going to repeat that because for every single repository it's going to need a separate token. So logging succeeded and now I can try to push some images. So I'm gonna push an image to just the root namespace as Alpine right here and then I'm gonna push an image called product one which is going to be one of the entitlements assigned to a customer. Let's see. Kind of hope it didn't break because I changed a few things last minute of course. So we were able to push the image as an admin. Now let's log out from the registry and try to go to the user mode which is again quite simple the user come from this configuration file. So there is the user, the admin and the customer user. And then let's try to well pull some images. So I should be able to pull an image from the root namespace which is fine because that's just collection of images I wanna share with everyone. Now the user should not be able to push a product image and it shouldn't like it doesn't get the scopes necessary to push the image to the registry as a product image. But the user should be able to push to its own namespace. This is the one that broke before so kind of happy that it broke now. And the user once again should not be able to push anything to the root namespace because that's the place where you just wanna share images with everyone and it can't. All right. And obviously the same works for the customer as well so the customer should be able to pull from the entitlement, the product from the repository but not from the rest. Okay. Let's go back to here. So you can find it on GitHub, you can give it a try. It's still working progress and feel free to share any feedback you have. Now I've told you kind of the good parts. Obviously there are bad and ugly parts as well. As I said, this solution, this registry specification is not a formal specification which means there are several gaps in it. There are several open questions that are unanswered and even the documentation itself is outdated. So all that unfortunately leads to quite a few issues. For example, partial implementations, registries or even clients don't always implement the specification right which means there will be incompatibilities between registries or clients. And there are even like competing but not completely compatible specifications. For example, the chart museum auth specification. This is actually the reason why Zot is broken right now. The chart museum auth specification is based on the Docker registry spec but it's not completely compatible with it. And Zot actually uses the chart museum auth library so it's kind of broken at the moment. And many people probably like don't see these issues because they don't write registries or clients. So the mainstream clients probably work just fine. But there are people who actually write registries and clients and those people started chatting a while ago how they could fix this situation. Unfortunately, we have a formal OCI auth working group as of August the first, I believe, that's the date of the first work group meeting that they had. And they are working on coming up with an official specification. Obviously, it's not gonna happen overnight and it's not gonna be adopted overnight. So we still have some time until we need to work with the current solution, the current Docker registry authentication. That's what most clients implement anyway. So it's gonna be around, it's not gonna disappear for a while. But hopefully we will have an OCI proposal soon and we can work towards a better solution. So that's what I wanted to share with you today. Thank you very much for your attention and happy to answer any questions if you have any. I believe we have a microphone here so if you have any questions, you can grab it or I can just repeat the question. I think so. I mean, I absolutely wanna work with the work group. I'm not part of the working group but I've already shared it with them. I received some feedback and it's actually gonna be interesting to see how the new specification will either change or add something to the existing one. Like they're not promising that it would be backward compatible but there is a chance that the new spec is not gonna break the old one. So it's gonna be interesting to see how they work side by side. I think there is a chance, a strong chance that it's actually not gonna break. Like the old specification will work for a while and then there will be a completely new one that we can use and in addition to the old one. So I think once there is a proposal on the table, we'll probably implement it in Portport. Any other questions? All right, if there are no more questions, you can find me on Twitter, you can drop me an email or you can find me in the hallway and thank you for listening to my presentation.