 Welcome everyone. My name is Simo Sokkyu. I'm a senior software, senior principal engineer in Red Hat and I work as a team lead for the crypto team. And today I have this presentation about building an open cell provider, in this case specifically for PKCS 11. And I'll just go a little bit of, you know, have a kind of high level overview and then some tips and some lessons learned maybe. Feel free to ask questions anytime. I welcome them. So I'm going through what's the problem? What is a provider? What is PKCS 11? What is the PKCS 11 provider? And then, you know, what happened when I tried to write one. So the problem is very simple. It's not a new problem actually. In, you know, in previous open cell version there's a thing called engines. And there was an engine that allow you to use PKCS 11. So the problem was specifically with the new version of Open Cell 3 which deprecated engines and introduced these concept of providers. And the problem is having an application that is using the Open Cell API ends up being able to use a hardware token generally or even a software token that does some cryptographic operation. So what is a provider and why there is this change? So Open Cell had these engines that there were limitations. Some, some, sometimes it would be awkward to use in applications and providers are kind of better engines if you want from an application point of view. They can be transparent to applications unlike engines where, where there was a requirement that the application know how to use them and call specific APIs. There are already in fact multiple providers within Open Cell 3 that applications don't really actually see. Like the default one, TIPS, legacy and stuff like that. So it doesn't, it's not required for an application to explicitly select a provider. It can be, for example, configured in, in, in Open Cell Conf file and, you know, transparently or loaded. And you can also set properties for a provider that kind of can change the behavior open a cell to use a provider or to change the behavior of a provider. So it basically allows a good configuration or agility if you want in the way cryptography is used once an application, you know, is open a cell, which was not really available before, I would say. At the code level, it's just a loadable model like a shared library. So in, in Open Cell Conf you will define I have this provider, this is the name and Open Cell can load this shared object, find, you know, functions and, and use various management functions. You have to implement generally at least, at least a key management operations because, you know, when you're using cryptography, usually using keys, even though technically there are some, some operations that open up can do that doesn't involve keys, those are not very, very interesting to me at least. And you also need to support at least a way to reference these keys later in other operations through Open Cell. But, but that's, that's, you know, that's it's just a self, self-contained shared object that can do operations that Open Cell can call into. And this is taken directly from the old Open Cell 3.0 design just to show you the difference in terms of where these things lives. In, in the engine is some sort of legacy thing that kind of has its own world around it while the provider is really below the core of the library today. So it's really, really hidden and very, very low level if you want. And all of the providers are basically on the same level. And while the engine was a special thing, together with many other special things. So that's why the provider can be used somewhat transparently while engines really had to be coded specifically for. All right. So what is the difference when an application had to use engines before and now needs to use providers? As you can see on the left side with engines, you will see a lot of this engine word, like you have to load the engine, you need to know what the engine is, what the name is. And then you need to use a special, special operation to load a key from an engine. And then you do the normal operations, then, you know, close and go away. But the critical part is that you really have to change your application to use that, this, this functions. On the provider side is that if you use the modern Open SEL API, you just open a store, which can be anything, even normal file. So you can have an application that just doesn't know anything about your provider. And then you will load from the story key. Again, there is no special knowledge of where this key is coming from, what it is, then you will do your operation and close. So in terms of the amount of code, it's kind of comparable. But the good thing is that if you look down the URI, which is the only identifier that differs, if all goes well, that usually comes not even from the application directly, it comes from some configuration file, or maybe some prompting, if it's a common line utility. And so it can be totally transparent to the application itself. So if the application is well written, you can test with PEM files for keys. And then you, you bring in your provider that has the keys completely different place and the application will just keep working fine and doesn't know anything. So from the point of view of architecturally working application, it makes it much easier to plug in, you know, modules that will do new things without having to constantly write patch, you know, send bugs to many applications. Of course, this is not, this doesn't work backwards in time. So it's for new application that will use the new API. But that's okay, I think. Okay, so what are available providers, besides the internal ones, there are a few good points in this GitHub project that are two that I consider notable because I use them a lot to learn how an external provider should behave. One is the TPM to open a cell provider. It's a provider that give you access to TPM. So you can use basically TPM directly from this cell. And the OQS provider, which is the provider that use LiboQS, which implement quantum-safe algorithms. Both those providers are already quite well written, I would say, in terms of the number of operations that we're doing. I'm not making a good quality statement. But just in the, you know, the extent of the APIs they were using, and they were very useful to learn more about the providers before I venture into writing this other one. So let's then digress now a little bit what is PKCC 11 before we go there. You need to know, understand a little bit what it is to figure out why or what I did. And so PKCC 11 is fundamentally an API, so a C API specifically. And what it does is kind of mediate the access from an application that is more generic to a hardware, a generic hardware token like an HSM, a smart card, the UB key, whatever you have. So it's very simple in concept. And the standard is managed by the Oasis Group. And it really only defines how an application talks to a shared module fundamentally. It doesn't implement or define any hardware protocol. How that driver talks to the actual hardware is completely outside of the scope. And it basically creates an obstruction layer of, you know, potential hardware communication, but it doesn't have anything to do with hardware. That's kind of important because it means that you really need to have yet another piece of software underneath. It's not just calling and talking directly to some hardware. And the other thing is that it's not like a monolithic thing. There are many APIs defining PKCC 11, but not all of they are, but not all of the drivers implement all of them. So that also means we need to be a little bit careful of what you're going to use because maybe one hardware token with its driver will support that API and another will not. And things like that. So it is a kind of like a collections of API if you want, you know, sense rather than a solid, you know, thing that you can just write something test and it will work everywhere. So it really inherits the hardware limitation if you want in some sense. And it continues to evolve as new, you know, cryptographic primitives or new protocols or new algorithms are being created. So now we can tell it what is the PKCC provider is just, you know, getting this API and stick it into open a cell with a middle layer that translate the way open a cell under, you know, created the providers API and talks to the driver that you want to use for whatever hardware you have. And so it just, you know, another abstraction layer in the middle somewhere. And eventually you go and talk to hardware or a software token or whatever it is. And it is that address there on GitHub. So what are the goals for the PKCC provider? The main goal for me was, of course, to make it possible to use PKCC 11 tokens. We've open cell three because the engine interface is rapidly kind of degrading its ability to do to work well, due to the deprecation. And I, you know, we wanted to look forward and see what use the native, you know, APIs for open cells that eventually we can stop using these deprecated interfaces. But the thing that really intrigued me and interested me was the fact that if used correctly, it can be completely transparent to our application. So one of the things I made very hard, in my opinion, use hardware tokens was that whenever you want to use a hardware token, you have to go to the application developer and ask, you know, can you start using this engine API in the specific, in the special case for me to use this token? So you have to go and change applications. While with the providers, there is a good chance that, you know, you can actually send patches to change how the application works. But it's not a special case. It's just a common case. And, you know, depending on how you configure open cell, then you will use either the standard cryptography within open cell or the provider. The other thing is that you can use the standard PK611 URIs to define how to get access to a key. It's a very well done, I think, standard in this case. It gives you all the tools and tweaks you need to identify a key in the token so that you can uniquely identify the key and call it what needed and use it and, you know, I haven't felt any need to do anything else. So I think it's complete from that point of view. The other thing I wanted to do is that I would like to get to a point where we can configure PK611 provider in a system like Fedora or REL by default as a standard provider in the default configuration and yet not have side effects in applications that are undesirable unless the application is actually using the provider and then the case will not be a side effect it will be wanted. And, you know, of course then that's also the case where applications want to explicitly force the use for the provider. In that case they will use a property like provider equal to PK611, but that just need to work. And the final goal was to make it work with as many actual hardware tokens as possible. Because as I said many of these tokens just support a subset of algorithms, subset of functions, sometimes they have quirks where they kind of support something but not in the correct way and many other strange things because hardware is, you know, it's kind of hard sometimes. And also software tokens. So these are the goals for the project. So how you can use it? I'm not going really down in details and give you exact configuration that can be easily found, you know, online. But fundamental is just a little snippet of configuration in the main open cell configuration file. What you need to define is where the module is, where you compiled it. You need to define what driver, what PK611 driver you want to use. You might want to define a pin because normally what happens with hardware tokens is that you have to have a pin to unlock the token to permit operations. It's not required to set the pin in configuration. There is support for prompting if what you're doing is, for example, using open cell commands and you'll have an interactive section. So you can avoid that. But generally, you know, like for services, running on some service on machine, you're okay setting the pin in that configuration so that machine can access the token directly without having to do any, you know, dance around introducing the pin and keeping it in memory or whatever. But that's up to the users. And then you can run a command, like the one that you see there, open cell store useful keys, the xpk611, to test the configuration and you should see something like this where it will connect to the token if there is a pin needed and its configure will use it, otherwise it will prompt you for a pin and then it will fetch the objects from the token and then print them out because of the dash text option. What we print out, we implemented encoders and decoders explicitly to print a little bit more information than what normally open cell prints when it pulls keys. And what we print is, for example, the URI, the URI that you could use if you want to use that key, you just found. If you have a token, you generated some key, maybe some UBI key utility, whatever not, you have the like, how do I find out this key that I want to use an application, just use this command, it will list all the keys you have, you find the key you like and you just can use that URI in the configuration to use that. So in this case, I found 12 keys and I just, you know, alighted something to put in the screen, but it will print for everyone, URI and, you know, the public key and a bunch of information. Alright, so in order to get there, I had to learn how to write a provider. That's, that's an interesting, an interesting, I mean, I like, I like doing that, I have to be honest, I really loved having to dig into the code, but it is a little bit of a daunting proposition because the documentation is really sparse, like if you do man 7 provider, you get some information, but you won't go very far in what in terms of what it really means to actually write a writer. It's mostly about how you use it or stuff like that. It's very generic. So you end up reading a lot of sources and there are, you know, two things to know. There is the open-sell source, of course. You read that, it will give you a lot of hints, although internal providers take shortcuts, even just in the initialization code when I started, I was looking at those like, this won't work in my own. And so what you end up doing is probably reading also external provider sources to figure out the difference or the things you have to do when the module is built outside of open-sell. The thing that makes things hard for real, more than actually reading the source per se, is that open-sell has an extreme level of indirection everywhere. And very, very hard to read. Multiplied nested macros everywhere. So sometimes reading the source code leaves you more confused than not reading it. And when that happens, I shut up GDB, I set, you know, my, you know, my break points and just go and see what happens. We get back traces and then I go back and try to find the functions with the understanding of where things are coming from, where they're going, and then you have a better idea. But sometimes GDB also get confused because of the extreme level macro interaction. And so, you know, it's, it's fun in a way. Sometimes you swear a little bit, but in, well, in the end, you know, it works. So the first hurdle I had was initialization. But actually, from that point of view is once you know the thing, it's kind of easy. The only thing that you have exposed for your share object is basically just one function. That makes it easy. Just this, this function called OSL provider in it. It's all you need. Done. We can go. Well, this function oratory, I think called a dispatch table, which is a function table that points multiple functions. And one of them, which is kind of the most important one is the one that, you know, allow you to query for operations, which basically will tell what operations they provide actually can do, can support. And that will return, each of them will return back another function table that then opens up will kind of cash and use at some point to do whatever operation you need to do. So for example, you can implement this OSL, OAP signature function table, we have a bunch of things. But if you do that, then you will also need to do a few more because open a cell kind of expected for some operation that you must have other operation also working. Although I know technically you could just implement, you know, the single operation in our side effects. So you end up, you know, having to build other operations, but it is okay. It's so what are operations just to be clear? Like, for example, if you want to implement your owner as a version, for whatever reason, you know, for fun, because you have a special hardware module, a special CPU instructor you want to use you just want to do some different code. You will have to have a name recognized or not recognized by open a cell. So if you use a name that is recognized like RSA, it means that open a cell can use your provider. Also when the application just genetically asks for RSA. And but if you implement something like a quantum safe algorithms, you will have different names that open cell doesn't necessarily know ahead of time. Anyway, there is a name that identify, you know, beyond the operation type, what kind of cryptographic algorithm we are implementing fundamentally. And then you will have to product constructors, destructors, get a set parameters. These are almost all of these operations have some kind of stuff like that. And then you will implement your specific, you know, function like, for example, if we want to implement signatures, you will have something like I'm implementing OSL func signature digest in it, which is in this function table, you have this thing define and then the function that will implement this operation. And then of course, you will have update and finalize and whatever now and verify, verify it, verify update, verify finalize. So this is what a function table kind of looks like at least in my code. And here you can see that I fell in the same trap of using macros. So it's not actually a real C structure in this form, but this makes it more readable. So in the end it was, it was okay. And I picked this as an example, because it's one of the shortest signature structure because the DDSA doesn't support various things. So you basically have three functions that create free or duplicate context. A context keeps various information on what kind of operation you're going to do. Then you have at the bottom, a way to get and set parameters. You could maybe setting digest that you want to use for a signature or other parameters needed. And then you have the actual sign and verify functions. And this is all that you need to implement to basically do a DDSA in a place. All right. So then there is key retrieval and key management, usually to do anything useful like it unless you're just re-implementing an existing thing just in a slightly different way. And specifically in the PKS, I mean provided where keys are not directly available because the whole point of hardware modules that the key is stored safely in hardware and so it cannot be extracted in using software. You have a way to go and find this key or reference them somehow. So the later when you do an operation, you will use the key you want. So and the API to use to find keys or load keys is the OSL op store API. It just defines ways to load keys, find keys and you know, unlock keys through password and callbacks and stuff like that. And also importing and exporting keys, which I also need another one, which is the encoder operations in case you actually need to then use ASM1 to export them in some form or import them and so on and so forth. And finally, you also need the key management like these three kind of always work kind of together somehow. And key management does things like generating keys or preparing keys for operations and various things like that. I mean, I'm not going into details because you know, you'll never end but just to give you an overview of the individual areas that you need to learn over time to be able to be a complete provider. So this is how once you've built enough things, things will start to kind of work finally. So if you have an application you and you want to do a signature operation, you have basically two basic operations you have to do. First, you have to get a key and then you will have to put up your data and you will say, I want to sign this data. So to get good idea key in the BKC11 case, you have this URI, you'll call openSL with the open store API and say, I want to find a key with this URI. Then open a cell will see that it's BKC11 call on and say, oh, I have a provider that can handle this kind of URI. And so it will go into the BKC11 provider. There are various things I'm not going into, but it will find what operation it can do. And eventually we'll find that there is a store facility there, we'll call it. And the provider will actually then know that there is a URI we're looking for. So it will go down into the driver for BKC11 and try to find if there is a key that basically fulfill all the filters you have in this URI. And if it finds it, it will return back, it will cache something in some memory object store, return a reference to openSL. And if openSL finally will recreate this EDPP key structure that gives back to the application. And that's the abstraction that basically allows the application to reference a key, whether there was a store in a file or in BKC11 or TMP or whatever it is. And then the next thing the application does it like, I finally have a key, I have my data, I want to sign it, it will send all this stuff down to openSL, which will send it down to the BKC11 provider. BKC11 provider say, oh, I got this key, there's a reference, I know what the handle is. For the BKC11 layer, I can set up the operation with the hardware. I can tell the hardware this is the operation I want to do, this is the key object that I want to use. This is the digest I want to use, whatever. It will call the driver, the driver will do its own set of things. Eventually you get back a binary blob, which is a signature. You will send it up. In some cases, the BKC11 provider might have to message this data because of sometimes, for some signature cases like ECC, I think, PKC11 return, you know, the data in some format, openSL will expect a slightly different one, so maybe it does some play with it, but eventually it will come back to the application and you have your signature. From the point of view of the application, the application just called openSL to the signature, that everything under the first layer is completely unknown, potentially by the application. I think that's the nice part. So what are the hard areas? I had my head on multiple times, encoders and decoders. The concept is very simple, but you have to know about the ASIN one, for example. All of the providers, the internal ones and the external ones, use at least five layers of nested macros to implement the functions that implement these encoders. So when I was trying to understand what was happening, it was really, really hard. The GDB was completely useless to follow what was going on and I just went trial and error until things worked. And eventually I understood enough to actually correct the first few mistakes ahead and it kind of works, but it is really, really difficult code. And then there is naming and caching resolution within OpenSL itself because whenever OpenSL tries to do an operation and tries to find which providers can't complete an operation, like for example RSA, it has a whole world where it looks for names and try to find them and cache stuff, find function pointers. I gave up on that one because every time I had, you know, I was trying to look into it, the code is really complicated, but I was also trying to solve another problem and the other problem was more important. So maybe it's just me, but I just say keep it on faith. I mean, it works most of the time. Sometimes, like, you know, I ended up, you know, just looking other code was doing, okay, maybe I'm doing something wrong when it wasn't working, but you can largely mostly ignore it, but it is a really complicated code. Maybe it would be nice to have a guide on that from the open cell side, but it's not a showstopper, but those are really areas where it's really hard to understand what you're supposed to do and how things work. There's still one thing that I don't understand. Maybe I'll ask the pencil developers what they are about the query operation for the function that is a query operation where you can return a result that says, don't cache these. And if you do that, what happens is kind of the opposite, like open cell will only use that thing for their own. And it was like, what? So I stopped doing that. But yeah, I mean, it was fun. So what are the next steps for me or for the project in general is integrating with applications because as I said, you know, you have to use the open cell three API fully to make use of the providers. So we did some testing within the team. We believe SSH stuff mostly works. Some bugs were found, but that should be, we should be able to get that working. But I want to work on ISE bind, open SSH and modern cell to make them use these instead of engines. And that's where we'll go next. And, you know, a lot of bugs were found in the making of these lights, but nobody was hurt. There were, there were PRs open, there were discussion upstream, open cell developers were mostly very, very gracious and useful and understanding and accommodating and we fixed things, we had PR going, changes. There are still some things discussing. There are still some areas where the provider API can be improved to make things easier or possible. But overall, it has been a very fun experience for me. So, yeah, thank you all. And if you have any questions. Yep, go ahead. From the bottom, fix small. A fix, using, you know, it's called SSH moving file for fix. The moving file has two providers. The name is base and the name also is named fix. And I was told in that case of the moving file the base is used as a decoding, decoding and the fix is used for encryption. And I wonder how it's necessary to have some kind of decoding which provider is used for decoding or encryption. Yeah, yeah, so the question is in fix small, when you want to use a fix provider, there are two providers you actually have to configure. One is called the base provider and the other is the fix provider. And the base provider does encode it and decode it. And a fix provider's encryption primitives and how that works. So one thing that maybe I didn't say is that although there is this impression that the provider is monolithic, like it says one thing, from the point of view of a cell, each provider fundamentally gives you a palette of providers to choose from. Like each operation is kind of its own provider within the provider. So, and each one of them has a name and a function table associated. So when open cell needs to do an operation like I need to decode something, it will query all of the providers or well it's more complicated than that but let's say it queries all of the providers and we'll find which providers tells back to open cell, I can handle this, okay? And then it will open cell has some logic to choose which of the providers it will use. So in the case of the base versus fix provider, the fix provider doesn't provide any encoder or decoder function. So whenever open cell ask, who can decode this PEM file, the fix provider will not respond but the base provider will say I can decode this. And that's how often the cell knows that it will use with the base provider for that. What is more complicated is what happened when one provider for example decodes the key and then you want to use it to another provider. But I don't go there unless you ask for it. It seems like the important thing we need to call applications is the switch to using this OSSO store. How do we get then the automatic association to using a specific provider for say, I've got a PQC user and a URI user, how do I know to use your provider for that URI versus some other provider which might be on the system? Yeah, so the question is if applications reach to this new open cell store API which is what is needed to use provider and passes in the URI, how does open cell know that it has to use the PQC user and provider versus another provider and stuff like that. There are multiple ways that are implemented to influence which provider is being used. So first of all, providers can register into an open cell, handle for URIs. And so the PQC provider registers into open cell, I can handle anything that is PQC 11 column, something. So whenever open cell is given a URI in the store case, it will see, oh, it's not file column or slash because there is a shorthand there. It's PQC 11 column, so which provider handle this? It will find that the PQC 11 provider handles it and it will call the store operations into that provider. Now, what happens after that is that the PQC 11 provider will return an object reference for a key, for example. And that will be embedded into EVP key structure. When you do an operation with the EVP key structure, then open cell will go in and look into it and say, oh, this key is owned by this provider. Let's try to see if this specific provider can handle the operation I'm being asked to do. And if it does, it will prefer to use that provider. And if that provider does not support the operation, what open cell does is it will ask the provider, can you actually export this key so I can actually import it in another provider and use it in another provider? So this is the general mechanism, but you have things like properties, for example, to tell open cell, well, you should really only ever use providers that dispose this property and never anything else, even if they say they support operation. This is how, for example, the FIPS provider is actually will be used because the FIPS provider is fundamentally the same code base as the default provider. The difference is that when you set the property called FIPS equal yes, then open cell will only use the provider that exposes property, which is the FIPS provider. So there is this selection mechanism within open cell that either the application can pass in or the configuration can file can be set to that allows you to select specific providers when multiple of them offer the same functionality. That's how it works in the international. I wonder if you load more than one PCS 11 module. Can I do that with stanzas, or does that mean people need to, or will the PCS 11 provider have to know if I'm going to load more than one module when I'm managing that? So the question is, I want to load more than one PCS 11 module. Okay, so the question is, can I load more than one? Are you, well, the question is, can I load more than one PCS 11 module? My question to you, do you mean more PCS 11 drivers? Yes. Okay, so because the PCS 11 provider is kind of a module itself, which gets loaded dynamically into open cell if there is configuration. And then the provider itself currently loads one driver. So if you want to use multiple tokens, there are multiple strategies that you can go for. One is different applications use multiple different tokens. So you will have a custom open cell.conf file for each application and in each one you will set the driver. Or you set an environment variable instead of setting the driver into the configuration and you set this environment variable for starting the application. But that's another way we support to load different drivers. And that's what we use for example in CI when we test different drivers. Or you use something like 11Kit that can aggregate multiple driver underneath. It's a basic proxy information there. So there isn't a single way to go. It will depend on the situation and what you need to use and how you need to use it. So, yeah. I have never tested whether I can load the same provider in open cell multiple times. So it... But I haven't looked at that. What does it provide you with multiple times to give structure text? Yeah. This is actually before that it's a load by default in the 11Kit proxy which basically aggregates all the cases that you will install in the 11Kit. So if I do not provide configuration in configuration file past or let's say in the case of 11Kit it allows cases from the 11Kit proxy that will basically do what you want. Yeah, Jakub commented that. We have a build option where we will load by default the 11Kit proxy which can collate multiple drivers available in the system by doing some discovery. And so it will make multiple driver available. And this work just fine because through the pks11 URI you can always select specific module you want to use even if multiple modules have keys that are named the same way. So there's no ambiguity unless you forget to set those parts. But yes, it's possible in the end to have multiple because you have a module load at the same time. We're out of time. Thank you.