 Hey folks Okay, we're 10 oh two Dive into things So I see sorry Justin you say something Yeah, I was asking if we should dive in and start to talk about things Well, let me turn my speakers. I couldn't hear you. Um, I'm guessing that we have a small group because while we talked about going from 10 to 11 last week and And Amy did update the calendar that people are forgetting that we have 10 to 11 now So I'm happy to start because I think I was just reading Marina's tough document which There's just a couple epiphanies in there that I think Help with I think some of the confusion we've had in some of the conversations Yeah, one thing I do want to mention is is that Even though tough has update in the name. That's the you and tough It is also used for initial software installation lots and lots of times and places as well so it's you know like Regardless of what you're what you're doing with it whether you're doing an actual update of something where you've already established Like have an earlier version or whether you want to install Something new tough is meant to handle that case securely and well So I it was one of things that I called out because I did pull up a couple of things and there's this one section for non goals that talks Not providing a means to bootstrap, but what that means is is that So We're not going to prescribe whether you do some way of trust on first use or how you get like a Bundle of things inside your client in in practice What's always the best if you know what repo you're going to communicate with which? Registry you're going to communicate with which maybe you do Then it's always the best to just ship some version even an old version of at least the root metadata for that Which is much smaller than you know for instance your CA bundle will be for your root CA's anyway It's you know like a maybe a kilobyte or something so We want Basically what we want to say Our way of saying non goals is to say it isn't to say that that isn't at all a concern it's to say that that's not something where we're doing something new and Interesting and that different people that deploy this in different ways could very well elect to Use different models as is true for different communities that use tough today okay All right, I'm just looking at the agenda and Yeah, thanks now as we should add ourselves to the attendees. Let me just post that here as well so for the first thing agenda was the key management and Yeah, as I'm wondering if we flip it because by the Attendance, I'm guessing most folks didn't recognize the change from 10 30 10 a.m. Start Um Gened out until late, which I'm trying to get better about sorry. What are you saying? No, that makes sense to me. I think I just more had like announcements for people to look at things So it makes sense to wait till 10 30 Okay, so Justin Marina, do you want to do the metadata overhead conversation first? I mean is that less important to have people here for? It's recorded and we at least have some people here. So I I'm open to suggestions on just I know that you know, this is the only person for the The key stuff that I know others are also like participants in where I see you guys being the ones that driving this conversation So at least it's recorded up to you guys hold on. So I'm trying to figure out what else is on the agenda Here I mean, I guess like we can really quickly talk about it. Maybe the metadata Discussion, there's not a lot of Perhaps there's there's not a lot of Debate about it. I don't know that we want to go through the numbers in any massive detail, but Basically the spreadsheet we sent out shows the the overheads and other things with this And the comparison page is really the only page you need to look at if you want to get the actual kind of Birds idea of what's going on if you really want to dig into exactly how everything was computed than the other sheets Make some amount of difference and we also Look at what the sizes of different things are for a I think there's a typo here, but for a registry with a single image and a registry with 200 million images, which is I think the size of Docker Dockers registry It's the second bit And in either case the The actual overhead numbers We show there we color-coded this to try to say what we thought Like would make sense but of course just left the numbers there if you have a large Can you back up I like explain the methodology of what you're trying to do because This is the I'm just looking at this for the first time and I'm trying to follow. Oh well, we're trying to see what the what the overhead would be in Essence for going in and Actually like Producing appropriately signed metadata and things that would allow you to verify That images haven't been Excuse me that they haven't been The metadata hasn't been tampered with you don't have all data being replayed and so on So in the end what we did is we looked at we we got some numbers from Justin Cormack that were very helpful we also went and Look at sizes of things where we tried to go and look at what Like existing systems would have or use in different scenarios How much overhead that would be and then We took the proposal that that we I guess called it the last year proposal because I don't know That's perfectly accurate, but it Is I think the solution that we've seen you present and talk about the most So we named it that because I think that just made it easier it for us at least to keep track Which was which? and we compared the overheads of Of all of these in all the different scenarios to see kind of what the cost would be and We talked in more detail about how these costs were derived but basically the way we proposed to set this up and The green options are the ones we think are good The the yellow options are the ones that we think are okay And the red options are the ones that we wouldn't recommend The red option when you're saying you did that's what I'm trying to stand your Obviously a green is good red is bad and but I'm not sure what you're done differently between them So the numbers here, okay, so there's there's two things that we looked at here but the main thing that this looks at is it looks at the the actual size overhead of doing the transmissions of data we also For this I can't highlight this well for the fresh client option for tough design option to It's this is what a client that has no previous state is going to get Which is okay, but it's not great. So the reason why this is sort of yellow from You know yellow colored is from a security standpoint You're not as as well off as as you would otherwise be That's also why this column here is is read because of the security concerns that we talked about over the past like many meetings Related to not having some kind of snapshot Metadata and then we did comparisons to look at really what we're trying to do Here is to look at like what's the overhead of? going and doing You know the sort of signing and things like that and producing metadata you would need That would allow you to do the stronger security verification So in in both cases we think the overheads are small Both for a single image and large public repo And so the options that we ended up selecting here because we view these overheads as negligible We chose we would recommend the option that has the best security for For the users as a good option because something like a point zero zero two percent overhead or a point zero four percent overhead is Feels like a drop in the bucket to pay for getting You know improved security Couple of comments here one I think The clarification definitely helps it was a little bit confusing to me because when I was looking at it from a signature overhead perspective It's not clear to me why some of these options are not recommended I think I would make that a call out in the doc that says these options are not recommended for x y dv reasons And maybe in the doc. I'm just kind of looking at this really quickly But that that that kind of this definitely does help That there's something in addition to the metadata overhead that we're looking at The other question I have and this is something where I'm not familiar enough with registries But are there other? Overheads that we should also look at in terms of like the time to pull down signatures and compute them That should also be included in this overhead Yeah, we can certainly add things like that In practice, we haven't found that to be problematic. I mean even a like, you know, even the little embedded systems that people use in automobiles That you know the signature like doing a couple of signature verification isn't a big deal for an ECU and the the real expensive thing here is the Is is like, you know would be something like a private key signing Which, you know, even then you do Tens of thousands per second Doing verifications of substantially faster than that and if you're doing something like which I don't think you're doing We'll be doing anything more complicated, but but you know the the actual time overhead here is, you know, think of it like You know like like TLS or something like that it just it really gets kind of lost in that At least in practice has has always been sort of lost in the in you know in the sea of other things. Yeah, I think that makes sense. Sorry Yes I was gonna say what we usually done in the past Is just compare that to the time it would take to deploy an artifact Without a signature versus time to deploy with and that like you know, if we consider other options down the road I think the distinction here would help call out why tough might be a better solution, right? Like if the implementation from the top side just adds very little versus sort of like some other signature validation has Significantly more validation time. I think that's a good data point to have in the future as well. Yeah, we can take a look at this But in practice It's it's going to be extremely minimal and and even just the Like the you know the signatures from Like the the on the metadata from the last year proposal will already be It's like similar to what we have which is in both cases It's like similar to what we have which is in both cases is going to be like just kind of lost in the noise unless you're on a 10 gig ethernet Downloading full bore or something then maybe it it you know, it does make some very Minor percentage wise difference, but in terms of absolute time. It's it just shouldn't really matter Yeah, one of the things I'm kind of curious about like, you know When I think about signature verification like you're Recomputing the hash of the bits that you just downloaded, right? And that's trip takes some time So I'm kind of curious as to why there isn't an impact here Maybe there's something I'm missing in how the signature validation is happening. I don't have the numbers in front of me, but you're Number one like as you would download it, you'd usually compute that but even if you if you don't In terms of actual crypto operations, you can do Like Many many I think I I'll run the numbers you can do open SSL speed on this but but doing like secure hashing over data is extraordinarily fast on a modern computer. So it's just not I I Don't know. It's not something that That we've we've had Really a concern with anybody who's deployed or tried it in practice But we can we can get some numbers for you. It's just like I Mean you do all these things too when you do a download Over HTTPS so You know if You're going to see something like that slowdown, which is is imperceptible So, yeah, I don't know I mean we can we can look at it, but Like in practice what we've seen is latency in bandwidth or much bigger contributors To this now if you're in a data center and you have a huge bandwidth and you're you know You're serving things from the same rack Then you're right that that may not matter as much, but it's it's still not going to be An expensive operation in terms of absolute time I'm still trying to figure out what exactly you're comparing here I mean because the the method all it's not clear to me what methodology or even what problem you're trying to solve here Because we've done a great job saying yes, this one is red. It must be bad And this one's green it must be good and there's numbers But there's nothing really explaining what it is you're trying to accomplish if I'm taking a Image, and I'm trying to get its signature back Then I'm not sure why What you're trying to say by how much content is in a registry and how it affects that performance So the the other design document we have has like outlined some of the security problems and shortcomings with the Proposal of just retrieving a signature This notion of like something is signed therefore I trust it is is problematic And so what we've done here is this document is focused on what's the overhead that is You know that you would have if you actually provide protection That's greater than just having something that has a signature somewhere that may or may not be out of date and So you're making a leap as to that the problem you're trying to solve is a problem we need to solve Like I You're that's what I'm trying to track and that's what I was referring to earlier of The a registry has two categories of content There's content that is you know software and base images that other people depend on and they are intentionally a single version a tag that references it as a version and We do want to get updates to that And we've built stuff in Azure specifically to enable getting updates to that not not from a security perspective Just a notification that an update has happened that the digest has changed for a given tag So that's an important problem for base images, you know, like Operating systems like Linux and Windows and the various permutations of each and so forth and the run times that are sitting on top of them But when customers builds something from that They don't want an update to that thing Because then update to that thing that they're building is a specific deployment They want a secure supply chain that's not a secure spot They want a supply chain that says Something change might be the source itself that changed that I made an update and I want to push that Or the thing I depend on changed and I want an update from that so I think we have to tease out the two because the the apps that people built should not get Updated tags that there would be any update specific to that So I think we're just Isn't really the problem I Think the people that want to have what the updated tag points to should get what the updated tag points to and people that Want to have a version should get a version a specific version I think think we agree on that and I think we also agree that it would be bad if I'm if a potentially malicious actor could give somebody That said they wanted the latest version an old version that has security vulnerabilities or other things like that I think we're we're in agreement about that and so The The problem is that the design that just does the signatures with no context and no Snapshot as we as we call it in this document that that design doesn't provide that set of protections for all the reasons we talk about in the other document that we go through the The the Google doc not the spreadsheet and so what we've presented here You know we also went through and wrote up the other Google doc that walks through all the scenarios and shows how the design that has these security properties actually works in all the scenarios and Now we've gone and we've shown that it's actually efficient at least what we feel is efficient in these scenarios and so we're we're you know trying to show that the added Security you get from doing things this way Doesn't come at the cost of usability doesn't come at the cost of overhead and so on and so You know, we're we're trying to present this in as clean of a way as we can so that others can can look at this and then You know like Get the the protection that they should have for their users because the last thing that that we want is for there to be a security Incidents and to have users at substantial risk because You know, we we decided not to add a protection for whatever reason So how me with the concern is the registry is trying to be protected from being hacked To roll it back to an older version. Why wouldn't they hack this metadata store as well? So You they would hack what they would hack The one of this metadata store that you're saying has this authoritative complete view If I think one of the things you're saying you're trying to protect from is I have the Mondays build on Monday It was fine, but on Wednesday we discover it was insecure and on Friday a new version is pushed So how do we keep and we want if that tag of went to 1404 whatever Is hacked that if there's a 1405 on Friday release that somebody doesn't point it back to 1404 which is still valid, but it's it is known to be less secure than the one that was shipped on Friday if Yeah, so there's a there's a couple of ways that this happens so first of all For clients that have any prior state who have seen anything in the past Um, then what happens is is that a client has historical? Information from the last time they've seen things But what but you're assuming the client is a steady state like we're targeting a serverless environments where the their clients are Don't have any knowledge of the previous state every time every time. It's a neutral. It's a restart So so I was I preface what I was saying by saying so in the case that the clients have state Okay, then then there's this protection. Okay, so in the case that the clients do not have stayed Then there are a few means to deal with us. So presumably they have They have the root metadata or something else from that Registry that they're going to that tells them like what the correct snapshot key is Okay, so Presuming they have this then what you or what anyone as a registry operator does is once a day You securely sign a new snapshot file Okay, so basically you take the updated in design option to here You take the updated targets files over that period You generate a new Merkle tree for it and you push the resulting metadata out Okay, so this is something we have one of everything in that one repo Yeah, it's a Merkle tree of what's in the repo. Okay, and what we show here with the numbers and things is that You know, even though the absolute data is modestly large It's not actually that large in in registry terms or you know, like real terms It's it's um You know, it's um You get the data and the data clients download is small because they only need to download their path to the Merkle tree What I'm confused though is it's 200 million. Are you saying that's the entire registry or just that one repo? Like the Ubuntu repo. We can do the entire registry You can do all the public parts of the registry. You can even Intermingle private in public it scales very well, but it's not our scaling It's a matter of like each repo is itself different, right? Like Windows and Ubuntu or well, they're not but it's just Alpine and Debian are in the same registry But there's no correlation between the two Yeah, but the idea is is that you want versioning information other things like that to be updated consistently If we'll have state so it doesn't hurt you to have it all in one repo is a large part or not repo but in one Repo's not the right term for but to have all the repositories inside of the registry is I think your way of thinking about it all be all share a snapshot file Well, I'm here. Yes. That's what I'm trying to understand. Is that what you're suggesting or? Yeah, that's what design option two does It shows you can actually put so design option one shows Like what happens if you just make private registries that are all small or you segment things up some way And people do because there's different owners of each one So that's why I'm struggling because the one of the options that we keep on talking about is Putting everything together, but we haven't addressed any of the security issues where those are fundamentally different owners and teams that They're and we have depending on the registry and how they're implemented as challenges Yeah, we talk about that in the document quite a bit if there's if there's parts that are in there that aren't clear or need more Exposition we can always do more but this this is something, you know, like the way to think about it is If you imagine for a moment you have all these separate repos, which is fine Just think of it that way. Okay. They're all totally separate. Now think for a moment What if you could like link everybody's snapshot to everybody else's snapshot together? Not targets not linking other things then then that would strictly be better than having all of the views an assumption that you're making That's why I understand why you're making that leap Because the way in which you do Okay, so let's walk through the example in the document that we wrote before This in the notary B2 signature design proposal I'm trying to find it. Hold on a second Just really quickly before we move to that I think What we're essentially having is a conversation around the design itself, right? I think the metadata doc Could easily circumvent that by just calling out what makes sense from metadata perspective Because I think the the conversation we're looking more at is does this design actually work or not? And there's a lot of other considerations we're taking in there But this doc on its own from a size perspective just can call out what are the options that make sense or not, right? I think it's saying a lot more about options without really covering the content of like what those decisions are Which is called out in a different doc and I think you know just separating out what the recommendation here is Based on size versus we look at the the other doc. I think would make more sense just for keeping context Thanks. Yeah, I think here's the thing Justin like the What I what you've got me labeled as me, you know is some snapshot for no snapshot proposal All I'm saying there is there is a way to go find content using this reverse lookup So it's not excluding any metadata Conversation so That label one bad versus the other that's what I'm just trying to understand What it is you're trying to say that the concerns we've had is not that you is a Registry especially Docker hub which is meant for the world to put content on is Intentionally segmented and then you have private registries that even though they're private to a particular customer They still have multiple teams sharing content in that so how do we Account for a set of doc a set of content that needs to be secured The first is just is it signed and then you have a second level of security, which we're not pushing back on that It's not valid. It's just where what is the trade-off of it that? An update can be somehow verified So I think it's just it goes back to the scenarios. Which scenarios are we trying to account for here? Yeah The main goal of this is just to show the feasibility of adding in addition to just signatures If you add this this second level of repository Security just main idea is that it doesn't add so much extra Metadata for the client to download that it's like infeasible for them to do it. That's kind of the The main idea and then you know, you can dig into the numbers and figure out, you know exactly how much Extra is it and what does that mean? I think that's the big takeaway I'm just looking at the I got I was reading through the tough doc where I didn't get to this This other diplomat using delegations. So I hadn't finished reading that one And did does this talk about is this theoretical in how you did the measurement or did you actually perform this operation on Docker? It's it's partly theoretical partly. I generated some metadata locally to test sizes, but I didn't actually like upload things and stuff, so Yeah, I mean we'd have to get all the Signed data everywhere from Docker hub to do it for real, but we took the we took actual numbers like image sizes and things from Docker hub and looked at Like looked at actual stuff there to do the estimates which we expect to be right within a factor of You know, like maybe maybe to something like that could be slightly larger It could be slightly smaller, but it's not going to be Substantially off and the growth rate is is about right So it's I may be giving Marina not enough credit. I think her numbers are a lot more right than that, but as As like new interesting cases or things like that like emerge if there's some weird Trying to think of even what this would be but If somebody is scattering massive amounts of small OCI metadata or something like that then maybe You know, maybe that like Changes one of the terms in the spreadsheet slightly But in general we did I think of a pretty We tried to be very thorough about this so that it was a really fair comparison And it was really factually based on the information we got okay right, I mean one of the things we can look at I mean the the Prototyping that we're looking at doing will should be able to give the opportunity to be able to put this information I think that the biggest problem or the biggest concern and If you're saying that this doc talks about it, then we we can start the address it But the biggest concern is how do we as registry operators secure? Any content across different repos because by definition there should be no Team A should not have access to team B's information especially if it's not public so The shared any shared information between them is a concern from a security perspective so So let me let me just talk about that real quick because that's come up a couple of times the kind of TL DR is Already you have mechanisms that protect like the different repositories on the same system from this Like on the same registry. So you already have all kinds of access control for this. Okay You can view this snapshot as just being extra like a Metadata that happens to be the same for every repository, but nothing else none of the other files change Okay, and then there's one other property of this metadata that is important, which is that I If I give you Like your path to the root of the Merkle tree You have no idea and no way to prove that anything else is in that Merkle tree So like Basically, it's not that you can't just go like poking around and looking at things because of the way that the Do the secure basically you're given the secure hash of things And you'd have to be able to go backwards from the secure hash To be able to get any meaningful information about what anybody else is doing and the whole reason why secure has to their Secure is because you can't go backwards from the hash to find the original thing No, son. I'm not are you saying that the If I had if team b had access to team a's Secured hashes they couldn't recreate the actual content. I think that's what is that what you're saying But the question is how did team b even get access to team a's metadata in the first place? and why okay Let uh, it isn't that they get access to their metadata. They don't have their metadata What they have is when you sign something. Okay, so so think of it It's like you're doing one signature on the registry And that one signature Has a whole bunch of secure hashes listed under it And it doesn't you don't know who those hashes are for what those hashes mean or anything What you're told is is that if your hash is there Then then the thing was signed by the registry Right And so it doesn't matter What those other things are? It's they're just they're just hashes right, that's effectively what like a What like a completely flat merkle tree would look like which would be really inefficient There wouldn't be a reason to do it, but you can you can see I think you can see conceptually why the security property the privacy property you have You get it there um And in fact the inside the document There's there's a description about how even Oh, sorry. Did somebody else want to say something? Okay. Um even if you like Knew the person you were next to somehow in the tree like you were at a company and you had the previous metadata And you went and knew that this hash corresponded somehow to this party um When you generate the new thing you can just put the like the timestamp of the thing you're doing in that secure hash And it makes it so that everything moves randomly in the tree anyway Because you're you're adding an extra piece of information into every to the secure hash as you're doing And so it really doesn't It it's one of these things that sounds more complicated than it is. It's it's actually really Really simple But it's uh, it's just a way to do this and store this information so that there isn't like a Uh You know like a um a privacy leak between parties Yeah Go ahead Yeah, sorry, if it if it helps steve I was also pressing uh, justin and marina for a bit more details on how this would preserve privacy So so let me try to see if I can use different words to try to sewage your concerns What justin is saying is correct the merkle tree is it's got a whole bunch of leaves under it billions of leaves Whatever it is and and the point is unless you know what you're looking for the key in the in the in the tree You're not even going to know that it exists, right? So that's the thing you should be worried about What if I knew the key is the question Yeah, so so what if you knew the key? What if you could guess for example? Oh, I know that microsoft is working. Maybe I know the code name of this project I'm not going to try to query this key and see if it exists in the tree, right That's the larger concern here and I ran through this with them And so the way the thing that needs to be hidden so the snapshot has got all this leaves in there That's for sure. And if you knew the key you could try to guess and query the The snapshot for it and so what you need to do is to obfuscate you need to hide The the key names that's the more important thing and the way you would do this is by using existing access control mechanisms You already have for private images If you don't know if you don't have permission to to see the key you won't you won't be able to query it in the first place Now if you knew the key for example if you could somehow guess it and you could guess the you could guess the you could query the snapshot for it And the way to solve that problem is by Obfuscating your your true things. You don't you don't want to give away. You don't use the direct code name for example Maybe use something else Does that make sense? Well, what I'm maybe I'm just not hearing it right But what I'm hearing you're saying is if I've gotten this secure information That there's no way to decode what the source was Is that kind of the argument? If I give you a secure hash you can't find I mean, and you don't know what the answer is for some other reason then there's no It's infeasible computation infeasible free to find something that matches that secure hash No, I got it. Okay. So that is the proposal. So now what I'm asking is how What is it you're expecting? Two teams using the same same registry, especially if they're private don't have What you're kind of making your argument is team b even if I got team a secure metadata that They can't reverse engineer anything. So it doesn't expose anything The fact that there is even metadata that you got access to from team a is is this the security boundary problem So I get that Metadata you get you get a secure hash of something But why do you have access to anything a secure hash or anything of across two private entities? um Because if you want to link everything together In in a way that you have time timely information like um What's the way to say this if you want to have actual, um You know the ability to say that things weren't modified You have to have some notion of Like a snapshot of information on the repository You have to have some notion of Of like time moving forward. Otherwise, you're always in like, um Like this perpetual state of Um, you know, this like this baby bird imprint on whatever random thing you happen to be given Which is is bad So I think you keep on saying that but i'm not i'm still trying to follow the the use case so if If a entity is trying to pull content from both team a and team b Then I could see kind of like there's an efficiency here that I when I try to validate both a and b's content That I want to be efficient about that. Is that is that the argument? Not necessarily no So if i'm just team a Team b and team b never looks at team a What is the benefit to this disclosure of information between the two just first the security aspect? um, what you want to have is you want to have like a um You want to have like effectively like a historical record of what's occurred at different time periods And the more that you link into this historical record The the better it is because the more like the problem with having separate historical records for every party Is is that it's easier to go through and for an attacker to then replay whatever things they want out of there in in like a different way so to go to the example in the document which is I think you have this up now so Uh figure one looks at what happens when you have per image index manifest signatures here Okay, um and the example it's giving is this is an example where In this case there happens to be um the repository that contains things both from uh Foo and bar Okay I'm trying to figure out which diagram you're pointing at Figure one in the notary v2 signature design document. Oh, there's okay Right. Um so The idea here is that an attacker goes um They're going to go and they're going to break in And they get access to um You know, uh, basically they are going to be able to go and manipulate What a party that goes to this registry looks at so if you go And look here. You'll see that there's a version that's colored in red For each foo and for bar um I must be looking at the wrong one. Hold on a second. Which document are you referring to because I see I'm looking at your diplomat use delegations. Which you're talking about a different document Okay, I think I found the one that marina put in the excel spreadsheet now. Yeah, so um If you look here figure one All right, so here you go and you have um per image index manifest signatures. You basically have no um snapshot information here. So what happens is um Here you have this this repository It had At on tuesday a bad version of foo was pushed. It's not known that this was a bad version at the time, of course um Wednesday, this was maybe discovered that hey, there's some Um security issue or something with this And so now foo one point latest on wednesday Points to foo 1.2 Right Okay, um thursday similarly bar is Uh bar 1.2 is released. It has problems and then friday bar 1.3 Is released to fix that Okay, so now um skip down a little bit here And there's a table You can just look at the top. I figure two you're saying there's a There's a table. I'll come back. Oh, I see. Okay. Yeah. I don't know what page this is, but okay Yeah, so and then at the top here, um, if an attacker goes and uh compromises the um the registry Then the question is is that what damage? um, can they do To a specific client like what metadata can they give to the client to make them think a certain version is Like is the latest version this is for a client that's trying to retrieve foo one point latest or bar one point latest So because the registry gets to pick what foo or bar points to Then if they're just retrieving the per image information then the registry gets to go and just basically um Uh like pick whatever vulnerable version they want to to have be installed and the client because it doesn't have any notion of what versions like bar or foo or like, you know, what history of any of these tags are or How they happened or what's going on has no way to protect themselves in that case um So now if we go back Well, first of all, is that clear? I The client it's a new client every time. So the client just simply was told I need one uh foo one Well in in this case, um, once again the client is going to get some metadata Now this as I said could be loaded onto the client like the client somehow gets the root metadata or whatever else That's not a feasible thing in any kind of ephemeral environment that every node every compute that's given to it Is this serverless environment that it has no knowledge of what customer it's being used for What registry it might be pulling from so it is as vanilla as a possibly can be Right, but it at least has information. It has no information about any registry it might pull to Correct. It does today because it has to know at least like I might want to go like here's the ca bundle for docker You know that that is going to let me establish trust and first use with docker hub or whatever it what I think we're mixing a couple of scenarios and yes, I just I just noticed the time and I But the scenario you have to account for is that uh A serverless environment, right? There's just the azure aws google whatever we've got vms that are sitting around and and compute instances That were just ready to give out to any one customer at any one time Majority of customers don't actually pull their content that they're deploying from docker hub They pull it from their private registry and as we give that vm to You know any one of the hundreds of thousands to millions of customers that we have We have no idea what registry they're going to pull from so that compute is instance to them in No knowledge around which registry it's going to pull from Um First use they use the x509 certificate Or what do they do for how do they get their Like the keys that they can say it's valid Yeah There is a bundling process that happens prior to the environment bootstrapping at least we need to do that so, um There is only a valid set of hosts that you will be able to pull from Not any host on the internet unless you have some kind of bundling or or Exclusion or inclusion parts that have been set up from the ca as well Yeah, not to mention that sorry I was just going to say that this bundling process is where you Effectively put this information in you put the root file just like You know where you get your stuff from you pick it in there Yeah, so I can see you have to give it the you know the the current thinking is that you would obviously give it the client keys, you know the the public keys that says these are the ones that I trust and um It then was pulled so what is your proposal that it's somehow in addition to the client keys It gives it the public keys rather it gives it some metadata as well Yeah, so um, basically if you're able to give the keys what tough uh root metadata is is it's basically just keys It's keys with a little bit of information about what those keys do So you give it those keys and you give it a snapshot file and you're off of the races Yeah, and in practice Go ahead go ahead Trishank And in practice, you probably want to add a caching policy anyway You don't want to keep pulling the same thing over and over again So this is not a this is not a biggest concern I think like adding a caching layer where you can keep the previous snapshot that you saw the merciless We I mean There is also another approach to address this in terms of like if you manage revocations for Artifacts that you no longer trust Is that also a design option? We've looked at Well tough tends to handle that for you When you generate targets metadata if you no longer want to trust an item What you do is you just remove it from your targets metadata and then it's no longer signed and available When once you upload the new version so like kind of this Deletion of trust is also a big problem although not not one that we Called out really specifically with having like detached signatures like a separate Right wouldn't that be a better solution because not only do you want someone that's getting foo one dot latest not to get foo one dot one But someone that's also directly trying to get foo one dot one to know that that signature itself is also about it Yeah, if you you can do that In fact in the example here Imagine that you want foo 1.1 to no longer be addressable even as foo 1.1 right Then on Wednesday or Thursday whenever you realize that you just remove that entry from the file as well In in figure two and it's gone If it's in figure one then the malicious registry can still just re-add that File or or just keep it around and you don't have any way to actually Like invalidated in a in a good way Um, so the the thing with figure two is that as long as you've seen a snapshot From after the change was made. Sorry. Sorry. I you know, I can go through this I don't have to go through this now and I I do realize that yeah, it's like you had something you wanted to mention too And we've kind of co-opted this whole meeting Um, do we want to spend an hour going to this? Stock plan this weekend Yeah, I think that the and this might so uh, let me I'll say this and let's figure out what we want to do I think part of what we're doing here is we're saying there's this Entity that there's almost like there's two parties. It's the two sub key thing That both have to be done at the same time to agree and part of the And in fact the original notary design actually had a separate store as well. Maybe that's why this was But if you take An environment that says that has to be clean every time because we don't know who it is But you have to put something on it so that it knows what to pull The thought is is that and I don't know the details yet because that's what i'm looking for Yes, and the other key folks to help with is that client pulls the client keys from some other entity Not from the registry some secure key management store And then it has the ability to pull from the registry because it says, okay, I have these keys It sounds like there's Unless the metadata is also stored in a separate entity that is separately trusted Then if I if that client then pulls from the registry at that point, it's not clear to me Why whatever got hacked to push it back to a previous The one one tag instead of one two how that's any Basically, it's going to the hacker is going to hack that as well So there's almost like those two two phase commit thing that we're trying to achieve And where does that second piece of information come from? Yeah, if you're able to push keys you're able to like you're able to push top Root metadata or snapshot metadata It's like the snapshot like the thing you need for snapshot and design option too is is just a It's a What would it be a 32 by value? You know, I think it's the the size or anything. It's just where does it come from that it's there's a time There's a an assumption around what's around for how long and when where does that information come from? I think that's the one other conversation. We can walk you through all this. It's not a problem We can walk anybody who's interested through this. It's totally fine Um, yeah, and if we want to do this sometime later this week, that's great I know other people probably have stuff coming up in a moment too. Is there stuff that we and I actually have to drop in a moment I just realized I do actually have a hard to stop um but yeah, I Apologize I feel like I've talked an awful lot during this meeting Is there anything else we want to cover? Yes, you want to do your quick of what you had proposed? I saw your PR. I gave you some comments on it I was there some more that you wanted to talk about Uh, yeah, I can address those. So I what I put in is questions that I expect that we'll get uh youth cases to kind of clarify like what how these questions should be and so the Uh, the guidance the feedback that you have should be addressed after the next meeting We have where we go through the different use cases and see what we need to answer there I did want to say like, uh, I'm not sure what time zones everyone is I went with sort of like in early morning availability. I had because I figured that would be easier for east coast and for people dialing in from europe But if those slots that I have put in don't work for anyone. Uh, just message me on slack and I'll update it Did you post something on stock for when you uh, I put I put it in the meeting agenda There's a doodle link there to pick meeting times So once I get feedback in from everyone, I'll go ahead and schedule some time Okay, maybe it was just me, but I don't I didn't realize it was Yeah, I think it's a conversation Yeah, if you could post it to slack, I think they'll be very helpful too Uh, I'll go ahead and I'll paste it into slack right now. Thank you Oh, the doodle link was in the agenda. I got you. Yeah, just doing both would help people I think people look at the agenda for what's currently got been Conversation thread and slack would be helpful for everybody all right, uh So that's we're at the hour. Um, I know it's there's a lot of stuff there But I think we are getting a little at least I'm trying to understand a little bit better of what we're trying to accomplish I don't know if the new environment was new information that helped understand where some of the concerns are coming from in addition to the cross repo thing so We'll keep on iterating. I've started putting out a scope for I didn't put it out really yet in breadth is uh, how to A framework for doing some prototyping so we can flush out some of these things Um, I got some of it done last week. I'll be working on it this week I'll get it out before the next meeting so people can take a look at it and That hopefully will give us the framework to start Experimenting with these things um and flushing out the details So with that encourage take a look at neos schedule for whatever he's doing around key management that I know we need And I'll you know as I'll forward it to our internal key folks as well So they know that some specific content on that topic is coming up and We'll keep the conversation going for the week and we'll meet next week It is an hour now every week. I did update that Thanks, folks