 All righty everybody we're about ready to kick off our next talk we're going from Google to the other leader in the public cloud AWS and His shirt pretty much says it off you can see it just keep calm and launch the Debbie name. I without further do this is a James Bronburger. Good morning everybody. Can everyone hear me? Yay Feel free to move closer Feel free to move closer. So my name is James Bronburger and technically my day job is I'm a solution architect for Amazon Web Services Which is something I've been doing for just on a year now But let me take you through firstly what I'm going to cover today firstly a bit more about who I am So you've got some background What is AWS which I'm going to be pretty quick on because I'm thinking most people here have probably come across it at some stage And then generally go through what Debian is doing with AWS So most of this talk is actually going to be given from my perspective as a Debian developer for the last 12 years or so And and obviously the things that I've been trying to do from from the Amazon perspective to try and help Debian and obviously what What Amazon is getting out of it? We've got a buff that I'm doing tomorrow future of the Debian AMI's So this will probably overlap a bit with what some of the stuff we're doing with Jimmy and the guys and I've got a meeting coming up tomorrow for things that AWS can actually do for for Debian So if you've got an idea of something you want to do Come along and I'll see if I can help you or find the right people and and you can take advantage of whatever AWS pieces you want to play with So anyway Who is Jeb? And obviously that's a TLA most people here. I would imagine I've had TLA's over time I've been a Debian user since about 95 a developer since 2000. I ran Linux confey you anyone here been to Linux confey you Australian National Linux Conference Zach. Yes Okay, so the next one coming up in January If you're looking for some warm weather in January come to Australia. It's in Perth in my home city I am not running it because I run it in 2003 once bitten, but please come on down. I went to debconf one I've got some proof coming up of that and I went to to Edinburgh So yeah the proof Who here is in there? One okay, where are you? Where are you Zach in there somewhere? Okay, I mean there are a bunch of people still here who were here I do this for my own embarrassment more than anyone else's We all know and love Colin Watson on the right And Thomas Lange is somewhere around here as well and and Martin Mikkelmeyer. He's not here this year Christopher Lamenter and a bunch of other people is that Andres Scaldi Andres Shuldi. Yes So a long long time ago many memories, but anyway So let me skip through the bit about what is Amazon web services because hopefully a lot of people have Have already come across this. So it's a collection of remote computing services So that's EC2 Anyone here actually used EC2? Yeah, about a quarter of the room brilliant a bit less It's Virtual service you install whatever you want on it. You do whatever you want with it Pay for it by the hour shut it down when you don't want it. So similar kind of thing Storage so s3 as a storage service object storage whereby the amount of data that you upload is the exact amount that you pay for Unlike for example block storage if you go and buy a sand You typically pay for the entire sand you put five megabytes onto it. You're still paying for the entire sand With block object storage. You're only paying for the amount of storage. You're actually consuming Pro-rated. I think it's done over the hour or day or so So it's it's pretty accurate to what you're using Networking obviously storage compute is pretty useless without a bit of networking You want to be able to connect to stuff? So we do a hell of a lot of networking and one of the more recent things that we've been working on has been virtual private cloud So public cloud has been around for What was it 2006 late 2006 when Amazon started this? Virtual private cloud is an extension to that to allow you to define your own private network Potentially with no public internet access to it that you connect to using IP sec Or potentially open VPN or any other VPN technology you want or private fiber If you so wish to connect into your private resources and low balancing and other stuff in there as well and there's over 30 services that the AWS offers so easy to Compute is just one of them s3 is one of them low balancing is another service over 30 including things like Hardware security modules one of my favorite ones are the last sort of nine months or so anyone ever use the hardware security module Yeah, nice. So we have these available in AWS now Basically for those that aren't familiar with them if you've seen Mission impossible how this message will self-destruct in five seconds These are the devices that if you try and attack them with a screwdriver, they will zero themselves It's for managing your crypto keys securely And so I think it's the safe net lunar SA devices that we've got And you basically have them by the month as you want Because you know security is an important thing everyone here agrees security is important. Yeah Brilliant But also on top of all those kind of services it's also supply chain logistics and operational excellence and certifications and audits and operating at massive scale and and trying to drop those prices all the time and and living up to what we feel is going to be a good experience for our customers and Making compute work for people as effectively as possible But more importantly for easy to Customers get to choose the software they're going to run There's a whole bunch of Amazon machine images or AMIs That you can choose and for a long time. There's not been official Debian images on there It's been pretty straightforward to make your own one customers can can master their own But this was one of the things that as a Debian developer I've been sitting there going we should answer this We should have a Debian image which people can trust and run as opposed to Finding some unknown AMI that they may not know of or may not trust or having to make their own Which you know, I've spoken to a couple of people who said they don't know how to do it. That's fine But it's it's not straightforward to begin with it. It's your first time using EC2 It does get pretty easy and I'll demonstrate how we do it in a second So let's start with some of the other things that the Debian is doing with AWS firstly, I don't want to steal his thunder but massive distributed Debian package Recompilation This is not my project This is Lucas's project So Lucas has been recombining Debian archive on EC2 I Organized a grant for him since I've been at AWS and there was one given the year before And effectively it's just recompiling it now. I'm sure Lucas could you could talk about this at length on this kind of stuff But what's been interesting for me is that that Lucas chose to use our spot market for actually provisioning the compute power The spot market is a Very interesting place where you can get very cheap compute. It's basically putting in a bid on the unused capacity on the cloud So we have a couple of pricing models the on-demand model for pricing is it's this much per hour Turn it on for as many hours as you want you turn it off you stop paying But of course if we've got all this equipment up and running then why not give us a bid and pay a price for what you think is a Fair amount and if nobody else is using it, it's yours If the market price goes up, then you'll lose your instance you lose your compute So obviously you wouldn't do this for all kinds of compute processing But for things like recombination of binaries you could restart that pretty easily And so it becomes a very interesting and very cheap way of getting large amounts of compute And so Lucas has a post on his blog about what he's done for this And it's a great example of using that that spot market for for compute Um the price does change over time and so you can actually see the history per Availability zone for Amazon and per instance size. There's two two interesting concepts. We need to talk about Amazon has a whole bunch of global infrastructure We generally talk about a region as being a Geographic location where customers can use compute and storage and so we have nine of these worldwide One is reserved exclusively for the US government, but the others are Dublin, Sydney, Singapore, Tokyo, South Palo in South America, US East, our original one, and California and Oregon Now each of those regions has multiple availability zones at least two and you can think of an availability zone as being a cluster of data centers And we transparently add and remove data center halls to those availability zones over time generally add And then the demand for each size of instance and a size of an instance is an amount of compute power and memory that's being allocated to your host The demand does change over time and so we've graphed it so you can see what is the current spot bid and as you can see from this graph The on-demand price for a medium, which I think is about Seven and a half gig of memory and an amount of compute power Of which we have our own metric for compute power because if you've anyone here bought a CPU before Anyone yeah, everyone's bought a CPU before You probably can't buy the same CPUs today that you were buying in 2005 or 2000 and so we have a standardized metric that we've come up with to give a kind of quantitative feel to how much power is Against an instance and so a medium has got two compute units worth of power a small has got one a Large has got four so we see that's kind of doubling each time So the spot pricing on here was between 1.8 cents to 3 cents an hour with the yellow line Dropping around so that's between 80% to 88% cheaper than the standard pricing So this is a perfect use for for Lucas's project Second thing we've been doing and this has kind of been An experiment that I've been doing for about five six months Working with Raphael on HTTP dot debbie and net To give us a Little bit of help on getting the package archives available within AWS's regions But also making available to customers outside of those regions And so we've been using cloud front which I've set up as cloud front on debbie and net You're more than welcome to hit that web page, and it'll tell you hey Why don't you try and add this to your sources And it'll actually HTTP dot debbie net for certain regions will redirect you to cloud front on debbie net You can use it directly if you want cloud front is our global CDN Currently has 42 points of presence the last two opened about two weeks ago in Chennai and Mumbai in India And we're constantly expanding that and it looks like you can't see can anyone see the Mac behind that Yay Never use gray Excellent, so the idea is that it will accelerate. It's a it's a caching network So it'll accelerate it after the first hit for all of those Files that are out there and to reduce the load on on as Jimmy said reduce the load on the debbie and archive infrastructure that we have So you can use it in app sources or you can use it via the redirector So one of the things that I found is that most objects are fine with the default caching of 24 hours So this is cloud front's default if there are no preferences set via HTTP headers and HTTP headers are the correct way to do this Then we'll try and cash stuff for 24 hours using a LiU cache effectively But obviously some of those files need to be quite Volatile index files translation files trace files We want to force them to be a little bit quicker And I put a call out to say hey, can we get some of these cash control headers added to ftp.debian.org over HTTP and Given the timing I went okay. I'll actually have to try and implement something So I've got for certain paths an interstitle web server Which is effectively intercepting some paths and adding the relevant cash headers to ensure that we are pretty Fresh on these files. I'm talking seconds instead of 24 hours And so I was going to give you a bit of a view of how this works And it'll give you some idea of what actually you can access via cloud front dot debbie and net today and this may actually Expand so debbie and CD if you're after CD images If you hit cloud front dot debbie and net slash debbie and dash CD You'll be actually getting a cashed copy of the CD distribution including with blu-ray images Backpots the old backpots obviously now deprecated given that backpots has moved into the main archives is there But the interesting bits become when the project file and the disks As she goes off to this little server of mine that I've set up on ec2 Which is that interstitle server just adding on those extra cash control headers And the default page the page that some of you may have just seen is coming from an s3 storage bucket It's what about four kilobytes worth of data And so what happens is from each of our edge locations for the CDN The path mapping looks like if you ask for anything under slash then it goes straight to that storage bucket Which is why you've got that that web page with the little low goes on it If it's anything like slash debbie and then it goes directly off to ftp dot debbie and all cashing it for 24 hours by default Everyone with me so far Beyond that if it's looking at debbie and disks then that goes off to my ec2 instance Which adds these extra headers on saying well instead of the default 24 hours because there's no headers being returned from our upstream Let's whack on I think I made it a default of 15 minutes expiry So the cache is pretty fresh And then for certain other paths like debbie and cd and debbie an archive goes directly off to those because Archive generally doesn't change it that much And of course this happens times 42 locations worldwide These locations do do collapsed forwarding everyone familiar with collapsed forwarding So anyone run squid before Squid 2.6 used to have a brilliant feature that if you came to your squid server With like 20 result pretty requests for the same URL at the same time It would actually pause all 20 requests do one request the origin and then service all requests simultaneously as opposed to hmm I want you I want you I want you I want you and and basically just Channeling all 20 requests in parallel and cloud front edge locations do that same kind of collapsed forwarding So that if we do get a stampede of traffic, it's not going to be propagated to the upstream origin servers But each edge of location does operate independently So we will get 42 times number of requests times what 24 hour TTL on those objects Which is perfect for packages packages don't change once they've been uploaded. Do they cool? So that interstitle server is basically running Apache with that kind of config Certain paths. I've decided and I kind of put this on the cloud list and asked around and This is kind of what we've ended up on I'll go through the the individual paths and the timeouts that I've whacked on there And this is not set in stone if there's a path that you see that you want fresher through the CDN Let me know If you want to jump on to that box then give me your SSH key It's it's being run from the Debian account on Amazon. I will give access to any Debian developer who wants access Just come and talk to me So that's kind of the Apache config Summarizing it into a table. We've got those kind of paths and those kind of timeouts. So things like Where is it the Debian project trace files 10 second timeouts? You can see that they're pretty fresh That's kind of what I was thinking was going to be an appropriate level of caching versus an appropriate load on the upstream servers Any questions on any that feel free to jump in at any time if anyone's got anything cool So next thing and this is I guess the the more bigger thing the official images on on ec2 And I'll start by saying and as Ingeman is absolutely the legend that we and and the HP guys and everyone else is Is kind of putting all of our patches into one place? So there's a uniform place for us to go and get our image generation done So we have official official ec2 Amis for Debian generated by Debian developers and that was the key thing is that it is this community that has generated them On ec2 they're available in two ways Firstly through our marketplace. So Amazon has a marketplace where independent software vendors Putting their software and potentially adding a cost of how much it is per month and obviously ours is at dollar zero And also from our AWS account directly. So if you don't want to go through the marketplace, you've got two routes to get to it It's available in all regions and it's even now available in gov cloud However, I don't have access to gov cloud because gov cloud requires you to be a US citizen Etc. Etc. So I have to go with my software and say would someone please bless this and and it goes into gov cloud and it is available In fact, you'll find that most of this is actually documented on the Debian wiki with all of the AMIs every AMI has an identifier It's different in each region because each region operates independently So the list is all there and there are templates to go and launch stuff and and all of that Um, this is what the marketplace front page looks like so you can see all those vendors that are on there Some have very higher reputation. I'd say the one with the swirl has a very high reputation It's great that I get to see that being presented at at conferences with you know four five seven thousand people And they'll show the marketplace and I'm you know personally very glad that we've managed to get our Debian logo actually showing up on there And that's what it looks like when you go to the product page if you're going through the marketplace route of getting your your AMI So and directly from our account if you search for our account number that we've got there then you'll find everything we've got there so Debian 607 7.1 a we had a slight re-release after we noticed that New SSH in wheezy has got ecliptic curve cryptography keys in there Come and talk to me if you want to find out about that And so you can watch them so as I said they're all built from Anders Ingemann's initial script This was a shell script which Basically did everything There was very little that I had to do I found it went hey that that's kind of looks like a nice model as Jimmy said They are Anders is now refactoring this to a Python script and supporting basically all of the cloud vendors in here With all of our extensions and pieces that we need to do So there are some scripts that that does actually inject into your EC2 image if you were to use it and you're more than welcome to generate your own Debian AMI or you can use the ones that myself and others have prepared Here are the things it does one grab the SSH keys You'd upload previously an SSH key into your your account at the public key And obviously it needs to be put on to that that instance when you launch it Potentially resize the root file system if you decided to launch your EC2 instance with a 10 gig file system instead of the default 8 gig Then on boot it'll resize it to the right size And then execute any initialization or bootstrap code that you may have given to your instance to run on launch And it's very very simple. It looks at this web server this private web server Gets this user data blob of text and if the blob of text starts with hash bang Smells like a script. We should probably execute it So it's a nice way that you can actually do all of your bootstrap at launch time and start with a base AMI And your base AMI might then start to do a few interesting things like unattended upgrades and things like that You might want to set up optionally Again the build scripts are all under Apache software license it uses eucalyptus to do that build So everything in there has tried to be as open source as possible where we can There are no call home or updates Nothing Currently the Debian AMI's don't fall back or use cloud front dot Debian dot net it actually uses directly HTTP dot Debian dot net It doesn't actually give any other information that it's being used so You can launch it and in fact if you've launched it from the Debian AMI account not through the marketplace I have no idea how many people are using it and that's great for our users privacy Said that To access the instance you SSH as the default user admin Which was I think when we launched the Debian cloud mailing list about nine months ago and one of the first thing was What username are we going to SSH as because some easy to Default Linux or Amazon Linux the user is called easy to user the Ubuntu images the user that's initially created It's called Ubuntu And every distribution seemed to have their own name and we thought well Should we make it Debian and everyone went no because we've got blends and derivatives and things like that Well, let's make it something that's generically useful. So admin and then pseudo to get root I am repeating this because this is an FAQ once you start your Debian image on easy to SSH is admin Not as root Of course once you've got onto that machine you could change that you're free to enable root SSH You're free to enable passwords if you want This is completely at your choice because it's your machine once it started So no remote root SSH no password authentication, but you can change it You have full root privileges Amazon does not we can't tell how much memory is in use on your instances It's all completely a black box to us And if you wanted to track any of that you'd have to instrument ways to push that into our monitoring systems Obviously you run whatever you want subject obviously to as we all have acceptable usage policies And one of the interesting things I found has been no broadcast multi-cast or promiscuous mode on your network interfaces Cool So the snapshots for the Debian AMIs are marked as public which means anyone else with an AWS account can go and inspect them The the idea of trust of our images was a very important concept to me and I think to all of us So you can go and inspect them before you launch any of them You might want to create your own AMI which you can create by launching one of the original AMIs doing your modifications And then saving it back as your golden master There's no problem with that or you can use the the script. That's quite straightforward One of the questions that came up was why did we put this into the marketplace? That sounds a rather commercial for a non-profit Distribution like Debian Well discoverability was actually one of the ideas that came up a lot of people go looking for software and they go looking for in a marketplace Why would you go direct from our account because we can You can choose exactly the same software it'll have different AMI Identifiers because it's cloned into the marketplace account, but the software is just a clone. There's there's no difference between it So what are customers doing with them? As I said before from the AWS account, I don't have any idea. It's not tracked. It's public It's marked as public people can see it launch it and there's no log how many launches that I've I've seen that let me know What's happened? But from the marketplace I can see some stats now we did this starting end of last year with squeeze And basically it's been a top-10 product across all independent software vendors of the Thousands I think that we have in there and it's seen a 5% week-on-week growth Which if you track over time That's roughly a graph like that So we're seeing a lot of a lot of popularity of the Debian AMIs on EC2 But then there's a other questions that come up Why should we care about having AMIs on there? And this this comes back to I guess we didn't have official AMIs for a long time for some people It's the first place that they're gonna find Debian Obviously no for no one in this room But for a lot of people they'll be looking for an operating system and they want Linux and there's a whole bunch out there And so having a presence there means we've lowered that barrier of entry to a lot of people And yes, they can if they want sign up for a new account on AWS and you can use a micro instance for free for the first year details are on that URL But obviously existing Debian users wanted to use Debian because well, it's a pretty good operating system. I found We have a lot of Users who want to use obviously our vast number of packages. We've got that might not be there in every other distribution And so we we're trying to obviously lower that barrier for experienced users as well So this is some of the comments that we've had come out from some of our users that have logged it through the marketplace Pleased to see the team has created official AMIs that you trust Upgraded to Weezy servers. It was in the marketplace. It was all there I use this image and it did everything I needed to and there's an underlying message that comes back from all of the comments And it's basically that people trust it And I think that that is a core thing for us to provide that level of trust and and and reliability So one of the questions that has come up is to be in the life cycle of images So we've generated point releases since six point oh point six And I have just deprecated six point oh point six and release obviously we had six point seven since the day it was released and One of the things that we as Debian need to decide is what's our policy on keeping old releases Now I think I personally have felt that we should keep the last point release of every release that we have And it'd be great to go back and see if we can get I don't know bow ham slink. It might be a little bit too old But whatever we can keep on there I think it's a great place for us to archive it because you never know in in Six years time ten years time twelve years time. We might come back and go well what a fire up a squeeze Where will I find one and archives may have moved on and and timers has bit rotted things So I think if we do keep that it'll be nice But obviously this is our choice as Debian and so please come on to the boff and and Give us your five cents of what you think or two cents of what you think is going to be appropriate for us to do over time Creating the images so to create these images. We start with an existing AMI and we launch an instance. Yay We have a machine we can work with We then create an EBS volume of block storage volume of whatever size Typically, I've chosen 8 gig 8 gig seems to give everybody enough room It's kind of become a default for easy to instance sizes on the cloud And then of course you're gonna do to bootstrap into it do all of your software installs anything else You want to put into there and the volume is got data on it then Calling the API you then unmount it and snapshot it so you have a snapshot that is stored into s3 Really important distinction between EBS and s3 is a s3 is replicated into at least three locations So it's incredibly durable Whereas EBS volumes only exist within the same availability zone as your easy to instances running in And from that snapshot, we've then got a nice little call where which is called basically register image And that API call will result in an AMI entry. It's basically a configuration setting Referencing that snapshot and then what you can do is get rid of your volume and instance And you've just kept a snapshot and the AMI entry Any questions on that so far? So One of the interesting things that's recently happened Has been that the API for doing that register image has just been improved A new parameter was added back on the 2nd of August. So what's that about a week and a half ago? called virtualization type Up until now this idea of having an EBS volume and doing a debut strap into it and registering It meant you could only register power virtualization machines PVM Which for Amazon has typically been the smaller instance sizes This new parameter, which obviously will be looking to eucalyptus to see about getting a support for that parameter in there as well Will let us register the image as hardware virtualization HVM Which is the larger cluster compute instances with the instances with the nvidia tesla graphics cards on them And other larger stuff, which is coming down the pipeline as well So that's a quick summary I put together the other night of Some of the smaller instances you can see the amount of compute power They've got the number of virtual cores that are presented the amount of memory and whether it's power virtualization or hardware virtualization So you can see at the moment the images that we've currently got their EBS backed so block storage backed and they're using power virtualization And so our goal is to bring up these HVM images as quickly as we can And also looking at things like s3 backed AMIs, which are the ones which don't use block storage for the root file system But use temporary or ephemeral storage for that as well Now currently the images that we've got I consider as being base images It is basically a standard Debian install as you would get without choosing any tasks or anything like that But one of the things that as a community we should probably think about is which blends do we want to add in potentially as as their own AMIs? So I was talking to Andreas this morning. He's talking unfortunately. I'm missing right now because he's in the next room About getting sort of Debian med images up there or the biology images or scientific images Ready to roll because obviously there's large numbers of people that want to be able to run those And maybe they don't want to wait for the base image to install 400 scientific packages, so we could actually master those images up for them So yeah, please come to the boff on Tuesday and talk about whatever you want to see Future directions so summarizing up those root file systems We've got 32 bit and 64 bit images at the moment. I don't know how long 32 bit is going to persist for But we've got both These are all questions for us to answer One of the obviously we have those instant images times the number of flavors times the nine regions worldwide Obviously, there's a vast amount of storage that we're starting to accumulate across all of this and Obviously, please say that Amazon is picking up all of that because it's obviously good for Amazon More information is on that wiki. I've been documenting stuff potentially not well enough But attempting to on the wiki Contributions obviously gladly welcomed on there. It's a great little resource and it's obviously the Debian resource to talk about all this kind of stuff One of the interesting things that I was passed about a week ago was actually this white paper Which is someone who's actually made their own AMIs for doing protein sequencing on EC2 And there's more and more of this as I keep getting my name known as hey You're the Debian guy in Amazon. People are saying have you seen this? Have you seen this? And it's great stuff that that you know, we are enabling as the Debian community for people to be able to do pretty quickly Obviously goes without saying that we are hiring We've got a whole bunch of roles. I've spoken to the managers in most of these teams in the last couple of weeks So if you care about anyone here care about operating system design Come on. Everyone here cares about operating system design. Brilliant If you want we've got jobs going in in Seattle with our kernel and operating system team That's not a get-smart reference. That's the kernel operating system team if you care about packaging in general. Yeah We've we want you Try and help make it. We've already spoken I guess about packaging cloud specific stuff But come work with us and and and help us get that done if you care about CDNs and an HTTP and that kind of stuff The manager of that team also reached out to me and said yes Anyone who's likely to be here is likely to know their stuff So again another area and even outside of Seattle Roles like the role that I'm in if you like talking to people about technology and cloud then all languages are clearly welcomed So other resources we've got so and as a script is obviously one the cloud front of Debian net site or reference Feel free to use it as much as you want Here at Debian conference a bunch of interesting sessions So today Jimmy session on public clouds and official Debian image status. So we definitely definitely gonna be there at that Also this afternoon. How can AWS help Debian? So if you've got an idea for something you want to do come along talk to me I can give you some credits and you can be on your way and and do whatever you want abuse it as much as you Want and and try to make something useful and interesting out of it Tomorrow challenges and questions so David's talk at 9 30 and then the boff tomorrow for me And that's it. So there's a bunch of people that I do want to thank obviously in does and is and is needs so much Thanks, because he's basically put up with all of us coming along and bothering him And merging very very quickly and then reinventing and basically being awesome. So Obviously huge amount of thanks Charles Miguel Julian and Thomas because the cloud and it packages finally coming together Cloud and it is a way of structuring your initialization data for an easy to or in fact a general cloud image I should say and We haven't had that package in in Debian. It's a basically it was a mastermind came out of Ubuntu But now we have that and I believe has made backpots yet. I think it has Yes So now we have another thing that we as Debian need to think about is do we have a base image Which pulls from backpots as opposed to just pulling from main and these are the things that I'm not going to sit here and say We should do I think this is something that we as a community need to decide Because I don't want to put any software in there that not a single person in this room believes should be there It should be stuff that you trust So it's glad Nick blah and Lucas and Stefano For your support that you give me of hey, yes, you look like you're in the right position Just put this together and make it happen. So thank you very much That's basically me. So let me throw it open to questions and see what I can answer Steve Microphones are there you can stand and dance if you want So you talked to quite a bit about the cash headers regarding HTTP and the app sources one of the things that the Ubuntu server team has found is that So we've done some work well work has been done in the past on apt to make it more reliable We have the in release thing now where the GPG signatures are in line in the the releases file to reduce inconsistencies there Yes, but one of the problems still is that even if you're caching the individual devs you have your indices are Mutable over time and there is you do not have an atomic update of your server And so when you're talking about cloud scale kind of things where you're talking about tens of thousands or millions of machines that are using apt The fact that you have a few second window where one file is updated on the server Not the other becomes a problem in terms of reliability If you want if you want your 10,000 machines to reliably apt get update every time without any manual intervention The fact that you don't have a consistent mirror becomes a problem. Absolutely And that's one of the things that Rafael and I have seen and have been looking at which is why I haven't made Many public statements about it so far I've been actively using this for about six or nine months or my half a dozen instances In fact, no couple of hundred actually I when I've been doing a couple of fun jobs with some customers And I've seen that since I've tuned this down to the 10 second mark I've not seen that problem since but 10 seconds was just an arbitrary number Yeah, we could make it zero and every request would be live But obviously that wouldn't be taking advantage of any of that cashability, right? Well, I mean the problem is no matter what where you draw that line Your your files at some point during the day you're updating and and because your it your packages file in your releases file Cannot be updated atomically at some point you have inconsistency on the mirror So the Ubuntu server team have worked on a solution for this which is basically using an s3 mirroring in within AWS address this by having unique Identifiers for each packages file, which is what you have to do in order to make sure you you're always getting the right index file Support for this has not landed upstream and apt yet So I think I'm just throwing it out there that if people are interested in this problem of mirror consistency in the cloud There are potential patches out there that it would be great to have some help on getting that upstreamed into into apt-and-debian Which is as I understand it has stalled out over the past year or so getting that up there so Yeah, no that sounds really good. Obviously, I'm interested in seeing if we can improve that over time as well And if we want to put s3 Mirrors then we can do that. I started actually by creating eight independent s3 based Debian archive mirrors and then when I was looking at the the complexity and the jury reading going on I thought well, this would be just much easier to set it up as a CDN and I think within about 15 minutes. I was done But you know either way more than happy in fact There's there's talks now that since I've spoken to to dsa in the last 24 hours About putting some more stuff through cloud front and an s3 to further help. So yeah, whatever we can do Anyone else? Yes. Yeah, you said there's no multicast and And no premise mode in AWS why sorry no Multicast multi cast. Yes, correct. I mean you can't do multicast One thing some people have done is they've built layered networks on top of our internal IPv4 network And then they've done multicast broadcast and even IPv6 on top But by default our network doesn't allow you to do broadcast multicast or promiscuous mode on network interfaces Why so Going going back historically when it was all public cloud That would basically cut off. I believe a large number of potential routes for exploit And the number of workloads that actually natively required that was quite low And so that that was the choice that was taken then now with the advent of virtual private cloud Where you basically carve out your own IP address space and you may or may not connect it to public gateways or not That becomes less of an issue. I don't know if that's going to change in future But that's the current rules as to how it does work If it's something that you're passionate about come see me I actually have been rather intrigued that I can put in feature requests So if you if you've got a feature that you'd like to request or something It's been bugging you come and see me and I can I can put that on the list and it Obviously, these things going to heat maps and get addressed in there the fullness of time All right, we've got about five minutes one more question here So I was gonna ask the question Steve asked but he's dealt with that But there was another similar problem that Lenaro's run into a lot where if you've got So they use been using EC2 for ages to build lots of stuff And we the problem you hit the problem if you try and run more than one job on a particular instance You know apt assumed serialized access and basically the second one explodes because apt is currently locked Yes database and we hit that a lot as a problem, and I think that's something else that you know basically having lots of If you do only ever do one thing on each instance, then it's okay But as soon as you try and do more than one thing on instance, then it all explodes So you're saying app needs some more attention. Yeah, it's another problem to think about in the context of running loads of stuff Yeah, you know that wasn't designed for all this And we need to think about that as well I don't know if anyone's thought of how to fix it or if you just just don't do that Is in fact the only answer or well do stuff in charutes, which also works. Yeah. Yeah Cool Stayed again. Yeah, I guess nobody else has any questions. So I'll go ahead and burn the rest of your time talking about Let's see what was I gonna say. Oh, right. So have you looked into using juju at all for devian Are you feeling what the juju is? I keep getting confused because to me juju was always the firewire stack that replaced the old firewire stack No, what is the new juju? That's that's packaged Sorry, that's juju is service orchestration in the cloud. Yes So as opposed to putting devian images together from separate pieces wasn't that a juju as well? That was jigdo. Jigdo. Jigdo. See right. Yes. No juju. Sorry. Go ahead juju So juju is canonical's solution for providing a friendly, you know client front-end for Declaring service relationships between nodes in the cloud and automating that the the the orchestration of the deployment of services and everything I'd be interested to know if devian is interested in in having that capability both both having the client packaged in devian Which I don't think has done yet as well as having Having juju able to Deploy devian instances and not just ubuntu instances. I think currently I'm given to understand that currently you can deploy ubuntu And sentos instances using juju, but not devian So there's a there's a gap there that I would be nice if somebody would and some of that some of that gap that that has Been there in that orchestration is the fact that you guys in ubuntu have had cloud and it And a lot of the three pieces have plugged into that as opposed to just a well It looks like a shell script it smells like a shell script. So let's just execute it And so that's I think one of the the next things that we need to decide is is for our next revision of 7 2 Do we pull in cloud and it from back ports as it currently stands for devian? Or do we wait until it's in the main archive? These are policies not not technical limitations And I want to get obviously everyone here on consensus that we you know with what we're going to do About using cloud init I've asked the release team about having cloud init and cloud init ramifest tools in in stable and They still didn't reply, but I guess it's easier to have it in back ports. Yes. Yeah, the only problem is that This it's the only packages we need from back ports is these three And that's it. So it's a bit of shame to just have the one exception in a source list Yeah We have no choice for the moment. I guess you mentioned actually the very important point that we talked about on the mailing list Which was yes, we could go and grab cloud init from back ports But how are we going to get security updates for that? Does that mean that we then need to add back ports to app sources? Do we agree with having app sources containing back ports for the default distribution? Um, anyone have a feeling on that? Those in favor say yes Those not in favor Say oh no Uh, great. I have one. No. Thank you. Wookie Cool any other questions Steve again Oh, Steve, go on take my time. Two minutes. See I think we're going to find a bar somewhere So this is actually just a comment rather than a question and it may Hopefully somebody has a question after this because it's kind of a sour note to end it on but I did want to point out It's great that we have the the rebuild the archive rebuild stuff going in the cloud And I think that's a great use for ec2 But although you assert it in your talk that amazon has no visibility into the instances as a consumer of the cloud We will not rely on that and I think that's probably the debbing policy that we don't we cannot trust The results for anything that goes into the archive. So You know fair enough just something that that you know In fact, that's that's one of the things that we always say to our customers is feel free to encrypt your data before you give it to us More than happy with everything being pre encrypted. If you want us to encrypt it again, we can But to aws. It's just a bunch of bits um Yes, there's a there's a lot of interesting discussions that that jimmy and I have had on this so far But yeah, more than happy to talk to anyone that wants to go on to that All right. Well, thank you james that about does it for our time. Thank you everyone