 I'm Mark Atwood and I'm the Community Manager for Eucalyptus Systems. I'm using up to self little Eucalyptus cards I've been handing out. Stuart asked me three days ago, three days ago, to fill in for Monty Taylor who had a family emergency and so he could not make it all the way around the planet to speak to you about Swift. Just too bad I was looking forward to seeing his talk myself. So instead I'm going to talk about Eucalyptus. Eucalyptus originally started as a research project out of the University of California Santa Barbara by these crazy people. The guy on the top is rich and these are five of his seven graduate students. They were working in supercomputing theory and grid computing networks and that sort of stuff. All kind of very exciting stuff for the theoretical supercomputing world who looked at the whole cloud things going on and looked at the primitive technology and there's nothing really interesting research wise and so the world's diverging but what Rich did is he had a contract with one of the national laboratories in the United States to build something anything that let them start doing some real world data analytics on large clusters of cheap computers that the national labs had and several other groups put together these sophisticated complicated grid computing systems that required from scratch rewrites of everybody's Fortran analytics and Rich and his team decided instead to re-implement Amazon AWS and they wrote this paper and it was an interesting research project for them. It was going to get them seven people master's degree and another line in the CV for Rich and suddenly everyone went nuts and people were coming out of the woodwork from large.coms to try to hire all of them away for very large salaries to build this just for them and they realized there may be a business opportunity here. They came up with this incredible awful background by the way, the elastic utility computing architecture linking your programs to useful systems. The real reason for the name is Santa Barbara is completely overrun by eucalyptus trees. Some nut job brought one back from Australia about 100 years ago and eucalyptus trees have no natural predators in North America. It's a good thing that they're climatically constrained just southern California or we'd have another kudzu invasion on our hands. I actually was tried and Koala's don't is there's some natural pathogen in North America that kills them so that might be it. Because Southern California routinely burns to the ground every two years except for the fucking eucalyptus trees. Sorry. But anyway doing what any smart tech people did in Southern California they went and talked to companies with names like this who like Benchmark is somewhat extremist for being one of the two companies in the bidding war for this little search company that no one thought would go anywhere so they decided not to take it all for themselves and so only gave them ten million dollars if someone else gave them ten million dollars. Benchmark made at last count something like over 100 billion dollars on that investment in Google. So anyway they gave us some money to start a company and jumping forward in addition to a company we have this architecture where every single one of these parts are a web service speaking WSDL you point web browsers or soap clients or rest clients at it all and the two things at the top are the cloud controller and walrus and walrus is our S3 clone and the cloud controller is our EC2 clone. Every component is a web service WS security is used between every component you could run every one of these components on one box or you could run all of them on separate boxes geographically separated as long as you can get secure HTTP through it all it all comes together as one eucalyptus node. Some working meta features of our project it is all open source it's a GPL3 there has been some fud unfortunately our own CEO didn't help very much with it about parts that are closed and parts that are open turns out actually everything is open it's all GPL3 as I said it's all written in C in Java it's hosted on Linux boxes we don't emulate the entire AWS API just two critical parts and you can do a little bit of your own configuration to it. Now before I go into this or skip past it how many people have used S3 quite a few do I have to give a quick summary S3 is Amazon okay I'll give a fast 60 seconds description of how S3 works S3 is a storage API in architecture backed by whatever implementation Amazon has actually re-implemented themselves three times and no one on the outside has ever noticed it has a flat name space which is very similar to the DNS in fact it works hand and glove with the DNS if you have a flat name space of things called buckets if the bucket looks like a domain name and you have your DNS pointers pointed at the S3 or walrus server then you can access it by going to that domain name and walrus or S3 will remap it objects are stored in buckets objects have names look a lot like path names but they also are flat the slashes are there for human readability and the system doesn't take advantage of that you have certain maximum sizes items in buckets are immutable you can write to it once you can read from it you can read ranges from it you can delete it you can move it but once it's written it can't be changed there's a bunch of APIs to it the best known are the soap and the restful ones is a great deal of the web browsing that you do right now is actually coming out of S3 and you don't realize it walrus or S3 is used to keep boot images how many of you have used Amazon EC2 Amazon keeps your execution images in S3 eucalyptus we keep your execution images in walrus now we get to the limitations our first implementation of walrus it's using the file system as the back end so there's not actually anything interesting happening from a storage architecture point of view unlike Amazon's it's that way in part because A it's a first run second we are primarily targeting enterprise data centers and so people are already used to doing large-rate sands we are working on projects to make it distributed make it use something other than the file system but not yet what's coming out of eucalyptus the other Amazon APIs the query service the queuing service we're working on projects to make it HA to make it more distributed make it multi data center like I said before we are open source not only and we're not open source the way like say a large social networking company I could name is open source where they write things and kick it over the wall we use launch pad all of our work is done publicly we use launch pad to keep our code in we use launch pad track our public bugs in we put all of our engineers have to when they're working on something when they do emerge the merge runs through an internal QA and pushes up the launch pad every night so you will see at worst case 12 hours behind what our engineers are hacking on at any given time we have a website for our community it has wiki forums documentation and something we call the concourse it's one of our Italian employees gave it that name because we didn't want to call it a contest or race with the concourse we're asking anybody who's done anything interesting with eucalyptus whether it be a fork or an interesting application running on it or interesting legal case or something to write a wiki page about it when we have collected enough of them we'll have everybody look at them and vote on the 10 best ones and give those 10 best a free year support contract which then basically you're getting the same thing that the people who are giving us money has got to be a contributor you have to sign the community license agreement which is almost identical to the one you would sign for fedora for open stack or for Apache you submit a patch against a launch pad tree send the issue to our issue tracker and you get an account on our issue tracker when you sign the CLA if you have that you can read and check our issue tracker you can write comments on any of the bugs in it other people who have anything they want to, issues they want to send in or things they want to talk about can use our public forum or they can send us email that is what I have to say about eucalyptus I've been handing out little cards inviting people to join the project or at the very least check it out one of the slides that's not here is if any of you are Ubuntu users Ubuntu world known as Ubuntu Enterprise Cloud if you put together a couple of machines install them from scratch tell one of them is the UEC node controller and the rest are UEC node clients you will get a eucalyptus stack any questions? so for the back end you said there's no replication of the data yet if someone just sets up say something like Ceph any changes required in the rest of the stack to take advantage of how to spread the workload to other nodes we actually have a patch we have integrated to use Ceph one of the cool things about that patch is that it was pretty minimal it was only a patch against one particular WSDL component in the Wara system the rest of the eucalyptus system needs no changes to do that anything else? Amazon has a system for sort of like an Akamai type system so that you don't actually have to serve things from your instance you can just serve them that's the one are you thinking about saying that that would be kind of awesome what would be more likely what we would do and this is me hand waving with no knowledge of any plans is to what would be awesome is if we set up an agreement or wrote some code interface with say Akamai's API Amazon can do cloud front because Amazon needs a content distribution network for their own service and they own the data centers they're running in we don't run a public very small public eucalyptus cloud we don't run a service like Amazon does we instead expect people to have their own data centers their own agreements with their own internet service providers people have Akamai's content coming off of Walrus it's reasonably straight forward because from the outside Walrus just looks like a web server so if you can Akamai's a web server you can Akamai's Walrus but we don't have anything like cloud front ourselves does the CLA at this point involve copyright assignment of any kind it does to us and then we license it back to you we have our own liars looking at how this is a similar and different to all the others I myself am not a huge fan of the way it's written right now I would much rather the CLA be that you assign us the rights that's unmatched to the open source license and you keep the ownership yourself and that may be the way it ends up one of the forward problem but one of the issues with S3 right now is that if you have large files you can't really use rsync like mechanism you have to kind of put the whole file up there is your system any different or do you have any plans to update parts of a file sort of speak ours is not different I know of no plans to extend the API if you have to do something like that to S3 what I was speaking to myself what I would tell you to do is go look at a product called jungle disk and jungle disk rights to both S3 and walrus and S3 jungle disk presents a fuse interface and then uses a interesting way of mapping names of a local file system to S3 or walrus objects and then you can actually copy to or rsync to a jungle disk that is that product laid on top of it I think my question has just been answered I was going to ask about migrating between services and are you aware of any large scale migrations that have been done to or from eucalyptus I know of a couple and unfortunately I can't tell you who they are and this is very frustrating to us one of the corporate goals we have at eucalyptus right now is to get customers who are willing to tell them what is announced that that's what they're using to do large migrations to and from there is a really awesome python library called boto which someone started working on a few years ago as a python api on top of all the amazon web services stuff and we thought it was so awesome we hired the guy who writes it so there are people have written scripts in python using boto to be pulling objects out of S3 and putting it into walrus or moving it from one walrus server to another I'm just wondering if you guys had any plans to look at addressing data distribution across multiple machines or at least multiple disks we are and it's one of the things where the engineers are arguing back and forth about how to do it we know it's something we have to do there are plans but there is that as of yet no code if you have any awesome ideas for doing it yourself like I said we take patches alright I think that's it thank you very much thank you