 Good afternoon everyone, and thank you for being here so This is officially a corporate presentation, which it is not going to be for more than five minutes You over there you already know this presentation. It's the same one. I did at the open-stack meetup a couple months ago I'm afraid you'll be bored, but Anyway, so today after a short introduction I'll dive slightly into the best practice that we've developed for delivering clouds at intervals which tends to apply common sense and So that we can Deliver and maintain a working cloud for people So who am I? I'm Nick Barsett. I'm 46 years old I'm single I've been working on open-stack for quite a bit and currently I'm working as VP of products and pre-sales at Inovance I happen to Work on open-stack telemetry also known as a Cilometer Founded the project together with a few other people I've been traveling the world and enjoying myself talking and doing open-stack in many ways And if you want to complain about my bad French accent on Twitter, it's at Nijaba Well, the details are over there Who is Inovance? And that's a corporate presentation slide So Inovance is a French company started in 2008 Growing like crazy since it started doing open-stack when I joined Inovance Now that was a year and a half ago. We are about 20 people. We are now 120 this is outdated We've got 150 plus customers We make quite a bit of revenue hopefully otherwise we would have trouble paying salaries and Ha, yeah, now this slide is really wrong. I don't like saying which position we are in in the Score list of the day. We are in the top 10 and we've been reliably in the top 10 for the past four district releases of open-stack number four was reached Temporarily once during the previous cycle. So I don't like this number forget Forgive me about not fixing that before doing this presentation So Inovance That's two things we help people build a cloud whether you wanted to do internal workloads or to provide resources to external customer and We help people use a cloud and that is not limited to open-stack We can help people deploy their application and maintain and operate their application for them on various Clouds currently we support any open-stack clouds that includes the French provider called cloud what but that would work as well on HP cloud rack space etc We also do do Amazon web services and Google Enough for the corporate presentation as I told you I think it's much more fun to talk about how we deliver clouds So to To start this you need to set up the field open-stack Deployed and producing any value for anyone Represent more than 40 components that you need to align so that they fall into working If you add to that All the other components that are needed to instrument open-stack whether it is underneath things like storage things like I Don't know iPervisors or things like Network equipment and network software and stuff that you put on top stuff like oh, maybe I need a little application to do a self Registration portal or maybe I need a little something to provide a software catalog Well, the number of components is huge If you deliver something like this, this is a slide for our reference architecture for service provider components So as you can see we add to open-stack quite a few things like billing capacity planning graphing monitoring bi tools integration with CRM etc etc or Whether you're looking at other use cases like building HPC or Building a cloud for a single application or Test and death cloud well, there are many many use cases in which you can use open-stack But in all these cases you're just growing on top of this 40 component that needs to be delivered for your solution to be working so in general when you Get an open-stack distribution you will get this You will get access to technical support This is great And you will get access to maintenance updates of packages You have a question So is it somebody has a support contract and then they can ask as many questions as they want or is it Charging per hour or per question and does that work out? So right now I was talking about the general benefits of technical support and maintenance The way we do support at in advance is we offer a contract Which is only limited by the time of response not in the number of questions or not in the Duration it takes to solve a problem. So basically it's a flat fee That is based not on the number of questions But it's based on the number of servers or other parameters that may fit better your business case So we don't sell this without selling the consulting services to help you deploy your cloud so in general we answer all these questions as part of the engagement and What we do in terms of technical support is more concentrating on the break fix Issues that you may have once we've trained you on how to use your cloud so in general a subscription is just Maintenance access to updated packages and technical support But this leaves open a few questions How do you safely upgrade your environment? How do you know in advance what is going to be the impact of this upgrade and If the best upgrade course it to is to reap and replace everything. This is not going to work Of course, you could say hey, I'm going to stay on my version of the cloud It's working. I won't touch it anymore. I'm going to do like I was doing with all my software for ages I'll leave the same version of the software running for five years This is how I get the best bang for my back unfortunately Cloud is not really a standard application For many reasons There is the risk of not staying up to date Open stack is a piece of software that is being used by people to get to the very heart of your data center Actually, you're selling pieces of your data center So it's the actual purpose of the cloud to give root access to people you don't really know whether they are your internal customer or your External customers these people could be malicious and Yet you need to provide them access to the heart of your data center so of course there is a few security measures here and there but The cost of not maintaining Up to the latest security update Your cloud environment is very high potentially so you that's one very important reason why you don't want to stay with the same version forever and When we say up to date, what does it mean? I mean when we are Considering that the stable release of open stack lasts for According to upstream a year Can you afford not making major upgrades I Don't think so so we found out that Within the open stack project. There is a great tool that is being used every day by the developer which is the CI The CI that is based on Jenkins and validates every patch before it is merged to the actual branch of open stack and We found out that Maybe the best way for you to be maintaining your cloud would be to Set up a child CI of this CI That would be able to reproduce the tests that open stacked as Using your specific configuration With additional tests that you would provide that would match your use cases Something like this. So what I'm talking about here is a chaining of the open stack CI to inovance as a CI so that we can internally Validate that what is being done upstream is not breaking what we've delivered downstream to our customers and then have Customer Customer-specific CI on site of the customer reproducing again, maybe with additional sets of tests The tests that open stack has done that we've enriched through our knowledge during our engagement with you and Link to the CI that you use and this I mean, of course, it's never foolproof What we are going to achieve with that is Give you we hope a high enough level of confidence on whether the updates that you just received is Not going to break what is in production Enough confidence so that you will push the button easily To deploy the updates knowing that it shouldn't break things. This is Nice, but sometimes we miss I mean With the best wheel of the world The test coverage of open stack is not yet at 100% for itself and when we are talking about your projects you've You've been adding in your project generally multiple components Multiple modifications or configuration options that are specific to you which may or may not be tested as part of the open stack environment So we are going to be adding our own use cases In the form of tests into the CI to enrich that but sometimes Problems fall through so the other thing that we need is provide you with the ability to Do a Progressive rolling update and the ability to roll back If something bad happens, you need to be able to go back to the previous stage, right? so what is exactly that we test so we Divided our tests into multiple levels of tests first of all we want to test the packages whether we are getting the packages from Another trusted source like Red Hat or Debian or whether we are building the packages for you We still need to make sure that the packages themselves are Providing the requested service without breaking the environment. So that's a very first level of tests that we do We take the upstream source which we combined with the package source We may add a few cherry picked Fixes that we found to be useful here and there Hopefully there is as few as possible in there and We are going to be retesting the wall packages Independently and once we do once we've done with the test then we Publish that in an internal repository Which oops only one slide please Is then one of the sources for what we call our product CI? The product CI combines the packages with is everything that we have developed to deploy OpenStack Which include puppet modules puppet manifests The binary packages we just produced earlier E-deploy roles E-deploy is our deployment servers You can find a lot more about E-deploy on our tech blog and We build we actually deploy a small cloud and we have various scenarios depending on what type of Deployment you have we are going to be reproducing in a smaller scale on VMs a similar environment to the one that is running in production in your enterprise and We are not only going to be testing That what has just been delivered works. We are also going to be testing one thing that we include within our Elements, which is how to upgrade and We want to make sure that The upgrade doesn't create an interruption of service and we want to check that the service After The upgrade is functionally the same or better Than the one before and we want to make sure that the applications the key application that are running on your cloud If we've identified them the written Test cases for that are still performing as expected Because it's not only a matter of availability of availability, but it's also a matter of performance. Yes Every two weeks they try and stay close to Trump by reduce roughly. I have a hard time imagining something I mean I can picture like if you have a few servers being able to like upgrade and it said oops There was a problem and rolling back Do you have any idea if like things even in a scale of rack space if the if it turned out that they Put out the wrong thing and had to roll can can things roll back at that level that they need to like Have double their art double the generally you want to you don't want to upgrade all your servers at once because For one thing sometimes you need to reboot a server So you will need to migrate the VMs are running on the server Prior to rebooting it and you don't want to reboot all your servers at once or otherwise you won't know where to put your VMs so generally what you do is upgrade rack then an aisle and Validated or works and if it if it works well, then you you'll go for a few more and you'll test again You don't want to do a big name upgrade to be like oops. That was the wrong one. That's not the way to go Now a little difference because you mentioned rack space we don't run off trunk we decided to stay on stable and Provide updates at they are Provided to stable and provide upgrade once a new stable release has been produced we Have never seen a case where our customer was in such a hurry that we needed to run from trunk and The hassle of running from trunk. He's not negligible. I mean, I can understand when your business is to be delivering a public cloud that you have tons of engineers ready to Support that and that's okay, but most of our customers don't have that So we'd rather stay on on stable Every six months, but then you've got minor releases within the cycle Every two month So for example, we just released ice house and there will be ice house dot one and dot two that's really Okay So once we've done the test of the product Then we are going to be archiving the result Because we want eventually to be able to go back to this archive later in time and The output of that is then to re-index the result of This test when it's passing the previous test when it's passing into a customer-specific CI that is running at innovants where This customer-specific CI is going to be even closer to the setup that you have This is actually going to be the exact a copy of the exact configuration file smaller scale and again, we will produce this test in something that really really looks like what you have and then willing that That's exactly when we are sending when the test passed then we send that information To the customer. Hey, there is a new update. It has passed all our tests. It is now time for you To perform your own test and the customer has the ability to add its own tests into the CI Because sometimes there is information that they don't want to share even with us because sometimes They have lots of development teams that can provide additional tests for the application that are running on top of the cloud and once this is validated then the customer Has this choice to push the big red button. Yeah, we deliver a big red button No, we don't To launch the upgrade into fraud so all this we build Using 100 open source 100% open source solution. I mean The CI is nothing magic. We are we're using what OpenStack Intra is delivering all the components are Standard open source components like Jenkins, Zool and a few others and This allows us to add the missing link To what subscription should be Which is providing a Channel for you to be receiving continuously updates upgrades in scripts that allows for your cloud to Stay alive, you know, there's this saying in the security world that security is not a state It's a process. Well, OpenStack is just the same So far your question. I've been enlightening. So please go ahead I mean does that work So we don't provide we cannot provide any kind of SLA for somebody else running the cloud But there are cases where we are managing the cloud on behalf of the customer and in this case we handle the SLA so Obviously, oh question We have a specific CI for each customer that we have yes It doesn't mean that we have a set of hardware dedicated per customer the same set of hardware Can be shared between customers So basically we are we are testing OpenStack on OpenStack. So we've got Jenkins instances that pilots a global pool of OpenStack instances on which we deploy OpenStack to do the testing So So So I won't be able to right now to give you the complete detailed answer because it would take a little bit more time than I have However, all the details of how upgrades are operated is published as part of a blog post that was published Friday Which Provide links to every component that we use to do these deployments and that integrates as well an Ansible Engine that we use in order to do the orchestration of the deployment because upgrades creeps are very strong needs for being properly orchestrated you need to do things in the right order and Configuration management was just not enough to solve that problem You don't need to have a global maintenance Windows where you API's won't work for Alphanar Or at least we hope to and if such thing ever happens We hope that by running the test beforehand. We should be able to warn you that such a downtime will be needed But so far we've never had to do that So far what you have done time on specific hosts, but since in our reference architecture every Management server is at least triplicated as three copies. We can upgrade one and then move on to the next For Nova hosts, it's very simple. We do it by Groups of hosts and not all at once so that we can maintain Service yes Every time there is a reboot needed we will need to do that in order not to interrupt services for the end user the toughest part is a database, but This is something that has been worked on for at least a year now and for which People within the community. I provided some pretty nifty solution that allows to maintain Service while the upgrade is being done on another copy of the database Yes text that in TCH at s at in ovens dot com dot in ovens dot com Look that in ovens does work as well. Thanks. She will maintain that blog environment. So, you know that by heart Sorry Yes, it's a bit more often because We we provide updates as they come to stable. We don't wait for the Dutch releases Customers is That's correct and that's the key when you've got so many components You really can't Test the component independently and hope that everything will work once put together integration is More an art than the science until you've got such a tool Actually, yes What's funny there is a company that does exactly that for a say P say P is as complex a beast as open-stack and Integration testing and continuous deployment of a say P is at least as complex as open-stack I I'm completely falling in love with this mobile. So I'd like to see it everywhere, you know I'm like the guy that has got a hammer and think that everybody is a nail But everything is a nail, but I don't know That's my current belief maybe asked me the same question in a year from now. Maybe I'll have change Lugging as in Everything's good Do you bother? Basically a customer's been running for a while they have a whole bunch of logs Based on your reaction. I'm assuming that means you're not you're not you don't take you don't dig deep into their logs to see if things If there's problems there or things make sense there beyond the normal tests The thing is logs you you'll get logs whether everything goes well or not detecting a real failure He is actually much easier if you know what's the expectation are from the result of a test then by digging Through logs logs can be very useful to understand where the problem is coming from and this is why centralized logging is a key component of Real life product open-stack installation you don't want to have to chase from controller to controller which one has handled this request and Which is an overhost on which the VM is in the end and this is why one of the features I could say of our Installation and upgrade methodology is to stop as soon as a problem is detected So that you don't have to pass through hundreds of kilometers of logs Before you find the line that you need But still sometime debugging can be a little complex and centralized logging really helps in that case No in the testing right okay in the testing you want to interrupt your your process So that you can look fix and resume from there You welcome That's okay, yeah, I was just checking how much time I still had I shot 10 minutes, so we are fine. I think that's about the last light No, we didn't invent anything and there's plenty of great things for that The lock stash is pretty nice and then you can plug in whatever search tool you want and you know there are plenty of Very cool way of implementing that on the web and we didn't reinvent it from scratch Every time When we start then we fix the scratch Oh, there's someone over there. Sorry. I have Big lights right in front of my eyes, so I may not see you right away So sorry, you're a little too far There's a microphone if you don't mind in the middle and it will have the advantages for other people to hear the question you are asking so you mentioned that You have a framework where the customer can verify a new update in their in their environment Mm-hmm. So what is that framework? That's Jenkins. I see it's I mean again Even though we are absolutely fabulous people at in advance We love inventing stuff, but when the stuff has already been invented we reuse what exists So but Jenkins lets you run a bunch of jobs, right? So the customers will have to define the jobs and write their own test cases. Is that correct? Exactly? The interest is the difference in what we're doing is in the chaining of the sea eyes That's the if there was something unique in this presentation that the thing right the rest is standard practice That everybody is implementing great. Oh, thanks So again, I'm going to ask you to speak in the microphone because my earring might might be a little impaired How you reward from a failed upgrade eventually in customer sites? so fail upgrades tend to be Stuff where we apply rollback if it's on the production environment We want to make sure we go back and in order to do the rollback E deploy Does something that is very similar to what triple a will do soon Which is deploy images of the file system and Differentiate in on this image. What is data from what is system? So when we rollback we just push back the old image we're using the exact same data as before So we don't need to back up and restore we just provide the restore immediately once this is sorted then we've got to fix something and One of the feature of this chaining is that when issues are encountered Inovance is a notified so we can start working on the fix Before you can you can call us Does this answer your question? Did I understand this correctly? How do we identify the so do we have? No, we don't maybe you know The question was how do you identify what is data from what is system within the images? So I'm going to repeat in case it was not heard very well We actually manually specify which are the path where data is being stored So in e-deploy we identify precisely. Okay, this this directory and all its child is data and The rest is System you had a question Yes Yeah, exactly. We are cloning from the infrasci So actually there is two models because we support three distribution Ubuntu, DBM and Red Hat and for Ubuntu and DBM we pretty much are The upstream for the packages we use so it's very easy. We can do a direct link for Red Hat we use what is available in the red Hat network distribution Mechanism so in that case we copies a test, but we don't copy the branches. We all we Use the packages as provided by our chat Yes, we are okay another question We we don't have access to the data at the customer side Okay, and again, we we know it in a very generic fashion because we are not talking about I don't know that The credit card information for such-and-such customer is over there I know that in general on my nova host data will be stored in those path That doesn't mean that I have access to that to those hosts that doesn't mean that I have the ability to Identify which data is which on those hosts now the big security risk in that model is that if somebody is able to access our Publication servers and introduce some malicious code in there it could propagate to our customer and Do some and have some devastating effects and for this we reuse those standard practice for packaging which is to ensure that every time we build something we Produce an MD5 Check of it that customer can verify independently so that they know nobody has tempered with what we've built It's not foolproof But I mean that's true to anybody that provides software to you any kind of malicious software could be embedded in their software without you knowing it The only advantage that you have is you could eventually rebuild everything that we are providing to you from source and and Hopefully the issue would not be there Up to the big red button. There is there is a manual decision. That's the CD is only applicable once people have told confidence in the system and we don't want to replace that confidence by Being overly confident ourselves. So we tell them a from our point of view. This is okay If you feel the same go ahead and press it Yeah, that's correct Okay, the end user benefits. I bet you've read better than me what's on the slide over there since the time I let this slide Happen, so I'm going to skip anybody against that no okay, so Thank you very much if you have other questions. I Think I'm out of time, but we can talk about it later on. Thank you so much