 Okay, let's start welcome to our session Volkswagen's group IT Operations team adventurous journey to the land of open stack. My name is get pussman. I'm with Mirantes My name is Timon Schultz Yeah, I'm with folks folks wagon The Volkswagen group Consists of 12 brands and financial services. It's not a brand but part of the group well-known names like Volkswagen Audi Skoda and others and the Volkswagen group manufacturers cars in 31 countries 121 production sites and employs 610 K people in Manufacturing cars or services for car owners financial services stuff like this and they built 42 K cars per day and Ship them to 153 countries And they made revenue out of it 230 billion euros per year obviously Organization of this size needs IT and we are from infrastructure operations one part of the IT department of the group IT from Volkswagen and We manage or the group manages the department manages four data centers 7000 service and employs 80 plus people and we are talking about the Traditional part of the IT not about the cloud. So This department manages more than 250 business business critical critical applications And that's what we in the cloud would call the legacy stuff But that's the these are the applications that create the revenue currently So a very important part and there are multiple IT departments and other brands as well So this is not the only IT department. There are more or less independent IT departments and other brands as well The project we talk about today is the VW group IT cloud abbreviated the VWG ITC and we have two Major goals in this area The first goal is to support the business objectives from Volkswagen with respect to a digitization and new markets like Connected car IOT car sharing organization things like that and the second one is We follow a slightly different paradigm than in the past we don't try to Press people to use the technology from central IT We try to convince our customers our internal customers that the offering that we have is a very good offering and Competitive to other offerings at the market. So we have a product and we want to sell it to our internal customers And convince them with our offering But besides of that there are more specific goals for this project so on the one hand it should be in the future the global integration platform for all brands which means they could participate in building the cloud in operating the cloud and Running services on top of this cloud or their applications on top of this cloud Even if it's connected currently to open stack It's not the project covers also hybrid cloud approaches. So public cloud as well. It's not only private cloud Alongside the the project and the technologies one aim is to foster methodologies connected to cloud to the cloud model like DevOps or HL development HL processes and the innovation in IT in general The major value that we would like to Deliver to our customers is to help them shorten their time to market for their products for their applications For example for connected car or digitization products to bring as fast as possible their functionality to their markets But because it's a Volkswagen product for their internal departments we have to cover or embrace policies from governance from compliance as well To deliver a secure product to the to the customers and save the data or safeguard the security of the data of our customers and By using open source by the way using open source and big corporate organizations is very challenging By using open source. We also want to use the model behind open source the collaborative model to create software to Run software and to enhance software in the future So we W also aims to introduce kind of a corporate open source model into the company To help the various departments developing a software so that one department creating software would be able to Benefit from stuff that was already created in other departments and use the same stuff For example for a config management couple manifests, whatever it is heat templates to have joint forces on creating the stuff And one Target in the back for sure is cost saving as well The cloud has a lot of advantages over our traditional systems with respect to cost reduction And that's one of the targets as well Currently the cloud is deployed only on one location in Wolfsburg in the headquarters of Volkswagen But there is already planning to deploy it in various locations around the world This is a global approach. So the cloud will be operated and deployed in multiple locations in the world the next one Probably will be deployed in the Czech Republic and after that the next locations will be somewhere in the area around the world So a pack either North America Latin America or China or Asia in general As already mentioned, it's a public cloud There's a part for public cloud as well and there we work Together with the usual suspects in this area the big public cloud providers around the world That's one example of the workloads running on the cloud. It's on the car configurator For the Spanish market that was the first application installed and running on the cloud Actually, if I would click this link, we could see this application Ricardo I have seen you a few minutes ago. There's Ricardo He invested a lot of work into this application and suffered a lot from our problems at the beginning But if you want to know how it feels to deploy on this cloud and consume this cloud Then you should talk to him. It's it's one of our dear dearest customers now This is the timeline we followed in creating the cloud so the initial planning took place in the middle of the last year VW was already consuming virtualizationization technologies for example and the next logical step was to to deploy a cloud and to use and benefit from from the cloud after that we started a proof-of-concept phase and selected the vendor and Started the pre-production so until the end of the last year there was we built up the first system Put it on to pre-production and presented it to potential customers so that they would be able to Help us pivoting on the services that would be required in the first setup And then the system started in production so for for example the ngw website Started implementing on this production website and we added platform as a service as well So there is pivotal cloud phone. We're running on this platform as well And then we are now in the phase four and that's the face. We are currently working in We call it the rise of the Titan because we added a lot of Stuff or plan to add a lot of stuff and we are currently developing all of these services and in this timeline we picked out two Yeah, we could call it incidents, but major decisions to Change our way. The first one is the samurai versus ninja incident and the second one is project versus product management summarize versus ninjas I already mentioned that The project was born in the traditional department and this traditional department the approach is to plan things in a waterfall model plan releases and release these releases after six months for example Deploying it on golden machines very reliable very expensive and Do approvals for specific parts or provisioning for example of parts of the resources manually And it took it takes months or years to deploy in the system But it's very reliable high quality and it works But it takes a long time and if you have problems with your planning or issues with your technology Then you have to start over. It takes a long time on the other hand in the cloud area You have an agile approach. You have continuous integration continuous deployments things like this You try to leverage benefits from commodity hardware for example And you have automatic way workflows and your aim is to deploy in days or weeks, but not in months and years But we want to benefit from both areas so if we introduce cloud we want to have kind of the same quality from the old world and It took one decision We staffed the team from resources from the traditional team and said okay We want to achieve all the benefits from benefits from the right-hand side But you want to do it with a specialist with the subject meta experts from the traditional world and Try to infect the old world with these method methodologies as well the second one a second important change was to Move from project management to product management. So the traditional world is organized with very Precise project planning. It's all about scope cost and Quality and schedule, but it was done on department level. So the storage department plans something the operational department plans something something the server operations plan something and at the end That doesn't necessarily fit together Especially if you are under time pressure and you want to deliver a benefit to the to the to the business For example for digitization workloads. You have to be very fast We suffered a lot from this because from from the silo point of view it was Very understandable why the departments did this but at the end there was there was a problem to bring all the things together So we switched to a product management approach which means Think about your service as a service for a customer. What are you internal customers? What is the product? What is the value for the business? What is your market? Do you have a market for database as a service? For example, and that was very important especially because We required from these products are on us who create the the products that they Incorporate operations as well. So they have to deliver an operational and support model for their services Otherwise like in the in the old world There was a gap for operations, especially And one of the benefits is we have roadmaps for our products. We have minimal viable products. So if there is Something wrong in our planning or we have Misunderstanding of the requirements of our customers because we we are able to do pivoting on the Features of these products very fast. It's a model for fail fast and fail save This is the face. We are currently working in so we call it the rise of the titan because it's huge These are the products the 60 products that we are currently developing There is nearly everything that you can imagine a private cloud model. Some of this is hybrid cloud model as well But we have also products like cross services, for example, that's automation all the methodologies that we need guidelines for for the developers all the workloads to be integrated on the cloud as well as Workload onboarding. This is very very important You have to consult your customers your future customers these customers may come from a more legacy model as well You have to consult them. You have to help them to leverage all the new technologies and all the things that they want to use and consume on the cloud and Make them understand the model that you're working in So now it's my turn thanks, Gert Thanks for this talk and for this part of the talk and for all the support you and me want to start giving so This is a second part and I'm responsible among other things for the operations of the group by T cloud of the on-site group by T cloud and What all people tell you is you have to change your organization to be successful with cloud For so far to me nobody pointed out of exactly how I should do this I haven't got a recipe either, but I would like to invite you to share some experiences some consequences we or Some recommendations we draw from this and how we want to proceed on on this matter If you have any questions don't hesitate to ask straight away We will have a Q&A afterwards as well so Yeah, we'll see how So Gert pointed out that Not all of you will we be aware of how traditional IT is is done and I would like to start with with the Rackymatics for server operations So if I have a server Service might be a Linux service it will involve a lot of teams like server ops over planning storage Facilities so acclimatization and power supply We can go on forever and it will involve like 150 people in in total to to give a service of Of that quality With all its drawbacks So imagine in this direction Below and in this direction we can could virtually go on forever. We could describe more tasks. We could describe more More teams it doesn't involve applications and so on this models changes quite a lot We had a rough sketch Where what parts the green ones the cloud operation Team will be responsible for so it's infrastructure life cycle management for instance, which was another department before It will be is the hardware for the set storage as well It will be the controllers it will contain Contention Serious parts of the network Obviously quite a small thing at least here the open-stack control plane But there is lots of other things infrastructure monitoring infrastructure background and recovery and so on which was a shared responsibility before So quite quite a change from this one to quite a green field in both meanings of the word in this diagram and To make clear what changes and what challenges we had we took a couple of interviews We have some co-authors to so to speak and had And recorded a couple of different view point of point of view like customer security team and so on And we will share some some quotes or some expectations from from these Stakeholders that is so to speak and we will start with With the customer the customer always challenge you with with the public cloud supplier in this case we choose a AWS because it's by far the largest and by far the one with the most services and they say AWS does it that does have it? Why don't you? Yeah Because we just started so The recommendation is you have to be very very clear about your world map. You have to tell them. Okay. It's not there It might be there at that point of time You can a color a collaborate giving us requirements also on Most people are quite happy if they know, okay, it's not there But it will be there in half a year or a year or whatever We are working Together with a product thing on this Then the cloud ops is always the last resort for for a problem In an organization like ours people just called the cloud ops team and Get informed that while installing the tenants they forgot a router and that's why there is no Network connectivity for instance So recommendation we We do draw from this is that you have to provide for the whole organization Proper customer training and onboarding. So it's not it's not like providing a website where they can fill in and It's it's just not the way it is What happens a lot is that people have problems with with IT in general and Use the cloud as an as an enabler to solve this problem the most prominent is that people call me Yeah, but I just can't Can't develop software with a Windows client Which is the only client we provide as a group RT in Wolfsburg Yeah, I don't have a good answer for this My recommendation is stay very polite when you come across these kind of requests Then there's a special thing in a large organization which has had IT for a very very long time and has mainframes and Oracle databases and Microsoft SQL databases made name any technology We will have it and Obviously People want to connect to back to back end systems and like a legacy systems And this is kind of Pandora's box if you start on this one you you'll end up in in trouble in a way and You have to be very strict about that you require statelessness because it will Enable you to do operations afterwards But in a way that you help the customer To create a migration plan might be that you first put the front end into the cloud and afterwards other parts like Ricardo did Not that you just don't say okay, it's not possible So obviously my team is has some As a point of view as well What drove me a lot as being part of IT for 15 years or so is that about 60% of the accepted incidents led to serious improvements of the of the Of the platform so If something did not work it wasn't fixed in one Location and the other locations were Forgotten Which brings us to the recommendation deploy immutable infrastructure, so do everything as code and to see ICD Be quite modest start with a very very small test set and extended Because in a product and Project point of view if you start huge it will be stripped from the product play and If you have testing in place you can quite Quite easily in enlarge it, but you can't afterwards injected In which test cases in word What struck me a lot or most in this whole journey was that Harness Did call me? Yeah, we had an incident was a was a load balancer as a service it Restarted itself automatically and Reinstalled all the VMs Coming from a point of view where such a decision would usually in take a meeting of Approximately five manager to decide whether to To restart it because some of the services are running and others aren't Yeah Be confident a lot of things really work they People just don't tell you if they work they really work they do And what what we have an experience is that there are two kind of of users We call them cloud citizens. Those are familiar familiar with AWS or Azure or whatever Which are critical and very demanding and cloud immigrants They need a lot of help, but after that are quite happy What help us there is that we have a Concept which is called tenant for free. They get a small and one year free tenant where they can just play around and After that people knew what to expect and whatnot this was said a lot so of Course my management takes a point of view as well actually part of my management is on-site. So I'm have to be a bit polite on this Cloud is One expectation accord is cloud is just a new platform and this heavily heavily underestimates the level of disruptiveness it has and I think we are quite successful in Telling people how disruptive this model is and how huge the shift is in responsibilities and all these things Another expectation is Is that any new workload can Without any modification run on the cloud A quote for this is open-stake is a cheap replacement for VMware, which is not quite true What helps there is that you Should have a good checklist what means cloud ready or cloud native and by any means stick to these conditions You will end up in a lot of trouble Another thing what what the management always says not only in cloud is that you can do it on top Because you have too much spare time. I won't comment on this general problem Being very European very German organization we have specific employee employer expectations and One is that we stick to our stuff very very long. It's quite common that somebody starts at VW His or her career and ends his or her career Like 35 years or 45 years later Which means People are technologies are always place place where people are not our recommendation is That you have to train and crawler shy the stuff a lot For cloud operations, which brings me to the next point Experienced stuff is hard to find At least in Germany, but I get the impression that every other boss is having we are hiring So it doesn't seem to be a German problem alone Yeah, see above you have to train If it takes longer and it takes longer Another thing which I embrace a lot is that my management the general management sees this as sees a lot of synergies They want us to deploy the fitting Technologies from cloud like lot of testing like CSED like conflict management name it to be pushed back to traditional IT to have synergies to have More advancement in traditional IT and I Find find this is a very good idea and We try to serve as to take people so to speak part-time from from the traditional IT job Being a network are being a storage guy being a server guy or doing conflict management and traditional IT and deploy them in in the cloud and Especially we Make them responsible for a new product and people are not the way that they just forget when they have Forget about it when they do the other job. They see okay. This is this is the same I can deploy the same to the same software the same whatever the same principle there and there The security team is They are not on site, but I have to have to give them My high regards. They did start off this journey with us and they were very collaborative and Both of us learned a lot on this Their initial point of view was okay, we have a set of rules we apply it to this technology as well Which does not work too good So commendation from our side is involve them as soon as possible Basically at the first First meeting invites them we want to do this and this and this Help them to to do policy involve involvement train them in the technology as you train your own stuff They will There will be much more Understanding and they did a very good job on this Then one thing which is always so to speak unfair to the infrastructure the platform has to compensate for any deficiency The application has that's why we have said many fireballs Which the platform is not really responsible for Our solution is which we will implement in the next phase is we will have user specific tenant types We will have managed Types where they can't fiddle around with a minute with a network Too much and we will have unmanaged where they need a tight security approval approval and where they will have to to address risks and Countermeasures and all these things and one quote from them is no Programmer or developers cares about security And I can understand the point of view on this Our recommendation is embrace security concerns tell them yet. Okay. We know there is a loophole. There is some risk and we did this and this and from a technical point of view you should Introduce security screening as part of your As part of your application CICD So what we do is we screen The images we generate for some known mistakes and all these things Which is a very good idea and Last but not least the developer developer's point of view is For instance github bit bucket any mirror you can name Is a reliable cure source of Software to be used in the in the cloud and this is not quite true And as get a pointed out the private cloud is required to do a surplus to AWS We surplus in enforcing security rules for instance Hence we restrict Access to these software things It's a hard work, but You have to explain your developers why you do this Why you don't want them them to use bit bucket or whatever directly and that there are things they can use in In a in a matter and that it has got a consequences legal consequences IT security Consequences it might have even Financial consequences if somebody decides on installing Oracle in the cloud Because you will need a software license for this What what they expect as well is that bug fixing has to be has to occur instantly We learned this quite a hard way and we suggest that you do the testing together with with your C with your developers so that they really test end-to-end that they That they test the infrastructure as well as the application and so on and another point of view they take is They want to have non-invasive Bug fixing which is hard So recommendation is that again communication be precise and strict about your service levels And about the state of the product when will they have multiple clouds when will they have a global open and so and so on We do struggle Get added sometimes without me knowing My point of view is we struggle a lot I don't want to lie to you. It's just another point of view. Yeah, it is We have I addressed this already too few resources and skills This is a general problem. It's worsened in our situation and I think in every cloud start project that New products come into production that fast that there is no relaxation time After you finish the first one you get straight to the next one What I find personally very very hard is trouble shooting in deep network issues This is from my point of view the far the far hardest We do struggle in configuration management in particular in August orchestration Applying updates to the environment we struggle with We struggle with to automate configuration changes to the environment this kind of works But having subsequent actions in other systems is hard and to automate them. So if you If you Change a Parameter on some log file collection thing you have to you might have to restart a couple of services on other computers as well Which is hard? And so to speak the the next generation art is to disembark without disruption So to summarize The recommendations be very transparent about your capabilities about your restrictions about your roadmap be patient on kind all I myself hadn't got a clue like 12 or 18 months ago what the cloud is about And most of my customers are in this situation now Give them time to to learn this as well Be aware. It's not just another product. It changes the game Walk in their shoes. What we learned a lot from is we have certain Let's say services which support the cloud like mirrors like repositories Customer portal and it's very beneficial if you do them in cloud as a cloud native Application where where possible it teaches you a lot be communicative do management of expectations with the same one as the first one or Be preemptive of an overwhelming Demand people do want this They actually do They might use it and they might Come up with incidents So if you don't do this and we are struggling there as well a bit You might be a victim of your own success you you opened a box of candies and now The whole school is gathering around you So to speak Surprisingly little problems are technical Almost all about skills resources and the shift of responsibilities And we did state to contributors already Which were Hannes and Ricardo there were two others Fabio and Nardeshtar Who helped us Being interviewed sharing their experience to us Thanks a lot again to this and Next slide is Korean a we have questions. Yeah So so it doesn't seem to be a micro can you go to the mic? Thanks. Thanks Do you have plans to extend the openness that usage beyond the cloud application? So let's say be the eye Classical servers like as a servers or you think it's not possible for a company like you don't know if I get it correctly If you repeat from my company we are Exploring different usages for example the BDI Problem on top of open stack For example SAP on top of open stack. This is not what you have been told. Do you think this is possible? It's not yet your map. I Personally think it's possible, but very demanding So it's a challenge, but I think it's possible We do We do do low hanging fruits first So and this is not a low hanging fruit from our perspective. We have a cloud first strategy. So our aim is to Preferably move first cloud ready applications to the cloud because it's easier That's why we refer to don't open Pandora's box and with providing a lot of Connections to the back end because then you have a problem to provide these back-end connections on other locations as well. It's a very It's a huge set of dependencies In the future it might be possible to move under workloads as well over there So we have a lot of discussions in this onboarding a workload onboarding team For example to move more demanding or challenging applications to the cloud as well if it will be VDI for example at the end I don't know could be the case depends on the requirements for the underlying underlying infrastructure, could you explain something about the balance between the application centers or groups that would Welcome what you're doing. I would really want it and pull and the difference between Maybe something that's maybe ordained at a high level of management where it's actually something that is required to be pushed Okay, I got the question so There is no real resistance to this project because we don't force anybody really to use it we said we want to have a Want to have a Very good offer and if it's not good enough or not good for a specific project Both can walk their way so to speak so there is no real resistance to the project We do have as folks one as a whole a strategy to develop new applications cloud ready Who does like this most I would guess developers first They don't have to to fiddle around with with processes to get a server to just to test something After that I would guess rep prone Applications are obviously Most Happy to use this because I can It's not about OpenStack and IIS but because of the whole techniques. They have Attent there which I didn't before So Yeah, I think there is another demanding group there is the expectation with some workloads that they would save a lot of money if They would move their workload to the cloud But that's not necessarily true. It depends so The goal why this workloads want to move to the cloud might be different from the other ones and You have to explain very in detail and clearly why it might be not a good idea to move this specific workload to the cloud because he consumes non-HA non-cloud ready back-end servers for example and don't have this connection and you don't want to provide this connection because it Fears with your cloud model. It's just Thanks for presentation I Okay, so I see that you have a lot of requirements as a biggest enterprise in the world And I know how internals of OpenStack, so I'm interested How many OpenStack project you invaded did some patching developing etc I saw Sean's presentation about Keystone policies and As I understand Keystone was a patch it to meet your requirements. So Did you patch Nova Neatron other services? How many developers are working on? Unfortunately, I haven't seen your presentation. So I'm lacking information here Yeah So it's one vanilla open-stack So this is so I would like to add a customer's perspective to this I'm not aware of any page to to to as a man we want is open-stack distribution I'm aware of Quite substantial configuration because of network restraint restraints Because of the way we provide for instance mirrors and so on so it's No, not specific but By configuration what I'm aware of I don't know whether this counts as Policy.conf which is always a pain, you know this So I do think we are overdoing our time as far as I'm Getting the the movement So thanks for your patience. Thanks a lot. Thank you for your time