 Hi, my name is Robert Wostrom, and I work as an application server platform architect at Volvo Cars. Today I'm going to take you through how we transformed our old Java EE service to become new fancy shiny with containers instead. So we deliver Java EE application servers. A couple of months ago we had 785 applications on 560 application servers, all running on 80 physical servers. So a fairly large environment. Our environment has to be stable all of the time. Our manufacturing plant uses applications in the platform, our maintenance all over the world. We are also using it to service cars. So we have been delivering this service for roughly 15, 16 years, something like that. Everybody knew how to operate it, everybody knew how to use it, it was well known within the organization. But we started getting into problems. We built our whole infrastructure on physical servers. All of a sudden, there was this medical thing, IBM would give a deadline for one of their earlier versions. That meant for us getting new servers into place, maybe get a guy out to drill a hole between the data centers to get a fiber of the cable between them. It was a mormick of panic. Everybody has to change the Java version, go up to the next Java EE version. It wasn't that good. We couldn't go to the cloud, we tried. And we had problems getting service windows for our platform as well. It had to be up and running, if we were running late on delivering cars to our customers then the plant had to be open. Our platform had to be up and running on our Sunday evening. That is Japan's Monday morning when our Japanese customers were in their car going to the dealership to get the Volvo service. It was a huge issue. Our service was also built on physical servers. That means that after about six to seven years we could guarantee that our test environment looked exactly like a QA environment. That looked exactly like our production environment. It was hard, almost impossible to replicate production issues because the servers would have had people logged into them doing changes, doing maintenance work, fixing stuff. It was a huge issue. So we also had a problem with calendar time. It took us a couple of weeks to get our environments into place. That also with us being a couple of versions of Java e-applications of versions back. It started to become a big issue with our development organization because they couldn't use the latest fancy Java frameworks. So we started to sign in the new platform. Our first thing was always offer the latest version of Java and Java EE. Our developers had to get the latest version. We didn't want to, well, if it was supported, if the Java and Java EE version was supported, we were going to support it. However, if you're starting a new project, we don't want you to use a Java version that's three years old. We had to go to multiple locations worldwide, being on every plant in the world, and also into the cloud. Our platform had to be modern, fast, and also had to be able to adopt to changing requirements. There are always going to be new services out there. It's always going to be new frameworks that comes along that our development organization wants to use. We, as a platform provider, need to provide that for them. So when we started with the platform work, we wanted to isolate everything. A misbehaving application shouldn't be able to take down the whole server. Immutable. Immutable. That means it's going to be the same image that you build in your development environment that goes to test, that goes to QA, that goes to production. And even to the cloud, it's going to be the same image. It can't be any difference between the different versions, well, the different environments that you're in. Identpotent. Everything that we can't use immutability with, such as servers and so on. Identpotency. That means that we describe a wanted state. For instance Ansible is what we use to do our idempotent scripts. We describe an end state. For instance, I want this share to lie down. There's a share over there. And the only thing is that it's in the wrong state. With normal templating, we would have to go out, go to IKEA, buy a new share, build it and put it there. But if I, using an idempotent language, describe that all I want is that share to be down on the floor. The script will actually do it for me. So we started off by looking at how it worked. So we started off by saying that our platform is stable and our customers trust us and the platform. That's really important. If our customers, our internal customers, doesn't trust us, then we have a problem. We approach everything in a scientific way. We test things. We like to experiment. And if it works, then it works. If it fails, then we know it fails. We are transparent with what we do, what we deliver and how. This, with everything's code, even infrastructure, is a really important thing. Because we want, my end goal somewhere, is to have a document describing the whole environment. With everything that is included. So that if there's a major incident, then I can bring up your environment here. Show it to you in code. If I want to replicate your environment anywhere in the world, I just use the same bit of code. Just put it on a different place. So that's really important. We communicate through APIs, mainly restful APIs, then. So this was how we started everything. We started looking at virtual machines. Well, with virtual machines we can automate everything. We get isolated environments. We can run different versions of Java at the same time. However, our 80 physical servers became 850 virtual servers. That's a hefty price tag because we pay per operating system instance. And the configuration is only known directly after provisioning. After we've been running your application for a while, then things might have changed inside your virtual machine. Can't have that. So looking ahead as well. We deliver environments for monolithic Java applications. So they're huge. They're really huge. They're not really intended for running a container. However, we're putting them into a container. Because we want to have an environment where we can put our huge monolithic applications and then have a way of slowly breaking them up into microservices. If that's needed. And I think that microservices is one of the cornerstones of DevOps as well. So our second draft was containers. With that we have the possibility to automate everything. We got isolated environments. Well, you guys know everything about this. Well, we use less hardware and the configuration is known at all time. So we set on three products. The first one is OpenShift. To build distribution runtime environment. We actually use it to distribute to the cloud as well. It's been designed with the developer in mind. And this is really important. It's got really nice APIs that we can use. Because we want to create a self-service portal. The platform doesn't only deliver an open shift environment to our end customers. It also includes things like load balancers and maybe in the future we can use this information and so on. So we just wrote a bit of a wrapper around it. To provide a GUI for our development and operation guys. So the next product we choose was Ansible Tower. With that one we can automate everything. It's idempotent and it also has nice APIs, restful APIs that we can create optimization to. We use it to create and manage component outside OpenShift. And we also in some cases actually use it to build container images. And we also use it to manage OpenShift. So this is how our new environment looks like. There's not a lot of difference for a developer. Because we changed the operating system version. The clustering capability has been taken over by Kubernetes and OpenShift. We are on virtualization on Docker. But we changed the application server from network deployment to Liberty Profile. Liberty Profile was the first Java EE7 certified enterprise application server. So we put that in there and on top of that we put in an ear file. The difference here is for a Java EE shop. It's that the application server, the new version Java EE7, doesn't support Jax RPC and a couple of other frameworks. So we had to get rid of some of the old technologies used. But that's just a sanity thing to remove all stuff that has become optional in the standards. But for a developer and operations personnel, operation was a bit of a difference. But for the developers it wasn't that much. So we actually had to take a look at our application deployment process. It looks like this. So we got the developer checking in helloworld.java to our subversion or git. Goes to Jenkins, gets build, goes into Artifactory. Artifactory is a glorified FTP server. Down to the developer. The developer then takes the artifact, creates three different deploy packages. Three. They're all different. They got all different configuration. It points out databases, there might be a properties file somewhere deep inside the application that they've changed. Puts that on three different file storage, NAS drives. That is, and for testing QA, they're automatically deployed. But for production, it has to go through ServiceNow, which is our incident request handling system, to an Ops guy. And I need to inform them four days ahead that we're going to do a production deploy. Then it goes to the actual environment. This doesn't really support continuous deployment. So we had to redo it. This is how it looks now. Everything is the same until we go down to OpenShift. In OpenShift we build an image, put it into the test environment. We don't deploy to QA, we don't deploy to production, we promote. It's the same image that goes from test to production. But what about resources? So in the test environment, I want to go to my test databases, test Q managers and so on. And I want to go to my production test, production Q managers and production databases. So we actually introduced a templating language based on Mustache syntax that our development teams are using when they're configuring their files. So anything that's environment specific, they will fix that in the configuration files. And then before we start up the application, we scan for those files, then automatically he changed them. So we got the same image throughout the whole process. It's only that the environment configuration has been changed. So our build process, this is basically it. We take the latest version of Liberty Profile, put it into a Docker image. Our development teams, take an error file and a configuration file. The configuration file basically tells us where the databases and the kind of things are with Mustache syntax, put it into Artifactory. And then we just combine the two and put it into the Docker registry. So I'm just going to check for time. So we're doing cloud deployment as well. So we automated everything here at VCC in Torslanda. What we did was it was really easy. We basically took the same optimization scripts, rewrote them a bit and provide them provision Azure infrastructure. And then run the scripts there. So all of a sudden we had the same environment, the same open shift installation up in the cloud as we did have on premise. And the Docker registry was just connected. Well, we didn't connect it. We use an ansible playbook to push out the information, to push out the images to the different cloud locations. When we started this, Red Hat wasn't supported on Microsoft Cloud. About two, I think it was about two months before we actually got into production. They started supporting it, which was really good. But this is basically how it worked. It was really easy because we virtualized on the open shift layer. So we define our infrastructure inside OpenShift. So we can have, replicate the environment anywhere in the world, any cloud provider. We can be in any plant. We can be behind the parking lots on the receiving goods party in Shanghai if we wanted to be there. We can deploy our solution pretty much anywhere. So looking ahead, we actually started, we have automated everything in our platform. Everything is there. But we have a problem because the rest of our organization hasn't been automated. So we're driving a hard agenda against our other people. For instance, if the business cases are there, you can do the calculations. I think that if we were to use the normal processes just to get our 80 applications, removing 80 applications into our solution or 80 application portfolios into our solution before Q1 next year. And I did a small calculation on that. And in order to get the load balancing thing, get everybody, get the load balancing configured as we do with the normal process, it would take us roughly 30 years to get that. So we just, that is a good business case. Then we just have to tell the right people and will also help them automate their service. The same with authentication. So we started off with the application platform. And now we're starting to expand to automate everything. So that I can get my end results, which is a document that describes your whole application environment. Because I think that's really, really important. When you've done that, you've automated everything. You have moved into a position where your ops guys doesn't have to sit and press next, next, next, finish. Now I'm done. What you can do is deliver value higher up in the stack. Go out, talk to the application projects. Be more involved with the part that actually creates something. So, to summarize, we built our platform on OpenShift using Ansible that ties everything together. It's deployed on-prem and in Azure, and we're using a website with a little bit of profile as our application server platform, our application server engine. So are there any questions out there? Yeah. Hi, this is Saitam Nani from Deutsche Bank. It was an excellent presentation, and actually it's all interesting when people talk about something which is productive. So you mentioned, I think, 787 applications, 700 app servers, 80 hosts. I know that we discussed lots of interesting things. Are you saying that all these 780 applications are actually running now in production on OpenShift? No, not all of them. But the version that we're running on our 80 physical servers, the website application server is going out of support Q1 2018, and by then, this is the major platform that we have for our development teams. However, we've been running in production since August last year with a global application in this. So we know it works, and I think we're running 6 applications at this time in production, and a number of other ones in test and QA that hasn't made it up to the production level. Ja, thanks a lot. Hello, I'm Björkter Karlsson from the Norwegian Tax Authority, and I'm wondering how you're doing the, how do you transition the image from test to QA to production? Using OC tags, if it's an internal environment. If it goes out to the cloud, we're doing Docker pull, Docker push. Okay, so it's manual, you have to do a manual process to do it? Ja, the first thing, our first iteration, it was Ansible scripts running inside tower, so we had a basic UI that was doing it. But now we actually developed a portal, our platform is called MAS, Modernized Application Server Platform, and for that we obviously have an operation portal called MOPs. Based on the latest technology like RESTful Services, Node.js, and those kind of cool things, Angular Material. So we built this little interface, and now our customers, our internal customers, are using that to handle the environment. Thank you. I wanted to know if you use one OpenShift cluster or more than one? Multiple, we even have, when we're going to, we actually, we have started doing release based, so we got an A side and a B side for our next release. This is the first one we're doing this with, because we got so many customers and so much volume, that one mistake in an upgrade script for environment would have disastrous consequences, so we're not risking that. What we do is have a B A side and a B side, and the application can be in both sides at the same time in different environments, so you can run your test QA in A environment and production in B environment. They got different versions of OpenShift, at least two clusters, and then we have multiple cloud sites, formal cloud sites at this time, and then we have one based on network zones. I think we're up to eight clusters right now. Okay, you then have eight clusters for different failure domains, but to do you, for instance, deploy testing QA and production in the same cluster? Yes, we do. And that's to make sure that you have the same environment, I presume. Yeah, same environment, because given that we're doing releases of the applications of a platform, and in a release it's OpenShift, the latest version of OpenShift, often the latest version of OpenShift, the latest version of the operating system, monitoring agents and so on. So we're all, sorry, what's the question? Yeah, that you run everything, production, Q&A. Yeah, so we're running everything in the same cluster. However, we do have dedicated VMs for production load, so we can guarantee a minimum capacity for those applications. Yeah, okay, thank you. Hi. One question is about operations. How was, for then, adapt from body metal, all application format, operation level to the cloud with OpenShift and containers? We still have almost weekly meetings with our operation team just to get them to transition to start. But it's kind of our problem as well, because we haven't had the volumes until now. It's just been something that barely touched, but now they're getting more and more involved with it. Thank you. Are there any other questions? If not, thank you Robert. Very inspirational. Well done.