 Hi guys, welcome to my talk. It's really amazing to see so many people here So you guys are ready to get into the dirty details of migrating legacy applications onto shiny new Kubernetes Yeah Okay, cool. I'm Joseph. I'm CTO and co-founder of cure where it's we're a custom software development company from Germany Munich Germany focusing on developing cloud native Applications migrating legacy applications onto a cloud native stack and Today I want to talk you through what we have learned by migrating hundreds of J2 EE legacy systems Onto Kubernetes. I'm the technical lead of this large-scale migration program at our one of our customers It all started with a big bang when curinate a Kubernetes and docker was released open source to mankind and Literally everybody suddenly was available or was possible for them to develop cloud native applications Those applications have common properties like being hyperscalable being really resilient to failures they maybe sometimes have or provide OPEC savings due to a higher resource utilization and more and more ops automation and You can develop at a higher level of speed with shorter release cycles, so These are huge advantages of such cloud native applications and the question is what about legacy applications Applications, can they benefit from cloud native technology and get into these advantages? To answer these questions, I want to take you with me to the journey we had by answering this question for our customer It all began with a brave decision of a major German insurance company's CIO He decided to move Literally every web application onto a cloud native stack Pooh and to make it a little bit more challenging. He also Cancels a contract for the existing data center. It was an on-premises data center Yeah The chief architect took over eager to explore And learn how to migrate legacy applications, which are not Microservices which are not containerized and so on on to a cloud native platform But that needs time So the CIO was very generous and scheduled one year to finish the migration Tell you afterwards how many applications and whether they proper their properties so but the CIO also raised Some priorities for the migration project the top priority is to increase the security level as it the applications are Migrated from a very trusted so on on-premises data center on to potentially a little bit untrustier so on public cloud infrastructure and So security was a major concern shortly followed after Time It was important to hit this schedule and to migrate those applications within one year and Yeah, he also wants to save money and the migration cost shouldn't be that high These were his priorities So that's how we started eight months ago So today I can talk a little bit about this what we have learned the last eight months and what will probably happen the next four months as Doubts are the death of each Ambitious project. We also decided to be as brave as the decision itself But we also had some impediments Had to overcome But we learned a lot and so In the following slides, I want to present you What we've learned about the good the bad and the ugly of migrating legacy applications onto Kubernetes Let's start With the good things Completely ignoring the principle of not putting the low lights at the end So I start with the highlights the good things and decrease I will decrease a little bit further on By the way, this picture is proudly presented by the Bavarian Ministry of economics tourism and beer So what has been the good parts first The first task was to gain visibility and Yeah, the outcome was quite good Because we started with a very naive approach. We wanted to architect the cloud We thought about the target architecture at first We thought about how to slice and mesh microservices We thought about how to package and distribute these microservices into containers and how to dynamically execute them on the cloud The cloud native principles here But then it was a little bit clear that we're not facing a greenfield approach There are a lot of existing legacy applications and we knew Nothing quite nothing about those applications. We had to migrate So our first task was to gain visibility here We started by sending out Questionnaires to all relevant applications to gain first feedback on Our fundamental questions like what's a technology stack they use? So for us it was important what images we should provide What are the required resources and the required amount of resources and to answer this question? How many application will be hard to schedule? on on Kubernetes notes also very interesting part was those applications writing to storage Writing to local storage or writing to remote storage The amount of data written there to decide what storage solutions we should provide Some special requirements like native lips or special hardware like printers or so which is not virtualized typically in a Kubernetes environment and What inbound and outbound protocols are used? What's a protocol stack? Are they TCP level or are those? HTTP protocols are they secured by TLS do they use multicast which is a little bit of problem If you use plain vanilla Kubernetes or do they use dynamic ports to communicate which is also a little bit of problem Then we also asked concerning the ability to execute for the Applications are there any tests? Are there any responsible persons? Maybe are those Applications running out of life? During this year This was a very important question because only with this questions a question 20% of the applications were away from our migration duty And last but not least also important question What client authentication? Mechanisms are used and should be ported onto our cloud solution But this was only the first step afterwards We have written a small code a small program called cloud Eliza and extracted a lot of data sources Which helped us to to? Yeah Doing better decisions there those data sources were the questionnaires were Chira tasks of the migration project were a M tool enterprise application management tool were certain Excel sheets and a very important source for our decision was Static analysis We performed static analysis on all application binaries With the IBM migration tool you can see there which Libraries are used and which JEE APIs are used We used our own Q validator to check some dependencies and We used son a cube to have some quality metrics and all this is integrated into Something we called migration database and we performed analytics on this database. We're using tableau To explore the data there and to answer architectural questions And we also have built a dashboard providing the projects and the management some basic Migration information That's the visibility part Next good thing was our approach to do emergent design of the target software landscape Because a big design up front and analyzing each nitty-gritty detail Would have been very hard and time-consuming so we have to start with a little Little knowledge and improve and improve so we did an emergent design approach here We started Where by playing the divide and conquer game? we had to migrate 400 applications and We divided them into parts one part are The more or less old applications of H between 10 and 15 years All of those applications by the way J2e applications and the more modern ones and we decided to migrate the older ones By an approach re-architecting them and making them runnable on Kubernetes on Amazon web services and the more modern application we decided to Lift and shift them with their virtual machine from the existing data center into Amazon web service into EC2 there Why only those old? Applications at the first step to Kubernetes Yeah, we wanted to do risk front-loading so we're doing the risky parts at the very beginning and they benefit the most of re-architecting them and Those others we put them at the first step Why lift and shift on EC2 and the second step also will be to migrate them to Kubernetes Yeah, after analyzing the migration database. We had a pretty clear view of the source architecture We had to migrate We're coming from 200 monolithic applications Looking a little bit like this They had a broad range of back-end systems and Infrastructure systems integrated. They're using hosts. They're using batch processing file shares LDubs message queuing all that The stack is based on Java 6 On a J2e 1.4 application server. They've built an almighty legacy framework on top of that Hosting the application and a basic HTTP D web layer the inbound traffic is TLS encrypted and This outbound traffic to the back end Was mostly non TLS non encrypted those cases So and we came together with a cloud operations team we came together with the information security officers and designed a target architecture for all those monolithic applications and the target architecture is About to run on Amazon web services on a Kubernetes cluster actually we were using open shift, but only the Container as a service parts. So it's pretty plain Kubernetes there it's based on Docker and JVM 8 J e7 a lightweight application server where to port this Almighty framework Yeah, we put it the legacy out and now it's almighty framework next generation and Yeah, and then there is an important architecture decision up here to split the application into two parts An outer layer part and an inner layer part the outer layer part of the application Is the part which provides? user interfaces external API's and For security reason also has to authenticate and authorize The users and the inner layer application does all those Business logic and backend integration things and there is a something like a firewall in between It's an API gateway And this API gateway checks those three requests from the outer layer into the inner layer if there is a valid token Why well it token Each and every communication here throughout this deck should be TLS 1.2 2-way encrypted and Having an open ID connect identity token as payload So you have your user context in this open ID connect identity token and your your application context in The two-way TLS client certificate And double layered protection there and the API gateway checks this open ID connect token Yeah, and all inbound and all outbound traffic should be TLS 1.2 So now with a little bit of wizardry and some secret source It should be easy to come from 8 to be right for all those applications For us the fundamental decision upfront was Whether when we're migrating from a to be be Should it be cloud friendly or should it be cloud native? application on the right-hand side here and Our decision was that Be our target architecture should only be cloud fan friendly Yeah, being containerized of course, yeah, and we decided here to put the monolith into the container and Should respect the 12 factor apps principles there and this can be of the first step if some Of those applications are thriving towards cloud nativeness This is also the good first step. So But with our migration we targeted cloud friendliness So but if we put the monolith into a container What's running in this outer layer in our target architecture? and our answer and This answer has proven to be right or one on potential right answer is To place an edge service On the outer layer. So we built an edge service and this is the application part residing in the outer layer And the monolith is in the inner layer and there is the API gateway in between and the edge service is responsible to accept incoming requests Which which have diverse user contexts Context there are some user context cookies set headers set client certificates all that and The edge service then is responsible to change those diverse user contexts Into an open ID Identity token and it uses a token provider central service here Which is integrated with all I am so systems and the customer they're more than 10 and Yeah, you can exchange the edge service can exchange those diverse user context to an identity token and he caches them and if there is a request With no user context set He is also responsible for redirecting the user to the single sign-on system to sign on So with the edge service in place all Authentication authorization Tasks are shifted into the outer layer and the rest the user interface is Still in the inner layer. Yeah, of course in in our architecture, but Provided by the edge service at the outer layer Only a short glimpse into the internal structure of our edge service it's based on spring boot and Netflix sole and Yeah, basically is Integrated with the token provider here Next thing is how to Enhance our applications To respect the 12-factor apps principles or some of them and Further architecture requirements on security like TLS 1.2 encryption all the way Yeah, here we learn to love container patterns The basic ones are well described in the paper below by Brandon Burns and David Oppenheimer Yeah, and we used the sidecar pattern which enhances Container application container behavior by a sidecar contained in the same part We used this for log extraction and log reformatting Based on flu and D also for scheduling to trigger hdp requests on the applications For time scheduled tasks. We're using quartz there Used the ambassador pattern Which is more or less a proxy within the communication ranges Used it for TLS tunneling traffic when there is Neighbor system or a back-end system not providing TLS encryption Used as tunnel or ghost tunnel there as ambassador pattern used it for circuit breaking and Also for request monitoring based on linkered to enhance all those features Without changing the application itself It was important to save time and last but not least the adapter pattern To to provide Homogeneous interface of an application to the outer world We used it to yeah to adapt and the configuration mechanism of this almighty legacy framework To Kubernetes config maps and secrets So Kubernetes config maps and secrets were written to files and changed in the format And injected to the other container the last good thing I want to mention is About Kubernetes constraints because initially we thought we'll run into a lot of Kubernetes restrictions on our infrastructure Because those legacy systems behave strange Like our target infrastructure didn't support multicast There were no overlay network supporting this on play in place also We had no read write any persistent volume volume claims available So we're a little bit scared To hit this restrictions and we did for quite a lot of those applications but But cutting this application requirements and Re-architecting these applications not being dependent on Multicast and not being dependent on read write any persistent volume claims led to a better architecture of those applications So we were Okay with this and was it didn't produce that much effort to refactor the apps accordingly So this were the good parts now let's get into the bad side of cloud migration life and Yeah One bad side is state State within the clouds sometimes Microservices suggesting there is no state in applications, but there is state in applications and in legacy applications There is state in a lot of different representations The first representation Is state within databases Here we made our life a little bit easier by letting those databases in In the on-premises data center. We didn't migrate databases to Emerson web services. They were still in the on-premises or they're right now still in the on-premises Data center and we're using them from our monolith in the emerson cloud directly Potential disadvantage here could be that there is more latency, but This wasn't a problem for us as the cloud interconnect in between emerson web services and The on-premises data center was quite good. Yeah, there was nearly no Latency impact there, but where two advantages the application versions on-prem and in the new cloud native platform that can run in parallel which is What is very very good when you launch those applications you can have smooth throwback to the old version and This is also privacy saying as there are Some there's some data in those databases Which should not be in a public cloud At least according to German laws Next state our files File persistent a persistence was used in about 10% of those applications and We restricted it very heavily So we didn't Provide read write any persistent volume claims. We allowed no file rights into containers We Yeah, there was the rule that files files within Persistent volume claims with application data sensitive data They should be deleted after 15 minutes and of course there were no Nars mounts available from the on-premises data center into the Kubernetes platform So the migration tasks for the affected application were to store those files in the database as blob or use Use fdps to store the the files or to to rearchitect Application and not using files anymore So now a very interesting part was how we tackle Session states because all of those applications nearly all of those applications had session states first attempt was to use session stickiness OpenShift somehow provides session stickiness, but in my opinion in our opinion In the cloud where everything fails all the time and is rescheduled every time session persistence is not the way to go Session stickiness sorry is is not the way to go Next step was we considered about session persistence So writing session data Of the application in the existing application database, but the performance impact Here was high because there were many writes and many reads and The application slowed down immensely Another try was using Redis as session persistence mechanism, but Redis didn't provided us with encryption out of the box and Also, we found nobody who wanted to run this separate Redis infrastructure required here So I Got for a session synchronization Not using the application server mechanism here because This application so we used wasn't able to look up the peers within Kubernetes So next try was to use Hazelcast as in-memory data grid to synchronize session state But Hazelcast costs a lot of money if you want to use TLS and session synchronization with TLS and Paying a six-digit amount of euros was not the way to go in our opinion so our final solution was Apache Ignite Using Apache Ignite right now to do session synchronization between those Applications you can hook it up into a J e-server very easily Apache Ignite provides peer look up within Kubernetes Yeah, automatically peer look up a little bit cumbersome, but it works fine and yeah this just for you to mention because this This was three or four engineering days to fix this bug Ignite has an just-in-time compiler bug on the IBM JDK or JBM So some classes have to be excluded there. We had to use IBM JDK Yeah, this is about session persistence. So next thing is about diagnos ability I don't this guy is not relevant here, but I only want to place the superhero of my My childhood in some of my slides Okay so next What's about diagnose ability? diagnose ability is The ability to observe applications and to diagnose them in if they're running in a normal state and It's all about integrating traces metrics and events Yeah, and our first solution was to go for Prometheus metrics fluent D and The EF K stack behind for events and logs and open tracing implementations used Sipkin the first step to analyze traces But then we run into problems because we were not sure if we were able to set up a central solution That scales for all applications It's feasible, but we were not sure if we had enough time and also one instance of Prometheus Sipkin and fluent fluent D is or the EF K stack per application Was too much efforts for the applications So we decided to do an easy move and Now the whole cluster is instrumented by diner trace as application performance Monitoring solution, but it's a place order for any commercial performance management tool Which is pre-integrated with Kubernetes and and suffers the needs here Yeah, so we're not using those fancy cloud native projects here, but It's instrumented this diner trace So next is security Once again my friend We came far We have edge service in place TLS 1.2 two-way and those identity tokens all the way Also security filters enhanced here at the edge service level at the application level. We enhanced it the application by performing client certificate ACLs and token checks and we have egress rules towards the back end to protect Potential malicious containers for accessing all those back ends But there is one thing remaining With a huge problem with certificate management right now three full-time employees Are doing nothing more than scanning where? Certificates are not valid anymore or where is a new need for the certificates issuing those certificates and delivering those certificates to be provisioned into the containers and That's a current state But we want those three full-time employees doing something else by end of next year. So we're looking into solutions like spire or Istio for managing Certificate management on on the service mesh site or using network level policies Based on products like tag era or psyllium or or twist look and aqua To handle this application to application trust thing on the networking level So and finally the ugly parts this line of code is very Resentative for the ugly parts and the hostess pokus the host integration. We saw very Ugly things there how hosts are integrated and so on. It's mostly about cloud enabling cloud aliens there With toxic technology, this is technology, which is not supported and maintained anymore and Especially it was our Almighty legacy framework This almighty legacy framework was developed in the early 20 2000s about five hundred lines of five hundred thousand lines of code and zero test coverage and Yeah, but we managed To do all those migration tasks here We had to do and with our almighty legacy framework to make it an almighty framework legacy framework and G here like The migration to the new technology stack token checks token relays Or modify the configuration mechanism and so on and The strategies we applied was the hard way 70% of the migration was the hard way to migrate manually and increase coverage The other parts were decorate the applications with ambassadors sidekicks or sidecars and adapters and in one or two cases we decided to not migrate certain APIs of the legacy framework and Having the applications to do to migrate to another API Yeah, that's the ugly side. So where are we now? Right now about 100 systems are live on target on OpenShift on MSN web services and Little bit less because 100 will be at the end of the year. Yeah, but right now it's 86 or so 200 systems will be live on target by end of first quarter 2018 and we're Thinking this is possible because after doing some upfront work. We're now launching tens of applications each week and The other 200 systems will be migrated also by end of first quarter 2018 with this virtual lift and shift approach and Will migrate them to Kubernetes? by mid of 2019 or so So but we'll meet the requirements of our CAO that it's done within a year So that what we've learned from our migration it's Not a stupid just put monoliths into containers approach We really tried hard to come as close as possible to cloud friend friendly application principles 12 factors and Also in our opinion we increase the security levels really by an order of make a magnitude At the target architecture So that's all from my side I think now the counter is at zero, but I can propose you I'll be out there For 50 minutes for half an hour and you can ask me any question you want Thanks for your attention