 I think maybe we can get started. So how many of you attended my talk at Santa Clara last Cloud Phone Resummit? OK, good. There will be some repeated messages, but it looks like all of you are brand new here and you are not there. So again, my name is Surya Dugurala. I'm from IBM, mainly responsible for the IBM Cloud Performance Engineering. I chair that guild. And today, unfortunately, I couldn't get Melorod with me on the stage. He couldn't come because of some conflict. So today, we're going to talk about how Cloud Foundry at scale works in the banking sector. I'm fortunate to work very closely with the Royal Bank of Canada on their journey to bring some of their key workloads to Cloud Foundry. And the journey was not that fun because we had our own fair share of problems because the requirements of scalability and we really put Cloud Foundry to test. So I'm going to share our observations and where we started, how our journey was, and what are some of our successes and things that we have identified actually that are actually making the Cloud Foundry better as well. So first of all, how many of you know about what an RBC Royal Bank of Canada as a bank? I think, yeah, a few. So yeah, I'll talk a little bit about RBC and RBC's goals for Cloud Adoption. I'm the Cloud Lab Advocate from IBM working closely with all lines of business at RBC. So I pretty closely work with both technical and the management folks there. So I'll share some of the goals for Cloud Adoption. And some of the key applications that RBC has actually deployed and running in production right now, of course, on Cloud Foundry with Bluemix, what benefits RBC got. And also, I will talk a little bit about some of the engineering work that went in. And in fact, this morning, as you might have seen about the Go Router changes and actually those design changes in action, I can show you here with some of the RBC's online banking applications. So RBC, Royal Bank of Canada, is a Canada's largest bank, not only Canada's, but also they have global footprint. One of the North America's leading diversified financial company. It has both personal, commercial banking, wealth management, insurance, corporate, and investment banking. They have almost 80,000 employees. And they serve more than 16 million customers, both personal, commercial, and small business as well. Again, they are operating in not only in Canada, but around 42 countries in the world. So when we started this journey with Cloud Foundry at RBC back in 2015, the goal has been, if you look at from a strategic goal point of view, there are multiple lines of business. Like you have a business unit. Each business unit has its own strategic goal when it comes to the personal and commercial banking or wealth management. So there is a group called Technology and Operations. They look at the LOB's specific goals from a business standpoint, and then how we can actually translate them into the technology. And one of the main things there, from a TNO strategic point of view, so how can we get the business speciality? The main thing is about how we can increase the, accelerate the capability. Previously, they used to even a small change if they had to make it in one of the panels or one of the functions. It used to take months. So how can you accelerate? Those are one of the main things. And also the innovation from a cloud and big data point of view, technology platforms, how can we embrace the new technologies? Because they're already running lots and lots of applications on traditional data centers with middleware-like web sphere. And of course, they have the data in mainframe. And how can we get into the cloud and without disrupting at the same time? How can I do the digital transformation seamless? That was the main goal there. When you look at the business financial services portfolio, if you look at the client and user segment, you can clearly see the commercial clients, the small business clients, and of course, you have RBC employees also. So those are the client and user segment perspective. How can we impact them? From a digital channels and services point of view, you can clearly see we have divided this whole work into three different things. One is the online banking. Online banking is the early adopters of cloud. I'll be talking about some of the POC, and then ultimately that went into the production. If you're an RBC customer now, if you are going and actually looking at your account summary, you are using cloud phone right now. And if you're doing some bill payments, like for instance, multi-bill pay, and you are using cloud phone right now. So that is from an online banking perspective, retail online banking. But also from a digital business channels point of view, when you talk about commercial banking and wealth management, some of those areas also, we have some of the existing applications. It has been a tough thing because a typical industry problem, like all of you may be experiencing, how can I get my monolithic, a large monolithic application and then get to the microservices? And also, how can I go from an on-premise data center to cloud? Those are the typical problems, like all of you have from a customer point of view. We went through that journey here. That's what I'm going to talk about here. So from a digital business channel cloud strategic point of view, so the main issue was some of the applications that are running in traditional web sphere because the main challenge was already you're up for migration because you're running on an older version of the middleware. And you have two options. You either migrate to the latest and stay there on on-premise or take this opportunity and migrate some of the applications to Liberty on local. And then transition that into the build pack, like Java build pack, Liberty build pack in Cloud Foundry. That is one of the route that we took here. And then when we talk about the enablement roadmap, in three stages we are actually trying to do. We wanted to avoid the full big bank migration, which, of course, you'll be setting yourself up for failure. So I'll give you an example of a commercial banking application where you have one monolithic application doing almost 19 different business functions. How can I get in an ideal world, you will actually divide that 19, and we would like to have those 19 different microservices. So what we started doing was take maybe one of the edges and one of the scenarios, and then migrate that to microservices and, again, get that to the cloud. So when we're doing that, one of the main things that we did was the online banking POC application. But before I get into the details of the online banking retail online banking POC, right now there are 30 applications in production on Bluemix Cloud Foundry platform. And there are some of these retail online banking that are rolled out to around 6 million customers, the retail customers it has gone out to. And some of the other key applications that are running on Cloud Foundry that include the portfolio risk management, RBC Rewards mobile, merchant banking, finance core technology, and my advisor, some of the wealth management functions. So as you can see, this is actually cutting across many lines of business. So this clearly talks about how Cloud Foundry is actually being used with a banking sector, RBC being almost at the top of the big banks there. And the journey, as I said, when we started this, the challenge was, can the Cloud Foundry meet the bank's requirements of around 300 to 400 transactions per second under subsection response time? That was the ask for the retail online banking. When we started this, we were at around 10 to 12 transactions per second with almost 15 to 20 seconds response time. So we were nowhere close, and we were not able to even scale. From that journey, before we went to the production, we were able to showcase almost 800 to 900 transactions per second at around a 7 to 800 millisecond response time. So that journey was our dearest journey. It was not easy. We found many issues, but before I get into the issues, what exactly we were trying to do in this POC? So when you talk about online retail banking application, it has multiple things, right? It has a GUI piece. We tried to redesign and re-architect that with AngularJS. And then we have a security piece and a services piece. So security, we were using trust association interceptor for authentication before, TAI. We brought in that same authentication, and actually we packaged it up with them. So we divided this whole application into two Java applications, an orchestrator piece and a stub. So orchestrator is actually simulating the GUI and also the security pieces. And the stub application is actually a simulation of the backend service interaction, because data has to come from the main frame. So we are simulating that. So these are the two Java applications that we have actually deployed in Liberty and we have a stable pack in Cloud Foundry. When we tried to scale that, that's where we saw many issues there. The first one is we learned that Godouter has a scalability limitation as you try to scale. I'm glad, because I have been pushing for that for a long time. I'm glad that that happened. I'll be showing some data later today. So that's one. And then we also found out that there are some algorithms in the runtimes. They may not be agile enough when you try to push to a large-scale workload. They all worked great before in the on-premise world. But when we went to the Cloud where you have the backend services, the latency of the backend service, if it is really high, then some of these autonomic algorithms started failing. They are not really working to expand the pool, to accommodate the large-scale workload. So some of those things we have identified and we fixed. Not only that, we also had some issues with security vulnerability, for instance. The security vulnerability scanning that we do for making sure that no rogue application is actually crept in. And all those things we had to a little bit work through and optimize to make sure that we have. Another thing that we have seen is the build-pack mechanism. I'm glad I'm working with Jules on some of these things. What is happening is, on the build-pack, especially on the Java side, when you try to push an app, if you're mainly in Java, what does happen is actually you will have a lot of CPU-intensive operations. If you have a lot of artifacts that you actually push that into a droplet. And then when you're pushing it, there will be a significant CPU spike. That's because of the ZZP operations, and those things are very heavy and CPU-intensive. So we are trying to work to change that the way the current flat file system to a layered file system, like Docker, which will reduce the droplet size and which, in turn, will reduce the CPU usage that is required for doing these ZIP operations and tar and tar and all those things. So some of those things are going to be useful for everybody, and it will help Cloud Foundry as it is. And we also wanted, Bank wanted, to make sure that when you're running at this scale, can the Cloud Foundry platform stay there for a massive workload running for 12 to 24 hours and that's another requirement that we were able to look at the endurance and the resiliency of the platform with auto-scaling, can the application scale automatically with the auto-scale, and we demonstrated that as well. So as I was telling, we have 800 to 900 transactions per second, but we were stuck at 800 to 900, mainly because once you reach that level, the good order, not having to keep our lives is actually getting to us for scalability. So I will share some data, but now with that change that was made up into the Cloud Foundry, now we're able to cross that boundary of 800 to 900 and we're able to go to 2,000 transactions per second. So that's a very good change as in the keynote, you're talking about three times improvement, we can actually see that three times improvement in a real application. The way it was used is actually RBC has two different environments. One is the local, that's Bluemix local with Cloud Foundry, that's where the production is. And for a Dev-1 test, they have a dedicated instance of Cloud Foundry, where developers will have access and develop their application, then promote it to local for deployment and production. These local, they have two local instances for HA, and then they are actually connected to the back-end mainframe via data power for an encrypted channel there. And one thing that they wanted to go further, when we talk about the next steps, what exactly we want to do now that we have this RBC working, scaling in Cloud Foundry in production. So they took something called the RBC Express or a commercial banking kind of a thing, where you have a traditional web sphere and traditional middleware, and they wanted to change that into the Liberty Build Pack. So those are some of the things that we have completed, first two phases, we demonstrated and some of those things, and the next in line will be the RBC Express. So what are the benefits that RBC realized of actually getting to this Cloud journey with Cloud Foundry? There is accelerated time to market, including return on investment. If you want to get some kind of changes, any small change in the GUI or something, which used to take a lot of time, and now we're actually able to get that. And again, directly aligned with the DBC-scaled agile, they have the operating model. So now, with the adoption to Cloud Foundry, they're able to directly align the Cloud goals and strategic direction to the scaled agile operating model. Of course, it enables the common services. When we talk about they have almost like 400 or 500 different business-level APIs, they're all business layer. So the goal here is how we can actually exploit those 400, 500 APIs. Any application, any LOB, they'll be able to get to those APIs, and they can actually expose those as microservices. That is the direction. Of course, I improved efficiency through QE and DevOps operations. There is a DevOps pipeline and also the monitoring layer, integrating the existing some of the monitoring stuff with the dashboard for Cloud Foundry. Of course, the reduced cost of operations, because of the elastic provisioning, the auto scaling that is happening there. From a metrics point of view, you can see the quality and production resiliency, some of the metrics. So it used to be like 18 months from between releases. Now, actually, it has come down to days, if not some of them in hours. So that's the big one. And then there is a 41% improvement in quality and development projects. And there is a 90%, lowers the defects that we have, the arrival rate of the defects is reduced by almost 90%. Digital business banking achievements, some of those things that the LOBs, like you can clearly see, there are eight microservices. This is from a digital business channel, LOB, point of view. They got eight of those microservices with stable operation in production. So that is the DBCs, because you have this online retail banking that's a separate LOB, and then you have a digital business channel with commercial banking that's a separate one. And we have successfully migrated from existing legacy monolithic applications. And we guard them, some of them into the production also. Some of them are still at POC level. Again, we have identified many, many areas for both from the way the applications, from an application architecture and development point of view also, being the first time we are actually going from on-premise to some of those things will be like using the sticky sessions. We learned the hard way. When you are bringing an application from an on-premise world, and with the traditional web sphere and middleware, you get some of those problems. Because we were finding why we are not able to scale, why the autoscale is not working. Because when we enabled the autoscale, the traffic was not getting spread into multiple instances. Even though autoscale based on the policy, it will provision a new instance. But the traffic is sticking to the first instance. That's mainly because you had a sticky session. So when you have a sticky session, the autoscale, you cannot really spray that to the multiple dynamically provisioned instances. So some of those things we have found the hard way. From a cloud foundry engineering point of view, as I have mentioned previously, we had some of the GoRouter things. That was a big one. I will show you this is the chart. This is the Grafana chart. You can see that's before the GoRouter design change on the top one. And then the bottom one is the one with the GoRouter enabling the Keep Alives and setting it and configuring it correctly. You can see that once you reach like 7 or 800 there, you can't really push anything. Even though you have enough resources in the Diego cells, you are stuck in the front door pushing your traffic. Only thing that will go up is the latency and the response time. You can't really increase. But you can see here. This is not a limitation yet, because you can see the 1,500 to 2,000 here. This is not a limitation. We are in the process of actually scaling out. You need to make sure the GoRouter, by default, it's there in CF 253 and above. You can go to 270, you will get that. By default, the Keep Alive for a GoRouter is disabled. So if you want to enable it, you need to change the manifest and redeploy your GoRouters. And you need to make sure that you have the right Keep Alive connections. Because Keep Alive has a side effect also. If you increase the number of connections, what happens is you may end up having less number of file descriptors. You may be having a problem with that, and that will impact the stability. So you have to make sure, as a cloud provider, we have to make sure that we have the right configuration set for GoRouter to get the scalability. At the same time, we want to make sure that we are not destabilizing the Cloud Foundry platform. So that's why you can see there we have gone up to 1,700. We are in the process of actually finding that sweet spot. I think the maximum allowed is like 50,000, based on the OS that you use. But you need to keep some of the file descriptors for other things. So you can't go up to 50,000. The number that you are seeing is with our Keep Alive Connection set to 5,000. So that's one thing. And another thing that we want to hopefully coming pretty soon, which will impact the Cloud Foundry operating system, the whole Cloud Foundry platform, is mainly for Java. Because we compared that with Java and other runtimes, too. But the CPU spikes are happening mainly in the Java runtime, whether it is Liberty or Tomcat. You can clearly see five times, sometimes, 200% to 300% CPU spikes when you're pushing an app. If you have so many applications and you have a dense cell, you can understand how that's going to impact the other applications taking the CPU from those other applications. So that is actually in Jools. And the CF team is actually working actively to redesign the build pack mechanism once we get that. That will be, again, a step in the right direction to stabilize the Cloud Foundry and reduce these CPU spikes. So with that, I do have some of the case study in detail. I have written a blog. And also, there is one nice, if you Google, I think maybe you can see. There is one nice article from a third party that describes the whole journey of RBC with Cloud Foundry. So with that, any questions I can take? OK, the question is, how did you manage the backend service integration? In this case, for online banking and other digital channels that we have used, we have mainly directly used the mainframe to get the data using the data power. So we have used the tunnel, and we have done that. We haven't used any other public services. For instance, maybe Compose or any other. We haven't used that yet. But we have directly interacted with the mainframe using the data power. OK, so the question is, how long it took for the sprint zero? And then from that point to production. Sprint zero, I got engaged last year, sometime in April time frame. It took around two to three months for sprint zero. And from that point, we have slowly gotten into the production. Those 30 that you're talking about, it took some time. We haven't promoted all those things into production. But it may be a few more months to get to that 30 in production. Any more questions? If not, thank you.