 Hello everyone, this is Simon Cashwall from Barclays. I'm Anthony Keston from Red Hat and what we wanted to do today was really give you some ideas about how Barclays are scaling up their OpenShift deployments. It's a very large deployment. It's been in production for some time now, not only in V2 but also V3. So just to give you some context, Barclays itself, it's a large bank in the UK, or headquartered in the UK, about 80,000 people in about 40 countries. Over 20 billion in revenue, about 48 million customers. Interestingly enough, for every pound spent in the UK on a credit card, then a third of that is spent on a Barclays credit card. And interestingly enough, it's quite an old bank. So it's been going for 327 years. So that means the process debt and the technical debt that's available here is quite large. You're going to be hearing quite a lot of interesting ideas over the next few days and new announcements. I have one other announcement to make and I think I've worked out how to beat JetLag. So anyone a bit JetLag today? I know Diane and a few others are, okay? Anyone woke up at half past one this morning, half past two, half past three and half past four? Yeah, it's quite bad. So what you have to do is you have to think of the talk you're going to give the next day. And last night I got as far as hello, this is Simon Cashmore and I'm Anthony Kesterton and I was asleep. So try that tonight if you're struggling with JetLag. Okay, not you, Simon. So with that, what I want to do is really just hand over to Simon and let him talk a bit more about, first of all, the architecture and then the dos and dos of a very large-scale implementation. Simon? So just to reflect in terms of setting context as well, it's a highly regulated environment we're working with multiple regulators. And it's kind of risk averse. So that kind of also sets context. But I think for me, there's a big change in the way Barclays is approaching technology and some very senior executives are starting to talk about Barclays as a technology company that does banking rather than a bank that has a big technology department. So it's a very big shift in mindset about how Barclays is trying to position itself going forward. So APAS, that's our platform as a service offering. It's based on OpenShift Container Platform, the enterprise version. And when we thought about setting up this service, we thought about all the service elements that make up that service offering, not just the technology. And so there's some surrounding bits that we spend as much time in creating the service as thinking about the technology. It also enables us to create a brand that we could take through and through all the iterations. Our journey so far, so we started a while ago with version two of the OpenShift platform. We went very early. I think that was a very, very good decision for us in terms of being able to learn from that deployment, get people used to using containers and platform as a service, and then we've iterated on that. We started slow. We slow take up during 2015, but then it grew exponentially. And it says we're at thousands of pods containers now. Actually, it's tens of thousands now that we've grown to. It's growing at a rate that we can't keep up with. I think the other interesting thing for me about this slide is in terms of a relatively short space of time in Barclays history, we're already creating more technical debt in terms of V2 and trying to migrate off V2. So that's brought challenges as well in terms of how we do all of this work. So hopefully that's given you a bit of a context. I'm now going to try and talk around how we've architected this. And some of this may be a little bit of an anti-pattern, so I'll just take you through why we've done things as we have. So in terms of how we've deployed it, we've deployed in paired data centers, and we really build separate OpenShift clusters in each data center. At the OpenShift level, they're completely separate in those two data centers, but then we join at the global load balancing layer, the DNS layer, and then we have local load balancers within each data center. It's not quite accurate. That external registry is actually shared across, so that is joined up so that the external registry is shared between those environments. There's a slight overhead that we mandate that everybody has to deploy their applications into both data centers, so the concept of active-active, which is also very timely in terms of the banking world in terms of resilience and the ability to maintain service. So the bank has traditionally thought about active and passive environments, so build a disaster recovery environment and active environments, and then you fail over at some time. So this was the first platform that we really sold trying to run active-active across two data centers, although it is really at that application layer. That brings problems a little bit when you think about different workloads you might put on there. In terms of this architecture, it means we can drain a data center gracefully over a couple of hours and then patch in that data center. So that's very valuable for us in being able to meet regulatory requirements about vulnerability management, patching relatively quickly and easily, and being able to maintain a service stability. I think service stability is the key mantra that we drive with. We've got to be able to provide a service that is always up at the platform level. So I suppose an example of that, we also deploy in separating non-production and production environments, so we actually have four clusters in each data center. We're also across the globe in the US and UK as well. So we had a storage problem in some of our non-production environments. We monitor storage. We try and make sure we don't run out. We were rolling out upgrades and provisioning more storage, and that worked fine in one data center. It was all up and running. Then in the other data center, after we'd done that, we hit a problem. And it was a technical problem around the way the storage was configured and everything. So it meant we were down in one data center. But that was, it didn't really impact anybody because they're deployed in both data centers. And so that service stability was still there and that resilience and being able to maintain service really important. A couple of other things just to quickly briefly mention. We deploy using Chef, so we build using Chef in an automated fashion. If I had my time again, I might not use Chef, but that's the bank's mandated way for managing the environments at scale. So we sometimes struggle with cadence of versions and being able to update. And I think some of that is slowed by us not being able to just take Ansible and using Chef. So interesting choice. But Chef is very good for our infrastructure service offerings. And across the bank as a whole, we use it to manage those environments. So it works well for the bank as a whole, but it does give us some little bit of issues in the way we manage our environment. We also, when we first went in, we shared the masters and ETCD and metrics all on the same nodes. And we soon hit problems as we upgraded. So we split those out and they're all running on separate nodes and with dedicated resources. So we're constantly iterating, constantly learning. And I think that ability to be able to do things in one data center, drain and try it, has been really helpful for us. As I said, it doesn't necessarily help in terms of deploying your application and spanning across. We're also taking a slightly different approach to our public cloud offerings and we run a cluster across multiple availability zones. So a slightly different approach in that world where we've got a non-production environment at the moment. I'm going to do it for time. So next, I want to talk about a slightly more complicated environment that we tried to build in, which is our DMZ environment. So this has given us additional challenges that we've taken a nice platform that's fairly holistic. And now we've had to work in different zones, have different functional requirements, different security requirements. So we've almost broken it up and then split it out and move different elements into different areas. So quite a challenging environment in terms of how you build in that environment and also from a security perspective, quite challenging in terms of governance. So we've had to iterate on governance and keep change in the way we're operating to make sure it's secure. We meet all of our regulatory security requirements, but also trying to give the users a reasonable journey that's not too dissimilar from what they're used to. It's a proof that we're constantly iterating. It's not quite 100% accurate. We've disabled builds in that environment, so you can't build. So we don't actually run the open shift node agents on those masters. And it's slightly different from that diagram already. And that is, again, around security requirements. So we've had to change a few things. Traditionally, we just hook into AD, so we integrate with AD for our authentication authorization. We've had to move to token-based authentication in this environment. So that's been quite changed. But now that's been quite interesting. We may refit that, retrofit that into our cluster campus environment. We've also started using Gluster for the internal PVs. So on the campus environment, we tend to use NFS and some dedicated block storage, et cetera. But within this environment, we've started to use Gluster. That's our first experience of that. So we're going to see how that goes. Again, we may use that back in the campus environment. Some additional features around security. We remove the Docker command across the whole estate. We don't have any direct route access. Quite a few other security things. We do try and have an onboarding solution. So we don't have a broker that manages this environment for us. We're trying to make it seem like the users are using the real OpenShift console and the API. So we're trying to expose those. But we do have to have an onboarding solution that means people register, set up security groups, AD domains groups, et cetera. And also we get somebody to charge. So that's where we get our charging. Details from. So I'll speed on a bit. So I just wanted to talk briefly about the other elements. And I'll go through these quite quickly. So as I mentioned before, a service focus, it's changed accountability. So we've had to change who's accountable for what in this environment. And developers want speed and self service and access. But they've traditionally, especially in the bank, relied on infrastructure teams to manage those services for them. So there's a change in mindset, change in terms and conditions. We've been patching from the off. So this is really good for us that we drain data centers. We give warning, but we drain gracefully. And then we patch. That means we've already proven that we can manage in a fully resilient manner. And it means that users have had to get used to that. I think also in terms of bringing your own image and users need to take on responsibility for taking those products through governance, et cetera, within the bank. The support model. We've got a dedicated team that support this across three global locations. They're in the UK, India and the US. And it's for us, a relatively small team managing thousands of applications is a really good place for us to be in. It was a dedicated team. It's a multi-skilled team. And rather than traditional approach in the bank of silos, if you had a problem, you get a network guy in, you get a Linux guy in, you get a middleware person in. This is one team that manages it on behalf of the users. Another key thing, we went 24-7 and treat the non-production as though it's production. Really important environment and quite a different mindset change. In terms of collaboration, we work closely with our end users. When we launched, we launched with a pilot environment. So we got Pathfinder applications, people to use the environment quickly and experience it and help us shape it. And I think key for us is around that feedback loop. But we have to prove and get trust and buy from our developers. So we have to keep iterating the platform so that they believe we are going to take feedback. Charging. We try and do consumption-based charging. This has been one of our biggest challenges. Anthony mentioned about process debt. I think the financial processes and the financial structure at Barclays is one of the hardest things we've come across in time. Trying to introduce a cloud platform where you can charge for what people use. So very problematic and quite difficult for us to do. But we nevertheless have implemented a version of consumption-based charging from the off. So I did want to talk about some of the challenges. So I think in summary, I can summarize as the biggest challenge is probably cultural as much as technical. So there's technical issues about putting this platform in. But the cultural mindset and getting people to change that debt that exists in the bank is quite a challenge. So a number of ways we've tried to overcome that in terms of working with the developers as I mentioned. And I think also for me, I've seen a change that now that people are accepting the platform and using the platform, we've changed from initial things about me having to go and meet and do loads of presentations, meetings, and tell people how it does multi-tenancy. And yes, this application isn't going to impact that other application and trying to persuade people to use the system. We now got smart users who are out there saying, why aren't we on this latest version that does this? So we've seen a different impact in terms of the journey. And that's been with our path and the acceptance of the journey. So a very different change around our problems and how we do that. I do think Cadence and being able to take new versions is something we need to get quicker at. And it is around that keep managing that technical debt and trying to move to an evergreen environment if we at all can. And I think it really links back to service stability. So the biggest thing is we've given a platform that is available all the time and we've proven it's there all the time and we've drained and we continually prove that. So I think that that trust is a big win for us in terms of people adopting the platform. So there's lots and lots of benefits for us massive. I'm not going to say too much, but just to finish off really, I hope that's given you some insight into the complexities of trying to build a platform like this in a very large and quite regulated environment and also in an environment that has been around for a long, long time with lots of debt. And I think for me, we've really tried to create an environment that means app teams can focus on the development of applications and new functions and not worry about infrastructure. And that for me is a real, real bonus. And I always think back in 2015, I knew we'd done something right. A team from Portugal went from non-production to production within two months. Nobody in the infrastructure teams knew about it. It just happened. It went live. And that meant that for me was a real indication that we've done something really powerful and strong at Barclays. So thank you.