 Lleiddo hi. Yn ymwneud hynny, ond y dyweddod yn rhan yn gyffredig, felly byddwyd yn ymddangos i'r wych. Gweithio'r cyntaf, maen nhw'n drwy'r ffordd o'r cyfrifwedd, a mae'r ffordd i'r ffordd i'r ffordd. A'r rhaid i'r ffordd i'r ffordd i'r bwrdd. Felly, rhaid i'r ffordd yma. A'r ffordd i'r bwyddennyddiol. Pwy'n dwi'n mynd i'n ymddangos uchel i'r bwyddennyddiol. I'm Colin Humphreys. I'm the CEO of Cloud Credo. I'm presenting with Marco from Swisscom. Hi guys. I'm Marco, CEO of Swisscom Cloud Labs. So, we're going to be talking about service foundry. And I think I can encapsulate service foundry in this very simple slide. So, I love Cloud Foundry. I love CF Push application. I think that's a great journey. And I want to raise the question, why can't I do the same thing for services? Why can't I CF Push my SQL and give it a disk? So, this is based on the idea that if something is good and it works, do more of it. So, I love CF Push. So, let's explore the idea of being able to CF Push services and data. This is the agenda for the talk. Marco is going to take us through the current state of statefulness in PAS and a brief state of the nation. That's the best joke in the entire talk, by the way, state of the nation. So, it doesn't get any better than that. You can leave now if you want to see the best joke. Then, I'm going to talk a little bit about challenges solving stateful problems. Then, we're going to talk about the solution that we changed the data services, so they work with service foundry. And then, we're going to talk about the solution part deux. We're going to change the platform so we have the right kind of primitives and the right ideas to enable this way of delivering services. So, I'm going to hand over to Marco to talk about statefulness in PAS. Cool. Thanks, Colin. So, we started this discussion, Colin, a couple of months ago and went through it and really checked what is the problem. I want to explain a little bit about statefulness in PAS, what it really means, why do we need state in PAS? It's very simple. It's clear that Cloud Foundry and PAS in general handled very well the stateless applications. The 12 factor applications to microservices or what everybody says today, cloud native applications are well handled though with platform as a service platforms. CICD processes, you can all do that with PAS. But what about underneath? What about the stateful service? How can you do that? And where do you do that? So, today we answered this question usually with monolithic databases. We do it with what we already know. We take our existing database infrastructures, our existing database vendors, which we know a couple, and just deploy them in scale and try to somehow implement them in PAS. But the fact is that you cannot run an Oracle cluster in a Cloud Foundry container. Neither you can run, or actually you can run a MongoDB in a Cloud Foundry container, but it's not stateful. If somehow the container goes down, you lose your data. So what is the point in here? The point in here is databases are not ready for PAS yet to run in a PAS environment. So we take them away and run them somewhere else. So the challenges we have at Swisscom and we are a service provider in Europe. The challenges which we have here is we have existing service landscapes. We have huge clusters of MSSQL, Oracle databases and others. And we try to attach them to the PAS environment, which works. We can use them. We can make shared databases out of it. We can leverage these services and know how we have an existing databases and offer them as a service. But the problem we have is they not scale like they should when you think about how PAS scales. So if you scale your application to 1,000 instances from one immediately, your Oracle cluster just not just reacts on that because it doesn't know. So either you go there and call him and talk about that and try to make that scalability automated. Or you change something there how you interact with your Oracle cluster, for instance. The second thing which we see is the operability and compatibility of existing services. It's very hard. We have a couple of services out there and they all run smooth, but they all have their own teams. We have thousands of people just operating services today. So they operate their MSSQL database. They are professional in that, but that's it. So can we find a standardized way to operate it more smoothly and to add services very fast? This brings me to the next point, which is we have to deliver new status of services in a very fast manner as we do it with middleware. If after MongoDB there is Redis coming and there's the next one coming, we as a service provider need to be able to deliver that to the developers in a very fast time. So do we just another team of Redis experts ramping them up and build that and ship that? That's the questions and challenges we have as a service provider. Now let's look back to Cloud Foundry and the services within Cloud Foundry. What do we have today and what is the state of the nation, so to say? Well, services are actually done and the speech here in front of this audience actually 10 minutes before was exactly about services or service brokerage. We have two things which are very well captured within Cloud Foundry. The first is the provisioning of services. So you can provision services easily within Cloud Foundry, actually with the CF services contrip, so there is a way to do that. The management of services or let's say the creation and the binding of a service is done with the management and actually with a service broker. So let's go one step further on that. What is a service broker? A service broker allows you to get your database, bind your stuff and add it to your app. What in fact a service broker does, you don't know. Either it's a shared database or you get a dedicated database. You get a service which is somehow provisioned, somehow binded. So it's basically an externalization of the issue. You take your issue and then put a nice layer on it, which is perfectly fine and works. But it's not a complete solution. So what's behind the service broker? And I'm talking here also in our case on our production system and also other people we talk to, it usually looks like that. You have a weird somehow way of doing it, some uncompleted walls and some open doors because you somehow have to provision that. And if you grow and if the developer approach you, hey, we need another service and it's today it's this one, tomorrow it's this one. You just somehow need to wrap them and provision them over service brokers. That's what we see today. That's not the Swisscom office. So what we see today as well is we have a lot of services out of the box. So all these distributions in the venue over there in the expo, they provide you services out of the box. They provide you specific services in their complete model. Property service sometimes, sometimes it's a common open-source services but they try to bring their USP in there. So their USP is, you get that cloud foundry with all the services around it and you don't have to care about it. They just chip it. But the point here is that we think as a community and that's where we start discussion basically, we should maybe think about how we can go further than just deploying middleware but also deploying in a generic way services. That not each of the distribution that vendors are solving the problem by their own but that we have a common layer how we can solve and actually onboard new services faster and as a community. But now everybody says, wait, we have something which is called Bosch. Bosch solves all the problems. And I don't know if Dr Nick is here. I guess he will agree. Bosch can solve all the problems but yeah, we had different experience. So Bosch can be very slow. Bosch is not really easy to deploy and to learn and the life cycle of Bosch is very hard. So if you have existing database, you upgrade to a new one or you have data migrations into other data centers or whatever, Bosch is a challenge and was a challenge that was our experience so far. So Bosch could be a solution but it needs some big steps to do that. And what about the V1 services? We had V1 services, right? They were perfectly fine and as I look into the, there are some Swisscom folks. They run in production, we somehow run them but we would never do it again and it would encourage you not to try that out. It is kind of a piloting and you can try it out for ramping up your Cloud Foundry environment but it's really not for production use. And when I talk about all these things, then as a service provider you have a different perspective on other topics because you think about operation readiness. When is a service operatable? You think about life cycle management, okay? How do I have to upgrade XYZ when customers are on it where you have no clue what they do? You have no clue which database keyman that you're using. You just want to upgrade the service. So you have completely new challenges in there. And obviously the multi-tenancy concept. Not every database has a multi-tenancy concept per C. Let's take an example of Redis. Redis per C has no multi-tenancy concept so you have to deploy each Redis by wrong. On the other side, Maria, DBE or MySQL, they have a very nice multi-tenant concept. You can just give the user a password schema and that's fine and you can run one cluster. So you have to look at each of this level of various services and try to figure out how you can run them. And obviously the billing as well. How you want to charge your services, your various services by quota, by message, by whatever you think about. And this actually brings me to the challenges solving state for problems. And that's the point where Colin needs to take over because I have a challenge. Thank you very much. So I gave this talk a run-through last week. Just to a small group of people. And they said to me, Colin, you've got to say something constructive. You can't just stand on the stage and complain for like 20 minutes about how difficult this is to do. And I said, no, I'm just going to stand here and complain. So this is like, I'm just going to preach about why I think this is so difficult, why it's so challenging and why I've tried to do this so much and had so much difficulty making this scale and making this work. So why is it so difficult to solve stateful problems when Cloud Foundry makes it so easy to solve the stateless problems? So firstly, CAP. Can I get a share of hands that who knows what CAP is here to a reasonable degree? OK, that's not everyone, so I'm going to very briefly run through this. CAP, consistency, availability, partition tolerance. You can have two of those three in a distributed system. You cannot have all three. And that makes life difficult. So you have to pick two. In reality, what this means, if you have a distributed system and you have a network partition, you can choose to do one of two things, both of which are wrong. So your choice is that network partition, the servers have been split in half. You can either stop serving data and stop mutating state because you could be inconsistent. So you maintain your consistency, but your service is unavailable. Or alternatively, you carry on mutating data with your cluster split into two halves, at which point you are inconsistent, but you are available. So you can either be consistent in a partition tolerance, CP, or available in a partition tolerance. You can't do both. And this makes life very, very difficult. So what does this mean for Cloud Foundry and for Service Foundry? So if you've got your Cloud Foundry app that's stateless, and you push it in and it's working, if the Cloud Foundry cluster is split into two, it can just fire up some more versions of your app. It doesn't matter. That's fine. If you've got, you know, my web app, having two of my web app, that's fine. If you've got my SQL running there and you split it in half, what should happen? Do you run two MySQLs both serving your data independently? You've got a split brain situation. Do you run NoMySQLs? What's the right thing to do? There is no right thing. So life gets very difficult. As we know, because there's been a few talks about it, Cloud Foundry is focused on 12-factor apps. This is a set of patterns that came out of the guys at Heroku. This enables applications to be effectively pass-compatible, cloud-native. But we know that some of the 12-factor apps dictate that we should externalize our state. So do we move to 10-factor data services? So the two factors we're going to violate here, firstly, that we are not going to externalize state, we're going to choose to internalize our state. And secondly, that our processes are going to have to stop being ephemeral. We can't just throw them away because there may be important data there. So we're going to drop down from 12-factor to 10-factor. And there's a talk that's going to extend that idea tomorrow. Ted and Caleb are talking about persistence in Diego. So that's like a continuation of this talk, if you will. That's a far more positive talk. I'm just going to stand up here and complain. Automation. So this is really, really difficult. Now, a while back, I wrote some automation for Oracle Rack cluster. All right, some scripting, some chef cookbooks around this so you could automate Oracle Rack cluster. And some Oracle engineers said to me, you can't do that. You can't automate Oracle Rack cluster. Each Oracle installation is organic. It needs to be grown and tended to. So I think somewhere there is like a DBA union that are making data services that need to be looked after by humans. But why is this? I mean, maybe because it's difficult, I don't know. But if your apps are going to be run inside of Cloud Foundry or Service Foundry, they need to be automated ball. The app can't put its hand up halfway through the night and say, can an administrator come along and run these commands, please? OK, it needs to be automated. And this isn't how data services currently are. Scaling. So we know with Cloud Foundry we can scale easily. Nice horizontal scaling. What does scaling even mean for data services? Do we have more instances of the data service? Do we have a larger disk? Do we have more IOPS? What does scaling mean for data services? Well, this is just ambiguous. I mean, this could mean anything for different data services. It's not clear. And what about durability? How do we store the data? How do we persist the data? And what is persistency in this context? Also known as not storing your data in MongoDB. So how do we store data in an environment where containers are coming and going? And how do we give them the persistent disks and persistent volumes that they need? So again, this is talk tomorrow, which is going to look at how we can do this with Diego. So I would recommend going to that talk. What I think is interesting here, why I think this talk is pertinent now and maybe wasn't before, is because if you look at version two of Cloud Foundry and by that I mean DEAs and NATs for orchestration, we couldn't really do consistent data services because for those of you that have played around with it and poked at NATs and the health manager and the Cloud Controller and that entire kind of loop, crazy things can happen. You can go from no versions of an app to too many versions of an app and back down again and it's very difficult to reason about. So with Diego, we have XED as a backing store and we can start making consistent decisions about running applications. So Cloud Foundry version two was not fit for building service boundary on top of, but Cloud Foundry version three, a.k.a Diego, gives us what we need to be able to deliver consistent data services. So there's two parts to Service Foundry, two elements to the solution. The first one as I alluded to with the challenges is that we change the data services themselves. What do I mean by this? So I've been fortunate enough with Cloud Credo to work with some of the teams that are bringing data services to Cloud Foundry and what's been fantastic there was to work with the actual developers of the data services and kind of move them along this journey, help them make databases and data storage solutions that are better suited to running in a cloud-like environment. So in particular Cassandra, we're working with data stacks to help them, make that a better journey, make that a better movement there. We work with the RabbitMQ team. We changed their clustering to be better suited to automation in this way. And I think if we as platform specialists work with the vendors of the data solutions, we can help them make their data solutions more cloud-native, but I think there's still a way to go. An interesting point here is that there are potentially data solutions in the future where if you run enough containers in enough instances, you don't need data volumes behind them. An interesting solution called Crate at crate.io. They're looking at potentially running huge numbers of containers and in essence your data is replicated across all the containers so you don't need backing volumes because so long as enough containers stay alive you still have all your data. So if data services move down that path, all your data could be ephemeral, but that scares me a little, I have to say. Would I put financial data into something that could lose it all at any point in time? Probably not right now. So that leads us on to micro data services. So just imagine Cloud Foundry where your apps can have a persistent disk behind and can do networking across. Can we then start to run small data services within Cloud Foundry that deliver data to micro services and thus have micro data services? So I'm going to hand over to Marco. Talk about the second part of this. Thanks. So changing the service is interesting and this may be a way to go. But it takes a long time to go there before we maybe should check about changing the platform or adding some topics to the platform that allow us to be faster anyhow and not waiting for the changes to come. So what do you really need? You need really a CF push, a real service kind of thing. So how do you do that and what do we need and what is already out there because you shouldn't make a new wheel if you already know there is something out there. So let's talk about what is already out there. So, for instance, we need storage. We need persistency. We mentioned that a couple of times. Are there orchestrator managers who can allow us and give us fixed persistent storage in containers? Yes, they are. And here are just some of the projects like Arrowhead, Flocker, or you should go to Tets Talk tomorrow. He is also talking about the solution, how you could do that. So there are solutions out there who can give us persistency on a service or in a container. What do we need else? What do we need in addition? We need network. Sure. We need software-refined network. When you spin up your cloud foundry the first time, you won't need software-refined network. You just spin it up and hope it's running. But when you go further into the use case and think about abstraction from your services to the middleware and even security boundaries between them, then you think about software-refined services because you need to grow when customers come. But also here there are solutions out there for networking for containers like Calico, Weaver, Socket Plane, just to name a few. There are hundreds. So there is a solution for that somehow. Next topic which you need when you have a lot of services is discovery and monitoring because you need to know what is here, you need to know how the services are doing or when do you need to scale them, how do you need to manage them. But also here there are a lot of tools already out there, open source, most of them. And to mention just some of them, Consoles to Keeper Eureka, they're all capable of somehow handling containers and somehow letting you know when the state changes or when some assistance is needed, when some automation is needed. And last but not least, provisioning. Provisioning is what you do with CF Service Broker, right? You saw there needs to be something there at the end. But also there we have solutions. There are solutions out there, open source solutions, like Brooklyn or Terraform, just to name two of them. There are hundreds more existing, you can even puppetize and run just puppets behind it. But the question is, are we able to build something which gives us an overlay for all these solutions and provide us an easy way to drive persistence cloud-based data, microservices or microdata services, sorry. So to summarize that, we talked about the solution one which is basically change to service, which Colin is working on. We talked about solution two, which is enabled the platform, which we maybe can work together on. But both sides could and will and somehow have to move into a concept which we today call Service Foundry. And let's build this as a community. That's basically our message here. I think we need something to not have the USPs in each distribution and each one is building where they're on, but to find a common ground of growing and delivering fast these services. Great. Good. Do we have any questions? No one wants to ask where is it? We haven't built it yet. But we'd like, if anyone wants to collaborate with us on building this, please do come and talk to us. We're issuing a set of demands. We want this to happen. Please do come and talk to us if you want to help build this. Any questions? Come in.