 Hi everyone, just quickly, is this has a volume on this one, that's good. Okay, so as James mentioned, I'll skip past this. It's actually the second birthday of Kubernetes. Now, it's the second birthday of the 1.0 release, which makes it a little bit confusing because Kubernetes actually came out a year before that. And so, technically Kubernetes has been announced in around for three years now. And the momentum though has really come about when it hit 1.0 and now the continuing release is afterwards. So, happy birthday, second birthday Kubernetes. We've got a cake and everything. Now, just to give it a bit of a... So, as a milestone, just to run through things, it was open sourced back in July of 2014. It caused a bit of interest there, especially because it was coming out of Google, but it wasn't really until a 1.0 release that things started really picking up. So, around about that time, the code was donated to the Linux Foundation and that's when they developed or they opened up the Cloud Native Compute Foundation, which is the body that manages it and oversees Kubernetes and a number of other projects that are related to Cloud Native technologies. So, it continued there. The first birthday, we actually saw... Sorry, I mean, I've got mixed up a little bit here in terms of the 1.0 release 2015. And then the first birthday which came last year for the Kubernetes 1.3 release. Now, that was a iterative improvement. So, in the time between 1.0, there was really two major releases that came between that and then following up in 2017 with only a few weeks ago with Kubernetes 1.7. So, to quickly go into the details about what the different releases came about in the last year, sort of summarize it quickly. 1.4 came about as a release that simplified deployments to make it easy and the tool Kube admin, if anyone's familiar with it, was the tool that came out there, which has now continued to be refined to make it even easier to deploy on different environments and cloud platforms, bare metal, everything else. The improvement of state Philapse is an important one and that's one that I'll be talking a little bit about tonight, especially on the storage side of things. And the expansion of Federation. Now, Federation is a way of connecting different Kubernetes clusters together across bare metal, cloud providers and various different environments that let a larger environment run different high availability workloads and everything else. So, a pretty important feature that is part of Kubernetes that's continued to mature over the releases. And then the 1.5 release came out probably about six months ago, in which case Stateful Sets became the tool, replaced what they call Petsets at the time. Now, there was a few people in the community that didn't really like the idea of Petsets. It seemed to be vegans and vegetarians actually put their hand up and said we'd rather something else. So, the community actually had a lot of thought and discussion and renamed it to Stateful Sets as they matured it through the release pipeline and added new features to it. Probably an important part of that as well is what they call the container runtime interface. That lets various different container runtimes be switched out in the Kubernetes platform. So, instead of using Docker, you can use the rocket runtime. You can use things like virtual machine back-end. So, KVM or any of those things is a platform called Hyper-V that runs around, not necessarily the Microsoft platform, but other technologies around that as well. Alongside of that, they added Windows support, which became a pretty important part for Microsoft in its push and the Azure platform, but it's not really what we're here for. We're talking about GCP. But moving forward with 1.6, the previous release, they expanded support for up to 5,000 nodes. Now, this is something that they will do testing on, make sure that it's certified for being able to run in that kind of environment. Along with the stabilizing of what they call dynamic storage provisioning. Now, I'll go through that in a little bit more detail, but prior to that release, if you wanted to be able to add a particular storage provider, you typically have to get an administrator to add that, go into the back-end and add an EBS or add a Google's disk or anything else to be able to provide that. Now, this actually enabled it to be provisioned automatically. Alongside of that, role-based access control became beta. Now, it's an important step for being able to secure workloads and being able to provide access controls to various different parts of the Kubernetes environment. So, that brings us to 1.7. So, this was a release that had an interesting mix of things in there. It actually looks like a release that's come about to expand the plugability of Kubernetes. Now, a number of these things have been around for a few releases as minor features. But one of those things was called third-party resources, which has now been renamed to custom resource definitions. Now, the name itself isn't particularly interesting, but with Kubernetes, it lets third parties be able to provide extended functionality into the Kubernetes ecosystem. So, if anyone's familiar with what they call operators, operators are a way of encapsulating best practice around running technologies. And the operators use what they call now custom resource definitions or CRDs to be able to provide that sort of expanded functionalities, things like if you're running etcd, the tool that sits behind Kubernetes, but a number of other projects, it lets you do things like backups and off-site storage and a number of other things as well, to be able to make it easier to manage it. Alongside of that, they added what they call aggregate API servers, bit of a tongue twister, but it's a way of providing more control over API resources, so whereas a custom resource definition is really just a piece of yaml that gets stored in the back end, and you can query it. It doesn't really do anything once it's in there. Aggregate, so in that kind of environment, you'd have something that would query the API and do actions on it. An aggregate API server is actually a new endpoint that sits in the API that you can create as a pluggable endpoint. So you can create things like things that will say talk to security services or any other sort of other things. Now, it's still very early days in terms of what people are doing there, but it'll be actually a very powerful extension to be able to build on top of to extend Kubernetes going forward. Without necessarily having to hit a new release, these things can be deployed as containers. So it's sort of a bit of an inception here running on top of things and adding plugin features that get built into it and deployed. So alongside of that network policy went general availability. It's a way of being able to contain a particular namespace to be able to say this pod can talk to another namespace and be able to lock things down there on the network level, not just on say the user level which with a role-based access control. This is on the pods being able to talk to each other. So these will typically work with a network plugin, some sort that will then make sure that either it's encapsulated in a VX LAN or in a particular way of controlling access to these various different resources around the network. Another interesting thing which goes together with something that's going to be coming probably I think in the next releases it's maturing, but Kublet TLS Bootstrap. So being able to, when you set up an environment, be able to have secure TLS certificates for each node to be able to identify who they are and be able to talk to the main Kubernetes master to be able to make sure that if you're running in a mixed network or anything else for security that these things are actually both encrypted traffic but also identified as to the correct node that they're not particular issues in terms of security in the network. But to go along with that and something that will come as part of a future release of Kubernetes is being able to encrypt those secrets and to store them securely so that they're only given to the correct node. So if you've got a workload that's living on node X, then only node X can see those security tokens or anything else in there. So it's an important step for increasing the security perimeter and maturing the product offering. So Stateful Sets, just two minor things here, Stateful Sets and Demon Sets. Stateful Sets got a rolling update mechanism. Prior to that, it would be that you'd have to do it by hand. You'd take down a node. It would replace it for you as you go forward. This one does a nice easy rolling update to make sure that in the semantics and how it actually will go through and update things, it'll make sure that everything's updated one after the other without taking down a full cluster and everything else. Demon Sets, which are a resource that gets run across or every server in an environment, gets the ability to have history and to do rollbacks. So if you update something, realize that there's a problem with it, you can then downgrade it as part of the environment. So that sort of covers what's in 1.7. It doesn't seem all that exciting. But in terms of the base that it's setting for the future of Kubernetes, it becomes an important step for being able to expand it as things go forward. So I think as we watch over the next 12 months, for the next birthday, we'll see a lot of other things plugging into the Kubernetes ecosystem going forward. So to recap a little bit about the project as a whole, it's kind of good looking at the Google trends and seeing the interest over time. So this is the first, in the last month or so, it's the first time that searches for Kubernetes globally have been more than open stack. So in terms of technologies, both private cloud and for platform as a service and other container platforms, Kubernetes is now kind of the leader in terms of the searches. Now, I'm sure if you compare that against AWS, a search for AWS or GCP, that it would be minuscule in comparison. But in terms of a metric to see how it's compared to similar projects, going forward, this is probably an interesting trend to see how it's going to continue over the next 12 months. In terms of Singapore, it's also an interesting sign. So it's now started to overtake OpenStack in Singapore as well. So just a trend, I mean, I haven't seen a lot of interest in OpenStack over my time here in Singapore. But it may be a sign that the stuff that Kubernetes is doing is actually gaining a bit more traction, partly because it works across public cloud and private cloud and a number of other environments. So just to see how the last, I mean, this is a graph over the last four years. But it's good to see an increasing trend. So a little bit about the code, I won't go through all the details on this thing, but you can see that the difference between year one and year two in terms of code contributions is a much bigger jump. Now, in terms of contributions on GitHub, Kubernetes is close to being number one, comparing with Linux kernel and a number of other things. So it's interesting to see how this thing is scaling. There's a lot more special interest groups that are coming around. There's been eight new ones in the last year. I can't think of off the top of my head what they were, but one of them certainly is around storage. And it's an important part. Storage has existed in Kubernetes since day one, but it's now that it's really starting to have serious thought put into it. And there are now specifications being around container storage interfaces. And it's now maturing into a serious consideration as things go forward. And bigger vendors come into play and everything else. So just a quick hands up, how many people are familiar with Kubernetes? How many people are complete newbies? Excellent, all right. So a quick introduction to Kubernetes for everyone who is unfamiliar. Apologies for the people who already know it back to front. This is really just a quick overview and feel free to ask questions, put your hands up if there's anything that's not really clear. So, what is Kubernetes? It's a container orchestration platform. Now, probably familiar with that and you're here for that reason. So, if you're not again, tends to voice downstairs. But for the most part, it runs Docker containers. And running on a variety of different environments, cloud, bare metal, Raspberry Pi's all the way through to massive iron. So the concept around it is that it's managing applications and not machine. So really at the end of the day, as an end user, you shouldn't really care where your workload sits. It could sit on any number of nodes in a particular cluster. It manages it for you. I mean, as an operator, as it's doing your cluster ops or whatever you like to call it, you do care about nodes and you care about how they're performing. And there's a layer in Kubernetes that lets you manage those nodes and to see how healthy they are and take them out and upgrade them and everything else. But for the most part, for the end users and the target users, it's very much about managing those applications and providing a common framework and a platform that lets you deploy these things easily and repeatedly. So, quite a few benefits. So to go into the architecture a little bit, this is a very high level diagram showing the users themselves accessing it either by API, by the kube control, CLA, kube control, kube cuddle. There's always a bit of a disagreement about what it's called. I don't think it'll be ever settled really. Or the user interface, the Kubernetes dashboard. So, the main part of Kubernetes sits on the master, either as a single node or as a high availability cluster. And has a few different parts of it. An API server, which everything really talks to and does all the work, has etcd that sits in the back end, provides the data store and is a way of doing persistence for those particular resources they get put in there. The scheduler is responsible for making sure that the work loads get placed evenly and talks to the kubelets to make sure that things are actually not overloaded. Or if there are any problems in terms of how things get placed around. And finally, controllers which handle the particular specifics of a workload. Now I mentioned demon sets before, or replication controllers which are another part. And the controllers themselves are the ones that are responsible for handling those particular types of workloads that you're deploying. Be it on every single node in the cluster or one that handles failover or high availability or anything else. So then to go down to each node, the things that actually do the work sits a kubelet. And the kubelet has a few different components. The kubelet itself is very, very simple. It sits there and it talks to, in this case Docker, if you're running with a standard cluster. You can drop a resource into a kubelet's directory and it'll launch stuff for you. That's pretty much all it does. It tells you how healthy it is. But really that's about it. The smarts come from that master. Alongside of that, there's a proxy that handles routing of the particular requests for a particular workload that you're running. So, fairly simple. Is anyone confused? Have any questions about those things? Okay, so quickly skip on. So to go down into a little bit more detail. The pod is the fundamental unit in Kubernetes. Now, most people are familiar with Docker. You run containers on Docker. A pod is a collection of containers that are sort of co-located together that do a particular function. So the best example I have is if you're running a database, MySQL, Postgres, anything else like that. A container that was in the pod may be something that will run a migration. Or it'll be something that may do a backup and ship it off site for you. So these things aren't, a pod isn't a unit for doing scaling. Sorry, the containers inside a pod aren't a unit for doing scaling. The pod is the thing that gets scaled. So it's a unit that comes together. Now, as part of that, there are a number of things that are defined in a resource specification. They usually YAML files or JSON files. Now they're a bit verbose. But once you get a handle of it, you kind of find a way of actually deploying it. We can use other tools to actually do it. So each pod itself gets a distinct IP address. All those containers will share that IP address. And it'll expose a particular port. So in a way, you can think of it like a virtual machine. But without the overhead of running a full operating system inside of it. Alongside of that, and what we'll get into next is the volumes. So volumes being attached to storage that goes alongside of that. So that's it for those ones. Let me get to the next slides. All right, how are we doing for time, James? I haven't been keeping track. All right, so a little bit about me. My name is Hunter. James has given you an introduction. But I'm actually the cloud native compute foundation's ambassador here in Singapore. So what does that mean? If you have any questions about the technologies that are part of the foundation, feel free to get in touch with me, ask me questions. If you need any information, I can point you in the right direction or I can help out in those sorts of things. So I run a business that provides storage and Kubernetes for business on premises. And as part of that, I've been deploying Cef and Kubernetes for a number of years. So just a bit of a pitch. I usually don't do this, but I thought it's kind of interesting to see these sorts of things and how they relate to what I'm actually talking about. In this case, we actually build a really interesting little storage platform that runs Kubernetes. We do things like 24 terabytes of storage, two terabytes of RAM and a whole bunch of vCPUs in that all sitting in a nice little 3U that sits in a rack. And it's available as a managed service. So pay us a monthly fee, we give you our things, it's in your data center, you can run Kubernetes and do whatever you want with it. So it's kind of a fun technology to be able to play with, as well as a useful technology for business. So going further, the question, and I hopefully wise, a few people are here, is how can I run stateful apps on Kubernetes? It's the one question that always seems to come up with a lot of our sessions. Can I run a database? Can I run a stateful app in Kubernetes? So the answer is yes, otherwise I'll finish the talk now and go home. But like everything, there's always a but. There's always an asterisk. And I think probably the thing that really jumps out is something that Kelsey Hytale had said a little while ago. Now if anyone's familiar with Kelsey Hytale, he's kind of a guru that now works at Google, talking about cloud platforms, Kubernetes and related technologies around that. Now he says that at least his opinion is that if you want to run traditional databases on Kubernetes, strongly consider using a managed service. Now, I think that actually applies to anything, really. If you're running it on public cloud on a virtual machine, really you should consider a managed service because at the end of the day you need to understand the requirements of your stateful service. So how do you operate it? How do you back it up? All of those sorts of things. But also, you need to understand your storage provider. Be it EBS or Google disks or anything else, you need to understand the trade-offs between performance and durability and scaling and everything else like that. So if you can cheat and get a managed service to look after it, do it by all means. Why take that responsibility when you have to do it? But if you wanted to be doing it yourself and you want to be using the LEDs for it, let's go into a little bit more detail. So Kubernetes volumes. As I've showed before, this is a rough diagram of a pod. A pod can support various different volumes that can be attached to it. They can be the local network attached and there's a few caveats here and there which I'll talk about as well. But the purpose of the various different persistent volumes in Kubernetes is to abstract the underlying storage layers underneath it. So you don't have to know if you're using an EBS cluster or an EBS disk or a Cep disk or an NFS. They should all operate in exactly the same way. Give me a place that I can put something, store it for me, particular durability, particular performance, anything else like that. So to go into a little bit more detail, administrators are the ones who provision what's called persistent volumes. Now you'll find that unfortunately in Kubernetes, the distinction between all of those sorts of things is a little bit confusing. They have persistent volumes, persistent volume claims, storage classes, volumes, local disks and everything else. So to explain it a little bit more in detail, persistent volume is a thing that's managed by an admin. They will either go behind the scenes with their NAS or with their cloud provider and provision of volume and then associate a persistent volume with it. So that's the manual way and that was the original way of things in Kubernetes. Now some of the cloud providers had a few hacks behind the scenes, Google being one of them because they built the thing. So it would actually create the things behind the scenes with the APIs and everything else. But if you're a provider, say like Ceph, when these things first launched, you actually had to make the particular block storage by hand and then associate it. So now over the past releases, I briefly touched on it before, they have what they call storage classes. Now storage class is a particular resource that is specific to the storage layer. It can be associated with a particular performance characteristic or a variety of different attributes that are associated with the storage. So there can be multiple storage classes in a particular cluster. You could have a particular cloud provider that's using your cloud provider disks or you could be having, say, a Gluster environment in your public cloud for that matter and use those all in the same way. And there are different characteristics for using that and reasons for doing that. Now, when I give a bit of a demo later on, I'll show you why there actually are legitimate reasons for not just using, say, EBS or Google Cloud disks, because there are actually performance characteristics that go around those things for high availability and switching nodes and everything else. So the next step down is what they call persistent volume claims. Now, these things, it's a confusing name, but it essentially means that I am creating a resource. I need to have a particular size and a particular characteristic. And those things will then look for a particular volume claim, sorry, a particular persistent volume and associate the two of them. If there's nothing there, then it will tell you. And I say, well, I can't find something that you're looking for. So, you know, error out and either you need to create a particular resource that goes for it or anything else in your storage platform. So, storage and persistent volumes are independent to the pod. So if I destroy my pod, typically the storage will stay around. Then I can associate it with another pod. There are ways that you can say, well, actually, if I delete my pod, I'd like it to be reclaimed instead of being deleted. And a number of characteristics here and there. In certain cases, they can be shared or handed between pods. So if you're using NFS, then you can actually assign a shared volume that runs across a number of different pods. Or if you're using a block storage device that can be then attached to one pod, the pod destroyed and attached to another one and continuing as you left off. So, quickly looking at the basics, you know, the most fundamental part is volume mounts. And I'll go into more detail and show you the resources that are used for these. But a volume mount is really just saying, I need something that is going to be attached to a particular point in my file system in my pod. Alongside of that, there are a few other special cases. The empty directory is one of them. That's essentially a temp file system that you can attach. You can put stuff in it. When you kill your pod, things get deleted. Fairly straightforward. Along with a host pass. Now, a host path is kind of becoming an anti-pattern in Kubernetes. It means you can store stuff on the local file system of a particular node. But the problem is if your node dies, then it'll lose everything on the disk. If you reschedule a pod, say, if the node dies or something else, then it doesn't follow you. There's no way of actually keeping it there. It's useful and it's used in the case of particular system services that may need to be sticking on a particular pod. But otherwise, it's not necessarily recommended for actually running high availability services. Alongside of that, secrets and config maps are also a type of volume. Now, secrets are a way of encrypting special data and attaching it to your pod. They can be mounted at a particular path, but they can also be attached as environment variables as well. Now, config maps are quite a similar concept, but instead of encrypting the particular data, it's used for configuration files and everything else there as well. So they aren't necessarily a secret or a config map, aren't necessarily used for storage, or at least for writable storage when the lifecycle of a pod, but they are used for specifying it as you launch things on a particular configuration environment. So going on to the more interesting parts of things are storage platforms. Now, Kubernetes obviously supports the big cloud providers, it supports GCE, it supports AWS and EBS, but it gets into the other types of network storage, software defined storage, Glastussef, NFS, all of those sorts of things which may be used outside of a public cloud environment. Now, the one thing that's sort of on its way and is involved as part of a fairly deep discussion about how do you extend it to all the myriad of storage providers out there is FlexVolumes. Now, FlexVolumes is a plugin that companies build their own without necessarily having to go into the Cef tree and to have everything provided as well. Now, there's a few limitations that are leading to a little bit of frustration in the community at the moment and they're being resolved, but there's a very interesting project called Rook, which is a platform on top of Cef for being Kubernetes native. And they're looking at ways of integrating that automatically without having to manage credentials. So in the current workload with certain plugins, if they're not included in the tree, then you can't necessarily attach secrets to them to be able to access particular devices with username and password, let's say, or security keys or anything else like that. So coming in at one point, it'll mean there'll be an easy setup and almost a drop-in replacement for having storage in any sorts of environments. So let's go down with a demo instead of just me talking for the whole time. So what I'd like to do is really to go through and show what is a fairly typical workload. I assume everyone's familiar with WordPress. It's a blog, if anyone's not familiar with it. But it's a good example of a stateful app. Typically, a PHP front-end with a database back-end. And obviously, we want to be able to put something up there that will survive the database or part or container being destroyed. And coming back up again and keeping all the data. So fairly simple example, but feel free to ask questions and if there's anything that comes up that you're curious about. So let's go across here. So I'm running this environment on GKE. It's actually a shared environment that we have running as part of the company. But what we're going to do first is look at storage classes. So by default in the GKE environment, there will be a single storage class that exists in there. Now, this is the definition that is used to say, I would like a GCE disk that's going to be attached to my pods. Now, behind the scenes, this has the service accounts in the GCE environment that will have the credentials to be able to create them. But you can see that what it's doing is it's creating a PD standard, a persistent disk standard. So it's not an SSD in this case. You can add SSDs if you so wish. But for the most part, and this for the default, it's creating a GCE persistent disk. Now, that's going to be used. It's actually the default storage provider for all of these things. Can everyone see that, by the way? Is it not? Is it too small, too large? No. Easy to read. OK. Anyway, so what we're going to do, I'm going to lean on Helm a little bit here. For those who don't know what Helm is, it's essentially a package manner for Kubernetes. So it lets you create a templated bundle of Kubernetes resources that are used to instantiate different versions. So in this case, I'm going to launch WordPress and setting a few configuration options here. So in this case, I'm creating it at a namespace called kubesg. And I am setting a few config options here. I'm giving it a bit of a password, not exactly secure in this case, but this is just a demo. So I recommend you don't use it in a production environment and giving it a customized name for the blog itself. So behind the scenes, what this thing is doing is it's creating the pods. You can see here that I'm running a... It's actually behind me, so it's probably not the best thing to do there. I'll cancel out of that one. And we'll go kubectl, get pods. So in that, we can see that we have a MariaDB database coming up along with WordPress. Now, it's currently running... You'll notice that there is a zero in the case of WordPress not running up. So what that's doing is it's actually making sure that the database is happy behind the scenes and running any kind of database migrations. So now you can see that it's come up. So WordPress is now available. And we can see if our service is available. So we're adding a load balancer in this case. So it'll be accessible publicly. And you can see here that WordPress has been assigned an external IP in this case in the Google load balancer. So that's accessible publicly outside. Obviously, we put a password in there so that it didn't get accessed. Now, we can see here that the configuration that I put in there, Kubernetes SG blog, is there. And we have a very, very basic WordPress blog. What we're going to do is we're going to log in. We're going to customize it. In this case, user and kubesg, no. So this is the good old WordPress dashboard. So we go, what do we do? Happy birthday, Kubernetes. Fairly simple, and we'll publish. So in this case, we can go back. We can see that it's actually visible there. Now, what we're going to do is the worst of all possible things, we're going to go and say, OK. Firstly, actually, let's go and have a look at the resources. Let's see what's being created behind the scenes and how these things are configured. We're going to say, let's have a look at MariaDB. I'm going to be looking at this as a pod and in the namespace kubesg. So what you can see here is a whole load of fairly unreadable junk, at least on this screen size. But if we scroll back up again, we can see a number of different things in here in the description of the MariaDB pod. So it's running an init container which sets things up initially. In this case, it actually sets up the configuration used for MySQL or MariaDB. But you can see that it's got a bunch of things here called mounts. Now, mounts are particular locations on the file system, as I mentioned, that exposed for storage. If anyone's familiar with Docker, they're exactly the same as the container volumes that are available in the configuration there. There's a few other things as well. There's a service account that's always injected into at least the default containers which contains a way of accessing the Kubernetes API. But in this case, the most important thing here is the mount for data from Bitnami MariaDB. So the container itself is made by Bitnami because they make some really handy little containers that we can use for this stuff. And that's where the MySQL, the MariaDB data is actually stored. So alongside of that, as we scroll down, we can see that there's a variety of different stuff here. Now, data is the interesting one. And that is actually where things are being stored in a persistent volume claim. So that's referencing a persistent volume claim in the particular YAML that is stored on that is used for storing and creating a persistent disk on Google Cloud Platform. So alongside of that, there's a config map. There is also a token as well that's associated with it. Now, if we go in and have a look at the YAML file for those people who are curious, we can actually see what it's doing there. So this is a little bit harder to read. But this is where the rubber hits the road. And this is actually where you specify things. So here you can see that it's really just specifying a persistent volume claim. And it's calling something that is a claim of Kubernetes SG-WP MariaDB. Now, if we go and have a look at kubectl getpvc in our namespace, we see that there's actually two things here. We have two particular persistent volume claims. Both of them are bound. One's actually used in the WordPress side of things for if you're uploading HTML content or images or anything else like that, alongside of one which is the MariaDB database. Now, that's pointing towards a PVC volume. You can see here with a really nice GUID that's referencing something else. It's also telling the capacity. I'm sorry, I actually realized that it's all the way down there. Sorry, Hunter, so did you set up any of this beforehand or did you just run the Helm script? Yeah, so I just run the Helm script and I create all of these things by hand. Including the storage. Including the storage. So it's actually gone and provisioned eight gigabytes of storage in the case of the top one there. And also 10 gig, I don't know why the configuration by default is 10 gigs for the WordPress side of things. But as for the database itself, it's eight gig capacity. It's read, write only. So there's no shared capacity for that. And it's using the standard storage class. So from the perspective, in the case, for example, the Docker, you manually mount your volume. Yes. You know your volume that you want to mount inside the container. Yes. So in this case, if I run on the hard drive, for example, I want boxes. So why do I specify which volume actually persists to the cluster? Oh, thank you. Which volume has this volume mounted and which don't? Yeah, so that's where I talked a little bit about host paths. And so that's the equivalent of taking a docking container and running it with a container mount, along with a mount on the file system. Now, as I mentioned, the problem is that it doesn't consider anything about scheduling. So in the case of if the pod gets deleted and it gets launched again, it doesn't necessarily come back on the same host. Now, what's come as part of Kubernetes 1.7 is a new feature called local paths or local storage. Now, that's a way of specifying a persistent volume claim that will only go to the local disk, but it'll also take into account scheduling to make sure that the next time it comes back again, it'll get scheduled on the same node again. So useful for cases of if you're actually launching your storage cluster inside of your Kubernetes cluster. So in the case of what we do is SEF, just as an aside, we have the actual storage demons that store the data attached. They run inside Kubernetes. So it's kind of a bit like Inception there as well, which talk to a local host path and make sure that next time it comes up again, it gets rescheduled correctly. In this case, even if you run the SAP or NFS or any other file system, it's still going to be mounted inside the physical or VM box, right? So how Kubernetes knows what is already mounted there if you don't specify it up front? If you don't specify it up front, you as an administrator need to know that. Now, in the case of storage providers, NFS, you can actually shell out to an export and or you can reach out to an NFS server and get exports and attach those sorts of things in there. If there's no API interface for your storage layer, then it becomes a bit more difficult because there's no way of programmatically saying, well, at least for Kubernetes to say, give me an 8GIG storage volume. I want to use that. I want to attach it. Here's a credential to access it to make sure that no one else was accessing the same data, all those sorts of things. So it does become a little tricky because especially in the case of Ceph or other environments where you have to manage locking for these things, you need to have the central master controlling saying, well, this volume is not being used anymore, the nodes crashed. I need to remove the lock. I need to associate it with another environment, all those sorts of things. Now, the built-in providers will typically handle those things, but if you're using something that is not necessarily supported, you can create a flex volume to be able to do those sorts of things, but it's up to you to be able to manage your service provider layer. Make sense? Yeah? OK, so to quickly go back to James's thing, these volumes didn't exist before. And so you can actually have a look. So I'll quickly go into the G Cloud. If I can remember what it is. You can see there's actually a few. We've got a few different disks stored here. But one of these things was magically, or two of these things were magically created at some point when I ran that Helm chart behind the scenes. So there's probably one there. There's eight gigs at the top there that's for WordPress. But going forward, let's actually cause some mayhem. I don't want to do that. I want to go and get the name of the pod and say, OK, let's do the nasty thing of, say, kube control, delete, pod, and the namespace that we're working in. So that's actually deleted there. There we go. And watch it come up again. So get pods in the namespace, and we'll watch it. So you can see here that there's a few things going on. It's apologies for it being so low on the screen there. But in the case here, there's actually a pod that's been terminating. That one's the old MariaDB. There's a new one that's being launched that's coming up here. It actually hasn't enabled itself yet. But you can see here that the MariaDB is still terminating. And WordPress has also gone down. Now, there's a reason for that, because what the checks in Kubernetes are doing for its liveness and its readiness is it says, OK, we're going to ping the front page or the login page of WordPress. Our database has gone down. So we're going to mark this out of a load balancer. So it's actually not going to be routable in your environment. Now, if I was running multiple WordPresses, in this case, you'd need to be running multiple MariaDB backends or more of a high availability database. But in those sorts of cases, these things would still remain running just in a two environment, two node or a two pod environment running a front and a back end. In this case, it's going down. But in the time that I've been talking in the 40 seconds, MariaDB has come back up again. And WordPress is now healthy again. So we can actually go back again and see if I am telling lies about the data being stored correctly and safely. I've just refreshed the blog, and we're back again. So I can go in there and say, well, I can create a new blog post and publish it. So it's not a complete fib. There it is on the front page. So that's essentially safe data stored in the case of Google's persistent disk environment. And it's the way that things are typically done for stateful applications as you go in Kubernetes. So fairly simple, fairly straightforward. Does anyone have any questions about this? Yeah, I have a lot of questions. In the case, everything was represented as EPS or mounted volume, right? So from object store, let's say, we have, for example, the IWES S3, right? So if my data is in the object store, I'll give you an example of the use cases will be for machine learning if all the training data is stored in the S3. So I want to reuse them. Is it any way that this, for example, driver for the network can handle? So it's typically not done that way. Object storage is a bit of a funny case. Because it's an HTTP interface, unless you really want to mount it as a Fuse file system or another way of putting it into the operating system itself, the way that you'd typically access your object storage stuff is via the application itself, making requests over the network, and pulling it down as you would. So if you want to be running HDFS or one of those sorts of things for doing big data workloads, you can use it in a Kubernetes environment. So you can put it on the local disk, you can put it on a persistent disk storage disk or anything else, and create yourself a high availability HDFS cluster. But in terms of object storage, it's a better use case to actually have the application doing all of its work, because then you can have things with caching, you can have a number of things that go into whichever of the particular machine learning tools you're using. So anyone else? Yeah? Hello. My name is Drogesh. So when we're talking about that, we're going to have multiple mechanical server, binary server. What happens if you have two server and one disk down, and how in that case, our data will sync up? So that's really dependent on your database. For MySQL or for Postgres or anything else, if you've set up a master slave replication, that would remain working the same way. So the disk itself will keep the data safe. One of them will go down. You'll need to make sure that your logic for promoting to master is in place, with a proxy in place. But you don't have anything as such. We have to manage it all on. That's right. And that's why I said, if you're running in Kubernetes or running in a virtual machine or anything else, you need to understand how your databases scale. You need to understand how to operate them for high availability. So that doesn't go away. You just need to make sure that however it fails over, that you're handing over, as you would in another environment. So yeah. Any other questions? Thank you, Hunter. I didn't know that.