 All right see it's on the hour so take it away. Yay. All right thanks guys. I'm going to show my screen I think I'm sharing it already going to go in present mode on my laptop so I can't see everybody but I think best probably to wait for questions at the end otherwise it might get confusing given the nature of webinars so I'll take questions at the end some of the questions I probably won't be able to answer. I've been working on our big event coming up in March cloud next and I haven't spent much time working on Kubernetes in the last couple of months but I have three years of experience circumstances that come out I've been working with Kubernetes and this talk is basically an end-to-end view of Kubernetes and specifically in the context of building cloud native applications and if you're building with something like Kubernetes you're automatically almost building cloud native applications not quite necessarily the idea of cloud native is building specifically for the cloud as opposed to using traditional infrastructure and lifting and shifting that stuff into the cloud so that's what we're going to be talking about today and I guess then if there's any issues that come your way please stop me and I'll address them but otherwise I'm just going to plow on. All right so and again as I say I think this is going to be hard to be interactive because I can't see anybody but and we'll talk to my screen I often have a picture of a crowd of people that I can see and talk to those instead which helps a little bit I won't be able to do that today. All right okay so I always like to provide a little bit of background quite a bit of background actually into why why we're doing this thing why why Kubernetes why is it a thing and why do we care and I have to do that in the context of Google because really that's where it came out of and I think other companies are starting to wake up to the same realisation or have been over the last decade or so the traditional infrastructure just doesn't cut it when it comes to scale and back in April 1999 we found ourselves in this situation where we've been going very very rapidly on very strong growth up to about 500,000 users in just five months 500,000 queries per day is very much in just five months and that kind of rate of growth is really hard to sustain with traditional infra so we went through various different iterations of things including starting to run our own data centres and at some point we ended up with plans at least for something like this this is one of our bigger bigger data centres in council Blas Iowa and we hadn't quite reached this yet but we knew when we would get to this point it's just really impossible to deal with this kind of stuff the way you would do normally there are thousands tens of thousands of machines in our clusters and for a software engineer to have to deal with that kind of complexity where to run stuff is really hard and so there was some things going on at the time and what we wanted to be able to do ultimately was make it very simple for software engineers to access all of that infrastructure stick a simple control plane in front of it and have software engineers deploy stuff to the control plane and have it take care of all of the details and that's what we kind of come up with but everything wasn't in place to begin with containerization have been kicking off they've been charu initially and and bsd jails i used to work for some market systems when solarisons were really popular back in 2004 or so so containerization was really kicking in hypervisors were around virtualization vms and such like are already there but the full story for containers wasn't really there yet and the kind of things that pulled all of that together into the next application containers were things like charutes capabilities and then ultimately namespaces and c groups and those last two particularly helped very much with do you guys know the mr men i don't know i did this before the mr men are very popular in the uk but i'm not sure if you've seen these guys before but i love them this is mr noisy so c groups and namespaces can help with noisy neighbors they can also help with nosy neighbors and also help with messy neighbors so all of this is about isolation about making sure that you're not impacted by what the work of what other people are doing and this is both important in a multi-tenant environment and also in a single tenant environment and another part of puzzle was this idea of images and again this has gone back a long time statically linked windows things that carried all of their dependencies with them by that time we were riding the crest of the wave of the dynamic linking and dynamic dynamic linking is great to a degree but it also has some real issues when it comes to dependencies and managing dependencies for the idea of a statically linked bundle allowed us to encapsulate all of the application environment into something that we could move around everywhere and then you combine those two together you have signed static windows plus limits containers you now have the ability to isolate applications from those many many different environments in which they run and that's where we got to and we had a lot of we did a lot of work on namespaces and c groups and that's ultimately what made the modern limits container possible and then it was the way we actually run the containers containers were i'm going to get to something important in a second but at this point we had two systems we had a global work queue and we had something called baby sitter we would run batch workloads on our global work queue and baby sitter would run all of our production workloads and this is where terms like nanny came along well we have a nanny for all of our running processes that make sure that it runs and continues to run there are some problems with this model as well and i'm not going to go into too much detail but i'm going to kind of give you some very quick comparisons what we decided to do was combine those two together into what is known as a bog cell and bog is our cluster manager our effectively the orchestration component and our initial equivalent of Kubernetes today so we combined all of this together we now run our batch workloads and our production workloads on the same machines in the same cells in the same clusters in the same data centers and that's interesting because that doesn't necessarily sound particularly optimal but it is and there are various reasons for that the first is that we again this seems like a no-brainer for the most part but we do run multiple applications from machine so we're not just like dedicated virtual machine slices for specific applications we're running multiple applications from machine and the speed spot of around 50 percent on this cdf here is about nine nine applications from machine and how does sharing classes between program batch help well if we look at our non-prod workloads which is the light blue bar at the top and our production workloads we can compress them and compact them down to the minimum number of machines that we need to be able to run them and if we did that we've been separated it would look like this we would have that gray bar at the top and we'd have a little bit of overhead left over after we compressed them and by the way if you need me on twitter or anything like that at tech girl you can ask me questions that way we can also ask questions via stan directly as well so we have these non-prod and pod only workloads compacted if we were to combine them together and run them both on a shared cell we get much much better compaction and now this is where we're running non-production and production workloads on the same cell you can see that we've compacted this down significantly and this effectively saves us potentially around 25 percent of the machines that we run so that's an important differentiator as why we would run program batch together on the same machines so this next graphic again that's not literally the slide for a while but this graphic shows exactly what the difference is in terms of the overhead from the previous model to this model this shared model around 25 percent okay so that's what benefits we get from it and this is all managed by borg our cluster scheduler all of the ability to be able to make this work is managed by that particular piece of technology that we created and another thing as well is that we like to think about over commit significantly and if you look at the squiggly bar at the bottom there it's actual resource consumption of running processes but when we have our engineers schedule work on to borg on to our machines via borg they have to specify how much resource they believe they're going to need for that process and that's what's called the limit and as you can see the limit is fairly optimistic it's much higher than the actual resource used obviously they want to make sure that there's a significant overhead in case of the spikes in traffic so the limit is always going to be higher than the amount of resource that's being used and what we can do and what we've been able to do over the last 10 years or so with borg is very very accurately predict or estimate the future usage of all of our running tasks within the cell and so we calculate this thing called a reservation on an ongoing basis and that's that blue line there as you can see it encompasses all of the running work workloads it may sometimes be off we can we can deal with that we're over committing generally but that blue bar represents what we think we're going to use and now this gap between the dark blue line at the top and the light blue line is potentially reusable we can reuse that resource for other things and what we use it for is batch jobs jobs of a very low priority things that can be preempted effectively and that's represented in this graphic here which shows reasons why tasks have been evicted from a machine within a cell and I'm not going to go into too much detail about some of these things but it just does represent exactly what's going on here you look at prod which is the top bar we have a very very low eviction rate and most of those evictions are scheduled machine shutdowns so we're going to be scheduled rescheduling these tasks anyway because we're taking the machine down for maintenance sometimes they get preempted and obviously there are production tasks that are lower priority than other production tasks and sometimes they may get taken out but it's very rare if you look at the bottom bar you'll see the non-production workloads where we have a significant amount of preemption and that's because they're all running in that that green area there which may be needed by the actual production workloads and also we have the other bars at the end how I'm going to those details but you can see that we we effectively preempt some non-production workloads very frequently and the other thing the other factor that we really care about is the ability to bin pack and this means that we when we take those nine those nine tasks we're effectively using all of the resources on a specific node a specific cluster node and if you look at this diagram or picture here you'll see represented running virtual machines within our cluster these are virtual machines on google cloud platform and have been made available to users but they appear just as tasks running on borg the top you can see cpu usage and the bottom you can see memory usage and if you look at the big black bar in the middle that's one machine if you look to the left though you see the orange you'll see that this is for one machine you'll see that where we have spare cpu we also have spare memory which is great because we can schedule stuff there if you look at the purple stuff I'm not sure why it's purple it should really be red but you can see that here we have cpu or memory that's been used where there is no corresponding of the other so where we have cpu being available we have no memory available we have memory available we have no cpu available and what this means is that we can't schedule any work there so what we need to be able to do is effectively be impact to make those bars the same and match on both sides so where we have cpu available we have memory available and so we have advanced impact and algorithms that calculate that and this is all available in our borg white paper all of this information I've just given you is available in our borg white paper which the link is on the next page so yeah so for borg this is why we do all this stuff and why we have Kubernetes efficiency comes from a various various different things that we care about scavenging unused allocations overcommitting prioritization sharing resources very very smart scheduling and bin packing we take an application-centric view of the world not machine-centric it just works better for us we launch over two billion containers per week go google on our infrastructure and that's the borg white paper it's a long read I think I've summarized it reasonably well but there's a lot of good stuff in there all right so the conclusion is that we in order to be able to build borg and this is what we saw we saw that we would need something like borg to go back to that picture where we have a control plane in front of our data centers and we would need something like containers to make borg possible and so we helped build Linux containers this is a bit of an inversion my lights have gone off this is a bit of an inversion because really you need something like Kubernetes in order to make containers practical because containers are taken off in a big way and at some point you get into a situation I'm going to show it in a minute where you have many containers and having something like Kubernetes to manage your containers for you is vital so let's give a quick recap of containers I'm not going to spend much time on this you're probably all aware of what containers are by now pardon me are my voice is going I'm not sure where you all are in the world I'm in San Francisco and it's very early in the morning here all right so containers generally lightweight hermetically sealed not a word I like particularly well but it does kind of match the representation of a container isolated very easy to deploy and move around introspectable you can look inside them and see what they're doing and they are runnable and they are runnable anywhere in many different environments so they're effectively looked like in its processes the impact of them is that they improve the overall developer experience they really do very much help with reuse and they simplify all of these operations for what today we're calling cloud native applications and this is represented by things like a lxc rocket and docker and I'm not really allowed to use the docker logo anymore so that's my version of it okay so what we get to is that we find that containers are awesome and we want to run lots of them and that's when the inversion problem occurs where you need something like kubernetes to manage them for you now it's the exact inversion of what we had initially so you want lots of containers and kubernetes is there for you to provide this abstraction on something now what that's something is can be different depending on your on your needs again we have this notion of a control plane in this case an api which is represented by kubernetes and on the right hand side you have things like data sensors like our data center in council blast Iowa you have potentially a kubernetes cluster on a rasby pie a bunch of rasby pies all stacked together that's just there for fun we built this so we helped build it with a winam slam it could be google cloud platform it could be amazon web services it could also be hp del ibm on premise machines that kind of stuff so pretty much the abstraction works on all of those environments which is extremely powerful and with cluster federation which we'll mention briefly at the end here it's potential to mix a match of those together so that we can see a control plane in front of the control planes effectively to create a federated cluster so kubernetes i'm just going to look at the time okay i hope i'm not going too fast i'm going to take a breather right okay so kubernetes i always say natives everybody else is kubernetes i always learned it that way and i've always said it that way to find out i was wrong it's pretty scary considering i've been using it for three years i like this graphic this is from the docs i like it because of i like the notion of a notion of user containers and i like the notion that we take that ocean of user containers and we schedule it and we pack it dynamically onto nodes very much like borg what we've just described so i think that's an important summary of what kubernetes gives us we're not going to talk much about the architecture of kubernetes in this slide deck i think i've missed that out but we are basically looking at a master slave system or master node system and nodes can run anywhere in a cluster and they can be on-premise they can be in the cloud they can be on a raspy pile ultimately and so i'm not going to use this graphic to explain kubernetes so let's get into the details so kubernetes what does it give us so it helps decide where our containers should run so when you have a container all you have to do is tell kubernetes i want to run this please run it for me maybe you'll provide other information about it and hook it up to other components and other abstractions that are running within kubernetes as well it manages life cycle and health of my containers and we'll talk about the actual runtime artifacts at the end of scheduling shortly but it will keep containers running despite failures and this is for some notion of containers containers are generally ephemeral and replaceable making sure that we had the right number of containers running at any given time is what kubernetes will do for us also scaling we want to be able to make sets of containers larger or smaller depending on traffic or anticipated traffic we want to be able to find out where containers are or where they're represented by services where they're components of a microservices architecture or just generally a monolithic service represented by container we need to be able to name them and we need to be able to discover them and believe me people still build monoliths and containers today like we're balancing we want to be able to distribute traffic across some set of containers running on our cluster somewhere our cluster infrastructure and potentially even across a federated cluster today we also want to be able to provide storage volumes to our containers everything needs to store data we want to be able to log and monitor what's happening within containers and kubernetes also provides debugging and introspection so we can actually look at containers look at what they're doing attach them and debug them as needed and the final part is make sure that only the people we want to be able to do something with our containers can do those things by identity and authorization excuse me i'm coughing into the microphone so that last part of the quick recap kubernetes is one of oh at the end three legs of cloud native i'm not updated that there are much more to cloud native now you can go to cncncf.io to find out more about the cloud native computing foundation but basically we talked about the architectures here what it does containing packaged apps and emitting microservice architectures it's been around for around almost three years now GA since 2015 1.6 is due in the end of March or currently a version 1.5 lesson half the code is now written by google i couldn't find the latest version of the graphic but you can see there it's 44 percent for google and it's stewarded by the cloud native compute foundation and not by google so i think that's important as well if you want to know about independence kubernetes is also very stable it comes from all of those learnings we talked about earlier 10 years of over 10 years now our production experience our mistakes things that we've learned over the years with borg are represented by kubernetes i often talk about borg being something that we've built up over time we've added stuff imagine a house with lots of extensions all over the place coming backwards and forwards and a high up and it it's very solid it sounds it does what it needs to do and it does it extremely well but at times you want to tear the whole thing down and start again and kubernetes is kind of like reimagining what borg does externally from google we're unlikely to port to kubernetes anytime soon because we have so much dependency on borg but if we're going to build something from scratch outside of borg then this is what kubernetes is effectively we're guaranteeing that no breaking changes will be introduced until version two with v1 so that's important for everybody that uses it we have different tracks for introducing new features both alpha and beta we have significant feature sets in beta currently and obviously your ga tracks guaranteed certain levels of stability we do farra and sastin through the community we help with that as well at google and there's lots of new work taking place outside of the core commit process of kubernetes most of this is accessible from the github the github project you can find out more about all of this ongoing work kubernetes also has a very solid core we're going to talk about some of these primitives soon not all of them pet sex mentioned there sorry i should be stateful set more cabinet shortly my voice is definitely struggling here so yeah so these are the core concepts i don't like talking about bullet points around individual things that may be complicated to understand but we also do have a kubernetes ecosystem is extremely healthy and is supported by very large companies different distros different cloud providers different platforms of service offerings continuous deployment offerings various package managers monitoring networking storages storage and appliances this may not be up to date completely but it's going to be more instead of less and also we have great mention as of this morning 43,765 commits to the github repository and 1078 contributors including many important companies listed below that number so kubernetes we go back to a picture that looks remarkably similar to what we saw earlier with ball can with our initial abstraction of what kubernetes would look like our developers are accessing a kubernetes cluster via a control plane which is the api uh somewhere within the cluster we have machines as a developer we don't care what those machines or those nodes are it's really only something that the control plane which in this case is the kubernetes master and scheduler need to know about so we have these servers we have the control plane and developers can access them via either the api which is extremely common this can be made use of by many third party tools as well you can use the cli the command line interface uh the the kubectl command and other commands now and also the user interface which is still very much of work in progress many people like to build their own user interfaces in front of kubernetes that gives them a lot of flexibility and allows them to integrate kubernetes via ci scd into their own workflows so the dashboard the user interface has never really been a huge priority for kubernetes so ultimately what you do to get started with kubernetes is you start with a cluster you need some machines so i'm going to run this on that could be as simple as a laptop you can run kubernetes today on a laptop or you can run it in a high availability multi node cluster it can be hosted or you can manage it yourself we have an example of that host in in google container engine and there are other distros as well that provide hosted offerings of kubernetes you can manage it yourself it could be on premise or or cloud as we've already established and it could also be on bare metal or on virtual machines supported by most operating systems and you can just run on just a bunch of my raspberry pies which is like we did at devox last year with quintor from amas fort in near amsterdam there are many many different options there's a matrix down there a link i've not actually checked the link for a whether it's still updated i'm sure it probably is i'm sure it probably is that's good i like that but i should update that later i'll share these slides with you afterwards setting up a cluster there's various different ways of getting started if you want on prem we've recently introduced the cube adm command which can be used to configure masters and nodes on a in a non-premise environment there's also this thing called cops and not look back on this for a while to see what kind of traction it's getting but this was designed to specifically run on aws run kubernetes on aws you know the plan was for it not to be limited to aws but it does simply by deployment getting up and running because uh on some environments it's the networking is all taken care of for you on aws you have to run things like uh final that kind of thing to better run kubernetes and cops takes care of that for you uh you can choose a platform uh whether it's on google cloud platform azure ubuntu juju and you can just run it from the command line you get the script i wouldn't recommend using the code command anymore i don't think people do this anymore uh it may still be in the docs like this uh but you can specify your kubernetes provider uh and pull down the script uh run the script and that will install kubernetes for you in your environment of choice uh you can choose a distro like red hat atomic tectonic marianthus marano mesos uh or you can build some scratch uh and there's a research for kubernetes the hard way uh one of our guys kelsey hightower has a walkthrough of how to build kubernetes completely from scratch uh let's get into the abstractions of kubernetes and the first one is a pod a pod of containers uh we don't actually run containers we don't schedule containers specifically we have this notion of a thing called a pod uh and this is probably the hardest thing to get your head around it's fairly simple but the biggest question is normally why a pod why do we need this thing why can't we just schedule containers and this generally really is kind of a mismatch between the way docker works uh and uh the way the infrastructure works in the way we need the infrastructure to work uh when we look at networking shortly there are some things we actually require that were not available in docker uh when docker came along and so we had this abstraction called a pod and a pod looks pretty much like a logical uh a virtual representation of a logical host uh not a vm but looking like a host and being able to run things like a host it has its own ip address it can run containers and mount volumes uh and uh ultimately we can run multiple containers from inside a pod uh together when there's a need to this isn't uh the idea of building monoliths it's not to facilitate building monoliths but often you'll find containers need to run side by side and we have various different patterns for that often called sidecar containers uh or adapter containers things that do things that offer value to the running container uh we look at app engine today app engine flexible environment which has a container and a memcache container that runs side by side with it and those containers will live and die together uh so we have an example on the next slide I believe but they also support liveness and readiness checks and they have post start and pre-stop lifecycle hooks that can help with tear down and creation here's an example of where we'd run multiple containers within a pod in this case we have a node js application container serving traffic to consumers uh by an abstraction we'll talk about shortly called a service uh and we have a another container that's run inside by side within the pod there's monitoring the github a git repo and whenever pushes are made to the git repo uh the git sync container sees those pushes pulls down the content stores it on a volume which we'll talk about shortly as well and that can now be updated and served by the node js application container uh so uh here we have a lifecycle dependency those things need to live together uh they are matched together and when one goes away the other can go away as well so it makes sense to run them inside a pod they do simplify networking for us as well uh and these containers share effectively port namespaces and ipc namespaces and they can talk to each other through localhost just like they're running on a host machine uh so if you can imagine building this machine physically and just then mapping it to a pod and then pushing it out into Kubernetes to say this is what that machine will look like a virtual representation of that now running as a pod uh pod networking is an important concept for us because uh there are some requirements that we have and uh this where some of the complexity of Kubernetes comes in which is generally going away with things like cops whereby we need to have a network layer that allows pods to have their own routable ip addresses that's unique within the cluster which means we can actually access each of these pods individually by ip address or ultimately by our dns lookup it's just extremely powerful so these blue boxes here are nodes and we have other blue dark blue services running within the node but the pod is scheduled on the node it has its own ip address and these things can expose the same port so like in the middle example we have 10.1.1.211 and 10.1.1.2 they could both be exposed in the same port they're both running on the same host but because of the networking overlay that we use that's perfectly okay that's not a problem we can schedule multiple pods of the same type or each is exposed in the same port on a single node that's extremely powerful so no broker in the port numbers at all and pods can reach each other without using that even when they're on different nodes there are many different solutions and again i'll probably need to update this list a little bit you can use things like flannel weave calico open b switch or you can make use of your cloud provider and this is what google container engine does or when you're using money and kubernetes outside a container engine on google cloud platform virtual machines you get this for free because it's supported by our networking infrastructure so that makes it very simple to use so that's how pods work and this is one of the reasons why we have pods because of the networking ultimately pods are built from some kind of spec template basically this is a something that can be recreated they're completely fungible they can be replaced they're completely ephemeral as well but they're functionally identical and we just build a pod from a spec template that we created and we can have as many copies of that many replicas of that as we like across the cluster we're going to get into more details i don't see some more examples one of the powerful mechanisms one of much needed mechanisms when kubernetes is a grouping mechanism called labels and we can group pods we can group other semantic abstractions as well using these labels effectively every resource within kubernetes cluster can be labeled in some way it includes nodes and the most powerful abstraction of mind or powerful use of them is with pods where we're going to group pods together so now we mentioned that pods are created from a spec template in this case these two pods were functionally identical they are fungible they're replaceable and they both have the same label which means we can now build other abstractions or tooling even that can say i care about all of the pods with the label type equals fe and in this case well in all cases labels are key value pairs they have semantic meaning to you but not to kubernetes okay so but the most part there are some labels that may have semantic meanings of kubernetes but most mostly not in this case you'll say type equals fe because you care about pods that are front end pods and you define that type and now we can build a dashboard of some kind that can that cares about all of those pods and you can make api calls to kubernetes and say give me all of the pods with label type equals fe or we can have another dashboard and in this case we have version equals v2 and we've built another dashboard or another abstraction that can that cares about version equals v2 pods can have multiple labels and they can be similar to this this structure here and maybe there's some difference between the pod in the middle and the pod on the left both from the type equals fe but this one is the later version of it and we look at the canary and other abstractions shortly these things are queryable by selectors and you can do that by the api or kubernetes can do it as well then we have the notion of a replica replica set and a replica set manages a number of running pods that are functionally identical built from the same spec template we mentioned earlier my that's something wrong there with my picture yeah okay anyway uh normally i'm not convinced that that's correct because maybe it would be yeah okay so basically when we create pods we give them labels we also have this ability to create a replica set and a replica set will is an abstraction that comes up we we build a replica set we tell it how many pods it needs and we provide it with a template it knows how to build pods from that template so in this case what we have is some kind of template we're saying we want free of this template when the replication replica set comes up it looks outside in the cluster to see if there are pods that match that label if it finds them then it it starts effectively managing their lifecycle for us if it doesn't find them it will create them for us from the template so in this case this works but we have three different pods with effectively two different labels the replica set probably didn't create all all three of these they probably created the first two and it is managing the last one which has a slightly different label label configuration but effectively what we're saying here for the replica set is i want three of these this is my template please make sure there are always three of these running at any given time and it will be effectively a control loop i will continue to monitor that it will look to see if there are three of these running if it finds there are four running it will take one away if it finds there are two running it will add one and it will use the template to create that there is possible we can update the replica set and we can effectively then tear down the pods and recreate new ones that's a thing called a deployment which we'll look at shortly but ultimately these things make sure we have a desired state we usually use this term quite regularly desired state it manages that desired state this is the state of the pods that we want to have running and it will manage that for us deployments effectively take that one step further what we used to be able to do is provide some kind of rolling update for a controller effectively what we would have would be a replica set that managed a bunch of pods and we want to be able to update each of those pods to a new version that would involve creating a new replica set in those days it was called a replication controller and we would basically tear down one pod from the old version and bring up one pod on a new version until we had replaced them completely but that process was driven completely from the client side from the actual kubectl the command line tool we do all of that work by issuing api calls and it was completely interruptible which means if you control seed it or something went wrong it would leave the entire configuration in an unknown state we may have x number of new pods and x number of old pods we may even have the wrong number of pods so what we did is move that inside the cluster via a deployment and so deployments effectively offer updates as a service we say how many pods we want to run we provide a template and we ask the deployment to manage that via a replica set so now what we can do is we can roll out updates to the deployment say hey I've updated my configuration here's a new template or I want more of these please do this for me and the deployment with inside the cluster will take care of those changes I think I have a demo of it soon but not sure we'll get to it in a second but anyway effectively they can manage changes such as wrong updates they can do all the scaling for us we can actually edit a configuration on the command line we can issue a command look kubectl edit which will pull down the configuration of the deployment and allow us to edit it once we've applied the changes they will be pushed out to the cluster and it will enact those changes by updating the pods within the deployment and we can manage rolled out some rollbacks as well this is a much more powerful mechanism and it's completely managed on the cluster don't we have the abstraction of services services okay my slides animation is broken a little bit pods can run anywhere within the cluster pods are ephemeral their IPs are not stable and we may we may have many replicas of the same pod running within the cluster that's interesting but how does that help us we have this situation here how do we get access to them and how do pods find each other we may have multiple microservices deployed within the cluster we may need to come in externally to be able to find those pods how does that work and how do we route traffic to them ultimately and that's done through something called a service and so now we take this the pod abstraction they have labels in this case we have app equals back end and we're going to stick this other abstraction in front of them called a service which cares about pods with the label type equals their fee or sorry app equals back end in this case I need to update that slide and then basically what can happen what will happen fire a virtual IP provided by that service other pods can find that service and access those other pods external services can do that and we can route traffic to them all via this virtual IP and the virtual IP gets a DNS entry as well so we can discover it via DNS so this effectively aggregates all those pods into a service it may be a microservice it may be a one-off service that we're running across the cluster but this is the mechanism we use to be able to communicate with running pods within the cluster choice of the pod for routing is currently random it used to be around Robin but we do support session affinity via client IP and the virtual IP address is stable as is the DNS name okay so that's what a service does we also have another abstraction or another version of a service called a node port which allows us to rather than having to worry about where the service is running and route traffic to it we can actually expose a port on every single node within the cluster and then that would be our service and that would effectively route traffic to any running pod within the cluster so we don't need to do a discovery we just need to know where the node is and that may just be an IP address so we point to a node hit the uh hit the port and that incoming traffic incoming requests will be routed to a running pod within the cluster that exposes that service there's also DIY load balancer solutions of the low balance one is what we just saw DIY load balancer solutions such as Socat, HA Proxy and Nginx and we also have this thing called Ingress and Ingress is provided maybe not in beta I'm not sure some of these things it's really hard to find out what's in beta and what's in alpha nowadays with Kubernetes should probably look into that but basically the idea we have two services and we have to access them via their own DNS names or via their own IP addresses independently of each other but maybe we want to route traffic through a single endpoint and Ingress gives us that ability in this case we have two services and they have pods that they represent which may be name equals su name equals bar which is the label for those pods when we route traffic and we have to go to the virtual IP or to its DNS name what we can do with an Ingress we can stick this Ingress in front of it which will allow us to have a single DNS entry external DNS entry or a single IP which will route traffic based on URL and in this case we have service foo and service bar being having traffic routed them through the URL path foo and the URL path bar so this is extremely powerful as well this was introduced a while back I also want to talk about scaling as well we do this via a replica set but also via redeployment which manages the whole thing for us we start off with one pod we have a pod and a service in front of it the pod has a label version equals v1 type equals fe the service cares about labels with type equals fe and the replica set cares about label version equals v1 it effectively provides that kind of diagrammetical form there they are using different labels and that's important when we look at canary in shortly what we do is we say to the replica set I want two of these and now the replica set spin in again is control loop it says oh I should have two of these but I've only got one of them so it will now issue an API call and we'll create a new version that pod from the template that it has again the same labels as before and given that it has the same labels as before which is defined as part of the template the service will now pick up that new pod and automatically route traffic to it once it's online and if we say four pods exactly the same happens we have an extra two pods again the same label and now we have four pods the service is routing traffic across all four of those pods and the next thing is a roll out I want to kind of just go through this well my graphics are broken again in this case we have a replica set with two pods version equals v1 and a service with type equals be the service is the green box at the bottom we're not going to worry about the service too much in this but we want to be able to update the pods maybe we have a new version version equals v2 that we want to roll out and we can do this using something like kubectl edit deployment in this case we will issue that command we may edit some factor of the pod template in this case let's imagine we're changing the image so we built a new container image we pushed it out to a repository and now we want to which could be docker hub it could be our Google container registry we run on Google container engine but we now want to update our pods with that new image so we edit the deployment we could be doing this by CI CD and once the deployment sees those changes it will be responsible for updating the pods the first thing it will do will be create a new frontend a new replica set the replica set would have a different name slightly it's hashed so it's created dynamically the name but it cares about pods with version equals v2 and initially it has no pods and then we scaled that up to one the replica set again spinning will say oh i should have one of these pods my new template here it will create the pod version equals v2 type equals be again it's aggregated by the service the next thing we need to do is to downscale the existing replica set and we take both of those away we have here a situation whereby we only had one pod serving at any given time you can control that so like if you want there's a thing called max surge and max overhead i can't remember what it's called but basically you can define what how many you can have over the number you really require and how many below the number you really require in this case the default setting was to be one less and what we had is a situation where we only had one pod running at one time we could make sure that we always have two running by changing those parameters but effectively what's happened now is we replaced the running pods with two new pods that are built on the set on the new image that we created when we edited the the deployment the old replica set will continue to exist it's doesn't take up much in terms of resources it has no pods attached to it if we ever needed to do a rollback we would reuse that replica set we could roll back that update and go back to the usual configuration just by issuing one command that's really powerful and canary is another situation where we find ourselves in we may have two running pods version equals v1 managed by a replica set and abstracted out to a service we may want to test an update to our container image and what we can do in that case is create a new replica set again with one of the labels being typicals be so it's still aggregated by the service and one label is being version equals v2 which means it's being managed by this different replica set now we're routing traffic between two different versions of our pod v1 and v2 so effectively 66% of our traffic is going to the older pods and what 33% is going to the newer pod we can then decide at some point that the change that we rolled out is good and we can then update the deployment to for the first replica set to use a new image or we can roll back and tear down the replica set and start again and try try a different version that's something we do often at google canary and we have a very small percentage of our deployed services on the newer version we can monitor how that's performing if it's performing well we can roll out the change across the entire application or we can roll it back and go back to where we were before that's very very easy to do within kubernetes autoscaling is also extremely powerful in this case we have a replica set that manages two pods this is an example I use that devbox which is why it's got strange labels and we are monitoring the CPU utilization of those pods for a thing called heapster which runs as a pod within the cluster this is all automatic you get this for free and we set a scale target of cpu percentage equals 50 what we're looking for is if the cpu percentage gets greater than 50 we want to be able to update or add new pods on demand so initially the replica set is set to pods equals two we now have this loop whereby we're monitoring traffic if at some point the cpu goes above 50 percent which it has in this case we will buy as a scale our resource we will add we would ask the replica set to add new pods so we change the replica set number pods to four and it will take care of running the new pods and ultimately now the cpu utilization drops down below 50 and we now have a stable environment whenever that goes whenever we need to we we can tear down pods as well using that same mechanism scheduling we've mentioned scheduling before within the context of bog I'm not going to spend too much time on it we've only got 10 minutes left I mean it's left oh wow but Kubernetes without a scheduler does work you don't have to schedule things for Kubernetes you can specify by this label here a node name within your pod spec I want it to run on a specific node you could do that if you wanted to that's not how you should do it but that is possible and that's how Kubernetes would schedule a pod if it didn't have a scheduler but what we do have is a scheduler component that effectively allows us to say hey just run this for me and it will make the scheduling decisions for us so it will ultimately look at the pod look at the cluster and it will make a decision on where to schedule that and that's all dynamic and handle for us and ultimately the bin packing components of that will come along within Kubernetes in the future we don't quite have all of that yet it doesn't quite match all but it's going to get there ultimately I just want to mention volumes very quickly I don't have much time we need to be able to share state or have state within the cluster this is provided by volumes which will be effectively mounted by the containers running inside a pod and pods containers within a pod can share that same volume dismounted we have multiple different representations of volumes including an empty directory which lives within the pod this is backed by storage but it's completely ephemeral when the pod goes away the storage goes away so it's used for anything that's transient temporary storage that kind of thing we have a host path we can actually map a directory on the node itself into a container as a volume use it use this for caution now often nodes are different we know that the operating system underlying operating system environment may change in such a way that what we see within that host path may differ from node to node so this isn't necessarily recommended you can use nfs or cluster fs and others nowadays to mount a volume you can also use cloud provider or block or file storage and this aws and google cloud platform both provide this facility so this will map to a google cloud platform persistent persistent disk or to an aws block storage device we can also use a thing called a persistent volume claim as well and these things change one thing about what we talked about before that they live as long as the pod does these things can live beyond the life of a pod and we use them to basically ask the underlying cluster to provide the volume for us we set out a requirement in terms of size maybe in terms of identity as well and it will give us that volume where it's possible it may not be able to it may have to schedule that it may be we're asking for a terabyte worth of disk storage when we don't have a terabyte available so that request will be queued and will give them to us when it becomes available so persistent volumes effectively isolates us from any specific cloud environment or on-premise environment administrators generally provision them and users claim them via these templates and specs we've talked about before we can also do dynamic provisioning of these as well so once we ask for a claim the underlying cluster can provision them for us they have an independent lifetime from the things that use them so they live until the user is done with them you can actually control the lifecycle determine whether they should be recycled or whether they should be reused and handed off effectively between different pods and they can be dynamically scheduled and managed just like nodes and pods. Clustered applications so this is a problem that Kubernetes is now trying to address through a thing called stateful sets with a cluster application such as Cassandra nodes have a unique identity which is coupled to data on the disk and nodes may effectively have to be configured and initialized in a certain way or in a certain order this is as you can probably imagine from Kubernetes a little bit more complicated so we've introduced this thing called a stateful set basically stateful sets again do what we've done with replica sets in the past they stamp out identical copies of a pod but they do make sure that each one has a unique identity and they allow each one of them to have its own reusable persistent storage and that looks something like this this is a pod effectively it's called db0 they have originality so part of their name is an ordinal number in this case db0 and they are matched to a persistent volume called pvdb-0 in this case naming obviously will be different depending on what you're deploying but they are consistent and when you create the next one maybe scale maybe from one to two maybe you ask for 10 initially and it will do this automatically or maybe you scale and add new pods but the next one will be called db1 same with the with the persistent volume and the next one will be called db2 etc etc but that originality is important and that sense of identity is important as well so basically you're going to say to the master i want three of these three of these and it will create three for you the first one will be db0 it will create a persistent volume claim and that volume will be assigned either from a new volume that we've created for specifically for this deployment or it will give you one that already exists one with that identity that understands completely already and next one we create db1 the same thing would happen we create the persistent volume claim and that's matched with either an existing volume or a new volume and db2 we're the same and what we can do is at any point we can take two rows away but the persistent volume continue to exist and when we add them back again they will map to the same volume so now they have this sense of identity and they have their own state from storage so now we can do things like the cluster applications that we talked about earlier things like Cassandra another component for that is something called initialization containers so when we spin up a pod we can also spin up this notion of an init container which will perform different functions they basically work until they're finished they can be used for bootstrapping effectively normally when we have a pod we build the containers within that pod from an image that will be called baking but you can also bootstrap these things by providing configuration based on the environment so your init container could come up it can look outside the rest of the environment make decisions based on that and then configure the containers the other containers that will be running based on that environment we can use this for discovery of peers using peer finder scripts and sidecars we can use it for startup and tear down ordering which is absolutely vital for something like Cassandra for master elections and also ultimately implicit ordering and those things are available today all of that's in base currently init containers and staple sets are introduced in 1.3 well yeah init containers are beta in 1.4 and staple sets are now beta in 1.5 i'm going to skip over the rest i'll provide the slides for you but you can do configuration coming from the environment provide configuration via a resource you can provide secrets via a resource and the last part is multi cluster federation ultimately we could have our control plane match multiple clusters within different availability zones so this could be something run on aws with different clusters or parts of a federated cluster running in different availability zones or we can also have different clusters parts of that federation running in different places in google cloud platform and amazon or on premise as well cluster federation is still a working process where we're sorry to interrupt but we've got about 30 seconds left yeah i'm just about to finish okay cool so i will share these slides with you we have a very very large community it's extremely powerful normally it's messages for people who've never heard it before and i also mentioned the insect we run pokemon go on kubernetes in case you didn't know uh and that's it so basically that's the the entire talk i would not less left time for questions but you can send them to me via stan uh stan is that okay perfectly well thank you very much maybe it was a great presentation i'm sorry i didn't leave time for questions uh uh but feel free to just throw them at me and i will do my best to answer them some of the questions you've already sent to stan a little bit uh need to need need me to do some investigation but i will do my best to answer them over the next few days if you have any more send them to stan he'll send them to me thank you very much thank you thanks a lot jess bye bye