 So welcome everyone, I'm quite glad to be here, I enjoy the conference very much, I hope you do enjoy it at least as much as I do, it's been a while since my last time. My name is Georgie, I'm a developer at SAP and also I'm a Bulgarian and also I'm a member of the garden team. So if I should only use one word, I'm a bull gardener. My name is Tom, I'm also a software engineer on the garden team and I work at Pivotal. Cool, today we're going to talk about how did our garden grow during the past year or so, I'm going to risk using this thing, everybody says it's not working, let's see. It really is, oh it is working. So today we're going to tell you a bit of history, then we're going to talk about the cool new stuff that we've been building during the last year and we're going to finish off with a short glimpse of our short and future plans. So we know that you normally expect juice on the garden project of the e-talk, right? So we plan like not turning you down on this one. There you go, juice. A bit of history, history of containers in the context of garden and cloud foundry, where did it all begin, I guess, it all began with the ideas of running with portability and isolation and these are quite hot topics in today's smooth down cloud world, but these ideas are relatively old, like a short Google search revealed that they date back to at least the 1970s with languages like small talk, running on different platforms, in different environments and the truth is called in the 70s and then if you fast forward 40 years of history, there is cloud foundry and it's the platform that we know and love and cloud foundry is all about running multi-tenant applications and doing so securely so it has genuine interest of the topics of portability and isolation and when that thing got released, the most hot, like the hottest technology around isolation where these little links kernel features called namespaces and cgroups and what these gave you was the ability of unsharing different aspects of your processes so that they can run without knowing about one another and without stepping on each other's toes and cloud foundry loved that so it created project warden and this is the great grandparent of garden and warden it was a small Ruby project and it orchestrated the low-level details giving you giving the platform the ability to do this thing and run the apps securely and so on then we decided to rewrite warden and we did garden like you probably know that any name that doesn't start with the g is not a valid name in the golang land you know so warden became garden but it was actually more than just swapping one letter for the other we extracted the garden api which is a small api describing the platform needs what the platform needs to be able to do and then you could have these swappable backends swappable implementations of the api which is great and this this was a really good move it really paid off in the future you'll see and then what's next oh my god docker then docker was released and changed the world i changed the world of containers please why was that so why was it so such a big deal well mainly because it gave you this killer UX this really revolutionary way of working with containers because container technology is complex it's really hard for real users to kind of work with these things and it will be fair to say that docker invented containers because before them we had this bunch of cisco's and mounts and primitives that we could combine to build like different sandboxes but docker really made this a commodity really likes kind of standardize it so it was immediately very easy for normal people to just work with these containers there's this little thing that you can run and move around it was a very big deal and it was such a big deal that everybody was using it everybody was calling that was a standard and with cloud foundry thought well everyone's using docker maybe we should so we did this little experiment we built a back end that we swapped for our like default back end that we'll be using docker this experiment didn't quite work because reasons like for example docker was a very big monolithic project that day and we only needed a little container runtime which garen was at that time it was not really practical to run all of docker which was coming with the UX which we didn't need in the platform and it was very opinionated we'd have to be fighting with it all the time so this plan was dropped on the floor basically and then luckily like a few years later the people behind containers and docker coro s they came together and started coming up with real open standards around containers and so the oci was born and the oci the open container initiative this is a set of standards for example the oci runtime spec is like a standard describing how you run containers and the oci image spec is focusing on layered file systems like the docker images as we all know and there was this cool little thing called run c and this was the first implementation of the oci runtime spec and this thing was nothing more than docker's core engine extracted into a small binary that was not opinionated it was highly configurable it was very lightweight and it was actually released under an open foundation and this was exactly what we needed so that's why we immediately started building our next back end for the garen api this marked the beginning of the so-called year of glue for the team this was a year where we spent like thinking about all those cool standards as we're coming along and how we can possibly bring them to the platform so the results were that we built the next back end the garen run c which obviously wraps around run c and uses run c to run the containers it's needed to say that we were able to delete a lot of in-house code so we were now owning a lot less code and better community code and we also did another project called Groot FS and this was a replacement for our image manager the one that was preparing the root fs's so Groot FS replaced the old one which was called which was called the garden shed and to say the least we didn't like this project very much so we really wanted to kill it so yeah we were able to delete a lot of code and at this point you're probably asking yourselves like well what is this garen project doing like it's basically steadily deleting parts of itself so what's the future where we're going what what are the garen teams goals what is garen for the short answer to this question is that the garden is for garen's main goal is to bring container technology to the to the cloud foundry platform and do so securely and to expand a little bit on that we have like basically we have three goals first we need to be the glue that glues together cloud foundry to containerization technology we are exposing all data platform needs and hiding all the complexity because container take is complex we should be delivering secure defaults because in the multi-tent world that we live in this is really important we need all the security there is and we don't want to leave it to the user like this is a cloud platform the user shouldn't care so we basically turn all the knobs that we possibly could and another important goal of the team is to help the platform leverage and like make max use out of the container take because it's subtracted from it we're hiding the complexity and we don't want to be hiding it too well and this container take is still being developed it's a whole thing so if we spot something that can possibly be a good use for like features we should probably propagate that knowledge upwards in the stack and now tom is going to tell us more about what did we do what cool new things that we do so that we achieve those goals great so clicker sure I probably should have practiced using this before I started talking there we go so let's talk about glue um there's a few things we've been working on in in sort of the glue category and the first thing I want to talk about these things called garden peas before I explain what a garden pea is you have to really understand what a kubernetes pod is so I'm sure a lot of you here are familiar with what a kubernetes pod is let's just give ourselves a reminder they're a collection of containers that share some stuff between each other so for example they might share a network namespace so they can see each other on their own local network um so in summary like a kubernetes pod is like a cluster of container images that are shippable but they're not entirely isolated from each other uh you're making some security tradeoffs for for some like nice to haves um it turns out that cloud foundry had a use case for such a thing um here's a few of them um so you may have heard of envoy um envoy is a process that does some networking stuff there are there are plenty of talks here about envoy uh that I'm sure we've seen but essentially envoy uh a requirement of that is it it needs access to the network namespace that a container it happens to be using uh so envoy is just a process that runs sort of partially inside the garden container only sharing the network namespace uh there's another example I can give called a health check and this is kind of exactly what it says on the tin it's a process that checks the health of the other process and uh the requirement here with with uh something analogous to a kubernetes pod was that we didn't want this health check to eat into your applications memory limits your application might have a memory limit of say 60 meg and the health check might be written in go and consume six meg just for the runtime that seems unfair that you'd pay for that so health checks live outside of of the uh the memory limits of your application but you can still see some things about the application uh there's a few other examples like cfssh etc but we won't really go into those um so I want to explain like a lame joke about why we called this thing garden peas um so if you think about a kubernetes pod a pod is a collection of containers and in dockerworld a container is represented by this fun little whale that someone drew and now think about a garden what's a pod a collection of the right it's peas um so we're trying a thing here you might have heard of test driven development we're trying name driven development and it seems to be working out so far um so there are some differences between a kubernetes pod and a garden pea and that's that garden peas are not user serviceable we don't want people using a cloud foundry to have to really think too deeply about them really we want people to just push that code and get a bunch of cool features for for free so all that garden really is doing here with this garden peas thing is is gluing a cloud foundry experience into low-level implementation details of a concept called sidecar containers uh on the topic of of glue we've also been working on uh some some more self deletion uh so garden's historical decision to replace the hand rolled container runtime with run c it made a lot of engineers very happy and was widely regarded as a good move um it was so good in fact that we thought well let's do it again um so a lot of smart people that worked at docker yet again extracted more patterns out of docker and we ended up with this thing called run c and uh sorry with uh with a thing called container d and container d it turns out had a lot of overlap in what it was doing uh with what garden was doing these features we built around run c within garden um so we thought that's pretty cool um we should probably use that uh we might end up use like owning less code we might end up with more eyes on the code that we did use it's an open source project um and really it gives us a chance to sort of give back to the community if we adopt this thing and and start using more of these standards um so that's that's been going pretty well um we've had a flavor of garden running in production um a large cloud boundary production for a couple of months now and it's actually been surprisingly stable relatively bug-free um it has exposed a few issues with garden and a few issues with container d itself and this is really given as a chance to contribute back to the community more we've we've in the past couple of years we've made efforts to contribute back to run c to fix things like getting it to run rootlessly that kind of thing um but until quite recently we never really had a dedicated effort to give back to communities for products where we use like container d in run c so in the past couple of months we've actually had a pair dedicated to just working on open source technology that we happen to use improving it fixing bugs adding features uh and it's it's been a really cool experience um next let's let's talk about secure defaults uh gilgi kind of briefly mentioned this earlier um so we use this thing called run c and run c uh lets you configure a container in such a way that you can add additional layers of security to your container there are things like username space is app armor set comp etc but by default uh run c won't necessarily turn many of these things on for you and uh garden deployments in cloud boundary we we want to be as paranoid as possible you want to be as secure as possible and then not using those features should be a conscious decision rather than the other way around um so so by default garden actually looks at this this this run c configuration it just it turns on every single security feature we feasibly can turn on um but there is there's a problem with that um and we we've kind of secured the container with this big heavy padlock but the the thing around that container the gate has got some pretty big gaps in it right we could quite easily just climb through that hole in that fence there and get inside the container and the the analogy i'm trying to draw here is that the garden server is still running as a root user um which which seems dangerous um and it turns out that you do need to be root to spin up containers in some way but you don't necessarily have to be root forever so we've been doing some work on getting the garden server running rootlessly and what that means is doing a bunch of rootful setup early in garden's life cycle and then quickly dropping to a non-root user um i won't dig too much into the details of that because there was another talk given yesterday um by another garden team member or two of them claudia bere spread and ed king um if you haven't already watched their video on youtube it's it's super interesting it's about garden's journey to run the server rootlessly um let's talk about cool container tech we've been working on um so there's this thing called uh oci build packs that we've been working on let let's step back from oci build packs for a second and talk about what happens today when you cf push something garden will will will be asked by diego to spin up a container and what it's going to do is going to it's going to orchestrate with gruta fs to create the file system for your container and it's going to do that by laying down this root file system and then it's going to take the the app bits for the thing that you pushed like it might contain your python code and and the python runtime and uh that data is going to be streamed into the container and then it's going to be untarred and then it's going layered on top of the root file system and that excuse me that really sucks um it sucks for a few reasons um untarring something is quite cpu intensive uh downloading something is quite slow and it really doesn't cache all that well the way that we represented on the disc excuse me that was very loud um and uh so we wanted to address this thing and if you if you look at the diagram that i drew here and forget the container bit around it that looks awfully familiar that looks a lot like uh a docker image uh a layered file system that a docker image might give you and so we thought why don't we take some ideas from the way that docker images are constructed and try to apply them to cloud foundry and uh we've sort of like dipped our toe in the water and and tried to treat the app bits more like an oci uh image compatible layer and uh we have a deployment of uh of cloud foundry with this mode enabled and we've actually seen uh a 33 percent performance increase on cf scale uh by enabling this so it it's pretty good there are future plans to to sort of like fully turn our images into oci compliant images um another thing that's happened sort of i want to say it almost as a side effect but we had this in our mind when we were spinning out this garden api that georgia mentioned earlier is that because we've detached the garden runtime run c um from the api that's used to invoke commands to run c um we kind of got windows support by doing that uh it turns out if you take this run c thing and unplug it and then plug in wincy uh or wink or wince that's pronounced several different ways i don't know which one's correct which is a windows implementation of uh the run c um container runtime it just works garden just works on windows out the box by plugging that other thing instead um so next georgia is going to talk about what's what's exciting what's upcoming uh in garden all right so what's next um first we spent a lot of time recently thinking about providing better ways of cpu sharing by providing better cpu metrics and this is to suggest that we don't think that the current cpu metrics that we're meeting are very good uh this is true and i'm going to try and explain why this is so uh so cpu sharing is no piece of cake right well maybe it is in a way if you think from the arse perspective the amount of cpu that your amp gets for a given moment in time is exactly this it's a piece of the whole cpu cake on the cell uh so we might we might play with an example so if i push so if you push today an app to to cloud found which is say 64 megabytes big 64 megabytes of memory uh suppose what happens if it lands on an empty or an idle cell where there are no other tenants uh how much cpu is this application going to be able to get well uh it's going to basically be able to eat up the whole cake because there's nothing there to prevent that and that's fine and when you list your apps like cpu usage then you're you're going to see a report of like 100% cpu provided that your app for example does an infinite loop or mines bitcoins or whatever so you're going to get 100% of cpu usage and this is fine there's nobody else there to use up this cpu but then tomorrow if uh if i push another app of the same size then it will be fair to like split the cake into two like so that you get 50% i get 50% and this is exactly what is going to happen uh but if you again list uh your api matrix today you're going to see 50% of cpu usage and this sucks right because this is not a really a metric like you need metrics as developers so that we can take decisions based on those metrics and this metric just dropped by half without me doing anything so it's not a good metric uh and like in order to make sense of this metric i need to understand that i'm running in a multi-tenant and that there are other people how big are their apps how big is the cell like how big is the whole cake how and stuff like that that these are this is no business like of the app developer as app developers uh you shouldn't care about this by definition so how do we plan to address this well instead of today's metric which is more or less uh what the machine reports a cpu usage like not a containerized one we plan on emitting two new metrics which will be absolute cpu usage that is the cpu time as reported by the kernel from the moment where the container was created until the present moment and an entitlement uh and the entitlement is an interesting one this has the semantics of what you paid for so this is going to be calculated by the system because the system knows all these details that you shouldn't care about like it knows what cell you landed on so it knows how much memory there is how much cpu so it can basically divide all the cpu to proportional entitlements and then when you get back those two metrics so the entitlement will be again cpu usage in terms of cpu time from container creation up to the present moment like what we paid for and when you get these two metrics it will be trivial to calculate a more stable percentage which will have the semantics of how much am i using from what i paid for which makes a lot more sense and again if you land on an empty cell you're going to see like values larger than a hundred percent but that's fine then you know like cool i'm getting cpu for free i didn't pay for this it's fine but then you can't count on this because tomorrow you might land on a busy cell and then you won't have the cpu so this is this is like we feel a much better way of committing cpu metrics and this is just the beginning uh this is not going to solve the problem this is just going to like expose metrics that describe the problem more or less uh what would be the next steps well after having these metrics we will provide some operator tooling so that the operator will be able to list all the apps that are consistently using more that they were paying for because it's important like we should consider limiting those apps enforcing the entitlement so that other people won't have like pushes that fell for example so that other people will be able to spike their cpus when they need to so we're going to collect feedback from those operators well we gave you this list do you think this is a correct classification do you think it will be good to enforce the entitlement on these apps and if the answer is yes we're going to automate all of that and which means that uh if you consistently use more than your entitlement you're going to be uh throttled the entitlement is going to be enforced so that other people can get a chance uh to have cpu spikes uh and that's how we achieve a better cpu sharing scheme we believe so in terms of future plans rootless container d although if i if i'm true to teams conventions i should be calling this rootless containerd you know so what is what is this all about well given that we're bringing this container thing to the platform we should make sure that it can run uh and create containers as a non-root user uh because we need all the security and speaking of containerd there's this thing called snapshots and this is containerd's model of dealing with layered file system so think docker images and this is pretty cool now it's one level higher than docker images it's not bound to any like shell algorithms or uh formats like tire etc it's pretty cool and given that root fs uh our root fs component is dealing with exactly the same thing uh we see even more overlap here so we should probably take this thing and like replace root with it and i guess this is more or less it so just a short recap what is garen for garen is glue it's security faults and leveraging container tech for the platform uh what did we do uh we gave you a piece like kind of bots in the cloud foundry world we are bringing containerd to the platform we are progressing to now a rootless rack of work i know we've been saying this for ages but we really are getting closer and we are converting buildbacks to docker images and we hope to come up with a better like a more fair cpu sharing scheme with that said we thank you and we welcome your questions any questions uh what is the performance win uh if we enable like the question was what is the performance gain if we enable the oci mode well uh we are noticing i think tom mentioned like about 30 percent of performance win on cf scale i think pushes are roughly the same but scales seem to to benefit the most for some exact numbers on our testing environment where we had this mode enabled it takes um six seconds to perform a cf scale um going from one instance to 10 instances and with oci mode enabled it takes four seconds yeah uh so the question basically the suggestion was that we kind of provide burst capacity for cf pushes so if we don't immediately limit uh we know about this this is on our radar uh we think that these two metrics might solve that because we don't intend to like immediately uh enforce this thing so you'll be able to burst but not consistently like not for a long time we also spoke about maybe reserving a portion of the cpu that isn't allocated to containers entitlement which would be a portion that other containers can spike into when they when they need when they need to with time we're over good any more questions if not we thank you for coming and thank you