 So hi Let me start this with a question. So how many of you are know know about docker or kubernetes raise your hands many and How many of you are using it in production make a spoke? spoke So So let me first introduce Quint Lane We are technological company We are trading stocks on different stock exchanges around Europe We are based in Prague. We have a relatively small team of developers. There is 14 of us As I said, we are developing a trading platform, which we are also using. We are not selling it outside And we use mostly open source projects to run our infrastructure and Even our most of our code uses open source project projects Outreading platform is written in Python obviously it uses React.js for front-end In our applications even in our trading platforms, we heavily used async.io because There's a lot of data and we want to process it in parallel Concurrently excuse me And we are storing data in Redis and timescale DB Also, we are using third-party libraries Integrating it with integrating them with site and Those libraries are written in C C++ Java or even Scala So site and right So There are lots. There are a lot of data. So we have to split our application into multiple processes and also Applications around which help us Like there's reporting some graph graphs and Other tools we use Also, we are using messaging to integrate all these applications together and pass data from one to another. When I first started working for Quantlane That was kind of a chaos All the applications were deployed on physical servers Managed by Circus, CircusD. It is process management system similar to SupervisorD, if you know Packages were installed in virtual end So each application had a separate virtual end with installed pre-installed packages Probably the standard and Everything run on the single user Most of this Also this error that it was simple It was simple in a manner that when a new user came to the project all he had to do was Clone project, create virtual end, install packages and he could run the application. It was the same for the deployment There were some disadvantages like there was some package version in hell, which means that We had different packages and whenever we updated one of our own packages It may use some newer third-party libraries and we had to somehow Migrate those changes into our other applications And there was no failover because when the server died everything on it died and the record was not automatically migrated somewhere else So what happened next was docker when I started we are already looking into the use of docker as it was Yes, it has a promising features like it was able to unify in our environment So that we could run the local development with the same kind of package. Let's say image As it was in staging CI and the any production The deployment was also simple because all you had to do was build any build the image just run one command and you had probably running application migrations were also simplified As The image contain everything you don't you don't have to Install anything else just pull the image and you had it It's speed up our CI No matter that image When we build them build and the image is built it contains everything so we can use this image through all the Through all the other stages of CI so and we had atomic releases as the built image has some tag and Even some hash so we can we have image with We have image with a hash which is atomic and unique in our registry So there are some there were some challenges we had to overcome first to introduce docker to our infrastructure and Those were like how do we store the images for this case? We decided to use github registry as we already had github instance in our infrastructure and it had this feature Next thing was image caching because It's kind of sad, but we have a pretty slow uplink to the internet like our internal network is fast It is it has a gigabit but uplink is around 20 megabits. Maybe so so as I said, we are using github and github has CI which has steps defined in the gith repository so anybody who has access to get a good repository can update the pipeline definition and Motivated to whatever he wants As we wanted to have a built stage in CI Like the simple CI build is this you just want docker build and it should build the image and Prepare everything but the user has access to CI definition he can update this and modify it maybe to something like this And by this he can effectively get access to the server on which The docker demon is running this means that You should have dedicated building environment which you can just Take and throw out replace it with a new one clean So whenever Somebody get access to this environment and does something harmful to it you can just clean it and Go without any problem for that Next thing was CI pipeline design Because you want to have fast CI you don't want to spend 20 minutes on Building and then testing and then maybe integration tests publishing Whatever and the last thing was cleaning cleaning of all the images The standard docker registry which you can download from docker hub Does not have Automatic cleanup of all the images and because our images have around 50 500 megabytes and we have like hundreds of them We have we had to implement some kind of cleanup of these images We are not we are not running in AVS in we are not only running in cloud So we have we don't have infinite storage so What docker brought us? Was as I said unified and stable environment by means that we had the same image for local development CI staging and then production and Everything was baked in so developer When he built them the image he could be sure that the packages which are in it and All the images I don't mean know just those which are specified in requirements but also the third party requirements and The full chain will have the specified version and this will be same in CI and staging and production Be because of the docker nature when you build an image it has everything packed in so Basic of this is to create one image bake everything in all the requirements all the development requirements application some environment specifics you need and You can run everything in this image as I will show you on the next slide It also brought us isolated environments by means that When the application runs it cannot access other applications running on the system It cannot take control over the other processes which are running there That's probably as some kind of security feature so fastest CI and This is our spice. This is our pipeline So first we built them the image as I said it contains everything and it contains all the packages application and some environment definitions Then next we run tests co-quality unit tests packaging Those are run inside this image in parallel So each of those tests may run like maybe two minutes and this speeds up entire process because in those state in this that stage none of the jobs have to install the packages which was the bottleneck of our CI so next we optionally deploy Next we optionally release a bleeding edge version and deploy to staging and then we run integration tests and publish the documentation know that bleeding edge release and state deployment or staging are optional So we can run integration test Immediately after the unit tests are complete so a docker has also some kind of Looker has also these advantages like there are known bugs And every day you can find a new one For example, there are memory leaks There are some race conditions which lead to deadlocks It has no fail over if you don't use docker swarm. I don't know what kind of state docker swarm right now is but Also when the looker demon dies you cannot Manipulate the running containers your containers may still be running, but you cannot stop them or restart them Or create new ones There are few Gotchas We found out when we started using docker and for example, there's a PID one pitfall who knows about PID one pitfall Yeah Something you don't know So PID one pitfall is basically a problem Or maybe a feature let's say So when when you run an application in docker, it is started with PID one PID one has a special meaning in linux because it is an edit process which starts everything like SSH your UI Whatever PID one doesn't inherit default default signal handlers which means that you have to implement them by by yourself Who knows about signal handlers in linux? Okay, so So you have to implement those signal handlers because You you usually want a graceful graceful shutdown of your application So when you run docker command the process is started with PID one and When you run docker stop on it, which should Terminate the process The first what docker do is it sends sick term to it If the application doesn't shut down in 10 seconds after this it will send sick kill Effectively killing everything you may lose your state with this So it's good idea to implement the signal handlers. This also applies for a process which doesn't run outside of Which runs outside of python outside of docker? but Also, it has one other meaning that you have to Like the when a process runs in docker you have to take care of the sub processes you run So if you are using sub process Module and running other processes such our process says you have to terminate them and Clean up after them because if you don't they will remain there and The crew will probably somehow take care of them or cannot will kill them at at the end so also we have to take in mind the user within and within the container because When somebody get access to your container and Runs a shell in it Let's say some kind of attacker so The one can get the root access if you run your applications your application on the root Which means that he can modify the file system within it even run other applications And you should avoid this you should avoid this because you don't want it to modify the Modified the entire container he can even run some kind of spambles You don't want that you After we migrated to docker it took around maybe a month We started looking where to move next and we found Kubernetes It has some kind of Maybe I didn't like docker has a whale Kubernetes has a wheel. I Don't know so What Kubernetes is is basically a cluster orchestration. This means that You have a bunch of servers you install Kubernetes on them Kubernetes is somehow manages all of them and you just tell Kubernetes these to run The application somewhere in the class that you don't care what where you just want to have it up running and maybe Accessible on some kind of address What Kubernetes was interesting in what Kubernetes Kubernetes was is interesting for us because It's sold some kind of fail over when the server failed So when a server in Kubernetes dies it Migrates the workload from the that server somewhere else So you don't have to care about it and you have you can sleep at 3 a.m.. It won't wake you up the configuration can be stored in namespaces namespaces are Logical dividers you can have namespace for production for staging given for different applications maybe Namespace for monitoring blogging even your applications and For each of these namespaces you can have stored a global configuration Which can be then propagated? From this namespace to those services running in in this namespace Also if suppers some kind of basic service discovery you can access other services based on DNS Like my service my namespace service cluster local is the standard address Just have to fill in this my service my service and my namespace It has an ingress controller It supports an ingress controller ingress It's a way to expose the services applications running in Kubernetes to outside world so the outside world can make requests and For example retrieve some kind of website which is running in Kubernetes The other way around like the services between the Kubernetes can still access the outside world but for that outside world to access the Kubernetes services you have to have an ingress controller also Kubernetes has one fancy feature and that's deployment history which means that When you deploy some kind of new service and it doesn't behave you want to Revert it to previous version. So you can call kubectl rollout undo and this will Deploy the previous version. So it's a really handsome utility when you want to simply Revert revert something and you don't know which version version it was running before Right now we are in the Process of migration to Kubernetes our main trading platform was already migrated a week ago There are still other services which are running running still in docker on other other hosts But we are planning to migrate them in maybe two weeks and then we can Join all of the other servers into the Kubernetes cluster we have in the environment Configured by namespace variables So we have production namespace in which we have configuration For Which specifies where you can find services like the messaging some kind of data storage access to databases and so on and We are deploying our services using Plane almost with ginger to Like ginger to files containing gammas with our variables this allows us to have some conditionals in templates and have a single deployment file which is adaptable for different Processes or maybe profiles configurations and so in this example you can see that we have some profile specified and If the profile is C2 loo the other there are other environment variables added to the deployment So no some notable features about Kubernetes are probes imagine that you have web application and Deploy it to Kubernetes. You want to check that it is a running and if it starts like maybe 30 seconds is You just don't want to care about it. You just deploy it and want to see it running as fast as possible so probes what probes does do is that they check if application is running by Accessing the some kind of port you specify or running Some internal command in within the container This allows cubanities to check if the service is running and if it's healthy if it's not cubanities We are automatically restarted And then there are updates strategies There are two major one one is rolling up update What this strategy does is that when you deploy a new version the old version is still running and new new version new deployment with new version is started and unless the deployment with new version is running available and stable the Previous version is the previous the previous deployment with old version is still running So when the second version the new one is available the first one gets shut terminated it's shut down and all the Traffic is forward forwarded to the new deployment Another update strategy is to recreate What it does is that it first shadows the previous version then starts the new and You can have some non-zero that non-zero the whole time this update strategy is good when you have Some kind of resource which has a unique log maybe You can have some data stored in files and you don't want two applications to access the file simultaneously so those were the fancy features or interesting features of cubanities and as I mentioned doctor has whale cubanities has Wheel so beyond that There are many wheels. So we are looking into the cubanities Federation next and that's beyond So thank you Thank you, Peter, please raise your hand if you have any questions Artis ginga templates integrated somehow with cubanities conflicts and secrets for deployment Like you can send only pure yaml files to cubanities and jason files But you can you can't send the yaml ginga do files. So you first have to fill in the variables then you can send it and The way our deployment is designed that we take the Ginger file and fill and fill in everything we have so we have also secrets and config maps within the deployment Can I have one more? Which deployment strategy should I use if I have database migrations between versions I I should use this recreate Strategy or there are some others better solutions if you don't have any Any way to fix Like to take care of the difference between those two versions you have you should use the recreate but Maybe there's a way. I don't know. How do you monitor all this? That's an excellent question. We are monitoring internal bots. We're using hipster and we are monitoring entire cluster by Prometheus So previously you mentioned the fact that the problem of the P1 within Docker. How that's Kubernetes helped you Kubernetes doesn't solve that you have to solve this by yourself if within your application Okay, I've been the assault and if so, how? We are solving this by registering signal handler, which As I mentioned, we are using gacing IO So what do we basically do we just stop the loop which? terminates all the all the running futures I think it's called futures and Then we just have a finally block where we just closed all the handles states states save the state and Clean up everything we need But also you can add the signal handler to loop directly hi Do you have some persistent data on hard drive and how do you do you manage this data between all the service? Yeah, we had we had a persistence on file system We are right now migrating it to Redis, but Until now what we had to do when we migrated between Hosts was that we had to shut down the service migrate the data and start the service again. We have no shared storage We have time for last question At the beginning of the talk you said that you were running the problems with github runner How did you solve the problem with for the clean environment for each pipeline? We have a dedicated Dint like docker-in-docker so service running which which handles the Which handles the pipeline but so we wait we have we have shared we have shared shared dint but When we are shutting down it sometimes so we clean it So basically when that gets to the end it doesn't affect the entire host on which everything else is running so Actually, we have the same problem and the problem with deans is that you need to have a privileged container to run the docker-in-docker and Essentially, you have a route if you're if you do that So I wondered if you Had another solution Maybe we didn't solve this the right way Thank you. Yeah, we are women. We are planning to migrate it to different hosts to separate hosts. So maybe this was so this Okay, fantastic. Thank you Peter. Let's thanks