 Hello everyone. My name is Kumar Rishabh. I'm a software engineer at PayPal and part of web platform engineering team so this talk is about localizing noise in application and development to products and journey so In past two years PayPal has invested a lot in localizing our development and products environment So in this talk, I will not talk about the best practices We followed or what learning I had in that the journey to build the Images and writing better talker files So why why this topic right? So if you'll see From a small to made to enterprise level company. There's talker everywhere everywhere. So and in your near future it can be Maybe serverless or it's computing but currently it's Docker, right? So if I talk about the PayPal, so we have around a hundred thousand of hosts and Around three hundred thousands of container running in productions. Okay, and we have a hundred plus no disapplication at PayPal so if Are working at this scale even a small amount of improvement we are doing we are making a huge impact So all the best practices what I'm going to talk about all might not be equally important So it depends on your use case your company your organization So so this is the agenda for today. So first will be like Docker cache. So how you can effectively use the Docker cache? second one to be the What you should look out for when you are choosing the base image for your? building the Docker image Next one how you can is about the multi-stage build and how you can leverage it to optimize your image and image size and Last one is a handling signals Health checks it is Before I start let's do a poll. So how many of you heard of Docker and How many of you are just you know like playing around doing POC? Okay, and using a Dev & Prod Nice So I think most of people know about Docker just do a quick refresh here. So Docker file So in Docker file, we have set of instructions. We write like nice declarative fashion so and When you will build the dock image through this Docker file, you will get the image, right? So what is the image? It's a blueprint of a running container So and Docker is not we I'm okay. So let's do This confusion say in your right side This is example typical load app and it has image layers If you'll see on the bottom you have a host and your Docker engine after that you have base image Can we any OS you're using and on the top we had different statement like run work. They are copy So each statement will make a layer image layer. So what is this image layer basically a file system snapshot? so when you will Docker build so it will execute each statement and for each statement it will create a digest hash, okay, and It will use probably use this has to Maintain the cache So if there's any change in that layer, let's say you have changed some file your address new statement in there It will create a new hash so each hash will be unique here and so this part, right? So this is a readable part of that image. So when you execute the image So basically when your container is running so you will get one more layer on the top Okay, that is the readable as well as right table layer. Okay, so any data You are creating inside of container or any file you are creating everything will go in that right table container Okay, effective use of Docker cache. So this so this is a basic tip for this. Okay, you should defer the Cache invalidation as long as you can Okay, so let's start with example. So first one is it visible on the back? Order matters all of the statement what you're writing in your Docker file matters. So if you'll see in this left side So you have from a statement basically the base image I'll find node image version 10 the working directory. So you're copying Everything what's in the build context currently to the app folder and running and payment stall and you're starting them You're not applications. So I'm doing some change and in the right side. You will see So if you'll see this here in the right side now, we have to copy commands. Okay, and The copying everything here It's in the second last line and before that in the third line. You will see copy packages on and package logs is on so packages on basically you have In or just application you have all the dependency in the package is on and package logs is on basically you will have here You will lock down your dependency So when you will here run any payment stall, right? So it will So before that, let's say you are Working fine So if you see in this left side, right, you are copying everything So let's say you you have done a small change a small CSS or JS file So I will happen when you're building it again. So dog will find okay. Something got changed. It blew the cache Okay, but because a small line of change, right? You're every all the modules you install right and can install all the modules everything will go off, right? So what I have done in the right side? I have first remove this packages on and package lock on the top first in third line Okay, so while happen if you see okay, is any change in the log file or not? It's not there. Okay, then it will go in any payment stall So as there's no change in the log file, obviously they won't be changed in the installation step. Okay, and Then after doing the copy statement, so yeah, that's the difference here Okay, so a few things here also like instead of using npm install Use npm ci okay from npm 5.7 and above you have something called npm ci. So what it does So npm 5 basically creates log file for you So if lock file presents if you use npm ci It will use log file to install the dependency it will be faster because Lock file will have all resolved dependency easily. So it didn't have to waste time on resolving the dependency Also while installing use this flag half and half and production. So what it will do? It won't install the dependency. So yeah, right so in production Why you need a dependency and casting the build dependency, right? It's true for any stack as it's a go stack or Java stack. You should catch your dependency. Okay, so next one is So next one is remove unnecessary tools Dependency code is not required production Okay, so if you'll see second last line, right? So I'm doing this copy source and some executables Let's say you have some binary you created in the above lines, right? So you don't need all the dependency you wish you used to create those binary Just had those binary the final thing and copy only those Okay So apart from that you can use doc Docker ignore file so it's Docker ignore just like your git ignore, right? You'll mention those directly the files Docker build won't touch those files while building those okay, and do a specific copy Copy exactly what you need in the production Okay, and apart from that a wide installing dependency tools, which you don't need in the production For example, you don't need SSH or WIM or any other dev tools so if you install more dev tools or More dependency more attack surface and more maintenance you need so and also if you're using a node just applications Once you have all the package in the production you got all the pop package. You don't need npm anymore So just remove the npm. Okay, and this small there's Come on collect copy and add this mall between these two So copy what it says as a transplant. Let's do the copy from the build context from there when you're building the image, right? But add it does more than the copy apart from copying it It will let's say you have some tar file. So when you'll do add it will untie it in the location So just see what is your use case? Okay, so next one what not to cache so If you see second and third line, right? So what will happen if you do this second third line separately, right? So what will happen when you're writing the next time the second and we get cast so It won't get the new changes and next time you will install it some dependency Let's say the Python is third line. It won't get the updated version here So always do update and install in the same line basically these are the install birdies, okay? Apart from that, let's use this flag no install recommend so it won't do installation of recommended package Okay, so basically install what you need and You can remove this package manager cache. It's not required anymore after you have done with installation next one So what you should look out for when you are choosing the base image and There's no hard and fast tools to by choosing a base image. It depends on your org your company your project so first of all a Wide using latest tag. Okay. I find it very confusing. So in Docker world Latest tag is very confusing. So what happens when you are building it and we're not mentioning the tag It automatically takes the latest tag so when you are Building the image try to pin it the version It's good that you will know right what in that in your image what version of your software is going So it's always better to pin them version Next one use minimal light base image So again, there's no rule like you have to go with alpine Linux or some specific version So see what's your use case? Let's say in your company initial organization You have some version and on the top you add five to ten dependency as per use case So package everything build a new image then distribute that among your Team or other people right so define you your own base image. What's required for you and Very possible use official image not public like very popular not always but try to go with official image Okay, because you have you will have check for authenticity and integrity there. You can trust those image next one Share base image where is possible? So let's say you are running 234 containers truth about services or different containers, right try to have same Base image where possible because if you have same image then Docker won't pull it every time it will share those image Okay, so this is a recent blog from sync So top 10 most popular Docker image has at least 30 vulnerability so what I'm saying is what you are Putting in your Docker file what you are building the image, right? Know your image basically don't just blindly put the all the commands all love image What you feel just looks cool. It has lots of things inside that don't do that at least in production Yeah, next one is multi-stage build. So what is multi-stage build? What is my stage first? Okay, so generally if you'll see any Docker file, right you will have one from a statement, right? But I think multi-stage game with Docker 17.09 two years back. So with this feature you can have multiple stage So basically multiple stage means multiple from a statement and each from a statement makes one stage Okay, so first stage will be zero second will be one second third will be two So how you can leverage multiple states to optimize the image? So what we what we do here, right? So basically you will use the end result of the previous stage or industry before that in your current stage That's the basic way simple version of this So while happen, let's say you are doing lots of bunch of stuff in the intermediate state And you got the final result and you are using only that in the final stage So you are removing intermediate states. So final image what you are getting at the end It won't have all the dependency what? you were What you got while building those dependency right and again it will reduce the size This is a small example here Is it visible? So if you see here at line number one and line number nine, so they are two from a statement, right? So first one it will be a stage zero and And the second one will be stage one But but again here order is there, right? So to avoid that you can give aliasing you can give name for example here. I gave From base image as builder. So here the stage name as builder and This will do some it will build some binary and it is in app folder. So if you see line number nine and Here line number 14 So here using copy hyphen from and here you are giving builder because that's a name of the stage and here I'm just copying the This NASA binary file. I don't need other things like I don't need npm. I don't need other dependency What's is coming from alpine node base image, right? So I will need only the NASA binary executable. I will just copy that So when you once you will build it you will see huge reduction in the final image size also next So this is a build kit. So what is build kit? So build kit is a project by Moby and Moby was a project started by Docker viewers back. So it's a Build tool. Okay, it will use Docker file to create your image. So how is different from Docker, right? So It's performance is better like how I'll show you example for one example. You don't get how it's different So to enable it just use this environment variable and If you want to email by default you can just have in this your config in a demon config you can start So this is a Docker file These are the normal how you do without a build kit. So we'll do a stage 0 a stage 1 a stage 2 all One by one, but if you do build kit, right? So you will see it will do things in parallel So it will do a stage 0 a stage 1 a stage 2 it will get all the files After that it will run different statement for a stage 0 a stage 1 So it will do things in parallel. It will improve the performance of your build step. So again handling persistent data So as I told right on the top you have container layer read only I just a little read only so So basically to handle the data So always go with volume. So there are different ways to handle the data right in the containers One is bind mount one is volume. So and other ways is TMF TMF TMP FS mount So in bind mount is a heavily depend on your diet is dietary structure of your host operating system But in volume volume volume everything is managed by your docker So both will have files on the host only but in the bind mount you have to manage everything You have to be the path and everything for the volume. It will manage by your docker. Okay? This is an example like again everyone must have named volume here. I'm giving the name Handling signals so if you want to Have graceful restart like a real load or a stop Process then always make sure you intercept these Linux like six signals like sick terms I have a sick kill so what happens when Linux kernel give these signals the docker will get these signals And it will send those signals to your process which is running as a pros ID one Okay, it's always to have those you know in your script to trap those signals and do the like you know like Gris full work like let's say before Turning your off before they are starting before reloading just close our data data connection all the database connection or Stop some running process or Clear some logs or clear some file. Okay next one So you should know right? I mean you are running you should know what the permission and user You are running on the container. So first of thing A wide running as a root. Okay, and next one define your user permissions Don't blindly go with root and give all the seven seven seven. Don't do that Okay, so let's say you have app your application. You have some framework level logic Which is like maintaining the life cycle of the script So just create like one user for this one user for that and you should know what each User can do and it can't do okay accordingly give the permission and yeah Next one So instead of running, you know, when you're starting a node is application instead of doing is npm and start and starting your app Just go for the node and the file name. Why because first of thing you simply starting a new node process By running npm and start another thing npm your silos your signals Linux and exit signal So it would be difficult, you know to trap these signals Because npm is not designed to be a process manager. Okay, and yeah, and The application application container. It should be always a stateless Femoral and immutable don't show any data in your containers. It doesn't have long life And if you have to make some change, don't do patching in the products And okay, it's better to rebuild and then use it again and this one so Always have and at least the drive has have like one application per container instead of you know Like in what we used to in VM image we're working on a VM like we have we have a pretty like lots of bunch of services in One host like for the DB for the engineering server and the application everything in into one But if you're doing same thing in the container, then you're not making best use of the container. Okay, so Decouple your apps into different containers. So once you will decouple it, right? So basically what you're doing you're binding the life cycle of the application with a life cycle of the docker So if your app is down, basically your docker container is down It's just easy to maintain those life cycles and it's easy to maintain the health check also Yes, yeah That's all right. Thank you. Sorry. I have to skip fast in the middle. Yeah. Thank you Risha for such a nice talk So basically one thing is like as you said that is in the one of the previous slides that basically It is a stateless and basically which take care of that. We don't pass many things in the container So is it like docker takes place? I mean ensure security by itself or it's something which we have to handle it explicitly by integrating some third-party APIs or something like that It does take care of security, but I'm not sure like what a specific human rights with security part But yeah, it does take care of your security by the So basically Docker makes use of your name space and control group to handle the security. Thank you Yeah, yeah, it's like if I want to share my docker image to someone then I need to put that image on docker hub Yeah, is there any way that without docker like I want to share my image openly without using any like Email password access. So can I share my image like publicly to anyone? Yeah, so you are saying like sharing image with a docker hub, right? Yeah, so they are various packets So I'm not sure like why are docker have without credential? I'm not sure that's possible, but they're various packets manager. Okay, like you have artifactory. It says it says the Docker image, okay, but they are also we have the authentication. I'm not sure like is there available without authentication You can do but I'm not sure on that. Thank you very much. Thank you