 All right. Well, good morning Apache Khan. Thank you all for turning out for the first talk of the day So About me. I'm David North. I work for a UK based company called core filing We do our development mostly in Java and we work for people like financial regulators banks insurers governments So sort of big conservative financial institutions, which some of you might might know tends to put some constraints on what you can and can't sell them when it comes to it and in the last year or so we've ventured into microservices and Containers and Docker and those sorts of things So this talk is sort of based on what we've learned and what we've worked out along that journey and they're still working out So a statement not a question. We all love Docker. Well, maybe and I'm not going to go into a huge amount of detail about Why developing applications as lots of little services can work well It has its pros and cons, but I'm doing it and I'm assuming at least some of you are doing it as well Quick share of hands. Who's using Docker for anything? Quite a few of you anyone using rocket less and own Me neither, but I thought it deserved a mention since Kubernetes which I'll be talking about quite a lot does support rocket containers as well as Docker So how many of you are running a containerized application and selling it to people as a software as a service thing? Okay, well we are and How many of you are running containers on-premises for yourselves or for customers? More of you so perhaps you've actually already gone in the same direction as our thinking and how do people orchestrate their containers who just sort of runs them using Docker compose and Nothing more complicated than that anyone using Kubernetes One or two hands Anyone doing something else? Me so of course. Yes, we're at Apache con. How could I forget and what about the stuff built into Docker anyone using their latest bits and pieces? Okay, well there are a whole bunch of ways that you can do these things and a Lot of them. There's brilliant support for on big cloud providers If you want to Kubernetes cluster on AWS, you there are tools shipped for this You give them your AWS API key and say I want to cluster about this much hardware You type one command and it goes off and makes you one and if you want one on Azure, it's even easier Microsoft have got a native support saying I just I want to keep a net is cluster on my Azure cloud click-click. There you go. I Don't know if that's had a BT yet, but given that I heard about it's Apache con last time around it might well be And maybe you get your ops team to do deployment or maybe you've embraced the joys of dev ops and people who build things should run them but What people don't seem to talk about at least not online is What happens if your customer doesn't want or can't have software as a service? What if they need your application to come to them needs to be on their premises? So the situation we often find given that our clients are people like the UK government who have lots of Sensitive tax data that they're virtually prohibited by law from putting in a public cloud What we often find is they say right? This is our production environment. There's no access to the public internet from these machines And there never will be Usually that's only a back-end restriction But just occasionally you get someone really paranoid whose front-end machines can't access the public internet at which point your front-end developers need To think a bit hard because they've probably pulled in all sorts of fonts and things that are hosted by Google And you don't want to see what your application looks like when you cut those off But focusing on the back end What's the what's the next challenge these sorts of big enterprise your clients often go for windows as their first choice operating system? But we found that if we ask nicely smile sweetly give a bit of a push They've got some Linux in there somewhere. It's probably red hat So as long as it's red hat 7 then this conversation can continue if you're trying to support Docker on Red hat 6 it's a bit of a nightmare and it's not officially supported But thankfully we've we've found red hat 7 in the cases where we've been asked to do something So here's a rundown of what we had to find answers to to get our containerized application running on one of these big Enterprise client sites with no public internet So question one, how do we get the Docker images there? Obviously they can't pull them from some sort of registry over the public internet Given that we're using kubernetes, how do we install a cluster that functions without the public internet? Databases our application uses some of them. Should we stick them inside the cluster and treat them as a black box or not? How do upgrades take place? How do we stay on top of security updates and patching? These are often things that you have to have answers to on the RFP before you get anywhere near actually doing it How do the on-site IT team manage all this they might not have been exposed to containers before so that can be a challenge Logging and backup and recovery So let's take these one at a time So first of all, how do we get our Docker images to the client site? Let's pretend for a moment that they've got their cluster running How do we actually get our application there so internally or on your cloud? It's easy, you know, you have a local registry full of your Docker images Maybe you push them into Amazon or Azure's registry service and then you pull them down onto your production system from there In this case, we have to get a bit more creative and we have to reach into a dusty corner of Docker that you may not have seen before So I think I might want to go to mirrored monitors to make this at all possible. Just give me a moment Yes, you can see the same thing I can write so there's a lesser known part of Docker or two lesser known commands called load and save so Save does kind of what the name suggests which allows you to take some Docker images that you've got locally and dump them to a tar file And then the mirror image command is load So you can see what Docker images you've got on your local machine By doing this and If I do quick save Run it through GZIP because these things do get quite sizable This is sort of revealing the innards of how Docker works that each image is Ascent very easy to capture as a tar and in order to save that image It's got to produce you a tar that potentially has all of the Further up images that build up into elastic search so if it extends from some Linux image. It's got to save all that as well Okay, well you get the picture. Let's not sit and watch that complete But just to show that this really does work if I turn off my Wi-Fi for a moment Now one of the first things you do when you install Docker is this Now normally of course if I've been connected to the public internet you would have seen it just go Oh, yes, you haven't got hello world locally, so I'll go and talk to the Docker hub I'll pull it down and then we're up and running can't do that. There's no internet. However earlier on You can see that I saved one so as Per the command We haven't got the hello world image locally, but it looks like I saved a tar of it earlier, so That loaded nice and quickly and if we do docker images we can see oh look at the hello world images there And now if I try that again still no internet access by the ways to prove. I'm not cheating Really isn't and So there you are you can dump docker images take them probably Offline to the customer and get them to load them in at the other end never underestimate the bandwidth of a USB drive that you've sent by FedEx Let's just go back to mirrored screens so I can see my notes Just bear with me for a moment. Oh Technology There's a wonderful bug where when I get this right my laptop screen will go blank because it sets the brightness to zero when I'm just going to unplug and re-plug Sorry, this is not the smoothest of experiences That's better right. So that's docker load and save so that's one of the major hurdles taken care of If I could just start my slides again, then we'd be Good to continue Yeah, there we go, right Okay, so we've got the images there. We've solved that problem now How do we actually get a Kubernetes cluster on this client site where there's no internet access? So what do we need to get the client to give us? Well follow the system requirements. We'll need at least two machines Let's say they're running sent-offs, which was what I did for my little trial run of this very similar to red hat as you all know Um First thing you need to do is get the client to make sure when you get there the machines have got docker installed and they have got Kubernetes installed so You need to tell them to follow the constructions on the website install those using yum There's no way around Needing some internet access at least to those repositories to make that happen But we tend to find as long as we're saying well, you'll need that during the installation You can prepare the machine images and then you can cut the internet off The clients have always been happy with that as a workaround And of course you're going to need networking in place between your two or more machines Just a private network not the public internet and So once you've got them past this bit, there's no internet access in what follows So The final thing to make it all work. Yes, you need to You need to take the docker images with you So you'll find and annoyingly the only way to work this out at the moment is by doing it for yourself with internet access and then looking at what images have appeared but We find that kubernetes needs to pull some docker images in order to work So the best thing to do is run through this when it has got internet access and then do docker images You can see what it's downloaded onto the machine You save all those with docker save and once you've loaded those into the target machines at the other end It will all spray into life and start working and shouldn't need the public internet So then you can for example use kube-adm to initialize your cluster and Of course you need to make sure that whatever you're deploying into the cluster doesn't need public internet access It's fortunately quite a lot harder to slip up in back-end development than in front-end when it comes to accidentally relying on somebody else's URLs a few a few tricks we've found over the years is When we run our tests unit of functional we tend to Install a security manager so that we spot and block any tests that are relying on the public internet It's obviously a reliability problem anyway, so it's much better if we can make sure tests and code don't do that and For bits that really do need to do that you want to have some kind of option to work from an offline cache so One of the elephants in the room you get this nice message printed out by kube-adm going this is visa Don't use it in production don't but It was alpha in kubernetes 1.5. It went to be sir in 1.6 We're hoping that the forthcoming 1.7 will make it production ready and Once you've initialized the cluster you're not actually relying on this tool at all, so It's a question of saying to yourself. Well, where's the risk you can look at what shape of cluster it sets up but once you've used the beater tool to do the install you're not reliant on it after that and It is by far the simplest option for doing this So for the moment our strategy is we do use it and we hope to see it go to from beta to the real thing very soon Now what about upgrading? Unfortunately, this is one of the things that kube-adm doesn't do yet or doesn't make easy However, if you think back to the situation I described where you've got your big enterprise client You're deploying your cluster for them and putting your containerized application on it Most of the time you what you're doing here is something they're expecting to run in production for a number of years and probably only take security critical fixes and occasional updates and Given that there's no public internet in this scenario one of the biggest attack vectors for older software is taken away straight away so Not too much of a concern We don't have a brilliant answer on that yet But of course the other thing to think about which I'll cover later is that if you put all the important state on volumes or in databases Outside the cluster then the cluster itself is a transient thing and blowing it away and creating a fresh one isn't such a big job So that was something I alluded to What lives inside the cluster and what lives outside if you've gone to all the trouble of making a cluster of containers Your first instinct would presumably be well Let's put absolutely everything in there because then it can be any old docker container doesn't matter what's in it doesn't matter I don't need to impose any system requirements on the client that they don't already have But again, this is where reality comes up and Often in these kind of enterprises the database admins are quite protective of their little corner of the world And the idea that you can bypass the need for them by just sticking some databases in a container inside your cluster Doesn't go down well with them at all and for practical reasons as well If your database is one of the most IO intensive parts of the application You don't necessarily want to be running it inside a bunch of docker containers with Volumes attached to them sometimes running that directly on physical hardware might be better Sometimes the client might have a preferred database in which case they'll have a cluster of it And they'll have it backed up and they'll have it highly available. So let's take advantage of that So often we We do say alright, we can support we use hibernate We can support anything you like for the relational part of this so You might as well use your existing database cluster and we'll just configure our containers to point at it So the only thing you then need to watch out for is obviously they'll need to be some way of getting Network-wise from the cluster to wherever they keep their databases and They'll need to be no firewalls in the way of that particular access Which is not difficult to arrange but in this kind of environment It may be in the most extreme case that you have to submit your written request to the firewall team two weeks in advance Not discover when you're on site that you haven't done so So what about upgrading the application running in these containers itself? Well, why don't we just do another docker save and ship them a nice big bundle of images? You'll find if you do this that it gets quite fat quite quickly because the total file system of the container Building all the way back up to whatever it extended from can be quite big, but as I mentioned before The bandwidth of FedEx and a USB drive is pretty good and if there's no internet access at the other end That's often the best way to get things there anyway so what we actually do in-house is as well as having a Continuous deployment of our containerized applications We also build an actual build artifact for each version Which is all of our different micro services as a docker image bundle that you can download and walk away with as a self-contained thing The other possibility is that you could try and run a docker registry and you could say to the client Well, look if you've just allowed one bit of our bad internet access to our private registry over here Then you could download the images from that and wouldn't life be simple But it's a bit messy running a docker registry that requires authentication If you're in a full-on commercial setting anyway, then it might not be such a bad thing to have to Explicitly ship the binaries to people. Maybe you already have a system for doing that with other software So security updates here's another one that people never seem to talk about The answer if you're just running containers on AWS or on one deployment wherever you like is well You rebuild them nice and often and by doing so you make sure that whatever they're based on is always the newest the latest And greatest and that just naturally rolls in all the patches Make sure that there are no problems. No security holes lurking in there But what if they're going to be running on a client site for some time? So we experimented with a few things on this We actually at one point got as far as having someone spend half a day working out if you could run Yum update or apt-get update inside your docker containers and then run commit But that's really messy and also suffers from the fact that if there's no public internet access then you can't do that So we took a compromise approach our build process monitors every docker file Fortunately, you can only extend from one base image and that has to be the first line in the file So it's quite easy to run an audit over all of those and say this is our approved list of base images There's only about five of them open JDK Debbie and one or two others You must extend from one of these and then we monitor We use something like black duck to monitor the software inside them for Vulnerabilities coming up in the Java and we monitor the upstream mailing lists for security critical problems in the software We're taking in the base images so In the relatively rare, but it does happen event of a serious security hole affecting something that we're using we say to the client Okay, here's your image bundle that contains the updated versions of these So what about the onsite it team in this scenario? Have they met containers before quite possibly not if they're in a more conservative or older enterprise so We may may need to give them a bit of education This is where Kubernetes can actually really help because one of the things it provides is a dashboard So it's quite nice to be able to say to your onsite it team, particularly if they're more used to Windows Here's this nice HTTP dashboard. It shows you the health of the application You can go in and click through and see the logs You can see where everything's working you can do all this from a browser You don't need to get your hands dirty with the command line Often goes over quite well and in some ways goes over better than our previous iteration of things where we'd say Well, you need to install Tomcat and then you'll need to stick a few more files on it And you'll need to manage it from the command line. So that was an unexpected bonus There's a little picture of the Kubernetes dashboard so you get some nice metrics for free you get to see the various different things that are deployed and So say talking people through this isn't too difficult and you can provide a little We provide a little manual just giving the basic steps for troubleshooting and the most likely scenarios logging So one of the downsides or one of the things that needs to be managed about microservices development as you know is that Instead of having one monolithic application Which can pump everything through a handful of carefully chosen log channels and appear on disk or in a database somewhere You've now got zillions of little services or you know, maybe a hundred of them all writing their own log output So it's got to be bought together and managed somehow ELK stands for elastic search log stash and Kibana So this is a a way of aggregating and working with logs in this kind of environment. So you install Install an elastic search index in your cluster and you use log stash to feed all the logs through into there and Kibana as a nice web interface to search through them Now there's one annoying thing we found about this Which is that if you one of the common use cases if on-site it can't solve it themselves Then you want to be able to say to them. Well, right just export the logs for the last 12 hours Email them to us and we'll have a dig through and we'll work out what's happened Annoyingly you can just see here if I show you for a moment this is the Kibana web interface and It's quite nice and you can search through things you can run queries You can do visualizations and you can see this is just a demo that I found online somewhere because I'm not running one locally But you can see here. We're aggregating the log statements from various different services What's missing from this picture is a button labeled export So you can identify the logs you want them to send you but there isn't a handy button to dump them out of CSV or dump them out as Excel so We're hoping upstream Kibana will solve that for us for now The best solution we've come up with is that they need to work out which container is causing the problem And then the Kubernetes dashboard lets them zero in on exactly what we want to see and they can copy and paste it from there So it's not very elegant and if upstream doesn't do anything about that then in the spirit of open source We might try and do something about it ourselves Back up and recovery. So you've got all these containers running doing stuff If you're running them in AWS, which we do for our own deployments, then it can be pretty straightforward You take snapshots of the AWS volumes involved and if you're running databases on AWS, you can schedule automatic backups I'm sure as your does something similar It all kind of works. You don't have to think too hard about it. Although It's worth remembering in these things that it's not whether you run backups That's the question you should be asking it's do you run restores? Have you actually tried blowing away the system and reconstructing it from what you backed up last night? If you haven't don't let when something has gone wrong be the first time So what you need is a documented and tested process for the client to run through on site So what do we actually do to make that possible? Well thinking back to what I said about how your cluster should be a stateless as possible If your databases are an external thing then that takes care of that if the on-site DBA is running them They know how to back those up. They know what to work with there the only Quirk that we've had to work out on that one is if you've got multiple different stores in your application Maybe some of them are on disk and maybe some of them are relational It may be that the order in which things are backed up starts to matter So for example, if you've got some metadata in your database and some files on disk Then you need to do the backups in a certain order to avoid capturing a state where you might have some files That don't have associated metadata or you might have some associated metadata that doesn't have a file So that's something that your architect needs to think about hard Maybe the solution is to just put everything in as few data stores as possible But if you start trying to stuff huge binary files into a database then that has downsides of its own But apart from that one quirk Yes So you tell tell the client to back up the databases in the usual way if you've got something like Cassandra in the mix You follow the advice on how to back that up and then file systems If you have everything mounted into your cluster as external volumes Then you say well back those up in the same way as you back up a file system It's just a matter of making sure that you really can restore from these things to a working state So what else if we're trying to develop software which wants to be deployable as microservices as a cloudy setup But also on site like this it does limit architectural choices a bit there are all sorts of APIs in AWS and in Azure for doing everything from sending email to machine learning and Sometimes you know on a rainy Tuesday when you've got to get something done you look at them and go Oh, wouldn't it be nice if life was simple and I could use what Microsoft or Amazon have already done But we can't or at least we can't tightly integrate those as the only way to do something Although we've also found that a good stick to beat our customers with because obviously we'd love to get to a world Where our customers get rid of this unhealthy obsession about having everything on site and just use our cloud wouldn't life be great So one way of encouraging in that direction is to say well Here's some nice exciting functionality over and above the basic product We'd love you to be able to use it But you will need to either use our cloud deployment or allow access to this bit of cloud over here to make that work Now there is an interim state which I haven't talked about much But when we go and have a conversation with a potential customer about our software What do we actually say to them? We say well, you've got a few possibilities. You could buy it from us in the cloud You can we'll sell it to you multi-tenant. That's the preferred option because it's cheaper for you and It reduces resource overhead for us if they've got deeper pockets But they're willing to take it in the cloud will do them their own sort of isolated cloud deployment We found a middle ground with at least a few big customers Which is quite encouraging which is that they're heavily invested in AWS Or they're heavily invested in Azure themselves and they want us to take our stuff and put it in their AWS Account or their Azure account that works really well because as I mentioned earlier There are some really good tools for working with these things So we say to them if you just give us a few API keys, we'll do the install We'll talk your IT guys through managing it and then when we've done the install You can revoke the keys we used to do it and it's all yours So we've done that probably in more cases than we've done the the onsite thing and that's definitely our second favorite option So what do we leave to the customer? What do we explicitly leave out of scope for our install guide or say that it really depends to you? SSL if it's anything other than our public cloud Then it's probably got the customer's domain name involved in access to it That means they need to get hold of the SSL certificates They need to install that and they need to work out what they're doing with various settings like strict transport security What about windows? So I mentioned earlier that our big enterprise customers love their windows and so far It turns out that if people are big enough They'll almost always be talkable into having some sort of Linux because it won't be the only Linux in their IT Estates, but what if somebody really digs their heels and says I really really want this to work on windows Well, so far we've been lucky and we haven't encountered that but our line is very much Microsoft are working very hard to solve that for us We've already got today the ability to run Linux docker containers on windows All right, it's not recommended for production just yet, but it's a lot of work being done with hypervisors and things to make that possible There is a lot of work being done on kubernetes to make it possible to run Estates of windows docker containers given that most of the contents of our containers is just Java We can envision a future where we build all of them twice once as a Linux container with Java running inside it And once as a Windows container and Right now if someone absolutely insisted on windows the best we could do is say well Keep all the databases and data stores outside of the cluster and run a Linux VM or two to contain the cluster so I realized I haven't quite managed to Fill as much for as much time as I should but I guess that gives us longer to go over any questions if anybody has them So I said how do we get the customer to send them to us? And there's a I think there's a service called log spouts Which you can install and you can tell docker to send all of the all of the logs through that and then that feeds them into into log stash We have and I think we've got a couple of deployments where we send everything to syslog for example, but Depends a bit what people want to do any others So the eventual destination for the logs is elastic search, which then will be backed by some volumes So that's where they end up and then I think it depends what you've done with the docker log drivers as to how long they're kept by docker itself Any others hi Sometimes and you're right there has to be a better way than doing it completely by hand or copying the files around I think one of the things we've not done and should do given how many of our customers have it is Got some red hat for ourselves and properly played around with all of the the value add in this space And indeed the fact that red hats are doing a lot of work with kubernetes and with with this kind of thing directly Anymore for anymore? Well, thank you for listening