 Ladies and gentlemen, it is now 335 p.m. Fortunately this year we don't have any exit instructions, but in case you do it's the same door you came in on So welcome ladies and gentlemen of the pre-flight platform checklist presentation. You're here for the doomsday presentation. That's next door. Go see that one Hey, don't exhale So myself, I'm Chris Weible. This is mr. Kevin Rutten wealth work for a little company called Stark and Wayne We're all the idiots run around in the blue shirts and the main What is that thing called the hub the foundry. Thank you. It's down there. Come visit us get any questions after this So I want to start out by Again introducing myself Chris Weible Our talk today is primarily about the lessons learned from deploying Bosch and cloud foundry This is my sixth CF summit talk Not talk six time. I've come to CF summit So in that time we have seen a lot of ways of breaking Bosch breaking cloud foundry Fixing breaking fixing breaking usually in production So back when we started all of this the documentation around cloud foundry and getting Bosch going It's not exactly the most thorough thing in the world There's a small treasure trove of carefully curated Manifests to get cloud foundry going. There's some Google Docs That even until recently have been having a lot of additions to and there are readme's that quickly Became outdated with every new version of cloud foundry embossed that came out there Let's face it. It was a small miracle back then just to get cloud foundry running for the very first time Yes, sir This is me trying to go through my speaker notes that are a little longer than I was hoping they were I'm getting to this I promise So, you know in this In the subsequent years there have been a great number of resources that are now available to us that weren't always there I want to put out a special. Thank you to the docs that cloud foundry dot org folks done a phenomenal job so anybody that has Contributed to that. I think you know hoping some of you are here If not, that's been a great value to us Kevin, please click the next slide So which brings us to the talk that we have here today the docs out there are great But there are still a lot of sharp edge cases that are out there So for all of the bits and pieces that may have fallen off the end Give you mr. Kevin rotten. Thank you very much. Thank you, Chris So we're gonna break this into three sections. We're gonna start with the people and then move on to the I as and save the fun networking for last and in all of them There are a lot of big gotchas that we're going to be going through Hopefully we'll highlight the common pitfalls that you may stall or that may stall your project So actually why people first? The first draft we wrote of this talk We had you know, I as and networking first because that's kind of the obvious thing but there's a lot of Conversations in real life. You're going to be talking to the people on your team and making these decisions and That's actually where we're going to start our story so You need to set Expectations early on So we're gonna start with just one that seems to be avoided in every conversation that that we talked to When where and how are you going to maintain cloud foundry once it's up and running? And so when you're basically talking to the stakeholders and managers all the teams Please have this talk and figure out When can you have a maintenance window? When can we do the rolling stem cell upgrades? If you don't have this talk early and have buy-in and agreement your foundations going to be Jurassic and Sometimes you just need to make sure you have some quiet time most of the rolling we starts are painless and unnoticed So that might give them a little bit of a You know warm fuzzy feeling that have this talk so unlike the photo Backups are good you need to decide what you want to back up and You know normally it seems like you backing up the database and you're backing up a blob store doesn't have to be that complicated for some foundations you can simply use the API to scrape all the users or spaces and quotas and If you have proper pipelining all us good developers do that, right? Then if you have to store cloud foundry boot it back up again deploy it Create all the orgs and spaces push all your apps again But without a backup plan your production environment is basically a proof of concept First time you run into trouble You're kind of host you don't have any history or audit someone's going to ask Hey, what was it app that was deployed before it went down or where the assets? You're kind of stuck so decide what you want to back up and Where and how long and this is something you have to talk about ahead of time If you don't have these discussions before you even start once you deployed it You may not have a place to do the backups and you're not doing backups and then you're going to run into problems Get your backups running day one you deploy it Do a backup on an empty foundation? It is really fast once you got the backup started then as you start adding stuff You don't have to worry about you know a month down the road. Oh geez. We have to get backups going We have you know other talks going on about the shield project The Bosch backup and restore there's lots of ways into it and Well, I said get your backups done day one That's also a good time to consider doing a restore You do a backup push a few apps in do restore if you've never tested your backups You have binary data. You might not actually have backups so When you're deploying your Cloud Foundry, you've got some choices you can make Right. Do you have an internal or external DB? Do you have an internal or external blob store and? There are some pros and cons to each The internal blob storing database Built-in to Cloud Foundry very well tested very well integrated when you are tearing down Cloud Foundry Everything's deleted persistent disk are gone. Everything's cleaned up But the external ones a little bit more work. You've got some huge advantages if all your data stored next to no blob stores and Actual databases you can actually tear down your Cloud Foundry and deploy it back up again Database is still intact blob store still intact. Everything will come back up again So that makes a big difference in your backup story is whether or not when you're doing any maintenance or you know Did that to recovery? Did you have to back up your blob store manually if you're using I say s3 and it's replicated across regions? Well, then you may not have to back up your blog store, which could be gigabytes of data terabytes of data So one of the things to also remember and it's gonna be a little bit different back up your credential store your credit hub your vault Backing it up is is something you have to talk to your security team about because it's going to be done a little bit differently but I like to actually back it up and take incrementals because Sometimes when you're doing restore you need the credentials at a point in time and you can go to your backup list and Restore that backup and boom you have a state at time To know which credentials you are using Kevin real quick one thing I wanted to point out is a forget. It's either UA or the credit hub database. Sorry UA or CCDB the encryption For that particular databases stored in credit hub So if you're backing up your your cloud controlling UA databases and you go to restore them to a new environment But you regenerated your credit hub variables. You're not going to be able to read that old data So make sure you're backing up all of the databases that you've got there. Thank you You basically, you know the common practice is you try and back up everything at points in time So don't back up your credential databases, you know once a month, but your infrastructure once a day because You're gonna enter that problem So now we're saying do backups do backups One of the ones that you can run into is never back up the lock at database or Actually, you can back it up. Just never restore it The database gets put back in the locks are put back into place But now there's nothing to release the lock so those locks will never be released and That will cause you lots of fun in the future So with the people you are gonna have a firewall team a security team an infrastructure team Weeks ahead of time get to know them like, you know, maybe take them out for for a lunch Maybe make sure they are working in the same area Because when you're doing any sort of employees, you're gonna be talking to them often The day before you're deploying is not the day to talk to them I mean, that's the day you're gonna find out that the team is going on vacation the next day and you're like So you're gonna be talking with them make sure you're talking to them often make sure that you're Including them in the conversations keeping them in the loop. So You also want to Test the assets these teams are gonna give you stuff like SSL certificates, which I'm going to pick on because they're fun Do you have SSL certificates and do they have the correct DNS wild cards aliases? Did they give you the intermediate certs have you checked that they match Do you have the DNS set up and do they match these certificates? I've been a few places where the certificates showed up and they have the wrong domain name and Now you got that two week. Oh, I gotta be provision. I got the order. It has to go through purchasing fun to go through and Finally install doomsday So I made a joke a little earlier about hey, there's a doomsday talk going on right now So Stark and wind is also giving a talk about doomsday, which is a great little application that goes out there and looks at the Certificates when they expire You don't want to know that your certificates have expired After they have it makes cleaning up your production environment a whole heck of a lot harder So I encourage you to When you get back home, I'm assuming these are being recorded go back and check out that particular presentation Yeah, it turns out that if you let your Nats Certificate expire then when you do deploy and Bosch tries to talk to the agents to update the certificates The certificates have expired and they don't want to talk and then you have to go through all the fun of manually trying to fix all the instances Which can be kind of scripted but still is very very painful. So Tools like doomsday. We'll give you a bit of a heads up guess so what's on the screen here is actually a screenshot of This the CLI version of the tool there will be a gooey version that you can just deploy straight to club foundry But in this case you can see that hey, whatever our cool site comm site is here within the next two weeks If nobody makes any changes, we're gonna have a production issue because that certificate is going to expire. That's bad Yeah, this is actually set up in my profile. So every time I log into the jump box, I get the list So one of the other things very critical when you get the SSL certificates Check that the key and the certificate match check the modulus If they don't match when you go to deploy HAProxy or whatever load balancer you're using is going to complain and possibly blow up and And then you have to go to your security department and start the requisition and wait two weeks and so on and so on so This is also something that you should try and Set up is to pair with someone on your deploy You know, we are very big proponents of pairing of the knowledge transfer I work at an amazing company all the people I work with are amazing my co-pilot here is Fantastically funny and you know, a moderately smart guy. So pairing is quite critical because It helps save you from fat fingers. It's the knowledge transfer It's making sure that you're not the only person who knows how the foundation is set up So you get the phone call Sunday night saying hey, can you come in? Now there's a 50-50 chance that he'll get called so all of you guys have been in Rooms together and meetings together and knocked out all the discussions and you're going to make your choice of Infrastructure and you've got a couple choices You've got the the classic vSphere and you're going to need the vCenter IP username password choose whether you're using the internal or external database Are you going to be using webdav, nas, minio Some other blob store. HAProxy Another load balancer where you're going to terminate the SSL You've got the grandfather of the cloud AWS. You've got your access keys and secrets internal database RDB RDS databases webdav s3 another data store HAProxy ELBs and a little balancer where you're going to terminate the SSL connections You've got Azure subscriptions resources tenants internal or Azure SQL webdav or Azure object store HAProxy or Azure load balancer where will the SSL be terminated? Google compute platform server the count key internal or cloud SQL webdav or cloud store at your other HAProxy cloud load balancers or another load balancer where will the SSL termination be and Open stack Count details internal external SQL webdav Swift another object store HAProxy Octavia another load balancer where will the SSL termination be? So on the last five slides we covered five different infrastructures, and we had five different questions on there It's very important to know The answers to those five questions we bring them up because we see those are not necessarily answered when we go on site You need answers to those five questions depending on how you answer them There are severe implications But where you've put your databases where you've put your blob stores that can result in outages Just prefer on the way that you've designed your system from these five decisions alone So while we kind of glossed over those maybe just a little bit you need answers to those five questions So most of the IS is except vSphere Most of the IS they look pretty much the same, but there are subtle differences Actually some big differences. So when you're looking at documentation Most of the IS is except for vSphere need a boss registry depending on the documentation. You look at port 25777 is often not listed Every IS has its own branding its own flavors and features not all features are available in all regions either So you have to be careful of that and when you're following the docs, which is important You also have to be very aware of which infrastructure the docs were written for Because they will probably be skipping steps from other infrastructures that you're going to need. So Yeah, definitely keep in mind that the Differences between them there's a lot of difference in terminology The pricing tiers you're gonna find that even though, you know, they have The you know VMs per hour price when you start adding in access to the network traffic very different pricing and The feature parity between platforms So once you get your IS keys take a few minutes do some testing Do your credentials do your I am I am keys allow you to create and delete buckets creating delete files Most of these things, you know, Amazon has a tool you can download as your other tool Minio has a tool Download those use your credentials. Just take five minutes. Try and exercise it make sure that you can actually Create or destroy something Yeah, I'm over six on my last production deploys on the set of credentials that I get for the underlying infrastructure Actually do all of the things that I need them to do whether it's create a VM create a bucket Keep when I had was I can write to the bucket, but I can't read the bucket. That was cute So yeah, go ahead just because you got a set of credentials Don't make any sort of assumptions that the permissions that you've gotten will actually work And try to chest those out before you try to do a Bosch deploy and then try to debug that some real simple tools out There go ahead and use them Yeah, also realize that your security team They are going to try very hard to minimize attacks It says give you the least permissions possible. So the keys you're going to get are probably going to be missing stuff you may actually want to consider talking to them about getting your own private account and Basically doing all your employees there so that your infrastructure is separate from anyone else's Might make them a bit easier Now when you're talking to them about getting your own account Make sure you're not the person with the credit card Doesn't matter if you have a corporate card from the company. Don't use it Don't fall into the trap when the credit card screen comes up Find somebody whose job it is to have the credit card number have them put it in Because what happens is you're deploying everything's going great the card expires the infrastructure gets turned off and Then they're going to look whose card was it. Oh, it was your card. How come you didn't update it? All right, so there are people who Basically the job is to make sure that the cards are paid the cards are active if the cards are stolen Addresses change all handle, right? Amazon has umbrella counts Set your account up put under the umbrella, right? I've had You know an issue where the credit card got declined infrastructure just disappeared on that vein With all these Infrastructures you can usually get an account manager get to know them talk to them periodically Sometimes if the credit card gets declined instead of turning off the infrastructure They might give you a phone call first to update it All right, but so get to know your account manager So we talked a few minutes ago about the fact that when your search expires you're gonna have a problem When your credit card expires you're gonna have every problem So networking is a big kind of warmth to talk about And we've already been talking a little bit about TCP ports You're gonna have to basically go through the documentation and SSL bootstrapping director Nats that's gonna be on one document and then Credhub UA application SSH shield registry. This is not a complete list Nowhere will you find a complete list You are going to basically have to go through every thing you're installing everything you're working with find that list and build it and I Recommend you do something like this where you actually label what the ports are for because your security department is going to say You gave me these IP addresses What are they for? so If your company has a corporate firewall Factor in a couple days to debug them Every time as you will have problems with them every time If your company has a proxy to access the internet Factor in a few days to debug them every time as you will have problems with them every time New rules get added definitions get updated You may need to disable or they may decide to disable a pings for a CV or something stuff changes and You want to have your security team On-speed dial if they have any of this stuff you are going to be on the phone with them Also be careful of home router ranges So when you're setting up your network Avoid the common, you know 192.168.0 the 10.0.1 So I was working with a customer Months ago VPN into their network tried to log into their jumpbox Time to try to log into the jumpbox again time dot looking at it. Why am I trying to SSH into my printer? Oh Their network and mine network has the same IP range. It won't route You're going to be causing a world of headache if you use these IP ranges for your network Because anytime you have a VPN There's going to be somebody on your team who's Gonna have a problem now in my case. I found a workaround I went on to the guest network which was on a different IP range, but that's kind of annoying So isolation segments Though they're a little bit easier to deal with if you're booting up multiple foundations You can use the same IP ranges for all of them because they're isolated But you do have to work with your services team Make sure that your isolation segments and their services IPs don't overlap Otherwise the containers can't talk to your database your redis any of that sort of stuff And easy that needs to flow downhill. I believe that's completely self-explanatory so There's, you know Lots of things you have to check out if you have a DNS setup Double check it's returning the correct private IP addresses instead of public ones if you're on AWS For instance, and it's returning the public IP addresses for S3 You could build a slightly different rate than if you're using the private IP addresses inside gets the traffic Goes out and comes back in again. It's a different tier Check your VM sizing and Disk sizing make your discs bigger than you think you're going to need them So remember you can run out of resources. So try and you know plan ahead that you're not going to fill up So any questions? See I helped with a few of the slides So like Kevin said If there's any questions cool will answer them now if not you can come see us down at the What is it called again? The Foundry will be the people before B for Will be the people that are standing there and kind of chilling out because our talk is done with feel free to ask us Any questions you've got but Otherwise I want to thank you folks for showing up today and enjoy the rest of the conference. Thank you