 Well, hi everybody welcome. Thanks for being here. So welcome to the open source on ramp This this track is meant for people new or interested in Open source. So our goal is to get you started with stuff. So without further ado I would like to introduce Amy who's gonna talk about serverless. Oh, good. Everything works. That's always nice It's gonna get I guess. Oh, I'm so sorry about that. Oh, wow. That was loud. Good morning, everybody Or rather. Good afternoon. Um This is my first time at open source summit. So welcome everybody Just to make sure you're all in the right place. This is Managing your serverless servers again This is going to be everyone listening to me talk for about half an hour super excitedly about serverless because it's my jam so Let's get started. What do I mean by again? Again means I talk about the subject a lot My name is a man around a great. I'm a developer advocate over at Digital Ocean and I've been doing serverless development since 2018 and it's something I'm both deeply passionate about and something I've seen fall in its face multiple times Because have you ever really built anything if you haven't seen a break? I would say no in fact, I actually did a talk over at serverless comp 2019 called managing your serverless servers And that was specifically talking about troubleshooting Well, this is going to include portions of that. It's also include a lot more We're going to talk about the entire software development cycle as it pertains to serverless so By the end of this talk, I'm going to hope that you have a deeper understanding of the way functions as a service works and What to consider when designing a serverless application? so I'm gonna find room for us here this is my favorite hot date and It's that serverless still has servers and If you haven't seen this as an argument and you probably haven't seen an unfounded argument on the internet before and for that I envy you This is the sort of comment. You'll see either on forums or threads or live streams kind of in that vein of your favorite language is dead and If you're programming on this OS you're doing it wrong that sort of thing All of these arguments are nonsense. They're all wrong. They're always just fighting words If it executes and delivers something on the internet Naturally, it's gonna have a server because that's what the server does in this story But that does leave a actually interesting question of what is serverless and this alone isn't a bad question The previous statement was bad. This is less bad because cloud providers are Stamping the word serverless on a lot of things and they're producing numerous server Services every day. Sometimes they're actually serverless. Sometimes they just say they are but they still want you to plug in machines and Configure stuff and do the hard part of programming, which is not what I signed up for So let's kind of do some level setting and figure out what serverless actually is first So according to serverless stack, which is an interesting bug that will walk you through the various parts of What the serverless landscape looks like? Serverless computing is an execution model where the cloud provider is responsible for executing a piece of code dynamically by dynamically allocating the resources and is charging for the amount of time the resources are used to run the code and Because of that the code is sent to the provider for the execution usually form functions and hence serverless is sometimes referred to as functions as a service service or fast and That took a lot of breath. So give me a second whoo and That is mostly true except Serverless is sometimes refers to as fast, but that's not what it is It's so much more than just a function itself So we're gonna take another detour because that's going to be this entire talk We're going to talk about cloud native Because anything any technological conversation needs to be taken seriously. It's another nebulous term that has a computer word in it That doesn't actually tell you what it does So this is from the cloud. This is the definition from the cloud computing foundation who more words Cloud native technologies and power organizations to build and run scalable Applications in modern dynamic environments such as public private and hybrid clouds containers service message service meshes Microservices mutable immutable infrastructure and declarative APIs exemplify this approach. That is what they say Which means it's a bunch of things that run in the cloud and run in cloud only that's what that's a short version of that and Some of these technologies are more specialized such as messaging or databases They're all preconfigured in a way That's called a managed service because you don't have to install my sequel on on your machine or you don't have to You don't have to set up any of the licensing for MongoDB Managed means provider managed Now if the service is requiring you to detail and provision a virtual machine Some of the cloud databases are like that Or even the manage a lot of managed Kubernetes services are it's considered server-full because you had to pick up a machine to do it if Everything is provisioned by the provider like functions or cloud storage, and it's considered serverless. That is a difference and Now we're going to go back to what my original point was so what is fast then if If that if serverless isn't fast and what is fast? Fast is a type of cloud computing service allows you to execute code in response to events without the complex infrastructure Typically associated with building and launching microservice applications. I like this definition because it is Relatively concise and also that it comes from IBM and I find that hilarious this their Human readable version of this is that it's where our code lives and is executed. It's a compute It's the brain of our serverless application. So when we do talk about serverless servers This is what we're talking about and if you're curious about the deep inner workings of fast This is a fascinating talk from 2018 serverless com where Lyndon Nichols talks about Not just what fast does but the actual performance implications between the different types of cloud providers And it's it is really trippy and then you can see how they all work together and what the actual hardware looks like for some of for the Providers that we're willing to kind of surrender that information The the other short version of this that you need to take away from is that according to this slide Functions are supposed to be stateless ephemeral and fully managed containers So we're gonna go into what is fast? When you have your executable code uploaded to your provider It creates a containerized image right there and during your first invocation the provider spins up that initial container and as you get more invocations more containers or replica or what have you are created and as each completes the execution with the exception of that initial container they each containers destroyed until All of them are destroyed and then the initial containers destroyed Now I know this is the industry standard to use shipping containers and I get it But that's not my favorite type of container. This is my favorite type of container if If it's not packing by lunch. I want nothing to do with it So in our example here, let's say you want lunch because we just had lunch and it was It was as much of a feat as you could imagine trying to get a bunch of people to pick lunch You make an order the restaurant boxes up your meal and you take a seat and then all your friends show up they show up in order and each one says I'll have what they're having and New box lunches shows up for every single person in the exact same way and As they finish at different times their boxes are taken away when they are done and they leave But you are a go toast and you stay until all of them are done And then when all of them are done and gone you get yours taken away and then you take your leave And that is the very general structure of a function as far as When and how each of these steps are executed that differs from provider to provider because they all have Their own priorities as far as optimizations and performance and SLAs so This is the response to that hot take Something I already explained a couple of times now and that is the function as your server It's the thing that runs a code and will likely coordinate your various Services based off of an event or a request But it also means that the provider manages all of it. I said it before Fully managed means provider manage your updates your networking and fast is a fully managed service So, why do we bother with fully managed services? Because we want our provider to do with a heavy lifting they maintain software updates not for your dependencies that's your job, but it is in the provider's best interest to keep all of that underlying infrastructure up to date and optimized and Secure enough for them to meet their contract and speaking of an underlying infrastructure They also manage things like networking and scaling and all of the hard parts all of the necessary parts outside of application development which is is going to be handled by the provider and Speaking of that underlying infrastructure I mentioned how the provider will build your container So the developer doesn't need to be overly worried about things like orchestration and the actual containerization Some providers will allow you to upload a custom container But the standard FAS has the provider building the container and then packaging it for you It also means that they're allowed to set the actual specifications things like how much memory you're allowed to use how how what your Function duration time is allowed to be What your CPU is going to look like what the actual hardware underneath is so if you're trying to do something that is extremely performant and Very highly customized. It's probably not the best idea for you So what are you expected to do? Well, naturally, it's your code. Please test it. Please secure it. That's your job that is the part of shared responsibility where The provider will do its part as I said to maintain its SLA But you're it's also your job to make sure your app is good So you have to manage your dependencies. That is your job You have to be the one getting those emails from GitHub saying well all of your dependencies are vulnerable It's like well, I just use a library. That's not my fault. That is your job So beyond that, what do we do? Do we just let the provider do the rest? That would be great, but that's not true As I said, we do have our responsibilities and that is to make in a serverless app good So what do we mean by making it good? We know what to do with your standard monologue monolithic Applications your containerized applications. It's a curatee. It's maintainability But if we can't see or touch the infrastructure, how do we make anything good? well The answer is simpler to say than it is to actually do properly and that is Just be a good developer Good application development means good serverless development because that's basically what you're doing You're building an application and just being pushed closer and closer to the infrastructure and Because of that none of the design principles change if anything Because you're being pushed closer. It means you actually get more responsibilities as an application developer But enough of that let's build don't worry. There's no live demos in here. So nothing you can crash Let's build is more like let's build in our minds in our hearts So the good thing about here it about web development in general and what develop in good practices Best practices rather is that it's been around so long that there's an indeterminate number of articles I will tell you how to do it right. They might be conflicting because what are standards? Who knows that's not important. We're not gonna talk about what those articles mean because that is a different fight That will take a different hour We're going to talk about the basic principles of good Application development the questions you need to answer when you are building. Is it secure? Palo Alto We're going to go deep into well not deep We're gonna talk more about serverless security than you generally will in it in this kind of survey type of talk But what else do we need? We need it to be well monitored. You actually need to be to see what your app is doing You needed to be reproducible that means being able to stand it up and take it down without it crashing all the time because you did it twice or Having her quest come through twice and not having them to do wildly varying different things It's a maintainable That's a that's a principles question. We'll get to it and it's a deficient and that's an opinionated question And we'll get to that too Now let's talk about security and we because we always have to talk about security first or security will knock down the door and We'll all have to reset our passwords or something So when you're dealing with securing serverless Palo Alto networks I was starting to say had a really good blog post on the types of things you should look out for Especially when dealing with specifically event-based infrastructure The base principle is know your serverless attack vectors for event-driven infrastructures It's going to be event data injection things like storage and database events when there's a change there. You don't want Either impermissible or unpredictable behaviors coming out of your function Stream processing events because it's like those previous problems except then it's happening all the time and you can't stop HTTP API calls that's a lot of letters, too If you've ever had to build an API in the past 20 years This is something that this is usually your first go-to for Security hardening because this is where the port scanners will find your end points and then attack it And then you have to read a bunch of reports about it and go fix it and Finally code changes which this is especially important when you're dealing with distributed teams open-source projects just because You are trying to encourage people to make changes to your application But that said not all in code changes are secure not all of them are even done in good faith So you have to make sure that you investigate what your code changes does in order to prevent anything bad from happening So Excuse me again, I'm sorry This is how you protect those serverless applications first first step is access and permissions This is making sure that all of your resources that your functions touch don't actually aren't actually reachable from the outside So that if your function goes down Your resources are Essentially secured from the outside world The best security you can do is to make sure everything's off and nothing conducted. It's that kind of principle You want to make sure that only permitted users are able to make code changes you want to be sure that All of your calls are authenticated and anything that doesn't fall into these very type ravers are dealt with properly so that there's no Edge case that someone can use to exploit your function Also, just like all forms of web apps vulnerability scanning on your code base on your ports on all of these things This is just to make sure you do your due diligence. You Probably meant to do things well I've built things well I've built things that secured as they would let me and there would still be things that pop up because I was using things that were Insecure from a dependency that they were using and nothing that I did That's just the thing that happens and run time protection things like data Sanitization and stuff like that to prevent the injection of bad events causing malformed reactions from your function and you don't want any of that to get into your data or If you're dealing with the section that's in the middle part of a pipeline you do not want to return bad data and Ruin someone else's day What that looks like so just Good life tip overall security is always hard and it's always the most important. I wish they paid me to say that Now we're gonna talk about monitoring, which is just as fine security, I Love talking about monitoring, but that's just me I've built a lot of things that have gone down and it's always good to know exactly when they went down So here are your options as far as serverless monitoring You'll always have your native serverless logging and metrics things like your CPU memory usage storage usage and Sometimes sequestration if you deal with the timeout issue It actually may not show up in your logs because didn't close properly. So it didn't write Haha, that's tricky These are the base metrics that you generally will get for free from any provider don't if you are writing custom metrics and they are just derivatives of these things like I want to know the CPU usage, but I wanted it to Measure it over time You can do that math on your own time You don't do it in your custom metrics because if your math is bad anywhere becomes harder and harder to find the further It is away from you So only take your base metrics Don't make derivative metrics do derivative math during your analytics process. Do not do it in this step, please Then there's of course application and runtime logging that will give you a fair idea of What your invocations look like what your Quest looks like maybe you have your own Logging solution and you would rather push your errors there so that you can run things like analytics over what these requests look like They're causing all of these errors You can do that But and sometimes The logging that comes with your cloud provider is not structured the way you want it to that is often going to happen But but it's there in case you need to do something quick and dirty and try to figure out what went wrong and Once you kind of exhaust all of these options you'll get into your third monitoring third-party monitoring options and If you're working for a company of size or a project of size you chances are you already have and your company already bought something and Chances are either your security team your DevOps team already has access to this dashboard Just send them all of your logs because they'll ask for it anyway and this way it's not on your budget to pay for it There are a bunch of unified Monitoring platforms data dog if they go and then trace the New Relic Just that was just what came to mind as typing this but there are a ton of them They collect the native your native logging metrics and then they provide insights and dashboards on all that collected stuff And then they'll do all that extra math Afterwards so you don't also pay for custom metrics The problem is it is quite expensive tata dog especially quite expensive But it's worth it if you have a very distributed sort of architecture if you really need those insights That's an option for you and so you're not read essentially rebuilding the same product because I think I saw it on Twitter earlier this week If it's their one job to do this Nothing you build because you have three other jobs to take care of it's going to be good at this one product Okay, so that's just worth looking into Now we're gonna talk about reproducible actions We're gonna start getting into the more opinionated portions of serverless as we go on As I said, it's often stateless Which means it doesn't it is not aware of what's going on in the system around it so Theoretically any request that you give it should proceed provide the same output the same input if it doesn't Then you are relying on a state that isn't there or that you can't control or predict so don't do that Also, don't overload the function AWS allows you to load Java as One of the runtimes and because of that that takes like half their runtime space because Java is a beast and If you Use something like that. It is very easy to go well I'll just push as much as I can into this container because they are still needs to run You're still going to use some amount of onboard memory in order to be able to run your process Don't over stuff it because tracking stuff down is going to be very difficult It's a meant to be lightweight and fast and small that is the point of a function and also Don't rely on ordering. I talked about these unwritten states Ordering is the one that usually is the biggest Violator of this unwritten states issue Basically, you are assuming that other functions will be done before you get to it Not necessarily true if you're going to do something like that make sure that you're Managing or tracking your state somewhere or at least validating that the state you're expecting is there It's like, oh did this complete and then I'll go and Make sure that you have a mechanism that's able to kind of handle the delay and waiting for other things to finish and Finally close your connections open by your functions. This is something I ran into recently because that your container It has its own Has his own configuration sure the database are connecting to main maybe a little maybe it's a lazy database Maybe it's only being run when it's being acts actively connected to Maybe it's not you don't know Especially if it's managed it could have something that is not reporting that's preventing you or that could be running fine And it's just your containers like I don't want that many connections going out so I'm gonna stop you now and Then you can't reach out from your container and you don't know why So make sure you close all your connections, especially for databases and It's a maintainable Let's be honest. Is it maintainable is it is All of your code on your laptop right now because you're still trying to figure stuff out Even if you're just getting started use version control use some kind of repo even if you end up like me And you have 25 empty repos. It's better than losing Every losing track of everything because you needed a hard wipe for one reason or another Also, just a lot of platforms you just require you to have a repo now Anyway, or a container just it just so that it's not the one in charge of tracking your changes and Also use infrastructure as code you use a third-party service like Terraform or service or Pulumi or you can use a native deployment language like Digital ocean apps back or AWS Sam just something where it codifies What your application structure actually looks like and that thing can be safe if you're doing everything from the console You will lose things I guarantee within the next two weeks Because you will need to crash and rebuild and you won't know what you're rebuilding from anymore and Make sure you build for any number of developers even if you're doing this by yourself We all have side projects, but you're not gonna be the same person in two months. I'm never the same person in two months We've all had to put stuff down right because You have to come back to it life happens work happens when you come back to it It's like an alien wrote this thing. You don't know what they were thinking and she don't know what they were doing So do future you a favor and if you're doing this as an open-source project do Any contributor a favor, please write stuff down And I'm just just talking in functions, even though you can just use a open API to Record your expected inputs and outputs and that sort of thing And if you can use that so that it builds automatically but also Have a project plan, please This is just me in general talking to me later when I watched the recording of the film of this presentation Please make a project plan so you know what it is You're building and where you are going with this design and if you had an idea It's like I'm going to make sure that this is built this way and this is going to be monitored this way that you don't forget This is just you being a good maintainer. This is you being the maintainer you want to see in the world Write stuff down write everything down as possible Now this is going to be the most opinionated portion because now we're going to talk about efficiencies Because a lot of efficiencies are taking that the cost of read readability and maintainability. I personally err on the side of readability Because even if it's harder to maintain if another person can read it That's one less conversation. I have to explain on what this function is doing or why there are five variables with this That seemed like they have the same name This is a this is basically a product decision that you as a person or your team has to make on your own But there are basic precautions that you can take to measure the efficiencies of your base Again, let's talk about dependencies If you're doing in the prototyping phase and you're just adding dependencies to see what works best Make sure you remove everything that isn't being used by the end product This reduces your surface area for security vulnerabilities reduces the number of library libraries that you have to maintain and As you keep adding to it and you add more functionality to your functions Make sure you monitor what that size looks like if you get too close to the container size Make sure you refactor it so that you don't run into these These limitations the same thing with runtimes if you're fine That more data is being pushed to your function and because of that. It's in just running longer Make sure you fix that before you end up with timeout issues so Supposedly by now you have designed an application whoo-hoo and Hopefully that application is Alive Hopefully it's working. Hopefully stuff is making it from end to end So congratulations everybody. We imagined an application So That's fine. What do we do now if it's alive? Can we just pack up and go home? I know I want to It's great, it's fine, but what we actually end up doing is entering the maintenance cycle of an application Because remember all good app dev is good serverless dev So now we're gonna address this as if it was your average software stand-up. We're gonna answer what went well Went poorly what can be improved and if you got out of stand-ups to go to this. I apologize to you specifically Let's talk talk about what went well the things you want to look at when you're talking about your serverless Application is maintaining your application health because if it went well You want to make sure you keep doing that and continuous testing to make sure that it's actually going well and It's not just you making assumption because you didn't get an email saying your your stack was down Well one poorly Bugs Bugs always happen. Everyone's got them. So we fix them But there's something more important. That's capital B bad. What do you do when something is capital B bad? You are CAs and if you This goes back to my point about Did you really build anything if it hasn't broken? If it did anything really break if you didn't have to write a five page report on how I broke So that is that's a thing you got to do I think everyone was like by the time you hit senior engineers bring at least two RCA's I'm just gonna make that assumption. So I don't feel bad and Finally what can be improved? Everything always continuous improvement We want new features. We want to be able to give users new things to do We want to be able to make all the things that used to be able to do more efficient and we also want to Identify anything we forgot to give users in the first place and kind of fix that and we're also going to talk about repository work clothes things like get flow and get ops and Trunk flow and all that stuff because it does something weird when you're dealing with a bunch of services So I'm not going to talk about every single one of those individually But I am going to talk about serverless help because of less health is important And to end monitoring this is part of continuous testing Just make sure all of your requests are moving from point A to point B and back again sometimes they don't and When they don't that is going to be your first flag that you have a problem But also if that happens your users are going to be the one who tells you That's something bad happens and that's not what you want because Serverless is a bunch of managed services in a trench code This becomes a specifically important because you have to make sure not just all all of your code works and all the Managed services are up that all those integrations are still working because sometimes they'll change You want to make sure that your application is available and consistent Because if things do change then it makes everything go bad You'll also want unit testing on platform managed services because even though I said about 20 times already that Matt manage service it means platform managed Cloud doesn't go down ever does it not want not once in old past five days has the cloud gotten down Yeah That's what we have to keep telling ourselves It's not true. Even though they promise check always check that your service is up All providers are notoriously slow on their status pages They are because they're dealing with incidents response. They need to figure out what's wrong They don't want to you update the wrong part of the status page and freak a bunch of people out So they're gonna be slow You will however will notice right away if your stuff is down and you need to know why and if it's Something that you have no control over that is when you get into that ticket queue system and say, okay I need you to either fix this or refund me something because oh my god. I'm having problems and finally Build runbooks for your support teams again, even if that support team is you Even if you're a you're a startup of one person write a runbook a runbook is less about having to do tech support and more about being able to routinely understand how your application works and Yeah, just You don't want it to go down and you want to be able to handle it consistently so We're gonna go into the meaty part Which is root cause analysis? This is this definition will be my slide deck it is from the State of Washington's apartment of enterprise services, which I did not know was the thing that that happened. Um Yeah, the basic idea is please research the problem before trying to fix it. So you actually know what's happening. Oh And for most people of most people have to deal with it's you writing a five page report after reading a bunch of logs and Talking to a bunch of people who are on call and knowing where that code base was specifically at that point where The incident happened you don't want to be researching something if your code base is already out of sync because their first follow-up question will be Why wasn't this fix already pushed? So? Light stuff has a product that actually does automated serverless analysis and it is sick Finally repository work clothes Yeah, when you rebuild of serverless stack it will sometimes rebuild the entire thing Which doesn't seem like a big deal except it also means all of you the things that you rebuilt are unavailable during the time it takes for it to update itself Most of the time it will only do the one part But if it senses any other change in more than one part in Maine just we do the whole stack and you kind of have to plan for that Also doing rollbacks on serverless is kind of hard You do have to have a plan a lot of people build Specific tooling just to be able to handle this case. It is a pain and Again, make sure you can be handled by multiple people because it's gonna be handled by a team that isn't you And if you're the only person who's able to log in and change it, you're gonna get a phone call on an actual phone Now it's time to grow What does life look like after serverless? Has your have do you have too many functions is all of your functions just scattered on the floor loosely tied together with Message to hopes and dreams Don't do that Maybe you start exceeding your container limits super common and maybe it just seems like a lot Move to a container and containerized some services or containers as a service Basically what it does. It's like a super general form of functions as a service except you don't have to deal with things like The limitations that the platform puts on you because you're now expected to maintain your own container and because of that You get a little more wiggle room on what those limitations actually look like they both focus on code a little bit more on Infrastructure management management on the containers as a service side, but you also end up with more control and options They are extremely similar. It's basically function service. So if they don't do it for you, can you use serverless forever? Yes, you can I say with a heavy asterisk You'll need a state manager because if you're reaching that point you probably do need start managing state You need to create service groupings for your more complex market service logic And you want to optimize your request processing by request by analyzing your application behavior So all of that I'm now going to talk super fast because we're now on the speed run where I summarize what I talked about for half an hour We talked about what is fast and how it worked We talked about good development principles including security monitoring reproducibility maintainability of and efficiency we're going to talk about how to maintain civil is applications including Civil is health root cause analysis and repository workflows and we talked about growth strategies for service list Including maturing your civil is platform or moving straight out into our containers as a service Whoo, I have one minute left any questions. It's fine. If you are interested in any of this, um, I Will the slides are already up on the On the summit page for this talk you'll find all the links that I talked about at the bottom if Otherwise the links will also be on my Twitter that is twitter.com slash nerdy pause and e R. D. Y. P. A. W. S. Um I don't know give me half an hour and I can have it up then. Thank you