 Okay, so the I guess it's cool thing and a bad thing. So first is that I'm part of this talk I get to talk. So I want to mention a couple things number one is Since I'm part of this talk you should be questioning. Well, why is he running the talk and he's a part of the talk? but it turns out that we submitted a talk Neema and I and my colleague and Of course, you know when it came to review it I was like well I submitted it so I got to stay out of it and then it turned out that Matt ranked it high. So when I saw that I was like well, you know, it's gonna be bad because I'm part of the extension And he kept on insisting and I was like, okay, fine. That's cool because maybe people want to hear about serverless So that's what we'll talk about. But the other thing I want to mention is I don't know about a lot of you But it's always good and bad when you feel old It's bad, I guess in good. So why am I mentioning this? so first thing is Neema here I used to be at IBM research and he was a student doing his PhD and And work with me So that makes me feel old So that's one and then the second part is of course It used to be a time when when I work with somebody I kind of led the project and I kind of like did everything Well, I'm feeling very old because he pretty much did all the work. So I'm taking some credit, but he really did the work So let me try to introduce the talk and then we can commiserate later about being old So what are we talking about? We're talking about serverless So why is this cool and kind of like why did I participate in this? It's because part of my job is to try to find out what's cool But also how can we improve the platform, right? I mean the platform we love Cloud Foundry, which we're spending a lot of time working on and Serverless was the obvious thing. So first thing is sort of what is serverless computing, right? That's the first question you should ask a lot of people look at it as functional programming Sure, but there is like tons of functional programming languages that you can use So why why serverless, right? Well, so functional programming on the web, right? That's another way people look at it. So you're invoking remote functions But then somebody at pivotal I'll mention his name. Maybe not Dimitri Told me well, that's like CGI and I'm like, no, it's not CGI It's like CGI is like, you know his washing accent is CGI and then try to convince us I don't know. I think it's it's it's different, but it certainly looks like CGI, but it's lightweight stateless and so on Obvious question that come up when you say, okay Well, should I start using serverless or like what what's the advantage, right? So clearly why why is it different from paths, right? Like what's so different? What's so cool about it, right? So first thing that comes up, especially when I start talking to other people at IBM that I've been pushing serverless for a couple years Is I it's gonna save you money, right? Well, but then the question is is it gonna save you guys money that are maybe potentially customers of ours or Competitors of ours like some of our colleagues here from pivotal that are running Cloud Foundry like we are so Which part what is it gonna do for you, right? So I think those are the kind of things The other thing is it is their way to sort of compare, you know, like what I'm doing now versus what I was doing later And of course the obvious question is couldn't you just like CF push the app and then just you know be done with it Why do you have to create another? Complexity or another abstraction. So what's what's the advantage? So what we try to do and I guess in this talk is to try to help answer that question in the context of Cloud Foundry and Specifically I want to limit what it's not and then Neema is gonna go into the details of what it is including some data and results So first thing is we're not proposing CF serverless, right? So this is not a proposal I understand the whole process of proposal I run it, but we're not trying to do this here. We're trying to experiment and We clearly cannot explore all possibilities of serverless and Cloud Foundry just because we don't have enough time Neema works in Diego. I kind of move around spend some time on Bosch, you know the places So we have day jobs basically. So we tried to do something that was sort of like 20% work And also even though we work for IBM We're not trying to favor one or the other as a matter of fact last time We tried to give this talk some people IBM to us. No, we can't do it. I was like what no we were open source They're like no you can't talk about it like Okay, but I will so we are here to talk about it anyway and part of it is because we're not trying to say you should go adopt hours we're trying to give you a way for you to make your own decisions and The other thing also is that because we look at it as a set of test suites It's not a complete test suite and the final thing I want to do and I'll pass it to Neema is To mention briefly what we try to do in terms of the experiments and then the different platform that we targeted so We are defining experiments run the experiments and share the results So sort of rinse and repeat and those are the platform that we targeted. So Azure now has functions OpenWISC is IBM solution for this You know, there is iron IO that has an open source version of this open WISC is open source as you of course is not open source And then of course many of you probably know about AWS Lambda Which is sort of one of those pioneering work around serverless that was released a while back So with that let me switch it to Neema. I'm gonna sit down and then at the end we'll have some time for questions So I can come back. Thank you. All right. I think I have one. Yeah. Thank you very much Max So I think two disclaimers. I don't think he's much older than me And then the other thing is that I think he also contributed significantly to this work So just got a credit him for that So yes as Max mentioned the main reason we actually did this experiment was to understand serverless platforms Especially because when we started doing this work It was fall 2016 and the serverless platform kind of we're emerging at that point It's more mature now, but at that point it was like very listed for a lot of these platforms So what we wanted to do was to kind of have a way to understand how these serverless platformers work and You know if you want to choose one service platform over the other, what are the factors that you need to consider? So in order to do that the first step that we took was that we decided that okay Maybe it's better for us to kind of simulate something That resembles the behavior of a serverless system and then that gives us enough understanding to then go and define Experiments that we can run on these serverless platforms and see how they behave and because we're all from cloud foundry We were like okay cloud foundry is a platform that runs applications and serverless systems Those are essentially functions that you load into containers and run so it's more or less the same So why not build something on top of cloud foundry that we can experiment with and understand the system And that's how CF serverless emerged. So we worked with the the man in the Bosch team Dimitri and we basically started coding something that was emulating the behavior of a serverless system and the way it worked was that it was basically a CF application that you would push to a Cloud foundry deployment and that CF application would manage other applications in the platform So if you have functions you would define them in the form of typical CF serverless CF applications And then you would make a call to this manager application That would then turn on or off your other CF application as the request would go in and that's more or less the same thing Basically, it would turn the application on it would respond to the to the request through the the application that it launches And then would keep the container for that application around for certain amount of time Let's say 30 seconds and then if there is no new request to that application then it would shut down the container The main reasons CF serverless are interesting is for two two reasons. First of all, it actually cuts Abstraction significantly like you I think we'd we'd have with pads We went one level above IS in the sense that we didn't have to care about infrastructure with containers We went one level above it with applications. We didn't have to care about anything like at the level of managing and you know running the operations and CF and serverless is interesting because essentially all you need to care about is the function that you run So it's very low very high at the level of abstraction and significantly saves on engineering efforts that you need to put in place In order to manage a serverless application Also, it's presumably cheaper because you're going to be charged only for the amount of time that your function runs So if your function runs only like Occasionally you're going to get charged a lot less than having an application up and running for the entire time and having to pay for all The resources that you're using for a while, you know having your application sitting on So with all this information we built CF serverless and we realized that there are a couple of things that are very important So we started defining metrics The first thing that we realized is that it's important to understand what the throughput of the serverless applications are It's important to understand how they behave when there are memory intensive functions or CPU intensive functions And also it's important to understand how these serverless platforms manage containers because we realize that container management is a significant part of You know operating a serverless system So we defined a set of functions that would allow us to address these requirements we defined an echo function with which was just basically a hello word function and then You would launch it and then it would send the hello word back and then we would measure the time that it takes for the round trip We had a memory intensive matrix multiplication function which would multiply like two three hundred by three hundred Matrixes matrices and then return the results and that's like a typical memory intensive problem for CPU intensive functions We had a function that would find the prime numbers below one thousand and again That's a known problem. And also we had a curler function that would launch care link of an endpoint from within a container So that we could understand whether a container stays around or whether a container gets killed after we basically send a request to So if a curler function, you know to stop curling the endpoint that we were expecting you to care They will notice that the container was killed So full disclosure the data that I'm gonna talk about is based on the results that we Collected during fall 2016 when we ran these experiments So it might be that you know results have changed since But for the experiments we actually set it up so that all the functions that we deploy to all the platform and that include the AWS lambda the Azure web functions iron IO our own serverless instance and open VIX this IBM's open risk And they we all gave them five hundred and twelve megabytes of data We set all of them up in US East except for open risk, which I think is primarily available in Dallas We launched 100 and ramp up requests to all these Endpoints and to warm up the platform and then ran another one hundred requests to collect data And we did this in three rounds of execution And yes the environments I mentioned So let's have a look at some of the results We intentionally do not put these results side by side because we don't want to draw any comparative conclusions If you want to draw any comparative conclusions, go do it yourself But the purpose here is to just give you some ideas about how the platforms behave So the type of workload that we actually launched towards these endpoints We're in two modes sequential or parallel and that's how we actually decided about the throughput The idea with sequential workload was that you would hit the endpoint Once and then wait for certain amount of time and then we hit it again And I would collect information about how much time it takes for that for that request to come back with a result so if you can see these are the results for CF serverless and essentially the first request and that you sent to CF serverless and The managing application that we wrote on top of CF serverless and CF serverless would go create a container Load the code for the function into that container create the endpoint and then respond to the request That's why if you look at the graph the first request usually takes around like seven seconds But once the container is up and running the next follow-up requests are going to take a lot less Because you know the application is ready and it can quickly respond If you delay for more than 30 seconds in case of CF serverless We would kill the container so your next request would take another set of time for everything to be up and running also if you increase the load potentially and That in such a way that you need another instance of the application to respond to you to your request You would see a spike which would involve creating a second container to respond to the application So here is an example of a high throughput function where we actually did the same echo and This time by sending all the requests at the same time And one thing that is very interesting and these are the results from Marcus of the juror One thing that we noticed is that as we started sending more requests to the endpoint the response time actually got a slower and slower So essentially the request would get queued up and it would take longer for the request to To basically come back with the result But there was suddenly a significant drop in response time And that's why that's when the the serverless platform in this case is sure would realize that hey You know I'm probably hitting the tertials and it's better to launch another container to respond to this and that second container What's it once it's in effect all of a sudden it kind of drops the workload But then it again starts ramping up as you know more requests get queued up. So another thing that we looked at was the container management and one thing that is very important and one thing that we realized with CF serverless Was that you know creating the container is probably the most expensive part of the whole CF the whole serverless Ecosystem because that's the part where you actually need to put these bits and pieces together and have an application up and running So we wanted to see how other platforms behave when it comes to creating containers One thing that we noticed was that for CF serverless We noticed that you know if You send the request it creates a container loads the code into the container if your container is idle for set an Amount of time we would kill it and that's 30 seconds for open risk for AWS Lambda and for Azure we realized that they actually keep the container around somehow Because the request would take a lot less time to to to come back in the order of hundreds of milliseconds Which was very different from you know the seven seconds and that we would notice in case of CF serverless now some of this is Because in cloud foundry we do some crazy magic when it comes to launching applications that involves You know downloading the build pack downloading the droplet and all of that is quite expensive But essentially for some of these platforms like Azure We also noticed that when they start the container even though it's like a plain simple container It can take you know two to three seconds for the container to become available We noticed that for lambda for example that was a lot shorter sometimes like in the order of 80 milliseconds One of the assumptions we had was that maybe they are creating You know some sort of a cache for the containers and keep one instance of the container around and you know kind of freeze it So that it doesn't utilize that much resources, but then it becomes available very instantly once you need it So one thing that is important is that we treated all of these platforms as black box because there is not that much information About how exactly they manage containers So these are most like some of this is based on our observations and the data that we collected some of the assumptions that we have so here I'm showing The results of a memory intensive Sequential function execution for 300 by 300 matrix multiplication And you see this is the typical behavior that you expect from a serverless platform so all the requests more or less come back at around the same time it's kind of a flat line and In this case, it's a sequential one so requests go one after another But the amount of time that you see and the function takes to execute is more or less the same one thing that is Interesting is that once you make something like resource intensive like matrix multiplication Parallel then it can significantly degrade the performance of the system this example is from Azure the first few requests would Actually take some time and then return back But all of a sudden the platform would crash and everything would stop and we wouldn't receive any results Then you know another container would probably be launched and then it would respond to a few of the requests And then it would go down so for something heavily memory intensive We noticed that Azure was not able to cope and scale well when there are not a good number of requests that go in For CPU intensive and other resource intensive case where we actually did the experiments But this is the result from CF serverless and we did it in a sequential fashion You can see that again the same pattern that we saw for echo the first request Takes a lot longer because it involves creating the container the follow-up requests are kind of flat and In case of parallel we did it with openvisc and we saw that openvisc And also lambda I believe did a much better job compared to Azure when it came to handling parallel request for resource intensive computation, so you see that there is a slight increase in the amount of response time that we noticed but basically every single request came back successfully and The slope is not that steep actually the increase in response time wasn't significant Which was quite interesting So it basically means that those platforms were able to better identify and understand that you know There was heavier load on the platform and they were able to better You know create new containers and provide more resources as the number of requests increased So some of the observations that we did in this experiments and the first of One of the interesting ones was timeouts. We noticed that You know these platforms generally claim that you know that the timeout the default timeout for a lot of the functions is 30 seconds What you can change it up to five minutes The interesting thing is that these timeouts are only applicable if you launch Request to these functions from within the cloud environment that you operate So if you're using lambda, for example, you can increase the timeout to up to five five minutes But then you have to make calls to those functions from within the VPC if you make calls to the functions from outside the VPC from outside AWS the timeout is always 30 seconds and it always kills your function if it takes longer than 30 seconds So that's one of the things that you may need to Take into consideration if you want to use several functions from outside Another thing that we noticed was that it's important to understand how your cloud platform scales If you launch a lot of parallel requests to it like for example in case of Azure When the computation was you know resource intensive we noticed that you know the platform may simply give up So it's important to understand how the platform responds Also for that same matter you need to plan for unforeseen load because if the load all of a sudden increases and your your underlying System or basically the serverless platform that you're utilizing may actually perform in a weird way that you don't expect The other thing that we noticed was whether like there is this general assumption that more money always means better performance But we noticed that it doesn't necessarily It doesn't necessarily happen to be the case in case of serverless platforms It very much depends on the architecture of the platform the way they manage the functions the way they manage the containers So even though you may end up spending a lot a lot more money on a given serverless platform If the platform doesn't do a good job managing containers, then it doesn't really matter Like you see like gonna notice, you know disruptions in your service Um, there is a blog post that we have wrote highlighting some of these lessons that we've learned running these Experiments, and I think the link is here if you're interested you can definitely go and check that out And so the last thing that I want to mention before wrapping this talk off is where we want to go from here The main reason we we did this was to understand and serverless platforms so we started thinking about something like spec serverless and Bunch of tools and that we would make available open source so that you can go and run these Against these different serverless platforms collect similar data to the one to what we've reported here And then the site for your own whether a given serverless platform is the right choice for you So What we've done so far is that and you know, we've put the proposal out there We've defined the objectives the metrics the workloads and also the tests and we've drafted something that we're communicating Internally within IBM and also pivotal in order to decide you know at one point We want to make it publicly available both the tool and also the benchmark definitions that we've prepared and worked on So I think that's it and thank you very much We have time for questions like five minutes. All right, cool Just mention your name and association Hi, I'm Sabao from pivotal So is Diego task API is not sufficient enough to run something like serverless functions because you can define a function and Diego basically runs LRPs with some tasks. So there is a task API Why can't we use that and basically have a pool of in cool of your container instances So those can be immediately run with a specific task definition you pass like a Python or go or whatever Ruby or bash You mean in Cloud Foundry, right? Okay, so there are a couple of interesting things about tasks the one and the first thing is that tasks are Do not get routes so they do not become available externally So you can't have and you know an endpoint for your tasks so that you hit it and every time you hit it It actually does something for you tasks are kind of like background jobs more or less That's one of the main reasons you don't do it The other thing is that tasks suffer from the same problems that I've mentioned here in case of CF serverless Every single time you run a task you need to create a container You need to download the build pack you need to download the droplet and that's a very expensive thing to do So oftentimes your task is going to end up taking seven seconds before it's actually ready to run your thing Exactly, so that's one of the other things that we were thinking that is that is definitely something But it's a lot more expensive work. Sure You could do that But we didn't do this part so CF like our implementation is basically a lot closer to a serverless behavior because it actually brings up the Application and then it manages the life cycle of the application So similar to tasks, but basically benefiting more from the fundamentals of Cloud Foundry in that sense It seems like the the largest time-consuming process as you mentioned is starting up that that container and all the Infrastructure efforts that needs to go along with that. Did you consider the idea of simply having a maybe a existing generic Python Container that was always running and then loaded Loaded that those functions as modules on demand. So we weren't spending that container startup time every every go That's a that's a very good question. Yes, so essentially There are a couple of issues with having one container and then loading code and that's one of the primary ones in security Because containers provide a certain level of isolation and you want your processes to be isolated from one another You want you don't want one process to go and ruin the other processes and data for example If it's running there or ruin the state of that process for whatever reason So you want the isolation that you get from containers when it comes to serverless And that's one of the primary reasons you need to actually have separate containers for separate platforms But yes, if you want to forget about all the security and all the good things that containers provide that's potentially solution. Yeah You had a questions. All right. Thank you very much. Appreciate it