 All right, all right, guys, thank you for being here on this early hour or the third day on the conference after the party. This is not easy. But you don't want to oversleep serverless. This is a big trend, too big to oversleep. And for those who are still sleeping, you know, some of us were sleeping when docker was happening, when containers were happening. And some are still waking up for virtualization, we know this. And some are in a total hibernation. Our industry is very fascinating because the leading edge is moving so fast. And the gap between what's happening in technology and the majority is growing. Fascinating times. Anyway, that's not part of the talk. Let's get to the talk. My name is Dmitry Zemin. I'm one of the creators of StackStorm. And I've been around IT, automation blog for quite a while. Today I'll be talking about serverless, and this is supposed to work. So we'll cover what is serverless. We'll talk about the definition, about the frameworks out there. We'll spend most of the time actually looking at things. For those of you guys on the back, just consider typing things in terminal. And to be sure that you see the things, consider changing the place right now up to you. And then we'll talk about path forward. We'll talk about what is the path forward for the open stack. Not that the serverless is becoming reality. Now, it doesn't work. What is serverless? Well, serverless is a meaningful term that makes absolutely no sense, just like many other terms like DevOps, Agile, and Cloud. And that makes it perfect to be the term that's behind the big trend, so that we can have all the conversations about how serverless is your serverless, and just like we have conversations of blogs, yet another 12 definitions of DevOps, we can do all the stuff with the serverless again. Some of us think serverless is lambda. And indeed, it started with lambda. It started with Amazon doing this masterpiece. And Amazon continues to want us to think that serverless is lambda, but we know by now that it is more than that. Some of us try to bring a better definition of serverless, functions as a service. And this turns out to be a precise term that has a good meaning. And as such, it cannot capture all the trend that is behind the serverless. And now it has the place, but serverless is a trend that remains there. For the authoritative definition of the serverless, check out Mike Roberts or Martin Fowler.com. This is the read, which is longer than most of our attention span, but I highly recommend that. And a week ago, we'd been to the serverless conference in Austin, where every talk started with what is the serverless. And it looks like we're converging now. We're converging on these three main bullet points. Serverless is event-driven. It's change things from the programming models where everything is spun around the database. Two things in architecture is where the event stream is in the center. And that is pretty profound change. Serverless is functions, not the services, not the things that we want to run all the time, but the things which are done when done. And then it's infinite scale, like infinite. Of course, there are limitations. And if you played with Lambda, you know that if you want to expand that for more than certain things, you'll need to call Amazon and ask them to add you the capacity. But that's not our problem. The scale is no longer a developer problem. We don't consider the scale in development. We don't need to learn how to develop for scale. Right now, scale is not my problem. It is on providers. So scale, and most importantly, it is a different payment model. We pay per use. And as such, serverless is good for everything, not quite. This is good for particular workloads. It's good for the very spiky, unpredictable workloads where capacity planning is difficult. And if anyone looked deep into the auto-scaling, you know that it is actually an interest in science there to do auto-scaling well, it's pretty hard. So occasional workloads, there are tasks that don't worth having the server up. And Lambda is used for variety of the tasks that only happened once in a while. And most importantly, serverless is for saving money. The cost model, the pay per use cost model allows people to do very large savings, not for everything, but for the particular type of the loads. So the use cases, people are presenting use cases right now, and it's obvious that serverless is coming out of the infancy. People are talking about different types of data processing, service computations, image processing, marketing data processing, processing the data on the streams. Serverless is backing up the web and mobile applications with thick clients with doing things like rendering the React.js stuff on the service. And again, it proves to be very economical for a particular set of the use cases. And there is a variety of other applications. What is interesting is while in, you know, depending on how you look, data processing looks very heavy based on the invocation, but in terms of how many functions are there kind of back up for the web applications, pretty big use case. And what is happening this time is we left time when there are toy applications for serverless. And we entered the time where serious companies are doing serious applications based on this technology. Tim Wagner, who you know is determined to destroy all the hardware outside of AWS data centers. He's kind of blending the piece of the hardware at this time around. So he presented these free use cases one more. But the point here is serverless right now is used in heavy production, and these guys have presented substantial savings for the serverless. What is that? An interesting example from Nordstrom, they built a canonical application. They have been using that, they built this application that used that internally for developers to bring it up on the technology stack. And it is a hello retail that's kind of the proof of concept application that we can check out on GitHub and see how it works. The observation there is the end-to-end service application is by far not just Lambda. It is a variety of the stack. We have DynamoDB here, we have Kinesis, we have CloudFront, we have S3, we have step functions, we have integrations with other things like Twilio and integrations are very essential. So it's a lot. And I think what we see right now is clarification of the serverless that it is not just functions as a service. It is a stack. And the serverless stack has been formed. So here I compare the Amazon offering with Microsoft offering, which by the way are very impressive. And you see that everyone who is playing in the serverless right now is trying to build that type of offerings because we've just Lambda and we saw that a year ago. With just Lambda you can do very little. So let's talk a little bit about the ecosystem and everything I think spins around the big cloud offerings, big cloud giants. I'll talk about some of the types of the frameworks on serverless each one at a time as you expect. Now that it becomes the trend, the serverless frameworks are popping up every day. So there are four major categories of them. One is the offerings from big clouds and Amazon is the leader. What Azure is doing is really impressive and even for those of us who are skeptical on the Windows, I was looking at the logo and I thought, you know what, actually they would do favor for themselves if they somehow disengage from the Windows association here on the logo. They're awesome. Google may be less so impressive. You know, IBM is playing a lot of IBM, Blue Cloud doing a lot of marketing power behind that. You know, we'll see how successful they're going to be in competing with Amazon. There are standalone frameworks. You know, Iran had done serverless even before the Lambda and they're still around and they're still offering not just the Lambda-like functionality but the Kinesis-like functionality and the database-like functionality. So there's a stack of the offerings. Web tasks coming out of zero turns out to feed the serverless requirements perfectly. So it gets used right now. There are interesting things, like I'd say exotic things, like backend, which make some interesting assumptions about developers, like developers only wanted to write the JavaScript. So like if these assumptions hold, these guys probably have future. The next class of the frameworks is the Sheem layers on the AWS. And watch, I say on AWS because Microsoft doesn't take prisoners. There is no much ecosystem around Microsoft tooling. We know that Microsoft is different, but they're perfect in tooling, so they don't leave much. On the other hand, trying to use Lambda or trying to use AWS services raw at scale quickly brings you to serverless com or any of the other frameworks that makes it easy to work with the offering and sometimes extend that. For instance, some of them are let you run the Go applications on Lambda, and some of them are multilingual, and some of them is just more of the convenient stuff. The third type of the category is implementing serverless on top of your existing Kubernetes or in top of your existing SWAR. You need the function as a service. Kubernetes has jobs, SWAR doesn't have. You can do that, but Fusion IO from Platform 9 is awesome. That makes creating the function and deploying that on the Kubernetes cluster enjoyable. So check it out. There are two things here, it's kind of on Docker fast. This is not even a framework right now, it's just emerging. And the serverless SWAR, which I put together again, this is not a framework which is something more like we did that for ourselves and put it up there. So it's less happening in the Docker ecosystem than in Kubernetes, but just for completeness. Lastly, we have open source do-it-yourself serverless frameworks. And OpenWISC is one of them. In fact, it powers IBM BlueMix Cloud. And recently it is TechStorm. And why TechStorm is recently, well, I think because we were sleeping, I was sleeping. We missed the point and somehow looked at TechStorm too narrowly and saw that as DevOps automation tool, up to the point where our users begin to use that for serverless and it opened our eyes that we need to probably just put our head in the rain and position ourselves as a serverless framework too. So now we are playing. So I think I've spoken for too long. Let's do some show. And what I want to do is to do a typical demonstration, the people who are explaining serverless are doing and the typical demonstration, I'm biased. So I'll choose TechStorm as a platform of doing this, but this is exactly what people are doing on the other things. So here we have the box which has TechStorm. And on this box, what I'm going to do is I'll create a function and run that. So hi, hello. And for the difference, everyone is showing serverless with JavaScript, but we here are probably more comfortable with Python Ruby or Shell. So you Shell just for fun. How's that sound for the hello world? We do? Okay. So this is the function. To make that the function, we'll need to add a little bit of metadata to explain the system what it is. So to save you from my typing, I put this up together already. So this is the function name will be hello. And we put it in a pack. The pack is to logically keep the functions together, Boston. We'll run that as local shell script. If you're a Python, we'll do the Python runner here and so on. Here is the description. The entry point is the file. And now we go about the parameters. And here we can give the name of the parameter. We can have a description to the parameter. And we can do some extra things to point to the positional and key value arguments to play with the shell scripts and make it easy to write them. So now I ask you to action create hello, yeah, I'm still sleeping. Oh, we already have it. So let's delete it first. So here is our action. So what can we do with that? S2 run, we have a help on that. And the help says that it takes the name. And if you run that, we for S2, hello, word, name, open stack, sure enough, it runs. There is a UI that, oh, I'm still asleep. So here is the UI that says that, OK, Boston hello runs and returns the stuff here. So what's interesting is I can actually run that with debug. The convenience here is we now can locally debug this function. We can do the things before we do anything with this function. Look at that. We run this function and the client shows us all the interesting things about that, including the posts that it is doing to call out this action. What hints us that this function has an API. So again, not to bother you with my typing, here is the query that we're going to do. So looks good. We're doing the post to the execution endpoint telling us, can you please create us the action of, except I need to change that, of Boston hello. So off it goes, oh, I mean, this is a good thing, right? So it doesn't let us just do anything, but luckily we have S2 API key, create minus key. So we can create an API key for that. So I'm thinking, you know, luckily, I hit this box, so I'm showing you the security minus H, what? Of course, S2 API. Okay, so we get the action going and most interestingly, we get this IP. So if I do to execution, get open stack. So hello, open stack, we got this action going. All right, so this is a toy application, right? So there's really nothing special about that. But what I want to show here is we created an action on any language, played with that locally, brought it up, and now it has an API. So now you can run this function via API, you can run that via web hook. Now you can actually hook it up with the events. And off, let's try this out. So I have to hook it up with events, we need the rule. So the rule defines that on a particular trigger, let me just, on a particular trigger, we run the action, check against a certain criteria. And in this particular thing, we are watching the file that will be in the TMP Zoo. And we'll have the line watcher that fires a trigger every time when a new line adds to the file. And then we'll be running this Boston Hello action every time when a new line is added. So just the two rules list. Okay, so it's already in place. So I'll just do this echo RMP Zoo. So off it goes. Now let's check st2 trigger instance list. So there is a variety of things. So here is the file watcher line, it happened. Here is the generic trigger so that actually reacted on that. So we can check st2 execution list. I keep on forgetting that we have autocomplete. So we have this action, and if we check st2 execution get, I'm presenting. This was the action. So what happened right now? We took a simple YAML definition. We took the stimulus, the event, the triggers that stated with that. And we configured the system to fire the action on this event. So this is the function as a service functionality pretty much end to end. So that's about it. So I actually meant to describe Stackstorm a little bit. And again, Stackstorm is the one driven system that has the sensors. And in our example, the file watcher was a sensor. It emits the triggers and the new line into the file was a trigger. Then it hits the rules. And rules are very simple. Like if this happens, fire this action, take the parameters from the input and fire that to the output. And then we can stick the actions together into the workflow. So run actions row, like actions are in action. And we'll talk about workflows later. So for the workflow engine, we're using Mistral. And we, Stackstorm, are the Mistral contributors. So some of the stuff that you guys will be seeing further is already running an open stack. Okay. So a little bit more talking. So this is boring, right? So like now I'm going to the, like, is it? Like, how am I doing? Like, all right. So, so kind of you're still asleep. Give me some. All right. Keep on going. Or should we kind of just go home? All right. Less on going. Like, but you know, I'm picking up the steam because the coffee is now digested. So this was a toy application to introduce the concepts. Our friends and partners begin to use, actually begin to look for the real application. And I will be talking about applying what we just see to the real problem. The real problem is genomic sequence annotations. There is a company called SoftBury. They have a bunch of proprietary algorithms that they offer to stitch together for the doctor. The majority of them are doing the genomic annotation. So who are you familiar with what the genomic annotation is? Like, I know it's familiar. So awesome. So if I mess it up, like, let me know because I'm just familiarizing myself with the domain. Simply, you take a sample. You know, it might be a bacteria. Take a sample. Put it on device called Sequencer. And then it produces the long sequence of the four-letter alphabet. I mean, the sequence is really long. We're talking about, you know, three gigabyte long. And as a doctor, you look at the sequence and it doesn't make any sense. So there are substantial databases that find the important parts of the sequence. And people actually throw that on the computing. Compare the sequence, again, everything that is known about genome by now, and return what is called an annotated sequence. So the doctors, the practitioners, can take a look at that and make some meaningful conclusions. Like, this particular thing on this particular bacteria correlates to something that is responsible for this particular kind of antibiotic. So it can be treated with this antibiotic or so on. So I don't know much about that, but it comes to computation. The typical genomic annotation pipeline consists of three parts. One is prediction. And this is usually kind of heavy, but relatively short part. The second one is comparing that against everything that's known about this sequence. And that is highly parallelizable, and there is a number of databases. Databases are big, but not too big. It's just like 100 gigabytes, and nowadays it's nothing. And then all of these computation results needs to be merged together, and there are some processing of them. So what's interesting about that is there is a number of algorithms that are participating in playing there, and these algorithms are actually written by different scientists at different times in different organizations, and oftentimes in different languages, and a probability of having at least two of them working on the same computer is like near zero. Scientists have a particular peculiar way of doing programming. So for the company offering genomic annotation as a service presents typical serverless challenges. The workload is extremely elastic because some of these steps actually run for a week on one machine, or for a day if you run that on five genomic. They're all function type of the workloads. You run that once, it produces the piece of the data, and it moves to the next step. Then there is a need to package these programs written by the scientists in different places and run that in a variety of events. So there are also additional requirements there. Long running times, pipeline orchestrations, it is difficult to put all these things together, and they also wanted to have local development. So therefore Lambda, we cannot run too long time with that. OpenWISC begin to have sequential workflows, but the workflow power is not sufficient to do the pipeline. Azure, actually this comes with the closes, but Docker container support as a function is not there. So then we end up putting the solution that is based on StexTron and Mistral to run their serverless computing, and there are three parts to this. I hopefully will have time to demo that, but the important thing here is, first of all, on the step one, it is the administrative task. We need to bring up the serverless together, and what we put together is the thing that runs on Swarm that has shares that enable the functions to exchange the data, and then we put StexTron, local registry, and some other auxiliary things on the work here, and then we make the Swarm elastic. The programs, individual steps, then as a developer, the responsibility of the developer is to do two things. First, take the program and turn that into the container. Remember all these Java, Fortran, or Pearl of any different things, they just put them in the container and pushed to the private Docker registry. Then they define the steps or the logic of how these containers are stick together with StexTron with actually Mistral workflow, and then the system takes it from there. When the user sends their data, the data are dropped on the disk that constitutes an event, and when it drops on the disk, it fires up the computational workflow. The computational workflow calls SwarmController and it says, run me this task, and then all the Docker cool stuff is happening, one task runs, and the workflow continues to run that. So let's try. Let's see this in connection. So I won't need this anymore. So we are now on the cluster Docker node list. So we can see that we have the three Docker nodes running, one of them is a leader, and we have, just again, we have StexTron running on the same node, and we also have the, like if you play with Docker, everyone is doing a visualizer that shows the Swarm here. So to do the first part, which if you remember, I'm now a developer. As developer, I wanted to create and define the workflows and package up my functions. So the functions are in one of these directories. If you take a look at that, there is a lot of stuff. So I don't know how it runs, because they package that in a container, and so it's in a container right now. It's containerized to run this container with the parameter as it will produce the results. I don't care how that works. To do that, again, we have this built, and we take all the Docker files, and we push them into the private registry. Now all the Docker files have actually been there. So now what we want to do is to run the workflow. So I'll start running that because it takes a while, and then I'll explain what it is and where is my cheat sheet again. So what it does, it runs an action, and this time the action is a workflow, it's multi-step. It takes the input, and just by the way, that's what we're talking about when we're saying about the sequence. Like you look at that, this is a sequence. Fine, try to use that. It takes the sequence and produces the file named result, and at the end it sends the mail to the particular mail. So let's get it running. Here is an execution. If we go into here, we can see that the first step was already running. So this is the pipeline. This is an execution on the pipeline. It is at this first step, you can see that the bubble is going. To explain what the pipeline is doing, I'll show that into the workflow designer. On the right here, you see the standard Mistral syntax, just to make it convenient to maintain. There is some preparation to create the directory for each user and do some other stuff, to isolate the data from one another. Then we run this step, and here you can see that we're actually taking the image from the registry, run that with the parameters. And then here are some preparations. The preparations apparently also require some algorithms that are not transferable, so it's also running Docker container. And at this point, we'll just blast it off. Let's see. Here we are. So we're blasting this on the note, and it will take some while to run them as they weed up. We'll then combine the result, think MapReduce. So all the results are combined, and there is some processing on those results that's gonna happen. We'll build some kind of nice graphic here, and then we'll send a mail. So let's see where we are right now. So once again, this time around, we were a user. We submitted the data. We sent it to Stackstorm. Stackstorm fired the workflow. We're not going far in the action. There is some stuff that we added to Swarm. Some of you who are familiar with it know that Swarm doesn't have a concept of jobs, so we need to add that, but you know this minor. So by now, the workflow has happened, and I'm actually expecting an email to come. I won't be waiting for too long, but I'm just curious where it is. It is here. So it sends me a nice graphic, and it also sends me the annotated sequence. And remember how this sequence was very cryptic at the beginning, so right now it takes this and adds everything that it found it in the database, and then it also adds the findings into the database. I don't know how to read that, but my partner does. And we're talking a really serious stuff here, meaning this algorithm that we shown here had been used to, in the past, not right now, to hunt pretty serious bacterias and find out the cure for the bacterias that otherwise were pretty deadly. So we're pretty happy to be in the spot. Now, for the record, I cheated. I ran that against a tiny database, and if it was for the alone database, we'll probably be running that until the show ends. And I also cheated. I ran that on Amazon. The guys actually, they're working on Amazon, so our main thing was on Amazon. But just before we go to talking about the future of OpenStack, like hot off the press, our team took the same technology and ran that on OpenStack. And in addition to just doing it, like there is a MapReduce example out there, and you can check it out. In addition to doing this thing, they actually did something really cool, and the credit goes to Winston Chen, who some of you know. He's a mystery, one of the core contributors and the rest of the team. So let me show what they got. And this will be the recording. So similarly, here is a Docker, and it runs on Rackspace, Cloud. There are two workers here. Actually, I have a pointer. And we have the frame node swarm cluster. We're gonna run the MapReduce that is a similar thing. It is just assimilated. It's a word count example. It conceptually repeats the kind of like our genomic information thing, but it is simple, and we use that for testing all the time. So as you can see, we're on massive amount, I think 15 instances, but we run that with memory reservations. So right now, there are 15 things which are scheduled there, and they don't fit. So what we had is StackStorm has an auto-scaling rule which is responsible for scaling the open stack, like scaling the swarm that sits on top of open stack based on the swarm load. And here we see that the trigger in the van had happened. The queue of the panning swarm tasks exceeded the free-short. It fired up the event, and StackStorm says it's time to auto-scale. So we're going to the auto-scaling and now rexpaces provision in UVM, and there is a hit template that provisions that, and configures that so it becomes the member of the swarm node. And now I don't need to hold the fingers because I saw the video already. I know that the node will come up. Here it is. So the node comes up. We actually, we see a really cool thing, like one of the panels, the very respected guys were discussing Kubernetes and say that when we run on Amazon, we want the elastic capacity. So remember, the serverless is about saving money. We want the elastic capacity so that we only pay for use. But unless you have an ability to dynamically scale your swarm Kubernetes, you call it, on top of the cloud, you as a user don't have this elastic capacity and you as a provider doesn't have the agility of using your compute power to provide this infinite elastic capacity to your end users. So, you know, like, eventually they'll don't scale. So how did we do this? Let me pause this sucker. And how did we do this? Let me again show you something. We'll jump into the code a little bit again. Here's the t... What? All right. I will leave it for you as an exercise. There is a heating plate. No, I'm determined. Like, for heating plate, they see Zane here. I'll show him. I'll just be sure that I'll show the heating plate. He'll be interested to see this. So, you know, that's a swarm manager heating plate. It defines all the worker stuff. It defines the cloud in it. But most importantly, it defines the scale-up policy, scale-up webhook, and scale-down policy. So, there is an action. And we had to do an action here because it's an API call and it's trivial to make an API call in StackStorm. But to do a kind of... To call a heating plate, we need two API calls. One to authenticate, and one to do this. So it's a trivial Python action. And as usually for action, you have two files. You have also the YAML file that defines this action. And most importantly, we have... I mean, there is a pending task queue that's more if you're interested in how to deal with this swarm. You'll take a look at that. But most importantly, the rule. The rule that... Why is this all together and make it all possible? Rules, not pending. We're talking like this is AWS. This is the OpenStack rule. So as you can see, it triggers on the pending tasks. And the swarm pending task fires an event when there are too many tasks which doesn't have the place to be scheduled in the queue. Then it crosses this up and then it fires the webhook using the pipeline scaling action. I'm done with the showing. So those of you who are interested in me talking can stay for those of you who are only here for the show or can leave. So OpenStack and serverless. Who cares? Well, cloud providers, maybe they care. Maybe they want to offer the serverless functionality to compete with AWS. Well, I don't know what they're thinking about competing with AWS. But private clouds, your developers may want to use the models. Like OpenStack itself, maybe with our little bit of identity-looking, kind of identity search we are looking into should we be doing everything that Amazon is doing. Most importantly, I think, developers care. Developers have real use cases for the workloads like that. They want to use van-driven-based architectures. They will use them with OpenStack without it. So for OpenStack, OpenStack only have three options. OpenStack can build these things themselves. They can use something like OpenWISC or can use something like StackStorm. And I'm biased. I think that the StackStorm is here. But because I'm biased, I only want to present facts. The facts are, we have majority of the serverless stack already in place. And the fact is that with all you saw today, you can see that we cover the pieces that are currently missing to deliver the end-to-end serverless functionality. And when we do so, when we are in place, we can take the same thing, the canonical serverless architecture, and see what covers it. It's like we need to have a database and we have it. We need to have the van-stream and we do have it. We need to have the storage. We need to have workflow. We need to have Lambda. Here we come. And StackStorm, kind of a patch-to-license, it's been a while, kind of, it's relatively mature at this point. We have a lot of contributors. We're developers here. We know that it is binary. We have 256. I found it interesting that it's exactly 100 million numbers. Substantial installation base. A lot of integration packs. We are largely on OpenStack technology stack. We actually, we started StackStorm as a company, started with an idea of automating OpenStack when we later realized that automation goes beyond OpenStack. Team, we are around the community since probably inception of Mistral, so we know some of you, you know some of us, we're contributing. We already bring a large set of integrations with a variety of development tools, and some of them are pretty mature and heavily used. And these integrations can be extended. So I think that I've done my part. I think that I give you enough information to see that it is possible to build serverless on OpenStack today, that it has been done, and it has been done for serious applications. From this point on, I think that you guys who are interested in this topic need to judge for yourself. Take StackStorm for a right. You know, far-grant is the usual way. If your adventure is try Docker, Chef, Ansible, Puppet, everything for production, read the documentation. Everything that you'd seen today in the demo are in these two repositories. Check them out. You can actually reproduce most of the stuff, except the proprietary algorithms that run the genetic stuff is not there. And then let's discuss. So what I suggest is we'll have some of the discussions here in Boston, just get it going, and find me in StackStorm booth, and then we'll pick it up on the discussion. Lastly, I rate my talks. And if you find something interesting, and kind of if you feel like giving me the star, the best place to place this star is on our GitHub repo. We are approaching 2,000 GitHub stars. We're actually giving a prize to the person who is going to be giving us the 2,000 stars. So please check it out. Maybe you're in a line somewhere, kind of if you have the phone. Just if you feel like you learned something, and this presentation was interesting, give me a star. And with that, thank you.