 My name is Rob. I'm an engineer at Microsoft. My role is I work with partners in the container space. Specifically, I work with them to enable them on the platform to make them run more efficiently on Azure. Today, I want to talk to you about building fast data solutions with the COS. I'm going to show you on Azure. I'm not a sales guy. I don't hold any quotas. So I'm not going to be trying to sell Azure. But if you do choose to run on Azure, there are a lot of good reasons to do so. I'll give you some tips on how to do that. Hopefully, you're familiar with what fast data is. In the smack stack, if you're not, I'll give you a very high-level overview. Fast data solutions are the typical IoT solutions, but there's a time cruciality involved. I'll give you a good example. I have a customer that I worked with that they have those gates that as you exit stores, if you haven't paid for an item, they alarm. The way those things work is they're little RF chips that they put in the tags on the whatever product you're looking to purchase. As you exit, as you hit that sensor, the sensor determines the direction you're moving in. If it detects that you're moving towards the exit, towards you're leaving, it sends an event up to the cloud, and then some computations are done to see if you've purchased that product or not. Then if you haven't, an event gets back down to that sensor and the sensor alarms. Now, you can imagine if that round trip took longer than say a couple of hundred milliseconds, if it took say 20 seconds, that it would be valueless. The person would have already been in their car driving away with whatever they've pinched, right? So that's that time sensitive nature. Now, there's not just one of those gates in the store. There's probably five or six per store. There may be a thousand stores, and there are a couple, maybe a dozen people walking through any gate at any point in time. So that's hundreds of thousands of events that are going on per second. So that's kind of the IoT nature of fast data, okay? The SMAC stack you're likely very familiar with, but Spark, Mezos, Aka, Cassandra, and Kafka. Spark is distributed processing engine where you can do ML, data processing. Mezos is the reason why we're all here. It's basically connect as a kernel for your data center or cluster of servers. Aka is an actor based framework for building distributed applications. Cassandra is a linearly scalable NoSQL database, and Kafka is a linearly scalable event buffer. So in this talk, I'm gonna be talking about Mezos, Kafka, and Cassandra, specifically I'm gonna be talking about DCOS around Mezos. Sessions demo driven. Very demo, it's about half demo, half slides, live demos, could fail, all the hail to the demo gods, blah, blah, blah. So because it's demo driven, I have to talk to you about the application I wrote for the demo. Imagine you have a bunch of sensors that are writing into Kafka. Now I didn't wanna create a bunch of instances of those sensors, so I created something called a producer which acts like it's a bunch of sensors. And it's writing to Kafka, specifically to a topic called sensor temp, okay? Now Kafka is partitioned, okay? And so depending upon the key for my event, it's gonna get written to a specific partition, okay? And the partitioning is why Kafka is linearly scalable, okay? So I've got a consumer component that I wrote and it subscribes to the topic as a whole, okay? Now that consumer can be assigned to one or more of those partitions, okay? When an event comes in on a partition that it's assigned to, I'm gonna capture that event and I'm gonna write some data into Cassandra. Makes sense so far? Gets a little bit more complex but not much more. Now you remember I mentioned that a consumer can be assigned to one or more partitions. In fact, when I have one consumer, it's gonna be assigned to all the partitions, okay? But it's not gonna read off all the partitions at once. So what it's gonna do is it's gonna be reading off of a partition and it's gonna get back a block of data, maybe 50 events and number of events at a time, okay? The interesting thing about Kafka, one of the challenges you'll find working with Kafka is it doesn't really have this notion of how far behind am I on the queue as a whole or on that topic as a whole? So you've seen things like queue depth. That concept doesn't really exist within Kafka. The way it works in Kafka is when I go to pull from a partition, one of those end partitions, I get back some metadata along with those events and that metadata tells me how far behind I am on that particular partition, okay? So I may be just pulling from the first partition and it says I'm 50 or 500 behind on that partition but I have no notion of partitions two through X how far behind I am, okay? Makes sense so far? So what I wanted to do is I wanted to be able to handle that because I wanna know how to scale. I wanna know how far behind I am on these in my partition so I can scale my consumers. So what I decided to do is when I have a second consumer and that rebalance occurs or end number of consumers, every time a consumer gets back, that metadata is saying how far it is behind on its partition. I basically write to another topic saying, hey, I'm partition one, I'm 500 behind. Another one may be writing on partition three, I'm 50 behind. And then I've got another component that's listening off of that that calculates the aggregate. How far am I behind as a whole? Makes sense? Now it's not altogether 100% accurate because as I mentioned, if you're not getting back data from partitions, if they're partitions that are unread, you're behind on them, but you don't know that. But we'll see how we can handle that a little bit later. Once we have that aggregate lag, once I know how far back behind I am on my topic as a whole, I'm gonna write that to a workflow engine and we're gonna see what that looks like later and we're gonna use that workflow engine to auto scale out my consumers. Makes sense? That's the scenario. So what I'm gonna illustrate, I tend to think of container-based solutions in layers starting with just the containers themselves. What do containers buy from me? What do I get out of containers? And how do they help me, in this case, in the context of this talk, build fast data solutions? So I'll start with that. We won't spend a ton of time on that. Now we're gonna talk about running its scale. What do the orchestrator, specifically DCOS, give me when I'm building fast data solutions? And because these are persistent or stateful workloads, how do I handle data persistence? I'm gonna talk about using a product called Portworx to solve a lot of the challenges that you have when working with persistent workloads. And I'll outline what those challenges are. I'll talk about how Portworx is gonna resolve those. And then I'm gonna touch on some day two operations. How many of you guys have seen any of the talks that have dealt with VAMP? Several of you have. They were talking about canary releasing in VAMP. I'm gonna be using VAMP to do microscaling. So I'm gonna use VAMP to build a workflow that's gonna do scaling, that's gonna scale out given the lag that I'm showing from Kafka. And that's about it. So let's jump into the local development aspect, just what containers do. And when we're talking about from the perspective of fast data solutions, prior to containers, how would you go about developing against Cassandra or Kafka? You have to either install it yourself or you'd have a shared solution, right? And you'd be sharing with other people a pooled solution, neither which are very good solutions. Installing it, wasted calories, very difficult to do, challenging, and then using a pooled solution, other people can step on what you've got. They can delete your data, and it's not portable. You can't run on your laptop if it's shared, if you're running on a plane, how does it work? Today it's as easy as running a container up. Just spin up the container, develop against it, and I'll show you how to do that in just a second. Containers give you beyond just the ability to run things like Kafka or Cassandra. And to me, the attribute of a container that really gives you everything is the fact that it encapsulates its dependencies. That nature, that fact, makes sure that you know that something runs on your machine, it's gonna run the same as it does in production, on dev, on staging, and that feature also allows you to get density or dense workloads in your data center or in the cloud. So if you don't know what I mean by that, I've built a slide to kind of illustrate that. The left-hand side of this slide illustrates kind of the old world. We've got a hypervisor and a couple of VMs running on it. And the VMs are color-coded for indicating what dependencies they have on them. And in the center here, you've got these applications that are color-coded to determine what dependencies they require. So you can see the dark green apps can run nicely on the VM that has the dark green dependencies installed on it. And the slightly lighter green can run on those. But I have all these apps sitting in the center that can't run anywhere. And maybe these two servers, these two VMs up here, maybe they're only running at 10% CPU, 10% memory. But yet I've got all these unused workloads here. So if you look really closely there, you're gonna have to look really closely, I've put a very thin line around those boxes to indicate that they're now containerized, okay? And what that means is that the dependencies are now encapsulated inside of those containers. And so now they can all run on any server that has a container engine running on it. So now maybe this guy's running at 60, 70% capacity. And that's the attribute of dependency and encapsulation of the containers, they enable that functionality. So let's take a quick demo on developing locally to developing these fast data solutions locally. So we take a look in here, you can see I've run a Docker PS, I've got Kafka up and running, okay? And you can also see that I've just done a list of my networks, okay? Now the easiest way for me to run up Cassandra is for me to just run a Docker container. So what I've done is if you wanna run any of this code that you're gonna see later, the producer, the lag reader, the consumer, any of that stuff, all of this code is out on my GitHub, it's out here at, it's out here at github.com, we're at Rob Bagby, DCS, Kafka, Cassandra, okay? What I've got here under docs is also some nice guidance on how to run both Cassandra and Kafka containerized because there's a ton of articles out there and they all kind of drive you in different directions. It's actually kind of a bit painful to do. But it talks about the why we need to set up a network and then gives me the command to run either Docker for Windows or on Linux. So I'm just gonna go ahead and just copy this command and I'm gonna run it. See, now I'm running, I guess we'll have to skip the local demo. Oh, we can just not panic. Okay, so now Cassandra's up and running. So you can either trust me or as my father told me when I left for college, trust no one. So if some of you are like my father and you're not gonna trust me, I've written a little utility to illustrate how you can read and write to Cassandra. So if you go back out to that little help page that I've got, you can see I've got a test container, it's a little Cassandra tester and I'll show you the code in a second, but let's just run it up. The Cassandra tester's up and running. Okay, if we take a look, we've got our little Cassandra tester and if I do it, Docker logs, you can see that every three seconds, it's gonna be writing a message and then it's gonna be reading a message back, reading the last 10 events back from Cassandra, right? So I've got Cassandra up and running. Now, let's go ahead and stop this. Now, when I'm working with Cassandra, I wanna work locally. So I wanna use all the tools that I'm normally using. If you took a look at the Cassandra, the Docker run, you can see that I mapped port 9042. Let's take a look here. You can see right here that I mapped port 9042 on my host, down to 9042 in the container. The benefit of that is I can use anything like data stacks dev center to just directly connect to my local machine, open the connection, and then I can just start running and developing against that cluster. Make sense? Okay, so now I've got Cassandra running, I've got Kafka up and running locally and I wanna start building my fast data solution. Remember, I've got my producer that's writing events to Kafka, consumer that's reading, and it's also writing out to another lag and then I've got a lag reader. So all that code again is out of that GitHub. I've got it locally here. You can see the consumer, the producer, and the lag reader. And I've got a Docker compose file. So I'm just gonna go ahead and show you that. The Docker compose file's got the producer, consumer, the lag reader. You can run Docker build against the Docker compose build build it, run it. So let's just go and run this thing. Docker compose up minus F, or excuse me, minus D, run it as a demon. And we're gonna spin up my reader, my producer, so we can do a Docker, I can do a Docker logs, and I can do my code producer. And you can see that I'm publishing those messages locally to Kafka, Docker logs, and I can do my consumer. You can see that I'm reading those and I'm writing out to Kafka to that separate topic. And then I can do over to my lag reader. And you can see the lags are gonna start growing. They're growing to 3,400, 3,400, et cetera. They're gonna continually grow. So I can actually go ahead and use Docker compose, and I can do a scale, and I can scale my consumer up, and I can push that out, and I can show while developing locally how I'm able to handle the scaling and make sure that all my code is working, my lag reader is working appropriately. All that, and if we watched the logs on the lag reader, you'd start to see those slowly come down. So as you can see from a container perspective, I'm able to develop locally, but I'm not gonna wanna scale across my single node using Docker compose, am I? That's the role of the orchestrator. So let's jump back to the slides and take a look and see how we can take advantage of the orchestrator to solve really running at scale and not just moving past the development side, okay? So when you're running at scale, that's the role of the orchestrator is to, and if you look at the orchestrators, they all tend to generally solve the same set of problems. If you look at Mezos and DCOS, Docker Swarm, you look at Kubernetes, they all tend to act as kind of a kernel for a cluster of machines. You don't have to worry about where to schedule a specific service. You basically declare the service has these requirements and the orchestrator will find somewhere to run it. It can do health checks both at the process, both at the process as well as into the application, and when that application's not running healthy, it can reschedule that and restart it, right? It allows you to scale not only across one machine, but off across a cluster, allows you to gain resiliency and high availability by scaling across fault domains. All of the orchestrators tend to do those same things. So what is it that's special about DCOS? It's really specifically when we start talking about building these fast data solutions. The answer, if you sat through the keynote this morning, Eriex really hit home on that. It's these vetted, stateful frameworks. What are they? Well, they're really distributed applications. They've got the notion of a controller and workers, they're called a scheduler and an executor, and the benefit is the scheduler is both application and cluster aware. So it can make application-based decisions on how to run that application given what's going on in the cluster. If you've ever installed something like Cassandra Kafka yourself, you'll know that it's not trivial to do. That is all encapsulated inside of those frameworks. So not only the installation, understanding what the dependency tree is, I need to make sure X is installed before Y, this is all up and running. Not only beyond running or understanding, getting an HA implementation of these frameworks running, they go beyond that. So if you want to do things like add a node into Cassandra, that's again a non-trivial operation without these frameworks. All of that and more, I've got a little bit of an eye chart up here to illustrate to you the benefit of, or some of the benefits that are in the Cassandra framework. You can look at each of these, but they really should hopefully illustrate to you really the power of running these. They're not templates. It's not like a helm chart that you're running that's going to install on Kubernetes because of the fact that it's an application that's aware of its surroundings as well as the app itself. So hopefully we've beaten that down. Now, if you want to run these frameworks on DCOS and you want to do it in Azure, there are a couple of ways to install it. One, you can just go into our portal, go into the marketplace, search on Mesosphere, show you that in a second. Another way to do it is through something called ACS Engine. In Azure, the way we deploy is something called an ARM template, Azure Resource Manager Templates. It's a big giant thing adjacent, big ball adjacent that defines the topology of what you want, okay? You can use ACS Engine, which is an open source project. I'll show that to you in a second. It's a little go app where you create this little tiny template and you run it against ACS Engine and it gives you that ARM template and you just deploy it, okay? And that allows you to divine custom topologies for DCOS. So you can say, maybe I want two private pools or maybe I want to deploy into an existing virtual network because I want to build a hybrid solution or maybe I want to have attached disks for storage like me. So you'd use ACS Engine to do that and I'll give you a high level on how to do that in a little bit. That's the Azure pitch, right? Okay, so we talked about the orchestrators. We talked about specifically DCOS, the power DCOS. Now I want to talk to you about data persistence, okay? And specifically I'm going to talk to you about how Portworx is going to solve some of the challenges that we have within data persistence. So let's talk about the options for container persistence. I'm going to talk to you about Azure ones in specific but most of these can be generalized to other clouds or even on-prem. The first one is Azure specific. It's Azure files, which is a file sharing service. You might think Azure files is your answer for if you're going to run Cassandra, Kafka for your persistence. The challenge is that it wasn't designed for that. It's a file share designed for specific files not for high IOPS. It's highly throttled and if you're running something like Cassandra, you're going to get throttled and it's not going to work for you. So you don't want to use Azure files. Next solution, you could use the ephemeral disks on the VM and there are people that do that. We've got people that want to eke out that last bit of performance in Cassandra and they run on the ephemeral disks knowing that if the node dies, they lose everything on that node. But they take that into account with their backup strategy and their snapshotting strategy. You can do that, but that is for a highly advanced scenario where you need to eke out that last bit of performance. It's not a generalized use. So then you've got attached disks and that's probably the most common use that you have. And I'll talk to you about attached disks in more detail in just one second. Then you've got pooled storage solutions. What pooled storage does, typically, is you take a bunch of maybe attached disks and then you aggregate all those together and then you serve out virtual volumes. And I'll talk to you about how these pooled storage solutions, some examples are Portworks, GlusterFS, how those pooled software solutions or pooled storage solutions solve some of the problems that you're going to see with attached disks. So with that, let's talk about the challenges of attached disks. The first challenge I think of when talking about attached disks has to do with container rescheduling. If you're looking down at your PC, you're missing about 30 minutes of work on an animation on this slide. So everybody look up, let's respect the animation. All right, so when node gets scheduled to node one, needs to get rescheduled to node two, what happens with that disk? The disk is gonna have to move from node one to node two or you're gonna have to, you can look down at your PCs again, the cool animation's done for now. I'll let you know when the next one's coming. Or you're gonna have to make sure that that work gets rescheduled on the same node. Now, if the node was the problem, that's not gonna work. So you've got this disk movement problem. What's the challenge when the disk moves? Even if the orchestrator can tell the disk detached from here and attached from over here, you've got latency. That's a moving part, right? The second challenge is this relationship. I call it the container disk challenge. It's this relationship you have between the containers and the disks. If two containers are writing to the same disk, naturally, if you have to reschedule it, you would have to reschedule both of those because you've got to move the disk, right? Or you've got to enforce a one-to-one relationship. What's the problem with the one-to-one relationship? Well, the one-to-one relationship causes a couple problems. One, you may have a max number of disks that you can attach per VM. But secondly, it's not a very granular unit. You might say that, oh, I want a five-gig disk for container one, right? What if you need to go to 10 gigs? Well, over provision. Well, maybe now you're paying for something you don't need. It's not the best solution. There are challenges there. And so some of those challenges are addressed by these pooled software solutions. So in this slide, what you see is I've got two nodes and those nodes have four disks attached to each of them. Each of those disks is 128 gig. The pooled storage is going to take those and aggregate that into a terabyte. And so when node one wants to spin up container one, container one needs a 10 gig volume. That pooled solution is going to give you a virtual volume of 10 gigs. If it needs to be 15, voila, it's 15. If it needs to get rescheduled, the container needs to get rescheduled to node two. No animations here, by the way. If it needs to get rescheduled to node two, then voila, no disk movement. That's the benefit. So what am I using as this pooled storage solution? I'm using Portworx, okay? Portworx is a great solution. Not only does it kind of solve those problems I just mentioned for pooling and giving out virtual volumes, but it was built and designed for containers. So it has a lot of per container functionality. Things like the ability to encrypt per volume. So you get basically container by container encryption by bringing your own keys. You get a Docker volume driver. And so the orchestrator, as it's basically saying, hey, I need this bit of compute, this container scheduled over here, can also call into the volume driver to get a volume. And so you're treating storage the exact same way that you're treating your containers. You also get enterprise features such as backups and snapshots. So I don't want to beat the drum for Portworx too hard. Jeff's here from Portworx. He'll be out in the hall. If you want to talk to Jeff or the other folks, please do, they'll talk to you about it. Let's do a demo now about running at scale. So basically what I want to do in this demo is I just want to show you how in a DCS cluster I can get Cassandra up and running, using Portworx on the back end. And then in the last demo, I'll show you running the entire app and then scaling it, okay? Okay, so I basically created a little test cluster here. Okay, it's a three node cluster. It's got Portworx running across three nodes. And I've got something called reproxy. The Portworx UI is sitting in the private agent pool so I need reproxy in order to serve that out. So I've got that running. You can see Lighthouse out here. Because Cassandra takes a little while to deploy, I'm gonna go ahead and deploy that now and then I'll kind of backtrack and talk to you a little bit about it. So I'm gonna just search for Cassandra and I'm gonna install the Portworx Cassandra framework. And the reason why I'm choosing the Portworx Cassandra framework and not the other Cassandra framework is that this has a volume driver integrated in it. So I'm just gonna put two nodes out here. And while this thing's installing, one thing you'll notice is as this thing starts to deploy, you're gonna start seeing that kind of dependency tree illustrated out here. You're gonna start seeing things are gonna get deployed in order. And that's illustrating what I was talking about the value of the scheduler. But as this thing gets deployed, let's take a look at Lighthouse. So this is the UI for Portworx. You can see I had a three node cluster with 128 gig attached disk on each node. But you can see to Portworx, it's aggregated that to 384 gig. You can see I've got three nodes, they're all running. But I've got no storage, no volumes have been served out yet. But if we watch this for a second, as that scheduler starts to kick out my Cassandra nodes, and don't worry about the big red line, I didn't configure an email server or so, whatever. But you can see it now. So the first node for Cassandra's up, when the second node comes up, you're gonna see another 10 gig volumes is gonna be handed out. I didn't do anything. The orchestrator is controlling all this. And that shows you the value of that volume driver. So we're gonna let Cassandra run for a little bit, the install run because there's still a second node that needs to get going. I'm gonna take a quick break over to ACS Engine and show you how you might install a cluster. I'm just gonna give you the high level here. But hopefully it'll give you enough to go on if you want to in Azure. So you got to github.com, WAC Azure, WAC ACS Engine. And if you go into docs, and inside of docs under acsengine.md, you can basically see it'll give you the install instructions of installing ACS Engine. What is, in general, what you do, the easiest way to install it is either to, either to run it inside of Docker, easiest way to do it, or you can just grab the binaries under the releases, grab those binaries. And what you do is, if you go into ACS Engine, you can see these examples. So under the examples, these are these little mini templates that these little templates that you have to fill out that you run against ACS Engine. So you call ACS Engine generate and you pass it one of these templates. In this case, I use this template. So I've got DCOS, I've chosen I want one master, I entered in my DNS prefix, and then I chose one disk, 128 gig, and I entered in my public key, okay? I ran that against it, I got an ARM template, and then I deployed that ARM template, okay? That's essentially as simple as it is in deploying an ACS cluster, okay? So now you can see Cassandra's, I've just got served up my second volume so I can see that Cassandra's up and running. So I've got two nodes in Cassandra up and running. Well, this one's staging. So wait till that finishes up and then what I can do is I can get that Cassandra tester, we can run it up here and just prove that that runs. Then we'll jump back to the slides very briefly and then I'll show you how we can auto scale this thing. So everything's up and running. Let's grab that auto tester. I just happen to have the marathon config for my auto tester. So I take that, go into DCOS services, and I add my little auto tester, Cassandra tester. Here's Cassandra tester, it's gonna deploy quick. That little thing starts to buzz when I lie, did you notice that? Or when I'm about to lie. So let's click on it, it's still staging. Let's give it a second. It only takes this long if you're on stage and it takes longer if there are more people in the audience. With for every 20 people, it's another 10 seconds. Okay, it's up and running. We can click on and I lost my mouse. So we can click on here, take a look at the logs. Now, Cassandra's taken a little bit longer. I'll kind of jump back into this a little bit later if we have time, but Cassandra's not properly up completely yet, so it's taking a minute. So I'll jump back to this in a little bit, but let me just, let me pop into the slides and keep going, because we wanna get to the money part of this demo, which is the vamp solution. So what we wanna do next is talk about kind of the next day. So hopefully I've illustrated the ease of which you can install Cassandra, the ease of which you can take advantage of pooling solutions such as poor works to handle those persistence problems. So now what we wanna do is talk about day two. Again, if you missed the talks by Tim and by Julian on vamp earlier, one of them is running DevOps, the other one was just on vamp. Highly urge you to watch those recordings, they're unbelievable presentations. They talk about canary deployments and all of the power of vamp from a canary deployment perspective. I'm gonna be talking to you about vamp from the perspective of using it to do micro scaling. So what is vamp if you didn't see those solutions? It's a canary releasing and auto scaling for microservices systems. Basically what they do is they take a bunch of telemetry out of the orchestrator and they make that telemetry available to you inside of workflows, okay? That's the simplest way to think about it. They've got these constructs that represent the artifacts that you want. So I've got containers, those are in workflows and all of that and they're encapsulated in these constructs. But the real money is you've got all of these, you've got this rich set of data, which includes data that's coming from the orchestrators but also data that you can push in yourself from your application. So you can push this data onto what you can think of as a bus and then you can write workflows that operate against that, which is exactly what I'm gonna do. I'm gonna take that lag, that aggregate lag that tells me how far behind I am in Kafka, I'm gonna push that into vamp. And then I'm gonna write a workflow that says, hey, if my aggregate lag is greater than 500 and I've got less than 10 consumers running, spin up another consumer. I'm gonna do that every 15 seconds. But if my aggregate lag is less than 100 and I've got more than one instance, spin it back down. So eventually I should level set given my rudimentary algorithm. Make sense? I've got to explain to you a couple of artifacts in vamp so you understand it from a high level but I don't have to go into too much detail. Breeds, they're basically described entities. So you can think of them as like a marathon JSON file or a YAML file inside of Kubernetes, just defines the entity. A blueprint is a topology. It really typically has several entities. So my blueprint is gonna have my lag reader, my consumer and my producer. Make sense? Don't you hate it when a guy says make sense? I can't stop saying it now. And then you've got deployments. A deployment is just a blueprint that's running, okay? And then you've got workflows. And a workflow is just a little Node.js application deployed as a container that runs against that telemetry, either orchestrator telemetry or you can run it against your own custom telemetry which I'm going to. So why don't we just take a look at a demo. Let's see if my Cassandra tester's working yet. There you go, my Cassandra tester's up and running. So Cassandra's up and running. I actually didn't write such a crap component that I normally write it, retried. So I should get a little clap for that. I'll get myself, yes! No crap code here. There's a lot of crap code here, by the way. All right, so I'm gonna close this out and I'm gonna close out this session and I'm gonna go ahead and connect to another cluster. You guys ever see the movie, Dumb and Dumber? Or he says, just when I think you couldn't write, do something any dumber, you do that. Totally redeem yourself. Hopefully I totally redeem myself. All right, so I'm connected to another cluster. Let's take a look at it. I'm doing port forwarding, by the way, that's why I'm on localhost. So you can see this cluster got a bunch of stuff running. I got portworks and reproxy again for my back end. I've got Kafka and Cassandra both running. Both of those are the portworks versions that have the portworks volume driver. I've also got vamp running and elastic search. So vamp is using elastic search for its time series data, okay? So vamp, if we take a look here, click into it. There's a little group here and the most important guy here is the UI. Actually, the most important guy's the API, but for the purpose of this demo, the most important guy's the API, the UI. So if we pop into here and we take a look at the breeds, you can see I've got a breed for my producer and if we take a look at that, you can see that really all this thing is, it's basically defining my entity and the real money there is that it's pointing to inside a Docker hub is a container, our Bagby WAC demo producer, right? And it's got a bunch of environment variables and those variables can either get overridden by the blueprint or if you're calling into the API, you can override it, okay? So I've got my producer, my lag reader and my consumer and then I got a blueprint that's my ready demo here, and so my ready demo, you can see it's got the breed for my producer and I can override those environment variables there. Can you guys see this well enough without me zooming in? Yeah, the consumer, the details don't really matter too much in my producer. So I got this blueprint, I can run this thing up, okay? So let's go ahead and run this. I'm gonna deploy it. Now remember I told you that we're gonna get all these events when this thing deploys. So you can see we're starting to deploy here and this thing's gonna go ahead and try and deploy those three containers into DCLS. And when the lag reader gets up and running and it starts calculating the aggregate lag, I'm actually writing that back to the vamp API. So I should be able to see in these events, my aggregate lag, I'm gonna get rid of the health and the metrics and the allocation events and we should keep missing it. We should start seeing some lag events popping up. If I go over here back to DCLS and take a look in services, you can see my services are running. So I've got my producer up and running. If we take a look at the logs, you can see my producer's pushing 50 messages every second. If I go back to the services, you can see that all my services are up and running and the lag reader's out there pushing those lag events. And let's see where it's pushing them to. You can see it's out here pushing, one was 294, 33. If we go back to vamp, you can see I've got all those lag events are getting pushed into vamp. Now I've written a workflow that, just like I told you, let's take a look at it. So let's go into the breeds, let's hide these events for a second, took the breeds and look at my auto scale. And here's the workflow and I'll just show you the money bid here. Oops, okay, so here we go. What you can see here is I've basically got the events and HTTP get to my URI to vamp running in the cluster to the events where the tag is lag. So it's basically just getting my lag events back. I'm then getting those back past me as a response and then I'm just running this logic. If the lag is greater than 500 and the scale of my instance count is less than 10, scale the instances, otherwise if it's less than 100, scale them down. Makes sense? So let's run this thing. Let's take a look at my events and see where my lags are. My lags are over 500 now. So when I kick this thing off, it should immediately start pushing these things out. So there's my auto scale ready workflow and let's start it. This thing's gonna run for every 15 seconds. It's gonna run. So let's just watch it go. While this thing's kicking, you can see it just ran it, didn't do anything because my lag is 458. Remember that lag can kind of go up and down artificially because there are certain nodes or partitions that it's not reading off but eventually we'll hit over 500 and this thing will kick off. And when it does, we're gonna see my instance count go up and we're gonna instantly see, here we go, we're now up to two instances. Can you all see that? Two instances popping up. If we go back to DCOS and look at it, you can basically see the services. You can see I'm now deploying my second instance out here. Let's take a look in DCOS, just at kind of the density that we're running at. You should see it start move up higher than this as we start deploying more and more services and as vamp starts seeing these things going up. You can see my lag is starting to increase rather drastically because as I spin up new consumers, they're hitting different partitions that were never being read off before. So all of a sudden it says, oh, partition X now has 50 or 100 or 500. And eventually this thing will start moving up and you'll start seeing all of the, once we have 10 consumers out there, you're gonna have basically a clarity on what your entire lag was and eventually we'll move back down to below 100 because my consumers will now catch up and we'll start moving down. So it'll eventually smooth itself down to somewhere around between four and five containers. Make sense? There it is, that makes sense again. I can't stop saying it. So let's take one more look over here at DCOS. We can see that we've got now six running. We're trying to deploy the seventh and if we go into the nodes, you can start seeing that we're actually getting higher and higher density. So it's the container, the fact that encapsulates its dependencies allows it to run anywhere, but it's the orchestrator that allows you to schedule it anywhere. But now it's this workflow component that gives you the ease of use of spinning them up and spinning them down upon need or upon use. The benefit is that you have workflows or if you have work items that vary throughout the day, you can spin up containers for the ones that need more, spin down for others that don't. And you can really kind of not have to worry about what I call macro scaling, scaling up nodes as much as you do, just spinning up and down workloads inside of those existing clusters. So let's just go ahead and back to VAMP for a second here. And we'll just keep it on here as the lag starts to go down. We're sitting here, I think we've got nine or 10. The lag will eventually start going down as it goes down below 100, this thing will start to spin back down. But for now, why don't I take any questions if we have any? And then we've got a microphone here. And if they're portworks related, again, the portworks guys are out there. We've also got somebody from VAMP here that can answer higher level VAMP questions. First of all, thank you for the talk. Maybe it's a portworks question, but as we are talking about fast application, how do you measure the delay of rating in a scalable data store, in a distributed data, in a pooled data store that you... I'm sorry, can you repeat the question? Yes. When I work with this kind of technologies, one of the most important parts is the data locality. So if you write or read from a pooling data store, how do you measure this delay of the writing and the reading from the... From each of the components? How do you measure the delay on reading? Because we are talking about fast application. So I assume that all of this is in a microsecond scale. So do you measure this delay or this is... Currently, in this app, I'm not doing that, but we can take that offline if you want and I can kind of direct you to some people that can probably help you from that perspective. Well, that's it. If you've got patience for just another couple of seconds, we've just hit below the 100 threshold and we should see this thing start to scale back. That's it and thanks for attending and we scaled down to nine.