 Talk on using the HTTP API. Maybe a short question up front. Who of you has used the HTTP API before? You? Cool. Actually, a bunch of people. How many of the others who just didn't raise their hand? How many of you have written frameworks the old style way using Glip Mesos? OK, no one? All right, I hope we still have some new exciting stuff for you or some best practices of writing HTTP-based frameworks. Maybe just about who we are. So we, that's actually Max from a randomDB. A randomDB is actually a critical partner of us, Mesosphere in Germany. They basically, they're always like beginning pigs implementing new features. As for example, they were one of the first frameworks using the new persistent volumes back when they were released. And now they're actually also one of the first ones like trying to migrate from the old way to the new way. There are actually like a number of frameworks already written with the new HTTP API, but there aren't so many who actually moved over from the old style way to the new style way. All right, just as a brief outline, what this talk is about. So we're going to talk about the HTTP v1 API and that was actually a Mesos 1.0 milestone. So it officially was supported in there. Bits of that were already like prior versions as for example, the Skating Learn and Executor API and Mesos 024. And as you can already hear from those different names, it's actually, it's not like one big API, but it's actually several APIs. It's actually the framework part, which consists of the scheduler and the executor APIs. And there's like as well for operators, so for people who actually want to control the cluster, there's yet another API called the Operator API. Overall, it's like an RPC-like based HTTP API and the coolest feature in my opinion that it actually allows for language-independent schedulers. Before you always had like this lit Mesos dependency and there were people managing drivers, for example, for Java, for Go, but it was always really hard to keep them up to like the current development, up to the most recent development. So it's really nice to have like independent of language, basically any language supporting HTTP, you can now write your framework out of the box basically. We're going to come to some limitations later in this talk. So in the beginning, in like the first years of Mesos, there was just lit Mesos and that was all you had to use for writing your own framework. So how did that look? So imagine we want to write like a Java-based framework. If you want to write a Java-based framework, and here we only care about the scheduler part, we don't care about the executor part. So we actually, we had see Java library, Mesos.jar, and that was internally using, why J and I was using like lit Mesos as the library. And basically like this construct of those two libraries could then talk to the Mesos master and basically receive like job offers and start new tasks and basically do anything a scheduler would do. So overall that actually had several disadvantages. The first one being it wasn't really affordable. So whenever you wanted to, this was like specific to Java slash Scala-based framework, but like for example, if you wanted to write your Go-based framework, as we just heard, we see for example, Kubernetes, you actually, you have to write another scheduler into another connection to lit Mesos. So that was really nasty to keep track of that. Second part was it was really hard to debug. So whenever you found somewhere an issue that like offers weren't received, offers weren't sent, you actually, it took a really long time to figure out which step throughout this chain that was actually failing. And there was no easy way to get like blocks throughout the entire lifecycle. There was also an upgrade dependency. So basically the frameworks had to be upgraded if we had some breaking changes within like lit Mesos. And so you couldn't just keep your frameworks running. It didn't happen often, but it was still like annoying that you had to keep track of basically this development of lit Mesos. And maybe from an operator perspective, also pretty annoying is that sometimes the response codes change. So for example, imagine like the state versus state summary endpoint where we added new fields and you never couldn't be really sure what you've got because there could have been new fields added sometimes redeployed fields. So there wasn't like a real versioning of the different APIs. So that's kind of what seemed part of the motivation for moving to the HTTP based API. The other is the second big part is like the networking part of this. So with the lit Mesos, it was really from a networking perspective, there were some challenges. So what happens if I have a firewall in between the master and my scheduler? What happens if one of them is running in a container that really had to keep track of how the networking was working in those cases? All right, let's break down this wall in between and actually see how we can get going further with that. Before we break down walls, we should actually understand how many APIs there are. As said initially, it's not just one API being used by Mesos, but this actually is a bunch of stuff. So first of all, the one we mentioned before is the scheduler API. The scheduler API is actually for the scheduler talking to the master node and basically receiving offers, starting tasks, receiving task status updates and then reacting to that. So that's what the scheduler API is used for. On the other hand, down on the agent, we have the executor API, which allows the agent to talk to the executor and vice versa. And this is basically then for the task running on those nodes that they can communicate with Mesos, with Mesos agent running there. And that's called the executor API and together, as those are required to write a framework, both the scheduler and the executor API are called framework API. Then there's actually one other person involved and that's the operator. So also the operator he needs to interact with both the master and also with the agent to start tasks to set, both to set limits, change logging levels basically anything an operator would do. And that's actually managed by the operator API, which again consists of the master API for communicating with the master and the agent API for communicating with the agent. Internally, and that's actually not yet touched by this effort to have a consistent, more or less consistent HTTP API is the internal API between master and agent. So that's basically still left as it was before, but that's also it's not really visible or it should be visible to any framework developer. So that's why we first felt we wanna get the APIs to the outside correct, which you as a framework developer might use and then as a next step tackle like the internal backlog. All right, now we understood basically what's happening and we can actually start building something new around this. So the goals, the main goals were first of all, like this one slide we saw about the networking issues to really enable it to work inside firewalls and containers more easily without requiring like manual adjustment and having to be aware of that. Second to allow for pure language clients that you don't have some dependency anymore and basically can pick any language supporting HTTP to write your scheduler, to write your service slash framework. Also the versioning I mentioned just to make it easier for us to keep track of the different versions of our API and also make it more better visible to the operators which API is currently using. Maybe like what I really, really like as a goal is like the documentation part. So the old APIs were just like kind of developed basically also happened like back in the Twitter times and now we actually have like as a concrete goal to have some like better documented so they can actually be easier used and easier understood by having better documentation from the start and also keeps it going when we add new versions. So those were the goals. And this is basically what came out as the current implementation of the HTTP API. So every call you made may be from an operator, may be from a scheduler, actually it's a post request. And then there are different responses depending on what you did. You can subscribe to certain events. So for example, if you're an operator you might want to subscribe to the event stream and just see what's going on in the cluster. And then you actually receive a 200K and following that you receive a record IO formative as a stream of data. Record IO is that basically means like for each event which is going to be streamed to you for like each package. You're first going to have to length followed by is the JSON also packed it's actually a really simple format. And then there are also like non-subscribe calls. So if you just set to start a new task for example and that would actually result in the 202 accepted response. All right, how does it actually look in the cluster? So we have a scheduler and the scheduler would actually first subscribe to the master because it wants to be connected. And this is also a connection which should be persistent throughout the lifetime of the scheduler. So we basically we subscribe with our framework info. Similar as we would have done initially when we registered with the master our new framework. And then we actually we get a response and this response it's a stream. So the first one is going to be subscribed. So it's going to tell us, hey yeah, we successfully registered with the master and that's actually followed by offer events. So then similar as we did before where we got offers, we now get offers via the stream of data. And it's actually it's pretty much the same format as we had before. So we've seen that we can easily follow and understand that. All right, and as a scheduler I can then decide to accept that offer for example. And then basically start a task on that as I would have done in the old API. So with that, that's actually successful. It's not a streaming one. So I simply get like a response back and then I know, hey my task I've got to go and actually the master is aware of them and pull try to start them on the agent. So next the master actually needs to launch them on the agent. And therefore the agent will actually subscribe will spawn the executor and then subscribe to the executor because they also need to communicate. And this is now, this is the second part of the API which is the executor API. So once it has subscribed they can actually also like the executor is gonna give responses whether task has successfully started whether anything has failed back to the agent. Maybe just when we have this picture what's interesting over those subscribed connections. So basically between the executor and the agent and between the scheduler and the master they're also like hard beats flowing. So basically they always checking whether the other parts are communicating whether it's still alive. So they can actually detect when something is failing. So this is also like kind of important if you write your executor you should really implement and take care of those hard beats that you can always make sure the other side is still running. All right, this is exactly this slide what it's about. So periodic hard beats events they are sent by the master. So going back to this slide so the master is gonna send to the scheduler from time to time those hard beats and it's actually up to the scheduler to receive them and to basically check that the master is still alive. So if after a certain timeout it's not alive it needs to be aware and basically be connected. Similar as before. And so the master does the same. So the master is actually tracking those persistent connections and if they fail then it's actually gonna allow the framework to reconnect within the failover timeout. All right, this brings us to the operator API. The operator API is used by a human operator. In theory, actually a framework could use it as well. There's nothing actively preventing a framework from accessing the operator API highly recommend not to use it because it's not really the intention. So if you have a valid use case you might want to go for it but it's not the intended use case. Okay, and also here I can simply, it all is based on a post request. So I can simply subscribe for example to the event stream of in this case the executor. And so this is on like the agent side. And there actually now have a nice abstraction. So basically before when I wanted to, as an operator I wanted to keep track of the state of my cluster. What I had to do was actually I had to call state summary like every five seconds for example every 10 seconds. So I actually had to do calling to keep track of the state of my cluster. And now actually this is event stream. We have something really nice. So we can just, Marathon actually had said already for a while and what it allows to do is basically just keeping track of what events are going on in the cluster. So for example, if I subscribe to that as an operator I might actually get for example a task updated event. So basically I'm proactively being informed if something is going on in the cluster. Again, this is like the record IO format. So I first, actually I have a later photo. I'd like to get some event plans telling me how much is expected afterwards. And then I get like the task, the actual JSON of what has happened in the cluster. Cool, versioning, so versioning is mentioned before. It's kind of our goal is to have them matching the major versions of Mesos. So for example, we're right now at Mesos version one. So this would be like, this is a current API version you can use. And if we have Mesos 2.0 it probably will look like version two of the API. In between, we guarantee backwards compatibility. So we have similar definition of this compatibility as for example Kroberbach does. We allow ourselves for example to add new fields. So if we wanna add also new events, this is okay. But we wouldn't remove something or rename something. So basically your old code, which is looking at those different events, it should still work as yeah, we're mostly like adding stuff on top of it. And I think this also makes it easier for people, both implementing frameworks. First they can rely on that it's gonna keep the same and they don't have to change or recompile once there's like a new Mesos version. And secondly, it's also makes it easier for operators to write scripts calling like the operational state of their customers. The API status as of Mesos 1.1 or actually even longer, this is the same as true for the 1.0 release. The scheduler and executor API, they are stable. So we would actually recommend framework writers to really use it from now on because what's also gonna happen in the future is that features are just gonna be implemented for the HTTP API. So for example, this event screen, you can't use it if you're using like the old Mesos implementation. So that's why actually this is like the most important part that you can actually switch over your frameworks and we'll talk about how to do that in a minute. And the operation API, we consider that still unstable. That means we might actually still be named some fields there. We might want to remove some fields, but the overall concept that's gonna stay the same, we're just not sure about certain implementation decisions there yet. And from a status perspective, they're actually already like some client libraries and we'll see that later. We'll highly recommend you to use them as well. So if you just go to like the Mesos documentation, you'll see that there are like a number of client implementations like in C++ and directly in the Mesos code. There's like one JavaScript framework. Also you can be based on the HTTP API. There's like the DCS SDK, which is also for which you can internally use the HTTP API. So whenever you want to start with a new framework, it's really good to look at that because it's still, it's like kind of hard what you have to do or there's some pitfalls as we'll see later. All right, this actually brings me to a RangoDB. As mentioned earlier, a RangoDB there all the often like one of the first implementers or first users of new Mesos features. And so I would hand over to Max who can tell you a little bit about what they have done in respect to the HTTP API. Yes, welcome. So before I started talking about framework and HTTP and our rest of it, let me just give you a brief overview of what a RangoDB is and why we would need a framework for it. First of all, a RangoDB is a database and it is a multi-model database. That means we support multiple different data models. In our case, it is a document store, JSON documents, but it is also a graph database and the tool and also key value store functionality if you just use it as a key value store in one engine. So it's not different database engines next to each other. It is one engine which has a query language which supports to mix and match all the three data models. And the API is HTTP REST, so we have for a long time used HTTP and REST and JSON documents. And it allows you in our query language in QL to do joins based on documents. It allows to do graph queries and it allows to use transactions and all of this in one engine and you can even mix it within a single query. Furthermore, the database is extensible. We embed Google's V8 engine in the database server and therefore you can extend the already available HTTP REST API by your own routes. So if you need, for example, data-centric microservice which is close to your data and executes queries efficiently, then you can implement that in JavaScript and run it on the database server. And you can make a good abstraction of this service by exposing it as an HTTP REST service which is directly handled from within the database. Now, what is not on this slide, but what is evident is that around the DB is a distributed data store which scales horizontally and therefore we very much like to integrate with DCOS and with Apache Mesos to be able to deploy around the DB easily. And thanks to the universe in DCOS, it is literally two clicks to deploy a distributed around the DB cluster in a DCOS cluster. And if you later need more servers, you can just scale up whatever user interface. You need less servers because maybe the amount of data is strong. You can actually scale down and we handle this gracefully by first cleaning out the data from some servers and then shutting them down in a controlled fashion. And here you see, distributing, deploying and scaling and distribute the data store with state, of course, is a difficult task. And DCOS and framework writing is exactly the right approach for us to give a good user experience for people who want to use around the DB. This was wrong with correction. So, we have written a framework, most of it and I was done by myself and colleagues of mine. But that is hard. Let me just explain why it is hard to write a framework. Well, as you know from Mesos, a part of the planning and the scheduling, that actually the second level of the schedule happens in the framework scheduler. After all, it's called the scheduler. So, it is not so that the framework just tells the system, I need this and that task. Rather, it has to take part in this. It has to receive offers about free resources in your cluster and then make a decision as to whether this is accepted and used to deploy a task or not. So, there's a certain difficulty in the scheduling logic. In particular, because we use persistent volumes. So, we want that if a task goes down for whatever reason, that it comes back up on the same node and still sees its own data. Otherwise, we would have to resimmarize which would take a longer time. Now, the second part which makes it difficult to write a framework is it needs to be resimulated in its own right. If you want to run a database in production, not only the database itself must be resilient and fault tolerant, but also your scheduler. So, if somebody kills the scheduler and for maybe just a silly reason because you need a new version, then the framework must somehow find its own state and recover from the crash and get into action and pick up the pieces which were left. We do this by using ZooKeeper and storing the latest state of the ArangoDB cluster as we knew it in ZooKeeper. So, then when the framework starts anew and just have a look at ZooKeeper and pick up the things where they were left off by asking Mesos, is this the running list task and so on and doing the task reconfiguration? Anyway, the framework, it does deployment or help with organizing. It deals with persistent volumes. It organizes a part of the failover. Another part is handled in the ArangoDB parts themselves and it takes part in up and down scanning. So, if you click scale down in the ArangoDB user interface then you somehow express the wish that the database should shrink. And then, first of all, the database in itself moves data away from one node and in the end the framework schedule notices that this node is now empty and can shut down it in a controlled way. So, a lot of things happen and if you actually look at the complexity, our framework is written in the C++ language because we started at a time when Mesos was essentially the only option and it is over 5,000 lines of code, just the framework schedule. And this is the slog count, so this is taking out commentaries and empty lines and so on. If you really count the source code lines it's over 12,000 lines of code. Now, this is the code base which have to be written, debucked, maintained and it causes a lot of headache. So, now, let's see how we can maybe improve this situation. Not only was writing the framework in itself complicated, we had also issues with respect to the management of this. We were using Lib Mesos and Lib Mesos is a huge thing and it itself has 119K source code lines and even just a built environment in a Docker container to build our framework needs three gigabytes of disk space which can be a problem on the Docker hardware. Now, so there's a lot of technical difficulties with respect to linking, with respect to using the right libraries, version dependencies and it doesn't become easier because usually the framework is applied to the Docker image. We have the problems you have mentioned that with the old API there has to be a network communication in both directions. So, not only has the framework to open a connection to the master but also the master connects back to the framework. So, there's lots of firewall issues and so on and therefore it was a huge headache. Now, what do we hope the HTTP framework can improve on this? So, nowadays, and this is now different from a year ago I would say, we wouldn't recommend to write your own framework in C++ Chrome Scratch. Rather, we would recommend to use a software developer kit which uses the HTTP API. In this way, the effort to write a framework is much reduced. Let me say a few arguments about this. For example, you can develop your framework in any language. You are not limited to C++ or Java so you can take JavaScript, you can take code or whatever language you want. The communication is JSON, that is clear text and it's easier to debug. For example, you can just use a network server and just see what communication goes on between the framework and the master and very much quicker, hoping on problems. You have a lot less code. So, for example, just recently we have started to experiment with the JavaScript framework Jörg has mentioned, so this is I think actually a link to the blue one, Jörg, isn't it? Yeah, to the GitHub repository. And so we are trying to see how we could move our framework to maybe a JavaScript in limitation or maybe something else which uses the software development kit and uses HTTP tool, uses HTTP. Which would then allow us to get rid of the mesos, have a much easier way to build the framework and to remove a lot of boilerplate code. So, the first experiments are encouraging. So, for example, in this JavaScript framework or framework maker, I can just essentially describe what tasks for around what we have to be started, how their dependency is. And so, I don't have to write thousands of lines of C++ code. Rather, I have to write a few dozens lines of code of JSON to describe what tasks I want. So, it's much more compact. And obviously, some complexity still remains. So, if we want to achieve all the things we have done with the C++ framework to handle failover, to handle scaling and things like that, then still you have to do some kind of coding. But again, it can be done in another programming language and can be done in a much shorter way. So, let me say two more things to this. First of all, a warning. This software development kit is early stage. It's probably not yet ready for production, but it's encouraging, that's what I want to say. And secondly, there's another talk tomorrow by Jörg and Ken. It's actually 10, 25, in which lecture room, also here? Whatever, you'll find it out. Which is devoted to the topic of writing state full frameworks. So, there's more to be said about this. Okay, let me finish by mentioning a few challenges we found when experimenting with the JavaScript framework. There's actually a fundamental thing missing from the HTTP framework, namely the state abstraction. In the LitMasus, a framework could persist some state to say so, keep or any other place while our state abstraction API. And that is actually still missing in HTTP. So, therefore, whatever framework you write with HTTP cannot persist state easily. And therefore, it's very hard to make it so that if the framework comes back after a crash, picks up the pieces where it's begin with, it can't really remember, for example, the framework IT with which it has connected to the master. So, if it now connects to the master, it's probably considered to be a new framework. So, that is the fundamental problem. And the four points here are more of a problem with the particular JavaScript software development kit we found that it doesn't yet support persistent volumes and it's difficult to organize failover at this stage and in scaling afterwards. And also, it doesn't do the new operations API. So, that should be my word of warning that this JavaScript code would maybe not quite production ready. But the point remains for the future that I think it will be considerably easier to write frameworks using not only the HTTP API but also the software kits supplied. And with this, I hand over to Jörg. If you have any more questions about RangaDB or about our framework, just contact me after the talk or whenever. Thank you very much for both presenting and also implementing try-walls and you made those features all the time. Maybe just like two more words, one for this slide. And this is also like underlined as a point you just made that's always like using SDK is actually a pretty good choice because like the last point about operations, this is always something you should keep in mind when also running your software. It's basically like, how can you monitor? How can you debug? How can you update? So if you have a new version of your framework, how can you rule it out in a safe fashion? And that's actually where SDK is coming quite handy because otherwise it's something like everyone has to invent from scratch as well. And just jumping back one slide. So this Node.js, this JavaScript SDK Max just mentioned in my opinion, it's really, really cool for prototyping. So actually it's been written by a community member who wanted to try out Flink on Mesos. And so he basically was like, yeah, sure, I'll write an SDK for that. And actually the Flink implementation is really, really short. As said, it's nothing. This is not the Flink I would run in production but for them just to test out that they can actually run it on Mesos, it's pretty cool. And by the way, we're just working on the Flink integration and whoever's interested in Flink, Apache Flink. The next Flink version is gonna have explicit Mesos support. And actually around the same time, we're also gonna have some DCS universe package for that. And as we talk tomorrow, it's actually exactly about using an SDK. So we wrote the DCS SDK, making it easier to write stateful applications. Maybe just going a little bit back in the history of Mesos. So Mesos had already the intention when they, in the early design phase, to be an SDK to easily write distributed systems. And it's actually just using Mesos. It's really easy if you write stateless services but as soon as you try to write stateful services, it becomes a little harder. And this is why we basically now have this new SDK which we'll present tomorrow. All right, so if you wanna get started with your own framework, there are like multiple points to get started. So if you're an existing framework and you're based on Java, there's actually something really cool and it's even like the Apache Mesos repository. It's a shim which allows you to write your software against like this shim against accommodate API. And then you can switch in the background whether it still should use the old the Mesos version or it should use the new HTTP version for new features. And this is like something nice. It's mostly like for Java based or Scala based services, but it really makes it easy to switch over from like the old world to the new world without really having to rule it out from scratch throughout your entire cluster and then figuring out what's going wrong. Second one is it's called like this JavaScript framework. It's called Mesos framework. So that's nice if you just want a prototype and your framework, you just want to try out something new. So that's something you could have a look at and it's also like if you want to understand the basics of the HTTP API, I would also say this is like a nice piece to look at for understanding like the first bits not so much about like failover and really like the production bits, but just to understand the initial ideas behind the HTTP API. This is a pretty good start to have a look at. And then also like if you really want to write your own framework, I would urge you to look at the DCS SDK and probably visit this talk tomorrow. All of that, it's going to enable you to really write really short definitions of frameworks. So for example, this is an example for a scheduler with the Java based API and we said DCS SDK, it's actually often sufficient that you just have like a Java file from which code is generated for you. So you don't even have to code in the end. Best practices for using HTTP API. So first of all, as mentioned multiple times, really use libraries. Don't try to throw it up from hand at least look at what other people did before you. If you have a Java based framework or already written, have a look at the chip, it should be pretty straightforward to move over from your existing implementation to this chip and then step by step switch over to the new HTTP world. As mentioned like new features, they probably will only be supported for the HTTP API. So it's good to switch as early as possible. When running HTTP based frameworks in your cluster, keep in mind that you should use persistent connections. So connection keep alive and you'll see also the transfer encoding it's chunk. I appreciate you have to subscribe call and you basically gotta get events over time. So you have to keep your connection open. As you're keeping connections open in order not to open too many connections, HTTP pipeline is a good thing to interleave basically HTTP calls and not serialize them and slow down your entire cluster. When writing your own scheduler or your own executor, make sure that you react to those hard beats because otherwise you might be killed or you might not recognize when the other side isn't there anymore. And also, as with all things, it's really important to implement the authentication schemes provided because otherwise you probably, you could end up having to configure your cluster in a really insecure way and actually allowing frameworks to hijack your cluster. And yeah, most of those are actually, if you're using one of those SDKs, they'll do like the best effort for you and you don't have to worry about it because it's just given by the SDK. All right, thank you very much. And any questions? No questions? All right, and thank you very much for listening. Thank you very much for writing HTTP-based frameworks.