 Okay, well, so we are Jay from IBM China and the capitol from a sphere and so we're here to present So let me first talk about motivations, probably the type of stuff is already like self-explanatory We don't want a zookeeper First of all, we can give some good words for a zookeeper. It's a mature, it's very mature technology It's been there for a while and it's very future-rich and it can go on and on and I know there might be some zookeeper lovers But there are some limitations of zookeeper Well, it's kind of heavy It's for especially for our agency And the zookeeper has some hard dependencies and it's java not saying it's a big band of java but yeah, it's java And the zookeeper comes with lunch band bindings So if you want to use a zookeeper with you but on a guest zookeeper, you need to find a specific lunch binding and develop on some of that library and Well, there are some bad things about the zookeeper, but it definitely has a down size But again, I want to emphasize this Very hard. It's all about having options. So for example, if you already have ncd or console or Any other key value storage in your environment you want to use that you don't want to have another zookeeper Then it's about having options. You can use mesos against your existing infrastructure. That's the whole point So it's like having different flavor of your chocolate And I'm going to give an overview of mesos high availability So first of all, you need a distributed key value storage. Actually, I want to ask first how many of you Have or have been already using a key value storage in your system or in your environment Quote has a lot and uh, how many of you have already developed a Uh, for example, leader in action leader detection all this kind of system for your application Hope you Yes, okay. I'm gonna talk briefly talk about that So you need a key value storage I put the distributed word till now like as far as others because it's not from basic point of view You don't necessarily need your key value storage to be distributed It's just for high availability again for your key value storage itself just from basic point of view. It's okay And uh, it's a very technical basis setup. So you have at least Three basis masters to get to work like in the cluster mode and so you specify core example And uh, then you have one an only one leading master It's the result come up on the election. I know you're really really with americans And uh, it's uh, go through the leader election and leader detection. Detection is for the mesos agents and frameworks to detect which is the current leading master And uh, the third one is remedy lock It's one of the internal library used by mason masters to store some data for example your quota information your Reservation or roles, but again, I'm going to talk about that later. So leader detection detection is quite Uh, straightforward actually So here is by ground zoo keeper as the 2v key batch store and that's comes along at the very beginning for mesos So let's talk about a leader in action Then everything is very straightforward. It's very Simple, so I'm going to go pretty quick So at the very beginning all masters intend for the leadership And of course one will success And uh, other two will fail and at this point This the leading master will hold their leadership In the zoo keeper and the other failed ones will keep watching on that Either like a signal or And to see the leadership whether the vision has already expired or due to for example network position or master failure So here we have a leader master And so what happened with network position were like the failures in our register mission So the other two are keep watching it until it's That the failure happened Then they got immediately notified And then they can tell the game as we mentioned about And of course we have a succeeding master and we have one major And then one hope and leadership another one just to keep watching it. So new leader is elected when the first Miss master is there So what about agent frameworks? How can they detect the master? It's very simple. It's just a cost detect uh to zoo keeper And uh, the zoo keeper will return the current leading master's information In this example is ip and port and then The agent and frameworks will just use that information to connect to the current leading master It's a rapid lock. It's a little bit slightly more complicated. But uh, as the simple form is just Let's just create you I think Replicated fault totter and pen only locks. You can if you're interested you can these uh I'll make some stop at object.org and there's a documentation for it And you can also read some like research papers from Twitter and other conversations And so makes master use replica lock to store clusters a in the replicated durable way to actually external you can think of external data storage and which can be persistent And uh in replica lock it runs a access algorithm internally, but you know, maybe some of you are already familiar with Paxos You need to know every other entities to enable to run the algorithm so What it does is every replica in the replica lock will register in the zoo keeper and uh Of course every replica will hold its own registration to in case it won't be expired And that information is available in zoo keeper And when you replica join a group This information will be propagated to other existing replicas And so everybody replica will have a global knowledge of all the uh Nones in the group And then since every replica has a global knowledge, they can do access So in replica lock zoo keeper is actually just a I would think of as a registration entity everybody such register with the zoo keeper And in this way they can get to know each other. Then they can run the Paxos algorithm. That's the goal. So Uh, so we want to replace the keeper. So essentially we replace Every single entity that we just mentioned above to actually facilitate the election detection and registration And it can be anything. It can be zoo keeper. It can be entity or console It can be anything like keybeta star essentially if you have your own application And the key components as we mentioned three part that components comes in three forms one is contender Contender is very simple. It's just uh, it's a content API If you want you can have the old module you can have the content API and mesos will call that interface And expecting to get able and it's chill then i'm the leading master of the boss and there's to be another one who's leading And detector of course, I call the type API And uh, I get master involved coming back Tell telling me what's the port and I get port of the paraleging master And for galley group is probably not that intuitive. It's initialized. It's basically as a new replica I'm going to join the Paxos group. So I initialize then It's your responsibility to implement the safe interface to register in the keybeta storage and maintain that presence and also Uh, then a pieting group is actually a class in the process and the internally it does all the uh Paxos algorithm and so we are Because previously zoo keeper is quite closely coupled with mesos mask mesos and so we cut up the interface and uh If you have your own invitation of keybeta storage, you could just write the old module using these interfaces and uh, they are Oh, we have great example. I will show you some more later And it's very simple. Okay, so I'm going to head over to a little capital here to talk about modules in general so So far we have seen what are the components we need to replace in order to get new keeper And uh, we'll see how we can do it with module. So what's the mesos module? it was introduced about two years ago and the idea is It's an extension to the existing mesos code. You can call it a module plugin extension, whatever the idea is We have several components in mesos. Some of them are not Give us that we want and we want to be able to replace them in other cases You want to add some extra functionality to the mesos master or the slave or the framework and so on With the idea being This extra functionality may or may not be universal Like maybe like your organization needs this but nothing else And some of the examples you often see in our cluster will be the isolators If we have a specific hardware then you want to isolate that property Uh indicators every one has their own separate authentication uh proteins and mechanisms and so on And then there is this uh third kind of module which is uh the hooks Hooks are less of a attention but more of a listening service. So you can see Oh my task is getting launched or my task is trying or the slave exit and so on and so forth It's quite helpful to also um update any information during task launch for example If you want to add some authentication Uh privileges in the task launch sequence that you can have a hook which will intercept that Update the information and then pass it on to the uh the remaining components Again, the idea is you want to intercept the code path Modify the arguments and then continue there. You are not adding any extra functionality It's just that you are modifying the code path in some sense So how are modules used? Typically a module is compiled and placed in a shared library on its own So for example, LibMessus and this cold network only got SO It will contain usually a module or maybe a set of modules which are required For functionality uh in the same sense In the same process And then while you are uh launching mesos master or agent You specify these module parameters so dash dash modules equals The description of this json power Will contain the information about what is the path of the module library what are the modules called and so on and so forth And then you have typically some additional flags. So in this case, I'm saying dash dash isolation equals my isolator So basically I'm telling agent to load my isolator module from this definition And make it available or use it during the uh Launch of tasks and so on The module my isolator will get loaded during the agent installation in this case And then whenever a task is launched, it will be called and I'll do the isolation For the task resources inside that isolated implementation Again, this is using the existing APIs. So if there is a module that you want, right? It has to be some Component within mesos that already has a view of class or definition and so on and that's that's what we are going to do this year So now that we know what modules when we go up from here the the core part of This later section of the talk is to uh to see what we as a community can do with modules. So we have A developer coming up and says like, you know, I know a really awesome module that actually provides your mesos everything that you needed But how do we actually make it useful for others? Is it actually just okay for a module developer to put a module somewhere and you can go and use it? What about the compatibility is what if there are? Uh Frequent changes in mesos which are going to break about it and so on and so forth So as a community we want to to answer all these questions how do we make this whole module development process seen streamlined and In a way that the community can use and participate and provide feedback So modules aren't tricky and that's why so far we haven't seen that many modules everything That we have seen so far in terms of modules. It has been highly specialized by people who are familiar lots a lot more familiar with mesos than an ordinary Person who is actually more into their own application. So it requires very intimate knowledge of mesos and that's why Developing building testing using all of these steps are still a bit complicated more complicated than would like them to be So let's take a look at some of these writing modules So writing module itself doesn't require any knowledge of mesos. You don't have to be familiar with mesos code You don't actually need to be a mesos developer Just the the subsystem that you are implementing that is the only piece that you need to be aware of However, you do need Familiarity to the mesos model and what I mean by this is mesos uses the process and then it has event-driven Mechanisms it uses features and promises and so on. So because the documentation that's on the Or the subsystem that you're implementing is going to depend on these factors You have to be aware of this and you have to be knowledgeable about that finally the The module that you are developing the modules of the system that you are working in case is closely tied with the mesos It's possible that in the next version we update the isolated interface and now your module needs to be able to adapt And in most cases what it means is you have to enhance your module to add texture and to provide support for the interface and so on Building mesos modules. It used to be quite tricky First you have to build mesos in order to do that You have to install all the dependencies that mesos learn request and so on and so forth It takes several minutes to build and then of course we have the Present dependencies with mesos modules and mesos. The good news is starting with mesos 1.0 the rtm and dev packages that are Available they contain everything that you need to Use in terms in order to compile the module So now you don't have the need to Have the mesos source downloaded and compiled on the machine you can just now go to the So testing is yet another pain point so far We like modules, but how do we like unit testing is there can we have something Similar to what mesos has itself is it has a bunch of unit tests which are based on the gmod and gtest paper So far modules don't have that Well, the good news here is there are some Especially on the way to create a similar thing to Live mesos and this codecast or something Testing library that your module can then link against and run the unit test just like mesos does That way whenever you update stuff you can catch the feedback right away instead of someone using their Development or production cluster to find the errors so back to the the whole idea of community driven modules and To answer the question like once I have a module I have built it tested it now. What do I do next? And the question is We as a As a community how can we make it easier for developers to provide modules and for the community to actually consume those modules provide feedback and trade on it And so the proposal that we are still brainstorming about is to have a central repository of all modules that can be for instance github Under the github mesos organization We can create a there's already a module repository This module repository will contain say all the pointers to all the mesos modules that are publicly available And those pointers can be pointing to your own mesos module repo Which the developer would own all the community would own depending on where and how they want to post it and then we want to be able to see if We can create the binary packages and get in and out of packages so that people can just click the link Get the module and put it in their mesos environment So again another example for my name is here for example, you can say Yep, this is my module. This is the module version. And then this is the mesos version because modules and mesos Versions are closely tied. We have to provide multiple versions of your package for all the mesos We are working actively on the mesos module ci the idea is it will Whenever you create a pull request for example against mesos modules, it will burn your source code And that's against it and see if it actually works against the latest master and you need it against the previous versions of mesos which Nature supported and so on So with that, let's actually take a look back at at cdvj So as many of you already know at cdvj, this review can be stored and Actually, there are quite some projects using it already for example And it has a very beautiful htp api which I really like because there's no that point You can use whatever you want and just Do the htp call and Yes, they at cdma already says in your environment. I know some of the companies aren't using it So Probably you already seen this diagram before I will just replace the question form with the scd And so you don't need to modify rebuild mesos if you want to use the htp module And it comes actually three modules master detector master container the adi group But it's bundled in one project and you can just download that one and compile So I want to say that once again, it's all about having options I'm not saying htp is better than some people who work At console or wingsaw, but it's just to enable you to use your own flavor Projects and Tools for your own environment is not trying to convince you to switch that killball to keeper So title may be a little bit misleading. We just want to be a catcher So here we want to be able to use the keeper entity and console and Potentially a lot more kv store Okay, now it's double time. I need to go back to that That's the live demo I'm going to start the agent just to show you the master detector industry detected master at port 5050 and Uh, it registered with it. So I'm going to kill one answer here I'm going to kill the egan master And to set a failover. So there's a time value to configure And here you see the new master detected master at port 6060 which parallel in the new in answer And to actually see the master lock here. And yeah, that's pretty much the demo from my from the sd module And I'm going to show you briefly the module side ct webpage, so I'm getting head over to camel Let's talk a little bit about it. This actually wants to talk about it says building the The module and as you can see, uh, I pushed something which failed and that's simple idea of testing a module So if we go to the bullet point, I think it shows the um The place where the data is coming from and just recently pushed created a bullet press against the resource actually module and It went to the build mode and which is uh detected that the build has failed and so we have the status here So it will be a bit more lively when we have the final solution And As far as the actually at all goes, uh, it's still working progress and the feedback is much more of an outcome And that that's pretty much Thanks. You can't take questions now Uh before that actually sorry Just so we want to give a big thanks to the all the folks that are working all along With us. It's the first one that is the lead truck. Uh, I think it's here He was he actually modularized the contending contender and detector interfaces And also we have support from kodi banjame man actually started the module project itself And uh, joseph actually right now, there's still some patches again progress and joseph is reviewing some of the patches So, um, thanks and now, uh, you can grab the modules from these two links and try it out Currently, it's based on mesos version 1.1. And that's it. We're ready to take questions So, uh, I want to build a cluster with 10 physical hosts and uh, I run the application inside the container So the application need to have the Data to persistent To uh, somewhere. I don't have a share storage I want to protect those, uh, data From the application is there a way for me to see for a very lightweight say No matter if it's a file system interface or something, you know, I don't have uh, say the data only about say less than 100 magnifies So, uh, in replica log actually free one can use that or a very limited amount of data So we can read the documentation of the replica log so your framework can actually use that But uh, there's a limitation on zookeeper storage. So I think for every zeno, there may be a one mega One megabyte limitation So you probably need to chunk your data into small pieces But then so if you use replica log and then it's transparently underlying using zookeeper more entity so just, uh So the interface is from the replica log to your framework to don't need really to care which modules are lying in this No, it's just a key better storage if you want a replica log in mesos My knowledge right now Okay. Uh, thank you very much. Have a good day