 Welcome We're going to talk about marathon and and jobs and the current state and and there may be future state My name is Johannes. I'm working at missus fume. I'm working in the marathon project And if you have any questions about what we talked yesterday, or I'm going to talk today Please drop me a line if you can't find me in person I'm on Twitter and in various slack channels So I'm in the mess of slack channel and it is your slack challenge So just take the chance write me a message if I'm like if I screwed up or if you have like more questions than I can show today so First talk about marathon so marathon is a container orchestrator on top of messos Who's who's using marathon already? Okay, great. So I think everybody raised their hand I think marathon is one of the most popular container orchestrators or orchestrators on top of missus and We have a section in our guitar page when you scroll down. We have a Powered by a marathon section and I put it some some logos out of the section in the slide I'm really proud of all the users and I wanted to put more of the users in the slide I met yesterday during the town hall discussions, but I couldn't find you in there So if you're not in there already Make sure to open a PR to at your company at your team to our users by section and I'm really happy to merge those PRs Yeah, but we have some users here. So Yelp obviously uses marathon Yelp is a really good contributor as well. I like our users marathon But also like paypal ebay Verizon uses marathon to run their productions and Marathon is one of the fundamental parts of DCS. So are they DCS users here? Just a few most of them. Oh, oh, yeah more more raising their hands. Okay, great So every time you're using the service section in DCS. You also use marathon underneath which was a slightly more fancy UI and The first part is talking a little bit about past achievements past releases we had so we had last April last year April in alignment with the DCS open source Announcement we released marathon one dot one and included features for local and external storage and readiness checks So we'll talk about highlighted features just after this overview and a marathon one three we introduced The universal container runtime. So what we did there was we extended the mesos Containerizer which had before the shell executor To to be more open to be able to run for example Docker images on top of directly natively on top of messes which was quite nice and enabled us to do further development For example to enable task groups afterwards Animal three we enabled GPU support also for the universal container runtime and we added CNI support container network Interface support and we enhanced the petition awareness of marathon and missus so missus introduced the concept to remove task lost and replace it by more meaningful task statuses like task unreachable or task gone and We adapted those our statuses in marathon as well And now if you have a network petition and you can't reach your agents You can really fine-grained to find how you handle those situations in the past was ever task gets unreachable or task lost Mathon will place it instantly now. You can define how long marathon should wait on those tasks Okay, this was one three and one four. It was beginning of this year. We introduced pots So the work on pots or task groups we borrowed this concept the name from Kubernetes was it Basically and a rewrite of more than 50% of the marathon code base to go from app definitions to task groups This was quite a big change for us. And now we are able to run Side-by-side container sharing some resources. We will dive into this a little bit later And we introduced missus house checks. So in the past marathon was performing health checks to to check if your application is up and running and healthy and this was quite a bottleneck for for big installations and now you're able to let them execute Let them be executed directly on the missus agent Which is quite nice and have a nice scaling effects And we also introduced enhanced debugging so you're now able to dive into stock deployments and see what's going on and stuff like this and last recent release was I think four weeks ago was more marathon one five and One five is not not that rich in features, but rich in improvements So we had a bunch of quality and scalability and also performance improvements in one five We added so We added support for more ways of expressing secrets So this is only the API and you need to provide in our implementation for those API We'll talk about it later But now you are able to use five a secrets to like to make sure that your secrets are not exposed environment variables You can expose them as files mounted to your container And we enhanced the networking API so one three you introduce CNI support One container was able to join exactly one CSI network and in one five We we enhance this a little bit so that containers could join multiple CNI networks Multiple other networks. So there's a little bit more flexible now So this is a rough overview about what we did in the past one and a half years And let's dive into some some of the features on this slide so We introduced local persistent volumes and external storage so Storage is a first-class citizen in messos for quite a while now So if you have different application you have different needs to your storage layer or your storage options I'm basically what we saw with three kinds of applications So you have your stateless containers and you actually don't care too much about data in those and for those containers You get a default sandbox and they those sandboxes will be clean up for five minutes after the container Terminates, but if you have data applications, you usually care about your data and care about what happens So your data on your container is is gone So we introduced local persistent volumes and this concept is quite nice because it's it's designed for distributed databases. So When you have a distributed database, you have a replication in your data layer So imagine an elastic search or whatever framework you or database you want to think of and in those databases, you usually have replication that the application layer replicates data to all To all of your notes But then you have one a half performance of local rights And you want to be really sure that you have a good performance doing rights and not be waiting for your network device to commit some Some right requests So we introduced these local persistent volumes and that means that marathon is Reserving on one particular agent some space on the disk and let's get the label and this is reserved just for you application and for you a scheduler marathon and if your container terminates marathon will wait to see and miss us offer carrying those this specific Data a world you again and so you're able to restart your container exactly on the same data again as your previous containers running on So I imagine you want to do an upgrade of your database or whatever data application you're running So in the past without local volumes You had a problem you want to make an update of your docker container and this is Restarted and then it needs a full replication of your data set It's probably bad because you're shifting a lot of data in your data center And now it's able to stop your application Replace it within your version get exactly the same data again and prevent those full data replications in your data center That's quite nice and the configuration in marathon. It's on the right-hand side. It's quite easy So in the volume section you define your volume where it should be mounted say it should be persistent And then you can define the size of your volume and you're done. So you have this benefit quite easy to configure And on the other hand we introduced external storage So we choose Rex Reyes first provider for external storage if you haven't seen it They have a booth outside so you can talk to them and what they're doing is they're providing a driver to talk to external data storage for example Amazon EBS so with the configuration on The the lower configuration you just say I have an external Volume and I want to have a mount this and then you configure the driver and you're good to go then marathon and messes and Some extension will take care that this external storage is mounted to your container to the configured path And this is quite helpful if you're running like Traditional databases like traditional sequel databases without cluster solution You probably want to keep your data on a container terminates or when a container fails so that another container can continue working but with the External stored data But you get the downside of external rights So but if you're locked in in these databases and can't change it for whatever reason I think this is a good opportunity and a good way to to solve the problem and then When we go further in this line of releases we saw the universal container riser and The universal container riser is basically That we enhanced the mesus container riser So we call this thing starting tasks starting containers we call this container riser And what we had since the very beginning was the mesus container rising Starting shell commands wrapping on the flight container around doing resource isolation And then docker came up So we adapted docker that the mesus agent were able to start docker containers and we did this by just passing the request to the docker demon But then we had the problem that we had a third-party library in our installation So not only have mesus in marathon or DCS. You also have a docker demon and Some of our users and our customers don't want to have these third-party dependencies So they want to have a mesus marathon installation I want to run containers without a docker demon and basically it's it's similar to what mesus is doing So we're also running Tasks and containers we are selecting them against each other So I think it was quite a small step to go one step further and Enhancing this container riser to run also docker images So the mesus container riser is just a one-line change in the marathon app definition from type docker to type mesus You can still specify an image and the docker fetcher with down the mesus fetcher will download it and extract it Except the same experience as we have with a docker demon but this is running on the mesus container riser and You benefit from all the experience in the big installations and the mesus container riser is running so it's quite a nice feature and This entire container riser is is in the mesus wall So the mesus community can really drive forward this implementation and we were able to support the CNI networks really nicely or They were able to utilize GPUs and they were utilized They were able to to start POTS to to launch task groups and now they're able to launch Hierarchically Containers so this is really nice and enables Independent development for maybe that's what docker is giving priority to Let's talk about GPUs so this was also a feature which was Primarily introduced in the release together with the UCR so beginning from This release you're able to Isolate GPUs against each other So this is a new property a first-class property in the marathon app definition in this example you can specify this container should get four CPUs and This was developed together with NVIDIA So when you're utilizing GPUs, you need quite quite to do some work around this You can't just start your container You need to have other kernel extensions and the kernel libraries It depends on your kernel if you're in a high highly dynamically wall as we are in mesus You don't exactly know necessarily which currently you're running so Mesus did all the work around this so when you're in the docker world NVIDIA has this great way of Hiding this complexity for you You can just say NVIDIA docker and then run your container and then this extension will take care of doing all that magic and if you do This app definition for example mesus will do all the magic for you binding all the libraries You need to have and you're able to run your GPU consuming tasks just over And now you're able to utilize GPUs do machine learning do whatever we heard it in the keynote this morning About tensorflow and that you were able to run tensorflow on top of DCS They will talk I think this morning by Kevin about this topic I think there's a mesus con university session this afternoon. So if you're interested I put some Links and slides and make sure you was at the talks Okay, so now we're at the pot section so pots are really interesting because they enable us to have More flexibility in our architecture. So basically the contact a concept of pots is we're not starting one container We're starting a group of containers and they can share some resources like networking or volumes But they don't need to they don't want to and on the other side They have some specific require a specific reserve resources a CPU memory and discs. So they're inside there Isolated against each other, but they can share resources if they want to so classic examples are for example, the Logging demon sitting beside your main application Locks dash grabbing all your logs and transferring them to elastic search or gathering metrics or Doing introducing some back pressure in Placing and load balance in front of this so you you have more flexibility in what you're doing when you're using when you're using task groups and This is also done in the mesus containerizer and If you're in the DCS world, there's this new thing and it's also it's currently Going to be ported in the amiss of CLI that you can do this year's task exec so you can Execute in your mesos task Yeah, CLI and run some bash script there or bash commands if you want to that's really good for debugging And that's also done via these task groups Okay Oops I always try to point to there, but I think I need to go here Okay, so what was also able using the The UCR was introducing container network interfaces So in the marathon at the finish and you were able to join multiple CNI networks now in one five And it's really easily done by defining a network's property Defining the mode of your networks in this example. It's container bridge. So you're running your container in a bridge network but you could also do mode Container network and this enables you to join an over there network So if you're using DCS, we're providing a solution for running over there networks Bundled inside DCS, but if you want to bring your own network provider You can use the weave works over their networks or some other distributors. So everything Is conformed to CNI you could use to do all our networks and whatever so This is quite nice and enables us to do IP per container and more services cover your options So it's quite a nice feature and also really small configuration in the marathon at the finish And we'll stick to this networking story We are now smoothly transitioning over readiness checks to health checks So readiness checks were also introduced readiness checks as a marathon only concept and readiness checks you can define checks they are performed after the container was spawned and They will perform as long as the first readiness checks succeeded and then the readiness checks are stopped and then marathon will consider this container as up and running and ready to serve traffic Unhealth checks on the other side are started up with the first success of a readiness check so readiness checks are really designed for Giving a task or giving a container time to to start up To warm your caches to spin up your JVM or do migrations Whatever you need to do before your application can reasonably serve requests and Yeah, you can confide configure them similar to health checks and If you're using like a load balancer upfront for example marathon I'll be marathon I'll be respects health checks So nothing I'll be will not route traffic to unhealthy applications by the fact that the health checks are starting after the readiness checks I'll be is also aware of this and not routing traffic to an spinning up container, which is not ready Okay, good and Another feature in these and this story is the maces health checks So I think some some of you had some problems in big installations when running marathon HTTP health checks and we heard this Yesterday in and Tom X talk and I think this morning a little bit that When you're running a big cluster, let's assume like 500 2,000 nodes with big cells and applications and you configure a HTTP health check for every application marathon is quite busy performing all those 6,000 health checks all the time and blocking the network traffic So what messes did messes did? and a or introduced a feature called messes health checks and They are performed on the messes agents. So marathon can now configure and the task info that marathon wants to have health checks performed by the messes agent and They those kind of health checks are more distributed So if you're having a thousand nodes, you you have thousand agents who can perform health checks and not the single marathon instance So it's better. They don't produce network traffic because they're running on local host So you have a quite some advantages when using messes health checks and it solves a lot of problems and some installations But they're they're different. So imagine you have a network petition and this one agent is petitioned away and You might you may have two two scenarios so the first one is your etcher order can stir traffic to those agents so you're good to go and Your your user facing So the user doesn't recognize that this agent is missing So it's okay that this is petitioned away. Nobody knows this if you're using the marathon health checks and marathon It's not not able to reach this agent marathon will ping it in this example Three times and on the third failure. Mirathon will replace it with another one But it's all right, but it's the running and able to serve traffic from the outside and if you're using messes health checks, they are performed by the missus agent and The missus agent pings it and it's up and running. So it's totally fine So it's considered healthy and it's not restarted by marathon and it doesn't matter because the traffic is still going But you can have other edge cases where the application is it's not Reachable by marathon and maybe your marathon installation is near your etcher order And you want to make sure that this is reachable from the marathon leader because it's your etcher order Whatever so then in this you know, you probably want to stick to to marathon health checks and some other you want to go To messes health checks. It's really depends On your specific installation what kind of health checks you need you need to think about which which kind of way you want to go But if you have like a really big installation We saw this with two thousand or more applications. This picture is a stone from a blog post We saw that at some point of application was around 2000 and my phone was busy by doing health checks and its own it was not able to to really scale by the same amount of applications as Was before smurfy was busy doing health checks all the time So this is the left picture But when they moved to messes house checks, they were able to start way more applications because The health check and the load produced by the house checks are distributed to the missus agents So again, it really depends on your scenario if you're good to go with mouth on health checks I want to benefit from checking if it's reachable over the network Maybe stick to marathon health checks if you want to have like this outsourced to the missus agents and you're okay with This kind of checks. You're okay going to messes house checks and Then when we stick to these kind of debugging thing we introduced a better way to Debug deployments so this was a feature and many users and customer asked for as they deployed an application and By accident or by not knowing what they're doing they requested 20 CPUs And you don't have one agent carrying 20 CPUs You can never start the supplication and this application if you're using marathon UI was waiting forever And was starving because no offer was enough to start the supplication So we introduced this kind of debug feature in one for and If you're using DCS you have this UI where you have a more or less a funnel and in the first row You see the role and you see that 50% of the agents matching the role So in this cluster I had a private and a public agent and the private one matches my requirements and the public one Don't matches my role were constrained. So that's why only 50% matches and In the second row only the matching resources are considered for for the bar to display So we see 100% matches the constraints 100% matches the CPU, but none offer matches the memory So probably I need to change my memory configuration for this and If you want to have a more detailed view you could also have these table kind of thing where you get checks and Crosses depending if this host in the last offer cycle fulfills your needs or not But if you're not using DCS You can do the same things. So Oh Let's stick to DCS users if you use the DCS you lie you get basically exactly the same information So you get this table of DCS marathon debug list and you see all applications marathon is currently trying to start on this example It's a CPU task. It's waiting had six offers and couldn't use any of them And then you can dive into and say DCS marathon debug details or the app ID And then you see that none of the CPU offers matches But now if you're not in the DCS world you can still use this and I know about users doing way more fancy things than we do So there's an endpoint. It's called V2 Q and then the V2 Q endpoint You get the same information so you get for each application you get a summary of the last offer cycle So you see if you have two agents, you see two entries there and I only grabbed the one for the sufficient CPUs There was one processed offer because the offer of the public agent is filtered out in a previous step So we had one processed offer and one declined offer. So we know that there's a problem with the CPUs and we need to fix it and on the other side You see a summary of all offers this application has seen so in this example This application has seen 30 offers a 30 offer cycles and declined all the offers because of Because of the CPU and you can you can subscribe to this you can ping it constantly Define some alerts in your monitoring tool and Don't know page your SRE whatever that this application is not it's not starting and do whatever you want to They are really great things you can do with this endpoint If you're doing great things with the endpoint ping me, I would love to hear your story about this Okay, and the secrets API so Marathon by itself only defines an API to to define secrets and We had the secrets API quite for a while and was able to to configure environment-based secrets But usually you don't want to have Secrets exposed in your environment variables You want to have secrets injected to a specific file and you want to make sure that only this container can read this file And what we did a marathon We introduced these API that you can define still define your secrets section is the array of secrets or yeah your object of secrets and You can define a source whatever this means and On the other hand you can define a volume and you can say hey month this secret please to a specific path and Then you can hook in a plug-in So by default marathon does not provide an implementation for these API But it's Java code you can easily like write a plug-in consuming these information and you get a call directly before marathon will give the task information to me And you can do there whatever you want so now I know about users and they are connecting to their secret store pulling out the secret Putting in the information in in the task info and then passing it to messes but I heard yesterday in the town hall that some users also trying to Implement a messes plug-in to do this on messes side So it's up to you what you're doing with this We provided the API if you are in the DCS world in DCS Enterprise world we provide a like you can just use it There's these kind of plug-in included and the messes adaptions as well. So It's up to you what you want to do with these enhancements Okay, great Let's talk a little bit about quality and the things we did in the past release So the top left picture is on a monitor. I see every morning when I go to the office So we defined a bunch of integration test suites to make sure that we don't break things And we have this For I think we I don't I'm not necessarily sure if you still have it for one three But we have it for one four and more five and master And we're running integration tests all the time like we had some issues in the past with some flakiness so marathon is a distributed system and We are having assumptions and maybe some of them and not that stable as we wish to and we have some interference between some runs But I think nowadays it's we made it really really stable again So that all our integration tests are stable again And we're running a big sole cluster installation So most of the time when some of our users reporting a problem with some exotic combination of App definitions and what they're doing and when they're facing problems We're going to fix the problem and introduce an integration test for this and in some cases We're also introducing a test case for a sole cluster. So Like starting something all the time like I know some users and they're starting their production of their testing environment Every minute because of some changes because like developers deploy constantly. They're queuing up everything and We simulate this behavior in our sole cluster to make sure that We are able to do this and we're not running into regression and like being too slow to fulfill those setups and we're doing some some Scale test nowadays. I will talk a little bit later when we're going to all nightly builds and Yeah, that's it. So some of you may notice this It's one of our github bots and Our github bot first decline every PR. So don't be mad that you get a Declined PR when you open it It's all bought and what he does you can follow the link and go to Jenkins and this bot will build every PR that's made against marathon code base So unit tests are running and integration tests are running and our integration test. We're really spinning up a Cluster of Marathon a cluster of mesos and doing some serious testing on this But then when it's finished Every PR and every commit to master is uploaded as Kind of release so everything is built To AWS 2s 3 and with this configuration You can go to DCS and spin up a DCS cluster having exactly this marathon version of this PR Which is quite nice because every time someone committed to marathon master our CI job builds this commit runs all the integration tests of the unit tests and then updates a PR in DCS with the newest marathon version and Then these newest marathon version will be picked up and the nightly integration tests of DCS And we will run marathon integration tests against the latest commit of master So this quite funny that commit on marathon will trigger a CI job that triggers a commit and DCS That triggers a CI job that triggers integration tests But we're running like four DCS clusters every night testing all the stuff we're doing and we hope that we Make sure the note to run integration to into a regressions with this and we have this nice shiny emoji When the build is passing Okay The next steps so I will talk about some some plans we currently have so I Can't give you promise that everything is going to be happen Some stuff will be seriously be happen but It's up to all of us to to shape the future for my phone So if we drop something because of various reasons for my phone 1.6 Don't be mad at us. Just ping us say hey, we really want to have that and maybe I want to contribute to that Let's let's get in contact and work together So the first thing as oops I'm so sorry The first thing is container storage interface So the same thing we did for container network interface. We're doing with storage interface again So there's a couple of vendors called the working together to make one interface to define how containers should communicate with storage And it's called CSI. This was mentioned yesterday in the keynote That we're really pushing this forward and as first as the first draft is marked as release candidate We are starting to introduce this and messes the work to enable messes to do this And afterwards we're going to Introduce this in marathon as well and then We want to do a thing called for domains and cloud bursting so currently you can Model your topology Using missus agent labels, and I know some users doing this putting a label on an agent saying hey You're in region US West you're in availability to zone whatever or you're running on-premise But it's done via labels and missus is planning to do first-class port of those regions and fault domains and This is I think it's a draft or already committed to those kind of message Protobuf may have a domain info for each agent and you have a rec info in a region info And if you're running on-premise you can use this as rec and as region as data center and as rec information If you're running in the cloud you can maybe have your region of your of your data center and your availability zone to make sure that marathon is able to schedule Tasks or to spread tasks over your availability zone, so you I Would imagine that it is then possible that you can define constraints that you have Please make sure that my task is available Or is scaled up even the between all my availability zones or is scaled up even the between all my wrecks and If we have this first-class port We can easily add the thing called clouds cloud bursting on top So that's what the Erics was talking this morning So if you have some workloads and it's just temporarily workload and you want to burst out to you know Some AWS instances beside you regular on-premise instances. You can label those and Say hey in this region or rec information It's defined that there's a temporary notes on this burst thing and marathon should schedule only particular task on this special temporal agents and Then missus is also planning to enhance the role support so There are the plans to enable multi role and hierarchical roles and marathon It's also planning to support this so currently marathon is not able to register to more than one role. So When you start marathon You need to define which kind of roles you're interested in if you do nothing marathon is Interested in the unreserved role the asterisk and if you're running DCS marathon is also interested in the slave public role But with this new feature would be able to be interested in more roles that marathon can be really Handled multi roles or even hierarchical roles and this is quite a nice feature and if this Once this is a landed fully in messes the work will start on top of marathon And I saw already quite quite a lot of work for multi tendency in marathon And what this will be mean for marathon and what in place what implications will see after introducing this And for sure once the IP music support is fully supported by me So smurf and we'll also include this so marathon is just doing validations and IP before so this is not the big deal for us And metronome and chronos. So this was probably the most asked questions yesterday and the town hall and A little bit story behind that so chronos was there for for ages and Last April released metronome last year April Mainly because such such amount of users asking for a modernization of chronos and new features and chronos But we weren't with a team at that point We were not able to support the marathon development and beside the chronos development. So we decided to go for metronome and metronome metronome uses marathon as a library and Added just a thin layer around marathon to enable chron expressions But the bad or the downside of this is that metronome is always behind the marathon development and the marathon and So you don't have the newest features in metronome that you would have a marathon and If you want to do new features, you need to adapt the metronome API and the same way we you change the marathon API and this is kind of double the effort and Nowadays, we're planning to do magic things with that So our vision is to really enhance our deployments And let's talk a little bit about con deployments and marathons. So deployments and marathon are really opinionated So a deployment and marathon is always you have your current set of application You do an update and then you can Configure it a little bit how your rolling update will work But nothing will always go from version a to version B in a rolling way And it's okay. So it works. So most of the users are using rolling upgrades for most of the part if you're using Canary upgrades this is Typically done by starting a new application Somehow magically connecting your canary with your rolling with the regular application and if gain confidence in the canary Then you start your role in deployment. So this is typically done there, but you always stick to rolling deployments And what we're currently planning my colleague Alex say is working on this for quite a while It's making this more flexible that you don't have this logic how to Calculate a deployment deeply coated inside marathon We want to make this pluggable and make this as interface to to hook into that one deployment can contain multiple steps and multiple steps can again Contain multiple actions and then action could be start one task in one version on one note or Stop one stop exactly this one or a step could be wait for Operator interaction wait for someone to acknowledge manually this step. So we're currently in the like Further or improve brainstorming phase and really writing this down and as soon as this is really ready to share We're gonna share it, but we like we don't want to make like promises where we can't fulfill. So this are just Things we're currently thinking about and we're currently doing all the refactorings under the hood to make this possible So we're really working on making this Making a good good infrastructure to shape those features in top of this and so We will introduce an interface While you can implement in Java whatever JVM language you want to how your deployment should look like and If your deployment looks like this, it's maybe a blue-green deployment canary deployment or rolling deployment This is fine, but maybe you do something like this Maybe you have a Java implementation generating a Schedule job a one-off job So we're also planning to make all life cycle events pluggable. So come the marathon always the container Terminates marathon resorts it again. So maybe we can make it like oh If this is terminated, it's okay. If we get execute zero, it's fine if you get execute one retry and Maybe we make the schedule a first-class citizen in this schedule in this interface We don't know yet, but we would try to Merge those two development branches and metronome marathon back together and enable our schedule jobs in marathon But again, I can't promise you that this will be happening one six So I can promise that we're doing everything we can to make the fundamentals of this to make the middle step But maybe the last step is done in one seven or later or maybe you want to participate and contribute to this This is highly welcomed Okay, so I think got a few minutes left to talk about people and community so This is an overview of our contributions and then and the releases I mentioned at the beginning of this talk So month on one we had nine community contributions and no contributor had more than three contributions in one three We had 31 contributions. It's not that big maybe as other big open source communities but for our kind of a Project and our kind of complexity. This is quite nice And we had some contributors with like eight or ten contributions So really thank you for contributing so much to the marathon project and we're gaining So like the last release not so many but I think there's some Trend to have more contributions and we really had a nice contribution Tagline in the last release. So maybe we can just go with it. So this was community contributions were merged lately So there were really nice features inside and the great thing about this message corners that I met most of them Who contributed so we entered you to introduce a plug-in for for scheduling decisions. You can now hook into the marathon offer matching decision and Can hook custom code into if you want to start something as particular task on a particular Offer or not. So this is really nice that you can inject custom logic there. So we had another contribution to not start Tasks on a agent which is currently maintenance mode. So I know a bunch of users really requesting this for ages and Someone sit down and implemented this because like we hadn't had the chance to do so. So this was really nice Or what else were there so some fixes so we obviously we screwed one migration a little bit up So a user fixed it and prevented us from running this or releasing this. So this was really nice so really really thank you for this and I know that some of you some of you running marathon in a patched version in production So or miss us in a patch version in production. So this is really nice. So this is really giving us good feedback and helping us to ship a good product for you and If you screw up your production, please keep reporting and telling us that that you have problems with that So we really appreciate this so Thank you very much. Thank you very much for reporting and for fixing and for helping us building marathon Yeah, I think that's it. If you have questions, just grab me or write me somewhere No questions over home by the marathon plans with deployments. Oh One question great. So you talk about Marathon. What about Marathon will be so we'll either some plans to integrate the routing part also So you have these canaries plans? Yeah, how how this will be routed like to real application then so Marathon so the plan about the new deployments is to stick to the current way of of API we expose So that we are aware that we have consumers of the marathon event bus and more than I'd be for example as one consumer But we do have a bunch of consumers behind this bus and by all those changes We are sticking to the same API. We're exposing in the event bus to not affect tools like marathon I'll be routing traffic there so I Think it it should stay the same So if you have and can every release containing three old ones and one new ones this new one should get Maybe 25% of traffic, but if you're using more advanced load balancers, I think Some of the users here. I think you help us using a quite Quite advanced load balancer. You can do this more fine-grained. You can say give this application Maybe only 10% Traffic and maybe the other more and stuff like this So the plan in our current design document is to keep the current API Consistent for tools will not adapt that the new way can still work as they did before But we can offer more events if you want to so if you'll be more interested in these kind of more events for more specific versions because a marathon I'll be gets wants to adapt that There We could think about concept of registering for more events and then be able to do more sophisticated routing as etch router Okay, great I Had a question about pods. So the last time I checked there was no support for external volumes in pods Is there any plan to add support for that? I? think I read that We are going to so I'm not totally sure that external volumes are for the supported in parts or not I would assume that We are aware of if it's not working and that we We can work on this but I need to to double check if if this is on the road But for the next release or or or not So I'm currently not not really sure what this day. Yeah, there is no way to specify external volumes right now Specification itself, so if there's currently no support And if there's no ticket, I would invite you to to open the ticket Say hey There was this guy on the missus con told that pots are a great concert, but don't support external volumes Please implement us Sure. Okay. Thank you very much. Are there any plans to implement implement demon sets Kind of behavior demon sets. Yeah, so demon sets are basically starting a container on every agent and It's it's quite hard for marathon to do so because marathon is not aware of your cluster topology And it's not aware of it's not holding state about messes agents so But I think these kind of admin tasks or or operator tasks I think Ben mentioned this yesterday in the keynote that messes is trying to Establish a concept of reserving resources and you can directly go to missus and say hey missus I want to run those administrator tasks on those agents And maybe this is a way to to address those topics of Demon sets, so I know there was an github issue back in the day We use github issues and this was the most most Comment at github issue. I think there was a 30 or 40 participants in this issue So it was discussed really heavily to introduce demon sets, but I Don't think that that we're currently planning to to include them in in the marathon code basin in the next Version, but maybe if you want to have this new concept of deployment thingy Maybe one of the community members can implement at Java deployment generator Who will make sure that this is started on every agent? So I think when we be so flexible that you can hook into random or not random But like good Java code to to generate deployments and to be informed about certain stuff in marathon you can really customize this and Maybe do some sophisticated way to get information about all missus agents and then schedule tasks to all of those to Simulate these first-class Principle of other schedulers like demon sets That's a great example. I will note this too. So we're planning to do like hackathons and stuff like this So this would be Okay, he said he won't he already did that and on a top level I'm going to get rid of his crappy code Well, maybe you can do like not so really crappy cuz they couldn't do nice and shiny code inside the marathon cold base Okay, great, thank you Hey, I want question about health checks. Yes. So now we have marathon health checks. Yes. Now we have mythos health checks and I want to propose a Third type of health checks great The ingress health checks the health checks which will be running on the site of the ingress plane I mean ingress plane. I mean marathon will be tasks. For example, yeah, or traffic or something else So why they are important because basically I don't care if marathon itself can reach my tasks I don't care, but I really heavily care that ingress may reach my tasks and serve it to customers without issues So I think they could be a not the worst idea to implement a class of ingress checks and Maybe some circuit breakers with it Yeah, just an idea Yeah, that's a really good idea So in the end you want to make sure that your edge router can reach your tasks Exactly, if it like if you have an exposed application if you have an internally Only internally accessible application you maybe not That be interested in these kind of things, but you have an exposed application. You're really interested in having this kind of things so this is really valuable and I think this kind of stuff can be implemented in And Maybe in a message side, maybe also a marathon side, but yeah, let's chat about this topic This is really interested I don't think most users are interested in having this behavior that the task is reachable from the outside and not necessarily By by marathon health check, so it's kind of in between marathon messes health checks So it's not reachable marathon. It's okay. It's reachable from from the missus agent It's great, but maybe you want to have something in between from from the edge router to your application Yeah, it's a really good suggestion and we should like write a short document Describing the idea and proposing it In one in one ticket to to get this drive this further. I would love to Help you. Thank you. Be sure I will and I will do some examples with traffic. Great. Thank you very much Thank you Okay, great. Thank you. I think we're a little bit over time So I will make room for the next speaker, but if you have further questions Wanted to discuss things about your proposal or other things Grab me. I'm here the whole day or grab one of the other missus here employees to talk about those topics Thank you very much. Thank you