 Dobre, kao vsi, da sem učil, da smo tudi dobro. Vsem da se vse našli konferenci, da so se nekaj, da vse je zelo se tudi prišli. Našli, da se je zelo za mnohje dve sreče neč deli. Prejde se je zelo sreče. Značite, da imaš bolj, da našli so, da se počusti. Proste, da so našli, da se počusti. Je bilo, da se je pozdravila. Značite, da je dobro. Znamo, da so našli, da se počusti. Spodobno, da bomo počunili, da našli, over view of our current SAS solution. And then I will enter into more details about the new microservices platform that we built. And an important part of this presentation will be about the lesson learned while building this new platform. I also have a demo in which I will show how we scale our microservices in order to handle better more translation requests. OK, just a few words about me. I have a software development background. I'm passionate about people and technology. I'm interested in anything that is related to scalability, big data and machine learning. And I'm currently leading the big data and machine translation group for 4SDL in Kluž, Romania. And I'm also the co-founder of the Big Data Data Science Meetup Kluž. SDL is activating in the translation industry, being a large-scale machine translation provider. By large-scale machine translation provider, I mean that we translate over 15 billion words every month. Probably you are familiar with Google Translate. We do the same thing, but for enterprise company. The large-scale machine translation providers over the course of the last years improve the quality of their services. They did this both by improving their machine learning algorithms, but also by taking advantage of the technological progress, being able to process more data, store more data, compute more in with less money. What made, on the other hand, what made the really useful, what made machine translation really useful in practice are customization. By customization, I mean the practice of training the statistical engines with data that is really close to the client's data, or very similar to the client's data. And this customized engine handle in average translation handle terminology with 24% better quality. On the other hand, the training process is quite an expensive process. It's an expensive process because of the time. I mean, if training an engine from scratch five years ago was taking days or even weeks, now it takes six or eight hours, which is quite a lot. From hardware resources perspective, you need a lot of CPU, RAM, and disk in order to train these engines. And you also need, from the data perspective, you also need a lot of clean data, both monolingual data and both bilingual data. So we develop a new technology in order to address this problem. The technology is called adaptive machine translation. In the normal machine translation flow, the engines are built by a statistical training process. And after that, if you translate the same sentence 100 times, you will get 100 times the same answer. On the other hand, adaptive machine translation engines are able to learn also after they were trained. So they learn in the same way, in a similar way during the training process. But they are able to incorporate also feedback from the users after they were trained. And they are getting better and better over time. And let's take a concrete example. So if we have a persona, like a translator, that sends a translation request to a machine translation engine, he will get back an empty output. He looks at this output, and he decided he has a better way of translating that sentence. So he actually modifies that translation. That translation goes back into the adaptive machine translation engine. That engine extracts some rules that he will actually apply when the same users come with a similar translation input. So it will get better and better over time for that specific user. If we look also even to a sentence, for example, an English to French sentence, if we want to translate, no further requirements are needed. The machine translation will output part de tres exige en son requisé. And the translator decides that he has a better translation, aucune exige en supplimentaire ne nécessaire. So let's see what the adaptive engine actually learned from this. So he learned some new rules, like the fact that needed is probably in this context is better to translate as necessary, further as supplimentaire, and no further requirements are with aucune exige en supplimentaire ne. He could also learn some bad translations, but he actually compares these rules that he extracted from the text, from the sentences that we gave them, with the existing translation model that he has. And he decides that it's better to not learn them, because it will actually lower the quality of the translation. So he does not learn those ones. And before we actually implemented these several experiments were done, and with human help, and we ended up with the conclusion that adaptive MT lowers the time of post-editing with at least 7% in average. OK, so this was the first use case, the adaptive machine translation. And we have also another use case, which is the neural machine translation. So if we look back into the history of machine translation, we practically distinguish three main types of doing machine translation. It's the rule base in which you define and build a model by hand, is the traditional statistical machine translation in which you define the model by hand and you statistically learn it from the data, and the neural machine translation. I'm really happy that before my presentation was a presentation about deep learning neural networks. So the neural networks wave reach also the statistical machine translation field. So if for some time, the neural networks were too computational costly and resource demanding in order to compete with the state of the art statistical machine translation, the situation changed in 2015. So we now see engines trained with neural that we are able to put them in production. So compared to the two previous machine translation cases, the neural machine translation case, you define an architecture. And the architecture takes care of discovering, defining, and learning the model from the data. Actually, the neural machine translation uses deep learning architecture that is capable to learn the meaning of a text. As a consequence, the translation output is much more fluent and naturally sounding. The neural MT also shows significant quality improvements over the past engines because they are able to capture both local and global dependencies and they are able to handle long-range word reordering. For example, in our case, we observed an impressive 30% improvement in quality for the English to German engine, which for us, and in general, it's quite a challenge language pair to train. And the previous, let's say, in our previous tries with the statistical one, we managed to improve it with 5% or 10% after investing a lot of time in it. So 30% is quite impressive. We already have some neural machine translation, actually, more than 10, that we are offering into our on-premise offer. But we want to put these engines also into the cloud. In order to put them into the cloud, we need to be able to accommodate new hardware, especially GPUs. We need to have a flexible infrastructure that is able to handle both new engines and old engines, both CPU and GPU engines. And we need to break the old implementation so we extract the common part. And we keep this common part as a separate, so we deploy it as a separate and we scale it as a separating. And for sure, new and modern API would help us to onboard clients much more easy and much more fast. OK, so this were the two new use case. And then we thought, OK, how can we do this? How can we actually accommodate this use case into our production? So we actually started to look into our SAS solution that we currently have in production for more than 10 years, I would say. The current solution, it's a service-oriented architecture, but all the services are deployed into a single application server, so they are somehow packed together and deployed together, so we cannot practically scale them quite easily. But we look more closely into the SAS solution that we actually built with the same team that we are now building the new platform. And it's quite a mature platform because we iterate on it over the course of five years, five, seven years, so we reached to a quite stable platform or solution. As I mentioned in the beginning, we translate 15 billion words with this solution currently. We have a high availability, and we don't have P1, P2 bugs reported even from our external clients or from our technical support team. So we practically are able to discover all the bugs into the development or into the QA environments. And it's the only large-scale commercial gray machine translation solution except the one from Google and Microsoft. So these were the pros of this solution, and let's now look a little bit on the cons of this one. So it has some flows that were built with requirements, outdated requirements, so five, seven years old requirements that at that time were great, but in the meantime, the situation changed. Our translation engine are not modular at all, so we cannot easily add new things in there or extract things from there. In scaling down, this solution requires quite a lot of human intervention. So somebody needs to clone some VMs, then run some scripts, then add something into the database via some scripts, and then the engine is available to the customer. So it takes probably, let's say, an hour in order to spin up a new engine. And so we concluded that overall, we are having a monolith solution that it's quite hard to adopt, especially for the use case that I mentioned in the beginning, but in general for any new use cases. So the idea of a new platform came into the picture, and we started to think more and more about this idea that probably we will need to build something from scratch. OK, and we identify some key concepts that we would like to have into the new platform, and we decided that scalability is very important for us. We need to be able to accommodate new clients, and especially large scale, especially clients with a lot of translations. The latency and the speed is also important for us because we have also some clients that wait in front of a monitor to see the translation coming. We want to have independent services. In this phase we did not necessarily decide that it will be microservices, but we want to have independent services so we can scale them based on usage, not to scale everything as a monolith. We want to be elastic, so we want to autoscale both up and also when the clients are not there for some hours, we want to lower the deployment so we don't pay so much. We want to build also a solution that responds well to failures. We want to do as less as possible manual steps. I mean, we want to do everything via scripts, and we want to have reliable monitoring and alerts because this is quite important for us. We don't want to be called over the weekend or during the night except if manual intervention is really needed. So we will also start to look of what other company with similar scaling problem are doing, and we actually observed quite a pattern. So we see that a lot of them have started from a monolith, Burl, Rails, C++, then they continue with the Java, Scala implementation, and then they ended up using microservices. We actually have a similar path. I could say that, for example, in 2010, we started with the same team to build the Java implementation, having a Ruby on Rails implementation that we could not scale anymore, even with doubling the number of machines involved, and we could not actually get more numbers out of that. So we decided that we will follow the same path to use the microservices. After some proof of concepts and a lot of experimentation, we actually ended up with this technology stack. In order to solve our scaling problem, we decided to use Mesos as our cluster manager, and for sure we are using also Marathon and Chronos on top of Mesos. We decided to use HBase as our NoSQL database in which we store all the rules and everything that needs to be stored into the platform. We are using the HDFS layer from Hadoop in order as a storage, both for the HBase, but also for anything that we need to store into the platform, I mean all the content that we are actually using into the platform. In order to solve the latency and fault tolerance part, we decided to use Kafka as our messaging system. All our microservices are stateless, so we communicate between services using messages, and those messages are persisted in Kafka. We used ZooKeeper both as something that is tied to the Mesos, but we also implemented some logic in order to do discovery of the services around there. We used protocol buffers in order to serialize all the messages that we are exchanging in between microservices. On the infrastructure side, we are using Ansible in order to automate everything that previously we were doing manual or all the new things that we need to do into the new platform, and our production environment is in AWS. We actually started developing the solution in our private data center, then we decided that it's better to host it into AWS, because the old solution was taking out all of our resources in the production facility, so we did not have a place for new things to put in there. So we decided to use AWS and it was quite a good decision. For monitoring and alerts, we are using the Elasticsearch, LogStage and Kibana stack. The Grafana part we are using for dashboards and metrics, and I will actually show this part into the demo. And we are collecting application-specific metrics into OpenDSDB, and we show them both in Grafana or we actually use them to analyze how health is our deployment. And we monitor everything that is into our platform using the Zabix. On the microservices side, we use Docker as our containerization platform. We started with DropWizard, then we understood that it's more easy to use the Spring Boot part as a REST application framework. And we started with Java 8 three years ago when it was still at the beginning, but we had a good experience, so also this one was actually a good decision. Okay, so this was the technology stack that we ended up using. So now let's see the lesson that we learned in these last two, three years while trying to build this scalable platform. So the first one is related to the cost. From the beginning, we knew that we want to build something that is cost efficient. And we were quite passionate about looking on how we can optimize our cost. As I mentioned, we started the development into our private data center and we postponed a little bit using AWS and this part was not necessarily a good decision because when we started to put things into AWS, we see a lot of things that were different compared to our private data center. Our private data center is located in Denver, Colorado and we actually deployed into AWS region into Oregon. When we put things into AWS, we see some differences, especially our engines are quite IOPS intensive, so they do a lot of input and output operation and we could not obtain the same performance that we obtain in our private data center in AWS. This is mainly related to the fact that AWS gives you, so the performance of your storage is more or less tied to how much, how big the partitions are that you are using, but in the end, we actually had to change our implementation in order to make the engines less IOPS intensive in order to be able to have the same numbers from our private data center and from our AWS deployment. We also had a lot of configuration differences and we actually decided that it's a good idea to do a production clone into AWS, so we test all these things before actually putting our release in production and we test all the things in the production clone to see exactly how, if we still have differences or not, especially on the config part. The production clone is quite similar with the production environment. I mean, it has the same IPs, we are not using the same size of a mesos cluster, but we are trying to, especially for cost reasons, but we are trying to choose the engines that are more relevant and to those ones to do the tests and the validation. Still, currently, 40% of our cost, it's a non-production cost from the AWS build. I mean, that only 60% of our cost, it's the production. We have some small dev clusters, QA clusters and production clones, as I was mentioning in there. To keep down the AWS cost, we do periodical cleanups of everything practically. I mean, snapshot, EBS volumes, EC2 instances and so on. We have some alerts in place, so when the number of instances that are running go above the threshold, we are receiving a notification and we start to look on why the number of instances is so high. And we also try to use the latest AWS type of instances. So, for example, we are using EC2 of a type R3 4xlarge, which they have 16 cores and 128 gigs. AWS release a new version of this EC2 instances, which are called R4 4xlarge, which have exactly the same specification in hardware. I mean, the same 16 core 128 gigs with approximately 25% lower in cost. The only difference is that they have an ephemeral, the old ones, they have an ephemeral SSD disk, but we did not use it, so for us it was okay. We also had a lot of discussion in the team about using the elastic block storage from AWS or to using the elastic file storage. And we actually decided to use the EFS for anything that is shared across all the Meso slaves and to use the EBS for other things that are more local stuff. And for sure, we are reserving our instances, especially for our production cluster in order to bring down the cost with approximately 30%. The second lesson learned, it's about the security, so we had into this new platform, we had to look at this security part from a different angle. We started by, so the access to our production cluster in AWS is via only one SSH bastion host. That host contains also some filters in term of IP that can access this host. We also use GPG encryption in order to not store our passwords in clear in git and to also restrict the access to specific environment. So we had situation when some, especially at the beginning, when somebody was trying to run some sensible comments into the production clone, but he ended up actually redeploying some services into the production cluster. After that, we decided who will actually do mainly our deployments and we restricted the list of people who have access to run sensible comments for the production in order to prevent this type of errors. Except for this in all our clusters, everybody has the same access and can do whatever they want. As I mentioned, the only restriction are into the production environments. And we enable AWS termination protection, it's just a click in there, but we ended up enabling only after somebody from the team, not the same person as before, actually terminated a MesoSlave by mistake, which was not a big issue, but if he could, if he would terminate the whole cluster, then that would have been an issue. On the high availability side, so we usually, on the infrastructure side, we usually allocate one node, actually 10% more than the capacity that is quite needed in order to be highly available, even if some human error are happening, as I was mentioning before, but we also had cases when our EC2 instances were decommissioned and we had to rebuild them. It did not happen frequently, but we had some situation of this type. And on the microservices side, we allocate at least two instances per type of microservice. We are doing this for high availability purpose, even if, in some cases, would not have been justified by the traffic on that microservice. And we also put some constraints to not deploy these two instances on the same MesoS agent in order not, if that agent goes down, to not lose that microservice. We also test our, we are doing tests in our QA environments, QA performance environments with 5x or 10x more traffic in order, compared to what we expect or what we have in production, in order to reach the limits of our platform and to know if things will break what would be the first components that will actually break. And we, including from the previous solution that we built, we understood that monitoring is really important. So we invested quite a lot of time in monitoring from all different angles, these microservices and this platform. So we are using Zabix, as I mentioned, for the infrastructure. We are collecting application-specific metrics that we show, we have a screen into our office that we show a lot of healthy dash, I mean dashboard-related to the health of the deployment. We are collecting application, I mean user statistic of everything that happens into the platform, all the activities that the user perform into a platform, as I mentioned, using the Elastic Search Log Stash Kibana. And we are also doing external monitoring using Pingdom and the pager duty to call us in case something is really wrong. On the resource allocation side, so, for example, in the memory limitation part, we initially started quite at the beginning by setting only marathon restrictions on the container side. I mean, we had one container, we see that, we say that it needs one gig of memory. What, after some time, we were seeing some containers being killed. When we investigated the issue, we saw that actually those containers were seeing the memory of the whole mesos agent, so 138 gigs compared to one gig. I know that it's quite a dummy thing, but we managed to fix it only after actually seeing it, not in production in our QA environments. So we set some limits, both at the marathon side and at the docker side of the same one gig, let's say. But occasionally, we were still seeing some microservices dying. And when we investigated, we were seeing that actually the JVM was trying to take all the one gig of memory and he could not garbage collect so fast the things when they were reaching to the higher limit, so marathon just killed, it was just killing those containers. So we actually set constraints both at the JVM side. This time it was, let's say, with 10% lower, the limit like 900 megs. On the JVM side, one gig at the docker side and one gig at the marathon side. And this actually solved our problem. While investigating this, we understood that we are doing crash dumps, but we don't actually save them outside, so we are actually losing them. So we mounted a partition on each of the container that is actually available even after the container died, so we are able to investigate and take the core dump and see exactly what happened in there. Related to the CPU allocation, so as you probably know, the CPU are actually CPU weights. So initially we started, if we needed one core, we allocated one as a CPU weight. We concluded that is not the best idea, because in this case, we could not use our clusters very efficient and we could not do over provision. So we actually lower all the CPU weights from one to zero to, and now we are able to use our cluster even better. Okay, so the second one is related to the releases. So when we started this platform, we were thinking, okay, we are building something scalable, except for maybe the low balancer, we don't have any single point of failure into this. So releases with no time will be quite fast to do. It was not actually the case, because there are so many components involved into a deployment of this. As I mentioned, we currently have more than 35 type of microservices and a lot of things could go wrong. So we actually invested quite some time in order to do the production clone, in order to simulate our deployments two or three times before actually releasing them into the production, in order to catch all the differences that are between our QA environment and the production environment. We also did some scripts in order to monitor the downtime, so we actually know if you want to have releases with no downtime, first we need to know how to measure them, that they are without downtime. And we also invested some time in making these messages that are circulating between microservices compatible from different version. On this side, using the protocol buffer helped us a lot. We invested also some time in encibalizing all the manual steps that were involved into the deployment, so now the deployment is probably one or two encibal comments and everything happens in there. So we have encibal comments that are for almost all the activity that we are doing in order to keep this cluster up, to do upgrades, to do deployments. And we are also able to bring up a cluster from zero, I mean from zero having nothing, to bring up EC2 instances, deploy all the components, deploy all the microservices, validate that this cluster is healthy, and we are doing, and it takes around 20 to 30 minutes to do all these steps. All the investigations into this distributed platform became quite a big issue. I mean it became quite a complex issue. So in the previous solution we had logs that were into, I mean having all the services deployed into a single application server, it's a big advantage because you can at least track when the client with the request hit your first endpoint and all the steps that he made along the way because he did it in the same application server even if you have multiple deployments. In this case when a flow maybe hits 10 to 12 or 15 microservices, it's really hard to understand all the operation that the user made before from the entering point to the exit point. So we aggregated our log into Elasticsearch logs. We had to attach a request ID to each of the messages that were circulating and each of the logs that were circulating into the platform to be able to investigate per request later. Initially we were both shipping the logs into the Elasticsearch, but we were also using the STD out because it was much more easy to debug things at the mesos level. Quite quickly our mesos agents remained without disk, so we actually had to disable the STD out and we only shipped them into Elasticsearch. After that when we started to do some performance and to do some load on the platform we understood that the pender that we were using that was shipping things to Elasticsearch could not keep up with the volume of the load that was coming from our clients. So we had to do a trick to send it first to Kafka and Kafka will send it to Elasticsearch. So we still see a little bit of delay between when we see the request entering and when it appears in our graph but it's quite an acceptable delay, like I don't know, under a second let's say. We are collecting applications as specific metrics into OpenDSTB and usually when some things go wrong we had to correlate all these sources and understand where the problem is. As I mentioned at the beginning when we identified the key concepts we said that we want to have independent microservices. It's something that all the people in the team agree with. It was not so easy to, it was not an easy goal to achieve. As I mentioned we have more than 30 microservices now. It was a challenge especially for the legacy code so we took some code especially related to our engines from the legacy solution and we packed it as a microservice and we extract also parts that could be extracted as independent services. What surprised us a little bit is that even on the new services that we defined from scratch after two, three months we realized that those services could be broken into multiple microservices all over again. It's quite a continuous refactoring process in order to identify things that we can actually break into even more microservices. Probably the 30 microservice number does not sound like a big number compared to the Netflix who were having, I think, 350 or something like that. But for us from a monolith to 30 plus microservices it's quite an achievement. And the lesson learned, it's about the fact that we have had to reevaluate our assumptions periodically. So when we started the platform considering also the previous experience that we had in the different solution and on the serving different machine translation use case we were thinking that we can imagine quite well what the users will do. And then we released an alpha version and then a beta for a limited set of users. And then we understood that actually they don't use our APIs in the way that we actually recommended or in the way that we actually expected. So we had to change some of our APIs in some cases to our three versions in order to accommodate better the way that our clients are using it. For the adaptive machine translation flow we have now, I think, more than 2,000-3,000 users which is not necessarily an impressive number compared to the previous solution but it's still a big number. We understood that also the speed of our request is quite important especially for the sync request because some people are waiting for your translation to appear so it would be nice to appear in a decent time. And since into our previous solution the English to Spanish language pair was the most highly used one we were thinking that also for this adaptive machine translation it was not the case we had more users on the English to French, English to Dutch than on English to Spanish so we had to adjust our allocation of resources by lowering the English to Spanish resources and giving more to the English, French and English, Dutch. Ok, so even if we are building this platform for over 2 years now and we learn a lot of things we manage to fix a lot of things we realize that we still have a lot of things to do. And for example the periodical upgrades of a stack you see that there are many things in the stack it's quite a demanding task I mean it takes quite a lot in order to keep up with the latest technology stack and we are periodically investing this time and try to minimize the features that we are creating into the platform. We still need to do improvements on the monitoring side even if we are at a decent level right now we understand that we can do better than that. From the beginning we realized that autoscaling would be quite a nice thing to do. So right now as I mentioned in the previous framework in the previous solution we had to manually add VMs run some scripts, add some DB scripts in order to accommodate an increase in the deployment even if some of these things we were doing with scripts right now with a single click into Marathon you can scale your microservices so it's quite easy but it still involves that somebody realize that we need to scale that microservice because the load on that specific engine it's now bigger so somebody needs to give that click. We see also yesterday in the Netflix presentation a nice slide about how they do autoscaling. We also have clients into the old solution that are coming like they translate for six months for six hours at a huge traffic after that for the whole day they don't come back they come only the next day. So in this type of use cases autoscaling it's really worth investing timing and because it can lower the cost quite significantly so also maybe it would be nice to say so the size of our mesos cluster it's between 10 and 20 nodes which is not very huge for example into the old solution we have more than 3,000 VMs that we accommodate into the production environment but this so the mesos part is mainly because we accommodated new features as I mentioned adaptive entity which takes quite some time until people actually understood that idea are trying it are seeing that they are benefits and they are adopting it so it's quite a process that it will take some time on the other hand we want to migrate even more clients from the old solution to the new ones and then we will actually see a lot more traffic we still have a lot of components that are not in mesos especially I'm referring here to the HBase, HDFS, Kafka and ZooKeeper currently all of them are actually managed by Ambari but we would like to actually migrate them into mesos maybe into DCOS actually in order to be able to use the cluster even more efficient so for example HBase is quite a CPU intensive for us and the machines that we allocated are using the CPU 80-90% and in the mesos cluster we have some CPUs available so they would actually be a good fit to actually use the same resources but this migration requires quite some time so for some time we postpone it a little bit and there is also the elastic search part which is still kept outside of a mesos ok, so let's see now a demo I actually recorded this part in order not to have surprises I will use it without voice and I will try to explain so we will start our demo from the Marathon UI we have here an English to Dutch engine that currently has one instance so one microservice instance that is healthy then we see here that actually this is our QA cluster of mesos we have nine slaves and a lot of microservices deployed and we will use Grafana in order to see some numbers on actually how the things are going into the platform so especially the number of translation requests that are handled, the number of translation words and the CPU cluster utilization I will use JMeter in order to create some load on the platform and in this I already have a predefined script which has a sync translation flow defined which does one translation it waits for the translation to finish and it makes it retrieve the actual translated content I will start the script and we see now that translation are already happening we are going back into the Grafana UI we see here that the number of translation requests increased actually the first thing that we see is that the CPU cluster utilization increased from zero actually one to two eleven ten eleven let's say we see that also the number of translation requests stabilize around 120 translation requests per second so I came back after five minutes in order to leave the number to actually stabilize and we see here that the cluster utilization it's around ten percent the number of requests are around 120 requests per second we go back into the marathon and we scale the microservices to two instances and we see that it enters in deployment quite quickly and in a few seconds we will also see that the CPU cluster utilization increased actually increased to approximately ten percent and gradually we will see that also the numbers the translation request the translated word also increased it we will come back also in this case after three minutes so that numbers are actually reaching a stable point we will see that they increase for 120 to approximately 240 per translation that are happening in a second and we will now go back into marathon and we will scale the microservices to ten instances we see that they will enter in deployment state so gradually they will become healthy so when they are becoming healthy they are actually added to the low balancer so they actually start to receive some traffic and we will then go into the grafana and see that the cluster utilization starts to increase that is tied to the number of instances that are brought up we will come back also in this case after five minutes to leave the numbers to actually reach a stable point and we see first the cluster utilization it's around 94 percent 1994-91 percent and we reach 1000 translation requests per second with these ten instances of this microservice and it's quite important to notice also the number of translated words increased it we will look now into perspective of what happened over the course of the last 30 minutes and now we see that so we started from zero not having any translation in the system then we reach 120 with one instance 240 with two instances and approximately 1000 with ten instances we also translate from zero words per second to 30,000 words per second then to 6,000 words per second and then to 240 words per second and we also saw that the cluster CPU cluster utilization stabilized around 91 percent if we go back into the meter we did during the course of 25 minutes more than 800,000 translation requests with no actually with very few failures one failure or something like that so this was actually the demo that I wanted to show you so now if you have any questions please ask them I will try to not move sorry as you are scaling it up I noticed that you didn't have any labels in marathons so what you are using to your load balancing okay so we built a script that actually a bridge between aj proxy and marathon so we are exposing our let's say our front end our endpoints are actually getting the request via the aj proxy layer and practically when it's a custom script when we started to 3 years ago the marathon lb and all the stuff that are not available were quite not available so we had to do this part quite by hand and it also contains a little bit of logic in which we decide how to actually route our request but more or less is not something really fancy it enters into aj proxy that decide where to send the request approximately we have an algorithm that is not quite round robin it's a custom implementation of round robin any other questions on your future slide you showed that you also want to bring age base et cetera into mesos world how do you plan to do that it's a really good question I actually while being these two days of the presentation I actually try to ask many people how they are actually doing it I don't have actually an answer for that I see that the HDFS is already into DCOS I actually see that some people on github they already did some implementation and also the guy from portwalks they told me that they have something to look into on how they are porting age base into a mesos cluster so I have a lot of tracks that I want to follow I don't have unfortunately a simple answer any other questions if not you can find me around here and also if you want you can send me messages on the twitter part you can contact me in any way and I will try to answer your questions ok, thank you very much