 Okay, I'll get started. Thank you for coming to this talk. This is not me giving a talk here. I'm just merely giving the introduction to this talk. Why I think you should listen to this and why I think it matters, what I like about this initiative. The first reason is that those guys are French and they came a long way and they need to support the home team here. That's the first reason. The second reason is that it's a project that is a bit forward looking and we, due to the nature of the project, we keep on looking into use cases that are immediately relevant but those guys are actually looking into the next step and as we progress more into the internet of things and we will have more and more objects connected and more and more processing should be done closer to where the action happens rather than centrally and so exploring how you can push those compute processing closer to where the action happens. It's actually an interesting development if you want to be ready for what comes next. The next reason why I really like this project is because those guys are researchers and they have a very scientific approach to the problem. So whenever they make an assumption, they're actually proving it using experiments and checking that it makes sense. So it's not just them thinking it's a good idea. They kind of try to prove it using real serious experiments like running experiments on the CREIT 5000 system which has 10 different sites or friends so that you can check that massively distributed clouds actually work rather than just thinking it will work. The other thing I like about this initiative is that it's bottom up rather than top to bottom. We have a lot of people that come up and start developers and say, well, this is not the way you should have been doing things. You should have done things completely differently. You should drop whatever you've done so far and rebuild things differently and that doesn't really work well because we kind of make changes in an incremental manner and this team is actually taking a bottom up approach trying to use OpenStack for their use case and they took very low OpenStack, they run it and they actually quite worked but not so well and now they are looking into changes they can push to OpenStack upstream to make OpenStack better suited to their use cases rather than come up with a completely different project that would work for their use case. And I think more and more people realize that it's the only way to get things done in OpenStack and I think they're on the right track. Finally, the last thing I like about this project is that it lets us look into the edges of the use case spectrum. We have a community that is driven by the contributors that contributors are paid by organizations to work on OpenStack mostly and that drives a certain type of features, certain type of behavior inside OpenStack. That means that the middle use case is pretty well covered. We have medium-sized private clouds pretty much nailed down at this point whereas our mission is to provide a massive scale on one side and very easy to implement on the others. And you will see in this presentation that in order for them to succeed we need to be easy to implement and massively scalable. So whatever we do in support of this initiative is actually pushing us to the edge of the use case spectrum where our community will not naturally go due to the market dynamics around OpenStack. So without further ado, I don't want to take more time doing the intro. Let me introduce you to the discovery team. Thank you. Thank you, Jerry. So actually, Thierry said a lot of our initiative. I'm going to try to dive into details but I really insist on that point because as a researcher, we are not a user of OpenStack. We are not, right now, let's say, a developer of OpenStack. We are already trying to investigate what can be the cloud in the next five years or in the next 10 years. So that's our mission. So my name is Adrian Laird. I am a researcher at INRIA, a research governmental institute in France on applied mathematics and computer science. And as Thierry Manchin, I'm going to introduce the discovery initiative. Then Jonathan is going to present the first revision we made on NOVA in order to address our use cases. And finally, Mathieu will give you some details about where we are and where we would like to go. So the discovery initiative has been started two or three years ago now. And our starting point was the following one. It was the vision end user have about the clouds. So when you are an end user, you expect to get services or resources from the cloud on demand by going through the network. So the reality actually is that one, is that one actually we want to get resources or cloud services. You have to go through the network to reach the data centers. So basically you can obviously have your own private cloud or part in by open stack, obviously. But let's say that most of the public cloud services are deployed on few data centers that are deployed on the edge of the backbone. So what is the objective of the discovery? The discovery objective is quite simple. The idea is to actually bring the clouds back to the cloud. And the specificity of this initiative is that instead of trying to federate this front cloud stack, what we'd like to do is that we'd like to leverage open stack in order to make such an entire cloud as yours. So obviously the picture I showed just before is a kind of ultimate goal. It seems quite difficult to remove the pictures of giant actors such as Amazon or Google, Microsoft and so on. But the main idea is actually to try to propose a strong alternative of these giant actors by providing a more distributed cloud infrastructure system. So why we start such an investigation in 2013? Because actually we discovered that the current trend at that time was to build larger and larger DC. So I guess that most of you are familiar with that picture. So this is a data center of Microsoft in Quincy. So for the one that never sees these pictures, just give a look to the size of the cars and just imagine how many servers you can put inside such as DC. So this picture is usually well-famous. One that is not known is that if you go on Google map and you look for Quincy, actually you can see that there is the data center of Microsoft, but you can see also that there is the data center of Dell, Yahoo and so on. So why actually all these data centers have been deployed on that place? Is that because in order to reduce the cost of operating such as DC, cloud providers try to find attractive place. And actually in Quincy, you have here the Colorado River that provides cheap hydro-electrical energy, and that's why this place now is quite famous for data centers. So if you go on the website on Quincy City, you can see this nice slogan, Agriculture meets technologies. So from a place that was basically a peaceful place for farmers, now you have probably the bigger DC. So while it's quite funny, if we go on on such trends, we are going to face a lot of issues. So the first one is related to the reliability issue. Just imagine a disaster on such area. Imagine a military attack, a terrorist attack. How you can protect all this mega DC? The second one that was a concern that was really important for the European continent, it was about the jurisdiction concern. So actually this was the key element when we start this initiative because the jurisdiction concern was the major break for the adoption of the cloud computing from the European company. Why? Because actually most of the cloud provider come from USA and most of the jurisdiction is based on the USA rules. Three years later, the major concern is no more the jurisdiction one, but it is related to the network overhead to reach such a DC. So I put here some pointers if you want to have more details about that. But basically as Thierry said, now we are moving to the IoT paradigm and having a large DC that are deployed in few areas does not satisfy the requirements of such paradigm. So what is the proposal of the discovery initiative? Our idea actually is to rely on the micro DC concept that has been proposed by Microsoft a few years ago. So the idea was to try to deploy micro DC, so smaller DC, closer to the end user. And the question that Microsoft didn't solve at that time is that where we can deploy such a micro DC? And the first contribution of the discovery initiative is quite simple. The idea is that where you have a network point of presence is quite simple to benefit from all the existing facilities, the converters, the inverters and so on, and actually to just extend each pop, each network point of presence with few servers that would be in charge of providing the cloud capabilities. So basically I'm not going to dive into details but if you go, if you see that, this is the French NRAIN, so the French network backbone for the education and the research in France. And in each red square, there is a pop where you can put some servers. So due to space, due to the time constraints, sorry, I'm not going to dive into details, but trust me, this infrastructure are well suited to deploy micro DC. So the previous picture was the French NRAIN This picture is the European one, which is called GEO and I guess that some of you are familiar with that one, which is the NRAIN in USA. So once again, here in each circle, you can envision to deploy servers, storage capabilities, network capabilities to provide this concept of intercloud. This scenario can obviously be extended to the wireless backbone. So we are working with some operators like Orange in France and actually the idea is to deploy micro DC at the bottom of each radio base station. By such a way, all IoT services can directly benefit from local micro DC. So what is the main challenge that we should solve is to find a way to be able to operate such a massively distributed infrastructure. So you should consider that the infrastructure one vision will be composed of 100 up to 1000 micro DC that will be spread across a territory and we will find a way to operate all these data centers. So the first solution that comes in mind when you want to operate a federation of data centers is the brokering approach. So from what point of view, from the research point of view, while such solutions are production ready for the basic one, they do not satisfy the requirements of all use cases. So first of all, they face the API issue. Basically each time you receive a request, you have to forward the request and to change this request in the right API according to the cloud stack you want to use. The second issue the brokering approach is facing is that they have to reimplement at the top level the mechanisms that end users are familiar with. So basically if you are familiar with open stack, what you expect from the brokering point of view is to get the same functionality of open stack. If you are familiar with open nebula or cloud stack, that would be the same. So basically all people that try to provide a brokering approach have to reimplement all these mechanisms at the higher level. So the first proposition we choose to take is actually to remove other software stack from the pictures. So instead of considering open nebula, cloud stack, open stack and so on, we just decided to say that we are going to work only with open stack. So this will enable to partially make the problem easier. So if you look to the state of the art of how open stack can fed a different micro-DC, we can find only top-down approaches. So what does it mean by top-down approaches? You have a substrate which is in charge of receiving the request and forwarding the request to the different micro-DC. So there is two famous solutions that are the cells. So v1 and we got a presentation of cells v2 just during the last session. And there is another proposal that comes from the Cascading solution met by Huawei which is called Recycle. And basically if you look at these two solutions, they are also re-implementing dedicated mechanisms in order to be able, for example, to provide the scheduling capability you expect at a high level. The other issue that this approach will face is the scalability of the solution. So basically if you take the cells v2 or if you take the three-second model, the top cells should be distributed in order to address the scalability as well as the reliability challenges. So what was the position we took when we discovered that? Is that said, maybe there is another way to address this use case instead of going through a top-down approach, we can try to do a bottom-up analysis. And there it is, that one. We propose to address our use case by actually trying to revise OpenStack through a bottom-up approach. By such a mean, we expect to be able to natively fedite different instances of OpenStack. So that means that you will deploy one OpenStack instance and then you will deploy another instance and these two instances will be able to cooperate in a native manner thanks to P2P and self-styled mechanisms. So why we set the bottom-up? Because the first point will be to address and to make cooperative the OpenChart services and then once we do that, we can address the higher services until the ANS-YAS functionality. So let's start to this bottom-up approach. Go to the OpenStack documentation. That's great. What you can discover is that actually most of the OpenStack services already support the horizontal scaling. So most of the OpenStack services are already distributed. They are already cooperative. The only issue is the DB, the database, and the RabbitMQ bus. So let's start with this first step. So the first way to distribute the SQL database is to actually leverage the Galera solution. So basically it's based on the active-active application system where each DB will be synchronized each time you make a modification on one of the DB. So there are already some production deployment that are based on this solution, but unfortunately this does not scale well for our target. The other way to store all the state is actually to try to leverage a key-value store system, not our famous system from the scientific point of view, to distribute data over a highly and massively distributed infrastructure. So the first question we solve for our first PhD is that one, how can we switch from a MySQL system to a key-value store system to be able to store our state in a distributed manner? Okay. And to answer this question, we first focus on NOVA. So we took a look to the NOVA architecture and we saw that it was composed of several sub-services, such as NOVA scheduler, NOVA network, and NOVA compute. And we saw that they don't directly interact with the database. They, in fact, all interactions are centralized in another sub-service, which is NOVA Conductor. We took a look also to NOVA Conductor and we saw that it was leveraging a db.api component. This component expose an API that is used to define interaction between with the database. And currently, there was one implementation that was made to target relational database. So the link between the db.api and the database were using SQL Alchemy. So what we did is that we created a second implementation of db.api that was targeting key-value stores. And to do so, we created ROM, which is in short a library that enables to bring relational object mapping extension to key-value stores. The idea of ROM is to propose query object and session object that will use the same API as those provided by SQL Alchemy. And the idea was to use ROM and to reuse a lot of code from the default implementation provided by SQL Alchemy. And as ROM is compatible with SQL Alchemy to limit the break-off code in NOVA's code. So we managed to arrive to this proof-of-concept of having a NOVA infrastructure deployed on several sites. So each site was hosting a controller on several compute nodes. And so at the end, we had several controllers and several compute nodes. And all these nodes were collaborating thanks to a shared MQP bus, but also thanks to a shared key-value store that was hosted on several sites. So I will give some details about how we integrate ROM inside the code of OpenSack. So here, I will give two examples. The first example, I would like to show you how we used ROM to create model classes. So model classes, in short, are used to store database records in an object-oriented way. So here, you can see some code. This is the default. This is the model class for services with SQL Alchemy implementation. And here is what we get when we use ROM. What you can see is that you can see that the attribute declarations are almost the same. You can see that some arguments here, table, args have disappeared because we don't use them yet in ROM. And you can see that we add here some annotations to optimize the querying with ROM. But the code is almost the same, so it enables us to reuse a lot of code from the model declared with the default SQL Alchemy implementation. And now I would like to show you a second example. I'm about to show you how we used ROM inside the API function. So first, when we took a look to the default implementation, we saw that all the API functions were using this model query function, which in short provides to the API function a ROM SQL Alchemy query. So what we did is that we overrode this model function to provide a ROM context and a ROM query like this, which follows the same API as queries provided by SQL Alchemy. So I will illustrate with a second example. So for example, here I took the code for aggregate host get all function. So here we have the implementation with SQL Alchemy. And with ROM, we see that it's almost the same. We just change here an annotation rated to session management. So to validate this proof of concept, what we did is that we did some experiment on Grid 5000. Grid 5000 is a scientific test bed to enable computer science researcher to do experiment-driven research. So we use it, and we made many experiments on it. And these experiments can be divided in two categories. The first category is monocyte experiments, and the idea was to evaluate the overhead of using ROM and Redis versus the default solution based on SQL Alchemy and MySQL. And the second set of experiments was dedicated to multi-seat experiments, and the idea was to determine the impact of latency and also to check the compatibility of our solution with higher-level mechanism. So the experimental protocol was this. So we deployed some open-stack infrastructure on Grid 5000. And we deployed first some infrastructure based on the default solution based on MySQL and SQL Alchemy. And then we deployed some infrastructure with ROM and Redis. And what we did is that we asked to create 500 virtual machines on this infrastructure and we measured the time taken to create the virtual machines and also the time taken to serve the API function. So on this slide here you see the result for the monocyte experiments. So the idea was to measure the overhead of ROM. So what we did is that we take the time taken to serve all API functions. And what we see is that without any optimization, without changing the model, overall, ROM was 80% faster than the default implementation for 80% of requests. So it was still 20% to improve to where the default implementation was faster than our implementation. The reason is simple. It's for two reasons. The first reason is because as we stored some objects in Redis, we have some operationalization and desalization operation to transform some Python object into JSON format to be able to store it in Redis. And the second reason is that as you don't have some operation such as join or rational transaction in Redis, we had to implement it in ROM. So there is an overhead and this is the reason why here we have this red part here. But this red part should be mitigated with this table. In fact, what we saw was that despite the fact that some requests ROM was slower overall, when we measured the time taken to create the 500 VMs, the difference was not that big. So now, regarding the multi-seat experiment, what we did is that we deployed some seats on Grid 5000 and to ensure the reproducibility of our results, we used nodes from the same seat, but we create, we emulate the geographical seats by adding latencies thanks to the TC tool. The idea was to deploy some clusters on Grid 5000. Each cluster was containing one controller, six compute nodes and one dedicated database node in the case of Redis. In the case of MySQL, the database node was located in one of the seats. We used Redis and MySQL in their default configuration and also we tried to make some experiments with Galera where we had some troubles with reproducibility. We were not able to have reproducible results. So from a scientific point of view, it was not satisfactory. So I will not show the results in this slide, but if you're interested, we can talk of it later. So here are the results for multi-seat experiments. So we deployed from two to eight clusters. The global pattern is that when you increase the number of clusters, the time taken to create the VM decreases. The reason is because when you increase the number of clusters, you increase the number of controllers that you have in your infrastructure and you also increase the number of compute nodes. So all these nodes can work in parallel. So they can charge the workload and at the end you decrease the time taken to create VM. What we see is that with eight clusters, we reach the scalability of the limit of single MySQL nodes. So at this time, with eight clusters, MySQL becomes a bottleneck. While our solution based on Redis plus ROM behaves quite well. And we see also that for 10 inter-seat milliseconds, the latency and for five 15 seconds latency, we have the same pattern. Now my last slide is about the compatibility with higher level mechanism. As we modified a low level mechanism, we did not break the compatibility with higher level mechanism. So what we did is that we did the same experiments as before, but we created a dedicated availability zone for each seat and we also asked to create 500 VMs. And at the end what we saw is that it didn't break the performance. I mean the performance were in the same scale order. But it enables us to guarantee the fairness of the creation of VMs. They were equally located on the seat. But as we didn't saw an improvement by using a VB zone, it means that we think that from a research point of view we should maybe study how to improve the locality. So now that I introduced this proof of concept, the question is can we go beyond the research proof of concept and trying to find a way maybe to transfer to the open-sec community? Okay. So can we go beyond the research book? This is the last part of the talk. So to answer this question, let's have a look first at where we are. So you have whole code available on GitHub under the organization Beyond the Cloud. So it summarized all the code done during the whole duration of the discovery project. So I will just talk about a few of interest here. So first one, which has been presented by Jonathan, you have also, you can find the fork of Nova and the Glens and you can check what addition we've made. Actually the addition is very separated from the code base that you can find in upstream. So we are very separated from what is done upstream. It is based on the Mitaka release. For the Glens, the work is still in progress. So we plan to integrate it in a few. You can find also a development environment. If you want to hack a little bit our solution you can test it. It's quite simple. It's based on Vagrant and DevStack. So it comes in two modes. You have the standalone mode which will deploy a regular OpenStack which accept that it will make use of ROM and release as a backend, as a database backend. You have also the collaborative mode which mimic what Jonathan has done in the multi-site environments with several OpenStack operating collaboratively except that it's on your local machine using two VMs, let's say. Of course we are open to commands and feedback. You will have a link to send your feedback at the end of the presentation. Concerning the where we go, we plan a short term milestone in next October for the next release of OpenStack. This release will be focused on conformity. It's an important aspect for us. Conformity in terms of features, what we would like to show in the next release is that we don't break any high level functionality by replacing the database backend. Currently we have someone working on this, integrating Tempest in the development workflow and testing incrementally that we don't break anything from a high level point of view. We will be compatible with Nova, Glance and hopefully Cinder if time allows. Concerning the performance, it's another point of interest for us. As Jonathan said, there are some requests that are slower, still slower than my sequence, so we will dig a little bit into that and check if we can make some improvement to that. If you remember what Jonathan showed you, we can add some decorator to accelerate some requests. Maybe we can dig a little bit into that. One important performance measurement we would like to address is the scalability of the solution. We target something like in the discover project, so the goal is to target something like 100 sites and with 10 server each, which are each micro-DC. This is the target of the scale we would like to reach, so we will make also an extensive experiment to show that we can reach such a scale. The proof of concept already allows to operate OpenStack in a collaborative way, but we would like to go one step further, so I'd like here to highlight something that we will work on in the future, in the mid-term perspective. Let's take a simple example. Let's say we have a user in zone 1 which would like to start a VM, so it sends a boot request to the Nova API somewhere in zone 1. For the moment, if you don't care about locality, this request, more precisely about the bus locality, if the bus is not aware of any locality at all, this request can go actually to the scheduler in zone 2. The scheduler in zone 2 can capture the request of the user. If we go one step further, if our solution doesn't implement locality in terms of database, the request here, the scheduler, can actually schedule the VM in zone 3. So you have a user which requests the VM in zone 1, but actually the request goes to zone 2 and then lands in zone 3. This is the opposite of what we want actually with discovery, because you will have inter-traffic, inter-site traffic that is uncontrolled, and at the end you will have probably a high latency between the VM and your user, so this is exactly what we don't want. So we will have to address this issue in the next release, which is planned in 2017. This release will be focused on conformity, which is irritated from the last release, and locality. So we plan to extend the NoSQL, NoSQL backend to consider the locality feature. So we have some ideas to do that. I just put two of them, the idea is to try as far as possible to make the IO locality, and so mitigate the inter-site traffic. We are also working with Orange Labs, which provides some resources to the project in terms of postdoc and PhD. So we will have a postdoc and PhD working on the integration of the key value store on Neutron. So hopefully we will be able to integrate this in April 2017. Concerning a much further future, considering the long-term roadmap, there is something that is important to note is that having OpenStack compliant to the fogged computing paradigm is something not only related to scalability or distribution of OpenStack itself. There is, for example, the problem of locality to deal with, problem of high latency, et cetera. What is important, so here in the next two slides we have some challenges that we will have to solve. These are mainly scientific challenges, and few of them are more technical. What is important here, I won't go too much into detail in these challenges, but our approach, or our bottom-up approach, we follow the methodology of a bottom-up approach, so we will deal first with the shared services. So, for example, Rome was a first study to address the problem of the storage in the fogged computing. We'll do the same for the communication layer and so on, so if you go one developer in the stack, we will have to revise every core services of OpenStack and at the end enhance the API itself. Beyond the revising to internals, there are also challenges about deploying such an infrastructure and maintaining it. So, what is the key element here is that with the bottom-up approach we want to make OpenStack vanilla able to support the fogged cloud computing in this case. We have three partnerships, Orange in RIA and the support of Renata, which Orange in RIA have committed the support for at least three years by providing PhD, postdoc and engineer. So, you can see that some of the positions are in red, so that means they are already filled. It corresponds to the first layer, so from the bottom-up approach, it consists of work that is currently being done about the shared services, for example. Some others are not yet filled and we will open this position every six months. Doc, if you are interested, you can go to the website and check the position or you just come and contribute. We are very open. So, we are close to the end of the talk. So, these are the take-away messages. So, there are three take-away messages. The first take-away message is that fog-edge computing is coming. So, be prepared. Academics and industry will agree, so it will be the new trends for delivering cloud computing resources and actually becoming reality. What we do not want in the Discovery project is that we do not want to reinvent the wheel. So, we will try as much as possible to integrate the ideas in OpenStack and the ultimate goal is to make OpenStack to support massively distributed clouds. And finally, as samurai companies have expressed their interest in the Discovery objectives, we like around Dallas and UNRN. We propose a creation of a massively distributed cloud world group. So, to federate some actors of the OpenStack community around the project of building such a cloud, such an OpenStack venue able to operate fog-edge clouds. So, the massively distributed working group, will have its inaugural session tomorrow at the Ilton. So, you can join us if you are interested in the topic. The goal of the first inaugural session will be to identify ongoing action or related action to fog-edge computing that are available in the OpenStack community right now. So, we will have some talks in presentation during the session. So, if you want to have the full schedule, you can go to the etherpad right here and check. You can add your name and the list of the attendees, but you don't have to. Okay. So, the Discovery initiative, so the presentation we just gave today is the result of the work of several companies several people, researchers, engineers, stakeholders of important companies. So, we would like to thank them and thank you for listening. So, if you have some questions, please go to the mic. Thanks. So, just go ahead, I will repeat. The scalability of the NOVA scheduler is already a challenge with the number of compute nodes. Do you worry that this is going to even exacerbate that problem? And then the second one was consistency. Do you have to worry about consistency of that whole database because it's now distributed across a lot of sites. They're getting requests from, you know, all of them. And, you know, do you do anything to deal with that? So, maybe, so there are two questions. The first one is about the scalability of the scheduler and the second one is about the consistency in the DB. So, for the first one I will reply and for the second one maybe Jonathan can give the answer. So, running the first one, you're completely right. Actually, since thanks to the KIVIA store we'll be able to manage more and more NOVA nodes. The scheduler will have to manage also more nodes. And so, you will increase the scalability issue of that. So, we are a research institute and before addressing such challenges we work a lot about distributed scheduling in cloud. So, we have a different proposal, one which is called Distributed Virtual Machine Scheduler which is based also on a ring. So, we have scientific papers. The main change, we have voluntary choose in this initiative is that instead of just doing what we call paper work is that we want to try to make our scientific contributions valuable for your community. So, what does it mean? It means that as Mathieu presented we already have one PhD that work on these challenges with this objective of actually transferring the contribution we made from the theoretical point of view in a practical world. So, this answer the first question, I guess. Sort of, yeah. So, and regarding the consistency, in fact what we saw is that when we look at the default implementation provided for the SQL Kimi and MySQL backend we saw that in fact in some part of the DB.API component for example, let's say that in the function that is in charge of providing some fixed IPs to some instance we saw that in fact there was this session mechanism that was in fact used to ensure that for example a fixed IP will be assumed to instance and there will be no conflict with another for example, I don't know, DB.API running on another node that will take the same instance in fact and what I understand is that these sessions provided by SQL Kimi in fact are kind of wrapper on top of this transaction provided by Rational Database. So in fact what we did is that these sessions provided by SQL Kimi are working on an object-oriented way. In fact it's object-oriented abstraction of this rational transaction. So what we did is in fact we just took this API for the session, we reimplemented it to work on top of a key value store. So what we use is that we use the two fast commit technique to ensure that for example when you take fixed IP another node will not take the same IP and at the end this is why in fact in our experimental protocol we created 500 VMs it was to detect this problem of consistency this problem of conflict and it appears that using these two fast commit was enough to prevent this problem of consistency. Okay, thank you. I'm afraid there is no more time to answer questions what can I propose that after this session we can talk and we'll have a lot of time to ask a question. So thank you very much. So once again if you have questions or if you want to debate some position please welcome tomorrow to the inaugural meeting of the distributed working work. Thanks.