 Hello, my name is Jean Pierre. I'm a part of a French company named CloudWatch, which is interested in innovative cloud solutions. One of our focus is on Hadoop and on how can we make it ready for cloud era. Today's presentation is on these challenging topics. And the three men you see that we came across when we reached out on the subjects. The first one, does it make sense to run Hadoop on top of virtualized infrastructure, knowing that it designed for dedicated hardwares? And the second one is, can we fit Hadoop into the cloud model? Can we make it available for self-service, rapid provisioning, and elasticity? And if it can fit into the model, can we optimize this usage? And we have time. I show you that how easy we can navigate to the data. Well, for the first question that came out when we researched on Hadoop is, whether it's not to make sense to run Hadoop on top of virtualized infrastructure. To answer this question, we need to look at three key performance indicators. The first one is storage IU. The second one is CPU usage. And the third one is a CPU waiting for IU. Those three indicators are very important for us, because we can identify where the bottlenecks occurs. As you can see in this graph, the virtual performance is close to the physical performance if we choose the appropriate solutions. The data displayed here are from two platforms. The first platform is the physical one, which consists of 40 gigabit networks, three nodes with internal storage in red one mode, and two Intel Xeon and 128 gigabit of RAM. On top of that, we have deployed Intel distribution on top of CentOS 6.4. The second platform is a virtual one, similar to the physical one, which we add a storage that's on surf. The surf storage provides two tiers of storage, SATA and SSD. Each server hosts one VM, and its capacity is mirrored to the physical machine. Well, for the first question, does it make sense to run ADUP on virtualized infrastructure? Our answer is yes, because the performance is close to the physical infrastructure. Our next step now is to fit ADUP on Cloud Model. For these issues, we need to look at three point points. The first one is to enable surf services, rapid provision, and elasticity. I will launch a demo that makes Cloudified ADUP and launch within two minutes. So I'll go to the demo, and I will explain later on what I've done. So for the time being, I'll launch a cluster, let's say, name it CloudWatch, and see how time can make it. So to achieve this performance, we use two functions, the first function, Manage Virtual Liars, and the second function, Manage ADUP cluster. For the first function, we use Havana and Safe Storage. Havana is deployed on three nodes, in a typical manner. The controller nodes provide image services and volume services on top of Safe Block Storage. The network layers is provided by OpenVisage version 2.0, a net filter in Kernel 3.10. For the Nova compute, we will use QVM system in version 1.6. For the second function that Manage the ADUP cluster, we rely on the development of Merantys guides, named Savana, OpenStack Savana. For this demo, I didn't go through all the detail about the function of Savana. The Merantys guide will present it in deep detail in their presentation. Well, next, we will see that how many times the cluster will be launched. So the logical architecture provided here is ADUP on Savana. The Savana API is provided by a VM on the cluster. And now you can see that the cluster is launched within two minutes of time. Challenge relieved. So now that the ADUP is cloudified, we can move on to the last question. The last question is, how can we optimize the usage of ADUP cluster? Well, for this one, the multitenancy and world distribution service test management is essential to run a successful ADUP shared cluster. As I will be short of time, I will go through the multitenancy feature, and especially the resource scheduling. The implementation of resource scheduling is quite simple. We will use capacity scheduler, which provides minimum guaranteed and the best capacity. So let's move to the demo. The demo is tried for a while. We have set two queue in the ADUP cluster. The first one has the capacity of 25% of the cluster capacity, and 75% of burst when they are free resource. The second queue has 50% of capacity and 100% of burst capacity. So let's monitor the behavior of the cluster by launching two jobs. There is the monitor system. And now we launch the first jobs. It's a big sign because the Java context launched. As the cluster is free, the first customer will take 70% of burst capacity. The cluster has four slot maps, so 75% of four is three. Now let's launch the second jobs with a second customer. Well, as you can see, when the second customer needs the resource, the cluster rebalance the order result according to the quota in each queue. All the actions are done to provide a user or a specific user. Our objective is to provide a solution that can be used by any user. So to enable this kind of feature, we will make an example from CloudWare data and how to integrate the other solution to visualize and explore the data. The tool used here is from Pentau. Pentau provides three features. The first feature is the visual map reduce, which developed map reduce with a graphical interface user. This screenshot is the virtual, sorry, graphical user interface. An analytic platform which collects and analyzes data and the hardware visualization, which provides the user the ability to create his own solution. So we have a demo on the hardware visualization. On this graph, we can see that the dashboard has three parts. The three parts are all item clickable. So we can go through the one, for example. And in this view, we can go in deep detail of which data. And we can have analyzed friendly by dropping analysis arcs into the menu. And in near real-time, the solution makes requests into ad-lib system. So it makes time because the cluster is on Paris. So you can see that the refresh is near real-time. And you can set up any access you can use. So let's summarize what we have done. The three main points that we go through was the first one, ad-lib makes us to run into virtual infrastructure because the performance is near the physical performance. Ad-lib does fit into the cloud model because we can make it available for self-service, rapid provisioning, and elasticity. And the third point is we can maximize the use of ad-lib cluster by using the multitenancy point. So we have reached the end of presentation. You can reach me at jampion.com. So thank you very much.