 Hi everyone thank you for coming to this talk. My name is Gene Pang and I'll be talking about accelerating spark workloads in a mesos environment with Olexio. So here's a little bit about myself before we get started. My name is Gene Pang and I'm a software engineer at Olexio and I'm also a PMC member of the Olexio open source project. Before I started at Olexio I got my PhD from UC Berkeley from the Amplab which also produced you know software like mesos and before that I was working at Google with distributed databases and systems and here you can find my Twitter and GitHub handle. So here's a brief overview of what I'll be talking about today. I'll first go into an introduction and overview of Olexio and then I will go into some use cases that some users are using with Olexio and Spark and mesos together and also I will sort of describe how you can use Spark and Olexio together and then and then I'll go into some you know how does this get deployed in mesos and then finally go through a demo of this on on DCOS. So first let's talk about sort of this big data ecosystem. You know back in the day say 10-15 years ago it was actually pretty simple I would say the there was there was like basically one framework there was the Hadoop framework where we had Hadoop MapReduce we also had a Hadoop HDFS for the storage and it was actually very simple there was basically one a one-to-one mapping. However you know as time involved the ecosystem evolved as well and today there are lots of different types of frameworks for computation and there's lots of there are a lot of different types of storage systems that people like to use so for example here you know HDFS is still here but there are a lot of new ones that people want to use like the cloud storage like S3 the Google one Microsoft Azure and then there's also lots of storage appliances as well so there's a lot of storage and then there's also a lot of computation application frameworks like Hadoop MapReduce but there's also Spark which is a big one there's also Flink Presto you name it there's a lot of different types of application frameworks and so this can actually get pretty messy just in terms of writing applications managing the storage and just adding new ones removing old ones just is quite difficult. Also because the compute and storage there's just such a wide variety of them now it's harder to get sort of the optimal IO performance from the from the storage to the application. So this is where Alexio wants to help things. Alexio is a new layer in between storage and computation and here you can see that it sits in between the application frameworks and it's above the storage systems and it actually abstracts away the storage systems and provides a single unified and global namespace across all the different storage systems and so applications really only need to talk to just Alexio with just different paths and they can access their different data that happen to be sort of happen to be stored in separate storage systems. In addition to that Alexio is actually a distributed system so what typically this is deployed close to the computation and what Alexio can do is actually cache and store data closer to the application also in memory. So what this allows is that it can actually enable in-memory access to data for applications and this is actually very powerful applications really only need to talk to one API one namespace and they can get almost local memory speed type of IO from their data even though the data can be all over the place in different storage systems and this ultimately enables decoupling being able to decouple computation and storage and that's actually very powerful you can scale each independently and it's actually a very scalable way to operate things. So with Alexio in the picture now you know the analytics it can power a lot of different types of analytics and a lot of different types of ecosystems and scenarios so we've seen examples of big data IOT we've seen AI machine learning so there's a lot of different use cases that people use Alexio in also you can deploy them in very in a diverse environment so you can you can deploy them on premise you can deploy them on the cloud across different clouds and so we've actually seen you know especially using mesos we've seen Alexio being deployed across clouds and also on cloud on premise simply you know also using mesos simply by deploying these Alexio workers to different locations so it's a very flexible way to operate. So just to sort of summarize what Alexio can provide ultimately it can unify your data and it provides this one single API and single namespace for your data regardless of where the data actually lives underneath it. There's also a lot of flexibility with Alexio you can since you get to decouple your computation and storage you can scale them independently you can use you can mix and match different computation you can mix and match different storage and it's still Alexio allows you to do that and enables that to make it much easier and lastly Alexio since it can store data closer to the application and in memory it can greatly improve the IO performance. So Alexio is Alexio is an open source project and it's actually one of the fastest growing ones. Alexio has been open source for about four years now so yeah about four years and this is actually a graph of the number of GitHub contributors to the project over the first four years of the project and the top line is Alexio and this this graph is actually a little bit stale but if you I think if you if you go on a GitHub today I think there's over 600 contributors in the Alexio project so it's actually been really exciting and and fun to be a part of that community and sort of see how different people are using Alexio in different ways. So next I'll talk about some Alexio Spark and Meso's use cases out there. So this first one is from Chunar. They're a big travel website in China and here they have an interesting use case where they actually have multiple computation frameworks. They have Spark and Flink and they want to do both streaming, they want to do both batch and there's a lot of sort of mixing and matching between the two computation frameworks but they also have multiple store systems that they're storing data in. They have HDFS and they're also using Ceph so there's you know multiple computation and multiple storage and it was actually you know becoming sort of a headache to manage all that as well as they're not getting the best performance that they wanted. So they added Alexio in addition so they added Meso's and they added Alexio to the picture and by doing this they could actually abstract away the different storage systems that they actually have and they can get some of the higher performance IO that they were looking for as well as being able to share a lot of that data between the two computation frameworks through the memory and so this actually sped up a lot of their jobs and especially in some really bad peak times some of their queries actually completed 300 times faster with this new environment so they actually found a lot of value from Alexio. This next example is from Garden Health and they do a lot of genomics and cancer research and here they had a spark before they had the spark HDFS environment but they it wasn't scaling up as much as they could have and they were looking for actually they have a lot of data and they want to they need to scale out more so they actually moved to something they moved to Minio which is like a cloud like object store where they could scale out the data of far greater than HDFS and so once they did that though the data was actually it was remote and slower so they were they wanted sort of the performance back from when they had sort of the local data so they added you know Alexio to the picture along with mesos to this to be able to scale out and they you know ran spark spark on Alexio over Minio and that actually sped up a lot of the access as well as be able to access even their HDFS data as well so this this sort of gave them the flexibility to use different storage underneath their application and be able to scale out to their needs so next I'll talk about how spark and Alexio can be used so one one class of benefits that Alexio can provide is actually being able to share data via the memory and Alexio can enable that so with so if you were to just run spark jobs on mesos on some storage and if they wanted to cache their own data they would actually be each of the spark context will be caching their own data and that essentially would duplicate the data that that you have in memory and that could actually waste some space so here we have an example where you know both blocks one and three are stored by these two spark computation contexts but you know that's that's somewhat unnecessary especially if they're on the same machine so if you have if you use Alexio instead you can actually store that data in Alexio and spark wouldn't have to store that data internally and so if spark doesn't have to store that internally the Alexio can essentially manage caching and storing that data in memory for the spark applications and they spark connect have direct access to that in-memory data and be able to not have to duplicate duplicate the the memory memory usage as well as have memory speed IO to that data another so even if there's only one spark context there's a lot of benefit as well since you can share that data across different invocations of that context so here so here we are the spark context on running on some data but you know if for some reason that spark context crashes or you know has to be restarted or another one gets restarted all that sort of useful data gets lost and you'd have to reread it so if you reread it from something that's slow it would actually take a long time to to restart that spark job so instead you can store it in Alexio and even if the spark application has to restart or has or crashes the data is still resident in the memory in the Alexio worker and so because of that the application can when it gets restarted it can directly read from the memory the data from memory and thus greatly increased the IO performance there so here is a high-level overview of what the Alexio architecture looks like Luxio essentially has three major components there's a Luxio client there's a Luxio master and a Luxio worker so the Luxio client is living in the application and that is what the application uses to communicate with Alexio and then there is the Luxio master and workers which I'll speak more about in detail later but that they do most of the interaction with the storage and the storage is over here on the on the right and so the applications only really need to talk to Alexio and Alexio workers Alexio masters and then the workers and masters will interact with the under the backing underlying storage for the data and metadata so the Alexio client is the main way that applications interact with Alexio and there are actually a few different ways to interact with Alexio different API so there's the there's the native Java Alexio file system client which has a lot of the Alexio specific operations such as pinning and unpinning there's mounting unmounting you know setting TTL things like that then there is the HDFS compatible file system client and this is this enables applications to not have to modify their code if they're already writing reading and writing from HDFS it'll look Alexio will look just like an HDFS client and thus it'll interact with Alexio while the applications thinks it's talking to HDFS and then also a new a new API that was recently added a few weeks ago was the S3 API so when applications are written talking to S3 Alexio can also you know talk the S3 API so applications can just point it to to Alexio instead so these are the three major ways that you can use to interact with Alexio there's also the Luxio master component and this component is primarily used for managing the metadata and there's there's two major there's there's a few major you know classes of metadata there's the file system name space metadata which handles all the file system name space and there's the metadata for all the blocks of the data and as well as the workers in the system so these are the main pieces of metadata that exist in the system and the the the primary master also writes to a journal so all the actions are durable and the secondary masters will essentially tail tail that journal to to keep up to date and the workers are there are the primary components for storing the actual data and so they they store the data they serve the data they read the data and they else they they can store it actually in different storage media there is called the features called tiered storage but essentially Alexio can can use hard drives SSDs and memory for storing this type of data and there are sort of eviction policies and and promotion policies built in that you can configure to get the type of behavior that you want and also workers are the main components that read and write read and write the data to and from storage so next I'll briefly talk about how Alexio can be deployed on mesos and Alexio is part of the DC OS the universe and so as you can see here over here you can see Alexio is a package on the DC OS universe and you know like I mentioned before Alexio you know can can provide this unified view of the data that you have in different storage systems and also can give you higher performance IO to that data and DC OS actually makes makes deployment very very easy and scalable and it's just a it's a very useful tool in terms of dealing with the infrastructure so together it's actually really really convenient to have Alexio and DC implemented for DC OS so that you can have much faster deployments for your applications as well as deploying Alexio as well in the same framework and so this also enables being able to have you know applications in the DC OS in the mesos world that I want to access data sort of outside that the mesos the mesos world and so Alexio can help bridge that gap and continue to provide high performance to that data so next I'll show a short a short video demo on sort of how it looks like deploying Alexio on DC OS and a short demo with spark and so for this demo this is a it's very simple setup we have we have a you know mesos we have DC OS running and we are going to install spark and mesos together and we will be running a few simple spark commands and being able to show how it interacts with Alexio and we also will be interacting with some data in Amazon S3 so in terms of the demo setup we're using this these versions of the software and we're using an Amazon EC2 instance M3x large type of instance so in this in this setup we have HDFS setup and in HDFS this is the UI that shows the files in HDFS and here you can see there's one file called license and so there's one file in HDFS and then we also have an S3 bucket for this demo and in this S3 bucket here we're showing the listing of the bucket and we have two files we have the readme file and a sample 1g file and so these are the two storage systems that you know we'll be interacting with today and so if we go back to the DC OS we also have a Docker registry that will store you know some of our Docker images for spark and Alexio and things like that and so to install Alexio it's in the universe and so you can look for Alexio you can try it and you can install it through that so there are two main things you have to you have to configure for the installation one is the license and so this the license has to be sort of base 64 encoded and and this is what this is doing right now but it's going to base 64 encode the license and that'll be sort of pasted into the configuration and then another thing that you'll have to configure is the UFS address and this is something we call the under under FS under file system address and this is where we are essentially mounting some store system to the root of Alexio so something needs there needs to be some connection Alexio needs to have some sort of something backing the file system so here we're going to be using the HDFS location we're using the HDFS that I showed earlier to be the under FS for Alexio for the root of Alexio and so once you've configured those two to major points we're going to install it yeah and then once that's and once that starts you will be able to see a lot of the processes start up for Alexio running on mesos and so here I think we will be showing some some tasks here that are starting up and some of the tasks that start up are the workers some of them are the master and those are so yeah those are the main components that get started up and so once Alexio looks like Alexio is started next what we'll do is we'll actually log into the master and then start doing a few commands with the Alexio shell commands so since Alexio does provide a file system type of a namespace and interface and you can do simple file system like operations so we'll log into the master we'll you know start up we'll start up a Docker container yeah I think we're just looking at some of the Alexio configuration here and yeah the first command we'll do is a basic LS command and so here it's a little bit cut off but it'll it's basically doing Alexio FS LS slash the root and so here you can see it's listing one file already so we didn't do anything with Alexio yet but there's already a file there and that's because we mounted that HDFS I showed earlier into Alexio and so since HDFS had that license file that I showed earlier now we can see it in Alexio so when we did the LS we can see the license file there also you can see that it says not in memory that means it hasn't been loaded into Alexio yet so it's still in HDFS the metadata is in Alexio but not the data is not in Alexio yet and so this next command here we're actually gonna mount another storage system into into Alexio namespace since Alexio can combine different storage systems in the same namespace this is what we're doing here we're actually gonna mount the S3 bucket I showed earlier into the namespace of Alexio and so here towards the end of the command we're mounting the DCOS demo S3 bucket into slash S3A and so the path in Alexio will be slash S3A and we're mounting that S3 bucket in and so now if we list that location in Alexio slash S3A what we'll see we will actually see those two files that I showed earlier and so here you can see two files S3A you can see the readme file and the sample file that was already existing in the S3A bucket and so the metadata has been pulled into Alexio but the data the contents of the data has a has not been pulled in yet since nothing has read that data yet and so the next thing we're gonna do is we're going to start up a spark shell and do a few commands with spark on top of running on top of Alexio so here we're logging in we're gonna start a Docker container and so as yeah so we're starting a spark shell so it'll start up the executors and we are going to run basically a simple command that will count the data in that S3 bucket in that sample 1g file and so first what we're gonna do is we're gonna set the log level to info so we can see some of the timing information so that's what we'll be doing here and then what we're gonna do is we're gonna read the sample 1g file from Alexio so here you can see the command we're recruiting an RDD from the sample 1g file of Alexio so the path that we're passing in here is starts with Alexio it's the Alexio scheme and then we have you know the host name and things like that and then we have the path S3 a sample sample 1g and so that's the file that we're creating this RDD from and then we want to essentially read that RDD and so we're gonna do a simple count on that and so it'll it'll process that data and actually at this point it'll read it in from S3 since it hasn't been read in before yet so if you take a look at the time it took about 30 seconds to read all that data in to read all that data in and in that process actually it will have low it will have a saved it in Alexio space as well so if we if you look up here we do the same listing on the S3 a path in Alexio and now the sample 1g file says in memory and that's because since an application has read that file it pulled it in from S3 and stored it in Alexio now and so the next time that you want to read it it'll actually be in memory now so we're actually going to start a new spark shell a separate spark shell and we're going to run sort of the same command we're gonna you know create an RDD RDD from that Alexio file and we're gonna we're going to read that data so we set the log level we you know create that RDD the same RDD from the same file and then we're gonna run the count and so here you can notice that there's there's a node local locality level for these tasks and that's because it's now loaded into Alexio so spark can actually find it it can schedule tasks local to the data and it's actually in memory data in this case and so it'll actually greatly speed up the processing so if you take a look at the time now it took about three and a half seconds so you know almost ten times you know faster to read that data again and to process it that data again because it was it was read from the Alexio memory so that is the end of the demo so here are just some of the results from the simple demo we have this is sort of the duration in seconds of that of that processing that that RDD count and with Alexio this so the light to blue the top lines of each section is the first time you ran the count and the second line is the second time you ran the count and you know if you run directly on S3 it'll actually which is the bottom two lines here it will take the same amount of time to read the data again because you're just gonna access S3 again but with Alexio the second time you the second time you read it it'll actually read it from Luxio memory it's already it's already been cached in Luxio so it actually will have much faster IO because it's already in memory and so the application you know will essentially be eight so in this example eight times faster because the data is already in Alexio so in conclusion I showed how easy it is to use Alexio and Spark on the in the Mesa's environment I also described sort of the overview of Alexio and how can benefit a lot of different scenarios Alexio can also provide a lot of IO performance because we can store Alexio can store data closer to the application and in memory and I've also shown that Alexio can connect different store systems easily into a single unified namespace so that is the end of my talk thank you very much hello thanks for the presentation it's really cool just question so if I understand correctly Alexio is like distributed memory transaction or distributed memory across multiple machines and it read the data from the memory on different machines right yes you can you can read memory from other machines yes from different machine like yes the cache basically in the memory is actually multiple machines and they do distributed memory so what about cache and validations and consistency yeah that's a good question primarily Alexio works in the environment where the data is immutable so if when the data is immutable that is no longer an issue so once you write a file in Alexio you'd have to delete it if you wanted to update it okay thank you it's a little bit of a similar question so if I have a data sets that's larger than any one of my servers can hold itself and it needs to be striped across multiple servers how does Alexio deal with failure mode so if I have I don't know say a terabyte of data and I lose a node in my cluster with a part of that data will I reap will I pull in just that part or will I have to pull in the entire data set again yeah that's a good question so in the Alexio world there's a concept of files and blocks and so Alexio will have to reread blocks of data but not the entire file so if the file has like 10 blocks and you lose a machine that had two of those blocks you would have to reread two of those blocks from the underlying store or from if it's on a different machine it can even read it from another Luxio machine so it's on a block level one more question so you mentioned about spark locality that locality and so I'm confused in this case so the data is on Alexio and we have like there will be spark workers running but they are actually they might be different machines so right like the spark jobs are running maybe on different nodes than Alexio nodes where yeah that's that's possible yeah and then in this case what locality means yeah so if they're not on the same machine then you can't get locality if if if the spark workers are on the same machine as the Luxio workers and that's when you get locality okay but like even if like let's say if we have one spark node and two Alexio nodes and the data is sharded between the memory is in between two these two nodes then the look then the spark worker will see local data because it it knows one node but actually the other half of the data is on another node right yes so yeah if you only had one spark worker then for like half of those jobs it will not be local data because it'll be on a different machine right okay thank you I have been reaching you since you are tacking before we were talking before yeah and I have always the one question I read about the catch a in office of heap mode that is reading for tacking on when was tacking I think Luxio has it but they never and if to understand why we want to catch a RDD if we want if we could safe in Alexio it's not the same or I don't know if I just are talking about caching spark caching versus Luxio caching in in in a memory level that is not a memory only is of heap that I think is only really only relate with with Alex if I understand well so spark has many different levels of caching one of them is like the memory caching layer level so that is that's not Alexio but I guess there's a few distinctions here but one of the major ones is that that when spark caches that when spark caches itself in its own internal memory it sort of lives before that one spark context or spark job so a lot if you stored in a lot if you stored in that one spark context another spark context cannot read that data because it's in a different context so if you have it in Alexio both or any spark context can actually read that data from memory actually it doesn't have to be spark it could be like fling or whatever so on any application maybe different memory so it just essentially you're pulling some of the caching duties out of spark and into some external external system and another question do you say that you have several file system from the underlying file system supported how can you say how much for I only see plaster and HDFS well there's a fs there's a there's you know there's a cluster ones because I think that's also talks HDFS there's S3 there is like NFS type of systems as well yeah so a lot of different I think we support many of the files is there some is there one that you're particularly interested in in my case it was NFS so yes yeah so I think we do have people using it with NFS as well okay thanks any more questions thank you