 Good morning everybody, welcome to the run-up event of the 5th elephant big data So I'm Lalada the Monty I'm working there at in Glastrofus I'm going to talk about the Glastrofus overbearing with Hadoop We have been helping preparing the presentation. He actually get this presentation in the next part of 2013 So I'm taking this presentation modified to add some good things. So he's a homeowner. He's talking to me about safe So this is the agenda Obvious question. What is Glastrofus? How have you about it? Then use cases of Glastrofus where you can use Glastrofus Then Hadoop and Glastrofus What is Glastrofus? Glastrofus is a scale out solution open source software Which can scale up to petabytes of data and connect to thousands of clients It actually aggregates the storage from the network from a storage pool So like you have multiple servers. We have storage inside each of them You can aggregate them and it can represent is a single global news place to interest the part of singles volume So which you can use for storing your data or Big data analytics, whatever you want So Glastrofus is a file system completely you just base. So you just on this file system It relies on a disk file system which supports extended attributes extended attributes is kind of a Feature in Linux file systems where you can associate properties to files like if you have any on this file system We have metadata This is a beach file When that's by default created on a disk file system. Let's see you want to attach some more information to the file I would be data that you can do to the extended attributes. So Glastrofus, which is Extended attributes very extensively So that is one of the reason why we don't need a separate beta data server Glastrofus is a distributed file system Most of the distributed solutions offer and if you look at the solution They have a metadata server which has the information about the files like in HDF's Hadoop file system We have an end server which has information about the files and block of data where it is exactly present So in Glastrofus, we don't need that. One of the reason is to use extended attributes If you have any question, please feel free to interrupt me in between I'm glad to answer your questions. So this is a picture of a pretty typical Glastrofus deployment You can see here, this is the Glastrofus storage pool These are servers It's quite flexible in terms of what they can be used for your disk You can use it as a jboard, just a bunch of things, or a SAM, whatever you want And these are all servers if you use Glastrofus You can aggregate the storage from all the nodes and be present in a single volume into the clients These are the protocols, the clients who support NFS, SMB Saves, HTTP, FTP, and also Glastrofus native mount type that is fused file system in your disk space So these are the protocols on which you can access a Glastrofus volume Glastrofus is a scalar solution, so it provides us linear performance scaling That means when you keep on adding multiple nodes, the performance should actually increase Decrease English should increase because you're adding more nodes to it That's how scalar solutions should work We're going to talk about Glastrofus architecture a little bit It's a software-only solution. It can run commodity hardware, which is available You can use those hardware storage and Glastrofus on it No external metadata servers are scaled out with elasticity, extensible, and modular Unified access, you can access files through multiple protocols, like I mentioned And it's a Pozins component file system The advantage we're getting with the Pozins component file system is you can access The client can access the files on Glastrofus through a lot of protocols NFS, SME, Fuse, HTTP We're going to talk about the breakdown of Glastrofus to a couple of components Already you can really understand the concepts behind it The first thing is storage pools So like I said, Glastrofus is a distributor file system Where you need to have multiple nodes You can start with a single node But your storage needs to grow You can keep on adding nodes And it can actually get increased file system size So a pool of nodes called storage pool So to form a storage pool, we need to... There is a concept of Trusted Storage Pool So there are a couple of nodes which are already present If you want to add a new node, you have to do a invitation Basically, that's called peer-pro So without invitation, it cannot come to the storage pool So you have to do a peer-pro to the new machine If the peer-pro is successful, then it's part of the storage pool And you can use it storage again to Glastrofus So the embassy information used for determining quorum So there are situations in the distributed file system where Because of some disaster, a couple of nodes goes down So there is a concept in Glastrofus where you can say a quorum Let's say I have 10 nodes And if my 5 nodes are down, I don't want to operate Because it will put the data at a higher risk of data corruption So that kind of feature is there in Glastrofus And the node information we have in storage pool Used to determine... You can say the parameter, other things You can say the quorum is there in Glastrofus Who you want to be in operational mode or not Any questions? What is Preqs? So Preqs is where the data actually resides When you create a volume, you need to mention the Preqs Which Preqs it should use Where exactly the data is present Or would be present in different servers So it consists of the server name In the partition or the on-disk file system What you would do or export You want to use for Glastrofus to basically aggregate There is no limit on how many Preqs you can use to upload Preq nourishes the limits of underlying file system Like I already said, it needs the on-disk file system So when you are using a Preq And which have on-disk file system So it nourishes the limits of the underlying file system Let's say if one file system doesn't support a certain amount of size of data So obviously when you write the data to Glastrofus And finally goes to it is So if the limit is not there, it would fail So you need to be aware of the limitation of the underlying file system Clearly we recommend each Preq in cluster studio the same size Because the files are totally reshooted to get a better relation If it is the same size, it is definitely better Here is the example you can see There are three storage nodes And the first one has three Preqs And this one has five Preqs And this one again has three Preqs Any questions? This is Glastrofus I am looking for Glastrofus volumes now Volume is a logical connection of Preqs The Preqs data is already in the previous picture You can have a logical collection of Preqs To actually represent a single name space A single volume And you can mount a volume on the client To the product that I mentioned NFS, SMB, Infuse, Height System in the space Simple syntax given Where you have to mention the volume name And the server name is one of the Pool from the storage pool A single Preq From the same node can be part of different volumes What you can do is If you have a Preq You can create sub-preqs inside the Preq And use each of the sub-preqs For actually part of different volumes This is an example I have three nodes here Old one and old three And you can see this export Glass exports are the partitions Which I am going to use for Glastrofus And there are different sub-preqs Like Preq1 All Preq1 So it is part of the volume Which is the storage music All Preq2 sub-preqs are part of The volume which stores videos That can be done The different kind of volumes in Glastrofus So depending on the volume Sir, we have couple of algorithms in Glastrofus Depending on the volume time We actually decide how to distribute the data Across the nodes Where to store, how to store So these are the volume types One is distribute, stripe replication And these are mixed up For the top three, distributor, brigade Stripe and brigade Distribution and brigade So each of them Like distributed This is the one advantage in this application This is the one disadvantage So you can combine them And get the best out of the lot Depending on your requirement Distributed volume Distributed volume is a very simple way To distribute the files across all the Preqs Let's say I have two nodes In each node I have a single braid These are the two nodes Sir, for one and sir for two I have two Preqs Like export one and export two I am writing three files So it will distribute the three files Across two nodes And here it actually putting two files In here it is putting single file So the question now comes How decides which file to put where So it uses a hash algorithm And from the file name It creates the hash And the Preqs also When you create a volume It assigns the hash range to each of the Preqs So when it generates the file When you are writing a file On the fly it generates the hash The hash will do all out of it And decides where this hash will actually Should be Because already the Preqs has hash range Like let's say in the simple word Let's say it divides into 0 to 50, 50 to 100 And the file hash comes as 51 And it takes around 50 and should be between 50 and 100 So it will put in ring 2 So that's how it decides On the fly how to put the file We don't have any replication here So if one goes down You go on with environment in the file It's a data loss But because of the hashing mechanism We have it's definitely useful We don't need a metadata server Because the same hash range also We store the information There's three attributes in the file So when a client request any information It checks the actual attributes So it checks the actual attribute Where the file is and directly goes Is the file We don't need a separate metadata server About which file it sets where So the hash algorithm is talking about It's called DevisMare hash algorithm So 32 bit hash is divided into Number of breaks you have And also directory Because so when you create a directory The layout should be present in each of the Tricks because let's say There is a directory, directory 1 If you create two files One file goes to one brick Other file goes to other bricks So when you create a directory Between all of the bricks But if you create a file This is the file according to the hash range Any questions on this? So there is also a volume type called Replicated volume Replicated volume is It's a synchronous replication of Alt-recreate files Basically when you create A replicated volume and you have two bricks That means they are like Each brick is mirror of each other So whenever you write data The data gets written into Two of the bricks simultaneously And if at all because of some Failure condition one brick goes down It can actually handle the Request through the other brick And when the one brick comes up It automatically heals it It's called self-filling in the strap It can automatically heal Similar to HDFS Where HDFS also Actually if something Goes down and comes back it also Replicates the whole new data to it Any number of replicas can be Configured in less purpose when you create a volume So basically Each of the operation inside Replication Functionarity in less purpose Each of the operation and transition Just to maintain consistency Everything operates perfectly Data consistency is maintained Also it maintains its own Changed log of files Know which Where to replicate How to make them consistent This is a picture Replication of replicated volume We have two bricks here And they are part of the replica So whatever file you create This is going to go to both of the places So if I create an amount for it It goes to both of the places So now we Not about distributed Volume type and replicated volume type You can combine them both And I don't guess the best part of it So replication is definitely Helpful when you need Redundance your data In case of any failures Distribute this helpful because it gives you more Right documents You can combine them both And also get benefit out of it Now we do distributed replicated volume So Here is the picture Here we have four servers Each server is single Each brick in it So we have created A replica too basically I need One brick So we have one replica So here there are two pairs This XP1 is part of A replica group These two are part of the replica group So when you write a file Now it decides on the hashing algorithm Which group it should go It decides this file and should go to this group So file comes to this group and it Actually gets replicated on the two of the bricks So when this brick is down Actually glad can Still access the file one And write it to whatever you want Because one copy would be Still available and this replica Counties depends on On the validation command You can do multiple replica Here it is two you can do three four Whatever you want Multiple retentance levels We have another type of volume For stripe volume Where files actually get striped into Different bricks So only recommended where you have A file size which is bigger Than any brick size You have two bricks each of one TV And you are actually writing a two TV file So obviously the two TV files Cannot reside in a single brick Because it is bigger than each brick So you can use stripe To split the file into two parts It is stored in each brick The brick failure can regenerate into Data loss because you don't have a replica So one not goes down So obviously This data loss is always recommended to use With a replicated volume Striped replicated so that You have a redundant copy of the data So we can Addition of brick Removal of bricks These are all you can do Dynamically run time Basically so whenever you feel like Without affecting the application Without taking the volume offline So the volume Would be online you can do all these operations Like addition of brick Removal of bricks Revalence Revalence something which we need In case of stupid volume Let's say you have ten bricks And you have a lot of data Because you need more storage space now So you want to increase your storage So when you add new five Nodes when you create new data It would be present on all the nodes But what about the existing bricks Which are almost like 70% full Now you want to evenly spread all the data across All the nodes so what you do You do a rebalance Distribute the data across all the nodes The signals will regenerate With better inflation You won't face a single brick But rather now as a whole of Blaster And you can do performance Functionality in me during run time You are affecting the application My question is On this side You say a brick can Feed So the Bricks of the bricks And the figure management Or the programmer basically Now Blaster interface can handle itself Let's say you have a replicated volume And you want to show a brick actually Because of power failure or something else So now that you have a replica pair Application won't be affected anyway Because you have a replica pair Which can handle the request coming from the client So now where you fix it Let's say there is a power issue Now the power goes back The server again puts up So again it will automatically Understand that this file is actually This brick was offline for some time And it went its own change now And I need to fix it So Blaster is automatically start fixing it So recovery mechanism Yeah So does there like Is there a master model Which takes care of Some duplicates now No, so that is The benefit right It's like a distributed thing So It's not master model but the replica It takes care of it So it's all Between the Yeah single structure Blaster is being a right way Say a loop on it Blaster basically Not to worry about Something is not responding to something Right As an enemy behind it Right, you need to respond in some way That you know you have to fix it anyway Because let's say You have a two way replication and one Bricks goes down What about the other Bricks goes down So here you have to take care of it But not like you don't have to take the Brickation offline So we can See the response Yeah you can see the logs And also there are here like commands Where you can see the status of Any healing is going on Self healing is going on Whatever is happening Access mechanisms for Blaster This volumes High system in user space You can mount a Blaster This volume through Fuse protocol NFS v3 And NFS v4 Are getting it from a Ganesha NFS Ganesha project SMB NIPG is a Blaster API which can directly See a library written in C A program language if you directly talk to Blaster service so you can develop Application directly with it Mostly used for our integration points With different projects The rest API is HTTP And HDFS The most important part in Blaster Is implementation is translator It is inspired from basically New hard project where They have translators where each Translator is a functional unit And it is stackable They can be run then So Blaster face also have Translator volumes have seen Distributed functionality Or replicated functionality It can be implemented as a translator So it is very easy to Change the translator stack To include any functionality During run time This is how it looks Do a picture representation of a translator stack This is how it looks Translator can be on the client side stack Or can be on the server side So this blue, what you see Is the client side Is the client and this is the server This is the disk where it is getting written And this is Blaster face So it starts from there You have translators to do a lot of performance things Performance tuning And stuff like that And it finally comes to It comes to the server and the server Also have translators And it comes to the disk And we talk to the underlying file system In the disk Any question on this Blaster face is already integrated With a lot of projects like OpenStack Samba NFS Ganesha Over for management For Blaster face and other projects KMO Hadoop And other things also Finally put together all the things For Blaster face So this is Recommended Things for Blaster face And you have hardware They are not available LVM and now you can actually Support skin provisioning HP volumes Then you do any file system You support section right here Which can be used using HD4 At its surface And Then this Green things are Required and you can see the color combination here Strongly recommended optional And you can start the service Any question on this Let's say We will When you see the interface Yes These are the US cases of Blaster face Unstructured data storage Like Web data Archival Long archival or archival any data Digital storage recovery we have a feature called Geareplication I will cover it a little bit later So we can use Geareplication for Digital storage recovery Virtual image to Cloud storage for service providers Active cloud When you have multiple devices The data Synchronous papers multiple You started something doing mobile And quantity on a tablet The storage should be capable of doing that Big data Structured data Mostly I am talking about I think number two is one example Eximile data Emails JSON format So now we are talking about Hadoop English surface Big data conference Hadoop is a very important part of big data Where you can do a lot of analytics using Hadoop On the last set of data So we are capable In this map reduce Hadoop has a lot of programming models One of them is MapReduce You can use MapReduce on Hadoop to do Process a lot Last set of data And do analytics on it So we support Hadoop on Blaster face And then run MapReduce jobs On Blaster face volumes So for that we have a plugin called Blaster face hadoop plugin Which replaces the hdfs And Blaster face use This is the github page It tells about this plugin I mean how to use it And there is a github page So you will get a bit about it So there will be a file And it has all the requirements If you just map it To create a jab file of the plugin So you can look into it So you want to use Blaster face In Blaster Hadoop Now the obvious question comes right Hadoop already has hdfs Hadoop file system So why do you use Blaster face I mean now some other open source coverage for it Advantages First advantage is Hdfs is not a persistent file system So you cannot Access through nfs Or smd or fuse Or some other protocol Which is pretty Useful And you can use Blaster face supports Like I already talked about A lot of protocols you can use To access the data on Blaster face volumes The other advantage is having a single storage For map reduce and storing the data So let's say you have an app Which you observe and you are getting a lot of Lot of files you want to analyze it So what do you do Hdfs file system so that map reduce Jobs can be run on it But if you use Blaster kind of a solution Actually use Blaster to store those log files And use those to again run map reduce Jobs so you don't have to Your time is basically saved and also You don't have to manage two different things On hdfs and one separate storage Let's say when you find storage can be used For both heavy purposes So that's one of the biggest advantage I would say In hdfs architecture There are two kind of nodes one is name node One is data node so data node is where the tasks Actually get run and the name node Is where the metadata is stored So whenever they want to run a job They go to name node and find out Wherever the block is present and try to run the job there So in the Blaster face You don't need a name node So basically If the architecture Is like that it can It goes fine without a name node So that way it can save a lot of hardware And management stuff A lot of advantage This one I was talking about The georeplication eraser coding The georeplication is used for just recovery It's a Asynchronous reprimand and replication service It can actually replicate data From one side to the other To local area in a talk or van or internet So it's pretty healthy The other is eraser coding Eraser coding is another way of Getting Data redundancy And fault tolerance It would be similar to A software implementation Of a RAID So it's actually Reduced to that way So map reduce jobs In the HDFS use something called Local data locality optimization It's first you get the information Where the data is present and try to run the job there So that it saves a little bandwidth And obviously Result it to better performance So industrial office Actually we Drugging is already Texture of that because in each file We actually store where the file is present So plug in text finders where the file is stored And the map reduce jobs actually run there So it's actually Same thing as HDFS Data locality optimization is present Industrial office I think this is not present on Amazon S3 or something That's what I said earlier The two prominent Apechi projects Project one is Apechi Spark And other is Apechi Apari is both works with clusterfaces I'm going to talk about detail about this Apechi Spark project Is another open source Data analytics cluster computing network So it's basically a programming Model like map reduce Replaces map reduce in Hadoop It works with Hadoop so you can use Hadoop And use Spark to do the analytics It mostly oriented towards I mean To do the analytics on it In memory The security data so it gets data into memory And runs on it so because of that It's very faster So it claims up to 100 times faster In certain applications like data mining So you can read about it more In the world The other project Apechi Apari project which is The automation project for Provisioning, managing and monitoring Apechi Hadoop clusters So you can use this particular project For doing all this stuff With clusterfaces It's pretty easy to use So it has all other peripheral projects Like I think Nageous also you can Install it and configure with it So it's pretty helpful This is the implementation of clusterfaces with Hadoop You can see the other protocols Like NFS, fuse, in suite You can access clusterfaces volume In the same way 3 servers here 7 ones are not 3 So you can use these 7 So we have a lot of servers here And each server is brief You can also You know You can run the task And It's also like So a lot of things It's a little better Implementation of hardware These are the resources for clusterfaces If you have any queries You can go to cluster.org You have links for these If you have any questions regarding this You can ask questions You can ask the channel On free note Also it's pretty active You can ask any questions you have Thanks, so if you have any questions I would be glad to answer Thank you