 To start well It's been a thing with the organization and the monitor Yeah at the door not being updated, but well no worries. Here we go. Nevertheless again again, yeah Good afternoon everyone We're really happy to be here. My name is Andrew commissario I'm the technical lead of cloud services at Mercado library.com and I'm here with Leandro rails It's also a technical lead of several projects at cloud services at Mercado Libre also so We're here today to Tell you a little story about how we scale today our picture solution using open stack Swift But for those you don't know Mercado Libre Let's talk about about it a little bit Mercado Libre is an engine in companies. It's based in Buenos Aires. It is the e-commerce Platform leader in Latin America having presses over 14 countries including Portugal in Europe We are in the eighth position of the online world retailers 1900 employees working Mercado Libre were 400 are related with it Plus 90 million users are registered today in our platform since our API went public for all the developers to Develop their applications around Mercado Libre's API we're hitting five millions API requests per minute Regarding throughput and bandwidth. We are handling four gigabit per second incoming from our users and Our more than 8,000 virtual instances are running in over 1300 physical servers so I'm gonna give you Leandro to Start talking a little bit about our previous solution and why we couldn't scale pretty well Well, so yeah, you might say that what a nice testing lab you had there guys It's so big. What a nice QA environment, but actually No, the eight online world retailer is actually running inside full on top of open stack We're actually build our services around opens that we build memcash as a service database as a service Queuing as a service. It's all an old off toe of open stack and we are using the whole stack We're using Cinder. We're using new trunk. We're using Nova. Of course We're using Swift and we want to talk about that and we're actually planning to Actually released our first version of our platform as a service by the end of this year using Nova plus Docker IO containers. That's really cool and actually replacing our auto-scaling custom API With him than say lometer stuff. So we're doing actually Pretty cool stuff. I guess they're not bad for us. We are pretty proud about it. So let's talk a little bit of our previous solution and what happened before Why our previous a storage solution didn't scale right for us. So let's go a little bit of what happened so storing billions of billions of images on an FS Storage and our old NFS appliances Didn't scale right for us. So Actually due to performance issues Because our previous platform didn't has the ability actually to scale up and out so well Didn't have the ability to actually add More CPU units or more processing power to it. So we ended up with a lot of available storage But no processing power at all. So we were locked the hard way and we needed to Actually get away to Deal with this problem. We actually start into Shard across several NFS clusters and trying to solve that thing by that way, but didn't work either. So We were hitting a lot of bottlenecks or our current infrastructure actually When you try to access an uncache where we try to access an uncache object, for example to look up for an image Actually every single read implies a one database query so When the caching layer when we have a problem on the caching layer, for example, and the database can keep up with it so This goes down times of of course Reduce our annual bonus and that's sad. That's really sad. So We needed to actually Do a thing to actually improve this actually we have a layer of proxy servers that do all the online scaling the stuff for the users to upload a picture and scale the thumbnail online, but John there didn't scale right pretty well So when that later went down actually the whole site crashes and you cannot see images We are an e-commerce platform we sell stuff So actually the images are core of our business So need to be available all the time. So actually we can oh cool Nice, so actually we can handle traffic peaks and What about scaling the caching layer? We have a big barnish caching layer on the top of our seed and see them See the answer If you want to scale up the layer Actually, they're right when rehashing happens That implies a lot of backend hits But actually as I'll tell you previously our solution was unable to actually handle traffic peaks So we have to do it in there in the morning at 3 a.m To do that stuff it took a lot of time and implies downtime. It wasn't cool enough So we were able to follow business needs and Implementing other formats or upgrading for example An image format or adding an HD image to the platform were actually unthinkable So oh Yeah, so actually why we choose to if imagine that to picture this for you guys The images were previously out there. So you can even see the guy back there who's administering the platform. So We wanted to deploy an open source platform an open source of official stores platform that actually Fixed this issue that we were having previously So why we choose to if there's a lot of reason to it, but we're going to see a couple Because it's open source. Yeah, so we love open source. We are hacker. We love to hack We love to do things we love to contribute to open source. It's a very great reason to us You can run it anywhere I don't know if you guys know that back in South America where fridges are built in top of open compute standards I'm just kidding. I'm talking about commodity hardware and that lead us to the most selling phrase of a lot couple of years You know bend the locking so It's durable and it's as granular as you want. So actually you can actually Control all or availability by sooning the smart way you can distribute or your zones Across your racks across your data center or across anywhere. So you can actually manage that and It's multi-tenancy. You have to publish just one endpoint and any any user on your company can actually use the solution it's pretty cool if you integrate that with keystone what that's a killing combo and You don't guys don't need me to tell you that actually swift scales. You just need that proxies for processing or throughput power and you need to add just Storage node for a storage and sharding and all the stuff. So Actually to make to make it clear for you guys when you access Mercado Leave ray and we are in commerce platform when you access my colleague there all the images that you see in the items all the CSS files all the GS files that compose all this all the style from the side Actually, I will load it from Swift and you actually don't need a big a big team to administer You need just a couple of handsome guys like us just to administer Oh, if you don't if you are you're an IT related company or you don't want or don't have That team you can actually we think that there's two way to swift heaven It's actually swift that so you can actually hit so if that guy for it to get it that for you Single point single point of failure. So let's not talk about it because using a shared nothing approach Actually every piece of the swift infrastructure is transparently independent from each other that means as Every part of the solution actually knows everything about everyone. So that is what actually allows you to scale limitless Having a rest API for us is great because that allows you to make sure that we're always speaking the same language And making the data flow through the exact same protocol Talking about developers. Well, our developers talk to HTTP rest the whole day And I'm sure they can choose any language they want to build their swift clients around it Highly secure, of course because Swift by defaults is integrated with keystone. So every call Into the solution actually gets authenticated and authorized by the identity service So you can assure among other things multi-tenancy, but if you want to make specific content public You can do it through specific ACLs If you want to scale to the skies the throughput and the bandwidth of your implementation You can definitely that is our use case in implement swift in-house to take total control over it and Of course There is a huge and the best community behind swift. You can shoot the IRC You can shoot launchpad. You can shoot the email list and you're gonna bump that with lots of Amazing people from Swiss that can rack space that they're gonna get for sure the right answer for you. So Yeah, there was swift. There was a real option that was swift to say today. I think yeah Actually see the day our jobs and our annual bonus Yeah, I love I love to see that was clear We taste we taste swift we were using for being wild and we're using it a lot and we like it We love it. We implement swift in a way that is called multi-disk configuration. Actually, it's a pretty weird configuration One disk actually one maps to one port and one service for example an object server and Actually, every single desk has its own configuration file So if you want to tune something up in a particular desk You can actually do it and that's pretty cool stuff. So Yeah, I can I can remember kernel parameters by I always forgot the number. So we have 1.2 petabytes actually in our soft cluster Four hundred and sixteen terabytes neat with three copies. This is this is our our half cluster of current implementation we have two hundred and fifty-five terabyte needs actually available and We have more than one point four billions of pictures Actually uploaded into the solution and we are adding more than 13 millions of images To swift per day Per day. That's a lot. That's a lot of small files guy Imagine that actually the average size of a picture in Mercado Libre is about 20k And we're actually just using 60 gigs of of the current 60 terabytes Sorry of the of the current storage space. So there's a lot of math files. We have fast clients and Okay It's it's it's hard to scale if you want to scale what about it the good thing about swift is actually it scales linearly So if you want to scale your throughput and your bandwidth you can add More proxies into the solution and if you want to scale your storage capacity You need to have more data knows they adult by deal little but not having to worry about performance degradations We are 10x faster than before I mean thinking about an open source and no cost solution that actually allowed us for every picture in a specific item to be loaded and when it's loaded to In real time process there are 20 versions of the same picture To show in different kind of devices and different kind of version different kind of browsers and to be uploaded It's just one single operation without any not having to wait even minutes It was a dream and today thanks for the ability to for swift to handle that much throughput It's today pretty easily for us and talking about saving hardware and lowering cost imagine Static content being shown by the side not only the pictures Imagine configuration files imagine in their internal data structure that it has to be shared among internal applications well Give those application a couple of web servers and multiply it by the amount of Departments and projects inside your company and that's the exact amount of hardware that we saved Storing all these static content into one separate Swift cluster. Yeah, so that it's absolutely huge for us Again having this rock solid implementation of an object storage And having total control over it gives you the tools to leverage the right solution Take control about anything and not having impact on the business at all So, yeah, maybe you're still scared So you're asking if Swift is strong enough is material enough to store all your own company stuff We're here actually to scream in a loud That actually swift is core of our business We are actually an e-commerce platform and we sell things through images and all the images are a strong swift so actually You don't have reasons to be afraid because Swift is strong. It's a strong solution. It's a real cool solution but We have a couple of tips for you that we gather from great guys Like Craig on the IRC and the swift type people and of course if you didn't get yet the Joe the Joe book So you should get it. So Use always Enterprise grace drive for your SATA drives and for your SSD drives You're gonna gain a lot and you don't have to be struggling with all that You see if you have remote support calling all the time and saying to the guys, oh replace that replace that replace that That's not cool at all Use a dedicated high speed network We use live mostly for replication because we are throughput guy We are bang with guys, but if you actually are bang with guys, so Maybe you need to implement a high speed network To the data nodes from the proxies and so on Yeah, use SSD for a count on containers. This is actually a pretty important tip We'll we notice a major game Switching SATA drive on a kind of containers to SSD drive because of the nature of the service You're actually here in an SQ like database that need to actually Really really high speed for concurrency and of course Hays miss ways guys Festina lente if you need to add or remove resources from Swift at disk remove this Do it slowly? Sometimes it's low is better if you would don't want to actually Harmo make a performance impact on your platform and cool in Swift cluster do it slowly and that's actually a great tip So we now specific case our old production internal network is Dominated hundred percent for fast client that means lots of application craving for this Thousand of get and put operation that got against our swift cluster. So imagine you hit the perfect setup In our in this specific case. So in our case we found out that Stripping all in the same account stripping all your throughput again several containers actually Allows you to scale your throughput to the skies talking about smart zoning. I mean, this is a thoughtful initial Set up that you need to think really well when you're deploying Swift You need to really think about where your data nodes are going to be placed inside your data center. You need to Really be really thoughtful about how your zones are going to be configured in a way that you can handle Disaster situation talking about recycle I mean at least in our use case if you have data that you are not going to need in your cluster You're not going to need on the several weeks on the several month and the several years Please delete it because by that you free a 9-0 then a 9-0 freed on a disk on a data No, it's actually an I know that doesn't need to be caching around. So talking about caching also Make sure that everything that has to be cash is actually there take a trip against your memcaching your proxy nodes And do take a little queries and make sure that everything's there But on the other hand make sure that you know exactly what you need to be cash from the proxies and and and before so There is a specific there is a specific point that we wanted to really really make notice of so check in Your setup for go services. Well, if you have a specific disk that is always kissing the upper line of specific monitoring in your company or you have Weird IO peaks during specific times in the day. Well, just take a good cup of coffee and go through all your configurations about background services Replication all the terms etc. And make sure you have everything pretty well configured because let me know that in Swift everything has an explanation, thank God and The last point but not least since he's within you we got your attention We wanted to leave it for the for the last is Make sure you really understand what Swift is made for make sure that Swift has a use case and your company Understand the technological and economics implication of using Swift inside your company And if you do you just have to deploy it. You just have to configure it monitor it Shoot to the IRC if you're something lacking and then you can drink tea like a salt serve the whole day. So Um Well, thank you so much for coming We really enjoyed all talk and if you have any doubts about our setup or specific deep dive Technical implementation. We were the one that make it happen. So and actually We want to encourage anyone to use this platform We're using big time. That's really great. And actually if we did it actually anyone can do it So, yeah, so thank you very much