 My name is Simon, I work on the infrastructure team at Shopify, and today I'm going to talk about the past five years of scaling Rails at Shopify. I've only been around at Shopify for four years, so the first year was a little bit of digging, but I wanna talk about all of the things that we've learned, and I hope that people in this audience can maybe place themselves on this timeline and learn from some of the lessons that we've had to learn over the past five years. This talk is inspired by a guy called Jeff Dean from Google, he's a genius, and he did this talk about how they scale Google for the first couple of years, and he showed how they ran with a couple of my SQLs, they sharded, they did all this stuff, they did all the no SQL paradigm, and finally went to the new SQL paradigm that we're now starting to see, but this was really interesting to me because you saw why they made the decisions that they made at that point in time. I've always been fascinated by what made, say, Facebook decide that now is the time to write a VM for PHP to make it faster. So this talk is about that. It's about an overview of the decisions we've made at Shopify, and less so about the very technical details of all of them. It's to give you an overview and mental model for how we evolve our platform, and there's tons of documentation out there on all of the things I'm going to talk about today. Other talks by coworkers, blog posts, readme's, and things like that. So I'm not gonna get into the weeds, but I'm going to provide an overview. I work at Shopify, and at this point, you're probably tired of hearing about Shopify, so I'm not gonna talk too much about it. Just overall, Shopify is something that allows merchants to sell people to other people, and that's relevant for the rest of this talk. We have hundreds of thousands of people who depend on Shopify for their livelihood, and through this platform, we run almost 100K RPS at peak. The largest sales hit our rail servers with almost 100K requests per second. Our steady state is around 20 to 40K requests per second, and we run this on 10,000s of workers across two data centers. About $30 billion have made it through this platform, which means that downtime is costly, to say the least. And these numbers, you should keep in the back of your head as the numbers that we have used to go to the point that we are at today. Roughly, these metrics double every year. That's the metric we've used. So if you go back five years, you just have to cut this in half five times. I want to introduce a little bit of vocabulary for Shopify, because I'm going to use this loosely in this talk to understand how Shopify works. Shopify is at least four sections. One of those sections is the storefront. This is where people are browsing their collections, browsing their products, adding to the cart. This is the majority of traffic. Somewhere between 80 to 90% of our traffic are people browsing their storefronts. Then we have the checkout. This is where it gets a little bit more complicated. We can't cash as heavily as we do on the storefront. This is where we have to do writes, decrement inventory, and capture payments. Admin is more complex. You have people who apply actions to hundreds of thousands of orders concurrently. You have people who change billing, they need to be billed, and all these things that are much more complex than both checkout and storefront in terms of consistency. And the API allows you to change the majority of the things you can change in the admin. The only real difference is that computers hit the API and computers can hit the API really fast. Recently I saw an app for people who wanted to have an offset under orders numbers at a million, and so this app will create a million orders and then delete all of them to get that offset. So people do crazy things with this API, and that's one of our second largest source of traffic after the storefront. I want to talk a little bit about a philosophy that has shaped this platform over the past five years. Flash sales is really what has built and shaped the platform that we have. When Kanye wants to drop his new album on Shopify, it is the team that I am on that is terrified. We had a sort of fork in the road five years ago. Five years ago was when we started seeing these customers who could drive more traffic for one of their sales than the entire platform otherwise was serving of traffic. They would drive a multiple, so if we were serving a thousand requests per second for all the stores on Shopify, some of these stores could get us to 5,000. And this happens in a matter of seconds. Their sale might start at 2 p.m., and that's when everyone is coming in. So there's a fork in the road. Do we become a company that support these sales or do we just kick them off the platform and throttle them heavily and say, this is not the platform for you? That's a reasonable path to take. 99.9 something percent of the stores don't have this pattern. They can't drive that much traffic. But we decided to go the other route. We wanted to be a company that could support these sales, and we decided to form a team that would solve this problem of customers that could drive enormous amounts of traffic in a very short amount of time. And this is, I think, was a fantastic suggestion, and this happened exactly five years ago, which is why the time frame of the talk is five years. And I think it was a powerful decision because this has served as a canary in the coal mine. The last sales that we see today and the amount of traffic that they can drive say 80K RPS, that's what the steady state is going to look like next year. So when we prepare for these sales, we know what next year is going to look like and we know that we're going to laugh next year because we're already working on that problem. So they help us stay ahead, one to two years ahead. In the meat of this talk, I will walk through the past five years of the major infrastructure projects that we've done. These are not the only projects that we've done. There's been other apps and many other efforts, but these are the most important to the scaling of our Rails application. 2012 was the year where we sat down and decided that we were going to go the antifragile route with Flast Sales. We were going to become the best place in the world to have Flast Sales. So a team was formed, whose sole job was to make sure that Shopify as an application would stay up and be responsive under these circumstances. And the first thing you do when you start optimizing an application is you try to identify the lower hanging fruit. In this case, or in many cases, the lower hanging fruit is very application dependent. The lowest hanging fruit from an infrastructure side that's already harvested in your load balancers, Rails is really good at this, or your operating system. They will take all of the generic optimization tuning. So at some point that work has to be handed off to you and you have to understand your problem domain well enough that you know where the biggest wins are. For us, the first ones were things like backgrounded checkouts. And this sounds crazy. What do you mean they weren't backgrounded before? Well, the app was started in 2005, 2004, and back then, backgrounding jobs in Ruby or Rails was not really a common thing. And we hadn't really done it after that either because it was such a large source of technical debt. So in 2012, a team sat down and collected the massive amount of technical debt to move the background jobs into, or remove the checkout process into background jobs. So the payments were captured, not in a request that took a long time, but in jobs asynchronously with the rest. And this of course was a massive source of speedup. Now you're not occupying all these workers with long running requests. Another thing we did at these domain specific problems is it was inventory. If you just, you might think that inventory is just decrementing one number and doing that really fast if you have thousands of people. But MySQL is not good at that. If you're trying to decrement the same number from thousands of queries at the same time, you will run into lock contention, so we have to solve this problem as well. And these are just two of many problems we solved. In general, what we did was we printed out the debug logs of every single query on the storefront, on the checkout, and all of the other hot paths, and basically started checking them off. I couldn't find the original picture, but I found one from a talk that someone from the company did three years ago where you can see the wall here where the debug logs were taped. And the team at the time would go and cross off and write their name on the queries and figure out how to reduce this as much as possible. But you need a feedback loop for this. We couldn't wait until the next sale to see if the optimizations we've done actually made a difference. We needed a better way than just crashing at every single flash sale but having a tighter feedback loop, just like when you run the tests locally, you know whether it worked pretty much right away. We wanted to do the same for performance. So we wrote a load testing tool and what this load testing tool will do is that it will simulate a user performing a checkout. It will go to the storefront, browse around a little bit, find some products, add them to its cart and perform the full checkout. It will fussy test this entire checkout procedure and then we spin up thousands of these in parallel to test whether the performance actually made any difference. This is now so deeply webbed in our infrastructure culture that whenever someone makes a performance change, people ask, well, how did the load testing go? This was really important for us and as Siege, just hitting the storefront, that is something that just runs a bunch of the same requests, is just not a realistic benchmark of realistic scenarios. Another thing we did at this time was we wrote a library called Identity Cache. We had a problem with, we had one MySQL at this point hosting tens of thousands of stores and when you have that, you're pretty protective of that single database and we were doing a lot of queries to it and especially these sales were driving such a massive amount of traffic at once to these databases. So we needed a way of reducing the load on the database. The normal way of doing this or the most common way of doing this is to start sending queries to the read slaves. So you have databases that feed off of the one that you write to and you start reading from those and we tried to do that at a time. It has a lot of nice properties to use the read slaves over another method but when we did this back in the day there wasn't any really good libraries in the Rails world. We tried to fork some and tried to figure something out but we ran into data corruption issues. We ran into just mismanagement of the read slaves which was really problematic at the time because we didn't have any DBAs and overall, mind you, this is a team of Rails developers who just have to turn infrastructure developers and understand all of this stuff and learn that the job because we decided to handle flash sales the way that we did. So we just didn't know enough about MySQL and these things to go that path. So we decided to figure out something else and deep inside of Shopify, Toby had written a commit many, many years ago introducing this idea of identity cache, of managing your cache out of bound in Memcache. Idea being that if I query for a product I look at Memcache first and see if it's there, if it's not, if it's there, I'll just grab it and not even touch the database. If it's not there, I'll put it there so that for the next request it will be there and every time we do a write we just expire those entries. That's what we managed to do. This has a lot of drawbacks because that cache is never going to be 100% what is in the database. So when we do a read from that managed cache we never write that back to the database. It's too dangerous. That's also why the API is opt in. You have to do fetch instead of find to use IDC because we only want to do it on these paths and it will return read only records so you cannot change them to not corrupt your database. This is the massive downside with either using read slaves or identity cache or something like this is that you have to deal with what are you going to do when the cache is expired or old. So this is what we decided to do at the time. I don't know if this is what we would have done today. Maybe we've gotten much better at handling read slaves and they have a lot of other advantages such as being able to do much more complicated queries but this is what we did at the time and if you're having severe scaling issues already identity cache is a very simple thing to do and use. So after 2012 and what would have been probably our worst Black Friday cyber Monday ever because the team was working night and day to make this happen. There's this famous picture of our CTO face planted on the ground after exhausting work of scaling Shopify at the time and someone then woke him up and told him, hey dude check out this down. We were not in a good place but identity cache load testing and all this optimization it saved us and once the team had decompressed after this massive sprint to survive these sales and survive Black Friday and Cyber Monday this year we decided to raise the question of how can we never get into this situation again. We'd spend a lot of time optimizing checkout and storefront but this is not sustainable. If you keep optimizing something for so long it becomes inflexible. Often fast code is hard to change code. If you optimize storefront and checkout and had a team that only knew how to do that there's gonna be a developer who's gonna come in add a feature and add a query as a result and this should be okay. People should be allowed to add queries without understanding everything about the infrastructure. Often the more slower thing is more flexible. Think of a completely normalized schema. It is much easier to change and adapt upon and that's the entire point of a relational database but once you make it fast it often is a trade off of becoming more inflexible. Think of say an algorithm, a bubble sort. N square is how much is the complexity of the algorithm. You can make that really fast. You can make that the fastest bubble sort in the world. You can write a C extension in Ruby that has inline assembly and this is the best bubble sort in the world. But my terrible implementation of a quicksort which is N log N complexity is still gonna be faster. So at some point you have to stop optimizing, zoom out and re-architect. So that's what we did with sharding. At some point we needed that flexibility back and sharding seemed like a good way to do that. We also had the problem of fundamentally Shopify is an application that will have a lot of rights. Doing these sales there's gonna be a lot of rights to the database and you can't cash rights. So we have to find a way to do that and sharding was it. So basically we built this API. A shop is fundamentally isolated from other shops. It should be. Shop A should not have to care about shop B. So we did per shop sharding where one shop's data would all be on one shard and another shop might be on another shard and the third shop might be together with the first one. So this was the API. Basically this is all the sharding API internally exposes. Within that block it will select the correct database where the product is for that shop. Within that block you can't reach the other shard. That's illegal. And in a controller this might look something like this. At this point most developers don't have to care about it. It's all done by a filter that will find a shop on another database, wrap the entire request in the connection that that shop is on and any product query will then go to the correct shard. This is really simple and this means that the majority of the time developers don't have to care about sharding. They don't even have to know if it's existent. It just works like this and jobs will work the same way. But it has drawbacks. There's tons of things that you now can't do. I talked about how optimization might, you might lose flexibility with optimization but in architecture you lose flexibility at a much grander scale. Fundamentally shops should be isolated from each other but in the few cases where you want them to not be there's nothing you can do. That's the drawback of architecture and changing the architecture. For example, you might wanna do joints across shops. You might wanna gather some data or an ad hoc query about app installation across shops. And this might not really seem like something you would need to do but the partners interface for all of our partners who build applications actually need to do that. They need to get all the shops and get the installations from them so it was just written as something that did a join across all the shop and listed it. And this had to be changed. And so the same thing went for our internal dashboard that would do things across shops, find all the shops with a certain app. You just couldn't do that anymore so we have to find alternatives. If you can get around it, don't shark. Fundamentally Shopify is an application that will have a lot of rights but that might not be your application. It's really hard and it took us a year to do and figure out. We ended up doing an ad application level but there's many different parts or levels where you can shark. If your database is magical, you don't have to do any of this. Some databases are really good at handling this stuff and you can make some trade-offs at the database level so you don't have to do this at an application level. But there are really nice things about being on a relational database. Transactions and schemas and the fact that most developers are just familiar with them are massive benefits and they're reliable. They've been around for 30 years and so they're probably gonna be around for another 30 years at least. We decided to do that at the application level because we didn't have the experience to write a proxy and the databases that we looked at at the time were just not mature enough. And I actually looked at some of the databases that we were considering at the time and most of them have gone out of business. So we were lucky that we didn't buy into this proprietary technology and solved it at the level that we felt most comfortable with at the time. Today, we have a different team and we might have solved this at a proxy level or somewhere else, but this was the right decision at the time. In 2014, we started investing in resiliency and you might ask, what is resiliency doing in a talk about performance and scaling? Well, as a function of scale, you're going to have more failures and this led us to a threshold in 2014 where we had enough components that failures were happening quite rapidly and they had a disproportional impact on the platform. When one of our shards was experiencing a problem, every request to other shards and shops that were on the other shards were either much slower or filling altogether. It didn't make sense that when a single Redis server blew up, all a Shopify was down. This reminds me of a concept from chemistry where your reaction time is proportional to the amount of surface area that you expose. If you have two glasses of water and you put a teaspoon of loose sugar in one and a sugar cube in the other glass, the glass with the loose sugar is going to be dissolved in the water quicker because the surface area is larger. The same goes for technology. When you have more servers, more components, there's more things that will react and can potentially fail and make it all fall apart. This means that if you have a ton of components and they're all tightly knitted together in a web where if one of these components fail, it drags a bunch of others with it and you have never thought about this, adding a component will probably decrease your availability and this happens exponentially. As you add more components, your overall availability goes down. If you have 10 components with four nines, you have a lot less downtime if they're tightly webbed together in a way that one of them is a single point of failure. And we hadn't really at this point had the luxury of finding out what our single point of failure even were. We thought it was going to be okay, but I bet you if you haven't actually verified this, you will have single points of failure all over your application where one failure will take everything down with it. Do you know what happens if your memcast cluster goes down? We didn't and we were quite surprised to find out. This means that you're only really as weak or as good as your weakest single point of failure. And if you have multiple single points of failure, multiply the probability of all of those single points of failure together and you have the final probability of your app being available. Very quickly, what looks like downtime of hours per component will be days or even weeks of downtime globally amortized over an entire year. If you're not paying attention to this, it means that adding a component will probably decrease your overall availability. The outages look something like this. Your response time increases and this is a real graph of the incidents at the time in 2014 where something became slow. And as you can see here, the timeout is probably 20 seconds exactly. So something was being really slow and hitting a timeout of 20 seconds. If all of the workers in your application are spending 20 seconds waiting for something that's never going to return because it's going to timeout, then there's no time to serve any requests that might actually work. So if shard one is slow, requests for shard zero are gonna lag behind in the queue because these requests to shard one will never ever complete. The mentor that you have to adopt when this starts becoming a problem for you is that single component failure cannot compromise the availability or performance of your entire system. Your job is to build a reliable system from unreliable components. A really useful mental model for thinking about this is the resiliency matrix. On the left hand side, we have all the components in our infrastructure. At the top, we have the sections of the infrastructure such as admin, checkout, storefront, the ones I showed from before. Every cell will tell you what happens if that component on the left is unavailable or slow. What happens to the section? So if Redis goes down, is storefront up, is checkout up, is admin up? This is not what it actually looked like in reality when we drew this out. It was probably a lot worse. And we were shocked to find out how red and blue, how down and degraded Shopify looked when what we thought were tangential data stores like Memcache and Redis took down everything along with it. The other thing we were shocked about when we wrote this was this is really hard to figure out. Figuring out what all these cells and the values of them are is really difficult. How do you do that? Do you go into your production and just start taking down stuff? How do you know? What would you do in development? So we wrote a tool that will help you do this. The tool is called Toxy Proxy. And what it does is that for a duration of the block, it will emulate network failures at the network level by sitting in between you and that component on the left. This means that you can write a test for every single cell in that grid. So when you flip it from being red to being green, from being bad to being good, you can know that no one will ever reintroduce that failure. So these might look something like this, that when some message queue is down, I get this section and I assert that it responds to success. At this point in Shopify, we have very good coverage of our resiliency matrix by unit tests that are all backed by Toxy Proxy. And this is really, really simple to do. Another tool we wrote is called Semi-In. It's fairly complicated exactly how all of these components work and how they work together in Semi-In. So I'm not gonna go into it, but there's a readme that goes into vivid detail about how Semi-In works. Semi-In is a library that helps your application become more resilient and how it does that, I encourage you to check it to readme to find out how it works. But this tool was also invaluable for us to run, to not, or to be able to be a more resilient application. What we, the mental model we mapped out for how to work with resiliency was that of a pyramid where we had a lot of resiliency debt because for 10 years we hadn't paid any attention to this. The web I talked about before of certain elements dragging down everything with it was eminent. It was happening everywhere. The resiliency matrix was completely red when we started. And nowadays it's in pretty good shape. So we started climbing it. We started figuring out, writing all these tools, incorporating all these tools. And then when we got to the very top, someone asked the question, what happens if you flood the data center? That's when we started working on multi-DC in 2015. We needed a way such that if the data center caught fire, we could fail over to the other data center. But resiliency and sharding and optimization were more important for us than going multi-DC. Multi-DC was largely an infrastructure effort of just going from one to n. This required a massive amount of changes in our cookbooks. But finally we had procured all the inventory and all the servers and stuff to spin up a second data center. And at this point, if you want to fail over Shopify to another data center, you just run the script. And it's done. All of Shopify has moved to a different data center. And the strategy that it uses is actually quite simple and one that most Rails apps can use pretty much as is if the traffic and things like that are set up correctly. Shopify is running in a data center right now in Virginia and one in Chicago. If you go to a Shopify owned IP, you will go to the data center that is closest to you. If you're in Toronto, you're gonna go to the data center in Chicago. If you are in New Orleans, you might go to the data center in Virginia. When you hit that data center, the load balancers in that data center inside of our network will know which one of the two data centers is active. Is it Chicago or is it Aspern? And it will route all the traffic there. So when we do a failover, we tell the load balancers in all the data centers, what is the primary data center? So if the primary data center was Chicago and we're moving it to Aspern, we tell the load balancers in both the data centers to route all traffic to Aspern. Aspern in Virginia. When the traffic gets there and we've just moved over, any write will fail. The databases at that point are in read only. They are not writable in both locations at one because the risk of data corruption is too high. So that means that most things actually work. If you're browsing around Shopify and Shopify storefront looking at products, which is the majority of traffic, you won't see anything. Even if you are in admin, you might just be looking at your products and not notice this at all. And while that's happening, we're failing over all of the databases, which means checking that they're caught up in the new data center and then making them writable. So very quickly the charge recover over a couple of minutes. It could be anywhere from 10 to 60 seconds per database and then Shopify works again. We then move the jobs because when we move all the traffic, we stop the jobs in the source data center. So we move all the jobs over to the new data center and everything just ticks. But then how do we use both of these data centers? We have one data center that is essentially doing nothing, just very, very expensive hardware sitting there doing absolutely nothing. How can we get to a state where we're running traffic out of multiple data centers at the same time, utilizing both of them? The architecture at first looks something like this. It was shared. We had shared Redis instances, shared Memcash between all of the shops. When we say a shard, we're referring to a MySQL shard, but we hadn't sharded Redis. We hadn't sharded Memcash and other things. So all of this was shared. What if instead of running one big Shopify like this that we're moving around, we run many small Shopify's that are independent from each other and have everything they need to run? And we call this a pod. So a pod will have everything that a Shopify needs to run. As the workers, as the Redis, the Memcash, the MySQL, whatever else there needs to be for a little Shopify to run. If you have these many Shopify's and they're completely independent, they can be in multiple data centers at the same time. You can have some of them active in data center one and some of them active in data center two. Pod one might be active in data center two and pod two might be active in data center one. So that's good, but how do you get traffic there? So for Shopify, every single shop has usually a domain. It might be a free domain that we provide or their own domain. When this request hits one of the data centers, the one that you're closest to, Chicago or Virginia, depending on where in the world you are, it goes to this little script that's very aptly named sorting hat. And what sorting hat will do is that it will look at the request and interpolate what shop, what pod, what mini Shopify does this request belong to? If that request is on a shop that is going to pod two, it will route us to data center one on the left. But if it's another one, it will go to the right. So sorting hat is just sitting there, sorting the request and sending them to the right data center. Doesn't care where you're landing, which data center you're landing to, it would just route you to the other data center if it needs to. Okay, so we have an idea now of what this multi DC strategy can look like, but how do we know if it's safe? Turns out that there's just needs to be two rules that are honored. Rule number one is that any request must be annotated with the shop or the pod that it's going to. All of these requests for the storefront are on the shop domain, so they're indirectly annotated with the shop they're going to through the domain. With the domain, we know which pod, which mini Shopify that this request is belonging to. The second rule is that any request can only touch one pod. Otherwise, it would have to go across data centers. And potentially this means that one request might have to reach Asia, Europe, maybe also North America, all in the same request. And that's just not reliable. Again, fundamentally shops and request to shops should be independent, so we should be able to honor these two rules. So you might think, well, it sounds reasonable. Like Shopify should just be an application with a bunch of control actions that just go to a shop. But there were hundreds, if not a thousand requests that didn't, that violated this. They might look something like this. They might do something going over every shard and counting something or doing something like that, or maybe it's uninstalling PayPal accounts and seeing if there are any other stores with it, or something like that across multiple stores. When you have hundreds of endpoints that are violating something you're trying to do, and you have a hundred developers who are doing all kinds of other things and introducing new endpoints every single day, that's gonna be a losing battle if you just send an email because tomorrow someone joins who's never read that email, who's gonna violate this. Raphael talked a little bit about this yesterday. He called it white listing. We called it shitless driven development. The idea is that your job, if you wanna honor rule one and two, is to build something that gives you a shit list, a list of all the things that violate the rule. If you do not obey the shit list, you raise an error telling people what to do instead. This needs to be actionable. You can't just tell people not to do something unless you provide an alternative. Even if the alternative is that they come to you and you help them solve the problem. But this means that you stop the bleeding and you can then, going forward, rely on rule one and two, in this case, being honored. When we had this for Shopify, rule one and two honored, our multi DC strategy worked. And today, with all of this, building a top of five years of work, we're running 80,000 requests per second out of multiple data centers. And this is how we got there. Thank you. Do you have any global data that doesn't fit into a shard? Yes. We have a dreaded master database and that database holds data that doesn't belong to a single shop. In there is, for example, the shop model. We need something that stores the shop globally because otherwise the low balancers can't know globally where the shop is. Other examples are apps. Apps are sort of inherently global and then they're installed by many shops. It can be billing data because it might span multiple shops. There's actually a lot of this data. So I didn't go into this at all but I actually spent six months of my life solving this problem. So we have a master database and it spans multiple data centers. And the way that we solve this is essentially we have reed slaves in every single data center that feed off of the master database that is in one of the data centers. If you do a write, you do cross DC writes. This sounds super scary but we eliminated pretty much every path that has a high SLO from writing on this. So billing has a lower SLO in Shopify because the writes have to be cross DC. But the thing is that billing and partners and the other sections of this master database, they're in different sections. They're fundamentally different applications and as we speak, they're actually being extracted out of Shopify because Shopify should be a completely sharded application. And if they're extracted out of Shopify then you're also doing a cross DC write because you don't know where that thing is. So it's not really making the SLOs worse and it's okay that some of these things have lower SLOs than the checkout and storefront and the admin that have the highest SLOs. So that's how we deal with that. We don't really deal with it. How do you deal with a disproportionate amount of traffic to a single pod or a single shop? So I showed a diagram earlier that shows that the workers are isolated per pod. This is actually a lie. The workers are shared which means that a single pod can grab up to 60 to 70% of all of the capacity of Shopify. So what's actually isolated in the pod are all the data stores and the workers can sort of move between pods fluid like they're fungible. They will move between pods on the fly. The low bounce just sends requests to it and it will appropriately connect to the correct pod. So this means that the maximum capacity of a single stores is somewhere between 60 and 70% of an entire data center. And it's not 100% because that would cause an outage because of a single store which we're not interested in but that's how we sort of move this around. Is that answer? How do we deal with large amounts of data? Yeah, like someone who's doing importing 100,000 customers or 100,000 orders. Well, this is where the multi-tenancy strategy or architecture sort of shines. These databases are massive. Half a terabyte of memory, many, many tens of cores. And so if one customer has tons of orders, then that just fits. And if the customer is so large that they need to be moved, that's sort of what this data fragmentation project is around, is around moving these stores to somewhere where there might be more space for them. So basically we just deal with it by having massive, massive data stores that can handle this without a problem. The import itself is just done in a job. Some of these jobs are quite slow for the big customers and we need to do some more parallelization work but most of the time it's not a big deal. If you have millions of orders and it takes a week to import that, you have plenty of other work to do during that time otherwise, so this is not something that's been high, high on the list. How much time I have? Done? Okay, thank you. Okay. Thank you.