 Good morning everybody. How are we doing? Excited for scale? And those usually happen at 2013. So that's here wondering was I too slow? And they definitely let us know in the app feedback and on our member care. Also you have stakeholders that are extremely unhappy. You have someone in marketing who's been working too much back on our site after the last time it sucked up. And all of a sudden here we are, pride in your work. And you know, regardless of what you have, a nice graphic for you. This is a screenshot from New Relic and this is actually a large web transaction time is hovering around 200 milliseconds. And that's a significant drop throughput within a matter of about one to two minutes. And remember this is all symphony. This is all PHP. So the obvious question here is per minute 2.3 million NGINX log events per hour. Orders up to 250 orders per minute. So if I do some very complicated arithmetic. Is that there's a lot of low hanging fruit. Lots and lots of low hanging fruit. And as we did and tried and all the trials and tribulations, dark days. But really that's not as useful as what I think you guys want to know. Which is what were the practices? What we didn't do then to harken back to our first. And so I brainstormed with a team, gathered up. What we do and why not the biggest bottom like you are wasting your time. Because what will happen as you're going to debug this application. Is that you'll be spot fixing things over here that have basically no impact. Because you're all bottlenecking in one place. So it doesn't matter how long it takes you to get to that bottleneck. That bottleneck is still the slowest. So if you imagine structure diagram. Or you have users on the far left. You have your web front end. You have your back end API. And then on the right you have your e-commerce database or any database really. And you imagine the flow of data through the system. But in experience they're going to happen on the data side. They're going to happen on the API side. Your web stuffings are not as endemic. They're easier to fix than having to go back and redo hundreds of millions of records worth of historical data to fit a new schema. Stuff like that. That's very, very challenging to do. And so after you're debugging these bottlenecks and you're tackling the biggest one first. Another one is going to crop up, right? After a while you're going to start to feel like Hercules slaying the Hydra, right? Every time you, you know, chop off one head you're going to see three more grow. How do you deal with this sort of thing? I suggest you do what Hercules did, which is burn the stump. Make sure that nothing else can crop up there again. And so what that really means is get down to the bottom of why you have a performance problem. Get to the root cause and fix the root cause. And that might not always be possible. And so you will have to make some trade-offs. But as much as you can you must eliminate root causes. Second principle. Choose and use your tools or make your own. And tools that you're going to use to inspect your application, understand what it's doing, and make sure that it's doing what you expect. You must use an application performance monitoring tool. And I know that here we're very false friendly. We don't like saying that sometimes there is a better paid solution, but pay for the software. Please. Use New Relic. Use App Dynamics. Use anything. Don't just rely on stats D or whatever instrumentation you've made. You need something that someone else has spent all of their time working on to make good. Alongside there you must profile your applications. So what the APM will do for you is help you identify where there might be an issue with your code. And what the profiler will do is help you drill into that performance problem and understand why it's there and potentially get a fix. And so once you have found an issue, you've debugged it, you think you fixed it, fix it, make your server the biggest thing that I'm going to share with you today. Next principle, understand your trade-offs. So every time that we make a decision in engineering, we must understand that there's a trade-off. And sometimes the trade-offs are insignificant. Sometimes they're very, very significant and you regret it later and that you don't want anyone to find out what you did. So what I'm endorsing here is to spot fixed performance problems and actually treat them as exceptions, not the rule. This is very important. In other words, if you find a particular piece of your application that is extremely performance sensitive, fix that spot. Whatever things you do there to make it work to get it under a number, let's say you must get it under 50 milliseconds, do disgusting, nasty things, but keep them there and keep them contained. Make sure that everyone is aware that the card endpoint is nasty. Be careful looking at it. There's lots of tests around there. It's not going to make sense when you look at it, but it's extremely fast and we like that. But again, that's not the rule. And here I have an image of who is a Simpsons fan? Anybody might have heard of this show? I have an image of old Simpsons Homer and his self-designed car, which is a lovely green pistachio color and has all kinds of bells and whistles on it. And that's what happens when you do not make engineering trade-offs. You don't understand what the problems are and what the solution should be. You try to please everybody, you end up pleasing nobody, and your solution costs $80,000. So, next principle. Limit the total work done. In other words, when you do work, do it as rarely as you can. Do not do it over again. In other words, you're going to want to reuse computation whenever possible. So, cache everything. Cache SQL queries. Cache JSON blobs. Cache documents. Cache entire HTML results. Cache everything. Throw things into memcache and set a sensible lifespan for them. Reuse things everywhere you can. And this is not going to be informed by technical constraints. It's going to be informed by your application. So, you must know what is safe to cache and when. And when to show someone a sale result versus a fresh one. And along those lines, use a caching proxy. Everyone in here should be using a caching proxy. Who is not using a caching proxy? I see some heads. We use varnish. Varnish is the single greatest thing that saved our website. Thank you. Travis Hansen. And the reason is that we were spending lots of CPU time redoing expensive calls on holtlook.com for a given events call. It might take 6 or 7 seconds and about 300 queries to traverse the taxonomy and generate this giant JSON blob. And we only need to regenerate it when events change, which as we said is 8 a.m., 1 p.m., and sometimes 4 p.m. And so, do you think our database knows when that's going to expire? Yes. Do you think our API can set a caching header for varnish to know when to expire it? Double yes. So, if you include varnish as part of your development life cycle, if you include this concept of caching everything you can, you're actually going to end up with a significantly reduced need for capacity because varnish will be doing most of the heavy lifting. And finally, use a content delivery network. Edgecast, Akamai, whatever you choose to use, this is significantly important because you will not be re-calculating all of the work that we've done. For the same reasons that you want to use varnish, use a CDN and it will actually make your members experience better due to geolocation of caches and other nice features that they give you for the arm and the leg that they're going to charge you and virgin blood and first port and so on. Next principle. This is pretty big and it ties into the previous concept. Choose technologies that are designed to scale. Not everything is designed to scale. Some things, like we said, are a trade-off. They're easy to write software in, but they're not designed to scale. Conversely, some things are very challenging to write in and they scale very nicely. So, in this case, I'm endorsing the use of things that might be trickier but are designed for scale. And I know that a lot of developers don't like to think in terms of varnish when they're writing their APIs, but varnish is designed for scale. It actually prevents a stampeding herd problem, which I'm not sure if everybody knows about, but if you have 1,000 clients or N number of clients connecting to your varnish server and they're requesting one resource, which usually is a URL, then varnish will consolidate those, make them wait, and fire off one backend call to your API server. It will take that result and send it back to those 1,000 clients and they're none the wiser. So you just got 1,000 times more servers for free and it's going to cache it for you. So you might not actually revisit that server for 5, 10, 15, 20 minutes. And this is probably the most controversial part. I've been talking about PHP for now, but you might want to rewrite expensive pieces of your app with appropriate technologies. And so in that case, for us, that meant that our search implementation, which is written in PHP, it's using doctrine, it's using symphony, was very, very slow. And fortunately, it's a fairly isolated endpoint that doesn't impact a lot of other things. So we actually rewrote the entire thing in Scala and it is significantly faster. It is significantly more scalable and we use significantly less servers. I'm not sure how many more times I can say it's significantly, but it is important. So those are the principles of performance. And again, these are things that you want to pepper into your designs in StackRank, of course, and where it gets a little less negotiable are the three performance commandments. And again, these are things that you must do, regardless of how you choose to tackle your performance problem, regardless of what it is. The first performance commandment, know thy system. And so what do I mean when I say know thy system? It means that your app is running on something. What is it? Whether it's Windows, whether it's Linux, FreeBSD, it doesn't matter. Know it well. Know it inside and out. Have an expert on staff that can help you debug these problems. Because if you're building on a shaky foundation, you're not going to get anywhere with it. Second commandment, know thy runtime. Know it exceptionally well, whether it's PHP, Java, Ruby, Python, doesn't matter. Understand what the limitations are and understand what you can do about those limitations. Know what your frameworks are good at. Know what they're actually really bad at. And these are things where you cannot have any strong opinions strongly held. They must be strong opinions weakly held because you will be challenging your assumptions regularly. Third commandment, know your application. This is something that no one else can do for you. So when I say know thy application, that means understand who is using it. Understand what their behavior patterns are. Understand what that means for your application. Are carts more expensive than looking at events? Are product pages particularly challenging because there's so many of them? You need to know how they're being used before you can possibly debug and appropriately act on a performance problem. So first commandment, know your system. In our case, we're using Linux. 10 to S Linux, 6 and 7. And we're very happy with it. This is a little controversial. I know that we're talking about performance, not talking about manageability. Again, we're talking about trade-offs. Choose bare metal wherever possible. This is really, really important. I know that rubbed some people the wrong way. Don't use containers. Don't use virtual machines. Use bare metal. This is the absolute fastest way to run your software. You don't want anything between your runtime and the metal. This is for performance. This is when you have Black Friday coming up. You have no other options. You don't like this option, but you're going to install PHP on all of your virtual machine hosts and you're going to run PHP right on the box. Second, for Know That System, is tune the kernel. The kernel by default is very conservative and gentle with your hardware. It thinks your hardware is brittle. We know better. Get aggressive with the kernel tuning I'd like to take a break here to give you a short story. When I started a whole look, we had a big, big problem with connecting to our MySQL database. We got lots of PHP errors to the tune of MySQL servers gone away. Connections dropped, stuff like that. And necessarily, all of our investigation goes right to the server box. We're tuning. We're changing parameters in the kernel. We're trying to figure out why this thing is dropping connections like crazy, right? Turns out, the caller was in the house the entire time. The problem was never on the servers. It was actually on the clients. So, when PHP is connecting to MySQL and given a default kernel, it will actually use up all of the ephemeral ports it has available. And when PHP tries to make a new connection to run another SQL query, it's actually going to get a timeout. And that's where you get the MySQL gone away error. And I'd like to thank Travis Hansen back there for that. And figuring out what the hell was going on there. And so, what we actually ended up doing was using a kernel flag called TCP-TWReuse. Set that to 1. That means it will reuse existing connections. It will not create new ephemeral ports. All of a sudden, problem solved and the website is so much faster. Funny how that works. So, that's just one value. If you want to tackle all of these values all at once, you're going to want to use a tool called tuneD for sensible defaults. It will give you a profile that you can apply and it's got a number of them like virtual guest, virtual host. You're going to want latency performance. And that's very, very important. The latency is what you want. You don't care about throughput or other issues. So, I'm going to skip some stuff here. Let's go to knowing your run time. In our case, it's PHP. I've had people come up and tell me I didn't think PHP could be fast. You are right. It can't. PHP is very, very slow because it does a lot of stuff. But first thing you want to do to make PHP faster is use recent versions. In some cases, you'll see, in some workloads, you'll see that it's going to be actually twice as fast. So, if you want a legacy version of PHP, this is the absolute first thing you want to do. Get that PHP upgraded. Even 5.5 to 5.6, it's going to be good. 7 should blow us away. So, we'll see. We shall see. Second about knowing your run time is frameworks are inherently slow. They speed development, but they hog resources. And particularly here, we're talking about doctrine. Doctrine for the uninitiated is an ORM that is commonly used for PHP installed. It is actually the single biggest thing that uses CPU anywhere in our environment. So, further along those lines, PHP eats CPU. Eats it. You heard it here first. It eats lots of it. So, what you're going to want to do when you're performance sizing with PHP is think in terms of concurrency, not in terms of throughput. So, throughput is a post-facto measure of what your server did. Concurrency is what it is doing at any given time. So, I'm going to give you the ultimate performance tuning formula for PHP. You're going to take the number of physical CPUs on your host. You're going to divide by the average response time in seconds. And what that's going to equal is the total number of PHP workers you can run on that box at any given time. And what that means is that for every single request that comes in and wants some CPU time, it will get CPU time. It will not wait. Because if you have a ton of requests coming in, it will immediately start to snowball and you will actually end up stealing CPU from each other. And this is not even in a virtualized environment. This is actually in a physical host. You will actually have the PHP thread steal CPU from each other. So, if you do all this math, you're like, wow, that doesn't seem like a lot. You need more hardware than you have. That is how you capacity plan. So, finally here, knowing your application. And I alluded to this earlier, but you want to focus on the parts that make you money. The parts that make you money are the most important parts. So, if you have a performance problem with the happy path in your application, focus on that first. The critical path must be fast. And the closure with a, I think, the best, finest example of product focus engineering that we've had at Hold Look. As part of knowing your application, take cues from user behavior. I have for you a tale of too many carts. And so, I've mentioned carts a couple of times because they are very expensive for us. Doctrine is doing a lot. And we realize that carts are actually the line share of our CPU time. About 40% was just spent at looking at empty carts. And so, the engineers, like, digging into the code, trying to figure out what's going on with PHP. And our VP, who is actually not here today says, all right, hold up, I'll step back, step back, step back. What do we know about the members? What do we know about how they use the cart? And it turns out we didn't know anything about how they use the cart. As it turns out, what the numbers look like is for, you know, 100% of the members that visit the site, only 5% have ever added to cart in their entire Hold Look career. And of those, only 10% have ever added to cart again. So, what we decided to do is send back a cookie that says, do not look at my cart. And that is on the web client and on the mobile client. And every time you add to cart, we clear the cookie. So what does that mean? It means that we've reduced, we've essentially gotten 20-fold performance for free, because we're not doing as many carts. The people that are shopping and never adding to cart, the window shoppers, they have a faster, more experienced, because they're not waiting on an empty cart call that doesn't do them any good. And the people that are adding to cart, actually have a faster time adding to cart shopping and getting through, because they're not competing with the people that are just window shoppers. And so, with that, we actually were able to drop our average response in about 30% or 40%, which is significant after you bundle in all of the other stuff that we've already done. So, that's it for me. I'd like to open it up to any questions you have. I would have a slide up here for you. You can reach me at Joel E. Salas, that's on Twitter, that's on Gmail, that's on GitHub. So, any questions? So, the question is, we're talking about performance. What about scalability? How do we want to scale the hardware out? That's actually some of the stuff that I planned to talk about, but what we do is, again, using ToonD to set the appropriate scheduler and elevator settings. We also tune our file systems very, very aggressively. So, PHP likes to talk to your disk. We actually do that 100 to 200 times per web call. It'll open once to 200 files. And so, if you have access times set to write on your file system, you're going to be writing to your files constantly 100 times per call. And all you're doing is updating a timestamp. So, what we do with our mount options on EXE4 is, no ATIME, we disable buffers. We set the commit time to 600 seconds. So, for 600 seconds, there's volatile data that might be lost. But, again, we don't like to write to disk. So, most of the data that matters is not on that box. So, that's how you get the same hardware to do more, basically. Again, don't virtualize. Yes, sir. So, the question was when we re-implanted our search and Scala, did we consider any of the existing options? So, our search implementation is actually leveraging Solar Cloud. The Scala portion is the part where we take our gigantic, you know, best in class product index, and we give you nice filtered results based on brands, based on other colors and other facets that we add. And that's what our API is doing. But we are using Solar Cloud heavily. Yes. So, the question was who in the organization actually initiated the idea of tackling performance? And I'd like to say that it was the engineering group. It really came from the product team and it came from the leadership and the members. They said, we need this to do better. And we decided to tackle that as a problem. But, like I said, intrinsically, we as engineers want to design things that are fast, that are efficient, that are well-designed, that we can be proud of. And so, really, I don't think it was a very difficult conversation to say, we need to tackle this problem. There was a bit of pushback against, you know, performance is not a feature. It's not worth engineering time. But we got to the point where there was an inflection and, you know, the dollars ran itself down. All right. All right. Okay. So, I muted. Thank you. Appreciate it.