 I know you all love PHP. How many people use PHP? Excellent, that's good. So this is nice and applicable to a lot of you. Okay, so I'm going to talk about building large-scale applications with PHP. So there's some challenges when trying to scale PHP. I'll talk about those briefly. So I'll talk about how to compile, well, just the basics of compiling PHP to C++ using hip-hop. And then I'll talk about some experiments with threading building blocks that colleagues at Comparalel have done. Okay, so obviously PHP has a lot of benefits to it. It's very widely available. And lots of people know how to use it, lots of software available in it. Then you have problems with this runtime. And for a lot of parallelisation problems you want to be able to thread, and the runtime then has no support for that, which is a shame because it's already put a lot of effort into making it all thread safe internally, so that it can run, for instance, inside Apache with its threaded NPM. And obviously, okay, you've got scaling problems and other languages you might want to use. Maybe you've got one language, you decide it's a one true language because it supports parallelising, or whatever it is. But the reality is that you can't do that because you've got programmers, you've got software that's already written in this language. And you don't really want to have a big fight over which language to move to either. So Facebook had these problems as well. And so they came along and they came up with this idea. Well, if we're using PHP, we've already made all these compromises in our language. It doesn't let us do anything crazily powerful. So we could actually transform a quite reasonable subset of it to C++ and then just compile that. And so HIPOP is a Facebook project to do just that. And they use it for Facebook, so it's got to be at least partially successful. Internally, it's thread safe and it uses Intel's threading building blocks for memory management. Sorry, my own selections with Paul Emmett. So, and to do that, they've actually had to reimplement a lot of the extensions to PHP in a thread safe way. So as an example of the difference between trying micro-optimising common functions versus converting the whole thing to a C++ programme and compiling it, I can just have a look at some Drupal figures that from a bunch of people that have taken the Drupal core and tried to, first of all, they took a function which they thought was called lots and lots. They implemented a C version of that and then they thought they'd graph it. Now, the difference between the micro-optimised function and the main one is sort of like the difference between these very slight steps. Whereas the ones I've circled here with a very perfect circle in red is the results that come from the HIPOP version of it. So it's really, it adds an order of, well, it's not really what we're making. It's like 30% over the version compared to the one with all of the typical PHP acceleration things like APC added to it. So it's really, do I know what function it was? I don't know. I could probably find it. It's just some small function called a lot. I'm not that familiar with the Drupal internal snow. Another one which a colleague Lent will be talking about soon is doing this to WordPress. Now, one of the things that you quite often hear about parallelisation is that, well, sometimes my problem doesn't suit parallelisation and in an abstract sense there are some problems that don't support parallelisation. But in a larger sense you've got questions like, does my application have any parallelisable bits in it? And the answer at least for WordPress, when you think, okay, this is a general blog, probably not much parallelisable, was that there was enough there that could be shredded off into parallel loops. Even with only a small amount of investigation over the code base, the colleague found several loops that didn't actually depend on, each iteration loop didn't depend on the version of it before it, so he could use a parallel four keyword. And with threading building blocks, what it does is it knows about how many processes there are on the system and so it's pretty good about setting up worker threads, leaving them around so we can use them quickly. And it's not really that slow to start a new thread. I think on Linux it's something like 10 or 20,000 cycles typically and a teardown can be a bit more depending on how many kernel objects there are, but even small loops can sometimes benefit from a little bit of threading. And TBV gives you lots of keywords to make that easy. So we've had positive results in that and there's a white paper that Slints will evaluate on later. So my point is there that it's not just a narrow bunch of specialist tasks, like weather modelling or what have you, or breaking Shahwan or Bitcoin, that sort of thing that can benefit from threading, quite a few things can as long as you are using a runtime which can do it very quickly. C++ you can create and teardown threads very quickly. And if you're using an interpreter it's a lot different because you've got lots of state to worry about to implement that thread so they don't typically work so well. But hip-hop is quite an attractive target for trying threading on this to accelerate lots of different applications. So I'll probably run out of five minutes there, but anyone who's got any comments or anything to say about that? Can you still run plain PHP for testing purposes when you introduce this parallel 4K? Right, so the question is can you still run regular PHP if you're using a parallel 4K? Well, with parallel 4 it's very easy. If the threading isn't there you just use a regular for loop and in general for a lot of the API it's like that. That you've identified something like a sort or a map or a filter or something like that which you think that part I'm going to try threading that and see what performance gains I get and you don't have to change the structure of the code more. More hardcore stuff like actually creating new threads that kind of do their own things and communicate between each other and that sort of thing. That's kind of outside of the scope of what we're trying to do because it's quite difficult to do that sort of thing. It's quite difficult to work with that kind of code and possibly you don't want to do that with the kind of applications you're using PHP for which is everyday applications. Okay, just a quick one. How you took this threading system you're using, how applicable is it to other CPU types? Is it specific for say Intel's particular CPUs or say can I use it on AMD or is it open source and so I can run it on this thing which is MIPS? Right, yeah so if you use the library with any of Intel's competitors processors it actually melts them? No, actually they haven't really done anything specific to Intel processors at all. The only thing really is that Intel processors have hyperthreading and most of the processors don't really do that. Intel processors are better at threading but for the same amount of die space they usually do more threads maybe not compared to something like Crazy Light and Agra. So they haven't hobbled it in any way. Under the hoods it's just positive threads, pre-threads. So the only thing they're doing is they've made a nice threading API so that you can use the advantages of threading and it's a GPL library actually. And one thing that I just mentioned, the TBB library itself has kind of back ends for different processor architectures and there are actually a lot of ports for different architectures. There is some currently unsupported ones which would be interesting like ARM but there is a lot of other ones already supported. Right, so thank you very much Sam. So just well thank you guys for paying attention, it's us. And on Friday the paper that it presented at the conference we will present it in full with all the details and code and we put two white papers and one it's Friday at half past 11 or something like that. Please check it on the website because it's wrongly printed here. It doesn't appear. So, thanks Sam.