 Hi everyone. Thank you for joining us today. I'm Ori, a performance engineer and I'm very pleased to introduce Brett Simmers who's a software engineer at Facebook. Brett has been with the HHVM team for three years from the early days of the project and we're very fortunate to have him help us with the migration and lend his expertise to the migration process and yeah he's going to be talking about what HHVM is and what it means for you as a developer. So without further ado, passing it to Brett. So before I get started, who here has not heard of HHVM? Okay good. So I'll be doing a quick introduction just the history of the project and where we are now. So when you hear about HHVM, you might confuse it with HPHPC also known as hip-hop for PHP. This was a project that was started back in 2009 and it was when Facebook was adding users faster than we could add servers to support those users and so a few of our engineers decided to try transforming our PHP code into C++ with the idea that it would execute faster and that ended up being pretty successful. We saw a 2-5x performance increase over standard PHP and it was used from about mid 2009 until February of last year when it was retired and replaced by HHVM and yeah so it just it transformed all the PHP into C++ and then built that into a normal C++ program with a built-in web server and a lot of the performance gains came from static type inference so if you had a loop where you had a counter variable named I it would usually be able to figure out that I was always an integer and so in the C++ that we generated from the PHP, I's type would just be int and that would be a lot faster than operating on a generic PHP value and so that's where a lot of the wins came from and HPHPC probably saved the company because we couldn't physically install servers fast enough to keep up with the user growth but there were a few problems with it and so I think around mid to late 2010 a few engineers started coming up with ideas for replacing it so the main problems we wanted to solve were that we had separate development and production environments so originally when we first rolled out HPHPC we could run the site on HPHP or on the standards and PHP engine and this was fine because people would do their development using the standard PHP interpreter and then for production we would build using HPHPC and push that but after a while we decided that it would be worthwhile to start adding features to the language a few of the things we added are generators we added more type hints and a few other things and so as soon as we added features to the language obviously we added support for those to HPHPC but it didn't seem reasonable to add support for those to the standard PHP interpreter and so we ended up creating our own interpreter called HPHPI and that's what developers used in their development environment and this worked great but unfortunately it introduced a new class of bugs where because we had these two completely different execution engines between development and production there could be behavioral differences between the two so somebody would test their code on their sandbox it would work great and then it would get deployed from production and it wouldn't behave as they expected and so we wanted a way to fix this and the other problem was that the gains we were getting from the ahead of time type inference were starting to wear pretty thin because PHP is a dynamic language we need to be able to support the cases where 99.9% of the time a variable is an integer but 0.1% of the time it's a string and in those cases HPHPC wasn't able to statically prove that it's always an integer so we would have to fall back to the generic method or the generic value type which we call variant and that was just the representation of a generic PHP value and as you may guess the operations on that were pretty slow so we came up with the HHVM and yeah so it initially started as a project within the same code base we shared large parts of the runtime we shared the same code for operating arrays we shared a lot of the same code for creating objects and other stuff like that and we were able to take advantage of those cases where the type was one was one way 99% of the time and I'll get into that in a little more detail soon but but yeah it was able to make up for the fact that in those cases where we couldn't statically infer the type in practice it was one type most of the time and so this hasn't been much of a problem recently but one of the most one of the biggest points of confusion about HPH or about HHVM is how it relates to HPHPC and HHVM is written in C++ but it does not compile PHP to C++ so we got rid of that build step well we got rid of the required build step but there is an optional ahead-of-time compilation step that I'll talk about later that you can use for extra performance but it's completely optional because we want to get as close as we can to being a drop-in replacement for PHP 5 the development and production environments are almost exactly the same with HHVM solving one of the big problems of HPHPC and I'll talk about the almost in a little bit too and we fully support eval and all the other dynamic features of the language like accessing a variable using a string name and other stuff like that a lot of them are going to be really slow but we do support them okay so I'll just quickly show an example of compiling some code using HHVM so if you have this PHP statement that just multiplies a number by four and adds one the first step is compiling the PHP into bytecode and HHVM is a stack-based machine so all the inputs and outputs to these operations are implicit based on the operation so the first one up at the top just pushes the integer four onto the stack the next one loads local zero onto the stack which is n in this case and then we multiply the two and then we push one on the stack we add and then we store the result back into n and so this is the bytecode and we have an interpreter that can run the bytecode and you've probably noticed that apart from the integer bytecodes there aren't any types mentioned here and that's because most of the bytecodes are type agnostic they can operate on any type and this is good because it means it's flexible and the transformation from PHP to bytecode is pretty straightforward but unfortunately it does mean that the interpreter is kind of slow I think it's a little bit slower than the standard PHP interpreter and so to get our performance back we compile it down to x86 machine code and we have an intermediate step called HHAR or the hip-hop intermediate representation so earlier I mentioned that we try to take advantage of those cases where we know the type 99% of the time but we can't statically prove that it's always a certain type so the way we deal with that is by at the beginning of each chunk of code which we call a trace lit we guard the types of values that we care about so in this case we start by guarding that the first local is an integer and then we can emit the rest of the code assuming that it is an integer because we know that if at runtime it turns out to not be an integer the type check will fail and we'll go to the fallback mechanism so then the rest is pretty straightforward we just load the value and then we multiply t1 by 4 and then we add 1 to it and we store the result back to local zero and the t1, t2, 3, t3 are just anonymous values they roughly represent what you would see on the stack in the interpreter but there isn't an exact correspondence okay so then this is what we end up with for machine code so the first two instructions just compare the type tag for integer which is 0xA with the type tag for local zero which is that offset off of RBP and then so if this is too detailed for any of you it'll be over soon and the details aren't terribly important but so the first two lines check the type of local zero if it's not an integer it jumps to a stub that retranslates this code for whatever the type happens to be so if we get to this piece of code and n happens to be a double or a string instead of an integer then it'll jump to retranslate stub and it will go through the whole translation process again and it'll generate code that's pretty similar to what you see on the left of there except all the types will be double or string instead of int and then the next instruction loads the value from the local and then the rest are pretty straightforward we just multiply by 4 and then instead of adding by 1 we just use the increment instruction and then finally restore it back into the local okay so that was the very very compressed version of what HHPM is if you have questions about it at the end I'll be happy to answer them but the most important part of this talk is what you're getting yourself into by migrating to HHPM and the most important part that I mentioned earlier is that we want to be a drop in replacement for PHP 5 or as close as we can get so if you have some PHP that you wrote and tested using PHP 5 and it behaves differently using HHVM it's almost certainly an HHVM bug and there are a bunch of ways to get in contact with us that I'll talk about in a little bit unfortunately there are some places where we intentionally diverge and in our GitHub repository there's a file that lists all of these there are mostly things that we decided weren't worth getting right because it would hurt performance too much or the original decision was just so insane that we didn't think we would go along with it and we don't have a built-in web server early versions of HHVM did and HPHPC did have its own built-in web server but we noticed that a lot of the questions we were getting in our IRC channel and on GitHub were related to configuring the web server and since we're mostly compiler writers we're not web server developers we decided to try to get out of the web server business and so now you can just use HHVM with fast CGI behind Apache or nginx or whatever your current web server is and so the the two to five x performance increase that I mentioned earlier was running facebook.com and because that's the code base that we use to build HHVM and tune it and with other things like WordPress and Media Wiki and Sugar CRM and other popular PHP packages you'll typically see a one to three x or a one and a half to three x performance improvement and then the main weakness of HHVM that you'll probably run into at some point is we have some high startup costs for a few different reasons and I'll get into that later okay so if you run into any problems you can just go to HHVM.com that's our main website and it has links to all the rest of these but just the most important ones to remember are you can file a github issue that's the best way of reporting a problem to us and then we have a few Facebook groups that you can post in for help they're kind of like mailing lists but on Facebook and then we also have some IRC channels on FreeNode. HHVM is for general use of HHVM and then HHVM dev is if you're interested in developing HHVM or if you're trying to debug a really tricky problem okay so the performance story is kind of interesting because as I mentioned we've added some features to the PHP language and so we can't compare the performance of Facebook.com on PHP 5.6 versus HHVM because we use a whole bunch of language features that are only in HHVM and we're trying to get as many of these features as we can into upstream PHP and we've succeeded with a bunch of them I think the generator implementation in current PHP is pretty is mostly compatible with what we do and I think they're looking at adding support for type hints on non-object types but until PHP supports everything we've added to the language we just can't run Facebook using normal PHP so the 2 to 5x number that I mentioned earlier is kind of a guess based on what the performance looked like at the last point that we could run Facebook using normal PHP and the relative games we've seen since then and of course we can always use micro benchmarks and it's fun and there are lots of micro benchmarks where we're 10 or 20 times faster than standard PHP but in the world world they really don't matter you can make a micro from benchmark two times faster without affecting the performance of a big website like Facebook or Media Wiki and so yeah they're fun to look at but they usually don't mean that much the biggest non Facebook package that we can run reliably and people have tested pretty heavily is WordPress and I think the results on that are that we're generally two to three times faster and for Wikimedia I don't have any numbers for the full site but we can parse the page for Barack Obama in 40% of the time that it takes PHP 5 which is pretty encouraging and then yeah so repo authoritative mode you might remember a while ago I mentioned that although we don't have the mandatory compilation step from PHP to C++ and then to a binary we do have an optional ahead-of-time compilation step and I think the plan is to not get this right away but assuming you do get fully on HHVM once you are there this is a really good thing to look at at least on facebook.com we generally see a 30 to 40 percent performance improvement on top of the gains that you already get from HHVM and most of the gains come from ahead-of-time type inference that is pretty similar to what HPHPC did and even though we can't infer all the cases that matter we can infer a lot of them and so if we can statically prove that a certain variable or a certain object property is always of a certain type then we don't have to garden that at runtime and that can save a lot of time it also does other things globally that we can't do in the JIT when we're working within a function so it can globally propagate constants it can tell us when a static where when a function call can be statically bound because in PHP you can define a function to mean any number of different things you can either include a different file or you can define the function conditionally in the body of an if statement and we wanted to handle all that correctly but we also want to handle it efficiently in the common case where there is only one definition for a specific function or a specific class and the main difference if you do decide to use repo authoritative mode is that instead of deploying all your PHP files like you normally would you'll deploy a bytecode repo which is just a SQLite database containing serialized bytecode that's been parsed and optimized and then serialized into this database and you just push that out with your binary and it means that when the server is running it doesn't have to check to see if the PHP files have changed between each request and because it contains bytecode instead of PHP it can contain all sorts of useful metadata about types that were inferred statically the main disadvantage of this apart from the additional operations overhead is that eval is not supported i'm not sure if you actually use it much in production but it is possible theoretically to support a limited version of eval and we've been thinking about doing that so if you wanted to just evaluate a math expression or something like that then that should be theoretically possible but for now it's just unconditionally banned and although eval is not supported most of the other dynamic features of the language are so you can still use dynamic object properties you can still access local variables using a string name and those are all supported they're just going to be really slow because when something like that happens the static analysis will mostly give up on that part of the function because it's too tricky to get that right and when we're compiling it in the jit it'll also give up on certain optimizations okay so now the bad stuff i mentioned that there's some startup overhead that you might need to deal with and there are two main types and i think this is probably one of the most commonly misunderstood parts of hvm the two parts are there's a bytecode cache and there's a jitcode cache the bytecode cache which is the same format as the repo from repo authoritative mode lives on disk it's built up automatically and on demand so if you just run hhvm on your php script it will read in the php it will parse it it will convert it to bytecode and it will save that to disk and so that means the first time you do that it'll be kind of slow to start up but then anytime you run hhvm on the same php file again it can just read it out of the bytecode cache instead of parsing the file obviously if you change the file it's going to have to reparse it but in a large code base where you have lots of different files you're normally only going to be changing one file at a time or a small handful of files so that's the bytecode cache it's persistent it lives on disk and if you have your configuration set up correctly it can be shared between different users and then the second cache is the jitcode cache and if you remember the code example that i showed earlier this is where that code lives it's built up at runtime it lives completely in memory and it is rebuilt from scratch every time we start the process and that's pretty standard for a jit it's just something that a lot of people don't realize when they're running hhvm and it is pretty expensive to generate so generally compiling a piece of code will take 50 to 100 times longer than it takes to run that piece of code so if you have a big script that runs maybe a million lines at php but each of those lines is only around once running with the jit will actually be a net loss so that's why for small command line scripts and other short-lived processes we actually recommend running with the jit off but a good way to work around that is that you can convert whatever your script is doing to run in server mode and that way the process stays alive and you just send requests to it using fastcgi and i think erin has already done that with the job runner so that was the main weakness that we've already dealt with and it's usually not that much of a problem there are there are things that behave differently in command line mode and server mode but there actually aren't that many so the process of converting your workflow to server mode usually just means starting up a server and then hitting it with a request using the right file name you'll usually only run into problems if you need to read from standard in or something else like that okay so that was all operation stuff that you might run into when you're deploying hhpm but when you're actually writing php there are a few things that you can do to make hhpm happy some of these are good for static analysis some of them are good for the jit and some of them are good for both and a lot of these are just good things to do in php in general i don't think any of them are really that specific to hhpm so the first one is to declare your object properties if you just use object properties without declaring them then they end up going into a hash table in the object and it's a lot slower than accessing the declared properties which we can access using a fixed offset kind of like a struct in c and so this is not just for performance but this is also good for correctness if you use type hints on your function parameters then that will help static analysis and it will also help you catch some bugs this is one thing that is more valuable in repo authoritative mode than normal mode and then this next line is kind of a generic don't use the really dynamic parts of the language like eval $n to access a variable with a string name don't define it constant using a non-constant name don't use compact and extract those are things that either take the current local variable environment and put it in an array or in the case of extract it takes an array and it explodes that into the local variable environment of the current function extract is also really really bad for security reasons so you just shouldn't be using that in general and then define all your classes and functions at the top level so this goes back to what i said earlier about how in php you can define a function inside an if statement so you can say if foo then function x blah blah blah else function x something else and like we don't know which way that if statement is going to go and so until we actually run the code if someone calls x later on we won't know which x they're actually on the call and one thing you can do to avoid this is define all your function classes at the top level unconditionally and if you give them all unique names then that will help a lot since we know that if you're calling a function x and there's only one function x in the code base we still have to make sure that you actually defined x but we know that if it is defined it's going to be this one specific x and then so the second to last bullet point which is keep as much code as possible inside functions this is less important now than it used to be but it's still pretty important and what that means is so if you just write a normal php script and have some code in the the far left margin just not inside a function or anything we call that a pseudomain and we used to not run pseudomains through the jit for a couple different reasons that i can get into if someone is interested but the important point here is that the type flow logic that we do in the jit is much simpler to reason about inside a function instead of these pseudomains and so in large code bases this tends to happen naturally so you probably have most of your code in object methods or in functions defined in a bunch of different files but it's just something to keep in mind and then the final most important point is that well you should keep all of these in mind don't spend your time micro-optimizing your php code so the point of hhvm is to run whatever php you throw at it as quickly as we can and there are certain things that we're just never going to run quickly like eval and extract we're just never going to handle those very efficiently but as long as you keep these general guidelines in mind you shouldn't have to worry too much about oh how is hhvm going to compile this php and you don't you shouldn't have to worry about things like cache performance that's mostly our job and if you have a piece of php that you wrote in a straightforward manner and it's not running as fast as you think it should then it's possible that that's just something we haven't optimized yet and so if you file a github issue or come ask us about an irc then we can try to optimize that yeah uh is there a mode in which you get warnings about this if you do all these things our advice against i know there's hhvm-l will that warn if you're doing these things that are not ideal no i don't think it will i think the dash l is mostly just to make sure that your syntax is valid and a few other things but that's actually a good idea it wouldn't be that hard to warn on these at least in the case of eval if you try to use that in the production mode or repo authoritative mode it will fatal at runtime but for the other things no we don't have warnings for those i have a question to um first i want to say there are 35 people watching on irc so we have about 60 people watching but the question from irc is any worry about memory leaks in job runner in server mode does the parser leak okay so there are two parts to that question there's so php because it uses reference counting for memory management it's pretty easy to introduce a cycle into your objects and in that case if you have a php request that is running for a really long time and if you're building up a bunch of objects that are stuck in cycles then those objects will leak over time but i think i think in the job runner that is being asked about the requests don't actually run that long they typically stop after about a minute and so in that case the all these all the memory that you allocate as a php developer is going to be freed at the end of the request that's just part of the way because of the way we allocate memory in hhvm and so if you have 10 000 objects that are stuck in a cycle they'll be there until the end of the request and there's no way to get rid of them but at the end of the request they will be freed and so the short answer is that it depends on how long your requests are if the request itself is going to go on for 10 or 20 minutes then you might have to be careful but this shouldn't be different in hhvm and then the other harder part of the question is if you have a custom extension written in hhvm that is allocating memory from malloc you do have to be careful to free that memory at the end of the request and we have a few different ways to help you with that but if you're writing c plus plus code to extend hhvm it is possible to introduce memory leaks um so you said something about um conditionally defining functions a lot of kind of stuff um we do have some code although less than i thought we did um where we go like if not function exists something then define something because it like comes in from like an optionally compiled in module or something and we cannot assume that the the platform that is installed um is is that like covered by our list of things that are horrible it depends but the short answer is yes i heard a yes from the audience is it um so yeah so the the really good answer to this will take a long time but the short answer is that if you can avoid stuff like that you should right so i guess the thing here that that's that you're sort of implying is that mixed types are bad for performance at least i heard a rumor that mixed return types are really bad is that true and why how um so i guess there are some contexts where mixed return types could be worse than in other cases um so the most important thing is that in hhvm mixed return types are not as bad as they were in hphpc because in hphpc you would have to fall back to the fully generic code path but with hhvm so going back to this example so this code that we have on the right if you have six different types flowing through this piece of code we're going to spit out a version of this for each of those six different types and then at runtime we'll select the correct one to run so there is going to be the overhead of the type cards at the top failing until we get to the right one but once we've made it pass the type cards which are generally pretty cheap it's going to run this code that was emitted for those specific types so well you should avoid mixed types when you can you really shouldn't jump through any hoops if you if that needs to be done because part of the power of php is that it is a dynamic language and you when it makes sense you can be a little bit loose with the types um talking about it in the context of return types that could mess up the static analysis a little bit so if you have a function x that always returns a string then wherever we see a call to x we know that the return value is going to be a string and we can optimize based on that but if one of the cases inside x returns an int then i think for now we will fall back to saying that we don't know the return type of x but again once the call to x is finished even though we don't know the return type we're still going to go through the same procedure and emit the specialized machine code that we dispatch at runtime okay another question um when hhbm gets really slow for dynamic features does that mean slower than default php would run or just slower than hb hhbm could run so it should be based on some tests i've done with the interpreter it should be about the same speed as standard php i know that they've been doing a lot more performance work on their interpreter than we have since we're mostly focusing on the jit so the short answer is it should be about the same as standard php maybe a little bit slower and that's only in the parts that use the dynamic features so like if you use a dynamic object property in one part of your code base it's not going to magically make the rest of the code slower there we go what's the availability on profiling tools as far as finding some of these hot spots actually in a production environment so that we can go in and then tune them yes so we have a tool called ph prof which so you're talking about profiling the php code itself right yeah so this tool we have called ph prof which is built into hbm and you can turn it on without any special compiler flags or anything it will basically profile every function call and then the data it gives you back i think if you want it in a pretty format you'll have to do some post processing on it yourself but it'll give you pretty much what you'd expect from a profiler we have a tool internally that will generate a call graph and it'll show the inclusive and exclusive time spent in each function so that is available nice is that something that can actually be run in a production environment or would we sort of set up okay awesome so the profiler is off by default but it can be turned on by php code at runtime well that was all i had so is actually perfect timing with the first question is there another question up here following up on Brian's profiling question do you support the x debug protocol for inline debugging we do not but we actually have an intern working on that right now so if that goes well we should have support for that in a month or two um sort of a non-technical question but um could you um share anything about the decision to open source hhvm what motivated you to do it and um what your experience has been so far sure so i think so um hphpc was originally open source i think so we rolled it out internally in 2009 and i think it was open sourced in the beginning of 2010 and i think back then the main reason was that there's no reason not to and we thought the rest of the world could benefit from it and because the it was kind of hard to deploy and there was this extra build step that a lot of people didn't want to go through we didn't really see that much adoption of hphpc externally um but with hhvm we saw that if we did things right we could build it as mostly a drop-in replacement for php and so the if you were watching the history of the project on github there was a period of a couple years where it was pretty quiet not much happened every month or so we would do one big commit that had all the changes we had done in the past few months which isn't which only just barely counts as open source like the code was available under an open source license but there wasn't much support available externally and there wasn't really a community around it and so once it became clear that hhvm was going to successfully replace hphpc we decided that we would put a lot more effort into it and yeah i think the main reason is that it's like it's been pretty important to our success but releasing it to the rest of the world isn't going to hurt us and it's going to help a lot of people and um yeah so i think so you also asked about what it's been like yeah so it's paul who is originally going to do this talk would probably be a better person to answer that but for me personally it's been a lot of fun it's it's really nice to be able to say that what i work on is this big open source project that is gaining in popularity and it's um we've gotten a lot of useful bug reports we've gotten a lot of useful contributions from the community and there are also benefits to the company it's it's useful for recruiting i know that a lot of people have said that their interest in working for facebook was helped by the fact that we had this big open source project i know that was kind of a long rambling answer but did it help okay i have a question um so i'm a product manager so i think a lot about users how does this manifest this um the speed increase manifests and uses so it should just be faster page loads so i know that most of your requests are from varnish and they're cached so obviously it won't affect those but once you have someone logged in and editing a page it should hopefully mean that when they click preview or when they click save that should be two to three times faster sweet which would be pretty neat um so i guess just a political question are you so you're making extensions to the language um are you tied to php fig or other standards groups or is this uh is facebook making making these changes on their own so the original changes we made to the language we just made on our own we did it all all the discussions were done internally um and they were mostly driven by common pain points that we saw from our php developers um you might have seen that we announced recently that we've been working on a spec for the php language and i think we've released a chapter of that as a preview and the idea with that is that it's been something that the php community has wanted for a long time but it hasn't really existed and so the closest thing that there's been to a php spec has been the implementation of php posted at php.net and so this version of the spec was written based on php 5.6 i think and so it doesn't include any of our extra features um so what was the specific body you mentioned oh um i mean i'm kind of wondering if if you if you were able to unmoor yourself from the php figs stagnation but uh but you know i'm not i'm not that familiar with it i'm just wondering yeah if if there's a collaborative process to come up with more language extensions like i know there's a hack is internal to facebook is that true so hack is open now i actually have a few slides about that that i forgot about but i can talk about those if you're interested um but yeah the the early extensions we just did on our own because we just wanted to get them done so we could take advantage of them but now that we're doing more of our development in the open we are going to try and involve the rest of the community and it's not going to be look here's a thing we did it's going to be here's a thing that we think is a good idea what do you guys think hopefully we can come up with something that both hhp and php can implement the same way yeah show us the slides about hack do you have one more question first um in part to the code we return sort of anonymous objects is just a raise presumably for performance benefit it's better to turn those into classes with specific declared members yes it probably will be um so a lot of that depends on how much we're able to statically infer so if you have a function that always returns a specific object type and that object does have declared properties then that will generally be faster than an array with string keys um did they estimate how many servers less they needed thanks to using hhpm as opposed to the standard zen php yeah so that was what i had on the one of the first slides is that well yeah so this two to five x number is kind of an estimate as i mentioned earlier because we can't run facebook using the standard php interpreter and so whenever we say something is like a two x win or a five x win we're talking about cpu time used and so generally if your application has a two x win that means you should need half as many servers and so that's a very very very rough answer do you have a favorite hhpm exclusive feature probably generators and async functions should i go into what they are or was that okay well so just quickly if you if you use python you're probably familiar with generators and async functions are we're actually originally implemented using generators and it's basically just a way of doing cooperative multi-threading within your php program and we use it to do non-blocking IO internally so talking to the database and memcache it it just makes the flow of your program a lot simpler instead of manually like calling an asynchronous mysql API and then passing around the handle and then waiting on it when the time happens you just use the await keyword which indicates i care about this value and then hhpm will suspend your function until the value is ready and then resume execution um i um okay um i am speaking of hhm specific features i heard about this feature that allegedly exists where there is something that helps you build like html doms in php with like native correct spacing and stuff like that yeah so that's xhp okay and i think um so i don't have any examples of it but it basically lets you write what looks like raw xml in your php and we end up transforming that into a tree of objects that you can then tell it to render itself and it automatically escapes everything so you don't have to worry about cross-site scripting attacks and stuff like that and that is built into hhpm for it to be really useful you need a core set of classes for things like links and headers and stuff like that and i think those are available on github somewhere okay cool two questions um is there a way to uh warn against to have hhpm worn if you're using these php extensions because mediocre we still want to have our code just run on sort of default php yes so i think by default these extensions are not on and you have to turn them on either by um converting your code to hack which i'll talk about or by turning on a specific flag which enables the hhpm specific extensions great my other question was um the when you declare when you have a variable but it doesn't have the value until and then halfway through your function you give it a value does that sort of trigger having to handle multiple types or does hhpm sort of deal with something transitioning from an undefined to being a string or whatever right so we should deal with that fine um all the static analysis that i mentioned is control flow sensitive so if you have a variable x that you set to a string halfway through your function we will know statically that everything after that x is a string it's only if you have like if some function called an x equals a string else x equals some integer and then after that we only know that it's either an int or a string okay so i'll quickly go through the hack slides so for those of you that haven't heard of it hack is our internal version of php well it used to be internal it's now open source at hacklang.org and there have been a lot of misconceptions about it but it's basically a safer more sane subset of php with additional type annotations it's statically typed so you have to but you don't have to declare the type of everything you do have to declare the types of your function parameters your object properties and function return types but everything else is inferred it's compatible with php because it's basically just a subset of php if you're running hhvm you can run hack code and php and they can call back and forth between each other and there's no runtime overhead all the internal object representations are the same at runtime and all that and yeah if you know php you won't have any trouble reading hack you might need a little bit of time to get used to writing hack but it's really not that bad so this is just a pretty quick example the top is php and the bottom is that code after it's been converted to hack so this is just a function that takes in a map and it passes each item in the array through sanitize function and then sends that to the database so you'll notice a few changes the first one is that the data parameter now has a type and it's a map from string keys to string values and map is one of our replacements for the php array if you've been using php for a while you're probably familiar with the many uses of php arrays and how these many uses are sometimes not great things to have in a single data structure and so we noticed that in practice most php arrays were either a map or a vector like array where the keys were just zero through n and so we created a few classes to just embody these specific use cases so there's map there's vector which is just like a c array there's pair which just has two elements and they're set so map can map from strings or integers to any other value set can hold strings or integers and then vector can hold any any type so that's what data is we're just saying that previously data was always an array and it had string keys mapping to string values and now we're just being more explicit about that and so we create a new map and you'll notice that we've added support for these new collection classes to all the built-in language features so you can still use for each with your map and you can still use the key value syntax and that all just works as you expected and another important thing about this is that if you do pass data back into PHP code map is just an object and so it's going to be passed around and if PHP code was expecting an array and does a for each that'll just work as expected and then we call sanitized we put it in the map and then we send it to the database and so you notice that there really aren't that many changes here so we automatically inferred the type of sanitization and we know the type of k and v and so if you're familiar with c++ it's pretty similar to just declaring everything auto the type checker it knows the values that you're putting in these locals and so there's no point in making you be redundant about it are there any questions on this part okay but yeah so this is hack it's not scary it's not a brand new language to learn this is an example of what the type checker looks like so in this example you notice that the line that is underlined in red we forgot to specify a key for the map and if this had been normal php code then that would just append the value to the end of the map using the largest integer key that had not been not yet been used but that's not what you want since this is a map from string to string so the type checker yells at you about that and that's just one example of the many common mistakes it can fix if data or if sanitized data had been a vector instead of a map then this would be just fine because depending to the end of the vector is something that makes sense and it's something that people do pretty often so i already covered this you only have to annotate a few things and you might already have type hints on your functions so there might not be that much to add everything else is inferred and these are the types that are available all the normal php types we have int, string, bool, double, classes, arrays we support nullable types so you can say this thing is an int or null and then if you want to pass it to a function that takes an int you have to check if it's null first and the type checker is smart enough to notice that in this branch of the if statement i know that it's actually an int it's not null support tuples, closures, collections we also support generics and one thing that is a common source of confusion is that the generics don't mean anything at runtime all of these type annotations are mostly for the benefit of the type checker and so because php is dynamically typed it doesn't really make sense to have to so if i have a function foo that takes a type parameter if you're coming from c++ then you're used to that function being duplicated for all the types that you give it but in php that doesn't make sense because at runtime all the values are just mixed and so if you start getting fancy with your types using generics and constraints just keep in mind that all of that is ignored at runtime except for parameter type hints like the map here since those are verified we also support verifying return type hints but i think it's off by default um yeah so that's hack any questions about hack that was the very very compressed introductions i didn't know how much time i had to go over it not necessarily related to hack but i'm wondering if with all the metadata you're now collecting um you have improved the performance of the reflection classes yes um i i think it depends on what specifically you're worried about with reflection but i think that we have um so are you asking about like specific jet optimizations to improve it or um more and more it's just that currently when you run a reflection operation on a class say for example finding the finding of a function is defined that's actually fairly slow in the current implementation in php okay php goes back and like looks back at the original file and figures that kind of stuff out rather than like just having the metadata pre-calculated so we don't have to do that we do have all that metadata stored in memory and like if you want to instantiate a new object with a string name class instead of using the actual class name that's obviously going to be slower um but yeah i would expect it to be much faster than having to go back and parse the file and figure out that stuff is actually there and if there so if there's something in reflection that you're using a lot in production and you've noticed that it is slow in hhfm then you can come and ask us about it and there's a pretty good chance that it's just slow because we haven't tried optimizing it yet awesome thank you so much brett thanks ori and rabla and brett and paul i really appreciate you coming and talking to us about this