 see the room filling up it's gonna be a popular session so if you have a seat next to you try to look friendly get one of these folks in the back of the room to sit down otherwise the back of the room is gonna get very crowded in the next couple minutes and then I'll just like awkwardly call you out and make you come up to the front so Josh Miller's up here his seats next to him he's very friendly like I mean they've clearly underestimated the drawing power of memory management and PHP and how incredibly popular it is okay okay I think there's a single premium seat right in the very front here if anyone wants that's it be brave I did a talk last year on like social anxiety and stuff this is not that one so we're it's we'll still work through it anyways but okay this is PHP and you know how memory works in it basically so my name is Sean McCabe I will be talking about basically sort of how PHP handles memory when you declare a variable what does that mean when you have an array an object what does that mean how does that differ between versions of PHP when does memory get released it it should be fairly entry level as long as you have some experience with programming languages and just give you kind of the basics of if you're doing stuff what is happening you know with memory behind the scenes and what are you using especially with something like Drupal that is very memory intensive and oftentimes that can be a big problem first of all we'll just talk about how memory works in PHP or in most dynamic languages in compared to something that's more static like C or C++ in C you don't really you declare everything about the variable so you declare what type it is you know how long the array is going to be anything that nothing is dynamic so you're not really storing a lot of that information attached to each variable you don't need to know is this a string is it an object is it an array you know what it is it's it's a string or it's a character array or it's an integer and so that generally makes it more lightweight whereas in a dynamic language like PHP it can be anything and it can change it can be an integer and then it can become something else become a string it can become an object or and so that means there's a lot of data stored to keep track of that information as well so that you know if you're just you know storing you know the number one you're actually storing more than just like a single piece of data you're not choosing just like a you know 32-bit bits of data or you know even like a you know an 8-bit you know small int or something like that you have this more flexible amount of memory but that means it's also more memory if you're doing something like an array which will touch on a little bit more later that is even more information because the array is somewhat of a variable itself so it has information on itself and then it also has you know like how many items it holds and stuff like that and then each item in the array also you know is a variable so has to know you know what type of my and all that same metadata because it's not just an array of integers or something it's a ray of anything so you can see I sort of did a little bit showing up there that you know you're basically just storing the data in something I see sweet in something like C or C++ and in PHP you are storing all this metadata as well which if you're storing you know big huge strings and stuff like that it's probably a very small piece but if you're storing lots of small chunks of data you know in a bunch of big arrays you can have a lot of memory used up just in overhead verses in the data itself that you're storing so you have to think of more than just what you're storing but how you're storing it so like I said a little bit more on arrays so that means if you if you put something in an array and you know you want to store a whole bunch of it you want to loop through it or something like that that's not necessarily always a good thing to do because it's not a really cheap way to store something whereas if you come from a different programming language it is a cheap way to store something in a different language you're storing a pointer to the first instance and that's it so you have almost no overhead whatsoever so arrays especially like of a fixed size are a really really efficient way to do that and in PHP it's it's pretty much the opposite they they give you no extra efficiency they're there I mean that's not completely true there are like through this end memory management stuff they do some tricks but I got some data that I'll show at the end and in general it's you're gonna use a lot more memory for that there is actually no almost no knows this but you can declare static arrays in PHP that have a fixed length and they are more memory efficient you only use them when you like really know the length of something and you can do that but they will actually save you some memory I didn't link that in here but if you look up like fixed length arrays in PHP you will get it so this is now how PHP tracks memory so how it releases memory because you don't necessarily have to unset or clear any memory that you create just once it is no longer used it will be freed and so how PHP tracks what it's doing for memory and whether something is used or not is based on what references it this is very similar to how something like Java does it as well so if it doesn't necessarily delete something it just removes a reference and if there are no references to something then it removes it for example like if you call unset you are not freeing the memory all you are doing is removing that reference and if you remove all the references and they go to zero then the memory will be freed so you're not necessarily by running an unset clearing that memory it's not like calling a delete or something in C++ I come from my original background is like C C++ so I'll reference that a bit and also I know sort of how memory works there when you manage it much more directly whereas PHP where it's sort of obfuscated and with the nature of the language you don't work on directly really at all so we'll see here sort of in this example if I you know set a simple variable to the string test that's fine that variable points to the memory that stores the word test that's easy and then if I set a new variable to that same variable and they're actually just a reference PHP will do that automatically so both of those are just a reference to the string test and so if I go and I unset I I'm not actually freeing that up that data is still stored there all I'm doing is removing the reference of I it basically does like a ref count minus minus is pretty much all it's doing because each variable keeps track of how many things reference it the reference doesn't go that way it's the variable say I'm referenced four times or something you can get into some issues with circular dependencies where like a parameter of an object references a different object which has a parameter which is the original object and then you can even if you delete both of them they each still have one reference to them and that's where garbage collection stuff like that will run and figure that kind of thing out and clean that up so that kind of stuff won't even get cleared up on unset it will just get cleared when garbage collection runs if you see here and then at the end once we finally unset j as well then there's no more reference to that data and the memory will be cleared so it's it's not you don't quite have direct control and you have to keep track of your references actually pretty specifically even though PHP doesn't have pointers and so it seems like you don't have to do that if you have stuff hanging around you know in globals or in variables that are of a higher scope than what you want you can be holding on to memory for a lot longer than you intend to and then so now we'll talk a bit about how PHP allocates memory as in PHP the language running so it's not as direct as you might think so if you declare a variable you know we're not allocating necessarily that much worth of so we you know do something that's like four characters and a you know a null terminator or whatever like you're not doing like you know five you know sets of 32 bits or whatever you're like we said doing more metadata but also it's going to have pre allocated stuff in various chunk sizes ahead of that so it's going to always have some memory sort of sitting there allocated the program waiting to be used and then there's a whole bunch of logic that basically dictates how it should grab more memory and you know when it should do that and how much it should grab which and that's pretty much all done for performance reasons because declaring and grabbing memory constantly can be quite slow which I'll show some data later on that we'll cover as well and so there are three types of how it will allocate that and then even within the small one the small is anything under 4,096k I think and it will do them in a whole series of chunks you know anywhere from like you know eight bytes I think you know up and then 816 32 all the way up quite a bit and that will be sort of based on either what you're allocating or what it expects you to be allocating so if you if you do something like say you run in a loop of like a thousand times and you keep adding to an array each time you go through this loop like you just add another element to the array it will adjust how it allocates memory so for like the first five times of that loop you know it'll only allocate a small amount and then it will allocate twice as much the next time and then twice as much the time after that and more the time after that because it's trying to sort of predict okay you're using memory really aggressively so we're going to allocate it more aggressively to try to sort of keep up with you so we have it sort of ready to go instead of having to sort of grab it from the operating system as you're going so you'll notice if you try to track memory in PHP it's very difficult because things you do don't necessarily correlate exactly with the memory usage you're seeing because it's trying to sort of allocate memory ahead of time for you so you can say okay well I did like one change and then now suddenly it shows that I'm using you know a whole may by you know more memory why is that I added one variable and it just might be that you hit a threshold and it's allocating differently so the large and the huge ones are actually if you have to go you know allocate really really big amounts of memory and they will actually internally use a different memory mapping functions in the C layer of PHP that's built to for performance reasons and honestly that gets a little beyond my area of expertise I'm not a PHP core developer in case anyone was under any illusions PHP also does a thing called write on copy so like I showed before everything is pretty much everything in PHP is kind of like a reference it's not sort of exactly the same as a reference in a more specific programming language but like showed here if I have you know two variables and they both point to the same thing my data tests there there's only one set of it and if I go change one of those variables that's when it's going to suddenly make a copy so the variable J was only just pointing to what variable I pointed to and then once I change I to something else we know it was not intended for J to also be that thing so then it makes a copy which it uses for J and then it does something new for my variable and so even when you're not declaring variables which are just changing things around it can actually be writing memory and this can actually be kind of tricky if you have like a you know objects and stuff that you're passing around into functions and so normally if you pass something into a function you don't need to pass it by reference because it is a reference already and it will be very memory efficient it won't like say if you're passing a whole note around it won't make another copy of that note each time you chain it down into a different function until you edit it at which case then it will write on or it will write on copy and you will start making copies of things and you will be using more memory then so you just have to be somewhat conscious of how that works because you can unintentionally cause a lot more memory to be used when maybe you don't need to maybe it's something you save in a temporary variable and said it's very situationally dependent but it's just something to keep in mind okay we're going to talk a little bit of how memory changes in different versions of PHP as you guys probably know things changed up pretty dramatically in PHP 7 performance pretty much double memory usage pretty much cut in half and a lot of that has to do with how they handle memory where there was you know it wasn't as much performance work put into that in the five branch of things and in seven there was a lot they more aggressively cache or they more aggressively allocate memory but then they also more conservatively use it and there's a bunch of optimization so if you get into huge arrays and things like that we don't quite use the same amount of memory that we used to so I'll pop into some data here that we can actually go through and I think hopefully we'll have a few minutes for questions at the end here so this is memory usage of a very simple program it's basically running that one line that you see there I did a C version there for comparison it's obviously not running that line because that's a PHP line but it's doing essentially the same thing so like in the first one we're just you know creating an array basically of a very small range and then we increase the range as it goes up I wanted to go up you know by factors of a thousand but the last one is only a hundred million it's not a billion because a billion causes my computer to explode as you'll see because I do not have a hundred and thirty gigs of RAM so that wasn't going to work if you'll see for the first column when we're doing something very small newer versions of PHP are actually less memory efficient because they are more aggressively allocating memory so and they cash more things and stuff so you'll see we actually we only use 14 megabytes in the old 5.6 and we're gonna use 25 in 7.2 as you'll see for sort of a almost an unfair comparison C uses 700k because it basically just does a loop and nothing else this just as I know these do also output just there's a little bit of output involved in these test scripts so that does have a minor effect on this data but it's pretty negligible as we go along though in the data you'll see that where newer versions of PHP were less efficient they become drastically more efficient you'll see the difference between jubile 5 and jubile 7 being you know 3 gigs to 13 gigs you know so that's even well above the you know the estimated sort of halving of memory that they said obviously this is a bit of a staged example and you know not something like Drupal although in benchmarks and stuff you know I use a half to a third in running comparisons between 5.6 and 7.0 you'll see though that the differences between the versions of PHP the memory usage has stayed pretty much the same there's a 10 megabyte difference in my biggest test there you'll see but that could well be you know based on how it's how it happened to run and stuff like that if you run these you won't get necessarily these exact values it depends on your system and also like if I run the multiple times these are an average because they will allocate slightly differently depending on your block sizes available and stuff a little bit of a note there for the last C example is I wasn't I actually had to do that with a multi-dimensional array because you can't make an array that has 100 million entities in it in a standard sort of C setup there is a cap actually to how much you can fit in it's based on like an integer size times four of the bytes that I just point to and stuff so that's actually done in two setups but the test should be pretty much functionally the same so that's a win for PHP if we weren't clear it's better than C it's faster in that very specific use case okay thank you that was a joke you guys were a little slow on how one but you know and this one which was actually not what I intended to do in my talk initially but the speed is actually quite significant in the allocation of memory so the same test program that I was talking about this is how long it took it to run and if you'll see for the initial stuff it's basically the same it's a small fraction of a second again C is a bit faster but it's it's all almost negligible because it's so small with newer versions of PHP 7 actually being slightly slower for a very simple program but you see as it goes up it gets bad so for 7.2 it actually doesn't you'll see weirdly it got a little faster as it went on I assume that was just a quirk in the data or maybe it gets warmed up but I don't know hey cash warming is an actual term that one wasn't a joke okay it kind of was and you'll see in 5.6 though the speed gets atrocious really quickly so where all the other versions are doing in a small fraction of a second we're already up to 250 milliseconds just for allocating some memory and you know that's a fair bit that's that's maybe the time it should take you know to fully render a whole Drupal page or like fully build it in PHP and that's just doing a very simple array allocation and then if you see if we do it in a really big size it basically starts to break down and it doesn't all it starts to sort of thrash and doesn't allocate memory very well and so it goes to five minutes and anything over that basically ceases to work in PHP 5 on my laptop which is running 16 gigs of RAM whereas the other ones if you see they maintain a very consistent thing you know even when creating a hundred million records we only go to you know 700 milliseconds which for doing such a big amount of data is pretty reasonable and if you see we don't actually get that far off from a C program which is running in 300 milliseconds so we're getting fairly comparable performance to what is probably about the fastest way we could do this aside from maybe I would say aside from writing in pure assembly but at that point that C is going to compile down to basically pure assembly so that's pretty much as fast as we can go there's sort of a never we'll probably go with questions now we're we got a little bit of time here we got supposed to be 25 minutes so I got about five minutes here if anyone wants to do questions I am a big nerd for memory so I can probably answer most things but like I said not PHP developer and remember we were all working on our social anxiety so feel free to ask questions thanks for the class so I was kind of hoping you were gonna talk cover memcache a little bit and how that works with PHP memory so I'm coming across an issue which there's just like an endless amount of cash render memory problems with which I thought would have been solved with using volatile memory like memcache and so I wanted to know what your thoughts are on the best way to profile and track down the issues so I can isolate the problem I mean a valgrind setup is pretty good new relic to be I don't work for them but shameless plug that stuff is pretty good and so we'll help you grind down if you're having memcache or redis problems though a real big thing to look at is that you are not cycling through your cash too quickly if you don't have enough excess space there you can be writing and deleting from cash way too often and everything will still run because of course it just discards the oldest things but you can be discarding things too often and then you'll you'll thrash quite badly right right so I guess what I'm noticing is certain slabs and that basically the cash looks like the memory cash is starting over so it's it's not like it's using cash it's just so it's like endless growth of memory never stops so I don't know if that's because not enough memory allocation so then you know the PHP processes and so forth so I guess it's a it's a personal problem but I just want to know how I could track it down because I'm seeing like a certain slabs that it gets reset so yeah it can be a little difficult we've done similar performance stuff for that and it's you know doing the Valgrinds and things like that and see where stuff is allocated but I would definitely try in very specific small use cases and then see how it is performing in cash where your where your reads and writes are and stuff like that so you know is what's the age of stuff in cash if it's moving through quite quickly then you can get a lot of that you can also get things where you'll cash stuff that you don't really want to cash like forms and stuff like that which which aren't really cash they're more of a like a data store basically and you'll get stuff that just builds and builds and builds and won't flush out although you should sort of be having some clean up for your cash settings I mean it shouldn't even if it fills up it should automatically flush out probably more of a specific problem but anyways next question thanks are there one or two common mistakes that we as developers are probably making that we can go Google how to stop making them you use too many dam arrays pretty much honestly it's like everyone imagines that loading data like from the database or something is really slow and so it's it's the best to just keep everything in memory and that is true to an extent you know like caching a couple of nodes or something is good but if you're running a big report or something like that is probably a lot more efficient to pull a single row deal with that row and then free it and then move along again that is by far the biggest mistake that we have staff make internally is just you put it in memory and it's fast and it's like okay well it's always a trade off and oftentimes you especially if you don't need if a report runs in one second or one and a half seconds or something you know versus if it explodes the server or not you know I'll go with the non-exploding version so that's I guess you wanted to but I would say that's probably the biggest one thank you this is great thanks and for profiling tools I just throw in black fires great although I don't work for sencio and it's really expensive if you start using it a lot but it's great so the thing that always makes me wonder if I'm doing something wrong is like in triple seven you if you just needed to like iterate over a number of entities and like grab their a certain field off of it you could load a field without loading the whole entity and now entity API is like you have to get all the things all the time because it loads into a particular object and they're all dependent on one another like if you do an entity if you do a load off of storage you have to get the whole thing yeah which it kind of isn't the best app although a lot of it is sort of like just in time loading so like if you grab it or you look at it then it will fill it out but it'll only have sort of like a essentially a stub for a lot of that data like if you have you know any entity references and stuff like that they will fill out if you if you try to use them but they won't be filled out until you do that that's a weird thing where also you can screw up your stuff by measuring it that was a quantum it's floating a floating cat yeah so is there anything to do differently there and I guess the other question is like is there therefore what to maybe use some of these like iterable array object kind of things that are in that are in yeah and it's to what you need and do you even need the whole entity to be honest it's like can you you know just load something more specific you know sometimes if you need to be really performant it is you know maybe I'll just pull a piece from the database or something then loading full entities and it is really nice and fun and great to work with and they do all the you know permission stuff and everything else that's attached to them but sometimes you have to go away from that if you're doing something that it's really important that it be performant and it might do like instead of one entity it's going to load you know 10,000 entities also be care be aware that you can if you're loading a whole bunch of entities over and over make sure you set the flag so when you load them they're not cached because if you try to load a thousand things and you don't set that flag you have a thousand of them sitting in like memory which will take up a ton of space and you won't especially if you're not accessing them again you're gaining absolutely no advantage from that so make sure you load them and then discard them immediately so that's if discard isn't on set them well even if you just there's that there's flags in most of the loading functions for entities and stuff in both triple seven and eight and there's like a cash flag and you said it's a false and then it won't put anything in the cash to begin with so and then so you don't have to unset it the minute you stop using it it's going to clear automatically so like if you have a variable called like you know blog and you set it to a node and then you look through it you set blog to a new node the old one once you stop using it it's going to be cleared out of memory automatically but if we're dumping it all in a cache it goes into literally a big array of nodes and those just sit there for the whole page load and it can just go crazy with the memory so you pointed out that if you have to store lots of small things you're going to incur the most overhead so suppose I have to store lots of small things and I care about that is there like anything I can do about it at all like some crazy hack like let's make a big binary blob and then like is it does anything like that exist in PHP or am I just the one thing that comes kind of close is like I said you can use a fixed length array and so then it will use less at least for the array overhead in just generally the way PHP works each variable is a dynamic variable and so it's going to have that data most of the performance fixes are in how they fix the underlying memory management architecture which like I showed has improved your ability to do that as a program yourself is a bit limited putting it into a big blob is going to be slightly more memory efficient but it's probably going to be way less CPU efficient and so your trade off is going to be worse overall plus it's just more of a pain in the ass just checking thanks sorry you mentioned that PHP basically estimates how much memory you intend on using like you gave the example with the array and you know yeah adding elements to the array I was wondering if PHP does the same thing with object properties as well as methods doesn't kind of my understanding it does the same thing for basically anytime you're allocating memory so anytime you're adding a variable to like or you're declaring stuff it's going to sort of monitor your memory usage and then just try to increase the space so that's objects variables anything that gets used you can foot around with some of the settings for that like sizes for when it'll use certain you know chunk objects and things like that and what your limits are and stuff although for the most part you just try to not use a lot of it are there any ways to override this default guesstimation are there any contexts where you may or may not want that um you can configure it somewhat for like when it should and how like what thresholds there are I don't know if you can lock it down or not I didn't think to look into that that would be a pretty low level language thing I think that is pretty much built in I don't think there's a config but I wouldn't take my life on that it's built into the language itself yeah like it's right in if you I actually linked to it and if I'll post the slides later but on the allocation code page basically for the memory allocation stuff that's written in C there's actually a weirdly helpful comment block at the top that talks a lot about how it does that memory allocation and it's like it's almost a tutorial which is kind of surprising because I'm it's just hidden in a comment block it's it should be a blog post but if you Google like how PHP allocates memory like you don't get squat so