 It's three o'clock and I believe in starting promptly so let's do this So if you come here to see worker threads crash course, you're in the right place and thank you But I do question your judgment because Joey Chung is giving a talk in the other room about how no bootstraps itself And she's awesome and it's gonna be amazing talk, and it's totally where I want to be right now But I'm here just I'm happy to be here, too I should also but but but if you want to leave now you do because I've insulted you totally can This is also probably gonna be less of a crash course and more of a more of a more of a Introduction, but you know not so gentle introduction. I don't know. We'll see how this goes. Anyway, you've made the wrong choice to be here So hello friends. Thank you, and I talk fast I'll try not to but actually I won't try not to because I have a half hour And I ran through this talk about an hour ago, and it was way over so I cut a bunch of stuff out Mostly jokes though, so don't worry about it It just won't be as funny as I wanted it to be but hopefully they'll have as much material My name is rich. I work for the University of California at their San Francisco campus at the library right about now Is when I imagine people going he works in the library? What is he doing here in sort of a Jim Gaffigan voice? That just makes me relax and feel good about things But anyway as it happens I do a lot of work on node core And I'm also the author of a rock opera about a steakhouse I don't know if the sound on this is plugged in but let's see if this works here. I don't know Okay, so it's coming out of the projector. I think Okay, cool. So anyway, I didn't come here to play examples of the rock opera But if you you know, but I did want to point it out because all of the slot all the like links to blog posts and Documentation and videos and stuff are at Palace family steakhouse calm I don't think I have a link to the slide deck, which I guess I'll add after I'm done here Okay, anyway, but I'm not here to talk about that stuff. I'm here to talk about node But first some disclaimers the views express on my own and not necessarily those of my employer That's a pretty standard disclaimer. The views express of my own not certainly those of node. There are a lot Of other people involved in node a few of them are here right now Naturally, we all don't see everything the same way hence this disclaimer with that out of our way node Hey, have you heard about worker threads? They were introduced in node 10 5 0 but required a command line flag to use in that version So use node 12 if you don't want to use a command line flag, which you don't Specifically use node 12 11 0 or newer although. I don't think there is a newer version in the 12 release line There's 13 dot as well out But anyway, that's the first version 12 11 0 is the first LTS version where support is officially stable rather than experimental So yeah, oh No, it's 12 13 1 because the time of this writing. It's the most recent. Yes, not 12 12 13 1 is the most recent great Okay, super. So anyway worker threads. Hey, what are they? So they're kind of like web workers but different for example if you're used to web workers There's shared workers. No such thing here But they're also kind of like threads in other programming languages, but not if you use threads in other programming languages Cool If you haven't used threads in other programming languages cool It's gonna be fine. Don't worry. So JavaScript right we all love JavaScript and we all know JavaScript is single threaded even if you have no idea what that means Before you saw the slide There's a 100% chance that you've heard the phrase JavaScript is single threaded because JavaScript is single threaded Arguably I actually don't want to have that argument though So the point is that your code is runs, you know, it can do one thing at a time So it's why this program never exits. There's only one execution thread handling this code So the code in the set timeout, which would break us out of the while loop never gets to run because the code in the set Timeout won't run until the while loop relinquishes control of the event loop So this is gonna run forever or until you hit control C or until you turn off your computer or whatever But it won't exit cleanly and whatever causes it to exit. It's not gonna be that set timeout This is called blocking the event loop if you already understand what that means great if you don't trust me and look it up later There are some good YouTube videos of talks about it. My favorite is what the heck What the heck is the event loop anyway by Philip Roberts? I think that was a jazz comfy you. I don't know anyway There's a link at pals family steakhouse calm now you may be thinking hey, this is cool and everything But notice asynchronous. I don't really have to worry about this blocking the event loop stuff You're probably actually not thinking that or you wouldn't be here But anyway, I know it can do many things at once like handle multiple simultaneous HTTP requests or read multiple files And that's true, but it's mostly true for input output for IO if you're doing say data sciencey stuff or Graphics processing or something that's CPU intensive Then let's just say the default state of things is not as asynchronous Prior to worker threads the usual way people would offload CPU in a non blocking way and node was the cluster module and If that's working for you great But here's the thing Cluster spreads your workload out across multiple node processes with independent memory and so on So sharing large amounts of data is often problematic and each process consumes the full amount of RAM required by node This can be inefficient although again if it's working for you then great But it doesn't work great for a lot of things and even for things that it does work for worker threads will often be better because worker threads are More lightweight and they are better at sharing data So let's dive in Here's a hello world example, and we're gonna go through it step by step The first line pulls in three things from the worker threads module It pulls in the worker class the is main thread boolean and the parent port object The worker class will be familiar to you if you've used web workers if you haven't used web workers There's there's a great blogger about the worker class, and that's Karl Marx, but um we'll explain the worker class along with his main thread and Parent port later. Just know that they all came from the worker thread module and We'll start with his main thread. We use it to make sure we're not inside a worker thread We're checking that we're in the main threads that we know it's okay to launch a worker thread If we didn't do this check then then this code might launch a worker thread that launches a worker thread that launches a worker thread Add infinite them or until you run out of you know stack space or something anyway He I don't know what you run out of actually that'd be a good experiment Anyway, this kind of check is usually only necessary when there's when the code For the worker thread is and the code for the main thread are in one file Which is probably not what you want to do but for production code, but Was you know works great for a hello world example? so there you go so We we know we're in the main thread. So what we're going to do is create a worker thread So we use the constructor for the worker class. That's what new worker is right? And we pass it under bar under bar file name Which as you probably know is the special node variable that contains the name of the file currently being executed And if you didn't know it now you do You can create a worker thread to run any JavaScript file you specify so here we're specifying the file That's currently executed, but you can also pass You know a string path to any file or you can also pass in a string containing code to be executed Although to do that you need to pass a second argument to the constructor that tells it that you're doing that I tend to avoid that because string as blob of executable code raises the same kind of hives that it might raise for eval Because that's basically what it is But it's an option if it makes sense for your use case. So we're creating a worker now Let's listen for messages from the worker. This is the usual event listener syntax In node remember we're in the main thread still not the worker thread We're listening for message events on the worker. We've created and when we get one We're going to use console log to print the message And that's it for the main thread Remember we were in an if block that checked if we were in the main thread So now let's use else and do the right stuff for when we're in the worker thread And the only thing we're going to do is use parent port to send a message using its post message method in the main thread parent port will be null, but So if we want to send a message to the worker from the main thread, we use the post message method That's on the worker instance But in a worker thread parent port dot post message can be used to send messages to the main thread So let's use it to send a message that says hello world and that's the end of the file You'll remember that in the main thread we had a listener for messages from the worker and that listener prints the message So if you run this file, it will print hello world not terribly useful There are much easier ways to do that, but it does introduce the very basic concepts of worker threads So now let's do something just as contrived as this but more interesting Perhaps you remember the game six degrees of Kevin Bacon if not It's simple given the name of any actor in a film your job is to connect them to Kevin Bacon in six or fewer steps in the following manner Let's say you are challenged to connect Katie But Katy Perry to Kevin Bacon in six or fewer steps Katy Perry was in Zoolander 2 with John Malkovich and John Malkovich was in Queens logic with Kate with Kevin Bacon boom Katy Perry to Kevin Bacon in two steps I've seen neither of those movies There are already websites that solve six degrees of Kevin Bacon by using IMDB data several years ago I wanted to do this for musicians playing on recordings of individual songs So I made a site called music roots, but it's been broken for a long long time. So let's fix it first surprise surprising there's no usable database available for what musicians playing what tracks and Lot of people think all music has it all music has album data, but not track data Lot of people think music brains love that information But it has artist data not individual data and a lot of people think discogs has that information Discogs just copies whatever on the album sleeve Which means that for example it will tell you that that Rudy Sarzo played based on Ozzy Osbourne's Diary of Mad Men As everybody knows he did and he was just on the in the credits Lee Kerslake played drums and Bob Dazeley played base That's the end of Ozzy Osbourne information for this talk But you know y'all they all know just a little bit more now about Ozzy Osbourne So anyway that brings us to wiki data which has some data along these lines, but less than you think and you know That's cool. It's wiki data. Everybody can add information to it But it's also very very very unusably slow for the many many queries we'll need to make in our searches So I built my own database and published it. It's very incomplete, but it'll do for here And I also built a rudimentary rudimentary little visualizer for which we might get to later I don't know so in order to solve these things we could use breadth-first search I am now about to give the world's worst overview of breadth-first search here goes Let's go back to connecting Katy Perry to Kevin Bacon Step one is Katy Perry Kevin Bacon. That's a JavaScript triple equal there in the middle The answer obviously is no don't be ridiculous Step two find everyone that was in a movie with Katy Perry Do any of those people happen to be Kevin Bacon the answer is no Includes John Malcovich and other people step three find everyone that was in a movie with any of those people that were in a movie with Katy Perry do any of those people happen to be Kevin Bacon the answer is yes, so we're done Congratulations, you've just witnessed the worst explanation of breadth-first search ever now. Let's do a slightly better Explanation this will be the second worst explanation of breadth-first search ever We're going to connect Katy Perry and Kevin Bacon, but this time not through movies this time. Let's do it through music Kevin Bacon has a band with his brother Michael Bacon the band is called the Bacon Brothers, and I'm not making that up Fun fact as you can see in this photo from Wikipedia Michael Bacon's nose has never been successfully photographed So let's see if we can connect Katy Perry to Kevin Bacon via music So step one is Katy Perry Kevin Bacon. No get out of here with that nonsense So here's a visualization of Katy Perry in the middle and everyone she recorded with on her album One of the boys, which I'm sorry to say is the only Katy Perry album that I have in the data set You can open a pull request to fix that if you want to correct this horrible injustice Anyway, Katy Perry is that circle in the middle like I said and the circle each circle in the surrounding ring is someone who Is one step away from Katy Perry because they played with her on that album So somewhere in there is legendary session hornman Jerry. Hey, there he is There's also your rhythmics guitarist Dave Stewart at the bottom who goes by David a Stewart because literally and I'm not making This up either. There are too many Dave Stewart's out there. There's an David and Stewart and it doesn't matter So notably absent from that ring though is Kevin Bacon So now imagine we take each of those circles in the ring around Katy Perry and we find out everyone who is recorded with each of these People we would take all those people and make an outer ring with circles of each of them I didn't do that But I did mostly because well for a lot of reasons one I'm lazy But also because there'd be like way too many circles to fit on a slide We're gonna get to that in a minute, but Yeah, the number of circles is gonna grow exponentially or exponentially ish with each ring Right, so you wouldn't want to see all those circles But I did scroll this ugly blue line sort of represent that outer ring kind of like Saturn You know a little ring around the the planet of Katy Perry Anyway, I'm here to tell you something exciting about that outer ring though. It totally contains Kevin Bacon It's basic. So that's basically breath first search here are the results if you don't believe me there you go Yay John Bon Jovi who would have thought okay, so That is breath first search Let's implement it. No, just kidding for purposes of this presentation It's an implementation detail the there are trade-offs various ways of implementing it implementing it I don't really want to get into it and I don't have time I'm talking way too fast as it is But you can check out this repository for how I implemented it as well as the other Algorithms we're gonna talk about in a little bit The important thing is that our approach will keep the CPU busy rather than do a bunch of work up front This is so that we can see how cool worker threads are but it's also a legitimate trade-off one might make in the real world It's it's not always worth it to spend time up front preprocessing your data if it's time consuming takes up too much storage Etc. Etc. So here's what breath here's what it looks like We'll step through this the first line gets all the tracks of the star person sorry for the long long variable names But you know this it made sense when I wrote in the repository anyway Let's say it's Aretha Franklin. We're gonna put all the tracks in index zero of an array of tracks for the star person The index indicates how many steps we've gone from the start individual and this line populates the corresponding array of individuals That are in the source of those tracks above so in this case, it's an array of just one individual ID It's just Aretha Franklin For the two lines starting here, we're gonna do the exact same thing for the target person Let's say it's Carrie Brownstein this line checks to see if we have a match by seeing if there are one or more tracks in both lists Lastly this while loop runs until a match is found This line adds the individuals in the tracks that result from going out one more step from the start individual Then we've gone this far So all the people on tracks with Aretha Franklin then the next time it runs It's gonna be all the people on tracks with those people who are on tracks with Aretha Franklin And so on and so on getting exponentially slower as more and more data is involved in deduplication and queries So this line updates the matches list so the while loop will stop if we found a match So each time that last while loop runs, we you know remember each of these rings exponentially ish more work longer paths will take longer and So, you know There's a bit of a solution hinted at at a use of an otherwise unnecessary array in that previous code But first of all, let's check how breadth-first search performs on my laptop. Here's a run with the results It took more than 14 seconds just to do the breadth-first search. That's a lot We can do we can do better even without worker threads by doing bi-directional search Really quickly here's how bi-directional search works first Katy Perry is not Kevin Bacon despite the striking resemblance evident in that photo Again, we look for everyone that is connected to Katy Perry and check to see if Kevin Bacon is in there. He's not now We bounce back to Kevin Bacon and we get everyone connected to him We check to see if there is an overlap in the two sets of musicians If not, then we do another query We do one for Katy Perry's cluster of people still no match to one of Kevin Bacon's cluster of people still no match, you know until there's a match now so Let's let's visualize it like this. Excuse me That over on the left is Katy Perry and that over on the right is Kevin Bacon Katy Perry and Kevin Bacon different people there's space between them So there's a bunch of little dots, but those are all people just like Katie and Kevin Anyway, those are all the people played Katy Perry still no Kevin Bacon. This is this is breadth-first search We're gonna do another query now this query gets starts to get expensive I didn't include all the dots that should be in there But imagine like 800 times as many dots and still no Kevin Bacon We're gonna do another query and this one's even more expensive, but finally there's Kevin There's Kevin Bacon. All right, so now bidirectional search would go this way. Hmm. Let's check. Okay. All the people play with Katy Perry Kevin Bacon's not in there. All the people play with Kevin Bacon. Katy Perry is not in there. Oh, look at this We do one more query Kevin Bacon and Katy Perry have have some overlap John Bon Jovi or someone is is in the middle there Dave Stewart somebody is anyway So, yeah, so, you know, we're we're we're we're doing fewer Expensive queries. We're doing the same number of queries because we have to do as many queries as it takes to Get the number of steps to connect the two people, but we're doing less expensive queries So let's see, you know, if you don't believe by the way If you need big O notation to show how that works or if my explanation sucks so bad that you have no idea What I'm talking about. There's Wikipedia article so All right, how am I doing on time? I got 12 minutes Okay, bidirectional search is just like breadth-first search right down to the comments Except for the contents of the while loop So let's go let's zoom in on that while loop You can see that we do a breadth-first search starting from the start individual Then we check if there's a match and if not we do a second breadth-first search starting from the target individual And we repeat until we find some overlap. Holy moly. We went from 14 seconds to less than three. It's awesome But wait, this talk is about worker threads Why be bound to a single thread rather than doing one breadth-first search over here and Checking and then doing another one over here and checking and then doing another one over here and checking Why not just run two threads doing simultaneous breadth-first searches racing to see which one, you know Can return an overlapping individual first so to create our worker threads at this time? We are calling new worker again And this time we're putting the worker code in a file called worker.js There's also a new thing over there in the second argument, which is a worker data property This allows us to provide the idea of the individual to start with so worker data gets serialized And sent over to the worker which then unserializes it into its own copy of the data and Which is basically what happens when you post met when you post message data as well Now worker threads can do this awesome magical thing where if you do things just right you can share memory and also Transfer memory buffers between the main thread and the worker thread sharing memory doesn't actually resemble sharing nachos like this But I need an image So we're not doing this in this app worker data just sends a copy But if your data is of a predictable size and format and if there's a lot of it Look at the docs for information on sharing memory or transferring it It will be useful in addition to the shared memory stuff. There's pooling for this application We always need two workers, you know one for Katy Perry and one for Kevin Bacon And we don't and I don't care about the cost of starting one up You know just waiting until I need it and then starting one up But in an application where your needs are more dynamic and you are trying to get the absolute best performance You can you want to investigate having a pool of workers that you use as needed There are npm modules that can help you if you want to if you don't want to implement pooling yourself So over in worker.js reading the worker data value is done like this You import the worker data property from the built-in worker threads module then read the data then read the value of the ID key We're in a context switch back to the main thread We have an error listener that simply rethrows any unexpected errors from the worker And we have a callback that we use when we receive a message from the worker the index here is used to distinguish the results From Katy Perry's worker thread and Kevin Bacon's worker thread So we're using so we might use zero for Katy Perry's thread and one for Kevin Bacon's thread Let's head over to the worker code again and see how the worker invokes the callback So each worker is created it grabs all the tracks the individuals on and sends them along with the individual back as an object to the main thread That post message will cause the callback in the message thread to execute so here's the callback and Again, the index is a value that lets us use the same callback for Katy Perry's tracks as we use for Kevin Bacon's tracks We also get all the individuals from whom the list of tracks is derived And we check to see if there are any tracks that are on both lists thus indicating that our expanding circles are overlapping and we can stop If we have a match we call a function called done We'll check that out in a second if we're not done We send a message to the worker to go get us another expanding ring of tracks and individuals I'm not going to show the worker code that listens for the message as it's pretty similar to what we've already seen Plus I feel like I'm rocketing through this fast enough But if it gets the value next it gets the next set of research results and sends them back to the main thread Just know that to receive the message the workers listen for the message event on the parent port object But I do want to talk about that done function It removes the listeners we have for both workers and then it calls this method That's on all workers called terminate and what terminate does is it ends the worker thread and returns a promise that Resolves to the return the exit code of the work of the of the worker thread If we if we have cleanup code or whatever and we want to make sure the worker threads like the cleanly We can put this in an async function and await the value, but in this case We don't I'm going to exit with an error code You know it's gonna exit with an error code because we're terminating it while it's running We could also send in a message and have it and gracefully, but that's that's extra code and overhead We don't need in this case This just says and execution immediately please but we could do something more elegant if we wanted to And lastly we print our results. So let's see how this performs remember single-threaded breath first search took over 14 seconds Single-threaded bidirectional search under three seconds. Oh my gosh, it's less than 700 milliseconds now I can't believe it unbelievable. This should be illegal now I have to admit that the main motivation here was as you might have picked up by now Wasn't really to talk about worker threads as awesome as they are and as exciting as they are It was to write a program to efficiently find out how far palace family stakeouts is from Lil Nas X The answer by the way is six degrees It starts with a little Nas X of course and the first degree is Billy Ray Cyrus who's you're formed on Old Town Road I was as surprised as anybody to find out that as recently as 2009. That's what Billy Ray Cyrus looked like The second degree is country star Mary Mary Chapin Carpenter who along with Billy Ray Cyrus was on Dolly Parton song Romeo Matt Dolly Parton gets her own slide because you know Need to stop and just pay a few respects to Dolly Parton. She the aforementioned Romeo was on her 32nd studio album. She wrote it She produced it People who aren't country music fans and I'm not really one myself But I don't realize the extent to which she is in control of her sound in her career She is a legend and a force to be reckoned with so don't mess with Dolly Also this you know since we're starting from Old Town Road and everything like that But the analogous legend and forced to be reckoned with in note is Anna Henningson She's the one most responsible for implementing worker threads basically did it single-handedly As far as all things node go it's extremely difficult to give Anna too much credit So, you know, you should totally just like start applauding right now Yeah, that's She'll be she'll be one of the people on the node Technical Steering Committee panel at 9 a.m. Tomorrow morning So check that out. Anyway back to this nonsense After Dolly Parton's track Mary Chapin Carpenter goes through Saturday Night Live Bandley energy Smith and Tom Waits and the trumpet player named Chris Grady blend on track that I was on anyway Why should I restrict the fun of vanity exercises like this to me? You can head over to this glitchy URL and try some stuff out and since um, you know, I do want to just take a Okay, let's see here This is not working the way I expected it. Okay See here. Yeah, so we can Yeah, so let's see here You know, let's let's try a Did Kermit the frog because my daughter It doesn't matter why I did Kermit the frog Earlier I did curdle in front of Bob Dylan, but that that didn't seem as much fun as some other things What? the Pope the Pope is Yeah, I'm musician. I don't think he's in the discography But I don't know we could try like a Miles Davis or something. Let's try that Here you go four steps from Kermit the frog to Miles Davis. So anyway You know you search for people and they're not and oh and this is where the visualizer comes in so, you know, here's Here's Herbie Hancock and everybody he's played with and then you know if you want to know what Randy Kerber was on Well, he was on this Herbie Hancock thing But that's you know, but if you want to know everything Herbie Hancock is on there It is you know, you get the idea. Okay, so anyway, we're gonna so have fun with that that glitch thing. Yay Okay Present please this is gonna throw me back to the beginning. Don't throw me back to the beginning. Yeah. Okay. So there you go There's also a link at palestine-mistakehouse.com. Thanks. I think oh my gosh. I came in like four minutes under that's awesome Okay, um, yeah, wow everybody stayed it seemed like and that's just fun I hope I hope I don't didn't completely waste your time. Thank you very much