 I was asked on a recent project to explore the ability of a machine learning model to detect humans in a photo and differentiate them from the background and replace that background with a background image of the user's choosing, so essentially green screening without the green screen. I was really intrigued by the task and I was able to work through it, it had some difficulties along the way, so I figured I'd make this tutorial video. It's a bit of a specific use case, it may not appeal to what you're trying to do, but I also think this video is a good introduction for non machine learning experts to learn how to create a data set, train a machine learning model, and then visualize the results of that model, so hopefully some of this will be useful to you. Okay, so to sort of illustrate what deep labs all about, I figured the best way to show it is through a software called Runway ML. This is a great toolkit for people who are maybe more creative but less code-centric, who like to have a visual interface for things. It's in beta right now, you can download it, get it running on your computer, and once you do, there is a menu on the left here where you can browse models, you can see I've already got things running, but there's all kinds of different machine learning models that you can install, and the one we're using is deep lab here, so I've got it on my left menu handy, so I'll just show it to you, and what you would do is say add it to a workspace, new workspace, and then those workspaces will show up in the menu here on the left, so here's deep lab. So what I've got is as an input, I've got my webcam running, and you could also process a folder of images or a video, and then rather than export this somewhere, I'm just gonna have it running a preview, and you could see there's all these different colors where the camera is trying to recognize things. It looks like a lot of noise, probably because my surroundings are not that recognizable, but you can see fairly clearly the outline of me, so the machine learning model has detected that I am a human. If we can do this well enough, then we can create a perfect silhouette around me, and everything else can be considered the background, and we could put myself instead of this drab room in say a tropical island. So that's deep lab, runway ML, and machine learning happening sort of in real time. I'm gonna shut this down, and we're gonna get into the tutorial. So I wrote a little article on this, and I will publish a link to that, but it's gonna be my cheat sheet as I walk through everything because there's a lot of steps. As a first step, I think it's helpful to get a good programming environment in place. We're not gonna do a ton of coding, but there is a little bit involved, and since it can be a significant hurdle sometimes for people, it's best to get it out of the way to make sure that you can actually install a coding environment before you try to gather your images and things like that. Otherwise you'll do all that work and then be tearing your hair out because you can't get your programming environment up and running. So this is gonna require Python and a library installer for Python called pip, and I feel like the easiest way to get that up and running is through something called Anaconda, if you've never heard of it. It just gives you access to all these different tools in Python and makes them very easy to install, makes it easy to manage multiple environments or virtual environments in Python, which is important when you're dealing with machine learning because different machine learning models have different requirements and different versions that work and don't work, and it can be quite frustrating to try to manage all of those for one project and then switch to another, which has totally different requirements. Anyway, so if you download Anaconda, I'm using Anaconda Navigator, they have a command line version only, which is called conda, but you can set up these different environments by just clicking create, giving it a name, my end, me end, and choosing your version of Python right here. They only have 3.7. I could install older ones as well, but 3.7 is gonna work well for us, and I've actually got this TensorFlow environment already fired up, and then once you do that, you can just say open terminal. Now it's gonna open the standard, I'm on a Mac, so it's gonna open the standard Mac terminal. I like to use iTerm, so I'm just gonna copy that command, go into iTerm and paste it, and now I've got it going in iTerm, and I can close this up. How do you know you're in your virtual environment? It's listed here in between the parentheses TF, in my case is the name I gave it, so we are in our TensorFlow environment, and right now we have these dependencies, which are some of the dependencies we'll need. You see pip, see Python, and a few other things, but we're still gonna need to install some stuff. So the first thing you'll need is the TensorFlow models repo. Specifically actually you'll only need the research folder of models, so I can post a link to this, but github.com slash TensorFlow slash models, go into research, and you'll see deep lab. Now I have a browser extension that just allows me to download specific folders from within the github repo, I can just click this button here, I've already downloaded it, so I'm not gonna bother, but yeah if you don't have that, I highly recommend it, I can look up the name here in a sec, but otherwise just download the entire repo and then pull out that models research folder. I've got this article folder set up, and then there's that models research folder, and then deep lab, and it's got all the code that we downloaded from the github repo. Now I wasn't able to get this running out of the box, I had a lot of trouble and found some really great tutorials, which I'll post a link to as well, that kind of helped me get on the right path, but I bundled those all up as a bunch of scripts, which are in this github repo, hevers and deep lab training, so you want to go ahead and clone that as well. You may not have issues with this sort of thing, I'm on a Mac, it might be environment specific, they might get fixed down the road, but for me these were helpful, hopefully it'll be helpful to you, and at the very least there are some scripts that will be helpful in generating your own dataset. So there's various folders, you'll see this models research folder, which will sort of supplement the tensor flow deep lab folder that we just cloned. So go ahead and download this folder and in the downloads, go to that models research directory, you'll see a few scripts here, we're not going to overwrite anything that we've already downloaded, but we're just going to paste them into place at the same directory structure level. So we've got eval, train, and viz, pqr, and then within that deep lab folder, I've got a dataset already set up, so I'm going to copy this pqr, put it in datasets, pqr, as well as these two scripts here, build pqr data and convert rgb to index, and I'll go over what all these things are in a second, but just make sure you've got that directory structure. If we go back to the root of this repo, you'll notice that in addition to this models directory, we have a CV directory and a utilities directory. So I won't go over CV just yet, and we may not cover that actually in this video, but these utilities we will go over. Mostly these are things that are just useful in helping you gather photos for your dataset, so things that you want to substitute the background of, and they'll be useful in training the model. So we'll get into that in a second. First, let's make sure our environment is running. So we've got Python version 3.75, and I think this is the command for the version. We've got pip 19.3.1. Notice whenever I run pip, because I have multiple environments of pip and Python, if you're using pip 3, make sure you always mention it as pip 3, not as just pip. Otherwise, it's going to install it to your default pip environment, which may be pip 2, and you'll wonder why things aren't working, and it's because they're going directly to your pip default install. So we look good there. So now I'm going to install TensorFlow, and I had issues with TensorFlow 2. I had issues with TensorFlow GPU, even though I have a compatible GPU on my Mac, so I'm unfortunately just running regular old TensorFlow, and I'm running an older version, which is 1.15. It's sort of like one of the last versions of TensorFlow before they switched to TensorFlow 2. So if you can get it working with 2, and you need to, go ahead, but this should work just fine with TensorFlow 1.15. So I'm going to go ahead and install that. It's going to tell me I already have it, because I do. And then I'm going to install Pillow, which is pip 3 install Pillow. I've got that. It's version 7 at the time of this recording, and there's a couple other things that I need, TQDM. I don't honestly even remember what that's for. NumPy is useful for all sorts of things, used a lot in data science and machine learning. So that should be the requirements that we need right now. And so now I'm going to want to change into my model research directory. So I'm going to type CD, and then go find that folder here on my hard drive, and just drag it in. And now I'm in here. I can see that deep lab folder. So I'm going to be in the models research directory. And then we're just going to run a test here. And I'm going to run Python 3, not Python. Again, if you have multiple versions of Python installed, if you have Python 2, it will run these commands with Python 2, if that's your default. So always make sure to run any Python command with Python 3. So I'm going to run that model test. And we won't go through this whole testing thing. Oh, it's going to tell me, glad this happened because this is constantly happening and get very much confusing me. It's going to tell me no module named deep lab, even though I have this deep lab folder. What you need to run here is this command, which just adds it to your Python path. And now, if you run model test, it should work. No module named nets. Ah, right. Sorry. There's one more folder that we need from this TensorFlow models research directory, which is not just deep lab here, but folder called slim. So you want to copy that out of the GitHub repo. I'm going to download it. Copy slim. We want to put that in the models research folder. Slim. Now, let's see if we can run it. Cross our fingers. Seems like we're getting a little further. And it's running the test. You want to make sure you pass all the tests. I'm going to let that run and do its thing. I'm hoping that since I've run it before, it's going to work fine. And now we are ready to prepare all of our images, prepare our data set and get the stuff that we need to train our model. So again, anytime you open a terminal and run Python commands, just make sure if you've closed your terminal down since the last time, make sure to run that export Python path to include slim. Otherwise, you're going to get some funny years. Making a data set. What are we going to need? You're going to need images of human beings. And we want images without the background. So we don't have to manually cut them all out. Then we're going to need a consistent background image, or at least in my case, I wanted a consistent background image. Why? Because this was going to be a photo booth. The background was always going to be the same. The thing we're substituting out was going to be the same. And so by having a dependable background images, our machine learning model can learn much more quickly what's a human versus what's a background and do a better job of creating that silhouette around humans. So we're going to just pick out a background image that we want, that we might imagine being in our environment and have that ready to go. And then, of course, you're going to eventually want to have images of backgrounds that you might want to replace them with. So that island we talked about or say this was Alps. Things you might want to consider if you're building a photo booth like I am. What are the lighting conditions going to be? How close will people be standing to the camera? How many people will be in the photo? What are they raised? Their gender? Their clothing styles? Are they going to have props? Are they going to be facing sideways? Maybe you're always facing forward. Are they going to be jumping up in the air? All of that stuff can affect how the machine learning model trains. You don't want to over train the model on a very specific use case in case people do things that you might expect, but you do want to capture sort of a breadth of what you might expect to see in your actual scenario while you're training your model. So next we're going to scrape some images from Google. This is kind of how I train my model. There's a variety of different ways and different places you can get images from. Google is just the easiest for me. I'm just doing a standard Google image search. You'll want to make sure that you're doing a search for, you know, photos that you can obviously use, right? So we have open licenses from the Creative Commons or whatever, but for the purpose of this I'm just going to show you just a standard Google image search. So in our directory, in that utilities folder, you'll see this scrapeimages.py file. I got this from Gene Kogan and the ML4a guides repo. Let's see if we can pull that up. Here's their Utils repo. So he's got all sorts of useful stuff in here for machine learning. This is Scrape wikiard. Somewhere in here I think is a scrape Google thing. Maybe I didn't get it from him and I'm unfairly attributing it to him, but he's still got lots of great code. Anyway, this scrapeimages script. Let's take a look at it in a code editor. Alright, so there's that utilities scrapeimages.py. It's a little bigger. Alright, so this takes a few arguments. You can specify the number of images, the directory that you want to download to. It is Gene Kogan. I see that user is being downloaded. So you're going to want to make sure to change this to your actual directory if you're going to specify a directory. Your search is going to be the search term that you're searching for. It looks like the default is bananas. We don't want bananas. And then number of images. I'm going to make that 100. I think it maxes out at 100 and that's a Google limitation, not a script limitation. So you'll have to do, if you're scraping using a script, you're going to have to scrape in batches of 100 and change your search each time. Now the other thing I wanted to point out in here is that in addition to your query, it is going to specify some parameters for the Google search. This fit my use case. You may want to change this, but a few things you'll see in here. The size, I'm downloading medium size images. If I downloaded images that are too large, it's going to take forever to train my model. As it is, it probably, if I could do it again, I might make them even a little bit smaller than most of the photos I was downloading. The type is going to be a photo versus clipart. They're going to be transparent images, meaning it's going to be ideally photos of people without their background. And then in order to also help make sure that I'm getting truly transparent images, I'm going to specify the file type as PNG. That's the significant parts of this script. Now we want to run it. We've got this command here that we're going to copy from the midmi, and we're going to go to our terminal. Looks like our test did pass. That only took 26 seconds. I could have waited for that. We're going to paste in this script, but we need to change some stuff, obviously. The directory is going to be the path to your scraped images. And we want to put them. Let's put them in a folder called scrape and a subfolder called Google. Just to figure out where we are in the terminal here. I don't remember. Okay. So the other thing that I need to add is that this script requires a couple of dependencies. One of them is beautiful soup, which aids in scraping. And the other is urlib 2. There's a newer version of urlib. There's different ways. I think the original script said from urlib 2. That didn't work for me. I had to just do import urlib 2. So that's the other change that I made on the script. Okay. So I got my script command here. Scrape images up. I'm going to run that. I'm searching for worker. We've got 100 images. I want to go to the scrape slash Google directory. So I should start to see those populate in here. There's one. We'll slow should speed up a little bit. And you can see these are transparent images of workers. And we've got some clip art in here. We probably don't want. We're starting to get some errors 403 forbidden. You know, for whatever reason, you might not be able to scrape using the user agent string that we specified in here, which is this, or maybe the file type is not right. Hard to say. But we are getting a bunch of images. Okay, so I've got a bunch of images here. And what I've done is clean them out to make sure that I've only got useful images here, which is in my case, people, no background, no stock photography, no weird artifacts in the images just pure transparent background around the silhouettes of people. And you can see I've got like all kinds of different. It's not just guys and hard hats. Although mostly it is. There are some like office workers and things like that. We're going to want to download way more imagery than that. But for now, that's all I'm going to show you of that process. And then we're going to want to get a background image. So we'll just go to maybe look for construction site images. And this is going to be this whatever size we get here is going to be kind of what we want to composite our background onto. So I've got this one here. I go ahead and save it. And we'll just put that in the scrape. Let's put that in the BG image for now. Construction site. Okay, so I've got my background image. And I've got a bunch of foreground images on just got the first one loaded up here in Photoshop. And I've got all these Photoshop actions which are in the repo, you can load them by going to the actions panel and Photoshop and going to load actions. Selecting that actions file and importing it. Now there's several different ones. The first ones we're going to use is place and save. But we need to make some modifications to it. So I'm just going to do that. And we need this to match the dimensions of our background image. So I've got it sized at 1280 by 960. And then for the place command, we want to place the background image from location on your hard drive. Mine's in this BG folder. Construction site. So I'm going to do that. I've entered place it. The rest of this should be fine until the export step. And in the export step, you want it as JPEG quality 60. And the location will be this composite folder. You can call it whatever you want, but just make a folder where you can save all your exported images. And that's all the changes you need to make to that file. I'm going to go ahead into that composite folder and just delete this because it didn't run the whole action. But now we can automate that action in batch. So we choose our set, choose place and save choose a folder. And the folder is going to be all the images that we want to import. So this Google folder. I'm suppressing all the dialogues and errors and things like that. And the destination is none, because we already have an export specified in the action itself. So if I run this, see, running and it's sort of compositing these people and to the images. And we can see it better if we look in our composite folder at what's being created. So you can see the background is the same size every time it's bringing in the people and it's moving them to the bottom of the screen, the bottom center, so that they all kind of hopefully fit. If we've got any people that are any images that are bigger than the background image itself, then they'll obviously be out of frame. And we'd want to delete those before we train our model, probably if we can't see their face. As that might be helpful and likely in your real photo booth scenario, your background replacement scenario, you're not going to have that situation occur. And by the way, you may be more familiar with the tool like image magic or Python's pillow library and or open CV to do the same sort of compositing of images. Definitely if that's your skill set, you should use those. They're great. If you have a better method. This is just kind of the work for me. And hopefully it might work for some other people who are more familiar Photoshop than these command line sort of file compositing things. So I'm just going to close all of these. And the next thing we need to do is create our segmentation images. And remember, that's just going to be this sort of color coded thing where the foreground is one color and the background is black to indicate the difference between the two things. So we're going to go back in and load up one of our original images, go to our actions palette. And the action we want this time is segment. Again, we're going to need to change some of this stuff. So 1280 by 960. It's going to match whatever we exported our last things out to. We're not placing anything in. So that's fine. But we are saving these to a different directory. Let's call this a folder called segment. That's where we're going to save it out. The only other thing is you might want to change this fill color. So in here, I've got it set for the segmentation color for a person. If you're not filling with a person segmentation, then you want to look up the color table for the object that you're segmenting. So let's pull that up. Should be able to find it here. Okay, so here are our here's our color table. What you'd want to do is sample that color of your desired category and figure out what you want to be training on. So if it was boat, it looks like we got a hex value here. 02070C70. So you'd want to go into that Photoshop action and fill it with that color. But I'm going to leave mine as it is. And really quick, I'm going to maybe make a change this. This convert to sRGB should not be checked. That kind of converts the color to a monitor safe, a monitor adjusted color. We don't want to make any color adjustment. So I'm going to save it. Okay, so that action is good to go. I will close that. We will go into segment delete what we just created. And now we'll automate batch segment action to choose our folder to scrape Google. And we don't need a destination because we've already got that specified. And now we're going to start processing these images. And let's go to our finder and see what it's creating. Okay, so you can see, it's basically just importing those transparent PNGs, selecting them, filling them with our desired segmentation color, and then making a background layer, then filling it with black. So this is going to work great for training our model. See, it's all done. Again, we want a way bigger data set than what we have here. But for tutorial purposes, this is going to work just fine. Okay, so we've been operating out of this scrape directory. And we've got these segmented images, you've got these composited background foreground images. And now we just need to move all those into our data set. So that is the models, research, deep lab, directory. And in here, you've got data sets. And I've just got one called PQR. And I've got some, you know, temporary images in here, but I'm just going to paste my new images in here. And then I'm going to paste my segmented images in the segmentation class folder. Okay, and now what I need to do is kind of split this data set up into images that I'm going to train the model on, and then images that are going to be sort of approved that the model training has worked. So it's training and validation data sets. And just for the purposes of this tutorial, I'm just going to grab kind of like the first third of the images. So you would there are some tools to do this a bit more randomly and a bit better. But let's just grab grab some images here. And basically, what we want to do is, and basically what we want to do is, in here in this image sets directory, we have our text files with our training set, our validation set, and then a list of all the images in both the training and the validation set. So I'm going to open this up and get rid of what's in there. It's just a placeholder. And then I'll go into my image directory. And we will just grab for the purposes of demonstration. Let's just grab, you know, roughly looks like two thirds of the files here. And I'm just going to paste that in there. And it paste them in just the names. It's got the extension. I'm actually going to remove that. I should probably do these scripts so that they allow for this file to include the extension versus hard code the extension in there. But I'll just roll with what we got for now. So that's our train file. And then I'm going to copy the rest of these. And that'll go into our validation. So paste those. And then for train val, which is all of them, I'll just grab train too many windows here, paste those in validation, paste those in. And that's a list of all our files. Now I'm just going to do a quick final replace on those extensions, your that JPEG extension, please. And now we have a list of our training validation and all images correct as the model needs them. Last thing we need to do is the segmentation class images are RGB images. So meaning when the model trains, it has to read an R, G and B color value three different variables, essentially. And we want to reduce the dimensionality of that. So it only has to read one thing. So one smart thing that the deep lab data set does is it just indexes those colors. So when it sees black, instead of seeing, you know, R zero, G zero, B zero, it just says this is index zero. And when it sees that pink, it says this is index one. So we're going to convert those images. And we've got a script that's going to help us do that. It's this one here. Let's take a look at the script, convert RGB to index script. Pretty simple. Main thing you need to pay attention to here is this palette. So we're segmenting by people. That's this RGB value is what we're looking for 192128128 that this that's that pinkish color. And then 0000, which is black is going to be our background. So black is index zero, pink is index one, palette is set up. We're not trying to detect any other objects. Otherwise, we would need their color codes. And we need to add them to this palette. And you can see here the paths. This is just where it's looking for those things. If your folder structure varies for some reason, make sure you change this segmentation class path. And it's going to save out to this segmentation class raw path. And that's all you need to know about that script. So let's run it. So in here, we've got the command to run that terminal here. And our directory is going to be where that script is, data sets directory. So you can see the data sets. I'll just paste that. So we got an error. It's this ds store. No, if you've ever dealt with this before, it's kind of annoying. But just a thing that the Mac file system stores on here. So let's change directory into PQR segmentation class ds store. And then let's check our JPEG directory to should be good. Now let's run that script again. Let's make sure we got everything. All right, so now you can see we've got this segmentation class raw folder. And it's got all of our images. And if we look at these images, they look like they're just black. Don't be worried about that. It's not a it's not a problem per se. You can see the file size here is 14 kilobytes. So it's obviously storing some kind of information. And actually, it's just storing an index value there. So it's going to have a zero or one, our preview application isn't set up to be able to display that properly. But that segmentation that mask is there. Okay, so we need to create a TF record folder, which is TensorFlow's database storage format. It's a way of sort of optimizing storage space when we're training our model. And I don't know much about it, to be honest, but we have a script in here that's going to help us with that. It's called build PQR data. And it's a copy of the build VOC 2012 data file. And I've just made a copy. So I don't change anything case you ever wanted to refer back to that. But what it has in here is it has our image folder. And obviously, do you want to change these paths if you name them something different? But I've got PQR JPEG images for my JPEG images. I've got my segmentation class raw folder in here. I've got my image sets folders specified in here. And then I've got my TF record folder, and it won't create that TF record folder. So you'll want to make sure that you have it in place. Check down. So let's make that TF record. Okay. So now we've got everything we need there. And we should be able to run this script. Oh, the other thing is check your file names. I've got these hardcoded for JPEG and PNG. And you'll just want to make sure that you set those up to be whatever format you saved up to. Looks like we've got another PNG in here from my last project. Save that. And then we're gonna run this script from the same data sets directory that we're in. Okay, and now our conversion is done. And if we look at this TF record folder, we've got a bunch of files in here. So it should be good to go to start training. Okay. So there's one other thing you may want to do before you start training. Actually, I'd recommend it is in order to make the training much faster, we want to use a pre trained model, which already has a concept of the thing that we're training on. So in this case, we're using the Pascal VOC 2012. I've got a link up for it here. And you'll want to extract it. Once you download it into the exp train on train balset, init models directory, and just unzip it as this. And you'll see the model checkpoints. So that's going to be kind of like the basis for what we're training our model on top of. And when we train a model on top of another model is called transfer learning. And so this training will already have a concept of like what a human roughly looks like. And we're going to make it more specific by training it just to be recognizing our specific background and humans on top of those. Okay, so once you have that our training scripts are located in the models research folder. So I'm going to change into that directory. And the script we're going to be running is train pqr.sh. So this is a batch script. And it's going to call the train.py file. Then we've got exp is where everything's going to export to. There's a few parameters here, you could play with these. I haven't really honestly played with this that much. Our crop size, you want to pay attention to here. You'll want to set that for the height and width of whatever you're training on. So again, if we look at our images, you have 1280 by 960. That's honestly pretty big to be training on. I would suggest you make your images smaller, otherwise it's just going to take longer. And then you're going to set your number of iterations. And that's going to be a parameter that you pass here. Once you've got that script set up, we can run our training command. Now just a heads up, this is going to take a long time, even if you've got a fast computer, if you're running on CPU, it's going to take a real long time. If you're running on GPU might expect it to be 810 times faster, depending on the machine. The size of your images is going to affect things, the number of iterations that you chain on are going to affect things. So you need to balance those variables, play with them. I can't tell you what is the ideal ratio for those things. It's all going to depend on your data set. So our training command is this here. We're in models research. And here we go. Got no module name deep lab. That is that issue with starting the terminal up for the first time, you need to run this export slam command to make sure it imports deep lab properly. Let's try it again. It's going to do some basic setup before it starts training. Okay, so you'll eventually see this starting Q step. And then it's going to sort of monitor how many seconds it takes between each step. And that is how you'll know it's training. Okay, now as it trains, you're going to see in the PQR data set exp train on train valves that train folder, these checkpoint files that are going to start to build up in here. So this is at checkpoint zero. And you can see it's got about a 330 megabyte file. That is essentially the conclusions that your model has reached about your training set for this particular iteration. And as you train your model, that those files are going to generate. And I think every whatever you set up for your batch size, every four checkpoints, it's going to clear the oldest one out and keep filling this directory up until it is fully trained. And you're going to get a loss value over time that loss value will go down until eventually it reaches zero or close to zero. So you're not going to have a visual queue as to how well your model was trained or what the results of that training were. For that, you need to run this vis PQR.sh script in the same folder. So we run the script, starts the visualization process. And if we look in our folder structure, we'll see this vis folder with segmentation results. And those will start to populate with images. You get a sense of the progress here, visualizing batch one. And if we look in the vis folder, segmentation results, we see both the image that it was visualizing and the mask. I've got a whole another tutorial that's on how to use OpenCV to do this actual background swapping. But I've created a Photoshop action just in the meantime, if you just want to kind of like get a sense of the two things together, the segmentation mask and the combined image. Just run this merge segmentation action. Choose the corresponding mask. It'll place it in there. It gives you a merge layer. It gives you a mask. And you know, we could trim out the background and see, oh, we lost a little bit of the hat here. But not too bad, did a pretty good job. So again, that's training a model in deep lab with TensorFlow. Gathering data sets. The next tutorial video will be on using OpenCV to take these segmentation masks, take a set of photos and swap out the background with something of our choosing. So keep an eye out for that. Let me know if you have any questions. Thanks.