 On today's Visual Studio Toolbox, Jeffrey will show us why Visual Studio Code is the place to be for data science and Python. Hi, welcome to Visual Studio Toolbox. I'm your host, Robert Green, and joining me today is Jeffrey Mu. Hey, Jeffrey. Hey, Robert. Thanks for coming on the show. Yeah, no problem. Jeffrey's a program manager in the Python tools group at Microsoft. So this is our part two of our Python series, we had Tyreek on in the previous episode, give us an overview of Python. Today, you're going to focus on data science. Yeah, for sure. So what does that mean? Yeah. So why data science? Well, data science is one of the biggest workloads in Python right now. We're estimating around like 30, 35 percent of all Python developers do data science of some sort. So everyone's trying to do this. Is it optimized for data science, Python as a language or is it more of a general purpose thing? Well, the reason why Python is so popular for data science is because it's such an easy to pick up language to learn. So a lot of these data sciences don't have this engineering background. So it's really easy for them to plug and play in like minimal learning curve and it's kind of like almost a de facto of what people use in the industry. Okay. But yeah. So the reason because data science is so popular, we want to have like a first-class data science experience inside VS Code. And then most data scientists use Jupyter Notebooks, which is a tool that people use to explore or develop code. And because of this, we want to have Visual Studio Code, we want to offer first-class experience inside it for data science and also for developing with Jupyter Notebooks as well. So our team has been cooking up like a lot of really cool new features in the past month. This just like released this month as well. So I'm super excited to be here to show you what we have to offer, all the cool things we can do with data science inside Visual Studio Code as well. Cool. Yeah. So I guess to get started before we get started, if you're interested in just general Python as well or like something like web application or web development in VS Code, you can check out my colleague Tyreex video like you mentioned earlier. He also has a video on Channel 9 as well. Okay. Well, so. There's also ak.ms forward slash vst forward slash Python VS Code, which is some getting started. And then there was a whole series on Python development. Yeah. So definitely go check that out. We forgot to mention on air in the last episode, so we'll make up for that now. All right. Data science. Yeah. So it's easy. If you've never heard of VS Code before, it's completely open source. It's free. It's actually really easy to transition your Jupyter Notebooks from your other editors or IDs into VS Code because we now offer fully functional Jupyter UI inside VS Code and also fully functional Jupyter Hockey as well. So it's really easy to just bring a Jupyter Notebook in or create a new one. There's no learning curve. You just plug and play and it's what you expect. Okay. So how you can get started is, first, if you don't have Visual Studio Code, you want to install that from code.visualstudio.com. So once you actually have Visual Studio Code installed, it's Visual Studio Code is actually like a bare-bones IDE, so it doesn't actually come with the data science features. You'll need to install the Python extension on top, which will actually get you those data science features. So to do this, you can go into Visual Studio Code. On the left-hand side, you'll see the extensions tab. You can click into it and you can search for the keyword Python. Yeah. And it'll be the first one that shows up. Okay. And you can click install. As you can see, I already have it installed in my machine just for the sake of this demo. Okay. So once you have the Python extension and VS Code installed, the last thing you'll need actually is distribution of Python in your machine. So if you don't have Python in your machine, you'll need to install that. And there's two main ways to do it. There's the download from the official Python website, or you can get an Anaconda distribution of Python, which I personally recommend. And Anaconda, I guess like a TLDR or like a Gist of it is, it's basically like a package manager, and it's really good for data science because it is really good at managing your environments for data science. And Anaconda is something that most of the data science community uses. Okay. So I would personally recommend that. And I also have Anaconda installed in this machine as well, so that's to show we'll go through that as well. Got it. So once you have those two installed, I guess the first thing you'll want to do is to get started is to create a new Jupyter Notebook. If you don't already have one, you can also, if you already have a Jupyter Notebook, you can open it up normally and open it up as well. But for the sake of this video, we're just going to go through the getting started experience. Okay. So the first thing you'll want to do is we want to access the command palette. So let me just close this. So you can access that through Control Shift P or Command Shift P if you're on a Mac. And it'll bring this up. And why the command palette is so good is because it has all the actions of VS Code listed in it. So if you ever don't remember how to get to some menu or some action, you can just bring this up and search what you need. So for in our case, we're going to create a new Jupyter Notebook. So Outsearch creates new. And you'll see one of the first things that pops up is create new blank Jupyter Notebook. So if we click on this, it'll open up what we call our Notebook Editor, which is the new feature we have for editing Jupyter Notebooks. And you might be asking, what's the difference between a Jupyter Notebook and a regular Python file? So the main thing is that a Jupyter Notebook file, you can think about it as your code is segmented into different sections or what we call cells. Whereas in a Python file, everything is all together. So what's really good with Jupyter Notebooks you can see is these cells can be run individually. So you can run individual pieces of your code in that file or multiple times. You don't have to run the entire file to see a new output. So that's where Jupyter Notebook is really useful because you can just experiment and test and change one or two things just to see if that makes your data better. So now that we're in our Notebook Editor, we can start going through the UI of the Notebook. So if we go into the Notebook ourselves, we'll see when you make a new Notebook, there's one individual cell already. And we can now start going through, you can see there's buttons in the cell and there's also buttons up top. So the buttons in the cell are what we call cell-level actions. They're things that will affect the cell itself. And the ones that are up top in the toolbar are what we call notebook or global-level actions. So they affect like the entire Notebook. So we'll go through the in the cell actions first. So we can look on the left there. You have your move cell up and move cell down buttons. So if I only have one cell here right now, but if I had another cell below, I can just click on this button and it'll move the cell below that. To the right of that, we have our execution counter. So what that does is it'll tell you the relative when the cell was running in comparison to the other cells. So I'll have a number of like, let's say one, two, or three. Right now I don't have the cell run yet, so there's no number beside it. Below that we have our run button. So let's say I write some code, prints high. You can just click that run button and I'll execute that cell. So we'll see the word high. And then below that we have our cells, our buttons for run cells above and run cells below. So what run cells above does is that it'll run all your cells that are above the current cell, but not including the current cell. So where this is really useful is, let's say you have like a lot of cells, like maybe like 20 or 30 cells that you wanna get through before you have to run the current cell. Instead of having to click that run button all those 30 cells, you just click this one button and it'll run all those cells for you. And similarly with run cells below, it'll run the current cell and all the cells below. So it minimizes that amount of clicks that you have to click. The next one is a switch to change to markdown. So with this is Jupyter Notebooks also support not just Python as a language, but also supports the markdown syntax as a language. And markdown is basically ways to like pretty print your text. You can like have like bold or like list and stuff. And this is just for like commenting your notebook and making your notebook more readable. And then the last one is delete cell. So if you click on that, it'll just delete the cell. So now let's go over some of the, I guess overall global actions in the notebook. So at the bottom of your notebook, you'll always have this insert cell below button. So it's really easy to always just add a new cell below whenever you want. And then as well, there's also a insert cell above button or insert cell below, but adds it to the very top. So it only show on hover. So if you click on that, you'll see it adds a cell above. So now let's go through the top level toolbar. So the first two are your kernel actions. So first is restarting your Python kernel. If anything happens, your kernel or if it gets into some weird state or you ever need to restart, you can just click on that and it'll restart your kernel. Next one is your interrupt kernel. So this is more like, let's say you're running some really long code cell and it gets stuck in the infinite loop or it's just taking like really long time and you just want to stop it. If you just click on this button, it'll stop the execution. The next one is another insert cell button. So this will just insert a new cell below whatever cell you have focused. So for example, I have a cell here. I can click this and insert this new cell below. Your next one is your run all cells button. So like the run all above and run all below. You just want to run all the cells in your notebook to see the output. You just click this button instead of having to run all your cells individually. The next one is the clear all output. So let's say you see in this print statement, you have an output of high. I can just click on this and it'll remove all output in my notebook. The next one is what we call a variable explorer. So it'll show all your active variables in your notebook. Currently I don't have any variables in this notebook so nothing shows up. But later on this video, I'll show a more advanced notebook and then go into this feature more in depth. Next is your save notebook. And then finally is your convert and save to as a Python script. So this is what I was mentioning earlier where we have a feature where you can actually convert your Python notebook into a Python file. And again, I'll get into more in depth of this feature later on in this video. So let's move on. So this is when I created a notebook from scratch but let's say you already have a notebook open or a notebook that you want to bring into VS Code. So for example, I have this data science as cool notebook, which it is. But let's open that up. And you can see that it also opens up in our notebook editor as well. So this is a notebook I already have. You can see that outputs are already saved. So if you already have that notebook, the output saved externally from another notebook application, it'll also save that in this one. And some things that we can go through are there's full support for IntelliSense and IntelliCode. So as you can see here, let's say I remove this statement, if I want to write plot.show. You can see IntelliCode shows up with the suggestion for show. And there's also an IntelliSense for all the API that's supported. So if I want to do this, and you can see that there's a, it also shows a function signature of what show does. So PLT comes from matplotlib.pyplot, what is that? So matplotlib is a, one of the most popular plotting libraries for data science. So it's really good for just graphing and seeing what, yeah, seeing your data. So I guess the first feature I want to go through in more depth is called our plot viewer. So you can see here that this cell generated a plot as an output. And you can see that there's like a lot of data. It's kind of hard to like zoom in. It's like the plot's kind of small, so it's hard to see the data really clearly. So what you can actually do is if you click on this top left button here, it brings up the plot viewer in a new window. And then this plot viewer will actually bring up a bigger image of that plot. And then you can do really cool things such as you can zoom in the plot. So let's say you want to like look closer at where it crosses the x-axis. You can see here. It also lets you do other cool things such as you can save the plot in different formats. So if you want to save as like a PDF or PNG as an image and you want to share it with other people, you can do that as well. So now let's go back to our notebook. Our other cool feature that I mentioned before was the variable explorer. So with the variable explorer, let's say I opened that. I just ran this code cell and you can see that P shows up because P is a variable I just made. So with the variable explorer, it'll show up the current state of all your variables. And where this is really useful is, let's say you have like many cells that you run and maybe sometimes you run them out of order because you want to like fix some code in it or you run it again. And many times you won't know, oh what's like, I don't remember what my variable was or like I don't remember the state and you'll have to print that out. But the variable explorer is really useful because you can just open it up and you'll see at a glance exactly what your variables are. So in the variable explorer, it'll have the name of the variable. So in this case we have P. It'll have the type of the variable. So we have an NDE array type. It'll have a count, which is basically like the length of the variable. So we have a size 100 here and it shows up 100. But the most really interesting part is the value. So this is really useful for data sets. And this also ties into our new feature called the data viewer. So if you look on the right, there's a button over here for array or NDE array types or data frame types. So it'll say show variables in data viewer. So if you just click on this, it'll open up a new window and it gives like an Excel like interface with all your data. And this is really useful because it lets you basically sanity check or look at your data set without having to write code to do it. It just does it for you. And even more useful is, let's say you want to do a sanity check and you want to make sure there's nothing like negative or no numbers are like less than one or see how many values are that. All you need to do is click this filter rows button and you'll see this text box pop up. What you can do in this text box is, let's say I want to make sure that none of my values are negative. All I have to do is type less than zero and I can see, oh, there's nothing matching that. Or if I want to say everything less than one, it'll give me all these values, less than one. So that was the data viewer. The next thing I want to show you is Jupyter hotkeys. So like I mentioned, Jupyter hotkeys basically can make your workload more productive. So instead of having to click for all these actions or like find them in the menus, you can just do a bunch of like hotkeys and we have full support for many Jupyter hotkeys. So for example, I went through like, there's control, sorry, there's shift enter. So if I run it, it'll run this cell. There's also, if I go into command mode, I can push escape. You'll see this turn blue. And then I can also navigate between my different cells. And then if I want as well, I can push DD to delete. There we go. And then for full list of Jupyter hotkeys, you can check out at the end of this video or in the description, we'll have a link to our documentation and there's like a lot more hotkeys supported as well. The next thing I want to go through is a remote Jupyter server. So right now I'm running on my laptop right here. It's pretty fast, but obviously not as fast as like, like some server in the cloud, right? So, and with a lot of machine learning tasks or data science tasks, it's really compute intensive. So I don't want to sit here for like maybe a few days or like even a week waiting for like, it's a run on my machine. So we have the ability to actually connect to a remote server, remote Jupyter server and leverage the compute power of that. So to do this, we just go back to the command palette. So control shift P or command shift P if you're on the Mac and we'll search for the command specify Jupyter server URI. So we'll see that show up. And all you need to do is just click it and you'll see by default, it's running the local Jupyter server. So right now it's just running on my local machine. But if you have a remote machine in the cloud and you want to leverage that compute power of like a GPU or a really powerful CPU, you just click on this button and here you can enter the URL for whatever server it is. So that's really useful too. And also this entire interface also supports remote SSH with what Visual Studio Code as well. So you can also connect to remote server through that way. So the last thing I want to show you is, like I mentioned previously, is the convert and save to a Python script. So in this scenario, let's say I'm like pretty happy with what I've done so far in my notebook. These are generating the right plots. The code seems right and now I want to convert it into like a production service. So it's like an API, others can use it. Or if I want to convert it to like a Python script so that people can just run it from the command line and don't have to have like a Jupyter notebook to do it. Well, all we have to do is just click this one button and they'll automatically convert all my notebook code into a Python code into a Python format. So before I would have had to manually copy and paste all the code in the cells into a new file. And let's say you have like a huge notebook of like hundreds of cells, that could take like an hour or something. But this makes it almost instant. And we can see that. We can see that. If you go the other direction, if you have this Python. Yeah, for sure. You can go back to a Jupyter notebook. So there's also a feature that we have that lets you go the other way. So it's like a really binary way where you can just go from one to the other. You can work in whatever you're comfortable in. And we want to encourage that flexibility as well. So once you actually convert it to a Python, convert and save as a Python file, it'll open up in what we call our Python interactive window. And you might be asking, what's the difference between our notebook editor, our Python interactive window, or even like let's say our traditional just regular Python file? Well, the main difference is that the Jupyter, our notebook editor is mostly for Jupyter notebook files. So with this extension IPYNB. And it gives that traditional Python, sorry, Jupyter notebook interface where you have cells, your input and output are in line. And just like the general Jupyter UI, while in our Python interactive window here, we'll see that it's kind of like a hybrid of your traditional like Python file and also the Jupyter, our notebook editor. So you'll see it's like a Python file, but you also have an overlay for our Jupyter notebook cells because I know it came from a Jupyter notebook. So you'll see like these run cells, run below debug cells. So you kind of have like the best of both worlds in this case. And what's really cool is you can also, let's just save this file real quick first. So let's name it, I already have one that's test. So let's name it test one dot pi. So with this Python interactive window, you have the benefits of both, it's like a hybrid of both regular Python file editor and also our notebook editor. So you can see this is in a traditional Python file with the test dot PY. And, but we also have our overlays for our Jupyter notebooks. So we can see that there's run cell, run above an debug cell. And where this is really cool is you can still run individual code cells, like they were a Jupyter notebook, but you also can run the entire file like it was a regular Python file. So you have the best of both worlds in that scenario. And what's even cooler is, because it's a Python file now, we have also the ability to debug cells. So instead of having to debug the entire file or debugging their code for the entire file, you can just run for the individual cell. So if I want to debug the cell, for example, I can just click a debug cell. It can see it starts stepping through the cell line by line. And if I want to go through each bug, sorry, if I want to go through each line of code, I can just click step over and you'll see it keeps on going through each line of code. And you can see the variables and the call stack update as well. Cool. So the last feature I want to go through is the Python interactive window is the input. So we have a fully functional IPython repo window at the bottom right. And here you can actually type in code and run it inline with your existing Python file as well. So, and this window also has full IntelliSense and IntelliCode capability. So it has context of what you run previously and you can run that to update your current state. So for example here, I've run this cell. So it has context of like NumPy, Pandas and Matplotlib. And here I can, for example, create a new variable, let's name it X, and I'll make an array of zeros just as an example. So I can say zeros and then let's just make it size 10. And then you can see that runs as well. Cool, yeah. So those were, as you can see, like my entire data science workflow from getting started to creating a new notebook to even just bringing my own notebook into VS code and then doing the experimentation, doing the debugging, all that was done inside just this one tool of Visual Studio Code. So that's what I think our tool will excel in where everything, you don't have to switch between different tools, everything can just be done in this one, really amazing tool. That's fantastic. And this recently just came out this month, so this is all brand new. And I encourage everyone to go try it out for themselves. All you have to do is download Visual Studio Code, the Python extension as well, and then just bring your own notebook or create a new notebook and explore for yourself. Awesome, all right, thanks for showing us that. Yeah, thank you so much for having me here. Anybody doing Python, data science, Jupyter notebooks, this is absolutely the idea of the environment of choice. For sure. Cool. And then we're actually playing to, like we're putting a lot of our focus on this tool, so we're gonna have a lot of new features coming out in the coming months, so hopefully I'll be back soon to demo even more. Absolutely. All right, cool. All right, hope you enjoyed that and we will see you next time on Visual Studio Toolbox.