 So one of the things I wanted to do is I want to show you sort of a real world application that I personally use for Jupyter for data analysis. I literally have a folder in my computer called typos analysis where I like to do my analysis of student interactions with typos. And you'll see that this is a very similar structure to how I sort of teach it in the class. So we've been working off of the iris data set for a few lectures now and I happen to have a folder in my computer. I like to call it data sets. I sometimes use data, I sometimes use data sets. You get the idea of that terminology. Where I do the vast majority of my programming or my analysis, in this case I'm using analysis as that folder. I do like to have a folder for images and the idea here is maybe I want to save my graphs as I'm doing my analysis. Here's the folder for saving those graphs. I do have some other things and that's mostly just more data sets going on there. But specifically I want to go into that analysis folder. Now once again you see I've done a few different types of analysis, usability studies with typos, measuring the learning gains of students who used it. But just for our sake I want to go off of just my everyday query. Are students using typos and when? And that's exactly what I'm going into. You can see I've done this analysis a few times with a different semester a few times. In this case for example I want to look at yours. So again this is the file that I use to know what's the general activity going on inside of typos. Now I'm going to warn you as you can see yes it's a lot of information and I'll even kind of increase it up just a hair so you can not have to squint at it. But the entire idea and I'll just try and explain as it's going along that first little snippet that's very similar to our example video where I'm just showing you some import statements. And I've got a few other things you know I've got a little line here because once upon a time Jupiter wouldn't immediately load the map plot live files and so this snippet you know this little trick here did that. And I haven't removed it because I'm lazy and it works and don't mess with my code. Then just some other things you know when certain parts of say pandas or NumPy or any library are being deprecated and removed or no longer developed you know they'll show warnings. I don't want to see those warnings so I'm just ignoring them moving on. This nice little bit here this is actually so if you think about sort of what you're seeing here for a second you know take on you know visualize this for a second. This is the me overriding how to do that filter analysis that filter snippet that I've shown you in the previous videos. So literally this little bit here this little line here if you might recall for doing any type of filter that is just this in a nutshell and I won't lie personally I dislike doing that you know I hate the fact that I gotta you know put the data frame and then data frame again there's like four square I'm lazy I don't want to do that stuff. So this is where this is a great example of a website you know Stack Overflow but specifically a technique that I found from that website and I love and so I very commonly use sort of this little snippet this is actually part of my setup process now when I do data analysis just because it makes it so much cleaner in my opinion so I put it there all the time. This next little bit is just converting a string that is a date into a Python date time object and the reason why is that way I can do sorting analysis I can see whether one date happened before another some great things with that so I use it this next little bit is just some how many weeks I want to go back sort of the configuration setup so literally I want to go 16 weeks back because it's a 16 week course and then date format just how do I want to visualize different dates for the visualization this next little bit here this is effectively I need to connect typos is a database or a web platform that is connected to a database when you submit your exercises they are stored in the database for me to do analysis on and you know I don't have to worry about you trying to you know hack in because I'm working off of a local version of it I don't like to work off of the production version that's bad don't do that if you ever hear the word production in computer words avoid trying to mess with it at all costs always work on something we like to call development but anyways I'm just I have a local copy and this is my way of connecting to my own personal computer where I have the login and password that are separate from the actual database but still stored in the environment variables on my computer great and there's our course course number 33 off okay now this last this next line let me run this so I just shift enter and there's that line I didn't get in the air is fantastic but all that data is now loaded into memory so this next little bit is just a giant snippet that took me you know a couple hours to build way back in the day but effectively what I want to do is I want to see how many students completed exercises how many students looked at exercises and what were the total number of exercises completed in a given day because some students like to repeat and do multiple times so all this is is again this is me seeing how many students completed exercises then how many students are looked at those exercises and I'm doing some filtering you can see there's my last name I don't want you I don't want to impose my own names and there to fluff it up so I don't want those and the same kind of thing I like to do live demonstrations of everything so I have a demo account that I work off of and so any demo accounts just avoid them then the next little bit is I'm just type I'm going to merge those two things together and in this case I'm going to print it out afterwards and this is because I want to confirm that say for example the course ID is manipulated and changed appropriately and it just gives me a visual indication that this code did run and you can execute it fantastic so we see the course ID exists now this line right here I put it on its own single line because what happened was when I was kind of dusting this off it was running into some slight problems I hadn't mess with this code in a year so I needed to kind of dust it off and debug it and it was getting hung up on this line it kept erroring so when I was doing debugging I sort of put it on its own line because I could then just I knew where the problem would be getting caused from and I worked off of it in this case that three just tells me yes this code ran and it it executed to completion no errors occurred fantastic now we can do analysis that data that queried that data frame is loaded now into what I call results fantastic so in our case I am just going to do some cutoffs here that's just kind of showing me what I want to work off of and where I'm going so there are seven days are you know window from today to literally so today to 16 weeks ago fantastic so I do a quick little jump through it and then I just want to do some different crazy things here so those number of students as right now that's just going to be a string so I'm actually going through and changing them into integers and then what do you know there I go so you can start to see these are the last five records of students interacting with typos specifically it seems based on this recording we're over oh well you can't see it but today is Sunday April 20th and the last record seems to be that a student was on the 17th the 17th was Friday so nobody's been on typos all weekend I'm not mad but it does confirm something so for example tons of students are using typos on Thursday when we have class tons of students are using typos on Wednesday or Tuesday when we're having class as many people looked at those exercises as I would have imagined but interesting enough so anyways we keep on going so again this is just another one to then say well who completed how many exercises were completed in a given day so again it's really just the exact same approach and I'm not printing it out that's mostly just because it would be a long string it's not doing much for me but I'm gonna do the same thing I'm gonna run that query and then look for the past 16 weeks how many exercises were completed by students and you can imagine it should be roughly the same so in this case tons of students are completing even though if we look you know number of viewers there are only roughly speaking 84 students and there are only 85 views of those exercises but there were tons of completions people were repeatedly completing it so that's actually good you know in my view but like I said I want to take this data I've done the analysis and I want to visualize it I want to see how it looks across 16 weeks not just you know five days and so again I'm just going to do some quick analysis so in this case I'm adding in a new folder or sorry a new column so in this case I'm merging those two data frames that I have together because they work off of the same time zone and then I'm creating something called time zone with day and so as you can see all I'm doing is I'm just applying a nice little formatting approach where I get to see the day as well and that's more just kind of visual help because different classes different times maybe you're interacting with it on Wednesday Monday Friday Sunday I've seen some students get on it on Sunday so I like to know what day not just the date so I don't have to look it back up but this is how you would go about doing that in this case you can see I'm calling it time zone with day I take the results from the time zone and then I'm going to apply some function to it in this case this so this is something knows lambda calculus it's not something we're going to do in this class but you know just to explain it a little bit so this says for each row whatever the element in the row is do this thing and as you can see so date time transform that string into a date to take your your date and then convert it to this date format that we are showing plus the actual day you know the day name shorthand so that's what it's doing and then I'm just doing a tail on that so the last little bit here is now just some quick little it looks like what am I doing here let me run it and see so I'm just doing some formatting of my data so as you can see I'm masking exercises so show me time zones or mask the time zones for a particular day so I'm not seeing everything I'm just seeing certain ones from a particular interval a little a window if you will the start to the end let's see if that length is more if it's equal to zero what I'm doing here is effectively if nobody went on a particular day it took me a second to think through this but if nobody went on to typos for a particular day I still want that day to exist so in that case create that record effectively create a record for that day where nobody visited looked at completed a typos exercise see no usage that day and then I'm doing a quick little sneak where once I've added in all these new days I'm resorting my results so that those days flow correctly in a sequential manner and now what we're getting into is literally the plotting approach and there's a lot going on here again I want I'm doing three columns for every day so you can see I'm doing some quick formatting so I'm creating some subplots in this case I'm using that axes one so there are three rectangles that I want to work off of specifically and how they kind of appear and what's their results then I'm adding some labels these are mostly just my way of kind of creating some extra visualizations so say for example the y-axis label is going to be the frequency the title is going to be number of students that completed or viewed exercises and then my indices are some are sorry my tick marks for the x-axis are some set of indices that is specified by the number of days and I don't want to see all of them I just want to see a subset so again just some other little bits going on here so say for example there's the legend the y-axis so I'm only seeing a zero I'm not getting into negatives that's more visual you know stop at zero tick marks in this case this is how I'm setting the rotation so here's the tick labels here's rotate that data 45 degrees and let it be right aligned that just kind of helps with visuals going on there and then some other little pieces where I'm auto labeling that data that's not something I'm going to be working off of or visualizing here as you can see I've commented to add that out but I just showed numbers atop each one of the days so let's see how this all kind of comes together as you can see I've run through each part of these exercises and this last little thing and so this is literally a visualization of all of the student activity through typos and yeah a lot of it I can get so our course typos was a required exercise so I see a lot of students you know this kind of window in between so that's most likely a Tuesday that's not that's most likely a Wednesday that's most likely a Thursday Tuesday Thursday Wednesday Thursday and you can see oh well what do you know we happen to have a nice little probably vacation going on in February you know roughly around that time so February 20th we probably did not have class for some odd reason I'd have to double check but interestingly enough you know if you go to say something like March beginning of March there was a huge uptick or that's actually a Thursday or a little odd uptick there wasn't as much activity going on there but students that looked at the exercises I think there was a mistake there I think that was the get weather data where you literally couldn't complete it but then you see the nice little trough of spring break and some students kind of got on a little bit someone did but what do you know you didn't get on an academic website during spring break yeah but then you can see I can see some more behavior and apparently like some student kind of hopped on over the weekend at one point and just looked around so I can see this activity and this behavior based on this visualization and again this is what I use Jupiter 4 on a semester by semester basis and I have other instructors who are using typos in their courses and this is how I can show them how their students are using typos during the class