 Good morning everyone. Welcome to this second part of our UK data service workshop named Mapping Crime Data in R. This is the live code demonstration which will be done in R. My name is Nadia Kenner. I'm a research associate with the UK data service based at the Cathy Mosh Institute at the University of Manchester. So yeah this is part two to the webinar that was held on Tuesday where we'll be walking through some of the basic components for mapping spatial data in R specifically mapping crime data. So we can pretty much just jump straight into this webinar. So yes for this workshop we will be using RStudios. This is the RStudio interface. It is a integrated development environment. There is a bit of a lag I'm just going to turn my camera off as well so we can get there we go. Yeah this is RStudio. This is the integrated development environment. It's a program language for statistical computing and graphics and RStudios is typically organised into four main panels. We have the source panel so this is where you write your code so it would either be a script, a notebook or a markdown, some sort of shining out. There are lots of variations or files types that you can have in your source panel but this is where you typically write your code and edit your scripts and this panel supports like syntax highlighting, code completion and everything else. You then have the console panel which is what we're looking at here and the console panel is just where the R code is executed. The top right panel is your environment slash your history panel and this panel just shows the current environment that you're working in. It lists the variables that you're using, the data frames and other objects that are like in your memory. There's also a history tab within this panel and this just tracks the commands that have been executed. This bottom right panel is the files, plots, packages, helps, viewer, panel and this is just a way for you to navigate through your projects files. You can visualise plots that are generated here. You can manage R packages here. You can install and load libraries. You can access our documentation and you can even view web content internally. For the bulk of this webinar we'll be working through four main R markdown files. I'm just going to head over to the GitHub page which looks something like this. This is our crime data and our repository on GitHub. GitHub is just a platform for hosting and sharing code and it's just a Git version control system that allows multiple people to work on the same code base simultaneously without overwriting each other's changes. We're going to be using content from the February 2024 folder. Within this folder we have our code which is all stored here. We'll be working through the .rmd files. I've also included an interactive binder link which was run through Python. This launches the four projects into a file onto Google and this just allows you to run the code without having to directly install all the code onto your laptop. Bear in mind that this link, if you were to launch the binder link, is very, very slow partly because of this repository is quite large but partly because binder is just quite slow so please bear that mind if you are trying to work through the code in binder it just might take a little bit longer than running it yourself. There are multiple ways for you to obtain this code. If you head back to the main repository you'll see a big green code button and this allows you to basically copy the HTTPS link and clone the repository within your own computer but before going through that you could also just download the zip file. This downloads this whole repository straight onto your computer within your documents and this also will allow you to work through the code at the same time as myself but what this means is that if I was to make changes to any part of this repository you would have to delete your folder on your computer and then re-download this whole repository again. However if you follow this or copy this URL and create a project with an R any time changes are made you can have those changes directly made onto your computer. So in order to clone a repository into R I will just talk through this briefly and then we can jump straight back straight into the code demonstration but go ahead and copy this URL you see a green tick which means it's been copied successfully and head back over to your R studio. Now if you open up R studio you should see four empty panels currently I am within the crime R workshop as you can see the same content here is what we've seen on GitHub but in order to clone a repository onto your own computer you can simply click file new project it might take a couple seconds to load. There's just a question in the Q&A which says which data should we download all the data is within the GitHub repository within the data folder so there's nothing specifically for you to download all you need to do is read the data sets into R but I will walk through this as we go through. So yes once you click new project you'll see three options to create a project you have a new directory existing directory in a version control we're going to be using a version control which is just a tool that helps to manage changes to documents and to like basically track code and GitHub is in one example of a version control system. So go ahead and click GitHub and then click get and this gives you a box where you can then paste that URL that we just copied from GitHub you can then call this whatever you'd like give it an appropriate name and then you can choose to store this in any part of your computer so you can click browse and store this into a folder that makes sense for you. I would also suggest just opening this in a new session because it will mean that you're not complicating any projects that you have previously opened. I'm not going to create the project because I've already done that but once you click create it might take a couple seconds to load just due to the extent of the content but you should be looking at what I'm looking at here. So I'm going to start just with some prerequisites. This is just a little bit information about how to set your working directory and how to install and load packages. Now a working directory in R is just a file path on your computer that basically sets the default location of any files you read into R or save out of R. So it's just like a little flag somewhere on your computer which is tied to this specific analysis project and if you ask R to import a data set from a text file or save a data frame with a text file it will assume that the file is inside of your working directory. Because we have created an R project your working directory is automatically set and you do not need to specifically do that. But if you have downloaded the code from the GitHub using the zip file then you will need to set your working directory using the set under the set wd function. You can also obtain or figure out what working directory you're in by searching sorry by typing get wd with parentheses. Oh interesting. I think my laptop's done that thing where it switches your keyboard shortcuts but that's absolutely fine. Yeah so as you can see this is my working directory. This is where my the crime work crime my workshop is stored in my computer. And during this talk we will be using multiple packages as you see here. You will need to install all listed packages before running any of this code in order to run a chunk of code which is just this like grayed out bit of code that we see here. You can click this little green arrow which runs the current chunk of code. So if you go ahead and run this if you don't have these packages installed if you do you don't need to you only need to install a package once within R and then you can simply just load in the packages with the library function like so. I wish to give this a couple seconds to Render. Emma someone just asked if you could share the link of the github again. Would you mind just pasting that in the chat section I can't actually see what's being pasted but thank you. I'm not sure why it's decided to install my packages again which is odd. I think I deleted some of the ds files which might have just confused what's happening within R but that's absolutely fine we'll just give it a second it won't take too long anyways. So as you can see the output from running a code is shown in this console and it will print a bunch of like warning messages in orange but typically you can ignore warning messages within R. And fabulous just loading that last package which is cartogram. Sweet that has worked. And just as a little bit of extra information in order to clone a github repository but in order to clone a repository from github you might be asked to set up a PAT which is simply a personal authentication token. Now typically you won't need to do this but if you want to work deeper within github's like tools that they have so this might involve things like forking or cloning or working more like internally within a repository then you might be asked to provide a PAT token and this is this can be obtained by using a package called use this and create github underscore token this will take you directly to github and it'll give you this quite a long like a 15 digit code which should never be shared with anyone else and then you can plug that into the git creds package by just supplying that token when asked for but not entirely necessary for this this workshop since we are just cloning. So yeah I'm going to move on to section one and hope everyone's been able to clone the repository all right. I know there were some issues beforehand but I think I've managed to resolve the far path errors. So yeah we're going to start by looking at how to read in data how to run some basic exploratory analysis and how to produce some point maps. We are first going to read in our crime data set and this was obtained using the open source police website data.police.uk This is a site for open recorded police data from England Wales and Ireland. You can download various different formats including street level crime, outcome, stop and search data, you can explore the APIs containing detailed crime data about like individual police forces and so on so on. We are just going to be downloading in the crime data from Surrey from the years of 2021 to 2022 from December to December. So in order to read in a data set with R we can use a package called read R and the read underscore CSB function. So if we go ahead and just run this chunk of code you'll see that an object has now been placed in our environment to our top right. This object has 68881 observation and 12 variables. So let's explore this data set just a little bit further so we know exactly what we're working with. We can use the head function to print the first 10 or sorry first six rows of data and this lets us just explore briefly or like a summarized version of our data set. So we've got 12 variables not all are specifically important to us but we're going to be looking at the longitude, the latitude, the LSOA code and the LSOA name and of course the crime types. So these will be the variables of interest to us. An LSOA code is simply a lower super output area which is just a government statistic that identifies boundaries within an area and we'll be exploring this a lot today so don't worry if that doesn't make too much sense. We can also use the unique function to list which lists an array of all the different type of possible values in the column. So let's just have a go exploring unique. What I've done is called on the crime data set and then specifically called on the crime type variable which is spelt crime underscore TYPE type and this lists 14 different crime types that we can see. So let's just go ahead and produce some fairly basic frequency tables just to explore those crime types a little bit more. I'm using functions from the base package R so these are packages that you don't need to install these are packages that are already installed within R in itself. So let's go ahead and just use the table function to create a frequency table as you can see here for each different primary type of crime. I wonder if I should zoom in not sure if actually is that clear if everyone hopes so. Yes but by default the table is then sorted by category and we can use the order function to then order the table by count. So let's go ahead and run this chunk. As you can see we have those 14 crime types listed and now these and now they are listed in our order of our count of crime types as well. We can also create very very basic simple plots in our studio again using the base package and I'm using a function called par to do so which is seen here and the function bar plot and I'm assigning this to a variable called y. So if we go ahead and run this just this gives us a very very simple bar plot of our crime types and the counts of crime types as well. Obviously this isn't a very neat table but it does give us a bit of an idea about what type of data we are dealing with. If you attended the session on Tuesday you might have remembered me talking about the challenges of mapping crime data and one of the challenges I raised was the issue with violence and sexual offenses being grouped into one category. As you can see violence and sexual offenses make up the highest category of crime types which is not surprising when it covers almost two very like holistically different definitions. So yeah it's kind of just important to think about the implications of using this specific crime type in your analysis and what this might mean for like interpretations but yeah let's so what currently this crime data set that we have is not spatial. We need to make this crime data set spatial in order to produce some sort of maps and this is where we move on to our simple features and projection methods. Simple features is a common our language also known as SF and this basically just allows you to handle and manipulate the unit of analysis so that's the points the line and the polygons remember that's part of your vector data and these simple features allow you to store spatial objects and the package SF has been growing for numerous years it replaced a previous package called SP but SF gives you access to a lot of features and functions for the use of spatial data but for this exercise I'm going to keep it simple and just focus on how to use SF to make spatial data visualizations in combination with ggplot2 which is a very common R package for visualizations. So as mentioned in the talk we spoke about coordinate reference systems and how you need to assign a coordinate reference system to your data set in order to then not only make it spatial but also then allows us to then map this data. So yes a coordinate reference system defines a specific map projection as well as transformations between different spatial reference systems and these spatial reference systems all include an EPSG code which I did mention in the talk on Tuesday. Now there are thousands of coordinate reference systems to use but the most common you will see are the BNG which stands for British National Grid and the WGS84. Each coordinate reference system sorry each projection method has a coordinate reference system attached to that. In our case we will be using the world geodetic system and this is because our crime data set includes longitude and latitude variables remember we can have a look at that by just opening the data set so we do have a longitude and latitude variable which means that the coordinate reference system that we want to use is in fact the world geodetic system 1984. So what we need to do is transform this non-spatial object into a spatial object. So the first thing we can do is check the coordinate reference system within our original crime data set and we can use the function st underscore crs to do this within the I've just got a name and package within the sf package obviously so let's go ahead and run that and see what happens. In the output we see that there is no coordinate reference system currently attached to this data set which is not surprising because it isn't spatial but we can transform this by using a function called st underscore as underscore sf again from the sf package. I'm also assigning this to a new object called sf as to not overwrite the original data set. I'm then calling on the crime data set here and then attaching the coordinates of the variables needed in order to tell this package that we need these variables to be spatial. So I'm calling on those longitude and latitude variables and you see this function crs. This is where you would input your specific EPSG and in this case we know that the EPSG for the world geodetic system is four through two six. Just for reference our is case sensitive so if you included this variable with like a capital L and we run this chunk here it says that columns don't exist and this is because R is case sensitive so make sure that your variable names do match exactly with the data set. So let's go ahead and run this. Let's recheck the coordinate system by using the st underscore crs and now we can see that coordinate reference system has been attached to our data frame and we can see this through the EPSG at the top telling us we're using four three two six. We can explore the spatial components by using the glimpse function and this just gives us a this allows us to basically view the new variable that's been added. You might notice now that our longitude and latitude variables or yeah our longitude and latitude variables have now been converted into a new variable called geometry which includes our point data. You can even view this in the environment panel where we've moved from 12 to 11 variables. So let's go ahead and attempt to map some point data. We can do this using ggplot. ggplot has a function called geo underscore sf and this basically allows you to attach a spatial data frame that has a geometry column which we do have and it automatically calls on that geometry column to then make a plot. So let's go ahead and see what happens when we plot this data frame. Also as you can see I've called on the sf object and not the crime object because again we need that geometry column. So as you can see we have a very arguably messy and inconsistent map but we can see some sort of shape that's trying to be produced when plotting this point data. We can make this a little bit clearer by exploring the different crime types a bit more specifically and we can do this by applying an aesthetic and coloring each different crime types. So if we do this we then get a bit more of a nicer plot right. We can start to see some form of distribution across our crime types across the area of Surrey. You can go one step further with ggplot there's lots of aesthetics that you can use to manipulate and personalize your plots and so this chunk of code here allows you just to include a title a subtitle and a caption and it just looks so much nicer this way doesn't it right. The next step is to go ahead and plot some plot this point data onto a reference map and we can do this by using an annotation map tile function within the ggplot and this simply puts that point data over like a google image. Again I'm going to stick to coloring the crime types so let's go ahead and see what this might look like. So now we actually have a map now we can see a bit more clearly where where this crime data was reported and this is for the whole of Surrey. Surrey just pause because I have lost my notes. Okay here we go yeah we can also subset for just one specific crime type if that's if that's what you were interested in because the previous plot is obviously very messy and hard to see which crime type is actually happening where. So let's subset for just one specific crime type in this case let's go ahead and look at antisocial behavior. So I'm again creating a new object using this assignment operator and I'm calling this antisocial behavior or ASB I'm using the subset function from the base package r calling on our spatial data frame which is the sf and then I'm calling on the specific crime type which in this case is antisocial behavior. The same goes with it being case sensitive so make sure that this variable name matches what we see in the crime data set which it does and I've also just removed a couple columns by using the select function. So now if we look in our environment we see we have a much smaller object we can see that we have 1214 individual antisocial behavior crime counts that have been reported. So let's go ahead and plot this as well on our base map to see what this might look like a bit clearer. As you can see this is zoomed in to just the area of sorry sorry sorry and we can start to see better picture about our distribution of crime types but arguably it isn't all that clear but yes so that is the end of section one very introductory but it just allows us to explore our crime type and how we can make this non-spatial data frame spatial. I'm going to give everyone just a couple minutes to complete this little activity here because I do think it's nice to have breaks in between so that you can explore the data sets without me talking over your thoughts. So I'll give you five minutes just to run through this activity. I've simply just asking you to repeat what we did but for the crime type drugs feel free to explore any other crime type if you want and yeah I'll come back in like three to five minutes and just talk through these examples here. I'm just going to read through the comments in the Q&A and it seems that a few people are having issues cloning repository which is a bit surprising. I'm not sure why the five paths aren't reading I thought I've removed the ds underscore store files which were these are just files it's just like a mac the folder that's stored within your mac computer that helps you organize your folder structures but yeah I'm sorry if that's not working I can't quite address the issue right now but I'm hoping that you can just follow along with this code demonstration. I'll address the errors towards the end and you should be able to clone this repository by the end of the day. Again so sorry about that but hopefully you can still just kind of follow along with me or you can download it as a zip file and run this into your own computer anyway. Anna has also asked why I remove some objects from the ASB data frame and this was just to simply tidy the data frame I was just showing you an extra command that you could use from the Diplo package these were just three variables that I didn't want in the data set so that's the variable one which was like the long crime ID that we didn't really need I believe variable nine and ten were just like the location that these appeared on. Well we can just have a look can't we? Yeah I've just kind of cleaned the data set a bit I've just removed variables that we did need. Sweet okay let's go through this activity since it's a pretty short one I just can't seem to get rid of this zoom chat. Okay there we go um yeah so let's just run through this activity so yeah I first asked you to subset the data for those crime types recorded as just drugs so in this first eclipse here this is where you would call on your variable in this case our variable is called crime type and then we want to call on the specific crime type that you're interested in in this case this is for drugs um so let's see what happens when we subset this fantastic as you can see this has effectively subsetted our data frame and have just selected those drug types now if you are trying to type drug types in with a lowercase d you're going to get a different result and you're going to have no rows of data because again r is very case sensitive um I then asked you to basically create this into a new object that we did for their social behavior so similarly we just apply this code here crime type with the crime type you're interested in and I'm just using the assignment operator to basically create a new object called drugs if you look in your environment you can see that's now been added to the right um and the last step I asked was just for you guys to explore plotting this point data over a base map a base map and a reference map are synonymous synonymous by the way so let's go ahead and do that in this the function to create a base map is called annotation and you then need to call on the data set that you're interested in plotting in this case we're interested in this new object that we've just created called drugs so we'll go ahead and do that this gives us a picture of the distribution of drug types across sorry as you can see as far less point data far less point data than for the antisocial behavior category okay sweet um we're now going to move on to topic two so I'll just get rid of this in this section we will be working with our shape files more specifically how to read in a shape file and join this to our aggregated crime counts and then from now I'm going to briefly introduce classification methods as a way to better visualize crime counts as always make sure you load those necessary packages in you don't need to reinstall these every time so what is a shape file quick recap for those who weren't there on Tuesday but a shape file is just a geospatial vector that is used for GIS software and they store information about geographic location and the attribute information of a specific area it's a um so when I say vector data this is your points your lines or your polygons typically it stores a single feature class which means it'll store a single type and it will not mix to differ differing feature types um so only store like one point shape file or line shape file or a polygon shape file typically you won't have um a mixture of line and polygon data but yes these shape files contain file extensions which we can also explore from our file plot so if you head back over to that feb 2022 24 folder here and click data and click shape file this is where the shape file is stored for you if anyone is interested in how I downloaded any of the data sets by the way there is a doc human called downloading data here which takes you through how I've downloaded shape files how I downloaded the crime data and how I downloaded the census data but yes anyway back to the shape file as you can see there are four file extensions within our shape folder and these are pretty much usable across multiple applications within GIS but that first file the dot shp file this is probably the most important one this contains the geometry data it's just a file that contains all the geometry for all the features within that specific area so this shape file here was specifically selected for the area of sorry teeth which is an area within sorry um you then have the dot shx file and this is just the positional index of the feature geometry and then you've got the dot prj file which contains the coordinate reference systems and the projection itself um so let's go ahead and explore this shape file just a little bit in order to read in a shape file we can use the st underscore read function which is obtained through the sf package so you go ahead and run this code you can see the output uh just below it says that we are reading in a layer called england lsoa 2021 from your data source using the esri shape file simple features collection it tells us that we have 55 features and four fields a feature is simply a reference to one lsoa so it's telling us that we have 55 lsoa in sorry teeth it's also telling us that this is a multi polygon and a multi polygon is just the collection of polygons it's telling us that we have more than one polygon in this data set we have 55 it also tells us um the areas for which area is bound and it also tells us the projected crs which right now is set within the british national grid but we will come back to transforming that into a little bit so the first step is just to explore the shape file um if you print the first five rows six rows of data again by using the head function you can see that we have uh six sorry five variables we have the lsoa lsoa name we have the label which isn't very relevant to us we have the lsoa code which is just like a reference code for each area and we have the geometry feature so these are like all the coordinates that make up the boundary of one area so this is simply an empty shape file this is just a shape file that draws all the areas within sorry he's um and we can go ahead and plot this empty shape file just to see what we're actually looking at and again we can use the ggplot function and the geom underscore sf function to do so and this is because we have a geometry column within the shape file as well the geom sf automatically um reads automatically detects that geometry column and is able to plot the shape so if we go ahead and run this you see we have this fairly um simple map that shows us all the areas in sorry he's so each area is an individual lsoa um and there are 55 within sorry he's in order to just make sure that you know that your data frame is in fact spatial you can use the function class and then supply the object that you're interested in to tell you what you're working with so in this case yes we do have a simple feature data frame which is uh excellent because then this is how we can map our data sets we can draw attention to that specific geometry column as well by just uh supplying the variable geometry and as you can see it tells us a little bit more about the all the kind of like spatial attributes within this shape file so now we have a basic understanding of what the shape file is and how we can import them into our the next step is to run some data manipulation and create some data frames that can then work with the format of the shape files so what we want to do is basically group our crimes by each lsoa and this is because the individual crime data set that we see here these are just individual crime counts right these just tell us where a crime has happened at individual points and time and individual points in space however our shape file as you can see here they have um we basically want to then aggregate our crime types to each of these areas here and this is because you would expect to see multiple crime counts in one lsoa you're not expecting to just see one crime type in one lsoa and we can do this through using group statistics through the ditler package so in this little chunk of code here i've created a new variable called crimes groups by lsoa now typically i wouldn't have such a long variable name but i just thought if you make things clearer um yes and then i'm calling on the original crime data set not the spatial one and then grouping these by lsoa code by a mouthful and then i'm then summarizing the count of crimes for each lsoa um so if we go ahead and run this code you can see we've got a new object that's been added to our environment that now says we have 744 observations now if you remember this crime data set is from surrey but our shape file is from surrey heath so we're so we're going to have more observations than we will shape files at the moment but that's not to worry because we're going to join these data sets anyways to match the shape files so once this is once you've got your aggravated crime count i can actually just explore this here for you you'll get a table that looks like this you'll get lsoa code and the count of crimes that's happened in each area so you scroll down and take a look you see how this number like varies right and there are 744 lsoa in surrey alone so let's go ahead and join this to our shape file so that we can then plot that point data over the shape file um we can do this by using a function called left join as seen here and this left join function returns all the rows of the table on the left side of the join and matches the rows for the table on the right side of the join so it's basically matching the lsoos across these two data sets and then attaching the crime counts so i'm first calling on an object called surrey underscore lsoa again just a new object that i'm creating so we're not overriding or confusing anything that we've done i'm using the left join function i'm calling on two files here i'm calling on our shape file which can be seen here and then i'm calling on the crime grouped by lsoa and then asking it to match this by the lsoa code you could also match this by the lsoa name because both variables exist within both data frames but i'm just using the code because it's a bit simpler so if we go ahead and run that we've got this new object dial that's been added to our environment object we now see this surrey lsoo so let's explore those five the six let's explore the data set just a little bit and so here is our merged and aggregated shape file we have the name of the area we have the lsoo name situate here lsoo code and we now have that crime count attached to our shape file so we have the amount of crime that's happened within each lsoa and surrey heath you can also see this has been reduced back down to 55 observations the same as the shape file so now we've got that we can go ahead and start to map this data oh sorry you can view the geometry as well by using st geometry and the stvvvox function but these just tell you about like the specific spatial unit um but yeah let's go ahead and map this data so let's just start with uh the gv plot since we've been using that for quite a lot i'm simply going to plot this over a base map again i'm calling on the gm underscore sf function which is the aesthetic that matches the geometry that obtains the geometry column i'm calling on our sorry lsoa and i'm filling this by the count so this is that new variable that we can see here in this column that's our crime count what can i do sorry um and then i'm also just decreasing the transparency so i'm having this set at 0.5 for now and i'm also creating a scale fill gradient which just improves the aesthetics of our plot so let's go ahead and run this so so now we've got a plot that shows our um shows a distribution of crime count in surrey heath plotted using a shape file and this provides a much nicer map than what we were doing in section one with just that like point data analysis we can also use a package called t maps t maps allows you to create theometric maps so this is a map that has some sort of spatial relationship and the syntax is very very similar to ggplot which makes it really really easy to plot maps in our um but you can plot two types of maps with t map you can have interactive maps or you can have static maps and you can change the way this is set by simply changing the mode so we can have this t map mode allows you to change between your static and interactive maps so if you wanted a static map you would use the code t map with plot and it tells you now that t map mode has been set to just plotting so let's see what this will look like so we have this kind of fairly simple map um that shows us the distribution of crime counts in surrey heath we can see that to the east there seems to be higher counts of um crimes in these areas which is very interesting but what would happen if we changed this plotting mode to view so view is for the interactive viewing you then can just rerun this t map shape here um now we have an interactive map what this allows us to do is zoom in we can also hover over certain ls oas to see which ones um to see which ones are which so we can say that this area here surrey heath 005e would be the ls oa with the highest crime count now although that map is pretty good it's very simple you know how can we move on to like better visualizing these counts because count data does not equal or equally represent the population distribution at hand does it you know but t maps allows you to alter the characteristics of these thematic maps by changing the style function and the different styles result in different binning techniques so when mapping like quantitative data such as crime counts typically the variables need to be put into these bins if i just uh run this again you know what i mean this kind of key here we've got one to 20 21 to 40 41 to 60 61 to 80 these are bins that our studio has just automatically assigned they've grouped these categories in a way that they think makes sense within this package but we can manipulate this and change the way that we've been this data to have more accurate representation so we can as i said we can use the function style to do this within team apps in this below example i'm going to be using three different classification systems or methods that is the k-means the jakes understand the deviation it's not entirely important to know like what exactly these three things are but i've summarized briefly the differences in how these styles will result in like the effects of your bins being different um yeah so let's just move on to the code in this bit of code here i'm creating three separate maps for our three separate classification system and i'm assigning these to variables a b and c in our first map a we are using the same plot that we used above the only difference is that we've included this function here style equals and then k means and this is just a method of but it basically aims to partition the n observations into k clusters in which each observation belongs to the cluster with the nearest mean um so let's just go ahead and run that and you can see a new object to be added to our environment and no map has been plotted and this is because we are assigned our maps to a variable we're then going to run a jenks classification map same way using that style function but just calling on jenks instead and jenks is also known as natural breaks by the way and this just aims to arrange a set of values into natural clusters um and the last method that we're using is the standard deviation classification and i'm just calling this into object c as you can see in our environment we've got a lot three large team maps being placed a b and c here we can then plot this within our team map um by calling on whether you want the the plot or the view function i'm going to change this back to plot just so we can see clearly and you can also use the team maps arrange function to plot all maps at the same time which is why i called them into original objects so let's go and scroll down we'll see that now we've got three maps with three separate classification systems being inputted now i need to pay attention to the differences in the color schemes across these maps and the interpretation that these might have on your data set um you know if you were to present this to a group of people who were very unfamiliar with these classification systems then they might quite easily misinterpret the effects of the distribution of crimes in these areas and we might be either over predicting or under predicting what we are actually seeing so the impacts of these classification systems and how you've been your data is very very important especially in crime analysis because the main aim for crime analysis is to create prevention strategies for police and government statistics to reduce crime count but if we are providing our resources to areas are actually not as high as we thought and obviously there's just wasted income and funding there isn't there you can also plot categorical variables in R using the T-Maps function again this is known as a spool multitudes it's similar to like using sorry if you can hear that i thought i heard that muted um it is similar to the facet grid function in ggplot oh god right i think that's muted sorry if you could hear my notifications going off uh where was i but yeah i'm just going to define a categorical variable here in this example i'm going to separate the map into multiple components by lsoa and now this isn't the best example of a categorical variable but so if we just open up our sorry lsoa we'll see this variable called oh what was it called sorry uh yeah this variable called name here so this is a categorical variable i'm going to use for this example but obviously this isn't a very clear example a better example would be having something like urban or rural distinctions and then using your facet functions to plot this across your urban and rural areas so let's go ahead and have a look at what a facet map might look like look like so it gives you these individual maps um plot i didn't kind of want image uh but obviously this example's not great because we are just using the label that was identified in the crime data set you can also have additional features added to your maps you can change the style by using the team maps underscore style function so this is like a color blind option which is great for accessibility you can also add legends you can add um legends yes sorry by using the tf layout function you can add plots within your within your maps you can change the colors you can change the heights of all of these you can um change the position of where you want your legends to be and you can also add things like compasses scale bars and grids as well to get something that looks like this so yeah you can be very very flexible with how you present your data using tf apps sweet so i've been talking for a while so i'm gonna just give you guys like five minutes to work through this activity two and then i will talk through the artists and move on to section three but in this instance i'm just going to give you guys the opportunity to explore other classification methods specifically looking at one called b class and h class b class is a bagged clustering and h class is a hierarchical clustering again not incredibly important to know the distinction between those two now but yeah just have a go at running through this activity on your own and filling in those blank eclipses and then we'll work through this together so yeah i'll come back in like three to five minutes okay it's been a couple minutes should be enough time just to fill in this activity it's kind of just a repeat of what we've gone through but using different classification systems so let's start by first um assigning our b class and h class classification maps into separate objects and i've asked you to call these h and b and then plot them together using the team app arrange um so let's go ahead and do this so the first step is to call that object h we then need to supply the dataset that we're using in this case it's sorry lsoa we then need to fill this by the variable that we're interested in plotting in this case for us it's the current count and then we want to supply the specific style um if you had to go at looking at the help package you would know that this is simply b cluster so let's go ahead and run that and we do the exact same thing for oh i've done h and b let's change that to h class sorry so the first one's h class and the second one will be our bagged clustering so we'll call this object b again you call on your sorry lsoa you then call on which variable you're interested in plotting and call on your classification system just here so this will be the b cluster and then asked you to use the team apps arrange to plot these together so this is really really simple you're just going to type in the two maps that you want to plot or the all the maps you want plot and if we scroll down we see now that we've got our two maps with the bag and hierarchical clustering um the yeah there are many different ways to classify your data and you might think that and you like you must think carefully about the choice you make as it may affect your readers you know outcomes um but yeah that was just a quick introduction to our classification system so now we can move on to section three just checking the time we're about hour in which is spot on so yeah let's head to section three of our code which i shall just open so in this section we are going to incorporate some census data onto our shape files so that we can calculate and explore the differences between crime rate and crime count we then briefly introduce cartograms as an alternative to the geometric maps but depending on the time i might not be able to run that code as always you need to reload those packages and not reload if your project is still open you don't need to load the packages every time but if you reopen or then you're going to have to use the library function to reload your packages so what exactly is crime rate well we know that count data is not entirely accurate of population density so while the codes in sections one and two help us identify interesting patterns about point level crime data it doesn't give us much room for like a detailed analysis crime rate is best understood in totality as crimes per 1000 resident people as per the latest official sensors over a selected time period so using this rate reduces statistical bias and reduces the effect of the modifiable aerial unit problem which was addressed in section in session one so yeah accounting for these finding is is an enormously important task because if we can understand the causal process under that underlie this variation and we may be in the position to like you know in that policy change that can bring about changes in the volume of crime in society at any given point in time so i went ahead and downloaded some census data from 2022 i believe and i use the util data service c cam website to do so i've given steps in the downloading the data doc about how i did this because it is quite a tedious website there's a lot of a lot of clicking it's a very um uh clicking interface but yes no worries about having to do that yourself this data doesn't exist within the repository um so let's go ahead and read in this census data i have specifically downloaded two types of data in the census known as the red count and the work count these are reflective of the residential population and the workday population but the residential population reflects the usual activity of an area whereas the workday population reflects the people who work there those who are those who are resident in the areas and those that either work from home or who do not work so these workday residential population statistics are arguably limited because they don't include activities other than employment but for the sake of this analysis i thought would be quite interesting to compare like crime rate over um employment population but when conducting your own analysis it might be more accurate to use something like the total lsoa population because the workday and the residential only reflect a partial population as these then tend to represent a more regular spatial grid so let's go ahead and read in our residential population we're just going to look at that one for now i've called this residential count i'm using the redactile function um i'm cleaning the names which i did i mention this before i've done this for every data set but cleaning the names basically just means that all variables will be locased and joined by an underscore which just makes data manipulation so much easier and then renaming some of the variables that exist within this data set because they were quite messy so they had this variable to represent the lsoa code and they had this variable to represent the lsoa name and this variable for some reason to represent the count so it just changed these two variables are more consistent with what we know and what we're working with so let's go ahead and read that in and let's view those first couple rows of data again so again we have a lsoa lsoa quite a mouthful the name we have uh these like categorical attribute variables which aren't completely of interest right now i'm going to treat all of these categories as one which is i know questionable but um yep we then have the red count which indicates the residential population count in each lsoa so our next step would then to be join this to our new shapefile again so we want to join this onto a shapefile that includes those those grouped crime statistics and we can do this just by using that left drawing function from the diplo package to attach them uh with the same lsoa lsoa code so in the sense is i'm calling onto sorry lsoa because i don't want to create a new object i just want to add those new residential census variables onto what we have already i'm calling on the sorry l lsoa here calling with residential count and i'm joining by the lsoa code in each data set so you'll notice that these are named different um but these are still referenced to the code of the lsoa if we run that and explore this you can see that there have been new variables added in fact i'll just open this up in the uh panel so it's a bit clearer but if we scroll along we can see we can see that our red count has now been added to our data set along with our crime counts and this means that we will be able to then calculate the crime rate so let's go ahead and have a look at doing that so how do you how do you calculate crime rate right well a crime rate is simply um dividing the number of reported crimes by the total population and the multiplying by 100 000 um so for our data set we can take the count variable divide this by the population variable which is the work day or the residential and then times by 1000 and we're only times by 1000s in this case um because that is the average population of an lsoa now if you are using a larger unit of analysis you would typically multiply by 100 000 but lsoas are very very small area um statistics i mean in sorry alone there were 700 and what was it 44 lsoas and that lasts just in sorry right so the kind of like mass to do this would look something like this have your count divided by the population and times by 1000 and we can do this really simply in our using dipler i'm calling on our sorry lso object and i'm assigning this back into sorry lsoa so that we can create a new variable called crime rate using the mutate function and this crime rate variable is going to be made by this calculation here so we have the count variable that's our crime count divided by the residential population count and times by 1000 if we then view this data set um you can see this new variable here crime rate which we just created using mutate has now been added so we now have our crime rate as opposed to just our crime count and we can view that fully here so let's go ahead and plot that crime data the crime count so we can see the differences um as opposed to using uh crime count again i'm using ggplot i won't over explain what this plot does here the only difference here is that we are filling by crime rate we've just replaced that crime variable with the crime rate and yeah this now gives us a more defined and accurate representation of our crime count across our population statistics obviously we can do the same with team apps so uh yeah you can kind of explore this in your own time but you can see the differences in the ways that the interpretations may be very different when looking at crime rate as opposed to crime count because the max crime rate here we have is uh 935 which seems arguably very high but this is um much different to the interpretation that we would have with crime rate yes i have included this section on cartograms which i wanted to get through but unfortunately running a cartogram can take a while and i don't want to sit here for 15 minutes so i've just commented out this code so that you can run this in your own time you know run it when you want to go make lunch or when you want to go get a coffee or something and just and you can see the differences but a cartogram is basically a type of map where different geographic areas are modified based on variables associated to those areas and there's two types of cartograms we've got contiguous and non-contiguous i.e if they share a common border or not so in in short it's just a map in which the geometry of the region is distorted in order to convey the information of an alternative variable and this region area will be inflated or deflated according to its numeric value yeah so feel free to run this kind of um this code on your own time and we will just skip to activity like a section four i believe is the longest section and i want to spend some time on that so yeah um that draws conclusion to section three which shows us how we can calculate crime rate and then how we can plot these crime rates so for activity three all i'm asking you to do is to now plot the crime rate but using the residential population statistics so yeah the first step is to read and clean the data and then the second step is to join this to our shape file again and then you can go ahead and follow these steps here so make sure you run these two chunks here to read in the workday population statistics um i've called the new variable work count as opposed to res count so we can so we're very clear with what we're dealing with and if you open up the workday count just seen here we get a very similar data set as to the residential we can then join this workday count onto our sorry lsoo again give that minute to load and now you can see i'll just open this it's a little bit clearer you can see that we've now got a res count and a work count column added to our shape file so this is a very like um a rich data set where we've got the crime counts the residential population count you've got your workday population counts you could go ahead and add your total pop total lsoa population count and so on so on to create their crime rates um so once you have that you can just follow these steps um and in again three to five minutes i will talk through the activity and demonstrate the answers so yeah i'll be back in three to five minutes okay uh that should have been hopefully enough time for everyone to just walk through this activity so let's go ahead and follow the steps the first step was to calculate the crime rate and assign this to a new variable called crime rate two so i have supplied that for you here which you can see and we then want to establish the crime rate but for the workday uh sorry the yeah the workday statistics and so this object is called work underscore count i believe and then we want to times this by 1000 so i wish just checking the name of the variable was work count and it is we then want to plot this using juju plot we can do this by simply assigning the data which is the sorry lsoa the big data set that we've been using and we want to fill this by the crime rate two variable so let's go ahead and plot that and see what we have oh i'm not sure why that's taking time to load there we go oh i believe that's worked but i'm not sure why it looks so funny let me just check my crime rate two variable has been oh interesting um oh it's because these have been set to zero haven't they um that's okay no worries we can just yeah that's fine um cool and then we want to plot this using team apps um we're gonna call on our crime rate two variable again and go ahead and plot that uh this is very odd sorry give me one second while i just adjust that is very odd i'm right too i've not spot that wrong or anything have i uh sorry about this i think because we've got these weird uh that's not happened before or did it happen in the practice i'm just seeing if we've actually done something wrong um that's a shame but let's uh it means that we won't be able to compare the two oh i know exactly what i've done wrong i haven't called on and i haven't put in brackets for the variable that's exactly why that we're struggling to plot so let's see if that notices that now there we go but because we have these weird missing values in that column it's not actually plotting right now um not happened before so don't quite know why that's done that now we might even just try to i will address just looking at the time i will address this error uh towards the end of the workshop and i'll push the changes then but yeah we'll just move on to the the last topic because this was just to show the differences between what a residential order population statistic might look like um so yeah let's move on to our last section which is section four in this section we are going to be covering spatial interpolation which is the process of using points with known values to estimate values at other unknown points and we will then move on to creating heat maps in both duty plot and leaflet too um so spatial interpolation just a quick recap it's just a statistical technique used to like predict or estimate value that unsampled location within an area covered by existing observations it's based on the principle that spatially or temporally closed points tend to have similar values um known as spatial autocorrelation so in the context of crime data spatial interpolation can particularly be useful for like many reasons but the most common are to fill gaps so that's to estimate crime rates in areas where there is missing data which is what we all be doing you can use it to smooth crime rate so this is to generate continuous surfaces of crime rates um you can use it for resource allocation and even public information so obviously the first step is just to reading that crime data set i'm not going to do that again um but this was just a reminder of what this might look like um as we know there are 6,881 observations um but this is for the whole area of Surrey right we discussed that we know that there are 55 LSOAs in Surrey Heath alone so we can use the filter function in the Dipler to basically select crimes that have happened in just Surrey Heath so let's just go ahead and do that i'm using the STR detect function here and this is just a function from the string R package which is used to specify the condition for filtering so it checks if a string in this case the contents of the LSOO name column in each row match a specified pattern so yeah let's go ahead and run that and this oh i haven't loaded the package sorry you want to load your library package first your string of package first and then you want to call this into a new object called Surrey Heath crimes so these all the crimes have taken place in Surrey Heath and if we look to our right we can see that now there are 468 observations remember that this is the original crime counts across the boundary area of Surrey Heath but what we need to do similar to the previous task is to group these crimes by LSOAs in order to then obtain the aggregated statistics right and um yes i've just done this into a new object called cleaned crimes groups by LSOAs and i'm then calling on the group dataset here which is the Surrey Heath crimes uh sorry not the group dataset the original dataset and yeah grouping these by LSOO and using the summarise to then count the number of crime counts so let's go ahead and just run that and now we have a data frame with two objects that count our crime counts per LSOA just in Surrey Heath now if you look to your environment panel you might notice that we actually have 54 observations instead of 55 we know that the Surrey Heath shape file contains 55 different areas but this clean grouped dataset here for Surrey Heath only contains 54 so we can basically run a spatial interpolation to account for this by estimating the missing crime counts based on the average of the neighbouring LSOAs so first let's identify the missing LSOA in the crime dataset and you can do this in well two ways you can do this manually by simply opening up your dataset and just having a look for which values or NA and so yeah let's go ahead and first identify this we can use the step diff function to do this this is a function used to find elements in one vector that are not present in another vector basically calculating the difference between two sets and it's part of base R and it's commonly used for identifying missing or additional elements when comparing either two datasets or two vectors so let's have a go at exploring what that missing LSOA is I've called on the shape file and the code and the LSOA code and then I've then called on the clean crime groups by LSOA and called on that code as well if we look at the output we can see that the missing LSOA is indeed this one E01030808 obviously this doesn't mean much to anyone or anything but it has identified the missing LSOA as this this area now in order to run the spatial interpolation on the average of the neighbouring LSOAs spatial visualisation is actually required in order to understand if these LSOAs do does in fact have any neighbouring LSOAs and this is because if this area doesn't have any neighbouring areas we will not be able to establish a mean and therefore estimate a crime count value so before running the spatial interpretation we can first establish a list of basically establish a list to see if this area is neighbouring any other areas so that we can get the mean from those neighbouring areas and we can do this using a function called sttouches this is a function from the sf package and it identifies all the touching entities within the spatial data and when used with the same data sets as both arguments so in this case I'm using the shape file twice it checks each LSOA in the data set against every other LSOA to see if they share a common boundary so yeah this allows us to view which LSOAs in this shape file are in fact neighbouring and if any aren't so let's go ahead and run that you can see that this has opened up a list object in our environment as you can see here this result basically is just a list where each element corresponds to an LSOA in the shape file and each element of the list contains the indices of LSOAs that touch i.e. our neighbours of the corresponding LSOAs to that element's position in the list quite complicated I know but this then allows us to identify the LSOAs with no neighbours and we can use the supply function from the diplo package to each element of the neighbour's list so it basically checks if the length of each element so that is the list of the neighbour indices is zero which would indicate that an LSOA has no neighbours so yes let's go ahead and run this one this has created a new object as you can see it's called no neighbours and it has this logical list for our shape files 1 to 55 a logical list is just a true or false value and it tells us which LSOA is true and having no neighbours and which is false and having no neighbours so we can then extract the indices of these LSOAs with no neighbours and then extract the corresponding LSOAs so we're going to call this into a new object called LSOA no neighbours and we're using the which function which function here to do this and this is just used to find the indices of the true values in this no neighbourhoods dataset or the vector sorry and these indices corresponds to LSOAs in the shape file that also have no neighbours so let's go ahead and run that and the last step is just to extract the corresponding LSOAs with um into a new object which I'm just calling LSOAs with no neighbours again this line uses the indices to subset the LSOA code column of your shape file and which contains the codes of the LSOAs that's just this bit here and this gives us basically a vector of LSOA codes for those LSOAs identified as having no neighbours so let's see what happens when we basically plot this code here you can see that it does in fact say no we look into our values it also says no and we have an empty integer if this object is empty this means that all your LSOAs have at least one neighbour if this object was um like contained an LSO LSOA code or an ID or whatever then it would tell you that there are LSOAs LSOAs with no neighbouring LSOAs so they don't have neighbours so what this means for us is that the missing area that we found does in fact have neighbouring areas and therefore we can run some calculation on it to interpolate the missing value for that so for each missing area we need to find the neighbouring areas and calculate the average crime rate so I've created this um loop function in R that basically allows us to do this I've tried to simplify um what specifically this loop is doing because it is quite long but basically for each missing LSOA it identifies the neighbours that it shares it then gathers the crime counts from these neighbouring areas it calculates the average round count average crime count and rounds them and yeah and then it's just finally added back into our original data set so yeah this is the for loop that we've created here I'm just wondering if I should break down this kind of chunk of code just a little bit more because it is quite uh computationally intensive um maybe I'll point out some important features here so as you can see this full bit of code is the loop that we use to understand sorry to calculate the average crime count from the neighbouring areas to then interpolate that missing value that we had um so if I just go ahead and run that you can see this is nicely printed out for us the average crime count for that missing area that we established um and this has given it a value of 5.625 so yeah you can use this code in your own work as well the only thing that you would need to do is replace your neighbourhood crime counts with the variable of interest of the data set of interest um sorry not that sorry you need to replace this one the data set that you're calling on this is the clean crime groups LSOAs uh okay so yeah that's effectively replaced that missing value that we had so if we look back into our objects you can see now that that clean crime data set has now increased to 55 um shape files rather than 54 if we open this up and we have a look for that missing variable it's now been added here we have a value of 6 because it's rounded up and imputed this before that missing area so yes let's compare the previous sorry LSOA LSOA so this is the data set with the missing area and the sorry and the new object called sorry LSOA without the missing the area just to see how these might be different so in the same way we created the sorry LSOA by joining the shape files to the crimes group by LSOA I'm just going to call it on a new object called sorry LSOA new join the shape file and join that cleaned crimes grouped by LSOA so let's just run that second line because we don't need to rerun sorry LSOA so actually you can comment that out if you want um yep so if you look into our objects you can see we've got this sorry LSOA object somewhere here it is yeah this one there's new object here 55 observations and six variables um I think I've just figured out why the team maps package wasn't working and that's because I think I've overrided the sorry LSOA object um so in fact I might just rerun that there we go that's brought us back down to our original shape file that's fine so we can compare the differences between the cleaned and the non-clean data sets both statistically and visually uh we can do this statistically by just comparing the mean and the median so if we run these two here you can see that in our cleaned data set we have a mean value of 8.61 but in the data set that contains the missing value this average increases to 8.67 which arguably isn't a huge difference at all but you can imagine the implications that this might change if you had you know 10 LSOAs LSOAs with missing values that all needed imputing um so yeah it's really important to think about the effect that interpolating these values would have on your data set we can also visually compare these so we can create some basic gg plots like we've done before in this instance I'm just using the the only difference in this gg plot is that I'm calling on this scale fill burn this option here in line 231 and 238 and this scale fill verdus um it's just a color scale to fill the aesthetic and it's known for like uniform properties i.e making accurate representations of the gradients and we can set the heat maps to be um either plasma or magma and these are just different color palettes um yeah this is just as this is a very simple heat map that you can make with gg plot um so yeah let's just have a quick look at what these would look like with our um clean data set and our non-clean data set so let's have a look at the data set with the missing value they're ready you can see obviously this is the area that had the missing value and you might have spotted this in session one and two and we can see um yeah it's and then this is the data set without the missing lsoa and you can see the slight differences in the colors of the max in the interpretations that we see um and yeah so yeah you just need to think about that interpretation of you know would you want to include a missing area in your data set in your analysis or would you want to interpolate these and I guess the answer depends on what your next stages of your analysis would be um so now we're going to move on to creating some more specific heat maps um I'm going to make a heat map specifically of the antisocial behavior there is a little bit of data preparation needed and these steps are just repeated from the steps above that they can be summarized as such the first steps would be to filter for the ASB in the crime data frame so we've run this before but I'm going to do this again I'm calling on a data set filtering for just the antisocial behavior and filtering for just what happens in Surrey Heath your next step is to group these and summarize the counts so I'm calling this ASB grouped by lsoa so you can go ahead and run that we then want to find the missing lsoas in these data sets uh can go ahead and print that and your last step is just to impute these missing values so you can use this same for loop to do so but the only thing again that you need to change is this variable here so I've changed this and replaced it with the ASB grouped by lsoa so let's go ahead and run that loop and I see how that took a little bit longer that's just because we had more missing lsoas um um with missing antisocial behavior crime counts and then we can go ahead and join this back onto the clean data set so that we can go ahead and plot it so yeah now we have a clean data set of our antisocial behavior um crime counts across the area of Surrey Heath and yeah I'm going to use the magma function here under the scale field verus to set our heat maps um and once we run it we'll get a map that looks something like this as you can see we have very high incidence of antisocial behavior in the east of um of the east of Surrey Heath we can also use leaflet to create our heat maps and this code snippet basically just provides interactive map so we can go ahead and run the color palette and this kind of chunk of code here lets us make an interactive map um and just excuse me so yeah let's go ahead and run this leaflet map here you might have noticed a warning that says sf layer is not longitude latitude and this is because we haven't actually transformed this new data set yet we haven't assigned it to the correct coordinate reference system we could simply do this and the reason you know your map hasn't worked is if you get a world map that looks like this with an empty image so what you first want to do is just transform that data set and we can do this using st transform function and calling on the correct epsg so you run that and you rerun the leaflet you'll see now hopefully there we go that we can now plot the asp time counts um and explore this a bit more visually so i'm just looking at the time it's currently quarter two um i might just get over this activity let me see what time it is yeah so in this activity i just wanted you to have a go exploring the asp data but with the missing values again just a way to see how the interpretations of your data can um be significantly affected by things like like this so i might just work through this activity or maybe i'll give you two minutes and then yeah we'll we'll cut yeah i'll give you i'll give you two or three minutes just to follow these steps um they're very simple similar to what we've done in the previous steps you're just doing everything but the spatial interpolation here so yeah i'll pause for two minutes and just come back with the activity um yeah i believe there's been a few minutes so i'll just slowly start walking through this activity and then we can finish with five minutes on the q and a so yeah the first step is just to filter for your antisocial behavior so this just calls recalls calling on the specific crime type and we also want to detect only those areas in sorry he's so call on the lsoo name and i've also assigned this to a new object called asp2 just to avoid confusion again we're going to group and summarize i'm not going to talk about what that's done so i've done that about five times but yes we group by the lso code and then we summarize the count so let's go ahead and run that we then want to left join this so this is joining it back up to our shape file so that's the shape underscore file and also the asp underscore groups by lsoo2 in this instance because it's the unclean data set as in the data sets with the missing values and now we can go ahead and plot these in g plot and in leaflet so very very simply you should input that new object that we just created which is called asp lso of missing and we want to fill this with um the crime count i'm not sure why we have an el t there that's not meant to be there meant to say fill equals count bracket that should be correct so let's go ahead and run that um so yeah now we have this really kind of like empty map right without we have this really empty map right now that shows various missing areas with those missing asp count in those missing lsoas and then we can also do the same with leaflet so we first want to define a color palette and then remember you need to transform your data set otherwise you won't be able to otherwise the coordinate reference system or the gis that leaflet uses won't be able to understand what you're trying to plot um so yeah we can transform this with this st transform function and then you want to call on your es epsg which is the well geodetic system which is the one we want and then let's go ahead and plot this in leaflet just so we can have a look at those trends a bit more interactively and there we go we have this really like mismatched map of our asp with all those missing areas and you know if you was to present this this doesn't really effectively highlight the actual distribution of crime towns in these areas um and yeah that draws us to the end of this workshop i've included a syntax document um i'm just going to clean this up a little bit sorry let's get rid of this i've also included a code syntax here which just kind of has all the code in a bit without all the writing you just wanted to run all of that through and then there's also some additional topic section here uh this just has some information about setting up a google api to run some google maps in r and she's got also some stuff about binning data here and there's also some stuff about interactive map and functions like jittering which are used to like protect the privacy of victims um so yeah that draws conclusion to this talk uh leave this workshop thank you all for your time i hope i've been able to provide you with some useful information about my crime data in r