 a tiny bit on my end in case you can hear some baby noise. Excellent. Good afternoon, everyone. Thanks for joining us this afternoon. I'm Dermot McDonnell, I'm a former associate at the UK Data Service and here today to take a look at some simple social network analysis examples, mainly focusing on how we can use Python to achieve that task. So for today, it's not particularly a completely introduction to Python. So it's not for people who have never, ever used Python before, but you don't need to be good at it to learn something today. It's, I don't cover the complete basics of social network analysis either, but I do cover some of the essential concepts. So if you're thinking about learning Python from scratch, then we've got some different materials and there's some excellent open source materials also. And if you're interested in social network analysis, the essential concepts and kind of theories and how those are linked, that we've just recently concluded a webinar series with myself and a colleague, Julia. And so you can consult that as well. So today is for people who maybe know a tiny little bit of Python and maybe know a tiny little bit about social network analysis. And I'm gonna show you how you can combine the two together. So this is a coding demonstration. So not only can you follow what I'm gonna be doing, but you can also execute the code yourself as we go through the exercise today. So a colleague, Julia Kazmier, will have posted the link to the Jupyter notebooks that we're using today. A Jupyter notebook is simply an electronic notebook similar to your paper versions, and in which you can write programming code. So today we're using Python. If you were an or user, for example, you could use Jupyter notebooks for that. C sharp, Julia, lots of different programming languages can be written using a Jupyter notebook. So if you want, you can just purely follow what I'm doing for about the next 30 minutes. And then we'll have some questions at the end. Or if you can possibly, you can execute the code in real time with myself also. So today we're gonna look at the first webinar and some of the example code we wrote for that. So we've got a Jupyter notebook here, the fundamentals of social network analysis. And the way we do it is instead of trying to run things on our laptops and how to install Python and install packages, and that's kind of quite finicky, technical aspect, instead we can all do this online. So you can click the link that Julia posted in the YouTube chat. It's the same link as I'm gonna select just now. So basically this launches our Jupyter notebook to the cloud basically, and that allows us to run all the code without having to install anything on our own machines. And not only is this quite a useful teaching tool for us here at the data service, but it's also quite useful for you as an analyst as well. It allows you to share your code with others and allow others to make changes to your code without overriding what you've done yourself. So it's quite a good way of doing things. So basically once Binder has launched, so that's kind of the cloud service we're using, you can see a folder like this. So now basically those Jupyter notebooks I showed you before are basically now interactive. So if I click on the fundamentals Jupyter notebook and then that launch basically the programming code we're going to use today, excellent. So this contains a lot more material than we're gonna go through today. So I used some of this in the webinar series from about two months ago. So we're just gonna focus on one aspect today. So if you, basically it's like any kind of webpage just Jupyter notebooks you can scroll down, you can work through it that way. Basically Jupyter notebooks consist of cells. So this logo here is in its own separate cell. Think of a cell like a paragraph or a section, just encloses text or it encloses images or it encloses code. So what I'm gonna do is I'm gonna go to section six of this Jupyter notebook, which contains a very simple and real example of some social network analysis that I conducted myself in my own research field. So I'm just gonna head down to this section of the notebook here. So basically what I was interested in is UK charities and I'm interested in to what degree they're connected. So there's a long running organizational sociology concept known as board interlock. So that's the degree to which boards of companies or charities or public sector bodies are connected. So in this example here, I just defined very, excuse me, kind of rudimentary pilot study of board interlock in the UK charity sector. So I'm interested in what degree, to what extent do we see board interlock among I say UK charities, but we're talking about England and Wales and actually specifically in this example, Manchester. So basically two charities are connected if they have the same trustee. So if I'm on the board of one charity and I'm on the board of another, then I can say those two charities are connected through me. So that's a network kind of phenomenon we're interested in, how are charities as organizations connected through individuals. So to use some of the social networking terms, the things that we're interested in studying, you know, the entities that are connected are known as nodes in social network analysis. There's lots of different terms, depending if you're coming from a mathematical perspective or a sociological perspective, we're gonna use the term node. So a node in our example is a charity. And then we're interested in what we're gonna call connections, but the technical term we're gonna use today are ties. So how our two charities connected and if they are, we can say that they share a tie between them. So ties are simply a connection between nodes. So what I've done is I've taken a publicly available data set, it's an open data set of charities who are at least headquartered in the city of Manchester in England. That doesn't necessarily mean that they only operate in Manchester. We can think of international charities like Oxfam, which will have a headquarters in London, but will operate across 80, 90, possibly more, 100 and something countries around the world. So that's just to give us our kind of framework for what we're gonna do today. Hopefully it's a semi-interesting topic for us to look at. I know I'm certainly interested in it. So the way we use Python to conduct some social network analysis is kind of form of a couple of steps. And the first step is known as the kind of preliminary steps. So that's really before we start importing data, cleaning data, doing analysis, et cetera. We essentially have to tell Python, okay, hey, these are the methods and the functions and the processes we need today to conduct our analysis. So I'll make this tiny bit bigger, excellent. So what we do here is we have a code cell. So basically this is a type of cell in a Jupyter notebook. In contrast to the one above just here, we can see that this is a text cell. We just write plain texts as if it was a Word document or a notebook. We can write here, for example, I can save that, execute it so it actually appears. The code cell is different. Whatever programming language we're using in Jupyter notebooks, this is what we will write in this little section here. So we're using Python for this. So basically we need a couple of different packages in Python. So think of a package as a kind of a collection of programming code that allows us to do things in Python. So for example, the pandas or pandas, I'm not entirely sure what the correct pronunciation is, is basically a data cleaning or a data wrangling package that allows you to import data, clean it up and spit it out in lots of different formats. It's incredibly useful. We're gonna make good use of it today. NumPy's for just conducting mathematical operations in Python from more advanced than just adding or summing or multiplying, dividing, et cetera. We can do kind of more complex mathematical operations. What we're really interested in today is networks. So that's the Python social network analysis package that we need today. There is another one in Python. I think it is graph QL, graph something. I find networks possibly easier to deal with. I think it's maybe a bit more English language based. So if you want networks to do something as a piece of programming code, it's closely related to English. But it's just a flag up that certainly for some of you who are familiar with maybe the R programming language that there's multiple ways of doing the same thing basically. And for us today with social network analysis, we're gonna use the networks package for our purposes. We've got some data visualization package that's quite useful when we draw some graphs in a moment. And then we've got kind of a standard piece of Python programming which kind of just allows us to loop through lists, for example. It's nothing complicated at all. We just have to call it in to this Python session. So that's what we do with the preliminaries. We tell Python, go to where you save all the code that we need and pull in these bits of code to this session today. So if I execute this code cell, I can do it two ways. I can click on the cell here. You can see it'll either be highlighted in blue or green. Green means you're in editing mode. You know, I could type things just here. Either way, it doesn't matter. I'm not editing at the moment. I'm happy with that code. And I can go up here to the run button and just click on it. And that will tell the Jupyter notebook to execute that code. Now, we're not expecting any output from this. So all we're saying is, you know, import these packages in so that in a moment we can start actually, you know, performing some program and some analysis. And there's a keyboard shortcut as well. You can hold shift and then press the enter key. I think maybe hold in control and the enter key as well. But basically, if you wanna execute code today or using a Jupyter notebook in general, go to the cell in particular, you know, click on it so that it's highlighted and press and run. So we can rerun this as many times as we want because we're just saying, you know, load in these packages. And how do you know which packages you need to use? Well, basically, when you learn how to do social network analysis or web scraping using APIs in Python, machine learning, et cetera. Basically, the examples you find will say, in order to execute this, you know, piece of analysis, you will need the networks package, the pandas package, the requests package, et cetera. So you don't need to, you know, come up with these off the top of your head. It'll be made very clear to you that you can only conduct social network analysis if you use these packages in Python. So that's getting ourselves set up. Excellent. I know I've taken a bit of time, but again, assuming that you're reasonably new to Python and maybe particularly new to using notebooks, I thought I'd spend a little bit of time. So we move on, so we get kind of stuck into things now. So what I want to do is get some data in to my Jupyter notebook. As I said, I've got some data about charities in Manchester and who sits on the board of these organizations. This is, again, public information, so don't be surprised when you see some real names. And it would be quite a coincidence if one of you was one of the examples I'm just about to show, probably quite unlikely. So what I do here is, first, I want to read in some data. In order to do that, I use the pandas module that we imported earlier. You can see here when I did import it, I gave it a shorthand reference, PD. I didn't have to do that. That just saves me time later on. Instead of typing out pandas in full, I can just say PD and Python knows that I'm referring to the pandas package. So I want to call on my data cleaning package in Python. And then I want to use the read underscore CSV method. So that's a function or a method that allows me to pull in data from somewhere on my machine or on the cloud, which we're using today, and load it into my Jupyter notebook. So I do that, so I take this file here in the data folder, I load it in, and then I want, you know, I tell Python, give me a quick look at the data, how many rows does it have, how many columns does it have. And this function here just basically says show me the first 12 rows. So using the head function, just give me the first 12. I could easily say the first five, for example. But we'd execute the code first and see what kind of results we get. So that's really good. So it's clearly working. There's about 2700 rows, four columns. And we can see an example of the first 12 rows and all the four columns just here. So how to interpret the data is basically we have individual one here. This is their unique ID. This is their real name. So this is Rabbi Abraham Hassan. This individual has three trustee ships. So this person sits on the board of three Manchester charities. And then these are the unique IDs of each of those charities. So using our example, and we'll develop this further in a moment, we can initially say that these three charities here are definitely connected to each other through a common individual here, this Rabbi here. We can see that there's a second person who sits on the board of mine organizations, which I'm not sure where they find the time. And again, here are the unique IDs. So again, we can say that these nine charities are all connected to each other through this individual. It gets more complicated later because two organizations might be connected multiple times to different trustees. And how do we take account of that? That's easily done in social network analysis, but it's not a mode of analysis we're gonna undertake today. We're just gonna simply look at are two charities connected or not and how many times does that happen? So excellent. So one of the first things we want to do is basically just take a look at how many charities there are in the data set. You'll notice the way it's currently structured is we have basically lists of trustees. So there's 2,700 records, but of course, so that's not necessarily 2,700 people. It's X number of people on the board of Y number of charities. So we just want to do a simple count of how many organizations are in the data set. So basically I say count the reg no, so that's registration number column by dropping duplicates. So I only wanted to count unique instances of charity number and we get 1,123. So basically there's 1,123 charities in Manchester that are connected through their trustees. So the next thing we want to do is when we're conducting social network analysis is we need to get the data in the correct format. Now this is something we went through and quite a lot of detail in the webinar so I won't go too much into it now. But essentially we want to move from the data set we just had, which is like lists of trustees on a column showing which charities they're connected to. And basically we want every role to be a charity and we want every column to be a charity. And in the cells, in between the rows and columns we want a one if those two charities are connected and we want a zero or a blank space if those two charities are not. So that's a bit probably abstract to conceive and to visualize just as I've explained it. So let's just get straight into it and take a look at how it should look. Excellent, so to kind of give life to what I've just said, basically now we have all of those 1123 charities all listed individually as rows. So here's Charity 208879, so that's a Manchester charity. And then along the columns we have the same charities just listed again that way. And then in each cell a zero tells us if those two charities are not connected and a one would tell us if those two charities are connected. Because there's again 1,100 rows and 1,100 columns Python's not gonna, the notebook rather is not gonna show all of it. So you can see here at the ellipses kind of saying that there's lots more columns in between and that we haven't taken a look at. There's no real reason to actually visually inspect it. I mean an easier thing here would be to just to use the pandas module and to just export the data set and then you could just look at it manually. You could open it in Excel or some open office software. But it's good to do a little visual inspection but we don't need to comb every column in every row to check that there are ones. As you probably guessed, it's much, much simpler just to go across the rows or go across the columns and just say count the number of ones and that'll tell us how many connections there are. So that data set or that matrix that I've just created it's called charity underscore math. That could be anything that's just a variable name I've picked. Then I want to use the sum command. So basically just sum either the rows or the columns. So if I specify the first axis, so that's a way of Python in Python of referring to the columns or the rows. So if I refer to axis one, I'm referring to the rows. Basically, I just want to count, yeah. Basically I want some summary statistics actually it's not just counting, it's give me some summary statistics of all the ones and zeros that we find in the data set. As you can see here, we get the average. So basically each charity is on average connected to about three others. So there's quite a lot of board interlock in the Manchester charity sector. Probably not too surprising, there's, well there are I think half a million people in Manchester. You tend to get kind of the same people volunteering. So it's not like all 500,000 people have a chance of volunteering as a trustee. There tends to be this idea of a civic core when it comes to the charity sector that roughly the same kind of numbers and groups of people perform multiple roles in the sector. So you can see that the minimum number of connection. So every charity basically is connected to at least one other in this data set. And there's one charity, we can see the max statistic here that's connected to 23 others. So that's quite a well networked or quite a possibly quite important charity that kind of sits in the middle of the Manchester charity network. We can start looking that up in a moment. And just to show that you can also just say, well take the information in each column and give me the summary statistics and you get the exact same results. For the simple reason that we have 1,100 rows and 1,100 columns, it's the same organization. So it doesn't matter if you want to count all the ones across or all the ones down the way and we get the same summary and statistics. This is maybe, I'm showing you maybe not complicated Python code to do this. There's a simpler way of using the networks package in a moment. But I do encourage you to take maybe a deeper look at what I've just shown you particularly how we construct the network data set. It's not something you necessarily have to do all of the time. But why I think it's a good thing to do is it really really forces you to focus on the kind of key concepts and just what it is you're trying to do. So by setting up the data set that way of all the rows and all the columns, having charities, I made it very explicitly clear that I'm interested in how charities are connected. Because you probably have noticed I could have done it the other way and I could have put all the trustees on the columns and all the trustees in the rows as well and said, our two trustees connected to each other by sitting on the same board. That's another way of looking at the same data set slightly differently. But basically after all that work, just if any of that wasn't interesting or particularly relevant, it's all in service of getting the data in the right format so then I can start using the networks package which allows us to do the actual core social network analysis. So basically I go again to the networks package here. I refer to it as it's short hand, it's NX. It has a method called from pandas adjacency. So that just means from a matrix data format which I've just created, create a networks object. And we get our result here which basically just says we've created a network graph. A graph for most of us is a more generic term for data visualization, some kind of pictorial representation of data or an infographic is not necessarily a data-based product. In social network analysis, a graph is basically the data, it's network data. So again, it comes from the kind of mathematical origins of social network analysis. So from now on, when you hear the term graph, we're not actually talking about data visualizations. We're talking about that network that I've just created. So those columns with all the charities and those roles, and that's now known as a graph because it's a list of nodes and it's a list of ties. So basically a graph is a collection of all the things that are connected and all the connections between them. But we'll see again, we will see a visualization of that graph in just a moment. But it's just so you understand with social network analysis, a graph refers to network data and not necessarily an actual visualization. So now what I've done with this piece of code is I've said, right, here's my network data and put it into the networks package so that Python now knows that we can start doing some social network analysis methods and techniques. So what we want to first kind of focus on is some network level summaries. So basically we took a slight look at them. We looked at the average number of connections between charities, the maximum and the minimum number of connections. We want to go kind of one step further. We want to look at the size of our network. So how many charities are in the network and how many connections between them exist. So basically a very simple networks method. It's called the dot info method. So give me information about this network data here, which I called char graph. So it produces again some summary information about the kind of the aggregate properties of the network. The number of nodes is 1,123. We knew that from previously. We know that we have 1,100 unique charities. What we didn't know previously was, you know, how many connections existed between them and we get about 1,500 connections. So out of 1,000 charities, there's about 1,500 connections between them, which is reasonably dense. So we can get an exact measurement of density in a moment, but it's a reasonably high number of connections between charities. And again, as we learned before, each charity is connected to, on average, about three other organizations in Manchester. So that brings us to a concept in social network analysis called density. So the density of a network is essentially of all the possible connections that could exist, how many actually exist in reality. So how many possible connections between charities have been realized? Again, thankfully, we can use the kind of English language-based methods that the network's package provides. So instead of asking for information, now we're asking for information on density. So we use the density method. And what I do is I just take that result and I put it in a new variable called density. That's just so I can call it later on. So I want basically Python to print a message back to my screen saying, you know, this text here, i.e. network density colon and then whatever is captured by this variable density here. So actually those 1,500 connections in reality from a substantive perspective, that's a reasonable number of connections between charities. But mathematically, it's actually a very, you know, very small number of actual connections that could exist. So densities measured as a proportion between zero and one or just think of it as a percentage between zero and 100. So for example, basically, you know, there's fewer than 1% of all connections have been realized in the Manchester charity sector. So that makes it seem quite, you know, seem like quite a bare network, quite an empty network. But we're talking about real organizations, real life here. You don't tend to find very dense, real networks. You might find things like internet networks or electrical wiring networks to be very dense. So of all the wires that could be connected in and plugged in, the vast majority are, that's a good thing from an electrical engineering perspective, real life social networks don't tend to be very dense. So then we can just, we can move through a couple of other social network analysis concepts, you know, so clustering is one very interesting, well-developed concept. And we're interested in one aspect of clustering which is known as transitivity. So this is a concept where basically, if you know, three charities exist in the network and connections exist between two of them, basically what's the probability that the third connection will become realized? So it's better to kind of look at this visually. So basically a connection between three organizations would be known as a triad. So a trio of connections. Sorry, let's make this a bit more legible. Excellent. So this is an example of a possible triad. So you can see we have a group of three charities, internet work, two of them are connected, let's say charity A and charity B and charity A and charity C. So basically the clustering concept and the clustering measure we're gonna calculate is what's the probability of these two organizations forming a connection? So this is the friend of a friend kind of idea in a social network and what's the probability that a friend of a friend becomes a friend, basically. And this just gives us an idea of how likely is a network to become a bit denser? How likely are connections to form when we see this kind of example in the network? So that's a possible triad. So this is what we'd expect to find. We're basically saying, if we see a possible triad, what's the probability of this example coming to life in the network? Let's make this a bit bigger again, perfect. So it's a concept called transitivity. So again, we want the transitivity method from networks. Hopefully you're seeing this is why I think networks is quite good. It's quite a logical and English language-based package of programming code. Again, I'm just calling this variable triadic closure to use the technical term. I could just call it triads, for example. I'm just creating this variable name, whatever I want. Again, ask Python to execute that code. And yeah, I get a measure of about 61%. So that's quite high. What that basically means is if we see examples where charities have an organization in common, those two charities are likely to form a bond between them. So just to scroll up quickly, basically if we see this scenario here, if we see that absence of a connection between two charities, then about 60% of the time it forms into this kind of connection. So a triad or a triangle. So that just gives an idea of in my network that if a charity knows another charity through a common link, then there's a pretty good chance, there's a better chance than not that those two charities will actually form a connection themselves. Okay, so we'll just cover a couple of other social network analysis concepts and methods. So we've kind of taken a look at the network in an aggregate sense. So network level measures of analysis. We're gonna now take a look at node level. So now we're interested in individual charities and what's the maximum number of connections per charity? What's the average number of connections? Can we get some sense of how important certain charities are in the network? For example, so we'll take a quick look at some of those measures just now. So one concept is called centrality. So centrality, there's a couple of different types of centrality that the one we're interested in just here kind of captures importance. So how important is a certain organization in the network? So basically the most important charity in my network is the one with the most connections. So that's the connection between the concept we're looking at importance and the kind of empirical measure we can look at which is the number of connections. So the most important charity is the one with the most connections is what we're saying just here. And this is a measure called the degree centrality. So basically there's a tiny bit more code needed to pull out the degree measure. So basically I have my network data here which I've called chair graph. Hopefully it's got a degree method. So it's like, tell me, you know, it's count all the connections of all the charities in my network. Yep, so I feed that method all of the nodes. So I basically said to Python, look at all the nodes in my network and count all the connections between them. And basically I store the results in a variable called this just here. And what I do is I want to take a quick look at, I've plugged this charity semi-randomly. So charity number two, two, five, one, one, six, tell me the number of connections or the number of ties in that charity has. So this charity has 12 connections to other organizations in the network. Yeah, and then what I do here is basically I just kind of loop through all of the results. So basically I have a list of all the charities and for each charity I count how many connections it has. And now I'm just saying, you know, show me the top 20 best connected organizations. And yeah, so the most important charity in the network, i.e. the one with the most connections is this one here 530-002. Let's actually make this real, that might actually help. We can actually just go and check who this actually is, register, there we go. Yep, search the register, government websites are not particularly intuitive. Okay, so basically it's a religious charity focused on education. And this is the best connected charity in Manchester. You can see where I get the information from. So I have all the list of all the trustees who sit on the board of this organization. And look, it's no surprise that the first person alone has three other trusteeships. So it sits on the board of three other charities. The second person sits on a different charity. Yeah, so you can see just even scrolling through the results just visually it's quite a well connected charity. And I've asked for the top 20. This kind of structure here is a loop. So we're saying for every element in a list, and my list is called this thing here, sorted degree. Again, it's a variable name, it could be called anything. This allows me to specify how many records to loop through. So if I was interested in the top 10, I would say I can make this explicit. So basically come from the first record to the 10th, or I could say 100 if I wanted. And then we have to scroll down a bit further, but it's within our control. Excellent. So we're going to recreate basically that for a slightly, oh no, actually we are going to do some visualizations. So a very common way of analyzing the number of connections in a network is to produce a histogram. It's called a degree distribution. And it'll be very familiar to any of you, even if you're not primarily a quantitative researcher. It's a very common data visualization type. It basically just shows us account of the number of connections in the network. So what we're doing here is we're just creating a variable that stores account of all the connections in the network. I'm using then the plot function in Python. I'm saying write give me a histogram using this dataset and then just some aesthetic information. So give me a graph title, label the x-axis, and show me the resultant graph. Being a bit funny with the display, let's zoom out a bit. Yeah, there we go, here we go, that's a bit better. So basically we've got a histogram of the number of connections in the network. So basically we can see that the most common connection is there to be one. So most charities only have between one and two connections in the network. But there's about 150 organizations, 130 organizations have about four, between four and five connections. Then there's about 50 organizations have between six and seven connections, and so on you get this long kind of thin tail. So there's a very, very small number of charities that we saw earlier. In fact, there's only two that have more than 20 connections in the network. Hey, and finally, so I mentioned that a graph is a more technical term than social network analysis. It basically just, it's another word for network data. But of course we can create what we understand to be graphs of network data, so data visualizations. Visualizing a network is not actually necessary to do good social network analysis. In fact, it's often not quite a waste of your time, but as soon as the network becomes more than trivially large, so unless your network is just kind of incredibly small, it becomes very difficult to find meaning in the visualization of a network. And we can kind of see this here. So this is the Manchester Charity Network visualized. Just trying to get a better representation of it. So basically we can see that there's kind of a large cluster in this network. Basically there's lots of individual charities that are connected to one or two or three other organizations. And then kind of on the kind of outskirts of the network, if you want to put it that way, we have one organization here that's connected to this one. So it only has one connection. This charity has two connections because it's connected to this organization here and it's connected to this one here. Then this one is connected to three, so this one here, this one here, and this one here. And then this one is connected to one, two, three, four, five, six. So the visualization is trivially revealing of the kind of the network structure, the network properties. But it's highly reliant on the method of drawing the network. So there's lots of ways to draw the network. So this is one particular way that I've chosen. It's not quite a random layout. It's got basically an algorithm that decides where to put all the dots. And there's lots of ways of doing that. So here's an alternative way of doing it. This is the exact same network data. I haven't done anything. I'm not trying to trick you with a different data set. But this time I've said, right, draw the network using a circular algorithm basically. And as you can see here, basically, all the charities are on the outside in this kind of rugby ball. And then all the connections between them kind of crisscross. So you can somewhat see that this charity here, if you kind of follow one of the lines is connected to this charity here. Again, exact same network data, but this time effectively I have absolutely no way of, absolutely no way of, you know, defining how many connections are in this network, which charities, which maybe is this charity here because the line is denser. You know, is this the charity with the most connections? You know, I really don't know. So you can visualize the network. It's sometimes useful it may be something that's good for, you know, a poster at a conference. But hopefully as I've shown by executing the code to look at some of the summary statistics, visualizing the network is not central to your understanding of what's going on at all. So if there's one takeaway from today, do not waste your time trying to draw a very beautiful network diagram. It's often not worth it. So that's as far as I wanted to take you today. So it was just to give you a sense of how we work through a social network analysis example in Python. Again, I haven't explained the code on a very, very fundamental level. This isn't an introduction to Python. In general, we have other materials and there's lots of other great initiatives out there, not just from us. And again, we're not teaching you social network terms today either. This is hopefully for those of you who have some sense of both and just wanted a kind of a talk through of here are the steps and here's what's going on at the key moments. So that's the end of the demonstration. To me, hopefully a good few of you have followed along online, which would be really, really good. I'm now going to take a look at the chat. I'm currently minus a lot of IT equipment, so I do not have a second screen at the moment, which is a shame. But I will take a quick look at some of your comments. So that's the end of the demonstration. To me, hopefully a good few of you, excellent. Okay, so Farsi, thank you very much. Hopefully we've sent you the link and you've received emails from us as well. So there's lots of code you can see. I've just basically focused on one simple social network analysis example. In the notebook, there's three notebooks that are full of code. They're full of examples. There's plenty of practice. It's how you really develop your programming. I'm not a computer scientist by training. I'm a social scientist who's trying to shift towards more computational methods in my research and teaching. Hopefully today has demonstrated as well that it is possible to go very far with some decent Python and programming skills in general. The way I find that I explain this is that if you remember those kind of American movies, those military movies and they show the boot camp when there's this kind of 12 foot high wall and they're trying to help each other over it and you're like, how do you get over this bloody thing? It's impossible. But once you get over it, the obstacle course is finished and you can just run nice and freely to the end of the obstacle course. I find learning programming or Python like that where there's a very initial high barrier. You're like, I don't understand what a package is. How to install packages? How to import them into Python? What am I doing here? What's the function? What's the variable? But hopefully once you get past that and as I've shown you today, you can go very, very, very far in a reasonably short time with some decent standard basic Python skills. That's enough for me. Definitely enough chat. I'm gonna stay on for another five minutes or so if you wanna post questions. You'll have our contact details. You probably won't have mine because I've just left the data service. But if you can find my details, then absolutely contact me. But I would encourage you to get in contact with my colleague Julia Kazmier or the data service in general. We're doing lots more stuff like this. We're doing drop-in sessions where you can bring your own questions and ask an expert at the data service as well. So you'll probably see my face for a moment while the feed is still live. So if there are any questions, I'm happy to answer. If not, then thank you very much for joining. Really good luck in terms of what you're doing with your computational social science. Thank you. Oh yes, there was a question about charity data. Yes, so I just used one particular subset of data to do with Manchester Charities. Yeah, I would just go to the Charity Commission for England and Wales website and as they've got a data download and they've got kind of like a public search tool. So you could just use the data download and you'd be able to filter by postcode or by city to what you want. There's lots of open data about charities now which is really, really good. Yes, so there's a question from Ian Hamilton. Thank you very much. Are there Python packages for statistical clustering methods? So yeah, PCA or latent block modeling? Yes, so I think in terms of network clustering, if that's what you're specifically, yeah, so block modeling, yes, can be done in Python. It can be done in the, I don't think it can be done in the networks package. I'd need to have another look. I think there's a separate package for that. I certainly know if you use, what's the other software? It's a Pyek. Yeah, there's a couple of other social network analysis packages that you could use that are specifically for doing kind of statistical modeling. So things like exponential random graph modeling. The networks package is quite good for, it does some advanced stuff, but it's mainly for kind of describing networks, doing that kind of, it was kind of fundamental summary statistics. And then there's, there typically tends to be kind of very specific packages for doing more advanced modeling or block modeling. So apologies, I can't give you the exact package name, but there certainly are packages in Python that can do that. Excellent. So if there aren't any other questions, if you do have questions, again, fill through them through the data service, Julia will pass them to me. If not, hopefully see some of you in two weeks' time from today for the next session. And thank you very much. Bye everyone.