 So first of all, thank you very much to Mark for perseverance and organizational skills and making this all happen. And I want to say a special thank you to the Sealy Foundation. So it's great to fund these sorts of things. And I think it's very exciting for me to be able to talk to a group like this and to spread the knowledge that I get so excited about. So I enjoy that opportunity and thank you very much for that. And in terms of the overall, the hospitality here has been wonderful and thank you also to the University of Auckland for putting this together and hosting it. So the work I'm going to talk about this evening is what I'll do is give you an idea of some of the things we learn from looking at networks and economics. Understanding human behavior really has to be put in a social context because we're really social animals. We depend on each other for all kinds of things. And in particular, our information, the kinds of favors that we get from other individuals, all of these things are really put in a social context. And so economics for many years has been something which hasn't paid so much attention to social structure. But understanding social structure now gives us a lot more understanding of very particular kinds of economic activities and things. And this is a very partial list of some of the things that are ones that we get real new understandings of from looking at networks. So a lot of interactions occur in social environments and obviously having opportunities for job, who we hear job information about often comes to our social networks. Adoption of new products, I'll talk about that a bit this evening. Sharing of favors, risk, opinion formation, investment in education, all kinds of things, trading relationships. All of these have patterns to them and understanding those patterns can help us understand the behaviors. And so in particular, what I'm going to do is start with just a couple of pictures of different kinds of networks to give you some idea of the breadth of the subject. And then I'm going to talk a little bit about how we deal with these animals and how we measure them and how that can help us with economic insights. So this is a fascinating one. This is one from a study by Birman Moody and Stovl. It's what's known as the adolescent health data set. And these little dots are children in high school. So these are kids in high school. And a line between two of them meant that they had a romantic relationship during an 18 month period where the study occurred. So this is a network of basically romance relationships. The kinds of things that are interesting from this, this is a fairly fragmented network. So there's actually one large, what's known as a giant component. There's a whole set of nodes that are connected through different relationships. And then there's a bunch of smaller pieces and then there's actually 63 here who just were fairly steady during the time period. But the kinds of patterns that we see emerging here are patterns that we'll see in a lot of different contexts and we'll be able to understand things about these. So this will affect who hears what news, who interacts with whom, who adopts which kinds of fashions, products and so forth. Very different, so I'm going to jump to sort of the opposite end of the spectrum in terms of networks. These are military alliances. And these are ones just before the First World War and you can look at ones just before the Second World War and you can look at ones today. And they look very different. They have very different patterns and actually, you know, if you go into understanding these patterns, you can begin to see why it is that in fact we have many fewer wars these days than we had previously. So if you look at the world, the incidence of wars has gone down by a factor of 10 since 1950. So these are the kinds of things that you can begin to understand by looking at some of these alliance structures and so forth. So what I'm going to do is I'm going to go through and talk about some of the ways in which we can work with these networks and the ways in which they can help us understand different kinds of phenomena. And I'm going to paint a very broad brush with a very broad brush tonight. So I'm going to cover a lot of ground on basic concepts and show you how these work. So the main challenge in working with networks is complexity. And this is just a very simple calculation that's useful to sort of put things in context. So how many networks are possible if we just looked at 30 different nodes? So if we just had 30 people and we look in a very small say in a classroom in a school, how many different networks of friendships could we observe? Well, you know, each person one could have 29 different possible friends, yes or no. Person two could have 28 not counting that they've already been counted with person one and so forth. If you go through that, there's 435 different possible relationships and just among 30 people. And if you think about how many different networks that goes in, you could have any different combination of those 435 ones on or off. And that's a huge number. So it's in fact 2 to the 435th possible networks that exist in a society of just 30 people. Well, how many atoms are there in the universe? Somewhere between 2 to the 158 and 2 to the 246 by Wikipedia. So this isn't, I wouldn't count this as hard science, but if you look at Wikipedia, the estimates vary somewhere on those orders. Okay, so that tells us that there's a huge number of networks. We're not going to be able to identify them easily. There's just, it's just too big a class. And so we're going to have to, when we think about social structure, it can be very complex. And so we're going to have to have simple ways of understanding social network, condensing it down, and then trying to understand in this condensed form what we can understand about behavior. So tonight what I'm going to do is go through three different illustrations of different properties that networks have. And then how each one of those properties affects human behavior. So the first one is just connectivity or density. So does the network have lots of connections or does it have very few connections? So does it look like the last military alliance network or does it look like the high school one where it was a little sparser? The next thing I'll talk about is segregation patterns. Does everybody interact with anybody else in the network fairly evenly? Or do we see strong cuts in the network where there's subgroups that are only interacting among themselves and not interacting across groups? Okay, so there's going to be two different aspects of a network. And the third thing I'll talk about is position in a network. So when we begin to look down into the network and look at particular nodes, which nodes are well situated? Which ones are in more powerful or influential positions? Which ones are more peripheral? So these are three different aspects of networks that we can look at and all of them will have interesting implications for behaviors. And so that's what we'll do. So I'll start with connectivity and density. And this one is probably the simplest to understand intuitively. Denser networks, there's going to be more conductivity in them. You're going to be able to connect more people with more connections. And so density and the distribution of links is important in many different kinds of applications. We'll have fluid disease spread. So understanding influence every year. Understanding what's going on with Ebola these days depends a lot on understanding the network structure. Other forms of contagion, diffusion of ideas, products and so forth. So higher density roughly is going to correspond to more contagion. And I'll sort of go through this. There's going to be some caveats to that. So there's some basic ideas that are going to be fairly obvious and then some nuances that won't be as obvious. So why does more connectivity lead to more contagion? Well, the idea is fairly simple and we can see it very easily. Let's just take a random graph where we take a society and start putting down connections between individuals at random. So here I put them down with a very, so degree in network parlance means how many connections you have. So if I have degree five, it means I have five friends. If I have degree seven, I have seven friends. So here this is a situation where we put down links at random, but we put them down so that each person has about a quarter of a friend on average. So very rare. Friendships are pretty rare. Now if you had a disease that started in this network, it wouldn't spread very far. So it just doesn't have the connectivity. There's not enough interaction. It would die out very quickly. Go to degree one and the network starts to coalesce. It starts to look a lot more like that romance network. You're going to get a giant component. You get path connections so that things can spread through the network. And that happens fairly quickly. So now if about half the nodes are connected in a giant component, there's still some in smaller pieces and some isolated. You go to two and a half and boom, the network really coalesces. So what do we take away from this? One thing to take away from this is there's a very fine line between being disconnected and being connected. So it's a very rapid what physicists refer to as a phase transition. So you go from average degree one, so each person has an average one friend to two and a half, and you've got a very different world. It's much more connected and now things can spread quite a bit. In epidemiology, people always worry about this basic reproductive number. If a person gets ill, what's the chance that they transmit it to somebody else? How many expected future cases does that lead to? If it leads to more than one, then you've got a problem. And that's exactly what this picture is pointing to. So connectivity leads to more diffusion. The denser the network, the more connectivity, and it's a very sharp transition where once you've got an average degree of three, you're pretty well going to be connecting the world everybody to everybody else. It happens very rapidly in terms of a random graph transition. So in this comparison, we see a rapid transition. So if you begin to look around the world, different networks have different signatures. So some are fairly sparse, others are denser. So this is just a few numbers. If you look at the high school friendships in adolescent health, when you talk to people who are good friends of yours, people name on average about six and a half. The romance network was just below one. It was about .8 friends each person had, romances. And one of the networks we'll look at in just a few minutes, a borrowing network in rural India, about three and a half different households that an average household will borrow from. You can look at co-authorship relationships among different scientists. So how many people do they collaborate during their career? Biology, about 15 and a half. Economics, about 1.7. Math, 3.9. Physics, 9.3. You get different collaboration structures among different scientists. That's going to lead to different diffusion patterns and different densities and different conductivities in those different societies. Facebook, everybody asks about this, so you have to put it up there. 120. What Facebook friends mean is a whole other issue. But on average, a typical Facebook person, active on Facebook has 120. As of 1999. Okay, so let me sort of now push, so we've made one point that this phase transition happens at a fairly rapid rate. From one to three, you've suddenly connected the world fairly strongly. The next point is that as you increase links, that makes the network more connected, but it has a countervailing effect. And what's that countervailing effect? The countervailing effect is that as you have more connections, you might spend less time with any given one. So you're diversified more in terms of the relationships. And as you diversify more in terms of the relationships, that can actually slow down contagions or slow down things transmitting from one to another. So now I have more friends, but I'm spending less time with anyone, and that means the probability of transmitting something from any one to another decreases. And that's something which has an important set of implications. So I'll talk about some work I did with two students at Stanford, Matt Elliott and Ben Golub. And what we did was we looked at financial contagions. So we were trying to understand when it is that failures of one financial institution could actually lead to the failure of another one. And basically it follows a pattern where as you increase the number of co-partners or other firms that a given firm is making transactions with, you expose them to more risk. And you can transmit shocks so that if one ends up having a problem going bankrupt for some reason, then that can cause them not to make payments back to another that can transmit to the network. And so you want to understand that and avoid 2008, 2009 kinds of financial contagions. So when you look at this, you know, increasing the number of partners connects the network together, but it also diversifies the investments. So as I'm dealing with more and more other firms, I have less and less of a percentage of my portfolio with any particular firm. And what that does is that minimizes the risk in terms of now if one of them goes bankrupt, that's a smaller part of my portfolio. I have less danger of going bankrupt myself. So what happens is you end up with a danger sort of in the middle range. And I'll just give you one picture of this. So if you look at, say, average number of partners, these are simulations. We've done simulations on a whole series of different networks. I'll show you one of them in a moment. But here what's on this axis is the average number of partners that a firm has. And what we do is shock one of the firms and then have it go bankrupt and then try to understand how that transmits through the society. And what you see here is this is basically zero. Here's one having one other connection. Here's the three. So the world is becoming very connected. And then once you get up to six or seven, things drop down quite a bit in this world. So this is actually based on the banking reserve ratios and so forth. But I won't go into the details behind this. But basically what's happening is as you increase the number of partners, now the things become more diversified and you get lower numbers of bankruptcies as a result of shocking the first one. Okay? So this can have countervailing effects depending on exactly what the interaction structure is. But now this can be done in a lot of different contexts. This is done in the context of a bank, but you can also do it in the context of diseases or other kinds of diffusion of products, other kinds of interaction systems where there's also similar kinds of behaviors. This is a picture of the actual network, which underlies this is the amount of debt from different sovereign countries in Europe that are held inside financial institutions, in particular banks in other European countries. So for instance here, 18% of French sovereign debt is held by German banks. 11% of Italian sovereign debt is held inside of Germany and so forth, Spain, 14% is held in Germany. When you begin to look at this network and you look at the banks, they fall in that not-so-nice region, which is sort of in-between, where you have substantial exposure, but you have a network which ends up being fairly connected. And so understanding part of that is, you know, understands, can give us an idea of why it might be a dangerous system and in particular, you know, why you might worry about financial interactions and firms not necessarily taking into account the costs that they might incur on society. So for instance, you know, we're getting this increased connectivity and as you increase the number of partners, you get more diversified, but if you end up not getting too diversified, then you're in sort of a danger zone. Let me make one point about this before moving on. When you think about banks, investment banks, and hedge funds, there's one difficulty that they have in terms of, you know, what would be good from society. We'd like them to be in a world where things are more diversified. Why doesn't it go that way? Well, there's sort of two things that impede them. One is that they don't internalize the effect that their investments have on the system, right? So they're not internalizing the fact that if they have more partners, then that'll be safer for the entire system. They just take into account what it's going to do to their own portfolio. And so they under-diversify from that perspective. The second is that, you know, they tend, they do connect to multiple partners, so they go to the point where you're getting three, four, five partners because that improves their bargaining position. You don't want to just have one other partner. But at the same time, they don't have the incentives to diversify sufficiently. Secondly, when you look, for instance, at the European Bank situation, part of it is also imposed by the specific way in which they're regulated. And they're regulated to actually hold very specific narrow classes of assets. And if you require that they can only hold sovereign debt from other nations as certain kinds of collateral, then that means that they end up investing in a narrow class and don't diversify their portfolios as much as you might like them to. So there's a whole series of things that play into this. But understanding the network structure can help us develop better measures of when it is that a financial institution is too connected to fail, how do we worry about them, and so forth. So there's a lot that can be done with these simple measures that can enhance our understanding of these kinds of things. Okay, so that was sort of a quick look at connectivity and density. And the point here is the obvious point, more connection, more diffusion, more contagion. The subtleties are that it happens very rapidly. And often it's going to be a middle range where things are actually the most dangerous in terms of having widespread contagion. Okay, so the next thing I want to talk about is segregation patterns. And this is another aspect of networks where we can have fairly clear measurements and we can draw fairly clear conclusions about how that impacts some behaviors of human animals in social situations. Okay, so there's something which is known as homophily, which is a term which is from Lazarus Veldt and Merton in the 1950s, but it basically refers to the fact that in most social contexts, people tend to form friendships with other people who are very similar to themselves. So when you're relating or you look into high schools, people with the same ethnicity, you will be friends with each other, same age, same choices of hobbies, all kinds of things. So you see a lot of like socializing with like. And it's not something which escaped historians, or so you go back to Aristotle, you see it in Aristotle's writing, it's something which is endemic to human behaviors. It's been measured on all kinds of different dimensions, age, race, gender, religion, and profession. I'll show you some pictures of it. So this is a high school from the same ad health data set. These are students, and now instead of looking at romantic relationships, we're looking at their friendship relationships. Okay, so these are the people that they call their friends. And in particular, these are broken down by race. So this is an American high school of about 255 students. This blue nodes are blacks, self-reported. Reds are Hispanic, yellows are whites, and then there's a few others. So this one's predominantly white, some black, and then a few Hispanic students. And how was this graph drawn? Well, the graph wasn't drawn by me moving the dots around to sort of put the yellow dots in one part and the blue dots in another. What it did was it used an algorithm, and the algorithm tried to pull dots together that were connected and push dots together that were apart if they were not connected. So it's an algorithm that picks two points, and if they're friends with each other, it tries to move them closer together, and if they're not friends with each other, it moves them apart. So it has no idea what the color of the dots are. It just arranges them by the friendships. And what ends up happening is you can see that it's a fairly segregated network. So the whites tend to have more of their friendships with whites, blacks tend to have more friendships with blacks, and fewer friendships across. And in particular, this is an interesting effect also. When a group gets very, very small, then it can sometimes integrate. So here the Hispanic is a small enough population that they can't sort of form friendships just among themselves. They end up being fairly well integrated in this high school. Okay. What's this? This is the same high school, but now with a stronger measure of friendship. So here what I did is I took the graph and I only put a link between two individuals if they spent, if in a given week they did at least three activities together. So they had lunch together, they studied together, they went to a movie together, et cetera. So there's a whole series of things that they're asked about what they do. And now you can see that the network almost completely fragments, right? So it's cut almost in half. There's a few friendships across the race, but very few, okay? Now this is also going to have consequences for things like learning, diffusion, and so forth, because something that starts over here might never make it across the network. And so you can begin to see that this is going to have implications for how people behave and how they adopt behaviors inside a high school especially. Okay. Other pictures? This is one we're going to talk about a little bit later. This is a picture from some of the, one of the Indian villages we've been working in. These, each dot here is a household and there's a link between two of the dots if they borrow or lend kerosene or rice to each other. Okay. So this is the kerosene and rice borrowing and lending network. And here the color coding is by cast. So in particular, it's just a rough cut of cast designation. So the blue nodes are what are known as schedule cast and schedule tribes. So casts that are recognized by the Indian government for affirmative action whereas the red ones are otherwise backward in general cast, the ones that are relatively advantaged. Okay. And here again you can begin to see that there's a schism and in particular now the probability of having the frequency with which there's friendships or borrowing and lending, kerosene and rice relationships across these cast designations is about six-tenths of a percent. It's about nine percent within. So it's more than ten times more likely that you do this within the cast designation than across. Okay. And if you look closely here, you can also see there's another cut in this graph, right? So these nodes interact almost not at all with these. There's only actually only one link across from that group of blue nodes to this group of blue nodes. So if you go further down into the cast designations you'll see that there's further cuts in the graph and so forth. But this is something that's very prevalent. There's going to be lots of situations where we see these kinds of cuts in graphs and understanding that segregation pattern is going to have implications for people's behaviors. One more picture of homophily. This, what is this? This is the U.S. Senate. These are votes. So each one of these dots is a senator. So this is a U.S. legislature and there's a link between two senators if they voted the same way on at least 100 bills in one of the sessions. Okay. And then there's no link if they voted differently on a sufficiently many. So it's just a co-voting relationship. This is from a physicist actually, Lucione. And you can begin to see, again, this is one that's drawn by one of these algorithms so it pulls apart the senators that don't vote together and pulls ones closer together that do. This is from 1990. So there's a question about whether American politics have become more segregated over time. Here's 1990. Here's 2000. And there is 2010. So by looking at this you can begin to see that there's fewer co-votes on bills. A lot of this is endogenous but certainly there's a change that's going on in terms of the Senate and you can see it pretty clearly when you look at the graphs. The one guy in the middle, yes. So this is the Nebraska, the Democrat from Nebraska. It's Nelson, yes. So there are some independent-minded senators occasionally who sort of cross parties. These are tied. Yes. And then there's other, yeah. So that's another independent from Vermont. But it actually looks very blue in terms of an independent that looks more blue than green, right? Okay. So you can begin to see that these kinds of cleavages are going to have impacts. We can see them over time. We can measure them. There's lots of different ways of measuring whether or not a graph is really segregated. And that's going to have some implications for behavior. So let me go through and understand it's going to affect things like spread of information, adoptions of technology, product purchases, and so forth. One thing I want to touch on a little bit, which I think is sort of important in understanding is it can be something which leads to sustained differences in behavior over time. So you can have poverty traps. You can have different groups having very different behaviors and that can be sustained over time when you see a lot of homophily. And so let me, I'll talk about some work I did with Tony Calval-Armengol some years ago. What we were doing was we were looking at labor markets and we were looking at how people make decisions of whether or not to participate in a labor market or not. So in particular, do I become educated? Do I look for jobs? Or do I drop out of high school and end up either unemployed or not in the labor force, which could mean I'm in the U.S. that I'm in a gang or in other activities besides being, educating myself? So we can look at those, that kind of decision, do I stay in the labor force or do I drop out of that? And what turns out to be important, of course, is my peer decisions. So if I have peers who are educating themselves, that helps me in a lot of ways. It gives me a role model for becoming educated. It gives me more information about what it might be like to go ahead and go to university. It gives me access to future job information in a group of friends that are all going to become educated and get jobs as opposed to people who are all going to drop out. So there's a whole series of ways in which these behaviors are going to reinforce themselves. And then what we can do is we use game theory to try to analyze these kinds of situations. So you end up with a situation where people are going to end up making decisions that depend on their friend's decisions. And so you've got a system where people are influencing each other, and you're trying to understand what might happen in that. And I'm not going to go through the details of that kind of study. What I will do is just give you a picture of it, and the ideas will become crystal clear very easily. So we've got this homophily, and what can happen, and it's very different starting conditions, can lead to very different outcomes for different groups. And that can be sustained over time. And just in terms of data, sort of in the background, one thing that's been difficult to explain over time has been differences in labor force participation rates, wage rates, and so forth by race. And in the U.S., this is U.S. data. These are males, black, white, Hispanic, and this is 1970 through about 2006. And the blacks participate at the lowest rate, the whites, intermediate, and Hispanic at the highest rate. And fairly systematically differently in terms of their labor force participation rates. There's a whole series of explanations for it, but I'm going to point out that the social context is one feeder into this. Interestingly, if you look at females, it flips. So here, when you look at females, it's actually the highest participation rate is among black females. White is intermediate and Hispanic tends to be lower. But we see systematic differences, and there's a lot of gender and racial homophily here. And in particular, let's just look at a very simple example to sort of go through what might be underlying this. It's a fairly straightforward point. But if we think about two groups exhibiting homophily, so say the green group and the yellow group, and let's think of a situation where people's decisions are going to be influenced by their friends, and there's something which is known as the majority game, which is that I want to do what the majority of my friends are doing. So the majority of my friends go on to become educated and go to university. I want to do that. If the majority of my friends drop out, I want to do that. So I'm just going to emulate the majority of my friends. And what we do is we start just with a little difference between these two groups. So we'll start with, say, this group having two dropouts to start with, and the other group not having any dropouts. And then what begins to happen, as you can see fairly quickly, if I'm dropping out, if at least half my friends do, well, then this person has at least half their friends dropping out. So boom, that person drops out. Then we have another person here who has at least half their friends dropping out. So that person's going to drop out this person as well. And so then we end up with one of the, in this extreme example, one group completely drops out. But then, given the homophily, the other group is staying all staying in the labor force because all of their friends are staying in the labor force. And so we get a clean cut because of the clean cut in the social graph. We end up with this reinforcing itself when we get one group that's going to drop out completely and one group that's going to stay in. Now, the logic behind this is fairly simple. But I think it's important to keep in mind because it means that if you want to be doing programs or policies that aim at avoiding dropouts, understanding this kind of interaction means that you want to target those policies towards groups of individuals. Because if you just target one at a time, that's not going to make enough of a difference to change the system. So this kind of entrenchment means you have to target groups of individuals going into a particular high school and attacking that high school and trying to change the policy there rather than just trying to give scholarships to a few outstanding students. You bring a few up, but that's not going to make a difference if they're really looking at, say, the majority of their friends and so forth. You need to do it in a targeted manner in order to do that. So this can give us understanding. It actually gives, you know, when you go into the labor numbers and try and understand differences in wages and other kinds of things, it can, you know, help explain a lot of the variation, which can't be explained just by the choices of parents and wealth access and other kinds of things. There's still some residual, which ends up being something you can explain with this kind of understanding. Okay. So that was two different points we've gone through. Connectivity matters, density matters. Segregation patterns can have deep impacts on societies, can help us understand how different parts of a society can be doing very different things over time. The last thing I want to talk about is influential positions. So understanding when it is that somebody is influential. So, for instance, in that graph before, we were looking at people reacting to each other's neighbors. Well, if somebody has a lot more neighbors, they might end up having more impact than somebody else who has very few neighbors. So how that evolves and so forth might depend on which nodes are the ones that drop out first. Are they really important people in terms of the network position or are they more tangential individuals? So we can begin to understand that a bit. So how does a position in a network matter? There's many things to measure. So this is probably one of the more interesting from a mathematical perspective. There's a lot of different ways to measure what does it mean to be important or central in a network. And I'll mention a few of those. And the implications will depend on the particular application that we have in mind. So I'm going to talk about two different applications. One, we'll go back to the 15th century. We'll talk about the rise of the Medici. And then the second one, we'll come back and talk about research we've been doing in rural India and trying to understand exactly what the implications of network position are there. Okay, so let's start with Medici. So this is data that comes from, actually originally from Kent in the 1970s and then was re-analyzed by Padgett and Ansel in the 1990s. And what they did is they went through and kept track. It mapped out every intermarriage between different families in an oligarchy in Florence in the 1430s, actually from mid-1400s. Sorry. Yeah, so from the mid-1300s through the mid-1400s, basically. And so what this is looking at is these are some of the more prominent elite oligarchy families in the early 1430s. And there's a link between two of the families if they married a child from one to the other. So usually these were teenage daughters marrying 30 to 35-year-old males, sons, and they were often arranged marriages for either business or political purposes. So the idea was these were sort of social collateral. You gave your daughter to one family and the son became one of your political lieutenants in the city council. So here, at the time, going into this time period, it was basically an oligarchy ruled by a whole series of families. These are 16 of the most prominent. The Strozzi, in particular, were one of the most powerful. They had the most seats, the most wealth. There was a whole series of other families that were important. The Medici were not particularly prominent at that time period. So they weren't the wealthiest. They didn't have the most political clout. And yet they rose to power. And so let's look at the graph and just try to understand a little bit about what the graph tells us and how influence, you know, what can we see in the graph. So there's several things that are particular about their position in this graph. So when we look at the number of connections, so the Strozzi are connected to four other families. But Anu were another family that was important. They're connected to four others. The Albitis were another important family. They're connected to three others. The Medici have six, so they have more connections. That's not the full explanation. So when you look at this Cosimo de Medici, so he was basically the person who sort of engineered the rise of the Medici. And if you read Machiavelli, Machiavelli was actually fascinated with the Medici and spends a lot of time talking about Cosimo's leveraging the position in the network. And what's true about the Medici is they have a position which is they were important connectors between other families. They were important brokers. So if you think about, you know, 15th century Florence, it was hard to write contracts with other families and have them necessarily be enforced. So if two families, you know, say the Barbadoris want to do business with the Salviades, you know, they can write a contract together and hope that it works, but if they don't have some way of enforcing it, it might not work. And the marriages were one way of enforcing it, but instead another way was to go through a family which could mediate it. So if you were both married to one family, then you could go to them and they could help be a broker and help mediate the relationship and help make sure that that relationship worked. And in particular, let's just, you know, we won't ask you to all count this, but one thing you can do is you can ask how many different pairs of families, when you look at the shortest paths between different pairs of families, how often do the Medici lie on those paths? How often do the Strozzi lie on those paths? How often do the Guadagni lie on those paths? And you can do that for all the different families. So if you do that, the Medici lie on 52% of the shortest paths between other families. The Strozzi about 10%, and the Guadagni about 25%. So they were much more central in what's known in between this centrality. They were between a lot of others and in a world where it was important to broker deals, sort of literally a godfather-like way, it became very important to have that central position. There's one other thing that's sort of important here is a lot of the families that the Medici were connected to were not connected to each other. So actually the Rodolfi and the Torbioni, this is missing an eye, were the only two that were connected out of the ones that they were directly connected to. Whereas the Strozzi was connected to a bunch of families that were all connected to each other. So they could all deal directly with each other and didn't have to go through the Strozzi. So there were all kinds of things that led to sort of political competition among some families, whereas the Medici were important, other families had to go through them, and basically the political party that they built sort of builds out of this part of the graph and they rose to power and it became an autocracy out of something that was an oligarchy. So what does that lesson do? This is exposed historical interpretation, but the graph can tell us things about positions of certain families or certain nodes and help us understand, say, a historical event in this case. Now this is one particular way of measuring influence. And it's one that's important in a world where you have to worry about enforcing contracts, you have to worry about doing business deals, you have to worry about making political deals and make sure that somebody's going to abide by it. That can be very different than if you just want to spread information, for instance. It might not be that the Medici were the best information spreaders or best ways to start a contagion. That might be a very different question. So I'm going to talk now about a very different project which also tries to capture influence and understand position in a network and what the implications of that are. This is a diffusion project in rural India. So this is work I've been doing with Abhijit Banerjee, Arunachandar Sikhar, and Essadu Flow. I'll talk a little bit about one paper and a little bit about some of the stuff we're doing now. So this was a microfinance project where a bank was going into southern India into a series of villages and diffusing information about the microfinance project, and they wanted to make sure that the information got out. And they were getting very different results in otherwise similar villages. So they go into one village, they talk to a few people, they say spread the news, we'll be back in two weeks to start signing people up, and nobody would show up. They go into another village, they tell some people, and they get a bunch of people showing up. And so they wanted to understand why is it that they're successful in one village and not in another. So in our data, they got 7% in one village, the smallest in 44% was the max. And so if we try and understand the word of mouth process here, we can begin to understand why they were successful in some context and not others. And again, it's going to boil down to hitting the right influential nodes. So they were looking for very particular people to spread the news, and sometimes those people were central and sometimes they weren't. So what we did is we, before the bank entered, we went in and surveyed the villages, mapped out the networks, and then from those networks we can tell who was central who wasn't and whether they hit the right nodes or not. So we were in Karnataka, southern India. This is one of the villages, a typical network. So here the little dots are individuals, a clump of them as a household. And this is the question of if you had to borrow 50 rupees for a day, about a dollar, who would you go to? So then you can point to somebody and there's a link between one household and another if one of the people in one household named somebody in another household that they would go to to borrow money. So who do you go to temple with? Obviously not a terribly religious village. Who do you go to for advice? Who comes to you to borrow kerosene? Who do you go to in an emergency for medical help? And so forth. So we asked them a whole series of questions and from this we build up a network and that's the same network we saw earlier represented in the blue and red dots with the cast. But you can picture them in different ways. And then that gives us an idea. So if the bank went into a village and happened to say, hit this household as the one that they start telling to spread information, that information might not diffuse as well as if they hit somebody else who's better connected in that network. Okay. And so first of all, you know, do initial injection points matter and how should we measure their role? So in terms of doing this, so I apologize to people who are at the workshop, so I talked about some of this at the workshop, but I'll talk about some stuff that I didn't talk about there in just a few minutes. So one way of measuring this is just by counting the number of connections that a given household has. So here, this is obviously the most central household here because it's got seven, this one has six, and so forth. The problem with just counting numbers of connections is it misses a lot about the position, right? So here, this too looks a lot better than this too, partly because it's connected to a seven and a six rather than two twos, but also because it can sort of reach out to the rest of the network fairly easily and reach a lot of people. So what we'd like to have is a measure that captures the fact that it's not just how many connections you have, but you want to be better connected, you want your friends to be well connected themselves, right? So what's a measure of how connected somebody is? It's not just counting friends, but waiting them by how connected they are. So we want something to say that the centrality of a given household is proportional to the sum of the centralities of their friends. So instead of just counting my friends, wait them by the centrality of those things. That's a self-referential system. There's a solution to it from mathematics which is known as an eigenvector. I won't go through the math of it, but there's a well-defined way of measuring and finding these numbers. And if you do that in the particular example we're just looking at, this one comes out as a 0.3, this one comes out as a 0.1. So this one's about three times as important because it's connected to a 0.4 and a 0.5. This one's connected to a 0.2, basically 1-6. So being better connected means you're in a better position if you have better connected friends as well. So we've got these two different definitions. We've already looked at the between the centrality. We've got a whole series of different measures we can be using. So one thing that we defined was also something which is defined directly for a diffusion process. So instead let's try and measure things directly by how well people diffuse things. If that's what we're interested in, why don't we just measure it directly rather than using something like counting friends or counting how important your friends are, counting betweenness? Instead we'll ask if this person was hidden in network, how many friends do they have, how many friends are friends do they have, and so forth. So what we did was we defined diffusion centrality as follows. So this is, say, let's suppose that people spread news with probability 0.5 to any given friend. And we'll do this for, say, four periods. So we'll just have a couple of parameters. We say we think people talk for some number of iterations, and they randomly bump into their friends with some given probability. So as we increase the probability, they're going to spread more. If we increase the time period, they're going to spread more. So if you did this, then what would happen? Well, maybe they bump into one of their friends, a half, in the first period. Second period, they tell some people. Third period, it spreads out a little further. Fourth period. So for this node, if you were running it at this probability, one simulation of that would say that this person has a 13. So if you had this node, you do the same calculation. You end up with six. So the first node would be about twice as good as spreading news as the second one based on this measure. So there's a measure that we can go by particularly looking at diffusion and trying to figure out which of the best nodes are in a graph for diffusing information. And if we go through in the villages, we can then ask these hypotheses, maybe it's just higher degree that matters. Maybe it's higher eigenvector centrality that matters. Maybe it's higher diffusion centrality that matters. Maybe having better initially informed nodes that the bank went to and the first people they talked to, if they had higher diffusion centrality, then maybe that explains why they get better traction in some villages than other villages. So here's degree centrality. So this is a map of here's the average number of connections, average number of households that each household the bank talked to was connected to. So a particular point here is a village. So in this village, the average they talked to, a number of the first people that they talked to, on average had about 14 friends each, and they got about 9% take up. And so you can go through this. And basically if you look at the relationship between what's the participation in those villages and how many friends did the initial people have, there's no relationship. It didn't really matter. So that's not the explanation. It's actually sloping a little bit negative, if anything. When you look at eigenvector centrality, it's positive, and it explains about a third of the variation in the data. So you get a positive relationship. Now it does look like having more eigenvector central people to start with made a difference. And if you do diffusion centrality, you also get a positive relationship. Now you actually explain about 50% of the variation across villages. So there's still about 50% of the variation, which is unexplained, but it does a better job of explaining it than either the eigenvector centrality or the discounting neighbors. And in fact, between the centrality looks a lot like degree centrality. It doesn't do much in this. So between this does great for the Medici. It doesn't do so great in southern India. I'm not sure it's an Italian Indian statement, but very different applications. Rise of an oligarchy and just diffusing information are two different things. And you get very rise out of an oligarchy. You get very different reactions here. Okay. You see, in terms of time, one thing I'll do is just mention one, quickly one thing that we did, and then I'll close with another thing that we're talking about we're doing right now. So actually mapping out these networks is expensive. And a bank doesn't have the resources to go into every village, map out a network, find who the most central people are, and so forth. They might as well just tell everybody. So here what we can do is these are the people that they identified, the bank identified as important people based on whether they're shopkeepers, teachers, and so forth. What we did is you can go in and just ask people in the village. So just go into the village and ask people, who do you think would be important for us to talk to? And the villagers themselves are, in particular, we ask them, who should we go to talk to if we want to diffuse information about a new financial product? So if we ask them that question, they tell us answers, and a lot of them tell us the same individuals, and these individuals turn out to have high diffusion centrality. So they're very good at answering that question. And when you put the two together, who are the leaders and the people that they tell us to talk to, you end up with four of the most central people in the village. So the combination of using the information that we can see about the demographics, plus just asking the people who are the best diffusion points, the people seem to have a good idea of that and can tell us fairly readily. So we end up here with fairly good injection points. What we've done in a little bit in that is try to understand why. And basically, it appears that people just by sitting and listening to what's going on in their network and hearing news about other individuals get informed about who's really central in the network. It's easier to hear about people who are very central. And so if you're just listening and I keep hearing about another person, that allows me to learn about who's influential in my network, even if I don't know them directly. Okay. Let me end up with one sort of interesting thing that we've been working with lately and is a curious effect. So in understanding how people pass information, we also wondered whether or not they're selective in who they pass information to. So in giving information out about microfinance, it was important that they knew that the loans were without, there was an unlimited number of loans. So if everybody in the village wanted a loan, that was okay. Every eligible household could get a loan and there wasn't going to be any rationing. What we did is we gave them something which was rationed. So they could come and participate in a study we were doing and there were only going to be 24 seats in that study out of the village of about a thousand people. And the amount that we paid was a little more than a day's wages for about an hour's activity. So it was very attractive for them. And then we actually mapped out who they told. So we told some people in each village. We mapped out who they talked to, who they told. And we wanted to see, would they be selective? So if you think about this, who am I going to tell? Well, maybe I keep it entirely to myself. I don't tell anybody. Maybe I tell my best friend. But do I tell the town gossip or am I more careful not to tell people who are highly central? So what ends up happening, people, what do they do? If you look at a person and then look at one of their friends, if that person is below the median central centrality in terms of diffusion centrality, there's a 14% chance that they tell the friend. If that person is above the median, there's only a 3% chance that they tell them. So they're actually selective and they're telling the less central friends about this opportunity. So they want to tell their friends because it's nice to have their friends say, wow, you gave me this great opportunity. But they don't want to tell their friends who might spread the news to other individuals. So they tell less central friends, at least empirically. OK. So that's a very quick run through the world of networks and economics. But what I wanted to do is just give you some impression of how it is that social structure is important in helping us understand economic behavior. And we've just sort of gone through and looked at basically three different points. So density, one aspect of the networks. It affects connectivity. It drives contagions, diffusion. You get these sharp transitions. Segregation patterns are important. They can impede diffusion. They can help foster different behaviors in different parts of a network. And finally, there's a whole series of questions and understanding influence, understanding who's powerful in a network, who's good at spreading information. There's different ways of measuring these things. And I think overarching all of this is the fact that despite the fact that these networks are incredibly complex and difficult to measure, there are simple ways of capturing the features of a network that do translate fairly systematically into different kinds of behaviors. And so there's ways that we can take that and understand economics, hopefully, in a better way and improve the welfare of the individuals involved. So I guess that's a good place to stop. And thank you very much. A few questions. So please make them fairly loud. Just a question around the influential. I found it very fascinating. I mean, I'm sort of marketing at the moment. And that sort of thing seems to save a lot of money before we take the market. Well, what I actually was wondering was, because your example was about the village in India. We're going to have a different investigation if you come to a more complex society. I'm not disrespecting anybody in India, but I'm saying is this definitely how you deal with this? Is it that those algorithms would differ from the actually different country or not? Yes. So I think there's two aspects to it. One is it could be different in different social contexts for different cultural backgrounds, but it could also depend on what the product is. So one reason that I think things were particularly simple in the Indian case was that all that really mattered. People know a lot about microfinance, its existence already in India. So more of the information was just a simple fact. Microfinance is coming to our village and there's going to be a meeting. And it might be very different from, say, diffusing a product. So if you think about a game that are going to be played among teenagers, well, now you care about what your friends are doing and you might need several friends to adopt it before you're willing to adopt it. So that's going to be a very different dynamic. And I think in terms of the cultural differences, you've got people interacting here. It was mostly word of mouth. So it's relatively poor villages, relatively uneducated. Most people have four years of school as the median. There's not a lot of cell phone usage, no internet usage almost and so forth. So that's going to be very different than a world where people are connected through all kinds of social media platforms and being bombarded by social media directly. So, yes, it can depend on context. I think one thing to take away from it, though, is that we can build very simple models. Like the idea of diffusion centrality, it's a very simple calculation. It's not hard to do. And so by getting into the details of exactly how we think people interact, we can build fairly simple models where we can get a lot of explanatory power out of the network structure. And I think that hopefully will hold true in much wider contexts. And I think it does when you look at other people's work as well in Facebook and other kinds of things, you can see things diffusing as well. I'm interested in whether you have seen any examples of this being used to increase voters to participate in democracy and how you might go about using it. Yeah, so I guess there's a lot of interest in increasing turnout. And certainly, I think there's prominent examples of political parties taking advantage of social media to encourage turnout. And I think part of it is just understanding exactly how the population is communicating and how to best reach them. And best reaching them could depend on the context. So here it was just basically getting word out. Voter turnout's a little more complicated because most people are aware that the election's going to happen. It's more encouraging them to turn out. And that could be a very different poll. It's not something I know as much about, but it's something that I imagine you would take a different tack to because now I want somebody who's not only good at reaching people, but also good at convincing them. And here, we just needed the news to get out. And it might be a very different thing that you want in terms of figuring out what's the best way to encourage people to attend. And that could, yeah. That's the idea of what the difference is. Right, right. So that's a great question. And in fact, there's a lot of changes going on in the world right now. So when you look at the size of the paths between different people. So for instance, on Facebook, it's been estimated at 4.7 now as the average distance between two people out of 700 million nodes. So you've got 700 million people, and you pick any two, and you ask, what's the shortest path between them? It's roughly five friends, and they're connected. When you go back to the Middle Ages, it was about 11 and a half. And this is estimated by looking at the speed of the propagation of plagues. So they can actually measure out how closely connected people might have been. So it's gone down by a factor of two over a number of centuries. Those are rough cuts. But one thing that I think, in terms of the things we've talked about this evening, I'm undertaking some research with another student to try and understand exactly the question you're asking. And there's sort of two different forces at work. One is that as we increase the access to social media, we increase the distance at which people can interact. So we're shrinking the world by having more connectivity. But at the same time, it's also easier to find and locate people who have exactly the same interests. So I can find other people who are professors of economics who like to mountain bike ride. I can make them my friends. Well, now you've got a bunch of friends who look exactly like each other. So the homophily is increasing at the same time. So you've got a denser graph that could become more homophilistic. And so what that ultimately does to the society is an interesting question that we don't have a good answer for yet. And I don't know whether those will win out in the long run, but that kind of transition is happening. So these villages, the families interact basically with about 15 other families on average. And those 15 families tend to be in fairly connected, both geographic and caste. So they're fairly narrow. And it's sort of amazing to go into these villages a year and a half after the microfinance had come in and you find families who just don't even know it. It's available? No, I didn't know. So those can be really isolated. The modern world, you have this impression you're really well-connected, but it's not clear whether the homophily is going to overwhelm the connectedness or not. Small changes in the network? Yeah, yeah. So for the Manechi thing, it looked like if you put one more marriage, you wouldn't change the picture. Yes. So there's an emerging area of study on how robust a lot of the measures are. So actually Rune Chandraseekar, one of our co-authors on the Indian study, his dissertation was on how susceptible different measures are to missing data or mismeasured data. And there's a lot of mismeasurement and a lot of errors. And so actually we have about 20 pages of our appendix recreating and making sure that all of the measures are robust. And certain ones are much more susceptible to these kinds of things than other. So degree centrality, just counting friends, that scales roughly with the measurement error and it's fairly robust. Things like eigenvector centrality, between this and some of those can be, you can find examples where small tweaks in the network can change those things. So that's something you always have to be aware of is the data is noisy. And everything I looked at tonight and showed you was as if it was a static network. It's just fixed. And in fact these things are changing over time. There's measurement error. There's all kinds of things going on. So these are just our best guess at what the centrality of something is. And so the answer is we still don't know. There's people working on it. But it's understudied. Could you tell us a little bit more about it? It's interesting to know if it's got a name and also being an algorithm, iterative algorithm. How often do you run it before you start to do some features? So there's actually, that's not my algorithm. So the way that this was drawn is there's software. Now there's literally hundreds of algorithms you can find out there. There's a small literature just on what's the best way to picture one of these graphs and how do you do that? So there's a number of different algorithms. And usually what they do is they run them until there's some convergence. And convergence is measured by no longer do the points move by more than a certain distance. So you can set a tolerance. And then you ask for convergence to some level. It's a little bit of an art still, because the algorithms are somewhat ad hoc. There isn't a well-defined best algorithm. So there's a lot of them out there. I can give you references after, yeah. Right, right. So I guess you can think of, so this was a question about leadership in case people didn't hear how is social media affecting leadership and ability for people to position themselves. It's interesting that social media you can think of as just amplifying the reach that people have to some extent. So people could position themselves, the Kazimod, so it's Giovanni Kazimod's father who actually arranged a number of the marriages. There's debate about the extent to which they were foresighted and actually arranging these marriages deliberately to be in this advantageous position and how much of it was fortuitous and they were lucky. And I guess it's hard to position, to put yourself in the right position in a network and some of it's a little bit of luck. I think what happens with social media is there are effects of rich get richer. And so once somebody has certain numbers of followers, it's easier to find that person. It's easier to hear about them. It's easier to be aware of them. So for instance on Twitter, if somebody's getting more followers you have, the easier it is to attract more. And so what you end up with is a very skewed distribution where the people who end up having, once you get above a certain amount it's sort of snowballs. And so there are effects of this that happen and they don't necessarily happen because this is a better person to listen to than somebody else who didn't happen to get past that critical mass point. So one thing that can happen is you get these sort of acceleration patterns that should occur through social media that wouldn't happen necessarily just by word of mouth.