 Kevin Lewis is a Berkman Fellow and he's a PhD candidate in the Sociology Department at Harvard this year and soon to have his PhD. He also worked with a former Berkman Fellow named Jason Kaufman on some Facebook research and social network research previously. We're excited to have him today to talk about make choice in an online day so welcome Kevin. Thanks so much. Appreciate it. Thank you all so much for coming. It's wonderful to see all of you. I'm thrilled that turnout is so great. I'm a little worried to be honest that all of you really want to spend your Valentine's Day learning about make choice from a 28 year old single guy. So I'm thrilled to see you all. So one of the reasons that online dating caught my interest in the first place is that it's so rare to see a phenomenon, a social phenomenon that changes so drastically in such a short period of time. It was really only 10 or 15 years ago that the practice was tremendously marginalized. I mean that socially in the sense that very few people participated in the first place and relationships that began online basically accounted for much less than 1% of our relationships out there and also culturally and that the practice is also highly stigmatized. The sense was well if you're resorting to online dating there must be something going on in your you know face-to-face dating life. That's some kind of desperation there and really it's only in the past 10 or 15 years this has changed drastically. We've all started to hear more and more anecdotes from friends and perhaps personal experiences online and relationships that are successful that began online. I think the numbers themselves really reveal how striking this change has been. So as recently as 2005, a study showed that 37% of all basically single American internet users had tried an online dating site and a recent study by Michael Rosenfeld shows that of all romantic partnerships that began between 2007 and 2009 actually 22% of heterosexual relationships began online and 61% of same-sex couples met online as well making online dating clearly the most common way the same-sex couples meet today and the third most common way the heterosexual couples meet behind shared acquaintances and social venues such as bars and clubs. We've also of course seen accompanying decline in the extent of stigma that's associated with the practice people are much more comfortable admitting that the online you know date online that they met their partner online and so I really I think this transformation is fascinating in the degree of impact online dating has makes it an empirical topic worth your study in its own right. What I'm going to talk about most of the time today however is not the social prevalence but actually the academic importance of online dating and what we as social scientists can learn from data from these sites. So mate choice for decades has been a central topic of interest to scholars of inequality and this is for two main reasons. First of all to the extent to which romantic pairing in a given society involves intimacy and trust mate choice patterns can tell us a great deal about social closure or the extent to which individuals from different backgrounds accept each other as equals. Additionally insofar as romantic pairing leads to offspring we can also learn a great deal about intergenerational mobility or the likelihood that the status differences that today would pass along to the children of tomorrow. Now to date the overwhelming majority of research on mate choice and the sociological literature is focused on marriage patterns. This is due to the importance of the marriage relationship and also the availability of accurate nationally represented data in the form of census records, marriage records, survey data and to anyone familiar with the literature on homophily it will be no surprise that birds of a feather flock together in marriage just as they do for virtually every other type of relationship. So with respect to marriage we call this endogamy the tendency for similar people to marry within their social group and homogamy the tendency for people to marry someone similar in status. By and large however across a striking array of attributes people tend to marry those who are similar to them. Now prior work has not only described the landscape of marriage patterns that are out there but identified three possible causes of the patterns we observed. First of all we've long known that who we date or marry is constrained by who's available to date or marry in the first place. So maybe the case that my perfect match you know my soulmate lives in some beautiful village in the south of France, spends her days drinking wine and reading philosophy as any soulmate of mine will surely be doing. Given that the odds of me bumping into such a person are basically in a school it's much more likely I'll end up you know dating someone another socially awkward grad student like myself from Harvard right. If I'm lucky maybe someone from the law school. So the point is that that relationships are constrained first of all by opportunity structures and given that similar people tend to self-select in the similar places it's no surprise that this homogeneity is turning up in relationship patterns as well. Now second of all relationships are influenced by third-party interference. I mean this not just in the sense of external norms or sanctions such as who your friends your parents or church even your government want you to date or marry but also much more direct interference as well given that friends tend to resemble one another and friends of friends tend to be sent out on dates with one another to give no surprise that homogeneity is turning up in relationship patterns. Finally there's of course the individual preference for similarity or homophobic proper. Now marriage is obviously important for a number of individual and social outcomes but insofar as we're learning about boundaries between groups what's important is not so much who we actually date or marry but who we want to date or marry or as sociologists would say subjective social distance as opposed to objective social distance. To date however due to limitations in available data and limitations inherent to the study of May choice it's actually been very difficult to disentangle the role of individual preferences from these other two causes. For the following three reasons. First of all preferences constrained as I just mentioned who we date or marry is constrained but who's available in the first place and you can meet your potential partner anywhere a coffee shop on the subway at a Berkman lunch and talk maybe but unless we have data on all these possible ties that didn't happen it's impossible to say something meaningful about the actual ties that did and so given that complete data and opportunity structures however generally aren't available for romantic pairing it's really been difficult to separate the role of individual preference and constraints of opportunity. Second of all preferences multi-dimensional we all have preferred characteristics and ideal partner across a wide variety of attributes but traditional marriage data sets only contain a handful of basic demographic information consequently multivariate modeling of relationship outcomes is actually surprisingly rare and we don't have much of an idea whether an apparent preference for one attribute is actually a spurious consequence of preferences for another attribute with which the first is correlated. Finally preferences directed what I mean by this is a very basic and very consequential social fact with which many of us are painfully aware and that's that one of the biggest constraints on who we date or marry is who's going to date or marry us. The reason this is consequential is that very different underlying preferences can produce indistinguishable patterns in relationships. For example let's take an example of a heterosexual mating with respect to educational attainment so you have men and women who have a high school degree an undergraduate degree or graduate degree. Now one possibility is that the process directly parallels the pattern. In other words everyone prefers similarity with respect to educational attainment in their partner and so of course homogeneity in marriage patterns is what we observe. Now another possibility is that everyone prefers to date or marry someone with higher education so we have in this case is those the top pair off those the middle pair off and those the bottom have no choice for the pair off as well producing the identical pattern. Now finally there may be gender asymmetry with respect to preferences. In other words maybe the case here that women very much want to date or marry someone with a similar level of education but men don't care at all about the educational background of their spouse because they are caring about some other attribute. In this case it doesn't matter that the men don't care because they'll be forced to date or marry the only women who are willing to date or marry them. So the point is that indistinguishable patterns and relationships can be produced by very different underlying processes but if we actually only have data on the relationship outcomes that emerge from this process hypotheses about the dynamics that produces outcomes can only be tested indirectly. So we have the situation then where a basic sociological puzzle has basically to date been unanswered and that's what's the role of individual preferences in mate selection. Now recently scholars have added another concern to literature on marriage and that's that marriage doesn't capture the same thing that he used to. Marriage patterns have traditionally excluded and continue to exclude in most states today data on same-sex coupling for instance and given rising rates of divorce rising rates of cohabitation and a rising average age of first marriage. Marriage patterns are also capturing a smaller and smaller proportion of heterosexual couples each year. In other words marriage is an important outcome but it's also only one possible outcome of a much longer relationship development process and so while the majority of sociological research focuses on marriage in my dissertation I shift attention to the exact opposite end of the mate choice spectrum and instead focus on the initial searching and sorting processes whereby strangers consider each other as potential mates, express interests in some subset of this population but not others and find this interest is or is not reciprocated. And to do this I use data from an online dating site. In addition to its empirical prevalence as I mentioned earlier online dating site, online dating data have a number of methodological advantages that speak directly to the limitations and the obstacles identified earlier. So first of all we know the preference is constrained and in an online dating site we have complete data on the opportunity structure for interaction. In other words we know exactly who has an account at a given time and therefore not just who we message but who we don't contact as well. Second preference is multi-dimensional and online dating sites people typically report a wide number of individual attributes about themselves including not just demographic attributes like race or religion but also things like body type or whether or not someone has pets and basically data about what social sociologists don't usually have information. Finally preference is directed but online dating sites we can actually observe this directed interaction itself as it plays out in real time whether one person contacts another and whether or how that person responds. So now we can actually get to the data. The data I'm using from my dissertation come from a popular online dating site called OKCupid with which many of you are probably familiar either personally or academically. So it may or may not be the best dating site on earth but it's certainly one of the larger ones. So OKCupid it is free absolutely yeah so you should all be members right? No doesn't that also differentiate competing e-harmony and the ones that charge? Certainly. So an absolutely important feature of OKCupid in addition to its size it claims to have about several million active users is that it's free unlike many pay you know for pay dating sites subscription based dating sites. So that eliminates the substantial barrier to entry on the other hand you might say that people on this side might be more less serious about romantic relationships. Do you think it's younger? Yeah and I have the script is on the demographics to finish in those later but generally people in their 20s and 30s the median age is about 27. It also importantly advertises itself as a generalist dating site right as opposed to a specifically niche site catering to individuals from certain backgrounds or with certain interests. So now I thought it might be helpful for go forward to give an example of what the actual raw data we're working with or what an online dating profile on OKCupid looks like. So I just logged into the site and picked a profile at random that I thought I'd show you guys. So Colin's not here today. So first when you create a profile on OKCupid the first thing you do is I you know select the username that others will find attractive and upload the most gorgeous possible attractive picture of yourself. You have the opportunity also to respond to a bunch of these open-ended essay prompts where you describe yourself in response to various prompts. Now the data I have are completely stripped of identifiers. I don't have screen names. I don't have pictures except for how many photos each user uploads. I certainly don't have the open-ended responses except for how long profiles are. What I do have however are these closed-ended responses on the right hand side of the profile where people can indicate basic information about themselves by checking the number of boxes. So importantly also you can search for other users on the side on the basis of these attributes. And so for instance I can tell the site that I'm looking for an athletic Hispanic Pisces who drinks socially and owns a cat. And when I find such a person I have the opportunity to send him or her an email using the site's internal messaging system. So with this create stand by combining the data on profiles with the data on messaging is basically a social network data set. So here's an example for instance of all messaging on the site among users in the Northeast over a two month period in late 2010 where the data are based. So the nodes are colored here according to gender and the size of the nodes is proportionate to the quantity of messages received. So visualizations can often be deceptive but here we see an important just basic feature of online dating with which anyone who uses these sites will be familiar and anyone who studies these sites will be familiar as well. And that's that most messaging behaviors is men contacting women. Basically gender norms in real life perhaps are reproduced online. It's not actually the case that there are many more women online it's just the size of those nodes are much much larger. Yes. How are we supposed to see that in this graph? See. And it's mostly men. So the size of the node is proportionate to the quantity of messages one receives. So the red dots are women. Gender balance is roughly equal. Yes. So it's algorithm that basically just minimizes the distance between nodes for connected. And so what you end up seeing then is actually some geographic segmentation because people are still contacting one another not necessarily interested in finding your soulmate in Alaska if you live in Boston right. So anyway so the point is that you basically create this giant social network data set using these data. Now in some remaining I want to just run through some basic findings from this research. I'm going to focus basically for practical reasons on those who have lived in zip code beginning with one zero or the area surrounding New York City. This has to do with the size of the population we're dealing with in computational limitations. But many of the patterns I'll talk about today are absolutely robust across a number of different geographic regions. So what I want to begin with is exploring some gender hierarchies on the side and basically how many male and female preferences differ tremendously. This is something about which a lot of work is published but not often justified with the same type of data. So what we're going to see here is basically bar is quantifying the attention that individuals in certain social backgrounds receive on the site where in comparison to reference category here white users for instance how many messages are likelihood of receiving a message. Someone is based on the background in here in terms of race. So what we see here is for now it's finding that basically there's only one group of men on the site they're receiving the vast majority messages and that's white men. So the differences are much less pronounced among minority groups. That's basically why males on the site who are receiving much the attention from women. When we switch attention to females the case is absolutely reversed in that it's not just one particular group that's receiving mostly attention but a different group that's receiving the least amount of attention and that's black women on the dating site. We see that Indian women actually receive the most attention on the site most number of incoming messages followed by Hispanic women and white women but this difference is tremendously pronounced. In any case basically the gender nature of this the racial status hierarchy we see online. Something similar pops up to use another example when it has to do with educational attainment. So how likely are men who have different levels of attainment to receive messages from any woman on the site. The story here is straightforward that basically the more education you have the more attractive you are to female users of OKCupid. Now here's what happens when you look at women. So insofar as men are interested in a female with higher education men are basically interested in a female who has a college education no more and no less. So women with a master's degree or a higher degree as well as women with a higher college degree or a high school degree are actually penalized vis-a-vis women with a bachelor's degree. Yes. I'm controlling for everything. Yeah. So this is we chatted before right. Thanks. Actually so you can't see them as well but there are these 95% confidence intervals which all do not overlap with zero. So we chatted beforehand a bit about the OKCupid blog which many of us have read and find fascinating and because of the nature of the data they can provide these relatively basic descriptives that are nonetheless fascinating. What I'm doing here is actually running a statistical model that controls for a wide variety of other types other possible reasons one user may be contacting the other to make sure that the differences I'm identifying are not for instance just an artifact of similarity based on age or any other attribute for instance. So that's just an example. Two examples of the gender nature of status hierarchies online and how different types of individuals are more or less likely to receive attention on the site. What I want to do next is shift attention not to status hierarchies but to homophily the tendency for similar people to message one another and to compare rates of matching based on similarity across different types of attributes. So with respect to education here now previously the coefficients indicated basically one's likelihood of receiving a message from anyone on the site and here these primer estimates indicate the likelihood that two individuals from the same background will contact one another. So these are basically rates of homophily where these positive coefficients indicate the preference for similarity and if we were to see a negative one that would indicate a preference for dissimilarity. So with respect to educational background we see the highest sorting basically at the top and the bottom of the spectrum. Controlling for education we also have some important matching effects based on income. Not many people report income on the site it's often private or unknown. Question about education. Could we explain the previous data by the fact there are a lot fewer PhD students and therefore when PhD students contact each other it shows much less traffic. So all these models control for basically the opportunity structure and this has been a huge problem with past research is that for instance rates of white homophily for instance are always seem to be very pronounced because there are many more white people in the population and so just based on chance we'd expect more relationships and so all my models control for that opportunity structure. What is interesting then is even some of these smaller groups do display pronounced degrees of homophily precisely because they're smaller groups and therefore more easier to differentiate themselves I think. On the previous slide I noticed that the uncertainty bars for the lower levels of education were actually larger. Sure. Which suggests that in the OKCupid data set that the number of people with high levels of education is actually larger. Sure and you absolutely find that and when you look at the script as I'm deciding a pair to the broader population. So you have to write also that there are fewer individuals in those groups which is the same case with income where very few people are reporting income and often saying their income is private. Right. And so the models again take that into consideration. In these cases we can't be too confident in those results because the confidence interval is so large. We do see significant matching at the bottom. However those who make between zero to thirty thousand dollars a year. What for people report their education report their religion for instance. So even though scholars have shown that the importance of religious similarity for instance has been declining over time. We still see substantial matching based on religion especially among atheists Catholics and Jewish users. And finally of course the pronounced importance of racial similarity to messaging behavior with the highest degrees of morph among Indian users followed by black users Hispanic users and white users at the bottom. So matching in messaging behavior right similar people who are likely to message one another. This doesn't mean they have successful data. No and that's absolutely one limitation of the data right as I'm saying what's going on online. I have no idea what happened after that if people met up in person. Which is why here all these these data have only to do with the first message that people send in the first response they receive. Because beyond that it's really difficult to infer what longer interactions are the absence there of mean. If you and I suddenly saw messaging one another we could hit it off and move interaction offline otherwise or alternatively the match could have just failed and we lost interest right. So is the unit a message in a response or messages and responses. Both. So I look at first contact right whether that's the first one you send the first one someone responds now those are the network ties in this data set. Yeah. So it's directed. And you're going for the number of total messages. No not because I see the sense of which someone sends messages to others is endogenous phenomenon right. That's going to depend not just on my internal disposition but also who's available. The question that happens again another network's going on the room. So what these data obscure actually is so it is actually something that's remarkably rare in the first place is to get a multivariate model of relationships in the first place because usually all these data aren't available and just basically get some descriptive statistics on one or another dimension. What even these data obscure is the other types of matching that are going on on the site. So I'm going to give you an example of matching based another non demographic attributes about which we usually don't have data. So turns out for instance that people who work in clerical or administrative occupations I kept the scale here the same so you can literally compare the size of coefficients across. People who work in clerical or administrative occupations are very likely to seek one another out online. As are people who smoke occasionally. People who do not drink at all whatsoever. Not even at the Berkman holiday party. People who use drugs occasionally. People who are Virgos. This continues to baffle me if anyone can explain to me why I see differential degrees of matching by astrological sign I'd be curious to hear it. What a back in our department thought perhaps people are misreading and thought I said virgins and she was told that they were fighting with another as very water suggestion. Thanks Mary. I told that it's not really possible to the structure of the site but I was entertaining. People who love cats as well as people who own dogs. Individuals have multiple children which of course is a marginal category on the site speaking to the previous comment. We find the same type of effect among individuals who say they like children but don't want them. So again minority categories in the site certainly seeking one another out. Another important category people who self describe their body type not just as fit or athletic but actually jacked. So our people spend a lot of time in the gym presumably are very likely to message one another. Aaron example of a jacked. Right. Okay. People who are especially short people have more than 10 photos on their profile whether people are very attractive and showing off what they've gotten so beauty sorting or people sorting on vanity perhaps and people have especially lengthy profiles so those PhD students who instead of writing their dissertations are pouring out their heart to strangers online are also messaging one another. So this just gives some illustration of the multi-dimensional nature of mating and may choice that usually we don't have based on limitations in prior data. So what I want to do one final step is show you a regularity that found in the data that was very surprising to me very surprising to others and this is comparing rates of initial messaging to rates of response on the site and disentangling these two dynamics. So what I'm going to show you in this model is the exact same thing we just saw previously except I'm going to present two sets of coefficients. The green coefficients are going to show you my tendency to message someone who's similar to me where the positive coefficient will indicate a preference for similarity and negative will indicate a preference for dissimilarity. Now the yellow bars will indicate who I'm likely to reply to where again positive will indicate a preference for similarity in replying and negative will indicate a preference for dissimilarity in replying. So we'll just look at the matching coefficients the situation is very similar to what we just observed the high degree of Pomocli and all these attributes. Here however is what we get when we look at the response behavior. So two features are immediately striking. The first of all is that all the matching coefficients are the colors are difficult to see with the light but the matching coefficients are largely positive and the response coefficients are largely negative. This is again the basically the interaction between each of these categories and reciprocity. What this means then is people overwhelmingly prefer similarity in their initial messaging overwhelmingly prefer dissimilarity in their responses. The second immediate thing we know is of course that the standard errors are very large and the confidence interval is very large because we're dealing with so few ties there in the first place but even if we restrict attention to the significant coefficients we see that basically some of the strongest social boundaries I would say are also the most fragile in that I may be very very unlikely to contact someone from a different racial background than me but in the unlikely event that such a person contacts me first and actually more likely to reply than to someone from the same racial background. So quick summary the three types of findings I've demonstrated here today is first of all that we actually see some very pronounced gender hierarchies and initial contacting behavior on the site not just with respect to racial background education I focused on today but also income religion and most other attributes that you look at but usually because we only have data on the outcome of the May choice process we actually can't observe these asymmetries in preferences. Second of all I showed that demographic attributes absolutely important to make selection but so are non-demographic attributes and so well some people in response to the question of who are you will provide a basic demographic description. Others might answer that question in terms of what they like or what they do even how they look and turns out preferences those types of attributes are equally consequential preferences demographic attributes and finally I showed that initial contact behavior very systematically from response behavior it turns out that some of the strongest social boundaries are also the most fragile. Thank you. Yeah please. I was wondering if you could speak to the particular method of okay keep-dating which is you write questions in a sense that you ask other people to do which is quite answer which is quite uncommon to other sites. Sure. So does this create a filtering mechanism that already kind of maybe biases the start of the messaging process? Possibly well very likely right. It's unlikely to find anywhere else. So it was not familiar okay keep it has a distinct approach to matching right and that is that actually you know best what you're looking for which may not be problematic as an assumption in the first place and so what you do is is go on the site and answer tons and tons of questions in three different ways so say here's a question and you will give it okay keep it here's my response here's my ideal partner's response and here's how much it matters right and so then the site has an algorithm that takes a bunch of answers these questions accumulates them and pops back out of match percentage it then uses this this percentage to rank basically your search results and match you within the site. Now the trouble is we're talking about a dyadic similarity score between every possible user on the site which in the data set my original data says about 1.8 million people but a line is that's a lot of dyads and so I don't actually have the data on the match percentages on a you know holistic level and so I'm not as concerned about that however and like acceptance so far as one's answers to those questions would bias would intersect with these other questions would basically tone in more on demographic you know and descriptive characteristics about you rather than dyadic interpersonal compatibility right but it's one limitation of the data that don't have you know that kind of backdrop of the site itself. Good question. The data set that you used was primarily Manhattan right? So the entire data set is for the these analyses right people in the New York City area. Did you find much variation not being one of the most diverse areas in the country if you went to less diverse areas did you see the data and trends change significantly? In some ways yes in some ways no so the the latter two findings I presented about the importance of non demographic characteristics is robust across regions right in some cases it's not exactly the same categories maybe it's Jack people in New York and you know super skinny people elsewhere but the general take home is the same and the reciprocity finding is also robust to a large number of regions. Racial hierarchies are also very common regardless of where you look but some of the other dynamics absolutely very tremendously so in New York it also turns out that notwithstanding the research on on moral boundaries basically surrounding atheists for instance atheists and agnostics are like doing great on these sites they're receiving tons of messages many women alike and that's something that's certainly not robust across regions and so I think some of those gendered hierarchies are actually localized and that's a topic of future explanation why I just understand why these patterns vary in different regions. When you started the presentation we're talking about you know like the other research that's been done particularly focusing on the other end of the spectrum you know like endpoint you know marriage kind of thing what I was wondering was whether in your research you saw any differences in particularly in people's match choices online versus what other other research has shown outside of the online context. Sure great question so in the first place it's it's the advantages of these data are as I laid out right we actually hone in on preferences the disadvantages is that I'm looking at as I point out the entirely different end of the major spectrum so it's very likely that the things people care about in initial contacting are very different from what they care about the marriage partner. To me this underscores the utility of my approach actually in that we can probably expect that in these earliest screening and sorting stages social boundaries are going to be most salient and personality factors play a minimal role because we haven't had a chance to meet someone yet and so there really is no role of chemistry right which many say is the problem of online data also right you someone looks great and then you get on the first date it's like oh my goodness this is not what I expected so that's one limitation in so far as one does compare them though you often find that basically rates of homophilia are exaggerated in traditional data sets again because they don't control for opportunity structures or because they don't control for other attributes that might be correlated and the prior research has also found that basically whether you look at dating cohabitation or marriage patterns you always see a pronounced degree of homogeneity and so what that would suggest is that these patterns develop even earlier in the mid selection process and that's again that's something that these data can provide inside into that you know on okay keep it not not to reveal too much experience I'm close friend years people can express what they're looking for and so did you see any differences you know like if someone says I'm looking for you know marriage on the site versus oh I'm looking for a quick question so the all these results are limited to people so on the side you can indicate that you're there for a number of reasons you know the one-night stand dating relationship long distance penthouse these findings are limited to people who are looking for either longest or sorry long-term or short-term dating at a minimum you can choose a number of boxes and so I was not I what I can expect these dynamics would vary according to intent right and likewise I focus on people who self-identify as a single you might expect that people who are married might behave a little bit differently on a dating site than others and so I hone in on that category yeah I'm sure that a lot of people in this room have read the article the other day in the New York Times about the MRS and the PhD talking about education levels of men and women how it was interesting one of your slides didn't refer to men ever preferring women with a lesser educational background I was wondering if you've encountered any data that and they might be at the scope of your research but if you encounter any data to corroborate that and to note that men are now preferring women are more educated to some extent so you absolutely right that I don't look at my preferences for someone else relative to my own position right look at these two general theories of friends and similarity and competition who's receiving more messages overall interestingly you find there's one effect where men with only a high school education on the site are going out of the way to avoid women with a high school education one might argue them will for this group in particular there basically is no one else in the site that has you know lower education attainment than they do so the extent which my findings speak to those dynamics are indirect and one would have to look at basically the entire mixing matrix we would call it of interaction where you see how people across every possible dimension here interact with every possible dimension here the advantage of my models I'm also controlling for for basically popularity the overall tendency to receive messages as well as the overall tendency to send messages and if you include both those controls with the matrix you have like a open-determined model that's a more complex explanation than you asked for but it's the technical reasons that that those findings actually might be distorted also in that this is actually not an unreasonable way to look at the picture yeah seems like there's so much data for this site alone and obviously all the other sites and geographic areas they could slice and dice it forever yeah almost yeah you probably don't have time in one PhD but I wonder if you were to advise someone who were just starting this sort of research what other areas you think they could find some valuable information or make interesting inferences appreciate it so I think because the literature that's out there is again has really been able to distinguish the role of these different factors we really have under underdeveloped understandings of these factors and so what presented here is a very broad narrow geographically but broad overview of different types of preferences I think the natural next step is to hone in on one or another dimension of my choice and better understand the precise patterns that are going on with respect to education for instance with respect to religion I'm understanding why these preferences turn up as they do understanding why we see the geographic variation we do and personally I would love in the next step to to get a grant to actually go out and interview actual dating site users systematically and provide some interpretation and meaning to the findings I have and sit down with people and go you know through them to the behavior you know why do you contact this person not this point what's going on through your mind I think that missing qualitative component what a great deal to interpreting some of the findings we have here I think that a more in-depth study would be the natural next step the question yeah this stuff actually correlate with dating records I mean can you sort of look up in different countries who not not dating records marriage records rather marriage records are leaving the documents in some degree public so can you see I mean if you're talking about meeting reference can you look at marriage records and see whether that correlates with dating site activity one could so then you come back to the original there was a point though is that that the marriage records are basically just looking at the outcome of this process so first of all that's obscuring you know this directionality issue for instance but also as you know Ryan mentioned focusing on or discuss the Ryan focusing on the opposite end of the spectrum so you could compare the baseline patterns that you observe in each but it's difficult to know exactly what what to make of those because it could be tapping into different things at different stages in the process or the methodological advantages or disadvantages of either but those data out there in the that's the literature on marriage patterns in the sociology literature is extremely well developed in nuanced patterns over time looking at geographic variation across countries so it's a really developed literature but I'm trying to kind of go beyond that in a way yeah their dating sites that are trying to track outcomes I mean do they they must want to collect data so as long as the outcomes are good right yeah so the only ones they want to share yes right so I'm matching the harmony I know have both conducted surveys and in collaboration with an external agency to show that each respectively their dating site is the best and produces the most happy marriages and that kind of thing as each just claimed to and so the findings that I began with are something I would put a little more credibility in because they're conducted by a sociologist and actually went out and surveyed a national representative sample the population not that the matrimony results are biased but you're right that they're absolutely interested in you know marking their sites and showing that they produce positive happy masses matches the survey data though do show that you know matches who met online aren't really in any systematic way I don't believe different from those who met offline which is heartening to skeptics I guess but uh yeah I know there have been some okay cupid blog posts sort of tracking like what questions and stuff like make people most successful to say they met each other online and I was just wondering if that's just data that they couldn't you couldn't get in terms of what people who said that they I met my significant other okay cupid and that's why I'm taking my profile down I do have those data actually so you're right there's that's quite the information out there that they have that we didn't acquire right but you know one of those um is when you close your account basically you can indicate to the side it says okay why are you leaving us you can say well you advertised too much or I've met my soulmate online so I do those data I've not actually looked at them yet because you're dealing with so few cases and the trouble with that is even if you you indicate that I'm leaving because I've you know found a relationship I'm happy with you don't actually indicate who that person is I thought I thought you and you indicate who it is yeah okay if you do they didn't give us that unfortunately so I would have to take the potentially risky step of inferring backwards based on like the last person they were in touch with which may or may not actually be that person right so it's it's a little bit tricky maybe I'll have to go back to them for the extra bit of data yeah yeah right great talk Kevin thanks thanks for this I'm interested in hearing more about the reciprocity result I found that interesting and did you control for gender in that absolutely how much of that so is there a theory behind that what's sure great question so the those all the models I show it having basically the exact same set of controls and controlling for gender I've been pulling for the baseline tendency to reciprocate messages in the first place that said it absolutely is a lot to interpretation and that's one effect in particular that I think the qualitative data would be helpful for one could speculate for instance that people might be more likely to give a quiet rejection to someone who's crossing a racial boundary in the first place because you'd be not you're not going across the racist or that kind of thing I think that type of explanation is not so likely given that there's basically no cost of not replying to a message and in fact there's some risk of sending a plight a plight rejection and the other person misinterpreting it as interest is um some dating coach over here is a shaker right so rule number one you learn when you when you're going on on dating side is that that no response means no interest and then you get some awkward situations happening when in fact you you do oh well this you know Aaron was so cute and so they took the time to craft this you know nice beautiful message to me I'm just not interested in dating an academic though so I'll at least send him a you know no thanks um and for Aaron over there it's well in some cases actually right back and be like what I do wrong you know um and so this is not exactly an exchange I want to be in so I don't think that explanation is as possible and um what I think is going on is actually we were discussing it before the talk was um having to do with the basic nature of cognition and uh prejudice is that I've not looked at these numbers but I would expect to see that um this racial in group preference is equally pronounced in uh profile views as well as messages and that people when they go out and look on the side basically aren't even considering individuals certain racial background right but if this person goes out of their way to send me a message all of a sudden they're they're you know pushing their way onto this mental map and I might give them an entirely different type of consideration actually look at their profile um in a way that I wouldn't have otherwise and so I think that's really what's going on here is that the um the initial stereotype and then prejudice is just kind of a blanket approach um but once someone you know pops up on your map uh you have an entirely different mental framework with which you uh view them yeah so to follow up with that I was just thinking uh I wondered if these kind of patterns correlate with implicit assumption tests where you have people who do word associations very quickly into these things come out but I wanted to take another step and ask about ethics you might not feel sort of like ask it but uh or address the question but okay Cupid received some controversy you know it's probably a marketing employee when they sent people who had not been active on the site for a while saying we have now tweaked things so that you're going to see you're attractive and we're going to tweak it so you see other attractive people okay so if people and so you say well that's kind of silly but what happens if I get something left when raised knowing that people at least like the message other people of a similar race or if you're anyone anyone that doesn't like that most people don't like the message African American women apparently sure you know and hence they get filtered out you have anything to say about the use of your sort of work yeah in the filtering mechanisms that get open to the software so I'm going to try and be as descriptive and objective as possible here as a social scientist and that's I really don't know what's going on in that in that black box right and it could be any number of things to be honest some of which be more or less problematic for this type of research I've heard similar things like that about okay Cupid in particular that's certainly not to say that the only site that does it they just might be more open about it when also here's frequently or more or less frequently about users who want to okay Cupid or basically contact in some point and say look based on patterns of you know contact and who's looking to profile we've identified you as someone who's really attractive compared to other site users and now we're going to upgrade your matches to give you more attractive you know search results as well honestly I'm skeptical whether they're actually doing anything at all right some people's own just a marketing employee yeah and so I mean that would require a lot of effort that I'm not sure what exactly that that buys them right to to fiddle with the algorithm to to upgrade your search results like that and so I wouldn't be surprised if that were the case but I've got no idea what else is going on and again that could be more or less problematic for my findings insofar as is there behind the scenes alterations coincide with with the data I'm looking at looking to do so a great question yeah another possible explanation that I'm thinking of is regards to cross-racial contact sure is that you don't have the data on the preferences and hobbies and all that kind of stuff so it could be that someone is a different race than me but they have a really strong reason to break that norm and contact me because we have something really obscuring on them like scuba diving and right matching based on some other non observable it could be the case but one would have to explain why that would turn up only in the replying and not the initial messaging right what what caused someone to to actually you know take the step the additional step to go and look at the profile and identify that extra compatibility and right and so it also you know there's a lot of the models that may pick up on that kind of thing but one would need to explain why that turns up only in replying and not in the initial message which I think is is the puzzle I'm quite convinced it's not an artifact which some people are concerned with also it's it's you know I'm quite robust and there's lots of controls in the models but I think the interpretation is still a little bit open I have an alternate theory that might be interesting I don't know if it's worth exploring it's the same thing about why because I think that's the kind of weirdest finding in some ways in the most interesting from the level of it contradicts all this homophily stuff that we as social scientists and you know assume is going on everywhere all the time and that is that the internet historically is kind of good at helping people solve these like second order status problems so like what I mean by that is that I might not do something in public not because I don't want to do it or I'm necessarily uncomfortable with it myself but because I know Kevin's watching Kevin as my friend is going to see it and the internet's a really nice safe space in the sense that Kevin and my friend might not see it so in that sense I wonder if that's part of what's going on that people are actually able to reveal the fact that they're curious or intrigued by the fact that someone who's not like them has contacted them I mean it goes along I think it's not it's compatible with the explanation you gave sure it's a little more nuanced it goes into like why the response yeah it's like part of why the response anyway I don't know it's there's some interesting data there on bisexuality the degree to which women label themselves as bisexuality oh yeah yeah I've seen that on the okay keep it blog performance results it's kind of performance even if they don't follow through with that particular preference or have that preference in genuine yeah would be to so all these models are of course heterosexual pairing right for a number of reasons uh the foremost among which is really don't know how same sex or bisexual pairing works because our data come from from marriage to the most part right but I absolutely want to compare patterns among these different groups I think could be illuminating in lots of different ways really interesting that's it yeah yeah want to share well one of the things many things like when you use the site is the percent match number right in terms of like you know it sort of gives you a valuation I'm just wondering how much of the impact you think that has in terms of like whether you respond to matches just by itself like whether other things or in terms of how people see it almost seems like that's kind of like the new proximity just your percent match or sure I'm confident it's playing a tremendous role and and you know filtering out this vast population down to the you know the small portion people you might actually be compatible with or the side things you're compatible with again however I would be I don't think that that narrowing would would would affect my results to a great degree I think that is kind of an independent dimension and I think that's one of the the fascinating things about this topic in the first place is maybe this is replacing geographic distance I think it is in that way and in another way is that that basically bring individual preferences to the fore but slide if you don't mind I love the show is some other favorite thing to talk about is that Facebook not online dating but here's a snapshot of my Facebook my own Facebook friendship network for instance and it's ridiculously rare to see this type of segmentation in any network right we've had basically here you know my college friends my high school friends my grad school friends and Currier House where I'm drama too yeah thanks guys and so traditionally right where we be finding our romantic partners absolutely in one of these social post side not that one but people lose their jobs for that but so we're meeting our partners at educational settings there's also you know yoga buddies bartenders their west side you know other people these social venues that constrain interaction right and if it's not a social venue we're probably meeting people who are basically two degrees off this map a friend of a friend and online dating is totally shattering that traditional way that people meet one another but what you're seeing instead is a more direct mediation between individual preferences and the outcomes where rather than going into the grocery store and looking for a certain type of soap and being limited by what's available there we go on amazon.com instead where they're you know selections much broader and we can actually you know get the product we want insofar as it's it's available on the site right but yeah I think it's a fascinating aspect of online dating for that reason have you looked at the possibility of the opposite hypothesis which is that online dating particularly in the niches reflects friends of friends in other words the same people that you might be dating on let's say j-date might be the same people you're friends with on facebook or in business relationships with them linked in sure I think one might suspect so especially in these these you know smaller groups of people in kind of niche categories but in just anecdotes you know from friends and peers I think actually it's occasionally you bump up into someone online that you might you know recognize in real life and kind of steer clear of such a person and occasionally also you get stories of two people meeting one another online who actually very possibly would have met otherwise right because you know it turns out they you know they are you know two degrees removed in certain ways they actually just missed each other at this conference they're actually both regulars at some coffee shop and just happened to you know never struck up a conversation yet but I think in general you are you know generally meeting people that you wouldn't have have otherwise and I think those instances are more like the it really is is amazing how constrained interaction is and how many people we don't meet just because of our day-to-day trajectories and where we're walking and who we're interacting with I think online dating just really transforms that sorry I skipped over your second ago yes I'm gonna say you know okay huge of course except everybody you know how do you think the results there might differ from you know a niche site like J-date that or from a site like eHarmony that is known to reject you know potential customers whose responses are too far outside the mainstream right great question I think a couple different cases there so there's some sites like eHarmony who I guess are reject people I'm not as familiar with that case or I think perhaps of like attractive attractivepeople.com or something where you get you have to apply and if you're attractive enough they let you in beautifulpeople.com and so that's one type of selection the other problem though is one could especially in so far as I'm trying to infer social boundaries from interaction on these sites one would say look those who draw the strongest boundaries are just going elsewhere into a site specifically for them if I really want a Jewish partner I'm not going to be an okay keep on going going to be on J-date right so to that end these findings could also be biased towards those who are more generally open that said one other advantage of these sites is precisely that you can search you know for these these this gets described or searchable characteristics in a way you can't elsewhere and so in so far as such a niche site doesn't exist or maybe it does exist but there's also you know needs to have a sizable population going on okay keep it is very helpful for me because if I'm really interested in in dating that Virgo or that you know excessive alcoholic or someone who doesn't do drugs or whatever it's very easy to find them and you know online dating sites once they reach a kind of a critical threshold it's worth me jumping in saying look they have enough people there that even as someone who draws these strong boundaries this site will let me hone in on exactly that population and so I think those two tendencies might counterbalance one another in terms of the strength of the boundaries I see drawn yeah oh yeah I was just wondering one of the pieces of advice I've seen like in terms of using okay keep it is to respond to people because it tracks how often you respond and if you don't respond enough it'll give you like a red dot above your names coming in red the person responds rarely and so people it's like well why should I bother it's not worth my effort to respond to email that person because they rarely respond I would say the opposite though I think and again chatting with folks if you're sometimes find us so right okay keep it we'll give you an indication reply as they I think basically if you like it's like frequently which is probably same as like always it doesn't want to say make people seem desperate right so it's like often sometimes are basically like selectively and you might say oh okay this person has a red dot they barely ever reply why should they even bother but it's like oh this person has a red dot right there there's something going on here and this person selected more discerning I think it's also in some ways more more attractive and then it makes that person seem more more desirable hmm I don't know yeah I don't understand that own question both of the screens that person puts up to it's another boundary that they've created a sort of exclusive part to get absolutely what do you think about the feature for example in j-date where they'll the inbox will like do animate until you actually read the message oh I wasn't so it just kind of bugs you and is annoying until you actually look at that yeah um Chris Trattie if I were a user of the site I'm not sure if I get irritating it at some point but um I think again that goes back to these baseline you know gender norms of interaction that you see on these sites that basically as a male you're the one who's initiating you know contact all the time just as as men often do and relive and um in heterosexual pairing and as a female on the site mostly you just have to sit there and watch the messages compouring in your inbox and in fact that's the most tedious thing as a female is you know plow through all these and oh my god there's 72 new messages half of them we're just like hey what's up um with the face you know the other half for whatever reason is totally unattractive or undesirable one form or another and so I think that really you know colors the day-to-day experience of these sites for men versus women but another aspect of that though that I enjoy is I get into arguments with female single friends all the time it's like oh guys don't like being approached and like women who are too confident um and uh we should just let them you know come to us and say it's total bunk and in fact um on the site you see uh coinciding with those baseline patterns of interaction I took a slide off sorry um the uh the response rate among men is much much higher and so as as a woman reaching out to a male you are over twice as likely to get a reply as a as a male sending a message to a woman and so I was like look this is you know fantastic for you and then I don't know there was the guys coming to me sorry thank you see I wanted to go back to that really fascinating delta between the inbound and outbound responses with regards to race and um real life Pariser gives this really fascinating Ted Talk on what he calls the filter bubble okay that algorithmic filtering mechanisms uh can predeterminate often skew the behaviors of users sure so could this be a a an indicator not of social biases but perhaps algorithmic ones uh like our people coming from where exactly so the algorithm being the matching percentage or just who okay keep the chose to show you if if there's this much responsive interest from people getting messaged outside of their race perhaps not emphasizing that enough in its recommendations so it's not basically it's not presenting you with these mentions so what I would wonder then is is where those cross cross-race messages coming from in the first place then right like we know they're very rare but once they do happen we see this other pattern and so for those that do have I'm not sure of that that responses as compatible with your suggestion like there there are people they're crossing this boundary in the first place even though it's unlikely and I'd be curious to know who exactly those people are on why I see um and also I mean there's um data that we don't have in that data set but but indicating how people basically found each other so when I first clicked in your profile you know did I see it is because you contacted me first did I did it pop up in like a match recommendation did it pop up in search results and so I think those types of data might also provide more insight into where exactly these ties are coming from right but I'm not sure we have the the nuance there in the data explore it that's one possibility and again speaks to the black box of of other things that might be going on behind the scenes there and related I mean are the rates of prostration messaging really that much lower um yeah I mean so the the coefficients here all in terms of log odds of one user contact another one right and so I mean they're very pronounced when you for those who are familiar with speaking log odds hopefully not many of us this is baseline rates of communicate like these are these are in some cases very pronounced differences right and all of almost all of just significant it's difficult to pair those down into meaningful descriptions that are more basically if I give a descriptive stat that's very intuitive it also doesn't control for a bunch of other things that I should be controlling for and so what I can say is it absolutely is a pronounced division and this is what people have been finding for decades in the marriage literature as well is that you know intermarriage has increased in recent years the racial boundaries to marriage are still absolutely the strongest just as they are in virtually all their types of relationship so you're controlling for income education other factors where you mean in this model yeah everything's all together in the previous one where I gave a more nuanced look at one category versus another this has a bunch of controls and just education and another one with just race but this multi-dimensional model is everything in one as part of the reason I only focused on New York is that the approach I use because of the complex interdependencies in this network data set basically requires a tremendous amount of time and resources and some of these model results took like over a week to run with 60 gigs of RAM on Harvard Subaru computers or whatever and so as I face severe computational problems there just to do what I wanted to do in the proper way of doing it so both a strength and a weakness I think yeah curious what the like strongest affinity for interracial responses is Asian women respond to white men more frequent most frequently I'm just curious about any like startling results I think I've not looked at those which gets to the exact same issue we had earlier with the the specific rates of messaging between people with different educational backgrounds is basically that would be Intel filling out the entire mixing I'd say men of this type are interested in contacting women of this type which doesn't allow you to control for some things but is also kind of inherently interesting there's some scholars of UMass Amherst who have a comparable dataset from OKCupid and they're they're looking at some of these exact types of of you know intergroup preferences which is a fascinating work the trouble is and begin because we usually don't have these data I think people struggle to come up with a thorough explanation if you have just five racial groups that's you know 25 basically data points here you need to explain and it turns out these these you know findings are not really amenable to just straightforward explanations about men prefer this or women prefer this and to actually come up with it with a nuanced accurate explanation for why each of these patterns occurs as it does is something that's been elusive I think for some time yeah I guess apart from education and income and race what if you just looked for like the keywords that were the most indicative of matches like they like the red socks or like skiing or something is there like a particular responses well I think that be fascinating and again you just like rank keyword matches and get those responses or one could okay cubic could but we didn't we didn't get those data right nothing from the open-ended responses but they'd be fascinating right to see a similar plot of like keywords and what people I'm sure there are things out there that you know just for whatever reason or off the charts in terms of predicting compatibility although again I'd be you have to be careful those types of things because when you throw enough data points at anything you're gonna get some results just by chance also which is um you know some limitation of other fascinating research that has provided these broad overviews but also it's important to identify what's actually signal and what's just noise but I think that'd be fascinating yeah so I my other my last question is sort of speculative it's a you know I think there's one way to read these which is the way that you've presented it here in the sense that you're looking at this as you know we have this amazing transcript of this behavior that's been going on for a long long time and now we can finally understand what the general mechanisms are behind the scenes you know what's inside the black box of dating right I think there's another way to read it which is that you know expansion of online communication and more historically situated right so we've got this period in the United States late 20th century where we have increasing indicators of social isolation sure expansion of information communication information digital online communication tools right and that this is telling us something about the way that people like seek information and build social ties under those conditions right and I'm just curious I mean it's I'm just curious what you think of that or if you have ideas about how you would think about it in those terms the question so I'm at piss things today as online dating being somewhat of a natural experiment type thing where it presents these ideal structural conditions for isolating the importance of one factor we usually haven't been able to isolate so when natural responses well to what extent is the interface itself or these unique social historical conditions changing changing preferences themselves and altering what we see historical effects I think would be compatible with my interpretation is that you know people might care about different things today but we're so interested in what they care about insofar as people the interface itself is influencing behaviors is more problematic and some of the questions we've talked about today already touch on that at the end of the day though I think it's just doing both I think you know online dating is providing is making preferences more important and so that allows us to better understand them and that's also consequential for the relationships themselves but you're right though I mean you why is online dating on the rise in the first place well people have online social networks all over the place that they're increasingly familiar with and comfortable with we have these structural changes in society that I mentioned again rising rates of divorce increasing age of first marriage and so I think a lot of factors are you know kind of channeling people into this technology but I don't see it going anywhere anytime soon especially as a stigma declines I mean as more and more people are using it and finding matches and people are just tired of being single more and more people are going online which further decreases the stigma and so on until everyone's going to be only using online dating and never leaving their homes see another research to that effect not then in particular yeah article but more picky because of online dating well fascinating too right and so far as these these categories online like it reified in the first place and maybe because something is giving us this option to fill out oh it wants to know you know what I think of like dogs like okay maybe this is like important or we were therefore considering attributes about right yeah otherwise which again emphasizes kind of the the fact that meeting someone online is just inverse to to in real life basically right where real life you have this immediate sense of chemistry of interpersonal attraction but we're not at all walking around with signs in the forehead about our income and our education and most of the personality whereas online you have all these you know tick in the box and your own you know description of yourself then you get out in real life and there may be fantastic chemistry or might be just dead right and so you're approaching this you know romantic selection from entirely different angles but interestingly with with high rates of success either either way right and so I think that's another proof for your future research is what exactly is going on there how relationships begin online they're different from those that don't and I wouldn't be surprised but at the end of the day they're very comfortable right yeah more things change the more they stay the same right that's what else is so shown as you've been I mean one thing that I find interesting is that you've you've been talking about how looking at online data I'm a big proponent a big fan of online data you know so that's my dissertation too but looking at online data allows allows us to understand patterns of matching and dating but at the same time you just said that the way we pick people online is very different that way we pick people offline sure so I was wondering to what extent can we learn from one about the other like you can learn about the how the process getting to again the result in different ways from online and offline but at the same time it might not be representative like I can imagine people who are more willing to look outside the box or looking online because they don't find multiple partners and they're already live sure so some of this has to do with how the process self-online is different others have to do with with how generalizable is this this population right who exactly is online actually recent research has shown that once you focus in on the people actually like at risk of online dating in other words those who are single those with internet access there's actually the two populations are statistically distinguishable with respect to a wide variety of demographics so which is less confidence in generalizing but yeah the process itself is very different doesn't you know one doesn't inside all speak speak to the other but um yeah I think it does have this advantage to focus in on on this thing of interest I mean I think it's intriguing that you find a high degree of homophilia by race for example in here given religion given that you already have specialized dating sites or that are largely based around those categories but it's interesting though is is what that the meeting embedded in that site as well right so maybe you actually really care about someone who's similar to to me with respect to X but you really want to be dating someone who's going to say that there you know all they care about is that so maybe I really want a Jewish partner but actually to take that step to go on J date or um you know I we all run attractive partner right but like to actually apply for beautifulpeople.com so yeah it's going to give me someone attractive but someone who's like totally vain right that's that's that's it's a self-selection thing where it's like the people who think that's important go elsewhere and the people who are willing to fiddle around at the margins certainly best we're saying is that there may be cases where we you know care very much about these things but to actually go take the step to join a niche site for them right like I really like Apple but I'm not going to join Cupidino like a niche site for lovers of Apple products oh god I didn't know that existed it's amazing what's out there and what people forward you as a scholar of online dating dating sites for truckers dating sites for beautiful people the uh date harvard square is from with no it's where um so as I understand it men with harvard degrees you know sign up for free and then women of any background can pay for the privilege of going online and dating a harvard educated male it's like wow at the end of the day these might actually be perfect matches just for one matches for one another but um for all the wrong reasons let them go off that's right that purges the rest of our 1,000 flowers bloom right I shouldn't say things like that but do they have to have a degree or just be a her the men yeah harvard harvard degree I think I think it's oh that's true harvarddropouts.com that's it that's the name again good idea wait a minute camera thank you