 I'm going to talk about the way in which we have and you can use biological data to answer social science research questions. And so the way we in understanding society have combined those biological and social data. So I was trying to work out how to fit this all into 90 minutes so I've kind of set myself times for the different sections. And after each section we'll stop and have a discussion, you can ask questions and things. So I'm just going to very briefly say a couple of minutes on what understanding society is because it's not really the purpose of today to tell you about the study, but to talk about the kind of social and biological questions and data and what some of the issues are in analysing them. So a little bit on that, then I'm going to talk about why we think understanding society, why we've invested in these kinds of data, why it's useful to have biological and social data together, the kinds of research questions it enables you to answer. And then kind of stop and have a discussion, maybe you think about some research questions you might be interested in answering if you could have those kinds of data. Then I'm going to say a bit about what the actual data are that we have in understanding society and some of the issues that you need to think about in analysing those data that you perhaps wouldn't think about with a kind of normal social variable to kind of think about what some of the different kinds of challenges are to having these kinds of data. And then we've got two initiatives at the moment we're doing in understanding society, some fellowships and ways of designing experiments to think about collecting these data that you could get involved in if you want to. So I'll say a little bit about that at the end. And I should say that because Melinda is here I'm going to be talking about genetics and social science later, although understanding society has genetics data, that Melinda might be using it, I'm not really going to talk about that at all other than one slide to say what we have. So very briefly, understanding society is a household panel study. It's multi-purpose, covers a range of different domains. And it began in 2009 and we collect data annually from everybody in the household. So at Wave 1 the target was to interview everybody in 40,000 households, which was about 100,000 people. It builds on and incorporates BHPS. So BHPS was set up in 1991 and 8,000 households from BHPS were moved into understanding society at Wave 2. So for that 8,000 households you've got data going back now 25 years. It's funded by ESRC and government departments including DWP, who are here in the room and one of our funders. And so we work quite closely with government in terms of trying to think about how the data can be useful for policy purposes. It's a public data set so you can download it from the data archive and most people who use it are academics but government and third sector and some commercial people use it as well. And it's part of a family of household panel studies. So in the UK we have this rich tradition of longitudinal studies that are mainly about birth cohorts, but internationally understanding society and things like PSID in the States and Hilda in Australia. There are these household panel studies that have been set up in similar ways so that you can do comparative research. Although we're the only one at the moment that has biological data. So understanding society has five different samples and only two of them have the biological data. So that's kind of what the rest of this talk will be about. But just to mention them all now, there's a general population sample. So that's a stratified sample of the whole population. And there were about 26,000 households, 41,000 interviews in that sample at Wave 1. At Wave 1 as well there was an ethnic minority boost. And so the aim of that was to get representative data on a large enough sample of the five main ethnic groups in the UK so that you could analyse differences between ethnic groups effectively. But a decision was made at the time not to do the biological data on that sample. And so although in the general population sample there were ethnic minorities who we've collected the biological information on we didn't do it in the boost samples. BHPS I've already mentioned and we did collect the biological data for them. There's an innovation panel which I'll come back to at the end about experimentation. So that's where we do lots of experiments before we do things in the main study to make sure that they work. And then finally last year we added a second immigrant and ethnic minority boost sample to re-boost numbers of the main ethnic groups and to draw in new immigrant groups that have moved to the UK since we started. And so the kinds of things you might want to think about and thinking about how you would use understanding society for your research or what makes it different is it's a representative sample of everybody. So when we come to talk about these biological data we have it on everyone over the age of 16. So you kind of think about how things differ in different parts of the population. It's an annual study so everyone gets interviewed annually but it's continuous so you can use it to think about before and after things. So we happen to ask the EU Brexit question last year so some people are now doing analyses of people's attitudes before and after the referendum result. I've already mentioned BHPS. Everyone in the house hold is interviewed. So you can think about how does a mum's employment affect the kids or affect her husband either at the same time or over time. It's a large sample which means that we have enough people for you to look at subgroups of the population so it was designed so that there's 10,000 people per decade of age that we get about 500 new births a year. So there's a kind of whole range of things you can do to look at specific groups that you might be interested in. It includes the four countries of the UK and so as different countries introduce policies in different ways you can kind of think about how you might be able to compare them to look at the things you're interested in. You've got kind of natural experiments going on because of devolution and other differences. I've mentioned the boost samples. We do ask consent to link to administrative data. We're already linked to education data, that's what you're interested in. And I met Graham, Mike's colleague at reception last night and he tells me it will be linked to DWP by, can I say, by March. But we will see whether that's real. But we're well on the way to linking to DWP data hopefully. I talked about the innovation panel briefly and I'll come back to the end in terms of what we're doing in terms of health experiments. So I'm not going to give you a chance to ask me questions about understanding society because that's not what this talk's really about. If you do have questions, so kind of in breaks and things, I'm happy to answer them. And I answered lots of questions over dinner last night about personality scales, I seem to remember. Okay, so the reason that understanding society and more generally we think it's really important to bring biological and social data together is that in the biological world, in the medical world, people have this really rich information on people's health. But they tend to incorporate one measure of socioeconomic status and think that they're looking at how social factors affect health. In contrast in social sciences we often have self-report data where we might have one measure that says over the last 12 months has your health been good, fair or poor. So we kind of treat health as a unitary concept and then we have really rich information on people's social situation. But kind of how you can then unpick those things when you don't have that richness together is a problem. And so in a number of studies in understanding society we felt it was really important to bring those two sides together to have really rich information on people's health and really rich information on people's socioeconomic status. And so what we've done and what we'll be talking about in my session this morning is we have what are called biomarkers. So biomarkers are objective measures of biological processes and it may be a normal process. I mean it's normal, there's a normal distribution, we can all breathe and it measures how much we can breathe. But obviously one of the key things we do is to indicate where there's a problem to use those measures to think who is it that has a problem with their lung function or whatever the measure is. So a biomarker may be a measure so your height is a biomarker. We then take different physical measures so that's kind of blood pressure, lung function, those kinds of things. And then perhaps the one we most focus on is that we take blood samples and then we use them to identify different biomarkers that either measure kind of diseases you might have. So diabetes or something or how high your cholesterol is or what your liver function is. So a whole set of markers for different physiological systems. The reason we think it's really useful to do this in a social survey, as I said it's about having richness on both sides of the kind of thing we're interested in, is because biomarkers give us earlier more precise and objective measures of people's health and illness. So we can identify, and I'll show you some data on this in a minute, for example whether or not people have diabetes before they actually know they have diabetes. And so that kind of means that you can look at the whole distribution of people who might have a health problem, not just those who've experienced it and gone to a GP and had it diagnosed. It means we can look at things that are predicting people's risks for ill health in the future and therefore we can think about kind of what we can do to prevent that. What are the intervention points? If we know at what stage these biomarkers start to look a bit high or abnormal in some way. Given that we can kind of look at things before they manifest themselves into illness and disease, we can start to think about some of the pathways that might link people's social factors to their health or their health to their social factors. So if we think that stress for example is a key way that social factors might influence your health, we can start to think about biomarkers that we think would be particularly affected by stress and look at whether those things are changing between seeing a difference in people's social circumstances and seeing a change in their health. By looking at the genetic side of the data that we have, you can start to think about are there some biological underpinnings to some of the social things that we, or some of the things we think are social. And I'm assuming Melinda might be talking about that and also gene environment interaction. So it may not be that your genetics are affecting something directly but they're interacting with something else. So for example we know that there's a kind of gene environment interaction for pollution and lung function. So some people are more susceptible to lung function problems if they're in situations of pollution. And then although your genes don't change at all in your life, there's something called epigenetics which is how those genes get switched on and off or altered because of things that happen to you during your life. And that's something we have data on and I will be giving you an example of that in a minute. So although your genetics stays the same, epigenetics means that whether or not those genes work properly may change. And a lot of this thinking about why we might be interested in this is about kind of how it is we can intervene to address the way in which people's social factors impact on their health or their health impacts on social factors. And I've said that already. So what I'm going to do now is give you a series of examples that I hope illustrate some of those points about the value of having this more objective, more precise measures of health when we're trying to understand how people's social lives and health interact with each other. Does anyone want to ask any questions yet? So the first thing is that we can start to think about how, as I said, people's different aspects of health develop and decline in your life. And there is a kind of natural process to that. In early life, for example, your lung function develops and your body mass and all muscles do. So in this early stage of your life, different kinds of biomarkers or different kinds of physiological systems are developing. And at some point you reach a peak and then gradually as you get older, those things tend to decline. There are some which go the other way. And so we can think about that in terms of what might be, if you're disadvantaged, it may be that you're on this. You don't rise as high or you decline more quickly. And you can start to think about when those things are happening in people's lives and what you can do about them. So this graph is about grip strength. So grip strength is a machine at the top. You squeeze it really hard and it gives you a measure of your upper body strength. But that's a really good predictor of frailty and mortality in later life. And this just shows you the grip strength for men and women. So men have much stronger grip strength than women. It peaks a bit higher, so kind of in the late 30s. But women seem to decline a bit more slowly. And then when you look at social economic groups, the differences aren't quite as stark as this gender difference. But there's the same sort of thing, more disadvantaged people peak earlier and then they decline more quickly. And so that kind of helps you to start to think about, as I say, different intervention points. A different example. So this is CRP. I'm going to talk about this a bit later. But this is a measure of inflammation. And inflammation is one of the ways that we think stress may kind of affect your biological systems and hence go on to create ill health. And higher values of CRP are bad. And what this shows is the difference in CRP across the whole lifespan for people with different educational qualifications. So this kind of olive green colour is people with no qualifications. And so although earlier in life they start with low CRP, so it's not that bad compared to the other groups, by the time they hit their 30s, they've become much higher. And then that stays really high throughout the kind of middle life, old life. And then where this begins to decline here is probably much more an issue about selective mortality than it is about their CRP levels getting better. And in contrast, those people with a degree, they kind of stay low pretty much throughout that whole period. So, again, we can kind of see how these social differences in biomarkers that are not necessarily manifesting themselves in ill health yet, kind of emerge quite early in life and then get worse, or the gap between people gets worse. So another example starting to think about how we bring this together with social data is to think about, why is it that people might have a biomarker, or might have a disease, but then don't report it, or don't aren't aware of it, or don't engage with health services as a way of treating it. So it may be that people might have, the example I'm using is diabetes, might have diabetes, but they're not really picking up on the symptoms of it. And so they're not, therefore, going to a GP to try and get it treated, or to see what's wrong and get it treated. They might be aware that they have these symptoms, they're tired, a bit lethargic or whatever, but be really busy or distracted by other problems in their lives and hence not go and seek treatment. It may be that they've sought treatment, they know they're diabetic, the GP has told them they need to change their lifestyles and diets and things to kind of manage that diabetes, but they don't manage to do that. Or it may be the GP gives them medication, but they kind of don't adhere to the medication regime and therefore, you know, kind of that's not working well. So there's a whole set of social factors you can think in the natural history of a disease and getting it treated that might impact on kind of what actually is going on underneath in terms of the biology and what people are experiencing on the surface. So this is looking at diabetes that's both measured in terms of a biomarker and is kind of people's self-report. So if I left hand kind of bluey purple bar, that's the biomarker. So about slightly over 4% of women and 6% of men in understanding society have a biomarker that indicates their blood sugar levels suggests that they have diabetes. The red bars are those people who tell us they have diabetes. The green bars are those people who were taking medication that was for diabetes and so the grey bar is someone who has any one of those things. And then when you kind of break that up, what you find is that quite a lot more men than women know they have diabetes but are poorly managing it, i.e. they're taking medication or they've been told to address, adjust their lifestyle, but they're not actually doing that enough to keep their blood sugar levels below the kind of at risk level. The red bar are those people who, according to their blood levels, have diabetes but they don't know about it. They haven't told us they've got it. They're not on medication that suggests they've got it so they probably don't know that they have it. The green bars are the kind of people who are okay. They've got diabetes but they're managing it well. They know they have it. They may or may not be on medication but they are managing it well. Their blood sugar level suggests that everything is under control. And the orange bar are people who, these are the people who were on diabetes medication but didn't tell us they had it so they kind of don't know that they have it or they don't seem to be aware of it. Maybe they're on lots of different medications for different things and they don't know or maybe they chose not to tell us but for whatever reason they were being treated for diabetes but they didn't tell us about it in the survey. And when you look at that by social class perhaps not surprisingly what you see is that poorly managed diabetes so those people who know they've got it they might be on medication but they're not doing enough to keep the way they're managing it isn't doing enough to keep their sugar levels low is much more likely or is much less likely amongst those with degrees or A-levels than with those with no qualifications. Undiagnosed diabetes, there doesn't seem to be such a social gradient in it. So the next example I wanted to give is around that question that those of us in social science have analysed for a very long time overall over the last 12 months how do you rate your health excellent, good, fair or poor and we know that variable is really well associated with different levels of mortality and morbidity and use of health service and it's a really good single question to predict a whole set of other things to do with health but we also know there's systematic biases in it and that worries us when we are using it as a way of allocating resources or thinking about need. So if you think about, if I was to ask you that question now what would happen, so the psychologists tell us is that the first thing you would do is think about if I say overall how is your health you'd first of all think about what is health what do I mean by health, what's my health and you'd then think well how is my health compared to the people I'm sat with, my family, people of my age what I need to do today and then you would look at the scale you're presented with which generally is a kind of five point scale and you'd pick a point and you'd do all that in milliseconds kind of thing so that's kind of how you answer that question and yet we treat that as very much this measure that predicts everything so you kind of in that momentary assessment you make a really good assessment of how your health is but we do know that there are differences by social groups so what we try to do is say well can the biomarkers help us understand what those differences are are there different kinds of biomarkers that represent different kinds of health that measure different kinds of health that might help us understand why it is that people systematically answer this question differently and so we grouped the biomarkers we had into these different groups so first of all they were what we called visible biomarkers so your BMI, your waist circumference how much body fat you've got so that was one set we then looked at things about your physical fitness how well you feel you can function so your heart rate, your grip sense that measure I just talked about and lung function because that kind of is about how you can be physically active there were then a set of measures that we think really are a proxy or a way of looking at fatigue so if you feel tired and kind of lethargic so the inflammatory markers we've got markers of anemia and we've got a marker for kind of an immune problem and then we then looked at those markers that are about disease so for example if you have high blood pressure and you've reported you've got it or you're taking medication we noted that as something that you know you have a disease because if you know you have a disease you might report that question answer that question differently if you don't know you have a disease so then that was the disease known group and then there was the disease not known group so those people who have high measures on these different things but haven't told us they've got it and so then we looked at the literature to think well who would we expect to report these things systematically differently so Mildred Blackster a really famous medical sociologist did this work where she talked to people about their health and what it meant to be healthy and so for example from that literature we might predict that women think of their visible kind of biomarkers their visible body as being more important for their health than men whereas men might think of fitness being more important when they think about answering that question about health than women and so we have hypotheses all the way through about kind of who these different kinds of biomarkers might matter more to when answering the question about health so the other kind of hypotheses we have where the fatigue might be more relevant to women than men to older people than younger people that disease might knowing you have a disease might be more relevant amongst older people and people on high income groups and our main assumption about having a disease and not knowing it was that if you didn't know you have a disease the relationship with this self-report health variable would be weaker than for the knowing you had a disease and some of what we found was that this was right it is true that there's a much stronger correlation between visible biomarkers and self-report health for women for men and for younger people than older people that physical fitness is more important for men in reporting their health than for women and so on but our hypotheses weren't all true and some of those were a bit puzzled about so for example if you have a disease and you don't know about it there is a much stronger correlation than if you have a disease and you do know about it now you might actually think well actually there is a logic there if you have a disease, you know it you're managing it because you're on medication you might discount that a bit in the way you answer that health question than if here you're experiencing symptoms associated with having high blood pressure or whatever it is but you don't know about it so it may be that some of our hypotheses are right rather than the findings being counter-intuitive but nevertheless the point really in telling you about this here is that having the biomarkers enables us to start thinking about what is it about the way people think about their health that might be systematically different amongst different social groups so the next example I wanted to give you is this thinking about whether having these objective measures of health can help us have a bit more confidence when we think about how social factors and health might be related and so this example is about work and health and so we know when you look at kind of data more generally that people are in work tend to have better health than people are unemployed and so we assume that returning to work is going to improve people's health and that you know the big part of some of the government's programs and things is very much saying that this is the case but then there's another literature that says not all work is good for health that there are different aspects of work that might be negatively associated with your health and most of the existing literature that's looked at this has been based on self-report health data and self-report data about the quality of work so you might on the same day be asked overall how is your health and you know do you feel you have much control in your work and so you can imagine that if you're kind of a very positive person you might answer those things quite positively and if you're a less positive person you might answer both those things negatively and so kind of there's a lot of criticism of this literature and in terms of thinking about policies that push this idea it's kind of based on this problematic assumption or problematic kind of investigation but biomarkers might give us objective measures in order to kind of look at this question and then we haven't got that problem about both parts of the association being based on self-report data and so this is a project led by Tarani Chandola and what he did was he looked at people in understanding society who were out of work in one wave and went back to work in the next wave and compared them with the people who stayed unemployed in the two waves and the measure of biomarkers that he used is something called allostatic load so allostatic load is about the way that stress over time impacts on your physiological system so it's a set of about eight different biomarkers or four different physiological systems and it's meant to kind of measure the cumulative burden on you of kind of having stress over your life time so the first bar is the predicted level of allostatic load so high is bad for those people who remained employed between two different waves and so there are allostatic loads about 2.7 or something if someone moved into what was classified as a good quality job in terms of questions that they answered about their control at work and different things like that then their allostatic load was much lower so that got much lower it was lower it's not significantly lower but it was lower so that implies that possibly moving into a good quality job is a positive thing in terms of health but if you had a poor quality job on a number of different measures then it looked like that actually kind of was worse for your health so you know there were a range of things that could be explaining this it could be that the type of job that people go into depending on their kind of pre-existing health conditions might be explaining this but Tarani did control for health before this he looked at a whole set of things that might you might think be due to selection effects to income and education differences a whole range of ways of trying to ensure that the differences he was looking at were more about the change of employment status than kind of background factors for people so then my last example before stopping and having a discussion about some of this is about this idea that although your genetics don't change your environment might impact on the way in which the genes control biological functions in your body so the epigenetics is a way of measuring something called methylation which looks at the extent to which your genes, each individual genes have been turned on or off or harmed in some way that stops them working properly and one of the big things that people have been doing in epigenetics is creating what's called an epigenetic clock and so the idea of this is to look at how much different parts of your genome have been affected by methylation by this turning things on and off as a way of measuring biological aging this kind of idea that if you're more disadvantaged you might biologically age faster than more advantaged people and so there's a particular famous clock in epigenetics created by Steve Horafath and it's based on epigenetics measured in blood which is what we've done and it's based on the number of 353 markers and we had 17 of them missing so there's kind of some issues about that and this clock has been shown to be really strongly associated with a range of what you might think of as aging conditions and so the first thing we did is sorry is just look at kind of how this measure of biological aging based on epigenetics is associated with actual chronological aging and you can see that it's so that the red line is just kind of where the two things equal line is the fitted model for the actual data and so you can see they tend to diverge particularly at the kind of ends of the age ranges and you know part of that might be that many of the studies that have looked at epigenetics before have been on quite narrow age ranges or quite small samples and so having this really big kind of age range in understanding society kind of enables you to look at it on the whole age distribution not just in selected groups or selected groups with health problems I should say that we because I'm telling you about the data later so that we only have epigenetics data for about 1100 people which perhaps doesn't sound very big in social science setting but it is one of the biggest kind of data sets on epigenetics there is at the moment so what this suggests is perhaps that the way the clock has been calculated until now may not take into account what you would expect in a kind of normal general population but when you use the clock to look at whether or not it's associated with socio-economic circumstances so whether there is a correlation between people's social disadvantage and how much the methylation has changed in ways that are affecting people's biological ageing what you find is that there is a significant association for childhood socio-economic status so having grown up in a semi-skilled or unskilled or in a household where neither parents are working then your epigenetic clock is likely to be much older significantly older than if you grew up in a house where your parents were more affluent but you don't see the same association they are not significant with a range of measures of current socio-economic status this is just illustrated with one and so there is some suggestion which would fit with what we think of biologically that your circumstances in childhood are when some of the damage might be done to this these biological ageing processes so that is the end of the examples I wanted to share about why we might want to think about using biomarkers in social science research questions so I kind of wanted to stop here if you have a chance to ask questions maybe think about some of the research questions that you could ask with these sorts of data and then I'll go on and talk a bit more about the data and some of the things you need to think about in analysing them so I don't know if anybody has any questions so I'm going to say a little bit about the data we collected and then about some of the issues you need to think about in analysing these kinds of data and I've got half an hour is that right till 10.30 yep okay so so when you think about the sorts of biomarkers you might collect as I said there's kind of physical measures some people call biomeasures like height blood pressure, grip strength, lung function and then there's these data that we analysed from we have analysed an understanding society from blood but we're looking at doing things in the future where we might take these data from saliva samples from hair samples from other bodily fluids that we won't mention and a whole set of things and and what you do for them is you can collect things that are or you can identify clinical indicators of disease so I'm going to give you an example in a minute around chronic kidney disease things that are risk factors so some of the inflammatory markers they're kind of risk factors we know they're associated with diseases in the future things that we think are on the stress pathway and then there are some kind of novel indicators and I was at a meeting the other day and people talk about kind of there's a fashion in biomarkers so a little while ago telomere was all the fashion so that was kind of seen as a way of measuring biological aging and kind of trends have moved on a little bit and we have in understanding society examples of all of them so what we did was we sent a nurse into people's home in wave 2 for the general population sample and wave 3 for the British household panel sample we were just talking about the ethnic minority boosts were excluded were eligible to take part if you had done the main interview in that wave you lived in Great Britain so we didn't do this in Northern Ireland because we couldn't get a nurse workforce in Northern Ireland and you spoke English in your main interview so we didn't have we have translators in the study for the general interview we didn't have that for the nurse interview and then there were not enough research nurses in Britain to run this study for understanding society so we had to kind of take a random sample of people in order to kind of match the nurse workforce that existed at the time the nurse interview took place about 5 months after the main interview so in the main interview people were asked would you like to be followed up with a nurse interview and they said yes they were and the interview and taking all these samples lasted about an hour and not everybody took part and so one thing if you're using these data you need to kind of think about who took part and I'm struggling to see this table from here so I'm guessing you might be from down there but we have 21,000 people almost did the nurse visit and so they have the physical measures and all those sorts of data 14,000 people gave blood and we managed so sometimes when you give blood the blood might get kind of not be taken well or it got posted to clinic so it might have got stuck in the post so there are a range of reasons why even if someone gave blood you might not be able to produce some of these analytes and we produced analytes for 13,000 people and I think I've said this before so the physical measures we did were height weight, waist circumference, body fat a range of measures of your lung function blood pressure, heart rate, grip strength and then we asked a whole set of questions about what their health was like on the day what other things they were doing on the day because I'll show you later some of those things affect these measures and so you need to take them into account when you're analysing them for the blood samples there are a few people who we couldn't ask to give us blood for different reasons as we were talking about just now the consent was for future research we didn't specify what we were going to do with the blood because at the time we didn't have funds to do anything with the blood we asked a separate consent for the genetics to extract DNA and do genetics analysis there's a set of issues around the way the blood was collected that mean only some kinds of measures can produce so some some measures need fresh blood but we froze the blood and therefore that kind of limits the things we can do and so the sorts of things when we got the grant or when we bid for the grant to do these blood analyse we wanted to think about what researchers might want to do and so we thought about what biomarkers we should include in that list in terms of these sorts of criteria so where you might expect there to be some kind of a environmental factor associated with them where they were on the pathway to major health conditions that a reasonable proportion of the general population might have so that you would have a big enough sample within understanding society to do analyses and kind of took part took account of the way that we had measured or collected and processed the blood and then we have a range of measures so we have kind of lipids which are kind of ways of measuring how much flat fat there is in your blood we've talked a lot already about we have this measure of blood sugar we have inflammatory markers we have a marker for wear and tear on your immune system some anemia markers about poor nutrition liver function, kidney function and then a set of different hormones that we think are associated either with development when you're young or decline as you get older so testosterone is all about building up muscles kind of about development and some of the others we have like DHEES are about kind of the way things decline as you get older for genetics we use something called a core XM chip are you going to be talking about this? and what that means is that you have the whole genome and they identified SNPs along the genome which are really good at predicting other kind of SNPs so they only measured only measured 500,000 SNPs and then they use that to impute 8 million and so there's kind of a big set of genetics data the genetics data were done for people who consented to it and then going back to the kind of discussion that was happening earlier that it was only done for people of white European descent and that's because that this particular chip was designed kind of for that population group for the epigenetics because this is quite expensive we did it as I said for about 1100 people and so we wanted to kind of think about the people who would be most valuable to have these data for so we obviously had to do it for people who had the genetics data, there were some things about the way the blood was processed that were important we did it on people in the BHPS and people for whom we had at least 10 years of data in the BHPS because we thought if you are interested in kind of how the environment over time has affected your epigenetics they might be the people that had the most kind of interesting things that you could look at and again so we have 1100 people and about 850 epigenetic sites that the data were collected for and these numbers sound quite scary I think to social scientists you know we have 8 million SNPs 850,000 epigenetic sites the data are quite straight forward so the genetics data is either 0,1 or 2, you don't have this particular SNP you inherited it from one parent you inherited it from two and the epigenetics data are about the proportion of methylation that's taking place so the number is between 0 and 1 so you have the huge numbers of these data because you don't have them for so many different sites on your genome but the actual data themselves are quite straight forward to understand and then there are specialist packages that manage these data the two different data which are basically based on R and so kind of then you can extract the information you need or we would extract the information you need and then you would get the data which is what I was just saying about so you can just want genetics and epigenetics data you can get it straight away from a genetics archive I'm guessing that's not interesting to people here if you want the epigenetics and genetics data combined with social data then you have to apply to a committee called the Metadac and Metadac is across different ESRC studies and the Metadac look at what you're wanting to do not in terms of the scientific question but in terms of whether what you're proposing should be disclosive because of the kind of sensitive nature of the data and so if it is they would ask you to think about having different variables as opposed to rejecting your application and so hopefully then your application would get passed and you would be given the data I just said all that so I now wanted to say a little bit about some of the things you need to think about in terms of analysing the data unless anyone wants to ask a question about what data there are so when you use biomarker data in addition to kind of the usual things you might think about when considering how to analyse a variable there are a set of things you need to think about because this is kind of different kind of data so you need to think about whether there are clinically feasible ranges for this particular measure that you kind of ought to know about for blood pressure is over 500 or something so you need to realise that person isn't alive so it suggests there might be something wrong with those data so there's kind of just a slightly different version to thinking about outliers you need to think about things that might have happened to the person or they might have done close to taking the measure because that might affect the measure and that might affect the measure in terms of the thing you're interested in so things like if they've had an operation then some of the inflammatory markers will be really high because they've had an operation as opposed to because of the things you're interested in smoking eating particular things drinking all of those might affect some of the blood blood pressure results when and how the blood was taken so I'm going to show you an example in a minute where we think the time of day different groups of the participants in the study might have been interviewed might be affecting some of the results that we've got so you kind of need to think about that and control for those things in your analyses other conditions so it may be that you're seeing high levels of a particular biomarker that's not because of the disease you mainly associate with that biomarker but because of something else so you need to think about comorbidities and then medications medications are both used by people to control diabetes so if you are analysing diabetes or high blood pressure what do you do about those people who are actually taking medications to control those things and then what do you do if you are analysing something else but you know that actually high blood pressure of medication might affect it so there are kind of things like that that you need to think about and then there are some things that might help with us so for some conditions there are internationally agreed ways of standardising and using these data and so what we've done for every one of the biomarkers available in understanding society we've got a glottery that goes through all those things and tells you what you need to worry about and that's on our website so to give you a couple of examples so I won't do this one, I'll do the next one so feritine is a measure of iron that's stored in your blood and so it's quite important in terms of anemia that's kind of what you think of in terms of iron in the blood and poor anemia or low levels of anemia is generally very high in women it's associated with fatigue and other things but actually one thing you need to think about is often with some of these measures both ends of the distribution indicate that there's a problem so with anemia we tend to think about it as being low iron in the blood is a problem and hence the kind of groups in society and other things that are associated with it but actually there's another kind of problem associated with iron which is having far too much iron in the blood and that tends to be much more common in men and then is associated with subsequent heart disease so this is what there are clinical cut-offs for this that are in the guide and this is what it looks like so for men iron overload is much more of an issue and it increases with age and for women iron depletion or low levels of iron are much more important that decreases with age so this next biomarker I wanted to talk about is just to again it's just to kind of make you think about the things you need to think about testosterone so this is a hormone associated with growth and development building muscle strength and there's quite a lot of interest in it in social science so it was associated with stock market crash it's seen as a kind of male social behaviour there's been studies using testosterone to look at people who are more likely to be self-employed have higher levels of testosterone and other things and because of the way it's measured you can only really use the data to look at testosterone in men we don't have the test we did wasn't really high enough to detect levels in women but this is also the one where time of day matters so we know that testosterone systematically varies during the day and we expect testosterone to decline with age for men but it starts to go up here around 64 and we think that's because the way the interviews worked or we know that's because we looked at it that once you're retired you tend to be interviewed in the morning or during the day whereas if you're working you're interviewed in the evening so this looks like a really strange example of testosterone not doing what we think it should do but actually you need to think about how that interacts with the way people take part in surveys there are some of the biomarkers where you need to know how people think about it in the literature in order to or in clinical settings in order to interpret the data or play or effectively so kidney disease there are international standard formulae that take the blood marker blood data that we have and you kind of have to apply different levels for different age groups and ethnic groups and genders in order to get a measure that indicates whether or not people have kidney disease and like so all this is kind of available in the glossary if you're interested in doing it to kind of make you aware there are things you need to think about and so kind of perhaps not surprising what you find is that high levels of kidney disease increase with age so we have funding to try and encourage people to use these data and we do various events and provide various resources and so a key thing to know if you're interested in this is we currently have a call out for biomarker fellowships so this is funding for ups to a year if you wanted to do a project using the biomarker data or the genetics data but the deadline is next Thursday but that is possible if you are interested and I'd be happy to talk to you about it we then have various workshops that we do again to try and encourage it so we have we're planning a genetics one in February another biomarker one in April because we've added some new by then we have added some new biomarkers to the ones we already have Dan Benjamin who's an economist and runs the social science and genetics consortium is coming to do a master class late spring and then we have a kind of annual conference that's just about the use of these data that you might be interested in but we've talked a little bit about the fact that we only measured these data once so far and these data were really expensive to collect but obviously lots and lots of the questions that you might want to ask really what you want is to have change in these data you want change over time and at the same time as kind of we've collected these data or since then we've moved to a way of conducting the main study where some people are only interviewed on the web interviewed face to face and so what we've started to think about is how can we collect biomarker data again so we can have effective ways of looking at change so it needs to be comparable to the data we've already collected but taking into account the fact we'll probably never get funding to do it in the way we did it before on the whole sample and for lots of our respondents they're not used to having an interviewer come to their house anymore on the web so can we collect biomarker data that's robust and high quality but nevertheless kind of takes account of these different ways of collecting a survey we do now and at the same time as kind of thinking about the way the survey world has changed from our respondents technology has changed so things we could only previously do in blood we can now do in hair samples we're really interested in things like biom which require stall samples we talked a little bit earlier about can we do things by smartphone and different ways of asking people questions do we need nurses there's no nurses in the rooms there do we need nurses anymore to do data collection for these kind of biological measures can ordinary interviewers collect these data can respondents collect these data for themselves and again we talked a little bit about this earlier the response rate we got in understanding society to giving blood or doing this nurse interview is a bit lower than some of the medical studies so is that because if you take part in a medical study it's very salient to you and therefore you're more likely to be willing to give blood or was it that we didn't give feedback so why would you give us your blood if we're not giving you anything back from it so there's kind of a set of questions that we're interested in so in the innovation panel that we have that will go into the field in 2019 we're doing an experiment it's purely about health and we've got three arms going on so there's one where people will get a traditional nurse visit it's kind of the way we did it before to get some what you might think of as medical gold standard measures they'll take blood samples and all those sorts of things one where one of our ordinary interviewers will go to the respondents health and do the questionnaire and explain to them all about these different samples and one where the respondent will do the interview on the web and there'll be videos to explain to the respondent how to do these measures themselves so clearly not what we're doing is blood spots so if you I don't know see people who have diabetes they kind of often do one of these pin prick things so you can do that get people to kind of send you the paper that you press your thumb on to afterwards to do some of the same blood analyses that we've already done we're getting people to send us hair samples so you need a bit of your hair from here only a little bit and that allows us to collect a whole or analyse a whole set of data around different hormones, exposure to toxins so kind of lead and all those sorts of things in the atmosphere and then we're also doing an experiment we're kind of talking about earlier about whether or not we give people feedback it affects either their willingness to take part or subsequent kind of behaviour sorry I've got a different slide on this thing in front of me so we have funding to do this we're really interested in both who is willing to do this kind of data collection in these different ways and what the quality of the data is that's kind of what we're doing but next month, December we're doing a call so that you could propose experiments if you wanted to so we have sort of had some, we had a workshop about this and people are thinking about experiments for example of where you take a photo of your body and you can use that to measure kind of body fat and body size in some ways people I think I mentioned about doing smartphone in the moment mental health measures we're talking about other samples that we might collect and how acceptable it would be to respondents to ask them to give us urine or stall samples because you can kind of get at very different sorts of measures with that so there's kind of a whole set of things that people are interested in but this call is open to everybody to propose an idea so we're going to launch it in December given this is a bit different to our normal experiments we're wanting people to say in January if they're interested in doing something so we can, there's no point you developing a big protocol for how to collect something if we've had 100 people interested because we only have a 50 minute interview so we'd kind of need to think about how to handle that if we had a lot of interest but the data will be collected in 2019 and be available in 2020 so if you're interested in experimentation or thinking about how you could do something in your own study and you want to try it somewhere first that might be the place to go and that's all I wanted to say you have any more questions