 Welcome to the series of presentations on some of the methodological considerations researchers should keep in mind when they're doing biosocial research. My name is Tarani Chandola. I'm from the University of Manchester and part of the National Centre for Research Methods. And I'd also like to acknowledge the Understanding Society Biomarker team who contributed a lot of the material and slides to this set of presentations. First of all, I'd just like to acknowledge that the methodological considerations we should be keeping in mind when we're doing biosocial research, it's not special, it's nothing new, it's just the same sort of issues that should be kept in mind when we're doing standard biological or social science research. It's just that some of these methodological issues get really highlighted and emphasised in particular because of the nature of biosocial data. And I'll be going through some of the examples in which these issues really get emphasised. So this series of talks is divided into three components. First of all, I'll be talking about the need for a biosocial research framework. Secondly, I'll be going into some of the data quality considerations researchers should keep in mind when they're doing biosocial research. And finally, I'll be talking about some of the missing data implications in biosocial research. So the very first part is about why we need to have a biosocial research framework. And for those of you who are completely new to the area and want to know what is biosocial research, I can recommend a really excellent talk by Professor Mikaela Benzoval on the topic what is biosocial research. And I've got the YouTube link over here and this is part of a series of talks that she has done on biosocial research and this is a talk that she did for the NCRM. Why do researchers want to combine biological and social data? And that's perhaps the most important question in determining the kind of research framework that we have in approaching biosocial research. There are a number of reasons why people are interested in combining biological and social data. For example, people might be interested in using biomarkers as an objective measure of physical functioning of health and illness. They might be interested in using biomarkers to look at the pathways between social factors and health, or they might be interested in using biomarkers to understand how biological factors act as distal causes to influence social outcomes. And in line with that, biomarkers can also be used to understand gene and environment interactions. And I'll be going through some example slides of how each of these biological and biosocial frameworks can help in understanding the associations between biological and social data. So biomarkers have often been used as a sort of a better, more objective measure of health and that's because self-reported health that is collected in standard surveys often has a lot of bias. It depends on people's own perceptions of their health, so it's very subjective. It depends on people being aware that they have particular disease or illness conditions, so you can imagine that a self-reported health might be considerably affected by all of these biases, by a person's mood on the day that they were sampled, whereas the biomarker, which is based on a clinical objective measure, may seem to be free from such biases. But actually, there are advantages and disadvantages for using biomarkers in place of self-reported health, and I'll go through a few examples of that. In the labor force survey, for example, they measure stress or work-related stress, and that's my own field of research. And they measure stress by asking people, have you suffered from any illness or disability, physical or mental problem that was caused or made worse by your job? And if people say yes to that, they then get a list of conditions that they ask. How would you describe this illness? And one of those sets of descriptions is stress, depression or anxiety. So if people say that the health was made worse by work because of stress, depression and anxiety, that's an indication that they have got stress or work-related stress. Now you can imagine that there are quite a few biases associated with this measure. Individuals are asked to self-report any work-related illness they believe to have suffered over the previous 12 months. So it really depends on their ability and willingness to self-diagnose these links. And they have to be, in a sense, epidemiologists or medical doctors because they have to ascribe the cause of the illness to work. And maybe there is a link, maybe there isn't a link. And people may actually fail to recognize a link with the working conditions when there is one. So there are all kinds of reasons why self-reported data on stress or work-related stress might be hard to measure from a questionnaire or interview. So in contrast, a lot of people like using biomarkers of stress. So there are well-known physiological systems which produce stress hormones which activate the body's stress symptoms. And this gets a number of hormones such as cortisol, adrenaline, noradrenaline going in your system. And this has been measured in quite a few biosocial studies. And it's also important to remember it's not just the activation of these stress hormones during an acute stress response period that is important, but it's the recovery. So this is some of the disadvantages of measuring biological data as opposed to self-reported data because you've got to be measuring these stress reactions, this physiological stress reaction over a considerable period of time. And it costs a lot of money to not only measure the stress response systems and biological stress response systems, but also if you're thinking about measuring over a long period of time, then that adds considerably to the cost. So you can imagine asking people a question in a questionnaire might be a lot less costly than trying to measure their biological stress responses. Biomarkers have also been used to measure how people grow and develop and change over the life course. And there are a number of biomarkers that are useful for measuring the life course developmental processes, growth hormone, for example. High levels of growth hormone or testosterone in early life might be indicative of one kind of process that may mean something very different when you have got high levels of the same hormones in later life. So it's important to keep in mind the life course processes, the stages of the life course when these particular biomarkers are measured because they have different implications and different meanings. The gene environment interactions are shown, some of the possible gene environment interactions are shown in this slide. The G denotes genetic factors, the P denotes phenotypes. Sometimes these are biological phenotypes, sometimes these are psychological or personality phenotypes. The E denotes environment and in gene environment analysis, a lot of the times the environment is the social environment and the Y denotes a sort of distal outcome. Usually it's health but it can be another kind of outcome people are interested in. So in the top right hand corner we've got a causal process that shows that the genetic factors influence the distal outcome, say it's health, independently of the environmental, the social environmental factors. The genetic factors are influencing the phenotype, usually as I said it's a biological phenotype. On the top right hand corner we've got effect modification going on here. So the genetic factors which influence the biological phenotype or personality or psychological phenotype, they modify the association between social factors and the distal Y or the health outcome. In the bottom left hand corner we've got a process where the genetic factors are actually causing the environmental response. So a lot of the times people have looked at the way how genetic factors are correlated with educational attainment for example or intelligence and so there's a whole process by which genetic factors can influence the phenotypes which result in low educational attainment for example and so there's been a whole series of studies that have tried to disentangle the way in which biology is the independent variable and that affects social environmental outcomes. And in the bottom right hand corner we've got an example where genetic factors influence the phenotype which is actually causing both the environmental factor to occur as well as the distal Y, the distal health outcome. So the association between the social environment and the distal Y or the health outcome is confounded by the genetic factors. So in each of these examples of biomarker research we really need to be very careful in trying to find out what is the association between the social and biological data and that really needs careful consideration within the relevant theoretical framework. If we don't have that theoretical framework we are in danger of making multiple comparisons resulting in trying to find associations with the lowest P values which is basically unscientific and non reproducible and that's because with bio-social data we have hundreds and maybe thousands of variables that are both relate to the biology of somebody as well as their survey responses. And the temptation for a lot of researchers is just to do a lot of correlations and do P hacking for example which would be nonsensical because we need to be analyzing these associations within the context of a particular theoretical, a particular bio-social theoretical framework. And finally I'd just like to emphasize the need for interdisciplinary research teams because this kind of bio-social research framework does require expertise of both in biological sciences as well as social sciences and unless somebody is trained in both it is quite hard for one person to come up with a relevant bio-social theoretical framework which is why some of the best bio-social research does rely on multidisciplinary interdisciplinary research teams.