 Hello, everyone and a very warm welcome to this webinar where we'll be highlighting data available in the UK for investigating obesity. So first some introductions. I'm Professor Rebecca Hardy. I'm currently at the MRC unit for lifelong health and aging at University College London. But as of next month, I'll become the director of CLOSER also based at UCL. And also presenting today are Dr. Vanessa Higgins from the UK Data Service, University of Manchester and Dr. Michelle Morris from the CDRC University of Leeds. So this is one of a series of webinars jointly organized by a number of data resources which are funded by the ESRC. And coming up on the 2nd of April, we have Education in the Long Run which is from the UK Data Service and Celsius. And previous webinars were about finding out and about using data for research on travel to work and on ethnicity and migration. And you can see these recordings, these webinars again on the website if you've missed those. So the topic for today is obesity and this is what we're going to cover. So first of all, I will talk about CLOSER resources and in particular the CLOSER harmonize longitudinal body size data sets and how that can be used. Then Vanessa Higgins will be telling us about data available from the UK Data Service, things like the National Health Service and the National Child Measurement Program. And then Michelle Morris will be talking about consumer data sets available from the CDRC. And there should be time at the end for you to ask questions and for us to answer them. It won't be while we go along so please can you type your questions to any point into the box and then we'll answer as many of them as we can at the end. So I'll now start with the first presentation. And this is about CLOSER, our resources for investigating obesity in the UK. So the aim of CLOSER is to maximize the use and value and impact of the UK's longitudinal studies. And I'll talk specifically about the CLOSER resources for investigating obesity rather than study specific resources. So CLOSER has eight partner studies that are listed here. The four National British Bird cohorts born in 1946, 1958, 1970 and 2000, 2001. That's the MCS, the Millennium Cohort Study at the bottom of this figure here. There's a large longitudinal household panel survey, Understanding Society. And then three regional MRC funded cohort studies. The Hartfordshire cohort study, which is the oldest study in CLOSER, they were born in the 1930s. ALSBAC, which is from the Avon region around Bristol, born in the 1990s. And the Southampton Women's Survey, which actually recruited women prior to pregnancy and then followed those who became pregnant through their pregnancy and followed their children once they were born. So I wanted to get an idea of whether you have used any of these studies or whether you've heard of any of these studies. So can you answer this question? How many of you have either used or heard of any of these studies? Okay, so we have a mixture of responses. So there's about a third of you who haven't used or heard of these. Some of you have used them and some have heard of them. So hopefully we'll motivate those that haven't used these studies before to do that by the end of this presentation. So first I wanted to briefly mention CLOSER Discovery. Now CLOSER Discovery is an online resource that enables researchers to search the data from our eight partner studies. And this allows you to search for the data before you actually go to these studies to see whether they've got the relevant data to answer the research question that you're interested in. Now all CLOSER studies have some height and weight measures. Some of these are self-reported and some of these are measured. And they also have questions related to weights such as unintentional or intentional weight loss. And I've shown here a screenshot from CLOSER Discovery for understanding society with a search by weight. And I've used understanding society as this is not included in the CLOSER body size data set which I'll focus on today. But understanding society is a key ESRC funded study that has data on height and weight. So as I briefly mentioned, understanding society is a household panel study consisting of around 40,000 households collecting data annually on all members of those selected households. So it covers all ages. So most data from understanding society are available under end user licence from the UK data service which you'll hear more about during this webinar. And more details about understanding society and the data that they have are available from their website. So the main thing I want to talk about today is the harmonised CLOSER body size data set. Now CLOSER has a number of harmonisation work package and has produced a number of these harmonised data sets. So specifically relating to body size, the data contains five of the CLOSER studies. The four nationally represented British birth cohorts. The idea being to be able to provide nationally represented data to track how body size changes across life and according to different generations. Now ALSVAC was also included here because there's this 30-year gap between the BCS born in 1970, and the Millennium cohort born in 2001 where there was no national birth cohort. So this isn't nationally representative, but we've included it to fill that important gap. So this slide shows the main ages at which data was collected in each of these studies. So you can see there are multiple measures within the different stages of the life course. And there are also measures at comparable ages across the cohorts. For example, we have measures around about the age of 7 and 10 or 11 across all the studies. Now the exception of birth where it was normally just usually just collected the weights and not the lengths, weight and height are available at these ages. Now from height and weight we can calculate body mass index or BMI as an indicator of adiposity. So this is a weight for height measure. And from BMI age specific cutoffs for overweight and obesity and indeed underweight as well can be calculated. Now it might seem straightforward to compare heights and weights across these studies, but even with these apparently simple measures, considerable work was required to make sure that they were as comparable as possible. Both within studies at different ages and across studies so that we could compare mean levels of BMI, distributions of BMI and the prevalence of overweight and obesity. So for example in the older cohorts in the early years height and weight were measured in imperial measures, so that's inches for height for example. So these were all converted to metric measures. Then a standardized cleaning protocol was applied across all the studies to remove biologically implausible values. And finally we harmonized the sample to make them more comparable. So the most restrictive sample is the 1946 birth cohort and we therefore excluded multiple births which are not included in that study and also ethnic minorities because the 1946 is a cohort of those who were born in England, Scotland and Wales in 1946, so includes very few ethnic minorities. Finally we checked the data set and the coding, documented it and then it's been deposited at the UK data service for use. And there are some indicator variables as well with each set of height and weight measures, so for example indicating whether the measure was measured or self reported. So the data are made available under two licensing types. So the end user license data just requires you to register on the UK data service. Those under special license access data require you to submit an application to the studies via UKDS just explaining what your research project will be about. Now the data include closer IDs that enable you to link with other harmonized data that closer is produced and I'll show an example of that in a minute. But updated versions of the data set can be available under the end user license to also include the original study identifiers so you can also link these data back to original study data. So what's the value of this data set? Well the key advantage of longitudinal studies with these repeat measures of body size is that you're able to look at change within individuals and by comparing across these studies you can compare these age-related changes across different generations. So here we looked at BMI and presented here are the probabilities of being overweight or obese in comparison to normal weights. So what we're then able to do is to see how the obesity epidemic has developed by cohort and also within cohort by age. And what we see is that those born from the 1990s onwards, those youngest two cohorts have rates of overweight or obesity approximately three times higher than the earlier born cohorts in childhood. And then of the older three cohorts that have reached adulthood you can see that the rise in the probability of overweight or obesity starts to be increasingly early at age. So this very powerfully shows I think the impact of the onset of a more obesogenic environment and by that I mean one that promotes overconsumption of unhealthy foods and discourages physical activity. An environment that was prevalent from around about the 1980s onwards and therefore hit the different generations at different stages in life with the youngest cohorts being born into this environment. So as I said the closer resource now allows you to use these harmonized body size data sets and to link to other variables to investigate for example risk factors for the development of obesity or how obesity influences subsequent health and social outcomes. So as an example to show you what is possible I'm showing this data where we've merged the body size data set that I've been talking about with a harmonized data set from another closer work package and this harmonized socioeconomic position variables. So here using harmonized further social class as a marker of childhood socioeconomic position. So we can now look at how social inequalities in body mass index have emerged and changed over the life course again by cohort and these are the four national cohorts presented here as that was actually excluded from these analysis. So we can see social inequalities in childhood BMI only in the Millennium Cohort study but little evidence in the other studies. And then the other plot shows adult BMI by childhood social class so we can see inequalities in all the adult cohorts which widen with age. So remember these, you see these inequalities by childhood social class despite there being little evidence of inequalities in these cohorts in BMI in childhood. So again a very powerful demonstration of how with the emergence of the obesity epidemic we've also seen the emergence of social inequalities in high BMI and overweight and obesity. So I hope this gives you an idea of the sort of really important and policy relevant information that can be gained by analyses of these data sets. And my final slide, thanks to the funders and closer partners. So I'll now hand over to Vanessa who's going to speak about the data available in the UK data service. I'm Vanessa, I work at UK data service. I also use some of the cross sectional data sets that we have on all these for my research. So I'm going to be giving you a whirlwind tour in 10 minutes of the cross sectional data that the UK data service make available. And also some with related content, so diet and physical activity for example. So once I've been through those data I'll then talk about the benefits of using the data and give you some examples of use and also how to find and access data on obesity from the data service. In summary, there are four key cross sectional data sets that we provide access to that I think are really useful for obesity research. And these are the top four bullets. So there's the Health Survey for England, the Scottish Health Survey, the Welsh Health Survey, although that has now finished and the last data set is 2015 and the National Child Measurement Program data set. So these four data sets are particularly useful because they contain data on objectively measured obesity. So that means that measurements such as weight and height for body mass index and waist and hips of conference to measure abdominal obesity are directly measured in the data collection from the respondents. So other studies rely on subjectively measured data on obesity. So self-reported data and self-reported data as you can imagine is proto underreported. So for example, if I was asked how much I weigh, I might underestimate it because it's more desirable in Western society to weigh less. So self-reported data are seen as less reliable and objectively measured data of the gold standard. So these four at the top all have objectively measured data. And they're also very useful because they're annually produced. So you can get up to date data, you can use them to compare trends over time, or you can pool more than one year within the survey to get larger sample sizes for sub-populations. And I said that the Welsh Health Survey had finished in 2015. There are some health questions now which have switched over to the National Survey for Wales but that contains subjectively measured weight and height and it's not objective. So the bottom set of bullets are examples of other data sets that we hold at the data service that you might also find useful. I've already mentioned the National Survey for Wales. They have some obesity content but it's not the objectively measured content. And they also, some of them only ask about obesity in certain years. So for example, the Northern Ireland Ni is Northern Ireland's, doesn't have obesity measurements but it contains questions about weight control and perceptions of children's weight. So there's some useful stuff in there but it's just not the objective measurement. And then another example is within the British Social Attitude Survey from time to time there are questions. So in 2015 there were questions on perceptions of obesity and the prioritisation of expenditure on health conditions which included obesity. So it's worth going to the data catalogue which I'll show you shortly and having a look around because there's obviously these key core data sets that if you're interested in more than the objective measurements then there are other things around as well. And my next few slides will focus purely on the Health Survey for England and the National Child Measurement Program because these are the most widely used data sets that I've mentioned so far. It's worth noting that the Scottish Health Survey is really similar to the Health Survey for England in terms of content and methodology. So the Health Survey for England is really an important survey for looking at changes in the health and lifestyle of people in England. It informs policy, it monitors the health of the nation, it provides key estimates for things like obesity and various other things. It's commissioned by NHS Digital and it's carried out by NASA and University College London. It's been conducted annually since 1991 and the latest data are the 2016 data that are available from the UK data service. The sample size fluctuates from year to year. It's been relatively stable recently and the current sample size is around 8,000 adults and 2,000 children. And from time to time there are boosts. So for example in 2004 there was an ethnic minority boost so that there were more people from ethnic minority backgrounds in the sample. It has a complex sample design. It's representative of the population of England. So it's reliable. So when a household is selected, all the adults within the household participate. They're asked to participate and up to 2 children only. Interviewers then go into the household, conduct a face-to-face interview with each participant and then there's a follow-up visit with everybody from a specially trained nurse to collect samples and measurements. So blood samples, delightful urine samples, waist circumference measurements, things like that. So it's very in depth. Because it's a cross-sectional survey, it contains different respondents each year. So unlike the longitudinal cohort studies that Rebecca has spoken about, you can't track an individual's health over time. But what it does give you is that annual snapshot of the nation's health at one point in time. Every year the survey includes a set of core questions and core measurements and you can see that height and weight and waist and hip and also physical activity and fruit and veg consumption are asked every year. And in addition, the bottom left-hand side blue box shows that each year there's either an additional module, additional boost or a special focus on particular topics. So of interest for obesity, you might be interested in cardiovascular disease or physical activity, so I've just listed a few there. It's also worth noting two other things for the health survey for England which is on the right-hand box at the bottom. First is that in 2016 there were specific questions on weight management which were really interesting about whether people go on diets, whether people use fitness apps and lots of other questions. So God take a look at that, really interesting. Second is that the UKGS holds a time series data set of the health survey for England on obesity which actually I do positive with some colleagues because we were writing a paper about trends over time and we thought it might be useful and the syntax is there too so you can update it with more recent data if you want to. So you don't have to put all the data together yourselves, that's good. Okay, so I'm going to give you a poll now. In 2017, what percentage of adults in England do you think were overweight or obese according to the body mass index? So that's the body mass index of over 25. Okay, so we've got 13% think that 28% is the right answer, 44% thinks 51% is the right answer, and 44% thinks 64% is the right answer. Well, 64% is the right answer. So the majority of the English population are either overweight or obese, shockingly. And this data comes from the Health Survey for England. So it's a whopping 64% so we'll do this first if you've got that right. When combined together, yeah, it's 2 thirds of the population but in terms of obesity, so a BMI of over 30, it's only, well it's not only, but it's 27% of men and 30% of women. And as I said, you can compare things over time. So what's really nice is that you can look back at how many were overweight or obese, what proportion in 1993 for example. In 1993 it was only 53%. So it's risen by 11 percentage points during that time period which is quite shocking really. So those are just an example of the kinds of interesting things you can look at with the Health Survey for England. Moving on to the National Child Measurement Program. Again, this is funded by NHS Digital. Its aim is to monitor obesity amongst children in important planning. It was first established in 2005 and data are available from the data service from 2006 to 2007 onwards. If any of you have got children, your children who are school age will probably take part in this because it takes place within all state maintain schools and then some private schools can take part as well. So it includes trained health professionals going in at the exception year, so age 45 and also at year six, that's the last year of primary, age 10 to 11. And they go in and take the objective measurements. It's just height and weight. It's not the waist circumference or hip circumference. So the database contains is slightly different to the normal kind of survey statistical data that you'd get from us. So it's only in either access database or CSV format and there are three main data tables which contain information at pupil level, school level and those authority level. And then there's a few more tables which give you the codes for things within the first three tables. So they are things like lookup tables for local authority codes, things like that. The database available is actually a reduced version of the full dataset and that's to ensure that the risk of disclosure is minimal but the pupil database still contains over a million pupils in 2012 to 2013 from 17,000 schools. It's pretty amazing, really. It's huge. A bit more limited in the amount of variables that you get because it's only got sex and age and regional variables. And I'm not sure if it's got any others. I can't remember off the top of my head but so it's not this kind of wide dataset with lots of information about social economic group or anything like that. But it's really big sample size. So these are a number of datasets that you might find useful that have data on diet or physical activity which obviously is closely related to obesity and being overweight. So one of interest is the National Diet and Nutrition Survey that although it's only 1,000 people who take part that actually also has objectively measured obesity in it. So that's a survey of diet and nutrition with obesity measurements too. So it's worth knowing about those and as we said this slide will be from the webinar recording hopefully over the next couple of days. So I'll skip over that and you can go back to it if you need it. So using large scale cross sectional datasets the benefits, I just wanted to highlight these. So obviously this is high quality well documented data that are conducted by well respected organizations such as ONS and NAPSEN. So they're really good for reuse by secondary analysts. They're nationally representative as I've said. So the results are generalizable to the population. The sample sizes are generally large so you can do subpopulation analysis of things by gender and age. They contain a wide range of covariates, the multivariate analysis, the socioeconomic group and lots of other things. As I've said that you can examine change over time or pool data over time for bigger sample sizes. And also many of the surveys are household surveys. So they interview all members of the household so you can look at intra-housel relationships which makes it really interesting. So you can look at for example adults obesity and children's obesity and the associations. I'm not going to go into detail about the research examples but there are these studies especially the health survey for England that are really well used and there are lots of publications out there which use these data for obesity. I've listed two there. Yeah, you can go and look at those and they might give you an example of ideas for what you can do. Okay, so finding data in the data catalogue. On the front page of the data service you'll see our data catalogue there so you just type it in your term and press search data button and you will see some results like this. So I've typed in obesity and I've refined some of the filters down the left-hand side so I've just done the date from 2010 to 2019. I only want UK survey data. I've ordered the results by relevance and you can see that the National Child Measurement Program is coming up here. And then if you scroll down you'll see many of the surveys that I've just spoken about. So that's how you find data in a very quick overview. Data access, this is my last slide before I hand over to Michelle. So as Rebecca said you need to come to the data service, register with us which doesn't take very long. Data are free for use apart from if you're using it commercially so you just have to tell us what you're using it for. Most of the data I've spoken about are available under end user license. So basically you only have to... Once you've registered and you've found your data it's very, very quick to download. It's 10 minutes or something like that to download. Some of these data are available under special license which makes it a little bit more complicated. It just means that there are more stringent conditions associated with accessing the data and it can take longer to get the data because your application needs to be checked. One more recent development from NHS Digital is that in the future for the Health Survey for England and also for the National Child Measurement Programme data set you will have to go through a new process so you won't be able to get these data under end user license. You'll have to go through the NHS Digital Data Access Request Service or DARS. So a bit like the special license they'll have more stringent conditions than under end user license and it will take a bit longer to get the data. The most recent NCMP, National Child Measurement Programme data are actually in the DARS already so they're not available from the data service. This is all very recent in the last few weeks and some of the detail is still to be confirmed and I'm afraid I don't know it and so I don't want to give out any false information here so I suggest that you watch this space for that for more information from NHS Digital Contact and if you've got any queries about that look at the links I've provided here because NHS Digital are going around giving some real shows about this so it doesn't mean you can't have the data it just means that you've got more stringent conditions. And over to Michelle. Hi everybody and thank you Vanessa and Rebecca for your talks. Really interesting, I learnt lots too. I'm Michelle Morris from the University of Leeds I'm a university academic fellow, a maturing fellow and importantly for this talk I'm a co-investigator at the CDRC. My research looks at using commercial data to better understand obesity. The focus of this talk is going to be quite different from the last two moving towards different sorts of data sets that have information about obesity and the wider drivers of obesity. The CDRC is a national centre it was funded by the ESRC as part of their Phase 2 big data network and it's accessible for all academics to use and it's kind of my name in the first phase was to broker relationships with commercial data partners and host data and make them available for other academics to apply to use. So I've got a quick poll now similar to what you've done already just to find out about how many of you have either used or heard about the CDRC before. So the majority of you haven't even heard of the CDRC so that's great, that gets me a good sales pitch and a good promotional bit for the next eight minutes or so and those of you who have heard of it, brilliant and only a few of you have actually used the CDRC. Right, so this is how the CDRC works in practice and a little bit about what's in it for you as academics or researchers and also what's in it for our data partners. So the data sharing partnership is established by the team at the Consumer Data Research Centre where academics but also kind of professional services and business development managers and people whose job it is to help with these relationships and then host the data. So we have collaborative projects some with the commercial data partners to deposit data and then it's completely independent but the data partners are generally very excited about the possibility of working with well-class academics, researchers, which is not just people at Leeds or the other CDRC institutes like UCL, London, Liverpool and Oxford but also anybody who wants to use the data so that could include some of yourselves. And it helps working with the commercial data partners to provide context to problems. So they can drive some research but as a researcher you could have a look at what data we've got available and then divide your own research question and putting a proposal to apply to use that data based on your own research. There's also opportunities to link different data sets available and so it could be that you use some CDRC data in conjunction with the data sets that we've already heard about today. So here are some of our data partners taken from one of the kind of standard CDRC slides. So our first glance many of these are not relevant to obesity I'm going to talk you through some examples of partners that we've got that do provide obesity data but also data on the drivers of obesity. The drivers of obesity in the simplest terms energy in from diet energy out from physical activity if we consume too much food and are not physically active enough we're likely to become overweight or obese. So available via the CDRC are data from the UK Women's Co-Orts today so this is not entirely a commercial data partner but is a partner at the University of Leeds who hosts these data and make them available via the CDRC for people to use. So the UK Women's Co-Orts study was set up in the 1990s and recruited 35,000 kind of middle-aged women with the view of investigating relationships between diet and health outcomes so these women are still being followed up through cancers and death outcomes. So at baseline all 35,000 women completed questionnaires about their diet and physical activity self-reported height and weight information and the whole host of other information. That follow-up of 14,000 of these women provided more questionnaire information on their diet and physical activity and various other demographics and behavioral questions. Another data that we have is provided by Heart Research UK so these data are from 6,000 individuals from between collected between 2007 and 2015 that were generated by Workplace Lifestyle Assessments. So Workplaces asked Heart Research UK to go in and do these Lifestyle Assessments and provide their employees with healthy heart scores. So to generate the healthy heart scores they collected survey information on diet activity, their Lifestyle Assessors measured people, got their anthropometric height, weight, waist circumference and a subsample of these 6,000. They also collected blood so we had things like the triglycerides. It is cross-sectional. There is only one measure for each person in there and while we have some basic demographics about the people, the identifier is their workplace as opposed to any kind of information about where they actually lived. Bounce is a physical activity data set. It's an app that is available to download and that incentivises physical activity and the data available for the whole of 2016. There are 500,000 active users in the data set and 344,000 of them have additionally provided basic demographics. The sorts of things that are included here are step counts. So you can either, the app on your phone will measure your step counts per day and you can also link various devices so Fitbit, Stravagam and J-Wing when you check in to certain institutions it will link institutions for things like gyms or it will load it into the Bounce app for you. Does distance travel? Got some GPS information. Of these users, 33,000 of them have 205 days of activity recorded in the data set. So kind of in the context of things like UK Biobank, the average is 7 days activity recorded but it is for more people of 98,000. And the thing with the, well all of these but the bounce especially is that it is people who've actively signed up to the app. It's not been a nationally sampled so there are bias in the data from those who are motivated. So U-Gov data is available so things that are included in U-Gov data it's survey data from February 2018 to 2019. Things like supermarket shopping behaviors, attitudes towards sustainability, muting and food behaviors, location, age, gender, socio-economic status. So a really rich data set about behaviors. I'll just skip through these but these are transactional data from companies that do not wish to be named at this point but are really interesting data. So how do you get the data? You can either go to the CDRC website direct and browse to data or visit data.cdrc.ac.uk and then search in the field at the top right. All data sets have come at three different levels so we have open data, safeguarded and secure. So open data, there are quite a lot of different data sets that have been repackaged so that they're easy to access and you can just download those. If it's safeguarded or secure you do have to request and put in a project outline to the CDRC for consideration. And the secure data needs to be worked on in one of the secure sites either in Leeds, London, Liverpool or Oxford. And the other thing you can see when you browse to the data sources is a full data profile. So in here there's a bit of background about who the data comes from and also full metadata about what is actually contained within the file that you would access. So really briefly I just want to mention the kind of wider determinants of obesity so it's not as simple as energy and energy out and this is the fourth IMAP from 2007 of the obesity systems. There are 108 nodes on this map within these seven key domains. If you think about all of these potential drivers for obesity then many of the other commercial data sources available to CDRC are relevant in the context of obesity which might bring in things like house price data from Zoopla income data in Axiom geodemographic information from the ACI or TransUnion who might be interested in how people travel to work and what transport is available. So it becomes that there are much more data that could be of interest in the obesity when considering the whole system. So here's an example of another data profile for cameo geodemographic information. So geodemographic classification uses data from the census and commercial data in this instance to generate a profile for the whole of the UK. So here at the top highest level everyone in the UK will be grouped into 10 different categories of like types of people and tied to a small area geography. There are open source versions like the upper area classification of geodemographics but they're really quite nice in profiling people for obesity as well. So if you are interested in how we can use different sorts of data in the obesity system this is just a little bit of a plug from another ESRC resource where a number of us have authored papers about how big data can be used in obesity research.