 Okay, good afternoon everyone and welcome to this afternoon's webinar which is using administrative data with Understand Society for Education Research. Now today is a joint webinar, so it's myself Deborah from the UK Data Service and I have Brigitte Rabi with me from the Institute for Social and Economic Research. So we're going to be presenting this webinar between us. Okay, so let's crack on with the webinar. So this is an overview of what we're going to cover. As you can see there's quite a lot there. So we're going to have a look first of all generally at education in social surveys. I'll talk very briefly about how you can search for data on education. And then I'm going to hand over to Brigitte and she's going to introduce you to the UK Household Longitudinal Study or Understand Society as you may know it as. Once she's introduced you to all the fantastic content relating to education, then we'll switch back to me and I'll talk to you briefly about accessing the data should you want to go ahead and use it in your work. And then as I mentioned we'll have time at the end just to show you some ways you can get in touch with us if you need further help and we'll try and get through any questions that you might have today. So in general education in social research, it's a really important area of research and educational background is recognised as being a highly important factor in shaping individuals' lives and their future outcomes. Now in many studies education variables might be used as a predictor for a specific dependent variable although they're also widely used as the main variable of interest depending on what the research study is focused on. And education related variables might include something like highest qualification, age left full time education, what institution they attended and so forth. And this is just a very brief list of the many variables that are available. One thing that education variables can be used at so they can be used as a direct measure of educational attainment but they're also widely used as proxy measures for ideas such as economic or occupational opportunity. You can see some examples of previous research and I'll show you where to find those a little bit later on just in case you need any inspiration. Now if you're not familiar with the UK data service already, we are a single point of access to a whole host of different social science data and that can include international macro data. So if you want to have a look at comparing different countries we have things such as the OECD education statistics for example. We've also got individual and aggregate level census data, we've got some cross national studies and we have many thousands of micro level data surveys from the UK. And these include cross sectional panel and cohort studies. If you're just starting out and you're not really quite sure what your exact research question will be but you want to start to explore education data, you can start by looking at the data by theme page which you can see on your screen. And you can see on the left hand side we've got a number of different themes available and if you click on the one of interest such as education you'll see a number of different tabs. And that's just really a shortcut if you like to take you to some of the key data sources. It will take you to our search engines but also if you look at the research tab it will show you some examples of existing research. And as I say that's quite a good place to get some inspiration. There's also a fourth tab which is resources and this will include some useful links to external websites and resources which might come in handy so it's also worth having a look there. You can also use our discover search tools so all you would need to do is type in some keywords here and then you can use the options on the left hand side of the screen just to refine and narrow down your search. And you can do that on a survey level but you can also use the variable in question bank to search for specific questions. And sometimes if you're looking at a very specific topic that can be the better option. If you want any more help on how to search for data then we have a webinar in the first week of May which will look specifically at some really good tips for searching for data. Okay so let's get on to the really interesting stuff. So I'm going to hand over now to Birgitta who is going to introduce both the survey itself and more specifically the education data. Hello so now it's over to me. I'm Birgitta as Deborah just said from ISA. We lead the understanding societies study from a scientific point of view. And I will talk to you about the education content on understanding society and then further how it can be used with linked administrative data. So starting off with a very quick introduction. What is understanding society? It's also known as the UK Household Longitudinal Study UK HLS. And it is essentially a longitudinal household panel survey of all ages. So it covers from babies to very old people. And it is designed to track and analyze change at individual and household level. It started fairly recently in the year 2009-10 with a sample of around 30,000 UK households making it the largest panel survey in the world. And apart from the general population sample it includes some special subsamples as well. Which are the ethnic minority boost sample which includes a special sample of the five main ethnic minority groups of the UK. It includes an innovation panel for methodological research. It also includes the British household panel study which many of you may know. It is essentially the predecessor of understanding society and ran from 1991 to 2008. And the whole sample has been included in understanding society. And of late in 2015 we added another sample which is the immigrant and ethnic minority boost sample. So in total we now have around 40,000 households in which we have 100,000 individuals approximately. Of whom 60,000 are adults and 6,000 are youths who we interview separately. So the household panel design, I'd like to go into it a little bit more here. As I said understanding society is a longitudinal sample of individuals representing the whole UK population. And they are interviewed within a household context. And specifically this means that we started off selecting randomly a bunch of addresses. And if at any one address there were more dwellings then we would select dwellings within those addresses. And then households within those dwellings. And then we go on and collect information about all the residents at the selected household. And subsequently we follow the sample members' life courses over time. Collecting data also from people living with them. So if you have a young person growing up in a household you observe him or her over time. If she leaves the house, forms her own family then the new members, her new spouse will be part of understanding society as well. And so on over the life course. So the basic design of understanding society is similar to the British household panel survey. And other household panels as studies in other countries like the German socio-economic panel for example. In terms of the data collection we have 12 month intervals between interviews. And we have waves of continuous food work that span 24 months. So that some people can still be interviewed in their first wave of understanding society while you have other people who are already on to their next wave. So these waves are overlapping. And the main sources of information we collect are from adults in households. So everyone aged 16 and older is considered an adult. So we collect information from these people at the household and individual level. And also through proxy questionnaires should one adult not be available for interview. But quite a special component of understanding society is that we also directly interview children aged 10 to 15 through the so-called youth questionnaire. And furthermore there is information available from interviewers and also data which is created through the process of the interview. Understanding society has key topics and those are the main significant research domains that we cover. These include employment, family and household, health, health behaviors and well-being, income, wealth, expenditure and consumption. And as the fifth key topic education which is of interest to the webinar today. So going a bit further into the education content and understanding society now. What I'm showing you here in this table is an overview of the information that we collect in the survey itself. And we show the information on the left hand side. And then you can also have a look at the waves in which it was collected. And finally the sources and the file name for that information if you want to go back and actually have a look at those variables in more detail. So starting off with the education information we have on adults. We have the highest level of education attained by each adult. And this is updated every year because you might think of an adult taking part in further education courses, perhaps gaining a new qualification. So you need to update this over time. And there are derived variables available which include all the sources of information collected over the waves. And this is asked on entry into the panel and then updated each year. Furthermore and quite interestingly we also have the family background of adult respondents. So what we do have is the highest level of education achieved by each parent of sample members. So even if you have a 60 year old person say in the panel, they will be asked about the qualification levels of their own parents. And finally on adults at Wave 3 we collected their cognitive function through various measures using tests of word recall, arithmetic, number series and verbal fluency. We have quite refined measures of cognitive function collected at Wave 3. So next going on to the data collected in the survey on children and young people. You can see the table is quite full. We have a lot of different things that we cover starting off at the top with parents, attitudes and behaviors. So what we have here is for example helping children with homework, the frequency of spending leisure time together, the relationships with the children, parenting styles in quite a lot of detail including things that are interesting such as even shouting out and spanking your children. And finally the aspirations that parents have for university. This is just a selection of the types of data that we collect on attitudes and behaviors. And the modules that collect these are carried in odd waves, waves 1, 3 and 5 and then continuing on to wave 7 and so on. Furthermore we have a little bit of information about the school that children go to, reported by the parents. But this is just things to all the children in the family attend the same school, is it a private or a state school. But as you will see in a moment we have a lot more information on schools through other sources. Going on to young people, we have young persons attitudes to education. And this covers both people in the youth panel so age 10 to 15. But we also ask very similar questions to young adults aged 16 to 21 because obviously for this group many of them will still be in education and similar things apply to them as to younger people. And these variables include things like the importance of doing well at school, perceptions of parents' interests in their own education and labour market and educational aspirations. And there are multiple waves in which these things are asked. Most of these modules are on a two year rotation. And finally from the youth questionnaire we have more on young persons behaviors. And these are also quite interesting. For example they cover things like bullying at schools or bullying others and being a victim of bullying, misbehaving at school, playing truant, a lot of information on homework. Are they given homework? How much time do they spend on it? Who helps them? Do they get help at all? Do they attend other types of classes after school? For example private tutoring, dance lessons, music lessons, all of that sort of thing. And finally the young people also ask whether the parents attend the parents' evenings in school. So this basically covers the information that we have on the survey itself. And you can see it's a broad array of things that are relevant to education. And now apart from the survey content it's quite interesting that we also have the opportunity to explore it linked administrative data through education data linkage. And that's what I'm going to talk about for the rest of the time. So to give you an overview of this we distinguish two big areas here. One is the school level and one is the individual level. So the school level linkage is all made possible through collecting school identifiers. And this is both for children that are currently still at school, but also for young adults for whom we also might be able to link in information about their schools. And it's collected in waves one, three and five. And for adults it's only asked once if they've already left school. You don't want to ask these things over and over again. Now these identifiers are available through special license access and Deborah will talk at the end a little bit more about how you can go about getting access to them. And then the second big area is the matching in of the National Pupil Database. This National Pupil Database is an admin data set of all pupils in state schools in England and it covers a lot of information and I will go into much more detail of what it actually covers. But just as an overview it has information about the background of students, their attainment, absences and exclusions. The data that we have actually linked to understanding society covers a number of years from as early back as 1995-96 to most recently the year 2012-13. So this is just an overview of this area and I'm going to talk in more detail about both of them now. So starting off with the school identifiers, administrative linkage at school level. We have collected school names in the survey from parents. So we are asking the parents which give us the name of the school your children attend. And this is for children aged 4 to 15. The parents give us the school name and we immediately convert those into school codes during the interview using a lookup table. The school names are collected in all the odd waves, so one, three, five, etc. And there may be some reason that information on school is missing. One of these reasons is that parents might not respond to the question. As I told you it's only on the two year rotation so you don't have the responses in the even waves. And in wave 3 also there was a routing error which led to not all eligible parents actually receiving the question they should have received. So I'm going to go through some possibilities how to recover the information that you might want to have year on year. The first thing to note about the wave 3 missing information is that there was a question in the survey saying asking is your child still at the same school that they were previously, the same state or the same private school. So using this question you can infer the school for those children that haven't changed school from their previous response. For school if you want to know the school children attended in the even waves you might want to assume that if a child is in a particular school at wave 1 and at the same school at wave 3 you might think it's plausible to assume they've also been in that same school in the even wave in between. And finally using the linked national pupil database data you can also recover more school codes because these contain school codes. So by combining the two information sources you will know quite in a lot of detail where children spent their years, the schools that they attended. So for young adults the school name was asked in wave 1 of everyone born since 1981 and this year is related to the information that's available on schools which doesn't go back further than cohorts born in 1981 generally and in subsequent waves after wave 1 it was only asked of new entrance to the survey so-called rising 16s. If an adult has already left school the school code refers to the last school attended. So the school code data, what you will receive in your data release is the official code of the school used in official statistics. So in England this is the so-called DFE establishment number and the same for Wales. For those of you who have used these things before it's the liestub variable. In Scotland you will receive the seed number and in Northern Ireland it's the DE reference number. The nice thing about this is using the official codes you can merge in official statistics and a lot of this information is actually freely available on the internet and it gives you a wealth of information about the context in which children are schooled. Just to give you a few examples for this, in England you have school performance tables so you have detailed information over time about the school's academic performance. There's also school level census data which gives you characteristics of the student body for example so you can know what proportion of the school was receiving free school meals, what proportion of ethnic minorities do you find in the school and you can imagine that this contextual information can be really interesting for a lot of questions about peer effects and other things. And finally also very interesting, you can merge in offset inspections data so these are grades awarded by the Office for Standards in Education on school quality ranging from outstanding to inadequate. So this is also really interesting so you can refer the quality of the school to the outcomes the child attains within that school. So I'm just going to give you an example of what you can do with this which we have produced for our Education Topic Guide and there is a link at the bottom of this page but also under the resources that we're going to give you at the end. This looks at access to outstanding schools and how this differs by family background. So as I said the offset grade that schools can receive can range from outstanding to good requires improvement and finally to inadequate stroke fail. And what this graph shows is that we see here the proportion of in each offset grade the proportion of children by the background of the mother. So what we can see is that children whose mother has a degree or higher are quite likely to be in an outstanding school whereas the two lower education categories you can see that fewer children are in outstanding schools. And the reverse if you look at the inadequate schools shown in blue bars here you can see that a children whose mother has a degree or higher are much less likely to be in an inadequate school. Actually children in families where the mother has no formal qualifications are then twice as likely to be in an inadequate school. So this basically points to the fact that there is quite unequal access to good schools across children from families of different backgrounds. So you can see there's a lot of interesting things that can be done just merging in administrative data at the school level and it's fairly straightforward. All you need to do is get the school codes, find some information on the web and off you go. Okay, so next we turn to the individual level linkage. So far we have linked data from England, administrative data from England and the data is called National People Database. What we have done is that we've collected consents for education data linkage at Wave 1 from all parents of 4 to 15 year old children and also from all young adults born since 1981. The consent rates for children were 68% and for young adults they were 78%. Then using those consents we matched to the National People Database held at DFE using some linkage variables including names, gender, date of birth and address as linkage variables. And the match rates were 82% for children and 55% for adults. The reason for the lower match rates amongst the adults is that the current postcode that we hold on adults is not the one that the children had when they were visiting school and in future linkages we have now also gained permission to use school names the name of the last school attended. So in future these match rates will be much better. So as I said the National People Database is a registered database of all pupils in state schools in England it dates back to as early as 1996 for some data items so now spanning a considerable number of years and most importantly it contains attainment data as children progress through school. I think it's widely known that English children are amongst the most tested children in the world and you can see here the different assessments they go through starting from the early years foundation stage profile at the end of reception year at age 5 and then through the different key stages of education where key stage 4 is the GCSEs at age 16 and then it even goes on to key stage 5 where the students take AS and A levels for example. And in this table you can also see the years the academic years from which this data was collected in the National People Database. This database on top of the attainment data also includes quite rich information including pupil background so there's some information about pre-school meal eligibility ethnic background and so on which perhaps once leaked to the understanding society will not be that valuable because understanding society has very detailed ethnic categories on its own but quite interesting also are absences and temporary exclusions from school so you can know all the reasons and the number of sessions missed through absences and when children were excluded from school because they were naughty basically. So this is all the data contained in the National People Database. What we have done is for each individual that we were able to link to understanding society we have added NPD data not only backwards in time but sometimes even forwards in time using all the available years of data from 1995 up to the most recent academic year at the time of linkage which was 2012-13. So just to give you an example how to think about this I've just come up with two fictional examples. These are not actual survey members. The first one is about Kate. Her parents gave consent to link her education records in 2010 during wave one and she was eight at the time. She was successfully merged and we were able to retain to get her information of how she did in school in the past at age five and seven but because linkage took place a little bit later than consent collection in the meantime the primary school sucks test and we were also able to add those as well as her absences in each year of school up to the year 2012-13 and because she is not a naughty child and young children don't tend to be that naughty she has not been excluded from school yet so we don't have any information on that for her and we hope it will stay like that. Our second fictional person is David. He consented to link his education records when he was 25 so he had already left school some years ago but still he was successfully matched and we have data going back to key stage two. This is again the SATS result at the end of primary school and we can follow him up to his A-level results. We do not have information on absences and exclusions for him because those were started later in the national pupil database records. So you can see that for different people we will hold different types of information but over time we will build up a really interesting overview of how people move through school in different ways. So the linked NPD and understanding society data does not include all the data items that are included in the national pupil database but just about all of them. We have just excluded some items that we thought were really not useful. There are dozens and dozens of variables available for you to use and I'm just going to talk you through some of them to give some examples. So at age five we have the Early Years Foundation stage profile including all the subscores for about 2,000 children. We have points attained at key stage one that's at age seven in reading, writing, maths and science for about 2,600 children. We have all the SATs results at the end of primary school for about 6,000 children. We have GCSE grades in more than 30 subjects and a whole load of summary indicators of performance at GCSE level for about 4,000 students. Then at AS and A level we have the grades in virtually all the subjects as separate variables in there and summary indicators and a lot more for about 3,000 students. Termally absence rates we have by reason. So was it because you were ill? Were you visiting a doctor? Was it an absence because of observance of a faith? Was it an unauthorized absence? All of that we have for 7,800 students. Exclusions from school only affect people who have been naughty but they affect more people than perhaps commonly known and about 900 such exclusions have been recorded so far. So you can see there is a lot of data available and I'm going to just spend a little bit of time to talk about what possible things you could do with this linked data at the individual level. So as Deborah said in the introduction there are different ways to think about these variables. You couldn't think of them as an outcome. So for example you might want to explore the reasons of absenteeism. Why do young people start becoming truant for example? You might want to study the impact of family dissolution on school attainment or other disruptions in life and this is really an ideal thing to study with a survey such as understanding society because we have a lot of detail about changes in circumstances also regarding income employment status things that happen at the family level and those might conceivably have an effect on school attainment. Thirdly you might want to investigate how school trajectories are affected by other characteristics such as neighborhood deprivation. Understanding society makes available geographical identifiers which allow you to merge in information such as indices of multiple deprivation, or census data. So all your data at local level you can study in conjunction with the outcomes here. Obviously you might also want to use education as a control variable. A very straightforward example is a wage equation. There's always a problem that you need to control for ability. Here you could use attainment as a proxy for ability. Finally heterogeneity analysis is something else that springs to mind where you want to describe outcomes by different levels of attainment. So just to give you one example which has been in the media recently the DWP launched an improving life strategy which focuses on work-less families and they have actually used the linked data for their report and they wanted to see how parents' work-lessness affects, relates to children's likelihood of failing to reach expected attainment levels. So what they've done is to just show for work-less families and working families what proportion of children fail to reach the expected levels at key stage one, two and four. And the result is actually quite striking. You can see that children in work-less families as the headline here shows are almost twice as likely to fail at all stages of their education. So quite a stark message from this and this is based on the linked data that we provide. So I hope that you see some opportunities for your research in this linked data and my final words will be on the next steps here. So we have collected consent for linkage in At Wave 4 again and we are actually very close to updating the data linkage and this will provide you with some more recent academic years that we can cover and also we have re-asked all our respondents to see whether they might now want to consent to data linkage even if previously they had reasons not to to the survey and some other changes which means that we hope to have a larger group of people consenting and then matching because now we will also match on school code not only on the other variables that we used previously. Also this year we are planning and we already have permission to perform education data linkage with Scottish education data where again we can link in on background of students, on their attainment, absences and exclusions quite similar to the English case although the data is different and the education system is different. And then looking forward for 2018 we have planned to perform education data linkage with Wales and there are some further data sets that we can link to as well which are the HESA data individualized learner records and early year senses and the exact timing depends on a number of factors among others the consent for linking in the early year senses for example has only been collected in Wave 7 which is still in the wave so this is something that you know watch this space for the future but as time goes by this data will become richer and richer and will cover longer spans of the life. Finally just to point you to a few useful resources if you want to find out more we have a user guide for the linked data which is actually interesting it's a very big document listing all the variables that are available including the sample sizes for each of them so you can see at one view whether sample sizes are big enough to do whatever you have in mind there are some summary statistics in there as well to help you understand the data better we also have a topic guide which guides you through the education content of understanding society and also gives more examples of research that can be conducted and the next two links are just generally to our understanding society documentation and the national pupil database documentation on the DFE website that you might want to look at so this is all for me for now and I will back over to Deb for the rest OK thank you very much so hopefully you can see that understanding society and the linked data is really such a fantastic resource if you are interested in education related topics so we're just going to finish up very quickly with a brief overview of accessing the data and I'll just talk about the different access conditions that Brigitte mentioned earlier so the understanding society data and the linked data as well can be accessed through the UK data service some of you if you've already downloaded data will know that you just pop on to discover search for the data or look through the data by theme pages once you found the relevant data set that you're interested in then you click on the download order icon which you will see to the side of the page now we've already mentioned that different data sets will have different access conditions now those access conditions really reflect the level of detail and sensitivity of the data so let's have a really sort of brief look at what we mean some of this you may be familiar with if you've already used data from us but just briefly we have the end user license data so the main understanding society data set for example is available under there and that's for anyone who is registered with the service it's very simple, a couple of clicks and you download the data directly to your PC there are very few conditions here the level of detail will be not so great here so for geographical region for example you might get government office region but you won't get any smaller area information and you for example with dates of birth you will probably just get the year of birth and that's quite standard for all data sets really on the individual level and it identifies are available under our special license conditions now again it's quite simple you can download this onto your PC in the same way but you do have to fill out a special license form and there are some additional conditions which you must make sure you read and accept and what happens is that you fill out the special license form and submit it we will pass it over to the data owners and they will just have a look and approve that and once they've approved that then we will give you access and as I say you download it to your PC and this will give you information which is a little bit more detailed so now you can have smaller geographical areas such as local authority and local educational authority for example and now you'll get a little bit more detail with dates of birth for example so you'll get the month and the year and you'll get all those school identifiers that Brigitte mentioned earlier now for the full linked NPD database that is available just under secure access and the reason is because obviously it contains very detailed information which is obviously personal data so we have to make sure that we protect the confidentiality of those people's school data so access is limited to those who have applied to be approved researcher and they will have to come along to London for a day spend it with us and we will give them some training on data protection, data security and how to use the data safely once access is then granted rather than download the data onto your own PC accesses via our virtual secure lab so that will be perhaps a little bit different way of working but we go through all of that on the sure training course and that will give you as I say the linked NPD database but it will give you other things such as full dates of birth, national grid references so you can get much more finer level detail if you need any more advice about that then obviously do get in touch with us and I'll just show you how to get in touch with us now so we have at the UK Data Service our support help desk and we deal with all sorts of data related queries will help you find data if you're not sure how to understand the data or coding frames etc and any problems that you think you've identified and you can email us on support at www.dataservice.ac.uk or you can use the web address at the bottom of the screen but also understand society has their very own user support forum and this is a really really valuable resource so you can log a query directly with the understand society team and that's quite often a useful thing to do if your question is very technical or very detailed for example you can also search for previous issues that people have raised so it may be that somebody asks the same question and you can just do a quick search and find the answer there and then so they're two really really good resources if you need some help okay so that comes to the end of the content that we want to cover today there are obviously other ways to keep in touch so you can follow us using Twitter and Facebook and that's the same for the understanding society team as well we also have a subscription email service so we can keep you up to date if you subscribe with any new data releases etc so thank you very much and we wish you all the best with your research.