 Hi, I'm Becky Twinsley. I work in our Lead the Admin Data Sensus project in the Office for National Statistics and, as Ollie said, I'm going to talk to you with Becca about how we're transforming the population migration statistics system. First of all, I'm going to talk to you a bit about the context and where we've come from around the Admin Data Sensus. So, basically, we want to put our users at the heart of what we're doing, and we want to transform the way that we produce our population and migration statistics to better meet the needs of our users. We're working in partnership across the Government Statistical Service, and we're progressing an ambitious programme of work to administer some data at the core of our evidence on population and migration statistics for England and Wales. So, population for England and Wales and international migration for the UK by 2020, and that brings together work that we've been doing across the ONS, both in the Admin Data Sensus project and the Migration Statistics Transformation Program, and we want to deliver improved statistics by using the wealth of ministerial data that's available to us and that we can now access through the Digital Economy Act. So, casting our minds back to 2014, the national statistician at the time made a recommendation that in 2021 we will do an online survey, an online census of all households in communal establishments, and we would also be progressing our work to use an increased amount of administrative data and survey data in order to produce the best possible outputs for 2021 census and also to make improvements to annual statistics between our censuses. Off the back of that recommendation, the Government also agreed that that was the way forward, but they also made a much stronger ambition that censuses after 2021 would be conducted using other sources of data. And at the time, that was dependent on the dual running of sufficiently validating the perceived feasibility of that approach. So, basically, we will be able to run both approaches, so one that's based on admin data, and we will compare that with our 2021 census outputs, and then in 2023 we'll come back and make a recommendation about the future of censuses in England and Wales. Since that time, we have done an awful lot of work to think about what we want to do in terms of our transformation and delivering some early benefits and some of the work we've been doing, and that's where our ambition to put admin data at the core of population and migration statistics in 2020 has come from. And we need to really consider how we bring together the range and the wealth of sources that we've got to produce the best possible outputs that will deliver the ambition of admin data at the core of our evidence, but also will deliver our best possible 2021 census outputs. But they will help us to carry on transforming and improving the way we produce our statistics. So we have taken a decision that we will no longer be making that comparison with the 2021 census outputs, but rather focusing all of our efforts and using all available sources to produce our best possible statistics. So I've used the word admin data census, and what do we mean by that? Well, we're aiming to meet user needs by using a combination of different data sources that are available to us. So that might be admin data, survey data, and other types of data, so for example, big data. And all of this evidence will feed into a recommendation about the future of census in England and Wales in 2023. Previously, we focus our efforts on the three key types of outputs that we get from our census on the size of the population, the number and structure of households, and the characteristics of housing and population. But we're now looking at how we can deliver to a wider unit than that, so we're looking at topics that might not traditionally be captured through a census. So a good example of that is income, but we're looking at lots of other things that we've delivered through the LNS and how we might be able to deliver and improve statistics on a wider range of topics. And we've previously demonstrated that there's lots of potential with admin data, but on its own it won't provide a complete solution. So we need to combine it with a range of other types of data, but also with surveys. And we think we've got two key requirements for those new surveys. One would be an annual 1% coverage survey that would help measure the size of the population and households and measure and adjust where appropriate. And the other would be an annual characteristic survey. We don't yet have confirmed what size that survey might be, but that would help to provide and support the production of the range of characteristics that we have that our users have an interest in. So thinking about our current model, so our census every 10 years provides a really rich source of detail right down to really small areas every 10 years. And in between our censuses we have less detail at regional and local authority levels. A future model provides us lots more opportunities of providing more frequent statistics and maybe allows us to do longer tubing analysis that might allow us to better understand outcomes for different population groups. Would enable us to produce new outputs and maybe be more flexible to our user needs and how things change. And that helps us to deliver our vision of better statistics, better decisions. Today our focus is going to be on our research to transform the population and migration statistics system. But you can also look at our admin data census pages on our website to see the work that we've been doing around other parts of this model. Great. So hi, it's Becca Briggs here from the Centre for International Migration at LNS. So I'm going to talk you through a little bit more of the background on our latest research. So here, just to give you a bit more context, Becca has mentioned some of this already, but from an LNS perspective we produce population statistics for England and Wales and international migration statistics for the UK as a whole. So the size or the stop of the population at a point in time is a really important component of our statistics system alongside measuring and understanding how that population is changing over time, so flows. From an international migration perspective, the current method we have for estimating that uses the international passenger survey. This plays a really important role in our system at the moment, but also in future. However, we have long recognised from the LNS that it's stretched beyond its original purpose and that we think there's more we can do to provide richer statistics for our users. So back in 2017 we announced our ambition to make more use of administrative data sources to transform our statistics and we've already published a series of targeted research pieces since then. So for example over the last couple of years a number of reports around things like student migration using the admin data available in that space. And as Becky mentioned, we've also recognised that the broader population system needs to change and make more use of administrative data sources. So back in March 2014 the National Statistics Commission's recommendation of the 2021 census is predominantly online and that we make increased use of admin data to enhance those statistics. And again we've already published a number of pieces of research in that space on how we can use admin and survey data to measure the sizing characteristics of the population. So I will move on now to talk a bit about our very latest research report. So at the end of January we published a research engagement report that updated our users on our population and migration statistics transformation journey and really crucially sought their feedback as well on our future plans. So what did this cover? Well it went through the progress we've been making towards a new approach for producing population stocks and flows using administrative data and how we're bringing more sources together than ever before to fill gaps in our coverage. So that includes things like linking across immigration, education, health and income records and exploring how we can use these to determine the usually resident population of England and Wales and also those immigration flows to the board of UK. So this includes developing data-driven rules based on what we can see around registrations and signs of activity in the data sets and what we can identify from each source. As part of that research we also produced a series of more in-depth case studies that put a spotlight on what different administrative data sources when they're linked together can tell us about international migration. So to kick off a bit around one of the kind of central features of that report, as Becky mentioned user needs are absolutely at the heart of what we're doing. So we set out a framework for how we're delivering our transformation work which you can see at the left of the screen in the diagram there and our users absolutely sit at the heart of this. So what do people need to know about population and migration? How that's changing over time? And one of the things we've heard quite frequently is what more can our statistics say about the impact that population change is having on the economy, labour markets and society more broadly. So once we've thought about those user's needs we then move in on our framework to thinking about the concepts and definitions we need to measure. So for example around UK migration and population, what are our users interested in knowing and how can we create statistical definitions that support their understanding? Then moving on to data sources. So what data can we use to answer these questions and measure those definitions? And that includes administrative data sources, the census, what we can get from wider survey sources or the opportunities around big data or other data sources that are available to us. And then once we've thought about those data sources, what methods do we need to use to analyse them in the best way? So how will we estimate population flows and population stocks? And then feeding through from that, what outputs do we need to produce from the data? So we currently produce a range of different population and migration statistics. But what do our users want to see now and in future? How frequent? What format should they be in? So that's giving you a bit more background on the framework for how we're working to deliver transformation. But we already know quite a lot about what users have already told us they need to know. So firstly to reflect we know there's a rapidly changing policy context. For example with the UK exiting EU, the government plans for a new immigration system. That's really pushing the spotlight around our statistics on migration. Our statistics system also underpins a wide variety of other statistics. For example there's denominators and employment rates and inform a wide range of decisions. So at a national and local level, how many school places are needed? What provision of health services should there be for a population or an ageing population in certain areas? Where do new businesses, where should they be located based on what we can see about population change or the characteristics of a local area? Users have told us that they want coherent statistics on the size stock of the population and how it changes over time so the flows. Breakdowns for different groups and characteristics. A better understanding of how population migration changes impact on society and the economy. Those statistics in a frequent and timely way and a clearer understanding of the quality of these statistics which is really important as being new to a more independent database system. I think crucially as well, regular opportunities to provide feedback on the changes that we're making. Now I'll talk a bit more about one of the key areas of our research and what we published back in January which was linking back to the framework I've just talked about. The concepts and definitions. What do we want to measure in our statistics system? Firstly, user needs to really inform that. So what our users need to know about will directly inform the definitions that we need to provide statistics about. We already have a set of existing definitions that we publish based largely on UN standards that allow international comparisons. So you might be familiar with concepts around the usually resident population, long-term migration, short-term migration. However, what we know and what we've seen from our research today is that people's lives are complex and the patterns of movement might not naturally fit the statistical definitions that we're using. So to understand that further, we think we need more flexibility in some of our concepts and definitions. Our analysis of home office administrative data that we published back in July 2018 really started to illustrate this. The picture below is a hypothetical example of an individual's travel patterns. But what we can start to use some of that administrative data to see is where people are travelling to the UK. They might have a visa benefit for a certain period, but perhaps they're arriving and departing within that period. There's some complexity to the travel patterns. In our latest research, one of the topics that we looked into a little bit further is around circular patterns of movement. So what I mean by that really is repeated travel patterns. So people that are coming into and out of the UK on a repeated basis. And that's something, again, that our users of our statistics have told us they want to know more about. And it's the current evidence gap. So it's not something that we publish regular statistics on at the moment through our existing survey-based system. So what we did is take a look at what we can see from administrative data sources, and in particular some home office data, which tells us about visas and the data sources called exit checks. And we looked into what we could see about those repeated patterns of movement. So as I already set out, we have some existing definitions that we do publish information around, so including long-term migration, short-term migration and visitors to the UK. And it's likely that some of those repeated patterns of movement are already actually measured within some of our current definitions, even if you don't know those as circular. So by using that home office data, we were able to start to explore, I guess, looking at those repeated patterns of movement and how we could potentially start to group them by low, medium, high and very high frequencies of movement into and out of the UK. And we could start to look at some of the characteristics of those patterns. So, for example, we saw that individuals that had a low or a medium number of journeys in and out usually stayed for a period of around two to five months and travelled for the purposes of study or family. But there's more to do in this space, and we think there's really clear potential to produce statistics on this in future. And as part of our report, that was one of the things that we are aiming to gather further feedback from our users on. So what do people need to know about these types of patterns of movement that we're not currently covering in our statistics, but that administrative data might allow us to cover in future? OK, you're back to Becky now. So I'm going to talk to you about the research that we've been doing on stocks and flows. And previously, as part of the Admin Data Census, we have been focusing on a stocks-based approach to using our Admin Data to measure population. And we've used that in what we call our statistical population data set. So we've linked four key data sets together. Those are the English and Welsh school census, the higher education statistics agency or HESA data from the cover students, the NHS patient register, and the DWP and HMRC customer information system, which is basically a list of people who are registered with a national insurance number. And using those four key sources, we've applied at the time a set of quite arbitrary rules to try and reduce that down to what looks like the usually resident population. And so we had rules that said you have to be on two of those sources for us to consider you as usually resident. We've used a little bit of activity data, so data where we can see interactions with those sources to try and resolve some of the address conflicts, so where people might be registered in different addresses of different sources. And then we produced what we call our statistical population data set or SPD version 2. And what you can see on the chart there is that for total population, we actually have quite good results. So when we look at each local authority, the total populations, the 96% of local authorities were quite a similar quality to that that you get from census output when we compared it to 2011. But when you break that down by age and sex, you can see that actually there's quite big coverage patterns. So particularly for male for working age males, there's quite a big amount of over coverage. So too many people compared to the census estimates. And when we look at females in particular, there tends to be a little bit more under coverage than the estimates are lower than the census estimates. And particularly for older ages, there's higher amounts of under coverage. So we're taking a step back from that and thought about is there a better way of using what we've learned about all those different data sources and using new data sources that might become available to think about how we can better estimate the usually resident population. On the other side of that, we also need to think about how we can produce estimates about population change, so the flows that Becca mentioned earlier. So we're taking what we're calling a hybrid approach to this. So we will carry on doing our stocks based approach and looking at what the flows you might get from differencing between two different stocks. But we'll also take a different approach to thinking about how we can use the different data that we've got to produce different flows. And those flows will be international migration, but also cover local migration or internal migration and also get some deaths. So we'll use all of our different data sources to try and estimate the flows. And then we should have two sets of stocks and two sets of flows. And then we want to then be able to evaluate the differences between those two different sets of outputs and find a way of triangulating those estimates. So how do we produce a coherent set of stocks and flows that make sense from the different approaches that we've taken? So that would be our aim to produce that transformed population migration statistics. So we're going back to the stocks based approach of the total population. We're now looking at a hierarchical approach to using data driven rules rather than the arbitrary rules that I described before. So if the different age groups will look at what would be the most appropriate source that would best cover that particular age group. So a good example would be 5 to 15 year olds. So probably the key data source there would be to use the English and Welsh school censuses. However, we know that that's got some gaps in coverage. It only covers state school pupils. And so we then want to look at what other sources might be available to plug those gaps for children who don't attend state school, who are school privately, but also for children who might be homeschooled or not in the education system. So another good source of information there is the child benefit data. However, we know that there are eligibility requirements for child benefit and it doesn't cover the full population. So we can then also look at the NHS patient register, which also has good coverage of those school aged children. And so we've taken that approach with different age groups and looking at what are the best types of sources that we could use. And we plan to publish something in the spring that will show you the outcomes of that research. But what we're already seeing is that by taking a different approach and using the sources in a different way, we're actually reducing that over coverage that we saw for working age men and we're also reducing some of the under coverage that we saw for females. So you can expect to see the publication of that over the next few months. Some of the case study work that we've been doing has also helped us understand where some of the deficiencies in some of the rules that we had previously. So where we said that we wanted a record to be on both on two sources to consider them a part of the usually resident population. On the left hand side, you can see some work where we've linked migrant workers' account data with the personal demographic service data. And what that's shown us is that for EU and for non-EU residents, the rate at which they register with the NHS services, the health services is different. There's different patterns. And that just goes to show that having an arbitrary rule for the total population doesn't really work. That we need to think about the different population groups and the way that they interact with the different sources in a different way. On the right hand side, you can see some work where we've linked the HESA student data with the Home Office Exit Check data. Previously, Ambrall said that if someone was studying a course for a year that we would assume that they would be usually resident. However, what we see is that for a lot of overseas students, actually they're much less likely to be in the country for the whole year, even if they're on a one year course. And so that again shows that we need to think about Ambrall's in a different way. OK, so now I'm going to go through a few more of the case studies that we published back in January and some of the insights that we had from this around international migration patterns and what we can say from linking different data sources together. So this first one is looking at higher education data linked to Home Office Exit Check data. And what that can tell us about departure patterns and length of stay of non-EU students. So I'm kind of building on some of that analysis that Becky just mentioned. So I should say that these are all sort of initial experimental research outputs at this stage. So the results we're getting are from linked data. So the records we've been able to link across these data sources. And what we were able to see here really I think is a confirmation of some of the patterns we've seen in our previous work around student migration. So most non-EU students departed long term at the end of their studies. And a further high proportion extended their stay in the UK or departed short term and returned a longer term visa. We also saw here that around 10% of graduating non-EU students that emigrated long term left within a week of their course end date. So either a week before or a week after. I think what we can see here and why this is important for our work to put a bit of data at the core of our migration statistics future is that using linked administered data we can start to identify student travel patterns. So identify those signs of activity in more depth than we've been able to today. For example you can really see what's happening with departures and the number of days that international students stay here in the country. So moving on to the next case study and here what we were looking at is linking across higher education data with HMRC data and what that can tell us about our employment and economic activity of international students in higher education in England and Wales. So our total analysis showed that EU students are more likely to be working than not with the opposite found of non-EU students. And there might be a range of reasons explaining that. So both immigration rules and the economic background of students for example is saying here that non-EU students pay higher fees. They're more likely to be sponsored in their studies so their focus could be on that rather than working. But what this is also showing and why this is important for our work to put a bit of data at the core of our statistics is it's actually an interesting finding from the perspective of building our statistical population data set. So Becky mentioned earlier around the fact that we're moving towards more of a hierarchy and data driven rules approach that's using multiple sources and rather than a two data source rule because what we can see here is that we won't necessarily capture all students in certain data sources, not all students have an economic activity that we will pick up on. So it's really important to triangulate across multiple sources. Moving on to our next case study and what we're looking at here is what we can see from linking migrant worker scan data to income and benefits data and what that can tell us about activity patterns of non-UK nationals. So again that's something where what we're looking at is can we pick up on signs of activity that indicates that kind of poops are here and you're being resident in the UK. And the cohort analysis we did here illustrated that four in five non-UK nationals did have signs of activity across the income and benefits data. So really illustrating that's quite an important source for the work we're doing. We did find more evidence and more international migrants who were earning in tax through the pay as you earn data than the benefits data which was consistent across both EU and non-EU groups. But again I think the message is quite consistent to what I've just been saying in that overall what this is showing and why it's important for our work to put a bit of data at the core of our statistics is that as an indicator of presence in the UK we need to be triangulating across other types of data sources. So the migrant worker scan tells us about people who have registered for a national insurance number by linking that across to information on income and benefits. We can start to see more about the potential patterns of arrival and what people are earning in the country. So moving on to our next case study where here we were looking at what linking migrant worker scan to national health service data can tell us about how and when international migrants appear on those data sources. And Becky mentioned some of those kind of the main findings from this around data lags earlier on. And that was one of the really key things we could see here. We could start to look at the time scales and what's happening between arrival in the UK and registration of the types of services and how that might differ for different groups. So for example we found that EU nationals are registered more quickly for a national insurance number than non-EU nationals whereas non-EU nationals registered more quickly with the NHS than EU nationals. Again some of that might be explained by immigration rules and around the health service charge which we mentioned on the previous slide. So this work is important for what we're doing around putting a bit of data at the core of our statistics because again what it's showing is that we need to look at multiple different sources to really develop a full understanding of the population. There might be certain lags in what we can pick up on different data sources which is important for how we use them as part of a future system to measure the population and migration. So that was a bit of a flavour of the work we've been doing in that space but there is a lot more depth published as part of our report so you can find more information on each of those case studies, the methods and data sources we've been using and more in-depth findings that do take a look if you're interested. But in terms of the overall next steps on this programme, the main thing continuing to engage with our users so the report we published back in January was very much of a search engagement report. As part of that we asked a series of questions of the users of our statistics to get feedback on what we're doing, does it look like it's heading in the right direction and as Becky mentioned we plan to publish our next update on this programme of work in spring this year which will report back a bit about what we've heard so far and how that's going to steer our future work programme. But as part of our overall work we're still developing our future system and we're still in the process of looking at further administrative data sources that we can use to help deliver this system. So for example at the moment we have some coverage gaps around EU migration and we're looking to see what administrative data sources we can use to help us fill those gaps. So really our further research is looking at what we can do to link across a full range of sources to continue to build that integrated system that will allow us to measure population and migration producing hard statistics for our users. So in terms of how we're working to deliver this we're continuing to collaborate really closely across government with other government departments to develop our approach and to think about what data we can use to address evidence gaps that have been identified by our users. Alongside this we're also taking forward work to look at our existing survey sources. So I mentioned earlier that our current approach for measuring international migration is based around the international passenger survey. So we published a work plan in February which left out some work we're doing to look at the coherence of our existing survey sources. So the IPS but also the labour force survey and our annual population survey to help understand what each of those different survey sources are telling us and also how we'll be able to use them in future as part of our system to measure population and migration. So we're going to be reporting some conclusions on that later this year so if you watch out for that. And I think that's it from us in terms of what we wanted to go through. So we're really happy to take any questions or feedback on this at this point. I think I've seen there are some that have been coming through on the system so we can pick those up in a second. But just to say as well that any further feedback by email is also really, really welcome. So as I said we're definitely looking to kind of get views from all our users of the statistics we published so we can feed that into our ongoing research. Okay, thank you to Becky and Becca. I'm going to speak a little bit now and then we'll go back to all of your questions and then consider them in the poll. So I wanted to say something about administrative data. I think we're going to agree given all the things that Becky and Becca have been talking about that this is a very exciting space to be working in at the moment. I wanted to look at that particularly from the context of longitudinal studies. We have a number of census longitudinal studies in the UK, which I'll mention in the moment together with cohort studies. These are really excellent and rich research data sets, but particularly in the case of the census funds, a switch from census data to other sources would mean quite a lot of change for them. So I want to start off by saying a little bit about the possible threats from a switch from census to admin data. Both thinking about cross-sectionally how those data might change and more interestingly from my point of view from a longitudinal sense. And then to think a little bit about the opportunities that I see from greater use of the admin data. I should explain that I work as part of the UK Data Service and Celsius, the Centre for Longitudinal Study Information and User Support. So for those of you that are familiar with census longitudinal studies, we've got through studies in the UK. I work on the ONS Longitudinal Study, which is a 1.1% sample of injured deals in England and Wales. Together with other people living in the household of some members. We have census data from 1971 through to 2011 and we'll have 2021 census data in due course. And we also have some administrative data, especially mortality data, and I'm going to come back to that later. We have another study in Scotland, which is a 5% sample with census data from 1991 onwards. And one in Northern Ireland with the 28% sample of the population from the census data from 1981 onwards. And both Scotland and Northern Ireland have significantly more administrative data than we do at the moment in England and Wales. And you can find out about all three of those studies and how to use them at the website calls.ac.uk. So I thought there were some possible scenarios to think about in terms of a switch from administrative data to, from census data to administrative data. As Becky said, the most likely outcome and the one that we think is going to happen is that we'll replace the census with administrative data. And at the same time, we'll have two population surveys, a coverage survey and a characteristic survey. But of course, we don't know exactly what's going to happen. So it's possible that the sendial census might continue as in the past, albeit with more admin data. And maybe we would have another outcome is that we'd have reduced the sendial census with some admin data. When we think about this switch from census to administrative data, just in a cross-sectional way, it's obvious that there are some existing census questions that aren't particularly easy to replace with administrative data. And these are the areas that, collectively, those who are interested in this opinion to think about. So some of those questions are self-assessed. For example, a self-assessed question on general health. We can't know that without asking people. Similarly, questions about national identity. Well, our main national identity is supported. Some additional national identities are available for people to type in or to write into their census form. And we can't really know those without asking people. Again, with a religion, we can't know what people believe without asking. In other areas, an important gap, I think, is a provision of care. A lot of that is done informally. And if we want to know about caregivers' needs and the need for care, we're going to find some way of replacing that from what we know from the census. And that might be from an annual population survey. The 2011 census in England and Wales asked about second residences. It's easy to think that they're not particularly important in terms of the scale that it's about people with holiday homes or places they work from. But in fact, the largest group of people who have two residences are children of separated parents. And we know more than we ever knew about that before from the last census. And it's important for us to try to keep up with that information. Another mission is details of the workplace and journey to work. So at best, I think, from administrative data, we're going to know about where people work, perhaps from their tax records, but there are possible problems with that. But we don't really know anything about how people get to work without asking them. And again, that's something that a population survey would have to cover. Finally, in terms of gaps, a significant one is household structure. The census asks about the relationships between people within the household. And this is fairly poorly replicated in administrative data. Some administrative sources ask about more than one person in the household and about who might be married to whom or who might be whose children. But it's quite hard to reconstruct family relationships from that information, especially when we're thinking about complex households that might have more than one family living in. We're going to have new questions in 2021. We'll have questions on gender identity, on sexual orientation, and on veteran status. And the precise reason why we're going to ask those questions in 2021 census is that we don't have any other good alternative sources of that data at the moment. And that need for new questions is always going to continue in the future, I think. So what happens with longitude and old data? Well, the LS structure in England and Wales in Scotland assumes that a census will occur. The data structure in Northern Ireland is a little bit different. If we don't have a census, then that structure can't continue on as it is at the moment. But all is not necessarily lost. Given the multiple administrative sources that we know how to link to individuals through the work that Becky and Becker have been describing, we can think about how we might take those LS sample members, identify them by their date of birth and linking them by their name and doing that in a secure setting. We could link administrative data to the LS sample. And we'd be able to observe a lot of the census type characteristics, albeit not all of them. We can also think about existing surveys, and that's a bit harder. So we ask lots of people, lots of questions in surveys. But they're all relatively small samples, and only a small set of the people in those samples would be in the LS as well. So joining the existing surveys to the LS is quite hard. One model, as we've discussed, and something that may well happen, is that we have an annual population survey. That might be something akin to what happens in France where they have long insensors, where all municipalities are covered either in full or through a fairly large sample over a five-year period. If we had a large population survey, we could include questions that replace those census ones that don't have any other alternative sources. And that would be really great, but produce fantastic data, but it still has problems for us trying to use the LS. So perhaps we'd sample people over a 10-year period. The transition period that we could observe changes in LS members from one location to another, from one shop to another, for example. Well, that would be occurring over different time periods for different people. And in terms of analysis, we'd have to try and take that into account. The fundamental model of linking administrative data, I think, offers an awful lot of scope for the LS. So we can think of some of the new records that we could ask the LS. Becca talked about circular migration and being able to see how people leave and enter the country over time. And being able to add things like passport observations would add a great deal of depth to the LS, being able to see how people move in and out of the UK. We can also link additional administrative data that expands the scope of what might be a census or an admin-enhanced census. An income is an obvious example of that. But we can also add some fields that we've had in the past in the census and don't have any more. And in particular I'm thinking there of information about qualifications. The censuses in 1971, 1981 and 1991 had a box where you could write in post-school qualifications. And you could write in up to six qualifications with the subject, the awarding institution, and the type of degree or other qualifications. And that's obviously expensive to process. But if we can link to national people database data under HESA data, then we could again look at that sort of data of what happens to people with different sorts of degrees later in their lives. So today on World Book Day, for example, we could look at people who have degrees in creative writing or in publishing. And see what happens to them maybe 20 years later in their career. But I think that's just straightening the surface. I think the greatest opportunity that we have with the LS, given the work on admin data, is that we don't have to be limited to just the 1% sample that we have at the moment. We can use a complete population and link them forward longitudinally. And we should aim to be really ambitious about this. And have a 100% sample of people. From some data in the future where we start linking their administrative data, I'm moving forward. And a lot of the work ONS have done so far has been on cross-sectional analysis of producing population estimates and flow estimates from year to year and looking at how their methods improve from year to year and how the estimates become better from year to year. And that's really exciting. But I think where we're going to go in the future is looking at how people change over time. And that 100% population could contain a sort of golden sample based on the LS that goes right back to 1971. And in some cases, earlier, we've also got details of wartime registrations in 1939 for some of our sample numbers. And we could use that sort of 1% spine in England and Wales and equivalent in Scotland and Northern Ireland to test out exciting new administrative link kits. So I think the longitudinal study is going to change in the future post-2021. But we shouldn't necessarily be scared of that change because there's an awful lot of opportunity for us to do new and exciting research. And we'll have to work out how to do some of that. But it's an exciting area to be working on at the moment. With that, we're going to switch over to looking at your questions. I'm going to see lots of people have been posting questions so we'll go through those. I've just got on the bottom of this slide some forthcoming events that you might be interested in coming up every Tuesday for the next few weeks run by ESRC-funded paper investments that are available to researchers. Okay, so looking at the questions, I'll start at the top. I'll read them out and Becky and back can reply. Our first question is, do you have a link to the case studies mentioned in the previous slide that shows how different administrative sources can be used to tell us about international migration? Yes, so we can certainly circulate something following this or you can find them from the O&S release calendar. The report was published on the 30th of January, but certainly we can send that information. I think there's a link embedded in the slide as well, so whether those are available, you'll be able to follow that link to get to those case studies too. The slides will be available via the UK Data Service as well a recording of this webinar. Our second question, can the work you're doing on circular migration inform flows across the Irish border in future? At the moment we only have very broad estimates of the numbers of people who commute daily across the border between Ireland and Northern Ireland. This has obvious policies significant over the next few years. I think across all of the research we're doing we're considering what geographical levels we can provide information on. Our research on circular migration is ongoing, we'll be looking at that. I think also just to say that as part of our border transformation programme we're working really closely with statisticians in Northern Ireland and Scotland and Wales to consider the coverage of the data we've produced different parts of that system. Also, quite closely with statisticians in Ireland as well, we'll help to spread them out in the space. Is it fair to say the work on circular migration relies on people crossing an international border and having a passport observed? Yes, at the moment we're working with a certain attractive home office data, so something we're just using to continue to explore what we can actually measure in that space. Not to say what happens with the Irish border, but none of us know at the moment. Will any of this work allow us to arrive at more robust estimates of the population of immigrants who are here illegally? Say this estimates for the illegal immigrant population have remained prevalent for over a decade in the UK now and there's lots of demands to try and improve on that. Yes, so I can answer this one. So when we set out the ambition to put an interesting data at the core of our evidence on migration, one of the things we did recognise is that there is a lot of interest in what we can do. We can say around illegal or irregular migration, but that it's a really complex area to measure. So it's something that we are looking into at the moment. I haven't got at this point a sort of we will be publishing more information at a certain point in time, but just to say that yes, it's something that we're looking at. I suppose I also have thoughts about that from a sort of academic point of view and without any of my research project hats on. I think we need to recognise that the data we're collecting, the more we link data, the more potentially dangerous that data is, the more potentially disclosive that data is. And we need to make sure people treat that data and do research with that data responsibly. And so from my point of view, I'm thinking about how we have to protect that data both from stereotypical hackers, but also from government overreach. There's been stories in the newspapers online recently about home office use of national people database for immigration assessment. I'm speaking as an academic so I can sort of be more free in what I say perhaps, but I'm concerned that if we put too much of an immigration slant on use of data, then people will be less willing to engage with everyday administrative processes like filling in details of other children going to school than what the nationality is. OK, our next question. What research are you doing on internal migration? So I was just going to say quickly, but it's all right to just respond to the point there. Because from the ONF perspective, what we're looking at is very much what the data can tell us from a statistical perspective. So when we're looking at what we can integrate around different administrative sources, that is for the purpose of measuring those aggregate levels that we can see in population migration. And we set out quite recently our data sharing principles and how we work to keep data safe and secure. So again, all that information is part of our recent report. So if anyone's interested in that or has further questions, do just take a look. OK, so going on to the question about internal migration. So that is obviously a part of the work that we're doing around stocks and flows. So when we have been focusing on international migration, but one of the outcomes from the work that we're doing around the stocks will also be to look at internal migration. About a year ago, we published some initial research on internal migration using app and data. And that research will continue as part of a package of things that we've produced around population and migration statistics. OK, could you please elaborate a bit more on integrating big data? So we're working really closely with the big data team that's based in ONS to explore the possibilities of using big data in this space. We have previously published work on how we might use aggregate mobile phone data to look at commuting patterns. So that was something that we published probably about 18 months ago and it's available on our website. And we are looking at other different data sources that might be available that might help to support the work that we've been doing. So could, for example, property websites help to improve our understanding about housing and housing stock and describe the housing stock. So I think there's a range of possibilities. We do have a big data team that we're working closely with, as I said. And we also have the data science campus based at ONS that we're working closely with too. OK, will we ultimately be able to get all the population characteristics data from admin data as we get from the census? Yes, so this is obviously a really challenging area for us. We previously published as one about annual assessments, a sort of snapshot look at all the different topics that we get from the census and the availability of that compared to admin data and also how the definitions for those different topics might align. And you can imagine the spread is quite broad. So some topics are really well captured in the admin data. The coverage is really good and they meet the definitions that we measure through our surveys and through the census. But some of those topics aren't available. I think the provision of care is one of, is a really good example of that. So we are, and that's why we said that there would need to be a survey that would be supporting this. But we also want to work with different data providers to see how we might be able to capture some of those other topics as well. And as I said before, we're looking far beyond the topics that we get usually from the census. I think we've been up quite a front about the different trade-offs that there might be through this approach. So yes, it's possible that some of those topics might not be available or might not be available to the level of detail. But actually there's a load of other benefits, like the frequency of topics being a bit more responsive to user needs. And also being able to look at the longitudinal aspects of some of the topics so that we can start really understanding the outcomes of different population groups and the impacts that different groups have on the provision of services at local levels. So that will be part of that recommendation in the future and we will continue to engage with users about some of those trade-offs and what really is needed to meet their needs. Okay, how will you capture migrant families which include a non-working parent or grandparents in the data? I'm thinking they will have NHS registration but not HMRC or school. So we've actually done quite a bit of research around how you might get to households through admin data. And actually we've run quite a few seminars through the World Statistical Society particularly on this topic. It's a really good example of a topic that it has high user need but which the definition that you can get from admin data is very different from what you get from a survey. So from admin data you're basically looking at addresses whereas through a survey we're able to ask about how families can help people live together. But some of the work that we have done is looking at popular household composition. And actually we thought that that was going to be a really hard thing to get to because traditionally admin data sources don't really look at how people work together. But the research that we've done so far has shown that actually this is a more promising area and that some areas, particularly in lone parents, we're actually able to observe that really well in the admin data. So it's something that we will continue to have those discussions at. And we are also now extending that to think about how we might look at families which again presents themselves slightly differently than how people live in an address. Okay, is there an ongoing commitment to publish a small area of population data? Yeah, absolutely. We know that this is really important in terms of user needs and that is something that we are continuing to base our research on. So, yes, absolutely. Is there, I wonder, a need to publish data to any particular level for local authority services? Do local authorities need ward-level estimates or for legally? Do they just need local authority levels? Yes, so there's a real range of requirements that we have from different groups. So we will continue to be discussing with those different groups about what their user needs are. Obviously, there have been regulations about what we have to produce as well. So we will be taking that all into account. Okay, are there any admin data sets? This is a good question. Are there any admin data sets or the previous ones we are as well? Are there any admin data sets you can't currently access but you'd like to? Yes, so we're exploring a really wide range of different admin data sets. The focus of our effort has been on some of the high-quality data sets. So those data sets that have got really high population or that fill some of those gaps. So working with the Home Office to think about how we see people moving in and out of the country. But I guess there's nothing that's really off-limits. So we will be continuing to try and understand how different data sources can fill the gaps that we've got. Okay, how will you use the data to update local authorities, OAs, LSOAs? I think that's kind of similar to the previous question about small area estimates. Yes, so there is a project that's being led by a geography team at ONS that's looking to see whether it's possible to update the boundaries around OAs, etc. Fruit through using admin data. So that is something that we are considering as an office as well. How robust is the matching between different sources? So we have published a lot of reports on the matching methods that we've developed. Particularly we developed to use on hash data, monseed on the minestata. Those are all available on the ONS website. We have a matching team based in ONS and we are continuing to look at best patches around matching and developing our methods to make it as robust as possible. Okay, are you planning linkage so that we produce statistics normalised for population staff, e.g. rates per X in the population of local areas? Yes, so that's another really important point. It's probably something that we've explored less of in the past about the impact that a change to the population statistics system will have on some of those rates. But we will certainly be taking forward conversations with the various uses that we use our outputs in rates. So yes, we'll be taking that into account. Yes, are there any plans to tackle the overestimation scene in the higher ages at any issue? Yes, so that was based on our previous rules to produce less statistical population data set. The work that we've done more recently on those data driven rules has actually shown that we've reduced a lot of that over coverage. So that's showing good promise. I guess one of the challenging things is thinking about how we produce robust estimates of the population. So if you think about the census, we have a census coverage survey and that is how we produce robust estimates. And we might need to explore different methods than those that we currently use. But yes, it's showing policy thinking about using data in a different way. Okay, so I think it's the last couple of questions we've got. More level data or insufficient granularity for an LA needs to be LSOA at least. So this is again going back to the question of the small area estimate and how small the areas will be able to work with. Yes, so if we think about the stock estimates or the size of the population, we've actually previously taken that down to our area level. And actually shown that even though we're now updating the way that we produce those estimates, we've actually shown that that's quite good promise for the size of the population out area. So we will be continuing to explore at what level it's reason that we should produce those out of the store. And again, that will be tying that in with the user needs as well. Okay, now finally, what data sources do or could give information about place of birth, country of birth, and therefore allow analysis by migrant generation? Yes, so that's a really good question and probably links back to some of the discussion we were having earlier around concepts and definitions. So one of the challenges around admin data is that some things are reported in different ways in different datasets. We have concepts around country of birth, nationality, but also citizenship. So I think it's something that we're thinking about quite closely as part of any migration analysis. I'm working with sources such as those from the Home Office, which will give us some of that information to think about what we do need to measure in the future. I think this is also a really good example of where we will continue to buy the strength that we get from the census data. So country of birth is obviously a quite static characteristic. And we are now thinking about how can we, even if we move away from the decennial census in the future, how can we still continue to use the strength and the power that we get from the 2020 census? So I think country of birth is a really good example where we can continue to use that to provide much more insight into our admin-based estimates. Yeah, I think it's also country of birth is a good example of an area where the multitudinal study can illustrate some of the context. As you say, it's something that shouldn't change. Although it might change the understanding of it, it might change the international boundaries. So if someone who was born in what was Yugoslavia at the time, they might give their country of birth a severe or Croatia or any of the other countries now. But we also find a level of noise, a small level of noise in the data, that people aren't necessarily consistent with what they say for their country of birth because of form building errors and other things, as well as political stance. And I was just going to come back to the point that Alun was making earlier, that there are loads and loads of opportunities for the NS. In the future, in the strength and power that you might get around, the opportunity or analysis of different data sources. I think it's a really exciting time, but not without its challenges. Okay, with that, I'd like to thank you all for listening. I'd like to thank Becky and Becca for their presentation. I'm also to my colleague, Bustlist, for helping us run the whole show. As I said, the slides will be available. The recording will be available as soon as we've processed it. One link is again, remind you that we've got some webinars coming up that you might also be interested in. If you're particularly interested in the longitudinal study, we've got an event coming up in Belfast, April 8th and 9th, again the calls were tight. We'll have information about that. On with that, I'd like to wrap up. Thank you all for listening and goodbye.