 Good morning everyone. I am David Ronsley. I'm the senior technical coordinator for the aggregate data unit within the UK data service. So we deal primarily with aggregate data from the UK census of population and from international providers like the IMF, the World Bank, the OECD. I've been working with census data since 1995. So I've just been through my third census now. So I'm starting to get a hang of it. I'm just going to stop my video so we can concentrate on the slides. So we'll start with a little quiz. This is probably the last time I'm going to be able to use this one. I've been using this slide now for roughly eight years. And so it refers back to the 2011 census. So we haven't got a poll for this one. So it's just a little bit of fun. In your mind, can you tell me in 2011 how many people in the UK identified their religion as Jedi? So in 2011, in the census detailed religion question, there were 176,632 people who identified as following the Jedi religion. So this is aggregate data. So it's information about the characteristics of a group of people at a geographic level. Now, this morning at 9 30, the detailed religion data from the 2021 census dropped into my outbox. And there isn't anyone in there that is down as following the Jedi religion. Now, I might have to ask ONS what's happened to that because that's a huge difference. Maybe they've gone into another religion box. I don't know. But there were 5054 people following the Satanism religion, if that's of any interest. So that's the detailed religion data that has dropped this morning. So I'll be updating this slide with something else. So aggregate data is all about populations, groups, regions, or countries. So it could be anything. So it could be, could be about people, could be average life expectancy, it could be employment rates, gross domestic product, it could be greenhouse gas emissions, or in this case, census headcounts. And aggregate data can be time series. So a lot of our international economic data is time series going back 70 or 80 years, or it can be a single point in time. So although the census is in effect, it's a time series. It's also a single point in time because it's only once every 10 years, and it changes so much over those 10 years. So the question on Jedi is probably not that useful to yourselves, unless you're planning on founding a church. But if the above diagram was your local area, and the numbers represented households where you can identify single person households, that is people living alone, and you can add in the characteristic that the person living alone is over the age of 60. Using that information, we can build up a picture of where in the country we could target resources for looking after vulnerable people who might perhaps be shielded from, well, two years ago would have been shielding from COVID. So the census of population in the UK is every 10 years. The first modern census that we would recognise as having characteristics similar to the one that we have had in 2021, that first started in 1841. There were censuses before that back to about 1815, I think, but they were more ad hoc. The data is held secure for 100 years, and then the original papers are released to the public. We have digitised censuses from 1971. There is actually data out there from 1961 now. I don't know if it's readily available yet. And this was the last ever census, as we know it, although they said that the last time as well. There are plans to use more and more administrative data and one-off polls to to replace the census. I'm not so sure. I think it's going to keep going for a bit longer yet, because we're very good at doing censuses. Brilliant to do censuses. We're not so great at keeping administrative data. So we talk about a UK census, but it's not a UK census. It's three different censuses. One is Scotland, held by the National Records of Scotland, formed with the GROS. One in England and Wales, held by the Office for National Statistics, and one in Northern Ireland, held by Nesra, the Northern Ireland Statistics and Research Agency. They are broadly the same, but they are also different in certain ways. There are differences in the geographies that are used. Geography thresholds can be different. Geography names are certainly different. The numbers of geographies are different as well. There are different questions asked. Sometimes the same questions are asked, but slightly differently. And the outputs from the census can be different as well, because each agency is responding to the needs of their nation. There is lots of harmonisation, and the agencies work together to try and make sure that there is enough harmonisation so that we can do cross-border analysis. And for 2011 and 2001, we ourselves produced a UK census that combined the three different censuses. But for 2021, 22, because there were differences in the questions, and also because there's that time difference between the Scottish census, which was held a year later, that UK census might be very different. It doesn't exist yet. There is work going on to create that. What might happen is that there will be an advisory board put together to advise users on how best to do cross-border analysis. So the England and Wales census was held in the 21st of March 2021. There was a 97% response rate, which is phenomenal. The average completion time was just 23 minutes, so that was for the online census. Following that, there was a census coverage survey. So the ONS went around and found out where was covered well and where wasn't covered well and why, and they followed up and tried to fill in the gaps. There was also a census quality survey where they tried to understand how questions were answered. Were they answered correctly? Were people having problems answering certain questions? And then they did a lot of output consultation with interested groups. UKTS was involved in that for the academic community, but all sorts of other users from local authorities, charities, community groups and commerce to ask them what they wanted out of the census. And from that, we're now getting the output. So we've already had a number of releases. ONS did a fantastic job on their collecting the census and compiling it. Their output timetable is running a little bit later, and it should be due to increased consultation with local authorities. They put out some data to the local authorities and asked them to comment on it, and the local authorities took quite a long time responding to that. So then all of our census was held on the same day, went through all the same milestones. Their census has been absolutely perfect. The response was fantastic. Their coverage was fantastic. Their output prospectus is going absolutely to timetable. We get metadata and draft table outlines in advance so that we can prepare for the releases. It's fantastic A plus going brilliantly in Israel. Scotland, they took the option of holding it a year later due to COVID. ONS and NISRA took the opinion that because they've done so much work on it already, and one of the criteria was to have the greenest census ever. They'd already done their paperwork, so they chose to go with the 2021 day. Scotland decided to do it a year later. Because of that, their response rate was initially very poor. I don't know why I think possibly people are getting mixed messages or maybe we've had the census already, maybe it's not happening, who knows. However, they had an extension period, they put a lot of advertising and media working, and they eventually got their response rate up to 89%, which is actually an okay response rate. And with follow-up coverage surveys, they will get that to above 90%. They're currently in a consultation phase on outputs and their cleaning and collating the responses they've got. We don't have any dates on when that data will be released yet, but I would think late spring, early summer 2023. So the census outputs in England and Wales, we've already had population household estimates. Initially, we had rounded estimates, we've now had unrounded estimates. We've got some releases on household and resident characteristics, international migration, and armed forces. That's new to this census because of the question on the armed forces. We've got data from country level down to what's known as middle supraprateria or alpateria level. So alpateria is roughly 150 to 400 people. So that's the lowest level of data you'll get. We've had information on sex, age by single year, sex by single year of age, number of households, population densities, households, number of residents in households, legal partnership status, living arrangements, and households by some deprivation dimensions. We've had information on country birth passports held and length of residence. And in Northern Ireland, we've had broadly similar outputs. We haven't had anything on armed forces. There wasn't an armed forces question held in Northern Ireland. I'll talk about that more later. Northern Ireland data at the moment is only at country and local government district level, nothing on alpateria. So next census releases. So today we've had a release from the Office for National Statistics for England and Wales on ethnic group, national identity, language, and religion. And we've had some updates to the census maps that they've, the ONS have produced to add in the recent data releases for those. We've got a very busy December. We've got West Language, Labour, Market and Travel to Work releases from ONS. And we've got the phase two release from Northern Ireland, which the ONS are releasing areas of interest. Northern Ireland are doing what they call a phase, so they will drop a large amount of data at once. So Northern Ireland's phase two is health, disability, unpaid care, housing and accommodation. And their phase three is marital status, household composition, living arrangements, sexual orientation, qualifications, labour market and communal establishments. So that was going to be in spring 2023. By 2023, that all this data at the moment is univariate data. So it's just single, single variables. It's not variables mixed with other variables. From spring 2023 into summer 2023, they will be releasing their multi-variate data. So these are the larger, more complex tables. And also they will be releasing their data via the flexible table builder. So this is where you can take existing tables and you can flex them. So you can slightly change them or you can mix and match your variables. And the table builder will go away and say, can you do this? Is it viable? Will it reveal personal characteristics that might identify individuals? If not, if it does, they might blur the data and release it to you. Or they might say, no, I'm sorry, you can't do this. Or they might just say, yeah, it's fine. You can have this data. So we've seen this in operation with some test datasets. Can't wait to see it working with real data. That's going to be a bit of a game changer. We've never had anything like that before. And then the final releases at the small geographies for multi-variates will be ongoing from summer, autumn 2023. So what consensus aggregate data tellers? It is the most complete source of information about the UK population. Nothing comes close to this. It's data about populations, employment, ethnicity, housing. You name it. That's the aggregate data, but it's more than the aggregate data. There's geographic boundary data for making maps. There's micro data for looking at smaller populations and anonymized individuals. There's flow data. So there's data about where people move. So it might be where they live and travel to work or it might be where they've migrated within the country or where they've come into the country, international migration. And then there's derived data such as deprivation data, you might have heard of Townsend or Caster scores. And there's also spatial temporal data. So that's looking at how the population changes between, say, night and day. So there were new questions in 2023. There's a question on gender in England, Wales, Scotland and Northern Ireland. It was a voluntary question. We haven't seen any data from that yet. We're really looking forward to seeing that. That could be very interesting. As you can see, the question was slightly different. The lilac coloured one was the question asked in England and Wales. The black line one was the question asked in Scotland. The question on sexuality wasn't asked in Northern Ireland, but it was asked in England, Wales and Scotland. And again, the way the question was asked is very slightly different. The veteran status. So this is people who have served in the armed forces, but are no longer in the armed forces. It wasn't asked in Northern Ireland. It's obviously, that's a very sensitive question for Northern Ireland. They will be attempting to, sorry, if you can hear that, dogs. Yeah, so not asking Northern Ireland. They will attempt to recreate the answers from administrative data. We have had output from this for England and Wales. We've had, I think, four datasets out of that, which we have available. And health conditions wasn't asked in England and Wales, but was in Scotland and Northern Ireland. Scotland included a writing option that Northern Ireland didn't have and slightly different tick options as well. That could be very interesting with Scotland if people are writing in for possible long COVID conditions. So what it can't tell us? Can't tell us anything about wealth or income. There are no questions asked on that at all. There is some derived deprivation data from other information within the census. So number of people living in that house, access to cars and finance, that sort of thing. There is no personal identification. For a hundred years, there's data blurring and obfuscation and there is sometimes self-swapping so that people can't be identified as individuals. Census geography is tricky. The building block for census geography is the Alpertaria, which is about 150 to 400 people. You're not going to come across the Alpertaria anywhere else. It's just for censuses. The Alpertaria are built once the census has been taken. They're designed to have similar population sizes and they're designed to be socially homogeneous and to not spill over certain physical boundaries such as large roads, rivers. They're not supposed to be part residential and part industrial, for instance. And that's why they are built after the census is taken so that they can work out socially homogeneous areas. Alpertaria are then used to build up super Alpertaria. Low-layer super Alpertaria, so middle-layer super Alpertaria areas in England or Wales. In Scotland, they're called data zones or they were called data zones. They might have changed their name when they come to do the apples. In Northern Ireland, there was just LSOA. They might be moving to LSOAs as well or they might just be sticking to district government levels. None of the above relate to anything real world. However, there's also data at that local authority level. There will also be data at Ward. So that's Council Ward level, electoral divisions in Wales. There is no postal geography. However, we have created a system which will allow you to convert postal geography to census geography. I don't know what else to say about that. Scotland has slightly different thresholds for Alpertaria because the Scottish post-pollution is different. It's very sparse and certainly in the north of the country. In Scotland, it's a minimum of 50 people. Whereas in England, it's a minimum of 100 people. Yes, it's tricky. It also changes as well for every census very slightly. Alpertaria geography is designed to stay as similar as possible to the last census, but there are changes over time. It can't be helped because the population changes and where it lives changes. So to access our aggregate data that we have, we have three different ways of accessing data because the way we're funded has meant that we've created new platforms, but also the way that we're trying to allow access to the data is different as well. So we have Infuse and CASweb which allow you to get individual Alpertarias if you want and it will allow you to get individual variables. Whereas our data in CCAN is bulk data. It's just grab as much as you can and then you can take it away and put it into your own statistical software systems into Excel or SAS, Stata, SPSS, whatever you want to put it into. You can do that when you can analyze it yourself. I will very quickly show you CCAN. So this is what we call our CCAN. CCAN is a data platform in use in a lot of countries around the world for sharing data sets. It's got a fantastic search engine. So we can look for, we can search for anything like, we want to search for 2021 and see if we find that. There we go, 33 data sets found for 2021. We could search for ethnicity. There we go. And we get our data back. Or we can use these little tags at the side to search for things. So if you want to search for all 2021 data sets with ethnicity in, there we go. So this is our Northern Ireland data. We've got some metadata all about that and then we have the data itself. It gives you a little bit of information on that data and then we can just download that and that's the whole data set downloaded. In this case, it's an Excel file. At the moment, all the data come from ONS and this was in Excel format. Okay, so Infuse has data from 2001, 2011. We will be getting rid of Infuse. We're going to replace it with something else. And we've got all our data from 1971 to 2011 in a format where we can access individual variables and individual geographies. We just need to create a platform to put on top of that so we can show that data to yourselves and you can access it easily. But we won't be doing that until we've got all our 2021 data in a similar format. CAS Web is our really, really interesting platform, but it has data all the way back to 1971 and it has boundary data for 2001 and 2001 in there if you wanted to create maps. And CCAN, that's got our latest data releases in there. So we've had a little activity if you want to do this. If you want to go to Infuse and have a look, we've got a worksheet that you can download. And you can have a look at using Infuse to get out some 2011 data if that's what you want to do. I don't know if that's okay for people. The relevant URLs are in the chat so you can follow those easily. Just let me know how you're getting on. I'll get Infuse up so that you can see what's going on. So this is Infuse and we'll be accessing 2011 census data in there. This is Infuse so we can get 2011 data. We can choose geography or topics first. It does take a little time unfortunately. Infuse is old, that's just why we want to replace it. It uses a data model that is quite complex and takes a while to load things. We choose our local authority. We choose expand Birmingham. We choose all wards and electrical divisions. And then we add that confirms that we've added all wards and electrical divisions for Birmingham. Next, we're going to select from the left hand side. We're going to select household composition. We're going to look at this first option here. It gives us some information about that particular variable. We have a total of households and one person of households where they are 865 or over. We'll add that. Next, get some of the information about what we've asked for. Tell it to go and get the data. When it's got that data, we can then download that and we then have total households in Birmingham and also all households where there is a single person aged 65 and over living alone. And when we get the data, we get three files. We get a citation file which tells you how to cite the data if you're going to publish. We get a metadata file which tells you some information about the variables. And we also get a data file that has the numbers in. And that's in a commerce separated variable format. Give you a minute or two more to just go over that. But also, you can do this in your own time. And if you've got any questions, you can always come back to us. We have a help desk where you can come and ask questions too. So we have had a question. When will the infuse be replaced? Please, please. We haven't got an exact date on that because we're right in the middle of 2021 releases. So we're just trying to get the data available as soon as possible for that in bulk format. When that's done, we will have to spend some time kind of taking the tables and exploding them into their individual components and putting them into our data model. That takes quite a while. And then we have to create a new interface which means involving our software division. So that will take time as well. So we're probably looking at late 2023 for that. But we will have data available in bulk form as soon as it's published. We can't compete with the flexible table builders that ONS and NISRA have created. So we're not going to. So if you want to create your own tables by that, that's absolutely fine. What we will do, those table builders will have an API. So what we might be doing is looking at building something on top of that. So that might be a visualization tool or a mapping tool. Okay. So we've had a look at that. A quick look at interviews. Other data that's available, as we say, traditionally, there's been no deprivation data. 2021 will have some deprivation indicators in there, which are derived from room occupancy, house ownership, tenancy, car availability, employment status, etc. There are also traditionally a couple of recipes of deprivation data. So there is the car stairs and the town's ending indexes. We will hopefully be updating those with 2021 data. There's also the index of multiple deprivation, which is created by ONS. And that's available from the Gov.uk site. And hopefully they'll be updated soon as well. If we do want to match geographies, obviously people know their own postal geography. They know their streets, they know their towns, they know their postcards. If you want to take that information and use that with sensors, as we have a tool for doing that, it's not updated with 2021 geographies yet, but we're hoping to get that updated as soon as possible. It uses address points, voile mail address points to calculate an area population, and then does some clever proportioning to mix and match geographies. I'm not going to show that today. We do have an activity based around that, which you can do in your own time if you like. We've got some test data that you can upload and a portion to different geographies. As I say, we're not good to do that today. As I say, our sensors bulk data is available via CCAN. These are whole tables for geography areas. It's fantastic to search. We have metadata, and we're expanding that all the time. We will have data going all the way back to 1971 in there eventually. We do have an activity based around CCAN. If you want to go to www.statistics.digitalresources.gisc.ac.uk, we can then show you how easy that is. If you just follow the instructions on screen. On our front screen for CCAN, there's a huge search box. As I say, it's got an excellent search facility in there. It also understands synonyms as well. You could search for the whole term, sensors 2021 age of arrival in the UK, or you could just maybe put in age, or maybe even arrival might work in there. It does. If you put arrival in, we've actually got a number of datasets in there, of 2011 and 2021. We have England and Wales demography migration data, year of arrival, and we've also got age of arrival in the UK. We have a large number of datasets there. We've got the data at England and Wales level, regional level, upper and lower tier local authorities. Now, upper and lower tier local authorities are something that the government have just introduced. They are, upper tiers are county councils and unitary authorities. Lower tier are smaller local authorities. However, I've looked at the data and the data, some of the areas are in both upper and lower tier. I'm not quite getting my head around that, and I think I need to go back and have a look at the specification for that. Anyway, local authorities, so they look at councils, essentially. We've got middle layer and lower layer superalpiteria, so these are the medium and smaller sensors areas, and also alpiteria, so these very small building block areas, if you want to look at really small areas. We've got those at the regional level, so we've got alpiterias in Wales, alpiterias in the east of England, alpiterias in Yorkshire, and if you scroll down last screen, we've got metadata about the releases as well. Manchester appears in lower tier local authorities. It's also one of those councils that appears in an upper tier local authority as well, so you could actually choose either. Of course, this is in Excel, so you would be required to have Excel on your computer, and the Excel files that we get from the ONS have two sheets, so the first sheet, they have metadata about the data set, and the second sheet has the table and the data itself. Rather than scrolling through all these, what you can do is just select the column with the local authority name in it, and you can click on find and select and find Manchester. I just realised that this question doesn't exactly talk about the data, so it talks about long-term health problems or disabilities, but that's not in age of arrival. What I should have been asking is about a number of people who arrived in the UK age not four years. I will quickly share my screen, sorry about that, but we could type in arrival, and you get 11 day sets, and the one we're looking for is age of arrival in the UK, and if we look at, we've got the upper tier local tier, so we can explore or we can just download. I'm simply going to download this data, I'm going to open that in Excel, so my data set, I shall try and expand that text on there, so we have a metadata sheet, which tells us all about the data set, the units used, and the table sheet, which is still very small, and we can select the local authorities to find my Manchester. So, we can find the age of arrival in the UK for five four-year age bands, and I think I was meant to ask how many people there were aged 20 to 24 years in Manchester, resident in Manchester, sorry, who were aged 20 to 24 years on arrival. Okay, so let's quick look at that, so you can very quickly get access to data. I'm sorry about that, it looks as though I've got, I'm using an old question in there, and I need to update that, I thought I had done that, I shall update that before I send out the slides. So, talk about some of the other resources we have at the UK Data Service, so we're funded by the Economic and Social Research Council, we're a single point of access to a wide range of secondary social science data, and we are free to use, and most of our, certainly most of our census and international data is openly accessible, you don't have to have login to anything, but some of our data is behind a login, but again it is free to use. So, if you go to our website, we have a large number of resources and guides, videos, access to webinars, the impact blog, case studies, the data catalogue, data skills modules as well, which I'll talk about later, and access to our help desk, where you can get access to general or specialist help on the data, and we have a learning hub there, so if you click on the learning hub you will get access to lots and lots of information there about all sorts of things, one of which is the census learning hub, so you can use that to learn out how to use aggregate data, flow data guide, micro data, how to map data with R, or with QGIS, and access to census boundary data as well. We have our YouTube channel as well, where you can access all our webinars, and also lots of how to do guides, and we have a number of data skills modules, if you are new to an area, we have data skills modules on survey data, aggregate data, launch a little data, and exploring the crime surveys with R, so you might want to know what's the point of census data, well one of the major points of census data is shaping policy, this is the UK budget for 2015, and how money is allocated is very much down to the data within the census. Thank you very much.