 Hello and welcome to our workshop for the third sector on evidence and impact using the census data. My name is Patti Doran, I work for the UK Data Service at the University of Manchester and with me is Dave Ronsley who also works for the UK Data Service but he's based at GIST. So just to go over what we're covering today, I'm going to give a quick introduction about the workshop and what we're trying to achieve with delivering these workshops to the third sector. Then Dave's going to give a presentation about aggregate data. That will be followed by a short quiz just to check your understanding of the information that Dave's been covering. And then Dave's going to give a demonstration of using some of the tools that we have at the UK Data Service. So we're going to have a look at infuse and geoconvert and if we've got time we'll also do a little demonstration of decan or we might leave that for the end around the questions and discussion. And then if we've got an activity for you, again that's the handout and you'll have 10 or 15 minutes to complete that activity and then we'll come back together and have some feedback and see how you got on, answer some questions and then hopefully evolve more into a discussion generally about using data to present impact for your services. So just as an introduction, the UK Data Service is a comprehensive resource that's funded by the Economic and Social Research Council. So it's a single point of access for a wide range of secondary social science data. So that includes census data but also social survey data. But what we also have, which is generally people are less aware of, is a whole range of support training and guidance tools. And that's what we're here to do today is to help with some of the support and training side of things. So the objectives of this workshop is to help us to promote our services specifically to the third sector users, increase understanding of how data can be used within a variety of settings. And we want to support people to access that data and that might be through our tools or that might be through other ways, which we'll discuss today. And the whole aim of that is to enable people to produce the evidence by enhancing their data knowledge and skills. So the rationale for this workshop was that we're aware that there's lots of third sector organisations out there who do, who provide services that kind of fit within the social science sort of field really. So it's lots of people work to improve outcomes for marginalised groups, reducing inequalities and out there providing services to those most in need. And lots of these services run on short-term funding and the funding relies on demonstrating the need and proving that the work that you're doing is creating impact and that the interventions that you deliver worth funding. And so how the UK data service can help is through all the evidence that can be produced from the data that we store. So the social survey and census data that we have can help provide context to the work that you're doing and demonstrate where the services are most needed. And so this is the first workshop of a series of three. So today we're using census data, but we're going to explore other forms of data over the next coming weeks as well. And we know that third sector organisations can mean a whole lot of different things. We could be talking about really big charities. We could be talking about very small local groups. We could be talking about small groups that have a national remit. So we know that there's quite a lot of variation out there and that not all third sector organisations will have a dedicated research team and often it may fall on the people delivering the services to also produce the reports and demonstrate the impact and need. So we know it's quite complex and there's quite lots of demands on people's time. So we're trying to enable you with more awareness of the tools and how we can support you. So just to go back to the sort of data that we have. At the UK Data Service we kind of have three sorts of data, aggregate data, which is what we're looking at today. So using the census data really to show how aggregate data can be used. And then we also have a lot of micro data. So that's data about individuals and lots of that comes from the UK surveys. There's also a subset of census data, which is micro data. And we also have other sorts of data. So we do some international data sets, business data, and we also have qualitative data in our repositories. So there's a lot of data out there that can support your services. And hopefully through the workshop today and the ones in the next weeks, you'll be able to sort of have a greater understanding of how our data can support you. So I'm going to pass over to Dave now, who's just going to give an introduction to aggregate data. OK, hi, everyone. I am David Rosley. Work for GIST. We used to be part of the University of Manchester, but sort of vagaries of modern life. We are now ourselves part of the third sector. We are a not-for-profit organisation supporting the UK academia, FE, HE, research and schools, and increasingly the third sector itself. I've been involved with the census since about 1995. Started at the University as a placement student and stayed. So what do we mean by aggregate data? Well, the data about populations, groups, regions, countries, it can be time series, so it can be a series of data. Or it can be a single point in time, sorts of things that we're looking at, population, census headcounts, the economy. So we're looking at trade, possibly bilateral, flow of trade between countries. Could be greenhouse gas emissions. We have a lot of data on CO2 emissions. Or it could be groups of people. Employment rich for a country, for instance. Here's some aggregate data. These are data from PISA, which is the International Educational Data Set. So along the bottom we have different countries, and then we have a reading score then, and we have a score, dots for girls and diamonds for boys. Girls, as you can see here, out doing boys in every single country. When we talk about aggregate data, we're usually looking at data with a geographic unit. So it could be a whole country, it could be a census ward, it could be a postcode. And aggregate data about people, it could be households, it could be bedrooms, it could have been averaged, totaled, or otherwise derived from individual level data, which we find in survey or census returns. And it's important to understand that each data has a universe or a population. So that might be a data population, or it could be a workplace population. And that data is aggregated into a table, usually. There'll be a geographic element. I've got a better image than that. I've got some actual data I got from the census this morning. So it will usually have a geographic element. These are local authorities in Northern Ireland. And they also have a unique identifier, perhaps because it's possible some places could have the same name. So we don't have anything unique there. But it's quite common that there are two more than one ward called central. So we have a unique identifier to that. Each variable will have a description. Here we have age three and over. And the main language used being English. And that will also have a unique identifier as well. And that's so that you can cite this data. We've built a system at the census unit that you get it, which largely removes the tables from the search aspect of it. You don't need to know what data is in what table. But we do still present the data in tabular form. So what can census data tell us? It's the most complete source of information. It's getting close to about 100% coverage. It's never quite that. It's for a single pointing time, which was the last one was the 27th of March, 2011. After that day, even the day after, that data can, to some extent, be outdated. However, for that day, it's almost 100%. The next one will be March, 2021. There's no exact date yet, although it will be March. And it covers a wide range of demographic socioeconomic characteristics. This is not an exhaustive list. There are other census products that come out of the census as well. So we have geographic boundary data for creating maps, microdata. Samples of individual level data. Flow data. So we look at how people are moving around within the country and also externally and how they're getting to work as well. And derived data. So we have deprivation data and spatio-temporal data, such as POP 24.7, which looks at populations at different times of the day. So how are they produced? Everyone gets a form, either a physical form or an online form. A lot of work goes into the forms beforehand. It's ongoing now, asking what questions do we want on that form? And local government, central government, businesses, academic organizations, such as ourselves, all have a say in what questions we want putting on there and what we want as an output from those questions. 2011, there was a new question added on civil partnerships. Local authorities might want questions on household conversation, accommodation types. So they can plan for services. Central government might want to know about ethnicity, employment, disability. So they can also put funds into the right departments. Business might want to understand about population movement, travel to work areas. And from those questions, a series of tabular outputs are designed at different levels of disclosure. And that's quite important, the disclosure element. I will mention that later. There's higher levels of information available for larger areas and lower levels for smaller areas. Once the tables have been created, that's when they create the geographies. Or certainly the Alpertaria building blocks for the geography. The Alpertarius are altered if there's a significant change to population, social, homogeneity, and physical boundaries. And I'll cover that a little bit more later. And then the higher level geographies are rebuilt with the changed Alpertarius. Once that's done, they add disclosure control. So disclosure control is where they blur the data. They restrict certain data or they remove it if it's felt that it could, in any way, reveal the identity of an individual. And so far, that's never happened. And the different census organizations are extremely healthy on security and the disclosure control. Because they want people to have 100% faith that nothing bad is going to come out from feeling in a form. There's also an anonymity that the 100-year rule, no one is allowed to be disclosed for 100 years or 100 years. And then the individual level data is released. And as I said, this has never been reached. So a little word about census geography. It's not as bad as it used to be, I'm glad to say. But it's still tricky. So the building block is the output area. That's the smallest area. It's designed to have a similar population size throughout the nations. It's supposed to be socially homogenous. So the people within it are supposed to be, to some extent, similar. And it's constrained by obvious boundaries, such as roads and rivers. And then in 2011, amperteries were changed to so that they aligned with local authority boundaries, which you would have thought would have happened before that. But it doesn't really seem different. In England, Wales and Northern Ireland, that's a minimum of 40 households and 100 residents in each amperterie, although they attempt to get 125 households. That's the recommended size. In Scotland, I don't know why, maybe because it's a lower population density, there's a minimum of 20 households and 50 people with a target of average of about 50 households. And amperterias, they attempt to keep them stable over time. But obviously with changing populations, changing boundaries, it's not always possible. From the super amperterias, so from the amperterias, super amperterias are built when there are two types, lower layer and middle layer. In Scotland, they call them data zones and intermediate geographies, and in Northern Ireland, they just have lower super amperterias. And this probably means nothing to you because they're not real, we don't refer to them, we don't pay our council tax to them, we don't live in a lower layer super amperteria. No one ever talks about them. They tend not to have names, just numbers, but it's a way of trying to create a geography that rarely changes over time. And so we can then use those to compare with previous and future censuses. On top of those, we do have real geographies, so we have regions, counties, local authorities, wards, those will be cancelled on electoral wards and electoral divisions in Wales. You don't have protocols in the census. However, we have an application called your convert which you can use to look up geographies and also convert from one geography to another within certain caveats, because obviously converting post goes up to council district areas can be tricky. I guess in my final slide. So we've seen that aggregate data has been derived from individual level data pulled together, so large patterns and trends can be seen. And aggregate data are often about different groups of people or regions collected for administrative purposes by census officers, central banks, national statistical officers, local and central government departments, and they use them to identify how and where they should be using the public. Here we've got just a little slide to show you how the budget was spent in 2015, and it's census data that informs these decisions. And aggregate data is also used to check how different groups in the community are affected by these policies and by these changes in funding hopefully to inform future policy changes. So I think we'll move on now to Dave's demonstration of how to access some of our census aggregate data. He's gonna start with infuse and move on from there. So Dave will do this for about the next sort of 20 minutes and then we'll move on to activity where you'll give it a try yourself. So yeah, thank you. Okay, so infuse is a web interface we built for the 2001 census data and then updated it for 2011. So I'm just gonna go through it and get some data out. We were thinking about what could we use a relevant sort of statistic we could find for something in the news. And we will talk about households with single occupancy and households with old age people, obviously with sheltering in place. This is, it will be quite useful to see what sort of populations we have of people who were over 65 and on their own. So this is our front page to infuse. Choose 2011 data, got two groups into the data depending on which one it will restrict the amount of, so if you choose geography it will restrict the detail of topics. If you choose topics it will restrict the geography that you can get. So we're gonna choose geography first and then we're gonna look at Manchester. That's where it based in Manchester. I understand the ward names. Okay, so here we have all the geographies from the different nations in the UK. We do have some UK level data. All the nations have their own census questionnaires and questions and they are all wanting slightly different things out of the census. And as by they have moved further and further away from this idea of a UK census. But there are still certain areas that are the same. So we're gonna choose local authorities. Expand that and we'll get a list of all the local authorities in England. It's rather a large list unfortunately, so it just takes a little bit of time. Okay, so here we have all the local authorities and we choose Manchester and we're just gonna open that up and we're gonna select all the wards. But if all we have to do is click that one. The 32 wards within Manchester itself and we'll add that. We'll get confirmation that that's been added. So Manchester all wards and intellectuals. And I hope you can see this and then I will come next. That'll take us to the topics. So we start with our filters. It'll narrow down these topic selections that we have. We're gonna choose household constitution. It's just a very, very simple query we're going to do here. And just to say Dave, sorry to interrupt, but if that is a bit small for you, there is a zoom icon at the top of your screen that you should be able to use to zoom and just to make that a bit clearer. Thanks Dave. That's okay. We're just gonna choose the first. We just want household composition on its own. We could choose it mixed with other topics as well. We'll just choose this one. And then we get some information about what we've chosen. So it tells you exactly what it means by household composition. Oh, this is Northern Ireland only, but I'd better go back this way. By chosen by one, household composition, should be household composition and age. That's not exactly what I'm after. I'm just gonna go with that. Says it's Northern Ireland, that's all. I'm not feeling very confident about this. The data we want is we'll check people who are age 0, 61, then people age 65 and over. And that's what we want. We've added that now, confirmation in here. This is just confirmation of what we've selected. Selected areas and the combinations which might tell me we'll get the data. Once that's been, the query's been run, download data button will appear, we'll download that and I'm hoping that this, this is okay and I'm gonna open this. You know, it comes down as a zip file. And in there we get some metadata about what we've selected. I'll make that larger. We'll zoom to that. So it tells us that these are the unique identifiers for the variables. So these are the field names. These are the identifiers for the cells. The topic, the category, and the description. We also have how you can cite the data if you're gonna use this in any complications. The data itself, that, excellent. Have the wards of Manchester. This tells you what level the wards of that. And the geography time. And here are our two counts. So this is persons in one person household that are not 64 years of age. And these are people in single person households who are, so these are 65 and these are, these here are not 64. And we can just do a quick calculation on these to get some sort of percentages. So we're going to secrete. I didn't take the whole, didn't take the whole population. So divided by that plus times that by a hundred. Hopefully that should give us something useful. It doesn't look useful at all. And I can just create that. And that gives us a percentage of people who are over 65 living on their own in this ward. And if we can sort this smallest to largest, so as you might expect, the city centre, lots of young people living there. It's only 5% human as well. There's an area of a lot of students and young people living. And down here, Mostyn Burnidge, Harry Blakely, you're approaching 40% of the people living there are over 65 and living on their own. I will just very quickly show you geoconvert, as I mentioned earlier. Geoconvert is a tool to match postcodes and other geographies and convert data between them. Very quick, I can give you some information about some postcodes for instance. We only have the latest postcode look up unfortunately. What we can do here, we can add data about postcodes. So we can add some deprivation scores. Some English, these are the index of multiple deprivation, which is derived from census data and other administrative data. We can add some classifications as well. These are sort of little 10 pictures of areas. Actually, I'll choose the subcodes because these are more detailed. We can add metadata about how urban or rural they are postcode populations. So we can, these are derived from the number of address points that the Royal Mail delivers to. And we can also add some geographic data and also data about postcode. So we can say when the postcode was introduced I'll print it, just some postcodes go out of service. I could supply a list of postcodes if I want. Use one list so I could just supply my own. We get some metadata about the postcode about the match and also we can get some results. I'll quickly show you here. So postcode in South Manchester, oh, we didn't put the description in, is there? Oh, sorry, this is the description. This my area is defined a little pink picture of inner city ethnic mix. And here the deprivation score and the deprivation rank. And we have more information about what those mean on the website. And I will quickly show you decan as well. If you are after large quantities of census data, so if you do want whole tables, we've loaded bulk data into something called decan. That's just the name of the software it uses. So here we can look at the topic. So if you want to look at language, this is all the tables that use language. So we can have a look at each by language profile in Wales and we can just grab all the data for, for instance, all local authorities, all that, terriers, obviously any on the cons fails, obviously. Preview the data, doesn't tell us much. Oh, here we go. So that we can make those larger. If that's what you want, you can download that and that will give you all that data for all the arbitraries in Wales that will take some, oh, no, it's downloaded actually. I can quickly show you, there you go. All the data you're going to do at the level of language. Okay, I think that covers everything I wanted to say. That's brilliant. Thanks, Dave. That's, it's really good to see the tools. That's it quickly. Obviously, we're on the end of a phone, we're on the end of email, social media, if you've got any questions, yeah, follow it too. Yeah, that's great. So next, we've had some questions come through. So just as a reminder, there is a question box there. So I've tried to answer the questions that have come through, but we can also address those again at the end when we come to the discussion. So if I've just kind of given quick answers to those, but we can pick them up again at the end. But right now we're going to have a break, though it's not a break, hopefully you'll all come back and talk to us at the end of it. But we've got a handout. So hopefully you can see on the control panel, the go-to webinar control panel, down the bottom of mind anyway, there's something that says handouts and there's a PDF there, that you should be able to click on. And it's some step-by-step instructions to download some data like Dave did from decan, not from decan, from infuse. So I would like you to go away just for 10 minutes. Have a look at those instructions and follow the instructions and download the data. And there's an optional part at the end to open up the data set in Excel or something similar. And try and pull out the statistics that we're looking for. And we're really trying to highlight with this activity how useful the data can be to give context to a situation. So there's a hypothetical question there that you're in Birmingham and you want to know how many older people are living by themselves because you're concerned about the impact of social isolation with COVID-19. So we want to know how many people locally in Birmingham that might be affecting. So have a look and see if the handout. I'll put just a question here. Let's see if people are having problems with accessing this. The website Dave just used, that was decan, D-K-A-N, we'll put the link in there in a bit, Fran. Oh, we've just done the, okay, all right. So what we're gonna do is come back at five to three. So we'll stay here. So if you do have any questions, pop them in a question box and we'll answer them. But hopefully you can all work through that worksheet at your own pace. And we're gonna come back at year 255 and discuss that and see how you got on and then have a wider discussion about how you think you might be able to use some of this aggregate census data or perhaps other aggregate data to support the work that you're doing in the third sector. So that's what we're up to now. I've just got this slide that I might put up to share. That's what the worksheet should look like. I'm hoping you can all see that. Hopefully you've all got it and these are the instructions. It says 3 p.m. there, but we're just running a couple of minutes ahead maybe. So we're gonna aim for 255 to start back. We're gonna do some polls though. So we'll put the polls up and see how we get on that way. This is the first question here. So did you manage to download the data to get that far through the activity? That's great, I think everyone's answered that. So most of you got through that far, so that's great. Sorry that not all of you made it that far, but the activity is there to follow up on later. In your own time and we can discuss that more in a bit as well if you like, but we've just got a couple more questions. So for those of you that did download it, how many different variables or different pieces of information did you download? So we've got quite a split here. So I mean, it kind of depends how you define your variables. I mean, you might have said two if you're talking about the geography and the category or you might have said four if you were talking about because we did the geography and then we did total households. We did single people households and we did single people households over 65 so that were four different categories. But when you opened it up in Excel, I think maybe I can show you on my screen hopefully. Those are the four categories there, but when you open it in Excel, you'd see you had eight columns. So there were actually eight bits of data there. So it just depends how you interpret that. But those are some of the sort of ID variables and the labels that Dave was sort of talking about before. So it doesn't just come with the information that you selected, but there's other information attached to that. So that's what we've got here. So we do the next question for those of you that did get it in Excel. So I realize not all of you would have done this, but how many one-person households? Question, where the householder is 65 or over are there in Bartley Green? So Bartley Green being just one ward in Birmingham. Yep, so we've got most people saying 1,409, which is right, it's this number here. So that's the number of people in one person household to over 65. So that's great. The number beside it, the 3,712, that's the number of one-person households. And 10,728 was the number of total households. So I think there's two more questions. So what percentage of households in Bartley Green is this? So this would have required you to do a bit of calculation. So we'll see how many of you got that far. This wasn't, we do other training around sort of introduction to Excel and introduction to statistics and stuff. So I realize that this might not be the sort of thing that everyone does all the time. But we've got quite a few people answering this. So that's great. And we've got most people there saying, the 13.1%, which is right. You'll see that, you might see that on my screen if you hadn't done the calculation yourself. So that's down there, the 13.1%. And actually I just realized you've got the last answer as well on the screen probably as well. If you're looking carefully, I think there's one more. You don't need to write too carefully, there's a big arrow pointing to it. We'll put the last question up anyway. What percentage of people aged 65 and over are living alone across all wards in Birmingham? That was the original question that we were trying to achieve. And as I say, I'll put it up on the screen there so you can see that a lot of you are getting that. Anyway, hopefully some of you found that for yourself. So this is just a copy of the Excel data that I downloaded. And so first I just wanted to point out that I cleaned up this data slightly. So I changed some of the labels up here at the top of the columns so that they made a bit more sense although more concise than I could read them in the columns. So this was the total households, this was the one person households and this was the one person households who were 65 or over. So I did that first. And then I went down the bottom and I did my totals. So I just summed up all those columns so I knew what the total across all of Birmingham was. And then I calculated the percentage by taking the one person household from the total households and copied the percentages down the side. And then I ordered the awards from lowest percentage to highest percentage just so I could see the trend and see what was happening and see what the smallest percentage was compared to the highest percentage and then looking at the mean as well. So those were just the basic steps that I took to clean up that data.