 Okay, hi everybody. So today's we're going to talk about area profiles. We're kind of in the middle of the census releases. So some of this will be looking forward to what data we can expect to see later on. And the practical element is really focusing on the topic summary tables that have been provided. And we'll go through that. There are a couple of practical elements, depending on how time goes. The second one I'll leave you with and explain if we don't have enough time to go through it. But I'm going to move on now to what we're going to cover. So first of all, the purpose of today, which I said a little bit about the kind of information that's available in the census, the scale at which you can look at it. And then practical exercise of accessing data with some kind of examples of how you might manipulate that data particularly. As Jill said, if you have questions, then then put them in the Q&A as we go along with such a big audience. I'm not going to try and answer them in flight. I'm going to kind of pick them up at the point where you go on a practical exercise in accessing data. So first of all, just thinking about what you're trying to do if you're developing an area profile. What are you interested in? Are you interested in economic outcomes in demographic characteristics, etc. So there are a number of kind of research areas where the census is particularly useful. And it's particularly useful because it provides you with a local geographic scale, which is hard for most other sources of data. There are some administrative data sources that do give you local area geographies, but the census is kind of unique in this baseline of characteristics of the population and so on. In terms of thinking about what you're producing, you also need to think about who you're producing it for and therefore how you represent it. And there'll be a couple of examples of how to think about that. I'm not going to cover the use of mapping, but I will use it for examples here. And we will be running a course on QGIS, which is shareware geographical information system that would enable you to do that. I think that's scheduled in May. So looking at this, the topic data, this has been released over October, November, December and January. And what we've got is individual and household variables and for housing dwelling spaces. And these are the main areas. So the first release was demography and migration. The rest have gone out of sequence. I'm going to go through these in more detail, so I'm not going to go through the headings there. So looking at demography and migration, the individual variables we have is those people who usually live in an area, age, sex, whether they're in households or communal establishments, their partnership status, there's a whole set of things around kind of migration, I suppose broadly, but I applied to say whether somebody's a migrant, their country of birth, when they came to the UK and how old they were and what passports they hold. And for households, we get living arrangements, which is like household composition. There are two variants on that. So living arrangements in household competition, short-term residence and account of the number of dimensions of deprivation. So within the census, they enumerate deprivation in terms of employment, health, housing, and education. If you're interested later on, we can talk about how those definitions are done, but when you look at the data, all of that should be within the documentation behind it. In terms of ethnic group at individual level, we get ethnic group, religion, national identity and language. And for households, we get households where ethnicity is mixed, an assessment of language competence in terms of English competence for members of the household and whether a household is multilingual. For veterans at individual level, we get whether somebody previously served in the UK armed forces. We can tell from the census whether people are currently serving from another indicator and whether they're living in households or communal establishments. And at the household level, the number of veterans in a household and whether the household reference person is a veteran. In terms of housing for the individual, for people who are living in communal establishments, you'll get type of communal establishment and position. Position means either whether they're a resident or a member of staff. You also get information about second addresses and the type of second address. We'll come to the use of that later on in the later releases. For a household, we get whether they share facilities, that means a kitchen or bathroom, the type of tenure, the type of accommodation or type of property, and occupancy rating measure on both bedrooms and rooms. So I'm happy to talk about that later on. The definition has changed slightly and there may be issues if you're using this for comparability. So I think that whether people have cars and central heating and just thinking about household housing deprivation, this is based on there not being enough bedrooms or sharing facilities or lacking central heating. And then in terms of dwelling spaces, so these are all of the dwelling spaces so census enumerators go out and estimate places that are on occupied. It's like saying whether a place is occupied, and then the number of bedrooms and number of rooms. Labor market and travel to work. So for an individual, we get the occupation, occupational social class, the nationals, socio economic classification, the industry, and the abandoned number of hours work so we can tell the difference between those people who work part time and full time. We also get whether somebody has ever worked their current economic activity status. So that would include people like students like people who are retired people who are caring. And then the method of travel to work and distance travel to work. I think there's a note there about the point at which the census was taken this was during lockdown so this will be quite different to previous versions of the data. The reason that came in in 2021 was around new questions came in around sectional orientation and gender identity. Both of these were we were uncertain about the return rates I think the return rates have been quite good, but from an ONS perspective they're quite nervous about the geographical scale they allow this data to be seen at so. If you're interested in that you may have to deal with a higher level geography than ideally you would like. In terms of education, we get the highest level of qualification and whether individuals are school children and full time students. And in terms of education declaration. This is seen as a household where nobody has a level two qualification or above and level two is the equivalent of GCSE. In terms of general health and disability, we get a self assessed questionnaire on people's health, whether they're disabled, and whether they provide on paid care. Again the health deprivation is picked up from the measures in general health and disability. And then for household level we get the number of people with long term health problems or disability. So that was phase one that was all of the data that we have available now. What's what we're in the middle of is the release of phase two so that the first is about the short term population so that's people who come to the UK and intend to stay for less than 12 months so they've been enumerated, but they are short term residents. The second one which is what I suppose lots of researchers are looking for is multivariate data that will have two or more variables at different scales. Now the ONS have introduced a new interface that I've not seen yet I've seen the pilot of it previously called a flexible table builder. What this will provide is the ability to put in multiple variables. Within it incorporates statistical disclosure control, which means that if the counts are too low. They will first of all maybe swapped with neighboring geographic areas, but also some counts will be suppressed because they're too low. So it's quite likely that the categories available may be reduced. So for example if you're looking at ethnicity and housing tenure. You might find that when you get to look at ethnicity or housing tenure you can look at the output area level and smallest geographies. You may be pushed off a geographical scale or two when you combine data because you'll have too many empty cells. So that actually has a number of different things. So there's I'm going to go through each of these and say something more about them. So the alternative population base provide will provide an estimate of workplace populations, workday populations which combines those people who are resident and not working and those who are working in the area. For instance, you'll get out of term addresses. And also second addresses for a number of potentially different reasons. So if it's a holiday home, etc, etc. And those population bases should be based on people spending at least 30 days there. For example, the MP who lived in their constituency and travels down to Westminster for Parliament would appear with two addresses, one somewhere in London and one where the constituency was. And then for some of the questions, we've been promised small populations. So for these, as well as the categories. So in ethnic group, there are 20 categories. There was a writing option. So lots of much smaller groups can be identified similarly with country of birth, religion and national identity. So for those where there's enough data, there's a commitment to produce data sets. So there's some examples here of what they've said in their initial documentation at the bottom, also about British Sign Language, Romani, Somali, etc. Now the next two sets of data are likely to only be available from the UK data service that be available. I've been told from the summer. So the first is flow data. So this is a particular type of data that shows where somebody came from. Based on where they were 12 months before in terms of migration and where they are. So we can have some idea of migrant flow both internally and internationally. Secondly, we will get workplace flow. So this will be based on people's place of work and place of residence and what they say about their commuting. Similarly with second address and student flow. So this relates to the second address detail for the and the outer term detail for the flow data. We will hold an open version which won't hold much data. A safeguarded version, which if you're registered with the UK data service, you'll be able to access and also a secure version with more details to access the secure version. So you will need training and for your project to be approved and you'll need to use the data in a kind of safe space where you can't take it outside. In effect, what you do is is gate kept so that confidential data can't be released. So micro data is a set of multi variables about individuals. So the there are a number of samples being released. There's a 5% sample based on a regional geography and a 5% sample based on a combined local authority. If you're registered, you'll be able to use them. There's a 1% household sample. And there's also one that's sent to an international census databases University of Minnesota. So that's the safeguarded data. If you want more detail, then the secure data has 10% individual and 10% household samples. Those will be available, as I said, with training and approval of your project. So just to start off with having done that, could we have a look at what information you're interested in? So if you go into Mentimeter and just type in what you're interested in, we should start to see a word cloud developing. So deprivation see through figuring quite big in the areas, housing, health, ethnicity, age, education, demographic details. So a number of kind of interesting areas and then I suspect we could probably cluster some of these underneath that. So picking out one that just struck me, in terms of smoking, there's no data in the census about smoking. And again, there's not much on it. There's nothing on income. So the proxy for income is either the occupation, which will give an occupational level or social class, but there's no direct measure of income. I would say for those of you who are interested in researching areas that aren't connected here is the ONS are open to proposals through a secure service to try and link data from other places. So if, for example, you were looking at the Health Survey of England for the work on COVID, there were links made to census level data pre-release in order to enable people to look at it. Somebody's putting constituency level data that will be there. We'll be looking at more data ourselves. Things around crime and violence are not really here. So again, you would need to link other data to them. Okay, so that's looking like an interesting word cloud. So let's move on to geographic scale. So I'm going to use some examples here to illustrate this, but broadly speaking, the boundary data and the data that can be mapped into it will be available on a kind of administrative base for local authorities and health. So you could combine those to fit police force areas or local enterprise partnerships or sub-regional groupings of authorities such as Greater Manchester. There'll also be electoral data. So the ward data is available. I believe the constituency data is available or is about to be. And lastly, the statistical areas, which I'm going to say a bit more about. You can get these from the UK data service or the Open Geography Portal on ONS. So that's the kind of boundary data. So if you're interested in mapping, you need to look at those. The current status of the UK data service is you need to download the boundaries for the country, but there will be a boundary selector tool coming up fairly soon where you'll be able to pick the boundaries for the area you're interested in. So output areas were introduced in 2001. And the aim was twofold. One was to kind of make them more homogenous. So there was a matching of characteristics within the areas that meant that they will be similar in terms of things like tenure, property type, etc. They were also set kind of minimum and maximum sizes. So the output areas are smallest, the building block. And that's where we can expect a more homogenous type of population. Prior to 2001, we simply use the collection base for the census, which was the enumeration district. The second is that these are more stable over time. So from 2001 to 2011 and 2011 to 2021. The target was that there will be less than 5% of changes in output areas. And that's been achieved in terms of that change. Those output areas are then grouped up into lower layer super outputs areas. So this is maybe familiar to some of you, but it is used in a range of public public statistics. So the IMD indicators index of multiple deprivation indicators are all held at that level. LSOA reported crime is held at that level. And then the largest grouping which is probably equivalent to ward sizes is the mid layer super output area. So this has been used in statistics such as the COVID cases and educational attainment. So I'm just going to illustrate what happens here. So this is a map of tenure or the percentage of people in the private rented sector in the city of Manchester. At the left, you can see the MSOA, you can see the distribution of the percentage in the private rented sector. And you can see some highlights where there seem to be large proportions. As we move along to the smaller area, the lower super output area, we see more distinction emerging. And that's kind of interesting. And again, we get a bigger spread. And as we move into the output area, we can see the range goes now from naught up to 97%. We can see much more fine granular detail. And this is the first point I suppose I wanted to make about representation that we can see very detailed information. But how do we make that make sense to who our audiences and as an example, I have taken that output area level from Manchester and mapped it onto the wards. So you can see the difference within wards as well as the kind of shape of what's going on. And I focused in kind of around the university area where you can see these big pockets of private rented sector that may or may not have something to do with the university. But that's kind of broadly around the south of the city centre. And you can see quite distinct patterns of tenure. If you were to look at the other tenure types, you would probably find things like social housing in parts of them, like ownership in parts of them. So that's the geographic scale. I don't know why this says players, but I think it's because you can have a leaderboard. Okay, so a lot of people want to look at kind of smaller geographies, output areas and then wards and then local authorities. And the second one and what type of area do you want to profile and a kind of dominance of local authority but with some sub region region and country answers. If we have time, it would be interesting to know what the others are, but I don't think I've got time to stop and ask now. But if you want to put it in the chat. There's a couple about power issues and it's not a question I know the answer to it's one I'd have to go away and find out about. I mean, they are clearly an electoral division. I'm not sure they're big enough to publish at that level but we could ask that question I can research that question afterwards. Okay, so moving on then. This is where we're going to get into a bit of a practical exercise. There are three ways of accessing the aggregate data. There's on us. The UK data service and no miss. And I'm just going to say a little bit more about them and for completeness there's also where you can get the boundary data. So in the practical exercise on the event page. I've shared a word document which is about accessing data. And what I'd like you to do is just have a look at these different areas. So on us gives you data. You can scale your geography, the type of area you want to select and the bound the boundary of it so if you're doing, for example, a local authority. You can set the local select the wards within the within a particular local local authority on us also has a number of supporting meet materials and the release calendar if you want to keep up with what's coming from the census. The UK data service, our data site is currently in development so at the moment you have to download all of the data and take out what you want. There are boundary data as well. They're supporting material including some explainers about some of the variables. So a section on race and migration things about the new variables veterans sexual identity and gender identity. And within no miss, you can pick the data similar to on us you can say the job, the geography you want and bound that within a particular area, and you can also select the categories you want to pick up. So if you want to access that document. So just check people can access that first of all. I will pick up the chat later and have a look at some of those questions. One thing to say about is that the Scottish release isn't until next year so we haven't really got grips with what's going to happen there. Hey. Yep, so Chris has put a link in the chat to the document. Well to the to the web page. The document is called accessing data. So it's the first on the screen. Okay, so I mean I think this is an opportunity if you want to have a comfort break as well to take 1015 minutes and we'll come back. Hopefully you'll get a flavor of what the different websites offer. It's fair to say for all of them, they will be changing over time. They're likely to converge to somewhere similar with the ability to select multi variables to scale your job, etc. But at the moment that's the current state of them. So I'm just going to take a comfort break myself. I'll be back in five minutes. So if you have any questions during that time then pop them into the Q&A. A quick question I can answer. SOAs and MSOAs are all bounded within local authority, but they don't overlap what they, they don't fit world boundaries. So the method ONS use for output areas mapping to world boundaries is a statistical one. There may be different counts. If you are involved with London, with the census information in London, for example, they use their own method based on housing. And household data is used as a sampling base, and a lot of surveys will be rebased for that. So the first question I've got up is about the short term visitor population. There were certainly people who were staying in an establishment or a household. So it would ask them to complete an individual form, which would say whether they were short term visitor. So for example, I had somebody coming to visit me from overseas, they would have been completed, they would have been in my census return. So it has no information about visas or citizenship, but it would include them if they completed the form. The health deprivation metric is based on people who say their health is poor or very poor. Jonathan's asked about the small population data. Some of that is available so you can get the ethnicity and religion data at the moment at local authority level. The thing to do is to watch the ONS release calendar because their dates have tended to shift around. So the dates are becoming firmer around the flexible table builder and that multivariate data. But what you'll see is others with more general times for release. Somebody's asked what are the percentage samples. I'm not sure about those. I'm not sure what you mean by the question. If you could expand on that. Charlie, you can go to the UK data service without logging in. If you're going to use it in future, particularly for safeguarded data, it's a good idea to register. Okay, so just a question about the number of deprivations in the ONS data and IMD. The answer is there isn't much connection. The index of multiple deprivation has 32 indicators that are amalgamated together into seven domains, whereas the census data is fairly basic. So we've talked about health, we've talked about education, we've talked about housing. The other characteristic is employment and that's if there's a household with nobody working. So the only question outstanding is on the percentage samples, which I'm not sure what you meant by if you could be. I think you're right that short-term resident population is not necessarily reliable. It won't include everybody who's here. Housing affordability data. I'm not sure it's available at that level the last it's a while since I've looked at it. So the last time I looked at it, it was at MSOA level. Okay, why ONS UK data service and NOMIS different? Well, ONS, this is the first time they've released the data. So in the past, they've released it through NOMIS and through OS. And we've used a different tool to the one we're using now. NOMIS looks like it's using something fairly similar. So the ONS is a newcomer in terms of the way to access data. And I think partly it's because we have different audiences. So NOMIS is probably more familiar to those involved in policy areas in public services. And the UK data service tends to have a more academic base, though a number of policy people do use it as well, particularly for other information. I wouldn't want to comment on the APIs. I know my colleagues who've been taking in the aggregate data have said they have had quite a lot of work to do. So I don't know what ONS's pounds are for that in terms of public release. They've obviously released something to NOMIS and to us. And I'm not sure that's a public document as yet. But at some stage I would expect them to be transparent and to release the API documentation. Well, I'll just give a couple more minutes for you to have a look at those and then we'll return back to the information to the rest of the question. A couple of things in the Q&A from Keith, the best source for geography undergrads to use in a practical class. At the moment, I would probably say ONS because they're the first of the past, so they will have the multivariate data. It depends on what else you're doing with them because the UK data service has probably more explanatory stuff aimed at that audience and also holds a number of other resources that might be interesting to use. So I think that's a hi Jasmine. I'll try and find the paper and put a link in the chat for out per area classifications, but I'm sorry on the event page. There was some work done for the 2001 census by a colleague at UCL which was a lease, which was how they arrived at those. I think they took a load of potential variables and saw how significant they were. Hi Vita, I'm not sure of the geography in Scotland, but there will be smaller areas than local authorities. Okay, I think we'll go back to the webinar then. So thanks for those questions. Hopefully I've answered some of them, which I didn't expect to be able to do. So that was the practical exercise. The first one, the second, there's a lot of data on the event page, and I'll be suggesting you have a look at that and then we come back. So thinking about building an area profile, what do you need to decide? So, first of all, you decide your variables, as we said, what things you're going to include in that area profile, whether it's an economic profile, a demographic profile, etc. You then need to get that data and prepare it, define your geographical scale, and then decide how you're going to represent it. So it could be by using mapping, it could be by using statistical summaries of the data or tables or chart. And a couple of types of indicators that might be useful for this are here. So this is taken from the MSOAs in the six Olympic boroughs in London. And what I've done is to look at the proportion of a particular group in an area such as a neighbourhood. So it's a MSOA level, and I've combined counts of families with dependent children. So that's couples, lone parents and other families with dependent children and divided them by all the households in an area. So what you can see is, is this an area where there's a lot of dependent children? And you can see a graph here of the distribution of those percentages. Similarly, you can map these and band them as I did with tenure. What this tells you is there are some areas where there are high proportions of dependent children and some areas where there are no proportions. When I did this kind of exercise in Greater Manchester at local LSOA level, what it showed was parts of the city where there were very few children. Those linked very much the city centre and to the kind of what we call the Oxford Road corridor, which is where MMU and Manchester University are located. Now the second thing you might want to do is to focus on a particular group. So in your community, you may have a particular group that you're interested in looking at. So how are they distributed? And what this measure does is to take the count of households with dependent children in an area and divide it by the count of all dependent children in all areas. Multiply by the number of areas. So the number you get if there was an even distribution should be around one. And you can see the majority fit into that. But there are some areas where there's quite high concentrations and some areas where there's quite low concentrations. So these kind of fit into the measures. And in a couple of the example spreadsheets here, I've added in those measures. Okay, so we've got some answers about Scotland in the chat. So moving on from there, what I've provided is a number of different spreadsheets on the event page. I'll just pick up one as an example. So we've got the bedroom standard, country of birth, ethnicity, household composition, highest level of qualification, industry, occupational social class, occupancy levels, sorry, occupation, religion, tenure. Can you see this on the screen? Yes, we can Nigel. So if I go into the religion one. I've picked out the wards in Derby, Leicester and Nottingham. And this is the table that came down from Nomus. So what I've done is add in a second table where I translated that into the local authority ward, the counts, and then I've generated that second indicator, the concentration indicator. So if you look at Derby, you can see there's one ward, play greaves with where 19% of the Sikh population live, another where 21% live, 15 and 16%. So they are quite concentrated in particular parts of Derby. So that kind of measure could be applied to any of those religious groups. Similarly, I've used the other indicator with tenure. So here this is the raw data coming from Nomus is here. And then I've looked at the proportion in the private rented sector, again by ward level so we can see again some quite big differences between the concentrations with 41% in less than one over 62% 48% etc. So we're going to go back to this and to say, have a look at those now and see what you can do with those. Now, I haven't tried to produce a profile document myself because I think it depends on what variables you want to use. But this would give you a kind of ward level report based on a number of the indicators. For those of you who aren't going to work away and get your own data because you're not at that stage yet. If you want to practice these data sets are there available for you to use and play with. You could, for example, do a profile of Nottingham or Acropa Leicester or Derby. So if you want to have a quick look at those, think about how you might use them, the kind of indications you might want to generate. We'll give you about 10 minutes and then come back for a kind of final set of slides. So we'll come back at 10 past and kind of finish off the webinar, have a chance to answer, go through any more questions. So if you have more questions, then put them into the Q&A and we'll try and wrap up. There's a small survey as well about our approach to events in the future. If you are going to go before that because you've got other commitments, could you please fill in the feedback as well? Thank you. So another minute or so to have a look at those, but they will be available on the page for those of you who want to look at them. And then just want to say a few more things about area profiles. I suppose the main one will be about data that's to come. Okay, I've got a couple of questions coming through. So geometric centroids are available through the UK data service. So you'll have to look at the kind of set of supporting geographical information. I can't directly point you to it now. But if you have an issue with it, then email me or call the help desk and we can sort out access. They should also be on the Open Geography Portal, but I haven't used that. I tend to use the UK data service. Charlie has asked about MSOA level data. So this is the ONS's own area profiling tool. You can download that data. So it's about going through identifying the data you want in ONS and then picking the MSOAs within the geography you're interested in. Okay, so let's just go back now to the presentation. So I suppose that the big thing personally I've been interested in is when we can look at different characteristics of the place or the people we're interested in. So, for example, I had a student who did some work on the Bangladeshi population and how they changed over time. And looking at that we use the different aspects of the housing conditions, the educational attainments, employment, etc. And that was possible therefore to produce a picture in Greater Manchester of how the Bangladeshi population had changed over time. So we used census data from 1991, which was when the first ethnic categories were put into the census through to 2011. Similarly, if you wanted to look at aging, you might look at the neighborhoods where older people lived and that would give you some kind of focus on how to model the experiences they had in terms similarly of employment, health, education, housing, etc. So the multivariate data is going to be very important for meaningful area profiles and I think we will be developing courses but having not seen the flexible table builder or the way that either Nermis or the UK Data Service will translate that. There will be work for us to do before we can advertise courses but on that basis, I'm just going to see. So you've got the presentation slide so you can look at our training events. We have a set of what we recall core courses, so an introduction to the data service, using the secure access service which is the training to acquire accreditation, how to find data, data management. And for those of you involved in teaching, we also support students through the dissertation program for those who are using our data. A range of other courses available on things like longitudinal research, one coming up on mental health research using the practical element will use data, crime mapping using our stuff on computational social science and time series analysis. So what we plan in terms of sensor course, sensors courses, it's probably to repeat some of these because they do seem to be very popular, but also once we get at the multivariate data to develop these courses on and make them more accessible similarly when flow data and micro data become available. We've arranged with ONS to do a workshop on statistical disclosure control which will be linked to multivariate data. And in the future also like to do one on the kind of quality assurance so one of the key things to say about the census and we don't can't see the data yet below local authority level is that they are the numbers that we get our estimates. So we never in most areas we don't get 100% of responses and those responses will vary between type of neighborhood. So there are some quality assurance things to think about. Once we get that smaller level data to see which populations might be on-door over represented. And there's my email address so if I've said so or if I haven't said so, if there are things you want to ask me about, then please do. I'll go through the chat which had quite a lot of geography and definitely research the parish issue. I think somebody's already answered the question about Scotland. I think it's fair to say that a lot of these courses will be repeated with Scottish data once that data has been released. And following on from that there will also be some. Okay, so thanks very much. I think we're going to close the webinar now and look at our events page to see what's coming up. There are likely to be more census courses developed as the releases are made available. As I said, there's definitely going to be one on statistical disclosure control. We will be doing multivariate data as well and potentially something on the quality assurance processes that aren't used. So enjoy yourself and thanks very much.