 Merhi gwrs. Fy enw yn ddweudio'r ddweud ymlaeniaeth UK Data Service suspension o'r ddweud ymlaeniaeth o'r ddaeth i'r ddweud? Mae'r ddweud yma, a gwrs mor hynny yn ysgol yma, oherwydd y ddweud Ymlaeniaeth UK Data Service, yn gwneud y Unifeilio Maennyn. Ac mae hwnnw'n gael y gwahanol, yn fawr, sy'n golygu y modd, sy'n golygu'r ddweud ei ddych chi, ac ydych chi'n golygu'r ddweud, sy'n gwybod i weithio gweithio gyda'r gwybod. Mae cymryddiwch chi'n dweud ar y gweithio. Mae'r gweithio gweithio gweithio, mae'n ddewch o gwybod i unigol ar gyflosfiadau ar gyflosfiadau, a mae'n meddwl gweithio os ydych chi'n gwybodaeth sy'n gweithio. Felly, dyma'r program Feitharau Custod, sy'n gweithio i unigol ar gyflosfiadau o'u gweithio ar gyflosfiadau. Mae'r gweithio ar gyflosfiadau i unigol a what the organisations require. Sometimes it's difficult to cross over, particularly where there isn't that support within your institution or within the placement. The aim of this is to try this out. For us, the feedback is very important. I would also like to encourage you to contribute as we go along. I'm going to just do a few short slides firstly about the kind of data you might see, and then invite you to comment from either your experience to data or what you expect from the organisations you're going through. I'm then going to talk a bit about skills and tools. The main part of the section is really what we have to offer. There's a couple of useful tools for some of the work that you might be expecting to do. As Jill said, if you have any questions, just pop them in the Q&A. We might pick them up as we go along or at the end of particular segments. First of all, the system data that was collected by the government is published every week. At the end of the week, they consolidate the number of positive COVID tests. I picked out some of the MSOAs, actually, their mid-level super-out areas that are relevant that you might understand. There's a university park there, Lenton, around Arboretum, City Centre, and a couple of places in Wallatons and Towns. What these show is the number of cases per week that have been reported. You could look at those and say, well, actually, before Christmas, the numbers were pretty high in Lenton, and they've come down, and you can make a fist of explaining that. The way we tend to report these is to use what's called a rate. The standard rate for this is, for some reason, the cases per 100,000. What you can see now is that I've standardised those cases, so it's easier to compare between the different areas. In particular, if you look at the Wallaton area and St Anne's East, they're much smaller than the other areas. If we just compared the numbers as raw numbers, we'd have had quite a different rate. The rate in St Anne's East is quite high, whereas the number wouldn't appear out of place against some of the other areas. That's just a quick way of thinking about how you present numbers. You may well be presented with tables of data that just have counts in them. If you're going to translate them, it's something useful. Percentages or rates are useful. In terms of looking at COVID, as I said, the protocol has to be to look at cases per 100,000. In terms of other areas, you might be looking at rates. If you're looking at employment rates between different wards in the city, you might want to translate those into percentages. I pulled this population from statistics released as a mid-year population. You will need to find an effective denominator that tells you how many people are in that particular geography. Moving on from here, the other big source of data you have is what I've called administrative data. Generally, this has to be collected as part of the operation of the organisation you're working with. It might be individual records. If you're working in a small charity, it might be a paper register of people who come in and use the services of the charity. If you're working somewhere larger, it might be electronic. It might be a row like in a spreadsheet. It might include numbers, words, and it may also include things like sounds. You may have an interview, which is an individual record of talking to someone, but it would be an audit. The first thing to think about with that kind of data, if you're asked to analyse it, is how do the organisation do it? The second question to ask is, does your analysis need to be repeated? I think this brings us on to an important area, but if it needs to be repeated, you need to do it in a way that somebody else who's not you in the future can carry it out, which means you're also going to be writing a set of instructions and using tools that somebody else can acquire the skills to use fairly easily. There's another factor to think about. If you're asked to look at a task with administrative data, those are the two key questions I would say. How does the organisation do it and it hasn't done it yet? Does your analysis need to be repeated? The second type of data to think about is secondary data. This is stuff that's already collected and held in archives like our own, the UK Data Service Archive. It might include service surveys, it might include census, it might include qualitative data, interview recordings or transcripts, evidence from focus groups, photographs or documents. If you're using secondary data, you know that it's already been collected. For many people, the assumption is that correcting your own data is best. I would say, from my experience, using other people's data that's already been collected and validated is generally more effective for most pieces of work that you would need to do. Clearly, if the data isn't there, you might need to think about the next area, which is primary data. This is where the organisation might have decided or might be part of your task to help to understand issues by producing surveys or doing interviews or collecting data. There's a few things to think about here. I think the first is, how do you design a good survey? One piece of advice I would give you is to have a look at those that are around. For example, if you're doing a survey on food needs, then it's probably quite important to understand the type of household you're working with. How old are the children? Is it a couple of single parents? What's their living circumstances? What's their cooking circumstances? You've got some picture of where that service is meeting its needs. We call those things demographic characteristics. Typically, they include things like age, sex, household structure, ethnicity, maybe migration status, et cetera. The kind of things that say who we are. If you're collecting qualitative data, I think you probably need to think very much about how you store it. If a survey is on paper or on electrically, you would need to store it. But surveys tend to look after themselves in that way because to analyse them, you're going to put them into some kind of computer software, which likely you are. Whereas with qualitative data, you might be writing a set of notes, or you might be doing a transcript, taking audio recording, and then doing a transcript. The way you store those other people can use them, again, needs some thought and needs discussion with the organisation you're working with. Do they regard the data sensitive? Do you have to take particular safeguards to make sure that it can't be accessed by other people, et cetera? So, just to move on from there very quickly with skills and tools, my experience of working with placements over a few years was that most organisations tended to use Excel for numerical data. Some of the organisations you work with may have their own databases to collect their operational data, and they may also have reporting tools available to allow you to analyse it to use it. So, those are the things that you would unlikely to find available. Whereas in universities, we often use quite expensive software, partly because we get it cheap, but we use statistical packages like SPSS, Stata, mapping software like ArcGIS, but that may be far too expensive for many of the organisations you'll be working with. So, in terms of general advice on this topic, I would say if you're doing something that the organisation needs to do again or another placement student needs to do when they return, use the tools the organisation do for work, or commonly available tools or free tools, but if you're doing a one-off project, a research project, then use the tools either the ones you're familiar with or the ones you want to learn. So, the UK Data Service is funded by the Economic and Social Research Council, which is the part of the UK research infrastructure funding organisation, particularly targeted at social sciences. So, it's a simple point of access to a wide range of secondary social science data, and also we develop support training guidance. So, we run a number of events, quite a lot of general ones so that people can enroll on them from all over, and some specific ones like this and others for particular audiences. And these are the kind of topics we cover. So, this is not an exhaustiveness, but this is trying to pull together what we look at. So, if you're interested in the employment of work, we hold a labour first survey, which is done every quarter, and there's a component which is the same people who ask questions over five quarters. A larger part of that collection is the annual population survey, which uses less questions, but carries a lot of information or collects a lot of information about the politics, the population of the country. And in terms of health, the health survey for it, we hold the health survey for England, the Scottish health survey, and other related health surveys. Family finances, there's a couple of big surveys there, one on family resources, one on living costs. The crime surveys for England, Wales and Scotland. A whole set of things around attitudes and opinions, and housing in the local environment. So, these are national surveys. We'll talk a little bit more about them, but we hold those. So, you're okay. So, you can access information about these areas. These are quite large surveys. Many are held annually, and some ask the same people, others ask a random group of people. So, for example, all the English Housing Survey is what's called cross-sectional. So, a random group are picked each year, and they are asked a series of questions about their housing conditions, their attitudes to housing, and so on. And those are used, it's commissioned by the Ministry for Housing, Communities and Local Government, and they're used, and they can provide, I suppose, the reason for telling you these things, is these can provide a backdrop of national, or the national picture that would inform priorities for your local organisation. So, it might inform some questions you would ask. So, if you're looking at poverty and debt, things that you might be able to find out from the Family Resources Survey about what kind of people are experiencing poverty and debt might be useful for you to be able to compare national into what's going on with what's in. So, the target audience is basically most people. I think the one kind of provider in there is because this is public, publicly funded data that's collected basically for the public good, there is a kind of question mark over commercial use of the data, and it may be chargeable to that audience. But as long as it's in the public interest, it would still be able to be used. But here in this set of people we talk about, quite often the business consultants are people who will be working with local organisations, local charitable organisations, et cetera. So, you fall into the category of academic research as a student. Government departments use it quite extensively as do charities and foundations, business consultants, independent research centres, and think tanks. So, we have a large potential audience who use things, and I think there's probably less use made of this data by smaller local organisations that could be improved, that would be improved through working with you so that you're aware of it. So, I think that's one of the reasons placement students are a target audience for us is to think about the skills and knowledge you take into smaller organisations who may not have the skills themselves to access the kind of data and tools we have that could be useful to them. So, the data generally comes from an official agency, mainly central government. There are some international data sets that are collected from organisations like the World Bank and the IMF. Research institutions who are funded by the UK research bodies may be required to deposit their data, so you would find some data that's held in the archives which has been collected by other research projects and individual academics. In terms of market research agencies, most of the data that's deposited by them is where they've been commercially contracted by public bodies to do the data collection. There may also be public records and historical sources. Focusing on surveys, first of all, they hold data about individuals or households. They're often commissioned by government departments, so the health service commissioned by the Department of Health, etc. They're conducted by organisations such as the Office for National Statistics, the National Centre for Social Research, etc. They include characteristics with key sample sizes. So, the aim of doing this kind of survey is to be able to say something about the stake of the country in terms of a particular area. So, if you're talking about world of work and employment, then the aim is that you can make assumptions because we've got a large enough sample about what the national picture is. And we use similar questions with the new sample people every time, often repeated regularly. So, the English Housing Survey is a repeated cross-sectional survey. So, who is an example of one that is one every year? Quite often, you will have seen news headlines about issues like Brexit and immigration as you've been growing up more recently probably about inequality. And a lot of the reporting of this comes, an analysis of this comes from this survey, the British Social Attitude Survey. It's been run every year since 1983. It covers social attitudes on a range of topics. And there are annual reports on the key topics. So, the chart on the right shows whether people think income distribution is fair in Britain. And the difference is shown between England and Scotland. And what it seems is that as a group of people in the country, near a bit more than half of us think income distribution is unfair. But there's a larger proportion in England who think it's fair or very fair than in Scotland. So, that topic of inequality has been one of the focuses. So, if you're interested in particular attitudes within the population that affect the service, then this survey might be a useful one to look into. The Health Survey has masses of data. It takes eight hours, details from eight thousand adults and two thousand children approximately each year. It asks questions about their experiences and attitudes towards health. Also takes clinical measurements, so height, blood pressure, weight and biological samples. And this chart comes from that. And it's basically showing who's been drinking more than the recommended weekly amount by age and sex. So, I have something obscuring the top right of my screen, but the darker bars are men and the lighter bars are women. And then we have a breakdown into 10-year age bands. So, we can see that men kind of, you know, around one in four men are drinking too much, up to the age of 45, and then it drops up to one in four, up to 75, and then it drops again when people hit 75. So, this is about those people who have answered the question. So, if you were to say, where is the problem of drinking age for men, I would kind of guess around 45 to 65, because that looks like where it's high. That's the group you'd want to intervene in, where you may be presenting, preventing alcohol-related diseases, etc. With women, it's slightly younger and also that young age group is quite worried, where women are drinking more than men. That's the only age group where that's happening. But that's just me reading off it. But if you're interested in health, then you might be able to get some ideas about the national picture that would then you would be able to compare to what you knew about Nottingham, where your organisation was running different types of interventions. And then here we have the English Housing Survey. So, on the right here is what's been generated from the survey by the authors, and that's an infographic. So, it's basically saying how has the proportion of household in the private rented sector behaved over the last three or four years. Now, I'm quite cynical about the way statistics are used, I suppose, because what we're seeing there is a very small drop, but it looks quite big on that chart, doesn't it? So, it's gone from 20.3 to 18.7. So, that's 1.6%, which is less than a 10% drop. Is that significant? We don't know because we don't really know the denominator either at the moment. But I'm a cynic about that data, but it might be useful. So, if you're saying you're looking at housing in Nottingham, how many households in the private rented sector and has that number gone down in line with what's happened nationally? Or has it increased? A further question might be what kind of households in the private rented sector? So, many students will be living in the private rented sector in Nottingham for those at the university, possibly around Lenton, Dunkirk, for those in Nottingham Trent, maybe up towards Arboretum and Forest. But those numbers are a particular type of household. Most students who leave the halls after their first year would expect to live in the private rented sector. Whereas I suppose many people with children might expect to be living in the private rented sector later on as their children start to go on. And the other thing that happens is the way the map shows where the highest number are, the lowest, and those are in the private rented sector outside London. So, that infographic was derived from the survey. And we could have done that ourselves with a tool we're going to show later on and you're going to have a go at. The other thing that is worth understanding is longitudinal studies. So, these follow individuals over time. There are two types. So, one is where you have a household or an individual and you start asking them questions and you say, are you prepared to be part of this panel and we'll come back and ask you questions each year. So, some big examples include the British election study, which is used using a panel methodology, understanding society, which we're going to look at. But we also have specific cohort studies. So, looking at cohorts from who were born in particular years. So, the 1970 British cohort study is an example of that, the Millennium cohort study. And there's also an ageing one. So, these enable us to answer slightly different questions. It's probably getting quite complicated. I'll give you a couple of examples here, but it does enable us to say, well, what happens to somebody over that period of time. So, if you look at the English Long Student Study of Ageing, it's quite important to try and understand why people are well or not in later life. And my survey helps us to do that by asking people their previous historical experiences of work, of health, et cetera. So, the dataset that is probably the largest around has 40,000 households, 100,000 individuals, and has been running for 10 years, incorporates a previous survey of the household panel survey, has some useful things. So, in particular has an ethnic minority boosting mechanism, which for many surveys, data on ethnic attitudes or experiences is quite weak. This survey addresses that by this boost measure. It also collects some biomarkers, and there are options to link to other data that's held about individuals. And it covers a wide range of things. So, it covers areas like employment and earnings, benefit, political party, identification, household finance, environment, family life, et cetera, et cetera, et cetera. And there are questions added in as we go on. So, in response to COVID, there was a set of surveys carried out every two months by the understanding society team, covering household experiences of employment, education, income, health, and living conditions. So, what this chart shows, which is what I produced, was identifying the percentage of households in housing areas by ethnicity. And it showed quite a marked racial difference between Bandodeshi, one in four in housing areas, Pakistani and Black Africa, one in five, down to white British, which was more like one in 20. So, quite different experiences of this particular problem of being in housing areas by ethnicity, similar indicators around debt as well. So, that particular data was very useful. The last screen is just to cheer you up for those of you who love SBSS. Here is the way the data might be presented when you download it. So, typically, SBSS holds everything as numbers, and it holds the key to those numbers to translate them. So, looking across the top, we have the self-rated health. So, this comes on probably the health data set, sex, age, marital status, higher education achievements, and ethnicity. And as we go along, we can see how those are translated in the bottom table. So, saying two means you're in good health. Three is fair health. Two is female. One is male. The ages are in age bands. So, it looks like two is 25, 44, et cetera. So, it's in those 10-year age bands we were looking at before. Marital status again, re-level, no qualification, and ethnicity. So, what we can see when we use this data is a set of numbers in a table, but when we report on it, we will get the descriptive elements. So, we'll be able to see the state of people's health by gender, for example, in a typical table. So, that's the kind of stuff about survey data. We also hold qualitative data, though probably quite a lot less, because one of the issues here is that many researchers who carry out qualitative research don't feel comfortable leaving personal interviews, photographs or records of focus group discussions, because they are very easily tied to an individual. So, if you think of the secondary quantitative data, well, it's going to be quite hard to work out from all of those characteristics who a person is. Whereas when you come into qualitative data, then people may well be recognisable. So, we hold historical data that has gone past the point where it would be the person identifiable will matter. So, we hold census data from 1911 back to 1851 and the 1921 census data will go on. So, that is a kind of set of data that you can access with different conditions, but that's partly because the requirement to protect individual from disclosure about who they are as well. There is some qualitative data, some interview data, etc. So, you can look for that on there, but thinking about the qualitative data that might be particularly useful, I think a lot of people would look at newspapers and look at newspaper archives to explore issues. Again, you're not worried about disclosure once somebody has been written about in a newspaper, you can talk about them safely. An example here is this study that was a national child development study. So, there was a qualitative element which took interviews with 220 people looking at questions around neighbourhood and belonging, leisure activities, social participation, personal communities, life histories, identity and reflections on being part of that study. So, these are kind of quite useful. We give access to them. You'd also find in your local archives either at the university or in the council there will be archive material from things collected like oral history, etc. That may be of relevance to the work you're doing. So, what I want to do is to describe some of the other data there that might be relevant. So, we have a set of aggregate data generally about population groups, regions or countries. And this is an example of a world map of inequality. But the bit that might be very interesting for you is the census type data. So, first of all, this is just to give you an appreciation of how we might think of data. So, we take a space, we draw lines around it and we say here, these are the number of the population of that space. So, typically in Nottingham, you would draw lines around different types of geographical areas. So, you're probably familiar with the names of places rather than necessarily the boundaries of those places. So, when we talk about Lenton and Donkirk, you would kind of know where it is. But would you know where the boundaries actually are? So, if you're voting local elections, you might know that your councillor is the councillor for that particular area. But if you don't vote in local elections, then you wouldn't necessarily know that. But when we come to looking at numbers and maps, we have this kind of quite useful set of data from the census. You're at a particularly difficult time in terms of census data because the last census in 2011 and the 2021 census data will only be released later this year. So, you're on light to be able to use the latest data. But it still may have some useful information as it's used as a baseline, has combinations of characteristics and small geographies. And you can get these kind of tables of data out. You can also look at individual characteristics of data. It has your baseline population, kind of employment occupation, social class qualifications, ethnicity, religion, et cetera, housing and tenure. Now, this may have changed since 2011, but it's still probably the most comprehensive source of information you're going to get unless you're working with an organisation like the council or a health service that collects its own data. So, you may well find you'll want to use this. Going back to the previous, the way to get aggregate data is from this link here, infuse, and you can create your tables and download them from there. I'll share these slides through Steve so that you can have a look at them yourselves. And for those of you who might be thinking about doing a placement, again next year there will be new information as well. So, the 2021 census will be available later on this year. And it also includes additional questions on sexuality, gender at birth and having been in the armed forces. So, if we think about what census data does, it's collected in specific geographies. And these geographies were developed in 2001 to be statistically relevant. So, they're called out for areas. They're based on the idea of trying to get similar places grouped together. So, they have a house of a number of people of 100 to 300. That makes an output area and then those are grouped into two bigger blocks, the lower super output area and middle air super output area. And depending on what kind of data you're looking at, then you will see these as the base for them. So, for example, the index of multiple deprivation uses lower level super output areas. The curve is statistics cases. I showed you use middle air super output areas. In Scotland and Northern Ireland you have slightly different things. But these are not real, but they have some advantages in that they tend to be relatively similar between censuses. And when you look within them, the people within them are more likely to be similar as opposed to having quite a wide range of the category you're looking at. You can also look at data through regions, counties, local authorities, wards and electoral divisions like parliamentary. So, the map on the right actually shows higher education participation by local authority where the darker green has higher levels of participation in higher education. Okay. So, I think I'll just now go onto the mechanics of getting in. So, the UK data service generally requires registration. So, a lot of the data, once you're registered, you can access it. And we'll talk about those different access regimes afterwards. If you're at a UK university, just use your username and password to log in. So, when you log into the UK data service, it'll just take you through the validation screen that you have for Nottingham University. So, to register simply clicking the button on the first page to register. I will go into the screens in a minute for the next, for one of the examples and come back over these a bit. So, the reason you need to register is because most of our data is held and requires a registered user. So, we have a tracker who's used it. So, we do have some open access data that's available with very few restrictions that might be publicly available like the international aggregate data from IMF and World Bank. Quite a lot of the census data, which again is publicly available, and the teaching data sets. But if you want to use real survey data, you need to register and agree to the conditions of the UK data service. Once you've done that, most data is available, most survey data is safeguarded. There may be some additional conditions for some data. And then the third level is unrealistic really for, is for researchers either doing postgraduate research or working as researchers in the university because it's using a secure access mechanism where your application is checked and it has to be used through a secure physical or virtual environment. This typically has a lot more detail. So, there's a much higher risk of identifying who the individuals are. So, it may include lower level geographies, which means that you can look at the combinations of characteristics together with the individual. A lot of business data, a lot of commercially sensitive data is held under these conditions to protect it for commercial reasons. Okay, so this is, I'm going to demonstrate this now. So, I think a good place to start if you're going to get going with this is the learning hub. So, I'm just going to switch over. We now have the front page of the UK data service. As I said before, you can click on that register button and it knows me because I'm from the university of Manchester, so I'll just simply click on that and it would log me in. For you, you might need to set up to start typing the name of your organisation and it would take you in. So, the place I suggest is starting was the learning hub. This has information. So, there's some stuff in here if you're new to using data and it has a description of what secondary data analysis is, some training interactive modules where you can develop skills, something about researching data, a section particularly for students versus students guide to the UK data service. Later on, if you're doing your dissertation, there's also information about dissertations and we run sessions targeted at dissertation students. So, we then got the data skills modules, stuff about survey data. So, the different types of data we've looked at, probably not relevant to your placement, but we have a group who are involved in computational social science, which is doing things like scraping Twitter and so on, sensors software and tools, and geography and data. Now, that one might be particular, a particular relevance to you, so you can think about generating local estimates and mapping data from other studies. So, a good place to start, I think, is that section. There's a number of interactive tools. There is, for example, a tool on mapping data using free GIS software if you want to do that. Right now, what I would then move on to doing is, once you're familiar with what you're doing, how do you find data? So, if we go into find data, I'll click on the front and I'm in the find the welcome. So, if I'm going to find something, I can just search for it. So, as a starter, I don't know what I'm looking for, but I want something about mental health. So, if I search for that, it will give me 250 results and looking at the first ones, a lot of them are about mental health trusts. Well, I'm actually interested about mental health and students. So, if I refine that search, what I then get is some more relevant information. So, student mental health during COVID-19 pandemic, a study by people at Birmingham, getting off to a mentally healthy start in doctoral study, which might be useful. And then, it goes away to something else that isn't so interesting. But so, the first thing, when you know what you're looking for, when you're not sure what you're looking for, might be just to go there. If I decide I want to look at this dataset, I can click on it and I would need to be, this data is open, so I can actually access it. And it's based on a depression anxiety scale. So, I could, without being logged in or registered, I could access this because the data is open. So, I think the reason for showing you that way into it is that I think you may well find it very useful if you are looking at a topic to be able to go and have a look at things like this, start to get some indications on what's going on nationally to inform local work and local analysis because it's always good to have a comparison. I suppose I know Nottingham reasonably well having worked there for three years, but it does seem to me there are some particular issues in Nottingham which organisations there will be involved with. So, there are particular issues around depth, around precarity. In common with many other kind of urban areas, Nottingham has the kind of level of deprivation that is probably higher than many of the areas, other areas in the country. So, if you're looking at any questions around data, around work, around employment, around health, then you've got these kind of resources to be able to tell you what the national picture is and to compare the kind of indicators being generated in Nottingham. Just to say something about the formats of data, it comes in once you get down to the survey data, you'll get it in SPSS or Stata format. You can also have it in .dat format, but unless you're familiar with SPSS, Stata or R, I would suggest you'd probably use a tool like Nestar, but other materials like that mental health survey may be included in databases or spreadsheets, and a lot of the material linked to the qualitative data will be in Word or PDF documents. So, in terms of what else we have, we do webinars and workshops on specific areas. So, for example, we're doing workshops this week on the labour force survey. We would do stuff on the health survey, we did stuff on the crime survey last month, and many of those are recorded so that you can go back and look at them if you're interested in them. We also have guides and video tutorials that you'll find through the learning hub initially. We have a topic set, so you can explore specific topics, and we also have some stuff on methods and software. Fine, I think if you've got an individual query, we also want to help Desk, which will respond to your questions. So, at the first stage, you might be using that to make sure you get registered properly, but once you're registered, if you have a query about the data you're looking at, how to interpret it, et cetera, then you can use to help this.