 Hello, welcome to the second online workshop that we're doing for the third sector, from the UK Data Service, and today we're talking about data skills and what our survey data. So thank you for coming today, I'm quite excited to be here and telling you all about this. So what we're covering today is I'm going to give a short introduction to the survey of the online workshops and what we're trying to achieve. Then I'm going to give a presentation about social surveys and secondary analysis. And then we're going to have a little bit of a quiz just to engage and see how everyone is and check understanding of the presentation. Then I'm going to give a demonstration of exploring some survey data using a tool we've got at the UK Data Service called Nestar. And then we've got a handout, which hopefully has been emailed to you as well. So please check your email if you'd like the handout separately or if you can't get it from the Go to Webinar control panel. But there will be 10 minutes or 15 minutes to do an activity in your own time. Then we'll come back and see how you got on with that, discuss that and have any further questions and discussions. So that's the general format for today. So just quickly, the UK Data Service is funded by the ESRC, the Economic and Social Research Council. And it's a single point of access for secondary social science data. So we hold all the data, a lot of the data that's produced by government funding across the UK and some from further afield as well. We also hold all the census data, but as well as holding the data and providing access to it, we also have a whole range of support and training that we offer to people who are wanting to use the data. And so the objective of these workshops that I'm running is to promote our services and our resources and our data to members of the third sector. And so it's not just an academic resource, what we've got. And we're really keen to get as many people using all the resources as many people using it as possible. And so we want to increase understanding of how the data can be used by different people and in different ways. And we want to support people to do that. And I think that through doing this, we'll be enabling people to provide evidence and enhance their knowledge and data skills. So just specifically why we want to do it with the third sector, just as way of introduction as well. So I've only more recently moved into working for the University and the UK Data Service. For many years, I worked for Manchester City Council and before that, I've worked for various settings and volunteered in New Zealand and Australia where I'm from. And I think that data is really important when you're delivering community services or delivering services to different groups across our communities. So lots of the services that I'm sure we're from run programs or deliver support to improve the outcomes for marginalised groups or you're trying to reduce inequalities and providing services often to those most in need. But these sorts of services rely on funding, often the funding short term, and you need to be constantly proving that what you're doing is of value and is needed. And so I think that's where the UK Data Service can come in because some of the data that we hold can help provide the context to the work you're doing and demonstrate the need and demonstrate why the services are needed. But I am aware that there's a whole range of organisations out there with all sorts of resources to provide that evidence. So these workshops are very much introductory to try and provide that sort of starting point for how data can help with your services. The sort of data that we hold at the UK Data Service is quite varied. Last week we talked about aggregate data and census data. Today we're going to talk about micro data that comes from UK surveys. But we also do hold some qualitative data and mixed method data. And there was a question that came up the last week's workshop about that. And so next week in the final workshop, I'll try and round up some of the questions and comments that have come in from the feedback and address those as well when we're talking about using data to tell stories in the last workshop next week. And so just now to talk about social surveys. So what is survey data? So survey data is a product of systematic data collection. And it's quite typical for these data to be collected using a standard set of questions asked to a representative sample of people. And so that produces micro data. So it's quite important when you're talking about rigorous social survey data that it was collected in a systematic way and the samples were representative because then you know that you can generalise the findings from that data to the wider population. So where did these sorts of surveys come from? They come from interviews. Often these are done face-to-face in people's homes. But more and more these sort of mixed modes. A lot of the survey collection at the moment because of the COVID-19 lockdown has gone online or through the web. But typically people are asked questions and it could be questions asked of the individual. It could be asked of households. Sometimes it's businesses or other units as well. So it's not just collected from individuals, but survey data comes from a wide range of sources. And so once those questions are answered, the information is collected from a number of respondents and the information is stored in records in a data set that can be used to produce statistical summaries. So each answer is put into a table. And those tables are put into massive data sets, which we can then analyse. And so just to give an example, I'm going to talk about the British Social Attitude Survey. So the British Social Attitude Survey has been going since 1983 and it is carried out to take an annual snapshot of public opinion on a range of key political and social issues. The survey is conducted by an organisation called Natsyn. And the sort of information that comes from the survey can be used to give opinions and it's often picked up by the media. So such as zantics marriages, austerity and more recently Brexit. So that's the sort of questions that are asked through the British Social Attitude Survey. And just as a bit more detail, it's funded by the Economic and Social Research Council, as well as other sort of government departments and research foundations. Different think tanks or departments can sort of fund certain questions. Actually, if there's a particular question like Brexit, perhaps, that they want to find out more about different departments will ask for that question to be put into the survey in any given year. It's a random selected households across Britain and one person then each household is selected randomly again. And generally, like in 2018, there were 3,879 interviews. And these are personal one-to-one interviews, but they also followed up with a self-complete questionnaire. So in that self-complete questionnaire, there might be more sensitive questions that are answered. And then this is how we store the data at the UK Data Service. This is an example of the catalogue record for the British Social Latitudes 2018 survey. And here you'll find all the information about where it was collected, how many people took part, who collected the data, and what sort of questions were asked. And just to have a look at what that data actually looks like when you download the data set. This is an example of what the National Food Survey, this is an open data set, looks like. And this is in our studio. So data in this format is typically referred to as microdata as the records relate to the original data collection units. So for survey, there's responses from individuals. The record will be for individuals too. So the generic term for a unit is called a case. So each row across here is a case. And you can see on the side up here, there's 6,699 cases. That's what that observation means. So that's how many rows of data there are here, which is why you need a package like R or SPSS to do the analysis. And then the variables are stored across the top in the columns. And again, you can see here there's 53 variables. And the individual unit where that crosses over is the value. So the value for that case and that variable is what's highlighted in each individual box. So that's what the data looks like. And just to talk a little bit about the value of secondary analysis as opposed to collecting your own survey data. So we have what we call data collectors. And then we've got the secondary analysts. And so on the left here, we've got example of some data collectors. So big organizations like the Office of National Statistics or NATSEN, as we talked about, other projects like Understanding Society, which might run out of universities. The photo there is of Peter Townsend, who's a qualitative researcher, who did quite a lot of research around poverty. So these are all people or organizations that do primary research. And they collect the data. And they normally do some analysis of it for a specific planned purpose. But then if they are prepared to share the data, it makes it available for secondary analysis. And so we hold lots of that data, which people can access for their analysis. And then so that's called doing secondary research. So you're reanalyzing the data. So you might be sort of reanalyzing it for a similar purpose, or you might have a whole different research question that you're applying to the same data that was collected. And I've got a picture here of a blog that was recently produced using Understanding Society data. And it was produced by National Voices about putting older people and older people and COVID-19. So it was using the Understanding Society data to help understand their experiences. So that's just one example of how data can be used in secondary analysis. So there were some pros and cons of using secondary data. Firstly, the big pros is that some of these data sets would be impossible to create on your own or for a small community organization or a small charity to produce some of the data on this level would be impossible, usually near thousands. These are multi-million pound surveys. And despite that, some people, lots of organizations do complete their own surveys. And even completing a small survey can be quite costly. So it can be quite cost effective to use secondary data that might be able to answer your research or your problems or the question that you're trying to explore. There's lots of ethical issues around data collection as well. But when you complete secondary analysis, all those ethical issues have already been dealt with. So you don't need to worry about disclosing individuals because all the data will be anonymized. So you don't need to recontact any other people who the data was collected from. They've all already agreed that they know that the data will be passed on in an anonymized way. So you can reuse data by others, but you can make your own claims on it, though, still. So, I mean, these data sets are huge. So there's all sorts of questions that can be answered using them. But some of the cons attached to it is that you don't know how and why the data was collected or how the data set was built. So you have to do quite a lot of effort to get to know the data and understand how it was all put together. So, and there's still some ethical issues that need to be considered because it may limit what data you can access. I don't go into it too much today, really, but there are different ways in which you can access data from the UK Data Service. So lots of our data sets you can just log in once you're registered and download. But then sometimes we have different versions of data. So some of the more sensitive data, like low-level geography or anything that's going to be using a small group of people or applied to a small group of people will be limited to perhaps secure access. And so you have to go through a whole other process to access that data. And the data may not exactly match your research questions. So you have to sort of think about how you can use it. Sometimes it's a bit of a compromise. And you can't make the studies any longer. You can't kind of add more questions onto it. So, but you need to sort of, the two things to remember from this is that you just need to make an effort to understand the data. And we also need to be a bit pragmatic about whether the data are good enough for your purpose. We often get questions coming through our help desk here saying, you know, I want to answer exactly this. And we have to sort of reply to them, well, we don't have that exact, anything that exactly matches that, but perhaps you can sort of think about rephrasing your problem a bit so that the data that we do have will be able to work for you. So how you go about doing a bit of research when you're doing secondary analysis is that you think of what your question or your problem is. What is it that you're trying to address? And then you locate some data that you think will help answer that problem. And then you evaluate the data so you start to explore it and think, is this exactly what I want? And then you carry out your analysis. So it's kind of a sort of a linear process. Only it's not really because you might locate the data and they'll think, well, actually, that doesn't really have the questions in it that I want. So you might go back to your questioning problem and start to sort of rephrase it or try to locate a different set of data. And the same thing might happen when you come to evaluating the data, you're looking to it a bit more. And it's at that point that you realize that it doesn't quite align with what you were hoping so much. So you might just go back to your question or go back to the data, try and find some different variables within it. And the same thing might happen when you get to analysis. So it's kind of an iterative process and you might have to go forward and backwards a bit until you kind of get the data in a way that you want and can find the right data for your problem or question. So to make sense of your data and to understand if it is suitable for research, you must understand what information was collected, who it was collected from, where and when and where it was collected and how the data might have been changed before it was archived with us at the UK Data Service. And all this information is in the documentation that comes with the data. I showed you before the catalog record. There's another tab along from the catalog record where after you've accessed the data, which gives you all the documentation. The documentation can be quite comprehensive. There can be lots of very long documents but the answers are in there. And we're here to help understand that documentation if you can't find the answers you're looking for. So just before I go into looking at some of our data, I just want to go over some simple statistics. So just to first talk about different types of variables. So mainly the main two types of variables are interval variables, which are on a continuous scale. It's something like age that just is continuous and there's no gaps in between each answer. And then there's categorical variables. And we can think of two main types of categorical variables. They've got nominal variables that have no natural order. So that could be something like your favorite color. And then ordinal variables, which are naturally ordered. So something that's like on a scale of one to five, how good are you feeling today on a scale of one to five? That sort of thing. So here are some examples of these sorts of questions. So the first one is nominal sort of variable. So no natural order. What type of school does your oldest child attend? And so there's just different types of school and they're not in an order. The second one is going back to the ordinal variables. So it's a scale. And oh, there should have been a one. They are under side completely satisfied. You often see these are called like it scales. So often one to five or one to four, going from one end of being completely satisfied to complete dissatisfied in this example. And then the last one is a continuous variable. So how much did your family pay in school fees in the previous year? And there'll be any answer there from zero through to thousands, I imagine across the country. And then there's different ways of describing these sorts of variables. So for the continuous or interval variables, because there can be many, many, many different answers, we don't look at the individual numbers. We look at things like the mean or the average or the most common answer would be the mode. We look at how it varies across the distribution is it a standard distribution or is it not? And then we often visualize them with histograms because they're an easy way to see what the pattern is. For categorical variables, though, we can use counts and also percentages because when counts get big, it's nice to see what a percentage is. And we often put them in tables. So it's gonna have a little look at tables. This is an example of a two-way table. So in here, we've got down one side a category of the age. So whether someone's under 18 or over 18. And then up the top in the columns, we've got what time do you normally get out of bed? This is just an example that I just completely made up. So in this example, it's important to realize that the percentages are going down the columns. So it's important to look at where the 100% is. Normally it's often in columns, but you might see it across the rows as well. So you could look at the individual counts, which is the number not in brackets. So you can see that if we look at the column here on the end, we can see that there were 60 people under 18 and 60 people over 18. So half the sample from each group. So 120 all together. But then out of those 60 people under 18, 40 said they got out of bed after 9 a.m. and 20 said they got out before 9 a.m. So you can see that that's not what the 29 and the 80% relate to, because that doesn't add to 100%. So the percentages are going down the columns. So what we're doing here is comparing this question. We're comparing who gets out of bed after 9 a.m. and we know that 80% of younger people get out of bed after 9 a.m. from the sample, but only 20% of those over-eating. So that's what we're comparing in this column and we're doing it down the column to compare that. So now we're just gonna have a little quiz but just to answer some questions from the presentation. But before I do that, I just want to point out as well that in your go-to-webinar control panel, which hopefully you can all see, you might need to expand it with an orange arrow if you can't see it all, but there should be a questions dropdown box. So if you have any questions, you can pop them in there as you're thinking about them and we'll answer them as we go along. If you have any problems, you can put them in there as well or any technical problems. Pop your questions in there. We'll answer most of them at the end in the discussion, but if there's anything that comes up, feel free to put it in the questions box. Right, well I think we'll do, we've got just three questions to ask you now about that presentation. So we'll put those up and see how we got on. So the first one, what is survey data? So survey data is, select what you think is an individual level data. Is it data collected over time or is it summary data about populations, groups or regions? Thank you, so we've got, most of you said, 79% said individual level data, so it's micro data, so that's the answer I was looking for. Data collected over time, I mean surveys can be collected over time. That sort of survey would normally call longitudinal data. And summary data about populations, groups or regions, that's when you aggregate the data and that's what we covered last week. So, yep, that's great. The next question, what is secondary data analysis? Yeah, so it's reusing data collected by someone else. That's the answer I was looking for. We've got a second report produced from a social survey. So I mean, you might have, it could be that, but that's the end product as opposed to what the actual process is. So yeah, so secondary data analysis is reusing data, generally collected by someone else. I think we've got one final question here just to see how we're getting on. Which of the following is an example of a categorical variable? Got more different answers here. So I was looking for the number of people living in the house. So it's gonna be a category. So it's either gonna be, I guess it could be sort of continuous, but it's gonna be sort of defined. Household income is continuous. Money is on a continuous scale. The household income, it's not in a category people could put, that would be an answer in a survey that people could put a specific answer to and it's gonna be along a scale. So an age of the oldest person in the house is the same, that's gonna be over a lot longer scale. So the number of people living in the house is more like a categorical variable. Maybe that wasn't the best example there, but yeah, that's great. So now, will you share the slide? Did you want me to put the slide up? Yeah. If you, I'll just answer this question because I think it's probably relevant. So if you categorize age, then age of the oldest person could be a categorical, yes, that's right, yeah. So it depends how the question was asked. It's probably depends, the question here was asked, if you categorize age, then you're the oldest person and it would be a categorical variable as well. So yes, so it depends what the answers were sort of saying. If, how old was the oldest person in the house? And if you had like bands, so you'd made categories of age, so you had like this oldest person in the house in between 25 and 35, 35 and 45, that would be a categorical variable for age. So it's one of the things you often have to do in surveys, with survey data to tidy it up, you might get a continuous variable for age, that's very common. And one of the things you'd have to do is put it into age bands to make it into a categorical variable so that you can do things like produce your two-way tables and your bar charts and things like that. So yes, so maybe that was a good example to put in as a question because it kind of helps make you think about the variables in different ways and how you're going to use them. And we will see that a bit more as well in the demonstration that I'm about to give now. And there's just a question there as well from Ella about will we be sharing the slides afterwards? And yes, we will. They'll be emailed up, but they'll also be up on the website. I think actually we might just email out the link to the slides on the website. So now I'm just going to move on and do a quick demonstration of one of the tools we've got at the UK Data Service called Infuse. And this is a tool that you can use to explore some of the survey data that we have online. So you don't need to have a tool like R or SPSS. You can just get some of the statistics straight through our website online because just to make the point as well, most of these survey datasets that we have, you can't open up an Excel. The data is too large and it doesn't generally come in that sort of format. So you really do need one of the biggest software packages to do the more in-depth analysis with it, but you can do quite a lot with Nestar online. So this is the UK Data Service website homepage. From here, if you go to get data and then explore online, you find the different tools that we have for exploring data online. I just want to note that I've already logged into the website. You don't need to be logged into the website to go to Nestar, but it helps. I'll point out in a bit where it helps. So this is the website address for Nestar. It's got its own sort of separate page. And this is what it looks like. Hopefully the screen is big enough for you. Just put a comment in if it's not and I'll zoom it in a little bit more. But you can see the interface for Nestar looks like this. So it's quite, I think it's quite industrial really, actually. But there's a lot of data in here, so it needs to be done in quite a structured way. So down the left hand side is where our navigation is. And you can see here we've got research datasets and if I open that up, these are not all the datasets that we have, but all the ones that we've got available to view on Nestar. So it's a lot of our main key datasets are available here and you can explore the data through them. And then we've got unrestricted access teaching datasets and then there's a wider range of teaching datasets as well. So I'm going to use one of the teaching datasets now just for this demonstration just because it's an easier way of doing it. So a smaller dataset. So you use the little crosses and minuses on the sides to expand and drop down these various categories. Sorry, I think. Patty, we've just had a request to make the screen slightly bigger. Yeah, I just saw that. Just will say, if anyone's still struggling, you should be able to shrink the sides of Patty's camera at the top as well by dragging the little line on the screen. Now that reminded me, I was going to turn my webcam off for this part so that just to make the screen better. So hopefully that's made it a bit easier to see. Great. So yeah, so I've expanded down from the unrestricted access data and I'm looking at the teaching datasets. I'm going to click on here, the crime survey for England and Wales. And so when I click on that title up here on the main part of the website, you get a description about the crime survey for England and Wales. So it tells you a bit about it, what it measures, what's its purpose. And then if you look down here, it tells you the actual dataset. And so this is the 2007, 2008 unrestricted access teaching dataset. And it gives you an abstract specifically about this dataset. So it tells you what's different between the teaching dataset and the main dataset and what topics were covered in it. And so then we've got metadata and variable descriptions. So the metadata is more like the supportings and the supporting information for the study. So we're more interested in the variables. So we're going to look at the variable descriptions. And then it's got some top level categories of the different variables that are available in here. And so the crime survey for England and Wales collects experiences of crime, but also it collects fair of crime attitude towards the police force feelings around antisocial behavior. And then as most of our social surveys do, they also collect a whole lot of information about the individuals who are filling in the surveys. So demographic information about people. And so I'm going to look here at fair of crime as an example. And these are all the questions down the side here that were asked about fair of crime. So you can see there's quite a lot. There's about a dozen questions there asked about how worried you are about certain things, how safe you feel. And we're going to look at this question here. So how safe do you feel walking alone after dark? And this is what I think is quite cool about Nestar is so you can click on that question and straight away you've got a statistic here telling you what the feelings about walking home after dark are. So I'm just going to note at this point as well that I'm not going to talk about waiting or wait any other day to stay there is an option to do that. And I will send some information around next week about waiting, but today we're just exploring it and looking at the data. So this isn't weighted results, but they are still they'll be fairly representative of the population, so England and Wales from where they were collected. So you can see here we've got the exact numbers. We've got some summary statistics under here. So we know that this was a sample of 11,625 people. And this is how many responded to each of these variables. So this is an ordinal categorical variable. And then we've got the percentages on the side as well because those are quite big numbers. It's quite nice to have the sort of simpler percentages to understand it. So when asked, I'm just up here. So this is the name of the variable is often something short and sometimes it doesn't make much sense, but this one is walk dark. That's the name of the variable. And the label is normally sort of like a summary of the question. So how safe do you feel walking alone after dark? But here we've got the actual, what that's called the literal question. So it was how safe do you feel walking alone in this area after dark? Would you feel very safe, fairly safe, a bit unsafe or very unsafe? And so we can see that most people feel fairly safe. They don't feel, yeah, so they're not feeling very safe. But there's very few that are feeling very unsafe. So it's a quick way of looking at a statistic about a specific fact. And so if you think about all the different surveys that we've got and all the different questions that go down to this sort of level, you can see that you can easily find, probably something that puts context on the sort of work that you're doing. So you might be working around safe neighborhoods or the British Social Attitude Survey was quite good. That's in Nestar as well. And you can go and find out in views and opinions on all sorts of things. And so you could quickly drop a percentage or a statistic in from here into a report that you're writing or a blog post that you're writing. And you know that that's a representative statistic of the population as a whole. But you might want to be a bit more specific. Like this is where the two-way tables come into it. So you can see at the top here, we've got description, tabulation and analysis. So I'm gonna click on tabulation and then you can see a blank two-way table comes up. And so I'm gonna click again on how safety feel walking alone after dark. And I'm gonna put that in my rows and see that thing that just came up, just flipped up there. And then it stopped. That was saying you need to be logged in to do this. And so because I've already logged in, let me go past that screen. Otherwise it would have asked me to log in. But just to make you aware that Nestar can be a bit unstable when you try to log in in Nestar. So that's why in the exercise when you come to do it next to the activity, you'll see that I've asked you to log in before you go to Nestar, log in from the main UK Data Service website. And then you're less likely to encounter problems once you're exploring data in Nestar. And just to say, if you haven't managed to register yet, and I apologize if you haven't, no, it's not a straightforward or unnecessarily an instant process for people who aren't in a registered sort of institution with the UK Data Service. You'll still be able to explore the statistics like I was just doing. You just won't be able to do the two-way tables and the activity. But going back to this two-way table. So now I've put how safe do you feel into this more tabular form. But I want to explore that a bit more. So I want to sort of maybe think, how does that contrast when you look at gender perhaps or age? And so I'm going to look at some different variables. So I'll go up a bit further here to the socio-demographic variables. And I open that and you can see there's things here around gender, age, marital status, ethnicity, education, whether that you're in work or not. So we can compare it by any of those variables. And I was thinking of doing it by age. I thought that might be quite interesting. But then this is a good example of why you can't really put age, which is a continuous variable into a cross tab. So if I click on age and I say, well, I want to put age in the columns, you'll see that across the top, there were respondents from age 16 right across up to 101. So that's not, that's very hard to interpret unless we're specifically interested in 50-year-olds and we want to look at 50-year-olds and work out what their fear is of walking alone. So that variable isn't a good one for us to be put in a two-way table because it's got too many categories because it's not really a categorical variable. It's a continuous variable. So we can look at that in different ways though. And I will show you that shortly. Instead, I'm going to remove that one here. So I'll move that from the table and take it back to that and we'll put in gender instead. So I'll add sex to the columns and it will tell us in the columns how people feel about walking by gender. So now you can see we've got gender at the top and the answer to the other question down the side. And as you could probably predict, it's telling us quite an interesting story. So like before, you can see we've got the 100% down the bottom. This is a percentage and we could change that there to the, we could change the percentages to go the other way, which is just something to bear in mind when you're doing the activity if your answers don't look perhaps like how you expect. You can also see the raw numbers if you just wanted to see the count. But we're looking at the percentages in the columns. So you can see how safe do you feel walking in the dark? Men feel very safe, 38.8% of them as opposed to females, women who only 15% of them said that they felt very safe. And looking at the other end, 17% of women said they felt very unsafe when only 4% of men did. So you can see there's quite a different story going on there. And I just wanted to show you one more thing up here. There's an icon for graphs and so if I select that and there's a little drop down menu. And so I'm gonna select the second bar graph and we can, I'll have to zoom that out. You can see I put that into a graph and you can see from here, there's two quite different pictures. The same question, but a very different distribution for males and females. So you can see males are much more here on the very safe side and women are much more in the middle. And you know, they're not very feeling very safe or necessary very unsafe, but much more so towards the unsafe side than males. This little, it's very, these icons up here are quite small. Unfortunately, there's one here which clears it. So I'm just gonna clear that and just put this age one back into the columns just to show you that it doesn't look very good in a table but again, we can put that into a bar graph and maybe I will just reduce the size of that. So you can see, oh, I'll zoom it out for some reason. And it shows you the distribution. Oh, it's not zooming me now, but you can see the distribution of age. This is the age respondents from the Crime Survey for England and Wales. So it starts at 16. So you can see there's a few there who are younger and then it's a fairly sort of even distribution across the middle age group. Fewer here, often you find in social surveys around this age, middle age, 50s, mid 40s, mid 40s to sort of 60. It's harder sometimes to capture those people because they're busy people out working full time. And so the harder to capture for social surveys they're often less likely to respond and then it goes down with age, but that's the sort of thing, this age distribution which might be adjusted with the weights if you are using weights. That's just to show you the distribution of that continuous variable. So I think that's about where I was going to show you for the demonstration of NESTAR. So now we will move on to the activity. So hopefully you've all managed to get a hold of the handout. So there's a handout which looks like this. Again, it's in the go to webinar control panel under handouts and it was emailed to you around lunchtime today as well. And I do, I've made a couple of comments last week. I mean, I do appreciate it's a bit hard having a PDF handout while you're trying to do an activity on an online tool as well. So just a little tip in case you don't know, I use alt tab a lot or I'm on a Mac, so it's command tab. So if you hold down the command or alt key and then tab, you can switch through between the screens quite quickly. So if you have it set up, you can switch between the PDF and NESTAR to follow the instructions that way. So it's just a little tip if that helps because I know it can be a bit tricky. But what we're gonna do is we're going to work through the handout instructions. You can post questions in the question box on go to webinar as you're doing it. I'll still be here while you're doing this and can address any questions that come up or any problems. So just work through the exercise in your own time. There's some questions throughout the handout and we'll come back together at the end to answer those questions. So jot down the answers that you get and we'll come back together at 3 p.m. to answer those questions using the polls and then have a few more of a wider discussion, answer any more questions that you have. So that's what we're going to do now. So if you want to start your activity and we'll come back together at 3 o'clock. This has been a question about how to locate NESTAR from the data service website once you're logged in. I can just demonstrate that again. There is a link on the handout as well that will take you straight to the NESTAR page, but from the UK Data Service website, if you go get data and then explore online, NESTAR is the first option there. So that's from the UK Data Service website homepage, get data, NESTAR. And then that's the web address there. Okay, I think we're going to come back together now. We're going to answer some questions first to see how you all got on. So here are our questions. So firstly, I'm just hoping mostly you completed the activity. So I was going to ask a quick question to see how many of you managed to complete the activity or at least do most of it. So about a third of you got right through to the end maybe by the looks of it. So that's good to know maybe. I'll give a bit more time next time. Sorry if you didn't get to the end of it or if you have problems registering or finding NESTAR. It can all be a bit complicated to find all of it that you need to find, but you have that activity shared with you now. So hopefully you'll be able to finish that in your own time. And hopefully most of you got, as far as the first couple of questions. So how many respondents from the Health Survey for England reported having a physical or mental health condition or illness lasting or expected to last 12 months or more? I'm glad most of you got this right. So I'm looking at this thing. I didn't actually write these answers down. So I'm just going to go with the majority here, which looks pretty clear. So we've got 3,914. So that's right. So out of the 10,067, actually I think I've got the slide here. So 10,067, that's the number of total respondents in the sample, all the people in the survey. So out of them about a third, just over a third, close to 40%, said that they had some sort of physical or mental health condition that had lasted or they expected to last 12 months or more. So that's the answer to the first question. So if we just go back, how many people reported that that condition affected stamina, breathing, or fatigue? So you probably noticed in the survey, they then broke that down so they said, do you have a condition? And then they're like, and what sort of condition it is? We're interested in what sort of condition you have. So how many of those had a condition that affected their stamina, breathing, or fatigue? And these are sort of simple statistics that you should be able to get up without without needing to register. Just by clicking on a variable, it will show you what the distribution of that answer is. And so you've got 1,114. So that's right. So 3,098, that was the total number of people with any sort of condition. So 2,000 was the number of people that didn't have a stamina, breathing, or fatigue condition. And so 1,014 were the number of people in the sample who had a stamina, breathing, or fatigue, long-term condition. And so now we're looking at those respondents with conditions affecting their stamina, breathing, or fatigue. So what percentage are in the lowest-income quintile? So if you manage to get as far as the crosstab, and so you could look at income quintiles and break that down by people with conditions, how many were in the lowest-income quintile? So there was a bit of explanatory notes in the handout. So quintile is when you had the continuous variable of income, and you divide it into five even categories so you can split the sample amongst the lowest fifth and the highest fifth. Often you'd compare the extremes just to sort of show that that would be the most extreme way of showing any sort of income inequality. And so this one, you've got most people with the correct answer there. So it was the 30.7% had were in the lowest quintile. And I'll just have a look here, because this is when I've been talking about crosstabs and looking at percentages and columns versus rows. So this one here is the table that you should have come up with so we can see, if I can get my pointer over there, it's 30.7 in the lowest quintile at the top there, mentioned that they had one of those conditions. If you'd done it the other way around, so the instructions asked you to put the quintiles and the rows and the condition and the columns. If you'd done it the other way around and put the conditions in the rows and the quintiles and the columns, you would have got percentages by columns instead. And that is where you would have got sort of some of these different sort of percentages. So the 40.4% is there at the lowest quintile, but that's not actually answering the question that's looking at in the lowest quintile. How many people with a long-term condition had a condition that affected their fatigue or breathing? So that's 40% of the lowest quintile. And that's not what we're interested in. We're interested in the difference between having that condition or not having that condition across the quintiles. So that's why you needed a table that looked like this in order to answer that specific question. And now we're just gonna answer the last question. So compared to all long-term health conditions, are conditions affecting stamina, breathing or fatigue more prevalent or less prevalent among people in the lowest income quintile? You've got that far and you've all got that, mostly all got that right. So that's when we're starting to interpret the results. So we're looking at this table again and we're looking at that 30.7% and we're sort of saying is that it's more prevalent. So if you look at the total column, that's telling you the distribution of the people with long-term illnesses altogether. So in general, 24% of the lowest quintile have some sort of condition, but actually 30% of those in the lowest quintile, that is one of those sorts of fatigue or breathing conditions. Compared to people in the highest quintile, 14% of people in the highest quintile have a long-term condition. So already we can see some sort of economic gradient to health there, but even then only 10% of those in the highest, or that's the second highest quintile, only 10% have a breathing or fatigue issue. So there are other conditions. So these sorts of breathing, fatigue conditions and stamina conditions are more likely to be experienced from those in lower income. So, and that answers the research question around risk factors for COVID-19. You're more likely to have adverse side effects from it if you've got a pre-existing condition, particularly one that kind of affects your breathing. And so therefore, you're also more at risk if you're in one of the lower income brackets because you're more likely to have those conditions. So that helps frame some sort of, if you're doing work in that sort of area where you're trying to support people with long-term conditions or you're trying to support people who are on lower income, this is how you can sort of frame some of your arguments about why you're wanting to do this sort of work. So that's where we're with that. And I just, there was one last little thing on the activity sheet to show that in a graph. So you might have come up with a graph that looked like that. And that's sort of showing it graphically again, which I think it just paints a stronger picture. So you can see here that overall people that don't have one of these conditions is kind of even. It's slightly lower for those in the highest quintile. But when you're people who do have these conditions, they're much more likely to be in those lower income brackets. So it's a much clearer way of showing that. Someone asked in the questions, if you can export these graphs. So I mean, I did that for this presentation. I just took a screenshot of it and pasted it into my presentation. And you can do the same or you can right click on it when you're in Nestar and you copy the image and paste it into your report. You can see the quality is reasonably good. So you can do that. I would just say you should be citing obviously the source of your data. And if you go to the UK Data Service catalog page for that data set, it gives you a reference in there which you can use and just paste straight into to site where that data came from. So now we're gonna move on to any sort of wider discussion. So if you have questions, put them in the question box. Ali will read them out to me. Oh, the copy graphs accessible to screen reader users. That's a really good question. That's come from Matthew. I'm not exactly sure of the answer. But I do know that the UK Data Service is currently ensuring as much as possible that the website is entirely accessible under the new sort of legislation that's coming in force later in the year. So most of our website will be. There's some things like the tools, like Nestar. I'm not sure how easy it's going to be to make that accessible with a screen reader. But I can get back to you on that, Matthew, if you like, and find out for you. Because again, I'm not sure if the copied graphs accessible or not. But we can take that question away and get back to you on that. Does anybody else have any questions or want to, if you raise your hand, or you can? Oh, sorry. We've got one Patti that says, in the cross tabulation, is it possible to get the p-value, i.e. the chi-squared test? Yeah, I think it is. I don't often do that sort of level of analysis in Nestar, I would normally be, in another program like R or SPSS to do that. But I'm pretty sure you can. And see there's a further tab here for analysis. I mean, you can actually run some regressions. Oh, you know more about this, Ellie? No, no, I don't think so. Oh, sorry, I thought you were going to say something. So you can, you do have, you don't have any of the statistics down there. But there is, it does give you the statistics when you do the regressions, or if you do the correlation, you get the statistics in there as well. So there is a further statistical tests, which the information will be available. But I actually, I can follow that one up for you as well, Pfizer, and get back to you just to give a more exact answer. We do actually have a few guides as well, which I can also share on the UK Data Service website. I'll just show you quickly where some of these guides are. Last week I showed it in the screenshots, but I'll do it here today. So under use data, so we're under get data to find, to explore the data, and to actually get the data. But under use data is where you'll find all our support and resources. And so under here, there's like guides and video guides. So the guides are all PDF guides. And so it's methods and stuff, we've got guides to explore in online. And so there'll probably be stuff in here around, well, there's not one about Nesta, but there's definitely videos about Nesta. And we've got a whole YouTube channel. So if you can log on, we don't just log on to YouTube. If you click through to the YouTube channel, you'll find all our videos. And there's a whole little set of videos about exploring Nesta, which will give you some more demonstrations of how it can be used. And I'll just point out down here, while I'm here as well, there's our data skills modules. And we have a whole data skills module on survey data that takes about two hours to complete. And it goes through everything I've talked about today in a bit more detail. It doesn't use Nesta though. For example, it does use SPSS. Or there is a free version called PSPP, which you could use as well. So that is worth exploring as well, if you want to find out more about what we've been talking about today. I think it's another question. There is a way to search for variables. Oh, that's right. Someone asked about searching for variables. But we do have on our website, variable and question bank. It's not updated as much as it used to be, but because lots of the surveys have run for a long time and lots of the surveys use the same variables year on year or wave on wave of data collection because that means you can compare the variables over time. And so, yeah. And so if you go here from the UK data service homepage to get data, you see this box comes up here, the variable and question bank. And so you can click on that. And then you can type in whatever variable that is that you're interested in. So I don't know if this will come up or not yet, but so we say walking alone. And yeah, so then you've got these questions about people afraid to walk alone in the area where they live. And that question that we just looked at, well that says it's from the English Longitudinal Study of Aging. You'll find some questions are asked in different surveys and it's the same question. And that's because lots of these questions are written in a way, which is validated and so it's asked in a way that they know it was going to get a fair answer, which is another good reason to use secondary data because all the questions have been thought about and developed in such a way that it's given a proper representative answer from the population. So this is a good place to look for questions, even if you are doing your own survey. If you use some of the questions that are standardized here, you could even then compare your own survey with your service users to a nationally representative one to kind of look at the differences or similarities. So this variable and question bank can be used for all sorts of ways. But you'll see here that you can even, straight from the variable and question bank, you can open it up and find the responses to that question there, which doesn't give you the percentages or anything, but you can explore further and look, all instances of this variable. Or once you've found the variable, you can then look for it. So you know that this wasn't the English Longitudinal Study of Aging, so you could look that up in Nesta and find the variable that way. So if you just wanna do a bit of a search about what sort of things you can ask or how things are asked, and I definitely recommend looking at the variable and question bank and exploring that a bit.