 My name is Maureen Haker. I've worked with the UK Data Service in various capacities for the last nine years, anything from digitizing data to reuse projects. And I also lecture at University of Suffolk in the Department for Social Sciences and Humanities. Great, thanks Maureen. So I'm Allie Bloom. I work with the UK Data Service user support and training team. I also do a lot of work on creating student resources, resources for dissertation students. So yeah, here to talk to you today about that and how you can use data in your dissertation. All right. So what we're planning to do today, there we go. What we're planning to do today is to go over what secondary analysis is, look a bit closer at some of the key methodological issues of reuse projects for both qualitative and quantitative data. But before we get into that, I'm going to do a brief overview of the UK Data Service for those of you who have not had the pleasure yet of exploring the archive. I'll find post you to some further resources, which can help you if you're planning a reuse project for your dissertation. This is an introductory workshop, so we're going to assume that most of you will probably not have used archives before and may have had only introductory modules on research methods. But before I get too far along, I just want to address a simple question first. What is secondary analysis? So in short, secondary analysis is a method which asks new questions of old data. It's analyzing data that you've not collected yourself. Usually researchers collect far more data than they actually need to answer their own research questions. So think of those national surveys which collect a lot of data on a representative sample or qualitative studies which contain interviews that can last one to two hours, sometimes more. Those data sets can answer a lot of different questions and they can be analyzed using a lot of different techniques. So secondary analysis makes use of that and does just that. It reuses data that's already been collected by someone else. There is a complicated nuance here around terminology. So you may have heard of some other terms that are used to describe this method, including secondary data analysis or reuse projects. They all refer to the exact same method. So don't be too confused by this. There is an ongoing debate about how to call this method. So in 2007, Libby Bishop wrote about the primary secondary dualism, basically making the argument that there may be a privileging of methods where you actually go out and collect the data yourself and using terms like secondary analysis or secondary data reinforces that hierarchy. But actually primary and secondary analysis are a lot more similar than different if you have fully considered all those key methodological issues of secondary analysis as a method. So consequently, you'll find an increasing use of the term data reuse, and that's a term that I'll be using throughout the workshop. And it does take into account the huge range of ways that data can be used and reused. And it also doesn't imply that any of those projects are secondary to the initial use of the data. So you can use whatever term comes to mind, but just be aware that there are a few different ways of describing this process of reusing data. Okay, so you know what secondary analysis is, but where would you find data that's already been collected? And this is where the UK data service, which holds the largest collection of social science data in the UK, comes in. We're a comprehensive resource that's been funded by the ESRC, and the main job is really just to be a single point of access to a wide range of secondary social science data. So the main purpose then is the collection, the ingest and the processing of the data, and then further dissemination of that data for people to use. In addition to that data infrastructure core, we also have a service layer, which provides extensive support training and guidance. Who is it for? Well, we like to think it's really for anyone who has an interest in data. Traditionally, the main audience and probably the people who both deposit and use our data the most tend to be academic researchers and students. But there's a lot of other groups that are well represented to including government analysts, charities, foundations, businesses, research centers, and think tanks, all will give us and use our data. Given the importance of data, how it's used and how it's disseminated, we're trying to reach out and support a wide range of communities. What kinds of data do we hold? The majority of the data, at least judging by the number of collections, is quantitative data. So we have over 8,000 collections of which 6,600 or so of them are quantitative collections. And we hold a wide variety of that data. So there's survey data, both cross-sectional and longitudinal, aggregate statistics, domestic international macro data, census data, micro data. And then of course, we also have qualitative and mixed methods data. Where does it come from? Again, that varies depending on the data type and some of the sources that you see here, including agencies and statistical time series, those are clearly our main sources of quantitative data. Most of our qualitative and mixed method data comes through individual academics. So if they get a research grant, they would deposit their data with us after their project is over. And also, of course, we hold some originally paper-based public records and historical sources, including things like the census. And where can you find information about all this? We have a website, ukdataservice.ac.uk, which holds a lot of that information. And from there, you can find our catalogue. We also have hundreds of pages which discuss methodological issues like gaining consent, anonymizing data, storing data. There's also, you know, some student-specific tutorials and pages. So we have some data skills modules. We have workbooks and exercises that are based on collections we hold. And there's also help pages if you have specific questions about how to use the website. But getting back to dissertation projects. So what kind of projects can you do reusing data from the archive? Well, we really need to go back to the beginning to answer this and think about the research process. So hopefully you're familiar with this model. It starts with some kind of, you know, general direction of your research or general kind of topic. And you do some background research into the literature already written on that topic. And from there, hopefully you're inspired to, you know, ask a research question which builds on this body of research. Once you have that research question, you then decide the best way to answer the question and you design your project. And once you've settled on that method, you collect the data, analyze it, and you begin your write-up, which is what you would submit for your dissertation. You might have a few couple steps in there or swap a couple of those steps depending on your theoretical foundation. But generally speaking, this is normally what we think of the research process. And when you're doing a project using secondary analysis, the process looks a little bit different. This model clearly shows how the research question is built from your chosen topic area, your preliminary search, possibly the literature that you find. However, with secondary analysis projects, the research question is derived usually from the data. So you would start with the topic you're interested in, but instead of looking for literature, you look for data and you start evaluating the collections. When you find a collection that intrigues you, you then ask a question, a research question of the data. And from there, you would then go out to find what literature exists on that question. You then would not need to collect any data. You just need to access it, which is, of course, one of the key advantages of secondary analysis. The data is already collected. You just need to get your hands on it, either by downloading it from the catalog page, or you may need to actually go to the archive if it's only available in paper form. Once you have it, you then analyze it and you write up your dissertation. So the key point I'm making here is around reusing data for dissertation projects is about where your research question comes from. It would take a lot of time and kind of what we might say is inside knowledge about the data within the archives in order to be able to come up with a research question, and then search for the perfect data to answer it. You might be searching for the right data for a really long time, unless, you know, as I said, unless you already have a good working knowledge about the collections that are already held in the archive. So for a dissertation project, when your time and resources are limited, you'll want to look at the data first and from there develop a research question, which gives a new take on that data. And you can, of course, spend time looking for the right data. But for a dissertation project, you may want to first go and look at what data exists on your general topic area before nailing down your research question. With that being said, it's probably important to have some kind of idea about the kind of project you want to do. While you might be exploring data without a specific question in mind, you may want to think about what kind of research design your project will follow. So there's four different types of reuse projects that lend themselves quite well to dissertation projects. And these are reanalysis, replication study, comparative study, and degree study. So reanalysis is probably the one that comes to mind when thinking about what secondary data is. This involves thinking about the wide range of approaches that you can take in the analysis of a data set. It usually means asking some kind of different research question from what the original researchers were trying to do. So, for example, Clive Seal and Charteris Black did a study using comparative keyword analysis of illness narratives. Now, the original illness narratives had been looked at exclusively for health research. The interviews were meant to explore how diagnoses are made. When Seal and Charteris Black came along to do the comparative keyword analysis, however, they were much more interested in the analysis of the discussions between the patients and the doctors rather than the actual help issues that came up in the interviews. So the question can be very different in that kind of way. Or sometimes the question can be on a similar topic to the original research, but have a slightly different focus. So for example, Joanna Bornaut looked at gerontology as a topic and she found two different data sets looking specifically at this topic. However, Bornaut's research question was on racism, which wasn't the focus of the original work, but the data set was rich enough to allow her to explore this theme within the existing data. Now, if you want to use the same exact analysis strategy, that would be a replication study, which is also possible. So right now there's a real concern about reproducibility of research, and replication studies can reveal the sort of messiness that's involved in working through data. One of the most famous or perhaps infamous examples of replication is from Thomas Herndon, who is a postgraduate student at University of Massachusetts. He was assigned an assessment to replicate the findings of a published study. So he chose Reinhart Rogoff's 2010 paper, Growth in the Time of Debt. And basically this paper comes up with the proportion at which your national debt can exceed your GDP before you see negative economic growth. So Thomas Herndon pulled the OECD data to rerun the analysis as the paper had laid out and got a completely different answer. The paper published said that debt cannot exceed 90% of the GDP. However, he calculated that debt can actually exceed the GDP. And even only then it has a minimal negative impact on economic growth. So after contacting the original investigators, he found a flaw in their data set where some cells basically had been miscopied from the OECD data set. So the full story is published in 2013 in the New Yorker. So a replication study can hopefully, you know, wouldn't always found flaws, you know, like this. But nonetheless, it is a study design that could be worth considering. And, you know, it helps you develop an appreciation for the research process. So you could even develop a project whereby you rerun a series of studies on the same topic, or you explore a complicated data set with, you know, I don't know, missing data, transforming variables and so on. You can also do comparative work. So you might be looking at an international comparison between two countries or comparing social subgroups of the same population based on a shared social characteristic. Our key data page for quantitative data sets outlines some of the large national surveys that are held at the archive, any of which would allow us to do, you know, comparative work without having to collect basically two sets of data. So you could compare samples across time, across geographic place, gender, ethnicity. So these characteristics are usually collected as standard for these larger surveys. And the final type of reuse is going to be exemplified by a case study. So, and this is a re-study. And this is where you replicate the methods of a study for purposes of comparison. So it does a little bit of secondary analysis, but it also allows you scope to collect a little of your own data as well. The example of this kind of reuse project is from the Collection School Leavers study. The original study was conducted by Ray Paul in the late 70s as part of a much wider community study on the Isle of Shopee. As part of that project, Paul asked teachers to set a particular kind of essay just before students were due to leave school, prompting them to imagine that they were reaching the end of their life, and something made them think back to the time that they left school. They then were assigned to write a short essay of what happened in their life over the next 30 to 40 years. In 2009, Graham Crow and Don Lyon, and this picture here is of Graham Crow with Ray Paul, decided to reanalyze this data set and focus solely on student aspirations. Using the very same methodology, they conducted a re-study of school leavers on the Isle of Shopee in 2009. And the prompts applied to students in 2009, 2010 data collection was nearly the same. Imagine that you're at the end of your life and reflect back on what you've done since leaving school. They then transcribed the essays and compared the themes from that set of essays, the new set of essays, to the set of essays that was collected by Ray Paul. And you can see the wording of the prompt here in a small snippet of one of the essays here. The findings are fascinating, and they show the difference in young people's aspirations after one generation, 40 years of time has passed. But how exactly were they different? Well, in 1978, students expected a much more grounded, arguably mundane sort of job. So career progression tended to be gradual and it followed on from very hard work. Sometimes there was talk of periods of unemployment or quite morbidly, even their own death or the early death of someone they loved. And you can see a few examples in the left column of quotations from those essays, such as the one on the bottom. I longed for something exciting and challenging, but yet again, I had to settle for second best. I began working in a large clothes factory. 2009, 2010, however, showed students imagining well paid, instantaneous jobs filled with choice, but also with some uncertainty. So Crow and his research team noted that there was a clear influence of celebrity culture in those essays. So for example, you have the quote on the bottom of a girl who writes, in my future, I want to become either a dance teacher, a hairdresser, or professional show jumper, horse rider, but do become a dancer. My dream would be to dance for Beyonce or someone really famous. This study is perhaps maybe a bit larger than what might be realistic for a dissertation study. So the goal was to engage the entire community alongside the research and find innovative ways of including participants in the research outputs. As part of the initiative, they published the Living and Working on Sheppy website, which helps to create a shared history and a memory of what living on the Isle of Sheppy means among the community. And while this would be an ambitious project for a dissertation, it is nevertheless a good example of how you can combine a bit of data collection and data reuse into one project. So for those of you who are perhaps doing maybe PhD work, this is certainly something to consider for your own projects. For others, you can still design a more feasible study with a smaller sample, smaller outputs. So hopefully you are now budding with ideas of what you might want to look for in the archives or what kind of project you might be able to do reusing data. And since you're not collecting data yourself, you'll find reuse projects have very few ethical considerations comparatively. And hopefully you wouldn't hit too many snacks with any ethical review boards. However, it doesn't mean that there's no ethical considerations. And there's two key points that I want to make before diving into some qualitative and quantitative data. The first of these starts around the access point. How do you get permission to use the data? So if you're reusing data at an established archive like the UK data service, we've taken a lot of the pain out of access by negotiating licensing issues with the person who's collected the data. This usually means that you need to sign what's called an end user license. And this is a legal document, which states that you're going to do two really important things. One of them is that you won't share the data onward, including with your supervisor. So if you need help with your analysis, or your supervisor needs to see the data, then he or she will need to register and download the data for themselves. The end user license stipulates that you cannot, under any circumstance, share your data or your login with anybody. The second is that all the data we hold is anonymized, which is again likely to be the case if you're going to reuse data from an archive. However, just because it's anonymized does not mean it's completely impossible to figure out the identities of participants. So Mark Elliott has some excellent YouTube tutorials which go through anonymization theory, but in short, he makes the argument that no anonymization strategy will be 100% effective at all times. Consequently, should you inadvertently uncover identities of any of the participants, then this end user license stipulates that you will not reveal that identity to anybody. So those are the key issues to recognize when signing the end user license. And once you have sorted out access, the second point that I want to make is that you need to ensure you're citing the data. So in short, citing archived data helps data creators track the impact of their study. It also supports reproducibility, makes it easier to find the data you use for your projects. This issue is so important that the UK data service has a page on the website which goes through a little bit more detail about data citation and will help you explain, you know, why this is an important ethical issue. With the UK data service, we tried to make this easy by supplying the citation that you need in our catalog pages. So the citation, you know, you would see it underneath there's a citation and copyright box on the catalog page. You just expand that. This is what you would see. And you literally can just copy paste this citation into your reference list. So you've got access, you've sorted out the citation, but now comes actually doing the secondary analysis. So I'm going to talk through qualitative data first and a couple of the key issues and getting started with qualitative data. And then I'm going to pass back to Ali who's going to talk about quantitative data. So first, I'm going to talk about orienting yourself to the collection and then talk about recontextualizing the data. And then I'll just end with a very quick point about sampling. So when you first downloaded qualitative data set, you'll get a zipped folder which looks a bit like this. So you've got some folders stuffed with files. Most qualitative data is held as what's called RTS word documents. So to find the data, you just need to go into the RTF folder. And this folder, when opened, should hopefully look like this. And here you go, all your data nicely organized and clicking on one of those files would open up a file which looks a bit like this. Now this is a snippet of what the school leavers study essays look like in their entirety. So the RTF folder has over a hundred of those essays. But the files don't have to just be essays or interview transcripts. You might, for example, have PDFs of handwritten notes like the one in the upper left corner. Or you might have ethnographic notes like the ones on the right. And some collections might also have images or videos like that in the lower left hand corner. Most likely you'll end up opening up interview transcripts like the one that's seen here. So it should have clear turn taking and speaker tags. So you should know who's talking. There's a lot of different data types available. So just make sure you have a good look through the collection first and see what's actually in there. The next thing that you need to do is orient yourself to the project. And I think the main point here is to not underestimate the amount of time it will take to get acquainted with the data set. So there might be multiple levels of context to get through in order to really understand that data. And what I mean by that is you may have more than just the data that's collected at the time of the interview or data collection. But you might also have to consider basic social characteristics of the participants. Perhaps the historical time period in which the data was collected or perhaps where the data was collected. So really the idea is that you need to understand the data set as a whole in order to get at the root of what the data can convey. And every collection archived at the UK data service has some documentation provided with the data sets, which will be a really useful starting point for that. It contains more information about the methodology, might have things like an interview schedule or a call for participants, or sometimes it includes segments from publications that arose from the original study. I've also seen things like funding applications. So for qualitative data sets, this documentation is called the user guide. And this is an example of a user guide. This one happens to have an interview guide for interviewers, as well as a blank consent form, a sample profile, etc. It's just further background information to help understand how the data was collected. But what if you want to know more about the participants themselves? Every qualitative study also has what's called a data listing. And here's an example of one of these. It's a table which gives a brief overview of the data in that collection. So each row represents a piece of data or even a participant, and each column lists some characteristic or attribute of that of that interview or that person. It's a quick way of sort of getting an at-a-glance view and getting to know who took part in this study. In addition to the context of the data, you may also need to consider the sample. So for example, if the data set is too large, you may need to take a sub-sample. Qualitative collections tend to be smaller anyways, but many of the archive data sets are funded and they can collect a considerable amount of data. So for a small dissertation project, you may want to be realistic and just decide if you need to limit the number of participants to a smaller sub-sample of the larger collection. So for example, the Edwardians collection, which was put together by Paul Thompson and is widely considered to be the first oral history of Britain, contains 453 80-plus page interviews. Conversely, most dissertation projects probably have an expectation of, you know, maybe six to ten interviews. So you would need a clear sampling strategy to help you choose which of those 453 interviews you actually want to look at. Or you might be interested in a particular subgroup of the population. So again, you'd want to think about what criteria are you looking for. You might also want to combine data from different collections that complement each other. So remember, this would take a lot of time to sift through and find pieces of different data sets to pull together, but this is another possibility. If you feel all the data speaks to the same topic and research question, and you've done the work of recontextualizing to ensure that those interviews kind of work together, then there's no reason why you can't sort of mix and match in that kind of way too. Just another consideration you can make. And now I'm going to hand over to Allie to talk about the key methodological challenges of quantitative data. So thanks, Maureen. I'm going to give a quick rundown now of the key methodological things you need to consider if you're doing a dissertation project reusing quantitative data. So I'm going to cover the key things you need to consider in two main areas. So first of all, when you're going about selecting your data, and then when you're going through the process of understanding it. So first of all, selecting your data, we're going to look at topics, concepts and variables. So as we had everyone putting in the mentor, I know some of you already know which topic area you're interested in, or you might want to explore. But if not, I'm just going to run through an idea of some of the data available. So the UK data service holds quantitative data on a variety of topics. So just to give you an idea, there's data available that could allow you to look at the environment, workforce patterns, health care, family spending, attitudes to the police and the criminal justice system, time use during the pandemic, so how people spent their time, attitudes to various political stances, and political opinions. And these are just some examples. There's a wealth of data out there, so please do have a look if you're interested in using quantitative data. So once you have a general idea of your topic area, you can think about the data that you might use to explore it. And the key thing for a quantitative reuse project is to understand what it is that you're trying to measure. So you need to think about the key concepts you want to measure and how these might relate to variables within a data set. So I'm going to give an example of this hopefully to make it a little bit clearer. So for example, let's say we were interested in looking at the relationship between fear of crime and age. So as Maureen said earlier, we're starting with a broad topic instead of starting with a specific question. So our key concepts here are fear of crime and age. So that means that we need to find some data which has variables that measure these concepts and will allow us to formulate or answer a research question about them. So as Maureen said earlier, it may not be possible to find the perfect data set, and you could spend a lot of time searching for it. So instead, we're going to start from data on this general topic and then derive our concepts and questions from the existing variables. It's also fine if you do already have a question in mind though, just keep in mind you might need to be flexible and adapt this based on the data available. So if you're looking for data on key topics, there are a few places you can start. You can use the theme pages on our websites to look for key data sets on particular themes, such as health, finances, environment. You can also type keywords or the names of particular data sets into the catalogue to search for these. And you can also use the variable in question bank. Now the variable in question bank allows you to search for data based on the names of particular variables or variables that measure particular things. But please just be aware not all of the data sets in our collection are covered, so you might miss some out if you search using that. So definitely use a couple of places when you're searching. You might want to check the catalogue as well. And the final one is the Haset thesaurus, which lets you search for key concepts. So you type in your key concept, poverty, something like that. And then it will find you data sets that have questions based on that. And the links to all of these can be found on the find data section of the website if you go to where it says browse and access key data here. And we will have a practice of this in the practical session as well. So once you've found a data set that you think might be suitable, you'll need to consult the catalogue page. This will give you an overview of the key topics, the background, and a brief overview of the methodology. You can also access the documentation from here, including any user guides or technical reports, lists of the variables included in the data sets, and any notes added by the producer's outlining changes that might have been made since the data was originally deposited. So sometimes changes will be made, for example, to the weighting or to the name of a variable, or if there are any errors in the data set. So it's important to check for those. The important thing is that the list of variables can usually be found in the documentation. And these for quantitative data sets, these can usually be found either in the user guide. Some data sets have a set variable list, and other ones, most of our data sets will have a code book or a data dictionary which lists the name of every variable included in that data set. So back to our example. So for our question on fear of crime, we might want to start by looking at a survey such as the Crime Survey for England and Wales. This is a large important crime survey which might cover our key topics. So just a bit of background on the Crime Survey for England and Wales. So this is an important source of information about crime because it provides crime statistics that are independent from police records. It is an example of a repeated cross-sectional survey. It is conducted annually, and it samples 35,000 individuals aged 16 plus and 3,000 individuals, a slightly smaller sample of those aged 10 to 15. And from having a look in the documentation, I can see that the Crime Survey for England and Wales has two variables that might fit our key concepts. So quality of life, which is how much is your own quality of life affected by fear of crime on a scale of 1 to 10, where 1 is no effect and 10 is a total effect on your quality of life, and a variable that measures the age of the respondent. So now we found our variables, an important step is to think critically about what they're measuring. So for example, take the quality of life variable. Does this really measure fear of crime or is it measuring how much fear of crime is affecting these individuals' lives? So to help you when you're thinking about this, you can go back and look at the original question that relates to the variable, so like I've done here, and see what it is the question is really asking. And again, the questionnaire, the full questionnaire can usually be found in the documentation, or sometimes the full question is written out next to the variable. So again, this is an example of how we might want to adapt our research question. We may not be able to find a variable that measures fear of crime, but we might adapt our project so we're looking at the impact of fear of crime on individuals at different ages, for example. So as well as considering your variables and concepts, you might also want to consider the type of analysis you want to do or that you can do with different types of data. So for example, if you're looking at individuals at a particular time point, you might want to use a cross-sectional survey. Or if you want to look at individuals, families, households at multiple points in time, you might want a repeated cross-sectional survey. If you want to follow the same individuals or cohorts of individuals over time, you would look at longitudinal data. If you're interested in small geographic areas, so particular local authorities or the impact of certain councils, you might want to look at census aggregate data or flow data. And if you want to compare countries over time, the best thing to do that would probably be international time series data. It's also important to think about your population. So that's the group that you want to measure. So this might be the population of the world, the UK, or perhaps a particular city or a local authority area. And you also need to consider your unit of analysis. So are you interested in individual people or are you interested in households? And this will affect the data you can use because some datasets are only available for particular geographies or at a certain unit level. Finally, as I said earlier, it's important to remember that this process isn't linear. You might need to go back and forth and realign your question with the available data or compromise if the perfect data isn't available or choose some different variables. And this is all part of doing secondary analysis. So now just briefly a quick bit on understanding your data. So once you've chosen your dataset, there are a few final things you need to consider. So as we said, the documentation usually contains information on the variables, but it should also have information on the questionnaire used to collect the data. And as I said earlier, to understand secondary data, it's really important to understand the questionnaire and the questions that were asked. And something that's particularly important about that is the routing. That is who was asked which question. Now this is because many questionnaires use something called computer aided interviewing which sends respondents through the questionnaire by different routes depending on their previous answers. Therefore, many of the questions in a survey might only be applicable to some of the sample, which might mean that the question that you're after doesn't address your population of interest or in some cases it might result in sample sizes that are too small to do useful analysis. So it's always a good idea to check the documentation and see who was asked about which variables you're interested in. So for example, this is a variable called flex 10 from the labour force survey, which relates to special working arrangements. You can see the questionnaire as well as the range of answers. And underneath this, it says that this question is asked if the respondent is in work. And then it also gives information on how they identified who was in work by responses to other questions or variables. You'll find that the documentation will look a bit different across different data, but this gives a general idea of what you can expect. You can also find information about how the data has been processed after collection. So derived variables are created from the raw data. And here is an example of a derived variable. So this shows how the flex 10 variables which we talked about on the last slide have been used to derive a new variable called flex W seven. So this variable has been derived by taking into account the responses from the original variable. So those who responded seven to the flex 10 saying they had a zero hours contract and now coded one on the new flex W seven variable defining them as people who work a zero hours contract. Don't worry if this looks a little confusing, it can take a minute to get your head around. But once you're understanding all the variable names, it should be a very useful way to get to know your data. And just to say not all documentation contains these diagrams. In some surveys you'll just have the SPSS or the status syntax which shows you how the variables have been derived. But it's really important that you understand the origin of any variables that you're using. You'll also need to think about sampling. Surveys and similar quantitative data sources are almost always based on samples. And one important question you need to ask about your data is, is the sample representative? So you need to know who is included in the sample? Is it adults? Is it only those in private addresses? What was the response rate? So were there differential responses across the population? And this will tell you about some potential biases. You'll also need to find out if you need to use a survey weight to make the data representative. And again you can find all this information in the documentation that comes along with the data. And finally, you need to ask whether you have enough cases to make a precise estimate. So for example, the crime survey for England and Wales has a large sample size. However, for surveys with smaller sample sizes or if you're analysing a smaller subpopulation that means you're cutting the number of respondents down, there might not be enough cases to make precise estimates. So I've just given you a general overview about sampling considerations and other things to consider, but we do have a number of resources on the website. In particular, I'd point you to the UK Data Service Guides on weighting and complex sampling. So just to summarise, think about your key concepts and what you're trying to measure and how these relate to variables in a data set. Check the catalogue and documentation to help you to understand your data. And make sure you're considering questionnaire routing, derived variables, and sampling. And if you want more information on our dissertation resources, keep an eye out on our Twitter for hashtag UKDS dissertations. And we also have some new student pages on the website that will help you think further about your student project. These can be found on our learning hubs. So there's a section that's dedicated to secondary analysis for dissertation, a section with dissertation resources, including PDF guides and quick start data set guides. And there's also a finding the right data for your project section with videos and worksheets to help you think through some of the things we've discussed today.