 Thank you, everyone, too, for coming to this dissertation project introduction to secondary analysis for qualitative and quantitative data. My name is Maureen Haker, and I've worked with the UK Data Service for about 10 years now on everything from digitization of some of our qualitative collections to reuse of collections. And I'm here with my colleague Allie Bloom. Yeah, hi, everyone. I'm Allie. I work as part of the user support and training team, mainly focusing on surveys, how we use them, creating resources, and kind of helping everyone get the most out of our data. So it's great to have everyone here today, looking forward to going through some dissertation project tips. Fantastic. So it's early on a Monday morning, and I think without further ado, we'll go ahead and get started. Okay, so what we're planning on doing today is to go over what secondary analysis is and look a bit closer at some of the key methodological issues of reuse projects for qualitative and for quantitative data. And before we get any deep in that, I'm going to go through a very brief overview of the UK Data Service for those who haven't had the pleasure yet of seeing some of our data. And I've tried to pull a few case studies in throughout as well, so you can kind of see some of our data in action. And we're going to end by signposting you to some further resources. So within this workshop, it's very introductory. We've assumed that most of you have probably not have used an archive before, and may have only had some introductory modules on research methods. So do let us know if you have any questions about any of the terminology that we're using. So the first thing that I want to go through before we get too far along is I want to address what is secondary analysis. So in short, secondary analysis is a method which asks new questions of old data. So it's analyzing data that you've not collected yourself. And usually researchers collect far more data than they actually need in order to answer their own research questions. So think of, for example, those national surveys, and they collect a lot of data on a representative sample or qualitative studies, which often can last for, you know, one to two hours of interviewing sometimes more. And those data sets can answer a lot of different questions, or they can be analyzed using a lot of different techniques. So secondary analysis is basically trying to make use of that data. So in short, secondary analysis is reusing data that's been collected by somebody else. But there's a complicated nuance here around terminology. So you might have heard of other terms to describe this method. And that includes secondary data analysis, or I've just used the term reuse projects. They all refer to the exact same method. So don't be too confused by that. But there is an ongoing debate about how to call this method. In 2007, Libby Bishop wrote this this article called primary secondary dualism. And basically she makes the argument that there may be a privileging of methods where you go out and collect the data yourself. And when we use terms like secondary analysis or secondary data, it reinforces that hierarchy. But actually, primary and secondary analysis as a method are probably a lot more similar than different if you fully considered some of the key methodological issues of secondary analysis as a method. So consequently, you'll find an increasing use of the term data reuse. And that's a term that we'll use throughout this workshop. And that basically just takes into account the huge range of ways that data can be used and reused. And it doesn't imply that there's any kind of privileging of certain methods or that there's any use of the data that is secondary to the initial use of the data. You can use whatever term kind of comes to your mind first. But just be aware that there are a few different ways of describing this process of reusing data. So now you know what secondary analysis is. But where would you find data that's already been collected? And this is where the UK data service which holds the largest collection of social science data in the UK comes in. We're a comprehensive resource that is funded by the ESRC and the main job is to be a single point of access to a wide range of secondary social science data. So the main purpose then is the collection, ingest and processing of that data and further dissemination of that data for people to use. But in addition to that data infrastructure core we also have a service layer which provides extensive support training and guidance. Who is it for? Well we like to think that it's for anybody who has an interest in data. Traditionally our main audience and the people who probably both deposit and use our data the most tend to be academic researchers and students. A lot of other groups are well represented as well so that includes government analysts, charities, foundations, businesses, research centers and think tanks all will give us and use our data. Given the importance of data and how it's used and how it's disseminated we're trying to reach out to as many communities as possible. And what kind of data do we hold? The majority of our data at least judging by the number of collections is certainly quantitative data. So we hold over 8,000, 9,000 collections of which most of those two thirds of them are quantitative collections and we hold a wide variety of that data too. So there's survey data, there's both cross-sectional and longitudinal, there's aggregate statistics, domestic international macro data, census data, micro data, as well as of course we've got a sizable qualitative and mixed method data collection as well. Where does it come from? Again that varies depending on the data type and some of the sources that you see here including agencies and statistical time series those clearly are going to be some main sources for our quantitative data. Most of our qualitative and mixed methods data comes through individual academics through their research grants and of course we hold some originally paper-based public records and historical sources and that includes things like the census. Where can I find information about it? We've got a website ukdataservice.ac.uk and that holds a lot of information and from there you can find our catalog. We also have hundreds of pages which also discuss methodological issues like gaining consent, anonymizing data and storing data and there's also some student specific tutorials like our data skills modules. We've got some like exercises and workbooks that are also based on some of the collections that we hold and there's a help page if you have specific questions about how to use the website. But getting back to dissertation projects, what kind of research projects can you do reusing data from an archive and really I think we need to go back to the beginning to answer this and think about the research process. So you're hopefully familiar with this model. It starts with some kind of topic or general direction for your research. You do some background research into the literature that's already written on your topic and from there you are hopefully inspired to ask a research question which builds on that body of research and once you have a research question you then decide the best way to answer that question and design your project. Once you've settled on that method you collect the data, you analyze it and you begin your write-up which is what you would submit for your dissertation. Now you might have a few extra steps or swap a couple of those steps depending on your theoretical foundation but generally speaking this is normally how we think about the research process. When you're doing a project using secondary analysis however this process will look a little bit different. So this model clearly shows that the research question is built from your chosen topic area, your preliminary search and possibly the literature that you find. However, with secondary analysis the research question is derived from the data. So you start with a topic that you're interested in but instead of looking for literature you look for data and you start evaluating the collections. When you find a collection that intrigues you you then ask a question, a research question of that data and from there you would then find out well what literature then exists on that question potentially also on the data and from there you would then not need to collect any data you just need to access it which is of course one of the key advantages of secondary analysis. The data is already collected you just need to get your hands on it either by downloading it from one of our catalog pages or sometimes you may actually have to go into an archive if it's only available in paper form. Once you have it you can then analyze it and write it up for your dissertation. So the key point that I'm making here is around reusing data for dissertation projects is about where your research question comes from. It would take a lot of time and kind of inside knowledge about the data within archives in order to be able to come up with a research question and then search for the perfect data to answer it. You'd be searching for the right data for a really long time unless as I said before you already have a really good knowledge about the collection that are already held by the archive. So for a dissertation project when your time in resources is limited you'll want to look at the data first see what's out there and then develop a research question which may hopefully give you a new take on that data and you can of course spend some time looking for the right data do some evaluation of the data sets but for a dissertation project you may want to first look and see what data already exists on your general topic area before you start nailing down your research question. So with that being said it's probably important to have some kind of idea about what kind of project you want to do. While you might be exploring data without a specific question in mind you may still want to think about what kind of research design your project will follow and there's four types of reuse projects that I think lend themselves really well to a dissertation project and these are reanalysis, a replication study, a comparative study and a re-study so I'll go through and explain what these are. So reanalysis is probably the one that comes to mind when thinking about what secondary analysis is so this involves thinking about the wide range of approaches that you can take in the analysis of the data set. It usually means asking some kind of different question from what the original researchers were trying to do so for example live seal and charterous black did a study using comparative keyword analysis of illness narratives. Now the original illness narratives had been looked at exclusively for health research so the interviews were meant to explore how diagnoses were made. When seal and charterous black came along to do the comparative keyword analysis they however were much more interested in the analysis of the discussions between patients and doctors rather than the actual health issues that came up in the interviews so the question can be very different in that kind of way or sometimes the question can be on a similar topic to the original research but have a slightly different focus so for example Joanna Borna looked at gerontology as a topic and she ended up finding a couple of different data sets that look specifically at gerontology however Borna's research question was focused on racism which wasn't the focus of the original work from the data she was reusing but the data set was rich enough to allow her to explore that theme within the the existing data. If you want to use the same exact analysis strategy that would be a replication study and that's also possible so right now there's a there's a real concern about reproducibility in research so things like replication studies can help reveal the messiness that's involved in working through the data and there are some infamous examples of replication studies and one of these is from Thomas Herndon who was a postgraduate student at University of Massachusetts so he was assigned an assessment to replicate the results from a published study so he chose Reinhart and Rogoff's 2010 paper growth in the time of death and basically the paper came up with the proportion at which your national debt can be of your GDP before you start to see negative economic growth so Thomas Herndon read the article hold all of the data from the OECD this was publicly available data so that he could rerun their analysis as described in the paper but he got a completely different answer the paper said that the debt can't exceed 90% of the GDP however he calculated that the debt can actually exceed your GDP and even then the impact on economic growth is only minimal so after contacting the original investigators he found that there was a flaw in their data sets where they had miscopied themselves from the original OECD data set and they ended up missing out I think it was five or six countries so the full story is published in 2013 in the New Yorker if you're interested in reading it a replication study hopefully wouldn't always show these kinds of flaws in the original study but nonetheless I think it shows that it's a study design that's worth considering and it helps you develop an appreciation for the research process and you could even develop a project whereby you rerun a series of studies on the same topic or you explore a complicated data set with I don't know missing data or transforming variables and so on you can also do comparative work so you might be looking at an international comparison between two countries or comparing social subgroups of the population based on shared social characteristics our key data page for quantitative data sets outlines some of our large national surveys that are held at the archive and any of those I think would allow you to do some kind of comparative work without actually having to go out and collect two sets of data so you could compare samples across time across geographic place across gender or ethnicity these characteristics are usually collected as standard for some of those larger surveys the final type of reuse I'm going to go through with a case study if I can move my slides along there we go so this final type of reuse restudy allows you to replicate the methods of a study for purposes of comparison so it does a little bit of secondary analysis but it also allows you scope to collect a little bit of your own data so the example of this kind of reuse project is from the collection school leavers study the original study was conducted by Ray Paul in the late 70s which was part of a much wider community study that he did on the Isle of Sheffey but as part of that project Paul sort of stumbled upon teachers who had set an assessment for their students to write an essay just before they were due to leave school which prompted them to imagine that they were reaching the end of their life and something made them think back to the time that they left school and they were assigned to write a short essay of what happened in their life over the next 30 to 40 years and Ray Paul thought that's a really interesting essay can I have those as data and the teachers quite happily gave it to him at the time I don't think that would happen today but nevertheless he was able to collect this data set of these essays and in 2009 Graham Crow and Don Lyon and you'll see Graham Crow on the left there with Ray Paul on the right decided that they wanted to reanalyze that data set and they were focusing solely on student aspirations so using the same methodology they conducted a re-study of school leavers on the Isle of Sheffey in 2009 and the prompt that they supplied to students in their 2009-2010 academic year was nearly the same imagine that you're at the end of your life and reflect back on what you've done since leaving school they then transcribed those essays and compared the themes from the new set of essays to the themes that were in the essays that were collected by Ray Paul and you can see the wording of the prompt here there's a little snippet on the bottom there of one of the essays or at least what it looks like digitized the findings are fascinating so they show the difference of young people's aspirations after one year sorry one generation or 40 years rather had passed but how exactly were they different well in 1978 students expected much more grounded arguably mundane sorts of jobs career progression was gradual and it followed on from a lot of hard work sometimes there were talks of periods of unemployment or being on the dull or even death and you can see a few examples in the left column there are some of the quotations from those essays such as the one at the bottom I longed for something exciting and challenging but yet again I had this settled for second best I began working in a large clothes factory the later essays however showed that students were imagining well paid and instantaneous jobs they were filled with choice but also filled with uncertainty Crow and his research team also noted a clear influence of celebrity culture in those essays so for example you can see the quote at the bottom of a girl who writes in my future I want to become either a dance teacher a hairdresser a professional show jumper a horse rider I do become a dancer my dream would be to dance for Beyonce or someone really famous so this study was a larger one that might then what might be realistic for a dissertation project the goal was to engage the community alongside the research and find innovative ways of including participants in research outputs so as part of that initiative they published the living and working on Sheppy website and that helps to create a shared history and memory of what living on the Isle of Sheppy means among the community while this would be an ambitious project to say the least for a dissertation it nevertheless is a good example of how you can combine a bit of data collection and data reuse into one project for those of you who might be doing some phd work or thinking about phd work in the future that's certainly something that you could consider for one of your projects for others you can certainly design a much more feasible study with a smaller sample smaller outputs perhaps you're just going out and collecting one or two interviews or something like that alongside the data that you're reusing so hopefully you're now budding with ideas of what you might want to look for in the archives or what kind of project you might want to do reusing data since you're not collecting data yourself you'll find that reuse projects have very few ethical considerations comparatively and hopefully you wouldn't hit too many snags with ethical review boards however that doesn't mean that there aren't any ethical considerations so there's two key points that I want to make before diving into qualitative and quantitative data and the first of these has to do with the access point how do you get permission to use the data so if you're reusing data from an established archive like the uk data service we've taken a lot of pain out of access by negotiating a license with the person who collected the data so this usually just means that you need to sign what's called the end user license and this is a legal document of our terms and conditions and it states that you're going to do two really important things one is no sharing of the data onward and that includes with any supervisors so if you need help with your analysis and your supervisor needs to see the data then he or she will need to register and download the data themselves the end user license stipulates that you cannot under any circumstance share your data or your login with anyone the second point is that all of the data well all of the data that we have would be anonymized and again that's probably likely going to be the the case if you're reusing data from an archive however just because it's been anonymized doesn't mean that it's completely impossible to figure out identities of participants there the information commissioner's office uses a term that's called effectively anonymous meaning that you know we can we are able to sort of share the data knowing that the risk of re-identification is quite minimal and the risk to participants is quite low so it's just kind of acknowledging that there's never a 100 effective way of anonymizing data so consequently should you inadvertently uncover any identities of the participants then this end user license stipulates that you won't reveal that identity to anyone it's to be honest it's it's going to be unlikely that it would happen anyways it's just a point to kind of acknowledge that technically it it's not impossible that this could happen so those are the two key issues to recognize when using the reusing data and when signing our end user license once you have sorted out access the other point that I would want to make then is that you need to ensure that you're citing the data so in short citing archive data helps data creators track the impact of their study it also supports reproducibility um it makes it easier to find the data that you use for your project and the issue is so important that the UK data service has a page on our website which goes into a little bit more detail about data citation and will help explain why this is an important ethical issue with the UK data service we tried to make it easy by supplying you with the citation that you need for your reference list on our catalog pages so that citation you would see under the heading citation and copyrights and if you expand that box you would see this kind of this kind of pop up so the catalog pages you can set what your citation format is you can literally just copy paste that particular citation into your reference list so now that you've got access you've sorted out the citation now comes actually doing secondary analysis so first I'll walk you through qualitative data and a couple of the key issues we've getting started with qualitative data and then I'm going to pass over to Ali who will talk about the quantitative data data so first I'll talk about orienting yourself to the collection then I'll talk about recontextualizing the data and finally I've just got a small point about sampling as well so when you first download a qualitative data set you'll get a zipped folder which looks a bit like this you've got some folders um and all of them are going to be stuffed with vials most qualitative data is held as rts so to find the data you need to go to the rtf folder this is the format of a word processing document so this folder once you once it's opened here it is will look like this um and there you are that's all of the data nicely organized so if you click one of those files you'd be opening up what looks a bit like this and this is a snippet of what the school leavers um essays look like in their entirety the rtf folder um that I open that up from has over 100 of those files but those files don't just have to be essays or interviews you might for example have pdfs of handwritten notes like in the upper left corner or there might be ethnographic notes like the one on the right some collections might also have some images or videos like you see in the lower left hand corner most likely though you'll end up opening up an interview transcript like the one that's seen here it should have some clear turn taking and speaker tags so you know who is speaking there are a lot of different data types available so make sure you have a good look through the collection first and see what's actually in there the next thing that you need to do is actually orient yourself to the project I think the main point here is to um not underestimate the amount of time that would take to get acquainted with the data sets there may be multiple levels of context that you have to get through in order to really understand the data and what I mean by that is that you may have more than just the data that's collected at the time of the interview or the whatever kind of data collection it is but you might also need to consider basic social characteristics of the participants the historical time period in which the data was collected where the data was collected so really the idea is that you need to understand the data set as a whole in order to really get at the root of what the data can convey every collection archived at the UK data service also has some documentation that's provided with the data set and that will be a really useful starting point it often contains more information about the methodology such as an interview schedule or a call for participants um or sometimes it includes segments from publications arising um out of the original study or it might even have something like the funding application um and for qualitative data sets this documentation is called the user guide so here's an example of a user guide this one happens to have an interview guide for interviewers as well as a blank consent form there's a sample profile and so on it's just further background information to help you understand how the data was collected but what if you want to know more about the participants themselves so every qualitative study should also have what's called a data listing and here's an example of one of these so it's a table that gives a brief overview of the data in the collection each row represents a piece of data um or a participant even and each column represents some sort of characteristic or attribute of that interview so it's a quick way of getting to know who took part in this study in addition to the context of the data you may also need to consider the sample so for example if the data set is too large you may need to take a sub sample now qualitative collections tend to be smaller anyways but many of the archive data sets are funded and they can collect a considerable amount of data so for a small dissertation project you'll want to be realistic and decide if you need to limit the number of participants to a smaller sub sample of the larger collection so for example the Edwardians collection which was put together by Paul Clemson and it's widely considered to be the first oral history of Britain contains about 453 80-plus page interviews conversely most dissertation projects probably have an expectation I'd say maybe around six to ten interviews um so you would need a clear sampling strategy to help you choose which of those interviews you want to look at or you might be interested in a particular subgroup of the population so again you'd want to think about what criteria you're looking for you might also want to combine data sets um so there may be data from different collections that complement each other now remember it would take a lot of time to sift through and find the different pieces of data sets to pull it together but this is another possibility if you feel like you've found data across different collections that all speak to the same topic and research question that you want to answer so you would need to put in a little bit of work recontextualizing the data and making sure that those interviews kind of work together as a new collection um but yeah that's a that's the final point that I wanted to make about qualitative studies so I'm going to hand over to Ali now and she's going to talk about key methodological challenges of quantitative data thanks Maureen that's great um so yeah so as Maureen said I'm going to give a quick rundown of the key methodological things you need to consider when doing a dissertation project reusing quantitative data so I'm going to cover the key things you need to know and consider in two main areas the first one of these is when you are selecting your data and the second one is when you are getting to grips with and understanding your data so selecting your data okay so um so as I said some of you and it looks like quite a lot of you might already have a clear idea of your topic area that you want to look at or explore for your dissertation but if not I'm just going to run through an idea of some of the data we have available and some of the topics um that we hold uh data for quantitative um that we hold quantitative data on so just to give you an idea there's data available that could allow you to look at the environment so I know we had someone in the chat interested in climate workforce patterns healthcare family spending attitudes to the police and from an or justice system I know someone popped in the word cloud that they were interested in the police time used during the pandemic so how people spent their time attitudes to all different kinds of social topics and political opinions and these are just some examples so why have I started with topic this is because as Maureen said earlier you'd have to know the data sets really really well to find the perfect one to answer your question so instead we can think about this as a process of refining our project and questions as we explore the data so let's say you now have a general idea of your topic area and you're starting to think about the data you might use to explore it a good place to start is by trying to think about what you want to measure this is really key for a quantitative reuse project you need to know the key concepts you want to measure so that you can relate these to variables within a data set so for example let's say we were interested in looking at the relationship between fear of crime and age our key concepts here are fear of crime and age so we need to find some data which has variables that measure these concepts and allow us to formulate and or answer a research question about them because as Maureen said earlier it's hard to find the perfect data set it's easier to start from this broader topic and then derive your key concepts and questions from the existing or available variables but that's not to say if you do already have a question in mind that that's bad maybe you have something in mind from a previous research proposal or a discussion with your supervisor and you can always search for data in line with this just keep in mind that you might have to be a bit flexible and revise it based on the data that's available so if you're looking for data on your key topics there are a few places to start within the UK data service so you can use our theme pages to look for data sets on a particular theme you can type in keywords words into the data catalog or you could use our variable and question bank this allows you to search for particular variables within data sets but just please be aware that it's not completely comprehensive of all of our data so if you do have a query or you think something's missing do get in contact with our help desk like Maureen mentioned and we'll be able to help you with that and there's also the hasithosaurus which allows you to search through key sociological concepts so things like discrimination, crime, pregnancy, maternity care I know someone was interested in that so you can search for these key concepts and then it'll direct you to data sets that have been tagged with these and the links to all of these can be found on the find data section of the website and we will have a practice in the practical session too so once you've found a data set that you think might be suitable you'll need to consult the catalogue page like Maureen said earlier all of our data sets have catalogue pages it's the same for quantitative as it is for qualitative this will give you an overview of the key topics the background and a brief overview of the methodology you can also access the documentation from here including any user guides, technical reports and really importantly the list of the variables included in the data sets and any notes that might have been added by the data producers outlining any changes that may have been made since the data was originally deposited so sometimes there are edits to waiting or particular things are changed afterwards so make sure you have a look for those so you know which version of the data you're using importantly as I said the list of variables can usually be found in the documentation the file which contains the variables can vary depending on the type of data set depending on which data set you select but this can usually be found in either the user guide a specific variable list document or a code book or data dictionary and here you can find information on what the variables measure and who they apply to which is something I'll discuss in a minute so after our example if we wanted data on crime we might look at the crime survey for England and Wales this is a really large important survey and it looks like it might cover our key topics the crime survey is a really important source of information about crime as it provides statistics that are independent from police records it's an example of a repeated cross-sectional survey it's conducted every year there are 35,000 individuals aged 16 plus and 3000 aged 10 to 15 yeah so that's an overview of the crime survey I thought I had another bullet point there but evidently I don't and so back to our variables having a look in the code book I can see that the crime survey for England and Wales has two variables that might measure our key concepts the first of these is quality which says how much is your own quality of life affected by fear of crime on a scale of 1 to 10 where 1 is no effect and 10 is a total effect and age which measures the respondents age so we've found these variables but our next step is to think really carefully about whether they're suitable and importantly think critically about what they measure so look at our quality life variable does this really measure fear of crime or maybe as it says it measures how much fear of crime affects life and with all these variables this is something you need to think about and consider what is the question in the survey really asking and to understand this you can look at the original questionnaire which is also usually found in the documentation as well as considering your variables and your concepts you might want to consider the kind of analysis you want to do or that you can do with the data so for example if you're interested in looking at individuals at a particular time point you might want to use a cross-sectional survey or if you want to look at the same individuals over time you'll either use repeated sorry you'll use longitudinal data if you are interested in small geographic areas so that's data available at a very small local geographical area you might want to look at census aggregate data or flow data and if you want to compare countries over time have a look at our international time series data it's also important to think about your population that's the group that you want to measure so for example this might be the population of the world the UK or perhaps a particular city or local authority area as well as this you're going to want to think about your unit of analysis so are you interested in individual people or maybe you're interested in households and this will affect the data you use as some data sets are only available for particular geographies or for certain units of measurement and that's all to do with sort of the level at which we make the data available that kind of anonymization and just closeness that Maureen was talking about earlier and if you're interested you can read more about that on the access pages of the website and finally it's important to remember that this process isn't linear you might need to go back and forth and realign your question with the available data or compromise it's a perfect data set isn't available and this is all part of doing secondary analysis so now I'll run through super quick on how to understand your data so as we said the documentation usually contains information on the variables but it should also have information on the questionnaire that's used to collect the data and as I said earlier to understand secondary data it's also really important to understand this process and something that's particularly important is the routing so that's who was asked which questions this is because many questionnaires use something called computer aided interviewing which sends the respondent through the questionnaire by different routes depending on their previous answers so for example if the survey has a particular set of questions that are only applied to people age 65 plus if the respondent answers at their 40 they won't be directed to that section of questions so therefore many questions may only apply to some of the sample so it's always a good idea to check the documentation and see who was asked about the variables that you're interested in so this is an example of a variable called flex 10 from the labour force survey which looks at special working arrangements you can see the exact wording of the question along with the range of answers and how they've been coded and this is an example of what this might look like in the documentation but when we come to look at our routing and our computer aided interviewing underneath this we can see that it says that the question is asked if the respondent was in work in the reference week and it also gives information on how they identified who was in work by their responses on other variables you will find that the documentation can look a bit different across different datasets but this gives you a general idea of the kind of thing to expect you can also find information on how the data has been processed after it was collected for example we have derived variables that are created from the raw data and here is an example of a derived variable so this shows how the flex 10 variable that's what we were looking at on the previous slide has been used to derive a new variable called flex w7 which basically just measures whether someone has a zero hours contract or not so this variable isn't derived by taking into account the responses from the original variable so all those who responded seven on the flex 10 variable which if i go back you can see was a measure of whether they had a zero hours contract or not have been coded as one on the flex w7 variable and therefore this new derived variable indicates whether someone has a zero hours contract or not all documentation will contain these diagrams and in some surveys it will just be the syntax or the information that's written at the bottom to show you how these variables have been derived and i know this can look a bit confusing but i promise like Maureen said the more you get to know the dataset the more it starts making sense to you and you get used to the names of the variables and the kind of process that's used but it is really important that you understand the origin of any variables you're using so if you're really struggling do get in contact with our help desk and we'll be able to go through it with you as well you also need to think about sampling so surveys and similar quantitative data sources are almost always based on samples and one important question you need to ask about your data is is the sample representative so you need to know who's included in the sample what the response rate was and is there any information about differential responses across the population so for example this will tell you about potential bias and you also need to find out if you need to use a survey weight in order to make the data representative and again you'll find this information all in the documentation a second and final question you need to ask is whether you have enough cases to make a precise estimate for example the crime survey for England and Wales has a large sample size which should allow for precise estimates however with smaller samples and if you're perhaps analyzing a particular subpopulation you might have insufficient cases in the sample to make precise estimates so I've just given a quick overview there about sampling considerations we do have a number of resources on the UK data website we've got some guides so do have a look there as well so just to summarize you need to think about your key concepts what you're trying to measure and relate these to the variables in the data set always check the catalogue record to help you understand your data and make sure you're considering sampling so that's the end of my section on quantitative data again just to remind it if you have any questions please do pop these in the Q&A box if anyone's having issues with this do let us know in the chat but before we move on to the practical I'm just going to highlight some of our dissertation resources that you might find useful as well so if you go to our website and you head to home the learning hub and then to the little box called students it's just on the top on the right hand side you can find further information on what data are available more detail on how you can find an access data the UK data service dissertation award and further resources to help you think through your data you may also want to explore our learning hub which teaches you the basics about accessing and finding and all the different on all the different kind of data types that we hold and we also have our data skills modules which are interactive training modules designed for anyone who wants to start using secondary data so we've got one on starting with survey data longitudinal data aggregate data and we've also got one on using crime surveys and analyzing these in our so do have a look at those if you're interested too finally if you want to follow us for any more updates if you follow our hashtag UK DS dissertations on X formally Twitter we post all the latest updates and resources there as well and then I I've put together just a few further resources as well so anyone who is kind of launching themselves into secondary analysis there are lots of book chapters and little written guides that can help you along the way Timescapes has a series of these which is particularly useful so Timescapes is a longitudinal qualitative archive but they've got some really useful just short guides that are very accessible there's also a few different books so the using secondary data and educational and social research has a book chapter by Jane Heaton which is really useful there is also if you're doing qualitative secondary analysis Karen Hughes and Anna Terrence a couple of years ago published the secondary analysis of qualitative data kind of handbook if you will we've also got as Ali has mentioned before things like the data skills modules lots of video tutorials and we also have lots of tools and templates so if you are looking to gather your own interviews for example or or you know make your own survey we've got lots of templates that you can use to help you write a really good consent form or information sheets that's got lots of text that you can just kind of use and reuse as needed so please check out those as well There's just some, oh sorry, yeah there's just some of our next questions we said find us on Twitter and do have a look at our YouTube channel the recording of this session will be available there after that and again hashtag UKDS dissertations on I should update this it's not Twitter anymore is it ex? We do have a recording that'll be available on YouTube once it's ready so give it a little bit of time and check back on our YouTube channel you can also view past events on our YouTube channel as well