 So hello everyone, my name is Maureen Haker, and I'm going to be one of the hosts today for the Getting Started with Secondary Data Analysis online workshop. I've worked with the UK Data Service for over a decade now, specializing in qualitative data. So I've done everything from ingest and curation of qualitative data to reuse projects and helping others out with data management of qualitative data. And I run a lot of the training sessions related to quality data. Nigel, did you want to introduce yourself? Yeah, so I'm Nigel Deena-Rena. I've been with the Data Service two and a half years, working primarily with quantitative data, in particular around census data and secondary survey data. Okay, so we split this into two parts, and the aim is to take questions at the end of the whole session. So the first part is about using the secondary quantitative data. And there's some overlap between the two. So some of the things I cover are also relevant for the quantitative data. So first of all, what is secondary data? How do we reuse it? And let's look at some examples of the kind of data, key issues about reusing it, what resources are available. And we will be running a couple of practical activities to enable you to try it out. And at the end, as I said, we'll go over a Q and A. First of all, just thinking yourself, have you reused existing data before? If you have, then you will have some familiarity and maybe this will add something to your knowledge. And if you haven't, this is a good place to start from introduction. Just thinking about what is secondary data. In terms of quantitative data, we have a number of major data collectors. So the Office for National Statistics collects a range of surveys and census data. There's a team in FX and RON, Understanding Society, which we will have a brief look at in one of the exercises, but is a major longitudinal study that's run yearly. And the one we're looking at was actually a data set created for teaching purposes from the COVID way of their data collection. That's and who carry out social research on things like the British Social Attitude Survey. So when we look at what they collect, they then deposit it with us and that is available for secondary research. So they do some of their own primary research for it, but the main aim of those surveys is to enable researchers across the UK and more broadly to explore the topics they're interested in. So the purpose that people use it for may not be the purpose that it was originally collected for. So what's good about having data available is that some of these data sets will be impossible to create. So the sample size of the data sets is really large. They're quite expensive things to do and you wouldn't be able to collect this amount of data in most research projects. So they're cost effective because the cost has been paid so you have free access to them. All the ethical issues about data collection have been dealt with and there is no need to recontact the data subjects because they've all given permission for their data to be used and that data can then be used to make the claims that you want to for your research. But there are some downsides that you need to address and think about. So first of all, you don't really have the insider understanding of data and data collection. Some of that documentation will be reflected probably of varying quality from the different surveys, but there is generally a lot of documentation about what happened with the data and data collection. So for example, in understanding society, they do an ethnic minority boost to improve the collection. You can read about how that's done, understand it and how some kind of shared understanding of the research as data collectors of what was done. Because it's not your data and you haven't designed it, you've got some work to do to get to know it. And in terms of the way you use it, you will still have some ethical issues to think about the way you use that data. So are the claims you're making like cause potential harms, for example? And the fit may not exactly match your research question. So you wouldn't be able to necessarily answer all of the things you want with the secondary data source. And there isn't the potential to extend those studies. So for you, I think the key thing is how do I, what does this data mean? How do I understand it? And making a judgment, a pragmatic judgment about whether the data are good enough for your research purposes. And what I'm gonna just make us think about here is the kind of messiness of research. So the ideal model is we start off with a research question, we find the data, we evaluate it and we analyze it. But actually it's a lot more messy than that. The process is quite interactive. So we might get a research question, find some data, check it, decide it's not good enough and go back and maybe think about changing the research question. We may evaluate the data and again, go back for more data. So the whole process is kind of quite an iterative one which is a necessary part anyway of research. If you're collecting primary data, you have the same. But with secondary data, you may well be looking for other sources as well. The kind of starting point is how do we access data? How do we find data? When you go onto the UK Data Service website, you go straight onto a page to find data. There's also supporting theme pages. There are a series of webinars that we hold around different data sources so you can look back. And there is more material being added and more videos about different types of data. I suppose you look at kind of how does it fit with my research? You need to understand what was collected, who, from, when and where, and any changes to that raw data before being archived. And we'll think a bit more about that. But in all of the main surveys, you will have quite comprehensive user guides, the original questionnaires, the interview schedules, plus a variety of other resources, you know, the sampling strategies, et cetera. For some of the studies, you will also be able to see the research that people have done with it. First practical activity is to look at documentation. Let's see what data is available. Let's have a go. What sorts of documentation are available? I'm going to cover. So I'm going to switch screens and take you into the UK Data Service speed and just show you. So I'm just going to share the front screen there. I'll stop this share. So this is the front page you would come into. So the first thing I'm going to do is to log in so that I can see documentation. So if you're in a university setting, you'll be logged in through your institution. If you're not, then you will have your own personal login. If you have any queries about logins, we can answer those later on, I'll point you to them. So now I'm going to go back to that front screen and find some data. And I'm going to show you a relatively new survey that's been deposited by the Centre on the Dynamics and Ethnicity, the EVEN survey, Evidence for Equality National Survey. When I search for that, I get the one I want, Evidence for Equality National Survey. And here on the front page, we get the title, the authors, the citation information, the sponsors and contributors, probably right, some summary of the topics and an abstract. There's quite a lot of information that I've actually deposited by the team who produced this. And then we get the coverage of methodologies. And here we can see when the fieldwork was conducted, the coverage in terms of geography, the observational units, and some information about the population and the method. This is a kind of unusual survey in that it used a non-probabilistic method. So there's a lot of information about the weight in here. So when we go into the documentation tab, we can see the code book, the user guide, the technical report, citation information and the data dictionary. And in this study, we have access to the book that was produced. So this book is actually open access electronically. So you can go and look at the findings that the authors produced from this. So that's the kind of documentation that's there. As an example, can you see the slides there? Yeah, you're good, I can see them. Okay, so that's a kind of quick tour through that. We're gonna give you the opportunities to have a look at that for yourself in a bit. But let's first of all think about some of the things we might be looking for. So what's the type of analysis? So for a cross-sectional survey, we might have individuals, families, households or businesses at one point in time. There are a number of those so that even survey I just showed you is a cross-sectional survey won't be repeated under the current funding arrangements anyway. Some established surveys like the British Social Attitude Survey are repeated cross sections on the English Household Survey. So they take a random sample each time on an annual basis in these cases of individuals, families or households or businesses and look at those on a yearly basis. So each year the British Social Attitude Survey is published. They also publish a report alongside it similarly with the English Housing Survey. We also have longitudinal data where the same people, individuals, families or households are asked questions over a period of time. So understanding society is an example of a longitudinal data set. We have some other data. So we have the kind of data from the census which we'll talk about if people are interested in but we have covered that. And we also hold country data from sources like the World Bank over time. So these are based on statistical return from the countries across the world and are available for you to use and download. An example here is the crime survey. So this is a repeated cross-sectional survey. It's seen as important sorts of information about crime. So we have administrative records from police recording and those are available but what this gives is people's perceptions of crime and experiences of crime which don't necessarily make their way onto police records. So there's an annual survey of around 35,000 individuals, adults over 16 and 3,000 aged 10 to 15, identifies those who've been victims of the crime in the previous 12 months and has various demographic characteristics as well as attitudes to the police and the criminal justice system. So data is stored as individual anonymized records and there are different levels of access. So our standard access is called safeguarding and most of our data sets are held as safeguarded data which you can access. There are some open ones which largely have been provided for teaching purposes. There's also secure access which requires you to go through a training and accreditation process and has particular access restrictions about where you can use it. So you need to use it in a safe environment. And this is an example of the way the data is stored. So here we have a label at the beginning and the variables together with we've put in text descriptors here in this example. So looking at what people have done with the crime survey, this study by Calife et al. Looked at violence against people with disabilities in England and Wales. It used the crime survey from 2009-10, brought in disability measures and used the special license for the data. So it looked at 46,000 adults with 9,000 having at least one limiting disability. And having looked at that and kind of adjusting for various characteristics, they found that disabled people were more likely to experience violence, that the levels of victimization were highest amongst those with mental health problems. Because this was a sample, a random sample drawn from the population, they can use the weighting to infer that 116,000 victims of violence were attributable to their disabilities. So it's really a significant piece of research that probably changed the direction of policing to some extent and the way we think about policing and violence. So just thinking about sampling, which I mentioned there, surveys are based on samples. So is the sample representative? So one of the kind of classic things that is excluded from many surveys is people who don't live in their own household. A lot of sampling is done from the post office address file. That uses private residential addresses. So people who are living in institutions don't tend to be surveyed. So if you're interested in prisoners in the Majesty's prisons, we wouldn't have data. Our general surveys would not include anybody from that kind of sample. There might be specific data sets that wouldn't. We have to think about response rates and bias. So who is more likely to complete this? Are we likely to get overrepresentation of certain groups? And for most surveys, you will find there is a weighting mechanism to make the data representative. So typically what a survey designer would do would be to take the population of the country from a source like the census and then to see how their response rate compared to that census for their population of interest and apply weights to make the data more representative. That kind of takes you down the line and say, do I have enough cases? So it depends on what characteristics you're looking at, but it's very important for small subpopulations. So ethnic data in surveys has been problematic for many of them. So I mentioned understanding society, which used a boost method to enhance the size so that they could make more precise estimates about individual ethnic groups. For some other surveys, the estimates are very broad and the confidence intervals are very wide. So if you're interested in particular subpopulations, you'll need to make sure you've got the coverage to be able to make the kind of estimates you would like to make. Using a good practice, just cite it. We provide citations for every record of data set. You can use the citation tool to copy and paste the correct citation into any work you produce. So here is an example from the British Social Attitude Survey. That's available on the front page as I showed you on finding data. And then we have a kind of question about who was asked what. So the British Social Attitude Survey is probably a good example of this. Because we use computer-aided interviewing techniques in quite a lot of our surveys now, they make it easy to send respondents through the questionnaire by different routes. So one of the ways the British Social Attitude Survey boosts the amount of questions they can ask is by only asking questions of a subsection of the population around each topic, each specialist topic they're looking at. So typically if you look at one of their surveys, you might find four different topic areas covered by a quarter or a half of the respondents. And here we have an example of the kind of question. So this is taken, I'm not sure where from actually. I can't see the citation. It's a labour force survey, isn't it? Is it a labour force survey? Okay. I think so. So this is asking about different working hours arrangements and it gives you a series of options to say what your working arrangements are. And it applies to those people to the conditions at the bottom. And having collected that kind of raw data, we can then derive variables from it. So here we can see the logic to decide whether somebody has a zero hours contract or not. So it excludes those who have contracts that aren't zero hours and focuses in on those that might do. So you can then get an indicator to say whether somebody has a zero hours contract or not. Census data, so nearly all of the census data is out for England and Wales, coming out for Northern Ireland and will begin to be released. I think the sex and age population tables are out for Scotland, but the topic data will be available from next year. So within this, we have different types of data. So one, and they're all tied to sensor to a particular geography. So the aggregate data has a univariate and multivariate defined tables. All three census authorities are building what was called a flexible table builder. In effect, you can define the variables you want to look at and you will get a table created for you and with dynamic suppression for statistical disclosure reasons. So that set of data is probably the most commonly used. It's used both in the academy but also in policy and the voluntary sector in public services, et cetera, to understand population and the characteristics of that population. To support the mapping of that data, we also provide boundary data for census geographies. So that includes the administrative geographies of local authorities, regions, wards and a statistical data, statistical geography, which is more stable over time when it was developed in the 2001 census for England and Wales. We've recently received the micro data which is a sample of individual census returns. So there are two individual files for both 5% samples, so around 3 million records covering local, combined local authority geography was around 270. So smaller local authorities are combined with neighboring ones and a regional geography. There's also a secure set, data set which has a 10% sample of individuals. There's a 1% sample of households which is safeguarded and a 10% sample of households which is again a secure data set. So those are all now available from the data service. The safeguarded data is all available, aggregate data is open and the boundary data is open. And then we have flow data which we are redesigning the interface for but flow data in effect has an origin point and a destination point for commuting, for migration, for student term time and home addresses and for people with second addresses where they stay more than 30 days. So we're still processing that and that is likely to be available towards the end of this year or very early in the beginning of next year. We will be providing some training on that as well once the data is available and the interface is defined. So just to give an example here, this is a custom data set I developed looking at housing deprivation by ethnicity and year of arrival in the UK. And this is for London. And what it shows you is levels of housing deprivation by when people came to the UK or when they were born in the UK. Okay, I will come back to that question later on, I think. So I should have actually asked you to do an activity after I did the find. So can I just... Do you wanna go back to it now and we'll do that quick activity and then we can switch over to quality data and that might actually start to cover a little bit around ethical considerations. Like the question is that, so we've had a question come in around ethical considerations for secondary analysis so that might start to talk about that but why don't you do the activity and then I'll take over. So apologies for that. There is a worksheet called catalog and documentation tab. So this gives you the opportunity to have a go at it. Is there a link to it, Jill, to put in the... It will be, yeah, I think Jill's gonna pop the link in the chat in just a second, so if you... Yeah, there is. So it's a two-part thing. So the first part is looking at the understanding society, teaching data set and ask you to look for some questions, to answer some questions. And the second part is looking at a mixed methods study, so a qualitative study. So if we take a few minutes to look at those, we'll come back and help you with the answers. So just to go over the answers to the first set of questions, the observation unit for this is both individuals and households. So the answer is C. The data covers the UK, and there are a number of topics in there. So a range of topics that you could be looking at. I actually used it for some work I did looking at kind of precarity and debt among different ethnic groups in the population during COVID, which did reveal some kind of interesting pictures of different levels of precarity among different ethnic groups. Possibly not a surprise to those who study in that. Worth using that data to be able to produce that kind of result and see what was happening during COVID. So I think we've probably had enough time to look at these. Again, part two is in the coverage and methodology section. So this study was conducted in England and you get a set of transcripts of the interviews in text format there. So hopefully that was useful seeing the documentation. As I say, this is a key part of getting to grips with secondary data generally is understanding where it comes from, how it's collected, et cetera. So what I'm gonna do is move forward to the qualitative section. All right, so I'm gonna talk about very similar things to Nigel, but for qualitative data. So first I'm gonna go through a couple different types of qualitative data reuse projects, give you some ideas of the sorts of things you can do. Then I'll walk you through a case study of one of those types of reuse. Then I'll do a quick overview on getting started reusing quality data, including a couple of key issues that arise when trying to reuse data. And it might start to touch on then some of those ethical considerations. And then I'll show you some ways of finding qualitative data. Just to say qualitative data reuse is becoming more common in recent years and the UK data service certainly is offering more qualitative data sets in a much more accessible way than it has been able to before. But it's still kind of a one of those niche areas. It used to be that you would need to actually come into the archive and sift through boxes of paper in order to reuse qualitative data. But now it's much more downloadable. You can search through the data itself. So it's much more accessible basically than those paper-based collections of old used to be. So it is kind of a still slowly emerging area but I'd say it's definitely much more common than it used to be. So there's many different ways you can reuse quality data. You can simply give a description or an understanding of a particular social and historical point in time. And this is useful because you can see more of the data than just what the publications reveal. You might not be able to see all of the data depending on what's available in the archive but you can certainly see more data than what the original authors of publications thought was salient for their projects. And this is useful because you won't be limited by what's been published. You can actually explore it a little bit further, see what might be of interest or of relevance to your own research questions. Another way to reuse data is to consider analyzing the methods used and then look at what lessons might be gleaned from the most effective ways of, for example, sampling or data collection methods or even developing topic guides. So one particularly valuable use might be to look at, for example, how an interview was planned, how it was laid out before the interview was conducted, so what questions the interviewers thought they were going to ask and then look at what was actually talked about in the interviews. And there might be many reasons why certain questions are not asked in interviews but it's certainly one of those skills that as a researcher you aim to develop. Some interview schedules are designed to be more flexible. Sometimes tangents come up, you just want to interrogate it further, but in any case, it's a particularly important skill to have that intuition. And you can't really see that unless you start analyzing what kind of methods were used, how were they were used, what decisions were made along the way. And you can kind of pick up on any of those points in the research process and start exploring that further. So you can do reuse projects for methodological advancement. Another type is called reanalysis and that just looks at the wide range of approaches you can take in the analysis of the dataset. So it usually means asking a different research question from what the original researchers were trying to do. So for example, Clive Seal and Charteris Black did a study using comparative keyword analysis of illness narratives. Now the original illness narratives were looked at exclusively for health research. So they were particularly interested in how diagnoses were made. But when Seal and Charteris Black came along to do the comparative keyword analysis, they were much more interested in an analysis of the discussions between patients and doctors rather than the actual health issues that came up in the interviews. So the question can be very different in that kind of way or sometimes the question can be of a similar topic to the original research but have a slightly different focus. So for example, Joanna Bornat was looking at gerontology as a topic and she found two different datasets that looked at gerontology. However, Bornat's research question was on racism which wasn't the focus of the original work but the dataset was rich enough to allow her to explore racism within the medical field within that existing data. Another type of reuse is going to be exemplified by the case study that I'll go through with you. And this case is a re-study which is where you replicate the methods of a study for purposes of comparison. So you might be looking at historical comparison which allows you to demonstrate how society has changed over time or could be geographical or class or comparison with any other kind of variable to show differences within subgroups. So this example of a reuse project is called school leavers study. So the original study was conducted by Ray Paul in the late seventies as part of a much wider community study on the Isle of Sheppy. The UK data service holds a number of collections related to the community study but the school leavers subsets specifically looks at student aspirations. So Ray Paul when he was conducting his study found out that teachers were setting a particular essay just before students were due to leave school prompting them to imagine that they were reaching the end of their life and something made them think back to the time that they left school. And they were then assigned to write a short essay of what happened in their life over the next 30 to 40 years. So Ray Paul found out that the teacher set this essay and thought that's really interesting. I'll have that in my data set. And in 2009, Graham Crow and Don Lyon and this picture here is of Graham Crow on the left with Ray Paul on the right decided to reanalyze that data set and focus solely on student aspirations within those essays. And using the very same methodology they conducted a re-study of school leavers on the Isle of Sheppey in 2009, 2010 year. And the prompts supplied to students during their data collection was nearly the same or tried aimed to be as similar as possible. So it's imagine that you're at the end of your life and reflect back on what's been done since leaving school. And they then transcribed those essays and compared the themes from the new set of essays to the set of essays that were collected by Ray Paul in the 70s. And you can see the wording of the prompt here and there is a little snippet there of one of the transcribed essays. There was a challenge to doing the re-study of this specific type because Ray Paul basically stumbled upon teachers who had assigned to the essay. He didn't necessarily have a lot of control over how the prompt was supplied to students. They were able to share the essays and the teachers were able to share the essays with them. But he didn't necessarily control how it was presented to students or how those essays were collected and those first set of essays you can actually see in the original paper that they were marked. So teachers actually went through and marked them. When Graham Crowe did the re-study, those essays were not marked and the research team were the ones who had control over what the essay prompt was. So Graham Crowe goes into some detail about this in some of his publications and discusses how they devised the prompts based on conversations with Ray Paul about his original study. So there is some methodological discussion there as well, but the conclusion that Graham Crowe draws is that the overall picture painted by the essays as a collective still offers a valuable comparison. And the findings do show a shift in aspirations as you might imagine. And so here's a few more details about what they received back. So slightly different gender divide, but similar amount of data received. Both data sets covered the same general themes of health, education, career, family and leisure, but they covered them in very different ways. So how exactly were they different? Well, in 1978, students expected much more grounded and arguably mundane sorts of jobs. Career progression was gradual, followed on from a lot of hard work. And sometimes there was talks of periods of unemployment or being on the dole or even death or the early death of someone they loved. So you can see a few examples in the left column of some of the quotations from those essays, such as the one on the bottom. I longed for something exciting and challenging, but yet again, I had to settle for second best. I began working in a large closed factory. The later essays, however, showed students imagining well paid and instantaneous jobs that were filled with choice, but also with some uncertainty. Crowan's research team also noted a clear influence of celebrity culture in those essays. For example, you've got the quote on the bottom of a girl who writes, in my future, I wanna become either a dance teacher, a hairdresser, or a professional show jumper, horse rider. If I do become a dancer, my dream would be to dance for Beyonce or someone really famous. The impact of this study, however, spans beyond just the interesting changes that they've noted in young people's aspirations. Their study was part of a much bigger community project on the past, present and future of the Isle of Shepi. So the goal was to engage the community alongside the research, find innovative ways of including participants in the research outputs and involving them in the process. So as part of that initiative, they published the Living and Working on Shepi website, which has videos and artwork that was produced by residents and children of the Isle of Shepi, as well as ways for those who participated in the research to stay in touch with each other and read about the history of their community. So they really helped to create a shared history and memory of what living on the Isle of Shepi means among the community. So hopefully you are thinking about the different types of projects that you might be able to do with qualitative data, but how might you go about actually finding some qualitative data? In terms of searching for data, qualitative data poses a bit of a challenge. Interview transcripts, essays, other types of qualitative data often hold far more information than just what an abstract might say on the catalog page. So you might be missing out on a whole range of collections that could potentially touch on the topics that you're interested in simply because nobody has the time to sit there and read all of the data for every collection. So at present we have, I think it's over 1,800 qualitative collections. So there is quite a lot out there and all of it is very rich detailed information. So we have a tool that we've developed to help with this and it's called Qualibank. Like the data catalog, you simply type in a keyword, but instead of searching through abstracts and catalog pages like the data catalog does, Qualibank actually searches through the data itself. So when you click on the search button of the data catalog, you'll see that there's a link to Qualibank that appears kind of right underneath the search button. You can also just type in ukdataservice.ac.yuk forward slash Qualibank. And with this tool, you might be able to identify relevant interviews that might be spread across different collections or find a collection where you didn't think the theme might come up. So in this example, I've typed in Typhoid and you can see that it's searched through and highlighted in the data itself where that word is mentioned. So the first couple of hits were from the Morale and Home Intelligence Reports Collection. But further down, we can see there's examples from Edwardian's interviews as well. So when you click on one of those search results for Qualibank, so I clicked on one of the interviews that had come up and it brings you straight to the interview to the spot in the data where your keyword is mentioned. And if you scroll back to the top of the page, you can also see that there's links to external resources and the collection documentation. So if you click on one of that, as I did, it would bring you to the bottom of the page. So there it is at the top. So if you click on that, it brings you to the bottom of the page. And it includes things like audio extracts of transcripts, images related to the interview, or sometimes there might be some web resources. Finally, one last feature of Qualibank. If you want to cite directly from an interview transcript, you can simply click on the create citation button, which you can see in the left-hand menu here, and then highlight the portion of the interview that you are interested in. So it does it by utterance. So every time you see the interviewer ask a question and then the respondents answer, each of those are their own utterance. So you just highlight the bit that you would want. And then you would click the retrieve citation button. So your create citation button turns into a retrieve and you get this little pop-up. And you can copy and paste that citation into your document. It's got a persistent identifier, which is a URL that you see at the end of the citation. And that brings your readers directly to the exact paragraph you've highlighted within Qualibank. So this basically is a citation tool that introduces a new layer of transparency into qualitative work. So it helps you accurately cite the data that you're reusing, but it also allows you to add further context into some of the work that you're doing. Okay, so we've covered the different types of reuse projects you can do, how to find and access that data, but what about the process of actually analyzing the data? And the first thing you would need to do, like with quantitative data, is orient yourself to the original research project. And I think the main point here is to not underestimate the amount of time that it might take to get acquainted with the data sets. So there might be multiple levels of context to get through in order to really understand that data. And what I mean by that is that you need more than just the data that's collected at the time of the interview or data collection. You might also need to consider, for example, the metadata of the participant. So some of their key attributes, their gender, their social class, their occupation, et cetera. You might need to consider the historical time period in which the data was collected. So a lot of our data is much more recent, but some of it is collected in the 20th century, some of it was published last year and collected in recent years. You might also need to think about where the data was collected and what some of the larger kind of political and social context is of that region. So really, the idea is that you need to understand the data set as a whole in order to really get at the root of what the data can convey. And the documentation provided with that data set is a really useful starting point for that. It often contains more information about the methodology, how the data was collected. So it might have things like an interview schedule or a call for participants or sometimes it includes segments from publications arising out of the original study or I've also seen funding applications. I've also seen some studies which have sections written up by the principal investigator about particular features of the data set. So for example, a net loss and conducted a study in the 1980s on adultery. And it's a really taboo, particularly for the 1980s in England. And that is a quite sensitive topic, quite taboo topic. And her sample ended up becoming a primary focus for her. She kind of had a bit of a gender and a class bias within her sample. So she wrote a 56 page document just on her sample comparing all of the features to national numbers and kind of concluding quite interestingly that some of the biases in her sample can kind of be disregarded in a way because she was doing an exploration research. And she asked whether or not some of those features on which it deviated from a representative sample, whether or not they actually mattered in terms of the topic. So it's a really interesting read. In my time working with qualitative data sets at the UK data service, I've also seen background contextual materials that were taken from the area of research. So for example, they're meeting minutes from local councils, government pamphlets, even correspondence and letters from participants. And all of that helps to paint a picture of what was going on around the study. And hopefully that would be then included with documentation. You might also need to consider the sample. So for example, if the data set is too large, you may need to, sorry, you may need to take a sub-sample of that data set. This may not be much of an issue with qualitative research since they usually are smaller studies anyways, but there are some collections which had quite a lot of funding and you'll need to carefully consider what's actually feasible. So for example, the Edwardians collection, which is sort of the founding collection of the qualitative data part of our archive. This was put together by Paul Thompson and is widely considered to be the first oral history of Britain and it contains 453 80 plus page interviews. So that would take a considerable amount of time to read and reread and become acquainted with and then actually do your analysis. So you may need to consider taking a sub-sample of that. Conversely, you might find interviews from different data sets that complement each other and you'd like to make a new larger data sets that's more useful to you if you combine them. So you may need to do a bit of work there around kind of harmonizing the data sets seeing if they're comparable to each other and then carefully choosing which of those can get added into your own sample. And finally, you'll need to think through how you will approach the data. So you might use an inductive strategy where you start with the data and you see what comes from that or you might use a deductive strategy where you have a firmer idea of what you're looking for in the data. Both are equally valid but you'll need to consider your approach as you get started. So this was a very brief overview of a couple of key issues when getting started with qualitative reuse. If you're looking for more guidance or discussion on these issues, there are two sources that I highly recommend. The first and foremost is this age handbook of qualitative secondary analysis. It just recently came out within the last 18 months or so. It's edited by Karen Hughes and Anna Terrence and it's a comprehensive guide to issues around recontextualization and sampling. There's also a short single chapter out of Silverman's most recent edition of qualitative research. Libby Bishop wrote this chapter specifically on reusing qualitative data. So it's filled with further examples of reuse and addresses the key issues in war depth. If you have access to the book from your library, I'd definitely recommend starting with the short chapter. There's also a time scapes methods guide series which is available online. Those are short, they're just a few pages but there's one from Sarah Irwin and Mandy Winterton which is another great guide, just a short guide to help you get started. Okay, so we do have a practical activity now. So I think Jill hopefully if you are able to pop that in the chat. So this one will give you a little bit of experience with what we call a download bundle. So this is where you're not just kind of looking for data or looking through the documentation but you've actually downloaded something. So we're gonna be using an open data set which doesn't require you to log in in order to download it. And the activity basically just asks you to access the data, so download the bundle and then just start exploring a little bit. So there's a few different prompts to see if you can find a couple pieces of documentation and then just another couple of questions so that you can actually look through the data and see what the data looks like in your download bundle. So I'll give you a few minutes with that and we'll come back and I'll just do a little quick demonstration on the screen so you can kind of see in case you had some questions around navigating the download bundle. I'll just give you a little demonstration. Okay, just another minute or so and then I'll do a quick kind of run through on the screen. Okay, I'll just quickly run through just a few points. So hopefully if I, so if you've followed the link, hopefully it should have taken you to the pioneers of social research catalog page and you'll see, sorry, my Zoom menu is blocking it but there is an access data purple button on the right-hand side, sort of upper right-hand side of the catalog page and it would bring you to a page that looks like this and you can just click download here and then it should and I'm not sure, but let me see if I need to change my share screen. I do hold on just a moment and it should open up a folder for you which looks like this. There we are. So you'll see that there's a few different folders in our download bundle. You've got your MR doc folder that has all of your documentation in it. So the first couple of ones like question one was asking about finding the data list. So you can find the data list as both an Excel format and a PDF format but you can see the U list is what it's called here, user list. And it's just an at a glance look basically at all of the participants. You can also find the interview guide here. So sometimes all of these different pieces of documentation are kind of bundled together in what we call a U guide, a user guide but all of your qualitative collections will definitely have it kind of either separated out like this with clear kind of file names or it might have a bundle called just a U guide and you should be able to find things like your interview guide there. Now if you go not into the MR doc folder but here it's called the RTF folder this is where you'll find your data. So here we've got interview summaries and interview transcripts. So if you open up the interview transcripts you'll see here's all your transcripts and Stan Cohen is listed as interview five on there. So if you open that up you'd be able to see the actual interview transcripts with Stan Cohen. This particular collection also has interview summaries which again they correspond to the same so interview summary five is also gonna be Stan Cohen. I think I've asked you to look at number 20 but basically you can have different types of data and normally it's all kind of gonna be within your data folder. So it'll either be an RTF folder or sometimes it's a PDF folder depending but I think a lot of them are RTF and the interview summary in this instance is because some of these life histories are really quite long so the interview summary kind of gives you a kind of blow by blow very quickly and it's actually taken excerpts usually out of the interview transcript and it's just a shorter kind of case study so that if you're looking for a bit of information you can quickly scan through the interview summary. We also sometimes receive interview summaries where they're not able to share the interview transcript itself either because there's a concern about confidentiality or just in general there's some safety issues or there wasn't the right kind of permissions in place to be able to share the full interview transcript. So sometimes we do receive interview summaries instead and it's just sort of sometimes sharing qualitative data can be a bit tricky because you have so many indirect identifiers that you'll see different types of formats basically in some of those data sets. So hopefully this activity has given you just a bit of a glance at what a download bundle looks like. I'm just going to, if you bear with me just a moment I'm gonna switch back to our PowerPoint. So if you do have any questions about the download bundle or kind of navigating it to pop those into the Q&A and we can go through those. But just to sort of signpost at the end here you can contact us using the usual means. So we do have a GISC mail list you can sign up for. We're on social media. Our YouTube account holds all the recordings for our online workshops. So if you're interested in some of the others like for example, we've recently done one on anonymizing quantitative and qualitative data. So you can listen back to that one if you'd like. Those are all available on our YouTube. And we do have upcoming events. So check back on our events page. So the usual ones that we run usually on a terminally basis will include different issues around data management. There is one that my colleague Hina does which is specifically on ethical and legal issues and data sharing which some of you might be interested in. And if you're interested in getting some one-to-one help or just asking some questions, we do have drop-ins and that sort of thing as well. So do have a check on our events page to see what else is coming. We still have a few sessions coming up. So there's one next week that I'll be hosting with Anka on data documentation. And we have a few that are on specific types of data. So we'll have a qualitative kind of spotlight on qualitative data coming up as well. I think we're just at about time. So if there are any other questions, you can always, if I can get my, there we go. You can always pop us an email if something comes up later and we'd be happy to answer there. Just give us, let us know how you're doing, how you're getting on. And if you do end up reusing our data, do let us know. So if you publish anything, let us know. We can always add it to our catalog pages so others can see what work you've done with our data sets. But I think that's it for us from us today. Check our events pages if you're interested in any future events. And thank you all for attending. I hope you've gotten something from this. Yeah, thank you.