 Hi, everyone. Welcome to Center for Open Sciences Love Data Week presentation webinar on sharing sensitive qualitative data. We're very excited to have everyone here with us today. A couple of quick housekeeping items. Feel free to use the chat to share any resources or introduce yourself if you'd like to talk about a bit about yourself, what kind of research you do, anything like that. We have a Q&A section and so that will be four questions for our presenter, Dr. Rebecca Campbell. So if you have questions during her talk, feel free to use that function and then I'll help to moderate that after the talk. We do have everyone but Dr. Campbell and right now myself on mute so feel free to use the chat as needed. This session is being recorded, it'll be available on our YouTube channel in a few days and we are very excited to have everyone here. So to introduce Dr. Campbell, she is a university distinguished professor in the ecological community psychology program at Michigan State University. Her program of research focuses on violence against women and children with an emphasis on sexual assault. She studies sexual assault survivors help seeking experiences with the legal medical and mental health systems and how community and campus systems address survivors needs. She works collaboratively with social systems to community-based participatory action research using quantitative and qualitative methods. I'm very excited for her presentation today. So with that I will go off camera and give you the floor. Thank you Dr. Campbell. Thank you. Thank you very much. I appreciate the invitation to be here. I am Rebecca Campbell. My pronouns are she, her and I am a professor of psychology at Michigan State University and I appreciate all of you taking some time from your schedules today to join us for our webinar on sharing sensitive qualitative data. So first off let me say I'm new here. I'm new here in the Center for Open Science Community. I'm also kind of new in this whole open science movement thing and that is largely due to how I was trained and the kind of work that I do. So I am a community psychologist and that means I do all of my research in community-based field settings, all of it. Most of my work is participatory action research. That means I am working collaboratively with community partners to both study a social problem but also try to figure out solutions and evaluate those solutions to that social problem. I do a great deal of mixed methods research with a lot of qualitative interview data collection and my topical focus is sexual assault. How community systems respond to sexual assault survivors so my focal population for most of my research is talking with sexual assault survivors. So in my circles, in that professional circle, open science isn't really talked about as much and it often seems like well that's something kind of other areas work on. So like in psychology we're like oh well that's what the social personality folks think about. Folks who do laboratory research and do theory-driven hypothesis focus research that's largely quantitative, that's largely surveying experimental and with non-traumatized populations. That said I also would describe myself though as being very open-minded and curious about open science practices. I believe in transparency as a value personally and professionally and as a practice. It's very critical in the work that I do that I am transparent with the survivors, with the communities that I work with. So to that end I do share my data and have shared my data consistently throughout my career. It's largely been the quantitative data that I've been sharing but I have shared some qualitative data in the past but all that said I didn't feel sort of a huge sense of immediacy or urgency to really wade into the complexities of open science for the type of work that I do until I received a grant from the U.S. Department of Justice Office on Violence Against Women and buried in the grant was special conditions number 10 the data archiving plan. Applicants should anticipate that OVW will require including a partial withholding of award funds that data will be submitted to the National Archive of Criminal Justice data. So for those of you who may not be familiar with NACJD this is a very long-standing archive in my world that holds criminal justice crime delinquency victimization data. It's been around since 1978. It's maintained by ICPSR at the University of Michigan. Some of its data sets are fully open and public access but given the sensitivity of the type of data that this archive holds some of the data sets are behind a wall so to speak and they require an application a review and a vetting process before the data are released to make sure that it's going to qualify folks that can handle the sensitivity of the data. So our particular project was going to be addressing what's called the national rape kit backlog and I'll tell you more about that in just a second. It was a continuation of a very long-term 10 years plus now community action community-based participatory action research project I've been doing in Detroit Michigan and yes the funder makes the location of the project public information and promotes it so as we're going through today's webinar I am starting at the space of everybody knows I am working in Detroit so the location is not confidential it is not private okay so let's do a little quick sidebar to tell you just very briefly about the substance of this research because it's really important context for today's webinar is it very much shaped some of the dilemmas we faced and the decisions that we made so I said this is about the rape kit backlogs so what the heck is a rape kit after a victim has been sexually assaulted they are advised to go to a hospital emergency department to seek medical care but also to have this rape kit collected and what that is is a complete full-body exam to collect all of the evidence of that crime that's been left behind so a doctor or a nurse can swab the body for blood, saliva, semen all of that is packaged up in those little envelopes you see there it's all bundled into a kit and this kit is then released to the police and the police are supposed to take that kit and submit it to a forensic crime laboratory for DNA analysis and that DNA analysis can help with the investigation and the prosecution of the reported assault but what we've learned is is that police have not been submitting these kits to laboratories for DNA testing and going back as far as the 1980s they've been putting these kits in storage untested and those are called rape kit backlogs and they exist in all 50 states big cities small cities rural communities and this as we've been studying as my community has been studying this it's often framed as a very profound institutional betrayal and that's how survivors experience it they reported this to the police they went through this honestly pretty invasive exam thinking and hoping and expecting that the police would do something with this evidence and instead they didn't and there's many reasons why the police don't test these kits but what we see converging across many different studies of different types of settings is that it wasn't an institutional priority they they just weren't going to investigate and prosecute these cases and so they literally put these kits on a shelf so our grant project what we this is is that we were going to do qualitative interviews with the first cohort of sexual assault survivors to have those old backlog kits finally sent to the lab and tested and these cases were finally prosecuted so we would be interviewing survivors and they would be recounting the assault itself and how they were initially responded and treated by the police which was not going to be good they were going to be describing the prosecution of their cases which would be very challenging and emotionally difficult and they would be talking about their health and healing journey after the assault after this institutional betrayal so I was worried I mean to put it mildly I was worried I was worried about many things one of the things I was worried about was creating a safe trauma-informed environment as a trauma community psychologist I've done a lot of projects I felt you know with a lot of intentionality we could do that but I was also very worried about protecting their privacy and their confidentiality given this mandated data archiving requirement and how would we release these transcripts to this national archive in a way that would prevent the re-identification of their data and that's really what we're going to focus on today so when I say open science I'm actually talking about a pretty narrow slice here today of archiving data in a particular national archive and a particular set of focal concerns around preparing data for distribution and sharing in this national archive so our team wrote a paper about this published in advances in methods and practices in psychological science if you haven't read this paper I hope today's webinar will encourage you to go take a look and if you have read this article today's webinar will pull this curtain back even a little bit more and share a little bit more of the behind the scenes information that didn't make it into the article about how and why we did what we did so first off we had to start with what we called phase zero which is if we have a requirement to share these data in a national archive then we need to tell the survivors that before we collect any data so we needed to start with informed consent so we developed a trauma informed protocol for informed consent specifically how to talk to a traumatized population about this and why you're doing this we wrote a paper about that protocol if you're interested and you know this is the language from the consent form so as part of our agreement with our funder we're required to share anonymous transcripts we promise before we share it you know we'll take out all of the usual suspects you know your name any other names dates and any other details about your case that would be identifiable that is a sentence very easy to write it is very easy to say and it is very hard to enact in practice i'm going to put a pin in that because we're going to keep coming back to that it was something that we didn't think through when we wrote the consent form and it's something that we had to think through in a lot of detail over the course of this project the other thing that we did in this consent form as part of our trauma informed practice is that we gave up the survivors opportunity to withhold some of their data so we said if there's specific sections or things that you've talked about in the interview that you don't want included in the transcripts that we're going to share an archive you can tell us at the end of the interview and we'll remove those and I appreciate that some folks may wonder why we did that happy to talk about that in Q&A but we felt that given the level of trauma and betrayal that they had experienced it was important to give them this choice and this control so we're going to put a pin in that too and come back and we'll let you know sort of what happened if we give people that option do they take it so once we had all of this in place we could start our data collection we interviewed 32 sexual assault survivors again that first cohort that's a pretty typical sample size and call research all of them agreed to participate all of them agreed to data archiving they didn't have a concern with it the interviews were incredibly rich and detailed there was no indication that by telling them we're going to archive your data that they withheld information they were incredibly forthcoming they were honestly incredibly painful interviews we also took the opportunity to ask them about this archiving requirements so like what do you think about this why'd you agree to it we wrote a paper about that if you're interested and they had really positive views about it they felt pretty passionately that they wanted other people to understand what they had been through they wanted other researchers to learn from their experiences so they were glad that there was an opportunity to do that and then we come back to that question of we gave them the choice that they could take data out they didn't only two of them requested redactions prior to archiving and those redactions were not substantive they were two two women who asked that we remove profanity from the transcripts before it was archived so we collected our data we went about our normal process we did our qualitative data analysis for the call folks here we use Miles Huberman and Saldaña's framework and Atlas TI version eight was our analytic software we did member checks with our victim service agencies we wrote our grant final report our funder requires a very substantive final report we got all the way through that and it's like now it's go time now we have to prepare our data for the archive so the first thing that we needed to sort of grapple with was as we had 32 transcripts with I mean a lot of information a lot of incredibly personal private information and we needed to figure out okay which pieces and parts of this need to be remediated or what what what's potentially identifiable what are the clues here somebody could use to re-identify who this person is so to do that we turn to available guidance to sort of help us through how we're going to do this so first off I would say at the 50,000 foot view we have IRBs our IR IBS so we went to the IRB and we said okay we've got to do this this de-identification archiving can you help us with this and their reply was well your consent form says you know name dates any other identifiable information so do that and we said we know we need some help and a thought partner in this and and again no disparaging them their reply was do what your consent form said so we're like oh okay so I'm going to put a pin in that too we're going to come back at the end to talk a little bit about sort of what we wanted and hope for from our IRB lovely people but but it wasn't the guidance a source of guidance that we were hoping for sort of at the 25,000 foot view we had the archiving guidelines of the archive the national archive of criminal justice data this was helpful just in sort of helping us think through all right what's the format how do we need to you know what documentation do we need at the 10,000 foot view now we're getting into a little bit more about okay well what do we need to take out what is it we're going to do here so obviously the HIPAA safe harbor identifiers was useful the qualitative data repository creation handbook was useful too and helping us think about kind of the buckets of issues that we would need to think about and then at the ground level we found some examples in the published literature of qualitative researchers talking about showing how they prepared or remediated data prior to sharing it and so we had those but the problem was where we were in that moment is like we weren't at the ground we were below ground we were buried we were at the roots level literally sort of completely overwhelmed with the incredible sensitivity and emotionality of this data and we kept coming back to that thing any other details about your case that would be identifiable well it felt like there were millions of details in those interviews that were identifiable and and we literally sort of felt buried you know there's all this guidance but it wasn't sinking down it wasn't coming down to the level where we were at so a key lesson learned in this project is this is that qual researchers very likely are going to need subject matter experts not open science experts although those were helpful too but we need subject matter experts to give you that root level guidance people who know your topic and your population and can really be your thought partners and unraveling what's the identifiability risks so for our project was the victim service agency staff as well as the prosecutors the prosecutors who prosecuted these cases so i want to tell you a little bit about these consultations and how they were helpful to us so if you haven't worked with prosecutors in your life um you may or may not know but might guess from tv they are a lovely argumentative bunch of people and if you're explaining what you're doing to them and you can basically get through to the other side with them you've been able to have all of the holes poked and filled because they're a tough tough tough group so i'm explaining to them what we need to do and they're like okay so let me get this straight you have this this research transcript and within this you have all of these different pieces of information i'm like yep and they say all right here's what you need to do you need to pull out each of those data pieces represented here as a dot so they're like so each and every one of these pieces has to be evaluated i'm like uh-huh and they were instrumental in giving us sort of a framework to think about this and they said well you're going to want to think about who else knows that information well one of the people who might know that information is the perpetrator themselves another person who might know would be the prosecutor the victim service agency maybe it's the person who is sitting in open court the day that this case was prosecuted so think about the who's who else knows this information and then think about how how do they know that information do they know that information because they're the one who perpetrated the act in question do they know about it because they heard it in open court how else do they know about it and then what other records data sources also contain that information and in our case that's a court record it's a court transcript it's a trial transcript they're like all of the stuff that you cover in your interviews is going to have huge overlap with a publicly available document and long story short yes you can get those by name if you can get one of those you'll get the name of the perpetrator and you might get the name of the survivor too this kind of pause made us pause and sort of freeze in our tracks because it meant that our research interview had sort of this ghost or companion document out in the publicly available ether that contained quite a bit of the same information so we're going to have to figure out how to de-identify in a way relative to these transcripts so the prosecutors helped us figure out how to get some of these publicly available transcripts we sat down we looked at them to get a feel for that kind of information and then it was once we had that sort of lens on this and it's like okay then each data point we pull it out and we look at it what's our options can we retain this is it safe to retain it can we keep it but remediate it blur it in some way shape or form or are we going to have to pull it out of the transcript and redact it because it just can't be there because it's identifiable or it's a really important piece of information that can re-identify and so then at the end we're going to put everything back into a remediated transcript that is both qualitatively different than the original transcript and quantitatively different it's going to have less information and the information in it is not going to be as detailed and rich so with that framing now we could sit down and start working through each of those little dots in all 32 interviews so for our remediation coding we formed a team this was done by multiple people some of them had really deep knowledge of the interviews they were the interviewers and we wanted to do that because they knew they knew the sore spots they knew the sensitive spots they knew it was really hard for those survivors but we also had a person who knew the substantive area but didn't know the interviews and we wanted to have someone with a little bit more distance to sort of look at that maybe they would see things differently but also sort of as a check to say if you take all of this out does the transcript make any more sense so it's kind of we're starting to think about not just protecting privacy but future usability of the data so we had kind of a mix in the team of who was doing this work and then everybody's assigned their transcript and they would go through dot by dot read read read read read read read the transcripts I'm putting a pin there because there's a lot of cost in doing that over and over again so they would look at this they would pull out the proverbial circle tag it and say this is a piece of information at risk for re-identification we were using Microsoft Word at this stage it worked fine I wish we would have stayed in Atlas we can talk about that later but I think it also sort of reflects that at this stage we're still thinking about this as sort of a just a simple recoding task and you could just do that in Word but we did need to create an audit trail some deliberation so we just use the comment box as a way of deliberating and the initial coder would would make a proposal um I propose that we remediate this by blurring and here's the way it should be blurred I think this should be redacted and here's why and they would propose a plan for each and every dot so you can imagine what the comment section looks like it's just layers and layers and layers of comment boxes and then we would have team review and discussion about the proposed remediation plans for each dot in each transcript for each survivor multiple rounds of discussion till we came to consensus we made the plan then the coder would implement it and then we did multiple checks to make sure it was remediated in the way that we agreed upon so now you want to see some examples right I'd love to show you some examples I'm a little hamstrung here because the whole point is this is I can't show you the unredacted data because it has identifiability risks so in the writing the manuscript we try to think about okay how can we show our work here and so we're going to try to talk you through some of the dilemmas and the choices and show you as much as we can and again what I hope to show here in the examples is how our team started sort of thinking going into this entirely about privacy and confidentiality and as we did this work having to broaden our lens to think yes about privacy and confidentiality but usability how these decisions would affect the usability of the data that we would be ultimately archiving so we'll start off with the easy ones names simple one safe harbor everybody knows you take the names out it's in the consent form easy peasy and by and large it was and there were a lot of names in here because they would refer to their friends the service providers the perpetrators by name so this is a straightforward thing you just remediate it by taking the name out and putting the name of the the role so perpetrator detective one detective two and the like here's one where it wasn't so easy because these assaults were finally prosecuted they're prosecuted in open court this became a known thing in survivors lives and many of them talked about as it became known in their lives and in their families and in their circles other people in their social networks often disclosed that they too had been assaulted and so we had within the transcript a secondary disclosure of an assault by you know assault experience of somebody who's not our participant didn't agree didn't consent to any of this so we're like looking at this and we're like okay so we're going to need a little bit more blurring here and if we just changed it to mother took out the name and put the role mother that blurs it except that should you know something fail along the way in this transcript is really you know reidentifiable this this poor person's identity is now blown and their story is is now known in a way that they probably didn't want so we would first for the name instead of putting the category mother everybody just became family member we felt that was kind of a minimal trade-off and usability yes some people future users might want to know that it was a mother versus a generic family member but that's okay we felt that that was important for protecting the privacy of this person in terms of the details of the assault i'm going to cover that in just a few slides of how we handled all of the assault narratives to protect that but again here we had to kind of blur to a much higher superordinate category than we initially thought we would all right let's talk about dates we promised we would take out the dates um and we did um we took out you know i was assaulted on this date my court was this date so the transcripts just say date redact but one of the things that was really substantively important in this project was the delay the delay in justice they reported in 1990 the kit sat for 10 years they finally was tested in 2012 they went to court in 2015 all of those dates are interesting to me um as a psychologist doing research in the criminal justice system and i i knew that my other cj um colleagues would be interested in those dates and wanting to know sort of how they overlay with different legislation and we can't do that um because those dates if you know those dates you can start to narrow down and get to our transcripts so instead we introduced time intervals so we would say you know a five-year interval a 10-year interval and we don't give the dates that stop and start the interval just the amount of passage of time that's kind of a moderate trade-off in usability um we knew that people would be um disappointed that the dates weren't there they would understand why the dates weren't there but that was what we felt was a reasonable compromise in the privacy and the it's still trying to give some usable data to future users all right now the biggie the assault narrative so survivors told us what happened they told us what happened in these assaults both to them and as i said sometimes in secondary ones so our approach was to start this with remediation and our plan was to blur so we would take the text where they're describing the rape and we also know that it's out there in a court transcript because you have to retell the story of the assault in the court so we're mindful that there's this other record out there so we're trying to kind of go through sort of line by line sentence by sentence and remediate specific words you know particularly um salient unusual words we're trying to blur we're trying to smudge we're trying to kind of keep the story there but remove or blur to superordinate categories or remove distinctive phrases and this was tiring it was frustrating and we weren't sure it was working and then it was one day in our research team meeting where somebody said i feel like i'm rewriting her story and that doesn't feel okay and i wrote that down in my field notes because everybody was like yes that's that among other problems it's like this is somebody's story you don't rewrite their story their story is their story is their story it's a very fundamental thing in our in our field in our world so we're like okay then what are our options and what we decided to do then was go to a much more extreme option and we redacted we redacted the story we couldn't like take it all out because you need to know what happened in the assault given that we're pretty experienced researchers in this i know what kinds of details many future users would want to know i don't know all of them but i know many of them and there are kind of a standard set of things that researchers want to know about an assault particularly criminal justice system was the perpetrated by a stranger where their weapons the set and the other so we made a list of what are those common variables that are coded about assault narratives and we wrote a summary giving the answers to those questions that future users would likely have it's not we're not changing their words we are assuming responsibility for the task of removing it and giving back information to future users around the key variables we felt this was a mild to moderate trade-off in usability if you really wanted the narrative itself the data won't have that but if you want to know the variables very likely then the transcripts will have that another big chunk that we were worried about was victims experiences with the legal system because a lot of narrative about that their initial experience reporting to the police the court so again we approached that the same way we started out with the assault narratives where we are planning to do that sort of line by line word by word remediation and blurring but here we actually had a completely different experience of of where we ended up versus where we started when we realized we could actually retain huge chunks of this with very minimal redactions so you see there for example this is a quote from one of the interviews it's right there I had two separate incidents in the rape kit backlog that's not an identifiable piece of information we had a lot of those redacted details of first assault known as Salem they you see the age of the assault I reported to the police and outside of training me basically like garbage like a whore like a liar they threw me away so in state redacted when I was kidnapped off the street by a stranger who turned out to be a serial rapist I was a little more hesitant to call the police so we'll have a moment for what this means that this is not identifiable this is actually really common like literally the words garbage whore and liar when we cross checked against other transcripts even those terms are not unique and identifiable this was a very common experience it's a substantive finding of our project but it also made us made it possible for us to hold on to bigger chunks of data than we ever thought because this is not unique information not at all last example was health and healing so we asked them about that most of this we could remediate with a blur so they would talk about health mental health physical health again we didn't want to distinguish mental and physical health even that felt a little too potentially identifiable so it just became a very superordinate category health condition and when they would talk about their healing journeys again that was a situation where we couldn't really remediate on a word by word level and it was really very private private very sensitive information so whatever we could blur we did and otherwise we just had to we made the decision to redact it and we acknowledged that that piece will be a high trade-off and usability that that future users will not have access to some of that information so some lessons learned out of this phase this was a very time and labor and emotional intensive process and we learned what is identifiable and what may be traumatic are not necessarily the same thing so we read and reread and reread traumatic material over and over again trying to figure out how to remediate it and that increased our vicarious trauma so I'm going to put a pin in that we're going to come back to that as well final phase um we're done right to da I mean we have an original transcript we pulled each of those dots out we we did who how what where all of that we deliberated and deliberated and we put it back in we now have these remediated transcripts well a lot happened between the left side and the right side of the screen between day one and day whatever it was so it felt like we needed a little bit more here what we thought was going to be kind of a simple honestly like word task and microsoft word became coding but the coding actually became an analysis this was a very deliberative decision making this would be so was analysis so you need a validity assessment and qualitative analysis so we needed to assess the validity of our de-identification not just the coding the analysis you know the whole the whole thing we decided to use a very classic framework for our validity analysis Lincoln and Gooba I'm going to put a pin in that was that the right choice or not we'll come back to that later and in the interest of time I'm not going to go through all of Lincoln and Gooba's criteria I will talk a little bit about credibility though um confidence in the accuracy of the findings how confident are we that that this was accurate so we spent a lot of time trying to figure out what the heck does accuracy mean in this and and basically we operationalized it as did it work you know we we took a whole bunch of information out or remediated or blurred it trying to protect the identity are they reidentifiable well you have to be kind of careful about how you test that working hypothesis so we provided agency staff with a set of remediated transcripts these are the agency staff they knew who these women were they had worked with them all throughout their court experience and we said can you read this and can you tell me who it is can you re-identify the survivors and the answer was no they couldn't that we had blurred and removed enough that they could not tell which client it was so we would say we think slash hope it worked again you'll see the pin there is something to come back to later but it was we felt a reasonable way to assess credibility and and it did give us some confidence that we had done a good job in getting identifiable information out of there so i'm going to wrap up with a few reflections lessons learned coming back to those pins and then we'll open it up for um your questions and discussion so let's go back to the pin about the IRB guidance again lovely people my IRB um and this is a quote actually from our from our paper our discussion sections like we followed this but we found ourselves wanting more consultative support from our IRB we've made a point to say we're not disparaging our IRB colleagues we're just highlighting sort of the limits of their training and the limits of their role they were very focused on the front end of the data making sure we had informed consent but on the back end of the data where you know i would argue there's just as many IRB and ethics issues they were not not our thought partners there so i think it really behooves us who those of us who do qual work who do sensitive work to really think about what what is it our our eyes need and they seem to be sort of less and less involved more and more hands off and i would like them to be a useful thought partner to me and my colleagues doing really difficult work to sort of think about what are the re-identification risks another key reflection i want to come back to is the vicarious trauma i mean we we went through a lot in this project that comes through in the paper hopefully it's coming through in the webinar i mean rereading these things over and over and over again and i have to say that this was kind of a ga moment for me because i literally wrote the book on vicarious trauma and sexual assault research um and i was mindful of it i was paying attention to it so it wasn't like i completely forgot everything i've ever thought about written about and and you know we use strategies for that but we didn't quite have enough gas in the tank to get this car across the finish line it was very hard so we needed a team we needed a larger team we needed more time to do this to give people the breaks that they needed to mitigate the trauma so i underestimated the impact on us and i underestimated the time and resources it would need so i share that as a way of for myself and others to say you're going to need more time so you can rotate people in and give them the breaks the mental health breaks that they need to step back and take some time off this also made me think about well does all of this have to be done by people um what can be automated and i and i had this sort of notion coming in it's like oh i'm a little teeny tiny small project i'm 32 cases that automated stuff is for the huge you know healthcare data sets and i would love to see sort of more efforts to try to scale some of those automation methods down to the little people like me doing little small scale qualitative stuff because there's probably some stuff we could have automated i was unfamiliar unaware and just didn't have those connections so that's on me too in terms of my professional networks but i also raised us to sort of remind folks doing those automated things yes it's for big projects but boy it'd be really helpful for us little ones too coming back to this question of did it work i mean you know let's go philosophical fear for a moment what does that even mean i i really don't know i felt like we operationalized it in a reasonable way could the transcripts be re-identified by people who know them in the review process for the manuscript people said one of our reviewers said is Lincoln and Gooba a reasonable choice and like it's the choice we made it's a very very classic commonly used validity assessment was it designed for this heck no although i will highlight that you see actually and you read these working definitions of each of these constructs there's like open science stuff in here the findings reflect the participant's views not the researcher's biases the findings are you know consistent and could be repeated so i actually think it did have some utility here but i i do think it's kind of a tbd for those working in the open science space and qualitative what does validity look like how should we be assessing this and how do we assess utility of the data i think those are really remain open questions and then finally as i said at the top of this i came into this sort of thinking a lot of the open science stuff is is in other areas but i i've learned it's in and should be in all the disciplines and research settings and paradigms and designs and data collection and research populations it can and should have very broad applicability transparency matters but i think that what that needs to be tailored and customized and contextualized to the type of research in the population because risks are relative and what is best practice in one area may not be best practice in another so i'd like to challenge us to sort of think about how do we take these principles and apply them and contextualize them to different types of research so with that i will finish my formal remarks and say thank you all for joining and i need to say thank you to my incredible set of colleagues that went with me on this journey that did this work dr jivorka jasmine engleton katharine fischwick dr gregory and dr goodman williams really truly an amazing set of folks so with that i'm going to stop sharing my screen and bring everybody back for the q and a awesome thank you so much um the chat is just filled with folks who are so happy and and grateful that you have have shared this process um i we have a lot of questions that have come in in the chat and in the q and a so i'm going to try to bounce around a little bit because it they came in as you were talking some of what you've said later in the webinar might address it so maybe we can just hit high points for those um but um yeah just lots of thank yous and and and just admiration for the hard work that you and your team have done in in going through this process and and making sure that we stand by our ethical and professional values so thank you for that um real quick did i turn off my share screen correctly or no you've got we've still got powerpoint for you all right let me remember what i did okay there we are okay sorry about that thanks yeah um so a question about the funders so did the funders agree to the dual content uh let me see if i can um the last statement in the consent where you were using parts of the data for the research and all of it was not archived uh were the funders okay with this or did did you have a conversation with them and what did that look like um didn't really have a conversation with them sort of said this is what we think is trauma informed practice and this is what we're doing and they said okay all right but i think that also because they were genuinely curious um of would would people pull out their data and they didn't okay um and so also when talking kind of similar to that consent process um when uh participants when you talk to participants uh during that consent process were they made aware of what the archive was about who had access to it what what information did you give them yeah that's a great question and i will refer folks to the um the other articles for sort of a deeper dive but but briefly um yeah we we did tell them the the main thing is this is that we were explaining to them that it was going to other researchers and because um NACJD as i said has those sort of two sides and this was going in the protected side where it would be vetted and it would have to go to researchers so it wasn't full public access so if we said it was going to other we could simplify that because we're really struggling like how do you explain this quickly we're at the beginning of a of a complicated process we're still in that sort of trust building phase um so we said it will go to other researchers to learn and and study but it will be de-identified here's everything we're going to take out um and like i said that seemed fine but again i want to highlight these are our research population was a traumatized population who had also been through court so that didn't absolve us of our responsibilities in any way shape or form not to protect there but but they were kind of used to a level of like my lifespan literally on the public stage so that could be a factor too of why they were like yeah i'm good with us um and just kind of to highlight some of that um can you give a little bit more context about the archiving demand so the data is it in is it open archive and how do you feel about qualitative data sharing under data sharing agreements in comparison to the approach that you took so the the data are in um a protected archive they're they're they're not um i had someone said well that's on open science it's not publicly available and i'm like i think my definition of open science may be a little bit different than others that i am sharing data but i'm sharing it i'm making it available to a subset of the general population qualified researchers and i sleep fine at night with that decision given the sensitivity of the data because i do think it's important to be transparent um with the research community my data i don't think really work well um on the on the public scale but i think it does work well um and sharing it in the research population yeah and i i do think that that aligns very well with the concept of as as open as possible but as closed as necessary right it's it's a trade-off it's not all 100 open anybody can look at it but it's also not entirely closed so making making that available is is going to be really important right and you know they keep going with the questions i want to make sure everybody gets we get as many questions as we can yeah um uh an interesting question um so the strategies that you developed for the interview study um how did you apply them or not or how might they apply or not to other types of qualitative evidence like field notes field notes wow yeah so i do have field notes for this for parts of this particular project um i think that's where you start getting into when you when you kind of go down the the literature on open science and qual research um you'll find kind of mixed views on this where where i think some who do that really in depth ethnographic work feel really hesitant about sharing the field notes um i think that you could use these same sort of principles i think that there would be parts of field notes that would be um there would be reasonable justification for redacting parts of it um some of them um it's practice in field notes that you bracket and you're supposed to bracket your own emotions and feelings and things that's part of how you get um sort of your validity and disentangle your participants perspectives from your from your things um i could make i could see a scenario where people would want to share that um as part of a dialogue about how they do that and others who may not want to because what they're unburdening themselves in the field notes is different emotions that they have different experiences that they're having so i think that's kind of an interesting space but i again i think the same thing is as the field notes are basically um you recounting what you've seen and what you've experienced names dates details can be blurred remediated or the like i think that there would be a fair amount of resistance to doing that but i i think it's an open question um about how that could be done and what it might look like um thank you um one question about process um so you thought of a lot of different issues as you were working through this entire research study um did you did you do any brainstorming in the beginning with like a mind map or was this i heard a bit of like you were flagging issues as they came along um can you talk a little bit about like the i guess the mix of like what you were able to identify in the beginning versus what came up and how also how did that impact your budget and timeline which you talked about a little bit but invitation to say more if you wanted i'm going to be really honest here this was me la la la la la i mean it was i literally had so much to worry about i knew it was coming i knew that we had to do the informed consent i knew enough that i needed to hold back some budget i i estimated i think like three or four months to do this um i held back you know so i i i knew it was coming but i didn't think about it a whole heck of a lot um so i um i figured out a lot along the way i wouldn't recommend that and that's one of the reasons why we wrote the paper to be honest i mean talking about transparency here was this is that because i felt like you know we were we were building you know there's a lot of metaphors here we were building the plane as we were flying it um and and at the same time i did kind of want to come into it with like okay here we are what are we going to do with this um and and really tried to do that and i think the mind map up ahead ahead of time um would have been a really helpful thing um i love that idea what was most helpful to me was having that incredible sense of panic and sitting down with the prosecutors and basically just getting you know yelled at for for days while while we work through that with them that's just the way they communicate um but that was really helpful to me so that that lesson learned of like go find your subject matter experts um for that roots level dirt level um that i think is transferable to lots of people and i encourage that and i don't know that you can do that ahead of time maybe you can but at the very least we needed it then and it was really helpful then um a question about de-identification and anonymization this is a lot of work as you pointed out um and it and it can reduce the richness and the usefulness of the data um and implies a risk of of re-identification so the question here was um i guess about your decision so you have like a managed access option um weighing that against archiving the data in a repository that uses encryption implemented access control um without oh i guess you i mean you did do that so you did have um managed access um so can you actually talk a little bit about the metadata um that you included did you guys talk through that with i i'm assuming icpsr a little bit about how to make it so it's discoverable still and so people are able to to access it yep so um you know the information about where it's archived is shared in the paper the doi is available it may not be fully up yet um the curators are still double double checking all of that but the doi is establishing all of that we do have some code books we do have some manuals um we did a pretty good job we hope um and when we redacted something of of being clear what what was redacted so you know what the whole is um and there were a couple times where it just says redacted because knowing what the whole was was too identifiable or too risky in our assessment but that was actually pretty rare that we did that so we did try to provide some metadata again to future users um about what they're seeing um another question about process um where the individual remediated transcripts shared back with the individual participants again thinking about this participatory right that's a great question and um again something that that had i built the plane a little bit more um before we started flying um might have had it might have had a different response there so here's the thing um this is a very difficult population to recruit this is an incredibly difficult population to recruit and part of the appeal if you will for them is is that it's a one-time engagement i have a long-term react relationship with the service agency the prosecutors is that and the other but between us and the survivors there's sort of a an ethos of i'm going to come i'm going to tell you there's kind of a healing piece in that as well i've written about and by sia um so it did occur to us too at the end of like i would really like to share this back with the participant to see if she he or they are okay with this we had no mechanism to do that our consent was set up as a one-time thing um and that was important because our our victim service agency partner said if you make this a repeated thing i don't think they're going to do this they've been through a repeated thing they will give back kind of one more time but only one more time so it is something i would like to kind of tinker with in future research to see if there is a subgroup who might be interested in that or make give it as a choice i mean it was kind of a paternalistic decision um to to you know not even do that we just kind of did it one way and i would like to open up a little bit more option um and see if folks might want to do that and might want to reread it um i'm not sure how many would um it means going back over traumatic material yet again but that's not my decision it's their decision i just couldn't offer that decision because of the way i set up my consent process thank you yeah i think that that's really helpful for folks that are planning new projects and thinking about how they might want to do this is is what options are we going to give so i think that's really important um i love this question and i think you'll have uh some things to say um this uh attendees a program officer for a grant program that requires applicants to submit data management plans um and so encouraging them to think about balancing confidentiality and openness i think of different elements that can be shared um and so um this person was struck by how often these dmps say like do the sensitivity of the data nothing can be shared and so they have they're wondering like what advice can we can we give these folks so i think i i think that this idea that the that the data are too sensitive to share is is a very real normal human reaction for folks doing um work with marginalized minoritized populations and and i i honor that feeling i've had that feeling because um there there's so many populations that have been actively harmed by the research community um native communities um black americans so so so many so you know it's like let's be careful here folks um and and and sometimes you you that notion that you can't share it or you shouldn't share it or you're too sensitive i don't want either side to get to in their bunker on that i don't want one side to say it is impossible it can't be done it shouldn't be done because i think that there is an important part about um our role as a researcher in making voices heard by others you know um we don't speak for the participants they speak for themselves and the extent to which we can share their data using as much of their voice literally or figuratively um of of people that that's that's important um and at the same time i don't want to get people in the bunker saying everything can be shared see here's proof you know even the most traumatized you know our minds a pretty unusual situation so let's not over generalize from mine either so um i don't have a great answer other than that i think that it should be kind of an open conversation among funders and i i don't want people to have the the kind of knee-jerk reaction on either side oh um you must share um oh you can't share it's like whoa whoa stop stop stop stop think talk look at um look at what's happening look at the different resources is it possible um rather than just assuming that it it isn't possible or that it automatically is possible wonderful um uh an interesting question about how you see the data what types of reuse of your data do you do you foresee um and are there reuses that you think will not be possible after your cleaning process um i think one key thing that's that's absolutely going to be impossible with our data is anybody who really wanted to do any kind of discourse analysis on how survivors talk about their assaults i mean that's just not possible we literally took it out and replaced it with a summary of variables um so but i don't know that people doing that kind of work would go to these data anyway for that kind of question um there's other ways and other data sets that that you would probably go to for that instead where they identify ability risk is is lower again remember if my city is public you know if that weren't on the table um in the review process one of my reviewers was like you know damn you know if you didn't have that think about how many other things might be available and i was like yeah you're right you know but but when you know literally where it is which courthouse which FOIA office you go to um it does it does constrain you quite a bit so i think that it's going to be not super helpful in that way but i think for folks who really want to understand how the criminal legal system treats sexual assault survivors there's very rich data there oh let's see i think questions are just coming in as as quickly as we can answer them this is awesome um let's see um so a question about uh how would you decide how much people on the research team could handle or when to tag in and out so what kind of characteristics would you look for within the team in the future yeah that's a good question and it's it's sort of part of a bigger um aperture we have in our teams around vicarious trauma that while they were interviewing we had those structures in place and we just had to continue to apply them here and again my underestimation was sort of like how much how much wear and tear this was going to cause so it would be part of it it's proactive on a to just rotate people in and out um there are power dynamics here between a researcher and his or their team um you know i'm your doctoral advisor i'm your employer people are going to be very reluctant to say hey you know i need a break i try to create an environment where they will do that but again not at the risk of being paternalistic but i just need to be mindful that with power differentials people might not do that so there's just a rotate in and a rotate out um that that just happens and actually that takes care of a lot of it knowing that you get a break you know you rotate in you rotate out you know that a break's coming and then um you know people could let me know they might let another team member know who might say i'm going to take over somebody's shift and i'm like absolutely and then providing support for each other within the team but again also recognizing people may not want their support from me they want support and i need to step back so that they can go do that but it doesn't need to be me and maybe it shouldn't be me but i can hold the space for them to go get the support that they need thank you um so i know we're kind of at the hour um i have a quick question would you be willing to share your slides with us we usually send a follow-up awesome so um we will it would be pretty hypocritical i believe in transparency now you can hit my slides awesome so well four folks will have the slides we'll have the recording and also we'll be sure to include those those papers that were mentioned um in in in the presentation as well so folks can have it um maybe this is probably a perfect question to end on um and if you could do anything differently um whether that's in the data management plan archiving preparation stage or anything else um what what would that be that might be our last question unfortunately i actually wouldn't do anything different because i went into this wanting the journey i want i went into this with a mindset of curiosity of being a learner um i'm kind of old in my career um i'm the old dog i'm the dinosaur i wanted to learn some new tricks i wanted to learn new things so i i i i let that happen and i wanted that to happen i wanted that experience of feeling unsure and confused and naive to learn and to remember what it feels like to learn that was really important to me so me personally i wouldn't do anything different because i really wanted that that experience um practically i would have started i would start the prep on um the de-identification at the time you're cleaning the transcripts because you have to clean the transcripts um and check for accuracy when you're doing qualitative analysis and i could have saved some wear and tear if we started doing some of the immediate safe harbor stuff then so that that's that's a concrete tip of of something i wish i would have done differently but otherwise i really wanted to see what i could learn and i hope that came through in today's webinar and i hope it came through in the paper of somebody not knowing the space trying to learn this space and to share what it feels like to learn this space so this was this was excellent um there's so much uh positivity flowing i see some of it i can't see all of it but i thank you i thank you for welcoming me into your new space um i hope it's useful and i look forward to meeting some of you and learning more about what you're doing some thank you so much thank you everyone for coming um please uh check out any other events that we have um and uh we hope to hear from all of you soon thanks thank you take care everyone