 My name is Beth Duckles. This is Vicki Steeves. I'm a sociologist. Vicki is an academic librarian and we're here to talk about qualitative research using open source tools. Start off just with the basics of what is qualitative data analysis to dive right into it. So this is a way of collecting, organizing, and interpreting typically textual data, although sometimes visual data as well. Everything from interview transcripts, public documents, focus groups, open-ended survey questions, anything like that. A close reading of this qualitative data researchers then assign codes to the text and the themes or things that come out and relevant quotes come out of the data. It's a very iterative process. So folks who do this are often from doing this over multiple times, so they might do it and work a little bit more on the codes that they're creating and keep going as they read more of the analysis. Here's what it's not, and I say this because a lot of times folks get confused about what qualitative data is. Anything that you can press a button and it comes back and it gives you an answer is not going to be qualitative data analysis. There's cool things that it tells you, but you have to do a lot of the work and the analysis oneself, the person who's doing the research. This is not word counts. It's not word clouds. It's not machine learning or text mining, keyword extraction, sentiment analysis. Those are all great. I really support them and want more of that, but qualitative analysis isn't any of those things. Or data analysis. Artificial intelligence, predictive learning. It's also not mixed methods per se. Qualitative can be a part of mixed methods, but it's not something necessarily that's always a part of mixed methods. Qualitative, as I say here, is deep engagement. So you're really looking at the textual data or the data that you've got really engaging with it and trying to come out with specific themes from that analysis. So why would you do this? Why would you do this research? Well, it's used by a lot of fields. Anthropology, sociology are kind of well known for doing that. Education as well. Nursing, information science, psychology, and a lot of folks in user experience and marketing are also starting to do this work as well. There's also a lot of type of data out there that comes out that's textual, and so this is a way to analyze that. Everything from observation, so ethnographic or observational research, interviews, focus groups, as we mentioned, also tons of public or archival documents, memos, jottings, media, for instance, tweets is a really, you know, growing field of doing qualitative data analysis, and also all kinds of visual stuff as well. There's a lot of visual analysis in this field. The reason to do it is because you generate rich, detailed data that lets cultural context and individual perspectives be. We allow for a deeper understanding of phenomena when we try to understand it from the words of people who were there. And often, and I say this all the time, qualitative research should study things that quantitative cannot. So we shouldn't be doing it if we can just do a survey and find some numbers and answer it that way. Please go do that if you can get the numbers to figure out the answers to your problem. Qualitative is a lot of work to do for something that you already have numbers for, but it also really helps us understand the why's for whatever the phenomena is that we're trying to understand. So how QDA works, typically there's collecting the data of some kind, organizing it, gathering it, pulling together source materials. Sometimes you're doing interviews or you're collecting observations. Then you do this coding process that we, I mentioned, quality of data analysis of coding these specific parts of the data, and then finding patterns. And there are different theoretical ways to think about doing those patterns, but you're articulating the connections, relationships and patterns in the data that you find. And then finally, you create this analysis, using the theoretical and substantive knowledge that you have to be able to understand what you find in the data. So I'm going to pass this along. We're going to talk a little bit about what exists right now for qualitative data analysis, and then we'll talk about our open source products. So a lot of our work was born out of the current landscape of qualitative research and what's called on Wikipedia, computer assisted qualitative analysis software. So QDA, it's QDA tools. It's fine. So you can see the list here varies a lot in terms of prices. So in vivo, for example, which is one of the most well used qualitative tools across disciplines, it's about $1,400 for a license and it only works on Windows and Mac. The cheapest tool is $15 a month and uses Flash, and it is not going to stop using Flash. So there's that. And then we listed the three currently maintained open source qualitative tools. So one is QualCoder, which we just happened to find the way we surreptitiously found each other. It works on Linux and it's being tested on Windows. And the cool thing about QualCoder is that you can use AV, so you can import audio and visual materials. There's QCoder, which Beth is going to talk about, which is an R package for qualitative, works on all operating systems. And then I'm going to talk about TagIt, which is the open source one that I work on. So the landscape was pretty grim before us in terms of accessibility to qualitative. People were putting together, like I'm going to cut and paste quotes into Excel. So you have one column that's your code and one column that's the text. And there are all these very hacky systems that people have come up with to get around having to pay these massive amounts of fees. Like I always think back of the first time I ever did qualitative. I was working at a museum that could not pay for the license. And so I just kept putting in different emails to get different trial versions. And so it's not sustainable. It's not equitable. And so why floss? The first reason is equity. These are under-resourced disciplines, mostly in social science and the humanities that do not get as much money from funders. And so this also spans lots of geographic areas that also typically do not have the resources to pay for tools. So this makes the default, these hacky systems, or people just stay analog. They print out all of their transcripts, highlight with different colored markers or post-it notes. And this makes the analysis sort of unavailable to other researchers. When we think about openness and reproducibility and qual, which a reproducibility librarian have talked to me about that another time, this is just not sustainable. And part of that instability is around lock-in. So the export options in these paid platforms really vary. They're not backwards compatible. So you can lock students, which gives me like the shivers at night, and researchers into a platform that have no alternatives. And so when you leave a university, you leave a well-resourced job, you have nowhere to go for your computer-assisted qualitative data analysis. And there's also a level of complexity that these paid packages have. Tons of add-ons and extras they want to sell to you that really overwhelm newcomers when really the basic functionality is highlight a section and give it a tag. That's a really simple program to make. However, we've seen a little bit more standardization, which I'm going to talk about later, and some open codebook initiatives as well. So some of the challenges for open source qualitative packages are coming to form. So one is technologically and the other is a cultural concern. So I'm going to talk about the culture first, even though it's the opposite way on my side. The first primarily is data privacy concerns. So people who do qualitative research are doing things like interviewing refugees or celebrities in Turkey, and these communities are really in danger. And so when they think about privacy concerns, both with their own data and with data that exists, that is findable, these folks often don't understand that open source can be as secure as a paid platform. And so these are some cultural barriers that we've seen. Then the other side is the education side. So folks either teach you the hacky systems that they've created to get around paying for licenses or you get trained on these proprietary packages and then again when you leave you have no alternative. The other on the technological side is that there are often lots of collaborations and qualitative research, and so people need one the ability to collaborate, which isn't existing in many open source and any qualitative paid qualitative packages. And then also there's not really good provenance or version control between the way people tag things and gets all tangled up. So there are some open initiatives in QAL I wanted to highlight. So there are some past open source qualitative initiatives that are now defunct the RQDA as an R package. The other one people talk about a lot is the coding analysis toolkit called CAT. The dates that you see there are the times they were last updated. So there's also the qualitative data repository, which is a new and really safe way of publishing qualitative analysis that's run out of the Syracuse University. You can have things like the metadata accessible, but you have to request access to the actual data file so you keep things discoverable but locked down for your populations. And they also have this great initiative called the annotation for transparent inquiry. So this is I really highly recommend that you look at it. It's about annotating specific passages in an article with links to data and more methodological information than what would normally be allowed in an article. So it's a great open initiative in QAL. And also just a one quick note on this thing, the Rotterdam exchange format initiative. So this is a great open community run initiative in QALitative to be able to exchange projects and code book in between. The charts are kind of big, but they've been adopted by a lot of these paid platforms. And so there's some movement towards more interoperability and QALitative, which is wonderful. So the problem that we tried to solve is that qualitative researchers need more equitable access to software. The options are far too expensive. The ability to highlight text and tag it should be really easily available in a GUI. And it's not fair or right that qualitative researchers without these massive funds can't afford the basic software to do their research. And so we're going to describe our two approaches. Beth is going to start with QC Coder and then I'm going to talk about TechEd. If you get nothing from this, that last slide is exactly what we want. We think that there should be more options. And so I was at our open site last year, Dan Scholler and Elan Waring. And we all got together. We're all social scientists and started talking about this problem. And we came up with a solution which is still in prototype, still in development. We'd love your help if you're at all interested in supporting. Our goal was to create a lightweight open source, textual qualitative data coding package. And we also wanted this to be available for existing our quantitative packages that are looking at things in terms of text mining and sentiment analysis. So we think that this would really help to create more capacity for mixed methods and more interesting mixed methods that might be available. We use RStudio and Shiny to create an interactive front end interface for users. We support at this point doc.x and text files and the data remains local. This is just an example of the interface that we have right there. We have a coding process here. And this just gives you an example. We're using codes of conduct as our test case or example data. We need to use that repository to find more examples. But that's an example of some code books that we put together and some of the outputs that we might end up getting. This is still very much in the early stages of development. And right now the problem is that you need to understand R and RStudio to be able to use this. And that's not particularly welcoming. We know that. We also know it's not particularly easy to collaborate yet. And we need to support more file types and extend this a little bit more. Make it really easy to kind of plug and play. It's in GitHub. Please come support us in any way. We're looking for collaborators and also funds for development, possibly working together. I should mention what's pretty fun is that Vicki and I met each other over this. Both of us. We didn't know each other. We both kind of put out that we had put out these packages and then somebody was like, you two should talk to each other. So it's pretty exciting that we can see that there's multiple folks that are interested in creating these things. We also have various stories of having run into people that were trying to do the same thing. So there's a lot of folks out there that really want to have an open source community that does qualitative research. But one of the challenges, I think, as we've mentioned is that a lot of qualitative researchers don't have a lot of the technical skills to be able to create this kind of work. And so we'd love to make that bridge and that connection. Elin Waring, who's sort of the PI for much of what we've been doing here is working with some CS students, hopefully this summer to further the package. But again, we're really open to any support or additional ideas from folks. So the approach that myself, Remy Rampan, who is the main developer of Taget, and Sarah DeMott took is a web-based application called Taget. So Remy Rampan, hi, he's on the live stream in France. He's in the US, but he's just in France right now. So which is why the logo is Taget because I like to poke fun at his Frenchness. And so I was like, it's Baget, but with a T and you tag things and that's what qual is. So we had kind of fun with that. And Taget is installable both locally and on a server for different collaborative work. We also have a server where folks can use Taget and the development is on GitLab. So feel free to also check out the DevQ and stuff there. So this is what Taget looks like. My screen is pretty wide, so on my desktop at home, so this screenshot isn't super clear. But sorry about that. But with Taget, you can pretty much import anything that Calibri takes. Calibri is an eBook manager that we use to basically convert all these documents into HTML to show you them on your web page. You can highlight and tag different text. We have this thing called Backlight, which I did not think was going to be very popular, but it basically grays out the text that is not highlighted. So the yellow pops a lot. And when I teach students this, they always freak out at this backlight and I have no idea why. So we have backlight. It's a thing. And you can also export different views and you can use group work and you can actually collaborate in real time. So that's sort of just how we've tried to do Taget. So like I said, it's a web-based thing. So you can use it on any desktop. You can install it on a server. It's built with Python and Calibri. The data is stored in a SQLite database, so it's provenance aware. These are all sort of super-developy things I'm telling you, but it's our technical approach. We also, this again, was born out of a need in the community for a lack of open source qualitative work. So we have tried to help this community by making super intensely commented and image-heavy documentation. We have open educational materials on the RSF. We have people email us actually with their errors all the time, which is very cool and some tweets as well. The other thing about Taget is that it is multilingual. So the story behind Taget is me complaining for two years over dinner with my friend Sarah and my partner Remy and he got so tired of hearing us complain about no open source qualitative that he made it. And so the cool thing is Sarah and I are then able to teach it as a part of our services as librarians. And she was actually able to present it in Vietnam and Syria. And the users in these sessions were able to upload their own materials and get started right away. So you can see it's multilingual. And we're actually actively translating the tool into different languages. We have French and German. If you speak something else, that's the link. So please feel free. And so just to sum up sort of the core of our presentation is that qualitative research has traditionally been pen and paper or dominated by these super lockdowns, super expensive proprietary software. And so only really well-resourced institutions have access. That's not fair. Qual researchers need floss. We're trying to fix that gap. We encourage anyone with some abilities in programming to contribute. Uplifting what Beth said, a lot of these folks don't have the access to learn these like Python and these things. So we're trying to bridge that gap between folks who are heavy in development and the communities that see this real need. So that is what we have for you. That's it. We're happy to take questions.