 and kind of get that experience. So Rebecca. Great, thanks Comstock. So our speaker today is Dr. Sean Lawson, an Associate Professor in the Department of Communications at the University of Utah. Dr. Lawson's research focuses on relationships among science, technology, and security. In particular, he focuses on the intersections of national security and military thought with new media information and communication technologies. Today Dr. Lawson is going to speak to us on collecting social media content for qualitative research. And like Comstock said, we're going to hold questions at the end, but feel free to put them in chat and we can get to those at the end of the session. So without further ado, please join me in welcoming Dr. Sean Lawson. All right, thank you. So, yeah, so the title for today's talk is Collecting Social Media Content for Qualitative Research. We'll start off by going over some general tools that you can use to find and collect multiple kinds of data online. We'll then sort of transition into highlighting some platform-specific tools for Twitter, YouTube, Reddit, Instagram, Facebook, and Telegram. Of course, the number of platforms here that we're going to discuss is not exhaustive. So you'll notice like TikTok is missing, for example. Haven't done a lot of work with TikTok or Snapchat. So there's always more platforms that you can deal with, but these are sort of the big ones that we'll talk about today. Additionally, I'm not going to be able to go into a lot of detail on each of the tools that I'm going to show you and specifically how to use them. Rather, this is meant to be more of like a quick overview of what's available, what's possible, to sort of spark your interest and get you thinking about what's out there and how you might start using some of these tools in your own research projects. I'll provide a link at the end where you can find all of the tools that I'm talking about here. And so I hope you'll go and play with them. Don't worry about feverishly like taking notes and trying to write down lots of URLs while I'm talking. Again, there'll be a link at the end, just a short bit.ly link that you can go to and even a QR code if you want to just snap it with your phone and it'll take you there. And it's essentially just a list of all the links from the presentation in the order that I mentioned them. And you could even use some of the tools that I talked about in the presentation to save that web page with all those links in it or maybe scrape those links. Or I don't know, screenshot them and then OCR them. The possibilities are endless for what you could do. And maybe that even gives you an opportunity to play with some of the tools we talk about here. Okay, so the other thing is a caveat at the beginning is also be aware that these tools come and go as the platforms make changes or as the people who created the tools, maybe stop maintaining them or move on to other projects. I really try to emphasize free and open source tools as much as possible in the presentation, which means that they're often created by either a single person or a small group of people, they're often doing it for free just as sort of a passion project. So sometimes these things go away or they stop working. In fact, when we get to telegram later, you'll see one of the tools that I recommend you check out was literally working yesterday because I was using it. And then this morning when I clicked the link to go to the page to screenshot it to put it in the presentation, I got a like a 503 error and the page isn't working. I don't know if it's going to continue to not work or if it's going to come back or not. I still put it in the presentation, but that's literally how volatile some of this stuff can be. So you just have to be patient and creative in doing this work. Like I said, often what works today doesn't work tomorrow. I literally found that out again in the last 24 hours. So maybe before we go a little farther, I should say a little bit more about who I am, although I think Rebecca pretty much hit it already. As she said, I'm an associate professor in the Department of Communication, and I'm also the director of the Communication Institute here at the University of Utah. I'm currently a non-resident fellow at the Marine Corps University in what's called the Brute Crewlock Center for Innovation and Future Warfare. And I've also been a participant for the last couple of years in the Atlantic Council's Digital Sherlock's program through the Digital Forensics Research Lab. And throughout the years, I've used digital data and tools in my academic work pretty extensively and also in some limited reporting that I used to do for Forbes. I used to be a contributor at Forbes. I haven't done that in the last couple of years. But in all of that work collectively, I've made a lot of use over the years of blogs, Twitter data, email breach data, public records, other kinds of social media data. And I've also made use of social media APIs, Python scraping, and various kinds of search engine optimization tools in the work that I do. More recently, I've been interested in what's called open source intelligence or OSINT. And many of the resources shared here today will come from that world of OSINT because I often find that my colleagues and students are unaware of these resources. Often in academia, we talk about digital methods or digital research or digital ethnography. Those tend to be the terms that we use. And there's a lot of good tools that are made by and for academics. But there's like this whole other parallel universe of OSINT or open source intelligence, where there's lots of other tools. And I think probably even more tools in that space than there are sometimes in the digital methods space. And so that's where I'm going to sort of draw a lot of my inspiration from today. Okay. So though the talk title is about collecting data, before we can collect data, we have to find it first. And that's actually not as easy as it would seem a lot of times. And neither as you will see is the collecting portion of it either. A lot of times we think that because things are online, it's easy. They're just right there on our screen. They're easy to find and they're easy to save. They're easy to collect and use in our research. But once you start especially dealing with certain social media platforms, Facebook for example, what you'll find is it's not that easy. And in fact, in a lot of cases, the platforms don't want to make it easy for you to find what you're looking for. And especially they don't want to make it easy for you to save that information and use it for something. Okay. So first and foremost, we have to find what we're looking for. And what's more, if we're engaged in a longer term ethnographic kind of project, monitoring our sources of digital data is also important over the long term, right? So for each of the platforms I'll discuss, I'll point you towards some tools and references for searching, monitoring and collecting or archiving digital data. So we'll hit those three for each one of the platforms. But I'll start with some general tools for web searching, monitoring, and collecting or archiving that are helpful regardless of what platform you're studying. Okay. So let's start with the search component of this. The first is of course, everyone's favorite Google, right? But I think most of the time, we usually just sort of casually type things in the answer box, as I like to call it, and hit go. And it mostly gives us what we're looking for. Usually within the first 10 results on the first page, how many of us really go far beyond into the second or third page of search results on Google? Not very often, right? So that works for most casual searching. But there are actually a ton of advanced search operators that you can use in Google that are very helpful for letting you narrow in on specifically what you're looking for, right? So how many people here have used some of the advanced search operators before? A couple people, right? So you can do search operators like in title, right? Like only search within the title. You can search in URL, right? So terms just within the URL itself, not the content of the page, but just the URL of the page, right? You can search a particular site, the content of just a site by just saying site colon, put in the domain name, right? And then search whatever terms you want. Okay, so and there's a ton more of these that you can use as well. Okay, so you can go and search Google for Google advanced search operators and you will come up with a ton of guides and lists for these. So I didn't put one up here for you. But in the world of like computer hacking and security and pen testing, another term that this goes by is what's called Google dorks or Google dorking. And so you can search that as well. But I did provide you one example here. And again, this link will be in the references at the end. You can go and look at it. And this provides examples of really specific search strings in Google for finding things like files that have the word, you know, for official use only in them from a domain name that ends with .gov during a particular time period, et cetera, or certain search strings that you can search for to find web cameras that have the security set to the default login and password, which could allow you to log into those cameras and see what's happening on that webcam. And all sorts of other things. So most of these Google dorks that you're going to find on this site, probably not things we're going to really be using as academic researchers, right, and students. But I include these here because they give you, I think an idea of how specific and targeted you can really get with a Google search if you really understand and get good with the advanced search operators. And so look at some of these examples from the exploit db.com database. And think to yourself, okay, how could I do something like this for my research and for what I'm interested in? And what could I potentially be getting to in terms of sources if I did that? Okay, of course, there's more to search on the web than Google, right? So don't ignore other search engines being go or even foreign search engines like Yandex, all right? So Yandex is actually a Russian search engine. And assuming they don't go out of business because of all of the, you know, all the financial stuff that's going on with Russia right now with the sanctions and everything. Yandex is actually a pretty good search engine, especially if you're doing reverse image search. So you can take an image that you find online and either pop in the URL to that particular image, or you can download the image and then upload the image to Yandex and do a reverse search to find every place else online, that that image has been posted. And in fact, Yandex in my estimation is really the best tool for that Google reverse, Google has reverse image search as well. But quite frankly, it sucks. It's not very good for as good as the rest of Google searches. The reverse image search is pretty awful and usually doesn't get very good results. But Yandex is like creepy good in the things that you can find. Finally, there are other tools like Google trends, which can be helpful for finding the most search terms related to a broad topic area. Search engine optimization tools or SEO tools can further help with finding most popular sites and content related to a particular topic or search for finding related or similar sites. You can start with one site and then have these tools find sites that are similar or related that are posting similar kinds of content. They can help you find links and references between a group of websites. So there's lots of things that you can do with these. And so all of this is good for sort of getting an initial handle on a discourse or where certain topics might be discussed online that you're interested in. And then you can dig in further with the other kinds of tools that we're going to discuss. Okay. So next up is monitoring. You might want to monitor the topics and the discourses that you're interested in over a long period of time. And so that is instead of just finding some data and then saving what's available at that particular moment and then moving on to your analysis, you might want to do a more ethnographic or sort of longitudinal kind of project. And that's where different kinds of monitoring tools come in. So I'm sure many of you are probably already familiar with Google Alerts. You can use it to monitor the web in general or you can monitor for news items for the search results and the terms that are of interest to you. And then you can use all of those advanced search operators that I mentioned before in your Google Alert searches. So instead of having to search the web regularly or search Google News regularly, you can have that happen automatically in the background and then have new results just delivered to you regularly. And so you can see an example on the slide of some of the alerts that I have set up, although I guess it's a little bit small. Okay. So but what do you do with those, right? Once you have the alerts, what do you do with them? Well, you can then have those results delivered to what's called an RSS reader, or sometimes they're called an RSS feed reader. And so RSS feed readers are really power tools for monitoring online content. So how many people here use an RSS reader already? One person. Okay. I can't really see online how many people may be raising their hand. Maybe one person. Oh no, that's me. Hi. There was a thumbs up. Yeah. Okay. We do have at least one thumbs up there from someone who uses an RSS reader. Okay. So the two that I recommend the most, especially for academics are Feedly and Inno Reader. I actually have paid accounts at both of them because I find both of them very useful. There's a lot of overlap between the two, but there are some slightly different features that each of them have. And, you know, they're not that expensive. And so I just kind of pay for both of them. So what can you do with them? Well, you can use them to subscribe to updates from particular webpages that have RSS feeds on the webpage. And most news and blog sites will have RSS feeds on the page. And with either Feedly or Inno Reader, all you have to do is just pop in the URL of the page you want to follow. It will go and find the RSS feed for that page. And then you can subscribe to it. And whenever that page posts an update on their news site or their blog site, it'll automatically be delivered to you. You can also subscribe to your Google Alerts by subscribing to the RSS feed of the Google Alert. In both Feedly and Inno Reader, you can subscribe to certain social media accounts, like Twitter, Telegram, and Reddit, for example. You can even subscribe to email newsletters. So you don't have to have all that going to your email and junking it up. Instead, you can have them sent to your RSS Reader instead, and you can consolidate all of it in one spot. You can do things like set up filters within your Feed Reader for incoming information to filter the information as it comes in and highlight certain terms that you're interested in in the information that's coming into you. These tools also offer various integrations with other services, like Google Docs, Dropbox, etc., so that you can automatically have data saved into spreadsheets or into folders on your computer. So Feedly and Inno Reader are going to come up multiple times throughout the remainder of the presentation, because they're really super valuable. And then finally, there are cases where none of the above are going to work for you. And in that case, you might have a page that you need to follow, but it doesn't have an RSS feed, or even email alerts or updates. And in that case, you can use a website change detector service to notify you when a page has changed. So the one that I use that's free is just called follow that page. There are a lot of other options in this space, both free and paid. This one's pretty simple, not a lot of bells and whistles, but it is free and it serves the purpose. And you can have the notifications delivered as email updates or as RSS feeds, both of which you guessed it, you can monitor in your RSS readers. You can have all of the stuff coming into one location that's sort of your dashboard. All right, so now let's turn to sort of general collecting tools that you can use. Okay, so there are also free tools for collecting. The most general and I think free tool that you can use, the most universal tool for capture is good old screenshot or just print PDF, right? We all have this capability on our computer, so don't discount these. Oftentimes, and I'll be honest, I do this myself, I will get caught in looking for like a fancier way to do it, like there's got to be a script, there's got to be a service, there's got to be software, maybe somebody's written something and posted it on GitHub and I'll just waste a bunch of time and it's like, you know what, I could have just printed this page as PDF and saved it on my computer and I would have been done with this like an hour ago, right? But I'm like looking for the elegant solution or whatever when the free and simple solution is staring me in the face. So don't discount these, okay? They're often more efficient than fancier ways to save materials online, but these options are always a good fallback if other tools aren't going to work or if in some cases like Facebook a lot of times, there just aren't other tools because Facebook really, really does not want you saving their content. And so sometimes like save to PDF or screenshot is like the only option unless you're willing to go out and pay for like a really expensive sort of like industry solution or whatever, which I think most of us as academics don't have that kind of money, right? Okay, so as you can see my slide here is very meta, M-E-T-A, not capital M-E-T-A, is it two T's now? I don't know, yeah, but it's very meta. I have a screenshot of me saving a tweet as a PDF. So technically I'm kind of saving it twice at the same time, whoa, right, mind blown. Okay, but there are also free online archiving tools that you can use. And so these are particularly useful if you need to share archived versions of your data publicly. So the two most well-known options in this regard are archive.today and of course everyone's favorite the Wayback Machine. Now I'm sure a lot of us probably use the Wayback Machine to find old versions of websites if the website has gone away and we can't get it anymore. But what a lot of people don't realize is there's actually a Chrome browser extension for the Wayback Machine that you can get that will allow you to save websites to the Wayback Machine. And in fact you can sign up for an account on the Wayback Machine on archive.org, get your own account, it'll be linked to your browser extension and when you save things it'll save snapshots of pages using the Wayback Machine tools algorithm and those will be saved in your account and so you can go back and find the websites that you've saved later. And there are also tools for actually for exporting those and downloading them locally as well. So I'm not going to go into those, they're usually like Python command lines up, but there is that option as well. And then the other one that's newer is archive.today. Now both of these options are especially used a lot for people in the OSENT community when they're doing investigations where it might be related to fraud or disinformation or atrocities like we're seeing in Ukraine right now. And so they will save and archive the materials that they find publicly to these services so that they can prove later that the material is real and it's not just coming from them, it has been archived and saved by a third party that is generally trusted. So that's a thing that you can do as well. So you can use these to save entire web pages or if you just want to save like a tweet or something like that you can do that as well. They're generally better for entire web pages. There are also browser extensions that will let you save full versions of websites or just pieces like frames of a website. So two of my favorites from, for the Google Chrome web browser, which is what I use a lot, are called single file and save page WE. So they're both free and have slightly different features. And usually if one doesn't really do what I need or doesn't work on a particular page, the other one will. And so they're very helpful. So they will download the entire content of the page including images, embedded like thumbnails and stuff like that. They won't grab like video. They won't grab like video files, but they will grab like the thumbnail of the video. And so it'll save like a full fully encapsulated version of the page that you're looking at to your local machine so that you can go back and see everything that was there later. So if you've ever tried to just do like a command S to save a web page, you'll notice a lot of formatting gets lost. A lot of images aren't there. It doesn't usually look very good. But with a tool like single file or save page WE, it saves the whole thing as an encapsulated page. So you can use these to, again, save an entire blog post or a new story or a social media account profile page, a list of search results from Google or Bing or something like that, or even just one social media post at a time. So there's lots of things that you can do with these. And then finally, we can use some sort of freemium, you might say, or paid tools for general collecting and archiving. So these include more general purpose note picking tools like Evernote, which makes it really easy to save local copies of like a web page that you're looking at, like a new story or blog post or something, or a solution like DevinThink, which I use a lot. DevinThink is Mac only though. So if you're on Windows, it's not going to work for you. In both cases, you can use the free versions for quite a lot. And there's optional paid upgrades for more features and more storage capability in the case of Evernote. Okay. And then finally, there are also a couple of really sort of specialist tools used by OSENT investigators in particular, that essentially what they do is they save local copies of everything you browse in real time while you are browsing. And such tools are particularly good for longer term ethnographic projects. So the first of these that's the best known it's been around a little bit longer is called Hunchly. It is a paid product, but the developer's a really nice guy. He's Canadian, right? So figures, right? He's a really nice guy and he offers, I think it's like a 50% discount for academics. I think the license is something like a hundred bucks a year. And, you know, with 50% discount, you know, like 50 bucks a year. So if you have a need for something like that, it's pretty awesome. I've downloaded the trial in the past and played around with it. I didn't ultimately end up buying it because I didn't have a need for it at that particular moment. But it works really well. It's really cool. You can go watch some of the videos about it on his YouTube channel. It's a really neat tool. And then there's also a newer one that's very similar, does a very similar thing called Bortemo. It's also a paid product, but the developer offers the tool for free if you're an academic. So if you just email him from your.edu license and say, hey, I'd like to use this. Can I have a fully functioning license? He'll just respond and say, yeah. So that's that's pretty cool. Okay, so let's keep going. This is where I say, I guess in my best infomercial voice, but wait, there's more. So let's go to Twitter now. Let's go to that first search, that first option. So the big one is probably Twitter, right? I, you know, a lot of us use Twitter to collect online data, maybe not because it's the most important place that people are, you know, doing things or talking these days. Arguably that might be happening in other places, but Twitter, I always say is sort of like the data equivalent of the drunk looking under the streetlight for his keys, right? And like somebody comes up and asks him, like, what are you doing? He's like, I'm looking for my keys. Like, oh, did you drop them here? And his response is, no, this is just where the light is shining. Twitter's where the light is shining, right? I mean, it's the easiest of the social media platforms to collect data. And so that, and to search and monitor, collect data, do all the things. It's the easiest. And so that's sort of why I think you see people using a lot of, a lot of Twitter data. So to find what you need, the first thing you should do in all of these cases is learn how to use the native search features of the platforms themselves. So in Twitter's case, it has a really good set of advanced search options right on the platform. So if you've not tried these out, you definitely should just go and play. Just try different searches and see what you get. It's especially fun to do like geolocated searches to see what are people tweeting like here where I live, within a certain radius of like the U or some other location, right? It's always fun to see what you come up with. But there's also more. That's actually kind of hidden. There's more search operators that you can use on Twitter that go beyond even what the advanced search interface allows. So I've provided a link here and it's in the list of links at the end. But there's just a lot of options that don't necessarily come up in the advanced search itself. So you can filter for new tweets that have links to new stories in them or that just have any kind of link. You can filter for tweets that have images, videos. You can do really specific geocode searches using latitude and longitude. So on the general advanced search page, you can sort of in a general area by like city name or something. But with the geocode operator, so it's like just geocode colon, you can actually put in specific latitude and longitude and set a radius in kilometers within which around that point, you want to search for content. And there's a lot of others that you can use in Twitter as well. So it's quite powerful. Okay, so let's turn to monitoring then. So for monitoring, of course, you can use your app or the web interface. Of course, we probably all do that, right? We either have the Twitter app or we just go to twitter.com, you know, in our web browser, but there's a free tool called tweet deck. Anybody use tweet deck? No, tweet deck users online. Do you tweet deck users online? I see one head. Yeah, I see one head bobbing online. So all you got to do is just go to log into Twitter and go to tweetdeck.twitter.com. And it opens up something that looks like what I have on the screen here gives you a dashboard with lots of scrolling columns of tweets all happening at the same time. And you can use it to monitor your general timeline, interactions with you, like responses to you, direct messages to you. You can use it to monitor just one person's account. I guess really want to stop them and see everything they post in real time. You can use it to conduct advanced Twitter searches and follow the results of that search in real time. You can follow lists. So you can make lists of accounts in Twitter and you can follow those lists. So if you can see in the screenshot here, I have a cyber info second privacy list that I created. That's a list of other people's accounts like cyber security, information security professionals, I put them in a list. And then I follow tweets from that list, right? And then I have two other lists there, one that I created and one that a journalist created, specifically with people tweeting about Ukraine. And so I can watch what's happening in Ukraine and I can watch what's happening in the sort of cyber security community all streaming in real time in front of me. Okay. And actually this is only three columns. I think I probably have about nine columns. So you have to scroll left, right to see all of the column. Okay. So it's a very useful tool. Next up is our RSS readers again. So again, the paid versions of both Feedly and a reader allow you to monitor particular Twitter accounts or search results alongside your other web and news alerts. So for example, in my case, I, like I mentioned, I have my Twitter list of cybersecurity folks. And in Feedly, I subscribe to a Twitter search that filters for news items posted by people from my cybersecurity list. And then it shows me the content of those news items. So I don't even have to go out and click the link to get the news items. It just pulls the news items in as it pulls news items that are treated by this group of professionals into my feed reader and then goes and gets those articles and pulls the content into the reader. It's all centralized in one spot, right? Very helpful. Okay. So then how are we going to collect and archive these materials later though? So again, we can always use the general methods that I mentioned before, right? Screenshotting, save to PDF, saving web pages with Chrome extensions, all that kind of stuff will always work. You can always fall back on that, right? But a good tool that lots of people in the OSENT world really like is called TweetBeater. So this is good for downloading following and follower lists of particular accounts, the specific account data for a particular account like the person's name, the description, when it was created, location, all that sort of metadata about an account, as well as tweets from a particular account. And as you can see, there's more options on here of other stuff that you can do with it as well. You can give it two accounts and then say compare the follower and following lists and see who are the commonalities between two accounts, for example. That's sometimes very helpful as well. But if you want to download tweets from multiple users that meet a search criteria and do so over time, automatically, you'll need a different tool. So this is where a tool like the Twitter archiving Google spreadsheet or tags comes in. It's also free and allows the use of all those advanced search operators that we talked about. And it can be set to run automatically and on an ongoing basis to pull data for you into a spreadsheet. So I've used this for multiple projects over the years and in my teaching and highly recommend it. And then finally, as in all of these cases, if you want to get a little bit more advanced, you can do a lot more if you are willing to get your hands dirty and play around a little bit with the command line on your computer and download some Python scripts or some Python applications from GitHub and try them out. The one that's probably the most popular for collecting tweets is called Twint. And it has the advantage over tags and that Twint does not use the Twitter API. So tags uses the API, which has certain limitations on how many tweets you can save during a given amount of time. Twint does not use the API. It essentially is just scraping the Twitter page. And so you can actually get a lot more data. The downside though is if Twitter makes like a major change to the platform, then sometimes Twint gets broken for a while until the developer fixes it. And then it goes back to working again. So it's always a bit of an arms race there. Okay. So now let's talk about YouTube. Like any platform, you want to start by getting familiar with whatever advanced search options the platform itself has. This is going to be a theme. Okay. If you haven't noticed already. Okay. So YouTube actually has quite good advanced filtering options. I sort of done a dropdown to show you these here. I mean, how many people use the advanced filtering options on YouTube regularly? One person here, a couple people here. I'm not seeing anybody online. Maybe one person is saying so online. Okay. I know that I often don't use them. You know, just kind of casually, just like Google, just kind of casually, you know, search for what I'm looking for. But you can actually get really specific with the filters in Google or with the filters in YouTube to find what you're looking for. But there are third party tools, just like with most of these platforms, there are additional third party tools that allow you to do other things that you can't do on the platform itself. So one of those in the YouTube case that I think is really interesting is YouTube geofind. So it essentially allows you to find videos posted from a particular geographical area that meet a certain keyword criteria. So, you know, a use case for this might be like, let's say you're studying like a protest movement, for example, and a protest is happening in a particular area. You could use a geofind search for videos being posted from that particular area with keywords related to the protest, for example, to try to find video that people are posting from that protest. You could combine that with a geocode search from Twitter, right, for the same kind of data but tweets. And now you potentially have a database of tweets and videos from YouTube and Twitter about the protests, right, that you're studying, and you can pull all that data and start working with it, right? So it's pretty cool what you can start doing with some of these tools. Just like for the general web, there are also, there are also SEO tools that you can use. So I just put two of them here that are free, one's called TubeBuddy and the other's called SocialBlade. Usually in all of these cases, there's also a paid option as well, but you can actually get quite far with just the free version in terms of getting basic data that's helpful. Again, in all of these cases, especially with YouTube, the SEO tools are helpful for helping you focus in on the data, the particular artifacts that are going to be of most use to you. So especially with YouTube, like downloading lots of video content is going to take a lot of time and it's going to take a lot of storage space, right? And it takes a lot of time to analyze that as well. If you're a qualitative researcher and you're, you know, really watching every minute and reading every word of the transcript and coding it, right, in detail, you're not going to be able to do bulk download of like, you know, terabytes worth of video or whatever. That's not the kind of work we do as qualitative people. But at the same time, you're going to want to try to find the videos and the content that you can make an argument are most influential, that maybe got shared the most or got the most comments or got the most views or upvotes or something like that, right? And so these kinds of tools can help you find that kind of data to help you make choices about which artifacts are you going to take the time to download and save and really pull apart and try to analyze. For monitoring, of course, as we know from any YouTube video that we watch, if you want to monitor what's happening on YouTube channels, you can subscribe and smash that bell icon to be notified whenever there's new content on the channel, right? We've all seen that before. But you can also use your good old RSS readers again. You can take the URL for any YouTube channel, pop it in Feedly or in a reader, and it will read it as an RSS feed. And whenever there's a new video posted, there will be an update in your RSS reader. So I actually didn't know that literally until a couple of months ago. I was like, well, where's this been all my life? It's been out there. I just didn't know. Okay. Now, how do you get all of these videos? How do you save them locally? There are lots of tools actually for saving YouTube that are readily available. These include more user friendly options that are paid, such as 4k download, which I've used before and it works quite well. But once again, if you're willing to sort of get your hands dirty and use some command line tools, there are tools that will literally let you do everything that 4k downloader will do, plus probably even more for free. So the big one in that regard is what's called just YouTube-dl. So you can get that on GitHub. Again, I put the link for that in the references at the end. And you can also get YouTube comment downloader. So YouTube comment downloader does exactly what it says. It downloads comments and that's pretty much it. So if you just need to download the comments to a video, that's a good tool to use. Pretty simple and fast. YouTube-dl is like a Swiss Army knife though for YouTube videos. It'll download the video. It'll download it in multiple formats. It'll download just the audio track. It'll download the subtitles in multiple languages. I mean, it's like crazy. All the things it'll do. And it'll download comments as well. And there's probably even other stuff that I'm forgetting that it does. So it does lots of things, but you just have to be willing to learn to work with your command line a little bit. All right. So let's move on to Reddit now. Again, your first start is always platform search. Not quite as good as some of the other platforms, but okay. But there are some robust third-party tools that you can use for searching on Reddit for different kinds of content, over different time periods, et cetera. I put a couple of those on the slide and the links are in the references at the end. I include multiples of these even though if you look at them, they kind of all do the same thing because like I mentioned at the beginning, sometimes these tools stop working. And so it's always good to sort of have multiple tools in your tool belt for doing these things, because sometimes one will stop working. And if that happens, you can move on to another one because they kind of come and go. Again, for monitoring what's going on on Reddit, you can just subscribe, get your own account, subscribe, use the web interface. You can use the app on your phone or your mobile device, of course. But again, you can subscribe to Reddit subreddits, which is basically what forums are called on Reddit in both of the feed readers that I talked about before. Again, you got to have a paid account for them to do that, but it's one of the features that you can do. And again, it helps to centralize everything all into one spot. And again, with the automation that I talked about and the integration with other tools like Google spreadsheets and Dropbox, et cetera, you can even automate some of the process of capturing and downloading the information to use later. But yet again, if you're willing to go a little bit further, you can do more with collecting Reddit data. Of course again, all those general collection tools that we talked about before are always your fallback. Those are always going to be something that can work for you even if they're sort of not the most elegant. But again, if you want to get your hands dirty with your command line and some Python, there are various tools for scraping Reddit. One of those that's specifically dedicated just to Reddit is called Universal Reddit Scraper. And you can use that. And there's a couple of other ones that I'm going to talk about. One of them that I'll mention in relation to Telegram actually scrapes more than just Telegram and also scrape Reddit and some other sites as well. So there's multiple tools that you can use in this regard. Okay, let's move on to Instagram. So Instagram is a little bit of a, it's a challenge. We're getting into Facebook territory now. And so things get more difficult as we move down this list of platforms in terms of searching and finding information. So for Instagram, you can use just the generic platform search. It's not very good. But you can search for hashtags and usernames and just general keywords or whatever. And it's kind of okay, but it's not great. There are quite a few third party tools that you can use for Instagram though that are helpful. And then, so this is one of them on the slide here. And there are others similar to it, kind of like the ones we looked at for Reddit. And then you can always also use the site colon operator and Google for any of these sites as well. So you could just say site colon Instagram.com and then search for whatever terms you're looking for. And it's not going to be comprehensive, but it will give you results. Okay. And so that's another thing that you can do. So usually you're going to want to try more than one of these tools, because not all of them are going to do everything or be really comprehensive. For monitoring Instagram, really just using your account and your mobile app. That's that's pretty much it. There are third party tools similar to tweet deck, but that work with Instagram. But I haven't found any that are free or even quite even really like reasonably priced. They usually meant like primarily for like marketers and advertisers. And as a result, they're really expensive, like $100 a month subscription fee to use it or something, which we're probably not going to do as poor academics. Collecting. Again, one of these tools that I mentioned before, 4k download, you can use. They actually have a tool for downloading Instagram as well as the tool for downloading YouTube. And I think they also have a tool now for TikTok as well. So you can try that out. And then again, if you're willing to get dirty with the command line, there's a tool that lots of people in the OSINT world use called Insta Loader. In all of these cases, though, you will have to have an Instagram account. There's not much you can do with Instagram if you don't have an account and you're not logged in or you're not like grabbing an API key and popping it into that command line before you try to use it. So just be aware of that. Same goes with Facebook. There's really not a lot you can do these days with Facebook. If you're not, if you don't have an account and you're not logged in, even search, like you can search using these third party tools. But usually I find if you click on those search results and you go there before you even are allowed to see anything, it's going to ask you to log in. Even if it's otherwise public content, Facebook is going to want you to log in. Okay. So again, you can use the platform search on Facebook. It's okay. It's not as good as it used to be. Facebook got rid of the super powerful graph search ability that they used to have. They got rid of that a couple years ago. There are folks like some of the folks that I've put on the slide here that have sort of reverse engineered and created a sort of quasi recreation of some of the elements of graph search. It's not as comprehensive as it used to be, but it is more comprehensive, provides more capabilities than just what the platform search box allows. So you can try these out. Again, the links are available in the references at the end. For monitoring Facebook, again, you pretty much just have to use your account. There used to be some tools that would work with us. But again, Facebook has really clamped down on things in the last several years. So you pretty much just have to use your account through the web or through the app to really monitor what's happening on Facebook. And then in terms of collection, it's really difficult to collect from Facebook these days as well. Again, they don't want you scraping, don't want you saving bulk content. So really those general collection techniques that we talked about at the beginning are probably going to be your best friend here. There are some helper tools that can help with this. So there are various kinds of auto scrolling bookmarklets or Chrome extensions that will basically automate the process of scrolling the page. So you know, you scroll, right? And then you have to wait for it to load more content. And then you scroll, it has to load more content. There are bookmarklets and extensions that you can get that automate that process that essentially just scroll the page automatically for you and force that content to load. And then once you get however much you want, then you can print PDF or something like that. There are also like full page screenshotting extensions that will like screenshot beyond the visual range of what you're seeing on your screen. So everything that's loaded, you can use a tool like that as well. Instagram is a problem though in this regard for auto scrolling because actually as you scroll, the stuff that goes above the screen unloads. So it loads new content and unloads the old content. So that like the only thing you're that's ever actually loaded is what's on your screen. So there are additional tools that you can get for Instagram that will force the stuff that you pass by to stay loaded and then also to load the stuff down here so that then you can keep it all loaded and then you can save it, right? So again, it's a little bit of a war. It's a little bit of a, you know, offense versus defense back and forth when it comes to saving the stuff. Okay, we're getting a little short on time, but we have one platform left and that's Telegram, which is sort of a new addition for me to all of this. Telegram of course has become more popular lately. Oh, I will say one more thing. Let's go back to Facebook for a second. Another thing you can do on Facebook that I forgot to mention here is the mobile interface. So if you're looking at something on Facebook and you want like a cleaner version that's easier to save in terms of layout and less clutter, you can just add an M in front of the URL. So instead of www.facebook.com, you can put M.facebook.com and then whatever the rest of the URL is. It'll load the mobile version that you would normally see on your phone, but in your full browser and then you can like print to PDF or screenshot from there and it gives you a much cleaner version to work with. Okay, so before we run out of time, let's go through Telegram really quickly. Increasingly an important platform, especially if you're studying like political extremism at all. A lot of folks in the extremist world have moved to Telegram for more mainstream platforms like Facebook and Twitter over the last couple of years. It's also been particularly important for following the war in Ukraine, especially some of the cyber aspects of that, some of the hacking that's going on. A lot of that has been coordinated and is being posted about on Telegram. The search on Telegram is rather difficult. There's not a lot of great on platform options, but you can use the general search strategies mentioned above. So for example, pointing Google at open Telegram channels. So to search on Telegram, you want to use site colon t.me. So you don't want to use like telegram.com or telegram.net or telegram.org. It's t.me. So if you do that and then you put in what you're looking for, you'll come up with some information. Not everything, but you'll come up with some information. There are also third party search and directory services, a couple of which are listed on the slide here. The best strategy for Telegram is really to use a combination of all of these tools because none of them are really comprehensive in what they give back to you in terms of search results. So the best is to use a combination of all of these, start getting what I call a bunch of like seed channels that are relevant to what you're looking for, and then start sort of snowballing out from there and looking for other channels that the channels that you've found that are relevant to what you're looking for are mentioning or who they're forwarding content from, and then start following those channels as well. So you're kind of using the search to get you an initial get you in the ballpark and then you snowball, you kind of snowball sample from there. Okay. One of these, this one, telemetre, telemeter, I don't know how you say it, .io is the one that was actually down this morning. I don't know if it's back up yet or not, but you can keep an eye on it. In terms of monitoring, of course, you can use your own account. You can use the app on your phone or on your desktop. I will say if you're using this to follow extremists, be very careful in setting up an account because you do have to use your phone and your phone number. So if you're going to use Telegram to study extremist movements online, I would recommend getting a burner phone, just a cheap $100 crappy Android phone, and go out and get like maybe a mint mobile $5 starter pack SIM card. That'll give you a phone number for like a couple of weeks that gives you a test to try out the mint mobile service. Use that phone number to download the app and set up your account to get your desktop app set up. And then you can let the phone number go away or you can create a different like free burner phone number somewhere else online and associate that with your account. If you're just doing this for normal research that's not dangerous at all, you know, just just do it on your regular phone or whatever. But if you're studying extremists, be careful or not. I'm not really going into OPSEC here in terms of keeping your own self safe and your own privacy safe online. So if you have questions about that or need help with that, let me know. I'm just assuming sort of normal everyday academic use with these tools. If you're doing something a little crazier, let me know. And I can walk you through that. Again, you can use an RSS reader like Feedly or in a reader to follow content from Telegram. But once again, you have to be logged in to your Telegram account for that to work. So you can look at public Telegram channels without having an account or being logged in. That's mostly what I do because I'm looking at extremists. So I don't even want to set up a burner account to do it. So I mainly just stick with publicly available stuff and just save things using the general methods that we were talking about screenshotting and print to PDF. But there are other options there. Again, if you're collecting, you can use your RSS reader and those integrations to Dropbox and other tools like spreadsheets to help you collect. You can use your own account if you have an account and you download the desktop app. You can actually save individual posts, images or entire channels worth of content using the desktop app. So there's a guide from the folks at Bellingcat that I've linked here that you can go and look at that shows you how to do that. And finally, there are yet again command line tools, command line tools that are specifically written just for Telegram. So the Telegram API tool is just a Telegram scraping tool. It was written by one of the researchers at the Digital Forensic Research Lab at the Atlantic Council. And then there's another one called S in Scrape. And it will actually scrape Telegram, but also lots of other platforms as well. It says Facebook, Instagram, Reddit, Mastodon, Twitter, VKontakte, Sino Webo, and maybe some others as well. And that was created by the Bellingcat folks. And so you could try that on Telegram, but you could try it out on other platforms as well if you want to give that a try. So that was a lot for like 40, 45 minutes. Sorry, if I went a little bit fast. Like I said, all of the links to everything I talked about here are provided at this link. So bit.ly slash oscent dash qualitative, or you can scan that QR code with your phone. I promise there's no malware or anything. And it'll just take you to that page. And actually I've shared those links using another tool that I quite recommend called OneTab. It's an extension for Chrome where you know how you like get really excited researching. You have like 50 tabs open all at once. And you're like, I don't know what to do with these, but I don't want to get rid of them either. You can just click a button in one tab and it'll just save all of them into a page, into a list. And then you can actually publish the list and make it public. So in this case, I did that with all of these tools. And I'm sharing it with you. So it's a really, really neat tool. All right. So with what time we have left, if there's questions or discussion or anything, I'm happy to take questions. Oh, okay. Well, feel free to email me. I should have put my email on there. So I'm at the U. So it's just www.Sean.Lossland. Utah. Yeah. Yeah, you can find me. Use your OSINT tools. Track me down. And yeah, I'm happy to help. I love this stuff. I geek out on this stuff all the time. So I would be happy to help you with it for sure. Yeah. A lot of stuff. Yeah. Yes. Again, it really depends on the platform. And it depends on if you're willing to get into the command line. But a lot of these tools that work from the command line will download metadata in JSON format, which then you can convert into CSV and put into a spreadsheet. And some of them are configurable in terms of what metadata you can get. Like you can choose to have some and not others. And then some of them are not. So I'm just get whatever they get. And that's just what you have to work with. Yeah. Like Twitter archiving Google spreadsheet, for example, pulls a ton of additional data beyond just the content of the tweet. I mean, it's amazing how much metadata comes with a single tweet. And it pulls all of that and puts it in a spreadsheet for you. So there's a lot that you can do. And then you can take that data and convert it into different formats and do social network analysis on like retweet networks or follow networks. Yeah. I have a question, but I also want to comment. We're getting some really nice comments in the chat. People, thank you for your awesome presentation. And then the question I wanted to ask is you shared really amazing tools for searching, monitoring and archiving the content. I'm curious if you have any recommendations for analyzing like a lot of qualitative data. Yeah. That's actually quite a challenge sometimes. Oh, shoot. Am I did I quit the zoom? Oh, no, there I am. I was trying to see the chat window. Yeah. So analyzing, you can take most of this material and then you could port it over into something like in vivo or Max QDA or Atlas TI. Those are the big sort of qualitative analysis programs I think that most people use. They're quite expensive though, right? Happen at the library though. And I'm seeing Donna Ziggins, who's our qualitative expert for Atlas TI and in vivo. Yeah. And you know, there are like, there are tools. There's, there's like Chrome extensions and stuff for both Atlas or I don't know about Atlas TI, but they're always for Max QDA and for in vivo for saving online data directly into those platforms. I haven't tried the one for in vivo. I have used the one for Max QDA a couple times because I have a license for Max QDA and quite frankly, I think it's awful. It's super slow. Oftentimes there's like bad formatting errors and what it saves. So I really prefer a lot of these other tools. Yeah. So yeah, so for, for analyzing a lot of data, the big platforms are probably the best. You will need to make sure you're converting the data that you download into a format that those tools can use. So converting it into something that's like an RTF or PDF or JPEG or just plain text or something like that. But yeah, those are pretty good. I also have done qualitative projects in the past using Dev and Think on Mac. So again, it's Mac only, but it's quite flexible in what you can do. And so I've actually done qualitative projects in that program as well. There's a newer note taking, there's a lot of like networked markdown format based note taking applications these days that you could potentially try out for this. If you could wrangle the data into markdown format. One of those that looks like it has a lot of promise for doing this kind of work is called obsidian. So that's not in my list of links, but if you just go to obsidian.md, you'll find it. Yeah. So, you know, all the big software platforms for analysis are much better than they used to be. But quite frankly, working with qualitative data is still quite challenging. Like it's not easy. It's a lot easier to collect it than it is to work with it once you have it, really. So I am seeing it's two o'clock. So I want to take a chance to thank you before we lose anyone else. And maybe you can still hang out a couple minutes. Okay. Great. Well, thank you so much. We really appreciate this. Yeah. Thank you. Thanks to everybody for showing up. Yeah, I'm happy to stick around and answer more questions if people want to want to keep chatting or have other questions. All right. And thanks to everyone online for joining us too. Oh, and Donna is saying Envivo capture plugin is slow too. That's why I'm here thinking about how else to capture. Yeah. Yeah. I was wondering if maybe Envivo is better. Maybe I should try that, but no, I guess not. So that's unfortunate. What's that? Do I have a GitHub page? No, I don't. I don't. Maybe I should. I don't know. Most of the coding that I have is just really janky stuff that I've like cobbled together in Python or like Apple script myself just to do automation on my local computer or like scraping of like one particular website. I did a project with content from RT several years ago. And so I created a Python scraper that was just scraped news article from RT from a big list of URLs. But it's really not good for any other any other websites. Literally, I just wrote just scrape RT. Yeah. Do you teach a course here at the U like if we have students interested in like really diving into the qualitative research and the tools like is there a course that they should sign up for? I don't have a course specifically about the digital tools. I do teach our undergraduate course on qualitative methods in the communication department. So that's I think com 3700. And then we do use some of these tools in a more topics based sports study teach called it's com 5610, which is information technology and global conflict. And one of the projects that we do is using various different social media monitoring and capture tools to follow international issues of interest to students throughout the course of the semester. And each like every couple weeks, we change tools and use a different tool and sort of explore what it can do and how it works and stuff like that. Yeah, but it might be cool to put together just a full course on just using these tools. Yeah, I'm still basically at the dummy level when it comes to that. But but the documentation for a lot of the command line tools are actually pretty good. You know, if you're willing to play around and just read the documentation and a lot of times the people that make them are super nice. And so if you're having a problem figuring out how to use it, if you just message them, they'll often help you, you know, because they're they're creating these things as a passion project and they want people to use it. And, you know, if they can help you, usually they will. And then Python, you know, that's a little more of a commitment to really start learning some Python. But quite frankly, like I wrote my scraper in like, I don't know, maybe over a couple of days. And like I had only ever done like really basic like, you know, print hello world or whatever kinds of Python to that point. And I was like, okay, at that particular moment, there really was not another tool that was this was in like early 2017 that was going to work for me to get all of the RT content that I needed in a plain text format that I could then import into like in vivo or Max QDA or something like that to really do the analysis. And I'm like, I'm going to have to I'm going to have to figure this out. And so I literally just spent two intense days like really learning enough Python and learning the like beautiful soup package, which is what you for scraping. And, you know, looking at a lot of stack overflow posts and YouTube videos to like pack together some crappy script that it did the job and I scraped like 850 YouTube articles with it, you know, and it did what I needed it to do. Is it pretty? Could a person who's a real coder probably do a lot better with it? Yeah, for sure. But it didn't take that long and I got what I needed, you know. Now I haven't used a lot of Python since then. So if I need to do it again, I probably have to start over at zero but but it's doable. It's doable much quicker I think than most people realize if you really just focus in on. The presentation reminded me of a really good presentation we had here like four years ago. I'm sure David remembers when Sarah Sunwell did her research on hashtag give Elsa a girlfriend, but she had captured over 600 tweets just exactly you said screenshot save PDF and then, you know, transferred it into and Vivo or something. Yeah, that's a lot of work. Yeah, yeah, tags could save 600 tweets for you much quicker. Yeah, she's like, I'm sure there's an easier way but I don't know how to do it. The problem though isn't and one of the times we're doing it the manual way is is better is if if you really need like the full like visual context of the tweet, right? So if you're saving using like twint or tweet beaver or tags or any of those kind of tools, they're mostly doing what you were asking about. They're just saving it directly into a spreadsheet. So you're getting the text, you're getting all the metadata and if there's an image there, you're just getting the link to the to the image right or the video or whatever the multimedia content is and that's all you're getting if you want to