 Thanks very much. For anybody who would like to follow along, there's a short link for the slides and it's bit.ly slash tidy Tuesday. So that's bit.ly forward slash tidy with a capital T and Tuesday with a capital T. Thanks for the opportunity to talk with you today about an online data visualization learning community and what we found out about how accessible it is for blind people using screen reading software. So who are we? My name is Sylvia Kanalan. I'm a postdoctoral research scientist at the University of Pennsylvania where I use the R programming language to analyze electronic health record data. I'm Liz Hare and I've been working with R for about 15 years to do working about genetics and genomics data analysis and I met Sylvia in MyR which is an organization of people from minority backgrounds working together to make our better space for everybody and Sylvia had this idea she had an interest in data accessibility and she had an idea that she wanted to scrape up the information that was being generated by the tidy Tuesday project so that we could see how often descriptive text was being added so that blind people could follow the data visualization. There are a few things that we wanted to give you to take away from our talk today. There are a lot of online learning communities around R and other data science software mediated in places like Twitter and Slack that are very supportive and people are really enjoying participating. Tidy Tuesday is a specific data visualization project using the tidy verse packages in the R open source statistical practice programming language. We found that 3% of the tweets displaying data visualizations had alt text which would describe the contents of the data visualization in a way that could be read by a screen reader. 84% were described as image which was just a default placeholder when the user didn't enter anything. So we're still wondering about how blind people can participate in these data science learning communities. You may be asking why blind people need access to data visualization. I know that there are people who think that it's a form of communication that can't be transmitted in any other way but most of the time it does need to be. There are blind scientists who need to be able to access read the scientific literature and contribute to it and present their work. There are more general applications in current events things like election maps and public health we all would like to be able to have access to knowing the COVID infection rates in the communities around us. There are some statistical software packages like form and SAS that will use vector oriented graphics to produce data visualizations that have aspects that can be grabbed by a screen reader but R doesn't have any of that capability at this time. Screen reading software provides voice or braille access to text that's on the computer screen. It does not describe graphics. You may be familiar with artificial intelligence image descriptions that are provided newly by some operating systems and Facebook and LinkedIn. These processes have not been developed enough that they could meaningfully describe data visualizations. Then we wondered specifically about Tidy Tuesday and the ability of blind people to participate. What is Tidy Tuesday? As Liz had mentioned earlier it's a social project that happens on a weekly basis and it's part of a broader online learning community. It provides participants an opportunity to practice summarizing and visualizing data with modern tooling like the Tidy version of packages. On the right hand side of the screen is an embedded tweet which is an invitation for people to participate on Twitter every week. This tweet was sent out by Tom Mock who leads the social project. He sent this out last night in preparation for today which is Tuesday and so we can see that the tweet includes information about the data set and where it comes from about the project itself and includes some relevant links that people can follow. Then on this slide on the left hand side is an embedded tweet showing an example of what one of these what we're calling Tidy Tuesday submissions or data visualization shares what they might look like and what they might include. In the body of the tweet itself there might be a verbal description, maybe the author shares their interests in the data or what they found, maybe they share something about their learning process, maybe they try a new function or a new package and then they may also choose to include some interpretation of the data as well. The tweet can also include a link to the author's source code which is another really important aspect of the social project as participants are encouraged to share their code openly so that others may learn from the process that went into creating the data visualization and then reproduce it or modify it for their own needs and then of course the tweet will generally have an image attached of the data visualization that was created but rarely do these images have text descriptions or alt text attached to them. So how are the data collected? The tweets for Tidy Tuesday have been collected over time since April of 2018 which is when this project started and Tom Mock has been collecting that data and he makes them available in the Tidy Tuesday repository and he collected this information using the R-Tweet package from R-OpenSci and so what I did then was I used that data to identify links for each individual tweet and did some processing and and used the R-Celenium package also from R-OpenSci to scrape the alt text attribute that corresponded to the image in each of those tweets. So the right hand side now shows a screenshot of what it looks like for me if I'm in my browser in Firefox and I open up my web inspector if I'm looking at the tweet that I shared in the last slide you can see that on the left hand side there's some HTML code that corresponds to that tweet and so the way the R-Celenium helped me in this process was essentially running a browser that would travel to each of these individual tweets, find the picture, find the image that was attached and then take out that or scrape that alt text attribute and so in this particular screenshot we can also see the alt text for the image underneath the image itself in a block of text and and that's visible to us if because we there are certain extensions you can install in your browser that lets you see the alt text for different images so that's what is showing up here. So what did we find after scraping? We found that over the three years of the Tidy Tuesday project there were over 7000 data viz tweets and only 215 or 3% of them had alt text. Participation in Tidy Tuesday has increased over time but as we can see from these line graphs the use of alt text is recent and remains really low so what we see on the right hand side is a line graph where there's a gray line corresponding to all the tweets submitted as part of this project over time and then there's another line that is a darker color and it shows the which of those tweets also have alt text attached to them so we can see that for the most part there has been increased participation in the project over time but there hasn't been much change in how alt text is used in this context except for at the end where we see this this spike and so that corresponds to a conversation that was had on Twitter when Liz and I found out we would be presenting at this talk today and there was some conversation about our preliminary findings and some changes that were implemented because of that. So Liz could you tell us where else alt text is missing in addition to this project? Yeah there's a lot of great open source material books and tutorials available on the internet and they're all wonderfully searchable by Google so that you can answer the questions you have while you're coding but many of them are lacking alt text for their data visualizations and also sometimes even display code as PNG images which can't be read by a screen reader. This problem is also found with Bookshare which is a nonprofit that provides access to electronic books for people with print disabilities. They work with publishers who often also don't provide alt text for images. They don't provide it themselves and again sometimes the actual code snippets are missing from those books and manuals. A lot of our package not a lot but some our package documentation has data visualizations in it to show you how these interesting new tidyverse kinds of graphs look and come out and it's really difficult to figure out as someone who's trying to display data and you can't see those what what the code actually produces. There are also a lot of our blogs that have educational components but again are lacking alt text or lacking access to the code. This happens both at the individual level as we mentioned with things like blogs and also at the corporate level where our studios documentation on their websites is often missing these accessibility features. So one of the things once we had scraped these alt texts we were interested in in asking what about them makes them good and what about them makes them effective and based on my experience online and reading media and also going through each of the alt texts that Sylvia scraped. The first the first item that I felt was really important was that something is shared about the what the data is showing what what is the meaning being conveyed 34% included that 28% included description of what variables were on the axes of the data visualization. 12% had some indication of the scale of the axes or a way to find out what the scale was from the description of the data and this one is pretty important too. You need to tell us what kind of graph it is. Is it a line plot? Is it a scatter plot? Is it a QQ plot or a visual plot? And 56% provided that. I'd like to also add that there are a lot of really cool modern new and also some very specialized kinds of graphics especially within tidyverse that if you're using something that's a little less common it would be really good to provide some orientation to how the data visualization works and what it's saying. So we saw in the line graph a couple of slides ago that there seems to be something changing now and so we wanted to ask you know is the tide you know has it started shifting are we sort of noticing changes outside of this as well. So we know that conversations are happening within our in different spaces so one great example is to use our conference that's happening this year that Liz is a part of and one of the organizers volunteers for. The conference itself has been very intentional about centering accessibility practices into the conference in terms of how it's delivered and in a way that makes it as accessible as possible for participants that are attending as speakers or that are just attending to participate in the different talks and tutorials. And as Liz mentioned earlier the MyR community is also a space where we're having these conversations like the one we're having today. There's an accessibility committee within that community that Liz and I are a part of and that where we regularly meet and discuss these things and then within Tidy Tuesday in particular as I mentioned when we shared our preliminary findings on Twitter the leader of the social project so Tom was able to add a whole section to the Tidy Tuesday repository about or inviting and encouraging participants to include alt text and the data visualizations that they share on Twitter and includes resources as well for participants and have a place to start from. And then more broadly we also have noticed conversations happening in other places so data visualization society had a conference earlier this year Outlier Conf that featured a few different talks about data visibility accessibility and then a variety of A11Y conferences and meetups also are having these conversations. There's also some changes happening with tooling within our so one good example is that within our markdown which is very popular there is now the ability to add alt text to a code chunk so as an option to a code chunk so that whatever a graphical output that code chunk is producing it can come equipped with alt text as well in the same way that one might add a figure caption now we can also add an alt text tag to that graphical output but it is important to note that these changes are great you know but they are just starting and they're slow so what can you do you can make your data visualizations accessible everywhere that you can so this includes adding alt text to websites journalism scientific publishing social media and other spaces and in situations where alt text isn't available in your particular document or platform try to find creative ways to describe your visualizations and words. Liz and I have provided a short list here of some resources including a great blog post by Amy Cecil the new chartability audit workbook by Frank Olapsky and then another resource for well and a few links specific to twitter so how do I even add alt text to the images that I attach how can I you know see the alt text if I want to what kind of extensions are available and then you know are there extensions available to help me remember to attach alt text to my images as a resource for that too specific to firefox but other browsers will have them as well so thanks so much resources Liz's analysis and also a link to the tidy Tuesday alt text package that we use for this analysis is available in the link on the slide thanks so much and we're excited to take some questions fantastic thank you so much for sharing this amazing and important work that you've been doing and and tips for us all to do better there's a question there's lots of questions but I think we only have time for one and so Jess asks are there any good resources for testing what the screen reader experience is like for people or accessibility in general yeah so I think a great place to explore would be some of those a 11 y resources so a 11 y stands for accessibility and it's there's a lot of conversations there that happened within the context of web development and and user testing and screen reader testing sort of part of that whole process so I would encourage folks to check out some of those resources