 I want to make a couple of things very clear first. One of those is I'm just a data scientist who happens to be blind. So if you were looking for an expert on research into students with disabilities in the classroom and things like that, you needed to bring someone else. But what I can do is offer a lot of perspective as someone who has been through a number of chunks of the pipeline for data science and some of the experience that I've learned both from myself and also from colleagues as well as some of the work that Drew has mentioned working with a number of people at organizations like BioQuest to actually think about some of this stuff very seriously and put together frameworks for making data science more accessible. And this is a very positive talk. So I'm gonna be talking about barriers but I'm gonna be trying to help you see that this is stuff we can deal with. So where we're going here, I'm gonna do a brief introduction, get a few terminology things down, not because I think they're the ultimate definitions but they're the ones I'm using. We're gonna talk about what disability is and we're also going to talk about the question of why we should actually include more people like me because it's worth asking and being upfront about. And then we'll talk about both social and technical barriers to people with disabilities participating in data science and some summary thoughts. So some introduction. What is disability? Disability actually has a lot of different definitions depending on the context. The first one that you will encounter the most often probably is the very legal and medical model for disability. This is the verbiage you will see in most federal and state laws that are protecting against discrimination for instance or establishing eligibility for certain things and that's what this definition is useful for. And it's any physical or mental condition that substantially limits one or more daily life activities. That's a lot of stuff. So disability is not deaf, blind wheelchair user. It goes way beyond that and I think most of this room probably knows that already. The more broad model that doesn't focus as much on what is wrong with you, how do we fix you, what do we give you is this social definition that says that disability arises from the design of the environment around people, not their specific impairment or difference. And like I say, both these definitions do actually have utility. The first one again for protections and for establishing who's eligible for certain things. But the second definition is a lot more useful in the context of instruction and that's where I'm really coming from in this talk is talking about training in data science and a little bit on the recruitment side. So these are the definitions that I'm using for terms. Again, I don't wanna start a fight about definitions. This is just how I'm using it. And actually I've already learned from things at this conference that this first one universal design, I actually had an attendee be brave enough to come out and tell me that they feel like universal design actually excludes them which is really interesting to me and something that I didn't know. And it's because I think a lot of times this term gets used very specifically. It has a very physical definition. It was coined in the 1980 by an architect and the idea is designing an environment so that it is most usable and most accessible by all people. I would encourage us to take a broader definition for it and look at it more as it is the pinnacle of inclusion. It is the top of the mountain. If I'm truly designing universally, it shouldn't have that specific definition. It means this works for everybody. That is a big goal and you don't have to reach it in one step like we just heard in the previous presentation this morning. So I would encourage you to also think about inclusive design, not as a state. Universal design to me is a state of things. It is when everything has become universally designed. Inclusive design is how we get there. And inclusive design is a verb. It is the thing you do. It's how you climb the mountain. And accessibility is one measure of how good a job you're doing at inclusion. And I wanna highlight that yes, I am focusing on accessibility here. I doesn't mean I think it's the only important measuring stick. Finally, when I say accessibility in this talk, I'm actually using a very narrow definition and concept of it. I'm talking about it from the disability context. From the context of how do, can I, can a participant with a disability engage with content and the learning? I am not talking about availability. Okay, so with all that out of the way, why should we do this? Well, first of all, because people with disabilities are an asset. You want us. When you look at people, when you look at what recruiters for training programs and you look at people hiring in this field or looking for, they want creative solutions to come, people who can come up with creative solutions to complex problems. They want collaborators. They want people who can communicate and advocate effectively. This is what we do every day. We come up with complex work around, complex, excuse me, we come up with work arounds for complex problems. You have to be able to collaborate with people, otherwise you're not going to get what you need. You have to be able to work with people. You can't just make demands. You have to be a good collaborator if you've got a disability, especially if you know you have it, and we're going to talk about that in a minute. And finally, you have to be a good communicator and advocate. And in a day and age when we're all talking about needing more science advocacy, meeting people who can communicate about science across cultural boundaries, community boundaries, we're good at that. So another reason, another thing is we perceive the world differently. This is obvious to most people in this room. Different points of view are good. This is my because the law says you have to pull a point. Our representation is growing and you do need to accommodate. What I'm going to help you do today is figure out how you can minimize how often you actually have to do accommodation. It is the morally right thing to do. And finally, because, and again, we've already heard that this shouldn't be your only justification for doing things, but I think it's worth including that inclusive design and inclusive teaching practice actually helps everyone, a rising tide lifts all ships. So let's talk about some barriers and some ways to fix them and areas where I think that there's a lot of opportunity for improvement. So we're going to talk first about some social ones and I'm going to go through these pretty quickly because whoops. Because this audience is going to be pretty familiar with some of these, I think it's important to show common threads and also highlight differences where they arise. So here's a common one. Low expectations. This kills everything. This is a non-starter. And it's a hard nut for us as a community to crack because it starts at K-12 or ahead of K-12. If a person's parents, peers, teachers, mentors don't think that a particular career or a particular path is an option for them, then neither will they. We are a logical species sometimes. So the fixes for this are simply to remember, especially in the case of disability, but I think this is true across the board. A lowered expectation is not an accommodation, it's discrimination. And what you want to do instead is adjust your expectations. You may have to go sideways. They may have to complete a different task, demonstrate a different skill, something like that, to show you that they've accomplished your learning goals. But your learning goal, your expectation, remains the same. The path may be different. So this is a big one and I still get this a lot. As soon as my disabilities don't take my classes, I'll be happy to accommodate them when they show up. First of all, you're probably wrong. A lot of people with disabilities don't know they have them and they shouldn't necessarily have to. And a lot of people with disabilities may not tell you because there's stigma. They don't want to walk up and say, hi, I had PTSD. Okay, that's not going to be a comfortable thing to do. I probably would, but you guys know that by now. So the problem with this approach is that it creates extra work for you. This idea of accommodate later, accommodate when it comes up individually each time. It results in duplication of work, extra work for both you and the participant. And the participant doesn't get the same learning experience that their peers are getting because you are both scrambling. So the way to get around this, it's not super straightforward. It's just, I mean, it is kind of straightforward, which is to realize that the need for individual accommodation is never going away. I'm not going to say that. That does have to happen sometimes. But you can mitigate the amount, or minimize the amount of time you have to spend doing it. By designing inclusively, and yes, I promise some concrete suggestions for doing that are coming. Basically, this is my trash can social barrier slide because everything falls into it. Assumptions are a problem. The two biggest ones are where people hear that they've got to participate with particular disability and they start trying to jump to conclusions about, okay, blind person must need Braille. That kind of thing, assuming that you can predict what someone does or does not want, or can or cannot do, and that goes back to expectations, those are a problem. The other one is what I've stressed before, which is not everybody knows, they've got a disability. And so there's an assumption that you always do and that you know what you need. So to get around that, there are several different sort of strategies you can take. The biggest one to take home from this, on the fix this side, is focus on your learning goals and then work with what kind of outcomes need to happen. To mitigate the impact of other assumptions, realize that considering multiple possible disabilities and trying to design around them in your design phase is great, but you don't have to think of everything. Inclusive design doesn't mean come up with your full list of disabilities and then come up with accommodations for each one. It just means make it so that those are irrelevant. All right, so let's talk concrete things. What are some of the actual physical barriers to participation in the data science field? Coding, excuse me, computers, in and of themselves can be a problem. You don't, not everyone with a disability knows which operating systems, which features within those operating systems or which third party software works really well for them. So the fixes for this are too familiar. First of all, when you are designing instructional materials, try to be operating system agnostic. Try to make it not matter what system I run your tools on. And start familiarizing yourself with the accessibility stuff that's actually built in to most of the common operating systems. Here's the ease of access panel in Windows. It's so expansive, because Microsoft has done a decent enough job that I can't even show you the whole thing in one screenshot. And it's got built-in rudimentary screen reading capability and for those who don't know who a screen reader is, the idea is I walk up, take your monitor away, and you can run the computer because it's talking to you. They've got magnification tools, ways to adjust color schemes so that they're colorblind friendly, all sorts of good stuff. And Apple people, I haven't forgotten about you. Mac has been doing this since the early to mid 90s when they did some of their first GUIs. They're good at it. They actually know how to do this. And this is under the Apple menu in the preferences. If you search for it under your little spotlight thing, you'll find it too. So, these are third party options. And again, you can get familiar with these. Familiarize with yourself with them, learn how to use them. The top two there, JAWS and NVDA, those are fully supported screen readers that, fully supportive screen readers that can read literally with no monitor, like I said. NVDA is free and open access and the learning curve is not that steep. I have never used a screen reader and don't know how it works is no longer an excuse because it doesn't cost $1,000. There's other software up there, Cursewild and so on. These slides will be available later that can deal with students with print disabilities. The most common one of those you'll know is dyslexia for reading in documentation, having it read out loud, formatting it in a way that works, providing highlighting is that reads and so on. And finally, you can even have somebody who physically cannot interact with a computer keyboard and mouse and they can run a computer with software like Dragon. Everyone can do data science if you are familiar with things that can help them. So now I'm going to go back here to the rest of the fixes, too far for this. And once you've done that familiarizing, set aside time in your training, in your workshop, in your course to actually tell everybody about it. You don't have to find this particular group of students that need to hear about it. Tell everyone, you don't know who you're going to help. You may have people who may not consider themselves to have a disability but are like, wait, I can make that screen more high contrast? That's great because I have a really bad stimulus for instance. Okay, web accessibility and application accessibility. This is a big one. Not all things are designed excessively. Inaccessibility takes many different forms. They can be completely incompatible with a screen reader. It can just land on it and say, I don't even know what this is. It can be just poor choice of colors. It can be very difficult to read but kind of pretty fonts. No comic sans. So, you know, and it could just be poor layout, poor user interface design. This is a problem for everybody. And that's something that I hope is coming through as a common threat. So, the way to try to sort of address this and work through it is to design courses, design trainings to actually be as fluid about which applications, which tools I use to complete them as possible. Now I totally understand that your whole training may be on a specific tool and that's fine but try to find areas where there is flexibility there so that people can use what works for them, what they have experience with and what they've set up for themselves. Definitely consider accessibility when you are designing things. Get help. If you just start Googling for accessible application development or accessible web design, you'll get a flood of resources. A lot of them are strictly for vision but there's a lot more coming out for working with participants with learning disabilities and so on. So, the other thing is, and we've talked about this at this conference, this goes back to the last talk as well. Advocate for this with your peers. Ask them to do the same thing. We're preaching to the choir to some extent here. Go home and talk to people. Make this an expectation. So, another barrier is coding in and of itself. This is central to a lot of data science stuff, right? You're gonna code. And it's not that people can't access computers. We've just talked about that. That's doable. It's actually sometimes the ethos and the way a language works in and of itself. And through you might need to get volume. But I have a video to show you what I'm talking about. Here we have a small block of code written in three different ways. In each case, it does the same thing, creates a list of numbers, loops over those numbers, prints each element with the phrase, the number is colon in front of it. Each of these work and the first is written in Perl. The second is written in Python with tabs to show indentation level. And the third is written in Python with spaces to show indentation level. Let's just read through these line by line with a screen reader as though we were proofreading them and see how they sound. Number this is the Perl version. Blank. My at list equals left per N, one comma two comma three comma four comma five right per N, semi. Blank. For each my dollar item left per N, at list right per N, left brace. Tab print quote the number is colon, dollar item backslash and quote semi. Right brace, blank. Number this is the Python version with tabs. Blank. List equals left bracket one comma two comma three comma four comma five right bracket. Blank. Def print numbers left per N, right per N, colon. Blank. Tab for element in list colon. Tab, tab print left per N, F tick the number is colon, left brace element right brace tick comma, and equals tick backslash and tick right per N. Blank. Print numbers left per N, right per N. Blank. Number this is the Python version with spaces. Blank. List equals left bracket one comma two comma three comma four comma five right bracket. Blank. Def print numbers left per N, right per N, colon. Blank. For element in list colon. Print left per N, F tick the number is colon, left brace element right brace tick comma, and equals tick backslash and tick right per N. Blank. Print numbers left for n, right for n. Blank. Blank. Okay. So the first version, Perl, we get a lot of verbosity. It could be a little bit overwhelming to hear all those different symbols, the left braces, the right braces, the semicolons, et cetera. But at least I know where I am in the code at all times. Once I've heard a left brace, I know I'm in a loop until I hear a right brace. Whenever I hear a semicolon, I know that the statement ahead of it is finished. Contrast that with Python, where in the first case with the tabs, it's a little more pleasant to listen to. There's a little bit less symbols being thrown around. But we have to be really careful to listen to how many tabs are at the front of each line to hear where the definition of the function starts, where the for loop starts, where the things that are supposed to happen in the for loop start, and so on. The third version is especially problematic because we actually didn't get any report of how many white spaces were in front of each of those lines because it uses spaces instead of tabs. Turns out that reading line by line isn't going to give you that with a screen reader. The only way to do it is to go up to the line and arrow over with the right arrow key, something like this. Print for element in list, space, space, f. So there was two in front of the f? Print left for space, space, space, p. And three in front of the p in print. So I know that print is under the for loop, which is how it should be. And I haven't made a mistake. But you can see how each of these presents differences that could present barriers. And this is how the ethos of a code and even just the way it's written can present potential barriers to people with disabilities. So the way we can get through some of that is to be code agnostic where we can. Let participants use the code that works for them to analyze the data that they can. Or present multiple types of code, show multiple ways to do things. Have multiple modes of instruction and multiple modes of measuring it. Allow people to pick what environments they use to write their code in. Maybe they want to use just a text editor. Maybe they want a full integrated development environment where they can run everything at once. Try to provide flexibility. Flexibility is another common thread. Let's talk a little bit about mathematical notation. Mathematical notation itself can be a barrier. And I'm not talking about math. I'm talking about how it's written. Screen readers and computers in general have trouble interpreting mathematical equations just as written because they are symbolic representations of abstract concepts and discipline matters. And even if you can get it read out loud to you, then it's going to read similar to the example you just saw with the code. It's not going to read the equation the way you and I would read it in sort of a paragraph form description. It's going to read it from left to right, top to bottom. And even if that isn't the case, let's say it can say it all perfectly. Do you want to learn calculus from an audio book? So what's the solution for this? Well, maybe it's Braille math if you're a blind student. Guess what? A lot of K-12 blind students don't get Braille math instruction unless someone pushes for it. And even in that case, if I can write it, can you read it? No. So we need ways to translate it. So how do we get around this? I think having alternative text descriptions in words of equations that you are providing to participants is a great idea because it isn't doing any work for them. But it does allow them to take that equation and render it in whatever way they want. And finally, I think we as a community could work to come up with better ways to represent this stuff and maybe even apply computer vision to taking these equations that we write down and actually interpreting them into natural language. I have colleagues that teach models to recognize proteins by pictures. I'm pretty sure we can teach them to read math. So lastly is data visualization. OK, the term is the first problem. Visualization. We need to get past that. And this is where Drew and I have talked a little bit about trying to get people to call it data sensation and work with different ways of representing data. The past several hundred years is two-dimensional graphical representation that we pass around to each other. This doesn't work anymore. This doesn't work for anybody, let alone people with disabilities. And so the remedies for it, I think, we've got several things. First of all, again, this flexibility issue. When you are providing data that you want participants to work with in a training, provide it multiple ways. Give them the nice graphical representation, ideally colorblind, friendly, provide a representation that's just straight black and white in just lines because that's easy to render into a tactile, touchable image that's raised lines. Provide the tabular version. Maybe somebody works better with that. I often do. Provide the raw data so that I can render it however I need it to work with it. And we need to explore as a community for ourselves and for broadening participation by people with disabilities of using machine learning, using augmented reality, audio feedback, haptic feedback, so that's anything that vibrates or does anything by touch, to actually allow all of us to explore our data in more rich and interactive ways. And that will help everyone. Here we have a simple chart. This is how I have to look at graphs. What comes from isn't particularly important. It's just some data exploration I was doing as part of my research a little while back. What is important is that already you have probably gotten the key features from this chart. You know what the axis labels are, you know what the scale is, and you know the general behavior of the data. But here's how I have to look at this in order to read it. This is about six or seven X magnification. And before I do too much work visually, let's just at least see if the screen reader can help me out at all. Can I get a value for this bar? No. Can I get a label for the axis? No. And that's because this is just a picture. It's a ping, like we'd output from our Python or whatever your favorite analytical tool is. There's no metadata that the screen reader can use to tell me anything about this chart, except maybe a file name. So I'll work visually. And this process would be similar if a little faster with a tactile image. Start at the bottom left, and I'm going to go over and up. And what I'm seeing is that as things increase on the X axis, Y is jumping, and that that jumping sort of becomes less severe as we get to a certain point. But what is that in relationship to? What's going on with the axes themselves? So this is transfer cost with dnl equals 1. And up on the Y, I have the mean number of losses. So what I can say now is that as cost of transfer increases, the mean number of losses is jumping up until it gets to a point where it doesn't jump quite as severely. But what about this particular point right here where the jumps become less severe? What's this bar about? So I follow it down to the X axis, and it's at a transfer cost of 7. I remember what that axis label is. So I come back up here, and it's just a little bit over this line. So this line is at a mean number of losses of 100. And we're a little bit over that line. But what's the scales? Now I've got to find this line, OK? That's at 80. So let's call it 105, 110 is the mean number of losses when the transfer cost is 7. This is how I have to go through this graph and you get the point. This takes time. So at the very least, we need to recognize that students who have to access data in different ways and look at it differently are going to need to spend. OK. So the rest of this is just two slides with summary points. And because we're close on time, I'm not going to go through them because I've been trying to summarize as I go. The key points so that I really want to hit are that you don't have to be a superhero at this. You can do this slowly over time. But every little bit helps. We need to demand it of our colleagues. We need to really encourage our funders, our NGO partners, our professional organizations to actually start thinking about this and recognizing it and funding it. And focus on learning goals, not specific tasks, and how they're going to be completed. Inclusive design does not mean design individually for each person and try to come up with all the things that you could possibly run into and accommodate as you go. It means make accommodation unnecessary. And with that, I'll take some questions.