 Hey everyone, welcome to theCUBE's coverage of Women in Data Science 2022. I'm Lisa Martin, and I'm here with one of the key featured keynotes for this year's WID events, Cecilia Aragon, the professor, the Department of Human-Centered Design and Engineering at the University of Washington. Cecilia, it's a pleasure to have you on theCUBE. Thank you so much, Lisa. It's a pleasure to be here as well. You have an amazing background that I want to share with the audience. You are a professor, you are a data scientist, an aerobatic pilot, and an author with expertise in human-centered data science, visual analytics, aviation safety, and analysis of extremely large and complex data sets. That's quite the background. Well, thank you so much. It's all very interesting and fun. And as a professor, you study how people make sense of vast data sets, including a combination of computer science and art, which I love. And as an author, you write about interesting things. You write about how to overcome fear, which is something that everybody can benefit from, and how to expand your life until it becomes amazing. I need to take a page out of your book. You were also honored by President Obama a few years back. My goodness. Thank you so much. Yes, I've had quite a journey to come here, but I feel really fortunate to be here today. Let's talk about that journey. I'd love to understand if you were always interested in STEM, if it was something that you got into later. I know that you are the co-founder of Latinas in Computing, a passionate advocate for girls and women in STEM. Were you always interested in STEM, or was it something that you got into in a kind of a non-linear path? I was always interested in it. When I was a young girl, I grew up in a small Midwestern town, and my parents are both immigrants. And I was one of the few Latinas in a mostly white community. And I loved math, but I also wanted to be an astronaut. And I remember when we were asked, I think it was in second grade, what would you like to be when you grow up? I said, oh, I want to be an astronaut. And my teacher said, oh, you can't do that. You're a girl. Pick something else. And so I picked math, and she was like, okay. So I always wanted to, well, maybe it would be better to say I never really quite lost my love of being up in the air and potentially space, but I ended up working in math and science. And I loved it because one of the great advantages of math is that it's kind of like a magic trick for young people, especially if you're a girl or if you are from an underrepresented group. Because if you get the answers right on a math test, no one can mark you wrong. It doesn't matter what the color of your skin is or what your gender is. Math is powerful that way. And I will say there is nothing like standing in front of a room of people who think little of you and you silence them with your love of numbers. I love that. I never thought about math as power before, but it clearly is. But also, and I wish we had more time because I would love to get to how you overcame that fear. I mean, you write books about that, but being told you can't be an astronaut, you're a girl, and maybe laughing at you because you liked math. How did you overcome that? And so never mind. I'm doing it anyway. Yeah. Well, that's a, it's a, okay. The short answer is I had incredible imposter syndrome. I didn't believe that I was smart enough to get a phd in math and computer science. But what enabled me to do that was becoming a pilot. And I be, I learned how to fly small airplanes. I learned how to fly them upside down and pointing straight at the ground. And I know this might sound kind of extreme. So this is not what I recommend to everybody. But if you are brought up in a way where everybody thinks little of you, one of the best things you can possibly do is take on a challenge that's scary. I was afraid of everything. But by learning to fly and especially learning to fly loops and rolls, it gave me confidence to do everything else because I thought I appointed the airplane at the ground at 250 miles an hour and waited. Why am I afraid to get a phd in computer science? Wow. How empowering is that? Yeah, it really was. So that's really how I overcame the fear. And I will say that, you know, I encountered situations getting my phd in computer science where I didn't believe that I was good enough to finish the degree. I didn't believe that I was smart enough. And what I've learned later on is that was just my own emotional, you know, residue from my childhood and from people telling me that they, you know, that I couldn't achieve. And I look what you've achieved so far. It's amazing. We're going to be talking about some of the books that you've written. But I want to get into data science and AI and get your thoughts on this. Why is it necessary to think about human issues and data science and AI? What are your thoughts there? So there's been a lot of work in data science recently looking at societal impacts. And if you just address data science as a purely technical field, and you don't think about unintended consequences, you can end up with tremendous injustices and societal harms and harms to individuals. And I think any of us who has dealt with an inflexible algorithm, even if you just call up, you know, customer service and you get told, press five for this, press four for that, and you say, well, I don't fit into any of those categories, you know, or have the system hang up on you after an hour, I think you'll understand that any type of algorithmic approach, especially on very large data sets, has the risk of impacting people, particularly from low income or marginalized groups, but really any of us can be impacted in a negative way. And so as a developer of algorithms that work over very large data sets, I've always found it really important to consider the humans on the other end of the algorithm. And that's why I believe that all data science is truly human centered, or should be human centered. Should be human centered and also involves both technical issues as well as social issues. Absolutely correct. So one example is that many of us who started working in data science, including I have to admit me when I started out, assume that data is unbiased, it's scrubbed of human influence, it is pure in some ways. However, that's really not true. As I started working with data sets, and this is generally known in the field, that data sets are touched by humans everywhere. As a matter of fact, in the recent book that we're coming out with human centered data science, we talk about five important points where humans touch data no matter how scrubbed of human influence it's supposed to be. So the first one is discovery. So when a human encounters a data set and starts to use it, it's a human decision. And then there's capture, which is the process of searching for a data set. So any data set has to be selected and chosen by an individual. Then once that data set is brought in, there's curation. A human will have to select various data sets, they'll have to decide what is what is the proper set to use. And they'll be making judgments on this the entire time. And perhaps one of the most important ways the data is changed and touched by humans is what we call the design of data. And what that means is whenever you bring in a data set, you have to categorize it. Now, for example, let's suppose you are a geologist and you are classifying soil data. Well, you don't just take whatever the description of the soil data is, you actually may put it into a previously established taxonomy and you're making human judgments on that. So even though you think, oh, geology data, that's just rocks, you know, that soil has nothing to do with people, but it really does. And finally, people will label the data that they have. And this is especially critical when humans are making subjective judgments, such as what race is the person in this data set. And they may judge it based on looking at the individual skin color. They may try to apply an algorithm to it. But you know what? We all have very different skin colors. Categorizing us into race boxes really diminishes us and makes us less than we truly are. So it's very important to realize that humans touch the data. We interpret the data. It is not scrubbed of bias. And when we make algorithmic decisions, even the very fact of having an algorithm that makes a judgment say on whether a prisoner is likely to offend again, the judge just by having an algorithm, even if the algorithm makes a recommended statement, they are impacted by that algorithm's recommendation. And that has obviously an impact on that human's life. So we consider all of this. So you just given five solid reasons why data science and AI are inevitably human-centric should be. But in the past, what's led to the separation between data science and humans? Well, I think a lot of it simply has to do with incorrect mental models. So many of us grew up thinking that, oh, humans have biases, but computers don't. And so if we just take decision-making out of people's hands and put it into the hands of an algorithm, we will be having less biased results. However, recent work in the field of data science and artificial intelligence has shown that that's simply not true, that algorithms reinforce human biases, they amplify them. So algorithmic biases can be much worse than human biases and can have greater impacts. So how do we pull ethics into all of this data science and AI and that ethical component, which seems to me that it needs to be foundational? It absolutely has to be foundational. And this is why we believe and what we teach at the University of Washington in our data science courses is that ethical and human-centered approaches and ideas have to be brought in at the very beginning of the algorithm. It's not something you slap on at the end or say, well, I'll wait for the ethicists to weigh in on this. No, we are all human. We can all make human decisions. We can all think about the unintended consequences of our algorithms as we develop them. And we should do that at the very beginning. And all algorithm designers really need to spend some time thinking about the impacts that their algorithm may have. Right. Do you find that people are still in need of convincing of that? Or is it generally moving in that direction of understanding? We need to bring ethics in from the beginning. It's moving in that direction, but there are still people who haven't modified their mental models yet. So we're working on it. And we hope that with the publication of our book that it will be used as a supplemental textbook in many data science courses that are focused exclusively on the algorithms and that they can open up the idea that considering the human-centered approaches at the beginning of learning about algorithms and data science and the mathematical and statistical techniques, that the next generation of data scientists and artificial intelligence developers will be able to mitigate some of the potentially harmful effects. And we're very excited about this. This is why I'm a professor because I want to teach the next generation of data scientists and artificial intelligence experts how to make sure that their work really achieves what they intended it to, which is to make the world a better place, not a worse place, but to enable humans to do better and to mitigate biases and really to lead us into this century in a positive way. So the book Human-Centered Data Science, you can see it there over Cecilia's right shoulder. When does this come out and how can folks get a copy of it? So it came out March 1st and it's available in bookstores everywhere. It was published by MIT Press and you can go online or you can go to your local independent bookstore or you can order it from your university bookstore as well. Excellent. Got to get a copy of, get my hands on that, got to get a copy and dig into that because it sounds so interesting but also so thoughtful and clear in the way that you described that and also all the opportunities that that AI, data science and humans are going to unlock for the world and humans and jobs and great things like that. So I'm sure there's lots of great information there. Last question I mentioned, you are keynoting at this year's WIDS conference. Talk to me about like the top three takeaways that the audience is going to get from your keynote. So I'm very excited to have been invited to WIDS this year which of course is a wonderful conference to support women in data science and I've been a big fan of the conference since it was first developed here at Stanford. The three top takeaways I would say is to really consider the data science can be rigorous and mathematical and human-centered and ethical. It's not a trade-off. It's both at the same time and that's really the number one that I'm hoping the keynote will bring to the entire audience and secondly I hope that it will encourage women or people who've been told that maybe you're not a science person or this isn't for you or you're not good at math. I hope it will encourage them to disbelieve those views and to realize that if you as a member of any type of underrepresented group have ever felt oh I'm not good enough for this, I'm not smart enough, it's not for me, that you will reconsider because I firmly believe that everyone can be good at math and it's a matter of having the information presented to you in a way that honors the background you had. So when I started out my high school didn't have AP classes and I needed to learn in a somewhat different way than other people around me and it's really it's really something that what I tell young people today is if you are struggling in a class don't think it's because you're not good enough. It might just be that the teacher is not presenting it in a way that is best for someone with your particular background. So it doesn't mean they're a bad teacher, it doesn't mean you're unintelligent, it just means that maybe you need to find someone else that can explain it to you in a simple and clear way or maybe you need to get some scaffolding that is take learn extra take extra classes that will help you not necessarily remedial classes. I believe very strongly as a teacher in giving students very challenging classes but then giving them the scaffolding so that they can learn that difficult material and I have longer stories on that but I think I've already talked a bit too long. I love that the scaffolding. I think the one one of the high-level takeaways that we're all going to get from your keynote is inspiration. Thank you so much for sharing your path to STEM, how you got here, why humans data science and AI are have to be foundationally human centered. Looking forward to the keynote and again Cecilia Aragon, thank you so much for spending time with me today. Thank you so much Lisa, it's been a pleasure. Likewise, for Cecilia Aragon, I'm Lisa Martin. You're watching the CUBE's coverage of Women in Data Science 2022.