 Live from Houston, Texas, extracting the signal from the noise, it's theCUBE covering Grace Hopper Celebration of Women in Computing. Now, your hosts, John Furrier and Jeff Frick. Okay, welcome back everyone. We are here live in Houston, Texas with the Grace Hopper Celebration of Women in Computing. I'm John Furrier, the founder of SiliconANGLE, and we are here with Jeff Frick, my co-host, General Manager of theCUBE. We are day one of the Grace Hopper Celebration. Megan Smith has just given a great keynote, CTO, she's from Silicon Valley, now working with the government, really opening up the doors for innovation across the board. Again, this is about women in tech celebration, but we have Brian Nosak, who's not a woman in tech, he's a man in tech, who's here to share his point of view, Executive Director of Center for Open Science, also a professor on leave from University of Virginia. Welcome to theCUBE. Thank you for having me. So what do you think about all the sea of ladies here? The men's line is pretty short to get into the men's room. Yeah, it is a fantastic meeting already. We've had great conversations at our booth and everything, just a lot of energy around everything tech related. So it's really- And it's pretty amazing too, it's about technology and a lot of programmers here, a lot of developers, a lot of leadership, talk about your role and your company and how you're involved in this because you have an interesting story. I want you to get it out quickly. COS, Open Center for Data Science and Science, tell people what you're working on because it's a nonprofit. You're liberating the market with new free software, all open source, providing great tooling for data science share, what you're working on. Yeah, so we are a mission driven nonprofit, but it is a technology startup. So our goal is to open data across the sciences and to improve reproducibility of that research because really the data is what's valuable in science but science depends on openness, right? It's all about making things freely available for critique, for extension, for questioning whether those things are accurate inferences. And so we're really trying to provide a framework to make it very easy for sharing data and collaborating across researchers. And how's that different than what they used to do before? Because there was always independent follow up, people would try to replicate. What has really changed in the internet and open standards and where we are now from how that used to be done? Yeah, well the values of sciences have always been about transparency, right? Basic ideas you have to say how it is you got to that scientific claim in order for people to believe it. But the practices haven't been that way. So the emergence of the internet, science is actually way behind compared to other industries because the scientists have not embraced the idea of openness of their materials, of their data, of how they pursue their workflow in producing the findings. It's interesting, I want you to share your data science tooling because I want to get your perspective on women in tech because I see women as a different gender bias towards data as well. They have a different unique orientation than male data scientists. Sometimes more creative, just in my opinion now, general bias, but what's the role of women in data science in your organization, in the academic community, obviously I know you're on leave as a professor, but what's the younger generation look like? What's the mix? What's the mindset? What are some of the insights that you could share with the folks out there in terms of this new breed of data scientists and the women in particular? Yeah, so for us, openness is two things. One, it's about transparency and access to data, but the second part of openness is inclusivity. That the way in which our science and our technology is going to succeed is by drawing on talent across the range of people that can get involved. So really our organization, both by mission and by process, depends on pulling talent from everywhere that we can. And so having a very inclusive technology community in general, and then specific to our company, is critical for us to meet the goals that we have. What's your staffing look like right now? So we have 68 staff, 42% are female, and 75% of our staff are technology people. QA through software development, full stack. What do you think about the digital native generation growing up? I was only interested in saying, I'm a DoD dad of a daughter, and they're growing up. And they're just naturally inclined to love tech. And it's not even about math or science, but there's men and women now are attracted 100% in on the native role. But there's now a generation of computer scientists now in college today. What's it like? What is the current mix there? Just any anecdotal observations that you could share in terms of what the makeup is, what they're thinking, what's it like? Yeah, it's a great question. And the experience that we've had so far, because we have a very young organization, we have a strong internship program where we're getting people that are students wanting to get experience and then very rapidly scaling up. And what is drawing them, at least in the work that we're doing, is to solve problems. It isn't just technology for technology's sake, but technology for an application to really address some issues, some needs, some problem in the community. And for us, obviously that's opening science, but we see that a lot of interest from all the interns and early developers that come into our program and wanting to have their technology matter in some way. So you're donating all this software for free, tooling, the picks and shovels, if you will, for this next generation of software engineering, softwares eating the world. Open source is now a tier one citizen. When I was growing up in the computer, it was radical, tier two, it's like I don't want to pay the licensing fees, it was mostly CES, very mail driven. Now it's a tier one open collaborative resource. The game has completely changed. What's your take on this? Do you like, I mean, you're super excited, you're doing your own startup or nonprofit. What is the next generation of open source going to look like in your opinion? Because now the data is important. The specialism is also still there. People are vertically targeting. Scientists, for instance, they don't want to be jack of all trades, they want to go deep vertical. Right, and most organizations are using, the data is the basis for the monetization strategy or whatever commercialization strategy they have, and ours is completely the opposite. The data should be freely available. That's the thing that we should not control. And instead, we want to encourage the different services that connect to the open science framework to leverage their expertise, their services, let them be expert in what they're expert in, but make sure that the connections between the services and the data that it's resting on are the things that are freely accessible. Can you give an example of where it works? Where open data was a really good thing and you can give an example of a success. Yeah, so open data in science is so important because tons of money is invested in conducting individual studies. And if those data don't get accessed by others, then they can't be used for the many different types of possibilities that they could be. So we have a nonprofit that I did before this called Project Implicit, where we look at implicit biases, thoughts and feelings people might have, judgments they might make by gender or race or age, and all of that data is freely accessible on the open science framework. And so many other researchers have been able to use it for questions that we would have never asked in our own laboratory, but they have expertise in, and so they've been able to use the same data to solve new problems. And so basically they didn't want to reinvent the wheel, basically. Right, and the resources that went into collecting that. Right, a Project Implicit, we have a million people a year coming to complete these tests. Those data are of high value and we're not going to get to doing all of it. We're not going to have all the ideas of what could be learned from that. And so making it available changes the equation. Yeah, so let's dig down on the bias thing a little bit because I think it's an interesting topic. There's always a fight in the news, right, about, you know, is it biased or not? Well, of course everything is biased, right? We all bring our life experiences, a certain filter. So that bias is not really the problem. It's when you don't recognize that you have biases, right? And not appreciating the fact that there's bias. So talk a little bit about how you guys have kind of unbundled the yes things are biased but yes, let's also recognize it's biased and try not to get, let that bias get in the way of the fact. Yeah, yeah, it is a huge challenge because bias is a very ordinary outcome of the mind, right? We have to make sense of reality but our brain doesn't experience reality. It comes through our filters, exactly as you said, right? We have to construct an understanding of the world. So we make assumptions, we jump to conclusions, we use stereotypes and all of those things will influence our individual judgments whether we know that we're using them or not. Right. And so a lot of the things that we've been trying to do as an organization is try to identify what assumptions might be driving our behaviors in different circumstances for hiring, for retention, for how it is. We try to create a climate that is inclusive for everybody that's there and can we challenge those assumptions that we have? And by having a very diverse work group, the assumptions come up more easily. More people can notice things that I might not see because of my own life experience, the positions that I have, how people interact with me whereas having many people from different, having different origins and different experiences, a lot of that stuff bubbles up more easily and then everybody can succeed more easily from that. So is that the best way to kind of expose it, to shine the light on it, is to have more points of view so that someone will say, wait, wait, wait, I don't see it that way at all. I mean, we talked to Lori McKenzie from the Claiming Institute and she talked a lot about attacking the problem via behavior, right? You can attack it via behavior in language and vocabulary. You know, real specific things that are actionable that you can do, that you can teach, you can define rather than trying to change the way people think, right? The work. Exactly, I've been studying implicit bias since I started graduate school in 1996. I'm not free of implicit bias. I am about as educated as you can get on it, right? I'm just studying it every day, all day and it's still in my mind because I don't get to decide whether it's in there or not so it really is on behavior, as you say, is that I have values, I have intentions of how I want to behave, I may not always behave aligned with those intentions but where I can intervene is on what I do and be open to having other people challenge that and say, wait a second, how did you get to that decision? Why did you decide to advance him rather than her? That special assignment, you know, she's totally perfect for that. Why didn't you think of her for that? Oh, I don't even know why, you know, and so being having that challenge and not having it be an attack on me as a person but rather a challenge on the process that I've done. That is the key point, that is the key point. Rather than being persecuted for being yourself, right? Just be you, right? Being open to the bias is actually a competitive advantage because you're going to have better input rather than negative energy. That's right. So like, I mean, everyone has a bias. Everybody has their own biases. However, harnessing it will be the key. Yeah, I mean, the key lesson that I've learned from doing research in this area is to have humility about my own decision-making, right? It's so much in to saying I am made an objective decision, but they're not objective, right? They're through all of these filters and so being willing to be humbled by that and have a challenge. How have your studies changed? I mean, this is something that we are very interested in. We do a lot of, you know, anecdotal research and some algorithms around online behavior with interactions on Twitter data. Has the internet changed your acquisition of bias data or has it actually worsened it? Yeah. You can argue that the social graph has actually increased the group think, if you will, actually accelerates the bias or doesn't create more data. Yeah. Back to the data question. It's a great question and there's competing ideas about it and there's not good resolution, right? One perspective is, well, you have so much more access to so many more pieces of information that would challenge your current way of thinking. And so isn't that great? It's an open landscape. But at the same time, we can completely isolate ourselves in a much bigger way by looking only at those data sources that are already aligned with how we think. Yeah. Right? If I am a left winger, there's certain websites that I might go to. If I'm a right winger, there's other websites I might go to and it can create much stronger echo chambers. And there's different now channels of communications. Yeah. Now there's Omni channel, front, back, side, you know, it's like, I can look this way but then back channel somebody this way. So it's really interesting. I mean, I just haven't found any data on this area because this is the key. How do you surface someone to look in the mirror and say, how am I behaving? Yeah. Versus being persecuted. I mean, just this, yesterday flying here on my Facebook thread, I threw out a comment to my friends and said, I'm going to the Great Topper. I'm super excited. Of course, I've interviewed so many women on the Cube. Never once asked them how they balance their family just as well as they're supposed to talk tech. But I said, is it not politically correct to say I love women in tech? Just kind of a comment because I love women in tech. Who doesn't? And all of a sudden, the thread started. A lot of response. You call them women or do you call them ladies? A debate about that. A debate about the word politically correct. Yeah. Just every single. A debate about the word love. The word love. That's one guy brought up the Greek definitions of. There's four versions of love. Amen. And every second. So my point is, you can't get it right. It's everyone will argue about every semantic. Yeah. So this is a challenge. Yeah. It is challenging. And I think a lot of it is just being genuine, right? And being open to having someone push back a little bit and say, well, did you think about it this way and not have that be a threat? Have that be an opportunity? That's exactly what came out of the Facebook thread. One guy, notable comment said, John, just be yourself. They're going to dig you. But don't say things stupid. Don't. Which is like, okay, wait a minute. That means that I'm not myself then. Okay. You might not be myself. Right. But my point was be genuine. Right. Okay. Be yourself. But don't be trying to be fake. Right. Don't fake it. Right. Oh yeah. Because that's the easiest to detect, right? Right. So I just want to bring it back before we wrap up with Center for Open Science with your core project. Can you give any examples of research, breakthroughs? You said you're kind of general. You don't necessarily concentrate on a particular type of thing like cancer research or this or that. But are there any stories you can tell where having a platform such as what you've been working on have really resulted in either faster problem solving, innovative problem solving, breakthrough problem solving that people can really kind of understand? Yeah. So the latest project that we have that's an illustration of that is that there's a very strong sense in the scientific community across different disciplines that we're not getting as much reproducible research into the published literature as we'd like. That the publisher may not be as reliable as it needs to be. So we, through the Open Science framework, we ran a crowdsourced project. 270 different collaborators all worked together on replicating 100 results in psychology. And the platform made it entirely possible to do it. It would be, have not been feasible to coordinate the protocols, to get all of these separate data collections done, to have those all organized into a single environment and then to make that publicly accessible. And those results were just published last month in Science and have really advanced the conversation in the scientific community about what the challenges for reproducibility are and then how we might improve them. And openness is one of the answers. So what was the answer? Yeah, curious. The short answer is that we were able to reproduce less than half of the results of the published studies that we tried to reproduce, which is much less than we would want. Right. And so the follow up question is why, right? What is it about the published literature? What makes it so hard to reproduce? And that's where they're right now is very interesting debates. You know, and it also is a debate about the academic literature, right? Because the academic journals yield so much power and it's kind of like the old newspaper, right? There's only so much space at the top of the fold. They have a lot of distribution power. There's only so many pages in those journals. How is open source and the research side impacting that process? Because you've also got much easier kind of open peering review than you used to have just those guys in the hallowed hall. That's right. Yeah, the pipeline for getting something into the scientific literature is based on 1940s technology. And there's this fantastic movement of open source, open science technology to try to transform that. Post publication peer review, like you're describing, write open data, more generally open materials. Pre-registration, where people say in advance publicly what they're going to do in that study and how they're going to analyze it and to increase the credibility of what they find rather than just data dredging, right? There's also gatekeepers that run that too. That's right. And so it can really democratize the entire scientific community by making it more open so that anybody that has a good idea can put it into that marketplace and have it be wrestled with. Talk about developers. What are you guys doing? What do you guys do with developers? How do they get involved? I'll see APIs are going to be critical to what you do. Is it cloud-based? Quickly give us the run through the geek side of it. The more interesting, when clouds you use, Amazon, Azure, all of the above. Yeah, so we are a Python, Java, ScriptShop, our primary tool that's open science framework is a web-based application. We use Amazon and Rackspace for a lot of our storage and computing needs. And we have a learning-based internship and hiring program. So because we came out of academics, we have a tech intern program where they can get in. They're all paid internships or they can even do it for credit and they get involved in the code base right away. That's fantastic. And so they really get this very on-the-ground, on-the-job, service-based learning kind of experience. And our goal, obviously, is to hire them or at least make them very hireable by the time they get out. So a meeting like this is also very important for us. Oh yeah, it is fantastic for us. Because we get people that are enthusiastic, they're driven by mission, they want to contribute and they just need the opportunity to get in. So are you on an evangelism mission here and or recruiting or both? Both. We're trying to say what the center is doing. We're always trying to advance our mission. We're also presenting tomorrow about implicit bias, my research work and how we're trying to wrestle with that at the Center for Open Science in terms of a practical application. And we're trying to hire like crazy. Well, it's a great mission. We totally support it. We think you're doing amazing stuff. The bias stuff we want to get, if you can get us a copy of that, so let's follow up and we can get it on our blog, get it out to our audience. And great stuff, really is the future. It really is about democratization. The old way in academic is going to be frictionless data sharing, access to tooling, picks and shovels, it's a gold rush, as we say in California. So, thanks so much for coming on theCUBE, appreciate it. We'll be right back with more from Grace Hopper in Houston, Texas, celebration of women in computing. I'm John Furrier, Jeff Frick, we'll be right back.