 From around the globe, it's theCUBE with digital coverage of AWS Public Sector Online. Brought to you by Amazon Web Services. Hello, and welcome back to the coverage of AWS Public Sector Summit virtual. I'm John Furrier, host of theCUBE. We're here in theCUBE Studios, quarantine crew here talking to all the guests remotely. And it's part of our virtual coverage of AWS Public Sector. So I've got a great guest here talking about data science, weather, predictions, accurate climate modeling, really digging into how cloud is helping science. Dr. Shell, gentleman who's a senior scientist at Farrellon Institute, is my guest. Shell, thank you for joining me. Thank you. So tell us a little bit about your research. It's fascinating how, you know, I've always joked on a lot of my interviews and, you know, 10, 15, 20 years ago, you need supercomputers to do all these calculations. But now with cloud computing, it opens up so much more on the research side and the impact is significant. You're at an awesome institute, the Farrellon Institute, doing a lot of stuff in the sea and the ocean and lavaria things. What's your focus? I study the ocean from space. And about 71% was covered by ocean, 40% of our population in the globe actually lives within 100 kilometers of the coast. The ocean influences our weather, it influences climate, and it also provides fisheries and recreational opportunities for people. So it's a really important part of the Earth's system. And I've been focused on using satellites, so from space, trying to understand how the ocean influences weather and climate. And how new is this in terms of just state-of-the-art? Fairly new, been around for a while. What's some of the progress in the state-of-the-art that you're involved in? I started working on satellite data in the 90s during school. And I liked the satellite data because it's the interface of applied math, computer science, and physics. The state-of-the-art is that we've really had remote sensing around for about 20, 30 years. But things are changing because right now we're having more sensors and different types of instruments up there and trying to combine that data is really challenging. To use it, our brain is really good in two and three dimensions, but once you get past that, it's really difficult for the human brain to try and interpret the data. And that's what scientists do, is they try and take all these multidimensional datasets and try to build some understanding of the physics of what's going on. And what's really interesting is how cloud computing is impacting that. It sounds so exciting, the confidence of multiple disciplines kind of all right there, had a geek out big time. So I got to ask you, do you have the public dataset program? Are you involved in that? Do you take advantage of that research? How are some of the things that AWS is doing help you and is that public dataset part of it? It's a big part of it now. I've helped to deploy some of the ocean temperature datasets on the cloud. And the way that AWS public datasets has potential to transform science is the way that we've been doing science, the way that I was trained in science was that you would go and download the data. And most of these big institutions that do research, you start to create these dark repositories where the institutions or someone in your group has downloaded datasets. And then you're trying to do science with these data, you're not sure if it's the most recent version. It makes it really hard to do reproducible science because if you wanna share your code, somebody also has to access that data and download it. And these are really big datasets. So downloading it could take quite a long time. It's not very transparent, it's not very open. So when you move to a public dataset program like AWS, you just take all of that download out of the equation. And instantly when I share my code now, people can run the code and just build on it and go right from there, or they can add to it or suggest changes. That's a really big advantage for trying to do open science. You know, I had a dinner with Theresa Carlson who is awesome, she runs the public sector at Summit for AWS. And I remember this is years ago and we were dreaming about a future where we would have national parks in the cloud or this concept of, you know, Summit is a beautiful treasure, physical place, you go there. And we were kind of dreaming that, wouldn't it be great to have like these datasets or supercomputer public commons? It sounds like that's kind of the vibe here where it's shareable. And it's almost like a digital national park or something. It's a shared resource. Is that kind of happening? First of all, what do you react to that and what's your thoughts around that dream? And is this kind of tied to that? Yeah, I think it ties directly to that. When I think about how science is still being done and has been done for the past sort of 20 years, we had a real change about 20 years ago when a lot of the government agencies started requiring their data to be public. And that was a big change. So then we actually had public datasets to work with. So more people started getting involved in science. Now, I see it as sort of this fortress of data that in some ways had prevented scientists from really moving rapidly forward. But with moving onto the cloud and bringing your ideas and your compute to the dataset, it opens up this entire Pandora's box, this beautiful world of how you can do science. You're no longer restricted to what you have downloaded or what you're able to do because you have this unlimited compute. You don't have to be at a big institution with massive supercomputers. I've been running hundreds of workers analyzing in my RAM over 200 or 300 gigabytes of data on a $36 Raspberry Pi that I was playing around with that was my kids. That's transformative. That allows anyone to access data. And if you think about what it would have to do to do that in the old days, stack and spike servers, call. First of all, you got to get the cash, buy servers, rack them, stack them, connect to a downward nightmare. So I got to ask you now with all this capability, first of all, you're talking to someone who loves the cloud. So I'm pretty biased. What are you doing now with the cloud that you couldn't do before? So certainly the old way from a provisioning standpoint, check done, innovation, bars raised. Now you're creative, you're looking at solutions, you're building a enabling device like a Raspberry Pi, almost like a switch or an initiation point. How has the creativity changed? What can you do now? What are some of the things that are possible that you're doing? I think that you can point to within some of the data sets that have already gone on the cloud are being used in these really new different ways. And it again, it points to this, when you don't have access to the data, just simply because you have to download it. So downloading the data and figuring out how to use it and figuring out how to store it is a big barrier for people. But when things like the HF Radar data set went online, within a couple of months there was a paper where people were using it to monitor bird migration in ways that they'd never been able to do before because they simply hadn't been able to get the data. There's other research being done where they've put whale recordings on the cloud and they're using AI to actually identify different whales. There's all, it's using one data set but it's also the ability to combine all these different data sets and have access to them at the same time and not be limited by your computer anymore. Which for a lot of science, we've been limited by our access to compute. And that, when you take away that, it just, it opens all these new doors into doing different types of research with new types of data. You probably correlate the whale sounds with the temperature and probably say, hey, it's cold or, you know, I'm making that up. But that's the kind of thing that wouldn't be possible for because you'd have to get the data set, do some math. I mean, this is cool stuff with the ocean. I mean, can you just take a minute to share some, give people an insight into some of the cool projects that are being either thought up or dreamed up or initiated or done or in process or in flight. Because obviously there's so much data in the ocean, so much things to do. It's very dynamic. There's a lot of data, obviously. Shares for the folks that might not have a knowledge of what goes on, what are you guys thinking about? A lot of what we're thinking about is how to have societal impact. So as a scientist, you want your work to be relevant, right? And one of the things that we found is that the ocean really impacts weather at scales that we simply can't measure right now. So we're really trying to push forward with space instrumentation so that we can monitor the ocean in new ways at new resolutions. And the reason that we want to do that is because the ocean impacts long-term predictability into weather forecasts. So a lot of weather forecasts now, if you look out, you can download, you can go on to weather underground or whatever weather site you want. And you'll see the forecast goes out 10 days and that's because there's not a lot of accuracy after that. So a lot of research is going into, how do we extend into seasonal forecasts? I'm from Santa Rosa, California. We've been massively impacted by wildfires and being able to understand how to prepare for the coming season is incredibly important. And surprisingly, if you think to a lot of people, the ocean plays a big role in that. The ocean can impact how much storm systems, how they grow, how they evolve, how much water they actually are moisture they pick up from the ocean and then transport over land. So if you want to talk about, it's really interesting to talk about how the ocean impacts our weather and our seasonal weather. So that's an area where people are doing a lot of research. And again, you're talking about different data sets and being able to work together in a collaborative environment on the cloud is really what's starting to transform how people are working together, how they're communicating and how they're sharing their science. I just opens up so many possibilities. I want to get your vision of what you think the breakthroughs might be possible with cloud for research computing because you have kind of old school, a new school, Amazon CEO Andy Jessica is an old guard, new guard. The new guard is really more looking for self provisioning, auto scaling, all that super computer on demand, all that stuff at your fingertips. Great, love that. But is there any opportunity for institutional change within the scientific community? What's your vision around the impact? Because it's not just scientific, it also can go to government for societal impact. So you start to see this modernization trend. How, what's your vision on the impact of the scientific community with cloud? I think that the way the scientific community has been organized for a long time is that scientists are at Institute and a lot of the research has been siloed. And it's siloed in part because of the way the funding mechanism works. But that inhibits creativity and inhibits collaboration and inhibits the advancement of science because if you hold on to data, if you hold on to code, you're not allowing other people to work on it and to build on what you do. The traditional way that scientists have moved forward is you make a discovery, you write up a paper, you describe it in a journal article, and then you publish that. Then if someone wants to build on your research, they get your journal article, they read it, then they try to understand what you did, they maybe recode all of your analysis. So they're redoing the work that you did, which is simply not efficient. Then they have to download the data sets that you access. This slows down all of science and it also inhibits bringing in new data sets again because you don't have access to them. So one of the things I'm really excited about with cloud computing is that by bringing our scientific ideas and our compute to the data, it allows us to break out of these silos and collaborate with people outside of our institution, outside of our country and bring new ideas and new voices and elevate everyone's ideas to another level. It brings the talent and the ideas together and now you have digital and virtual worlds because where you've been virtualized with COVID-19, you can create content as a community building capability or your work can create a network effect with other peers and is a flash mobbing effect of potential collaboration. So work, workforces, workplaces, workloads, workflows, kind of interesting or kind of being changed in real time. You just talked about speed, agility. These are technical concepts being applied to kind of real world scenarios. I mean, your thoughts on that. Yeah, I mean, I now work with people like right now, I'm working with students in Denmark, Oman, India, France and the US. That just wasn't possible 10 years ago and we're able to bring all these different voices together which it really frees up science and it frees up who can participate in science which is really fun. I mean, I'm a scientist, I do it because it's really, really fun and I love working with other people. So this new ability that I've gained in the last couple of years by moving onto the cloud has really accelerated all the different types of collaborations I'm involved with and hopefully accelerating science as a whole. You know, I love this topic. It's one of my passion, passion areas where and it's an issue I've been scratching for over a decade too is that content and your work is an enabler for community engagement because you don't need to publish it to a journal. It's like waterfall mentality. It's like you do it but if you can publish something or create something and show it, demo it or illustrate it that's better than a paper, right? If you're on a video you can talk about it's going to attract other people like-minded peers can come together. That's going to create more collaboration data. That's going to create more solidarity around topics and accelerate the breakthroughs. For our last paper we actually published all the software with it. We got a digital buyer for the software, published the software and then containerized it so that when you read our paper at the bottom of the paper you get a link, you go to that link, you click on a button and you're instantly in our compute environment you can reproduce all of our results do the error propagation analysis that we did and then if you don't like something go ahead and change it or add onto it or ask us some questions. That's just magical. Yeah, it really is. And you know, Amazon has been a real investor and I got to give props to Theresa Carlson and her team and Andy Jassy, the CEO because they've been investing in credits and collaborating with groups like Jet Propulsion Lab, you guys, everyone else, just space has been a big part of that. I'll see Bezos loves space. So they've been investing in that and bringing that resource to the table. So you got to give Amazon some props to that. But great work that you're doing. I'm fascinated. I think it's one of those examples where it's a moonshot but it's doable. It's like, you can get there. Yeah, and it's just so exciting. Like I'm working, I'm the lead on a proposal for a new science mission to NASA and we are going all in with the cloud computing. So we're going to do all the processing on the cloud. We want to do the entire science team on the cloud and create a science data platform where we're all working together. That's just never happened before. And I think that by doing this we multiply the benefits of all of our analysis. We make it faster and we make it better and we make it more collaborative. So everyone wins. Shell, you're an inspiration to many. I'm so excited to do this interview with you. I love what you said earlier at the beginning about your focus of being computer science, physics, space. That confidence is multiple disciplines. Not everyone can have that. Some people just get a computer science degree. Some people get, you know, I'm pre-med or I'm going to do biology, I'm going to do this. This notion of multiple disciplines coming together is really what society needs now is we're converging or virtualizing or becoming a global society. And that brings up, you know, my final question and something that I know that you're passionate about creating a more inclusive scientific community because you don't have to be the just the computer science major. Now, if you have all three, it's a multi-tool win, you're a multiple skill player. But you don't have to be something to get into this new world because if you have certain disciplines or that's math, maybe you don't have computer science but it's quick to learn, the framework's out there. No code, low code. So cloud computing supports this. What's your vision and what's your opinion of how more inclusivity can come into the scientific community? I think that when you're at an institution or at a commercial company or a nonprofit, if you're at some sort of organized institution, you have access to things that not everyone has access to. And in a lot of the world, there's trouble with internet connectivity. There is, you know, trouble downloading data. They simply don't have the ability to download large data sets. So I'm passionate about inclusivity because I think that until we include global voices in science, we're not going to see these global results that we need to. We need to be more interdisciplinary and that means with working with different scientists in different fields. And if we can all work together on the same platform, that really helps explode interdisciplinary science and what can be done. A lot of science has been quite siloed because you work at an institution so you talk to the people one door down or two doors down or on the same floor. But when you start working in this international community and people don't have to be online all the time, they can write code and then just jump on and upload it. You don't need to have these big powerful resources or institutions behind you. And that gives a platform for all types of scientists and all types of levels to start working with everyone. This is why I love the idea of the content in the community being horizontally scalable because if you're stuck around a physical institution or space, you kind of have group think or maybe have the same kind of idea being talked about. But here when you pull back the remote work with COVID-19 as an example, it highlights it, the remote scientists could be anywhere. So that's going to increase access. What can we do to accept those voices? Is there a way or an idea or formula that you see that people could, assuming there's access, which I would say yes, what do we do? What do you do? I think you have to be open and you have to listen because if I ask a question into the room where my colleagues work, we're gonna come up with an answer but we're gonna come up with an answer that's informed by how we were trained in science and what fields we know. So when you open up this box and you allow other voices to participate in science, you're gonna get new and different answers. And as a scientist, you need to be open to allowing those voices to be heard and to acting on them and including them in your research results and thinking about how they may change what you think and bring you to new conclusions. Machine learning has been a part, I know you have your work in the past, obviously Cloud, you're a big fan, I can tell, proponent of it. Machine learning and AI can be a big part of this too, both on not only sourcing new voices and identifying what's contextually relevant at any given time, but also on the science side, machine learning, because if we can take a minute to give your thoughts on the importance and relevance of machine learning and AI because you still got the humans and you got machines augmenting each other, that relationship is going to be a constant conversation point going forward, you know, is there data about the data and what's the machines doing? What's your thoughts on all of this? Machine learning AI is an impact. Well, it's funny you say impact. So I work with this NASA impact project, which is this interdisciplinary team that tries to advance science and it's really into machine learning and AI. One of the difficulties when you start to do science is you have an idea like, okay, I want to study tropical storms and then you have to go and wade through all these different types of data to identify when events happened and then gather all the data from those different events and start to try and do some analysis. They're working and they've been really successful in using AI to actually do this sort of event identification, so what's interesting and how can we use AI and machine learning to identify those interesting events and then gather everything together for scientists to then try and bring more analysis. So AI is being used in a lot of different ways in science. It's being used to look at these multi-dimensional problems that are just a little bit too big for our brains to try and understand, but if we can use AI and machine learning to gather insights into certain aspects of them, it starts to lead to new conclusions and it starts to allow us to see new connections. So AI and machine learning has this potential to transform how we do science and a lot of cloud computing is part of that because we have access to so much more data now. Yeah, it's a real enabling technology and when you have enabling technology, the power is in the hands of the creative minds and it's really what you can think up and what you can dream up and that's going to come from people. Yeah, phenomenal. Final question for you. Just to kind of end on a light note, Dr. Schell, gentlemen here, senior scientist at the Feralon Institute, you're doing a lot of work on the ocean, space, ocean, interaction. What's the coolest thing you're working on right now or you've worked on that you think would be worth sharing? There's a couple of things I have to think about. What's the most fun? Right now I'm working on doing some analysis with data. We had a big, huge international field campaign this winter off of Barbados. There were research vessels and aircraft. There were sail drones involved, which are these autonomous robotic vehicles that go along the ocean surface and measure air sea interactions. And we're looking to right now, we're working on analyzing that data. So we have all of this ground truth data. We're bringing in all the satellite observations to see how we can better understand the earth system in that region with a specific focus on air sea interactions over the ocean where we have, where when it rains, you get the salinity stratification where there's strong solar, you get diurnal stratification. So you have upper ocean stratification and heat and salinity and how those impact the fluxes and how the ocean impacts the heat and moisture transport into the atmosphere which then affects the weather. So again, this is this multi-dimensional data set with all these different types of both ground truth data, satellite data that we're trying to bring together and it's really exciting. It could shape policy, it could shape society, maybe have a real input into global warming, our behaviors in the world, sounds awesome. Plus, I love the ground truth and the observational and it sounds like our media business algorithm. We got to get the observation, get the truth, report it. Sounds like there's something in there that we could learn from. Yeah, it's very interesting because you often find what you see from a distance is not quite true of flux. I can tell you that as a media, as a media is we do a lot of investigative journalism. So we appreciate that. Dr. Schell, gentlemen, seeing your science at the Farrell Island Institute here as part of AWS Public Sector Summit, thank you so much for your time. What a great story. We'll keep in touch, love the sales drone, we interview them, great innovation and continue the good work. I look forward to checking in later. Thanks for joining. Thanks so much, it was nice talking to you. Okay, I'm John Furrier with theCUBE. We're here in our studios covering the Amazon Web Services Public Sector Summit virtual. This is theCUBE virtual, bringing you all the coverage with Amazon and theCUBE. Thanks for watching.