 Okay, we're back. This is Dave Vellante at Wikibon.org. We're live here at IBM IOD. Think Big is the mantra of this conference and we've been thinking fast, thinking big and this is a fantastic event. IBM's premier software event, the information management group but it's really morphed into a big data event. IBM's taking its claim and grabbing its piece of the big data pie, a leader, if not the leader in big data. And so this is Dave Vellante. This is theCUBE, SiliconANGLE's continuous coverage, our flagship coverage of events. We go out to these events, we extract the signal from the noise, we bring the smartest people we can find to you, share with you, our audience, the insights that we've learned that we're learning at these events. I'm here with my co-host. I'm Jeff Kelly from Wikibon.org and we're here with our next guest, Thomas Jackman from the Desert Research Institute. Welcome. Thank you. So you're an associate research professor for advanced visualization, computation, modeling. Tell us a little bit about DRI and we'll go into some of the things you're doing with IBM's big data portfolio. Okay. Well, the conference here is in Las Vegas, here in the state of Nevada and we're with the university system, the system of higher education for the state of Nevada. There are two university campuses and there's a research institute. DRI is the research arm for the state of Nevada. It's an environmental sciences and engineering research institute. And we do environmental monitoring, field studies, data collection for monitoring resources, natural resources, weather, climate, pollution. See a lot of data. Yes, we have a lot of data. That is a lot of data that you need to collect to do that work. Diverse sets of data. So, Thomas, talk about what your data life was like a few years ago. Go back as far as you want, five years, seven years, whatever, decade and what it's like today and how you're dealing with that and what kind of value you're getting out of that data. How has that changed? So, what's happening is, obviously our data just sort of grew, right? There was no strategy where it's going to grow a lot more and here's what we're going to do. So, you arrive at a point in your IT, sort of an infrastructure and you say, wow, we have a problem. And so, what we have to do is we have to start centralizing our data. We have to be aware of the different types of data. We need to be able to access the data in real time and we need to analyze it better. And so, we're just starting on the road of partnering with IBM and we're planning to have IBM Pure Systems deployed at DRI where we can start using the data, deploying the data and analyzing the data better. So, what are your objectives with the data? What are you trying to get to? What's the end game? So, obviously, we need to monitor our natural resources better. So, automation, supervised monitoring for the environment, monitoring power grids, renewable energy, weather forecasting, emergency preparedness. So, all this data, getting the data, collecting the data, analyzing the data, then helping public policy people, legislators, emergency responders, other scientists, and businesses. How are you guys funded? Ah, that's interesting. So, we're an academic institution. DRI is a very unusual in that most of ours is not coming from the state. Most of our funding comes from contracts. So, grants and contracts that we develop is how we're mostly funded. So, you got to earn your... That's correct, that's exactly the way it is. Okay, so that means that you've really got to align with what people want. You got to create value. We have to create value. Yeah, so that's an interesting organizational model. So, and that actually is the new model. I think we're all thinking that the academic institution of the future has got to be prepared for less dependence on federal funding. And we're going to have to be looking for partnerships with industry and companies, local companies. So, we'll hopefully get in the pure data, pure systems, giant academic institution discount from IBM, and a little plug there for that for the IBM sales reps. But talk about, you guys are scientists. There's a big term now in the industry, data science, Wikibon, SiliconANGLE. We turn all kinds of cool infographics on data science. One of the first data scientists. You know, New Breed data scientists. I met with Hillary Mason of Bit.ly. We've had Jeff Hammabacher on, who kind of coined the new Vogue term, data science. Are you guys data scientists? What's going on in your world? What does data science mean to DRI? Well, I run the center for, we call it CAVCAM. And we do high performance computing, advanced visualization. We actually have a virtual reality, a six-sided virtual reality environment, which allows us to sort of create natural environments and sort of bring in our data sets and look at things that wouldn't be able to look at otherwise. So to actually experience things like Lake Tahoe and to be able to see soil moisture, temperature, vulnerability to wildfires. We're able to see these things. And so this is perhaps a new interface for looking at data and using it and interacting with it. So is it a simulation type of model? So you can test what if scenarios and things like that, interesting. So how do you take that from, you know, doing some interesting simulations and taking your findings though and turning those into actions that help the community, help the environment, the business community, and your other stakeholders? So that is part of the science. It's actually validating that these models are true. Ground truthing these things. And data is, of course, a requirement for that. So we are bringing the process in. Of course, everyone has concerns about simulation. You know, how real is it? Is graphics involved? So lots of graphics involved. So is it movie or is it real? And so bringing people in, convincing them that they can do training inside this. So wildfire training, for example. All sorts of emergency preparedness. So these are the types of things we can do. So I think about, Thomas, I think about fraud detection. So fraud detection used to be a game of sampling. You take, you know, little bits of sample, maybe a lot of sample, and then you'd, you know, mine the data, you'd build a model, make some assumptions, and six months later, you might find that some fraud was committed, you'd go back in and try to rectify it. And then that's really changed quite dramatically in the financial services business, where sampling is dead and just the entire data stream and then within maybe not minutes, but certainly hours, they can identify, and sometimes minutes, identify fraud. You've got a different problem in terms of the predictive abilities and the complexities of the model. Think about weather. But talk about how things are changing, or are they changing in terms of sampling? How your models are changing? Are they becoming more, obviously more data heavy, but in proportion, relying more on the data, or is the data helping calibrate the models? Help us squint through that. So the same idea is that one might use in investment banking, you look back to predict the future. So the more data you have from the past, the better you can actually predict the future. And so with, for example, climate models, you can sort of look at data historically over the past, parameterize your models, create probability distributions, and then run models going forward in time. Not too dissimilar from what high-frequency trading uses. We can also start capturing data in real time. So actually imagining that we're having streaming data, we could start monitoring renewable energy resources, for example, photovoltaic arrays, wind farms, monitoring that data in real time. Because in that particular case, you have to match demand response in usage of electric power to what nature is producing for you. And so you'll have to know that. And if the energy providers have to integrate this into the grid, well, they need to know this, and they need to know it as early as possible. Yeah, so what is, we always have this conversation, what's real time? And sometimes in certain worlds, in the advertising world, real time is before you lose the customer. Yeah, right. What is real time in your world? Well, there are many different worlds, right? In climate, real world can be 15 minutes, could be an hour. In renewable energy, for example, real time is now, and perhaps you'd need to know every one sixtieth of a second. The grid needs to be monitored every one sixtieth of a second. We run at 60 hertz. So we need to know this very, we need to know it very accurately. There's lots of data that could potentially be streamed in. So we have problems of size and speed. Well, you mentioned earlier, as you kind of were taking this journey, you hit a roadblock. You realize at some point, well, we have a problem here. We didn't plan for this data explosion and the different types of data. So could you go into that a little bit more? What was the roadblock? Was it the, you were having performance issues. Was it having, you're just overwhelmed by the amount of data, didn't know, felt like maybe you were missing things. What was that moment where you said, we need to really start thinking about this in a different way? Well, we are an academic institution. We're not perhaps much like an investment bank where everything would be strategically planned. So our data grows because individual scientists collect more and more data. And you just keep adding more hard drives and you eventually get to a point, you say, when you can't do this individually, we have to start centralizing our resources and sharing our resources better. Because it's academia, we don't necessarily use, we use custom approaches, not necessarily enterprise-ready approaches. And so what we're embarking on is using the same strategy of Wall Street and using it in the academic research environment to do better, more sophisticated data management and preparing for the future. So that's also going to require a cultural shift in some sense in your organization. So how are you tackling that? I mean, the idea of treating data as an asset and as a centralized service is much different from individual scientists that got their, maybe a few hard drives, they got their data, not necessarily concerned what's happening over here. It brings into questions about data policy and management. So how are you addressing some of those issues, both technically and culturally, changing the thinking? So the scientists are all coming along willingly. I think everyone is recognizing that there is a problem. And so in terms of this proof of concept, we're having lots of interest to participate, lots of people willing to try moving their data to an enterprise-ready type platform to start looking at enterprise-ready type databases and data management strategy. So I think probably only a year ago we would have had a lot of convincing to do, but we're starting to see people who are just becoming far more interested and prepared for the next step. Well, people become very attached to their data. And sometimes we are very reluctant to put that into a shared system that happens in the enterprise, obviously happening in academia as well. The hard part is they have to keep devoting more and more time to supporting that data. And so having a shared service to do that is a big win for everyone. Absolutely. I would say this whole concept of big data changed your thinking and your point of view on data sources. Is the data that you're using, was it all there internally or were these sort of new big data technologies thinking bigger, did you start to look outside the four walls of your organization? Talk about that dynamic a little bit. So we have all different types of scientists. We have field scientists. We have laboratory scientists. We have computational scientists. So not everyone would have even classified what they had is data. It's information. It's lab records. It's spectrographs. It's remote sensing. It's aerial photography. So these are all different types of data sets. How do they all get geo-referenced? How do they get correlated? How can you start bringing them together and making decisions, actionable decisions, based on all this? In the past, they were separate data sets and now we have the capability to start bringing it all together. And so it's getting exciting. And I think that academia is actually ready and prepared to also start adopting this strategy for data. I think that's a turning point. What do you see as your big barrier? In other words, what one thing would you change if you could change would really accelerate, make your life better, create more value? What's the gate right now? Is it skill sets? Is it technology? Is it process? Is it, you don't know what you don't know? I think technology has been fairly siloed. Interestingly enough, when students go and they study engineering, they study science, they're not learning these strategies right now. Just now in the business schools, they're starting to teach this. But this all has to trickle down. So we need the skills in the students who are going to be participating in the research studies. So we need these skills to trickle all the way down to the universities and perhaps even to the high schools. So education is probably the biggest thing that's going to start helping big data. So yeah, so expand on that a little bit. The skills meaning the understanding of how data interacts with one another, how data needs to be visualized. What do you mean by what skills particularly do you think need to be really built up? Well, databases, software engineering. Engineering, teaching civil engineering about with smart structures where you're thinking about you have sensors in bridges, in buildings, teaching renewable energy where you're thinking about monitoring and starting to predict how much energy you're going to generate. These are things that we can start teaching in the universities and could and should. And so using predictive analytics and introducing that into engineering and science curricula, in addition to bringing in operations management from the business schools. And so having this cross-disciplinary education would be really helpful. Exciting times. Thomas, thanks very much for coming on theCUBE. I really appreciate your perspectives and good luck with the initiatives. And good luck with the funding and those IBM discounts. All right, thanks for coming on. Keep it right there. We'll be right back from Las Vegas. This is theCUBE, SiliconANGLE's coverage of IBM's IOD. Keep it right there.