 Hi, everyone, and welcome to the Big Data Deep Dive with theCUBE here on EMC-TV. I'm Richard Schlesinger, and I'm here with Tech Industry Entrepreneur and Wikibon Analyst, Dave Vellante, and SiliconANGLE CEO and Editor-in-Chief, John Furrier, known as the anchors of the always popular and informative broadcast theCUBE. We are here to talk about Big Data in Science now, and welcome to you guys. Thanks for coming by. We've talked about Big Data in terms of every aspect of everyday life, but there's an obvious application for solving great scientific problems. There must be an awareness of that in the AlphaGeek community. There's some serious attention, and quite frankly intoxicating when you talk to these guys out there in the science field, because the two things going on right now that's affecting their world, and they're on top of it. There's no need for any kind of education there. High performance computing, Intel, Moore's law is getting better and better, more processing power, more stuff can be stored thanks to EMC, but now with Big Data, that's all the horsepower is at a crunch to data, so in space and in oil expeditions, things like that, it's just absolutely phenomenal. It's tailor-made, it seems to me, for science, which deals with great amounts of data. And John's right, high performance computing used to be this nitchie, yeah, only the science community used it, now it's going mainstream. So a lot of these concepts that were confined to the scientific community are now finding their way into mainstream Big Data, multiply that by 10, and that's the science community. So we found a group of astronomers. This is a fascinating contrast between old technology and new technology. For time and memorial, astronomers have been keeping track of the sky by recording stars on these sort of almost photographic-type glass plates that they have to store all over the world of millions of these things, couldn't be more old-fashioned. So now they're going to try to use Big Data to analyze those things and really get a more realistic idea of how the universe was born and how it's evolving. The project is called PARI, and watch this. How large is the universe? How did the universe begin? Could there be intelligent life out there? The information we need to answer all these big questions is really right in front of us. If we can harness that information, we can unlock the mysteries of the universe. Don Klein is the founder and CEO of the Pisgah Astronomical Research Institute, or PARI, located in North Carolina's Pisgah National Forest. This non-profit center for astronomical research and education is using the power of Big Data to show us the universe as we've never seen it before. Astronomy is all about observation. Now if we just look into the telescope, we can see what the sky looks like right now. But what really is interesting is to see how the sky has changed over a long period of time. Is the star getting brighter? Is the star still there? Is it moving? Fortunately, astronomers have been recording the night sky since the mid-1800s on photographic glass plates known as star plates. This is a plate from Mariah Mitchell Observatory in Nantucket Island, it was taken July 19, 1936, of a comet called Peltier right here. These fragile plates have been hidden away in basement archives for generations. That is, until now. This is Gamma 2. It's one of two high-precision digitizing machines that we have to scan our star plates. Perry is on a mission to digitize these 200,000 donated star plates into a massive online database so they can be accessed and analyzed by researchers around the world. Right now we have stored over one terabyte of star data in our database. Check Republic astrophysicist Dr. Rene Houdek is using the star plate database to help plan an upcoming European Space Agency mission. Set to launch in 2014, the ESA's Gaia spacecraft will use its one billion pixel digital camera to survey more than a thousand million stars. The main goal is to understand the universe, the origin of the universe. Rene is intrigued by gamma ray bursts he's found on the Perry plates and plans to have the Gaia spacecraft take a closer look at this rare phenomenon. He hopes that understanding these massive blasts of energy could someday help us develop a limitless energy source right here on Earth. And this is already something where we can learn a lot about the origin of the generation of energy. Back at Perry, the ultimate goal is to digitize all the world's estimated 5 million star plates. But at 2 gigabytes per plate, secure, reliable data storage is mission critical. We think it's important work and we want to make sure they have the right tools to get the job done. When EMC's Bob Hawkins learned this cash-strapped non-profit needed to beef up their data center, he knew he could help. Today, the Star Play Digital Archive runs on EMC donated starfish. It's all about big data, the ability to store that data in a secure and reliable environment, the ability to analyze it, and the ability to tease the full potential of the data. What we've done so far is just the tip of the iceberg. As we continue to build the digital database, we will open up untold opportunities for new research. The possibilities of what we can discover are endless. Wow, so two things struck me about that piece. One is they really got their work cut out for them. They've only just begun. And the second thing is, just because I'm so curious about this stuff, I can't wait to find out if they learn something, you know, earth shaking, so to speak. Yeah, so another area, Richard, is healthcare, right? And healthcare historically has not been an aggressive adopter of technology, but at the same time, big data has potentially huge transformative effects on the industry. Yeah, I mean, one of the things we'll have, Terrell Depp, my interview with him. I just want to show you this great video at Cassandra Summit, which is another alpha geek kind of big data conference. But what's interesting about this video is that Terrell talks about problems that he's solving in healthcare, which is, there are many problems to solve. There's database problems, there's government laws, HIPAA that prevents data from being shared, it's complicated. It's really complicated. But what he's focusing on is patient-led care. This is where the revolution of big data is all about, it's about the people. It's everything's people-centric. In this case, it's healthcare people-centric. So great video, he's changing the game. Again, another disruptor like Virginia in the previous video. So watch this video from Terrell, great video. We're back live here at the Cassandra Summit. I'm John Furrier, the founder of SiliconANGLE. I'm joined by my co-host. I'm Jeff Kelly, lead big data analyst from Wikibon. And we're joined here with Terrell Depp from Healthcare Anytime, CTO there. Welcome to theCUBE. It's your first time on theCUBE and we promise it'll be an enjoyable experience, right, John? Yeah, so our goal here is to really kind of get to the events and extract the signal from the noise. People want to know what's going on at Cassandra in the big data space. It's being debated, debates that range from proper language to use, no sequel versus sequel, relational databases, the latest shiny new toy, whatever that is, people like to talk about it. But really what we've found is big data is really changing business with mobile and with dashboards like the Nexus 7 and the iPad. Real-time analytics data, fast data to the edge of the network is really viable. And I'm excited to talk to you about that because what better place than the medical profession if you want to have information at your fingertips because it's life and death, it's not just business, doctors are on the field. So healthcare is a really viable space. So is big data a reality? That's my first question for you in the hospital space. Obviously tablets are relevant. Big data, what's the state of the union for big data there? Yeah, big data unfortunately is getting some pretty slow adoption. And I attribute that primarily to the kind of inside the box thinking. Healthcare is not known for its rapid technology adoption. As a result, new technologies are even slower to catch on. And when we start looking at solving problems in healthcare, we have to think very much outside the box. And you cannot take an individual patient and put them in a situation where the information you collect on them is the same as the patient that you saw previously. Every patient, every episode of care is going to be absolutely different. Even for the same problem, your heart attack and my heart attack may be completely different. Your follow-up visit is going to be different from my follow-up to visit. How do you take that data and put it into something that can be mined successfully so that you get decent analytics out of it? You simply can't put it into a traditional two-dimensional data model. And also, not all the records are digitized too, right? I mean, that's another problem, right? That's a completely separate problem. That's a problem that's more of a cultural problem. It goes back to the adoption of technology being very slow. In the hospital business, there's a lot of database work. And with HIPAA, regulations around data is a factor. Is that a factor, or is it not a factor, or is my- I don't know much about it, but that seems like it could be a factor. Well, anytime you're talking about regulations and compliance, it's going to be a factor. And for a company like Healthcare Anytime, it's a big factor. And it goes beyond just the clinical aspect, it goes into the financial aspect because we have to not only be HIPAA compliant, we also have to be PCI compliant. And getting that data, keeping that data secure, accessing that data in a timely manner, that's what big data is all about. So for our viewers out there who might not be that familiar with healthcare anytime, why don't you tell us about kind of what you guys do, what's your core product, and whether it's patient portals and kind of bringing in data from multiple sources. So tell our audience a little bit about what you guys do, and then maybe we can talk a little bit about kind of how you do it. Well, the core purpose for Healthcare Anytime's existence is for the purpose of satisfying the patient need. It's patient-facing healthcare. That's a market that has been grossly underutilized, underleveraged, mostly because there's no money in it. And part of the emphasis that the Obama administration has put on healthcare is to get that into the hands of the patient and make them more accountable. And that's where we come into play. Now, we're not newcomers to this environment by any means. We actually have 30 years of experience in the healthcare space as one of the top healthcare information service providers, systems vendors in the small and medium community hospital market. So this is something that is kind of a natural progression to us as we started looking at where is the gap, where is the opportunity, where is the outlier potential here. And so we look at it from a standpoint of patient engagement, which obviously starts with collecting information about the patient, getting them to pay their bill, and even engaging the patient's extended family and friends to get them more involved in accountability efforts and things like that. So the whole thing gets wrapped up into what's commonly called a patient portal. So let's talk about some of the real use cases of both your platform and just kind of healthcare, big data analytics and applications in general. I mean, what are we talking about? I think you have to take a look at what are the different stakeholders in healthcare. Obviously we have the patient and that's probably the most atomic level of care, a single episode of care being or visit to the doctor's office being the smallest, most atomic component. The patient is generally more interested in knowing, okay, what were the results of my most recent lab test? They're not as interested in the long-term perspective or the long-term view of their care like a doctor would be. Then you take another step out and you have a completely different perspective on the data in that you have a hospital or even a region that is interested in things that are peculiar to that region. Some areas have a higher rate of diabetes than other areas. Obesity is more common in certain areas than in other areas. Those regions want to look at the data in a very different way. Step it out a little bit further and you start looking at a national and international level and we have public health and immunization concerns and we want to know what's this next flu outbreak going to do and how do we project where the flu is headed next? How do we stop these things before they become financial catastrophes? Right, so it very much depends on their perspective. Exactly. Where are you coming from? Carl Chell, thanks for coming on the queue. We really appreciate it. You guys are in a growing market. Obviously this needs to be some transformation in health care and you guys are doing some good work there. Appreciate the time. What do you guys make of the fact that he said that the people in the health care industry are resisting this new technology? Well HIPAA notwithstanding, the more data that you can put in the hands of the patients when they're making critical health care decisions for themselves or their families, the better off the system will be. So how do you get the doctors? How do you get the industry to go along? I think one is having the data available. Step one, two is getting the iPads in front of the doctors. You know, we'll get the iPad, get a device in front of the doctors when they can see and touch the benefits which big data allow them to do. Things will change. Because big data is getting accepted in all sorts of, you know, we've been talking about all sorts of different aspects of life including, you know, even the environment, environmental researchers, we've done a piece about this company called Tree Metrics. It's an Irish company and they were, let me just get this right. They have an index of 11 million trees on three continents and they've developed a system to a big data system to keep track of them to help meet the market demand, well, at the same time, minimizing the effect on the forest. So take a look at this. Our company has developed a radical new way to better measure forests. We started off with the basic idea of improving the way forests are measured. Up to now, forests have been measured with 19th century tools and measuring tapes and calipers. On average, 20% of the value of forest is lost at harvesting time. That value is lost mainly because the wrong logs are searched for a cut from the wrong forest. So what our system does is tries to show the forest owner and the sum of where the optimum logs are and where they are in the forest estate and cut them just in time. Keep that forest alive, keep that tree alive until it's optimum to fill that tree when the market really needs those logs that are in that forest. We've created a new system using 3D scanners. So the starting point is 3D scanners from the air from an aeroplane and on the ground with a 3D laser scanner. And we're the first company in the world that combines aerial scanning that looks down on the forest that gives you a macro view. What's the variation of trees across a wider cover of area? And then the ground sampling system is the terrestrial scanner where we go in and we actually scan sample plots to try and work out what's actually down on the ground. We've developed intelligent software then that takes this new information and creates a digital forest, a 3D forest. And that forest is stored in the cloud. Forest owner can type in the logs that the market currently wants and they can cut those logs and test all available forests out there across the resource which forest best suit that current market demand. It's bringing the world of lean manufacturing to the forestry supply chain. The old world was based on models and models are generally not ever totally correct. And with the analytic system that we have, we're able to do some new types of data mining and analysis to allow the forest owners to give real insight into the actual contents of the forest. We're only waking up to the whole power of big data. We've realized that there's a huge opportunity to capture masses amount of data from different parts of the world and use that data to work together with other forest owners to share data to be able to give insight at a global level. Foresters think at a national at a local level too much, whereas it's a global challenge. And that challenge is to try and keep that market supply but using as little forest as possible, trying to manage that forest as sustainably as possible. So big data and analytics is where that solution lies. It's about getting better information and analyzing that information, analyzing that data and to create new insight, new information and make better decisions. That's the real opportunity of big data and analytics. The missing link is this better information, this tackling this information problem of what is actually out there before harvest. We've developed a virtual harvesting machine that allows the forest owner and the sawmiller to go online after we scan, collect this better information and allow them to cut the forest online and see which forest yield the right amount of logs, the right types of logs and where are those logs in the forest? With our technology, we've proven that you can keep the markets applied but commit less trees. That can only be good for the planet, more wood from less trees. You know what I love about that video is that incredible giant machine ripping trees out, brute force, and yet we're talking about squeezing more productivity out of that process and you're seeing that in a lot of different industries, transportation and logistics and the like. Yeah, Dave, there's a trend out there if you call the Internet of Things where sensor networks, we talked earlier, this is just one example of the big data impact. It's not just what you think it is, it's everything possible is now changing and it's a beautiful thing. I love the Internet of Things and we'll have more on the Internet of Things. We're out of time now but we hope that you'll join us again. I'd like to thank you guys, John and Dave, for joining us again and sharing your insights and we'll have more installments of the big data deep drive coming up so be sure to stay tuned to the conversation with my new best friends from theCUBE right here on EMC TV.