 Okay, we're back live here in San Francisco, California, Oracle OpenWorld 2012. This is siliconangle.com, theCUBE. Our flagship program, we go out to the events, extract the signal from the noise. I'm John Furrier, the founder of siliconangle.com, and I'm joined with my co-host. I'm Dave Vellante of wikibon.org, and we're here with Vu Nguyen, who is an infrastructure engineer at the NASA Jet Propulsion Labs, a NetApp customer. Welcome to theCUBE. Thanks for coming on. Oh, thank you. So, we're here at Oracle OpenWorld, and what do you think of the vibe here? It's interesting, fun, exciting. Yeah, glad to be a part of it. What brings you to OpenWorld? Network Appliance has asked me to come and speak about the Mars Science Laboratory, which a lot of people know as Curiosity, and the recent events related with Curiosity. Right, the Curiosity project. So, tell us about the... Tell us about the Mars Initiative. That's sort of... A lot of people are interested in that. You know, there was a period of time where it was sort of the world questioned whether or not we should be exploring Mars and then sort of NASA went in that direction. Give us a little history of that program to the extent that you can. The whole idea behind Mars is actually, we're trying to understand Mars better in terms of its geology, and the more we know about Mars' geology, it actually helps us to understand better our own Earth's geology. And so the concept here is actually, the more we learn about the geology on Mars, we might actually understand how and why Mars might have lost a lot of its atmosphere, lost water and things of that nature, and how it might actually come back and assist us in understanding our world better. Now, so, we were talking off camera and I made the comment that probably not a lot changes even with the Mars project for 10 years, right? Starting in 2002. And I had made the comment, not a lot probably changes on Mars in 10 years. So that's not the case. It changes in a decade. It changes, and actually it changes quite often. And in a lot of cases it changes in a surprising way that we hadn't anticipated. In particular, it's actually just over a period of, I believe it was within a year. I'm not quite sure in terms of the amount of months, but we experienced a case where the MRO orbiter flew over basically a mountainous region and the initial images showed that there was nothing there. It was just like just barren as a side of a mountain. But on the second pass, we saw what appears to be like some kind of a liquid flow. And so that in itself is very exciting because we hadn't anticipated that. And we're not sure whether there's water, either liquid carbon dioxide, or some other kind of liquid, but obviously there's some form of liquid coming out. So that in itself shows that there's a lot of change that's happening on a planet that we have thought that it's fairly lifeless or fairly static. Well, I got to ask you because obviously we're here at Oracle Open Worlds by databases and all that tech-boring stuff, compared to what you work on. And just two weeks ago, I had a chance to watch the space shuttle and never fly over Silicon Valley. And people had that moment of like, wow, it's like an end of an error from going back from the 60s to today, but a whole nother level of science is taking place that you're part of. Share with the folks two things. One is, what's the feeling within the science community that you're involved in around that end of that chapter of endeavor? And then two, what new things you're exploring in science and in space in particular that's truly exciting, share with them, get people excited about it. Because a lot of people, this is not a niche market. People were really emotionally moved by that shuttle experience. And there's a lot of science and math and geeks out there who love spaceships and science. So take us through the emotion of that event and then what's the new frontier like now? In the case of the discovery and this shuttle in general, even though the Jet Propulsion Laboratory, we don't have direct link or work related with the shuttle, but we ourselves feel quite saddened that you're right, it is an end of an era. And we see new efforts going into getting more manned missions, more humans into space. But again, it's sad. It's like seeing a sibling kind of going away on a college trip or going to school that we're not going to be able to see them again and things like that. Not quite seeing them for a while. Retirement home or something like that. But so yeah, we are quite saddened by it. We felt that it was a hard program and it was definitely appreciated and needed. And with that, what's the new stuff that's happening? Because the project that you're involved in is exploring the science side of things. And we'll get into the big data conversation in a second, but what's the new exciting things? A lot of people are now after that event are trying to understand where the action is, where the excitement is. To kind of simplify it a bit, going back to the Murr project, which is most people are familiar with is the Spirit and Opportunity Rovers. If you think about it, that mission launched in 2003, around June time. There were two spacecrafts, but if you look at the technology that's associated with those spacecraft, compared to the technology associated with curiosity, just something as simple as the images that come from those two spacecraft. Now, I mean, mind you again, I'm oversimplifying this. So you know. We're not rocket scientists, so you can do that. Yeah, so you can kind of view it as taking a, you know, something like a VHS image and looking at it. And then all of a sudden you're bringing together a Blu-ray and you're looking at that resolution there and you go like, oh my, you know, we can actually see the tiny little specs of dust and sand and dirt on rocks, on surfaces, as opposed to before we have to actually get closer to actually see it. So that in itself is a testimony to much more data. So, you know, we encounter much more data coming down and also the difference in terms of technology, you know, where we're actually having to do with larger data and also being able to examine and analyze images and things like that, you know, in the case of science. I was talking with a high school student just a couple weeks ago and I said, you know, what are you interested in career-wise? He's like, I love space and science. I go, you know, what do you think that will turn into? And he said, this is just, you know, complete naivety. I think the answer to our energy problem is somewhere in space. Just the kind of this, that unconsciously competent kind of vision. Just a little bit out there. I mean, it's a stretch to say that, but share with us, given that kind of future leader that might someday be part of the science community. That's some of the things in science. What is going to come in and discover? What's the discoveries that are out there that you see? And that's being talked about in the science. You can oversimplify it, but just an vision standpoint. What are the new breakthroughs that are coming out of the data? I really can't talk much about the data because I'm not too involved in the science teams itself. On my side of it, I'm on a team that basically performs real-time operations. So in a lot of our cases, we get the data to come down and we process the data into what we call data products and then we pass it on to the science teams. So that in itself expands multiple times up and above so it gets really big. In terms of science, I really can't say much because I'm not on the team. But in terms of just the sheer storage and the data sizes that are coming down, expansion rates and things like that, it definitely pushes our limits in terms of the amount of storage and data that we have to store. And so we always have to reevaluate what kind of infrastructure we have in order to accommodate more and more data coming down because we never remove anything. We hold on to every piece of data because we never know when we might come upon a particular day's worth of work on Mars where we didn't realize we discovered something until the scientists and all their folks come back and they reevaluate the information and they come back and they go, hey, we found something interesting. Let's go back and take a look at it. And if it's a major event, it actually might explode up and above that. So you know. So the data access to the data is key. You got to search. So you need to have some sort of data store that has a low latency. Talk about some of the challenges there and what the new tech you're using. Especially in the down lake, we're expected to process the information at real time. So we are required to expand data products at a certain rate and the scientists and other teams expect that data to come in so they can actually perform their level of analysis and also planning for the next day's worth of missions or where to go, where to drive, what rock to look at, what kind of geology we're going to do, what kind of basically chemical analysis and things like that. So the data comes down, you process it in real time essentially and then it gets archived? It gets archived, it gets stored, let's say that way. It gets stored and then also that storage in itself, the other teams can actually pick up that data and actually continue on with the processing. Okay, and so it's never really archived. It's sort of, it's perpetually archived, I guess. It will go off to, it will eventually be archived. But it goes into more like a storage state than it is an archive. So we store it and then as it gets older and older, it'll probably get into an archive state, but even the archive state, the project wants the data to be always online and readily available so that it's an actual archive. So you've got essentially three tiers, right? You've got the processing tier, the storage layer, which is the sort of mainstream. And then somewhat the archive tier, yes. The kind of the archive, and which is the most storage intensive? Can you talk about those? It's the processing. It is, okay. The processing is most intense. Why is that? Can you describe sort of the type of data that you're ingesting and how you're handling that? It's the type of data is actually telemetry coming down. And the telemetry has in it broken into basically what we call channelized telemetry. And that breaks out. We actually process that. We break the channels out because each channel serves for a purpose. Some of it's science data. Some of it is engineering data. I mean like spacecraft health and things like that. And so it just depends on where it goes. But that information has to be processed in near real time so that the scientist and the other teams can get at their data. So you're getting a raw data stream. We're getting a raw data stream. Okay. And so you have to put that into some kind of format that's consumable by the scientist. Right. Yeah. The format that comes down is, compression is not the right way to really say it, but it's a form of like compression. And so we have to expand on it and process it. And prepare it. And can you talk about how much data you manage? I mean, so is it objects that you're managing? I'll give you one, you know, a couple of cases. We're just starting to do a lot of the science. So curiosity is very new. So it's very fresh. So we don't have really, we have an estimate of how much data might be coming down. But so far, we're getting something in the neighborhood of, I believe it was 128 megabytes per day. And that in itself is expanded on at least, at least 30 times. And then that's just from the initial ground data system, doing that expansion. Up and above that, the science team will actually probably process it further. And I'm not sure what the expansion rate on that would be. Okay. And so talk about your storage infrastructure, if you would, what does it look like? And how has it changed over the last 10 years? We've gone through numerous vendors, but ultimately, you know, we have a requirement for the infrastructure to be fairly robust. It has to survive, you know, minimal downtime is required. We do have a window of opportunity where, you know, maintenance can be done, but it's far and few in between. So, and scientists and engineers expect the data to be available at all times. Because, you know, you don't know when they might, like again, come back and reevaluate the information that they have already. So what's the infrastructure look like? You got, you know, a set of servers, you got... We have a set of servers, we have... We have... We use NAS servers. Yeah, we have a lot of NAS servers. And we also have SAN Attach for our databases and things like that. Okay. So you store in files? Yeah, basically, the databases is a method we use for searching. It has, contains metadata. And the storage is actually just on, you know, files are just stored on NAS stores. Is object storage something that you've looked at? No, we have not looked at that. No? No. I think it's too early for that. At this point, I'm not sure. But, you know, it's something that we will evaluate with the future missions that are coming on board. So big data is a big part of it. Obviously, you guys got to collect data and then get to access to the data really fast. So a lot of batch storage using Hadoop, but you guys using what kind of technologies you guys using for the store, the backend. The backend, we have a lot of network appliance to handle that storage. So you're building schemas out when you store all the data? I'm not going to talk about that. Ha ha ha. Ha ha ha. Okay. All right, well, we're almost out of time. Getting the hook there. Trying to drill in, find out what he's got in there. I mean, NASA, they got to keep their secrets top secret. So almost had an answer there, Dave. But we'll be right back with our next guest here inside theCUBE, Jet Propulsion Lab, doing some great work. Obviously storing data from space for discovery is a big deal. And thanks for coming inside theCUBE, appreciate it. We'll be right back.