 from Union Square in the heart of San Francisco. It's theCUBE, covering Spark Summit 2016, brought to you by Databricks and IBM. Now here are your hosts, John Walls and George Gilbert. Oh, welcome back to Spark Summit 2016. I'm John Walls here along with George Gilbert. We're in San Francisco here in the Hilton for the second day of our coverage here of Spark Summit 2016. We're joined now by John Farlin, who is a technical consultant with a group called DNVGL, and they're really a consultancy that helps companies in essence mitigate risk in a lot of different sectors, maritime, oil and gas, energy, just to name a few. John, thanks for being with us. We appreciate it. It's great. It sounds like a really cool job, frankly. George and I were talking about this before, but you deal with the energy group, that's your focus. So give us a little bit to our CUBE audience. This is your first time with us here on theCUBE. A little bit more about what you do at DNVGL and particularly within the energy group. Yeah, so DNVGL, like I kind of mentioned, is a large organization. It's got four pillars, maritime, oil and gas, energy and business assurance. We also have business units for software, cybernetics and research, but those are not our major bread and butter. So within those four pillars, I work in the energy business unit and energy is obviously a hot topic all over the world right now. And within energy, I work in a group called policy advisory and research. So policy advisory research is basically charged with consulting and providing advisory services, utilities in the United States and internationally. So they basically ask us to solve some of the more difficult problems or at least they ask us our thoughts on them. And then we go and say, all right, we've dealt with this problem and this part of the world. So here's the solution that we think that you should implement for whatever you're looking for. All right, and so obviously with energy, especially I'm thinking about consumption patterns, weather comes into play there obviously. You've got smart meters, so you've got all these crazy inputs going on, right? Yeah, actually, and just have a second on that. I just got done with that talk and I was just kind of saying, spend a little bit too much time going into the nitty-gritty technical stuff, but one of the cool things about my job is that electricity usage has a lot of things going on, a lot of things driving it. So what hour of the day it is? Technology aside, what hour of the day it is is a huge driver of how much electricity you use. Also, what about the electricity you use? If it's 2 p.m. now, what about 2 p.m. yesterday? Because we're humans are very habitual people. We have diurnal cycles. What I did yesterday at the same time pretty much predicts what I'm gonna do today at the same time. So things like that, temperature is a huge deal. We worry about things like heat buildup and buildings. So if you're trying to model how much electricity usage a structure is using, right? You have to account for, of course, what the outside atmosphere conditions are at that time, but also what they were for the past few hours because the heat builds up in a structure. And depending on how efficient that structure is at disseminating the heat buildup or heating envelope, then you have to model that as well. There's so many complex drivers. It's crazy to actually think about it. You don't realize it's as complicated as you think. So where are utilities in the big data game? Because they have a lot of opportunity it seems like. As you said, they know what their user's doing. Yeah, I know. Better than anybody, right? Because that's their bread and butter. So where are they in terms of putting that data to actual action and whether it's conserving energy or delivering a better service, whatever that experience might be. I think they're looking for insights. I think the balls in a lot of people's courts right now, that we have not figured out exactly how to do that. And when I say we, I mean, we're a consulting group. So we were actually, we talked to utilities all around the country. And the first thing is first, collecting the data. First thing is deploying a smart meter network. Back in the day every year meter would be read once a month. So that's 12 observations per household per year. All right, so now we're going to smart meters. And smart meters not only are connected with the internet, they record things at an hourly frequency. So now we went from 12 per year to 8,760, the number of hours in a year. Now, I mean, we're talking about stuff. I have one client in particular that's talking about five second sampling frequencies and not only just the household, every end use in the household, every light, every dishwasher. I had gotten involved in one sort of energy consumption project and the thing that the company that was sort of sponsoring this found out pretty quickly was that when you're just looking at the meter, you don't get a lot of granularity. You don't, you can't ask interesting questions. The instrumentation's too coarse. But when you get to every device, that's when you can do things like the thermal envelope and, you know, what appliances run when, you know, when, and I imagine the utilities might not have access to that because that would be an enormous expense. Yeah, I mean, think about it. Like Experian and some of those kind of credit organizations oddly enough have like demographic information on folks, but utilities don't know a whole lot. You're exactly right. So how do you, you're not the utility, but you might have other avenues to get into the instrument, that data in the home? Yeah, and so I think, I think you're absolutely right. And what end use metering kind of does is start to get chips away at the question of occupancy, human behavior within inside the house, right? So you can tell what the quantity of electricity usage was, but what drove that? Was it because it was really hot for the last three days or was it because you're having your cousin's birthday party and you have 20 people over? You don't know that unless you're kind of monitoring a little bit more granular information. Well, when you're advising utilities, have you gotten to the point where you can instrument the house with greater granularity than just the meter? We're starting to, and that might actually just tie in to why we're here in the first place. I mean, Spark, Spark is finally giving us the ability, we've made leaps and bounds, right? So the energy industry in general lags behind a lot of industries, years, if not decades, right? And so finally, now we have the reason to use something like Spark, just like you're saying, we have all this data coming in, we need some way to efficiently manage it and to analyze it and do the sort of algorithms and models that have been legacy in the industry. The industry, trust me, the industry is a very advanced industry when it comes to analytics, mathematics, and that sort of thing. It's not advanced in dealing with large amounts of data and combining the analytics with it. So we know what we want to ask, more or less, right now. We have data now that should allow us to answer the questions at a much broader level. And what I'm discovering now is that there are questions we didn't even know we wanted to ask because we're finally able to like, all right, let's look at it all. Let's look at it all. Rumsfeld's unknown unknowns. Yeah, exactly, you didn't even know. You didn't even know that you wanted to ask that question. So what's getting you that information now? How are you collecting that? So let me give a quick example. So we were talking before about the peak. Everyone worries about the peak, rightly so. That is the one moment in time where electricity demand is greatest. So typically it's in the summer, the hottest day of the summer, three o'clock in the afternoon, four o'clock in the afternoon, something like that. A lot of utilities are worried about that as they should be. When you institute a program to save energy, energy efficiency, there's something called demand response. So demand response is basically a program where you sign up and the utility, if it calls an event, that utility can shut down, curtail your load. You can shut off your air conditioner and maybe get paid from the utility to do this and be okay with that happening. But basically they wanna be able to say like, oh man, it's gonna be a really big day. We need to curtail a little bit of load. So we focus on the peak. What if we didn't have to focus on the peak on one day? What if I was able to take all of the hours of the entire summer and try to analyze where I should target that energy efficiency program to save the most electricity? Not just that one hour. Is that because you could store it now? Not necessarily. That's a bigger question from a distribution perspective. So what is the optimal way of servicing my end uses? But I know we're now getting into the energy nitty gritty. But if you still have to manage to that one peak over the summer, traditionally the generation capacity is two X the average need. So if you still have to get to that one peak for one hour, you still need that two X capacity. Right, well there's something called reliability. So the electricity grid is set up a little bit interestingly. Half of it's kind of deregulated. So you got the utility which is regulated, right? They're regulated by the CPUC in California and they're a basic public utilities commission. But the generation is not necessarily regulated. There's something called CalISO in California, it's an independent system operator. They're worried about reliability. They're charged with making sure that there is enough capacity to meet demand at any moment in time during the summer. And if that moment is smaller, the reliability demand is less. You can bring that margin down. You don't need to be two X. If you were really sure that the number was going to be this, you could bring it down to this and still have adequate reliability. So back to why you're here. Yeah. All right. Yeah, because you're doing maritime, you're doing oil and gas, not just energy, business assurance. I mean, what is Spark doing for all of your clients, not just the energy guys, but for all of your clients now in terms of streamlining their processes and helping them make better decisions. I guess ultimately mitigating risk. Yeah, I mean, this whole set of few days has been pretty impressive, I gotta admit. I have learned more about how Spark is completely integrated into every enterprise platform you can think of. Everybody is somehow leveraging the strengths of Spark. I think everybody's kind of come to terms with it's really good at this. It may not be as good at that as what I want it to be in the future, but still I can, you know, low hanging fruit. I can put Spark on my stack on whatever analytics platform you have and immediately you see gains. One of our use cases, we did, I did a use case, it was just pure data manipulation, nothing, no analytics. It took me 23 hours to do this job on one of our legacy servers. It took me seven minutes in using Spark. And like, I don't know about any other use case. I think that's probably the greatest efficiency gain that I've ever heard of with Spark. And, you know, something like that, if anything even close to that was able to be planted within each stack of each one of our business units, we would be able to, like I said, answer questions that we didn't even know existed. So when you go back, I mean, what are you going to put in the practice or what do you want to explore from what you've learned here, from what you've seen other people are doing and how that would be applicable to what you want to do for your clients? Right, I think the same thing with low-hanging fruit, right? So we're certainly not going to just get rid of everything we have and everything we do and just go full production, Spark, that's the Spark compliments what we do. As it stands, we don't have projects that justify going full-blown Spark, right? We have a few mega projects where I'm dealing with like hundreds and hundreds of gigabytes of data and even that's not even that big when we talk about big data. But for us, I mean, 500 gigabytes of data is kind of hard to deal with in most platforms. So for us, we're going to compliment our current services with Spark. So Spark and other big data platforms, we're going to, hey, if you got a huge job, we've got a platform that can handle it. Going forward, I actually have this vision about what we could do and it's pretty incredible. I mean, we could go from real-time forecasting so I could know in real-time, I could stream smart meter data into Spark and estimate forecasting models so that I know almost instantly what the electricity demand is going to be at any structure that I have data for in almost real-time. That's not what we're doing now, for sure, but that is definitely something that's a great use case for where we see the energy industry going forward, so. In that scenario, would you need to get behind the meter itself down to the household devices or is the meter, would the meter be a good proxy for the more detailed analysis that you might have done historically? Yeah, that's a great question and this is kind of a great shameless plug for something we're doing. DNVGL is probably, we're doing one of the biggest NU studies in the world and kind of what we were talking at the beginning. There's this technology called non-intrusive load metering. Basically, I do not even need to go into the household. This is a machine learning tool. I can put this device outside of your household on your smart meter and it measures. The type of energy that you're consuming. Pretty much. And associates that with the appliance or the device. It actually registers a change of 100 watts. So anytime it goes up or down by 100 watts, it registers as an event and that event gets recorded over and over and over again. We take that whole log of events, we use a machine learning algorithm and we smash it up against a library of known end uses and it's a categorization problem. So we say like, okay, these events happen, that looks like an air conditioner. These events happen, it looks like a dishwasher. So that would even going into the home, you said behind the meter. I can go up to your meter right now, put this device on it and I'll know how much electricity you're using with a little bit of airing on the side of caution and uncertainty, yeah. What end uses are actually coming from that? That's interesting because unintrusive, low investment. All right, so kids, turn off the lights. John Farland said, turn off the lights. When you're not home. Thanks for being with us. And like I said, we think you've got a pretty cool job. Yeah, yeah, yeah, very appreciate it. Thank you very much. Yeah, D and VGL and John Farland. Back with more from San Francisco, here on theCUBE, it's the moment.