 From New York, it's theCUBE. Covering Big Data New York City 2016. Brought to you by headline sponsors, Cisco, IBM, NVIDIA, and our ecosystem sponsors. Now, here are your hosts, Dave Vellante and Peter Burris. Welcome back to New York City, everybody. This is theCUBE, the worldwide leader in live tech coverage. This is wall-to-wall coverage from Monday to Thursday, where here concurrent with Strata, plus at Duke World, we call this Big Data NYC. Chad Neely is here. He's the vice president of product and solutions marketing at Teradata, and he's joined by Cheryl Wiebe, who's the practice lead for analytics of things at Think Big Analytics at Teradata Company. Folks, welcome to theCUBE. It's good to see you. Thanks. Great to be here. So Chad, let me start with you. What are you guys doing at the event? You got a presence here. What's the buzz? Yeah, absolutely. I mean, you guys know at Teradata for the last five plus years, we've really gone to market in terms of an ecosystem approach, right? We believe that our clients will deploy a multitude of technologies, the Teradata database, things like Hadoop, NoSQL, Spark, so events like this are great where we can interact with customers and prospects and talk with them about what they're doing around combining these solutions and some of the integration features we offer in that regard. So we haven't, Cheryl, talked much about IoT at this event, but you guys have a big IoT sort of initiative going on. Maybe set that up for us. Okay. Well, first of all, as Chad just mentioned, a lot of our customers have been building their corporate data asset with all their backend systems for years, but now the new sources of data are going kind of out into the wild, so to speak. And so they're being sourced from edge devices. And so we're bringing the data that comes from those sensors and from those edge devices into a potentially a sensor cloud, uniting it with the traditional enterprise data asset. And so we've built a bunch of connectors to those devices. We've built a listener platform that can, in a very self-serve manner, connect with various edge gateways from leading IoT platform providers that we're partnering with. But on the analytics side, so when that sensor data comes in, our customers are just starting to get started with experimenting on it and figuring out how to condition it, how to aggregate it, how to build rules on it. And then eventually they get to build predictive models, this sort of saving the world kind of streaming analytics you hear about. But so we've been doing a lot of work in the last few years on making those first steps, those first forays, those first experiments easier to do. And we've been harvesting a lot of the work using sort of, we call them proven analytics, or they're kind of accelerators to help their code, their data models, their prefab, predefined metrics and visualizations that we bring to the table in our little bag of tricks when we go work with customers. And so we help them do things like, hey, there's all these anomalous or noisy conditions in the sensor data. How do I write a bunch of rules to standardize it, organize it, replace nulls with zeros if that's what your rule is. So that's the sort of work we've done with Caterpillar and that's sort of foundational to when we do sort of predictive maintenance to help a mining truck be prevented from breaking down. So we predict which out of a fleet is going to break down and so we can alert the mining manager at the site that, hey, this is the one you should pull out of the pit today because it could stop the whole operation. You're seeing that in the field, right? That condition-based maintenance is one of the more popular and initial implementations of IoT, right? So this whole idea of, I've got some expensive, heavy equipment, right? It could be a train, it could be an aircraft engine, it could be an MRI device. And the idea is what can I learn from the sensor data and integrating that with other things like maintenance records or other usage of the device and be able to predict failure. And if I can do that, there's many things I can do, right? It can take scheduled downtime to repair that. I can also ensure that the necessary inventory is there, right, in case it does go down that we've minimized the downtime. And there are a lot of commonalities, whether you're trying to, again, predict failure of a train or an MRI device or a 500-ton dump truck in the case of Caterpillar. And taking these experiences around what combinations of analytics work best and making them repeatable for other clients we think is, it's very good for our clients, right? Because it'll shrink the time to value and the investment and take risk out of the project. Expensive assets that you wanna keep utilizing. Absolutely. And now I have to do a truck roll to figure out, well, after the fact, right? And so, the Sensor Cloud, what is the Sensor Cloud? Is it a new use case for cloud? Is it sort of a halfway house for data? That's a good question. Well, Sensor Cloud, I mean, it's just one way of looking at the data lake and the part of the data lake that houses or stores. It's a relatively inexpensive place to house a great amount of data. So Sensor Data is the biggest part of big data. You know, it's expanded by 10 or 100 fold, even over social media kinda data that we were talking about. So yeah, it's basically a data lake. Usually it'll have specific constructs for time series data because what Sensor Data is, is takes state information from a sensor, oil pressure, oil pressure at this time, this time, this time, fuel rail line temperature at this time, this time, and it streams all of that Sensor Data in. So that's what Sensor Data consists of. But then there's important tagging data. Well, what was the machine? What was the device? What was the conditions around the local weather? What have you? So there's metadata that needs to be brought in. So all of that gets stored in the Sensor Data Lake or the Sensor Cloud, as some people call it. And then it's really important to have connective tissue into the corporate data asset. You need to either be able to reach into the Sensor Cloud. You need to maybe deploy certain models, certain rules for conditioning the data need to be, even if they're authored somewhere else, they need to be pushed down to be run at the cloud. And then sometimes those data management rules are their own form of analytic that could be pushed down to the Edge device. And that's something that we're experimenting with now. To that end, I want to add to it something that the Siemens team shared with me, a client of ours, Siemens, big industrial giant. And obviously they're collecting Sensor Data from locomotives at the Edge. But what they were sharing with us is that unless they bring that into a central repository, they won't make correlations. Because a train failure, for say a particular bearing for one locomotive, that only happened once every four years. You're not going to get enough data to do the analytics. But when you take thousands of trains, you start to see the patterns faster. So there's two quick questions. One is that to do this properly, you have to be able to design the models. You have to be able to set up so the models can be trained over time. And you have to ensure that the scoring takes place and actuates or enacts some. How does that play out? Because Teradata is known for being a large central resource. So how does that map? How do those activities map back to your traditional approach and how are you changing your traditional approach to deal with the reality of the Edge? Well, there's a couple things we're doing. First of all, as you know, there's like a ton of innovation going on in the open source world. In particular, lots of our customers are saying we've got all these cool algorithms that are in the CRAN, the group that centralizes all the R code out there. All this sort of open source data science out there. So we've got a lot of technology that can help our run in database. So we can bring some of that innovation in data science right into the traditional corporate data center, either through data labs or things like that. And then we're building models, yes, you're right, more traditionally in that corporate central ecosystem. But the other thing we're doing is certifying on a number of our partners' gateways so that when the model that you built in the center, which is a very distinct and different activity from the scoring, that's essentially a little bit abstract. Yeah, exactly. So the scoring can be done in a completely different environment. And that's specific to that one device. So a good example is Apple's got some huge central ecosystem on which they're building predictive models and what can detect sleep or activities like walking or exercise, et cetera. When they push that out to my watch, that's a tiny little, it has very little compute power, very little memory, I mean relatively speaking. So it's only that sub portion that's specific to my watch that's running at the edge. So this is, for me, in this consumer device, this is my edge. Yeah, I'll give you a very simple example. I don't know if it's a consequence, but I am also an Apple Watch user and did the recent upgrade and was driving around with my daughter in Los Angeles, not too long ago, rent a car, drove up somewhere, parked the car, got out, and my watch popped up and said, your car is located at. So my car actually figured out, or my watch along my phone was actually able to figure out that I had been driving and that I had stopped and I had stopped long enough that the car parked and that's an example of pushing down new scoring and models to be able to say, when this happens, interpret it in this way. Okay, next question related to this, if I can. Teradata has also historically been associated with data warehouse proximate to the IT organization and you guys have had a long legacy of being able to span some of the tensions that have historically existed between people in the business who wanted to study things, business results and business performance, and the IT organization who was still in position to actually manage the technology. As you move to the edge, you've got business people, IT and operational technology people, OT people. You have, therefore, interesting visibility into how this three planets problem is playing out. What is happening in that intersection? Because that's going to be crucial to predicting, if I may, how a lot of the edge and IT relations actually come together. I think there's a lot to that, it's a great question. One part of it that I want to initially address is this intersection between OT and IT, which I think is fascinating, and the more that I've researched it and gone out and talked to our clients, you see that there's a huge problem that needs to be solved when it comes to IoT and big data because a lot of the initiatives are starting off in OT. OT is the group that is putting sensors in an automobile or a tractor or a train. These are real time guys. Absolutely, and they're technologists and they do not report up to the CIO. That's correct. But then you think about the stuff Cheryl's talking about around an architecture and a shared data management platform and things. That's traditionally the domain of IT. So a big issue for a lot of industrial IoT giants is how you bridge this divide, right? Because OT have their own systems with historian and things like that. And one of the things that we're offering and addressing in this is, it's not that unlike you just said, our ability is in the past to bridge with the business. You have to give them an easy to use solution but increasingly you have to make it more self-service. So the idea of being able to very easily ingest sensor data into a shared environment through a product we have called Listener, where it's very gooey driven. You don't have to be a technologist to learn how to do it. You don't have to get in the queue to have IT get your sensor data into a place you can utilize it. Or things that products we have like App Center or Asteron Hadoop that again are geared at taking mere mortals and giving them the power of analytics through shared functions and reuse of data. So I think increasingly this is a big part of our value proposition and how we can add value in terms of just making things more self-service to bridge that IT and OT divide. I think the other thing that we're really increasingly doing is going back to our roots to go directly to the business people in this case. It's the plant engineers, the process engineers in saying, you know what? You need all your data at your fingertips. And they're having a lot of trouble getting it out of the historians. So providing that kind of visibility is really important. So we're doing things like connectors to get data out of historians. I think you're going to start seeing the IOT platform providers put some of their technology, their data capture technology right in plants. There will be a cultural divide because typically there's been like this air gap between the plant and the local area network running in the plant and the rest of the corporation. So some cultural and data policy issues to be bridged there. But I think that there's a ton of plants out there in corporate America where there's like a huge amount of data sitting in, literally called the data grave or the data boneyard. And we're starting to see projects and interest in doing projects to mine that data. We can't get our process to maturity without analyzing all of our data. And that data graveyard needs to be dug up, so to speak. Well, the observation we've made is that data in context or when you think about context, data in one context may have no value, but in another context, it may have significant value. And in that graveyard, if you apply it to the right context, especially when you start talking about time series and whatnot, there are significant nuggets of value that companies are going to be able to generate over the next few years. How is your ecosystem changing, evolving as you get into this new market? I'm inferring that you work with customers like Caterpillar and Siemens already, but you see this opportunity to extend into operations technology and industrial internet, if you will. How is your ecosystem shifting? Yeah, I mean, if you look at some of the moves we've made, right, moving, we started this journey over a half decade ago from being a data warehouse company to really embracing an analytical ecosystem. So we've been making investments in bringing these things together, right? So as an example, Cheryl is talking about streaming sensor data into your data lake. And while we don't think a data lake is tied to technology, the reality is most of our clients are using Hadoop for that. And then in Teradata, they might have things like inventory, right, maintenance records, sales data, financials, and if you're gonna connect the dots on those and make correlations, you need capabilities like we have with query grid that allows you to execute a single query, leverage the parallelism of both environments and connect those dots. And I mentioned earlier in the segment things we have around listener, right? Which is perfect, Taylor made for streaming and sensor data. So I think a lot of the things that we've been working on from a horizontal framework is directly applied to AOT, but then you couple that with new partnerships we've developed with the likes of Siemens. They actually have an IOT platform as a service that utilizes Teradata and Hortonworks together with some other IP that they have. Other pending partnerships that Cheryl mentioned, right, where they have the edge gateways. This is increasingly an area where we need to develop those partnerships and certify to further enable the ecosystem. So I think we're well positioned on that. And then Cheryl was also mentioning, and like I alluded to with more business solutions like condition-based maintenance or some of the other ones that we don't necessarily have time to get into today, but that are really geared at addressing fundamental business questions that organizations are facing around IOT. What kinds of analytics are best suited for this problem? What sensor data should I trust and keep? We can help them with that from a consultative perspective. Well, the organizational issues that you touched on are pretty substantial, the divide between OT and IT. Security comes into it. I wish we had more time to talk about the TAM as well. We really haven't got into that. Unfortunately, we have to leave it there, but thanks so much for coming on theCUBE. It's our pleasure. Our pleasure. All right, keep it right there, everybody. We'll be back with our next guest right after this short break. We're live from New York City, right back.