 Live from San Francisco, it's theCUBE. Covering Google Cloud Next 2018. Brought to you by Google Cloud and its ecosystem partners. Hello everyone, welcome back. This is theCUBE live in San Francisco for Google Cloud's big event. It's called Google Next 18, 20 for 2018. It's their big cloud show that showcasing all their hot technology, a lot of breaking news, a lot of new tech, a lot of new announcements. Of course, we're bringing it here for three days of wall-to-wall coverage live. That's day two. Our next guest is Tim Kelton, co-founder of Descartes Labs, doing some amazing work with imagery and data science, AI, TensorFlow, using the Google Cloud Platform to analyze nearly 15 petabytes of data. Tim, welcome to theCUBE. Thanks for coming on. Thanks for being here. So we were just geeking out before we came on camera, the app that you have. Really interesting stuff you guys got going on. Again, really cool. Before we get into some of the tech, you've been about Descartes Labs, you co-founder, where did it come from? How did it start? And what are some of the projects that you guys are working on? I think, therefore, I am. Exactly, exactly. Yeah, so we're a little different story than maybe a normal startup. I was actually at a national research laboratory, Los Alamos National Laboratory, and there was a team of us that were focused on machine learning and using data sets like remotely sensing the earth with satellite and aerial imagery. And we were working on that from around 2008 to 2014. And then we saw just this explosion in things like use cases for machine learning and applying that to real world use cases. But then at the same time, there's this explosion in kind of cloud computing and how much data you could store and train and things like that. So we started the company in late 2014. And now here we are today. We have around 80 employees. And what's the main thing you guys do from a data standpoint? Where's the data come from? Take a minute to explain that. Yeah, so we focus on kind of a lot of geospatial, often geospatial centric data, but a lot of satellite and aerial imagery, a lot of what we call remote sensing, sensors orbiting the earth or low aerial over the earth, all different modalities such as different bands of light, different radio frequencies, all of those types of things. And then we fuse them together and have them in our models. And what we've seen is not just, there's not just the magic data set that gives you the pure answer, right? It's fusing a lot of these data sets together to tell you what's happening and then building models to predict how those changes affect our customers, their businesses, their supply chain, all of those types of things. Let's talk about just, I want to riff on something real quick. I know I want to get to some of the tech in a second, but yeah, my kids and I talk about this all the time. I got four kids and they're now two in high school, two in college and they see Uber and they see Uber remapping New York City every five minutes with the data that they get from the GPS. And we started riffing on drones and self-driving cars or aerial cars. You know, if we want to fly in the air with automated helicopters or devices, you got to have some sort of coordinate system. I mean, so this is kind of, we need this geo-spatial. And so, I know it's not, it's fantasy now, but what you guys are kind of getting at could be an indicator of the kind of geo-spatial work that's coming down later. Right now there's some cool things happening, but you'd need kind of a namespace. I got like coordinates so you don't bump into something or these automated drones don't fly near airports or cell towers or windmills, wind farms. And those are the types of problems we solve or we look to solve, you know, changes happening over time. Often it's the temporal cadence that's almost the key indicator in seeing like how things are actually changing over time and people are coming to us and saying, you know, can you quantify that? We've done things like agriculture and looking at crops grown, look at every single farm all over the whole U.S. and then build that into our models and say, you know, how much corn is grown in this field and then test it back over the last 15 years and then say, as we get new imagery coming in just daily flooding in through our cloud-native platform, then just re-running those models and saying, you know, are we producing more today or less today? And then how is that data used? For example, take the agriculture example and that's used to say, okay, this region is maybe more productive than this region. Is it because of weather? Is it because of other things that they're doing? You can go back, there's all different types of use cases, everything from maybe if you're insuring that crop or you might want to know if that's flooded more on the left side of the road or the right side of the road, you know, as a predictive indicator, you might say this is looking like a drought year. How have we done in drought years of 2007 and 2000? You look at irrigation trends. Well, and we're talking off camera about the ground truth. Can you use IoT to actually calibrate the ground truth? Yeah, and that's the sensor infusion. We're seeing, you know, everywhere around us, we're seeing just floods and floods of sensors. So we have the sensors above the earth looking down, but then as you have more and more sensors on the ground, that's this set of ground truth that you can train and calibrate and you could go back and train and train over again. It's a little harder problem. It's a lot harder problem than, you know, is this a cat or a dog? Yeah, this is why I was riffing on the concept of a namespace with the developer concept around, this is actually space. If you want to have flying drones deliver packages to transportation, you need to know, you know, some sort of triangulation know what to do. But I got to ask you a question. So what are some of the problems that you're asked to look at? Now that you have, you know, you have the top down view geo space, you've got some ground truth sensor exploding in with more and more devices at the network as they instrument, you know, they're connected to IP and whatnot. What are some of the problems that you guys get asked to look at? I mentioned the agriculture. What else are you guys solving? Any sort of like land use or land classification or facilities and facility monitoring, it could be any sort of physical infrastructure that you're wanting to quantify and predict how those changes over time might impact that business vertical. You know, and they're really varied. They're everything from energy and agriculture and just real estate and things like that. Just last Friday I was talking with, we have two parts to our company. We have kind of our, from the tech side, we have the engineering side, which is normal engineering, but then we also have this applied science where we have a team of scientists that are trying to build models often for our customers because they're not, you know, this is geospatial and machine learning. That's a rare breed of person. You don't want to cross pollinate. Yeah, and that's just not everywhere. Not all of our customers have that type of individual. But they were telling me, they were looking at, you know, the hurricane season coming up this fall and they had a building detector and they can detect all the buildings. So in just like a couple hours, they ran that over all of the state of Florida and identified every building in the whole state of Florida. So now as the seasons come in, they have a way to track that. It can be proactive and notify someone, hey, your building might need some boards on it or some sort of risk. Yeah, and the last couple of years, look at all the weather events. You know, in California, we've had droughts and fires, but then you have flooding and things like that. And you're even able to start taking new types of sensors that are coming out. Like the Urine Space Agency has a sensor that we ingest and it does synthetic aperture radar where it's sending a radar signal down to the earth and capturing it. So you can do things like water levels in reservoirs and things like that. And look at irrigation for farming. Where's the droughts going to be? Where's the flooding going to be? So for the folks watching, go to dekartslabs.com slash search. They got a search engine there. I wish we could show it on the screen here but we don't have the terminal for it on this show. But it's a cool demo. You can search and find, you can pick an area, football field, an irrigation ditch, anything, cell tower, wind farm, and find duplicates and it gives you a map around the country. So the question is, is that, okay, what is going on in the tech? Cause let's use cloud for this. So how do you make it all happen? Yeah, so we have kind of two real big components to our tech space. The first is obviously we have lots and lots of satellite and aerial imagery. That's one of the biggest and messiest data sets and there's all types of calibration workloads that we have to do. So we have this ingest pipeline that processes it, cleans it, calibrates it, removes the clouds, not as in cloud computing infrastructure but as in clouds over the head. And then the shadows they emit down on the earth and we have this big ingestion process that cleans it all and then finally compresses it and then we use things like GCS as an infinitely scalable object store. And what we really like on the GCS side is the performance we get cause we're reading and pulling in and out that compressed imagery all day long. So every time you zoom in or zoom out, like we're expanding and removing that. But then our models, sometimes what we've done is we'll want to maybe, maybe we're making a model in vegetation and we just want to look at the infrared bands. So we'll want to fuse together satellites from many different sources, fuse together ground sources, sensor sources and just maybe pull in just one of those bands of light, not pull the whole files in. So that's what we've been building on our API. So how do you find GCP? What do you like? We've been asking all the users this week, what are the strengths? What are some of the weaknesses? What's on their to-do list? Documentation comes up a lot. We'd like to see better documentation but okay that's normal but what's your perspective? If you write code or develop, you always want something out of feature parity and stuff. So from our perspective, the biggest strengths of GCP, like one of the most core strengths is the network. The performance that we've been able to see from the network is basically on par with what we used to have. You know when we were at national laboratories we'd have access to high performance supercomputing, some of the biggest clusters in the world and the network and GCS and how we've been able to scale linearly. Like our ingest pipelines, we processed a petabyte of data on GCP in 16 hours through our processing pipeline on 30,000 cores. And we'll just scale that network bandwidth right up. Do you tap the premium network service or is it just the standard network? This is just stock. That was actually three years ago that we got that bandwidth. How many cores? How many cores? That was 30,000. Because Google announced, we were going to talk this morning about their standard network and the premium network. I don't know if you saw the keynote where you get the low latency, if you pay a little bit more proximate to your users. But you're saying on the standard network you're getting just incredible. Yeah, that was early 2015. It's just a few people in our company like scaling up our ingest pipeline. We look at that, that was from then that was 40 years of imagery from NASA's Landsat program that we pulled in and not that far off in the future that petabyte's going to be a daily occurrence. So we wanted our ingest to scale and one of our big questions early on is actually could the cloud actually even handle that type of scale? So that was one of the earliest workloads on things like- And you feel good not about, right? Oh yeah, and that was one of the first workloads on preemptible instances as well. What's on the to-do list? What would make your life better? So we've been working a lot with Istio, that was shown here. So we actually gave a demo. We were in a couple talks yesterday on how we leverage and use Istio on our microservices. Our APIs are all built on that and so is our multi-tenant SaaS platform, so our ML team. When they're building models, they're all building models of different use cases, different fans of light, different geographic regions, different temporal windows. So we do all of that in Kubernetes. And so those are all- What does Istio give you guys? What's the benefit of Istio? Yeah, for us we're only, we're using it on a few of our APIs and it's things like really being able to see when you start splitting out these microservices, that network and that node-to-node or container-to-container latency and where things break down, being able to do circuit retries or being able to try a response like three different times before I return back a 500 or rate limit, some of your APIs so they don't get crushed or you can scale them appropriately. And then actually being able to make custom metrics and be able to fuse that back into how GKE scales on the node pools and stuff like that. So okay, that's how you're using it. So you were talking about Istio before. Are there things that you'd like to see that aren't there today, more maturity or? Yeah, I think Istio's like a very early starting point on all of these types of tools. So you want more? Oh yeah. One dot oh. Definitely. But I love the direction they're going and I love that it's open. And if I ever wanted to, I could build it on-prem but we were built basically native in the cloud. So all of our infrastructures in the cloud, we don't even have a physical service. What does open do for you, for your business? Why is it just a good feeling? Is it, you feel like you're less locked in? Is it feel like you're giving back to the community? We read the Kubernetes source code. We've committed changes. Like just recently, there's Google's open source, the open census library for tracing and things like that. We committed PRs back into that last week. We're looking for change. Something doesn't quite work how we want. We can actually go and we- You can show it upstream. You can show that upstream. Add value. For your business. When you get in really hard problems, you kind of need to understand that code sometimes at that level. So build tools, where Google took their internal tool Blaze and open sourced that Bazel. And so we've been using that. We're using that on our monorepos to do all of our builds. So you guys take a downstream, you work on it, and then all upstream contributions outwards? Sometimes. Yeah. Whenever you need to. Yeah, even Kubernetes, like we've looked, if nothing else, we've looked at the code multiple times and say, oh, this is why that autoscaler is behaving this way. Actually, now I can understand how to re-change my workload a little bit and alter that so that the scalar works a little bit more performantly. Or we extract that last 10% of performance out of the, you know, to try and save that last 10% of cost. This is fascinating. I would love to come visit you guys and check out the facility. It's the coolest thing ever. I think it's the future. There's so much tech going on. So many problems that are new and cool. Yeah. And you got the compute to boot behind it. Final question for you. How are you using analytics and machine learning? What's the key things you're using from Google? What are you guys building on your own? If anything, can you share a quick, quick note on the ML and the analytics? How you guys are scaling that up? Yeah, so we've been using TensorFlow since very early days. That geo-visual search that you're saying, where we use TensorFlow models in some of those types of products. So we're big fans of that as well and we'll keep building out models where it's appropriate. Sometimes we use very simple packages. You know, you're just doing linear regression or things like that. So you're supplying that in? Yeah, it's the right tool for the right problem. And I know it's picking that and applying that. And just quickly, you guys are for-profit, non-profit, what's the commercial? Yeah, we're for-profit. We're a Silicon Valley VC-backed company, even though we're in the mountains. Who's in the VCs? Which VCs are in? Crossline Capital is one of our leading VCs, Eric Chinn and that team down there, and they've been great to work with. So they took a chance and a crazy bunch of scientists from up in the mountains of New Mexico. Hey, you can't go, that sounds like a good VC-backed opportunity. Yeah, and we had a CEO that was kind of from the Bay Area at Mark Johnson and so there was, we needed kind of both of those to- I mean, I'm a big believer you throw money at great smart people and emerging markets like this. And you got a mission that's super cool. It's obvious there's a lot to do, and there's opportunities as well. Yeah, it's a tremendous opportunity. Congratulations, Tim. Thanks for coming on theCUBE. Tim Kelton, he's the co-founder of Descartes Labs, here in theCUBE, breaking down, bringing the technology, the applied businesses, all these brains, working on the geospatial future for theCUBE, we are geospatial here in theCUBE in Google Next in San Francisco. I'm John Furrier, Dave Vellante. Stay with us for more coverage after this short break.