 From San Mateo, California. It's theCUBE, covering SnapLogic Innovation Day 2018. Brought to you by SnapLogic. Hey, welcome back, everybody. Jeff Frick here with theCUBE. We're at the Crossroads. That's 92 and 101 in the Bay Area. If you've been through it, you've had time to take a minute and look at all the buildings because traffic's usually not so great around here. But there's a lot of great software companies that come through here. It's interesting, I always think back at the Siebel Building that went up and now that's Rokutan who we all know from the warrior jerseys, the very popular Japanese retailer. But that's not why we're here. We're here to talk to SnapLogic. They're doing a lot of really interesting things and they have been in data. Now they're doing a lot of interesting things in integration and we're excited to have a many-time CUBE alum. He's Greg Benson. Let me get the title right. Chief Scientist at SnapLogic and, of course, a professor at University of San Francisco. Greg, great to see you. Great to see you, Jeff. So I think the last time we saw you was at Flink Forward, right? Interesting open-source project, data, AdMove. So the open-source technologies and the technologies available for you guys to use just continue to evolve at a crazy breakneck speed. Yeah, no, it is, open-source is in general, as you know, has really revolutionized all of computing back to starting with Linux and what that's done for the world, right? And in one sense it's a boon but it also introduces a challenge, right? Because how do you choose? And then even when you do choose, do you have the expertise to harness it? And the early social companies really leveraged off of Hadoop and Hadoop technology to drive their business and their objectives. And now we've seen a lot of that technology be commercialized and have a lot of service around it. And SnapLogic is doing that as well. We help reduce the complexity and make a lot of this open-source technology available to our customers. Let's talk about a lot of different things. One of the things is Iris. So Iris is your guys' leverage of machine learning and artificial intelligence to help make integration easier, if I get that right? That's correct, yeah. Iris is the umbrella term for sort of everything that we do with machine learning and how we use it to enhance the user experience. And one way to think about it is when you're interacting with our product, we've made the SnapLogic designer a web-based UI, drag-and-drop interface to construct these integration pipelines. We connect these things called snaps, right? It's like building with Legos to build out these transformations on your data. And when you're doing that, when you're interacting with the designer, and we would like to believe that we've made it sort of one of the simplest interfaces to do this type of work. But even with that, there are many times where you have to make decisions, like what type of transformation do you do next? How do you configure that transformation if you're talking to an Oracle database? How do you configure it? What's your credentials if you talk to Salesforce? If I'm doing a transformation on data, which fields do I need? What kind of operations do I need to apply to those fields? So as you can imagine, there's lots of situations as you're building out these data integration pipelines to make decisions. And one way to think about Iris is there to help reduce the complexity, help reduce what kind of decision you have to make at any point in time. So it's contextually aware of what you're doing. And at that moment in time, based on mining our thousands of existing pipelines and scenarios in which SnapLogic has been used, we leverage that to train models to help make recommendations so that you can speed through whatever tasks that you're trying to do as quickly as possible. Right. It's such an important piece of information because if I'm doing an integration project using the tool, I don't have the experience of the vast thousands and thousands and actually you guys are doing now what a trillion document moves last month. I just don't have that expertise. You guys have the expertise and truth be told as unique as I think I am and as unique as I think my business process are, probably a lot of them are pretty much the same as a lot of other people that are hooking up in the Salesforce to Oracle or hooking up, Marketo to their CRM. So you guys really take an advantage of that using the AI and the ML to help guide me along, which is probably a pretty high probability prediction of what my next move is gonna be. Yeah, absolutely. And back in the day, we used to consider like, you might have heard of like wizards, right? And these sorts of things that would walk you through. And really that wasn't, it seemed intelligent, but it wasn't really intelligence or machine learning, right? It was really just hard-coded facts or heuristics that hopefully would be right for certain situations. Yeah, the difference today is, yeah, we're using real data, gigabytes of metadata that we can use to train our models. And so the nice thing about that, it's not hard-coded, it's adaptive. So it's adaptive both for sort of new customers, but also for existing customers. We have customers that have hundreds of people that just use SnapLogic, right? To get their business objectives done. And as they're building new pipelines, as they are putting in new expressions, right? We are learning that for them within their organization. So like their coworkers the next day, they can come in and then they get the advantages of all the intellectual work that was done to figure something out, will be learned and then will be made available through Iris. Right, I love this concept of operationalizing the machine learning and the augmented intelligence. So how do you apply it? Don't just talk about it, don't give it a name of some dead smart person, but actually apply it to an application that starts you to benefit. And that's really what Iris is all about. So what's changed the most in the last year since you launched it? So, you know, one thing I'll say, the first most, the most interesting thing that we discovered when we first launched Iris, and I should say one of the first technologies, Iris technologies that we introduced was something called the integration assistant. And this was an assistant that would make recommendations of the next snap as you're building out your pipeline. So the next transformation or the next connector. And before we launched it, before we, you know, we did lots of experimentation with different machine learning models and we did different training to try to get the best accuracy possible. And what we really thought was that this was going to be most useful for the new user, right? Somebody who hasn't really used the product and it turns out when we looked at our data and we looked at sort of how it got used, it turns out that yes, new users did use it, but existing or very skilled users were using it just as much if not more because it turned out that it was so good at making recommendations that it was like a shortcut. Like even if they knew the product really well, it's still actually a little more work to go through our catalog of 400 or plus snaps and pick something out. When if it's just sitting right there and saying, hey, the next thing you need to do is you don't even have to think, you just have to click and it's right there. Then it just speeds up the expert user as well. So that was an interesting sort of revelation about machine learning and our application of it. In terms of what's changed over the last year, we've done a number of things. Both the operationalizing it so that we get, instead of training off a snapshot, we're now training on a continuous basis so we get that adaptive learning that I was talking about earlier. The other thing that we have done is and this is kind of getting into the weeds, but we were using a decision tree model, which is a type of machine learning algorithm and we switched to neural nets now. So we use neural nets to achieve higher accuracy and also a more adaptive learning experience. The neural net allowed us to do to bring in sort of like this organizational information so that your recommendations would be more tailored to your specific organization. The other thing that we're just on the cusp of releasing is in the integration assistant, we were working on sort of a sort of from beginning to end type recommendation where you were kind of working forward. But what we found is in talking to people in the field and our customers who use the product is there's all kinds of different ways that people interact with the product. They might know where they want the data to go and then they might want to work backwards or they might know that the most important thing I need this to do is join some data. So like when you're solving a puzzle like with the family, you either work on the edges or you put some clumps in the middle and you kind of work to get to and that puzzle solving sort of metaphor is where we're moving the integration assistant so that you can kind of fill in the pieces that you know and then we help you work in any direction to sort of make the puzzle complete. And so that's something that we've been adding to. We recently started recommending based on your context like the most common sources and destinations that you might need, but we're also about to introduce this idea of working backwards and then also working from the inside out. Right. And then we just had Gravan and he's talking about kind of the next iteration of the vision is to get to autonomous. Correct. To get to where the thing not only can guess what you want to do, it has a pretty good idea but it actually starts to basically do it for you and I guess it would flag you if there's some strange thing or a need to an assistant and really move almost a full autonomy in this integration effort. Yeah, so I mean, yes, we want to get to the vision. No, no, no, and you know, like, you know and I'm the one that has to make that vision a reality, right? Yeah. So, you know, yes, the way I like to explain it is that customers or users have a concept of what they want to achieve and that concept is, you know, as a thought in their head and the goal is how to get that concept or thought right into something that is machine executable, right? What's the pathway to achieve that? Or, you know, if somebody's using SnapLogic for a lot of their organizational operations or for their data integration, right? We can start looking at what you're doing and make recommendations about other things that you should or might be doing, right? So it's kind of like this two-way thing where we can give you some suggestions but we also, you know, people want, know what they want to do conceptually but how do we make that realizable as something that's executable? So, I'm working on a number of research projects that is getting us closer to that vision and one of them that I've been very excited about is we're working a lot with NLP, Natural Language Processing, like many companies and other products are investigating. For our use in particular is in a couple of different ways to be sort of concrete. We've been working on a research project in which rather than, you know, having to know like the name of a Snap because right now you get this thing called a Snap Catalog and like I said, 400 plus snaps, right? To go through the whole list, it's pretty long, okay? You can start to type a name and yeah, it'll limit it, whatnot, but you still have to know kind of exactly what that Snap is called. What we're doing is we're applying machine learning in order to allow you to either speak or type what the intention is of what you're looking for. Like I want to parse a CSV file. Now we have a file reader and we have a CSV parser but if you just typed in parse a CSV file, it may not find what you're looking for. But we're trying to take sort of the human description and then connect that with the actual snaps that you might need to complete your task. So that's one thing we're working on. I have two more. The second one is a little bit more ambitious but we have some preliminary work that demonstrates this idea of actually saying or typing what you want an entire pipeline to do, okay? So I might say I want to read data from Salesforce. I want to filter out only records from the last week and then I want to put those records into Redshift. And if you were to just say or type what I just said, we would give you a pipeline that maybe isn't entirely complete but working and allows you to evolve it from there. So you didn't have to go through all the steps of finding each individual snap and connecting them together. So this is still very early on but we have some exciting results. And then the last thing that we're working on with NLP is in SnapLogic, we have a nice UI and it's really good. A lot of the heavy lifting in building these pipelines though is in the actual manipulation of the data. And to actually manipulate the data, you need to construct expressions. And expressions in SnapLogic, we have a JavaScript expression language. And so you have to write these expressions to do operations, right? One of our next goals is to use natural language to help you describe what you want those expressions to do and then generate the expressions for you. So this is, the way to get at that vision, we have to chisel, we have to break down the barriers on each one of these and then collectively, this will get us closer to that vision of truly autonomous integration. Right, but what's so cool about it, and again, you say autonomous, I can't help but think autonomous vehicles, and we had a great interview with the guy at Autonomous Vehicles. He said, if you have an accident in your car, you learn, the person that you had the accident learns a little bit and maybe the insurance adjuster learns a little bit. When you have an accident in an autonomous vehicle, everybody learns, the whole system learns. So that learning is shared, or is the magnitude greater to greater benefit to the whole? And that's really where you guys are sitting in this cloud situation, you've got all this integration going on with customers, you have all this translation and movement of data, everybody benefits from the learning that's gained by everybody's participation. I think that's what is so exciting and why it's such a great accelerator to how things used to be done before by yourself and your little company, you know, kind of coding way trying to solve problems. Very, very different kind of paradigm to leverage all that information of actual use cases, how, you know, what's actually happening with the platform. So it puts you guys in a pretty good situation. I completely agree and just to sort of another analogy is, look, we're not gonna get rid of programmers anytime soon, right? However, programming is a complex human endeavor, right? However, snap pipelines are kind of like programs, right? And what we're doing in our domain, in our space, is trying to achieve automated programming, right? So that you're right, like as you said, like learning from the experience of others, you know, learning from the crowd, learning from the mistakes, right? And capturing that knowledge in a way that when somebody is presented with a new task, we can, you know, either make it very quick for them to achieve that or actually provide them with exactly what they need. Right, right. And so that, yeah, it's very exciting. Yeah, so run out of time, before I let you go, I wanted to tie back to your professor job. Sure. Kind of how do you leverage that? How does that, you know, kind of benefit what's going on here at SnapLogic? Because you've obviously been doing that for a long time, it's important to you. Bill Schmarls, a great fan of the cubes. We deemed him the dean of big data a couple of years ago. He's now starting to teach. So there's a lot of benefits of being involved in academe. So what do you do in there in academe and how does it tie back to what you do in your SnapLogic? Sure. So, yeah, I've been a professor for 20 years at the University of San Francisco and I've long done research in operating systems and distributed systems, parallel computing, programming languages. And I had the opportunity to work with Snap, start working with SnapLogic in 2010. And it was this, you know, it was this great experience of, okay, I've done all this academic research, I've built systems, I've written research papers, and SnapLogic provided me with an opportunity to actually put a lot of this stuff in practice and work with real world data. You know, it's much different, you know, and I think a lot of people on both sides of the industry academia sort of fence will tell you that a lot of the real interesting stuff in computer science happens in industry because a lot of what we do with computer science is practical, right? And so I started off, you know, bringing in sort of my expertise and working on innovation and doing research projects which I continue to do today. And at USF we happen to have sort of a vehicle already set up, all of our students, both undergraduates and graduates have to do a capstone senior project or master's project in which we pair up the students with industry sponsors to work on a project. And you know, this is a time in their careers where they don't have a lot of professional experience, but they have a lot of knowledge, right? And so we bring the students in and we carve out a project idea and the students sort of under my mentorship and working with the engineering team sort of work towards whatever project we set up. Those projects have resulted in numerous innovations now that are in the product. The most recent big one is, Iris came out of one of these research projects as a machine learning project about sort of around three years ago. And we have, you know, I continuously have lots of other projects in the work. In the works, on the flip side, my experience at SnapLogic has allowed me to bring sort of this industry experience back to the classroom, both in terms of explaining to students and understanding what their expectations will be when they get out into industry, but also being able to make the example sort of more real and relevant in the classroom. So for me, it's been a great relationship that's benefited both those roles. Right. Well, it's such a big and important driver to what goes on in the Bay Area. USF doesn't get enough credit. Clearly, Stanford and Cal get a lot. They bring in a lot of smart people every year. They don't leave, they love the weather. And you know, it is really a significant driver, not to mention all the innovation that happens and cool startups that come out. Well, Greg, thanks for taking a few minutes out of your busy day to sit down with us. Thank you, Jeff. All right. He's Greg, I'm Jeff. You're watching theCUBE from SnapLogic in San Mateo, California. Thanks for watching.