 Live from San Francisco, it's theCUBE. Covering Flink Forward, brought to you by Data Artisans. This is George Gilbert. We are at Data Artisans Conference, Flink Forward. It is for the Apache Flink Community, sponsored by Data Artisans and all the work they're doing to move Flink Forward and to surround it with additional value that makes building stream processing applications accessible to mainstream companies. Right now, though, we are not talking to a mainstream company. We're talking to Greg Fee from Lyft, not Uber. And Greg, tell us a little bit about what you're doing with Flink. What's the first use case that comes to mind that really exercises its capabilities? Sure, yeah. So the process of adopting Flink at Lyft has really started with a use case, which was we're trying to make machine learning more accessible across all of Lyft. So we already use machine learning in quite a few applications, but we wanna make sure that we use machine learning as much as possible. We really think that that's the path forward. And one of the fundamental difficulties with that is having consistent feature generation between these offline, batchy training scenarios and online, real-time streaming scenarios and the unified processing engine of Flink really helps us bridge that gap. When you say unified processing engine, are you saying that the fact that you can manage code and data as sort of an application version and some of the either code or data is part of the model and so you're versioning? I mean, that's even a step beyond what I'm talking about. Just the basic fundamental ability to have one piece of business logic that you can apply at the batch bulk layer and then in the real-time layer. So that's sort of the core piece of what Flink gives you. Are you running both batch and streaming on Flink? Yes, that's right. So you're using the Windows or just periodic execution on a stream to simulate batch? That's right. So feature generation crosses a broad spectrum of possible use cases in Flink. Yeah. And this is where we sort of transition more into what DA platform could give for us. So we're looking to have thousands of different features across all of our machine learning models. So having a platform that can help us host many of these little programs running, help with the application life cycle of each of these features as we version them over time. So we're very excited about what DA platform can do for us. Can you tell us a little more about how the stream processing helps you with the feature selection engineering and is it that you're using streaming or simulated batch or batches in the same programming model to train these models and you're using sort of, you're picking out different derived data. Is that how it's working? So the typical life cycle is there's going to be a feature engineering stage. So the data scientist is looking at their data, trying to figure out patterns in the data and they're going to, like how you apply Flink there is as you come up with potential algorithms for how you generate your feature can run that through Flink, generate some data, apply a machine learning model on top of it and sort of play around with that data, prototype things. Oh, so what you're doing is offline or out of the platform, you're doing the feature selection and engineering, then you attach a stream to it that has just the relevant, perhaps the relevant features. And then that model gets sort of, well maybe not yet, but eventually versioned as part of the application which includes the application, the rest of the application logic and the data. Right, so like some of this stuff that was touched on this morning at the Keynotes, the versioning and maintaining machine learning applications so much, it was a very complex ecosystem there. So being able to say, okay, go for the prototype stage, doing stuff in batch, to doing stuff in production in real time, and then being able to version those over time to move to better and better versions of the future generation is very important to us. I don't know if this is the most politically correct thing but you just explained it better than everyone else we've talked to about how it all fits together with the machine learning. So once you've got that in place, it sounds like you're using the DA platform as well as perhaps some extensions for machine learning to sort of add that as a separate life cycle besides the application code. Then is that gonna be the sort of enterprise wide platform for developing and deploying machine learning applications? Yes, certainly, we think there's probably a broad ecosystem to do machine learning. It's a very like sort of wide open area. Certainly my agenda is to push it across the company and get as many things running in this system as possible. I think the real time aspects of it and the unifying aspect of what Flink can give us and the platform can give us in terms of the life site. So are you set up essentially like where you're like a shared resource, a shared service, which is the platform group, and then all the business units adopt that platform and build their apps on it? Right, so my initiative is part of a greater data science platform at Lyft. So my goal is to have, we have hundreds of data scientists who are gonna be looking at this data, giving me little features that they want to do, and we're probably gonna end up numbering in thousands of features. Being able to generate all those, maintain all those little programs. And when you say generate all those little programs, that's the application logic and the model specific to that application. That's right, so there's features that are typically shared across many models. So there's like two layers of things happening. So you're managing features separately from the models. Interesting, okay, haven't heard that. And is the application manager tooling gonna help address that or is that custom stuff that you have to do? So I think there's a potential that that's the way we're gonna manage the model stuff as well, but it's... That you would put it in the application platform. And that's sort of at the boundary of sort of what you're doing right now or what you will be doing shortly. Right, it's a matter of use case of whether it's online or offline and how it fits best in with the rest of the Lyft engineering system. When you're talking about your application landscape, do you have lots of streaming applications that feed other streaming applications going through a hub, or are they sort of more discrete artifacts, discrete programs? And then when do you keep state within the stream processor and when do you have it in a shared database? That's a lot of questions, it's kind of a deep question. So the goal is to have a central hub where sort of all of our event data passes through it and that allows us to decouple. So that to be careful, that's not a database central hub, that's like an event hub. Event hub, yeah, okay. So an event hub in the middle allows us to decompose the different sort of smaller programs, which again, I'm probably going to number in the thousands. So that like being able to have different parts of the company maintain their own part of the overall system is very important to us. I think we'll probably see Flink as the major player in terms of how those programs run, but we'll probably be shooting things off to other systems like Druid, like Hive, like Presto, Elastic Search. As derived data. As all derived data from these Flink jobs. And then also pushing data directly out into some of our production systems to feed into these machine learning decisions. Okay, this is quite, this sounds like the most ambitious infrastructure that we've heard and that it sounds like pretty ubiquitous. I mean, we want to be a machine learning first company. So it's, hopefully it's everywhere. So help me clarify for me when, because this is, you know, for mainstream companies who've programmed with a, you know, DBMS as a shared state manager for decades, help explain to them when you would still use a DBMS for shared state and when you would start using the distributed state that's embedded in Flink and the derived data, you know, at the end points, at the sinks. So I mean, I guess this kind of gets into your, exactly your use cases and, you know, your opinions and thoughts about how to use these things best, but- Your opinion is what we're interested in. From where I'm coming from, I see basically databases as potential one sink for this data. They do some things very well, right? They do structured queries very well. You can have indices built up that aggregates really feed into a lot of visualization stuff. But from where I'm sitting, like we're really moving away from databases is something that feeds production data. We've got other stores to do that that are sort of more tailored towards those scenarios. When you say to feed production data, this is transaction capture or data capture. Right, so I mean, we don't have a lot of atomic transactions outside of payments that lift. Most of this stuff is eventually consistent. So we have stores more like Dynamo or Cassandra, HBase that feed a lot of our production data. And those databases, are they for like ambient information, like influencing an interaction or it doesn't sound like automating a transaction. It would be, it sounds like context that helps that helps with analytics, but very separate from the OLTP apps. That's right, so I mean, we have, you can kind of bifurcate the company into the data that's used in production to make decisions that are like facing the user. And then our analytics backend that really helps business analysts and like the executives make decisions about how we proceed. Oh, and so that second part, that backend is more like operational efficiency and coding new business processes to support new ways of doing business. But the customer facing stuff, specifically like with payments, that still needs a traditional OLTP. But they're not, those use cases aren't growing that much. That's right. So basically we have very specific use cases for like a traditional database, but in terms of capturing the type of scale and the type of growth that we're looking for at Lyft, we think some of the other storage engines suit those better. So in that use case, would the OLTP DBMS be at the front end of this? Would it be a source or a sync? It sounds like it's a source. So we actually do it both ways, right? So it's great to get our transactional data flowing through our streaming system. There's a lot of value in that, but also then pushing it out back to it, some of the aggregate results to a DBMS helps with our analytics pipeline. Okay, okay. Well, this is actually really interesting. So where do you see the DA platform helping going forward? Is it something that you don't really need because you've built all that scaffolding to help with sort of application lifecycle management or do you see it something that'll help sort of push Flink sort of enterprise-wide? I think the DA platform really helped sort of people adopt Flink at an enterprise level. Maintaining the applications is a core part of what it means to run it as a business. And so we're looking at DA platform as a way of managing our applications. And I think like I'm just talking about one, I'm mostly talking about one application. We have it for Flink at Lyft. We have many other Flink programs actually running that are sort of unrelated to my project. What about managing non-Flink applications? Do you need an application manager? Is it okay that it's associated with one service or platform like Flink? Or is there a desire among bleeding-edge customers to have an overall sort of infrastructure management, application management kind of sweet? Yes, for sure. You're touching on something that I've started to push inside of Lyft, which is the need for an overall application lifecycle management product. Would that plug into the DA platform whatever the say, confluent equivalent is? Or is it going to directly tied to the operational capabilities or the functional capabilities, not the management capabilities? In other words, would it plug into like core Flink, core Kafka, core Spark, that sort of stuff? I think that's sort of larger to be determined. If you go back to sort of how distributed system design works typically, we have a user plane, which is going to be our data users. Then you end up with like the thing that we're probably most familiar with, which is our data plane, like technology like Flink and Kafka, Hive, all those guys. What's missing in the middle right now is a control plane to map from the user desires for the user intention to have what we do with all of that data plane stuff. So you launch a new program, maybe you need a new Kafka topic, maybe you need to provision Kafka higher. You need to get some Flink programs running and whether that talks directly to Flink and it goes against Kubernetes or something like that or whether it talks to a higher level, like more application-specific platform, I think it's certainly a lot easier if we have some of these platforms in a way. Because they give you better abstractions to talk to the platforms. That's right. That's interesting. Okay, geez, we learned something really, really interesting with each interview. I'm curious though, if you look out a couple years, how much of your application landscape will be continuous processing? And is that something you can see mainstream enterprises adopting or has decades of work with batch and interactive sort of made people too difficult to learn something so radically new? I mean, I think it's all gonna be driven by the business needs and whether the value is there for people to make that transition because it is quite expensive to invest in new infrastructure. For companies like Lyft, where we're trying to make decisions very quickly, users get down to like two seconds, makes a difference for the customer. So we're trying to be as real-time as possible. I used to work at Salesforce. Salespeople are a little less sensitive to these things and it's very traditional world. But even Salesforce is moving towards streaming processing, so I think we're gonna see it sort of slowly be adopted across the big enterprises. Imagine that's probably for their analytics. That's where they're starting, of course, yeah. Okay, so this was a little more affirmation on to how we're gonna see perhaps the control plane evolve and the interesting use cases that you're up to and I hope we can see you back next year and you can tell us how far you've proceeded. I certainly hope so, yeah. This was really interesting. So Greg V from Lyft, we will hopefully see you again. And this is George Gilbert, we're at the Data Artisans Flink Forward Conference in San Francisco. We'll be back after this break.