 Live from San Francisco, it's theCUBE, covering Flink Forward, brought to you by Data Artisans. Hi, this is George Gilbert. We're with theCUBE today at Flink Forward, Data Artisans Conference, annual conference here in San Francisco for the Apache Flink community. And we're joined by Andrew Gao from Capital One. Capital One's always doing bleeding edge use cases with the latest technology. Andrew, good to have you. Thank you, it's good to be here. So tell us about your latest most bleeding edge use case with Apache Flink. What are you guys trying to enable? Yeah, sure, for the last year and a half, me and a couple teams have been working on developing a fraud decisioning platform on Kubernetes. We've been running in production since September and we have three use cases on it now. Okay, so tell me about, let's pick one use case. And tell me, what is it about stream processing to start with that makes it better? And then let's talk about the Apache Flink tooling that's coming out to make it even more accessible. Sure, in terms of what we're using Apache Flink for, the use case I worked on specifically was the one where customers go to a bank and they either cash their check or try to withdraw cash and we deployed a defense there to make real-time decisions on their past transactions. And again, just to be clear, the defense is to make sure it should be authorized that this is not fraud. Whether they should call the fraud operators or not, pretty much. Okay, so tell us how Flink made that better relative to what you were doing before. Sure, at Capital One, we're definitely sold on the idea of stream processing throughout the company we're moving towards copper-related architecture. We've had a pretty good experience just like building up features in real-time using Flink and then sending them off to our models, our machine learning models to return a result. Oh, so, okay, so here the use case is sort of continuous learning for the models and to keep the application going live with... We're not so far to the point of continuously training models though that is probably the end goal but we will have our models that we've developed offline and then we'll use the transactions that are coming in through our streams to calculate these features and send those features to the models which will tell us whether it's fraud or not fraud. Okay, so now Capital One and a few other companies are really sophisticated and have been able to take the open source code and then put the sort of infrastructure around it to make it easy to operate. Easier for your DevOps teams. What is it that you see coming from the data artisans folks that might make that job much easier? So, unfortunately we started developing our Kubernetes platform before any of this data artisans platform was announced so we're excited to, we've already done some work with deploying Flink applications on Kubernetes but we'd be definitely excited to see what the original contributors have to offer with their data artisans platform. And in addition to the resource management, what about orchestrating what's essentially a very distributed application where you have the compute and state management co-located on nodes and in fact, highly integrated. So things like, have you been able to do elastic scaling and repartitioning of the data and checkpoints for distributed state and then restoring that for rolling out new versions, things like that? So far, at least for Flink, we generally provision our clusters ahead of time so there's no rescaling at the Flink cluster level. We actually have multiple Flink clusters running on like a single Kubernetes cluster. In terms of state management, so far, I won't say our experience has been painless but it's been pretty good to us in terms of restoring from failures. Restoring from? Failures, like if we have task managers that die, Kubernetes can just let them die and it will recreate, it'll auto heal pretty much and recover from the checkpoint by itself. Okay, and have you guys been monitoring the capabilities rolling out with the DA platform? Yeah, yeah, especially with the resource manager. So right now, as I said, we do have multiple Flink clusters on the Kubernetes platform and that was pretty much to address the issue of resources, resource sharing. And that being a problem, some apps, if one app died on the same Flink cluster, it could impact the other apps. So our approach so far was to have multiple Flink clusters and separate them by namespaces but it seems like the resource manager could offer a similar feature. And so what are some of the things that you'd like to add that either your additional tooling might help with or where additional sort of application support framework in the form of the DA platform might make some of your sort of wish list easier. That's a hard question. I didn't mean to ask you, yeah. That's a hard, because you have to weigh between what's coming from the vendor and what you're doing. Yeah, exactly. There's a lot of things internally that we want to do and so far we've pretty much dealt with our problems and ourselves and just did workarounds, however much. So we haven't had too much experience directing that type of work towards the Flink contributors. Okay, okay. Well, it sounds like you're definitely pushing the envelope on production and I imagine you see lots more use cases coming down the road. Yeah, we have three use cases right now that are running in production but this fraud platform we're trying to build is supposed to handle pretty much all the bank fraud. All fraud, ultimately. Ideally. Ideally, wow. Okay, on that note, Andrew, we should end it and hopefully we'll see you back next year when you can tell us how far you've gone. All right, sounds good. So this is George Gilbert with Andrew Gao of Capital One and we will be back after this short break with more from Flink Forward in San Francisco.