 Live from San Jose, California, it's The Cube, covering Big Data Silicon Valley 2017. Good afternoon everyone. This is George Gilbert. We're at Silicon Valley Big Data in conjunction with Strata and Hadoop World. We've been here every year for six years and I'm pleased to bring with us today a really interesting panel with our friends from Matunity, Itamar and Karyan, which we were just discussing, it's an Israeli name, but some of us could be forgiven for thinking Italian or Turkish. Itamar is CMO of Atunity. We have Chris Murphy, who is from a very large insurance company that we can't name right now, and then Martin Liddell from Deloitte. We're going to be talking about their experience building a data lake, a high-value data lake, and some of the technology choices they've made, including how Atunity fits in that. Maybe kicking that off, Chris, perhaps you can tell us what the big objectives were for the data lake in terms of what outcomes were you seeking? Okay. I'd start off by saying there wasn't any single objective. It was very much about putting in a key enterprise component that would facilitate many, many things. So when I look at it now, I look back with wisdom, hopefully. I see it as trying to put in data as a service within the company. So very much we built it as an operational data lake, first and foremost, because we wanted to generate value for the company and very much convey to people that this was something that was worth investing in on an ongoing basis. And then on the back of that, of course, once you've actually pulled all the data together and started to curate it and make it available, then you can start doing the research work as well. So we were trying to get the best of both worlds from that perspective. Let me follow up on that just really quickly. It sounds like if you're doing data as a service, it's where central IT as a function created a platform on which others would build applications, and you had to make that platform mature at a certain level, not just the software but the data itself. And then at that point, did you sort of show prototype applications to different departments and business units, or how did the uptake, you know, how organically did that move? Not so much. It was very much a fast-delivering agile set of projects working together. So we actually had, and we used to call it the Holy Trinity of the projects we were doing, we had putting in a new customer portal that would be getting all of its data from the data lake, putting in a new CRM system, getting all of its data from the data lake and talking to the customer portal. And then of course, at the back behind that, the data lake itself feeding all the data to these systems. So we were developing in parallel to those projects, but of course those were not small projects, those were sizable beasts. But side by side with that, we were still able to use a data lake to do some proof-of-concept work around analytics. So interestingly, one of the first things we used a data lake for in terms of, on the analytics side, was actually meeting a government regulatory requirement where they needed us to get an amount of data together for them very quickly. And when I say quickly, I mean within two weeks, we went to our typical suppliers and said, how long will this take? About three months, they thought. In terms of actually using the data lake, we pulled the data together in about two days, and most of the delays were due to the lack of strict requirements where we were just figuring out exactly what people wanted. And that really helped benefit their, demonstrate the benefit of having a data lake in place. So Martin, tell us how Deloitte, you know, with its sort of deep bench of professional services skills could help make that journey easier for Chris and for others. So there were actually a number of areas where we were engaged. We were all the way from the very beginning engaged in working on the business case creation. And really where it sort of came to life was when we brought our technology people actually into to work out a roadmap of how to deliver that. As Chris said, there were many moving parts. There were therefore many teams within Deloitte as well that were engaged with different areas of specialization. So from a web development perspective on the one hand to Salesforce CRM in the background, and then obviously my team of data ninjas that came in and built the data lake and what we also did is actually we partnered with other third parties on the testing side. So we covered really the full life cycle there. But if I were to follow up on that, it sounds like because there were other systems being built out in parallel that depended on this, you probably had less to fewer degrees of freedom in terms of what the data had to look like when you were done. I think that's true to a degree. But when you look at the delivery model we employed, it was very much agile delivery and during the elaboration phase we were working together very closely across these three teams. So there was a certain amount of, well, not freedom in terms of what to deliver in the end, but to come to an agreement as to what good will look like at the end of a sprint or for a release. So there were no surprises as such, but still through the sort of flexible architecture that we had built and the flexible model that we had to delivering, we could also respond to changes very quickly. So if the product owner changed priority or made priority cores and changed priority items on the backlog, we could quite quickly respond to this. So, Itamar, maybe you can help us understand how attunity added value that other products couldn't really do and how it made the overall pipeline more performant. Okay, absolutely. So the project that, again, this Fortune Hunter company was putting together was an operational data lake. And it was very important for them to get data from a lot of different data sources so they can merge it together for analytic purposes and also get the data in real time so they can support real-time analytics, use information that is very fresh. And that data, again, many financial services and insurance companies came from the main frame, so multiple systems on the main frame, as well as other systems. And they needed an efficient way to get the data ingested into their data lake. So that's where attunity came in as part of the overall data lake architecture to support an incremental, continuous kind of universal data ingestion process. And attunity replicates lands itself to enable to load the data directly into the data lake, into Hadoop, in this case. Or also if they opt to use Kafka, go through mechanisms like Kafka and others. So provide all flexibility architecturally to capture data as it changes. And there are many different databases and feed that into the data lake so it can be used for different types of analytics. So just to drill down on that one level, because many of us would assume that the replication log that attunity sort of models itself after would be similar to the event log that Kafka works sort of models itself after. So is it that if you use Kafka, you have to modify the source systems and therefore it puts more load on them. Whereas with attunity, you are sort of piggybacking on what's already happening. And so you don't add to the load on those systems. Okay, great question, let me clarify it. So first of all, Kafka is a great technology that we're seeing more and more customers adopt as part of their overall big data and data management architectures. It's a public subscribe, basically infrastructure that allows you to scale up the messaging of data and storage of data as events, as messages. So you can easily move it around and process it also in a more real time streaming fashion. Attunity complements Kafka and actually is very well integrated with it, as well as other streaming type of ingestion and data processing technologies. What attunity brings to the picture here is primarily the key function of technology of CDC, change data capture, which is the ability, the technology to capture the data as it changes in many different databases. Do that in a manner that has less very little impact if any on the source system and the environment and deliver it in real time. So what the attunity does in a sense, we turn the databases to be live feeds that then can stream either directly, either we can take it directly into the Hadoop platform such as Hive and HDFS or for example, we can feed it into Kafka for further processing integration through a Kafka integration. So again, it's very complimentary in that sense. Okay, so maybe give us Chris a little more color on the before and after state, you know, before these multiple projects happened and then the data lake as a sort of data foundation for these other systems that you're integrating, what, what business outcomes changed and how did they change? That's a tough question. So I've been asked many flavors of that question before and then the analogy I was come back to is it's, it's like we're moving from candle power to electricity, right? There was no single use case that shows this is why you need a data lake. It was many, many things they wanted to do and in the before picture again, that was always just very challenging. So like many companies, we had sourced the, you know, the maintenance support operation and running of our systems to third parties and we were constrained by that and, you know, we were in that crazy situation where we couldn't get to our own data and by implementing the data lake, we broken down that barrier and we now had things back in our control. So I mentioned before that POC we did with the regulatory reporting. Again, three months, two days, it was night and day in terms of what we were now able to do. There's many banks are beginning to say that their old business model was get the customers checking account and then, you know, upsell, cross-sell to all these other related products or services. Is something happening like that with insurance where if you break down the data silos, it's easier to sell other services? There will be is probably the best way to put it. We're not there yet as, you know, it's a road, it's a long journey and we're doing it in stages. So I think we've done what three different releases on the data lake to date and that's very much on the on the plan. We want to do things like nudges to demonstrate that a customer is how they're, you know, there are products that could be a very good fit them because once you understand your customer, you understand what their gaps are, what their needs, what their wants are. Again, very much in the road map, just not all that part. So to help us maybe understand some of the near term steps you want to take on that road map towards that nirvana. So and what the role, the atunity is a vendor might play and delight, you know, as a professional services organization to help get you there. So Junity obviously was all about getting the data there as efficiently as possible. Unfortunately, like many things in your first iteration, it's still our data lake is still running on a batch basis, but we'd like to evolve that as time goes by in terms of actually then making use of the lake. One of the key things that we were doing in that was actually implementing a client matching solution. So we didn't actually have an MDM system in place for managing our customers. We had 12 different policy admin systems in place and customers could be coming to us and playing any role. They could be a beneficiary, they could be the policy holder, they could be a power of attorney and we could talk to somebody on the phone and not really understand who they were. You get them into the data lake, you start to build up that 360 view about who people are, then you start to understand what can I do for this person. That was very much the journey we're going on. And Martin, have you worked with are you organized by industry line and is there sort of a capability maturity level where you can say, OK, you have to master these skills and at that skill level, then you can do these richer business offerings. Yeah, absolutely. So first of all, yes, we are organized by industry groups and we have a sort of a common model across industries though that describe what you just said. And when we talk about insights driven organization, this is really where you are sort of moving to on the maturity curve as you become more mature in using your analytical capabilities and turning data from just data into information and into a real asset that you can actually monetize. So where we went with Chris's organization and actually with many other life insurers is actually sort of the first step on this journey, right? What Chris described around sort of for the first time being able to see a customer-centric view and see what a customer has in terms of products and therefore what they don't have, right? And where there's opportunities for cross-selling. This is sort of a first step into becoming more proactive, right? And there's actually a lot more that can follow on after that. But yeah, we've got maturity models that we assess against and we sort of gradually move people, organizations to the right place for them because it's not gonna be right for every organization to be an insight-driven organization to make this huge investment to get there. But most companies will benefit from nudging them in that direction. Okay, on that note, we're gonna have to leave it here. I will say that I think there's a session 2.30 today with the Deloitte and the Unnamed Insurance Team talking in greater depth about the case study with Attunity. And on that, we'll be taking a short break. We'll be back at Big Data Silicon Valley. This is George Gilbert and we'll see you in a few short minutes.