 from Berlin, Germany. It's theCUBE, covering DataWorks Summit Europe 2018. Brought to you by Hortonworks. Hello, welcome to theCUBE. I'm James Kobielus. I'm the lead analyst for Big Data Analytics in the Wikibon team of SiliconANGLE Media. And we're here at DataWorks Summit 2018 in Berlin, Germany. And it's an excellent event and we are here for two days of hard-hitting interviews with industry experts focused on the hot issues facing customers, enterprises in Europe and the world over related to the management of data and analytics. And what's super hot this year, and it will remain hot as an issue, is data privacy and privacy protection. Five weeks from now, a new regulation of the European Union called the General Data Protection Regulation takes effect. And it's a mandate that is affecting any business that is not only based in the EU, but it does business in the EU. It's coming fairly quickly and enterprises on both sides of the Atlantic and really throughout the world are focused on GDPR compliance. So that's a hot issue that was discussed this morning in the keynote. And so what we're gonna be doing over the next two days, we're gonna be having experts from Hortonworks, the show's host, as well as IBM. Hortonworks is one of their lead partners, as well as a customer Munich Reeve. We'll appear on theCUBE and I'll be interviewing them about not just GDPR, but really the trends facing the big data industry. Hadoop, of course. Hortonworks got started about seven years ago as one of the solution providers that was focused on commercializing the open source Hadoop code base. And they've come quite a ways. They've had their recent financials were very good. They continue to rock and roll on the growth side and customer acquisitions and deal sizes. So we'll be talking a little bit later to Scott now, their chief technology officer who did the core keynote this morning. He'll be talking not only about how the business is doing, but about a new product announcement. The data steward studio that Hortonworks announced overnight and it is directly related to, or useful, this new solution for GDPR compliance. So we'll ask Scott to bring us some more insight there. But what we'll be doing over the next two days is extracting signal from noise. The big data space continues to grow and develop. Hadoop has been around for a number of years now, but in many ways it's been superseded in the agendas, the priorities of enterprises that are building applications from data by some newer primarily open source technology such as Apache Spark, TensorFlow for building deep learning and so forth. We'll be discussing the trends towards the deepening of the open source data analytics stack with our guests. We'll be talking with a European based re-insurance company, Munich Re, about the data lake that they have built for their internal operations. And we'll be asking their Andreas Kolmeier, their lead of data engineering to discuss how they're using and how they're managing their data lake and possibly to give us some insight about how it will serve them in achieving GDPR compliance and sustaining it going forward. So what we'll be doing is that we'll be looking at trends not just in compliance, not just in the underlying technologies but the applications that Hadoop and Spark and so forth, these technologies are being used for. And the applications are really the same initiatives in Europe are worldwide in terms of what enterprises are doing. They're moving away from big data environments built primarily on data at rest. That's where Hadoop has been the sweet spot towards more streaming architectures. And so Hortonworks, as I said, the show's host has been going more deeply towards streaming architectures with its investments in NiFi and so forth. We'll be asking them to give us some insight about where they're going with that. We'll also be looking at the growth of multi-cloud big data environments. What we're seeing is that there's a trend in the marketplace away from predominantly premises-based big data platforms towards public cloud-based big data platforms. And so Hortonworks, they are partners with a number of the public cloud providers, IBM, I mentioned, they've also got partnerships with Microsoft, Azure, with Amazon Web Services, with Google and so forth. We'll be looking, we'll be asking our guests to give us some insight about where they're going in terms of their support for multi-clouds, support for edge computing, analytics and the internet of things. Big data increasingly is evolving towards more of a focus on serving applications at the edge like mobile devices that have autonomous smarts like for self-driving vehicles. Big data is critically important for feeding, for modeling and building the AI needed to power the intelligence. In endpoints, not just self-driving cars, but intelligent appliances, conversational user interfaces for mobile devices, for consumer appliances like Amazon's got their Alexa, Apple's got their Siri and so forth. So we'll be looking at those trends as well towards pushing more of that intelligence to the edge and the power and the role of big data and data-driven algorithms like machine learning in driving those kinds of applications. So what we see in the Wikibon, the team that I'm embedded within, we have published just recently our updated forecast for the big data analytics market and we've identified key trends that are revolutionizing and disrupting and changing the market for big data analytics. So among the core trends, I mentioned the move towards multi-clouds, the move towards more public cloud-based big data environments in the enterprise. I'll be asking Hortonworks, who of course built their business and their revenue stream primarily on on-premises deployments, give us a sense for how they plan to evolve as a business as their customers move towards more public cloud-facing deployments. And IBM, of course, will be here in force. We have tomorrow, which is Thursday. We have several representatives from IBM to talk about their initiatives and partnerships with Hortonworks and others in the area of metadata management, in the area of machine learning and AI development tools and collaboration platforms. We'll be also discussing the push by IBM and Hortonworks to enable greater depths of governance applied to enterprise deployments of big data, both data governance, which is an area where Hortonworks and IBM as partners have achieved a lot of traction in terms of recognition as the among the pace setters in data governance in the multi-cloud, unstructured, big data environments. But also model governance, the governing the version controls and so forth of machine learning and AI models. Model governance is a huge push by enterprises who increasingly are doing data science, which is what machine learning is all about. Taking that competency, that practice and turning it into a more of an industrialized pipeline of building and training and deploying into an operational environment, a steady stream of machine learning models into multiple applications. Edge applications, conversational UIs, search engines, e-commerce environments that are driven increasingly by machine learning that's able to process big data in real time and deliver next best actions and so forth more intelligence into all applications. So we'll be asking Hortonworks and IBM to net out where they're going with their partnership in terms of enabling a multi-layered governance environment to enable this pipeline, this machine learning pipeline, this data science pipeline to be deployed is an operational capability into more organizations. Also, one of the areas where I'll be probing our guests is to talk about automation in the machine learning pipeline. That's been a hot theme that Wikibon has seen in our research. A lot of vendors in the data science arena are adding automation capabilities to their machine learning tools. Automation is critically important for productivity. Data scientists as a discipline are in limited supply. I mean, experienced, trained, seasoned data scientists fetch a high price, there aren't that many of them. So more of the work they do needs to be automated and can be automated by mature tool, increasingly mature tools on the market at growing range of vendors. I'll be asking IBM and Hortonworks to net out where they're going with automation inside of their big data, their machine learning tools and partnerships that are going forward. So really what we're going to be doing over the next few days is looking at these trends, but it's going to come back down to GDPR as a core envelope that many companies attending this event, DataWorks Summit Berlin are facing. So I'm James Kabilis with theCUBE. Thank you very much for joining us and we look forward to starting our interviews in just a little while. Our first up will be Scott now from Hortonworks. Thank you very much.