 Live from San Jose, in the heart of Silicon Valley. It's theCUBE, covering DataWorks Summit 2018. Brought to you by Hortonworks. Welcome back to theCUBE's live coverage of DataWorks here in San Jose, California. I'm your host, along with my co-host, James Kobilis. We're joined by Tendu Yogarchu. She is the CTO of Syncsort. Thanks so much for returning to theCUBE, I should say. Thank you, Rebecca, and James, it's always a pleasure to be here. So you've been on theCUBE before, and the last time you were talking about Syncsort's growth. So can you give our viewers a company update, where you are now? Absolutely, Syncsort has seen extraordinary growth within the last year. We tripled our revenue, doubled our employees, and expanded the product portfolio significantly. Because of this phenomenal growth that we have seen, we also embarked on a new initiative refreshing our brand. We rebranded, and this was necessitated by the fact that we have such a broad portfolio of products, and we are actually showing our new brand here, articulating the value our products bring with optimizing existing infrastructure, assuring data security and availability, and advancing the data by integrating into next generation analytics platforms. So it's very exciting times in terms of Syncsort's growth. So the last time you were on the show was pre-GDPR, but we were talking before the cameras are rolling in, you were explaining the kinds of adoption you're seeing and what in this new era you're seeing from customers and hearing from customers. Can you tell our viewers a little bit about it? When we were discussing last time, I talked about four megatrends we are seeing, and those megatrends were primarily driven by the advanced business and operation analytics, data governance, cloud, streaming, and data science, artificial intelligence. And we talked, and we really made a lot of announcement and focus on the use cases around data governance, primarily helping our customers for the GDPR Global Data Protection Regulation initiatives and how we can create that visibility in the enterprise through the data by security and lineage and delivering trust data sets. Now we are talking about the cloud primarily, and the keynotes, this event, and our focus is around cloud, primarily driven by, again, the use cases, right? How the business are adapting to the new era. One of the challenges that we see with our enterprise customers and our over 7,000 customers, by the way, is the ability to future-proof their applications. Because this is a very rapidly changing stack. We have seen the keynotes talking about the importance of how do you connect your existing infrastructure with the future, modern, next-generation platforms? How do you future-proof the platform, make it agnostic about whether it's Amazon, Microsoft, or Google Cloud, whether it's on-premise in legacy platforms today that the data has to be available in the next-generation platforms? So the challenge we are seeing is how do we keep the data fresh? How do we create that abstraction that the applications are future-proofed? Because organizations, even financial services customers, banking, insurance, they now have at least one cluster running in the public cloud. And there's private implementations, hybrid becomes the new standard. So our focus and most recent announcements have been around really helping our customers with real-time resilient changes and capture, keeping the data fresh, feeding into the downstream applications with the streaming and messaging data frames, for example Kafka, Amazon Kinesis, as well as keeping the persistent stores and Hadoop data lake on-premise or in the cloud fresh. Put you into great alignment with your partner, Hortonworks. So, Tendu, I wonder if we are here at DataWorks, as Hortonworks has showed, if you can break out for our viewers what is the nature, the levels of your relationship, your partnership with Hortonworks, and how the Syncsort portfolio plays with HDP 3.0, with Hortonworks Dataflow, the data plain services, at a high level. Absolutely. So we have been a long-time partner with Hortonworks, and a couple of years back, we're strengthening our partnership. Hortonworks is reselling Syncsort, and we have actually a prescriptive solution for Hadoop and ETL onboarding in Hadoop jointly. And it's very complementary. Our strategy is very complementary because what Hortonworks is trying and achieving is creating that abstraction and future-proofing and interaction consistency, I don't refer this this morning, across the platform, whether it's on-premise or in the cloud, or across multiple clouds. We are providing the application, data application layer consistency and future-proofing on top of the platform, leveraging the tools in the platform for orchestration, integrating with HDP, certifying with Ranger, HDP, all of the tools Dataflow and Atlas, of course, for lineage. The theme of this conference is ideas, insights, and innovation. And as a partner of Hortonworks, can you describe what it means for you to be at this conference? What kinds of community and deepening existing relationships forming new ones? Can you talk about what happens here? This is one of the major events around data, and it's data works as opposed to being more specific to the Hadoop itself, right? Because the stack is evolving and data challenges are evolving. For us, it means really the interactions with the customers, the organizations, and the partners here, because the dynamics of the use cases is also evolving. For example, data lake implementations started in US, and we started EMEA, European organizations, moving to streaming, data streaming applications faster than US. Why are the Europeans moving faster to streaming than we are in North America? I think a couple of different things might participate. The open source is really enabling organizations to move fast. When the data lake initiative started, we have seen a little bit slow start in Europe, but more experimentation with the open source stack. And by that, the more transformative use cases started really evolving, like how do I manage interactions of the users with the remote controls as they are watching live TV type of transformative use cases became important. And as we move to the transformative use cases, streaming is also very critical because lots of data is available and being able to keep the cloud data stores as well as on-premise data stores and downstream applications with fresh data becomes important. We, in fact, in early June, announced that Syncios now is part of Microsoft's one commercial partner program. With that, our integrated solutions with data integration and data quality are Azure Gold certified and Azure ready. We are in COSAL agreement and we are helping jointly, a lot of customers, moving data and workloads to Azure and keeping those data stores across the platforms in sync. So lots of exciting things. I mean, there's a lot happening with the application space. There's also lots of still happening connected to the governance cases that we have seen feeding security and IT operations data into, again, modern next generation analytics platforms is key, whether it's Splunk, whether it's Elastic as part of the Hadoop stack. So we are still focused on governance as part of this multi-cloud and on-premise the cloud implementations as well. We, in fact, launched our iron stream for IBMI product to help customers, not just making this data available for mainframes, but also from IBMI into Splunk, Elastic and other security information and event management platforms. And today we announced workflow optimization across on-premise and multi-cloud and cloud platforms. So lots of focus across to optimize, ensure and integrate the portfolio of products, helping customers with the business use cases. That's really our focus as we innovate organically and also acquire technologies and solutions. What are the problems we are solving and how we can help our customers with the business and operation analytics, targeting those mega trends around data governance, clouds, streaming and also data science. What is the biggest trend, do you think, that is sort of driving all of these changes? As you said, the data is evolving, the use cases are evolving. What is it that is keeping your customers up at night? It's still, right now it's still governance keeping them up at night because this evolving architecture is also making governance more complex, right? If we are looking at financial services, banking, insurance, healthcare, there are lots of existing infrastructures on main data, critical, mission critical data stores on mainframe, IBMI, in addition to this gravity of data changing and lots of data with the online businesses generated in the cloud. So how to govern that also while optimizing and making those data stores available for next generation analytics makes the governance quite complex. So that really keeps and creates a lot of opportunity for the community, right? All of us here to address those challenges. Because it sounds to me, I'm hearing Splunk, that's machine data, I think of the internet of things and sensor grids. I'm hearing IBM mainframes, that's transactional data, that's your customer data and so forth. It seems like much of this data that you're describing that customers are trying to cleanse and consolidate and provide strict governance on is absolutely essential for them to drive more artificial intelligence into end applications and mobile devices that are being used to drive the customer experience. Do you see more of your customers using your tools to massage the data sets as it were, that data scientists then use to build and train their models for deployment into edge applications? Is that an emerging area where your customers are deploying sync sort? Thank you for asking that question. Absolutely. And thanks for unpacking it and giving it to me. It is a complex question, but it's a very important question. Yes, and in the previous discussions we have seen and this morning also Rob Thomas from IBM mentioned as well, that machine learning and artificial intelligence, the data science really relies on high-quality data, right? It's 1950s computer scientists, anonymous computer scientists garbage in, garbage out. When we are using artificial intelligence and machine learning, the implications, the impact of bad data multiplies, multiplies with the training of historical data, multiplies with the insights that we are getting out of that. So data scientists today are still spending significant time on preparing the data for the AI pipeline and the data science pipeline. That's where we shine, because our integrate portfolio accesses the data from all enterprise data stores and cleanses and matches and prepares that in a trusted manner for use for advanced analytics with machine learning, artificial intelligence. Because the magic of machine learning for predictive analytics is that you build a statistical model based on the most valid data set for the domain of interest. If the data is junk, then you're going to be building a junk model that will not be able to do this job. So for want of a nail, the kingdom was lost, for want of a sink sort, data cleansing and governance tool, the whole AI superstructure will fall down. That's really, yeah. Absolutely. Yeah. Oh, great. Well, thank you so much, Tantu, for coming on theCUBE and for giving us a lot of background and information. Thank you for having me. Thank you. Always a pleasure. I'm Rebecca Knight for James Kobielus. We will have more from theCUBE's live coverage of DataWorks 2018 just after this.