 from San Jose. It's theCUBE, presenting Big Data Silicon Valley, brought to you by SiliconANGLE Media and its ecosystem partners. Welcome back to theCUBE. We are live in San Jose at our event, Big Data SV, I'm Lisa Martin. My co-host is George Gilbert and we are down the street from the Strata Data Conference. We are at a really cool venue, forager, eatery, tasting room. Come down and join us, hang out with us. We've got a cocktail party tonight. We also have an interesting briefing from our analysts on Big Data Trends tomorrow morning. Want to welcome back to theCUBE now, one of our CUBE VIPs and alumna, Tendu Yogurtju, the CTO at Syncsort. Welcome back. Thank you. Hello, Lisa, hi, George. Pleasure to be here. It's our pleasure to have you back. So what's going on at Syncsort? What are some of the big trends of CTO that you're seeing? In terms of the big trends that we are seeing and Syncsort has grown a lot. In the last 12 months, we actually doubled our revenue. It has been really on a very successful both organic and inorganic growth path. And we have more than 7,000 customers now. So it's a great pool of customers that we are able to talk and see the trends and how they are trying to adapt to the digital disruption and make data as part of their core strategy. So data is no longer an enabler. And in all of the enterprise, we are seeing data becoming the core strategy. This reflects in the four mega trends, they are all connected to enable business as well as operation analytics. Cloud is one, definitely. We are seeing more and more cloud adoption. Even our financial services, healthcare and banking customers are now, they have a couple of clusters running in the cloud in public cloud, multiple workloads. Hybrid seems to be the new standard and it comes with also challenges. Governance, IT governance, as well as data governance is a major challenge and also scoping and planning for the workloads in the cloud continues to be a challenge as well. Our general strategy for all of the product portfolio is to have our products following design ones and deploy anywhere strategy. So whether it's a standalone environment on Linux or running on Hadoop or Spark or running on-premise or in the cloud, regardless of the cloud provider, we are enabling the same application with no changes to run all of these environments including hybrid. Then we are seeing the streaming trend with the connected devices, with the digital disruption and so much data being generated, being able to stream and process data on the edge with the internet of things. And in order to address the use cases that SyncSortor is focused on, we are really providing more on the change data capture and near real-time and real-time data replication to the next generation analytics environments and big data environments. We launched last year our change data capture CDC product offering with data integration and we continue to strengthen that with Vision Merger. We had the real-time data replication capabilities and we are now seeing even Kafka data bus becoming a consumer of this data, not just keeping the data lake fresh, but really publishing the changes from multiple diverse set of sources and publishing into a Kafka data bus and making it available for applications and analytics in the data pipeline. So the third trend that we are seeing is around data science and if you notice this morning's keynotes, it was all about machine learning, artificial intelligence, deep learning, how do we make use of data science? And it was very interesting for me because we see everyone talking about the challenge of how do you prepare the data and how do you deliver the trusted data for machine learning and artificial intelligence use and deep learning? Because if you are using bad data and creating your models based on bad data, then the insights you get are also impacted. We definitely offer our products both on the data integration and data quality side to prepare the data, cleanse, match and deliver the trust data set for data scientists and make their life easier. And another area of focus for 2018 is can we also add supervised learning to this? Because with the Trillium quality domain experts that we have now in Syncsort, we have a lot of domain experts in the field we can infuse to machine learning algorithms and connect the data profiling capabilities we have with the data quality capabilities recommending business rules for data scientists and helping them automate the mandate tasks with recommendations. And the last but not least trend is data governance. And data governance is almost an umbrella focus for everything we are doing at Syncsort because everything about the cloud trend, the streaming and the data science and developing that next generation analytics environment for our customers depends on the data governance. It is in fact a business imperative and the regulatory compliance use cases drives more importance to data governance. For example, general data protection regulation in Europe, GDPR. Just a few months. Just a few months, May 2018. It is in the mind of every C level executive. It's not just for European companies but every enterprise has European data sourced in their environment. So compliance is a big driver of governance. And we look at governance in multiple aspects. Security and assuring data is available and a secure way is one aspect. And delivering the high quality data, cleansing, matching. The example Hillary Mason this morning gave in the keynote about how the context matters in terms of the searches of her name was very interesting because you really want to deliver that high quality data in the enterprise, trusted data set, preparing that. Our Trillium Quality for Big Data. We launched Q4. That product is generally available now and actually we are in production with very large deployment. So that's one area of focus. And the third area is how do you create visibility? The farm to table view of your data. Yeah, that's your talk. I love that. Yes, thank you. So tomorrow I have a talk at 2.40, March 8th also. I'm so happy it's on the woman's day. That's right. Get a farm to table view of your data as the name of your talk. Track data lineage from source to analytics. Yes. Tell us a little bit more about that. It's all about creating more visibility because for audit reasons, for understanding how many copies of my data is created, where did my data have been and who accessed it. Creating that visibility is very important. And the last couple of years we saw everyone was focused on how do I create the data lake and make my data accessible. Break the data silos and liberate my data from multiple platforms, legacy platforms that the enterprise might have. Once that happened, everybody started worrying about how do I create consumable data set and how do I manage this data because data has been on the legacy platforms like mainframes, IBM i-Series has been on relational data stores. It is in cloud, gravity of data originating in the cloud is increasing. It's originating from mobile. Hadoop vendors like Hortonworks and Clouder they are creating visibility to what happens within the Hadoop framework. So we are deepening our integration with Clouder and Navigator that was our announcement last week. We already have integration both with Hortonworks Atlas and Clouder and Navigator. This is one step further where we actually publish what happened to every single granular level of data at the field level with all of the transformations that data have been through outside the cluster. So that visibility is now published to Navigator itself. We also publish it through the RESTful APIs. The governance is a very strong and critical initiative for all of the businesses and we are playing into security aspect as well as data lineage and tracking aspect and the quality aspect. So this sounds like some, like a extremely capable infrastructure service so that it's trusted data. Can you sell that to an economic buyer alone or do you go in conjunction with another solution like anti-money laundering for banks or what are the key things that they place enough value on that they would spend budget on it? Yes, absolutely. Usually the use cases may originate like anti-money laundering which is very common, fraud detection and it ties to getting the single view of an entity because in anti-money laundering you want to understand the single view of your customer ultimately. So there is usually another solution that might be in the picture. We are providing the visibility of the data as well as that single view of the entity whether it's the customer view in this case or the product view in some of the use cases by delivering the matching capabilities and the cleansing capabilities, the duplication capabilities in addition to the accessing and integrating the data. Okay. Are you, when you go into a customer and recognizing that we still have tons of silos and we're realizing it's a lot harder to put everything in one repository, are we, how do you sort of, how do customers tell you they want to prioritize what they're bringing in to the repository or even what do they want to work on that's continuously flowing in? So it depends on the business use case and usually at the time that we are working with the customer, they selected that top priority use case and the risk and the anti-money laundering or for insurance companies, we are seeing a trend for example, building the data marketplace that centralized data marketplace concept. So depending on the business case, many of our insurance customers in US for example, they are creating the data marketplace and they are working with near real time and micro batches. In Europe, Europe seems to be a bit ahead of the game in some cases, like Hadoop adoption was slow, but suddenly they went right into the kind of streaming use cases. We are seeing more directly streaming and keeping it fresh and more utilization of the Kafka and messaging frameworks and the data bus. And in that case where they're sort of skipping the batch oriented approach, how do they keep track of history? It's still in most of the cases micro batches and the metadata is still associated with the data. So there is an analysis of the historical what happened to that data. I mean the tools like ours and the vendors come into picture to keep track of that basically. Oh so in other words, by knowing what happened as operationally to the data, that paints a picture of history. Exactly, interesting. And for the governance we usually also partner, for example, we partner with Colibra data governance platform, we partnered with ASG for creating that business rules and the technical metadata and providing to the business users, not just to the IT data infrastructure. And on the Hadoop side, we partner with Cloudara and Hortonworks and very closely to complete that picture for the customer because nobody's just interested on what happened to the data in Hadoop or in mainframe or in my relational data warehouse. So they are really trying to see what's happening on premise in the cloud, multiple clusters, traditional environments, legacy systems and trying to get that big picture view. So on that, getting, enabling a business to have that, we'll say in marketing, 360 degree view of data, knowing that there's so much potential for data to be analyzed to drive business decisions that might open up new business models, new revenue streams, increased profit. What are you seeing as a CTO, so when you go in to meet with a customer, data silos, how do, when you're talking to a chief data officer, how, what's the cultural, I guess, not shift, but really journey that they have to go on to start opening up other organizations of the business to have access to data so they really have that broader 360 degree view. What's that cultural challenge that they have to journey that have to go on? Yes, chief data officers are actually very good partners for us because usually chief data officers are trying to break the silos of data and make sure that the data is liberated for the business use cases. In the, still most of the time, the infrastructure and the cluster, whether it's the deployment in the cloud versus on premise it's owned by the IT infrastructure and the business, lines of business are really the consumers and the clients of that. CDO in that sense almost mitigates and connects to those line of businesses with the IT infrastructure with the same goals for the business, right? They have to worry about the compliance. They have to worry about creating multiple copies of the data. They have to worry about the security of the data and availability of the data. So CDOs actually have, so we are very good partners with the CDOs in that sense and we also usually have IT infrastructure owner in the room and we are talking with our customers because they have a big stake. They are like the gatekeepers of the data to make sure that it is accessed by the right folks in the business. Sounds like they're in the role of maybe a good cop, bad cop, or maybe mediator. Well, Tendu, I wish we had more time. Thanks so much for coming back to theCUBE and like you said, you're speaking tomorrow at Start a Conference on International Women's Day. Get a farm to table view of your data. Love the title. Good luck with that tomorrow and we look forward to seeing you back on theCUBE. Thank you, I look forward to coming back and letting you know about the more exciting, both organic innovations and acquisitions. All right, we look forward to that. We want to thank you for watching theCUBE. I'm Lisa Martin with my co-host George Gilbert. We are live at our event, Big Data SV in San Jose. Come down and visit us. Stick around and we will be right back with our next guest after a short break. Thank you.