 Live from Midtown Manhattan, it's theCUBE. Covering big data, New York City 2017. Brought to you by SiliconANGLE Media and its ecosystem sponsors. Baisal Farooqi, who's the Solutions Marketing Manager of BMC, welcome to theCUBE. Thank you, good to be back on theCUBE. So, first of all, I heard you guys had a tough time in Houston, so I hope everything's getting better and the best wishes that went down. We're definitely in recovery mode now. Yeah, and so hopefully I can get straight and out quick. What's going on with BMC? Give us a quick update. And in context to big data, NYC, what's happening? What is BMC doing in the big data space now, the AI space now, the IoT space now, the cloud space? So, like you said, that the data lake space, the IoT space, the AI space, there are four components of this entire picture that literally haven't changed since the beginning of computing. If you look at those four components of a data pipeline, it's ingestion, storage, processing, and analytics. What keeps changing around it is the infrastructure, the types of data, the volume of data, and the applications that surround it. This is, and the rate of change has picked up immensely over the last few years with Hadoop coming into the picture of public cloud providers pushing it. It's obviously carrying a number of challenges, but one of the biggest challenges that we are seeing in the market and we're helping customers address is a challenge of automating this. And obviously the benefit of automation is in scalability as well and reliability. So, when you look at this rather simple data pipeline, which is now becoming more and more complex, how do you automate all of this from a single point of control? How do you continue to absorb new technologies and not rearchitect your automation strategy every time, whether it's Hadoop, whether it's bringing in machine learning from a cloud provider? And that is the issue we've been solving for customers for the past 30, 40 years. All right, let me jump into it. So, first of all, you mentioned some things that never change, ingestion storage. And what was the third one? Ingestion, storage, processing, and eventually analytics. So, okay, so that's cool, totally by that. Now, if you move and say, hey, okay, you believe that's standard, but now in the modern era that we live in, which is complex, you want breadth of data, but also you want the specialization when you get down to machine learning that's highly bounded, that's where the automation is right now. We see the trend essentially making that automation more broader as it goes into the customer environments. Correct. How do you architect that? If I'm a CXO or I'm a CDO, what's in it for me? How do I architect this? Because that's really the number one thing is I know what the building blocks are, but they've changed in their dynamics to the marketplace. So, the way I look at it is that what defines success and failure and particularly in big data projects is your ability to scale. If you start a pilot and you spend three months on it and you deliver some results, but if you cannot roll it out worldwide, nationwide, whatever it is, essentially the project has failed. The analogy I often give is Walmart has been testing the pickup tower, I don't know if you've seen, so this is basically a giant ATM for you to go pick up an order that you placed online. They're testing this at about 100 stores today. Now, if that's a success and Walmart wants to roll this out nationwide, how much time do you think their IT department's gonna have? Is this a five year project, 10 year project? I don't know, and the management's gonna want this done six months, 10 months. So, essentially, this is where automation becomes extremely crucial because it is now allowing you to deliver speed to market and without automation, you are not going to be able to get to an operational stage in a repeatable and reliable manner. You're describing a very complex automation scenario. How can you automate in a hurry without sacrificing the details of what needs to be done? In other words, there seemed to call for repurposing or reusing prior automation scripts and so forth. How can the Walmart's of the world do that fast but also do it well? Yeah, so we do it, we go about it in two ways. One is that out of the box, we provide a lot of pre-built integrations to some of the most commonly used systems and a narrow prize, all the way from the mainframes, oracles, SAPs, Hadoop, Tableau's of the world, they're all available out of the box for you to quickly reuse these objects and build an automated data pipeline. The other challenge we saw, and particularly when we entered the big data space four years ago, was that the automation was something that was considered close to the project becoming operational. And that's where a lot of rework happened because developers had been writing their own scripts, using point solutions. So we said, all right, it's time to shift automation left and allow companies to build automation as an artifact very early in the development life cycle. About a month ago, we released what we call ControlM Workbench as essentially a community edition of ControlM targeted towards developers so that instead of writing their own scripts, they can use ControlM in a completely offline manner without having to connect to an enterprise system. So as they build and test and iterate, they're using ControlM to do that. So as the application progresses through the development life cycle and all of that work can then translate easily into an enterprise edition of ControlM. Just quickly, quickly, just to find what shift left means for the folks that might not know software methodologies. They don't think it's left political left or right. You know what I'm saying? We're not shift, take control of it. I mean, this is software development, so quickly take a minute to explain what shift means and the importance of it. Correct. So if you think of software development as a straight line continuum, you will start with building some code, you will do some testing, then unit testing, then user acceptance testing. As it moves along this chain, there was a point right before production where all of the automation used to happen. You know, developers would come in and deliver the application to ops and ops would say, well, hang on a second. All this cron tab and all these other point solutions have been used for automation, that's not what we use in production. And we need you to now- So test early and often. Test early and often. So the challenge was the developers, the tools they used were not the tools that were being used on the production end of the site. And there was good reason for it because developers don't need something really heavy and with all the bells and whistles early in the development life cycle. Now, control and workbench is a very light version which is targeted at developers and focuses on the needs that they have when they're building and developing it. So as the application progresses through- How much are you seeing waterfall? But how much can they do? Go ahead. How much are you seeing waterfall and then people shifting left being coming more prominent now? What percentage of your customers have moved to agile and shifting left percentage wise? So we serve our customers on a regular basis and the last survey showed that 80% of the customers have either implemented a more continuous integration and delivery type of framework or are in the process of doing it. And that's the other- And getting up close to 100 is possible pretty much. And what is driving all of that is the need from the business. The days of the five year implementation of timelines are gone. This is something that you need to deliver every week, two weeks and iteration. And we have also innovated in that space and the approach we've called Jobs as Code where you can build entire complex data pipelines in code formats so that you can enable the automation and a continuous integration and delivery framework. I have one quick question, Jim. I'll take the floor and get a word in soon. But I have one final question on this EMC methodology thing. You guys have a history. Obviously, BMC goes way back. I remember Max Watson, CEO and Bob Beach back in 97, we used to chat with them, dominated that landscape. But we're kind of going back to a systems mindset. So the question for you is, how do you view the issue of this holy grail, the promised land of AI and machine learning where end to end visibility is really the goal, right? At the same time, you want bounded experiences at root level so automation can kick in to enable more activity. So there's a trade-off between going for the end to end visibility out of the gate but also having bounded visibility and data to automate. How do you guys look at that market because customers want the end to end promise but they don't want to try to get there too fast as a diseconomies of scale potentially? How do you talk about that? And that's exactly the approach we've taken with control and work managed to community addition because early on you don't need capabilities like SLA management and forecasting and automated promotion between environments. Developers want to be able to quickly build and test and show value. And they don't need something that is with all the bells and whistles. We're allowing you to handle that piece in that manner through control and work bench. As things progress and the application progresses, the needs change as well. Well, now I'm closer to delivering this to the business. I need to be able to manage this within an SLA. I need to be able to manage this end to end and connect this to other systems of record and streaming data and click stream data, all of that. So we believe that it doesn't have to be a trade off that you don't have to compromise speed and quality for end visibility and enterprise-grade automation. You mentioned trade off. So the control and work bench, the developer can use it offline. So what amount of testing can they possibly do on a complex data pipeline automation when the tool is offline? I mean, it seems like the more development they do offline, the greater the risk that it simply won't work when they go into production. Give us a sense for how they mitigate risk when you use the tool and work bench. We spend a lot of time observing how developers work. And very early in the development stage, all they're doing is working off of their Mac or their laptop, and they're not really connected to any. And that is where they end up writing a lot of scripts because whatever code, business logic that they've written, the way they're gonna make it run is by writing scripts. And that essentially becomes the problem because then you have scripts managing more scripts. And as the application progresses, you have this complex web of scripts and crontabs and maybe some open source solutions trying to simply make all of this run. And by doing this in an offline manner, that doesn't mean that they're losing all of the other Control-M capabilities. Simply as the application progresses, whatever automation that they've built in Control-M can seamlessly now flow into the next stage. So when you are ready to take an application into production, there's essentially no rework required from an automation perspective. All of that that was built can now be translated into the Enterprise-grade Control-M. And that's where operations can then go in and add the other artifacts such as SLA management and forecasting and other things that are important from an operational perspective. I like to get both your perspective because I'm so you're like an analyst here. So Jim, I want you guys to comment. My question to both of you would be, looking at this time in history, I was in the BMC side, we mentioned some of the history. You guys are transforming on a new journey and extending that capability of this world. Jim, you're covering state-of-the-art AI machine learning. What's your take of this space now? Strata data, which is now a dupe world, which is Cloudera went public, Hortonworks is now public, kind of the big, the dupe guys kind of grew up, but the world has changed around them. It's not just about Hadoop anymore. So let's get your thoughts on this kind of perspective. We're seeing much broader picture in big data NYC versus the Strata Hadoop show, which seems to be losing steam. But you know, I mean, in terms of the focus, the bigger focus is much broader, horizontally scalable. Your thoughts on the ecosystem right now? Let the people answer first unless the people wants me to go first. Yeah, and I think the reason the focus is changing is because of where the projects are in their life cycle. You know, now where we're seeing, as most companies are grappling with, how do I take this to the next level? How do I scale? How do I go from just proving out one or two use cases to making the entire organization data-driven and really inject data-driven decision making in all facets of decision making? So that is, I believe, what's driving the change that we're seeing, that you know, now you've gone from Strata Hadoop to being Strata data and focus on that element. And like I said earlier, the difference between success and failure is your ability to scale and operationalize. Take machine learning for example. Really, that's where there's no, it's not a hype market. It's show me the meat on the bone, show me scale. I got operational concerns of security and whatnot. And machine learning, you know, that's one of the hottest topics. A recent survey I read which pulled a number of data scientists revealed that they spent about less than 3% of their time in training the data models and about 80% of their time in data manipulation, data transformation, and enrichment. That is obviously not the best use of a data scientist's time. And that is exactly one of the problems we're solving for our customers around the world. That needs to be automated to the hilt. Correct. To help them to be more productive and deliver faster results. Ecosystem perspective, Jim, what's your thoughts? Yeah, everything that Basil said, and I'll just point out that many of the core use cases for AI are automation of the data pipeline. You know, it's driving machine learning driven predictions, classifications, you know, abstractions and so forth, into the data pipeline, into the application pipeline to drive results in a way that is contextually and environmentally aware of what's going on, the history, historical data, what's going on in terms of current streaming data to drive optimal outcomes, you know, using predictive models and so forth, in line to applications. So really fundamentally then what's going on is that automation is an artifact that needs to be driven into your application architecture as a repurposeable resource for a variety of downstream systems. Do customers even know what to automate? I mean, look, that's the question. You're automating human judgment. You're automating effort, you know, like the judgments that a working data engineer makes to prepare data for modeling and whatever, more and more that can be automated because those are pattern structured activities that have been mastered by smart people over many years. I mean, we just had a customer on with Glaskum GSK with that scale and his attitude is we see the results from the users, then we double down and pay for it and automate it. So the automation question, it's an open question, it's a rhetorical question, but this begs the question, which is, you know, who's writing the algorithms as machines get smarter and start throwing off their own real-time data? What are you looking at? How do you determine you're going to need machine learning for machine learning? Are you going to need AI for AI? Who writes the algorithms for the algorithm? Automated machine learning is a hot, hot, not only research focus, but we're seeing it more and more solution providers like Microsoft and Google and others are going deep down and doubling down and investments in exactly that area. That's a productivity play for data scientists. I think the data markers are going to change radically, in my opinion, I see you're starting to see some things with blockchain and some other things that are interesting. Data sovereignty, data governance are huge issues. Bacille, let's just give you a final thoughts for the segment as we wrap this up. Final thoughts on data and BMC. What should people know about BMC right now? Because people might have a historical view of BMC. What's the latest? What should they know? What's the new Instagram picture of BMC? What should they know about you guys? So I think what I would say people should know about BMC is that all the work that we've done over the last 25 years, and virtually every platform that came before Hadoop, we have now innovated to take this into things like big data and cloud platforms. So when you are choosing ControlM as a platform for automation, you are choosing a very, very mature solution. An example of which is Navistar, their CIO is actually speaking of the keynote tomorrow. They've had ControlM for 15, 20 years and they've automated virtually every business function through ControlM. And when they started their predictive maintenance project where they're ingesting data from about 300,000 vehicles today to figure out when this vehicle might break and do predictive maintenance on it. When they started their journey, they said that they always knew that they were gonna use ControlM for it because that was the enterprise standard and they knew that they could simply now extend that capability into this area. And when they started about three, four years ago, they were ingesting data from about 100,000 vehicles. That has now scaled over 325,000 vehicles and they have not had to re-architect their strategy as they grow and scale. So I would say that is one of the key messages that we are taking to market is that we are bringing innovation that spans over 25 years and evolving it. Modernizing it, basically. Modernizing it and bringing it to newer platforms. Well, congratulations. I wouldn't call that a pivot. I'd call it an extensibility issue. Kind of modernizing kind of the core things. Absolutely. Thanks for coming and sharing the BMC perspective inside theCUBE here. On Big Data NYC, this is theCUBE. I'm John Furrier with Jim Kobielus here in New York City, more live coverage. For three days we'll be here today, tomorrow and Thursday. And Big Data NYC, more coverage after the short break.