 Live from New York, it's theCUBE. Covering the IBM Machine Learning Launch Event brought to you by IBM. Now here are your hosts, Dave Vellante and Stu Miniman. Welcome back to the Big Apple, everybody. This is theCUBE, the leader in live tech coverage. We're here covering the IBM Machine Learning Launch Event hashtag IBM ML. Jeff Jostin is here as the IBM Distinguished Engineer in DB2 space. Jeff, welcome to theCUBE. Thank you. DB2, we're going to geek out on DB2. Love it. So give us the update, DB2.12 launched in October. That's correct. All right. Tried and true database, and if you want something that works, boom, DB2 is it. Give us the update. What was new in 12, and then bring us through to the announcements today? So in DB2.12, which just went generally available back in October, we have several new features that are of high interest to our customers. One has to do with increasing the scale of the transaction rates that we can support, as well as the table sizes and object sizes. So we can go up to 280 trillion rows in a single table now with DB2, eliminating a lot of limits that perhaps some customers are bumping up against in the mobile and internet of things kinds of worlds. We increase the ingest rate that we can allow for inserts into a single table. We achieved over 11 million inserts per second, fully recoverable, fully logged inserts. So high ingest rates, large table sizes, improving the manageability of those tables as well for the admins. So as data gets larger and larger over time, the admin burden for DBA burden can become cumbersome. So at the same time we're increasing scale, we need to also improve the management capabilities. So we have the ability to allow piecemeal management of large tables. So you can manage a single partition reorganization and extending the size, for example, of a single partition without having to do table level operations anymore. So increasing the manageability, in-memory enhancements, query performance improvements, JSON data support, XML improvements, lots of SQL improvements. So full bevy. So tons of features. In January of 2015, IBM announced the Z13. And the big theme of that announcement was bringing analytics and transactions together, this notion of hybrid transaction and analytical processing. Bring us two years forward now. What's relevant to the announcement today and how is that all going? Going great. So we're of the mindset in DB2 development that a lot of the transactional data resides on our systems in ZOS. And so bringing analytics capabilities to that data makes a lot of sense and resonates with a lot of our customers. And so we're building out our analytics and query capabilities to run alongside the transactions and utilizing the base capabilities in Z13 in the ZOS operating system to allow for the strong multi-tenant kinds of operations and isolation of workloads. So you can run analytics alongside transactions without the two interfering with one another. So in Z13, there's a lot of very interesting features there for the HTAP world. Large memory being one of them. So the more we cash in memory in a database, the better performance we can deliver not only to transactions, but also the analytics that are running on the system. The SIMD processing with vector operations becomes very interesting in the analytics space as well as the SMT capabilities that we have for multi-threading on a single core that we utilize in DB2 on the ZIP processors. So a lot of the synergies that we have between DB2 and the underlying hardware and operating system really come to bear in the HTAP world. And then we put on top of that the analytics accelerator, which brings a new dimension to HTAP capabilities. Okay, so before we go there, I mean, at the time, a couple years ago, you would have, somebody would have a mainframe system, the transaction data, the critical data, and then they would have an infiniband pipe to the data warehouse. And that's how they did their sort of hybrid. But it really wasn't the vision that you put forth. Have you been able to sort of consolidate that workflow? Yes, yeah. That's been a big part of our technical roadmap and it is resonating with customers and we do have customers that are consolidating that workflow now onto the Z systems, many times using the analytics accelerator. So you get a lot of benefits from doing this kind of thing. You avoid the complexity of ETL processing, which is a big resource consumer in a lot of shops. And you allow for the analytics to go against the real-time data. So you eliminate a lot of data latency concerns you have with the extraction and loading it under other platforms where you're potentially querying stale data now. You're not seeing the latest transactional data back in the transaction systems. And security is another big concern with a lot of the customers that we work with. And that, if you're moving data around and you have multiple copies of the data and now you have data security concerns that you wouldn't have if you were able to consolidate all those operations against a single copy of the data. Okay, and so what's the technical challenge, Jeff? And maybe the technical enabler in terms of creating that system, that type of system, hybrid, HTAP, is it just larger memories, cheaper memories, but there's got to be more to it than that. And you're seeing it as a trend, obviously. Steve Mills once said, as long as ever since there've been databases, there's been in-memory systems, but it's been the last three or four years, it's been an explosion. Why? And what are the technical challenges there? Yeah, so in our world where we have fairly large data sets and transaction processing is very prevalent with mission critical systems that have to be highly available and highly secure, the technical challenge is that when you introduce the analytics processing in there, not only to make the analytics processing perform, but to run it in a way that it's not going to interfere with these mission critical transactions. So workload isolation is absolutely a critical factor and a big technical challenge. And on the ZOS platform, we have the capabilities of the workload manager that's baked into the operating system, which allows you to find business goals for the various workload classes that you have in the system and achieve that level of isolation so I can run fairly complex queries in my DB2 on ZOS system without impacting the transactions that are running alongside those queries. But they are sharing memory, they are sharing CPU, they are sharing the same underlying hardware resources. And that's where the analytics accelerator comes in is that it achieves an additional level of isolation and also an additional level of acceleration that we can't achieve on the native ZOS systems. And so by putting that complex analytics processing off into our hybrid analytics accelerator, now that acceleration, the query acceleration is running as a single image as far as the applications are concerned. But in the background, we're utilizing different memory, different CPs, and it's completely isolated from the transactions, the mission critical transactions that are running against that data. So does recovery change in any way if something goes wrong? Is that a key technical challenge? Yes, so if you're putting data off into Accelerator, now you have, what happens if that Accelerator fails and you do have some additional recovery considerations that would apply there. And the design that we have is that DB2 coordinates the recovery, the backup and recovery jobs of bringing data back to currency. And there'll be also some additional concerns for disaster recovery kinds of scenarios where if you're using an Accelerator at your local site and you failover to an alternate site, you'd also want those acceleration capabilities over there and to repopulate that data as quickly as possible if you failover to the alternate site. Jeff, can you speak a little bit about the role of the DBA? How do you really see that maturing? Do they become more valuable to the business now? It seems like they'll have more capabilities and more, I guess it's not up the stack but having some things in similarity to what a data scientist would do. How's that changing? Yeah, so we're trying to do the work in the engine, the technical work to make the mundane tasks that DBAs typically need to do in terms of backup and recovery, doing the database reorganizations, managing query performance, working with application groups and performance issues and concurrency issues and things like that. Try to take those tasks, those mundane tasks away and build in more automation so that the system automatically manages these things. And that frees the DBA then to be able to do these higher value business operations, working with the application groups and working more in kind of a data scientist kind of capacity. Yeah, I was reading an article talking about just the long-term viability of databases themselves. How do you see kind of the SAS model infiltrating database? Obviously IBM's got large presence in cloud. How do you see that maturing and changing over time? Yeah, many of the customers that we work with run large mission critical systems on the mainframes and when we talk to them about cloud, they are very interested in the flexibility and the elasticity that cloud can provide as well as self-service provisioning for their application developers. But the customers that we work with are mainly interested in on-premise and maintaining control and ownership of those assets, mainly for data security reasons and also for some performance reasons. But at the same time, we wanna advance the cloud service capability to make things easier for provisioning. So we're looking at private cloud kinds of implementations where the IT staff is the service provider and the service consumer would be the application groups that work inside that company and they can easily provision services on that mainframe infrastructure that are being maintained on-prem. And by providing these services then, we can provide a lot of self-service automation for that application developer personas. And we also provide the basis to perhaps take these services out in the public cloud someday. But our initial target is private cloud kinds of configurations. Yeah, and if you define cloud as an operating model, so not a place where you put stuff. So let's park that for a second. People want databases as a service. Is that an operating premise, right? That's correct, yeah, okay. So at different granularities, so you'd like to be able to, as an admin, provision a new DB2 environment automatically about having to do a lot of the manual install, migrate kinds of things that you traditionally had to do on those systems. And then from an application developer standpoint, it's a different kind of granularity of provisioning you're looking at. It's typically schemas within a existing database that's already provisioned on those systems. And so there we get into some of these multi-tenancy capabilities that we've built into ZOS over the years to allow these developers to very easily provision their schemas alongside other schemas that are in the same DB2 environment. So it's granularity of instances, is that right? And so there's obviously a lot of go-to-market challenges with databases as a service, like pricing. But from a technical standpoint, it's security, noisy neighbor stuff, right? And you're talking, you're saying the granularity of the instances that I provision, is that right? Yes, that's correct. And we think we're building on a strong technology base here because of the multi-tenancy capabilities we've had in ZOS for many years. And so we can get that isolation and solve some of these noisy neighbor problems that are perhaps more difficult to solve in other cloud environments. It's the original private cloud, folks. Some people might roll their eyes at that, but it's true. I mean, when you think about the ZOS and the capabilities there, and you look at how people define private cloud, you've had that for decades, right? A lot of those tenants are built in, so we think we have a very strong technology base to build out these services. Okay, so I know you can't give us too much detail here, but where should we be focused in the future? What kinds of things, just from a high level, should the industry expect from this class of systems going forward? Yeah, well, and my domain is databases, and I work on the mainframe platform. And so as a database engineer, we're always concerned about performance, first and foremost, and so we look at opportunities for doing hardware software integration and using large memory to solve performance problems, especially as the Moore's Law curve kind of changes going into the future, and we can't rely as much as we did in the past on just the CPU speeds doubling every 18 months as we did in the past. So hardware software integration, pushing things into hardware acceleration where it makes sense, utilizing large memory is gonna be really important. And as we talked about before, scaling databases is a concern, especially in the mobile world, and how do you manage these large databases and make that more automated over time, not only for cloud environments, where you would really rely on automation and self-management, but even the traditional mission-critical production environments that we maintain today, because IT professionals just don't have the time or sometimes even the skills to be able to manage these environments like they have in the past. Jeff, how does the pending end of the horrible storage stack affect database design? Because for since the dawn of computing, you've expected that there's going to be some rotating, some mechanical device that you're going to have to deal with that's going to create latency issues, et cetera. From an architectural perspective, as you see, more in-memory and non-volatile memories and flash storage, how does that change how you think about designing databases? It's got a fundamental impact on how we look at designing database management systems because we want to be able to cash in memory as much as possible for performance. However, memory, as you say, is volatile, and so we still need persistent storage for database recovery and database persistence. That has typically been predicated upon flash storage and spinning disks, but more and more we're going to cashed storage. The caches in our storage servers are becoming very, very large these days, and so utilizing those caches in the storage servers much the same way we use coupling facilities for clustering in the database, very high-speed clustering, it's electronic storage. You can think of DASD controllers or disk controllers in the same way, these very, very large memories that are caching data out there, and getting at that a lot faster than we could at a spinning disk. Flash gives us another dimension of potential there because now, especially for random access, we can get at data that's a lot faster on a flash device than we could on a spinning disk. Expect the application development paradigm to change? I mean, I guess with CAPI and things like that doesn't have to, but if you were able to, as a developer, exploit that non-volatile resource more explicitly, it would dramatically increase performance, presumably. Did you see developers beginning to change the way in which they write applications? Yeah, I mean, for example, some queries that weren't possible in the past now become possible. Like what? So some table joins that, perhaps, I want to do a three-way join between some very large tables, you wouldn't dare run that in some of the older systems. Now with today's larger memory systems, now you can start to contemplate that, and the performance becomes very acceptable, and we can do it with acceptable resource utilization. And with the analytics accelerator, that gives us a lot more power to bring to bear there. So application developers now are able to run these queries in the past that just weren't able to run. Great. And you're addressing the audience today, is that right? That's correct. What time are you on? Do you know? This afternoon, I think it's the last slot in the agenda. 450. Great. Data innovation with hybrid transaction, analytical processing with DB2. So awesome. Don't miss that. Jeff, thanks very much for coming on theCUBE. Really appreciate your time. You bet. All right, keep it right there, everybody. We're back with our next guest right after this short break. We're live from the Big Apple. Right back.