 From theCUBE Studios in Palo Alto in Boston, bringing you data-driven insights from theCUBE and ETR. This is Breaking Analysis with Dave Vellante. In the words of famous people like Nobel Laureate Niels Bohr and baseball legend, Yogi Berra, predictions are very difficult, especially if they're about the future. Hello and welcome to this week's theCUBE Research Insights powered by ETR. In this special breaking analysis, we're pleased to host our third annual data predictions power panel with some of our collaborators in theCUBE, collective and members of the data gang. With us today are five of the top industry analysts focused on data platforms, Sanjeev Mohan of Sanjmo, Tony Baer of DB Insight, IDC's Carl Olson, Dave Meninger, who was with Ventana Research, now part of ISG and Doug Henschen with Constellation Research. Guys, thanks, we really appreciate you and we are very excited for our annual look ahead on data. Now, before we get into it, I want to briefly share some ETR data from an October survey of more than 1700 IT decision makers. This graphic shows a net score or spending momentum on the vertical axis and the overlap of these platforms within those 1700 accounts, representing the pervasiveness of the platform within the dataset. And this is data isolated for the analytics, business intelligence, database data warehouse and ML AI sectors. And we've selected a subset of the companies in this group of sectors that are representative vendors of today's discussion. Notice that red line at 40%. Anything above that indicates a highly elevated spending velocity on a platform. So a couple of quick points and then we'll get into it. First of all, the presence of Microsoft and AWS is impressive and notable well ahead of Google Cloud. The momentum of open AI at a net score of nearly 80% is astoundingly high. And its presence on the X axis represents about seven times the account penetration of Anthropic, which you see in the left hand side of the chart just above data IQ. Snowflake and Databricks remain above the 40% mark with pretty strong momentum. And you can see a number of other companies that we'll discuss directly or indirectly across this graphic in this basket of sectors that we've chosen, MongoDB, SAP, IBM, Watson, and then we've got governance metadata, pipeline and ETL tools like Informatica, Calibra, Alation, Altrix, et cetera. You got BI platforms like ThoughtSpot, Click, Tableau and Looker, and of course a number of database and data analytic platforms like CouchBase, Cloudera, SAS, and of course Oracle. So this gives you a general quantitative sense of the relative position of these platforms and what is a multi-hundred billion dollar TAM. Okay, let's get started by looking back at this team's 2023 predictions and looking at how the analysts fared. This graphic just shows all of the 2023 predictions for each analyst in one table. It's got a little commentary on evidence of whether the prediction was a direct hit, which is green, a glancing blow, which is yellow or a miss, which is the red. So a quick scan of the heat map shows that the data gang did pretty well in its 2023 predictions, notwithstanding that these were self-evaluated by each of our analysts. Okay, so let's get into the 2023 predictions review starting with Sanjeev Mohan. We're showing Sanjeev your prediction about unified metadata becoming the kingmaker and your expectation that data products would rise in popularity. And you're sharing evidence of Microsoft Fabric, Databricks Unity Catalog and some other proof points. Take it away and explain your logic and your assessment. Yeah, so when we came up with these predictions last year, this is before AI took off, my whole thesis was that data catalogs are going to become more powerful, they'll add more use cases and go far beyond just being catalogs. And at that time I was thinking data quality, security, privacy, some of those things. And then Unity Catalog came up and we see it has merged the AI model catalog, along with the data catalog, Microsoft Fabric I thought is a really great concept that brings together different disparate pieces of the architecture of the stack into one place. So this is how I see metadata is starting to converge for different use cases. The second prediction I had was on data products. I was so upbeat about data products that I ended up writing a book called Data Products for Dummies on that. I feel data products have now become mainstream. I have multiple conversations every week where people are now starting to create these data products. For example, I was at KubeCon countries a couple of months ago. I met with the Intuit team. They have 900 data products and I was told that the mandate for them is that in the future, all access to data is going to be through a data product. So I feel data products have become a common theme. In fact, the definition, the examples of data products have also increased. Last year at this time, like I said, AI was not in the picture, but now that we are so deep into the space, so think like rag pipelines, for instance. I'll be talking about agents, AI agents. A lot of these are actually data products. So you could do your LLM inference and have a whole mechanism and wrap it up into a data product. So that's why I rate both of these as green. Okay, great, thank you. Okay, next, we've got a Tony Bear. We're going to be rapid fire here. Tony, you predicted that the industry would begin to rethink the modern data stack and you've cited some evidence of that with a mix of green, yellow, and red. So appreciate the self-evaluation. Explain your 2023 prediction in more detail in your assessment of its accuracy, please. Yeah, thanks, Dave. Number one, the modern data stack was a whole idea to basically modularize all the different pieces that you need to go from transactional to analytic data. Great idea, except for the fact that in execution it resulted in a lot of complexity, a lot of anti-complexity. So if we're looking at how this prediction performed over the past year, I'd say a metaphor here, and I may be guilty of overusing this, would be like having one leg in a bucket of boiling water and one leg in a bucket of ice. On average, I guess it was middling. It was okay. Let me break this down into several areas. Some areas basically showed more progress than others. I think in the area of flattening the analytics and transaction data stack, we saw pretty impressive progress. And this was progress that was already ongoing when we made this prediction last year. For instance, basically Oracle rethinking my SQL group with heap when I wish to combine both an analytic and transaction database. The first time that my SQL actually had been applied towards analytics. And Google doing similar things with Postgres in LODB. I think what was really interesting this past year is seeing what Amazon's done. And what Amazon did, Amazon has, now I've lost, 15, 16 databases, whatever. It wasn't like all of a sudden that they were going back to database Pangea or anything like that. But they were now putting in more, what I call seamless connections. And they did it in a fairly ingenious way. They took advantage of the technology they used in Aurora, which is basically looking at logs and doing and replicating change logs. So that basically are their transaction database on Aurora, the performance would not be impacted instead it would generate a change stream, which would automatically feed to Redshift. And basically the first announcement was with MySQL, which went GA. What's impressive is that Amazon AWS expanded this now to Postgres. And even more interestingly, to the NoSQL side with DynamoDB and OpenSearch. And so that was actually very significant progress for basically a cloud provider hyposcale that really has been known more for its complexity in the past to start to find a look to put things together. Another area which I think was we're sort of very strong progress is in database machine learning, fast becoming table stakes, fast becoming a check box. It's been implemented in different ways. For instance, with Redshift, basically it's a matter of calling in models that you've developed in SageMaker. Whereas say for instance, with Oracle or Google BigQuery ML, it's a matter of dealing with canned machine learning models that are already in database. I think going forward in the future, I think there'll be a lot of very interesting progress in terms of how we can incorporate let's say foundation models in this as well. So I see very impressive progress there. We're essentially, it's not that you'll necessarily do all AI within in database, but for those use cases where you would like to compress the stacks, I think there will be so expanding array of alternatives there. Whereas say progress has been more halting has been in the area of basically of essentially data transformation and streaming. ELT has been a good, it's been an area of progress and which has been made possible in the cloud because of course storage is so cheap. So when I do the transformation in database and that eliminates a whole layer of a stack, that whole staging server. So that's been impressive, that's been going on, and that's become basically a standard part of the modern data stack in the cloud. On the other hand, the continued popularity of tools like five transfer transformation, indeed, I should say five transfer for extraction load and DBT for transformation. Point to the fact that there's still a tool that there still is a stack of tools here and there's some needed integrations. And I'm looking forward to basically how they become more seamlessly integrated into their most platforms. Finally, I would say where we've had the least progress I think is with streaming and building and managing data pipelines. I do think I'm going to hold out an array of hope which we'll go into in the predictions because I think generators of AI could make a contribution here in terms of simplifying and integrating data pipeline, generation and data pipeline management. Great, good tease there. Thank you Tony, good stuff. All right, moving right along, Carl Olson, you said that SQL is back and you're showing a sea of green in your evidence column. I gotta ask you, was SQL ever gone? Please explain your 2023 prediction and your proof points that led you to that direct hit evaluation. Well, okay, so to begin with about two years ago, the CEO of MongoDB declared that SQL is dead, nobody's teaching it in school anymore, it's going away, nobody cares. And then of course, very short time after that, MongoDB came out with its own SQL query mechanism for documents on Atlas. So that immediately tells you that there's a reason for that pivot. In fact, in addition to that couch base, which was their competitor in the JSON document space, is offering Capella columnar, which is a column based SQL analytics engine that is a company's couch base on the Capella cloud platform. And Redis has been supporting SQL for the past couple of years. Databricks, which was once militantly Spark based. In fact, if you had conversations with Databricks three or four years ago, they would have said, nobody cares about SQL anymore, everybody's moving to Spark. Well, now they have their own SQL capability called Databricks SQL and SQL statement API that they recently announced, so that you can invoke SQL in the form of select queries directly through this API. Certainly suggesting a shift in emphasis. The top most popular DBMS engines, according to DB-Engines, which is an online ranking site, are Oracle, MySQL, Microsoft SQL Server and PostgreSQL, which are all, of course, based on SQL. And Oracle has greatly expanded its footprint in the SQL space, well beyond Oracle Database, making big investments in MySQL with HeatWave on OCI and also on AWS, and also with their newly announced native OCI service for open source PostgreSQL. So in all those ways, they're increasing their footprint because they're adding in their investment because they see that there's growing opportunity in the SQL space, and that in order to take advantage of that opportunity, they need to move beyond the boundaries of their flagship Oracle Database. So we're seeing a number of other, there's a time to go into all the other ways in which SQL is rising to the top of the tree, but it was the case that people were saying, oh, that's old fashioned, we don't need SQL anymore, we do these other things. And actually, when you look at it on the transaction processing side, there was a distinctly negative attitude towards SQL amongst the application developers. Now, we do know there's a strong preference among application developers for document products like MongoDB, and that's going to continue. And by saying SQL is back, I don't mean it's washing away everything else. My personal belief is that going forward, we're looking at what we might call multi-model future, where databases may support more than one format, both internally and externally, and more than one method for accessing and organizing data. And in fact, with the advent of AI and generative AI, in the environment where people want to see data put together in different forms, there'll be even stronger push to have databases that support multiple different formats for data. But I believe that through all of that, SQL will remain the primary way of doing a business data analysis because for this very simple reason, kind of SQL is by far the most powerful mechanism for doing analysis of data without having to anticipate in advance in the structure of the database, what kinds of questions are going to be asked? Well, nice call, Carl. SQL remains the killer app of big data, as Armour Aua-Della said years ago. All right, David Menninger, you predicted that the definition of data is expanding using excited metric stores, feature stores, model management and data sharing as examples of what we could expect in 2023, and you show a mostly green level of accuracy for that prediction. Could you please elaborate? Sure. So, I mean, clearly the definition of data is expanding and maybe we should give ourselves all a red X because we didn't capitalize on what was happening around Gen AI, but Gen AI has certainly expanded the definition of data. But with specifically with respect to the four items I mentioned, Gen AI has also brought a greater focus on AI and the processes around AI. So we see much more interest in feature stores. We see much more interest in model management and trying to bring together managing LLMs and other types of models as well. On the data sharing front, we've got Databricks and Snowflake battling over establishing standards on how data should be shared into some of Sanjeev's points earlier about data products. So I think those are all, I would give those all green ratings as you've displayed there. On the metric store front, there are a couple of vendors who are focused on metric stores specifically, but I think it's gotten analytics and metrics broadly speaking have gotten less attention around the whole governance process. I'd like to see more of Sanjeev's predictions and desires for the market to expand the catalogs and to incorporate these things more fully. And then I think I would have given the metrics store a green rating as well. But I don't think there's as much focus on that as there ought to be right now. So I'll keep my comments a little briefer and we can get to the 2024 predictions. Thank you for that, David. Appreciate it. You keep an eye on the clock. All right, last but not least for the 2023 look back, we have Doug Henschen with the forecast last year that BI analytics reporting and dashboarding would be commoditized and embedding and automation would ascend. You've got some examples here and this looks like a direct hit. Please elaborate, Doug. Yeah, I've been talking about embedded BI analytics being on the rise for a few years, I've written a lot of reports about it and the trend continued in 2023. So it's green, we saw, it's about embedding these insights at decision points, not forcing people to go off to separate reports and dashboards and then come back to their work and then make their decision. So to do that, we saw more SDKs, more granular APIs to let developers bring those insights into apps, more GitHub integration with CICD capabilities, more low code, no code development options. We also saw more workflow from some of these BI analytics vendors and use of event architecture to automate. So use those insights not to trigger an alert to drive you off to a report or a dashboard but to trigger an action in an application. And of course we saw the enterprise apps vendors like Oracle, SAP, Salesforce, Workday all increasingly embedding insights at decision points within their enterprise applications. And then the late 2023 announcements I alluded to, in my book, we really didn't really miss the GenAI thing because I think Microsoft was kind of premature in its GenAI announcements. Very precious few actual GenAI capabilities for enterprise are generally available. Most everything is still private preview. Microsoft co-pilot in Teams and use of Power BI, accessing a natural language query of Power BI exposed through Teams. Public preview finally, November 2023. Tableau announced Pulse. This is more of a business user focused insights embedded in things like Slack and email and later probably this year, more Salesforce apps. Amazon Q, which is really inspired by QuickSight Q but now it's bringing that natural language query into many places within the Amazon ecosystem. Also a preview at this point. So lots coming that will further advance this trend in 2024. Great, some really good examples there, thank you. All right guys, appreciate you looking back on 2023 but let's get really to the heart of our call today and turn our attention to the 2024 predictions. We're going to keep the same order. The designated analyst will present his prediction and then we'll have time for one or two other analysts to chime in on that forecast. Here's a table showing all of the predictions for 2024. 100% of them have AI included but spanning new data platforms, governance, metadata, database, skills gaps and more. So let's really get into it. Sanjeev, you first please. You've got the rise of intelligent data platform. We've been talking about the next data platform beyond the so-called modern data platforms of Snowflake, Databricks, Google, AWS, Microsoft. You could probably include Oracle in there in that mix as the database king. You've got that plus a call on governance and open source LLMs going after proprietary foundation models, lots to cover. So please take it from there and elaborate. So thank you for that, Dave. I have so many predictions, I'm going to keep it short. I published a blog recently with 10 trends for them are rising, all of them have to do with AI. The number one prediction I have on my list is the Intelligent Data Platform. What I'm saying here is that we've been talking, in fact, a lot of our predictions last year were on a data stack. So modern data stack, whatever you want to call it. So my prediction for this year is that we are bringing AI into the mix. We are not taking data out to a separate data stack just to serve AI use cases. That would be too much of data movement and which brings all sorts of issues with it. But how do I bring AI into my existing data stack? And that's what I'm calling it an Intelligent Data Platform. So essentially what we are doing here is that we've got an infrastructure layer which is cross cloud or super cloud, as you call it. And then we've got a unified storage layer. We've talked quite a bit about it and we may even talk more about a unified storage layer. Now we've separated out storage and compute. Now we are also separating out analytical engines. So we already did that. We have a Spark layer, we could have Panda, we could have even DougDB, SQL of course. In that layer of analytics we are now bringing algorithms. These foundation models could be either open source or proprietary. So I could have open AI or anthropic cloud or I could have any of the models from hugging phase. So we are bringing the model catalog on top of data stack. Also we are adding vector search along with that. The idea is that we expose this analytical layer through an API or an SDK layer. So then on top we can still have data products. So the data products don't go away or the BI dashboards which are actually data products. But now we are also adding more than just data artifacts. We could even have AI agents which is going to be, in my opinion, a very big move in 2024. These AI agents will work on top of LLMs but instead of just summarizing documents or translating it, they will kick off some sort of an orchestration of a task. So AI agents, chatbots, so all that together is what I'm calling an intelligent data platform. Along with bringing data and AI together, we are also going to converge the governance of each. So we are now going to have AI governance which will build upon data governance. In data governance, we had this ability to see what artifacts I have, technical metadata, business metadata, operation metadata. Now I should also be able to see what models are in use. In my company, are these models certified for use? What business cases are associated with those models? Because in fact, we just talked about multi-modal, so we've got many use cases. So this is the prediction that I'm making and I see a lot of vendors are moving in this direction, a lot of hyperscalers, data breaks. I'm sure we'll see stuff coming out of Snowflake and then Oracle, Cloudera, TerraData, in my opinion, they all start converging into this integrated stack that I'm calling an intelligent data platform. I'll give you a quick data point on your comment about open source going after proprietary foundation models in the latest ETR survey of 1700 IT decision makers. Lama 2 had approximately 17% more installation citations than Anthropic and the other interesting data point was about 30% of those were on-prem. So topic for another day, but Doug Henschen and Dave Meniger, you got a quick comment on Sanjeev's predictions. I'd start by saying very ambitious and visionary and I think the marketplace is a long way from seeing the sort of sophistication you're talking about, maybe like the top one or 2% of companies that are sophisticated being able to do that sort of thing, even vendors are struggling with these things. We're starting to see some database vendors build their own models, their own generative models, but we're really in the stone knives and bear skin rug stage of Gen AI. And I even talked to a lot of CXOs and very few of them are aware of even, the Calibrizolations, Atlands, Microsoft Proviews, AWS Data Zones, Google Dataplex. I think analysts shouldn't get too far ahead of where the market really is and the appetite for these things. We're just starting to get to the early stages of previews. GA is still a long way off and actual usage. I think that OpenAI stat you had, Dave, indicated a lot of tire kicking in experimentation, but not a lot of real production use. I was gonna make a similar comment that I think, but I wanna get to the heart of why I think the consolidation into Ensel's room platform is gonna lag a little bit and that has to do with the skill sets that are involved in the different activities. Clearly we have to have the analytical processing in the platform, but the tooling around it, I believe, will continue to be separate. So maybe I'm nitpicking a little bit, but, and the other point, I had similar data to you, Dave, that we see the open source models being used and adopted at similar levels to the commercial models. So in aggregate, so not picking one versus another, but in aggregate, the OSS models are being used as much as the commercial models. Great, thanks guys. Okay, Tony Bear, up next. We're showing your prediction that GenAI will simplify database design, deployment, and operations. How so? Okay, thanks, Dave. In essence, basically what I'm talking about here is actually I'm kind of double clicking down what Sanjeev said. It's a less ambitious view, but looking at not so much, our enterprise is going to buy GenAI tools. It's more there to see GenAI, or GenITV AI, and machine learning for that matter, further working their way in terms of how we operate databases. So it's not necessarily gonna be an extra skew or anything like that. It's just gonna be more a matter of like some invisible automation that's gonna happen underneath. And so I'll just give some examples of what I kind of expect to see. And I think one reason why I'm fairly bullish in the near term is that these are, for the most part, this is fairly incremental improvements because we've been working machine learning into the operation design of databases for your life, for instance, like you're deciphering data structures. So I see this is being kind of an incremental improvement. For instance, for database designs, I think the biggest bank for the buck is gonna be with essentially whatever deals with the content of the data. And this is where we basically can take advantage of some of the, for instance, like the document entity extraction capabilities of foundation models. Again, we're not trying to ask enterprises to go full bore into adopting GenITV AI, but I could see some of this automation creeping inside the database design tool. So for instance, I could imagine, for instance, you put in some requirements document and basically some developer or constituent, basically the folks who are collaborating and putting in annotations and comments on the side, that the large language model could essentially do some entity extraction and based on requirements and requests, could start to suggest, let's say some ER, to do some data modeling, some very rudimentary data modeling. I'm not talking about the lights out here, but here's your start. Here's a rough data model. Here's some rough entity relationship, ER diagrams, we'll start to generate a schema. I could also say, and this one is, I think, relatively a no-brainer, is this could be used for a synthetic data generation, which we use for a lot of testing. In other words, based on the characteristics of the actual data, in other words, have a language model, look at that data and generate synthetic based on the characteristics of data that's already there. That's relatively speaking, a no-brainer. Further down the line, and I think this gets more into Sanjeev's AI agents, and I don't necessarily expect this to be a done deal in 2024, but I could start to see the first steps towards basically taking the code generation capabilities of generative AI to help start to create some of those data transmission pipelines. Now, we're seeing the first steps of this today with SQL code generation. It's not a concept leap, it's more of a technology leap, which is going to take further more time because designing the orchestrations and optimizing how we use compute for all of this, that's going to be, those are going to be details that are going to take a while to work out. So I don't expect that to mature in 2024, but I expect to see some first steps there. I also see in a related sense, I could see generative being applied towards the governance aspect of database management and I think a good example here, I'd like to point to is Atlin, which is a data catalog provider, and which is focused on, many different types of data catalogs. This one focuses on data ops. They start with kind of a common natural language search function, which is kind of akin to some of the, so the natural language or conversational query functions that you're starting to see in like in queue, data breaks in your lake house IQ, so on so forth, but this is being applied towards like a cert, kind of go search of basically of metadata. It can auto discover, database metadata, and this also basically builds incrementally on what we've had with machine learning all these years, which can basically deduce things like table and column names, schema specifications and data lineage, and then doc, and then this is the, the generative part, generate the documentation in plain English. Atlin is already starting to do that. So this is the type of thing I expect to see more of this year. Great, thank you, Tony. Carl, you've got a comment on Tony's prediction. Yeah, I just, I think it's a great prediction. And I think that actually it dovetails pretty well with what Sanjeev was saying, because the whole idea behind it is that we can take what is inherently very complex, which is enterprise data, and we can render it into a form. The database, the data itself doesn't actually get simpler. The physical database doesn't get simpler, but our interaction with the data gets simpler because it's being mediated by this generative AI capability, which I think is what Tony was talking about, but it also makes it possible to do things like create the intelligent data platform, because now we can, you know, building the intelligent data platform requires all of these little steps, all these incredibly precise actions that you need to take to make sure that the data is synchronized and that it makes sense together and all this kind of stuff. Human beings usually get, you know, they fall down on these things. The projects start and stall and then you get abandoned because it's too hard. But, you know, generative AI doesn't get tired. It doesn't get bored. It just does its job. So I think that, you know, to look at it that way, you know, Tony's idea of the simplification of the data environment, not necessarily the database, but the whole data environment and the way you build and maintain data dovetails nicely with what Sanjeev was saying. All right, Carl, let's stick with you and take a look at your prediction that gen AI and other developments are going to catalyze a rationalization of what I infer data silos based on your prediction to enable combinatorial data use cases, which will ultimately create governance challenges. So I would say, you know, this may seem obvious to some people, but you're predicting, are you predicting that organizations are going to be able to succeed in 2024? Or will this governance challenge create insurmountable barriers to outcomes this year? Please explain. Okay, so this prediction is kind of a cousin to Sanjeev's prediction in that he was talking about intelligence data platform. I'm just saying right now, the data organization of most enterprises is a total mess, okay? It's not well-coordinated. Data is created for individual applications and then people take data from the databases that were created for individual applications. They combine it together for specific analytic problems. They put it in the data warehouses and that kind of thing or they drop it in a data like, but they're not really creating a coherent environment for the data overall. So it continues to be a mess. In order, when generative AI comes into the picture, it's going to start combining data in ways that was never designed to be combined. You're either going to get irrational combinations because it's just going to make assumptions that are not valid. Or it's going to expose things that we didn't expect to have exposed because it's picking data from different places that were previously not connected at all. So we need to think about that. And we also, we need to think particularly about how this impacts legacy data. So to answer your question, this is not a prediction that says, this is all that we did not in 2024. No, that's ridiculous. This could be a 10-year process of trying to rationalize data, but it has to be done. You'll never get the full value of generative AI data capture, combination, presentation unless you also deal with these issues and come up with decisions that say, for instance, we can't put this data together with this data or this data can't be exposed because it's confidential. We didn't market confidential before because we didn't think it would ever be combined with something else that exposes that confidential nature. But now we have to, there's a lot of work there. So, I mean, Sanjeev was talking about a structure, which is the intelligent data platform. I'm talking about human effort, which is going to be messier. Even with generative AI helping, it's still going to be a big effort. Gotcha, thank you for that. Tony and Doug, I believe you've got comments on Carl's prediction. Okay, I'll dive in first, which is that kind of like, I guess my prediction was kind of the bright side of this, but I think, you know, Carl basically very much points to the fact that this also gets us into deeper, more complex waters. And it's actually, it sort of echoes what happened with the modern data stack, which was supposed to bring all this stuff together, but added lots of complexity. So, yes, as we start putting data together that we didn't, that wasn't necessarily put together before because it was, let's say, in different formats and different systems. It was in different contexts. But, yes, it is going to start to, we will start dealing with some Frankenstein type table combinations. So, I mean, and I think here, what's going to be really important is data lineage, you know, to kind of figure out like, so we understand the provenance of all this data and understand what happens to it. And as Carl's saying, when we start getting into this recombinance stuff, we're getting into some deeper waters and it is going to take time for essentially our ability to cope with this is going to catch up. Great, thank you. Yeah, I would totally agree. I think the reality for enterprises is heterogeneity. We have seen efforts for this proverbial intelligent data platform. I think Microsoft Fabric is this idea that goes across their collaborative products, across their enterprise apps like Dynamics, across their productivity apps. But most organizations, even mid-sized organizations, aren't that heterogeneity, heterogeneas, and organizations aren't necessarily interested in putting all their eggs in one basket. So some of the chaos that Carl talked about is spot on it and it's a tough challenge. It is a tough challenge and also I would just point out in furtherance of what Doug was saying that the heterogeneity is important. And so no data platform that arises out of this can be uniform, it can be just one thing. It has to be more like a data environment. And maybe even a data environment that tolerates the idea that in different contexts, different sets of facts are true. In other words, you may have a different temporal context. You may say on last Tuesday what was true and then it'll tell you what was true last Tuesday, which is different from today. So all of that needs to be part of the system. Giving new meaning to alternative facts. Okay, up next, Dave Menegar with the prediction that despite all the hype around Gen AI, it won't replace traditional AI in the most demanding use cases. And you're predicting also a continued AI skills gap. Again, this feels like a lock, but add some color and some data points that increase the degree of difficulty for us, please. Well, so the reason I think it's important to discuss this is that generative AI is sucking all the air out of the room. And so I think people need to be aware that, yes, there's a lot of attention. It's doing a lot of great things. It's making various software products easier to use. It's easier to code software products. But in the most demanding use cases, it really isn't there. That's not what it's designed to do, at least not yet. So for instance, we have some research that shows that in areas like document summarization or in natural language assistance, clearly generative AI is more likely to be valuable there. It's gonna have a greater impact. In our research, one and a half times is likely to have a greater impact there. However, when you go to the other end of the spectrum and this particular set of research was around banking, if you look at things like credit risk, fraud detection, algorithmic trading, even customer acquisition, predictive AI models, traditional AI was twice as likely to have an impact over the next two years than generative AI. So there are these areas where it's important to recognize we need more advanced skills and capabilities to develop those types of models. Data scientists have to understand a lot of things. They have to understand the biases in the data. They have to understand the training processes. They have to worry about overfitting and poor sampling. And you also have to understand that models are never 100% accurate. You need to evaluate the impact of false positives and false negatives. So while AI, generative AI is making tooling better, including the use, the generative AI process to create models, I'm not trusting my personalized medical care to anything that wasn't developed by a Stanford trained biomedical PhD, right? I want the real knowledge in there and I want people developing those models that have the skills. And our research shows the skills don't exist today. One quarter of organizations report they have the skills they need to develop AI models and two thirds report that they can't get those skills and they're the most difficult skills for them to find and retain in the market. So I want to make people aware, don't put all your eggs in the generative AI basket. It's absolutely valuable, but don't put all your eggs in one basket. David, I love how you always bring the data to back up your opinions. And Sanjeev and Carl, do you have anything to add here? So it's since I've taken on this role of being an insufferable optimist, so I may say that we are maybe just insufferable. What I want to say is that in 1994, if somebody had told me that don't put all your eggs in one basket called worldwide web because it's just toy. All it does is just show static data. I would have never imagined that 10 years later, my entire tax code would be online. I would never have to go to a travel agent. I could buy stuff. So I agree that it's early in its journey, journey AI, but I have faith that in a few years, and this is going to be the new norm. If I look at the history before worldwide web took over, we already had Markup languages on Unix, I used to use SCMF, but HTML made things very easy. TCPIP had been in existence for 25, 30 years, and then HTTP came and it became pervasive. So I feel Gen AI is at that early stage of worldwide web in 1994. And another 10 years, it could be a different story. Eddie Big, Carl, you had a comment? Yeah, I just, I wanted to reinforce what Dave said and offer this observation that if you recall a movie called The Right Stuff, you'll remember that in that movie, the scientists believed that the astronaut or astronauts would just be passengers and that the computer equipment would run the whole thing. And if they had gotten their way, John Glenn might not have survived because he did go manual in order to adjust the angle of entry for his capsule because of something else that went wrong. And so, and of course, in Apollo 13, because there were three highly trained pilots, they were able to save their own lives by making adjustments that were involving a little bit more than just calculations. It involved just sort of sea of the pants, rough estimates and looking and based on knowledge, doing the right thing. So my point is this, General AI does indeed do a lot of work for us so we wouldn't need to do it and it relieves the responsibility from non-technical people so they can do their jobs without having to learn a lot of stuff. But at the same time, I believe along with Dave that it will ultimately open up more opportunity for highly trained individuals to guide it in the right directions, to help set up the environments so that everything works right and to avoid problems. I'm not even talking about hallucinations which I think are relatively easy to avoid but more problematic things that involve misinterpretation and improper combinations of data and that kind of thing. So I totally agree that this is actually an opportunity for more not less skills development. And nice throwback references, Carl, thank you. All right, we just got about five minutes left. The last prediction comes from Doug Henschen who's predicting that Gen AI will have a material impact on how organizations approach BI and predictive analytics. Doug, this is a prediction with big implications for data analysts, data pros working in the pipeline, Tableau jocks, and end business users, very exciting. Are you saying we'll see this transformation before the end of the year? No, no, I'm saying this is continuing the trend I talked about in 2023, increasing use of embedding, increasing use of insights being exposed where people are doing their work. The most adopted and most used feature in the augmented analytics trend that started five, eight years ago was natural language query. And Gen AI in 2024, 2023 was all about announcements. Late 2023, we started to see public previews. 2024, we're gonna start to see GA's and the most material impact on BI analytics is going to be putting natural language query on steroids, more accuracy, more verbose, more interpretive. And Gen AI will help you as the examples I gave, Microsoft Copilot, Tableau, Einstein Copilot, you know, AWS, Amazon Q, we're gonna start to have this query anywhere you need it in apps, in places other than just a BI platform. I think the caution is that it's still gonna be important, a really important role for analysts and that's gonna be curating the data, curating the question, curating the prompts. You know, mainstream business users don't necessarily know what to ask. So when you see natural language query interfaces, often there are questions, sort of their prompts to get them started and they're gonna be curating business metrics. They're gonna be, maybe we won't, we'll have Gen AI building the dashboards and reports for those that are still needed as sort of a system of record. But I think the analysts class are going to be focused more on deciding the key metrics that matter, determining whether is this good or bad for business and guiding users in their natural language query interactions on decisions they need to make. So business users will need those prompts, will need guidance as they start to use Gen AI instead of toggling between dashboards and reports and their transactional interfaces. You have more of a change in role versus an elimination of that role. Dave Menninger, I know you gotta run and Tony, you've got some quick comments here and then we'll wrap. Yeah, I'll make a quick comment. Thank goodness, in three quarters of organizations, less than half of the workforce has access to analytics. This is the solution for solving that problem. Prior to Gen AI, we saw that about a quarter of organizations were using natural language processing. 80% were using dashboards and reports. We need to see the inverse of that so that more of the organization can access analytics. All right, Tony, give you the final word here. Yeah, a quick point, which is that, I think the thing what Doug said here, which is that where the importance is gonna be is curating data questions and prompts. It's knowing how to ask the right questions of the data because basically technology will be there to get it. So I think what Doug is saying right there, it's all gonna come down to people knowing how to ask the right questions. Guys, can I just add one quick thing on that because this is, see, I wanna just continue this dialogue I had earlier about like how technology advances. So to me, prompt engineering today is highly complex and it just keeps changing. But to me, this reminds again, back to the time when internet search first came out. Remember, we had to put double quotes and put some hints here and there. Now we don't even think about it. So same thing, we are at this early stages of prompt engineering. Yeah, no doubt. Guys, I'm sorry we gotta go, but I wanna share my personal gratitude. It's amazing to me, this has been our third year through. It kind of started back at AWS re-invent in 2021 and looking forward to more collaboration in the future. So thank you. Thank you. Thank you, Dave. All right, also I wanna thank Alex Meyerson. Yes, happy new year. Who's on production and manages the podcast. Ken Schiffman as well. Kristin Martin and Cheryl Knight help get the word out on social media and in our newsletters and Rob Hof is our editor-in-chief at siliconangle.com. Remember, all these episodes are available as podcasts wherever you listen. I don't know if you know this. I'm really proud to say we hit almost 650,000 downloads of breaking analysis this past year. Wow, all you gotta do is search breaking analysis podcast. I publish each week on siliconangle.com and thecuberesearch.com and you can email me at david.balante at siliconangle.com or DM me at davilante.com on our LinkedIn post and check out etr.ai. They get great survey data, constantly updating the dataset. This is Dave Vellante for theCUBE Research Insights powered by ETR. Thanks for watching and we'll see you next time on breaking analysis.