 Hello, this is Dave Vellante with theCUBE and one of the most gratifying aspects of my role as a host of theCUBE TV is I get to cover a wide range of topics and quite often we're able to bring to our program a level of expertise that allows us to more deeply explore and unpack some of the topics that we cover throughout the year. One of our favorite topics of course is data. Now in 2021 after being in isolation for the better part of two years, a group of industry analysts met up at AWS re-invent and started a collaboration to look at the trends in data and predict what some likely outcomes will be for the coming year. And it resulted in a very popular session that we had last year focused on the future of data management. And I'm very excited and pleased to tell you that the 2023 edition of that predictions episode is back. And with me are five outstanding market analysts, Sanjeev Mohan of Sanjmo, Tony Bear of DB Insights, Carl Oliveson from IDC, Dave Menegar from Ventana Research and Doug Henshin, VP and principal analyst at Constellation Research. Now, what is it that we're calling you guys a data pack? Like the rat pack? No, no, no, that's not it. It's the data crowd, the data crowd. And the crowd includes some of the best minds in the data analyst community. They'll discuss how data management is evolving and what listeners should prepare for in 2023. Guys, welcome back. Great to see you. Good to be here. Thank you. Nice to have you here. Before we get into 2023 predictions, we thought it'd be good to do a look back at how we did in 2022 and give a transparent assessment of those predictions. So let's get right into it. We're going to bring these up here. The predictions from 2022, they're color coded red, yellow and green to signify the degree of accuracy. And I'm pleased to report there's no red. Well, maybe some of you will want to debate that grading system, okay? But as always, we want to be open so you can decide for yourselves. So we're going to ask each analyst to review their 2022 prediction and explain their rating and what evidence they have that led them to their conclusion. So Sanjeev, please kick it off. Your prediction was data governance becomes key. I know that's going to knock you guys over but elaborate because you had more detail when you double click on that. Yeah, absolutely. Thank you so much Dave for having us on this show today. And we self graded ourselves. I could have very easily made my prediction from last year green, but I mentioned why I left it as yellow. I totally fully believe that data governance was in a renaissance in 2022. And why do I say that? You have to look no further than AWS launching its own data catalog called data zone. Before that mid year, we saw unity catalog from Databricks when GA. So overall, I saw there was tremendous movement when you see these big players launching a new data catalog, you know that they want to be in this space and this space is highly critical to everything that I feel we will talk about in today's call. Also, if you look at established players, I spoke at Kuleba's conference, Data.world, Work Close Day with Alation, Informatica, a bunch of other companies, they all added tremendous new capabilities. So it did become key. The reason I left it as yellow is because I had made a prediction that Kuleba would go IPO and it did not. And I don't think anyone is going IPO right now. The market is really, really down the funding and VC IPO market. But other than that, data governance had a banner here in 2022. Well, thank you for that. And of course you saw data clean rooms being announced at AWS re-invent. So more evidence. And I like how the fact that you included in your prediction some things that were binary, so you dinged yourself there. So good job. Okay, Tony Bear, you're up next. Data mesh hits reality check. As you see here, you've given yourself a bright green thumbs up. Okay, let's hear why you feel that was the case. What do you mean by reality check? Thanks, Dave, for having us back again. This is something I just wrote and just tried to get away from. And this topic just won't go away. I did speak with a number of folks, early doctors and non-adopters during the year. And I did find that basically that it pretty much validated what I was expecting, which was that there was a lot more, this has now become a front burner issue. And if I had any doubt in my mind, the evidence I would point to is what was originally intended to be a throwaway post on LinkedIn, which I just quickly scribbled down the night before leaving for re-invent. I was packing at the time and for some reason I was doing Google search on DataMesh. And I happened to have a trip across this ridiculous article, I will not say where because it doesn't deserve any publicity, about the eight best DataMesh software companies of 2022. One of my predictions was that, that you'd see DataMesh washing. And I just quickly just hopped on that, maybe three sentences and wrote it in like about a couple of minutes saying this is hogwash essentially. And that just, and then I left for re-invent and the next night when I got into my Vegas hotel room I clicked on my computer and I saw 15,000 hits on that post which is the most hits of any single post I've put all year. And the responses were wildly pro and con. So it pretty much validates my expectation that DataMesh really did hit a lot more scrutiny over this past year. Yeah, thank you for that. I mean, I remember that article, I remember rolling my eyes when I saw it. And then recently I talked to Walmart and they actually invoked Martin Fowler and they said that they're working through their DataMesh. So it takes a really lot of thought and it really as we've talked about is really as much an organizational construct. You're not buying DataMesh to your point. Okay, thank you Tony. Carl Ellison, here we go. You've graded yourself a yellow in the prediction of graph databases take off. Please elaborate. Yeah, sure. So I realized in looking at the prediction that it seemed to imply that graph databases were going to be a major factor in the data world in 2022, which obviously didn't become the case. It was an error on my part and I should have set it in the right context. It's really a three to five year time period that graph databases will really become significant because they still need accepted methodologies that can be applied in a business context as well as proper tools in order for people to be able to use them seriously. But I stand by the idea that it is taking off because for one thing, Neo4j, which is the leading independent graph database provider had a very good year. And also we're seeing interesting developments on the in terms of things like AWS with Neptune and with Oracle providing graph support in Oracle database this past year. Those things are, as I said, growing gradually. There are other companies like Tiger Graph and Soapworth that deserve watching as well. But as far as becoming mainstream, it's going to be a few years before we get all the elements together to make that happen. Like any new technology, you have to create an environment in which ordinary people without a whole ton of technical training can actually apply the technology to solve business problems. Yeah, thank you for that. I mean, you know, these specialized databases, graph databases, time series databases, you see them embedded into mainstream data platforms. But there's a place for these specialized databases. I would suspect we're going to see new types of databases emerge with all this cloud sprawl that we have and maybe out to the edge. Part of it is that it's not a specialized, you might think you can apply graphs to great many workloads and use cases. It's just that people have yet to fully explore and discover what those are. And so it's going to be a process. All right, Dave Meniger, streaming data permeates the landscape. You gave yourself a yellow, why? Well, I couldn't think of an appropriate combination of yellow and green. Maybe I should have used chartreuse. But I was probably a little hard on myself making it yellow. You know, this is another type of specialized data processing like Carl was talking about with graph databases is stream processing. And nearly every data platform offers streaming capabilities now. Often it's based on Kafka. If you look at Confluent, their revenues have grown at more than 50% continue to grow at more than 50% a year. They're expected to do more than half a billion dollars in revenue this year. But the thing that hasn't happened yet, and to be honest, I didn't necessarily expect it to happen in one year is that streaming hasn't become the default way in which we deal with data. It's still a sidecar to data at rest. And I do expect that we'll continue to see streaming become more and more mainstream. I do expect perhaps in the five-year timeframe that we will first deal with data as streaming and then at rest. But the worlds are starting to merge. And we even see some vendors bringing products to market such as K2View, Hazelcast and Rising Wave Labs. So in addition to all those core data platform vendors adding these capabilities, there are new vendors approaching this market as well. I like the tough grading system and it's not trivial. And when you talk to practitioners doing this stuff, there's still some complications in the data pipeline. And so, but I think you're right. It probably was a yellow plus. Doug Henshin, data lake houses will emerge as dominant. I mean, when you talk to people about lake houses, practitioners, they all use that term. They certainly use the term data lake, but now they're using lake houses more and more. What's your thoughts on here? Why the green? What's your evidence there? Well, I think I was accurate. I spoke about it specifically as something that vendors would be pursuing. And we saw yet more lake house advocacy in 2022. Google introduced its big lake service alongside big query Salesforce introduced Genie, which is really a lake house architecture. And it was a safe prediction to say vendors are gonna be pursuing this in that AWS, Cloudera, Databricks, Microsoft, Oracle, SAP, Salesforce Now, IBM all advocate this idea of a single platform for all of your data. Now the trend was also supported in 2023 in that we saw a big embrace of Apache Iceberg in 2022. That's a structured table format. It's used with these lake house platforms. It's open so it ensures portability and it also ensures performance. And that's a structured table that helps with the warehouse side performance. But among those announcements, Snowflake, Google, Cloudera, SAP, Salesforce, IBM all embraced Iceberg. But keep in mind, again, I'm talking about this as something that vendors are pursuing as they're approaching it. They're advocating and users, it's very cutting edge. I'd say the top leading edge, 5% of companies have really embraced the lake house. I think we're now seeing the fast followers the next 20 to 25% of firms embracing this idea and embracing lake house architecture. I recall Christian Cloudera at the big Snowflake event last summer making the announcement about Iceberg and he asked for a show of hands for any of you in the audience at the keynote. Have you heard of Iceberg? And just a smattering of hands went up. So the vendors are ahead of the curve, they're pushing this trend and we're now seeing a little bit more mainstream uptake. Good job, I was there, it was you, me and I think two other hands were up. Which was kind of humorous. All right, well, so I like the fact that we had some yellow and some green. You know, when you think about these things, there's the prediction itself. Did it come true or not? There are the sub predictions that you guys make and of course the degree of difficulty. So thank you for that open assessment. All right, let's get into the 2023 predictions. Let's bring up the predictions. Sanjeev, you're going first. You've got a prediction around unified metadata. What's the prediction, please? So my prediction is that metadata space is currently a mess. It needs to get unified. There are too many use cases of metadata which are being addressed by disparate systems. For example, data quality has become really big in the last couple of years. Data observability, the whole catalog space is actually, people don't like to use the word data catalog anymore because data catalog sounds like it's a catalog, a museum, if you may, of metadata that you go and admire. So what I'm saying is that in 2023, we will see that metadata will become the driving force behind things like data ops, things like orchestration of tasks using metadata, not rules. Not saying that if this fails, then do this. If this succeeds, go do that. But it's like getting to the metadata level and then making a decision as to what to orchestrate, what to automate, how to do data quality check, data observability. So this space is starting to gel and I see there'll be more maturation in the metadata space. Even security, privacy, some of these topics which are handled separately. And I'm just talking about data security and data privacy. Not talking about infrastructure security. These also need to merge into a unified metadata management piece with some knowledge grass, semantic layer on top so you can do analytics on it. So it's no longer something that sits on the side. It's limited in its scope. It is actually the very engine, the very glue that is going to connect data producers and consumers. Great, thank you for that. Doug Henshaw, any thoughts on what Sanjeev just said? Do you agree? Do you disagree? Well, I agree with many aspects of what he says. I think there's a huge opportunity for consolidation and streamlining of these aspects of governance. Last year, Sanjeev, you said something like we'll see more people using catalogs than BI and I have to disagree. I don't think this is a category that's headed for mainstream adoption. It's behind-the-scenes activity for the wonky few or better yet, companies want machine learning and automation to take care of these messy details. We've seen these waves of management technologies, some of the latest data observability, customer data platform, but they fail to sweep away all the earlier investments in data quality and master data management. So yes, I hope the latest tech offers glimmers that there's going to be a better cleaner way of addressing these things. But to my mind, the business leaders, including the CIO, only want to spend as much time and effort and money and resources on these sorts of things to avoid getting breached, ending up in headlines, getting fired or going to jail. So vendors bring on the ML and AI smarts and the automation of these sorts of activities. So if I may say something, the reason why we have this dichotomy between data catalog and the BI vendors is because data catalogs are very soon not going to be standalone products, in my opinion. They're going to get embedded. So when you use a BI tool, you'll actually use a catalog to find out what is it that you want to do, whether you're looking for data or you're looking for an existing dashboard. So the catalog becomes embedded into the BI tool. Hey, Dave Menegar. Sometimes you have some data in your back pocket. Do you have any stats on this topic? Well, I'm glad you asked because I'm going to... Now, data catalogs are something that's interesting. Sanjeev made a statement that data catalogs are falling out of favor. I don't care what you call them, they're valuable to organizations. Our research shows that organizations that have adequate data catalog technologies are three times more likely to express satisfaction with their analytics for just the reasons that Sanjeev was talking about. You can find what you want, you know you're getting the right information, you know whether or not it's trusted. So those are good things. So we expect to see the capabilities, whether it's embedded or separate, right? We expect to see those capabilities continue to permeate the market. And a lot of those catalogs are driven now by machine learning and things. So they're learning from those patterns of usage by people when people use the data. All right, okay, thank you guys. All right, let's move on to the next one. Tony Bear, let's bring up the predictions. Is the, you just got something in here about the modern data stack. We need to rethink it. Is the modern data stack getting long at the tooth? Is it not so modern anymore? I don't have it, I think in a way it's gotten almost too modern. It's gotten too, I don't know if it's being long in truth, but it is getting long. The modern data stack, it's traditionally been defined as basically all, basically you have the data platform, which would be like the operational database of the data warehouse. And in between you have all the tools that are necessary to essentially get that data from the operational realm or the streaming realm for that matter, into basically the data warehouse or as we might be seeing more and more than the Dave Lake house. And I think what's kind of important here is that, or I think we have seen a lot of progress and this would be in the cloud is with the SaaS services. And especially you see that in the modern data stack, which is like all these players, I mean, not just the MongoDBs or the Oracles or the Amazons have their database platforms. You see, you have the Informatica's and all the other players there and five trans have their own SaaS services. And so, and within those SaaS services, you get a certain degree of simplicity, which is it takes all the housekeeping, away from you off the shoulders of the customers. That's a good thing. The problem is that what we're getting to unfortunately is what I would call lots of islands of simplicity, which means that it leads it to the customer to have to integrate or put all that stuff together. It's a complex tool chain. And so what we really need to think about here, we have too many pieces and going back to like the discussion of catalogs, it's like we have so many catalogs out there. Which one do we use? Because chances are most organizations do not rely on a single catalog at this point. When I'm calling on all the data providers or all the SaaS services to literally get it together and essentially make this modern data stack less of a stack, make it more of a blending of an end-to-end solution. And that can come in a number of different ways. Part of it is that where data platform providers have been adding services that are adjacent. And there's some very good examples of this. I mean, we've seen progress over the past year or so. For instance, MongoDB, integrating search. It's a very common tool that basically that the applications that are developed on MongoDB use. So MongoDB then built it into the database rather than requiring an extra elastic search or open search stack. Amazon, just AWS just did the zero ETL, which is a first step towards simplifying the process from going from Aurora to Redshift. You've seen the same thing with Google BigQuery, integrating basically streaming pipelines. And you're seeing also a lot of movement in database machine work. So there's some good moves in this direction. I expect to see more of this this year. Part of it's from basically the SaaS platform is adding some functionality. But I also see more importantly because you're never going to get, it's kind of like asking your data team and your developers, kind of hurting cats to standardize in the same tool. In most organizations, that is not going to happen. So take a look at the most popular combinations of tools and start to come up with sort of pre-built integrations and pre-built orchestrations and offers some promotional pricing, maybe not quite too far, but in other words, get two products for the price of like, or two services for the price of one and a half. I see a lot of potential for this. And it's to me, if the class was to simplify things, this is the next logical step. And I expect to see more of this this year. And you see an Oracle MySQL heat wave you get another example of sort of eliminating that ETL. Carl Olson, today, if you think about the data stack and the application stack, they're largely separate. Do you have any thoughts on how that's going to play out? Does that play into this prediction? What do you think? Well, I think that the, you know, I really like Tony's phrase islands of simplification that really kind of says what's going on here, which is that, you know, all these different vendors, you ask about how these stacks work, all these different vendors have their own stack vision and you can, you know, one application group is going to use one and another application group is going to use another. And, you know, some people will say, you know, let's go to, you know, like you go to a Informatica conference and they say, we should be the center of your universe, but you can't connect everything in your universe to Informatica, so you need to use other things. So the challenge is, how do we make those things work together? Tony has said, and I totally agree, we're never going to get to the point where people standardize on one organizing system. So the alternative is to have metadata that can be shared amongst those systems and protocols that allow those systems to coordinate their operations. This is standard stuff. It's not easy, but the motive for the vendors is that they can become more active critical players in the enterprise. And of course, the motive for the customer is that things will run better and more completely. So I've been looking at this in terms of two kinds of metadata. One is the meaning metadata, which says what data can be put together. The other is the operational metadata, which says basically where did it come from, who created it, what's its current state, what's the security level, et cetera, et cetera, et cetera. The good news is the operational stuff can actually be done automatically, whereas the meaning stuff requires some human intervention. And as we've already heard from, what was it, Doug, I think people are disinclined to put a lot of definition into meaning metadata. So that may be the harder one, but coordination is key. This problem has been with us forever, but with the addition of new data sources and streaming data with data in different formats, the whole thing has, it's been like what a customer of mine used to say, I understand your product can make my system run fast, but right now I just feel I'm putting my problems on roller skates. I don't need that to accelerate what's already not working. Excellent. Okay, Carl, let's stay with you. I remember in the sort of early days of the big data movement, Hadoop movement, no SQL was the big thing. And I remember Amar Awadala said to us on theCUBE that SQL is the killer app for big data. So your prediction here, we bring that up as SQL is back. Please elaborate. Yeah, so of course, some people would say, well, never left. Actually, that's probably closer to true, but in the perception of the marketplace, there's been all this noise about alternative ways of storing and retrieving data, whether it's in key value stores or document databases and so forth. We're getting a lot of messaging that for a while had persuaded people that, oh, we're not going to do analytics in SQL anymore. We're going to use Spark for everything, except that only a handful of people know how to use Spark. Oh, well, that's a problem. Well, how about, and for ordinary sort of conventional business analytics, Spark is like an over-engineered solution to the problem. Now, SQL works just great. What's happened in the past couple of years and what can continue to happen is that SQL is infusing, is insinuating itself into everything. We're seeing all the major data lake providers offering SQL support, whether it's Databricks or, you know, of course, Snowflake is loving this because that is what they do. And their success is certainly points to the success of SQL, even MongoDB. And we were all, I think, at the MongoDB conference where on one day we hear SQL is dead. They're not teaching SQL in schools anymore and this kind of thing. And then a couple of days later at the same conference they announced, we're adding a new analytic capability based on SQL. But didn't you just say SQL is dead? So, you know, the reality is that SQL is better understood than most other methods of certainly of retrieving and finding data in a data collection no matter whether it happens to be relational or non-relational and even in systems that are very non-relational, such as graph and document databases their query languages are being built or extended to resemble SQL because SQL is something people understand. Now, you remember when we were in high school and you had to take your debating in the class and you were forced to take one side and defend it? So I was at a Vertica conference one time up on stage with Kurt Monash. And I had to take the sort of no SQL, you know, the world is changing, paradigm shift. So just to be controversial I said to him, Kurt Monash, I said, who really needs acid compliance anyway, Tony Bear? And so, of course, you know, his head exploded. But what are your thoughts on all this? Well, my first thought is congratulations, you know, Dave for surviving and being up on stage with Kurt Monash. Amen. I definitely would concur with Carl. We actually are definitely seeing a SQL renaissance. And if there's any proof of the pudding here, I'd see Lakehouse is being icing on the cake. As Doug had predicted last year now for the record, I think Doug was about a year ahead of time in his predictions that this year is really the year that I see the Lakehouse ecosystem is really firming up. You saw the first shots last year. But anyway, on this, data lakes will not go away. It's, you know, I've actually I'm on the home stretch of doing a market landscape on the Lakehouse. And, you know, Lakehouse will not replace data lakes in terms of that there are, there is the need, you know, for those who data scientists who do know Python, who know Spark to go in there and basically do their thing without all the restrictions or the constraints of a pre-built or pre-designed table structure. I get that. Same thing for developing models. But on the other hand, there is huge need. I mean, basically, maybe MongoDB was saying that they're not teaching SQL anymore. Well, maybe we have an oversupply of SQL, SQL SQL developers. Well, I'm being facetious there. But there was a huge skills base in SQL. Analytics have been based on, you know, built on SQL. The came with Lakehouse. And why this is, why this really sort of helps to fuel kind of SQL revival is that the core need, you know, in the data lake, what brought on Lakehouse was not so much SQL. It was the need for acid. And what was the best way to do it? It was through a relational table structure. So the whole idea of acid in the Lakehouse was not to turn it into a transaction database, but to make the data trusted, secure, and, you know, more granularly governed, where you can govern down to column and row level, which you really could not do in a data lake or a file system. So while Lakehouse can be queried in a manner, you can go in there with Python or whatever, it's built on a relational table structure. And so for that end, for those types of data lakes, it becomes the end state. You cannot bypass that table structure as I learned the hard way during my research. So the bottom line I'd say here is that Lakehouse has proved that we're starting to see the revenge of the SQL nerds. Excellent. Okay, let's bring back up the predictions. Dave Menegar, this one is really thought provoking and interesting. I mean, we're hearing things like, you know, data as code, new data applications, machines actually generating plans with no human involvement. And your prediction is the definition of data is expanding. What do you mean by that? So I think for too long, we've thought about data as the, I'll say facts that we collect, the readings off of devices and things like that. But data on its own is really insufficient. Organizations need to, you know, manipulate that data and examine derivatives of the data to really understand what's happening in their organization. You know, why has it happened and to project what might happen in the future. And my comment is that these data derivatives need to be supported and managed just like the data needs to be managed, right? We can't treat this as entirely separate. Think about all the governance discussions we've had. Think about the metadata discussions we've had. If you separate these things, now you've got more moving parts. We're talking about simplicity and simplifying the stack, right? So if these things are treated separately, it creates much more complexity. I also think it creates a little bit of a myopic view on the part of the IT organizations that are acquiring these technologies. They need to think more broadly. So for instance, metrics. Metric stores are becoming much more common part of the tooling that's part of a data platform. Similarly, feature stores are gaining traction. So, you know, those are designed to promote the reuse and consistency across AI and ML initiatives, right? The elements that are used in developing an AI or ML model. And let me go back to metrics and just clarify what I mean by that. So any type of formula involving the data points, right? I'm distinguishing metrics from features that are used in AI and ML models. And the data platforms themselves are increasingly managing the models as an element of data, right? So just like, you know, figuring out how to calculate a metric. Well, if you're going to have the features associated with an AI and ML model, you probably need to be managing the model that's associated with those features. The other element where I see expansion is around external data. Organizations for decades have been focused on the data that they generate within their own organization. We see more and more of these platforms acquiring and publishing data to external third party sources, whether they're within some sort of a partner ecosystem or whether it's a commercial distribution of that information. And our research shows that when organizations use external data, they derive even more benefits from the various analyses that they're conducting. And the last great frontier in my opinion on this expanding world of data is the world of driver-based planning. Very few of the major data platform providers provide these capabilities today. These are the types of things you would do in a spreadsheet. And we all know that the issue is associated with spreadsheets, right? They're hard to govern, they're error prone. And so if we can take that type of analysis, the, you know, collecting the occupancy of a rental property, the projected, you know, rise in rental rates, you know, the fluctuations perhaps in occupancy, the interest rates associated with financing that property, we can project forward. And that's a very common thing to do, what the income might look like from that property, income, the expenses, we can plan and purchase things appropriately. So I think we need this broader purview and I'm beginning to see some of those things happen. And the evidence today, I would say, is more focused around the metric stores and the feature stores, starting to see vendors offer those capabilities. And we're starting to see the ML ops elements of managing the AI and ML models find their way closer to the data platforms as well. Very interesting. When I hear metrics, I think of KPIs, I think of data apps that orchestrate people and places and things to optimize around a set of KPIs. It sounds like a metadata challenge. More, if somebody wants predicted, so we'll have more metadata than data, Carl. What are your thoughts on this prediction? Yeah, I think that what Dave is describing as data derivatives is in a way, another word for what I was calling operational metadata, which is data, not about the data itself, but how it's used, where it came from, what the rules are governing it, and that kind of thing. If you have a rich enough set of those things, then not only can you do a model of how well your vacation property rental may do in terms of income, but also how well your application that's measuring that is doing for you. In other words, how many times have I used it? How much data have I used? And what is the relationship between the data that I've used and the benefits that I've derived from using it? We don't have ways of doing that. What's interesting to me is that folks in the content world are way ahead of us here because they have always tracked their content using these kinds of attributes. Where did it come from? When was it created? When was it modified? Who modified it? And so on and so forth. We need to do more of that with the structured data that we have so that we can track what it's used. And also it tells us how well we're doing with it. Is it really benefiting us? Are we being efficient? Are there improvements in processes that we need to consider? Because maybe data gets created and then it isn't used or it gets used, but it gets altered in some way that actually misleads people. So we need the mechanisms to be able to do that. So I would say that that's... And I'd say that it's true that we need that stuff. I think that starting to expand is probably the right way to put it. It's gonna be expanding for some time. I think we're still a distance from having all that stuff really working together. Maybe we should say it's gestating. If I may... Sanjeev, please comment. This sounds to me like it supports Jamak Tagani's principles, but please. Absolutely. So whether we call it data mesh or not, I'm not getting into that conversation, but data... Everything that I'm hearing what Dave is saying called, this is the year when data products will start to take off. I'm not saying they'll become mainstream. They may take a couple of years to become so, but this is data products. All this thing about vacation rentals and how is it doing? That data is coming from different sources. I'm packaging it into a data product. And to God's point, there's a whole operational metadata associated with it. The idea is for organizations to see things like developer productivity. How many releases am I doing of this? What data products are most popular? I'm actually right now in the process of formulating this concept that just like we had data catalogs, we are very soon going to be requiring data products catalog. So I can discover these data products. I'm not just creating data products left, right and center. I need to know, do they already exist? What is the usage? If no one is using a data product, maybe I want to retire and save costs. But this is a data product. Now there's an associated thing that is also getting debated quite a bit called data contracts. And a data contract to me is literally just formalization of all these aspects of a product. How do you use it? What is the SLA on it? What is the quality that I'm prescribing? So data product, in my opinion, shifts the conversation to the consumers or to the business people. Up to this point, when Dave, you're talking about data and like all of data discovery curation is a very data producer centric. So I think we'll see a shift more into the consumer space. Dave, can I just jump in there just very quickly there, which is that what Sanjeev has been saying there, this is really central to what Shamak is talking about. It's basically about making one data processor about the lifecycle management of data. Metadata is just elemental to that. And essentially one of the things that she calls for is making data products discoverable. That's exactly what Sanjeev was talking about. By the way, did everyone just notice that how Sanjeev just snuck in another prediction there? So we've... Can we also say that he snuck in, I think, the term that we'll remember today, which is metadata museums. And then I'll also comment to Tony, to your last year's prediction, you're really talking about it's not something that you're going to buy from a vendor. It's very specific to an organization, their own data product. So Touche on that one. Okay, last prediction. Let's bring him up, Doug Henschen. BI Analytics is headed to embedding. What does that mean? Well, we all know that conventional BI dashboarding, reporting is really commoditized from a vendor perspective. It never enjoyed truly mainstream adoption. Always that sort of 25% of employees are really using these things. I'm seeing rising interest in embedding concise analytics at the point of decision or better still using analytics as triggers for automation and workflows and not even necessitating human interaction with visualizations, for example, if we have confidence in the analytics. So leading companies are pushing for next generation applications, part of this low code, no code movement we've seen, and they wanna build that decision support right into the app. So the analytic is right there. Leading enterprise apps vendors, Salesforce, SAP, Microsoft, Oracle, they're all building smart apps with the analytics predictions, even recommendations built into these applications. And I think the progressive behind analytics vendors are supporting this idea of driving insight to action, not necessarily necessitating humans, interacting with it if there's confidence. So we want prediction, we want embedding, we want automation, this low code, no code development movement is very important to bringing the analytics to where people are doing their work. We got to move beyond what I call swivel chair integration between where people do their work and going off to separate reports and dashboards and having to interpret and analyze before you can go back and do a take action. And Dave Meniger, today, if you want analytics or you want to absorb what's happening in the business, you typically got to go ask an expert and then wait. So what are your thoughts on Doug's prediction? I'm in total agreement with Doug. I'm going to say that collectively, so how did we get here? I'm going to say collectively as an industry we made a mistake. We made BI and analytics separate from the operational systems. Now, okay, it wasn't really a mistake. We were limited by the technology available at the time. Decades ago we had to separate these two systems so that the analytics didn't impact the operations, right? You don't want the operations preventing you from being able to do a transaction. But we've gone beyond that now. We can bring these two systems and worlds together and organizations recognize that need to change. As Doug said, the majority of the workforce in the majority of organizations doesn't have access to analytics. That's wrong, we got to change that. And one of the ways that's going to change is with embedded analytics. Two thirds of organizations recognize that embedded analytics are important and it even ranks higher in importance than AI and ML in those organizations. So, it's interesting, this is a really important topic to the organizations that are consuming these technologies. The good news is it works. Organizations that have embraced embedded analytics are more comfortable with self-service than those that have not, as opposed to turning somebody loose in the wild with the data. They're given sort of a guided path to the data and the research shows that 65% of organizations that have adopted embedded analytics are comfortable with self-service compared with just 40% of organizations that are turning people loose in an ad hoc way with the data. So, totally behind Doug's predictions. And I just break in with something here, a comment on what Dave said about what Doug said, which is that I totally agree with what you said about embedded analytics. And at IDC, we made a prediction in our future intelligence. Future of intelligence service three years ago that this was going to happen. And the thing that we're waiting for is for developers to build it. You have to write the applications to work that way. They just won't, it just doesn't happen automatically. Developers have to write applications that reference analytic data and apply it while they're running, okay? And that could involve simple things like complex queries against the live data, which is through something that I've been calling analytic transaction processing or it could be through something more sophisticated that involves AI operations, as Doug has been suggesting, where the result is enacted pretty much automatically unless the scores are too low and you need to have a human being look at it. So, I think that that is definitely something we've been watching for. I'm not sure how soon it will come because it seems to take a long time for people to change their thinking. But I think as Dave was saying, once they do and they apply these principles in their application development, the rewards are great. This is very much, I would say, very consistent with what we were talking about I was talking about before, basically rethinking modern data stack and going into more of an end-to-end solution. I think that what we're talking about clearly here is operational analytics. There'll still be a need for your data scientists to go offline just in their data lakes to do all that very exploratory and that deep modeling. But clearly, it just makes sense to bring operational analytics into where people work, into their workspace and further flatten that modern data stack. But with all this metadata and all this intelligence we're talking about injecting AI into applications, it does seem like we're entering a new era of not only data, but new era of apps. Today, most applications are about filling forms out or codifying processes and require human input. And it seems like there's enough data now and enough intelligence in the system that the system can actually pull data from whether it's the transaction system, e-commerce, the supply chain, ERP, and actually do something with that data without human involvement, present it to humans. Do you guys see this as a new frontier? I think that certainly would work. Very much so, but it's going to take a while, as Carl said, you have to design it, you have to get the prediction into the system, you have to get the analytics at the point of decision has to be relevant to that decision point. And I also recall basically a lot of the ERP vendors back like 10 years ago, we're promising that. And the fact that we're still looking at the promises shows just how much of a challenge it is to get to what Doug's saying. One element that could be applied in this case is has to do with architecture. If applications are developed that are better event driven rather than following the script or sequence that some programmer or designer had preconceived, then you'll have much more flexible applications. You can inject decisions at various points using this technology much more easily. It's a completely different way of writing applications. And it actually involves a lot more data, which is why we should all like it. But in the end, it's more stable, it's easier to manage, easier to maintain, and it's actually more efficient, which is the result of an MIT study from about 10 years ago. And still we are not seeing this come to fruition in most business applications. And do you think it's going to require a new type of data platform, database? I mean, today data is all far flung. That's, we see it's all over the clouds and at the edge. Today you cash that data, you throw it into memory. I mentioned MySQL Heatwave, there are other examples where it's kind of a brute force approach, but maybe we need new ways of laying data out on disk and new database architectures. And just when we thought we had it all figured out. Without referring to disk, which to my mind is almost like talking about cave painting. I think that all the things that have been mentioned by all of us today are elements of what I'm talking about. In other words, the whole improvement of the data mesh, the improvement of metadata across the board and improvement of the ability to track data and judge its freshness the way we judge the freshness of a melon or something like that to determine whether we can still use it is it still good, that kind of thing, bringing together data from multiple sources dynamically in real time requires all the things we've been talking about all the predictions that we've talked about today add up to elements that can make this happen. Well guys, it's always tremendous to get these wonderful minds together and get your insights. And I love how it sort of shapes the outcome here of the predictions and let's see how we did. We're going to leave it there. I want to thank Sanjeev, Tony, Carl, David and Doug really appreciate the collaboration and thought that you guys put into these sessions. Really, thank you. Thank you for having us. Thank you. All right, this is Dave Vellante for theCUBE signing off for now. Follow these guys on social media, look for coverage on siliconangle.com, thecube.net. Thank you for watching.