 We're back with Justin Borgman of Starburst and Richard Jarvis of EMIS Health. Okay, we're going to get it to lie number two and that is this, an open source based platform cannot give you the performance and control that you can get with a proprietary system. Is that a lie? Justin, the enterprise data warehouse has been pretty dominant and has evolved and matured. It's stack is mature over the years. Why is it not the default platform for data? Yeah, well, I think that's become a lie over time. So I think, you know, if we go back 10 or 12 years ago with the advent of the first data lake really around Hadoop, that probably was true, that you couldn't get the performance that you needed to run fast interactive SQL queries in a data lake. Now, a lot's changed in 10 or 12 years. I remember in the very early days, people would say you'll never get performance because you need to be columnar. You need to store data in a columnar format. And then, you know, columnar formats were introduced to data lakes. You have Parquet, ORC file and Abro that were created to ultimately deliver performance out of that. So, okay, we got largely over the performance hurdle. You know, more recently people will say, well, you don't have the ability to do updates and deletes like a traditional data warehouse. And now we've got the creation of new data formats. Again, like iceberg and Delta and Hootie that do allow for updates and delete. So I think the data lake has continued to mature. And I remember a quote from Kurt Monash many years ago where he said, you know, it takes six or seven years to build a functional database. I think that's right. And now we've had you almost a decade go by. So, you know, these technologies have matured to really deliver very close to the same level of performance and functionality of cloud data warehouses. So I think the reality is that's become a lie. And now we have large, giant hyperscale internet companies that don't have the traditional data warehouse at all. They do all of their analytics in a data lake. So I think we've proven that it's very much possible today. Thank you for that. So Richard, talk about your perspective as a practitioner in terms of what open brings you versus, I mean, closed is open as a moving target. I remember UNIX used to be open systems. And so it's an evolving, you know, spectrum. But from your perspective, what does open give you that you can't get from a proprietary system or you're fearful of in a proprietary system? I suppose for me, open buys us the ability to be unsure about the future. Because one thing that's always true about technology is it evolves in a direction slightly different to what people expect. And what you don't want to end up is done is backed itself into a corner that then prevents it from innovating. So if you have chosen a technology and you've stored trillions of records in that technology and suddenly a new way of processing or machine learning comes out, you want to be able to take advantage and your competitive edge might depend upon it. And so I suppose for us, we acknowledge that we don't have perfect vision of what the future might be. And so by backing open storage technologies, we can apply a number of different technologies to the processing of that data. And that gives us the ability to remain relevant, innovate on our data storage. And we have bought our way out of the, any performance concerns, because we can use cloud scale infrastructure to scale up and scale down as we need. And so we don't have the concerns that we don't have enough hardware today to process what we want to do, want to achieve. We can just scale up when we need it and scale back down. So open source has really allowed us to maintain the being at the cutting edge. So just let me play devil's advocate here a little bit. I've talked to Jamak about this, and obviously her vision is there's an open source, that data mesh is open source, open source tooling, and it's not a proprietary, you're not going to buy a data mesh, you're going to build it with open source toolings and vendors like you are going to support it. But to come back to sort of today, you can get to market with a proprietary solution faster, I'm going to make that statement. You tell me if it's a lie. And then you can say, okay, we support Apache Iceberg. We're going to support open source tooling. Take a company like VMware, not really in the data business, but how the way they embraced Kubernetes and every new open source thing that comes along, they say, we do that too. Why can't proprietary systems do that and be as effective? Yeah, well, I think at least within the data landscape, saying that you can access open data formats like Iceberg or others is a bit disingenuous because really what you're selling to your customer is a certain degree of performance, a certain SLA. And those cloud data warehouses that can reach beyond their own proprietary storage, drop all the performance that they were able to provide. So it reminds me kind of of again, going back 10 or 12 years ago, when everybody had a connector to Hadoop and they thought that was the solution, right? But the reality was, a connector was not the same as running workloads in Hadoop back then. And I think similarly, being able to connect to an external table that lives in an open data format, you're not going to give it the performance that your customers are accustomed to. And at the end of the day, they're always going to be predisposed. They're always going to be incentivized to get that data ingested into the data warehouse because that's where they have control. And the bottom line is the database industry has really been built around vendor lock-in. I mean, from the start, how many people love Oracle today, but our customers nonetheless, I think lock-in is part of this industry. And I think that's really what we're trying to change with open data formats. Well, it's interesting, reminded of when I see the gas price, the teaser gas price, I drive up and then I say, okay, that's the cash price. Credit card, I'm going to pay 20 cents more. But okay. But so the argument then, so let me come back to you, Justin. So what's wrong with saying, hey, we support open data formats, but yeah, you're going to get better performance if you keep it into our closed system. Are you saying that long-term, that's going to come back and bite you because you're going to end up, you mentioned Oracle, you mentioned Teradata, by implication, you're saying that's where Snowflake customers are headed. Yeah, absolutely. I think this is a movie that we've all seen before, at least those of us who've been in the industry long enough to see this movie play over a couple of times. So I do think that's the future. And I think, I loved what Richard said. I actually wrote it down because I thought it was an amazing quote. He said, it buys us the ability to be unsure of the future. That pretty much says it all. The future is unknowable and the reality is using open data formats, you remain interoperable with any technology you want to utilize. If you want to use Spark to train a machine learning model and you want to use Starburst to query it via SQL, that's totally cool. They can both work off of the same exact data sets. By contrast, if you're focused on a proprietary model, then you're kind of locked in again to that model. I think the same applies to data sharing, to data products, to a wide variety of aspects of the data landscape that a proprietary approach kind of closes you and it locks you in. So I would say this, Richard, I'd love to get your thoughts on it because I talked to a lot of Oracle customers, not as many Teradata customers, but a lot of Oracle customers and they'll admit, yeah, they're jamming at some price and the license costs are good, but we do get value out of it. And so my question to you, Richard, is do the, let's call it data warehouse systems or the proprietary systems, are they going to deliver a greater ROI sooner? And is that in a lore of that customers are attracted to? Or can open platforms deliver as fast an ROI? I think the answer to that is, it can depend a bit. It depends on your business's skill set. So we're lucky that we have a number of proprietary teams that work in databases that provide our operational data capability. And we have teams of analytics and big data experts who can work with open datasets and open data formats. And so for those different teams, they can get to an ROI more quickly with different technologies. For the business, though, we can't do better for our operational data stores than proprietary databases today. We can back off very tight SLAs to them. We can demonstrate reliability from millions of hours of those databases being run at enterprise scale. But for an analytics workload where increasing our business is growing in that direction, we can't do better than open data formats with cloud-based data mesh type technologies. And so it's not a simple answer that one will always be the right answer for our business. We definitely have times when proprietary databases provide a capability that we couldn't easily replicate with open technologies. Yeah, Richard, stay with you. You mentioned some things before that strike me. You know, the Databricks Snowflake thing is always a lot of fun for analysts like me. You've got Databricks coming at it. Richard, you mentioned you have a lot of Rockstar data engineers. Databricks coming at it from a data engineering heritage. You got Snowflake coming at it from an analytics heritage. Those two worlds are colliding people like Sanjeev Mohan has said, you know what, I think it's actually harder to play in the data engineering. So IE, it's easier for data engineering world to go into the analytics world versus the reverse. But thinking about up and coming engineers and developers preparing for this future of data engineering and data analytics, how should they be thinking about the future? What's your advice to those young people? So I think I'd probably fall back on general programming skillsets. So the advice that I saw years ago was if you have open source technologies, the pythons and javas on your CV, you command a 20% pay hike over people who can only do proprietary programming languages. And I think that's true of data technologies as well. And from a business point of view, that makes sense. I'd rather spend the money that I save on proprietary licenses on better engineers because they can provide more value to the business that can innovate us beyond our competitors. So I think my advice to people who are starting here or trying to build teams to capitalize on data assets is begin with open license-free capabilities because they're very cheap to experiment and they generate a lot of interest from people who want to join you as a business and you can make them very successful early doors with your analytics journey. And it's interesting, again, analysts like myself, we do a lot of TCO work and have over the last 20 plus years. And in the world of Oracle, normally it's the staff that's the biggest nut in total cost of ownership. Not in Oracle, it's the license cost is by far the biggest component in the blame pie. All right, Justin, help us close out this segment. We've been talking about this sort of data mesh, open, closed, snowflake, Databricks. Where does Starburst sort of as this engine for the data lake house, the data warehouse fit in this world? Yeah, so our view on how the future ultimately unfolds is we think that data lakes will be a natural center of gravity for a lot of the reasons that we described, open data formats, lowest total cost of ownership because you get to choose the cheapest storage available to you. Maybe that's S3 or Azure data lake storage or Google cloud storage, or maybe it's on-prem object storage that you bought at a really good price. So ultimately storing a lot of data in a data lake makes a lot of sense, but I think what makes our perspective unique is we still don't think you're going to get everything there either. We think that basically centralization of all your data assets is just an impossible endeavor. And so you want to be able to access data that lives outside of the lake as well. So we kind of think of the lake as maybe the biggest place by volume in terms of how much data you have, but to have comprehensive analytics and to truly understand your business and understand it holistically, you need to be able to go access other data sources as well. And so that's the role that we want to play is to be a single point of access for our customers, provide the right level of fine-grained access controls so that the right people have access to the right data and ultimately make it easy to discover and consume via the creation of data products as well. Great, okay, thanks guys. Right after this quick break, we're going to be back to debate whether the cloud data model that we see emerging and the so-called modern data stack is really modern or is it the same wine new bottle when it comes to data architectures? You're watching theCUBE, the leader in enterprise and emerging tech coverage.