 Great. Great. Thank you very much. It's a pleasure to be here and I'm really happy to be on with Sanjeev again. We met about a year ago when I was organizing a data lake open source panel at subsurface which is actually coming up again. You should go check it out. Sanjeev was moderating and it was a very fascinating panel and what we wanted to do is just thought it'd be a good idea to get back together and talk about what's changed over the past year and bring up some of these topics again that we talked about last year only with a more recent update. And by way of background, I'll just share that I was around when the term big data was first making the rounds. I was really heavily involved in cloud computing the early days of cloud computing back then. And it seemed like the cloud was changing everything in IT and every major technology throughout the IT industry was being impacted. It wasn't just like ripples from the cloud, it was big crashing waves. And so we had a big data camp which is an uncompense where we discussed what big data meant. And we were trying to grapple with the fact that now everyone can have access to tools, open source tools to manage big data like Hadoop and these kind of things. And so we were trying to understand what was coming, what opportunities were there, what advantages can we take with this new technology, and what should we be expecting to see. And back then questions were like, you know, how do we deal with distributed data because there's so much data we can't store them on one server, or one hard drive or whatever we were putting our data on. And so, you know, how do we deal with things like the cat, you know, Eric Brewer's cap theorem, where you could only have two out of the following three items you could have consistency availability and or partition tolerance but you can have all three right. And it feels like over the past 10 years we've been grappling with that and trying to solve that problem and, and even though there is no perfect solution there's been many technologies that have come along, and have given us the ability to, to live within those constraints with maybe two and two of those issues, instead of only two that we could take care of. So, but now what's happening, I'm noticing is that, that instead of that like the changes are still coming but instead of them being punctuated in these big, you know, funders changes that come out seem to come out of nowhere or come out is really fast. It's still coming out as fast but instead of like a big crashing wave it's more like a, like a long tsunami wave where, you know, you don't really notice right away that the change is happening but before you know it. You know you're up to your neck and there's, there's so much happening that you didn't see coming because it happens so slowly, or it seemed to be happening so slowly even though there's so much way of water coming. So, we thought we'd take this opportunity to, to talk about okay what are some of those big mega trends that have been coming at us for the past couple years and you know where are we at and what's, what's the big news today that's changing. And so, you know, again I'm happy to have Sanjeev with us and I'll start off by just sharing with what you know one of the trends that I've noticed is that we've seen companies storing large amounts of database data not not just unstructured data, but putting massive amounts of data into an object store and then treating that data as if it was database data like they're actually querying it in real time, you know, it with SQL query so you know that to me seems like one change that has been occurring recently. But Sanjeev, why don't you tell us what do you think like is this, what are the big changes and what is this new modern data stack, what does it mean. Yeah, so thank you Dave for setting this up. You really said it very well, how you know there's this trickle of change that's coming. In fact, I didn't think of it that way for me it's like everything seemed like a mega trend every day there's something new going on, even between meetings if I go to LinkedIn just right now today I see all dbt labs, God series D of 220 million something like that, then our $4 billion valuation. So to me it's like there's just so much happening but now that you've posted that way. I think this is like death by 1000 cuts, the complexity is so high these days, we see that this whole data and analytics space is disaggregating and specialization is happening so rapidly that we are having to deal with so many different moving parts. So you asked like what is a modern data stack, modern data stack is is one way for us to consolidate into fewer, fewer independent pieces that are somehow somehow pre integrated and they've been proven to work together. For example, since I mentioned dbt so I'll use them as an example. If I'm building an analytical architecture, I can use maybe five trend with HPR for doing data ingestion dbt to do data transformation land my data into snowflake and then use my bi reporting tool of choice or sequel and do my reports. So this is an example of a modern data stack. It does not reduce the number of moving parts, but it at least brings to get the some of the technologies in some sort of like an accelerated fashion. So you can then have to think less about integration of these technologies. So that is modern data stack, but customers, they want flexibility they want to innovate. So, so modern data stack is one way of handling this, at the same time it does not stop the growth of specialized technologies. So, you know, in, in, in the early days we would talk about stacks as being like the lamp stack or the mean stack you know it was like a very concise set of software, like that you would combine together and you could probably like deploy it quickly on your laptop or something and run it or, but it sounds to me like what you're talking about with this modern data stack is that it's, there's so much involved that it's not just one piece of software it's, it's, it's not even one process it's like multiple processes and multiple pieces of software coming together to handle the challenge. Correct. Yes. For example, Dave, I talked about these few companies, but a modern data stack could have even a reverse ETL component it could have data observation it could have a data catalog. So that's the difference between, like an elk stack, which was elastic log stash Kibana, this is the scope is much wider here. Right, right. And so it's like the stack is almost like conceptual parts, and you can fill it in with the vendor best of breed that fits your use case. And so, so you end up with like all of these building blocks and when it's and it's really up to you to piece this together so like, what are the, the building blocks of a modern data So that's a great question so the way. So I think when I have done my architecture diagram in the past, I break it down into five different pieces. So the five pieces, the first one would be ingestion and integration. And we talked about actually some of the company like five and for example stream is another one, a lot of emphasis on streaming data, for instance, so that's ingestion and integration. Then comes a whole storage storage is is a hugely controversial, always in the news, because this is how we store data, we have sort of come to an acceptance that modern technologies are being built on top of object stores object stores give us super high durability, they are and, and they have all of the disaggregation of compute and store like Amazon S3, but what sits on top of storage, is it a cloud data warehouse, or is it a data lake house that is something which is still in in law of debate. So that's the building block number two. So we talk about ingestion and integration landing it into storage. The third building block is analytics. When I talk about analytics, I'm talking about all kinds of analytics, it may be traditional analytics where we used to build reports and dashboards and we still do, or it could be analytics that are more advanced data science like analytics, but it also includes what's a semantic layer, a knowledge graph, data virtualization layer analytics engines, like you mentioned Romeo or Starburst with Trino Presto. So all that comes under analytics piece. So these are the three building blocks that are distinct now underneath horizontal building blocks and the two of them. One is governance and the other one is operations governance is all the metadata related things that we do to catalog the data, curate the data, make sure that we are doing proper data access governance. People only see data, they're authorized to seeing, making sure we meet data privacy compliance regulations like GDPR or CCPA. So that's governance. And the fifth layer is operations. That is privately data ops, but it's not just data ops, data ops is data DevOps for data could be ML ops, you know, like feature engineering. So that is how do I automate and reuse the pieces that are building for my analytics architecture all the way from ingesting the data to consuming data. So those are the five building blocks that that I see that are critical. Just to throw another question here is, you know, this is we're talking all about this the database and the data and moving all that data around you these five elements. And is this mostly happening in the public cloud or is this happening on prem or is it a little bit of both. Where do you see that happening. Very good question. In fact, I think this is a question that needs a lot of careful attention. Is this all in the cloud or is this on prem because we we keep thinking that oh everything is happening in the cloud cloud is it everybody's talking about cloud. I want to point you to something that Andy Jesse talked about last year in April. While he was still with AWS and hadn't yet moved to Amazon.com and he just he said, only 5% of global it spending is in the cloud. Now, even if he's led to a little bit of it's no more than 20%. So what does that mean 80% of the work is still going on on premises. So this, this componentization of other building blocks that I talked about. It's not necessarily in the cloud. We are seeing that on premises is starting to now bring a lot of this platform as a service capabilities on to on prem. So even on prem, I don't need to necessarily have very expensive file system. I can have an open source object store like minni or red hat staff or Apache ozone, and I can build on top of that. So, so we are seeing that there's a transfer of advancements from the cloud and I have to say cloud is where these all started on to on prem. And a follow up on that in that is that okay so we're talking primarily about all of these changes that are happening with the data, and you mentioned that data ops is sort of like dev ops for data. And it seems to me that they're these these are almost like completely two separate tracks that are going on where application deployment is happening, and its own pace, and the data is more complicated so it's been evolving at a different pace like it almost like a separate wave coming. So, so it's you've just raised one of my pet peeves I really feel that I'm a data guy. And I feel that data people. And this is for the first time I'm saying this in public. I feel data people have had a lack of empathy for business. We lived in our own sort of domain. So we've been late at adopting DevOps we've been late at adopting data as a product, while software did we've been late at adopting. You know, data. Not just dev ops and data as a product, but data monitoring. So data observability. So, so now I feel that the data world is moving rapidly fast to catch up and be part of the bigger ecosystem, which includes applications. So applications, how did we build applications in the past in the data world. When I started my career, I was writing SQL statements, then I was in Oracle, when Oracle introduced PL SQL, which was a procedure language, and, and thereby we've got stored procedures. Now we were writing logic business logic in stored procedures. But what we're seeing today is that people are saying no no I will write my my logic in whatever language I want. I don't necessarily need to tightly couple it with my database architecture. What if I wrote it, my application at a higher level in Python in Ruby and Java in scholar, and then I pushed it down into the database. So the database is still an excellent execution engine, but we have sort of this decoupled our application development. So, so what what we are seeing is actually very interesting. Because of data gravity. First we brought compute close to data. So we could push down spark we could do execution of data execution of logic, where the data resides even pushed down machine learning training right into the database. Now we are seeing even the applications are being pushed down. For example, there's a, there's a new thing starting. Actually, it's not new, but only now starting to get traction is web assembly wasm wasm allows you to to bring your applications and your compute close to data. That's right. It's funny mentioned data gravity that was a term that I think was either coined or defined by Dave McCrory, which was also one of those big data camp people from the past from my past. But yeah, you know, I've noticed a similar trend when when big changes are occurring they, they usually happen in the simplest area first and deploying applications is simpler than deploying, you know, and updating data. You know, it kind of makes sense that that would happen in that way but you're right you know the maybe that maybe the data folks didn't think that that was going to apply to them as soon as it is as now. Actually, I'm so glad you brought that up because I forgot. I am so it's good you mentioned that to do the defense of data people, which is myself at least and I guess you, you know, even though now you do a lot of infrastructure and applications pieces, but you know applications don't change all the time, but data does. We have a concept of data gravity we don't have a concept of application gravity. So, so therefore it takes time for data to really rise up to the same level as applications, because once you write an application of course you can, you know, have to have a microservices change every day and that changes 10,000 changes, but but data is changing literally every micros many millisecond and you need to handle data much more differently than you handle the applications. That's right. So, which brings us to our next point, which is, okay, so if the data ops folks want to be more agile if the company wants to be more agile. So how do they take these modern data building blocks and put them together to help people become companies become more agile. So, there are a few things that come to my mind one is, I've talked about data ops, and it's not like I'm pushing for data ops on this call, but but data ops gives you a more structured way of building your pipeline. You know, using some of the local no code techniques, you know, it allows us to bring more of citizen roles, like if you have the right tools, then having citizen analysts citizen integrators, they can drag and drop they can, you know, build their self service applications. Automation is really important. And then in the cloud, I think one way how we have managed to become very agile is a lot of data technologies are finally becoming software as a service. I know, you know, snowflake introduced SAS, and that was a such a big deal is when snowflake certainly came into the scene it was like wow, I don't have to tune my database, I don't have to manage it it's self tuning self healing. No indexes to be created that was a surprise. I feel that now the data providers are taking it to the next level which is serverless. Where the entire data is abstracted into literally an API call. So, so not only the cloud introduced fully managed. But I still had to do some I to figure out what instance type do I need to pick up and how big my, my cluster should be even that is starting to now get abstracted through serverless. Got it. So, beyond serverless though you mentioned a lot of other technologies, you know in the data ops space. And I'm just wondering, like, for example, we saw this change from ETL to now ELT. Correct. So, how is, how is that for example helping a company become more agile. That's a good point. So, so if you see why did we go from ETL to ELT. So, extracting something from, let's say SAP HANA or terror data does not change that becomes like a commodity. So, moving into a Oracle Autonomous Data Warehouse or Redshift or BigQuery also does not change. So, what, what the revelation was that if the extract and the load are static, but transformation really depends upon how you want to write your business logic. So transformation is very specific to your organization, your use case, your business requirements. So ELT says, let's, let's package extract and load because that's commodity and buy that from a third party. And then let's do transformation separately because transformation now I can, maybe I want to write my transformation in PySpark, for instance, you know, I can, I can do whatever language I want, whatever underlying storage I want. So, so you see, this is how we broke apart. And instead of having transformation sit in between, we move transformation after extract and load. Also, loading became easier. You see, the problem that we used to have way back when we had these traditional data warehouses like terror data and the teaser storage was very expensive. So we had to transform the data before we store that we aggregated that data. Now, cloud storage is cheap and almost infinite. So why don't we extract and load the data in its raw form, then we transform it. Now I can also handle multiple use cases. Tomorrow is somebody says, Oh, by the way, I have a new use case. I want to do predictive maintenance on my equipment. I have the raw data, because I loaded it as it as I extracted it, I can transform it for a new use case. Going back to one of my favorite use cases or storage is the object store again. And I, you just reminded me that when folks are doing the extract and load they can extract it from wherever and load it to the object store and that can, like you said, be a component they can buy or open source tool. But then once it gets to the object store there's so many different applications or databases that need to access that data. And that is where you do the transformation so that you can actually transform that data, not to make it smaller necessarily but to make it in the right format for all of those tools that might access that that data on the object store like you know whether they need it in CSV file format or parquet or whatever that is so yeah that makes a ton of sense. So, then, what are what are you seeing as far as trends in this modern data architecture like beyond what we've already talked about. So, yeah, so let's let's change track let's talk about something different. So what I'm going to talk about is this whole era that we're living in is called is the era of decentralization. And so I want to narrate something that actually Tim O'Reilly originally came up with this whole pendulum swinging all the time from one place to another place. When we started our journey in this modern architecture, we actually not modern, we are going back to modern is a very relative thing but going back to 1950s, 60s, mainframe computers were the de facto standard. Everything was centralized into mainframe computers, but then as a computer usage went up. There was a move to decentralize hardware and we did it through personal computers, PCs laptops and so on. So now we went from centralized to decentralized, then Microsoft Windows came and it centralized everything back in at the operating system software level. Then the internet came and we decentralized it once again, but then the cloud providers and cloud data warehouses came and once again we centralized it. The pendulum has has swung again and we now live in the decentralized space so decentralized space. We've only talked about internet has decentralized the hardware cloud has decentralized how we access services, microservices has decentralized applications. Now we are seeing even in the data decentralization is taking place through concepts like data mesh. Now data mesh is a business concept is not a technology concept so we are still working through it, but the idea of data mesh is that that we have decentralized our teams made them domain driven, and we've decentralized how we build our data for consumption through data as products. If I take it even a step further, I think we are only at the cusp of decentralization. One thing that has caught the world's attention, especially venture capitalist VCs is web three web three is all about decentralized internet or decentralized apps. The web three is saying is that what is the biggest problem we have in in in data world, it's trust. What if I could take my data product, put it on web three, which is time to prove cryptographically secured. I now have a lineage of my my data product so what I'm sharing with you is not a common view it's something that that that I think may happen in the next three to five years but but I'm going to stop here because I want to get your thoughts on what do you think about some of these decentralization ideas. You know, the pendulum is a really good analogy, like, like so many of Tim O'Reilly's other analogies, like like web to like web 2.0. But yeah, you know the the decentralized concept because, like you said we did start to organize around the centralized cloud, you know putting everything into like an Amazon or Google or Microsoft cloud. And that sort of centralized but then there's once once that layer becomes commoditized, it typically allows innovators to rely upon that as a foundation for a new layer, and that new layer. It seems to have been a distributed layer on top of that cloud commodity and for a variety of reasons. And it seems to be powered by this blockchain concept which I think is a little bit off topic. Simply because it's like okay great you know Bitcoin is making a bunch of people money and everyone wants to be in on that so you know what's what's the framework that we're going to build blockchain applications around I've seen so many ridiculous applications using blockchain for things that just do not belong in a blockchain and will just lead to complete and utter disaster yet at the same time there's so much interest around it. You know what they say, you know first they ignore you, you know then they laugh at you, and then they fight you and then they went and then you win right so like I feel like we're in that same world with with this Web 3 where there's so much nonsense that at first it's easy just to ignore or laugh at and then, but there's there's so much activity around it they're they're they're starting to identify real, you know, they're starting to solve real problems with this technology, even if some of it seems to be ridiculous and there's so much of it I definitely imagine that Web 3 will will take off in somewhere some way shape or form I don't know yet what the killer app is yeah and I think that's part of the problem is we haven't really, you know, seen that, although I personally think that there might be something around data personalization, keeping your data distributed away from the marketing firms away from the sales call calling people. So somehow abstracting your data and keeping it decentralized, you know maybe in your own browser or something and then only sending it when you want to send it. But I don't know we'll see. So, yeah, do I think there's something there I do, I just don't know exactly where that will be. Oh, one last thing I do want to say it and actually I'll let you talk about it because I think we you might, there's a question about this we can talk about later around, you know, the value of blockchain in this new world. You know, I am actually surprised how many companies I'm not talking to product companies that actually have some sort of blockchain hidden underneath and they don't even want to talk about it. In fact, there's a company called vendia Tim Tim Wagner, the father of serverless computing the guy who created lambda created a company called vendia. They actually use a quantum lecture as as they are engine, but they don't want to, to necessarily talk about it, because they don't, because people have views of blockchain, and then when they think blockchain they think crypto and crypto is a very expensive. In fact, it's such a resource Hawk that many countries have even banned it. You know, the amount of electricity used by by just mining crypto is same as amount of electricity used by all of Netherlands. You know, this is so the so now the question kind of why is web three then such a big deal if we are struggling with these topics. That is because with the new way of how we're doing it, we are actually avoiding all this heavy weight processing. This is a concept called proof of stake and proof of stake is still very new, but basically it's a lightweight. It's more of a trusted thing we don't have to prove the trust. So it's actually not the blockchain we think of when we think of crypto this is this is the new way of thinking about it. And by the way, you're thinking about first and laugh at you they make fun of you, you know this very recording. I, we could take this recording and turn it into an NFT. It is an NFT. All right, anybody in the audience you want to buy the NFT for this recording just put your price in the Q&A window. So, no right. Well, I think I think it just points out that there, there's so much interest and they have solved a problem and now are they going to find other use cases for that technology and I think they will. And you know one example I've talked about before as a friend of my work for a company where they transport seeds from other countries, you know the import seeds from other countries to the United States and the difference between those seeds and other seeds is really in the DNA. You know the seeds grow slightly different and more resistant to certain fungus or whatever. And so, but somebody who's buying the seeds they want to know that that that seed actually came from where it was. They think it came from. And so they they're creating this this just it's a shared blockchain but it's not. It's a ledger of some sort maybe it's the same type of ledger you're talking about where it's a proof of not work but proof of what you call it a proof of what proof of stake proof of stake so it's a proof of stake that this person who is selling me the seed actually did have the right to that seed right. So, you know, maybe that's, maybe we'll see it in like a distributed partnership type situation. Yeah, and just to clarify, I am not at all proposing that we put all our data on blockchain or quantum ledger. I'm, I'm talking about metadata, like information about that data which is lightweight, you know, so. Right, right. Okay. By the way, we're talking about trends. So go ahead, because there's so many other trends we can talk about. We just got into this fun topic of that three but yeah, go ahead, Dave. Yeah, well, what other trends are, you know, do you see coming beyond the web three. Okay, so I, you know, there's a shift happening. We've typically been very focused on structured data doing some heavy duty coding on it batch data but now I see there's a lot of emphasis on doing streaming data using semi structure unstructured data. We talked about local no code. One thing that I want to talk about is, is a lot of time our discussion becomes very focused on the analytical architecture, which is you take the data from operation systems and then you land it into S3 and then you do work on the transformation and machine learning on top of that. So we forget that there's the whole operational side of the world, and that operational side is usually critical. And that is actually the bread and butter of companies. This is how companies make money, you know, is through these operational databases, and I could be down on my analytical database for a few seconds minutes and recover by if I'm down on my databases of my business stops. So the trend that we are seeing is the couple of trends and actually they oppose each other. One of them is specialized databases, for example, distributed SQL databases started by, to be honest, Google Cloud Spanner, the Cockroach DB, Yuga Byte and a bunch of other ones. These are SQL databases that support SQL, most, all of them have Postgres SQL compatibility, but they are distributed in a way that they're not just multi region or multi availability zone, they're multi cloud. I could today have a database where I don't care where my data is getting served from. If I'm in EU, then I get data from my local machine from one cloud provider in the US, it's a different cloud provider. Not something for every use case. I'm not proposing that everybody should look into this, but we have the technology today. So this is on the operational side. The other thing that I'm also seeing is there's another school of thought, which says that if I'm doing heavy duty graph work, then go get a graph database, not a problem. But the trend is to do multi model or some call it multi model, whatever you mean by that, but the idea is to converge the different data structures into a singular database. So this way I may have a relation database where I can do document database style JSON lookup, I can do graph, I can do time series, I can do a full text search. So some vendors are saying this is a way of future. Some are saying no, get specialized database. What should you pick really depends upon your use case. Right. Yeah. And having worked at Redis for four years. You know, they're known as being a cache, but they've done the same thing. There's now, if you want to run any one of those types of databases in memory, you can use a module to have all of your data in Redis and use it as a graph or JSON store, whatever. Well, great. Well, in a matter to just keep an eye on the time here. I know that we've got some questions that are related to some of these topics. So I want to make sure we hit them. So one of the questions is, you know, what are the trends you're seeing within individual stack components such as ingestion transformation and bi. And the next topic we were going to talk about is, you know, common tools and stacks. So, okay, why don't why don't we address that what are, what are some of the common tools and stacks and some trends that you're seeing around them. Right. And by the way, to answer some of those questions. In some ways, I've been answering them like, you know, streaming data for ingestion is a thing for transformation, it is doing transformation in, in, in a layer above, and then doing a push down into the execution layer, you take that transformation and you can version control it yet into another data ops DevOps thing, which we, we didn't really do that very efficiently in the past, but if I'm doing an ETL or ELT transformation, I can put it on GitHub, and I can do version control, if something goes wrong, I can roll back so so just to, to answer some of those questions that have come up. Moving to this new question so common tools and stacks. So, so the, the joke in the market is that today, at least in the US, we have five tools, five stacks, everything we do is now revolving around the five stacks and they are AWS, Azure, cloud, snowflake and data breaks. So, these are becoming so big that, you know, they are sort of subsuming the entire operations around it. So, but, but along with these stacks, the tools are coming in many different places, we talked about some of them. For example, data observability is a very interesting space, we see a lot of companies talk about data observability, the, some of the big ones I see, Excel data, big I, and Monte Carlo. So, they are basically saying that as stacks, as a end to end pipelines become very complex, how do you know that something isn't broken, how can you proactively find what is going on in that space. So, that is a very common set of tools that have only recently emerged. A common set of tools that have emerged in how do you do data access governance. So here we see companies like Ocaro, Immuta, Privacera, which is basically simplifying the way we create policies. In fact, even cloud providers have it, AWS has something called Lake Formation, but that's for AWS only. So, we see the common tools are starting to, to emerge in specific areas. So I've got one question here I just want to go ahead and answer now which is, there's a question about a user who is a public library and they're using on premise cloud era and Hortonworks clusters but they have budget constraints and so they can't use them anymore. So they're looking to migrate to a new cluster and you know given what we know about the changes that are occurring. Do you know of any stacks that they might move to that would help them with these challenges. So the first question is if they are going to move. So they have two choices. They can do private cloud or public cloud. So what you do really there is, I wouldn't, my first reaction was pick a cloud where you have the skills. Maybe you've got, you know, Microsoft Teams and Microsoft Office 365. So Azure, maybe a good choice because you already have power BI skills. I'm just using that as an example. So, so my first thinking was pick your, your cloud provider and then you can move into Azure has HD inside Google has data proc and AWS has EMR. But then I stopped myself because if your cost constrained, you know, how do we know the cost will remain the same or go up. So you can also look at, you know, on premises like cloud era, for instance, has cloud era private cloud, which uses more modern technologies. And I would say you literally have to do a proof of concept and see what is the nature of your workload and where is it going to be most cost effective to run. Maybe they can store their data in the object store type scenario, keeping their costs down. Yes. And then, and then, you know, take a look at open source cloud era or hortonworks or perhaps spark, you know, it can also be an option for them. But keeping the data in the object store and then keeping your clusters smaller might be a good alternative for them. Okay, we're running out of time here and I don't want to. I don't want to cut a couple things short. So what about how do you evaluate the components of a modern data architecture. So one of the points is about deployment. So that was a great question because that is exactly when you, we just talked about evaluation. So check to see, you know, if you're, you may be in a situation where you may say that if I have to run my batch job, or I have to, to hydrate my data warehouse, that's, that's a continuous process. So I'm going to keep that on prem, but when I have to train my machine learning models and I need 1000 GPUs. I don't have 1000 GPU setting spare on premises. I'm going to use the cloud. So cloud is fantastic when you have ephemeral use cases you want to scale up scale down very fast and stop paying for it. So that is one evaluation for your deployment is dependent upon the nature of your use case. The evaluation, but we're going to be running out of time here. So the second evaluation criteria is skill sets. We've already talked about it. Another one that I want to highly emphasize is customer experience that customer experience could be developer experience could be and user experience. But the problem that we are that I constantly run into is where organizations have accumulated a lot of technology like a data governance data catalog, but it's chefware. Why, because it is so hard it's not intuitive to use. So make sure that whatever tool or technology you pick can be easily adopted by the intended users. Okay, good. Well, we are almost out of time here. So let's get to the last question, which is how do you future proof your architecture. Okay, good point. So there are many ways of doing it. You know, again, depending upon your use case, if you're a Netflix or Capital One and you're all into AWS, that's fantastic. But if you're a smaller company, and you're not sure where to go, which cloud provider to pay, then picking a vendor neutral third party product that's multi cloud or hybrid multi cloud might be a good idea. And I mentioned some of them already in my, my talk track. So this way, you don't, you're not locked into a certain stack or technology provider. The other option in some people like open source, and they think open source is super important for them. We know open source as a business model has struggled. But Postgres SQL I mentioned a few times, you know, is might be a good idea because now you're not locked in. So you've, you've kind of future proved your architecture. The third thing is, is look for for cloud native deployments cloud native by the way, also means private cloud on prem. So these are products that run in containers orchestrated by Kubernetes, because as long as I have a Kubernetes engine, I'm, I can, I have the portability. The last thing I want to talk about is, when you are future proofing, look at whether you want an integrated or best of breed. If you go for integrated solutions, then you will be locked in. However, it's faster, you don't have to worry about identity access management logging. You have a single throat to choke if something goes wrong. But if you want to future proof yourself. Then you may look for best of breed with best of breed. The onus is on you to do the integration. But now you are not locked into a particular technology. Okay, so I know we're really short on time here. I just want to get a couple questions and then we got to wrap it up is we're actually over. A couple questions one from. How do you view the database deployments in a DevOps pipeline in terms of configuration product migrations. That's a real challenge. So I mean any suggestions there. Database deployments in DevOps environment should all be done through infrastructure as a code IAC, where you actually have a Kubernetes operator. And you have a YAML file in the YAML file you define your configuration and and insert hard coding the configuration into your database deployment you actually do it in a file. File is get put on GitHub you have revisions of that and you can quickly roll back go back and forth and do it on different platforms. Okay and then one last question which is kind of a one of the main themes I guess is do you think the emergence of a cloud data warehouse would pose a threat to a traditional data warehouse vendor. 100% traditional, there is no such thing as traditional data warehouse vendors, you know terror data had its own issues with pricing and, but now the terror data advantage there in the cloud. And by the way, they have the same maturity of the product they can do a join with 40 tables in a blink of an eye, but all the emphasis is in the cloud. Well great and with that, I think we are about done or we should be done. And we answered our questions so we're going to, we don't have to answer any more questions. So thank you very much, Sanjeev. And I do want to point out that that you know this type of information is going to be we are going to continue to discuss these topics. We are planning to organize an unconference in late March or early April, and we haven't set the date yet so I don't have anywhere to point you to right now but we are just calling it a modern data architecture camp, which is another term for unconference. And since you've registered for this event, we will send you an email to let you know when that event is scheduled. And really, that will be an opportunity for us to have more people come and join us and and have a broader discussion about more topics, and each topic can get its own breakout session so you'll be able to ask and get answers to more questions. Again, thank you very much. Really appreciate the opportunity to be here with you Sanjeev, and I look forward to doing more of these with you. Thank you so much. Take care everybody. Yeah, thank you so much to Dave and Sanjeev for their time today and thank you to all the participants who joined us. As a reminder, this recording will be on the Linux Foundation YouTube page later today. We hope you're able to join us for future webinars. Have a wonderful day. Thank you.