 Hi, this is the host of the party and welcome to another episode t3m or topic of this month and topic of this month is Data and who better talk to then Sam number C of better skills Sam is great to have you on the show Thank you so much for welcoming on we have covered plan a skill so many times But it's it's a great idea to just remind our viewers What is planet a scale and when we do talk about planet a scale it does talk about data, which is like huge You know quantity the whole history with YouTube and all those things So plan scale is truly the world's most scalable database platform Our cut underlying core technology was created at YouTube. It's called for tests It's a my sequel sharding And clustering manager that was built on top of my sequel to handle all of YouTube's data It was then later adopted by companies like slack and Square and Roblox and Etsy and github and github is kind of where I came in touch with the technology It was fantastically good for scaling my sequel Then I you know it came to planet scale and now we've we've hired a bunch of ex-github folks to build what is a Completely developer focused dev ops database platform So not only do we have the most scalable underpinnings for the database We also have an entire sort of dev ops layer on top Which allows people to automate their database Deploy the schemas of their databases into production roll back the schema changes without data loss And essentially just follow automate the database as part of the CI CD process How you've seen the evolution of data consumption creation In this whole cloud data cloud centric or Kubernetes centric world So I think there's just obviously more data than ever The thing that surprises Myself being at this company is the amount of companies that we see that are very small But have vast amount of data under management We speak to companies that have a nine person engineering team And have terabytes and terabytes of data being generated from their application. They might be a mobile game. They might be An AI startup There's so many different reasons That you might store lots of data and in this kind of very connected world that we're in the abundance of data being created Is continuous and and wide And so it's just it still amazes me and then and so then it comes to the hard challenges of storing data Retrieving it easily and quickly And do it in a resilient way. I mean we're now seeing huge migrations out of data centers into the cloud And it involves taking a strategy that's actually native to the cloud. So scooping up data center workloads And just putting them into the cloud Usually makes you less reliable less efficient makes your application slower So we help a lot of companies take their data center mysql deployment, which could be huge And we migrate them into the cloud And we don't just migrate them into the cloud in the sense that we just run mysql We run a cloud native Mysql cluster for you run on top of kubernetes Leveraging cloud technologies so that it's appropriately kind of baked into the cloud environment Um, and it's just ever surprising how much data the large the size Of these data sets and how difficult it can be for companies to manage their data and and you know databases still remain one of the most significant sources of outages and incidents and If you don't apply the right practices in the cloud You could increase your Chant your rate of instance and reduce your reliability coming from the data center So it's been a long time helping these customers kind of thinking a more cloud native kubernetes oriented way What are some of the big challenges or pain points that you see you did touch upon some of those Various where you see these companies struggle with and that's where of course there are a lot of solutions There are companies which are offering, you know similar solutions to a lot of folks But talk a bit about what are the pain point challenges and how we test our balance scale Kind of help them accelerate their journey. So a lot of the pain points come in differences from the data center are things like reliability and and networking and latency so cloud instances tend to be a lot more ephemeral than Data centers and data center servers and disks You can't like hop on a raid controller and start trying to resurrect Failed disks to retrieve data You likely have ephemeral pods that no longer will come back And then attached volumes that are persistent, but you need to be able to detach reattach migrate Scale up and scale down in a in the cloud world And and the cloud definitely has just completely different resiliency properties. So for tests being very very scalable, but also very reliable means that we've done millions and millions of failovers across hundreds of petabytes of data of various different companies at huge scale meaning failure scenarios are Very well tested and predictable newer solutions that are untested have not gone through such rigorous processes of kind of building in a very hostile environment The test being the way it is and even being built on go borg at google Which was pre kubernetes and again completely ephemeral environment means that we're used to to keeping high availability and services In a in a kind of a less reliable and predictable environment like the cloud and so helping people migrate out of Data centers where they might know the name of the machine and know where what rack and what server that machine is and they definitely kind of treat them A lot more as a lot more special You how you kind of have to get away from that in the cloud and the test is very very good at doing so Give us, you know kind of you know an overview of the overall picture of how benefit scale helps folks With their journey, which doesn't just end with my getting the data Yes, so we manage the entire life cycle of the database Most companies produce a database back end and kind of leave the user Up to a sort of automating managing doing backups doing schema changes all of the things that kind of the daily happenings And routine running and maintenance of a database a plans girl We not only have the world's most scalable back end. We also Take full ownership of managing the database. You do not need database expertise to manage your database with plans girl so schema changes are notoriously difficult there are locking operation and Can can cause downtime if they're done incorrectly and they're never really handled as a deployment. They're usually like a manual process in the in as part of the software development life cycle plan scale sort of being natively integrated in the dev ops flow We actually allow you to fully automate Doing schema changes and roll them back very similar to a github pull request We we have a deploy request that allows you to deploy a schema and a change iteratively and safely online We really spend a lot of time obsessing over how developers Daily interact with the database. It's not just about um kind of That one time you set up in the cluster Get set up easily and oh, it's like gray and works in a demo We we really focus on the years and years of building That you do alongside your database and we try and integrate the process of building Into the database workflow and we have prescribed and kind of very well refined workflows With the database and that's what makes it very unique And that's what makes us more of a platform than just a database back end. We talk about how data has evolved Let's talk about how planet scale has evolved over here since it's you know inception There's a two-sided answer to that There's one we work very closely with our users with our community and we constantly look at how they use the product and how they Evolve and manage their usage. We listen to them We see the kind of incidents that they might have relate to the database when we try and provide tooling that would make Remediating and solving these things much more simple We have like our insights product that allows you to really drill into performance problems That you have at the database level and really kind of get in there and understand and kind of get feedback from the database We then We have the other side of how battle tested for tests is We try and make it as hard as possible to do the wrong thing And mess up your database. We protect your database. We don't allow you to directly apply schemas or Drop data drop database columns all those type of operations have to go through a deploy That is like logged monitored and it's safe and it's still online and it's rewindable So we try and bake in a lot of the defaults and a lot of that comes from experience like everyone at planet scale It's got a lot of experience running databases at scale And that really helps us bake in Defaults And have that process layered upon a really solid foundation Means that we can overall provide a really solid experience for our users from technology or innovations point of view Because the the whole market ecosystem is moving at a fast pace new use cases emerging Kubernetes is going in production So unique challenger coming up. What kind of innovation? What kind of work is being done? Either of this or planning skill We're bringing a lot of innovation in terms of a bunch of sort of world first. So One of the products we released Uh late last year was plan scale boost Which is a sort of a real-time consistent cache query cache so one thing that is kind of burdensome and and A lot of kind of legacy databases have to Have a lot of sort of tip turning around them. It's very you have to shed load from the database using caching and caching can be a really appropriate way of speeding up Your database performance and how you use your database That said caching comes with a number of challenges Invalidating the cache when you get updates and writes and deletes Is is difficult. You have to it's a second system. You have to Store data there. You have to migrate Oh, you have to manage how that data is and the freshness of that data and inconsistent caches can lead to Bugs broken applications and just bad kind of user experiences So we've laid on this this query cache that uses The same protocol the same most equal protocol the same connection Even and you tell us which query you find slow queries in planets go insights you tell us Uh that you want this query to be cached We build the query plan into uh into the engine We then stream updates to the cache so that a real-time Consolidated sorry a real-time materialized version of that data and that queries Results are always in memory. So that means you get thousands of x performance improvement on queries without any Worries about invalidation or hosting memcache redis Solutions like that. Perfect. Thank you. I'm also kind of curious. Of course We all know the history of planet scales. So we do know we test and plan scale But I'm curious, you know, if you look at today's world since the instruction if you can give us once again a glimpse of Some of the use cases or users that you feel. Hey, these are the ones who are challenging, you know Even and that's where we help them. Yeah, so we have some very very large deployments running on our on our cloud Uh single databases with hundreds of terabytes of data doing millions of queries a second Uh fortune 500 public companies where we are their Main cloud database We have a payments platform that does 17 billion dollars of transactions every year running on our platform um we're really starting to see category leaders and Very large-scale internet companies Realize that they want to get out of the business of managing their own databases and that they can kind of Buy a platform that is run by folks that they would just work at their that they would happily hire at their organization, right? Our our engineering team has scaled The workloads at github dissolution github. Sorry google facebook twitter, we've like, you know, we've seen it a number of times and we've kind of We're building that product that we would have bought ourselves while working at any of these companies and Other large internet companies are thinking. Yeah, that's the right way is to outsource to people that we trust to know what they're doing When it comes to running high-scale infrastructure on a technology that's trusted globally by very very large Customers and so that means we've been very fortunate knock on wood to to gain the trust And and take on the workloads of some very very very large Customers we obviously work, you know, if you if you go on our website You know a number of case studies are a lot of well-known logos. Most people will will During their work day guaranteed work with You work with a tool that is hosted upon a plant scale Technology and that's something that we're very fortunate for I'm very excited about when it comes to data. Do you see that? Organizations also need cultural changes internally where developers think some moving Left where it's not once again. It's not data is not another silo It's you know, someone else's problem, you know Database and everything else is something to do or you're like hey culturally organizations need to start looking at the data Just the way we look at security and other aspect of writing an application deploying an application Yeah, I think culturally people need to understand How underserved they are by their current database practices I think databases have been so difficult for so long The people have almost given up on the idea that they can actually have Magical and enriching experiences While using databases Like we've made sure our platform Speeds up your active development Whereas you ask anybody else. What's the stance they have with the database? They're not expecting speed ups from the database They're they're grateful and happy if the database just doesn't go down while they're trying to make any single changes And that's where we've trying to try to push this revolutionary shift and it's cultural There's a lot of kind of trauma and negativity built up around databases And people not wanting to deploy databases or touch databases or go near database We've done a lot of user research and the word fear comes up All of the time people fear databases. They fear interacting with the database That's crazy because what what are you building of substance if it doesn't require database change right like any features require database changes and then The thing I would challenge everyone listening that does DevOps Unless you are deploying your database and you can roll back your database as part of a Undoing and deploy are you really fully doing DevOps if your database doesn't flow around the DevOps flow With you, so we allow you to branch your database At the time you branch your code So from the very moment you start developing a feature plant scale is there with you you develop against a development branch You deploy using a deploy request you monitor and alert based on insights If you see issues you revert and rewind the the deployment Without losing any data And unless you can do that flow and I think most people say they're doing DevOps Because they're doing it in the stateless world. They're like, yeah, we've got this great continuous You know deployment it rolls out code changes blue green. It's amazing But the database of dva started logs in and kind of manually does that stuff. You're not doing DevOps then You're ticking a box Organizationly unless your database fully lives in lockstep with your application Life cycle and gives you feedback and goes background that loop with you You have not completed the full DevOps journey and at date There's very few solutions that actually allow you and enable you to go and do that And that is something that's very core to our philosophy And and we're starting to see a major cultural shift Of people realizing the job's not done until every part of your stack operates this way And the database has been the absolute hardest to bring into this world But now we're getting there and and people are starting to understand as you said You do help, you know users with a journey. There are a lot of folks who are in very early stages They have a lot of room, but there are a lot of folks who are around if I ask you, you know I will not ask you a full playbook But how organizations should approach irrespective of we are in their journey when they look at data to build a culture Or look at a tool so that they don't hit roadblocks or they're also kind of future prove themselves So that whatever neck innovation is happening or they're moving faster data does not become a roadblock for them Yeah, I think it's about building an iterative and incremental process To adding features and modifying and working with your database you should Try and find bottlenecks in your process And and sources of fear and ways that you you have to like Be too safe too cautious And try and round those things out with automation And and stay disciplined don't Start to tell yourself That because you're at a certain scale Because you're a certain size because you're of a certain importance that you're above Continual iteration and continue development People start to say wow, you know, we're big now Everything needs to be bundled up into these big releases People still excuse themselves by doing maintenance windows maintenance windows are unexcusable in 2023 If you're taking a maintenance window to do any operations with your database You're fundamentally are sitting on a bad database technology And then you need to find an approach to get away from doing that because those sorts of practices are not just technically poor They also said that they're completely wrong standard within your organization And it it represents a fundamental lack of imagination Since we were talking earlier about culture Can you also talk about when we look at there are some practices that were popularized at Netflix like chaos engineering is there And we talk a lot about other things where organizations are prepared because as I said applications can go and on But the database and something suddenly you are, you know Nothing is more disastrous than that So also talk about some of the modern practices that you're seeing are also helping prepare dreams to also look at data from Holistic point of view also prepare them for when something does go wrong. You know, I think I have a better answer for thinking about How things go wrong? I think I think a lot of organizations fill the gaps with humans And I think you have to fundamentally understand that if anything Any incidents that go wrong or just a general operation that you should expect so you think about chaos engineering Chaos engineering helps you understand the failures. You should just expect like failures happen A lot of people operate and their practices are kind of on a hope and a prayer The failures won't happen. Well, they will they definitely will Availability zones It's not it's not an if they go down. It's a when they go down And so you have to build architecture that Recognizes the inevitability of failure and then you have to understand that if the answer to that failure Is human beings getting paged and woken up to respond You will never hit any respectable SLA or SLO You have to automate ahead of time What to do when something goes wrong Now that's a challenge if you're using a new or untrusted database Because there's chances they've not seen that scenario many times over So you're also stuck between a rock and a hard place in the database world You want innovation you want speed you want DevOps you want all of the things that make modern software development bearable But at the same time if the database is less than a decade old You're likely being dicey With the safety and reliability of your data So you need both you need the fundamentals And you need to be building on a solid foundation But you need all of the the delight the forward momentum and the and the kind of DevOps built in Sam. Thank you so much for taking time out today and talk about this topic And I would love to sit down and chat with you again. Thank you Thank you so much for having me very excited to have been here and i'm happy to come back