 Hey guys, welcome back to my YouTube channel. This is Daniel Rosil here and I'm doing an experimentary video today. It's going to be called Beer and Backup. Now this is not my idea. This is, I'm robbing the creative idea from the Restore It All Backup podcast, which I'll presently be introducing. It's one of my favorite podcasts. It's the best of my knowledge, the only podcast that really dives deep into the world of backup, which is one of my long-standing tech fascinations. I'll put it up on the screen there, a couple of things. Firstly, I was on this podcast about a year ago, maybe several. It's corona time. What can I say? It's all a blur. But I was on there at some point, episode number 74, talking to Curtis and Prasanna. I think this is actually one of the first podcast interviews I ever did. I had a great time on the show and they described me as a self-described backup anorak, which I remember Curtis got a great laugh at it because he never had come across the term anorak before. I think it's a British term and growing up in Ireland, you borrow a lot of slang from the Brits. So it means it's a kind of tongue-in-cheek word for someone who has a goofy interest in a very niche area, kind of a, I guess, a ham-hack. I'm trying to think of other words for anorak. You get the idea and certainly backup fits that criteria. There's nothing cool or sexier, glamorous. Even within tech, it's probably the dullest field. And yes, folks like me, Curtis and Prasanna, are fascinated by backup. And I'm going to try to touch a little bit on the psychology of my private, up-to-now private theories about the psychology of that in this YouTube, whatever you describe this, ranting process, or me talking to microphone in the late hours of the evening. So yeah, that was my episode. And if you want to hear me on the podcast, check out episode 74. There was, but here's a podcast itself. It's an absolute gem of a podcast. There's very few podcasts I've actually really stuck with over the course of time. Most are kind of, there's a lot of mediocre podcasts out there. This is a great podcast. It's got a lot of fans. They put out 158 episodes. These two guys, W. Curtis Preston, who's like literally has written the book on backup. And that's actually like literal. He's literally written the backup textbook. And he does a podcast with a friend of his called Prasanna, Molly Andy. And they used to work together at Druva. Now Prasanna is going to go on to Zoom. So Curtis and Prasanna ever listened to this YouTube video. They know I'm actually keeping tabs on the podcast. And they've done some really great episodes. They did one where they interviewed a Hollywood guy and he talked Hollywood studios do backup. Backup is really interesting, guys. I'm trying to sell it. It's a great jumping off point for learning about enterprise tech storage and all these admittedly unsexy fields. They talk about LTO. They've got a few episodes on tape for those who know a bit about backup. And it's really entertaining. They do great interviews. And if you're into tech and especially if you're into backup, this is like Nirvana, this podcast. It's like listening to people who actually get backup and care about backup. I went years without ever meeting another backup fiend before I discovered this podcast. So what I want the topic, the actual topic of this video blog is going to be the sorry state of consumer backup. So because I'm going to be uploading this after my beer, I have to get this straight backup and beer won the sorry state of SaaS consumer backup. That's going to be the title. So yeah, I just want to talk about that for a little bit. Like consumer SaaS backup is pitiful. Let's take a step back. And this is the way I want to frame this discussion. So if you get up in the morning, your average person pulls out their phone, they check their email, they check their Facebook, I check Reddit, YouTube, going through the things. And as you begin going about your day, my contention would be that the average person actually creates more data natively and that lives natively in the cloud environment these days than they do locally, right? What do you do on your local computer anymore? Most people, I mean, I'm looking at my desktop open in front of me here. I do my video editing on the local computer, probably like I'd say almost everybody because video editing is one of those things that is quite hardware intensive, RAM intensive. And it's really actually a bandwidth problem because to do like video editing in the cloud, there are a couple of cloud video editors, but they're just not not anywhere near as good as the desktop ones, right? My wife is an architect. CAD is another thing that's really, really heavy in terms of the data that's involved in doing architectural renderings and stuff like that. Or if you're an engineer, whatever. And it's actually a bandwidth issue rather than a cloud issue. I don't think it's the case that these applications couldn't be deployed as cloud native SaaS applications. It's the case, so let's say I shoot five gigs of video and I now need to edit that video. So, okay, if it's going to be a cloud backup application, I need to first upload my five gigs of video to the cloud. And what happens if I have my real lovely two megabit per second internet connection, it's going to take me six hours to get my clips just up to the cloud, then edit them, then publish them. Once I've done the editing, I could publish cloud to cloud. It's beautiful, but there's a constraint in terms of and that's the same for CAD files. So, it's actually an internet problem that the internet needless to say is there, but we don't really have universal, this would be my contention, high-caliber connectivity needed to really take cloud computing to the next level. I'm certain it's going to get there, by the way. Here's my prediction that in five years there's going to be almost nothing that actually lives in software. There's this concept called edge computing that kind of didn't really ever get a lot of traction. This idea that computers would be super lightweight, terminal 3D, almost like a Chromebook, like very, very lightweight operating systems, almost like what I'm using for the last 10 years, actually, which is LXDE Ubuntu. It's a really lightweight Linux version, and I do basically nothing on it. I just use it for video editing and Google Chrome, and that's all I do at my desktop. Everything else is I'm creating data in the cloud. Anyway, where were we? Your average person these days creates data in the cloud from the moment they get up in the morning passively or actively, right? Passively, I mean that you open your Gmail and you start getting in emails from your boss, from your co-workers, whatever, attachments, and let's see some of that data might be really important, so that's data that you'd ideally want to be backing up. Let's go to active data creation. I'm going to just pause momentarily to actually do the back-up and beer element of this. Heineken at the end of a long and sweaty conference day in Tel Aviv, nothing better. Let's talk about data creation now, so you respond to your co-workers' email. If we break that down from a technical standpoint, let's say you're not doing anything fancy, you're just using the stock Gmail app, right? The stock Google Mail app on an Android phone and you're writing an email. Now, what's just happened on a technical level? Well, you've created some data, so where was that data created? It's the best of my understanding, the Gmail client or any really webmail clients. They don't work like if you've desktop clients that still exist, but not so many people use them. Mozilla Thunderbird on Ubuntu and I even forget what the Windows one is called. It's been so long since I used it. If you remember those, you did your send and receive and they pulled stuff down from the cloud onto your local computer and you could actually see the directory where all the email is restored, whatever. In a webmail, that stuff is native to the cloud. It's not actually touching your local machine at all. So when you open your Google Mail and you read an email, let's just keep it simple here, and you hit reply and you write some text, that webmail client, that email client, the Gmail client for Android is just a front end to the web interface for Android. You're creating your data there, so where is that data being created directly in the cloud? It's a cloud native application. You're not creating it locally, then it's bouncing up to the cloud and there's a local copy. You're just creating it straight in the cloud, recipients reading it on their Gmail, they're reading in the cloud. So you're actually just communicating on, you're seeing, you're pushing some data up to Google's cloud by sending that little email. That's staying on Google's cloud and the recipient's just reading the data live on the cloud. So that's cloud native data. Now, that really simple example of an email which can be taken much, much further. So you send your email, then you go on to Reddit and you comment on a few Reddit posts or you post a couple of posts on Reddit. What's happening when you're going on Reddit? Well, you're accessing Reddit from, let's say, the desktop. So you go onto Google Chrome, you click Reddit.com, you click create post, you enter your post, you hit post. What's just happened there? You've created data that's being stored in wherever Reddit's hosted on. We could go on like this, talk about Facebook, LinkedIn, blah, blah, blah. Now, the problem that occurs is this. You're going about your day, you're reading emails, responding to emails, you're posting a bit on Reddit. What else might your average internet user be doing on a day in terms of data creation? I don't know. You post a YouTube, a YouTube video, you leave a comment on someone's YouTube video. You go onto Facebook, you post in a Facebook group, you tweet, you tweet photographs and you then delete them from your phone, maybe. So we're going through our day as your average consumer and we're actually continuously creating data if you think about it. I mean, if you don't use a computer for a day, which is totally reasonable, you go on a little digital fast and you leave your smartphone at home. Even then, I would argue the pool of data that's in the cloud that you wish to protect might still be growing completely passively. The example I used before was your colleagues emailing you and stuff you'd want to be backed up. Now, where is the problem here with this cloud native thing from a backup perspective? The problem is that the cloud is not backup. That's your primary data source. The cloud is amazing. Cloud computing is fantastic. As a long-term Linux user, that's actually the reason I've been a cloud supporter from day one because back when Linux compatibility wasn't great, the cloud is great. It doesn't matter whether you're your co-workers on Mac or your co-workers on Windows or they're also on Linux, everyone can work in something like Google Workspaces. You don't have to worry about the operating system for each person. So that's the cloud's amazing. This isn't critical of cloud at all to the country. This is just saying that the cloud is not backup and the cloud is not enough. So when we talk about backup and I've talked about this before, the best practice and backup, the long-standing best practice recommendation is 321. The 321 rule means you want three copies of your data. That's the original data and two copies, two different storage media and one of those off-site. Even though I'm sure I've covered it already in a ton of videos, why does the 321 rule exist? Why do you need two different backup copies? Well, you need one on-site copy. So let's just go back to the cloud. You have a ton of data in your Google Drive. Now why is the cloud not backup? Okay, because that's the most common thing, right? If I don't really talk about backup to anyone in real life except my long-suffering wife who has been subjected to probably way too many hours of me mumbling about backup, but if I told my friends, guys, I'm burning M-discs, this crazy storage technology, why? Because I want to back up my Google Drive and they're like, why would you need to back up your Google Drive? Your Google Drive's not going anywhere. And that's what people think and it's a problem because it's not correct. There's a few things that could happen to a Google Drive. Firstly, Google's a SaaS provider, right? They're not a benevolent entity that has your data. I'm trying to say this, I'm saying like a conspiracy theorist because that's not the point at all. The point is that Google is a third-party company. They're holding your data. And if it's a premium thing, a premium product like Google Workspace, they'll hold your data as long as you pay them to hold your data, right? So that's the first thing. It's an external party and you don't know, even though, yeah, a SaaS provider, Google's not going to vanish from the earth tomorrow along with your data, but a smaller SaaS provider, crazier things have happened. There've actually been some pretty big data losses. It has happened. There's a few other things that can happen. You can find a vendor lock. You can have a bunch of your data in a cloud. This happened to me recently and the cloud provider can change, move the goalpost. Well, you're 15,000 that were free data. Tomorrow that's going to cost you $10,000 a month and you're like, oh, what do I do? I don't have 10,000 bucks. So there you go. Your data was in a cloud. You thought your data was safe. Suddenly, you may not be able to afford having your data in the cloud. What else can happen? Here's another more perhaps credible scenario, ransomware. Increasingly, cloud services are interconnected. Everything's integrating with each other part of the cloud, which is again, very good. But it, from a cyber security standpoint, does pose a risk in that, let's say you integrate your Google workspace with a SaaS app that turns rogue and has read, write permissions. It can wipe. It could wipe your whatever inbox, calendar, YouTube, all of them, right? That's not unprecedented. That is actually a real threat factor. What else can happen? User error. And this is the thing. People assume, well, you can't, if I delete my Gmail data, YouTube has, Google has a backup. No. If you read the fine print of AWS or B2 or Google, any major cloud provider, you will see that security and backup is the user responsibility. And this is literally, I've seen it happen. I worked at a company once where people are into Linux, excuse me, this is probably doing a beer and backup thing when I'm a bit suffering from, suffering from bloating as they happen since my gallbladder surgery. Linux geeks are very familiar with the, you know, joke that says, oh, I just did an RF, whatever it is, RM minus F asterix, and it's the command to delete an entire file system. Like the intern types this into the root of the server, and your server is completely finished, right? That can happen in Google Drive. And I literally seen it happen. I've seen someone who wasn't so tech savvy, let's just say, accidentally delete the company Google Drive, then someone deleted that from the trash, right? And that's, that trash, that is literally the safeguard that there is no other safeguard if you use Google out of the box, let's say Google mail or Google calendar, you can create your own backups 100% restoring them is another issue. But that's what you get. That's your safeguard. It's totally possible that someone will delete all the company Google Drive, go into the trash, and someone says, oh, let me try restore. Oh, shit, I hit the empty, empty trash can button. Boom. Two seconds later, your entire company's drive is gone. And you'll, you'll contact Google, you'll say, hi, you have a backup right of my Google Drive. And they'll be like, nope, we do not have a backup. They might have a backup, but they, there's a difference between backup and restore. They may have a backup that they will not restore for your average consumer, maybe for an enterprise, they'll have some kind of backup plan. But you say, please backup, they'll tell you to go jump in a lake. So those are the reasons why the cloud isn't backup. Nothing wrong with the cloud. It's just one data source, like having all the cloud is someone else's computer. From a backup standpoint, the cloud is just about as good as having all your data on an NAS in your house, right? The cloud has redundancy, it's professionally managed, beautiful, beautiful, beautiful. But it doesn't equal backup because backup involves risk protection measures of having your data in one place and then having it in two other places. One of them is, one of them is offsite and why is one offsite? Because if your house burns down, your primary data backup is good for nothing. So I always want to have that fail safe of we have a backup here. And just in case there's a tornado strikes or some other disaster and our onsite and our primary are both destroyed, well, we have that backup to fall back on. I just noticed my levels are a little bit high. I saw some clipping from my audio interface. So I'll try to stop shouting here. Where were we? So yeah, that's why the cloud is not backup. Now let's just roll back a bit to the start of this talk, when I was talking about how much data your average consumer creates, not even a business user, but also a business user. From the moment they get up to the moment they go to sleep, they're sending emails, they're posting on their social media, they're tweeting, they're editing, they're YouTubing, blah, blah, blah, the whole thing goes on, right? Now how much of that data can be protected from in a backup three to one backup compliance standpoint? The answer is actually really, really little. I'm just going to pause here for a moment. The answer is very, very little. There are enterprise tools intended for businesses to backup Google Drive. There's actually even consumer tools. If we take the Google workspace, Google is a big ecosystem, right? If you're even your average user and you're not workspace on a free Google, you just have Daniel at gmail.com, whatever, and you start creating data in the Google pool. You start labeling your friend's house in Google Maps. You create a little list on Google Maps. You create a custom Google Map. You start putting stuff in your Google Drive. You put up a video to YouTube. I mean, if you look at the Google ecosystem or space, whatever you want to call it, there's actually a bunch of products there. It's very far reaching and you may forget that, oh yeah, Google Maps is part of my Google footprint. It's all connected to that gmail account, right? So even the products like Synology's backup platform, Clitesync, which is a nice decent product and there is a different product for Google workspace for paid Google users, it only gets kind of the core of that stuff. The stuff that your average business will care about, Google Drive, email. I don't think email is even there actually. So in a backup, in a normal functional backup strategy, you want a few things. You want three to one. You want automation because no one really wants to be thinking about backup. You'd rather be doing the beer aspect of this and the backup aspect or most normal, rational thinking people would. You want to be automated. You want it to be three to one compliant and you want it to be either differential or incremental. There's three different backup approaches that you can do a full backup, which means every time that you take a backup, you duplicate the whole shebang, right? That would mean like Google takeout. If anyone's familiar with that, it literally just pulls down the whole Google and that is very inefficient because when you're talking about a big data, I'm going to turn down the output level a tiny bit on my interface. When you're talking about a big data pool, for an enterprise scale, a full backup, it just becomes a pain in the butt to store. So good backup approaches, you have differential which takes a full backup then looks at the changes, the positive and negative, the creation and the deletion and moving the file structure around since that point in time, registers that and then creates a little file that's your differential backup and then incremental, which is a series of little slices. Each time from the last incremental, it writes the changes to the file structure and then if you need to restore, you go like this. If I can use my hands, you put the chops together versus differential where you have to put that whole, the slices are bigger if you want to think about it like a pizza, right? You want to back up your pizza. A full backup would mean we copy the pizza, the whole thing, but every time you back up, you end up with a whole another pizza. Differential backup would mean you took a caper off the pizza, the first time you backed up the pizza and then you take a caper off your backup pizza. Incremental backup would be you took a caper and an olive off your pizza and the incremental backup, the first one, we record that the olive was taken off the pizza. The second one, we record that the caper was taken off the pizza and if you screw up your original pizza, we go from the backup pizza. I never thought I'd explain backup by reference to a pizza and capers, but there you go guys. What a day this has been. So anyway, so even Google, even the few backup products are available and by the way, you have to make a confession. I had an idea a few years ago to actually start a startup here in Israel, a actual consumer backup startup that would be the Swiss army knife of consumer backup. It would try to integrate with Twitter, Reddit, blah, blah, blah, blah, everything your average person. Asada, project management Monday.com, Todoist, Wunderlist, it goes on and on. If you actually stopped and thought about how much SaaS you use in a day, you'd actually come up with probably 50. I would imagine a lot of people come up with like 20, 30 products. I use my as a freelancer, my accounting platform, my receipt generation platform, my billing platform. It's all SaaS, it's all living in the cloud. So yeah, I had an idea and then I realized, consulted with some business coaches on the idea and I realized it was actually not a good business idea. A, running a backup business would probably be pretty boring. Secondly, there just isn't demand because most folks, most consumers are not as paranoid as me or are not as backup. Woke is me. They say something backs up to the cloud. Beautiful. I don't need to worry about it. Most people don't care about backup. Enterprises care about backup. So the consumer backup space has been kind of left untouched. What do we have in consumer backup? So really, so you have the big data pools, Google workspace, Microsoft 365, whatever it's called. I don't use it. So I'm not sure what it's called, but you know, Microsoft's competing cloud offering. Some companies have developed solutions that will do some kind of backup on these. But again, at least just touching by Google, they are not from my standpoint, not sufficient. They do not capture the full typical imprint of a user on Google's cloud. Now let's just talk, let's talk about more, more, more sources instead of repeating names. Medium.com. I used to blog a lot on medium.com, right? So for stuff like that, that people don't really expect people to be backing up their stuff. Or, you know, the SaaS providers, Medium is a SaaS. It's a CMS that lives in the cloud and you just go in and create your blog. So what do you do? You're putting text and you're putting images up to Medium's cloud. Now, Medium is typical of what I'd call a backup provider, sorry, a SaaS provider with almost no useful backup functionality. And medium.com would say, Hey, what do you mean? We have backup. We have a data export functionality. User is going to export their own data. Sorry, that's not backup. Firstly, if we're talking in backup terms, that's a full backup. That's the least useful backup from a backup standpoint. It's very inefficient. That means every time I want to back up Medium.com, I need to pull everything down, everything down, everything down each time. Secondly, if you go into the, if you actually go into the backup and I talked about this before, on Medium, I posted about, I was chronicling or maybe GitHub, I think I posted on GitHub, talking about more SaaS GitHub, code repositories, stuff like that. I posted about in before, if you actually look what you get out of Medium.com, you unzip, you take a look, most people again would stop here. They'd say, Oh, great. I've got some kind of a backup. I don't need to worry about Medium. You actually go into the HTML files. They don't give you the images. It's HTML files, hyperlinks to the images that you've uploaded to their cloud. You don't get those in the download. So unless you're manually going through each link, one by one by one by one, pulling them out of the cloud, I'm sure there's some way you could do programmatically. That's not in the backup archive. So the picture that I found is that there's basically the only, so these kind of data export functionalities, which are probably not even intended for backup, they're probably intended for people, deep platforming, leaving a SaaS to go to a different offering. These are not, these are not backup functionalities. These actually only really exist for GDPR compliance or compliance with other data protection instruments. The GDPR, and if you look at Reddit's data export approach, you can google, how do I back on my Reddit account? You'll finally get to some knowledge article in Reddit's Zendesk, and you'll see there's a data backup request form. You need to actually manually complete the form, get someone to Reddit. I don't know if it's automated on their site, but let's just say it's not. Someone at Reddit needs to actually pull out your user archive. That's not, that's totally on backup friendly. That's not automatic from the user side. You have to manually complete a form every time. So hence it can be automated. B, it's a full backup, as we've discussed, which is not great. And that's probably it. So basically what we have today in consumer SaaS backup is actually, if you put those two things together and I realize I'm coming up to 30 minutes here, so I probably try to wrap up this beer, wrapping up the beer so that I can wrap up the backup banter. So the system we have today, put these two facts together. We're creating more data in the cloud, or most people are, I would contend. I just got some message in OBS about some encoding being overloaded. Never seen that before. Hopefully the video is coming through okay. We're creating more data in the cloud than we're getting through our local system. And we have very little in the way of actual viable backup for consumer SaaS out there. So most people currently, and I used to for a time, I can actually wrap this video up because I'm now worried that this is going to be completely unusable. Most people now are committing every single day, the majority of their data to the cloud, and there's no way for them to actually back it up in a viable way, unless, and this is what I used to do, you set yourself reminders, hey, on the 30th of every month, I'm going to download my Reddit archive, I'm going to download my Twitter archive, I'm going to download my Facebook archive, download my LinkedIn archive, you go through service by service by service by service, writing, completing forms, using data export tools all manually. And you, you know, then that goes into your backup approach, you pull that you store that on your NES, and then you offsite that. So you could do that. I just gave up on doing that because it was distracting. And you just have to say, at some point it does not exist. There is no right now at the time of writing, no one's come up with a really, really robust consumer SaaS backup tool. And if you compare this to, excuse me, the enterprise backup ecosystem, it just does not compare. Why do I care as much about backup? I'm going to wrap up with this. My wife asked me, why do why do us backup folk? Why can we only open up about our backup love sometimes over beer like Curtis and Prasanna? I think it's because we're deeply ashamed of our love of backup. We realize it's the most pathetic area of text. And yet we know it's super important. For me, I think it's probably, it's probably a combination and I suspect for a lot of backup people of two things, one being into tech, I think that's a given if you care about backup, be probably having a predisposition to anxiety, because people's anxiety are not able. One of the features of anxiety is that we can't tolerate uncertainty. We crave absolute facts. And in backup, there are no absolute facts. It's about risk mitigation. And the cloud from backup standpoint today is actually a pretty risky environment for your consumers. You're just kind of trusting that all your SaaS providers are sort of doing backup. You hope they're doing backup. You hope if something goes wrong, you can get it out. You don't know. There's no way you can actually create a robust backup plan. So if you actually learn about backup, you'll actually drive yourself crazy thinking about how bad the consumer SaaS footprint is. I want to just finish on a personal note. Why do I care about backup? I don't really finish that thought. For me, it's that I've seen, I've had a data loss disaster before. I've experienced those being a long term sort of DIY techie, you know, hosting my own websites, building my own computers. And I've seen what happens when you spend a year tinkering an operating system on Linux, only for the hard drives to catastrophically fail. I've actually experienced this. And the last year of your coding goes to waste. And then you say, I better learn about backup. So that's number one. That's on the negative side. On the positive side, my late grandfather, Aloha Shalom was the custodian of the Jewish records of the synagogue in Cork in Ireland. And he, as part of that job or function, you know, held on to data, literally physical books from literally the 19th century, like burial records, wedding records. And we got them digitized and something anyone who's sort of done it for an archive and then sort of some kind of archival process for a museum or for a religious organization or for a government department. There's an amazing feeling when it comes to actually storing historical data. It's a very satisfying feeling, taking a chunk of history and making sure that it's robust because paper in some ways is actually more robust than bits and bytes. Storing digital data can be surprisingly brittle. I didn't talk today about bit rot and cold data storage, but that's the problem I've talked about in my M disc videos. And sometimes when you do manage to do it right, when you learn about backup and storage, and you get a request from someone looking for the burial record or the wedding record of their great great ancestor, and you're able to pull that data because you protected that data, you've digitized those historical records, you've backed them up, they're safe, as safe as safe can be. It's an amazing feeling. So I've both lost data and I've gotten to appreciate through just personal tech experiences the value of data preservation and digital archiving and all that good stuff. And I think it's a really important feel. It's just again, maybe just because I'm anxious, it's a part of the tech world that's always attracted to me, albeit, yeah, it's not glamorous, it's not cool, but it's so important. And arguably, I would say, never more so than the cloudier. The cloudier is amazing. It's made all our lives better, but people far too many people, even I include inside, including this bracket, IT professionals, just haven't thought about this aspect of the cloud. We mistake redundancy with a valid backup approach. We say the data is available, it's great, my Google Drive's always been there, therefore that is backup, and it's not. So that's where I will leave it for my inaugural backup and beer episode. I almost made it through my beer. I might do another one of these, if there's video interest, anybody, because I probably do have more thoughts about backup, just kind of tinker around the back of my mind, but they'll come out in the next beer and backup episode if they are to come out at all. Thank you guys for watching this video blog. I don't know, whatever this was, this bits and bytes that have floated onto the internet. And if you want to get more videos, subscribe, but if you want to do that, subscribe using the not the all, subscribe, the recommended, because my videos cover a boatload of topics. If you want to get the YouTube algorithm to help you out to see content relevant to you, that will work well. Thanks you guys for watching, and until the next video.