 Hi, this is your host, Bheem Bhartia, and welcome to TFR Newsroom. And today we have with us once again, Liran Himovic, CTO and co-founder of Rookout. Liran is great to have you on the show. It's great being back here. I love this show. I love having you as well. And this month, you folks announced what you call fourth pillar of observability. You added snapshots. So I would do, of course, talk about why you did that and why you think it's the fourth pillar. But before we go there, just give us a quick update on what you announced earlier this month. So earlier this month, we announced snapshots as a standalone feature as the fourth pillar of observability, essentially allowing software engineers to capture the state of the application, their application with a single line of code or with a single click of the mouse to see exactly all the variable values, all the stack traces, the global context, everything with a single operation, rather than having to tediously try to provide everything they want to see instead of having to list the stack trace and all the variables and exactly how to dump each and every one of them. We capture everything with the utmost accuracy with a single operation. But since you're calling it fourth pillar, of course, we have metrics of their locks or their traces are there. And if you look at the whole CNC landscape, I mean, projects keep evolving, new projects keep coming in. So when you are talking about snapshot, when you're talking about need for the fourth pillar, talk a bit about if that was from the very beginning that yes, we needed something else or is it that now you feel that, hey, because the evolution of workloads, we do need snapshots in addition to a metrics, locks and traces. So I think the answer is two sides and you're actually correct on both approaches. On the one hand, you can see that snapshots in various forms have been with us for decades. I mean, if you look at the Linux operating system, if you look at the Windows operating systems, if you look at other popular operating systems, they all had forms of snapshots. They would usually take a core dump when a process crash or a kernel dump when the operating system crashes. So it's actually a very common technique. If you look back at more traditional computing where you want to capture everything within a certain process, within a certain application to investigate and fix complex bugs and other troublesome issues. Snapshots are also very common for analyzing memory issues, taking heap dumps, taking memory dumps, figuring out what's taking memory and so on and so forth. So on the one hand, snapshots have been with us in one form or another for decades. On the other hand, if you look at the model philosophy of observability, then everything started with logs which have been with us since the 50s, or God knows when. It's all about hello world applications and everything evolved from there. And then over the past two or three decades, we figured out that logs don't scale too well to some more advanced use cases. And so we invented metrics and later traces to help us get a bigger picture, get to be able to zoom out on complex and dynamic environments and to figure out what's going on. Now the thing is, this zoom out approach has been led by SREs, first and foremost, who care about the big picture without their job, knowing it's the system up or down. But today, as those systems are becoming more complex and as we're shifting left and handing more and more responsibility to developers, we also need to zoom in from logs. We can't just keep zooming out because developers need accuracy and infidelity. They need to know exactly what's going on. And Snapshots is the answer for that. Snapshots is the zoom in version of logs, allowing you to get a lot more information about the single event you care about. So Snapshot is essentially empowering developers, giving them more power, more control, more tools over the whole stack that they're in. Now, can you just go a bit deeper into how it works once again from a developer's perspective? Actually, to look out for multiple approaches for taking Snapshots, let's take the most basic approach which is using code. Today, let's say you're writing some piece of code and you're seeing some interesting event, maybe some error condition you want to document. So you're probably going to write a log line. The thing is, that log line is gonna say, I got an error. It might include the error itself because that's very easy to write out. But then you would be struggling to know what more to share. Do I want to share the inputs of the function that failed? Do I want to share network information? Was it a network call? Do I want to share anything from the database? Do I have some internal context such as a dictionary or a cache? I want to share its content. And then for every one of those, you say, yes, I want to share them. Then you're meeting a hell of a lot of challenges. How do I share this? What's the correct encoding? What's the correct format? How do I access it? And even worse, we're accessing and trying to log this variable negatively impact my application. I mean, I run my own podcast and one of my first guests mentioned that for them, they actually crashed the entire system with the bed log lines. Now, the way most users, most software engineers deal with this dilemma is that they log very little because they don't want to spend a lot of time on that log. They don't want to take a lot of risk. They will log the error message and that's about it. Now, when somebody goes later to that place, the error message is going to be very lackluster, very minimal. It's probably not going to solve all of their problems. It probably won't solve much at all. And then you start iteratively collecting more and more and more, trying to figure out what's going wrong. Instead, snapshots just allows you to say write in the code, take a snapshot here and you instantly get the full state of the application. All the local variables, the stack trees, the logs and everything is very accurately, very efficiently, very safely, stringified and written to whatever format you want so that you can later on load it and easily observe it. So that's the first use case, which is very in line with how you use logs and other forms today. We actually ran a whole series on, DevOps, platform engineering. And when we look at, when I listen to you or when we look at the whole evolution of cloud network, we did notice that a lot of focus, a lot of tools, technologies are being developed to cater to DevOps, SREs. But now we are hearing a lot about bringing back that developer experience. And when I'm listening to you, it also seems like to empower them. Do you see when we look at just Nero 9? I know you want to, you know, let's just zoom in into, into observability itself, that the tools that were created, all the approach that observability, you know, ecosystem is taking is kind of not geared to our developers and you folks or the community is now looking at doing things to give developers more power. If you look at a snapshot, do you think this is the direction we are moving to? I think so. Now, if you look at even after very term of observability, the term of observability was coined by distributed tracing companies. Now, I love those companies. I use distributed tracing myself, but distributed tracing is damn hard to use. It's damn hard to implement. It's a very complex beast and most individual contributors out there are not distributed tracing experts. In fact, they can barely use distributed tracing. Now, that's okay because distributed tracing today is mostly used by centralized observability teams and service teams who are working to monitor the system as a whole and troubleshoot complex performance issues and reliability issues in multi-region, multi-zone environments. Now, that's a great tool for solving those issues, but many companies don't have those issues and even for the companies that do have those issues, those are one in a thousand. For every engineer, I know fixes bugs every single day and they can't do that using over complex tools that solve over complex problems they don't have, which is why exactly snapshots allow software developers to zoom in, focus on their code or focus on whatever code they're trying to investigate and fix. It helps with troubleshooting, it helps with onboarding, it helps with training, it helps with the design, it helps with new features and it helps with the precation of features. It gives you the high fidelity of information that you're used to as a developer. Essentially, it brings you back to the days of the local debugger. Except you can do use it in the cloud, you can use it in cloud native environments. And in fact, that's a whole tire that can give you about how cloud native computing is taken away the debugger or the traditional debugger from developers. They can no longer use it because it can't reproduce things locally almost at all. Do you think that we are still in a phase where when you're talking about a fourth pillar of software that, or you see that this evolution will continue and there are still more gaps that we still need to fill to bring more kind of empowerment, more tools to developers, or you think that, hey, you know what, this was the last, and now they have all the tools you need. Are you seeing that after a while, you talk about the fifth pillar as well? So that's a good question. And it's really hard to predict the future, especially with the world we're moving in so fast. I know, I've heard more than one person ask recently about how do I see observability for AI, which I haven't actually found an opinion on yet. So I need to think about that. No big answers from me just yet. But when you think of big terms that are upcoming and you should know about AI, maybe crypto actually does break into the world. We're seeing a lot of new technologies on the rise. We're seeing a lot of new potential paradigm shifts in how we do software engineering. And it's hard to say that the way we're doing observability today is gonna be the best way forward for years and years. Even if you think about observability, term is what, 10 years old, maybe 15 at the most. It wasn't here 20 years ago, it might not be 20 years from now. Because you also interact with a lot of your customer, not you also, you interact with customer and where you see that there are certain pain points that developers still face, but sometimes they're always scattered across the places. And sometimes you see, hey, these are patterns we are seeing. So this is a pain point when it comes to observability or giving them full visibility. That's, you know, okay, this is one more things we need to do. And that might kind of become a category in itself just like a snapshot. Do you see anything like that? There were still some pain points over there. So working with our customer base, I think it's very clear that the biggest gap is around engineers needing to zoom in. And we're seeing that the snapshotting tool as a whole, including both the dynamic nature of snapshotting with production debugging oriented tools, the static nature of snapshotting with code and also some, a lot of automation around smart snapshots and the likes. All of those create a very comprehensive picture that allow software engineers to see very well into their code, troubleshoot everything that goes wrong or is unexpected in any way, shape or form. And I think this for today, the way from what I'm seeing, this is the key missing piece. And I'm not seeing anything just around the bend waiting for us. As these technologies evolve, of course, sometimes they fill some gaps, but sometimes they also do overlap or sometimes they replace other things. So when we look at a snapshot, you know, as it's capturing all the data that is needed when something unexpected happens, will it make other things obsolete like metrics, logs, traces, or it just complements them? In a way it complements them, in a way it replaces them. I think just like we've seen logs, the use of logs decrease with metrics and traces carrying some of the burden. I also think we're gonna be seeing the use of logs decrease in part due to snapshots in particular and also dynamic observability or developer led observability in general. I think today much of the emphasis we put on logging is field-driven. It's all around from missing out on data. So I think we're definitely gonna be seeing a decrease in logging. And potentially just as important, we're gonna seeing more and more logs being shifted from hot storage for day to day use to cold storage for audit backup purposes, which is way, way cheaper. And so we're definitely seeing that organizations learn to rely much less on logs because in many ways besides the huge costs of logs because 99% of the logs are never used, but beyond that logs are actually a funky way of walking because you're walking backwards. You're walking from a piece of data that you're trying to search for pieces of data that may or may not be relevant to what you're asking but simply because they happen to be there by mistake or by accident. Using more a direct approach of specifying what you need using a dynamic observability and the like is actually a much better approach because you specify what you need and then you have to do a lot less hoop jumping once you get the data to analyze it and understand what's going on. So all in all, yeah, I definitely see a huge decrease in logging and logging costs in general in our future. And since you brought the point of cost, I also want to quickly talk about that. Also, companies are becoming sensitive when it comes to cost and cloud costs. They are also growing. How does snapshot kind of affects cost, save times, make things easier for developers as well? So this entire logging process is incredibly expensive. It's expensive in the amount of engineering efforts you spend on setting the logs up in the first place. It's expensive in the amount of effort you set up optimizing those logs, adding all those variables. It's about expensive in how much you're paying for the log providers and how you're then reducing logs. And with the way we see it is that observability needs to be way more reactive. Some events are very important. You need to keep them forever and ever, mostly for audit logs, audit logs for security for compliance and so on. But for developers, we see that it's way more effective to let them specify in real time the data they need and get it when they need it, what they need it, which is less than 1% of the data you're collecting today. And rather than just trying to collect everything and hoping the answer is gonna be somewhere in that haystack and in taking that approach, you're getting way more accurate pictures, but only the pictures you care about. So when you're getting high fidelity data when you need it instead of low fidelity data all the time, which has a terrible signal to noise ratio and is much harder to use, and you're still paying tons for it. Of course, the snapshots was announced earlier this month. If you can talk about also depending on what metrics you have is that what kind of adoption you're seeing or what are the use cases where you see a snapshot it makes more sense. The thing is we started with a very dynamic approach to instrumentation. The way we started with a tool early on four years ago that allows you to set what we call the non-breaking break point anywhere in the code. And then you could turn that non-breaking break point into a snapshot, a log, a metric, or a span. Now at the time, we didn't even call it a snapshot. We just said, you know, set a non-breaking break point here and get the data. And what we expected was that user would use all forms of it. They would use it to create snapshots which weren't named at the time, logs, metrics, traces. But we saw as much as those other approaches, logs, metrics were being used. People injected metrics on the fly, injected logs on the fly. The use was so paled in comparison to how much they use snapshots. It was so much more convenient to capture everything and then kind of like sort it out later on in the UI rather than trying to squeeze it into a text string and then kind of analyze that. And we were astonished when you looked, you know, over 90% of the usage ended up being in those snapshots. And we went back to the user and asked them, I mean, you can create snapshots, you can analyze them, you can create logs line, you can analyze them, you can create new metrics. Why do you keep sticking with snapshots? And it turned out that for our users, the developers, snapshots were so much more powerful, so much more beneficial that that's what they focused on, which kind of brought us to the conclusion that we need to create snapshots of their own thing. And also, as they asked us, let us edit statically, not just dynamically in real time. We want ahead of time, we know of cases where we would love to have a snapshot if this ever happened. So let me code this in, just like I would code in a log line. One other great use case we have added was the ability for the agent and the application to automatically take snapshots when interesting things happen. Our key use case for that early on is taking snapshots whenever a test fail. So essentially, whenever you would see an assert failure, we would take a snapshot showing you, besides, the actual and expected and the condition that failed within the assertion, you would get all the local variables and the specific version of the code that failed and everything. And there are so many cases where you see a test fail and it's either very hard to reproduce because it's flaky or it's very hard to reproduce locally or it doesn't reproduce locally. And just seeing everything combined together, the code that failed with the data, with a single snapshot is so much more powerful than having to dig through the logs, the test logs and trying to reproduce and spending so much time and effort in it. And actually, we're looking for more use cases where always on the lookout from our customers and the community to hear more about what cases they would like snapshots to automatically be taken and share with them as a notification or otherwise. Iran, thank you so much for taking time out today and talk about snapshots. Also share your insights and how this whole observability ecosystem is evolving, how the focus is shifting back on developers. Thanks for sharing those insights. And as usual, I would love to have you back on the show. Thank you. Thanks. It's been great being here. I hope to see you and maybe some of the audience next week in Amsterdam for KubeCon.