 So hello everyone good afternoon. Hope you enjoyed your lunch. Thank you for coming to this session and My name is Ilya from VWorks based in London and Today I'd like to talk to you about time-traveling in the universe of microservices and orchestration I've been working on cloud native tools since about 2014 and as a company we are running commercial SaaS product on Kubernetes and EC2 and And I must say probably one of one of You know early vendors in the space we've released we've net overlay network for Docker back in 2014 when I started and It's it's been our flagship open source product since however We also built a commercial SaaS product Which is called Veeve cloud which I'll show you at the end and time travel is essentially one of the features We have in Veeve cloud and I'll show you a little bit of that at the end but I'll begin by going back in time and talking about some of the experience I had in the past with Linux systems and I'll relate to The present day when we have containers and how containers could have come could have come so useful back then in various cases Hopefully you'll enjoy it. They'll be not too technical talk and Hopefully quite good for you know after lunch talk All right, so we start So yeah, as I said, I'll start by giving you a little insight into my journey in software And why do I care about these things? Why do I care to come here and talk to you about this? hopefully You know, you can make up your mind by relating to it or whatnot and And then I'll I'll show you where are we today all wonderful things we've been able to do with containers and Veeve cloud Such as time travel. So before I begin I I wanted to to sort of to bring it up, but what what is it? Why do I care about containers? Why should you care about containers? I've used Linux since 2002 or three I can't remember for sure and That's like 12 years, right and I Used it in work context for about nine years now and I've done various different things and I'll go through some of these examples momentarily. I just want to say, you know throughout this time I Learned a lot through through the material that community puts out there on blog posts documentation Etc. And I really Appreciate this information being given to me for free and I really trying to give back to the community by going out to conferences and meet-ups and talking about these things and you know while working with Linux I I sold various kinds of problems, but two pretty major kind of predominant themes always been Packages and dependency management That was there like from day one except I had no idea what I was doing. I really had no And then I learned a bit more and and then I learned that I learned very little and then I learned more and I Think I'm getting fairly good understanding of it but the tools I find most handy are You know docker containers docker images. That was a big breakthrough and Second thing is resource management It's a you know, it's something that Didn't heat me right away They eventually understood that hey this app is using too much memory. How do I stop it from doing that? Or I'm trying to compile something, but I'm also trying to write this document How did how do I make that happen nicely? And reasonably fast So, you know doesn't magic there, but but container orchestration comes pretty handy here. I Wouldn't trust myself and setting up C groups or anything like that But you know, I'd rather tell my orchestrator to do that whether it's me. So so or Kubernetes However, you know, there's magic. There's no magic at all. You still have to understand what what are you doing? It just helps a little bit so It was around 2005 that I went to university in England and they got myself a desktop machine like a normal PC and You know pocked around for a while. Obviously I installed Linux right away I use Linux before that and I installed it right there and and you know At some point I kind of got bored of hardware and I looked up. What is it there that I can buy and I get myself a Sun Ultra 5 for 30 pounds on eBay I Like the desktop a lot. I kind of I admired the look and feel of it I couldn't do much with it terminal was pretty terrible and Yeah, I mean, but you had this great look and feel and then I also really like the keyboard I couldn't find a photo of it But yeah, well, so what happened is like, you know, I go and install Solaris 8 which is which is what this box came came with originally and networked. Okay, I could get the desktop But after that I learned about Solaris 10 which just came out around that time And I learned about containers and Solaris 10 and like ZFS and detrace and all these cool things I tried to install Solaris 10 and it didn't quite cut it. I Mean this machine was too slow for Solaris 10. I couldn't really use the latest OS on that So unfortunately, I only learned about I got the taste for containers from the commentary on videos and Like various white papers they had and stuff But I couldn't really use containers because it could run Solaris 10 on this and I couldn't use the keyboard with my PCI They'd had a different cable, but that's a separate story. So it served as a good monitor stand for a while I didn't have room for it at some point, but anyway, so that's how I learned about containers so and you know and after that I was I was using g2 linux for for a number of years and One of the first things you do with g2 you you do CH route or actually as you you call it True it right. I only learned later. I thought it's the CH route So I didn't know what that meant, but it was definitely a thing Now that I you know know about containers and mountain namespaces I understand CH route was like a basic version of that back in the day So I really had no idea what I was doing here, but I definitely typed that it wasn't a docks and then I learned a bit of bash and Like customize my prompts and stuff, right? This is what they use but this this is the sort of where I learned how to customize the prompt and then You know that was fun And then I compiled a lot like everything in gene to that's what you do you compile everything right you compile everything from source So I ended up compiling like pretty much every open source project that that was somewhat popular and because I was curious and Well while compiling things a lot I learned to control the way that my compile jobs use resources on the machine and the basic way of doing that was this nice command and later I own eyes and they were they were quite nice, but I Later learned that these were really rudimentary and kind of like indirect ways of managing resources on your machine and Well, I mean, I'm not going to question whether you should be compiling your desktop OS from source or not That is a separate discussion However, let's move on so You know I started hacking on some open source projects and as Folks been like releasing tar balls and they've been packages and red hot packages, etc. I wasn't getting any of those on gene to so I had to compile mine and Well, why not right and I ended up doing this kind of thing most of the time So it put put like a new version under under a prefix with the version number in it And I would often have to specify with something path to sort of tell it like where to find some headers or whatever and And then I would install it manually and you know started learning these things started grasping how these things work, but I mean, yeah, it's like it was on very small scale of my desktop as a You know in progress through my university course. I was doing electronics and we didn't really do Programming languages or computer science as such We kind of did a bit of Python and then we jumped straight to see and not see on like Unix or Windows or whatever but see on microcontrollers that was quite fun and You know, but like what we did at uni is one thing and what I did at home was like all open source and stuff And at uni would use like some mentioned compilers and at home I use GCC and I picked up some microcontroller projects from sourceforge, etc and What would you would often end up doing because there are that they do wouldn't normally use auto make or anything like that You'd just use make and you would specify CC and then you go and check which one What's the flavor of GCC that you've got to be using today? And I had multiple of those installed a package management problem again, right? So, yeah, it would be something like that and then and then sometimes you'd get like a linker error So you get a specify LD as well Anyway, but what I was getting to is that back then I've discovered Yeah Anyway Back then I discovered this Project on sourceforge called modules and it was able to manage different versions of GCC on your machine It was already quite old at the time as I've just double checked before this talk Turns out that like the first release of modules was back in 1998 it was written in tickle and I never actually used it. I just I was aware of it I thought it'd be something I could use if I had to manage more versions of GCC for more people or whatever but for myself I was kind of just about managing and I looked it up. I was quite surprised. They actually somebody is actually working on this project like now They've made a lot of changes recently and It's it's alive and well and on github Which was a fun discovery? I Wonder why they're not using docker images. They're using this old tickle project In case so as thing move as things moved on towards the end of my University course, I got a part-time job at a hosting company, which I'll tell you more in a minute But I was just gonna say while working on my dissertation. I learned to use git and You still find that on online, but that's a revelant to the topic that was just Means of coming to the point where where I got the job So I'm at this working. I'm working at this hosting company. They do insured hosting. This is 2007 or eight or something like that nine actually Anyway, doesn't really matter that much the way to ensure it hosting and shared hosting was still a thing maybe not so many people used it but some people did and And Yeah, I mean I'm just gonna like play this little video Essentially like we had some of the Worst dependency management package management config management problems You know, there'll be like I don't know we didn't have many servers and and There'll be different Customers running their WordPress and such kind of apps on different servers and some of them would have different versions of PHP like they would require them they would depend on them and we we had to like move people around from one server to another To match up was a version of PHP. They wanted Yeah, so that kind of stuff and You know and don't even get me started on like resource management because this WordPress sites got hacked all the time and Yeah So that was that was a lot of fun Anyway, well, I started googling Linux containers. I remember Solaris containers, right? And I thought well containers on Linux. Is that a thing yet? So I started googling that and And I remember starting to find certain things like open VZ But that was kind of complicated and it wasn't something I was able to introduce at this company I mean I was like a junior support person, right? but I was like trying to figure out what would be a better solution to this and You know didn't didn't didn't even try open VZ at the time seemed complicated anyway. I was writing that the station anyway, so Anyway, that job was over at some point I got a DevOps job fancy title still don't know what it means so You know and again I'm managing packages and dependencies and machine resources for most of the time So some of the things we had to deal with involved. Oh, yeah So we used puppet for example, and at the time I was like, oh, yeah puppet sounds cool And I still like the syntax a lot. I think it's it's very nice and expressive syntax I'm not using puppet anymore However, I just had this slide here because this is actually something that is quite similar to what we do today Except we did with Kubernetes. So essentially We were doing this kind of decentralized masterless puppets setup where we do a git pull Well pull a clone or whatever and then we go into that directory and do puppet apply We do that vertically from a crown job. So essentially we were doing decentralized puppet. What was this kind of thing and It's something I'm going to show you later how we do similar thing with Kubernetes Similar concept, please and you know one of the first things I've discovered was this Ruby version manager RVM and You would You do this sort of like you you After you install RVM you end up with this sort of thing in your bash profile, right and Then and then after you kind of like done a couple of those things you end up With CD command L is to this function, which which does magic whenever you change the rectors basically Which is great So we managed to read that up and Replace that was this thing called RBN, which is much simpler and nicer. However We soon introduced closure and no GS and some Python and some different versions of JDM Like Cassandra was using one while our closure up was using a different JDM Fun stuff containers could definitely come useful there and you know, this is you probably all know this You know this beside that that RVM stuff You still had to do this kind of thing where you go use one package manager to get your other package manager And that use that package manager to get your packages No, actually in Ruby you got the three, right? So you got you'd use your System package manager to install Ruby gems once you install Ruby gems you install bundler and bundler is like a wrapper on top of gem I mean the things are not dissimilar in Python world You know, no JS has a little s Anyway, but it's much easier once you can encapsulate this in your container image and you don't have to worry about it and Eventually they'll improve their package managers supposedly and we also had this amazing thing called Capistrano All you had to do is go around cup deploy production. That's all you had to do You know what I call it now It was a great chaos monkey tool except that, you know You triggered it was a different intent Um, anyway, let's move on That does this other thing as well, right? There was this get flow Somebody who is inspired by get flow blog post and they implemented it in go CD if you're familiar with go CD And nobody knew how to use it really our software wasn't that complicated So this is designed for you know, folks who do version releases and they run different Versions in different environments and such things. We didn't do any of that We didn't need any of this so that that I reaped that out right away on the Yeah, people just weren't aware that that this is completely useless complexity that we had Another thing I've happened to implement there. I mean I removed a bunch of things and implemented this kind of thing Which was okay We were we had this pool of VMs Where developer would go and say I want a VM that are on VM take And then can SSH to it and do some work and throw it away, right? and We didn't have Packer we could have used Packer if we were aware of it. We weren't We used VMware. Okay, fine but like, you know this could have easily been replaced was like a a a Bunch of dog containers really, right? So on Sometime later a colleague of mine asked me to come over to his desk So well, I want to check something out and it was the first demo of Docker on how can use That was pretty amazing pretty mind-blowing, but at the time I kind of like oh Salaris containers things I've seen with open VZ. This is very different It took took me a little time to to to to get the terms with it But I was definitely impressed with how easy it seemed to be And then later I started learning a bit more about messes chorus and Console Terraform all these new tools and I played around with it and then I got a new job. I Met the extra bit MQ co-founders who just started a new company She's called Zeti at the time now. It's we've works and we released VV net just like well They released VV net before I joined But yeah, anyway, so started working on containers How fun is that? So fast forward to present What have we got? Well? Let's consider VV net for example It's a it's an application oriented overlay network VV net user doesn't really need to know anything about underlying technology Whatever it uses VXLan IP seg that kind of stuff Users don't have to know about that They don't have to make any changes to their underlying infrastructure to run VV net and they can Do things like provide identity to each other with an IP address get get an IP address for each of their apps Use default ports don't have to use like port remapping for each instance of different apps, right? And we'll get to some of the examples shortly So oh, yeah, here we go. So like imagine this right in perhaps the olden days You have two hosts You got something running here and something else here you got this other thing that that gets this port and then and then if you kind of follow this schema you say, okay Well, now I'm gonna put a couple of more things of this kind here and I go to give them these port numbers and and now I got like An okay thing I suppose Then I introduce another thing and I bumped the port numbers again different way Because that's supposedly some how somehow slightly different thing So if I'm doing this allocation of my early perhaps these numbers are completely different But just imagine that you had like three different things and they're using completely different port numbers and they all have to be aware of What those port numbers? Actually, I mean where we're like if I hit host 2 or 1981 is that is that what kind of app I'm gonna get well who am I gonna get to talk to there, right? And you know some people say well, oh, yeah Get get get gets a bit more complex as you can see and then you're like all right. Well, let's just use DNS But the problem is that no developers kind of like keen to to write this sort of thing And if they want to or to made that they'd have to read a pretty big book and They can't be bothered most of the time But in case DNS comes out of the box with Kubernetes and DCOS we don't have to care right just works now and We don't actually have to use those complex service discovery mechanisms that that were built over time You know, there are things like I Don't know people did all sorts of things, right? I mean my personal view is that a lot of service discovery systems like zookeeper and console were invented because DNS was too hard and DNS was usually owned by an ops team and Managed somehow separately and hard for developers to automate Anyway modern container orchestration systems comes with DNS out of the box can forget this Don't have to buy that book And if you use an overlay network such as view net you can just use default Port numbers for it for all of your ports. Let's say all of these things talk to HTTP All right, and we use 8080 because we don't want to run it as route And that's it. Okay. They all on the same ports. You just connect using HTTP post name and 8080 and that's it you can hard code half of those things potentially You don't have to do some sophisticated look up unless you have a particular case where you do like client-side load balancing for whatever reason and that makes sense but For more majority of the cases You can actually just hard-code the ports and forget about port number lookouts and Once we have containers and API's We can orchestration API's I'm talking about we can implement policy. So Kibiris has network policy Which which allows us to essentially express that Whatever that blue thing is it's allowed to talk to the red thing and the red thing is not is only allowed to Other to talk to other red things and dark blue thing isn't easy is Shouldn't be talking to to the red thing you know that sort of thing and That's something that we can do declaratively with Kibiris network policy API for example and Other things we can do with the API's is Distributed observability and we do that with Veeves code. So I Mean I'm sure a lot of you are kind of familiar with this kind of situation where you'd have a Multiple terminals running like h-top and You think oh, that's cool, but like I wasn't able I was never able to process all this information. I Mean I can look at one of them and it all makes sense But but put forward or eight of these things or how many who go yeah, I think this was No, three four. Anyway, like too many for me to be able to to actually Understand and you know things like that to right tail and multiple terminals He's also pretty verbose and hard to process. I Mean if you start it all day, maybe you'll get used to it, but I struggle with it I mean it kind of reminds me of this a little bit I So yeah Uprontime metadata and API is allows to build Reacher tools with a lot more context So, you know Imagine a situation like this you kind of you went in the server and you found like oh, there's a Java process And it's it's using some juror that That leaves here We're gonna look at when the jar was last modified. Oh, that's a long time ago, right? But in order to obtain this information you had to be aware of what's Java Jar, and what's what's that thing? right Imagine you forgotten Concept of a Java jar Java jar, right? You could you you kind of like you're just thinking okay. Well, there is a process It takes some arguments. Is that a configuration file? What is it? It's hard to tell whether this represents your actual application right But once you package it in the container you gave it a name and That's an application You have a version tag. You can attach extra labels as metadata to specify some specific things like build date Something like that. You can look it up in the registry using Manifest ID and find out the exact things that that are in there and when they were Pushed to the registry, right? So we can build reach your tools having all this metadata can build the things like what we do is if cloud explorer We can we can look at all the containers that whose image begins with stock shop release and We can we can see where they run we can dig into them and Find out like in graph view. We can find out who's talking to who and We can see more information here, but particular process inside of it The there is process for you as well. So you can look at lower level stuff essentially you get more metadata that we can present menu more meaningfully to you and this this is done with zero configuration and them things like this right we can look at Frontend service here and see see who's talk who's talking to front end and How much memory The neighbors are using for example And we can also go back in time. This is the time travel feature. I'll show you a live demo afterwards You can go back in time and find out what happened last night. So they look at it 79 mag now 71 mag last night Well, not not that different and if you actually want to find out more You can you can click here and you would you'd go to to the monitor view where you can explore a Matrix and Prometheus I Will get to that shortly And you can also log in to to any of these containers and take a look whether well This is just the standard output of the container. You can also drop into the shell inside the container as well if you want to check whether There's something that you wouldn't do some manual Check of some some some kind or just take a look whether the output looks right still This is something you can disable in production also more of a development mode and Other things we are able to do with API We are able to do what we call gdops which is What I showed you earlier was puppet, right? We do the similar thing where Well, you'd think it's just a get pull keep could all apply in a loop But it doesn't really it takes a little more than that Because you want to be able to lock things you want to be able to update image attribute whenever there are new images and do other things So So here's a We've cloud deploy UI It shows you We're looking at front-end service here. This is the image tag. We're running right now That's cool, but we can take a look at the latest change that's been applied to the cluster and We can see that okay. Well it it got out to make ultimate it so essentially well in this case I clicked on that button earlier and I Automated the service so if it sees a new tag in the registry that will get updated in git and synchronized as a cluster and When I click that button We stored a note an annotation in YAML that Essentially means if you if you were to take this git repository into a different cluster, this will be Picked up from there right so if you rebuilt a cluster or something like that or you just want to clone it an environment So you can also you know, you can also use git in all different ways that you can use git For example git blame right you can see what changes Who made them and when so the Yeah, this particular one I made most of the changes, but The automation tool we've cloud deploy said this particular annotation here and You can correlate those whose metrics we have in Prometheus Here's a deploy event that caused the spike. What was that? So this is this is a This is memory usage, I think it's kind of pretty low However, no, that's that sign notes, but who cares I mean there's the same spike in in irate for Like instantaneous rate of CP usage Over five minutes There's a spike there too in this bike here. So I looked at this particular one. I found this git revision, right? I went in git and looked it up and turned out I scaled the low test deployment to 24 pods Now that's that's what caused the spike went away right away, but There was there was spike nevertheless, so now I'm going to show you that this is all real It's very simple demo showing how this works with Kubernetes on DCOS today Come on. Yeah, well Let's hope it all works still Yeah Okay, so So I have my cloud formation console here Can everybody see this great I have my cloud formation console I have DCOS I I keep this DNS domain Okay, so I'm in DCOS I can see have kubernetes installed I'm running the latest version of kubernetes. It looks fairly happy and I can take a look at a Forwarded the port earlier. So and here I got my kubernetes nodes And I got some pods in the system namespace amongst those I can see some pods with beef prefix so those are all the cloud agents and Pants it looks like the status is all good. So I should be able to Use we've cloud now, okay So just to be clear like I mean we've cloud days two months free trial It's a sauce product You run agents in your cluster. We present you this UI and all the features that come with it So here's here's my kubernetes cluster running on DCOS and I can I can take a look at any particular things here, right? So let's take a look at for example, we've Cortex edge input is this So I can see there is a pod let's take a look I can already see that it's It's running from ethios, right? So this is this is from ethios and from ethios Scrapes from all services it sees in the cluster and that's why it's talking to everybody Okay, I'm happy with that I can see there is a load test against the front end and the load test Run some Python process There's got 24 pods. Let me take a look at Some of these pods So for example, I can do this I can go to this pod and I can look at This pod in pods view if I drop out I see that I see that there are all these pods here Select front-end I can see all the pods that talk to it. Let's pick one of the Well, I'm gonna narrow it down to load test and sock sharp namespace and I'm gonna look at one of The load tests just just double check. It's not getting any errors in a terminal Okay, stand up at them in so I can see that cool looks good. Okay, so you found super curious I can actually go into One of the containers inside this pod and Have a look at what it's doing Is this working? It worked earlier. The only problem we may have is by fine Oh cool, all right, okay, so what we got Well, I'm just gonna do this right Okay, it's running This and there is this config file that specifies the load test. So if I'm like supposedly I'm very curious What's in that config file? Okay Fat thing is today config Oh, yeah, it's dot pi. I just missed that bit. So Yeah, and we can we can take a look at the code that's running in this container if we really wanted to Anyway, what else I can show you? Oh, yeah, I can show you A monitor view. So there's a there's a premise use metrics And I have a few things here. We got node resources. We can see all the All the stats about memory and CV usage From all the nodes in Kubernetes cluster that is on DCOS This doesn't currently run on all the DCOS Slaves, but I can be down to So we can take a look at the last Couple of days for example A whole week even I can see actually it looks like So we can see like deploy event here, right? What I showed in the screenshot earlier. We can see that There's a particular git commit deployed at that time And there were some other things deployed earlier And we can see that well This is this is probably the time when I created the cluster And here at the time I think when I've deployed the sock shop. Yeah, we can see like You know, there's definitely spikes and all the graphs around around about this time. So Yeah, that looks like when I've deployed the sock shop up So, okay, well, let's take a look at the deploy history When we kind of like in the overview we can see all the different events that took place We can have a look at Some of these things that I showed earlier For example This thing I think we had on screenshot, yeah So we go to to get up we can see this And we can navigate this repo we can view the file and and potentially look at History and see all the commits that There are so And you can see that there are the automated commits from from the system V flux user And there are commits from from me so I can make changes to to to the repo as well as the vCloud deploy makes changes to the repo Um So going back to to vCloud You can take a look at this front-end service for example, and you can see Uh all the events that relate to the front-end service more specifically, right? And Now I'll show you The time travel feature If I go back to live mode or actually I can go straight back to time travel. So there are Three main modes live pose and time travel So if I go back to The time Before I deploy the sock shop Can zoom out and I think that was on the 24th Right, we can see there are a lot less things here. We can see there are just the the system pods Uh, or we can even look just uh control is probably easier to look at On sometime around this time I deploy the sock shop And later so oh this time the the load test is already already has 24 pods So if we keep this selected we can Go back a bit and I think at the beginning I only had two I need to Find you in my uh Time scale here. Okay. Yeah, it was around the time of the night Uh Clearly this is not the production system. This is just a demo environment. So I've been working at it late at night Um, yeah, we we can see how this uh was different a little earlier on Oh, this time scale is like super fine. I gotta Zoom out a bit more Yeah, so we can see now it's two pods, right On the you know, we can observe things like this So, yeah, well, um, here comes time travel uh, think I actually managed to finish this Earlier than I thought we didn't overrun. We still have time for questions Please Does it understand layer seven? Uh No, we don't have insight into What goes on layer seven like as in you mean HTTP requests. Yeah. No, we haven't got insight into that So it operates on layer three. Yeah, but we do have a a plugin that looks at layer seven Uh, there's a plugin that that is able to introspect HTTP requests using a bpf Cool, and does it use Kind of tls like vxlans or for for networking. Yeah between the notes Well in this demo actually did not use vivnet and vivnet is not required to use weave cloud Uh, vivnet is a separate project essentially and the visualization That you saw that does not depend On the network you can use that with any network. Oh cool Yeah, but if you'd like to use vivnet, we we use vxlan and ipsec as well cool Some some of uh, some of our customers who use dcos Chose vivnet because they they need encryption actually and we provide ipsec that Is really easy to configure. Actually, you don't have to Know anything about ipsec to use ipsec and vivnet Cool. Thank you. Thanks Any other questions Is there any kind of performance here on running the agents? No, it's pretty low overhead. Yeah, we work pretty hard to to make for example the The explorer ui that Did have a performance impact and we managed to optimize that by switching to ebpf which is this kernel technology that allows you to Do interesting things in kernel space So it's pretty fast and uh Yeah, some some of the other things like prometheus are pretty low overhead Are we good? Thank you. Okay. Thanks a lot everyone Thank you