 I have the con as they say, so hi everybody, I need to do the screen share everything. That should be it, it says I'm sharing, there we go, right, how are you all doing? I know I can't get any feedback, but hi. So I am Mike Ellsmore, I'm developer advocate at Logs.io. I'm somewhere between JavaScript, Hackery, and somebody who wants to become an operator. I'm not sure which one I am quite yet. And so what's the problem that this talk is all from? Well, it's Node.js. Node.js isn't actually the problem itself, in reality it's the fact that when we are shipping Node.js we are shipping it by the metric amount, many, many, many, many, many, many, many, many, many of them, and when we are working in systems where we are shipping huge amounts of it, there's a lot of data and we don't seem to look at it. To be fair, there is a problem with, well, Logs metrics data, and the fact that we don't seem to try tracking it, using it, or keeping an eye on it, until something goes bank, which is always too late, or if we're lucky, somebody has had foresight because something has gone banked beforehand, and they've implemented it for us, so we don't have a clue where it's going or why it's going or how, et cetera. So this talk is a beginning to intermediate thing, explaining how you can start tracking this information and why and what it is so that you can see and find the benefits later. I'm also going to apologize now. I'm having to drink a lot of water very regularly to keep my throat going. The joys of getting sick when it's warm. So Logs, they look like this. We used to this when we are building, and it's usually just console out using color or something like that, and, you know, it's useful. It helps us debug. It helps us work out what to do. But what about when we're running thousands of them at scale on the internet? Well, the next step, observability. If you haven't heard what that is, that is this. It's three core components. The first one is Logs. So we would use this for diagnosis. Metrics for detecting the information, well, the state of a system, even in a small component or the overall view of the system and traces, which is how we isolate where a problem could be. We can actually drill down into it. Most of us as web developers are very used to this. When we open up any browser in DevTools and look at the network traffic, we see the traces and the impact of all the web requests. So that's network tracing. But the tracing here can be applied to the code and its process. We could equally call it something like distributed profiling. And so most people are currently going, I don't need any of this. I've got X, Y, Z solution that means I don't need it. Or we're a small company. We don't need to look at it. Or we're a huge company. We have an ops team. We don't need to care about this. I disagree. Because this isn't going to be easy and something is going to go wrong at some point and you're all going to have to look at it. It's just the way the internet works. Prepare for the inevitable bang. So let's start with the first one. Event logs, what are they? Well, they're immutable data points. These are what is happening, what has happened in your system because if it's what's happening, you've got the data precognition. But it's what has happened. So it's any given line of data. Usually they timestamp with a discrete piece of information. This is console log. This is what you see from SDD out or SD error. These are the discrete data points we can use to work out what's going on inside of a given part of the application at any given time. I've actually missing my speaker notes. So I actually forgotten all the data points there. So what are the best open source tools for this? Well, one of the most common and best in my opinion is the ELK stack, which stands for Elastic Search, Log Stash and Kibana. These are Apache 2 licensed and there was a community version available to everybody. It comes with Beats and which is part like Log Stash. So Elastic Search is the engine for storage of all the log items that go in. Log Stash and Beats are the means to send data into Elastic Search. Specifically when it comes to logs, it is a beat called File Beat. And Kibana is the visualization and graphing engine that's on top. This is the way you've been able to actually build alerting and finding out what's going on. System metrics. This is slightly more difficult and a bit harder to comprehend sometimes, at least for me anyway. System metrics. Once again, immutable data points. This is what has definitely happened. These are usually a label, a data point and a timestamp. These are specific data points around, well, if you are looking at, say, system data, then it would be CPU usage, memory usage or memory, free memory, disk IO, network. You can even get power consumption if you configure it correctly. And this is on a system level. You can then go down to the process level as well. But on average, you just look at it from an overall systems point of view. You can also define custom metrics to be sent through. Which can be very useful if you're trying to do stuff like tracing user behavior alongside these statistics. Next slide. And there we go. Here is the best tools for use with it. So Grafana is a visualization and reporting tool on top. So, sorry, open source Grafana is a UI tool for this. It comes with dashboards and tooling to do alerting, et cetera, on the information. Grafana, in our case, we run it on top of our ELK stack. So we are passing information into Elasticsearch and reading that on Grafana. You can also, one of the most common times, is Prometheus. Prometheus can be run on top of Elasticsearch. But it's better use case is to be used with something called M3DB. And I am currently learning more about that because it's a whole new dataset that I have. Well, data engine that I have never used before. Oh, there we go. If anybody does have any questions at any point, just drop them in the chat and I will get to them at the end of the talk or when my brain can process everything at once. Which is not very often, I won't lie. And then the last one, the hard mode, this is something that I took a while to wrap my head around and finally understand and be able to use. So tracing, it's the end-to-end flow of an application. It consists of traces and spans. Trace is the execution path of the application. So data goes into its endpoint. And spans are the individual pieces that the trace followed. So a trace is built of spans. So, for example, in the terms of a JavaScript application in an Express app, we treat the entry point, the get request as the security authorization at the top of it as the beginning of the trace. And then you can go through and measure each of the middleware points as individual spans or all the individual components until it reaches our endpoint, which would be when the data has finished. So that's my kind of challenge. That was a decent question, which I will answer when I get round to the demo part. Thank God. And tracing in this format is best used in the microservices distributed pattern. Purely because if it's a monolith, you can do simpler versions of profiling from beginning to end where you can just run through the entire DI pipeline, etc. Whereas if you've got components running in isolation, you want something that can measure it across thousands of machines and connect those services together. Just think of Netflix with its diagram of thousands, well, hundreds or thousands, somewhere between hundreds and thousands of microservices. And obviously, when they want to find a problem, they want to know which microservices causing other microservices to have problems and distributed tracing allows for that. There we go. And the best tools for this, I'm actually missing one from here, but that's because I haven't used it at all in my own in my own freedom playing. This is Jaeger and Zipkin. I've only used a little bit of Jaeger, but a little bit of Zipkin used more Jaeger. Jaeger is distributed. It's an open tracing tool that's open source. It was originally built by the people over at Uber was open sourced and is now got a very, very busy community around it. It's really, really in depth about Jaeger. They have a very, very active community at Jaeger tracing.io where you can get hold of the Slack GitHub, the whole shebang and learn anything that I miss or that is well beyond my knowledge. The app itself is built in Java, which is my nightmare language. Everybody has one. So I can't actually name any of how the internals of Jaeger works without crying. So here we go. There's the demo time. This is going to be taking a little while, preferably 15 to 20 minutes, where I'll be able to take us through implementing some base logging and metric data. And if we're lucky, hopefully implementing some Jaeger, I have taken the brave or stupid choice of trying to do it as somebody who has never done this before. So we are going to be using documentation data to be able to do this. I did do a driver in, but just in case. So there we go. So we have our we application. Just a Docker compose thing because I wanted to replicate having multiple versions of the application independently. But this we're going to be building this under the principle of we have no idea where it's going to be. This could be deployed to Heroku. It could be on a cloud foundry instance. It could be Kubernetes. One of a good Julian million different places. So I am going to show the shortcut, but I'm also going to show the better way of thinking about things where possible. So two parts of the application and a queue in the middle to make the thing glue together the receiver just receives a request, sends it to the MQ, and then reads it in the processor. And that is takes an image drops on the queue takes the image image URL from the receiver and just saves it to hard disk. Can all imagine it doing, you know, many hundreds of millions of different things more difficult mess but it's a good start to show off the basic principles. So the receiver really simple code. Simple get request. Try catching into a queue. Lots of little bits of logging so we know what's going on. And the same here. This is just a processor that's grabbing it turning on the queue listening to it consuming the queue dropping on the disk or dropping the image downloading and dropping on the disk. And just carrying on with loads of little things. The only little hacky thing is I have put a set of help to make sure that this turns on after our rather MQ does. So let's turn this little thing on darker computers up and just for good measure and the build version of it. The receiver is on and is running at the port 8080, which obviously it's not because it's Docker magic. We are going to get it on a funky long one that we don't conflict. And there we go. There's the processor. It's now also connected. So this is a lovely cute panda image like that. And we are just going to save that. So the image has been sent to queue. We check the queue. Aha. Image request received sent to the queue. The queue has been interacted with and the processor is receiving it and saving it to disk. Now if we want to do we could do a Docker exec go into the into the container actually grab the image but I don't see the whole point in doing this. And as kind of Jones put earlier. Here are some just these are the plain text of ones plain text just the line that goes out. That is a log that is fine. Now we're going to go and do something possibly dangerous. We're going to modify code that's working, which, you know, could inevitably lead to it not working. And we're going to go into the receiver. So the application MPMI we're going to want to download Winston. Luckily that's the process on want to be looking at the receiver one first. Luckily we have a whole bunch of things. So const Winston. Why is it when you know somebody is watching you type you for the life of you cannot type. If anybody else can type whilst being watched, you need to teach me everything, you know, because I am awful. Excellent. And to do correctly, we want to do. Logger. Logger equals Winston dot create logger. If I remember correctly, the default one will just make it throw everything to disk anyway so we'll want to change that to Is that a debug statement or an info statement you think I am thinking that is a info statement. Actually, no, you know, that's debug. We don't need to know when it actually starts, but we do need to know when it's finished. So I'll go dot info. And that's because in this instance, we are able to infer the log level of the information by defining this and these will be my strings. And this one is an error. This is important. This we will define as an error. So we can easily find it later. And this case dot info. And I go lock. There we go. Now, obviously going to kill this one quickly should have done that before I started typing. Because rather than you always take some moment. There we go. I can't build. Can't create property. Ooh, I have gone and clowned. There we go. Just refer back to my notes because anybody who sensible. Yeah, transporter. Oops. Got ahead of myself completely for transporter. Let's kill that so that we, whilst we are doing this bit, it is configuring in the background. There we go. Transporter. Now it's just going to dump it to the log. While I'm doing that, there is also another one I want to look at, which is the easy way of doing this. So this is going to just start dumping out to our console. Working versions of the log. Hopefully if I haven't create property instance of the intriguing. I'm pretty sure that's how we use log. No. It's changed up. Wonderful thing. I was only practicing this two hours ago and I've already forgotten everything through panic. There we go. That's better. And rabbit queue. Excellent. There we go. Excellent. So if I now run the, and it will save and it will go Winston. So this is a multi-learn object where it is sending through in answer to your questions. Kenneth, that is an object version of it. So inside of Winston, it is converting it to an object with all the message and to the log level information, which we can then interpret at the other end. So we're sending that to disk. Well, to STD out now. If I can find where we go. Here we go. We're going to do the quick and dirty version first. Ship your data. And we are going to just throw this into logs. Via the fastest means possible, which if it's multi-line, if it's an object, it can be interpreted directly off our docker range that we have to run with it. So going to turn this on in a terminal. There we go. And that should be listener.logs.io. And this one. I believe I saved it locally. Shipping. Shipping. Token. Yeah, should be the one. Ta-da. And that means anything we do inside of tither should start appearing in our elk. Now this is important. So if I just send one random request to make it do some stuff to do. There we go. Some information is flowing through. And then if I go back to here and log into our ELK stack. Which instead of having to stand up my own version on my own machine, I am going to be lazy and use the one happily and handily provided by my employer. So logging into logs.io. And Kibana. We should quickly see some information coming through. Did occasionally take a moment to double check that it's saying it's getting stuff. Collecting logs. Well, it says it's getting stuff. And I see information going through. Now we just got to wait a moment for all the queuing to stay and catch up with itself. Always what you want waiting for the universe to catch up with you. So that's great. We'll move on whilst we wait for that to catch up. So in this case, we're relying on the Docker stream to be one of the tailored logs to be picked up and thrown into logs.io via a handy dandy little helper Docker container. However, what happens when you're deploying this to do to a standalone server somewhere or an enterprise piece of kit in a box or even just a box in your office that you want to be able to monitor with the top of everything else. You're not going to have Docker and handy little helpers to do this. You're going to have to hard wire in. So we are going to add into the code a different transporter. So these transporters are the way that Winston tells it to communicate to the console or to different methods. There is one for logs.io. And that should be NPM. And that should be Winston logs. I believe. And we got and all this does is allows us to add a new transporter, which means instead of waiting for it to be sent out to console and then sent out to the UK stack for us to process. We can read it directly out of here. So if I do this, go back to our code, do NPM I save Winston logs. I should probably definitely kill that one. So it's going to start up again. When I need it. We'll insert that here. Move that over yonder because meter that to the list of listener. That's cheat a little bit. Go find out how what I typed in here. And I'm going to need to. Yeah, I have to do this and I'm going to do just off screen so that nobody sees it. Oops, because I've just realized I've got certain locally and V in the Docker image to capture the. So I'm just going to need to do. And I dot and just going to do it off screen so nobody can see and tell me off for accidentally leaking keys everywhere. Whoops. Let's see. Let's see. Let's change that to process.EV dot. When you think you're fully prepared and you forget to do one we thing. Save that. Yes, good, good. And I believe that should work right. And now that we should be shipping directly to logs IO from the source code as well as sending it to our local instance of the console. Still not coming through intriguing. Find out in a moment when it goes bang. The problem with the request. Valid URI. I'm going to do something. I'm going to very much regret. Yep. Oh, you silly silly devil. If you intend to use dot and locally, please remember to load dot and locally. It should then be. Load. So it's not load. Oops. And it's saying it's passing it through. Punky Dury. Wonderful. That's all we want. See if we're actually going to receive information critically frustrating when it's doing when it does this to me. Okay, so there's all my tests from earlier today. Minutes. And just when you want this to work absolutely perfectly live. That's trigger a couple more. Well, at least we know it's working locally and the transporter isn't bailing out and for any because it would normally for an exception so it is transferring just being slow and I'll have to work that one out later. Okay. Well, apart from the demo fail of me not being able to see. Ah, there we go. There we go. Oh, I probably haven't got a log level set to the correct one. I probably want to have it set to log level. I think all should be able to trans send all of them. I'll turn that on now and just double check to. Yeah, so I need to. Yeah, I'm limiting the error level. But there we go. We have log information being sent over quite nicely. Oh, yeah. Now it's definitely playing catch it with all the way. But we can see that we send items to keep great and that's if we did the same in the receiver and the processor, it would be exactly the same. I'm probably going to have to copy and paste that bit over as well. So, excuse me for a moment. Yep. Okay. Now we're good. So let's move on to the next bit, which is metrics. Now metric data. We're not going to go into transcend custom information from inside the application because that is really, really hard and probably about 50 minutes session on its own. So we are going to go and take the cheap way because the cheap way is nice, quick, easy, and you can get some benefits out of it very quickly. So once again, logo docs, very comprehensive docs. And the docs team have taken a long time and making sure this is simple for everybody to consume. So this one is docking payment. What it does is it actually connects to the host to read out the metric information from the host machine in regards to all the docker the docker clients. So it is, you know, very comprehensive in the information it will receive. So if I paste that in here, and we remove this and change it to the key, the local key, all keys, logs, I know metrics. Let's make that run. There we go. This metric container. What it's actually doing is starting up a copy of metric beat and using the system module as can get to the host and contain grabbing all the metric information for those doing it on a tech. I think it's a one or five second tech. So frozen random data in just to make sure it's actually doing some stuff. And if we go back to the metrics. There we go. And we can see that we are starting to receive some information. Excellent. Send to Q. That's Kavana. There we go. That's, there we go. There's no data yet. Wonderful. But if we then went into the dashboards and do for a specific dashboard. There we go. That's what I'm after. And we changed this to the Docker overview. Give us the information being sent over from Docker. Unfortunately, once again, we are waiting for the ice age information to propagate everywhere. The joys of information having to pass for about 15 different gateways to get what you need it, but it would come here eventually. But just to be completely open and transparent. When you install metric be you can then use it to look at many, many, many other components metric be metric beat is actually built and maintained as part of the metric the beats as part of the elastic. And if you're after just the system level information, great, but it can also do drill downs into all sorts of different smaller components. So if you want to be able to maintain information over couch to be one of my favorite little projects, Docker specifically elastic search itself. H proxy, et cetera, et cetera, it will log and maintain it. Let's give another moment to see if all the information is actually propagating through. It is not. And I don't have enough time to be able to see why I am not seeing that information. Okay, it's definitely got my key and it's definitely correct to go to the overall the wonderful one at the very top CFL. No joyous just have to keep waiting for it to turn up. Right, so I would be sending metric information. If I had time to prepropagate this with proper information for us all. Yeah, that's my fault. So the last step is tracing. Now, for those of you who are not aware of what tracing is this is the part which most people are not aware of or used. I'm going to be talking about Jager now Jager is a wonderful beast, but it is a beast, in my opinion, it is a big stack of things. This is your standard homegrown Jager stack. This is your application. This is what you have built what you are shipping into the universe. There's your application. There is the client and there is usually a Jager agent, which is gathering information. You send information from a client to the agent to the agent then sends it over to the collector. The collector is the middleware which does sampling and aggregation, etc. Before sending it over to the database and storage for it to eventually be queried and used in a Jager UI. In our case we're going to very, very quickly try and skip from that bit to that bit to straight to that bit and then see if we can visualize some stuff inside of LogsIO. Just give it another one. I've not left my self enough time to debug and fix that now. Nuts. Grand, absolutely grand. So let us move on to the next step, shall we? I need my code. Let's go into the processor, which is the one which has got more code for doing stuff. So the processor, there we go. So it receives a, it connects to the RubberMQ, waits for tasks. Each one processes each task, sent to it. Each task is a simple image URL, which is then saving to disk. Well, grabbing the image, saving that to disk. The joys of having too many windows. First things first, we need to set up the collector. So I've already done this in earlier on my double checking. So this is our collector. This is actually based on the current Jager collector, but we wrap it ourselves to... Now that's the metric one. That's the log one. There's the Jager one. We wrap it to add our security logic to it so that we can use our tokens, because normally you don't have that tokenized level of information around the standard Jager collector. It relies on being inside of your network and secured that way. This allows it to be a, for a vendor so that, for example, our platform is secured. You need to be able to talk to it securely. So we've wrapped the authentication level information there so you can add your token and ship information. So turn it on and have Jack. Excellent. We have a collector. Now to very, very, very, very quickly. If I remember which window I left the configuring the tracer on. This is what happens when you have four windows open. Each one has 20 tabs. Trying to remember where you left all of your information is a little bit of a pin the button. So how are we going to write cool. So need the Jager client. You know what, be fair so that we can all see it. That's that. So go back to my code. We want to install the Jager client. We're going to get that here. So const init tracer. Jager client dot init tracer. So we're going to configure this tracer which is then open tracing compliance. And open tracing is an older standard that has been superseded by something that I would like to mention in a little moment. Regardless of, oh, I've only got a few minutes. So if this doesn't work, we're going straight talking to that point. Cool. So I am going to copy and paste as all great development is done from here to here. And we're going to rename this to webinar image save. For the sake of all things being truthful, no one would ever want to see this code in the wild anywhere being used. So it is the version zero. And after that point, we want to in a tracer with the options to correct grand absolutely fantastic. And then we want to move to the open tracing documentation for the way of actually sending this over to do, because then we can use the standard tracing so tracing and then tracer starts span. So start span, and then we can start sending the information const span, actually, probably. Yeah, so start span, HP request, start HP request is an mpq ampqp ampqp request. And then we can just keep adding to the span. So as I said before, a trace is built off spans. So we get to add the different components to it. So in this case, we get to add log the information and a finish. And we're going to do that. So do we have an exit condition in that code. We have a we do that we do. So that becomes the something went horribly wrong. Bye bye. Goodbye. Good night. Oops. Just change that to E E E that that's not eat that's three. What happens when you think you can touch type and you can't. There we go. So that's going to go and not explode. And I'm going to need to include that so it has the correct tag object. And then MPMI save actually better kill this one so we know it's dead when we turn it on. Open tracing. There we go. And that's when I dead trace. And we're going to want to add one for when it's alive trace so we know when it's good and finishing. We just add some traces in between. So here we go. I received and request and event that was true. And then we're just going to add a event. So we know where they, well, it's going through. So let's do event file saved. Let's go back because we don't need it. That should be enough. And tracing. Take all of two seconds stand this back up and that should mean the processor is going to send some traces for it. To turn back. Of course, it's not fine because I didn't clear out the bits I forgot to delete. And okay. Let's see if we can very, very, very quickly. And I need to log out of it. Done. Done. I didn't configure it correctly. I wanted to fix that. Related eight. Long in as me on the corporate account. And go to our Yeager. There we go. So this is our Yeager. Cute little logo. I personally feel it should be a Yeager from Pacific Rim. But that's me and the giant nerd inside me wanting to escape. And has it working? Yes, it is. So grab, throw, send, do things like that. Sending through and it's getting information. Excellent. Select service. What did I name the service? What would have been useful to remember? Image safe. Cool. Thank you. No. Not going to have the same problem again, are we? Darn it. Oh, no. I know why. Haven't configured it to ours. So do, do, do. Where did I leave that? Talk to talk to it over thrift. That's the bit I forgot. Okay. I believe I'm missing the thrift part anyway. I'm going to run out of time. So I may as well move along. But normally you would see if I picked up the customer. Pulled that through fine traces. You'd then be able to see all the individual dispatches and be able to see the, how everything is breaking down. Now that should have been how we've seen it for us, but really tiny minute skill spans. Consisting of microseconds as it was passed through, you know, EDBD steps of code. But I don't have time. I don't think now I don't to do that. So last bit. Let's get back to presenter mode is there is an all encapsulating specification that's coming through that is open telemetry. Open telemetry is the merging of open tracing and open sensors, which are both CNCF projects and open telemetry is now CNCF project as well has been for a long time. Now that is combining log metric and. Tracing information into this overall observability view of your system with open source software. We at logs are really, really on board with this, especially as well. Life's always easier when there's one way of doing it rather than 15 million of them. And I believe before I get told off questions, if there are any, if I've left any sort of inkling of wanting to ask anything, if you don't want to ask it now, you can send me an email. I'm more than happy to answer it there or forward you to the right person or answer on Twitter because well, what else are you going to do on Twitter other than angry rant and answer questions. I apologize if the state of always not leaving enough room for debugging things in the middle of a demo. Even when you have a dry run of it, there's always a chance that something goes horribly wrong. And okay, so Grava Sharma, can a node be a span as well? Any chance you can give us some more context there? Or are we talking node in the form of a graph or are we talking node in the form of an application instance? Because if it's an application instance, then in theory, yes, if you have a microservices that say a single domain of information, then you can treat it as an entire trace and the span of say all the functions you're passing through. But if you're talking in the form of a graph, normally you build the graph, the graph view of information from the traces and span. So I'm not entirely sure what you're asking there. But I'm more than happy to answer anything else that comes through that so I can see how many people. Okay, we'll just give it one more minute for questions. And if there are no more, we'll wrap up. Thank you so much, Mike. And I've managed to keep my tonsils in one piece or throughout this. Thank you, Lord. Awesome. I hope you feel better. Okay, looks like we are all set here. Any last words, Mike? Yeah, if you have had any interest in this, there is a very active community around Grafana, the open source active, a really, really active community around Prometheus. If you want to get involved there. And if you want to learn how to do logging metric or tracing data, you can go into the open telemetry community and learn so much from them. Or you can come down to logs AO and pester us because it's our bread and butter. So we may as well help where we can. Wonderful. All right. Thank you for that. And thanks everyone for joining us. And we look forward to having you join us next time. Have a great day. Bye.