 Okay, hello, thanks for coming and we would like to start with one important question Who likes to look for a needle in a haystack? Please raise your hands. Oh Okay, only a few of you so that's good because we won't supply any magnets and Instead of showing you how to look for a needle in a haystack We will show you how to have it already on a plate in the way that you will never have to think Oh, no, these aren't the looks I was looking for so My name is Alicia. I work for Intel and since liberty I've been Focused on Cola project where I am also a core reviewer We have a great community and it's a pleasure to be a part of it And also it's a pleasure to show this presentation with these two great guys With whom I was working on logging during Mitaka cycle. So Eric and Miho So hi, I'm I'm Eric. I work for Mirantis as a software developer So I'm here to talk about Cola and logging Which is a project I contributed to in the past few months and My name is Miho I work for Intel and also proud member of Kola core community called our reviewers and Yes, let's talk about coa so This is quick agenda what we're gonna walk through first and we're gonna Make short introduction to Cola for whoever made you may not know. What is it about? So let's talk about it. How did we how do you in general deal with locks in dockers? then what we did to make it work with ours is luck and What we and why we remove it laid out later? and How we implemented central logging service with Kibana and Anthosic search Elk or egg I will leave it for later And we also have a short live demo for you of what we of a running running central central logging So let's let's go ahead and Talk about Cola. So in short Cola is a deployment of OpenStack in other we're using Docker Two major parts of it right now is Docker containers and Docker files, which we built we built for different distros We don't build full big 10, but we built we built quite a big chunk of it And also we supply you with some ansible playbooks to easily quickly deploy production ready OpenStack with HAA with SF with anything pretty much you want using our docking containers So As I said cola consists country currently of Docker containers Docker files Deployment tools, which is right now ansible is the main one, but also we had some we had some prototype of a mesos using using our containers also Kubernetes using our containers, which you probably seen on the demo on the one of the keynotes and Cola glue is a Repository in a docker hub You will if you don't want to build your own containers, you can just download it from the docker hub and run whatever you are running We docker we tack those containers every release. So you will be up to date and You can customize anything pretty much up to the line of config pair Well up to a line of config pair host So logging in containers So this is tricky tricky thing because as you probably know containers are separate anything separated Pretty much operating system. So getting getting clocks out of it. It's a kind of tricky tricky thing We don't want to supply you don't want to put Additional services in the same container which makes things harder because what when you service locks When your service make a log in my locks to something you would need our service to other service different one to follow To convey this log to the central logging service and someone so forth and in containers This is anti-pattern to run more than one service So we had to deal with it. So how docker locks? It's pretty It's pretty pretty elegant solution. And so Docker itself will capture these 3d standard error and stand out output output of the service which has speed one which is the main service per container and it then it can use several of hits of drivers to And to move the I mean to do with whatever you want to do with the lock Jason one is a Simplest one. It's a default and it just capture in just say this on a file this lock is well this lock you also have journal D go flu and D and You can you can I Mean you can Customize it by bear by container and to see which container runs which Which the logging environment you can run docker inspect so docker locks command is Very easy way to access the standard output and of the pay of the P2. Just docker docker locks and Name of the container of idea of the container So that's easy. That's our easy way. However, it works only for Jason and syslog drivers So our syslog it was a hacky way that we dealt with locks in liberty It was hacky we pretty much so late on the liberty we We needed to supply a nice operational way to manage locks So the way our syslog walk work is this it creates a special device called dev log Which services that are going to log into syslog writes into the special device becomes a stream and Then you then then the syslog demo gets the logs and can parse it Manage it and so forth and so forth However, it wasn't meant to be a container service because it requires you to have a Devlog we had to somehow provide the same devlog in multiple containers and providing slash dev things are not easy and Actually are in pop our short of in nothing short of impossible with a newer docker because docker became I mean docker started to use the very same the few of the dev Dev Files and we couldn't simply share the death so The other thing is we the syslog wasn't also meant for multi-line locks and Python tracebacks are multi-line locks, which basically means it's ugly And you don't want ugly locks We at some point we didn't even have to trace a box that was a bug which we figured but Even if once we get it's going it still was ugly and it's I mean forever one around these are syslog With the width of Python code. It's possible. It's not it's not pretty So bottom line we totally remove it from liberty and from liberty and meta cab and right now we are doing something a bit different and Okay, so central logging service Is it worth introducing it? What are advantages of the solution? So let me explain it by example Have you ever been to Washington art museum and have seen this exhibition? No, okay, I was there and I really like it But honestly, I believe that it's not a very good solution for the bugging application and Without central logging service we may feel like this like standing in front of the wall full of screens and We are not sure where we should look at first and Wouldn't it be better if we have only one single interface with access to everything and with central logging? Service it is possible Another important thing I'm sure that most of us have ever felt like that during debugging applications when we had to scroll many screens in order to find the answer and Imagine that we can add a search bar or a visualization panel to this book full of logs and Get answered to everything much faster And once again with central logging service, we can do it very fast We have filtering options and different visual representations of our data So that is why we decided to introduce central logging service in Kola as a new meet a cafe chair and Here is our architecture. It consists of three components HECA elastic search and Kibana So all of them run in Docker containers and we have HECA on every node It collects data from services which are logging to files Then data is fed to elastic search Which is also a backend for Kibana? and Kibana is designed to interact with indices stored in elastic search and it allows to Visualize data search data and interact with them And we also support elastic search cluster mode and to give you a better overview of Each component will describe them briefly so So we're now going to talk about About HECA into more detail so Quick introduction to HECA. So what is HECA? It's a stream processing software its open source and it was developed by By Mozilla and it's written in in go So as a stream processing software HECA acquires the data it processes it and then it sends it to Some external system like a storage storage system for example in our case. This is elastic search So as Alicia said HECA is run on every cluster node So this is a very important component It it's run everywhere for collecting logs and sending them to elastic search So was why HECA created in the first place. So HECA actually is a unified data processing system which means it can It can collect and and process any type of data In our case it can be logs and can be metrics It can be anything in the case of color and the work we are talking about today We use it to collect logs, but in the future we may use it to collect something else metrics for example so This is the HECA pipeline So so HECA works actually as a data pipeline With plugins at each stage of the pipeline. So there are actually five types of plugins imputes splitters decoders filters and outputs and each plays a role in the pipeline So HECA comes with a built-in plugins They are written in go and you can actually write your own custom plugins in Lua So it's very flexible from that point that point of view so every Every Lua plugin is actually a sandbox and it's limited In terms of CPU and memory that it can consume and if it consumes too much memory Then HECA will kill the plugin So some of the HECA highlights That I'd like to to mention So HECA is very lightweight So that makes it possible to run it on every cluster node and it's also very very flexible. I Will go back to this in in a bit So when we designed this central logging solution for color obviously the question of using lock stash or HECA was raised and We we actually decided to go with HECA So we chose HECA and the main reason was We wanted to build a distributed system and we did not want to have a JVM running on each node We also conducted a number of experiments a number of performance tests and HECA was much faster and Lightweight than than lock stash and as I said already HECA is very very flexible We can define plugins as code which makes it. Yeah, very very flexible for us So we decided to To use HECA so our stack is not ELK but EHK So now I'm going to talk briefly about elastic search So elastic search is an open source product It's actually basically highly scalable full-text search engine So it's very well known. It's used by many applications many companies and it's written in Java So the main and highlights of elastic search for for us Is that it's highly scalable What does that mean? It means that you can increase your cluster capacity Just by adding nodes and adding adding nodes to the cluster is very easy It's highly available Because the data is replicated across the cluster It's a full-text search engine which is based on the Apache Lucene library Which is a good library. It's been it's been there for a long time it's also documented Document-oriented and every action that you can perform You do that through a restful API Which is a which uses a JSON over HTTP So now Alicia is going to talk about Kibana so The last component is Kibana and it is also an open source product from elastic group in the latest or in the fourth Version it comes with Node.js app that sits between elastic search backend and Kibana UI and It was designed to work with elastic search. So it has a built-in proxy to it and How it can interact with elastic search indices. So there is a wide range of choices from search bars which allow us to Use different fields or just to look for entire message We locks can be also filtered by different time ranges both absolute and relative and There is also a wide range of different charts. There are bar charts or pie charts and For example, there is also a data histogram which by default shows count of locks versus time And now we will show you some basic examples. So in the left we can see the data histogram and then to other charts are top 10 Sources so top 10 services that are logging to To our to our service and the last one is top 10 hosts Here we can also see bar chart divided by severity levels So we can at once see whether something is wrong with our application. What is its state? and At the bottom, there is a search panel when we can specify different fields Which we would like to examine. So We have all our locks Start in one place, but we don't want to Give access to them to unauthorized users. So that is why we were thinking about adding authentication to Kibana First solution that we are considering was shield. It is also a product from elastic group and in latest version it provides security for the whole stack so also for elastic search and it has Login Ui for Kibana But unfortunately it requires requires a license so we couldn't have used it The next idea was to use nginx But it's required introducing new component and we have already implemented HAProxy in Kola, so we decided to use it to add Kibana authentication based on simple access control list and Also recently till till as for Kibana Has been also added to to Kola So now you know how it works, you know all the advantages and I'm sure that you are thinking that you'll have to spend hours or on Configuring on installing the solution and we have another great surprise for you It is as easy as three steps So in a comparison to a standard Kola procedure You have to build three more images which is HECA elastic search and Kibana You have to change one flag in configuration, which is unable central logging. You have to Put through there and then you can deploy Kola with one command so you have the whole open stack with central logging service running and if you want to get some details about how to Deploy Kola or any other things You can watch a webinar presented by our PTL a link provided below and we'll also provide this link at the end of the presentation Okay, so now we are going to show you a quick demo so So this is the Kibana UI the Kibana interface that we use so when you when you open Kibana in your browser That's what you that's what you get So it's already connected to elastic search. So it fetches The logs from the elastic search index and this is so this is what you get So you get a graph here, which is histogram. So it's the count of logs over the time Very basic and you also get a table here With the with the actual actual logs So here it's the logs over the last 15 minutes so I can I can change this and and and get the logs over the last seven days So if I do that a query will be made is made to elastic search and then we get all the logs for for that period of time So now I can go ahead and I can customize this Table because here it has only two columns very basic. So I can go ahead and customize this at the host name as at the program name that generated the log the severity and maybe the the entire entire payload of the logs So one of the advantages of this solution is that you can very quickly search for specific logs So for example, I can so I have a search bar here. I can select Let's let's say I want to to know everything. I want to see if I have errors So I put error here. It's a full-text search and I will get all the logs that include error in Any field anywhere? So this is very yeah, that that's full-text search So this can be very very convenient if you don't know where This error string is and you won't just you want to get everything that includes error The other thing I can do is I can be very more specific and I can say that I want to I want to get all the neutron logs whose severity is error Which can be which can be useful obviously So now and I have a nice search. I like this one. I know I'm going to reuse it in the future So what I can do I can save it so I have this Here I can save it Neutron let's call it Newton and save Now it's saved in elastic search actually because we the saving if the saving of the Kibana Metadata goes into elastic search as well So I can go back a few days later and I can reopen this so I can go here and Reopen my my search So So as you see at the top, this is the discover panel of Kibana Where you get this kind of table with all the logs and the searching Capability and now Alicia is going to show you how to create actual Visualizations analytics based on on the data Okay, so here is a visualization tap We can choose it and we can see that there is the wide range of different different visualizations So we will choose for example bar chart and I'm going to show you how to create a bar chart for different severity levels so here at Y axis we have a count as X axis will choose Terms and then specific field so Severity label you can increase the size and Now it looks like that. So we have the same color for all severity levels and if you would like to Split it which is split bars option we choose Same field and now it will look like this so We will have different colors for each severity level and the same as we've searched That was explained by Eric. We can save our visual visualization Let's call it severity bars and in future we can just come here and Open it. Okay. So we have our search. We have our visualization. So what we could do next We can create a dashboard So we choose We go to the dashboard tab and we choose add visualization option and here we can get Searches saved by Eric here is neutron and we can see that it appears here and We can add also Visualizations so severity bars and we can create a custom dashboard and Add different visualizations different searches to it We can also save this dashboard like here. I Will call it test dashboard Save and then we can access it in the future by opening it and now we are going to show you Our custom call a dashboard It's here. We prepared it earlier So as you can see we have different charts here and We have also search panel here and If we prepare our dashboard and our searches our visualizations, you can go to settings tab To objects and we can export it to a JSON format in order to save it or share with somebody So we just export everything And it is Saved as a JSON We can see thank you Yeah, just a quick note about the dashboard. So currently we don't have any Maybe we mentioned this bit after but currently in Kola. There is no default dashboard So what you so when you install the solution what you will get is is this keep an eye And you have to create your own dashboard But in the future our plan is to to create a default dashboard So when you install the solution you will get a dashboard if you're happy with it with a That kind of graphs and charts if you're happy with it You can use it. Otherwise you can customize it or change it completely. I think that's the that's the end of our Presentation, thank you for your attention, and I think we are ready to take questions. We yeah, we have some time Thank you. Hi great presentation by the way. Thanks. Looks amazing a few questions. I had but Can it be non-containerized as it can I deploy this? without Kola or have you tried that or have you been focusing solely on Kola because For something Where Kola hasn't been implemented like I come from the salaris side of things and we have a salaris implementation of open stack So if I want to quickly implement this EHK and I want to analyze logging. How do I do it? So we didn't test it. I don't see I should it shouldn't be an issue apart from the from the thing that When you reuse our ansible deployment, it will all be configured to work together So that part you need to do manually you probably need to deploy a hacker You probably could deploy a hacker with containers. You would need to do some wiring around configuration And that's pretty much it from if I would think that's pretty much it It will require manual labor, but you still should be able to do it okay the other question is let's say I've run a search which says give me all error on NOAA and and I'm debugging a specific problem where I'm trying to create a VM or attach a volume and Something's going wrong and I have no clue where to look at because it could be pretty much anywhere or let's say creating a VM right it could be failing anywhere and I want centralized logging and I want a very very fine granular relative time where it said last 15 minutes You said search me give me all logs from the last 15 minutes Can I narrow it down to the last five minutes and will it freeze and and I want I don't want additional logging clogging up my search results. Can I say? I'm going to run this search and freeze the set of logs that you have right now and don't give me Incoming logs because I mean maybe I've turned debug on on NOAA compute or Neutron and there could be tons of logs right and it could just be writing stuff and my search again gets cluttered So can I can I like freeze a set of logs in search within that so Yes in elastic you can do whatever we want you can set I want from this date this hour to this day You can say last two minutes last 50 seconds last 15 seconds, whatever you want It's pretty so the language for the search in in elastic This is the demo that Eric show this just a just a scratch of it. You it's pretty pretty powerful It's pretty well documented so you can go all sorts of crazy stuff with this once again. Thanks, guys Sometimes the most interesting logs are right before like a hard power loss or a network connectivity loss and The interesting logs are in the disk cache and don't wind up getting written to disk before the node just spirals out of control What what do you do to deal with that kind of situation? so Default by default it always gets locked on the note I mean every lock is locked on the note to the file at first So even if you lose the connectivity and then the file the note itself gets out of the network The file lock files will be there So you may not have this in your central logging, but we do have normal file logging Logs on the note as well and they will stay there. Yeah, and I will add to that and the system is also quite robust Because if you lose the connectivity and then the connectivity comes back Hacker will be able to Read the logs again the logs that have have not been pushed to elastic search yet So it will recover everything and you you should not lose logs basically And it's the same if you if the connectivity between Hacker and elastic search Goes down The logs will accumulate in in two files on on every node and then when the connection is is Goes back is back up everything will be will synchronize and you will get your logs. I Guess the one problem that I haven't been able to find a way around is if the logs are in the disk cache And they haven't been written to the disk when the problem occurs a kernel crash or a hard power loss So it hasn't been sent over the network. It hasn't been written to disk. It's in a disk cache How much can you do about that? I haven't found any kind of real solution And that's often when the logs are most interesting, you know, right before you have a hard kernel crash or something like that so I Contrary for long Eric, but there's there's rule we can do actually about that. We don't go that low level Yeah, I was going to say the same because even even when a car is processing the logs at that time the logs are in In in in the memory So if your computer crashes at that time, then you will you will lose the logs. Yeah So that kind of things can may happen Yeah Do you use the standard log format in the open stack services or have you tried the json formats? No, no, we use the standard and and you pass those with hecka. Yeah Okay, and how does the The trace back passing work is is that reliable? Yeah, it's it's it's totally reliable. It works as you expected to work So we have a an hecka decoder Which is specific to open stack and and it can deal with a trace backs So when it when it when it sees the beginning of trace backs of a trace back You say, oh, it's a trace back now I'm going to read more lines and accumulate this until I see the the end of the trace back And this is just one message Congratulations guys really good presentation My name is Neil I work at cloud engineering and box and I was wondering if I think you touched upon it a little bit But if could extend this idea to getting metrics Logging from your various components as well as maybe even the the VMs themselves Into the same pipeline as you're doing with log aggregation. Can you also be extending this for metrics aggregation? so We have this in plans not part not precisely what you describe it because we don't want our logs to be stored in elastic Sir, not logs, but metrics to be stored in elastic adversaries because there are better ways to do deal with metrics This is different kind of data HECA will make a handy in it. I probably will and we probably will create a new set of Tool the tooling to deal with metrics and monitoring data like I don't know collect the or snap or something like that and That's one of the part. That's one of the parts. We want to deal with in the cycle so if any anyone of you would like to help Thank you, sir. I think we have a summit session about this. Yes, we will wish we I think we do Thank you guys Thank you So I think you mentioned this as a way to actually gather infrastructure logs Any thoughts into how to actually extend this to user application logs? We have no such plan for now so maybe we could Integrate with a project like monaska in the future, but right now it's only about The open stack services logs and the infrastructure logs. Yeah Hi when you Look at log session also at HECA with the performance Did you also test log stash with file beat? As a lightweight log shipper Yeah, we thought about it But even if you use lightweight agents You need to have a lot log stash centralized somewhere and we really wanted to have Distributed and scalable architecture. So we have HECA everywhere on every node and it HECA directly pushing to elastic search which We know skates Okay, is the filtering in HECA as powerful and also as easy to use as in log stash Well, it's You have to know Lua. So you have to know how to to develop plugins in Lua if you want it to be Very flexible and do exactly what you what you need For a basic splitting up a log file log lines from Yeah, open stack log format is that an Filtering Which is still present or do you also have to write Lua scripts for that simple things? So as I said the HECA comes with a built-in plugins. There are many plugins. So for example, you have spliters You have very basic spliters that will split your log stream with a end of line Characters, so you have many such plugins But in our case to be able to deal with our tracebacks We had to write our own decoder Okay, thanks very much any other questions. Thank you. We can close this. Thank you very much again