 what's monitoring. I don't want to scare you in the morning, but perhaps wake you up a little bit. So, there's no default to say what is monitoring. So, I try to explain what monitoring is for me. The basic foundation, I guess, for monitoring is availability and functional monitoring. I mean, if I'm not able to ping a device or a service or something else, I have no idea what's going on and what the performance is, what the metrics are. So, I mean, it's figuring out if your infrastructure is available, if your services are available, and if they are basic, let's say functional, if you can log into database, for example, that's the foundation of everything coming on later. On top of that, in time series, metrics and time series are very popular, especially in the last three, four years with starting with graphite. Previously, there were things out, RD tool, MRTG, and all that stuff, but it was never so, people were never so interested in metrics, and it changed a lot in the last year. So, this is a big topic for me and also locks and events means tools together, information out of systems, dealing with submitted locks and that. And another very important thing for me is user experience. It doesn't help you in any way. If all your checks are super green and everything is okay, but the user experience is bad and the user is not able to use the web interface or that client application. So, getting kind of a perspective, how your application is served to users is also important, I think. So, what's to monitor? When we visit customers, when we are on projects, it's sometimes really not hard to figure out what should we monitor. So, that's really a hard question. What is important for me? So, there are different approaches to come to a full-featured monitoring. I would say if you have no idea how to start, the best way is to focus on your business. What the services you're making money with, internally or externally, but what's the important service, what drives your business? And to achieve that, I think a top-down approach for monitoring is very helpful. What I'm up means you monitor every device you can find, do an auto-discovery in every IP address you get and a successful reply is monitored, but it doesn't help you. It answers that you get 2,000 emails in your mailbox and you will probably spend half of your day creating rules to move them to the trash. Therefore, a top-down approach is really very helpful. So, first of all, focus on what is new business logic? So, what are you doing? The funny thing is some people don't know what they are doing. I meet a lot of customers where they say, what are you making money with? Yeah, that's hard to say. So, then you have a problem. So, you should know what you're making money with. And then, starting with a business logic, starting with external service which your customers use, which your customers are unhappy with if they are not running, is a very good point to start. Then, focus on the application. Means if you have an external web service or you have a web shop or something like that, then figure out what applications are responsible that your business logic is up and running. Means it could be one application. It could be more applications ending up that you have a successful service of your business logic. Underneath of that, there are services. Means, for example, a Tomcat, a database, whatever could be needed for an application to be up, and then you come to the business logic on top of that. And at the end, infrastructure is important as well. Of course, there are different perspectives on infrastructure because perhaps the guys are interested in a, I don't know, failed disk. Are not the same people interested in a fail in the business logic. Perhaps your management is not interested in a hard drive failure, but somebody should take care of because if all your heart is crushed, then you have a problem in your business logic as well, a later point. So there are different perspectives, but definitely I would say a good point to start is going the top down way. So how to monitor, so how to do it heavily depends on your perspective. Means what's important for you, what you like to see is something nobody can answer for you. So the perspective on your infrastructure, on your services could be so different that depending on your employees, on your service guy who comes to your company tells you a totally different story. So there's no perfect rule to do it. One thing I heard a lot of times that people discussing monitoring is push or pull the best content. And I think there's no, there's no R, there's an N. Because there are things, push makes sense, and there are also other events where pull makes sense. So there's no R for me and to write this the same with Vim or Eximus better means there's no, perhaps there is a better one, I don't know, but definitely there's a push and pull for me. So sometimes it makes sense to go to a machine and get out the metrics. On the other hand it's also important to deal with passive events coming in, metrics are ascended, the push way, or for example, if you're, if you're dealing with FNM, FNP trust, for example, they are still out there and I think we will not gonna kill them in the next 10 years. Personally, I don't like auto discovery. So auto discovery is very helpful in marketing because you press a button, then you have 10,000 of green or red lights, but the quality in auto discovery environment is most of the time not very good because what do you have? You have a bunch of services you figured out existing in your environment, but perhaps it's a laptop, notebook, workstation, whatever. So it's really hard to make a good environment out of an auto discovery service. There are, there are some excludes for it, especially if you work in big network environments, auto discovery could make sense. So if you have a good tool, for example, OpenNMS, which is really good for, for telco, for big network environments, there's a very good auto discoverability also creating dependencies on that. There it could be helpful because creating, for example, dependencies on a network layer could be really, really hard work if you do it with infrastructure as code. But I would say in general, IT services infrastructure, definitely you should think about infrastructure as code. It means you have, you have a process where you define where your services are. Use a configuration management, with puppet or Ansible, salt, whatever. We have already something where you orchestrate your infrastructure. Where you say, that service should run of a bunch of machines and monitoring should be a part of it. So the time where you create all your services and then open up a ticket to the monitoring team and say, please take care of the monitoring, please don't do it. So monitoring must be a part of your life cycle that a service goes up. It has to be monitored. Also in the, in the early stage of a development, it should be part of the process that monitoring is an important part of that. And also if the service goes down, then it's going to be removed out of the monitoring system. So infrastructure is coded, for me the only thing to do it right with monitoring, that monitoring is part of your process because it reduces failure rates, it reduces alerts on things that are okay that they're not longer here. So definitely for me a good way and also an important thing to choose the tool for you. So if you see that, you should know and you should figure out what tool can I configure with my favorite configuration tool. Provide monitoring as a service. If you have a monitoring, make sure that the other guys in the company, the other folks have a chance to work with that monitoring system. So provide a API interface, whatever, that they have a chance to participate in the monitoring and they're not required to open a ticket, write an email. Means monitoring should be a fundamental part of your infrastructure design and therefore you need something like your service interface, whatever you use, but monitoring needs to be a part of the process and that you can engage people to use it, you have to provide kind of a service today. If you would like to have a service monitored, independent from if you are on duty or not, but use the interface, add your service here and then we have in the monitoring system, then you get your metrics, also create dependencies on that. So means if you don't edit in the right way you will not able to get metrics for your service. So that's an important effect. So coming in explaining what I think monitoring is and how a technical approach can work let's talk about the tools because this is why we are here, hopefully in that talk, talking about availability and functional monitoring tools. So what's out there? There are dozens of tools out there. I have, there's no reliable database what opens those monitoring. So you can look on Wikipedia, they are about, I don't know, 64 tools. There's a monitoring survive, James troubled it for a couple of years. It's a little bit outdated since it's from 2015 but it gives a good impression what's out there. So we ask people on a yearly basis usually what are you using, what are you doing with your tools and what's important for you. And in the tools that I, you can see that Nagios is still number one. Then there are a couple of star services like CloudWatch, New Relic, some homegrown tools, whatever that means usually it's a modified Nagios or something like that. Then Asynca is coming up Sensosubbix and all these tools which are from the open source on premise area here, I will talk about. Except OpsView and Centrum because they are kind of flavors of Nagios and I don't want to cover it again. They are of course individual products with advantages and disadvantages but it's also not possible to put it in the 45 minutes talk. So I will focus on the open source on premise tools in this survey. So what they are, Nagios, Asynca, Sensosubbix, Griman and OpenNMS. Perhaps you figured out that OpenNMS is not in the survey but I think it's worth mentioning it so I put it in. Let's talk about the first one. It's kind of a love-hate relationship. I was using Nagios for, yeah, for years. I was in the Nagios Community Advisory Board once and the problem is I think Nagios is a good system and, but Nagios was a good system at this time like steam machines were cool at some time. So Nagios has a lot of advantages and it is easy to extend and all that stuff but definitely today I would say they are better alternatives. So of course you can use Nagios still for if you have to monitor 20-30 hosts and Nagios still is really, really reliable because the codebase is so old and so many people patched into that that it really works but there are better options out there. So I'm not a Nagios hater to make that clear but if you if you start new if you think about doing something with monitoring please don't start with Nagios start with anything else but not with it old part where there's no active development in it. Sorry Nagios, somebody's here. Isinga too, the other one on the list also that I'm involved on the Isinga project we have pros and cons as well and I would would like to treat every product fair. Isinga too came out of the Nagios for originally so the Isinga project for the three codebase into Isinga but at some point we figured out at this time that it was really hard to to enhance the codebase and then the product the team started to rewrite the code from scratch and C++. Advantages definitely in Isinga are that there are a lot of integrations to other tools are built in, means if you would like to write metrics to graphite influx db opencs db all the stuff it's just a feature you need to enable means you don't need for Avios or some other external tools to make that happen. It has an application-based cluster stack and it's really cool to automate it because it has a REST API. Means the REST API to add, delete, modify services during runtime makes it very easy to have it in an HL environment and that's definitely an advantage and disadvantage could be in Isinga because there are so many possibilities to have active passive checks and have the checks running on the server on the client that it could be complex to set up especially if you have a system with multiple nodes or the certificate thing we need to make it secure and all the stuff people have sometimes problem with it. Always room for improvement of course the documentation here is not so bad but we regularly figure out that people have problems with so much possibility. Talking about Sensor, Sensor has in general a very similar approach like Nagus and Isinga they also have kind of standalone and subscription checks means the server can do stuff and also the client can do stuff. A lot of people complaining about RabbitMQ which is necessary to run Sensor. I'm not a RabbitMQ expert but if you look at the forum a lot of people complaining that it's not running, it's not stable and it's hard to install. I don't want to charge on RabbitMQ because I have no idea but this seems to be a problem if you're running Sensor in production that people often have a problem with a transport layer in there. There are no historical data if you would like to do SLA reporting afterwards it's really hard in Sensor because the information is there in locks but you don't have a data model to access it and what's really sad is that all the Sensor move to enterprise only. Dependencies for example there are no dependencies in the open source version. SNMP move to the enterprise version. It's not wrong to have an enterprise version. Don't get me wrong but I don't know where is the border between open source and enterprise version here. I think the open source version of SNMP is really hard because it's probably not enough for an enhanced monitoring setup. If you're able to buy the enterprise stuff in addition to that then it will pretty much do everything you need but this is something you should take care about. Subbix. Subbix is also very popular monitoring tool. I think in Japan for example it's de facto standard so a lot of people using Subbix there perhaps you know better but I heard so that the Subbix is very popular there. It's a full featured solution with Subbix and advantage definitely is you get a lot of out of the box. You install the agents turn them on and you get all the data you get your graphs everything so it's really easy to start up your monitoring system. Logging and graphing is integrated in Subbix that is easy. It's a little bit harder to orchestrate and automate but it's usually the case with all the tool. As much you get when you get more in up front then it's usually harder to extend it later. Because all the things are integrated if you would like to extend it in some way then it's really not that easy. And scaling Subbix could be a problem because Subbix also the the satellite systems rely on a Postgres database but at the end all the data has to be written in a single Postgres database. To be fair also other problems like you think I have the issues working on that but that's a scale of limitation. I know that customers have with Subbix if you would like to scale out that your database needs to be very good suited means you definitely need SSD in a large environment to deal with all the data coming from Subbix. Riemann. Anyone heard of Riemann? Okay a couple of you. Riemann is a project it's there's not so much going on it's my last point on the slide but I figured that out that there's not much going on in the project right now. Riemann is a streaming processor means you have a server running and all the clients constantly push streams to the server. It's based on closure which is really important to to know that it's in closure because you also have to write your streaming rules in closure. It means you have to be familiar with that language to really work with Riemann in a good way. If you would like to measure different things like a web server application server they provide different dream and tools to send over the metrics to the Riemann server. It's stateless means it doesn't store any data it just handles it shows it in a web interface and you can see the metrics. So the advantage of Riemann definitely is also perhaps in a combination with another monitoring tool that you really have real-time information about your system. So you get constantly get your performance streams out of the system and see what's going on. But like I said before and then so much going on in the project I checked GitHub again yesterday there are a couple of small documentation enhancements but there's no much to do. On the other end it could be that it's just perfect right now there's no much to do. I cannot I don't know. OpenNMS. OpenNMS is on the market also for a very long time. It's also full-featured open-source solution means you have everything in there. It's very good in out-of-discovery like I told you before it's very good in hetero, in homogeneous environments, telco, network they are very strong in that area. It's based on Java which is not important to say if you hate or love Java but it's hard if you fork out of Java. It means every time you leave the OpenNMS Java context if you have to execute an external plugin then performance is horrible. They have a lot of stuff included means SNMP everything is inside the JVM and it's pretty fast but if you have to leave the JVM because you have to do something which is not included into OpenNMS then performance is not so good but this is not a it's kind of an OpenNMS problem but it's a Java problem. Forking out of Java is expensive and therefore scaling out external checks with OpenNMS doesn't make sense but I also guess that's not a field of expertise. Out-of-discovery is really cool. They are also really really nice guys from the OpenNMS project. I know them for years. It's definitely a cool tool if it fits to you to your needs. Okay so leaving the functional monitoring let's assume we set everything up using our favorite tool. We figure out something is running or it's not running. Metrics and time series what it's all about counting so we would like to say perhaps it ends up in money or not but we would like to know how the metrics are and definitely I would say biggest player here is graphite. So not talking about all the RAD tool based tools which are the MRTG, PNP, whatever. So RAD is definitely also a very cool thing. The problem with RAD is for most of the people it's not fancy enough. That's not a technical argument I know but sometimes you don't need technical arguments and that's definitely a problem that people would like to have more fancy graphs where you can add different stuff and that's hard with RAD. So what graphite is the database underneath the graphite is the whisperdb is very similar to RAD it has a couple of differences for example you can you can also add data not in a serial fashion like like PNP that the dates have to be arranged that is not necessary for the whisperdb. It kind of started the metrics revolution so it really started to be popular I would say four years ago something like that and the biggest advantage is also the biggest disadvantage because graphite consists of different components you have the whisper which is the database for the RADs you have a carbon you have a graphite web and some of the original components are first of all are really hard to install so a lot of people fail with graphite because they're never able to get it up and running but also the different components and it's scaling very good and means if you if you look out at github there are so many different components replacing individual components of graphite for example carbon seba which is like a proxy for the carbon cache there are so many things that it's really like a moving target and it could be hard to debug I know that there are a lot of large environments based on graphite and they figured how to deal with it but to start with it and scale with it it needs some knowledge I would say definitely it's still kind of the standard I would say another thing is open tstb anybody using open tstb congratulations you made it okay so open tstb is pretty cool but it's hard to set but it's based on hadoop and hbase means you have to know how the chit works and if you know it congratulations you know it then you can do it it's really able to scale like crazy and you can also store all your data you don't have to reduce your data you can you can live with the raw data forever if you able to pay for the disks of course and if it's up and running it's very easy to scale I mean based on the hbase and hadoop you just did a note and that's it it's very powerful it's also a lot of monitoring tools provide an api to open tstb so if you already have knowledge about hadoop or hbase or already have a cluster that that would be a good point to start and if you just would like to start with metrics and then have to start with a doob and hbase if you have time anyway it's it's harder to start with open tstb another thing gaining more and more attention is prometheus and prometheus originally was developed by a Berlin based company named soundcloud for the internal metrics storage they open sourced it and I think in about a year I don't know they are a member of the cloud native foundation prometheus is also a time series database with an with a dimensional model means the database model is not tree based like in in graphite for example you have these tree based metrics and prometheus is very flexible in the database model originally it's designed for for web service you can query externally means getting OS metrics or something like that you need other tools installed to be there was no kind of a plugin mechanism on there for example you would like to get some load information or infrastructure information you need a node exporter install it and what what made prometheus very powerful that it has a rule-based alerting so there's a component in prometheus named alert manager where you can based on metrics and thresholds create alerts then it to a user and if you are if you have a setup that is really based on metrics and probably all your your information comes out of let's say response time and and specific loads scenarios then prometheus is really good to to set up a monitoring based on metrics it has to fit your environment and definitely in all that in that in that cloud area means because also kubernetes is in that cloud native foundation it gains a lot of attraction heavily developed moving forward so it's definitely interesting tool to check in the next year another thing is influx db influx db has a very similar scope to graphite I think it the first end would say okay let's make graphite easier and therefore it's really much more easier to install and it has about very powerful as well like very language so if you're familiar with SQL you it's really really easy to get metrics out of influx the cool things need enterprise also means horizontal scale out and if looks to be you need to pay for it and they put a lot of energy and also doing more with it means they developed a stack named tick stack tick is for telegraph influx what is it chronograph and capacitor there are different components where telegraph is able to send metrics to influx db and chronograph is the web interface to analyze it and capacitor is something like the alert manager and from ethios so the guys from for influx data figured out that they probably more than a metrics database and created components to have probably the full chain from sending metrics to influx db have a web interface analyzing it and also creating some having a rule-based approach to get alerts out to the user elastic I would say it's the defective standard so who's using elastic second some way here definitely more than open tb don't ask me why I would say they were the first player and they they have made very good decisions in buying other projects means elastic searches there for a couple of years based on on lucene and the elastic stick also started in the time series era about very serious I would say one and a half years ago there's a kibana extension named timeline for for metrics and for time series it was pretty cool you have to look out elastic beats which is a method to directly send metrics information from your tool they are perhaps beats for isinga for Nagios for your for your application so you can directly send metrics information to elastic bypassing lockstash and then you can use kibana to query this it has a different model approach because the fundamental concept of how elastic search work than the database is different to the model of prometheus or graphite works and it's more important that you know at the beginning what you would like to see where the graphite approach is more like put everything in and if you can afford it store everything and look later what you need elastic I would say you need to think more about what I would like to have metrics for how the the no design should be how the object design should be studs D could be helpful studs D in combination with lockstash studs D is kind of a metric aggregation demon from Etsy where you can yeah work with with with with counts and aggregate specific values and putting that information back in elastic is very powerful but it could help you in some way as well so now we found out where we can store in the metrics different of the tools have their own web interface like the influx data guys Kibana of course but if you talk about visualization so getting all the metrics out graphana is it so graphana is I would say right now the standard because it works with all these databases Grafana has interfaces to all these tools and more the guys working on Grafana toggle and the and the Grafana team rain tank and they bringing new releases I would say every week and and it's very easy to start with and it's very easy and it's probably one of the reasons why it's so many people using it because it's easy to combine different data sources also from from different back ends into a single panel means you get some sources out of influx to be from graphite if you have more of that getting information about elastic and it's a very cool thing who knows Grafana annotations not so many so annotations are a cool way that if you have your graph and storing your metrics in it something happen in your system for example a puppet run or usually a git commit so somebody break break your system then you can show that event in your graph means you can probably look with Grafana into your metrics database and also look up to to your syslog information and that's pretty cool because if you see that something is wrong with your performance with your metrics with a response time sometimes it's just a puppet run exchanging some software is just a commit in your software and the annotations make it very easy to see it you can see depending on the event here it's just a test event you can also add more information here you can also say there's a git commit by developer XYZ so you can call him so it's really easy to have a quick analyze if the performance changes what happens so if you use Grafana and have kind of a lock management tool in the middle you definitely should give it a try talking about locks in events first of all we have to start I think what the difference between lock and events are so a lock is just a flow of unstructured data so hopefully we have kind of a time stamp makes it important to work with it but it's not more than we have a time stamp and we have a bunch of information in a lock and if you would like to do more with with it we have to we have to split it up in different attributes means going from a lock into event means we have to check what is what what is the time stamp what services responsible what is the message and it's I would say it's always the same process except you only have the by law you have to probably store your locks then perhaps you don't care what's in there if you don't have to store it but if you would like to work with your locks you have to make it into events so the process should always be like going from a lock to event thinking about lock stage or grey lock you you have croc for example you have different patterns where you can split up the log information into the different parts most of the service already out there so we don't have to rethink it again just just grab it and then another bit and then take some action means if you really work with locks you have first of create really the identify the attributes and work with it later on also elastic in in that area definitely is smaller standard than in time series area it's highly integrated so since elastic at the time was the first one elastic based on the scene then they I don't know kind of board or adopted lock stage developed by Jordan Sissel and then Kibana so they get all these tools together and and also beats was an an external project named packet beats before so elastic did very well in getting the right components together in a complete solution that it was it now is the elastic second was previously the elk stack but since they added beats doesn't work anymore you don't have any user authentication or stuff in it and if you also would have everything in a fancy way you need an expect for it but how they do it elastic I think it's it's very good because for example lockstitch has a very powerful API which is an advantage to others so you can really go to a lockstitch instance and see what's going on how the data is processed how quick it is and this is open source if you would like to have it in a fancy way in your Kibana you need to buy the expect but I think the border they draw between open source and and enterprise edition is pretty cool because you have everything you have the apis you can access them if you would like to need it in a more fancy freeable way after buy the expect so definitely lockstitch API came up with um I think lockstitch five a year ago one and a half year ago um if you're working with lockstitch and you you see some kind of processing performance problems the API is very very helpful and by far it's the largest community so if you think about lock management all that stuff elastic is the biggest community another cool tool is graylock um graylock is also based on elastic search so they use kind of the same database um but the biggest difference is that all the configuration and the ruling and everything where you have to do by hand a lockstitch um is provided by a graphical interface also if something like authentication authorization is important for you graylock could be a very good choice because everything is in there means um go connecting to LDAP having user rules all that stuff is freeable um and it's easier to start with claylock because you have an interface where you can see your input sources the output sources and it works also very good in combination with lockstitch means you it's it's it's always it's not an it's an or it could be an end as well means you can combine all these tools and there are a lot of users using graylock but using using lockstitch to get the information from the system or another tool named fluenty fluenty like um Prometheus also joined the cloud native computing foundation um it's it's kind of a unified lock layer so i would say fluent could be a replacement for lockstitch it has a very powerful lock layer and connects very system with others um i have two minutes you have nine minutes what is correct i only have ten minutes okay i have the right time zone hopefully okay um fluent could be a good alternative lockstitch or if you if you don't want to use elastic search as a data store if you use something else because fluenty is supporting multiple backends um fluent is also a good point to start um an advantage over lockstitch could be that it has built in reliability means it has um it has a file and memory based buffer system where you can also replicate it to multiple fluent services which lockstitch doesn't has so um it definitely could be a a good man in the middle replacement user experience probably the the last area which covered all the others um i don't know who is who is doing end-to-end monitoring in some fashion okay we'll not have a lot of fans here i guess um it's not super popular means really figuring out how your browser works or how your fat client works not a lot of people doing it because it's a lot of work um perhaps i would say for a typical ops guy is not so much fun to be with a front end and fill in different variables but it's really really helpful so end-to-end monitoring gives gives you gives you another experience that you don't have a disappointed customer and it's really often the case that technically everything is right but the user experience is shitty anyway means not talking about a bad interface that's another story but just your interface doesn't work like expected and therefore um for some services for for some i don't know if you're talking about a webshop going through your shopping experience adding a product to the cart doing a checkout and checking if the the invoice is correct could make sense um there were these tools out there web and checked and outed they were cool a long time ago means they are really not actively development web chick web and check the last release was 2006 still people using it and outed is also a little bit outdated also the current windows versions are not supported um they are two two end-to-end user tools i would like to mention here one is sakuli sakuli is a combination of a tool named sahi and one of sikulix one is for web testing and one is for fed client testing um it's mainly for nagers compatible systems means nagers i think are central and ops for all that stuff um most of the checks can just you can just launch a docker container means it's it's very much isolated um which i'm not really sure that some of the sahi features are enterprise only i have you have to check out what you need here um and another interesting product i think nobody knows here i guess is a product named alivix it's developed by an italian company um it's a it's a complete solution monitoring um web user experience monitoring fed client they have an IDE where you can really create test cases on end-to-end basis um and they have a full audit trail notification system as well send you a screenshot for example if something is wrong um definitely you should you should have a look so if end-to-end monitoring if you would like to to check your user experience if you would like to see if your citrix is working they also able to work with mainframe terminal if that's here a thing i would say mainframe and defops i don't know it could work in some way um anyway if you have it they can do it they can also work with java outlets they can do pretty much everything um it's really cool so the the conclusion so if if you go out of the talk come back next week and say did you learn anything so what now so your boss comes on monday uh perhaps he's not looking so bad as he or i don't know um i'm sorry there's no best tool and that was probably not the goal of the talk to say choose that one um i think it depends on what you need means they're kind of two different approaches there's a monolithic way where you have tools that do a lot of out of the box like subix and open nms they have everything in there you install the agents you have graphing locking everything um and if it's if it's enough for you then okay because it's easy to set up you don't have to deal with all the external components also if perhaps your main focus is not not tech so if you if tech is important part but not all the guys are in tech and you just won't like to have a monitoring then it could be a good choice because you get something easily if you need more if you need to scale out if you want also play with the newest fancy hot shit on the market then a modular approach could be better means um if you are tech driven company if your ops are really up to date and would like to play with a new way combining different tools together um is the best way so i prefer a modular approach um on a toolchain because one thing is um that sometimes also the monolithic approach is good enough but the problem is at some point you see a new thing is coming up again so there's a new metric system you would like to play with and then going from a monolithic approach to a modular approach and say i would like to replace my integrated graphing solution with graphite whatever it's hard means best is take your favorite tool set up kind of a real-life use case so don't don't test your local linux box so perhaps in addition but play with the use case you have in the company play with the integration so try to hook different tools together choose your favorites sometimes you are not able to make the best decision like in life you can go through every argument but sometimes it's just flip a coin if you don't know just start with one and figure out if you like it and if you like it and if it does everything you need then it's perfect for you so i'm hopefully in time thank you very much for listening are there questions or are there time is there time for questions five minutes are there questions are you awake kind of i didn't get the question i'm sorry hello hello yeah i was just curious why you didn't mention pager duty in this this um because of open source open source um pager duty is cool like others victor orbs and and all these tools between it's but it's an external alerting service um victor ops is also cool because they're using iSinger i know um no it's an external service like a lot of other cool things in the software as a service market of course like new relic or cloud watch or data doc librado but this is for me focused on open source possible on premise tools and therefore it's not part of it of course i know that all these alerting external notifications sms voice is a big part of a monitoring tool chain um you can do it on your own with various solutions but definitely pager duty is a good choice as well all questions there's somebody this question is not related to the tech exactly but uh as i used both the open source monitoring landscape as well as the proprietary stuff so uh in your opinion how far ahead is something like splunk pager duty compared to the elk stack and everything in terms of for a scaling tech driven startups yeah that's a good question so if you name splunk splunk is awesome if you can afford it um splunk is expensive like shit so if you would like to store a lot of data it's really hard to afford it so if you if you say money doesn't i don't care i would say go with it's super easy you can call somebody and and scream at him it doesn't work um so splunk is really really good but since you you pay storage wise so you pay on the data you store i think for me it's more alternative what i see a lot of customers away from splunk because not because the tool is that because they're not able to afford it it's really only a cost reason why people go away from splunk um in terms of the features um definitely splunk is much easier to configure so i would say graylog is is more the the splunk approach than elastic because all the configuration from the input and output is able to done via web interface also enterprise integration to also open source tool would provide enterprise add-ons like puppet or ansible works pretty good with splunk um splunk is fast and splunk is also also capable to probably work with with every input output necessary so it's definitely a good tool so i'm not into into splunk so i i cannot install it i can work with it i know in general what it is and what it what is able to do but i would say it's it's really just a money reason um otherwise a good tool so like i said if you can afford it congratulations it's a good one and not meaning it bad so if somebody from splunk is here i'm open for discussions you never know who's in the audience more questions yeah there's just time for only one more questions sorry we have discussed about many open source tools so my question is if we want to have like an activity tracing uh let's say uh use case you all my use cases in the y-axis i'm talking about a visualization and a particular graph where all my use case are in y-axis and all my microservice one two three in the x-axis so if i want to do something like a color-coded activity tracing for each of my use case which tool is the most suitable one color-coded on what space it's on the performance or no no no uh activity tracing suppose if i like if i have like eight microservices and they are uh a microservices is one for uh ui and then one for the database the service that's talking to the database and then to my iot devices so i want to activity trace my use case suppose uh let's say i'm switching on my device yeah so starting from my switch to the application i want to activity trace it so i want to uh visualize a graph like that so which of the tools that we discussed now which will be more suitable for to visualize that kind of a graph i would say none of these tools um so if you would really like go into application and urelic is very powerful means it has some requirements because you have to replace php or jbm with their stuff but they they are really able to go into it there's an open source i can i can tell you later it's not part of the presentation but there's also an open source alternative to urelic and we have these um we really can see what's going in in your application and the combination of the user case coming from a service what's happening in the database i i talk to you later because i have something but in general i would say none of these tools are good for it visualization could work in all these time series databases but they at the end just show result of your previous work um i i have something i i can show you later okay okay i'm here the both today's and also like again i mentioned before i think i came on saturday thank you very much and enjoy the conference