 So, first a few words about me, I'm Peter Tsanik from Hungary, Cistogenji evangelist at Belabit, the upstream developer of Cistogenji, and doing Cistogenji packaging, support and advocacy. Let me give you a quick overview of what I will talk about you today. First of all, a quick introduction to Cistogenji, and why is it good to be used in IoT devices? The rules of Cistogenji, a bit about log message formats, different kinds of devices where Cistogenji is used right now, and if time permits, I will also introduce you configuring Cistogenji bit. So, first of all, what is logging? It's recording events on a computer or a device. In this case, I put an SSH login message, but it can be just about anything. And what is Cistogenji? It's an enhanced logging demon with a strong focus on portability and central log collection. So, why central logging? First of all, it's ease of use. You don't have to check tens or thousands of devices, but all logs are at the central location. It's also availability. So, even if the sending machine is down, you can check the log messages and figure out what happened. And it's also security if your end-user device is hacked, you can figure out from the log messages what happened. And why to use Cistogenji on IoT devices? First of all, it's portable. It's mainly developed on x86, but it runs on ARM, power, MIPS, whatever. I often run into new CPU architectures, but I have never heard the word before when I find Cistogenji packages on the Internet. It's also a small footprint, as most of it is written in C. You can use it to do complex processing and filtering of log messages. So, you can make sure that only relevant log messages leave your devices in a format, which is ready to use. And best of all, you can use the same software everywhere, both on your devices and also on your central servers. So, what are the main goals of Cistogenji? First of all, it's collecting messages. It can also process them, filter them, and at the end either store them locally or forward to another destination. Let's talk about these in more depth. The first role is data collection, as it's not just system logs, but it can be any text data on your devices. So, you can collect from network-specific sources like Devlog, Jornal, and so on, but also from network. With our focus on central log collection, we can support all of the different legacy and new Cistog protocols over UDP, TCP, and clipped connections. And you can collect messages from many other different sources like FI, sockets, or even application output. The next, and in my opinion, the most important feature of Cistogenji is log processing. You can classify, normalize, log messages with built-in parsers, and you can rewrite log messages. And you don't have to think about falsifying messages in this case, but, for example, anonymize log messages when required by compliance like GDPR and other requirements. You can reformat log messages. For example, if you need JSON messages on the collecting site, you can do that. And you can enrich log data using GUIP or additional data based on the message content. The next role is filtering. It has two main uses. First of all, you can discard surface log messages. For example, you don't want to forward all of the debug-level messages, except if you are debugging an application. And the other one is message routing, making sure that the right messages reach the right destinations. For example, if you have authentication on your devices, it reaches the related analytics applications. There are many possibilities for filtering. It can be based on the message content, parameters. You can use many different comparisons and filtering functions. Best of all, any of these can be combined together using Boolean operators. The next role and final role is destinations. Traditionally, syslog was storing all of the log messages to flat files on the local file systems, or later on, network destinations were introduced so you could collect syslog messages to a central location. Later on, many different destinations were introduced SQL database, and in the past few years we have did different big data destinations like Hadoop, no SQL databases, and messaging systems like Kafka or MQP. Let's talk a few words about message format. Normally, if you have a Linux device, log messages are often in a format of date, hostname, and some text. Here we are back to my favorite SSH login example, and you can see that it practically is an English sentence with some variable parts in it. It was designed to be read by humans, and it's quite without it, but if you want to create reports from these messages, no two log messages are the same. I mean messages from SSH are different from Apache, whatever, so it's quite difficult to create reports from log messages. The other key takeaway is that there is a solution for this problem. It's called structured logging. In this case, even instead of using free format text messages, you use name value pairs to represent events. For example, an SSH login message can be described as an application name, username, source IP, source port, whatever. The good news is that SSH was built around name value pairs raised from the beginning, so other way it wouldn't be possible to do complex filtering and reformatting of log messages. So date, facility, priority, and so on are all based or represented in SSH and using parsers you can turn unstructured and some structured data formats into name value pairs as well, so you can use those in filtering or just store parts of the name value pairs, which makes it a lot more flexible. There are parsers for some structured formats like CSV parser, JSON parser, key value parser, which is used in many firewall systems. For example, pattern dv is for unstructured log messages like the SSH login message and if you want your, if you have a log message which is not covered by any of these, you can extend syslogan g either writing something in C or if Python is easier for you and you don't have a do-time-message rate you can write your parser also in Python. So here comes a tricky question. Which syslogan g version is the most used? What do you think? Here are some background data, project started 20 years ago, Red Hat, our most widespread released, we store as 3.5 and current version is 3.13. What do you think? Just few guesses which version is the most popular. Actually it's 1.6 and we are coming to the topic of IoT devices. I don't think any of our version has a larger number of installation base. Over 100 million devices sold as far as I am concerned. So let's talk about devices. There are many consumer devices which have syslogan g inside. I already mentioned Kindle where anything you do in different parts of the application stack is logged by syslogan g and when there is an internet connection dataists send to Amazon you might not be quite happy about it I guess. Another user is BMW i3 electronic car where as far as I could understand from the configuration file they use it for troubleshooting and they use quite complex filters in their configuration. So make sure that only relevant information is logged. Another set of devices are network storage and different network devices. Here just a few names Synology, Freeness or the Tourismia which you can see in another building here on display. Often you can access this only in the command line and it was originally installed on these devices for troubleshooting and security purposes but in some cases they provide a nice rich graphical user interface where you can configure logging and also query log messages and in some cases it can even be used as a center logging solution for a small event work. I want to highlight here Synology which became one of our contributor recently providing quite nice memory related fixes. Another set of devices are industrial like digital instrument realtime Linux devices what I have on display these are used for control and automation. On these devices Synology can be configured on the command line but they provide rich graphical user interface to query log messages. They use it they provide it as troubleshooting tool the applications running on these devices are logging to syslog syslog ng and this is how it is used. Another interesting use case and unfortunately I cannot say names here where syslog ng is used both on the device and on the center logging side we have uses in the car industry and also for smart metering where many different kind of metrics are collected on these systems and syslog ng and the syslog protocol is used to forward these information from the devices to a central location where data is stored to different big data destinations. I want to talk a few words about configuring syslog ng and my first advice is don't panic. Syslog ng configuration is simple and logical even if it looks difficult at first side and often also for the second side. It has a pipeline model and there are many different building blocks like source, destination, filters, parsers and so on and all of these are connected together into a pipeline using log statements at the end. Let me quickly drive you through a typical configuration where it starts with some global settings like version number of syslog ng so you can include other configuration files set some global variables and many of these can be overridden in later parts of the configuration. Here we define some sources. The other one is for local log sources using system you can hide away differences between different platforms. At the bottom there is a network source defined. Here are some destinations at the top one is a simple one for collecting of our log messages in a flat file. The other one is for elastic search where you can define an index name, cluster name and template, what data to send to elastic search. Here we define some filters and parsers, a simple one. In the middle of the screen you see long list of filter functions combined using Boolean operators and at the bottom one you see loading database for message parsing. And here is the heart integration where all of these building blocks are connected together. The first one is for varlog messages. It's simple, just source and filter and the destination. The next one is a bit more complex. Here we use multiple sources filter the collective messages pars the rest to elastic search. Here is a screenshot from Kibana where you can see results from message parsing the different parts on this dashboard. You can also analyze log messages which is becoming more and more important especially here in Europe but here for example if you think about making sure that no credit card numbers are leaking there are multiple ways to find these information and you can also solve that you are not just find sensitive information but also override it and not just simple constant value but with a hash of the original information so you can use the data for analytics without leaking sensitive information. Here is a nice map this was created using small tourism router which I have connection attempts from the world and IP address converted using GUIP to geographical location here is the configuration if you are interested my presentation with all of this configuration is up on the first them side so what is new in Cystogengy in recent releases making sure that even if you don't have stable network connection there are no messages lost you can do general correlation using the grouping by filter from all of the different you can use results from all of the different parsers in Cystogengy the Python parser was added recently you can use the REST API for elastic search use HTTP destination for example if you send data from IoT devices to Splunk it's a typical use case you have wide guide file source now and as I mentioned quite a lot of different performance and memory usage improvements in the last few releases so let me list the benefits of using Cystogengy in IoT and big data environments again first of all it's high performance and reliable lock collection it's a simplified architecture where you can use the same application for system locks any kind of text data the resulting data is easier to use due to parsing and reformatting and it can also lower the resource requirements on the processing side thanks to the efficient message filtering and routing if you want to learn more about Cystogengy the central source of information is Cystogengy.org we have the source code of Cystogengy upon github which also hosts our issue testing system we have mailing list and if you have question you can also post it on github do you have any questions about how would I make sure it ends up like do I need to do to the normal syslog if you have all of these small devices the usual architecture is that your code sends locks to syslog Cystogengy sends the log message to the central location and the central syslog server sends it to syslog so you don't have to install elastic search on all of your devices actually I don't know there are the smallest devices are usually raspberry pi and that category so bad boy look system or BSD or whatever any questions I will be here in front of the room for a while and I also have a developer colleague so if you have more programming site questions you can also ask him thank you