 Hello, everyone. My name is Nikolay Kondrashov. I work at Red Hat in Common Logging Team. I focus on user session recording project about which I'm going to talk right now. I also maintain pre-radius packages in Rail on Fedora. In my free time, I found it and I still maintain the DigiMent project which works on graphics, tablets, support, and Linux. I do embed it in the rest of my free time. So user session recording project is about recording what users see and type into the terminal, see on the terminal and type into the terminal, recording what commands the users execute and which files they access, and making that controllable centrally and stored securely, and also about searching, correlating the recordings with other logs and playing back those recordings. So our clients, Red Hat clients in government and medical and financial areas, they were asking us for a long while to do this because they are required by law sometimes. They want to know who broke their servers and how. They want to know who stole their data if that happens, especially in medical. They won't also trace user problems for support. Has anybody actually been recorded this way ever? Nobody? Has anybody set up recording of users? Okay. Has anybody wished they set up recording for users? Okay, one, two, three. That's good. Okay. So there's a great number of commercial offerings and they go from specialized hardware which you can buy like a box, put it on your network, plug it into network cables and have it intercept your traffic and giving the keys that would decrypt your SSH sessions, database sessions, other connections and log that and record that and then give you access to it. And it goes through systems which you can install yourself on your hardware and to jump hosts where you log into one system and then thrown to the target system and the software in the middle records that and finally to the systems which record directly on the target hosts. And this system, they record everything like there is a great range of offerings for recording keystrokes, display, including graphical display on Windows, on Unix, on Linux, recording commons, applications that you started on the Windows, URLs you accessed in the browser, et cetera, et cetera, et cetera. And these systems are often integrated with identity management and access control and some offerings are basically, first of all, identity management solutions and then recording. And of course, they store their stuff on the central servers and they allow audit and searching and post-mortem analysis and play back of those recordings. Unfortunately, there was so far no open source solution for this and the classic Descript program is still popular but it is totally not security-oriented and if you want to build something that would be security-oriented that you need to apply a lot of effort. I actually saw those efforts and there was much more except script there. Then there is sudo iologin which is for recording your sudo sessions and it has searching of the recordings and it has the playback, but it is not centralized and if you want to centralize that, you have to R-sync that or store that on a network file system somewhere. And the closest thing that there is is the TTY audit. It records user input and it can be centralized as logs can be and it is security-oriented but it doesn't record out. So what we decided to do is we implemented a tool called T-Log which records the terminal IO in user space and we chose to do it in user space because it was faster to do and faster to iterate on development and it is much easier for users to try. Later, after we started this already I learned from the audit developers that likely login the output would not fly with the current architecture. Then we used login infrastructure for delivery of our recordings along with other logs and it lets us save a lot on infrastructure and then maintenance and it allows us to easily correlate with other logs since it's just logs and there are plenty of solutions that allow centralization of log storage and it allows us to easily correlate with other logs and we are using audit logs for recording the rest of the sessions obviously because there is already everything there are comments executed you can extract it from exactly syscalls there are files accessed you can record data syscalls there is lots of stuff there which we can use already So we target enterprise already long-term log storage and the central control via free APA IDM solution and SSSD on the clients We are building web UI for playback and correlation and all the nice things and we plan to make it as a component so that we could embed it into OpenShift CloudForms or some such solutions But right now we are building the web UI for cockpit of which there were talks before by Steph and we are using controlling who to record via SSSD or manually and we are going to provide the configuration interface in cockpit web UI Apart from this for quite a while we already have command line tools for recording and for playback So I'm going to quickly show you how I log into a system that is recorded how the recording appears in cockpit how we can play it back and how we can play with the list of recordings So here on the left is the terminal I'm going to be recording and once I log in there and I do something a little the session should appear here on the right at the bottom Let's change it to user one So it's easier Which one? That's the biggest there is I'm sorry I'm using next there It doesn't really matter It's just the main point is that there is the session on the right Here we are We can start playing it back rewind it to the end and then we can type away So I'm entering sudo command I'll fail it and it should play back on the right Come on catch up Let's try some editing and then I can run MC It's lagging a little because this stuff is logged to journal and displayed back and there is a little bit of latency So basically looks like this So we can go back to the recordings We can do filtering enter a username and the point is that all that is stored in journal All the recordings are recorded to journal The recordings are listed here from journal and played back directly from journal The recording process is started as the user's login shell In the most basic setup you simply assign the user with this login shell Then when the user logs in that recording process creates a pty, starts the actual shell under the pty and then passes the data between the pty and the real inter terminal and records everything that passes cuts it into pieces, converts to jason and logs it So we optimized our jason schema for searching and for streaming because the data from the terminal is continuous but the messages have to be fineted So we cut it down to pieces, we record input and output as separate fields in jason we store time in separate with millisecond precision we preserve everything and if there are any invalid utf8 characters which we cannot put into jason we encode them as byte arrays just specifically as bytes when we log the journal because we want to be able to find our recordings and to list them we take some of the fields from the jason message and we don't take them away but we duplicate them into journal as journal fields and most of all the recording id which is unique per host the user that has been recorded because the recording process has set uid the audit session id of the recording of the recording message within the recording So cockpit uses very basic interface to access the journal which nevertheless is reliable and efficient enough for our purposes so far basically runs journal CTL and the host asks it to output in jason all the entries and the coding in the browser supplies it so there are options to do the things that we need so when we list the recordings we ask the journal CTL to match uid of the recording process the set uid recording process because we want to trust to be able to trust those messages so that we don't mix with the same messages or fake messages from users or from other programs then if we are filtering by username we get the username we can limit the since and until date and we ask for all the records and we tell it to follow so we can update the list of recordings as you saw so we read everything at the moment all the messages that match including the data and everything which is quite wasteful and we find out the unique recording id and aggregate all the information about recordings from the journal entries when we are playing we again add the set uid of the recording process again for the same reasons then we add the recording id and there we go we just list all the entries we follow the recording so that we can play back recordings as they go and we read all the entries and start recording them immediately as we open the page and as soon as the first we arrive we are ready to play so next big step will be correlating with audit logs and since that is quite a mess and difficult to work with even both with the original audit logs and even with the logs that journal generates we made a tool that joins the messages that belong to single event in the kernel because audit logs quite often actually logs several log messages for a single event they can be intermixed if the events happen in parallel and these messages contain some values that we need to match against to correlate but only one of the messages belonging to the event contains that value so we need to be able to match the whole event so that we don't do several requests for entries we have to make it in a single go so we are joined up in a single event, organized the fields a little and log it as JSON as well using this O-shape tool that we made the plan is to correlate with recording basically is to add another match to the journal CTL command line which would match also the relevant journal entries basically by audit session ID the UI is that we have the terminal we have the little window with the logs and they scroll alone as we play back the session and we can click on the particular log entry and rewind the playback to that position so there is some of the things that we would like to improve in journal CTL maybe we'll be doing the patches sometime soon first of all I know that the journal has indices for field values and that's great and that works fast but we would like to try to get matches for partial field values so that the user can actually start matching the entries by user or by host name while they type so they don't have to type exactly the value that appears in the field which is true right now we would also be able to match recorded input and output the stuff that appeared on the terminal and the user typed in and we would also be able to search related logs such as searching for parts of command that was executed in the audit logs and then being able to correlate that we wouldn't mind if it would be a bit slow so that's okay also as we release the recordings we don't need the messages from the entries we just need several fields and the messages is the huge part of the overwhelming part of these entries with the recording so that's why we would like journal CTL to return on some fields then not related to journal but still interesting we might need to handle different terminal types in a reliable manner now terminal types are basically similar and nothing probably will happen we might need to do something with that so as of now if you want faithful reproduction you have to play back in the same terminal that you recorded then there's a problem that the JavaScript terminal emulator we use is supporting only a subset of control sequences to do that we would like to maybe embed the terminal emulator library into the recording process so that we can present a single terminal type to all the recorded programs and have a single language that we record of the control sequences and there is the library that's called LibVterm being used by NeoVim if anybody knows it then character encodance is a much more real problem parts of the world still don't use UTF-8 and for example Japanese it's not very popular in big part because of the conflicts between various hieroglyphs between cultures so we will need to convert that to UTF-8 anyway if you want to search this consistently in elastic search and if you want to store it in JSON at all so conversion can lose data that's why we are thinking maybe we will be just compressing the original to preserve that and then keeping both versions one for searching one for storage and since we are preserving the original we might as well clean up the recording so that there are no control characters or something like that which would interfere with searching so and final and the most interesting perhaps is the seeking the playback so since the what you see on the terminal depends on what you output from the way before it's like you can send some color attribute at the start of your session and see everything in blue or something more important than that and it all depends on everything that was before to playback to produce a state the terminal at the particular moment we would need to play everything back from the start and this is obviously slow if you have lots of IO and you are recording like paging through something all the time or that's just watching some output of a program for like 2 hours that can take a while to process and output from the start so we only have the start state that we know and to work around that thinking maybe hijacking the terminal emulator and the playback so that we can take keyframes states of the terminal emulator regular intervals and then we can then we will need to be playing that little part back only and if we get the terminal emulator library in the recording then we will be able to of course generate these keyframes on the fly as the record and they will be in storage and the playback will be much faster ok, so there are 3 slides I'm not going to go through them you can get these slides if you'd like to try it from the all systems go website if you try to log try to break it please we need that try your shape you can just feed it your audit logs and see how it looks it would be nice and if you are really determined and if you get hold of some of these guys from cockpit you can help you you can try to build cockpit and see the UI it will look different later but right now it's working that's it, questions did you implement the virtual terminal in JavaScript in the cockpit UI here? we took the ready made term.js library and we just feed it the IO data that was easy so what about GUI so GUI we have the web UI no, I mean if I start say Emacs and Terminal it happens GTK window and I start doing things in there how T-LOG reacts well, we don't record anything in the graphical sessions we are thinking of maybe doing when you have a terminal session inside the graphical session it kind of doesn't make much sense because you have so much opportunity to do something using different means and graphical sessions should be recorded graphically, so you have to record the whole picture that will be a far away step and so far we are focusing on servers we usually don't have the graphical interface but there are solutions which record the graphical interface so it is definitely possible and we are having little plans for the future for that you mentioned that you need to make the user a shell the shim thing there can you then like hand back to any shell or do you are emulating some specific shell users generally want specific shell there can be any program running on this at the moment it is configured in the recording process globally per system and there is a way with SSSD if you use SSSD it will actually imitate the shell being replaced and supply the actual shell that you have with a WED or an LDAP or an IPA whatever you want so since it works with the terminal what is the advantage of doing it on this level instead of doing it in SSH demo for instance it works in any terminal it doesn't have to be SSH it can be console telnet if you want if you are in the terminal and you type and then a big file and it print megabytes on the terminal what happens with the journal well by default at this moment the journal will be swamped with lots of stuff but you can turn on rate limiting and you can set the rate at which the messages are locked to journal and the output will simply slow down so if you are silly enough you can wait of course until this is over or you can just terminate your process and do a pager which will be then faster for you to see all the file or the parts that you need they also implement by some request dropping the journal messages you will see it going fast but only part of that will be logged so I suppose the idea was that we don't care about fast output we only care about what users type into the terminal what happens if you type a password by mistake in a terminal it is regarded by default we don't track our input you can turn it on but we don't track our input by default because everything that you type usually you can see on the screen and then the idea of this recording is not that it records really everything that you do because there are a million ways to avoid being recorded in this situation it is rather to capture the intent of malicious intent if somebody has it than the actual act because if you really prepare it will be easy to recommend that and the real recording part, the real audit part is of course the audit logs and you have to set it up properly so that you capture these things all right, no more questions thank you