 So, please welcome on stage Michael, aka Greyhunt, who will introduce us with an exciting project called Kitas Trucks, the dissecting media phyton. Hello, everyone, and thank you for joining me today. My name is Michael the Action, and I'd like to talk about dissecting media file formats today with the tool named Kitas Trucks. So, what's the idea? Media file formats grow complex and more complex every day. Media software developers have to deal with the multitude of different media file formats. Some of them are well documented, but still pretty complex to process. Some of them are proprietary and undocumented and need to be reverse engineered. It's even more complicated task and requires quite a few hurdles to jump. For example, one need to do the proper black box reverse engineer to be included into free and open source project without major legal problems. One need to do lots of testing, making some hypothesis or proving them right or wrong and doing some decisions, mapping some proprietary format step-by-step, exploring it and doing some kind of description or specification of such problem. Basically the mission that such a developer must overtake is going from better presentation of file format in a stream, loading into memory and sometimes going back from memory to a stream. That is, we have some kind of stream that needs to be dissected into some objects laid out in memory, usually in some kind of tree or graph of objects. Well, development workflow for such a process involves writing some parsing code with certain programming language. Then you write some extra debugging code to ensure that it actually works because you need to somehow prove that it works. You either dump it to the screen, check some assertions, run with it with a debugger or something like that. Then you just basically debug it till you drop because parsing binary formats is not exactly a trivial test. There are quite a few pitfalls to overcome, like going over some boundary inside some soft structure, dealing with the engineers, dealing with byte alignment, dealing with a few other things like assertions, checks, barely specified formats, some special cases, conditional reading, writing, et cetera, et cetera. As soon as you finish such a big task, you get some sort of parsing library that loads objects from stream into memory. But then, if you want to support some other programming language, you just basically need to redo the whole process from the start, doing basically the same code in some other programming language imperatively. Actually almost every media format library I've encountered has these dumping tools. On this slide, I've listed some of them, then they're not just for random reason, they're for reason. They are needed by developers of these two libraries to debug their tools and to see if they really work. Needless to say that errors in file format libraries can be really devastatingly dangerous. Almost every such error, such as buffer flow, such as reading beyond some part of structure, interpreting something wrongly because of human errors in writing the code, et cetera, et cetera, are almost always remotely exploitable. They frequently provide arbitrary code execution, especially if we're talking about buffer overflows in libraries written in languages such as C. They leak information. They usually can lead to denial of service errors. For example, in leaping G since 2010, there were 22 vulnerabilities and quite a few of them are really dangerous. In a leap bag, for example, there are full vulnerabilities, but they're still dangerous, as well. If we'll revert to the start and see what format file format specifications exist, we'll discover that there is no single universal access standard. If we'll take a look at the documents provided by file format authors, there are quite a few things invented to describe by file format, such as, for example, C structures, as we see here with ELF headers, such as intricate tables, as we see here with network tables, protocols, such as even more intricate tables, as we see here with some random page describing Microsoft Word document format that maps some bytes, bits, and try to explain its values. Network protocol engineers have something better to rescue. They've got Wireshark that is universally accepted to be a tool of trade that allows to dissect the packets and see what's inside in some kind of tree format. Basically, you have the dump. You can point at any byte in the dump and see what values in protocol in the packets it corresponds to and vice versa. But what about the same stuff for media files? It's a bit complicated. There are quite a few proprietary tools available, such as one-to-one editor, or hexinate, or signalize it, as some of you may be familiar with. But generally, there is no universally accepted or at least a tool that supports enough popular formats to dissect and to build upon. Well, so basically, we've tried to fulfill this whole and go actually one step ahead of it. So I'd like to present a Catastruct project, which is declarative file format specification language. All the words in this phrase are actually meaningful. The emphasis is on the declarative. It means that we do not actually specify how to read the format, but we specify what is inside the format. And it's harder to implement in some cases, but it gives us quite a few advantages. I'd like to show a bit further in the presentation. We can compile our case-wide file that we've set up with the file format specification and to read the main parsers libraries in quite a few target programming languages that I'll demonstrate further. And quite as well, we can visualize, dump, and debug all these file format specifications using several tools that were built around the Catastruct project, such as visualization tools, such as WebID that I'll demonstrate further as well. Case-wide format is YAML-based. And that's actually a good thing because it's very easy to write your own tools. For example, it's quite a snap to write a tool that would embed one case-wide file into another case-wide file. It's generally a matter of writing a script in five or 10 lines, and it's quite easy. Last but not least, it's free and liberal. We use GPLv3 for the compiler, and actually, generated code uses some runtime libraries that we supply as well, and they are MIT or RAP-H2 licensed. So even if the compiler is GPL, it's possible to use the proceedings of the compiler and proprietary products as well. With support at Target Languages right now, it's C++, C-Shop, Java, JavaScript, Pro, PHP, Python, and Ruby. As a bonus, with support output to Graphies, I'll demonstrate it further. It's quite interesting side project as well. As experimental features right now, we are developing Swift support. We're developing support for exporting case-wide files as Wireshark disc setters to be able to load the same declared formats into the Wireshark interface and see there, and some quite a few others, probably interesting target formats. So how does it look? The natural API generated by Catastruct looks something like that. Here we have demonstration of GIF file, GIF file, and generally, the Catastruct file declares the tree of objects. Here for example, we have the header, the logical screen descriptor, global table, et cetera, et cetera. And generally, it goes down to traversing this tree of objects from some start that, for example, this code in Java starts with GIF dot from file that loads some, that parses some data from the file. And then you just do this file dot something, dot something, dot something and extract the data right away. For example, this is one liner that shows the screen width and height, which is actually there, dimensions with visible fragment of GIF image right away in one line of code. This is our web ID. Probably it's much better to just demonstrate it right away. This is probably now the main working place of a developer that wants to get his or her hands dirty with Catastruct. Here we have simultaneously an editor to see and edit the case. File file is on upper left corner. In upper right corner, there is a hex dump of some loaded file. Here we have a Microsoft ADI file. And it's corresponding from a description. In SA, you may have seen in such editors like one-on-one editor or hexinator or other preparator editors, it's possible to select any byte in the stream and go exactly to some value in the object tree in lower bottom corner to see what this byte corresponds to. And as well, one can traverse open and close arbitrary objects in the object tree as well and see how it looks like. It's fully interactive, changing a single symbol in the case Y code, recompiles everything conservatively, and tries to implement rate-to-rate, reparse the file in a new way that you've just specified. So for example, if you add some lines of code that add parsing of some new field, it would just appear right away. You don't need to basically just do anything. For those who want some more console hardcore style parsing, there is also a console visualizer. Here we have the GPEG file loaded into it. It doesn't look just as flash as the web one, but it works just as well. It doesn't feature an separate editor, of course. You're expected to have your own editor on console or whatever you want to use in some other window. So it focuses just on visualization. You have the same tree, you have the same binary dump, you can traverse it and see if the file specification you've just entered matches what you expect to see or not. This is how our case Y files look like. Say it's YAML, it allows us to set up some fields, some field types, and that is important because it's declarative and not imperative. On the left we see our declarative specification. On the right we see what it compiles usually to into some kind of imperative code. We do not have things like while loops, we do not have things like direct ifs, any conditional controls, any jumps, any basically any code flow that is immediate in imperative implementations. We just use, we just describe the file structures. If there are some repetitions, we enter it as repetitions. If there are some conditional parsing, we enter it as conditional parsing and it brings up quite a few possibilities, interesting possibilities that are possible. We have quite a few built-in data types such as integers, floats, unaligned bit integers and bit fields, strings, robot arrays, enums, and of course we allow to define user-defined data types. We have sequential parsing, parsing those one-by-one in sequence. We have out-of-order parsing, something called instances, so you can seek in the file actually to do some parsing of other parts of file by some index or offset. We have calculated attributes to ease representation of something that we've got from the file in some more popular form. We got checking for magic signatures such as fixed content, counter, for example, headers. We have conditional parsing. We have type switching on a condition, something like switch. We have repetitions until the end of stream, repetitions in predefined number of iterations or until some conditions in that. We have powerful expression language that could be used almost everywhere and that's a good thing because it actually compiles into direct expression code in some other languages. For example, this one shows how we can parse the attribute named full length that allows us to specify unsigned integer four bytes long in the first place and then we parse as many elements as we need, calculating the number of elements as full length minus four divided by six. So it is how it compiles to C++. This is how it compiles to Python. You can see that the code is quite different and it's how it compiles to JavaScript. Another difference is that, for example, JavaScript doesn't have the integer division so we end it with must.flow. That's another one that I wanted to demonstrate interesting stuff about the graphless visualization. Basically we compile stuff to graphless and this is what we've got. It's a human readable diagram that one can parse to his colleagues, one can parse to other people to just take a look at the format and implement it, for example, in some other programs. We've got a ground repository of formats, including tons of formats right now. It's quite interesting. You can find it at our GitHub page and see for yourself if there will be anything interesting for you there. There are quite a few image file formats, video file formats, audio file formats, archives, documents, executables, file systems, etc., etc. Thanks for your attention. I'd like to see if any questions arise. The question was about handling incorrect values in complex expression language. Basically, there is no internal checking in kind file struct. It just compiles the expression as I showed you and in runtime it will probably arise some sort of exception or error. This would be specific to a particular target language that you compile this code to. Please. Yes. Roughly by stream-based, what about codecs that have bitstream fields that are quite complicated, like most of Huffman codes, or Golan codes, or various different, they have much more complex bitstreams that we'd like to analyze. Is it possible to write in your definitive language the way that you parse these bitstreams? Yes. The question was about parsing bitstreams with more complex codecs like Huffman codes, etc. Since version 10.6, we have support for reading bitstreams. It's slowly growing. It's not very optimal to parse bitstreams per se to fulfill simple operations like unpacking something or uncompressing something. Probably it's more efficient to use some special processing onto the whole byte stream here. But it also can be done, I don't see any major problems here. Could you repeat a little bit louder? Right now the API, repeating the question, the question was about reading something from a stream, not from the whole file from the disk. Right now the API allows to basically do reading parsing from two sources from a file on disk or from arbitrary array memory. If you can organize the parser in some way that would be, for example, chunk-based that would parse one chunk and then stop, it's no problem to go with this chunk-based stream that you would somehow buffer in memory and add to this buffer again and again and recall the parser. I guess that would be it. There are several possible, the question was about adding annotations to young file to have more readable representation of whatever is going on there. There are quite a few possibilities to do so. We allow to add some annotations to fields to be parsed as doc strings into the target code. So, for example, if you load it into IDE, you just see whatever the comment for the field are. We allow, in Web IDE, there are several syntaxes that allows you to mark up some formatting for the representation. For example, choosing the hex representation, binary representation or decimal representation, etc., etc. And last but not least, you can do calculated values that allow you to represent something in more human readable way as well. The bit flags are parsed using the bit parsing syntax usually and you usually get them as separate fields that you can basically touch in every way you want.