 It's sort of the tutorial that I wish I had when I started LLVM. I first got involved when I was doing an internship at PlayStation a few years ago, and we were hacking on the LLVM compiler a little bit. So at the time I was a grad student, I knew a little bit about compilers, but not a lot about this framework. And I think I was intimidated by its size initially. So today my goal is to kind of lessen that intimidation if you've never touched LLVM. If you have and you are an expert at LLVM, then I welcome your opinions and how we can improve these sorts of talks for beginners to make LLVM a successful and accessible tool. So right from the start I want to give a little demo of what the little programs are that are going to be sort of walk through during this talk. That way you know what to pay attention to, or you know if you're in the right room, and so on. Okay, so I'm going to navigate to the first demo here, and what it is is I'm looking in the top right here. I have a whole world program, and it's going to analyze the code and print out all the functions in that code. Well, it's just a whole world, so there's a whole main, one function in there. So how do I write that analysis pass? The next one, demo number two. There's a few functions in a small program. It's going to count how many basic blocks are in there, and how many instructions. So give us an instruction count. Okay, so a little bit of statistics and analysis about our code. Demo number three. This time we're going to start looking at function caller-call-we relationships. Just your calls that are made. So again, I'll have a simple little program, and we'll see something like direct call to function. The name's a little mangled add function from main. So we can get some ideas about what's going on in programs for source code understanding. And demo number four is going to be injecting some code into different programs. So I had a hello world program that prints Bonjour, but I was able to manipulate the source a little bit with LLPM to put a little print message at the top when we enter each function. It says, hello, you're running an instrumented binary. Performance may vary while running an instrumented binary. Okay, so those are the tools that I'm going to walk us through. That's what you'll be able to leave this room ready to program. Okay, so those are the four demos. You'll see them again as we walk through, but in the slides. Okay, so who am I, your speaker? My name is Mike Shaw. I'm currently a lecturer over at Northeastern University in Boston, Massachusetts, the U.S. I'm about six hours ahead of time now. I'm interested in computer graphics, game engines, and systems. My research has been in performance tools, a lot of static and dynamic analysis, as well as visualization tools. I like to build things that help make programmers' lives easier. I think I'm in good company in this community. I like teaching guitar, running, weightlifting, much anything computer science is interesting. You can find out more about me, my courses, as well as all the slides on my website. Okay, so again, this is an introduction to LLPM. I have some specific goals to kickstart the day for the rest of your sessions. So figuring out what is LLPM in that project. Understanding how to obtain and install LLPM. This is something I'll go through quick, because we don't all have time to compile and build a source. But I found a lot of students that I've taught, they sort of struggle at this point. So I'll give you a cheat sheet. I'll have little examples with LLPM and Clang and some of the tools, and then how to produce the demos that I've shown. I'm going to actually show all the source code, which just as a warning, I'm going to break a lot of presentation rules. There's going to be code here, lots of long sentences, but that's okay. Because we want a good slide deck we can review later. And this is for you guys, your goals for tomorrow. To start thinking about some solutions, what you can do after you've learned a little bit about LLPM. What's going to keep you motivated, keep you growing, and think about how you make your lives easier as programmers. And again, as I mentioned, I want you to be able to run through these slides again with confidence and excitement if you haven't worked with LLPM previously. And all these slides and the code are going to be at my website, slashfosdom18.html. And I'll make sure that URL is in the top left of most of the slides as well. So what is LLPM? It's a project started at the University of Illinois, around 2000. Chris Slatner was the lead architect. Here he is. He was at Apple for a long time now at Tesla, but still contributes. You might know his contributions with Swift. But the LLPM project itself is backed by a lot of major companies like Intel, Apple, Google, and so on. But the important thing, of course, about this project is that it's open source. So anyone can contribute to it. Anyone can get the source and commit their updates upstream. I guess the real good question is, you know, it's great that it's open source and popular, but what makes LLPM the tool that, you know, a lot of programmers are paying attention to it right now? It's getting a lot of attention. Why is it winning awards? The secret recipe, if we want to just look, it's in the research paper published in 2004 that kind of describes the details and the internals. Of course, we have the source code. But I want to talk about Chris Lattner's big idea. This was from him himself when he was a graduate student in the early 2000s. Okay, so he was thinking about compilers early on during his graduate work. And if, you know, briefly I refresh the idea of the compilers to generate high-level language to machine code, right? We don't write in ones and zeros. So we have our C++ source or language of choice that gets lexed and parsed to make sure it's a valid program. It's sort of optionally, I guess, an optimizer that optimizes and makes our code better for us. On the back end, that generates code and gives us the code for our machines. Okay, and we can finally execute our programs. So around this time, 2000, maybe even before, JIT compilers or just-in-time compilers, we're really continuing to gain traction in languages like Java that have these virtual machines that could compile code online so as they're sort of running, they're about to run. But as they're doing this, you know, every time you run a Java program, not this time, it had to perform optimizations over and over and over again, right? Every time you run the program. Okay, so at the time, Lattner and his advisor and smothers, the big idea was, you know, can we perform these compile optimizations at compile time and really try to, you know, how much of the heavy lifting could we move to compile time before we run a program? Increase performance. So he was thinking about some low-level virtual machine. Of course, if we notice, low-level virtual machine, LLVM. So that's where the name originally came from. Now the project is just referred to as LLVM itself. Okay, so in the middle of our compiler pipeline, there's this optimizer. So that's going to become the focus of this project, or at least Lattner and his team in the early 2000s. So typically, you know, what's going on during the optimization stage for compilers? There's some intermediate representation or some intermediate language that the high-level language gets translated into. Okay, and this is something that has more of a regular language, something that can be manipulated a little bit easier. Okay? Just sort of thinking about this, why we would want an intermediate representation. Well, think about how many ways you could write a C or a C++ program that, you know, achieved the same thing. Why can't we have a better, more regular structure? Okay, so here's just an example of what an IR instruction looks like in LLVM. It's a branch instruction. So branching over some condition, if it's true or false, and then jumping. So it looks like assembly. Okay, let me jump into how to get LLVM at this point. So we're starting it a little bit excited about it. Now we want to download and try and all the tools. And as I mentioned, I'm actually going to run through this very quickly. In the future you can use it as a reference for getting set up on your own. And I do want to mention that the LLVM project is evolving at a rather rapid pace. The main sort of architects and folks contributing to it are willing to make sort of breaking changes to improve the tool. So it's always growing. So typically when I'm working with LLVM, I'm always building from a source from one of the latest versions. The instructions where they'll always be is at this getting started page. And that's where you can figure out where to download the repos from SVN and so on. For this talk, all of my examples are working with LLVM 5.0. They should work on most the previous versions as well. I'm running on a Ubuntu machine in version 16 with the x86 architecture. Similar instructions will work on Mac and so on. Tools you'll need, SVN, CMake, Make, some C compiler and that's about it. So how to get started, create a directory, something typically with maybe the date when you're downloading. Within the folder I create a few different directories, a build directory, that's where your binaries will go and then your source. Keep your project nice and organized. And then all about 12 steps from downloading the relevant projects to actually getting to make. This is my cheat sheet for you to follow later on. So after you follow these steps, you want to take a little break, get a stretch. Depending on the speed of your CPU, it takes me about 15 plus minutes at least. And you'll know it worked when you check your build directory, it's going to look something like this. There's going to be a bunch of binary files, Clang, and various tools. These are all the resources that you get here. Sort of heeding that, your system, you might already have Clang or some compiler installed through some package manager or native on your operating system. But we want to again build from source. The version often matters when we're building tools. As I mentioned, they're always pushing the project forward. At this point, let's just assume LLVM's up and running. At this point in the slide deck. Our first example, emitting LLVM's intermediate representation. As I mentioned, this is sort of the important part for the optimizer. Not just trying to look at pure C or C++ or whatever code. The compiler that we do this with is the Clang++ compiler or Clang++ for the C and C++ language. And they can generate code that targets this intermediate representation. So if I try, here's our sort of working Hello World program. Just printing a message. Bonjour. I do speak French and English. So if we compile and run it, that's it. Hello. And it works. Bonjour. But then, you know, we want to make sure that we're installing and working with the version of Clang that we built from source. So you can do this by checking version and yep, we're at version five. We've got our architecture and where exactly it was installed here. Everything looks good. All right, so now we can use Clang to mit some LLVM, get the IR. And keep talking about this step at the optimizer. So to do this with Clang, we have some options, dash s and emit LLVM in our source file. Dash s, similar to other parts, tells you only run the preprocessor and compilation steps to generate something. And the specific thing we're generating is the LLVM representation. Okay, that's sort of our assembly here. Now as a quick aside, you know, Clang++, isn't this an LLVM talk? Currently, it's both selling talk LLVM. Well, the news is, you know, since 2000, this project's grown really big. So there's a lot of tools. I'm going to touch on a lot of what I think are the important tools in the LLVM umbrella of projects. Jumping into our first one is of course Clang. As I mentioned, you know, if you're on a Mac, likely you've already used Clang. That's Apple's default compiler. Okay, so, you know, Clang or perhaps other tools can work with this LLVM. Yes, of course. Yes, we're in the right place. Okay. The key feature here is that if we have these language front-ends, whether it's Clang or something else, I can just target this intermediate representation. And then the optimizer can work on that IR here. So whether we're in C, Fortran, ADOR, Objective C, Swift, or whatever, we can get to this common optimizer. And then there's separate code generators from each of those intermediate representations for our architecture. I won't be talking as much about this part during the talk, but I'll have some resources at the end if that's interesting. Okay, so let's take a closer look at the IR. Well, first, how about a pop quiz? Let's wake everyone up. I know it's a little bit blurry, but from the audience, you know, what do you think this function does? It adds two values. It adds two values, yeah. Perfect. All right, wake. Well, yeah, it is named ADD1. You got it. I didn't make it too hard. And there's two arguments. You got it. I32 something and A and a B. I32 looks like an integer. It's an integer. Unsign it, yeah. And every function, we've got this entry point. So this is where we enter. We have to start somewhere. We store the results of our addition. This is an actual add instruction, sort of capturing what's going on. You can think of this again like an assembly. Let's take in our two arguments, A and B, and then returning the result as a type here from temp1, whatever it is from this add instruction. So if you can read assembly from your school days or even see if you program C, you can understand this intermediate representation. It's quite nice. It's very readable. So then to LLVM's secret sauce. Which really is this intermediate representation. So as mentioned, this IR, as long as we can get our front end language to it, we can optimize it commonly. It's fairly readable. Fairly writable. You could actually write programs in this IR. Might take an additional effort, but I've seen it done. I've done it. It's a little bit more of a well-defined language in the sense that a lot of these compilers that try to achieve the same thing, they'll target C or something and then just sort of bootstrap that as their intermediate language. I might say this is a little bit well-defined. The IR itself is strongly typed. We saw that I32. So that means there are types, even pointers can be defined with a star. There's what's known as an infinite number of registers. So from your assembly programming, if you have experience, you didn't see %rx, rdx, r15, or whatever 16 general-purpose registers, you have infinite. So you get these temporary registers as many as you want. And that's useful for us in what's called single-static assignment. Which is a way for us to sort of analyze programs in this intermediate form and make various optimizations. A lot of the common ones. So click aside just on single-static assignment, sort of from the Wiki page. What this is is where you can, I guess, start on the right side, assign every single variable a unique name here. Then you can capture that there's some redundancy. x1 is y2 for this assignment. So we can get rid of something. So that would be an example of a simple optimization. Some of the common ones from Combine. So more on the IR, from the open-source architecture book from Latina himself. This is a great resource on that. So back to using our tools, using Clang++ and generating some IR. Here it is from our Hello World program. I'm emitting LLVMR. And what I get when I do this is a .LL file. Two lowercase Ls. This gives me the textural readable form that we had looked at. That branch and that simple add instruction. And here it is. So this is Hello World. There's just a tiny bit more here, but this is all the code that does the work. And I'll pause for a second and just look at it and take some audience feedback maybe from the front rows if you can read the source. What do you see here? What stands out from this instructions? Call printf. So we see the functions that are done and there's a call instruction here. So we can trace it. We see our functions. Anything else? What do you see? Yeah, we see string. Longures encoded in there. Just like it would be an assembly in the data section. It's explicit alignment. So that might be important for architectures we're targeting. What else? There's a target here. This target triple telling me this is an x86-64 machine that's generated for. Pardon? Generated for this machine to run it. Yeah, that's telling me. Yeah, so I can target different architectures. I've targeted my machine for this. 64. Nine, yeah, so these are the alokas, so sort of our stack pointer here. Yeah. The IR is not machine specific, meaning that this is the same sort of IR I'll get on other machines, but from this IR I'll generate the assembly for my target architecture. So if it's a PowerPC, an ARM machine, x86, this will get translated later in the backend stage. Yeah. Yeah, so printf we cheat a little bit here, and that the C library sort of links in with the compiler so we know about printf. When we do other, I've kept these, if you do other ones, like even like C++ IO stream, you'll see the code sort of explode. So, but yeah, these are all good findings here. Pardon? Percent one is not used? Percent one is not used here? It's allocated, it's not optimized. Not optimized, I haven't run any passes on it, that's true. Yeah, good, yeah. Okay, so yeah, these were things I noticed. You know, source files there, you guys got the data layout, that's got to be sort of with this file, as mentioned, so when you target your machines that you're compiling again, lots of these percent signs, that's because we have infinite registers. There's no optimization yet, I haven't run anything on this on the optimizer, we'll get to that. There are things called binodes that are sort of selectors that come up, but none of them are here, and then the type information and at the very end there's some of the sort of metadata and attributes that are sort of hints to the compiler otherwise. Yeah. So I cannot change the name of a register, cannot assign a second time send to the register since there is an SSA. So you cannot, yeah you'll get a new, they need to be unique, yeah. Okay, so all registers are redundant. Are you assigned once? Assigned once, yeah, exactly. Yeah, everyone good, good questions. Okay. So then as alluded to from this common optimizer, that's the information we need to target our back ends here. Okay. Hopefully we're enjoying the IR readability. And the machines enjoy it too. This leads us to our next tool, LLI. It's an execution engine that directly executes program bit code using its own JIT compiler. There's a JIT compiler here. Okay. And because it's very readable, I think this sort of assists in this project. So if I actually want to run this program, I haven't generated any executables or binaries yet. I'm just doing LLI, hello, and then this is our textual bit code again. And I get the same output for Bonjour. Okay, so that's pretty neat. I can just run it really quick like that. Now, of course, the intermediate representation has another form, a binary form called bit code with .bc extensions. And then I have, you know, the binary data is a little bit more compact. And that can also run through the LLI tool. Exactly. Yep, you can convert back and forth as you need. One for readability, programming, inspection. The other for compactness and, you know, if you have a lot of these files. Yeah, come on. Is it a yes? Yeah, same thing. Yeah, I know they use bit code, but same representation. Yeah. Okay. So our next tool, LLVM, ASC, the assembler, takes the IR assembly form and converts it to that bit code form. If we want to do that, it's as simple as feeding in the .LL file. If I list the contents, I'll get a new BC file and be ready to go. If I look at that, again, I can also run through LLI. Same output. Still good to go. So same program, same representation. In general, we don't want compilers making sort of these breaking changes. So change it. There are preserved changes. Okay. So I execute the bit code and my claim is, you know, maybe the JIT engine can execute more efficiently. Guess is why? In that form. It knows the input data. Yeah, it knows the input data. Welcome packed. I think a lot of these programming languages, like Python even sort of compiles into its own byte code. So it can execute quicker. Here's what it looks like. So it truly is a binary application that you can't read. Of course, once you start getting a lot of these files and they get big, the bit code's much, much smaller, much, much more compressed. Okay. So we've done this bit code, but eventually we do want to generate our assembly for some target machine and build an executable that we can distribute to someone. Our tool is LLC and that's the static compiler. It takes in this IR bit code and converts it to assembly code. Same things we're doing before. We just run the tool on the bit pad. If I list it out, I'll see a .s file for the assembly. And here it is. Hello.s. Then if you enjoy reading the assembly, it's there for you. Correct. Yeah. LL is human readable. LL is the binary. And then the .s is assembly, which we'll debate about human readable or not. Okay. So this is the full circle. LLVM, the project itself, a lot of people worked hard on it to target many different machines. So you can actually pass this flag in for CPU and see all the different architectures that you can generate code for. This list goes on much, much deeper than what I've shown here. And at this point, we've gotten familiar with some of the IRR playing around with the tools, the compiler tool chain. But we haven't utilized the optimizer yet. Again, Latin is a big idea here with this project. So that brings us to the next tool, which is OPT. This is the LLVM. It's the analyzer, but also the optimizer. So it'll run optimizations so I'm going to run OPT again on our hello program. So run OPT on a textual bitcode. And then I've got this little time passes flag. So basically what this is doing is there's a bunch of optimizations you can apply. I'm not applying any right now. So I get a list of my passes. There's just a verifier here to make sure it's a valid input. And that's it. But I'm timing passes here. So I want to introduce the idea of passes with the OPT tool, or the optimizer. So just as your compiler might look through the source code many times to build your program, the optimizer can cycle through your source code, or the bitcode many times, looking for opportunities at work and optimize. There are a lot of ways that we can do this. Again, focusing on passes. And these sort of LLVM ways you have different granularities of how you want to sort of inspect your code. It's sort of a higher level, a sort of source file level you can think of. There's a module pass that'll look through the feed in a full module. A call graph pass bottom up. Function pass that runs over individual functions. So every function they have in a program. Basic block passes that run over all the blocks anywhere between the curly braces and C++ essentially. And then there's a few other types of passes. There might even be more added in the future. And even within these two passes analysis passes and transform passes. Analysis where you're sort of gathering information. That's mostly what I'm focusing on. Next couple of examples. But then there's transform passes where you're actually sort of mutating the program. There's some side effect for the original code. So the next task, next thing to get introduced to is how to analyze IR with this pass framework. So this leads us down the road of code optimizations code understanding. More so we'll start in other ideas that you can think of. Machine code generation and so on. So I've got a task. I want to print all the functions in a program. Maybe to get a list or to see where they're coming from. What kind of pass should we use for that? Any guesses from the audience? Function pass. You got it. That's appropriately named. So I'll take that. It runs over each individual function. That's how it's structured. We'll see an example of a module pass. That can do the job. Function pass is the correct answer. So writing our first function pass. So where you'll be working when you try this is in your source. LLVM lib slash transforms. There's a hello project in hello.cpp and that's where it gets started. That's all configured. It's in the build system so you can just hit make as you make changes and can use this as sort of a sandbox to explore the LLVM project. And as you continue on you can add more passes that's pretty well documented. How to do that. Code box here is sort of what it looks like. I'm in the terminal land most of the time. So here it is. hello.cpp This is code from LLVM not from me. Here it is. Nice and bright. Here's the piece we care about though. There's this function we'll walk through the rest. It returns a boolean and it's run on function. It takes a function in as an argument to reference that function. Then it's going to these errors is sort of LLVM's cout or printf way to safely do that. hello.colon and then write out f.gitname so the function name. Then it returns false because it's not making any changes. We'll look at that again shortly. So let's build our hello pass. So how you do this. You navigate to the lib transform slow directory. Within your build directory there's a make file. If you follow the instruction that's found you type make and it'll build for you. It'll build a library. This is our pass as it's built. It's in our libs folder in the build directory. We're up there LLVM hello.so. So it's a library there. So when we want to run our first pass, again we're doing this with the opt tool. We've treated this sort of as an optimization. So we're starting with opt. We choose which library we want to load. So sort of which plug-in I guess. It's the LLVM hello library. And within it we can have multiple passes. So we have one that's just hello. And then the optimizer runs on bit code. So our inputs the hello.bc file. Yes. Correct. Yes. It is. The name is overloaded here. Okay. So if I run this as I have I'll see a tiny print out here. Hello main. So there's just one function in our hello world. So it works. That was all there was to it to print out the functions in our source code. So that's good. So let's go over the anatomy of the pass. Some of the parts I skipped over. So again this run on function. This is the piece that does the work. We're returning false. Because this is sort of just an analysis pass. I'm not making any changes to the code. Any transformations. Just giving me information. We're inheriting from the function pass the class here. And then this is where the name comes in. You asked about the hello. But you sort of register this as a pass. Sort of the pass manager. So I'm going to register this pass. Hello. That's the name of my struct. Hello is what we're going to call it. So that's why I was referring to it on the command line. Then you can do a description if you want to give your users help when they're on the command line. And other attributes. That's what we care about. C++, yes. Yes, there are, yeah. We can write Python in particular. I've seen a lot of passes. There's some libraries for that, yeah. As long as you can Yes, yeah. This is the final answer. Again, that's how I knew what to type here. OK, so congratulations. We did it. And this also proves all of you are properly configured. So this is where we're going to start. So we want to move a little bit forward. This first pass is sort of getting at a static analysis. Brief reminder on what static analysis this is. What kind of information, bugs, performance, errors can you uncover before you run the program? It's to aid the programmer. It's a great way to get full coverage of the program, what's going on. Cons, you're not actually running the program and you might be overly conservative exploring code that actually doesn't run. So that computation might grow unnecessarily. So let's build a second pass. And this time I want to collect some program data. It's going to print the function name and it's going to count basic blocks and instruction counts. And I'm going to use this new sample code and then you can try on your own. I've got two functions here. Countdown function or rather countup function to 10. It'll still work as intended. Our add function here and then from our main I just print out the add and run the countdown countup or whatever. So that program's called loops. I'm going to compile and run it. We can use our previous test. Hello. This time on the loops.l so that's what I'm swapping out. And we'll see, yeah, there's three functions here countdown, add and main. So it's still working. How can we expand on it? Here's our second pass. Again, function pass is probably the right tool for this. And this is it. This is the code that does it. So again, within the run on function body, that's where our work's being done. I'm going to create two integers to keep track of the basic blocks and the instruction counts. And then within these for each function, I'm going to iterate through each basic block and within each of those basic blocks iterate through the instructions. So once I have a function, I can sort of keep working down and drilling to smaller granularities. You can watch that. I can't really work up as well. And once I have that information, I'll print it out. The basic block counts and the instruction count. So this is a common pattern you'll see. Now, of course, I've modified our Hello CPP. So I've got to save it, make again. That's previously done. To build a new version of our library. The other changes I've also this is Hello 2 now. So I'm going to run Hello 2 on that code and see how many basic blocks there, instructions, looks like countdowns, our most intense function here. And it works. That's kind of nice. Even a primitive tool like this, if you're running optimizations on your compiler at ove3, ove2, experimenting with inlines, you can see if your functions are growing or shrinking and get an idea of program size. Just from a little tool like this. Again, we're running Hello 2 and again as a reminder what the function pass is doing, it's running on each function individually. It doesn't know about the other functions. So I didn't need to reset variables or anything like that. So if I want to sort of store what information there is about other functions, I need to write my own data structure for that. What you can do. Alright, so let's keep going. Try to think about what we can do with some of this instruction information. Or just build other passes that are interesting. Here's some homework for later. Again, I'm not pulling these ideas from nowhere. But the LLVM writing and LLVM pass guide, that's a sort of definitive guide on these passes here. Okay, so I want to move to our third pass, another function pass, and this time it's going to show direct function calls. So I can figure out which functions are calling other ones. Alright, here it is. I'm sure folks in the back can read this. Don't worry. All the codes online. I'll zoom in a little bit on the parts that matter. A tiny bit. This time I'm including a new header LLVM slash IR slash call site. That sounds like something that might be useful for figuring out if functions are calling other functions. And so there is some object here I'm instantiating a call site. Okay, so there's a lot of these different tools in LLVM. Again, I'm not an expert. I don't know all the commands by heart so where am I going to go to get help? The LLVM docs, of course. And often the strategy, the appropriate strategy is LLVM space in Google and whatever it is. That'll bring you to this Doxygen documentation pages. Okay, and they're pretty good. You know, even if I go to this site I can jump and see all the overloaded versions of call site to see how it's sort of used. And the other common thing I do, if you remember, we compiled from source so we have all the source code. So you can practice your sort of grep skills here. And grep for get instruction or call site and you'll see all the different ways that it's used in the project. So that's how I like to learn about how some of these things are used. Right, if your project successfully built you'll probably find a working example here. Okay, so continuing on with the third pass on finding function calls. The next thing I'm going to do once I have a call site I'm checking that instruction to see, well, was our instruction something that was callable? Was it a function in the first place? So if it's an instruction at all, I check and then if it is I can figure out get called value strip pointer cast. I can't deal with function pointers yet or weird things like that. But I can just cast this to a function and figure out what that instruction is. Okay, so cast it to a function f here. So that means if my function pass is running at this point I'm going to get direct calls to function f getName the function I found within my function and it's from uppercase f which is our function. So the result in our sort of loops program I'm going to see direct call to function our add function from main direct call to function printf direct call to function countdown from main. Okay, we call the three functions there. So again it's a simple little function pass but what's interesting about this one is how you can start seeing the need or maybe the ability or power to build your own data structure from here. Okay, so I've got some edges now at this point. So if I had those edges now I can think about what kind of graph structure can I build. So I can actually visualize the program that's off to someone. Okay, and that could be a very useful thing in a project. So we'd call this a static graph that we could up. Okay, here's a little bonus trick on outputting graphs. LLVM actually provides a pass for us that can output control flow graphs. So control flow graph if I'm looking at a function what pass can it take? Where does it branch? Okay, so to do this install some .file viewer, that's the format .dot files .ipr4x. works pretty well and we'll view the .file for the countdown or remember count up function and see what's going on. So if I look at it what LLVM gives me here's our C++ source code. It gives me a nice control flow graph here and I can see and sort of trace where it's scaling. There's one branch here this is where the while conditions choosing true or false where we just return. So that's a nice debugging tool here. What's also nice about this as you're sort of learning about the IR and getting familiar with it is you can always just look directly at the IR of the function here in two different views and sort of map it back and forth and take it apart. So this is again another way of how I sort of trace through programs to learn. So if you're thinking about building a compiler that targets the LLVM IR this might be an appropriate place to start. So we've got three passes that gather, attribute statically, give us some interesting information but now I want to move us towards a dynamic analysis. Our goal is to figure out what information, bugs performance errors can we uncover when the program is actually running. So we're going to add something to the program to monitor out its functioning. First the pros is this gives us real values for wherever the code executes. The cons, well you've instrumented it so you've sort of changed behavior a little bit but hopefully to learn something valuable. Why use LLVM for this? There's a lot of different profiling tools but you know we have total control if you have the source code to sort of monitor and inject and where at in our code. We're not at the whim of the profiler. So I want to add in some functions here. Typically if you're going to do this on your own you're doing this in an ad hoc fashion just sort of scattering print apps everywhere on your project or lots of pound-defined end-ifs. So we're going to put a lot of code in this nice tool LLVM. We can sort of inject as needed and then take that instrumentation out which is nice. And again, fair warning I'm going to put a lot of code on the slides and I'll talk through them but you have these later. So step one of this process we write some code that we want to instrument. That means another example. This is sort of our our hook code or excuse me the code we're going to instrument is just our hello world program. The hook into that code I wrote a little function called init main that's going to print a little message out once we enter main. So this happens when you enter a function let's go say hello you're running an instrumented binary instruments may vary while running instrumented binary programs. And you could run this on any function. So step one again I guess is to generate IR for the hook so that little function that we're going to be inserting init main create the IR representation well we know how to do that using clang and we just emit this chunk out so we've got it this function. So there it is, there's our function the name is mangled so if you've sort of worked in compilers that means it's going to look a little funky at Z1 underscore underscore init main I there you can un-mangle it I'm just working with the mangled name here's our code we want to modify just a simple program as an example and now it's time for I think a module pass appropriate for this why? I want to show you a module pass and it sort of makes a little bit more sense to me just to add a module level or a bigger level to work down so modules are they have a collection of functions in them that we can instrument. So here it is, here's the big code and the dark screen but there's three parts here let me tell them what they are first thing I do is I create a sort of stub function for that hook function, init main that's going to get inserted in our main function so this is sort of like giving a declaration hey this function exists just like in your regular programs and again it's the mangled name that's what it understands so it's exactly underscore Z 1 0 underscore underscore init main I the next chunk of code what this block is doing since we're in a module just like we iterated through basic blocks and instructions before a module has a collection of functions that I can iterate through so that's what it's doing here going to look through all the functions and what it's doing in this code is the first thing is I'm ignoring my instrumented functions if I run into those I don't want to instrument an instrumented function and sort of recursively do that the second thing I'm doing is I'm looking for a particular function main in this case and I am modifying code so I'm returning true here for this pass this pass again is a run on module pass that's taking in a module module M so that setup hooks function what is it doing well it's creating a placeholder for that function as I mentioned okay so what I want to talk about here is that the observation is if I'm putting in a little placeholder or a stub I'm sort of building up this function let's see well you know that has to match the signature of main so my function main it returns void and it takes in one parameter which is an integer so that's the little stub I've got to build up so this is how I do it what this sort of reminds me of is again a code generator so if you're building a programming language or something these are the sort of functions you're going to be using to build little steps and this is sort of a big function that's not going to fit on the slide I've implemented in my examples again on mshot.io slash fozdom18 but what the instrument enter function is doing is sort of just choosing the specific function to insert here so it actually it looks very similar to the setup hooks function except there's a get or insert function and then I'm putting that stub in there and that's it if you look at the code you know we don't need to do anything fancy for this profiling but I've given an example that shows how to send in one parameter so you have that that took me forever to figure out back in the day so learn from that so I've written this code I've compiled and made it and it works so the next step is we're going to run our pass so I'm going to run it on our hello world program or hello and I'm going to call it ready to be hooked that's the ll file I'm going to emit so this is something new where I've run opt with an optimization and I'm making sure that I sort of save it somewhere this is the optimize it's been transformed so it's a different file I'm not just getting and once I do that I link in our instrumentation so remember our very first step was to compile that hook function in its own IR so then I'm going to merge them together link them in and you knew we were missing one on here if you're watching close the llfm link tool links two or more that code files into one so that way if we have that definition or declaration or function there should be the definition somewhere so it's all there so our files are all merged llfm link we can think of this tool as the linker for our IR code sort of a cheap version of linker that's just smashing all the files together but what is kind of nice about this is you can think about the little analyses that we've been doing on these small programs well if you have a lot of llf files why not merge them all together and then all of a sudden you can maybe do a whole program optimization or sort of a whole program analysis okay okay if I've linked everything together I'm going to output this as instrument demo that's our full textual file and the grand finale when I run it again with lli one of our previous tools take my word for it that it instruments the main hello you're running an instrumented binary performance maybe et cetera et cetera and then finally we get to print printf banjo alright pretty cool so it works questions about that last pass maybe time to think about it and I'm going to send us into wrapping up here so again you know if this is your first days with llvm some challenges to try things like printing out function arguments exploring the documentation figuring out how to get those args recovering some metadata from different optimizations that are already running there's things like profile guided optimizations that the compiler uses how can you access those things like maybe writing a little python script that just links together all of your source files think about how you can do that moving on you know the next week I might try something like building your own control flow graph or some call graph again how do you put those edges together and build a function a full tree there different attributes can be found in functions to figure out how to get at that metadata maybe you can figure out from our basic stat pass if we've got a lot of really small functions maybe 10 instructions or less tell the compiler to inline those that could be a cheap little optimization again very simple tool but might add performance might not but fun experiment to try and sort of I don't know if they're hard but interesting problems you can look at sort of these auto vectorizing instructions see if you can play around with inserting some simd instructions notice these other things they didn't talk about it as far as tools so I have to mention the sanitizers that's how you things like about memory safety or thread safety in your program see if you can at least run them and then see if you can modify the code because you have it and do some interesting printouts to see what's going on so that's the syllabus going forward resources I've got a big list here so you can sort of look through it the weekly LLVM newsletter is worth subscribing to there's a nice graduate student guide by Adrian Samson that's well worth reading along with this talk I hope these are good resources LLVM blog keeps you up to date there's an IRC that's very active with developers other tools that have been helpful for me as a compiler engineer pex dump meld and again if you're looking for projects about how to learn just google I'm in academia google someone's LLVM homework assignment you'll find interesting problems okay so some interesting talks to continue on there's a nice youtube series as well on program analysis the full course so you can learn about some things like data flow analysis, whole program optimization and really take those forward of course I can't believe without saying that contributing to LLVM is a good thing this is an open source project and there's a full guide here the LLVM developer meetings this guide sort of says what the etiquette is how to commit patches where you can start looking for low hanging fruits if you're just starting to get involved you can start searching from here just to gain some momentum this is my conclusion LLVM I think is a really exciting tool with a lot of power even from very simple projects whether you're working in performance tool building or just curious even if you attended this talk and said this wasn't the right thing if you're a C++ developer at least look through the source code so you can always learn by reading from the architecture of this code and again this is a big project but it shouldn't be scary it's just new you can do it you can run through these examples here and that's what I have for you I can take questions but otherwise thank you for your attention about 15 minutes I think I'll take a few thank you for your talk do any ideas exist in which you could make this kind of analysis by right clicking on a function and then the whole thing you described just happens behind the oh yeah yeah sure the question was about are there ideas or tools that exist where you can sort of click on something and maybe statically get this information or dynamically generate it so you can run it instead you have to make it all manually it's pretty powerful but also a lot of manual labor yeah so the first comment I ought to have made a lot of these tests when I run them but there are a lot of tools where I sort of start looking I didn't talk about clang a lot but there are these clang tools clang tidy a lot of these refactoring tools that run alongside IDEs that are constantly looking through the source code refactors things gives you information about the program so the answer is yes in a short now yeah yes yes the question was about finding cohesion and code dependencies amongst the code where to start looking I know there has been some work on data dependency and looking at which files are dependent on others especially for figuring out compilation time these sorts of things again I don't know any off the top of my head but folks on the IRC would I could probably Google it quick and find some projects but this is the right tool for that sort of thing yeah yes optimize it and then use the just in time compiler to run it the IR is half independent and if it's half independent if you want to optimize it then you usually use the hardware specific stuff so after the optimization the IR is not half independent sure so the question is about running the once we get the IR from our machines we run some optimizations on it and then we can run it through the JITS engine to get some program back and some execution so again the IR because it's independent and sort of not machine specific it's supposed to be general purpose I'm not sure I don't want to sort of promise on camera that yeah you'll get all the optimizations until you generate your final assembly code for your target machine again the JIT is sort of a work in progress that's something that people are working on right now to fit in some of these optimizations and I don't know enough about how many of the optimizations are there but if you are doing some optimizations like standard ones, dead code elimination reducing code size inlining these sorts of things will you should see a machine specific I think you're going to have to test it on the machine with the final assembly just a moment IR is not machine specific it is machine specific so it depends on ABI it can include assembly code so there have been a couple of attempts to make IR architecture independent so like an 8-climb project and then the script inside of Android is also using bitcodes but it's not architecture independent so it is dependent sure, comment is that IR is architecture dependent do you know how much or what percentage small percentage is important it's mostly the data layout so if you look at the data layout it says how big is that type of thing it's sort of the same IR is used for all outputs to a plan each instance of an IR you can't just recompile back and say arm if you recompile for exit it sometimes works but it's not guaranteed so comment is that the data layout does matter yeah, absolutely thank you for clarifying that yeah, question the languages are memory language as well as that it's in the GPM that's a good question so the question is about how does LLVM work with a language like Java that is memory managed I know there was a Java front-end project for this some time ago I don't have a good answer for that actually because I think that project got discontinued in Java perhaps someone else knows something more, I know like the Swift languages they have their own sort of reference counting implementation for it so I don't know the answer to that unless anyone else does yes yes the question was about the JIT and if it's doing optimizations can we see them at runtime again I haven't played around with a JIT enough to give a concrete answer about that so certainly you could emit some information I don't know what sort of you could sort of embed some sort of reflection capability to emit something about the code size or how it's improved I've sort of played around with some of the profile-guided optimizations to figure out hotness but again I've looked at that on C++ mostly so yeah question in the back that's a good question so the question was about using the JIT as a scripting language for say a game where you can sort of optimize as you're going at runtime my answer to these questions is always test it and run it and see if it meets your needs I'm not convinced the JIT I've used it on just CLI as fast enough for that sort of task in my experience there are some talks just to reference there are some talks in the LLVM conference about people who've used LLVM for JITs they tend to use them as the last stage on a multi-stage JIT so it'll sort of be determined this section of code is really really hot so I can spend a lot of time yeah so the comment was that they're typically not used but there is some talk at the sort of final stages of optimization do you happen to know if it's in the sort of profile guided forgot the hot yeah yeah yeah yeah yeah so the comment is it's not used as sort of the the JIT everything I've done a little bit of work in gaming specifically on profile guided optimizations and games are a really interesting use case because they're so dynamic so they might be a good case to sort of always having them run at some point but what I've found is where I've gotten the biggest optimizations in big projects like game is during link time optimization we're going to have this sort of whole program optimization that's where the biggest wins been that's going to make it take very long yeah there's another comment on the right first yeah pardon can you repeat the question oh can I write the transformations like I was doing before the intermediate yeah oh I see so directly on the C++ type information or something I haven't done any work in that myself I'm sure that's a problem that's been looked at I'm trying to think if I can reference anything on that again if anyone has a better answer that's an obvious concern yeah do you have an answer for that or quick question here you have the next yeah I see the question was about getting the relaying compiler or excuse me the parser to run but you only need that specific piece of it yeah did someone have an answer I know yes an answer I think I think yeah yes repeat lip-clang should give you the AST so I think this is a tool I would inspect lip-clang should give you access to the whole AST yeah there's a question point of view you have a project with several thousand files and you create the intermediate representation for every file and then merge them all together link them all together you then run the optimizer on it do you think this is something that's doable wanted to explode on the line machine yeah so the question was on this idea I've got a huge project I want to link it together all of the intermediate representations and generate this sort of giant big code file I have done this it does take some time to actually build that file as long as you have enough memory on your machine what I have found is that it's usually worth doing this because you get a lot of gains from inlining which is usually the obvious sort of winner but you can also play around with some of the instruction layout and sort of the function ordering so I think this is a win and you can do it and it's worth trying this is something I've had success with that's a great question thank you very much