 All right, everyone, let's get started with the next lecture. So today, we're going to tackle the topic of metaprogramming. And this title is a little weird. It's not entirely clear what we mean by metaprogramming. But we couldn't really come up with a better name for this lecture. Because this lecture is about the processes that surround the work that you do when working with software. It is not about programming itself, necessarily, but about the process. This can be things like how your system is built, how it's tested, how you add dependencies to your software, that sort of stuff that becomes really relevant when you build larger pieces of software. But they're not really programming in and of themselves. So the first thing we're going to talk about in this lecture is the notion of build systems. So how many of you have used a build system before or know what it is? OK, so about half of you. So for the rest of you, the central idea behind a build system is that you're writing a paper, you're writing software, you're working on a class, whatever it might be. And you have a bunch of commands that you've either written down in your shell history or you wrote them down in a document somewhere that you know you have to run if you want to do a particular thing. So for example, there is a sequence of commands that you need to run in order to build your paper or build your thesis or just to run the tests for whatever class you're currently in. And a build system idea is that you want to encode these rules for what commands to run in order to build particular targets into a tool that can do it for you. And in particular, you're going to teach this tool about the dependencies between those different artifacts that you might build. There are a lot of different types of tools of this kind, and many of them are built for particular purposes, particular languages. Some of them are built for building papers. Some of them are built for building software. Some of them are built for particular programming languages, like Java. Or some tools even have built-in tools for build. So NPM, for example, you might be aware if you've done Node.js development, has a bunch of built-in tools for doing tracking of dependencies and building them and building all of the dependent stuff of your software. But more generally, these are known as build systems. And at their core, they all function in a very similar way. And that is you have a number of targets. These are the things that you want to build. These are things like paper.pdf. But they can also be more abstract things like run the test suite or build the binary for this program. Then you have a bunch of dependencies. And dependencies are things that need to be built in order for this thing to be built. And then you have rules that define how do you go from a complete list of dependencies to the given target. So an example of this might be, in order to build my paper.pdf, I need a bunch of like plot images that are gonna go into the paper. So they need to be built. But then once they have been built, how do I construct the paper given those files? So that is what a rule is. It's a sequence of commands you run to get from one to the other. How you encode these rules differs between different tools. In this particular class, we're gonna focus on a tool called Make. Make is a tool that you will find on almost any system that you log in today. Like it'll be on macOS. It'll be on basically every Linux and BSD system. And you can pretty easily get it on Windows. It's not great for very complex software, but it works really well for anything that's sort of simple to medium complexity. Now, when you run Make, Make is just a command you can run on the command line. And when you type Make in, this is an empty directory. If I type Make, it just has no target specified and no Make file found, stop. And so it helpfully tells you that it stopped running. But also it tells you that no Make file was found. Make will look for a file literally called Make file in the current directory. And that is where you encode these targets, dependencies, and rules. So let's try to write one. Let's imagine that I'm writing this hypothetical paper. And so I'm gonna make a Make file. And in this Make file, I'm gonna say that my paper.pdf depends on, that's what the colon here indicates, paper.tex is gonna be a latex file and plot data.png. And the command in order to build this is gonna be pdf latex of paper.tex. So for those of you who are not familiar with this particular way of building documents, latex is a really handy programming language for documents. It's a really ugly language and it's a pain to work with, but it produces pretty nice documents. And the tool you use to go from a tech file to pdf is pdf latex. And here I'm saying that I also depend on this plot, plot data png, that's gonna be included in my document. And what I'm really saying here is if either of those two dependencies change, I want you to build paper pdf. They both need to be present. And should they ever change, I want you to rebuild it. But I haven't really told it how to generate this plot data png. So I might want to rule for that as well. So I'm gonna define another target here and it's gonna be, it's gonna look like this. Plot dash percent, what percent means in make is any string, sort of a wild card pattern. But the cool thing is you can repeat this pattern in the dependencies. So I can say that plot dash percent dot png is going to depend on percent dot data or debt. That is a common sort of suffix for data files. And it's also gonna depend on some script that's gonna actually plot this for me. And the rules to go from one to the other, these can be multiple lines, but in my particular case, they're just one line. I'm gonna explain what this is in a little second. All right, so here we're gonna say that in order to go from a wild card dot dot file that matches the wild card in the target and a plot dot python file, run the python file with dash i, which is gonna be like the way we mark what the input is in our python file. I'll show it to you later. Dollar star is a special variable that is defined for you in make file rules that matches whatever the percentile was. So if I do plot dash foo dot png, then it's gonna look for foo dot dat, and dollar star is gonna expand to foo. So this will produce the same file name as the one we matched here. And dollar at is a special variable that means the name of the target, right? So the output file. And hopefully what plot dot py will do is it will take whatever the data is here, it will produce a png somehow, and it will write it into the file indicated by the dollar at, right? So now we have a make file. Let's see what happens if the only file in this directory is the make file and we run make. What it says, no rule to make target paper dot text needed by paper dot pdf, stop. So what it's saying here is, first of all, it looked at the first rule of our file, the first target, and when you give make no arguments, it tries to build whatever the first target is. This is known as the default goal. So in this case, it tried to helpfully build paper dot pdf for us, and then it looked at the dependencies and it said, well, in order to build paper dot pdf, I need paper dot text, and I need this png file, and I can't find paper dot text, and I don't have a rule for generating paper dot text. Therefore I'm gonna exit. There's nothing more I can do. So let's try to make some files here. Let's just make an empty paper dot text, and then type make. So now it says, no rule to make target plot data dot png needed by paper dot pdf. So now it knows that it has one dependency, but it doesn't know how to get the other one. It knows that there's a target that matches, but it can't actually find its dependencies, and so it ends up doing nothing at all. It still needs us to generate this png for, or the input for the png. So let's actually put some useful stuff into these files. Let's say that luckily I have one from earlier, plot dot py to here. So let's look at what this text file is. This is what text looks like. It's not very pretty, but basically I'm defining an empty document. I'm gonna include graphics, which is the way you include an image file. I'm gonna include plot data dot png, and this is of course why we want a dependency of paper dot pdf to be the png file. Plot dot py is also not very interesting. It just imports a bunch of libraries. It parses the dash i and dash o arguments. It loads data from the dash i argument. It uses a library called matplotlib, which is very handy for just quickly plotting data, and it's just gonna plot the first column of the data as x's, and the second column of the data as y's. So we're just gonna have a data file that's two columns, x and y, on every line. And then it saves that as a figure into whatever the given dash o value is. Okay, so we need a data file. It's gonna be data dot dat, because we want plot dash data dot png, and our rules said that the way you go from that pattern to the dot file, the dot dat file is just by whatever follows plot. So if we want plot dash data, then we want data dot dat. And in this file, we're just gonna put in some linear coordinates because why not? That's not linear. All right. And now what happens if we run make? Well, ooh. Okay, so what just happened? Well make first ran plot dot py, with the correct files to generate the png file. And then it ran PDF, latex, paper dot text, and all the stuff we see below is just the output from that tool. If we wanted to, we could silence the output from this tool so we don't have to like, have it mess with all our output. But in general, you notice that it ran the two commands and then it ran them perhaps unsurprisingly in the right order. And if we now do ls in the current directory, we see that we have a bunch of files that were generated by PDF, latex. But in particular, we have the png file, which was generated, and we have the paper dot PDF. And if we open the paper dot PDF file, we see that it has one image which has a straight line. Perhaps in and of itself, not a very surprising or interesting result. But where this gets really handy is I can do things like, if I type make again, make just as paper dot PDF is up to date. It does no work. Whenever you run make, it tries to do the minimal amount of work in order to produce whatever you ask it to produce. In this case, none of the dependencies have changed. So there's no reason to rebuild the paper or to rebuild the plot. If I now, let's say I'm gonna edit paper.tex, I'm gonna add hello here. And now I run make. Then if we scroll up, we'll see it didn't run plot.py again because it didn't need to, none of the dependencies changed, but it did run PDF latex again. And indeed, if we open the paper, it now says hello over there. On the other hand, if I were to change, say, the data file and make this .8 and now run make, then now it plots again because the data changed and it regenerates the PDF because the plot changed. And indeed, the paper turns out the way we expected it to. So that's not to say that this particular pipeline is very interesting because it's not. This is only two very straightforward targets and rules, but this can come in really handy when you start building larger pieces of software. Where there might be dependencies, you might even imagine that if you're writing a paper, one of your targets might be producing this data file in the first place. So one of the make file targets might be run my experiment, run my benchmark and stick the data points that come out into this file and then plot the results and then, and then, and then, and then all the way until you end up with the final paper. And what's nice about this is first of all, you don't have to remember all the commands to run. You don't have to write them down anywhere. But also the tool takes care of doing the minimal amount of work needed. Often you'll find things like there'll be subcommands to make, like make test, right? Which is gonna compile your entire piece of software and also run the tests. There might be things like make release, which builds it with optimizations turned on and creates a tar ball and uploads that somewhere, right? So it's gonna do the whole pipeline for you. The idea is to reduce the effort that you have to do as any part of your build process. Now, what we saw here was a very straightforward example of dependencies, right? So we saw here that you could declare files as dependencies, but you could sort of declare sort of transitive dependencies, right? I depend on this thing, which is generated by this other target. Very often when you work with dependencies in the larger area of software, you'll find that your system ends up having many different types of dependencies. Some of these are files like we saw here. Some of them are programs, right? Like this sort of implicitly depends on Python being installed on my machine. Some of it might be libraries, right? You might depend on something like Matplotlib, which we depend on here. Some of them might be system libraries, like OpenSSL or OpenSSH, or like low-level crypto libraries. And you don't necessarily declare all of them. Very often there's sort of an assumption about what is installed on the given system. What you'll find is that for most places where you have dependencies, there are tools for managing those dependencies for you. And very often these systems you might depend on are stored in what are known as repositories. So repository is just a collection of things, usually related, that you can install. That's basically all a repository is. And you might be familiar with some of them already, right? So some examples of repositories are PyPy, which is a well-known repository for Python packages, RubyGems, which is similar for Ruby, crates.io for Rust, npm for Node.js. But other things are repositories too, right? Like there are repositories for cryptographic keys like Keybase. There are repositories for system-installed packages. Like if you've ever used the apt tool in Ubuntu or in Debian, you are interacting with a package repository where people who have written programs and libraries upload them so that you can then install them. Similarly, you might have repositories that are entirely open, right? So the Ubuntu repositories, for example, are usually provided by the Ubuntu developers, but in Arch Linux, there is something called the Arch user repository where users can just share their own libraries and their own packages themselves. And very often repositories are either managed or they are just entirely open. And you should often be aware of this, because if you're using an entirely open repository, maybe the security guarantees you get from that are less than what you get in a controlled repository. One thing you'll notice if you start using repositories is that very often software is versioned. And what do I mean by versioned? Well, you might have seen this for stuff like browsers, right, where there might be something like starting Chrome version 64.0.20190324, right? This is a version number. There's a dot here. This is one kind of version number, but sometimes if you start, I don't know, like Photoshop or you start any other tool, there might be other versions that are like 8.1.7, right? And these version numbers are usually numerical, but not always. Sometimes they have hashes in them, for example, to refer to Git commits. But you might wonder, why do we have these? Why is it even important that you add a number to software that you release? The primary reason for this is because it enables me to know whether my software will break. Imagine that I have a dependency on a library that Jose has written, right? And Jose is constantly doing changes to his library because he wants to make it better. And he decides that one of the functions that his library exposes has a bad name, so he renames it. My software suddenly stops working, right? Because my library calls a function on Jose's library, but that function no longer exists, depending on which version people have installed of Jose's library. Versions help solve this because I can say I depend on this version of Jose's library. And then there has to be some rules around what is Jose allowed to do within a given version. If he makes a change that I can no longer rely on, his version has to change in some way. There are many thoughts on exactly how this should work, like what are the rules for publishing new versions? How do they change the version numbers? Some of them are just dictated by time. So for example, if you look at browsers, they very often have time versions that look like this. They have a version number on the far left that's just like which release. And then they have sort of an incremental number that is usually zero. And then they have a date at the end, right? So this is March 24, 2019 for some reason. And usually that will indicate that this is version 64 of Firefox from this date. And then if they release sort of patches or hot fixes for security bugs, they might increment the date but keep the version at the left the same. And people have strong opinions on exactly what the scheme should be. And you sort of depend on knowing what schemes other people use. If I don't know what scheme Jose is using for changing his versions, maybe I just have to say you have to run like 8.17 of Jose's software. Otherwise, I cannot build my software. But this is a problem too. Imagine that Jose is a responsible developer of his library, and he finds a security bug, and he fixes it. But it doesn't change the external interface to his library. No functions change, no types change. Then I want people to be building my software with his new version. And it just so happens that building mine works is fine with his new version because that particular version didn't change anything I depended on. So one attempted solution to this is something called semantic versioning. So in semantic versioning, we give each of the numbers separated by dots in a version number a particular meaning. And we give a contract for when you have to increment the different numbers. In particular, in semantic versioning, we call this the major version. We call this the minor version. And we call this the patch version. And the rules around this are as follows. If you make a change to whatever your software is and the change you made is entirely backwards compatible, like it does not add anything. It does not remove anything. It does not rename anything. Externally, it is as if nothing changed. Then you only increment the patch number, nothing else. So usually security fixes, for example, will increment the patch number. If you add something to your library, I'm just going to call them libraries because usually libraries are the things where this matters. So for a library, if you add something to the library, you increment the minor version and you set the patch to 0. So in this case, if we were to do a minor release, the next minor release version number would be 820. And the reason we do this is because I might have a dependency on a feature that Jose added in 820, which means you can't build my software with 817. That would not be OK, even though if I had written it towards 817, you could run it with 820. The reverse is not true because it might not have been added yet. And then finally, the major version, you increment if you make a backwards incompatible change, where if my software used to work with whatever version you had, and then you make a change that means that my software might no longer work, such as removing a function or renaming it, then you increment the major version and set minor and patch to 0. So the next major version here would be 900. Taken together, these allow us to do really nice things when setting what our dependencies are. In particular, if I depend on a particular version of someone's library, rather than saying it has to be exactly this version, what I'm really saying is it has to be the same major version and at least the same minor version, and the patch can be whatever. This means that if I have a dependency on Jose's software, then any later release that is still within the same major is fine. That includes, keep in mind, an earlier version, assuming that the minor is the same. Imagine that you are on some older computer that has version 813. In theory, my software should work just fine with 813 as well. It might have whatever bug Jose fixed in between, like whatever security issue. But this has the nice property that now you can share dependencies between many different pieces of software in your machine. If you have version 830 installed, and there are a bunch of different software that, like, one requires 817, one requires 824, one requires 801, all of them can use the same version of that dependencies you only needed installed once. One of the most common, or one of the most familiar, perhaps, examples of this kind of semantic versioning is if you look at the Python versioning. So many of you may have come across this where Python 3 and Python 2 are not compatible with one another. They're not backwards compatible. If you write code in Python 2 and you try to run it in Python 3, it might not work. There are some cases where it will, but that is more accidental than anything else. And Python actually follows semantic versioning, at least mostly. And so if you write software that runs on Python 3.5, then it should also work in 3.6, 3.7, 3.8. It will not necessarily work in Python 4, although that will hopefully be a long time away. But if you write code for Python 3.5, it will possibly not run on Python 3.4. So one thing you will see many software projects do is they try to bring the version requirements they have as low as possible. If you can depend on major and then minor and patch 00, that is the best possible dependency you can have, because it is completely liberal as to which version of that major you're depending on. Sometimes this is hard. Sometimes you genuinely need a feature that was added. But the lower you can get, the better it is for those who want to depend on your software in turn. When working with these dependency management systems or with versioning in general, you'll often come across this notion of lock files. You might have seen this where you try to do something and it says, cannot reconcile versions or you get an error like lock file already exists. These are often somewhat different topics. But in general, the notion of a lock file is to make sure that you don't accidentally update something. The lock file at its core is really just a list of your dependencies and which version of them you are currently using. So my version string might be 8.17. And the latest version on the internet somewhere might be 8.30. But whatever is installed on my system is not necessarily one of those two. It might be 8.24 or something like that. And the lock file will then say, dependency Jose, version 8.24. And the reason you want a lock file, there can be many. One of them is that you might want your builds to be fast. If every single time you tried to build your project, whatever tool you were using downloaded the latest version and then compiled it and then compiled your thing, you might wait for a really long time each time, depending on the release cycle of your dependencies. If you use a lock file, then unless the version, unless you've updated the version in your lock file, it'll just use whatever it built previously for that dependency. And your development cycle can be a lot faster. Another reason to use lock files is to get reproducible builds. Imagine that I produce some kind of security-related software, and I very carefully audited my dependencies. And I produce like a signed binary of like, here is like a sworn statement for me that this version is secure. If I didn't include a lock file, then by the time someone else installs my program, they might get a later version of the dependency, and maybe that later version has been hacked somehow, or just has some other security vulnerability that I haven't had a chance to look at yet. And a lock file basically allows me to freeze the ecosystem as of this version that I have checked. The extreme version of this is something called vendering. When you vendor your dependencies, it really just means you copy-pasted them. Vendering means take whatever dependency you care about and copy it into your project, because that way, you are entirely sure that you will get that version of that dependency. It also means that you can make modifications to it on your own, but it has the downsides that now you no longer get these benefits of versioning. You no longer have the advantage that if there are newer releases of that software, your users might get them automatically. For example, when Jose fixes his security issues. Not that he has any, of course. One thing you'll notice is that when talking about this, I've been talking about bigger processes around your systems. These are things like testing, they're things like checking your dependency versions. They're also things like just setting up build systems. And often, you don't just want a local build system, you want a build process that includes other types of systems, or you want them to run even when your computer is not necessarily on. And this is why as you start working on larger and larger projects, you will see people use this idea of continuous integration. And continuous integration systems are essentially a cloud build system. The idea is that you have your project stored on the internet somewhere, and you have set it up with some kind of service that is running an ongoing thing for your project, whatever it might be. And continuous integration can be all sorts of stuff. It can be stuff like releasing your library to PyPy automatically whenever you push to a particular branch. It could be things like run your test suite whenever someone submits a pull request. Or it could be check your code style every time you commit. There are all sorts of things you could do with continuous integration. And the easiest way to think about them is that they're sort of event triggered actions. So whatever a particular event happens for your repository, for your project, a particular action takes place. Where the action is usually some kind of script, some sequence of programs that are gonna be invoked and they're gonna do something. This is really an umbrella term that encapsulates a lot of different types of services. So some continuous integration services are very general. Things like Travis CI or Azure Pipelines or GitHub Actions are all very broad CI platforms. They're built to let you write what you want to happen whenever any event that you define happens. Very broad systems. There are also more specialized systems that deal with things like continuous integration coverage testing. So like annotate your code and show you have no tests that test this piece of code and they're built only for that purpose or they're built only for testing browser based libraries or something like that. And so often you can find CI tools that are built for the particular project you're working on or you can use one of these broader providers. And one thing that's nice is that many of them are actually free especially for open source software or if you're a student you can often get them for free as well. In general, the way you use the CI system is that you add a file to your repository and this file is often known as a recipe. And what the recipe specifies is this sort of dependency cycle. Again, sort of what we saw with Make Files. It's not quite the same. The events instead of being files might be something like when someone pushes a commit or when a commit contains a particular message or when someone submits a pull request or continuously, right? One example of a continuous integration service that's not tied to any particular change to your code is something called the dependabot. You can find this on GitHub and the dependabot is something that you hook up to your repository and it will just scan whether there are newer versions available of your dependencies that you're not using. So for example, if I was depending on 8.17 and I had a lock file that locked it to 8.24 and then 8.30 is released, the dependabot will go, you should update your lock file and then submit a pull request to your repository with that update. So this is a continuous integration service that's not tied to me changing anything but to the ecosystem at large changing. Often these CI systems integrate back into your project as well. So very often these CI services will provide things like little badges. So let me give an example. So for example, here's a project I worked on recently that has continuous integration set up. So this project, you'll notice it's read me if I can zoom in here with that crumbing. Oop, that's much larger than I wanted. Here, you'll see that at the top of the repository's page there are a bunch of these badges and they display various types of information. You'll notice that I have a dependabot running. So the dependencies are currently up to date. It tells me about whether the test suite is currently passing on the master branch. It tells me how much of the code is covered by tests and it tells me what is the latest version of this library and what is the latest version of the documentation of the library that's available online. And all of these are managed by various continuous integration services. Another example that some of you might find useful or might even be familiar with is the notion of GitHub Pages. So GitHub Pages is a really nice service that GitHub provides which lets you set up a CI action that builds your repository as a blog, essentially. It runs a static site generator called Jekyll. And Jekyll just takes a bunch of markdown files and then produces a complete website and then as a part of GitHub Pages they will also upload that to GitHub servers and make it available at a particular domain. And this is actually how the class website works. Class website is not a bunch of like HTML pages that we manage. Instead there's a repository missing semester. So if you look at the missing semester repository you will see if I zoom out a little here that this just has a bunch of markdown files, right? It has, let's look at 2020 metaprogramming.md. So this is the, if I go to raw here, this is the raw markdown for today's lecture. So this is the way that I write the lecture notes and then I commit that to the repository we have and I push it and whenever a push happens the GitHub Pages CI is gonna run the build script for GitHub Pages and produces the website for our class without me having to do any additional steps to make that happen. And so yeah, sorry, go ahead. Yeah, so Jekyll, it's using a tool called Jekyll which is a tool that takes a directory structure that contains markdown files and produces a website. It produces like HTML files and then as a part of the action it takes those files and uploads them to GitHub servers at a particular domain and usually that's a domain under like github.io that they control and then I have set missing semester to point to the GitHub domain. I want to give you one aside on testing because it's something that many of you may be familiar with from before. You have a rough idea of what testing is. You've run the test before, you've seen a test fail. You know the basics of it or maybe you've never seen a test fail in case, congratulations. But as you get to more advanced projects though you'll find that people have a lot of terminology around testing and testing is a pretty deep subject that you could spend many, many hours trying to understand the ins and outs of. And I'm not gonna go through it in excruciating detail but there are a couple of words that I think it's useful to know what mean. And the first of these is a test suite. So a test suite is a very straightforward name for all of the tests in a program. It's really just a suite of tests. It's a large collection of tests that usually are run as a unit. And there are different types of tests that often make up a test suite. The first of these is what's known as a unit test. A unit test is a often usually fairly small test of self-contained test that tests a single feature. What exactly a feature might mean is a little bit up to the project but the idea is that it should be sort of a micro test that only tests a very particular thing. Then you have the larger tests that are known as integration tests. Integration tests try to test the interaction between different subsystems of a program. So this might be something like an example of a unit test might be if you're writing an HTML parser to the unit test might be tests that it can parse an HTML tag. An integration test might be here's an HTML document parse it, right? So that is gonna be the integration of multiple of the subsystems of the parser. You also have a notion of regression tests. Regression tests are tests that test things that were broken in the past. So imagine that someone submits some kind of issue to you and says your library breaks if I give it a marquee tag and that makes you sad so you wanna fix it. So you fix your parser to now support marquee tags. But then you wanna add a test to your test suite that checks that you can parse marquee tags. The reason for this is so that in the future you don't accidentally reintroduce that bug. So that is what regression tests are for. And over time your project is gonna build up more and more of these. And they're nice because they prevent your project from regressing to earlier bugs. The last one I want to mention is a concept called mocking. So mocking is the idea of being able to replace parts of your system with a sort of dummy version of itself. That behaves in a way that you control. A common example of this is you're writing something that does, oh I don't know, file copying over SSH. This is a tool that you've written that does file copying over SSH. There are many things you might wanna mock here. For example, when running your test suite you probably don't actually care that there's a network there. You don't need to have to set up TCP ports and stuff. So instead you're gonna mock the network. The way this usually works is that somewhere in your library you have something that opens a connection or reads from the connection or writes to the connection. And you're gonna overwrite those functions internally in your library with functions that you've written just for the purposes of testing. Where the read function just returns the data and the write function just drops the data on the floor or something like that. Similarly, you could write a mocking function for the SSH functionality. You could write something that does not actually do encryption. It doesn't talk to the network. It just takes bytes in here and just magically they pop out the other side. And you can ignore everything that's between because for the purposes of copying a file if you just wanted to test that functionality the stuff below doesn't matter for that test. And you might mock it away. Usually in any given language there are tools that let you build these kind of mocking abstractions pretty easily. That is the end of what I wanted to talk about metaprogramming but this is a very, very broad subject. Things like continuous integration, build systems. There are so many out there that can let you do so many interesting things with your projects. So I highly recommend that you start looking into it a little. The exercises are sort of all over the place and I mean that in a good way. They're intended to try to just show you the kind of possibilities that exist for working with these kind of processes. So for example, the last exercise has you write one of these continuous integration actions yourself where you decide what the event be and you decide what the action be but try to actually build one. And this can be something that you might find useful in your project. The example I give in the exercises is try to build an action that runs like write good or pros lint, one of the linters we thought for the English language on your repository. And if you do, we could enable that for the class repository so that our lecture notes are actually well written. And this is one other thing that's nice about this kind of continuous integration testing is that you can collaborate between projects. If you write one, I can use it in my project. And that's a really handy feature where you can build this ecosystem of improving everything. Any questions about any of the stuff we recorded today? Yeah. In my experience, it's often the idea that they make along with CMake and they do different parts of building some program. Does there feel like a minute to talk about CMake and how it plays with CMake? So the question is, why do we have both CMake and CMake? What do they do? And is there a reason for them to talk together? So CMake, I don't actually know what the tagline for CMake is anymore, but it's sort of like a better make for C as the name implies. CMake generally understands the layout of C projects a little bit better than makefalls do. They're sort of built to try to parse out what the structure of your dependencies are, what the rules from going to one to the other is. It also integrates a little bit nicer with things like system libraries. So CMake can do things like detect whether a given library is available on your computer or if it's available at multiple different paths. It tries to find which of those paths it's present on on this system and then link it appropriately. So CMake is a little bit smarter than make is. Make will only do whatever you put in the make file. It's not entirely true. There are things called implicit rules that are like built in rules in make, but they're pretty dumb. Whereas CMake tries to be a larger build system that is opinionated by default to work for C projects. Similarly, there's a tool called Maven. So Maven and Ant, which is another project, they are both built for Java projects. They understand how Java code interacts with one another, how you structure Java programs and they're built for that task. Very often, at least when I use make, I use make sort of at the top and then make might call other tools that build whatever subsystem they know how to build. My make file might call cargo to build a Rust program and then call CMake to build some C dependency of that, but then at the top, I'm gonna do some stuff at the end after the programs have built and that might just be run a benchmark which is in the Rust code and then plot it using the C code or something like that. So for me, make is sort of the glue at the top that I might write. Usually, if your make file gets very large, there's a better tool. What you'll find at like big companies, for example, is they often have one build system that manages all of their software. So if you look at Google, for example, they have this open source system called Bazel and I don't think Google literally uses Bazel inside of Google, but it's sort of based on what they use internally and it really just is intended to manage the entire build of everything Google has. And Bazel in particular is built to be, I think they call it like a polyglot build framework. So the idea is that it works for many different languages. There's like an implement, there's a module for Bazel for this language and that language and that language, but they all integrate with the same Bazel framework, which then knows how to integrate dependencies between different libraries and different languages. Do you have a question? Sure. So when you say expressions, you mean the things in this file or yeah. So these are, so make files are their own language. They are, it's a pretty weird language. Like it has a lot of weird exceptions. In many ways it's weird just like bash is weird, but in different ways, which is even worse. Like when you're writing a make file, you sort of, you can sort of think like you're writing bash, but you're not because it's broken in different ways. But it is its own language. And the way that make files are generally structured is that you have a sequence of, I think they call them directives. So every like the this thing, oops. This thing is a directive and this is a directive. And every directive has a colon somewhere and everything to the left of the colon is a target and everything to the right of the colon, I guess right of the colon is a dependency. And then all of the lines below that line are the sequence of operations known as the rules for once you have the dependencies, how do you build these targets? Notice that make is very particular that you must use a tab to indent the rules. If you do not make will not work. They must be tabs. They cannot be for eight spaces, must be tabs. And like you can have multiple operations here, right? Like I can do heck go hello or whatever. And then they would first run this and then run this. There's an exercise for today's lecture that has you try to extend this make file with a couple of other targets that you might find interesting that goes into a little bit more detail. There's also some ability to execute external commands to like determine what the dependencies might be. If your dependencies are not like a static list of files, but it's a little limited. Usually once you start meeting that sort of stuff, you might want to move to a more advanced build system. Yeah. What do you have two dependencies? And both of them depend on a common library, but they have conflicted me or something. Yeah, so the question is what happens if I have, let's say that I have library A and library B. And they both depend on library C. But library A depends on like four dot zero dot one. And library B depends on three dot four dot seven. So they both depend on C. And so ideally we'd like to reuse C, but they depend on different major versions of C. What do we do? What happens in this case depends entirely on the system that you're using, the language that you're using. In some cases the tool will just be like, well, I'll just pick four, which sort of implies that they're not really using semantic versioning. In some cases the tool will say, this is not possible. Like if you do this, it's an error and the tool will tell you, you either need to upgrade B, like have B use a newer version of C, or you need to downgrade A. You do not get to do this and compilation will fail. Some tools are going to build two versions of C. And then like when it builds A, it will use the major four version of C. And when it builds B, it will use the major three version of C. One thing you end up with is really weird conditions here where like if C has dependencies, then now you have to build all of C's dependencies twice two, one for three and one for four. And maybe they share and maybe they don't. You can end up in particularly weird situations. If imagine that the library C, imagine that library C like writes to a file. Like writes to some like file on disk, some cache stuff. If you run your application now and like A does something to call like C dot save and B does something like C dot load, then suddenly your application at the bottom is not going to work because the format is different. So these situations are often very problematic and most tools that support semantic versioning will reject this kind of configuration for exactly that reason, but it's so easy to shoot yourself in the foot. All right, we will see you again tomorrow for security. Keep in mind again, if you haven't done the survey, the question I care the most about in the survey is what you would like us to cover in the last two lectures. So the last two lectures are for you to choose what you want us to talk about and to give any questions that you want us to answer. So please like add that if you can. And that's it. See you tomorrow.