 Welcome to another edition of RCE. Again, this is Brock Mellon. You can find us online at rce-cast.com. You can find all the old shows, nominate new shows, look at a list of people we're looking at talking to. Also, I have here Jeff Squire from Cisco Systems and one of the authors of Open MPI. Jeff, once again, you grace us with your presence. You're far too kind, Brock. This is great stuff. And today we're gonna be looking at a project, actually kind of a suite of projects. I mean, we're gonna be talking about one of them in particular, but there's actually a whole bunch of additional tools that integrate nicely with it. And it's something that most people don't think about too much and they just kind of assume, but having really good tools in this area is actually really, really important. You'll find in at least my experience has been in most software projects, there's only one or two people who really understand the tools in this particular area. Yeah, so the project we're talking about today is actually the CMake, I wanna call it a build system, but I think it does more than that. So we'll let our guests actually describe what that is when we get into that. Our guest is another representative from Kitware who we've had on the show, represented on the show before. Bill, why don't you take a moment to introduce yourself? Thanks for the 10 words about CMake. I'm one of the original founders of Kitware. I did my master's degree at Rensselaer Polytech Institute, RPI. I spent about nine years at GE's Research Center in Niskeuna, which is shocking enough where Kitware is located now about 15 minutes away from there. Kitware, I started in 1999 with four of the folks, and the CMake project was actually came out of one of our first projects with the National Library of Medicine and the Insight Toolkit creation in early 2000. We were actually tasked with creating a build system, and that's where CMake got its beginnings. So describe CMake. Some people may be used like the regular GNU or POSIX make before, but describe CMake and what it does. A meta build system, and in fact, many people who use CMake use GNU make all the time. So what CMake does is the user would provide a simple input of what they want built. They want a library and here's the source files, and then CMake would generate maybe GNU make files, maybe Visual Studio project files, maybe Xcode project files, and then the user would use those tools to actually do the build. So what's the difference here? Why do we need another make-like utility? I mean, make is pretty prevalent in POSIX-like systems. Now, you did mention cross-system support like a visual project and whatnot, but does anybody use the X11 project files anymore or what's kind of the advantage of CMake over the traditional make? Certainly the cross-platform nature of it is a huge advantage and one of the main reasons people adopt it. However, the make-files that it generates are also very interesting even in a POSIX world in that they have progress reporting as they're building, they do nice colored output, they abstract away, building a shared library is just create library, and you don't have to figure out what particulars are involved with doing that on different systems. So how is that different than, say, one of your, probably the most obvious of your competition here is AutoMake, which doesn't really do progress or color, but it does do things like, hey, just make me a library magic go. But again, the cross-platform nature of it would be one of the major advantages. The AutoTools project works fine as long as you have a complete Unix system and it uses Bash and M4 and a whole other host of tools. CMake was designed really to, the only thing it really depended on was a C++ compiler. So should we, so a little sidetrack here, does that mean CMake itself is written in C++? It is, it's C and C++. Okay, and so what, I understand there's a bunch of other advantages versus the AutoTools too. You wanna give a little blow by blow of some of your cool features. The ability to, you know, build cross-platform with only a C++ compiler. And again, I think one of the really strong points at CMake is that it takes advantage of the most scarce resource on a project and that's the people working on it. And those people, everyone, if you look at cross-kitware, there's people that really like Visual Studio. They like developing on Windows. There's the people that use QT Creator and their Linux developers. There's people like me that use Emacs on the command line and all those people can work on a project together with CMake because it's actually generating the build tool of choice for the developers instead of forcing all the developers on the team to use a particular tool set. So that I think is a big advantage that it can actually take advantage of those developer skills and allow them to use the tools they want. So you're actually having people developing on different platforms on a single project. Does CMake do any type of verification that the code they're writing is portable between systems or is CMake completely unaware of any of that? It's just a build system. So it really, you can put any code you want in there and it will obviously not build cross-platform if it doesn't build on that platform. However, like you mentioned in the beginning, there's more to CMake than just a build tool. It's actually grown into a family of tools. There's Ctest, which runs tests in a project. There's Cdash, which is a web-based application. Again, these are all open source tools and fit into this family of tools. Cdash is a PHP-based lamp stack web application that displays test results. A continuous integration system similar to Hudson. And then finally, CMake also comes with a tool called CPAC, which can create Windows installers or RPMs or Debian packages automatically using the install rules already in your CMake project. So CMake itself is written in C++ and all these other tools are independent. Like you don't have to adopt the entire CMake environment with Ctest and CPAC or can you take pieces? You can take pieces. Some people use it just for the build system. Some people use the whole suite. Obviously, Cdash is completely separate. That's a separate application. Kitware hosted, cdash.org. There's also my Cdash. If you wanna try it out, you can create a hosted project really quick. But it's not tied with CMake. Ctest and CPAC are actually C++ binaries that ship bundled with CMake, so they're always available. One of the restrictions of CMake, I think, is that whatever tools are available, if you're gonna actually call them from the build file, we wanna make sure that the core testing and packaging are always available on all the platforms. And the only way to do that is to bundle them with CMake itself. So is CMake itself distributed in popular Linux distros or is it mainly downloaded via the internet? We actually had quite a growth in downloads from our site. I think we get somewhere around 1800 to 2000 downloads a day of the CMake binary and sources. But CMake is also included in all the major Linux distributions, SIGLIN, so that OSX ports. So CMake's readily available everywhere. And that was one of the problems early on is sort of a chicken and the egg thing. We've got this new tool, but you have to install the binary to do your build. But that's changing over time. But we're much longer than say that the KDE adoption had a lot to do with that popularity. So yeah, this is exactly an important point because way back eons ago when we started the OpenMPI project, we were looking at all the various build tools that were available at the time. We did not want to introduce a dependency on a build tool so that we were the new MPI implementation on the block and we wouldn't wanna say, oh yes, to try out our awesome code, you gotta go first get this build tool and then you can build OpenMPI. And so that was why we ultimately ended up going with the auto tools at that time because they do a wonderful job of bootstrapping a tar ball so that when you download the OpenMPI tar ball you don't need to have the auto tools installed. All you need are compilers and make. Does CMake do something like that or does it still require a sourced tar ball that was created with CMake? Does that need to have CMake installed on the target machine? Yeah, that's a really good question and certainly a sticking point for CMake adoption but CMake, yes, it does have to be installed on the system. It's used in the make files. They call CMake back to generate dependencies which is one of the other features I forgot to mention earlier is that CMake does source level dependencies it goes through and looks for what files are included generates those, like I said, the color output that's coming from the CMake binary. And again, this is how we get the cross platform ability because essentially we're providing a shell or a set of tools that's always around on your build platform. And you can't really, auto tools gets rid of it because they say, well, we require a politics Unix environment to be around for the build. So it's actually quite a bit of stuff that has to be around. Gotcha. All right, yeah, that's a fair assumption that yeah, you don't need to have auto tools installed but you need to have everything else installed. All right, so then let me shift the question a little bit here and say, is there ever or does it even make sense to have a bootstrapped ability in a platform independent way such that the first thing that's in the project file or the first thing that's in the make file is to build a mini CMake in itself so that it can build the rest of the project? Does that even make sense? Has there ever been any request for that? So when we request for it and the early CMake actually did that with the insight toolkit I talked about in 2000 the first thing it did is it had like a little configure like script that built CMake and then CMake built the system and it certainly could be done today but with the ubiquitous availability of CMake it's really becoming less and less of a problem. I mean at some point people don't say, well I don't want to use GMake because you have to have GMake installed. It's getting really easy. There's package managers, app get CMake, you're done. There's a whole bunch of other tools you're depending on your project too. And again if you get really stuck with CMake really all you need is the C++ compiler and to be able to build it. And also Kitware provides binaries for all the major platforms. We even go through sort of a lot of trouble to make sure our Linux binary that we provide works on all the platforms. It's actually building on I think a 10 year old version of Mandrake so that it's completely backwards compatible. Wow. That's painful. You know when that machine dies we're in trouble. We're gonna try to virtualize it in the next couple months. So let's get a little bit more of this history. The one toolkit you mentioned was what you kind of made CMake for that originally did. At that time did you envision kind of keeping CMake out as a separate thing or was it even called CMake then? I don't think it was cloud CMake right then. It was sort of the ITK build system and that's actually a great segue to a story. So the inside toolkit was a segmentation and registration toolkit for medical data mainly funded by the National Library of Medicine to work with the physical human data set. And they, the engineering leads and I was pitching my idea of this new build system to a group of people working on the project and one of the guys raised his hands. So what are you doing creating a new build system? You know, it's gonna be the ITK weird build system. Why don't you use something that's already out there? And before he finished his sentence he can let me answer, he sort of said, oh wait, wait a minute, I see what you're doing. You're not creating a build system for ITK. You're creating a brand new C++ build system for the rest of the world. And that's what we were doing and it definitely took a long time for it to get traction. So what are some of the projects that are using CMake? You mentioned KDE, what are some of the other ones that are using it? So the Kitware suite of tools using it, Pearview, VTK, some other ones include, there's quite a long list, but OpenSceneGraph uses it, Blender's 3D, ITK of course uses it, QuantumGIS, SecondLife is one of the more popular packages that uses it, Ogre uses it, GDCM, BulletSythicXengine, Avogadro, quite a host of KDE applications. MySQL is another really popular one that uses it. They recently, they were using it originally only for their Windows build and they recently made it for all their builds. The new compiler LLVM also uses it. So actually OpenMPI falls in that same category. We use CMake for our Windows build. We still have fairly extensive auto tool support on the POSIX side. We've talked about merging it all over into CMake, but no one's had the time and resources to do it unfortunately. But yeah, very definitely on the Windows side, there's quite a bit of logic and stuff over there that all the good people at HLRS maintain for us. And like I said, that's how it started with MySQL and for years, that's what they had, sort of a dual build system and I think six months ago or so, they finally unified the build system. What other languages does CMake support? So obviously C and C++, but what other things can it build? It can build Fortran, which is really important I think for the scientific community. We've actually, LAPAC uses it as one of their build systems and our Fortran support actually supports Fortran 90, so there's a full Fortran parser inside CMake and this is sort of a funny story. One of the guys here helped someone port some code over to CMake, it was a Fortran-based code and he was telling the guy how to use it and he says, well, you type CMake and then you type make and then it builds your project. My guy said, you mean I only have to type make once? Because these Fortran 90 projects, they have modules and you have to build them in the right order so that the modules are there when the project goes to include it and they're sort of byproducts of the build unless you build it in the exact right order and what a lot of Fortran 90 programmers do is they just keep typing make over and over again and after about seven times, it actually works. That's pretty terrible. I was on mute before, he didn't hear me chuckle when you said that the first time but I can confirm that disastrous build systems like that are out there and in real world usage and that's terrible. And that's why tools like this exist. So let me ask you a derivative question then going off on the Fortran bit, I'm actually involved in the Fortran MPI-3 effort and so I know a bunch more about Fortran than I ever thought I would. Fortran 2008, for example, is an incredibly subtle language. It's not, it ain't your father's Fortran 77. Are you guys keeping up with all the syntax for that stuff as well? I mean, I assume you don't need full Fortran parsing, you just need to be able to figure out dependencies and the like, right? Yeah, we basically just need to be able to figure out the use, what's being produced and what's being used by each module so that we can create a correct graph and then build them in the right order. So Fortran 90 has, it works a lot like Java actually. You say, use some module or produce some module and we just basically need to parse enough of it to figure out what it's producing and what it's using and then we can figure out the order of the build. All right, and so we got kind of distracted. What are the other languages that you're able to build? So you said C, C++, Fortran, do you do things like Java and others? Java support is admittedly weak. People do Java with it. But you can, so you can do custom commands in CMake where you can basically run anything you want, something that produces something that produces something. So a lot of people do Java like that. They might even run ant from CMake, but we don't, we haven't spent a lot of time on sort of native Java support. Another powerful thing that CMake can do is through these custom commands is code generators. Things like Mach or things like Swig, which actually has quite high demands on a build system. Oh, that's cool. So CMake has a lot more understanding of the underlying code. I mean, normal make, I'm used to always saying that this object file depends on the source file and so I can make it understand that it needs to compile this Fortran thing to get the module before it compiles the next one that uses that module. So it understands this. I don't have to keep track of all this myself. I can just kind of point it at my source code and go. Is it that easy? It's that easy, yeah. And it does that with CC++ and Fortran. It can parse the code and figure out the include dependencies and then make sure things are up to date. So then does CMake have any parallelism in it or does it just make all these dependencies correctly and then you use like makes parallel build with a minus j? Yeah, that's what we depend on is the underlying build systems parallelness and generating correct make files so that that will work. So yeah, make minus j is what we use on systems. Let's support it. Another interesting tool we're looking at, someone's working on a Ninja generator for CMake, which is a new build type tool from Google, sort of a low level make replacement to do better parallelism. So we talked a little earlier about what language is you support. And you mentioned that CMake is included in all the popular Linux distros and OS 10 via ports and it obviously must be available at work on Windows since we talked about visual project files and so on. So that's a pretty good list of systems. What's your list of compilers that you support? So like on Linux alone, there's four or five different popular compilers in addition to the GNU suite, particularly that in the HPC arena people like to use because there's a lot of religious factions about this compiler gives me better numeric performance and this compiler gives me better XYZ performance and things like that. And they all have slightly different rules for making say shared libraries. So what's your laundry list of compilers that you support? Compilers that CMake supports I think is fairly complete. I don't have a list off the top of my head. I would have to dig down and look at the dashboard. And the dashboard is our continuous integration testing system. So when people come to me and say, hey, do you support this compiler? Can you help me get that compiler working? I say, sure, as long as you'll contribute a dashboard. And what that means is running a build with that compiler and putting the results every night up on the CMake dashboard. And this is really important for CMake development because if someone comes to me and says, I want compiler X to work with CMake, if I just work with them and get it done once, it'll probably work for about six months until we check in something that happens to break it. But I think our tool set is pretty complete. If you can find one that doesn't work with it, it's probably, I'd like to hear about it and maybe we can set up a dashboard and get it working. But we're pretty proactive when people complain about something not being supported. And the addition of compilers, especially in a project environment, requires only adding a few text files into the module directory of CMake. So it's actually pretty easy to do it even if it's not built into the binary. The support for the compilers is not in C++. It's in the CMake language. So we've talked a lot about build and comparing it against make, actually compiling your code. What if I have options? I want to include this library or enable disabled GPU support. We're so used to configure dash dash with library path. Does CMake handle that kind of functionality also? Sure, it really does a good job of that. That's probably one of its strong suits that I should have mentioned. It actually has a full configurable GUI that it can run. It's a cute base GUI that shows all the options for a project. You can turn things off like build with MPI, build without MPI. So it's got a full configurable GUI and you can also do it from the command line as well. Ah, good. That's what I wanted to hear because we're all about automated builds and things like that. So GUI's good automation command line and types things better. So is this kind of analogous to autoconf, configures dash dash with foo and dash dash enable foo and things like that? Yeah, and it's customizable on the project so that CMake actually has what's called a set of cache variables. And these, when you configure CMake on a project, it writes out a file that stores all the variables for that project that are persistent variables, essentially we call them cache variables. And they're stored in this file and then they turn things on and off and they can be things that were discovered through a system introspection or they could be things set by the user. Okay, now you gotta forgive me because the majority of my experience is with the auto tools here. So then this sounds a lot like you have a similar functionality to autoconfs, built-in M4 tests, like look for this header file, look for this library and do these shell things to see if the system supports function XYZ and various things like that. Is that correct? That is correct, yes. CMake has what we call a system introspection options. You can do try compiles, try compile and run. You can run whatever you want and try to discover if the system has certain libraries or header files as well. So the system connects, the build can adapt to the system at a time. Okay, and one of the biggest problems that the OpenMPI project had with the Canoe Auto Tools was that in particular, AutoMake was very much engineered for a static directory structure. That I have subdirectories ABC and off of C comes DE and F and that's it, it's kind of fixed. But our developers very much don't want to integrate with the build system. The developers are C kind of developers and they're like, holy, I don't want to try and figure out what you guys did with Configure and make and things like that. I just want to add a new directory and the build system picks it up. Is that a kind of thing that CMake can do? It could, and certainly we've done that in projects. In fact, in the recent rework of ITK, we've actually had a chance for creating ITK version four, the insight toolkit where CMake came from. And we had a big modularization effort. And in this system we created, it actually has a bunch of directories which are all the each module of the system and CMake automatically figures the ones that are there and users can actually just directly just drop a new module into the system, run CMake and then it'll pick up the new module. So let's talk about C test a little bit. So CMake, we can turn all these features on and off. If I have a large collection of code with different options, I mean, Jeff with the MPI stuff, you've got all these different networks and stuff and maybe different options you wanna try. The C test enable this, is it serial or can I spread that across multiple hosts or what other things can C test plus CMake by me? I mean, C test is really the tool we use to run tests. So inside the CMake language you could say, add test, through bar, and then that would run some application, some executable, and it by default looks for, you know, zero for passing, non-zero for failing. And then those tests can, the results can be sent to C dash or reported right there to the user. So it's really just a tool used to run the regression test. Did I answer your question? Yeah, yeah. So what other features does C test provide though? There's a minus J option, so you can try to run the test in parallel if you set that up. It does not support without some extra work running it on separate hosts, although we have built systems that do that, you know, using, for example, Paraview uses MPI and we've got a little mini cluster set up and we run C test on it and then we'll launch Paraview tests that are actually running on other systems. But that's using external software. It doesn't actually do the SSH pushing thing like that. Okay, and so C test and CMake, I think you mentioned one other, there was one other tool in this suite of build and testing environment. Yeah, that's called CPAC. CPAC. It's a packaging tool. Yeah. Is that requiring special tools to use to give people a package to install? In a sense, yes. It works a lot like CMake does itself. So it's not a packager, although it can create simple packages like tar files, but if you want to create an RPM, it uses the RPM tools. On Windows, it uses the Nullsoft installer. So essentially it's another way of abstracting out your package. Here's the files I want to package. Here's what they look like and then go ahead and create it for an RPM or a Debian or a Windows installer or a Mac installer or a Mac drag and drop bundle. But it allows you to abstract that out and have one central place, the CMake list file where you tell CMake how to build your system and install it where you can create this abstract description of the packaging and then it will use the native packaging system to create a really nice installer for the system. So how does somebody transition over to using CMake? Like for example, an auto tools wonk like me. If I wanted to convert over my existing code base, where would I go about learning about that? Are there examples out there or are there some sample projects that are kind of canonical CMake usage that are good for noobs like myself? So that's a good question. There's a book, of course you can buy the book. There's a tutorial that's included in the book and the tutorial is actually part of the test suite of CMake, so that's available. And the tutorial chapter is actually freely available online and CMake really has an excellent mailing list that's very proactive and answers your questions relatively quickly. The best thing to do would be probably to walk through the tutorial and sort of get a feel for it. And again, it's pretty simple stuff at the basic level where you're creating a library and executable and things like that. Obviously build systems can get more complicated depending on what you need to do. But start small and then build up. So does CMake integrate with any IDEs or are you using though, do you have to invoke CMake independently? Usually on a project, so if you're building a project, you invoke CMake and it generates the build system and the build system might be the input to an IDE. Once you've generated that once from then on if you update say an input to CMake, we put hooks into the build so that CMake will rerun and the IDE will then reload the project if something changed. So what's the largest thing you've ever or most complicated, I guess complicated to be the more important thing you've ever seen or you've done yourself with CMake? Well, certainly the most largest would be the KDE build. I think they like to say the largest open source project in the world and it's not a trivial build because they've got things like the mock code generator which has to be invoked. And again, that was something that was in CMake from the beginning is the idea of code generators since CTK wrapped into multiple languages. And that's sort of as complicated as a build to get where you maybe build some C code that builds an executable and then an executable parses some files and then generates more code which then needs to be built into a library and being able to do that round thing and having it work in Visual Studio and Xcode and through the make files is not trivial. So by the same token, I'm actually just I'm sorry I'm drawing on my own experience here but in OpenMPI we have exactly that problem with some flex and bison files, some token parsing kinds of code generation stuff. Does CMake support natively invoking flex or Lex and Bison or Yak? Sure, there's a module directory in CMake that has various packages. I believe there's one for Lex and Bison. And again, it wouldn't be that hard to do with the custom commands where you just, and you say, you list the inputs and the outputs and then create custom targets for generating code and then if you use the outputs and put them into a shared library or an executable, everything sort of happens in the right order and CMake does the right thing. So a number of years ago, too, there was this paper called Recursive Make is Evil or something like that. I have a dim recollection of that paper where they talk about how you know, make and then invoking make in a subtree and then invoking make and yet another subtree is bad, bad, bad for a variety of reasons. Does CMake take a different approach or do you use a similar approach or how do you address this kind of issue? Sure, let me answer that. That actually has an interesting story to it. We set out at some point to, we read the paper and we said, oh yeah, we're make generate, we should be able to generate something. And we set out to generate a single monolithic make file and what we found out, without depending on some of the more advanced features of GMake, because we support other makes as well, and being able to do the dependency analysis and the code generation support and the things I talked about with the FORTRAN, in some instances Recursive Make is absolutely required to make sure all that stuff happens in the right order. So we sort of have a hybrid system where it does as much monolithic stuff as it can so that when you type make, it'll make sure the targets are built in the right order. But as it gets down to the individual target levels, it may actually invoke make several times just to make sure that all of the makes dependencies are set up and done correctly. So what's the craziest use of CMake you've ever seen where it's completely unexpected? I suppose watching people use it to build something like LaTec. So people have used it to build books. Oh, wow. But I guess if you're using programming language to write a book, then maybe you need a buildable. Well, that's a perfect computer science solution, right? That every problem in computer science can be solved with another level of abstraction. So what are the features in upcoming versions of CMake that we can expect? What are some cool new things that are coming down the pipe? The exciting features we came up with recently is something we call external project. And what this is is actually sort of a meta CMake build, or maybe even not even a CMake build. And we create what we call, we're calling them super builds. So a lot of times it's really complicated these days because everybody wants to use lots of different packages. But there really isn't a package manager per se that's across all the systems. But if you're on a development team and you're building something that needs Q and it needs Boost and it needs VTK and maybe it uses Paraview and then sometimes it uses LAPAC, what we can do is we have this external project command which can actually go fetch and download either from a Git repository or a tar file or off a website, a copy of the system, and then it can configure that, whether it be a CMake build or not, and then build it. And then we can create a system whereby you can get a developer quickly up to speed and have everything built and have it work all across platform. So that's a neat feature we added recently and I think that'll be expanded upon in the next few years. So what licenses CMake under and distributed? Without restriction. And that also covers any build system specific files that it generates like Make Files or Visual Studio Project files? Visual Studio Project that it generates still needs CMake around. So it doesn't, again, going back to it does CMake need to be installed and it does. So it generates these files and then you build off of them but you can't redistribute them, wouldn't be worthwhile. Now, how does the CMake community work? I mean, what's Kitware's role in the community? Are you kind of a benevolent dictator or are there committers from outside Kitware how do you guys function? I think recently we've been moving more away from the benevolent dictator to more of an inclusive community of developers and we've got even a guy working on adding a new regular expression support through a Google Summer of Code through the KDE project. And what's really enabled this expansion is our adoption of distributed version control through Git. So what's the website for CMake and where people can find it in mailing lists? And the website is www.cmake.org and there's information about the mailing list there as well. It's CMake at CMake.org's address. Well, Bill, thank you very much for your time and we'll have this show out soon. Yeah, thanks, Bill. Well, thank you. It was enjoyable talking to you about CMake and hopefully we both learned a little bit.