 Welcome to another edition of RCE. Again, this is Brock Palin. You can find the RCE podcast online at rce-cast.com. You can also follow me personally on Twitter where I tweet about HPC and computer related things. All one word, Brock Palin. And you can also find that linked on our website. Again, also I have Jeff Squires from Cisco Systems. He's also one of the authors and organizers of OpenMPI. So, Jeff, thanks again for your time. Hey, Brock, how's it going? People can find me on Twitter as well, Jeff Squires. I tweet not very frequently, but I did recently. Every once in a while, when something grabs my eye, I'm not a regular Twitter, but I do. I do try to write in my blog at least once a week or so. So all that's linked off the website. I guess other shout-outs we should give, by the time this goes to air, I think EuroMPI starts about a week after this goes to air. So be looking for interesting research results to come out of that. That's always a cool conference. Yeah, and for everyone else in the HPC community, especially those of us in the state, the big one, Supercomputing is coming up in November. So we're only a little more than a month away from that. So I will be there. Jeff, I assume you will be there. Yes, I'll be there and we'll be doing the rounds. Yep, yep. Be looking around for the normal listeners and former guests we've had. So again, SE is always a lot of fun. It's always puts me to sleep though too. Afterwards, I need to take a vacation after SE. It is exhausting. And then right after that is Thanksgiving where you're inevitably descended on Haun by family. So no risk for the wicked even after Supercomputing. So our guest today, it's funny. I actually tweeted once that I wished more software providers included modules, which we will get into, modules with their software packages. And so we wouldn't have to write so many ourselves. But we have with us today. Modules with a very specific meaning, not the generic software module meaning, but a very specific software package. Yeah, not Fortran modules, no other type of modules. This is environment modules. I think is what some people call them to clarify. But our guest is R.K. Owen and R.K. is actually his initials, but R.K. is joining us from the Pacific Coast. So R.K., thanks for your time. Well, thanks very much, Brock and Jeff. I appreciate being here today. And let me just tell you a little bit about myself. I currently work at Lawrence Berkeley National Labs, part of the NERSC or the National Entry Research Scientific Computing Center. It's a division there. I'm originally a physicist by training, but I found that I enjoy software far more than doing the physics. So I've kind of gravitated towards that. And so I've been kind of part of maintaining the modules for several years now. So for those of us who are not familiar with modules, what exactly are they and what do they accomplish? Well, first of all, yeah, I kind of identify them as environment modules because as you know, modules is kind of an overloaded term. I mean, there's kernel modules, Perl modules, Python modules. And I think you mentioned like Fortran modules. But I think this kind of got there first and it was originally identified just as modules, but more correctly, it's environment modules. And what it does is it allows a user to easily manipulate their user environment, that being like the, you have environment variables like path, man path, LD, library path, and so on and so forth. And so it allows you to kind of change those on the fly. And also you can create aliases for if you run in like in the seashell or born shell, you have kind of a unified way where you can create an alias in those shells. And kind of a side benefit also will modify X11 window resources too. Now Brock and I are both long time environment module users. I think we've exchanged some emails even many years ago in my prior life as a research assistant back at Indiana University and possibly even way back in my Notre Dame days. But one of the genius things that I love about environment modules from a user perspective is that it can add and remove things from your environment, even if for example, some path is in the middle of your dollar path environment. And you can say, oh yes, well, you know, unload that package and environment modules will just go and find it in the middle of the path and remove it. And to me that was always a wonderful, wonderful thing for particularly for users who don't know or care how all that mojo works. They just want to say, I don't want that in my environment anymore. Was that a specific design goal? I think, yeah, the original design was to be as flexible as possible with dealing with these environment variables. And so I think it had always been designed with that goal where if you had some entry in a path that was like in the middle that it would be easily get extracted, remove it. But you'll find that most of the module commands themselves do things like you prepend to the path or you append to the path. And that's kind of the, like if you're calling upon a new, you want to instantiate a new application in your environment, generally that module file will be appending onto your path. And you do a number of these things and say then you want to remove it, then you do a module remove, then it'll extract it from the path wherever it may be. So a traditional Unix and Linux type systems, we've always just modified the dot files, you know, .bashrc, .cshrc. So besides removing entries from the middle of your path, what's some of the other benefits of modules over using the .files? Well, in the .files, I mean, you're still free to use them wherever. That's what's nice about modules is that you're not locked into one thing or the other. Well, the thing about .files is that it's fine for people where you have a very much a static environment. You know, you have your set number of few applications that you want to use. Well, with, so you could put that stuff in the .files just as well. However, with modules, what's kind of nice about it is that you can have, you know, if you need to use some application today, but you don't need to use it tomorrow, instead of kind of having it litter your environment, you can just say, well, module load application one here. And then that will put in your environment and you can start using it. And that tomorrow, you don't want to have it. You don't want it in your environment or if you say you're trying out some new application and where you want to trade between two applications or two versions of the same application. So modules allows you to load one, try it out, unload it, load the other application or the other version, and try that out. And so you can go back and forth very easily. Now, one thing also that modules does, which I kind of like is that it allows a person or a system administrator to kind of place their applications in its own rut. And like for me, I usually put my application under user local package and then the package name and then the version number. And so you don't have to just dump everything into user local bin, which has been the usual way of doing things. Now that is a good thing and allows multiple concurrent versions of the same software package, for example, or multiple different builds for whatever reason, maybe based on architecture or configuration or something like that. But one of the criticisms of environment modules that I've heard over the years is that, well, I get this ginormous path that's terrible, blah, blah, blah. What do you say to people who give that kind of complaint? Well, well, I agree with them and that it can give a very large path. And I know like most of the work I was kind of introduced with modules on craze. And even now, they separate, they segment their applications a lot using modules. And yes, the path can be horrendously long. There's no doubt about that. However, the flip side is, well, then you have everything in user local bin. And so what's the best way of doing that? Well, I'd say it's easier to manage your applications using the modules or the path, a different path than it is to dump everything in user local bin. And particularly, if you're doing your own stuff and you're building stuff from tar balls, it really is kind of nice to just create a simple module file, either if you're doing for the system or for yourself, and then just load that and then you have access to the application that you're just unloading the tar ball. Yeah, I agree with that sentiment completely. I personally don't really care how long the path environment variable gets because it's just a heck of a lot easier logistical problem for me as a user to have the packages just that I want versus the system administrator, exactly what you said, trying to manage 600,000 packages inside one directory tree like user bin. And particularly if you want to have 10 versions of say, oh, I don't know, a great software package like OpenMPI, it gets really sticky to do that, to install that into one tree. And Red Hat and others have come up with some fairly creative solutions for it, the whole alternatives package and so on. But to me, I've logged in HPC clusters where I've seen 20 different MPI implementations, OpenMPI and otherwise. And boy, that's just terrible to do, in my humble opinion, via alternatives and much better serve via modules. So that's just my two cents. No, I do all the user support, the software facing stuff at our site. And I think at one time we had 30 different MPIs. And so when you can include all the permutations of versions and compiler versions, and MPI is a good example, compilers are a good example. But another case actually is just the idea of reproducibility. You can say I did everything with version X, but that might not be that version that, say, a collaborator uses or something like that. And so it's nice to be very explicit and to be able to support many users in a large environment with modules is just significantly easier. Yeah, you definitely need to, especially with libraries, you need to have a number of them because not every application can use the latest, greatest version of a library. Yeah, yeah, no, sometimes you just have to go backwards and this makes it easier to go. So one of the things like where we're really talking about HPC centers, and I know I've seen it in a lot of places, how widely adopted is modules? How many places have you seen it at? Well, I've seen it wherever you have crays, you know, crays have them natively. And yeah, it seems like every place I've been to, well, of course, I kind of pushed ahead of modules there. Also, I noticed that Lawrence Berkeley Labs, you know, without my help, they had modules installed there because they actually had a lot of applications that they had to serve up. So where outside the HPC environment do you see uptake of modules? Or is that not something you really pay attention to? No, I'm not really, I'm not out there looking at other places too much. However, I would like to see it on, you know, a modules package, installable on say Debian or Ubuntu, I think, yeah, Susie of Linux has it. Red Hat, I'm not sure. But yeah, I think it would be nice if it was available, but maybe, you know, as an optional package that a person could install. Because where I really, where I kind of got into this was, yeah, I was a software developer. I needed to have access to different applications and different libraries. And I wanted a way to kind of manage it on my system. And so I noticed that queries had this neat thing called modules. At first I thought, when I was told about it, just the idea that it was a child process modifying a parent process. I said, well, I didn't like that idea because I thought it would be nothing but trouble. It'd be just specific to queries. But when I found out more about it, how it did it, that I said, oh, I had to have it on my blocks. And that's when I downloaded the 3.0 beta version of the software and had to go through quite a few hoops to get it to compile. And then once I got it to compile and work mostly pretty well, then I went to the email list and kind of announced that I have it. It's available for other people to download because at that point it seemed like modules is kind of getting stagnant. There was nothing really happening for several years before that. And this was about 1999. So you found that, you know, Craig had this modules package and you wanted to expand it to other places. Even to this day, who do you think should be using modules that you don't think is? Well, like I said, it should be a package available for like Ubuntu, Debian, Red Hat, any place here. Because it's really targeted at the way I see it is, well, the system administrator, so they can manage various or selection of applications. And also for the software developer, if we need access to different set of libraries or different set of tools, depending on whatever project that they're working on. So I guess this is kind of an open call to packages out there. If anybody's interested in making modules package, please contact them and they would love to talk to you. Well, in fact, I'll do whatever I can to make it as easy as possible. I did have a Red Hat package or I have the hooks in it in the sources themselves for to do a Red Hat package. And right now I'm also working on Ubuntu package because that happens to be the system I use most often. Yeah, years ago, I believe that I made a package of modules for the Oscar. There was a cluster distribution at the time called Oscar and I made some RPMs for that and apparently they lived for quite a long time even after I moved on and out of the Oscar project but they were quite useful there because I added another layer on top of modules for persistence across multiple nodes. So it actually did modify your dot files and things like that. So you could say, hey, choose this version of MPI and that would actually not just modify your current environment, but also your startup environment because you kind of needed that in a parallel environment. And I think we went back and forth on that a couple of times on the mailing list a bunch of years ago. Okay. Well, there's also one feature of modules is that if you have like in your home directory adopt module RC and there you can have it do a module load of whatever modules you commonly use and the module command itself will edit that file. And so you can put commonly used modules in that one file and it'll be accessible whether you, you know, or log in with a C shell or born shell or K shell or whatever. Well, I'm going to pretend that that feature wasn't there and that the work that I did a couple of years ago was useful. Yeah, if you look at some of the things I have in the module sources, there are actually scripts and whatever that I had to develop for my job because we had this kind of a segment between the user services and the system administration. And that was more than the user services. So we could do things to user dot files but we couldn't do things to the system dot files. You know, that would make it modules or whatever available for everybody. So, yeah, you'll find that there's a lot of scripts and a craft in the module sources which are solving problems just like that. So this seems like a good segue here. What language is modules written in and why? It's written in C and it has an embedded TCL interpreter. I think, well, originally the first version written by John Furlan used AUX scripts and they found that it's just too slow for them. And so I think what they did is they went to C because it would be a lot faster and they used the TCL embedded interpreter because at the time that was kind of the only thing that was available. Now, if I was to write modules today, I'd probably use C and embed a Perl interpreter but that's just my preference. So that's a very interesting comment. Why Perl? Is Perl just kind of your favorite language du jour or is there something that lends itself to modules or what's your rationale for saying that? Well, I think the reason I would choose Perl for myself, it's the scripting language that I'm most familiar with. I do a lot of not most of my work in Perl. However, the modules was written with the embedded TCL because at the time that was only embedded interpreter that was for the most part available. And it did have the promise of being kind of glue for the glue to kind of glue different applications together like that. However, one of the things I'm sort of toying with an idea is the module command right now it's uses TCL but there's really no reason that you can't embed other interpreters also like Perl, Python, even M4 if you want as long as there's some way to embed it and perform the few actions that you need to like initialize the interpreter and perform the few actions which are required for modules then there would be really no reason why Perl module or why a module file couldn't be written in TCL or Perl or Python because they'd all be doing the same thing and basically letting the module command given the module command the information that it needs in order to modify the environment for the user. So do you, is this anything more than a little twinkle in your eye right now? Is this something that we can expect in a future version? Well, if I had infinite amount of time, yes, but it is something that I'm working towards and mostly by rewriting the code to be more module there we go with the module word again but modularize it so that it is less dependent on TCL specifics but allow other interpreters to be embedded. So it's something that I'm working towards with the 3.3 version which has been in the works for quite a while now, it probably won't happen there but each iteration the code gets cleaner, more concise and I wouldn't be surprised like with 3.4 maybe. Probably that would be my goal for 3.4 is to allow other interpreters to be embedded. So modules has this base feature of manipulating your environment. What is your personal favorite feature of modules? Well, I think my favorite feature is kind of the one that I use most often is the prepend path module directive because that's the one I use in all my module files because I just wanna prepend something through my path and in a time's pass like on various machines, I would use this and also the other feature like is the use.ownmodule file. That way you can basically add your own set of module files and that way I can tailor my environment because like on previous systems for whatever reason the system tools were really pretty inadequate and they'd rather use GNU tools and so I would either the GNU tools were available in a different directory or I'd compile them myself and put under my home directory somewhere. And so using the prepend path and the use.ownmodule file I could use GNU tools and bypass whatever system tools that they had available. So it sounds like there's been a lot of modification out there, Jeff did some modification. It reminded me of some modification. How flexible is like the modules environment for modules itself? For modules themselves. Now that was one of the things that I introduced with version 3.1 was versioning for modules themselves on the module command because up to that point you had modules could swap in and out the different versions of say an application but it never had a way to swap in and out basically a version of itself. And so with 3.1 and introduce all the framework and mechanisms so that I could be working on modules 3.15 and on the system the native or the system could be defaulting to 3.14 and so that allowed me to kind of switch between the module application itself or tool to different versions so I can try out and make sure that the next version or the next version was working appropriately. So but module files themselves you're kind of locked into a certain framework at this time. That may change in the future though and with some ideas that I've been floating around. So I know I've got some really large module files. What's the most complicated module file you've ever seen personally? Well I've seen a few from the email list and quite frankly if a module file is too complicated yeah in the sense that it's trying to do too many different things, too many conditionals it's you're probably doing something wrong. And so quite frankly a module file should be fairly simple and fairly direct in that basically adding onto the path, the man path, the LD library path, settings, other environment variables like the C flags or whatever. If it's doing something more complicated than that it's yeah perhaps you should probably rethink what you're doing or maybe break it up into different module files. Yeah I remember back in my Notre Dame graduate student days in the HPC and the centralized IT stuff they had some absolutely monster module files that had all kinds of logic and they would load sub module files conditionally and do some crazy things like that. Brock what do you have large module files for? What are you doing with them? The main ones are actually for certain libraries like IMSL is a good example. They come with a .file or something that you can source in your .file but it just sets 20 different environment variables. Oh I see so you're not doing anything crazy you're just setting a truck load of variables. Okay yep yep probably the most craziest thing we do is you know check to see if something else is loaded and maybe load it for you. Okay so does modules have anything built in that keeps me from say loading two versions of the same piece of software multiple times? Well I'm glad that you asked that because yeah there is a feature which is native to modules or environment modules is that it's called conflict and require. And so typically if you have multiple versions of the same application or same library and you don't want them all to be loaded because quite frankly if you have two different versions of the open MPI library then you have kind of a problem because which one will get used? The first one in the list on the LD library path list. So what you want to do is like if you have module files for different versions of the same library you would put in there a conflict and so in this case say the module file name is open MPI and then you have version I don't know the 1.1, 1.2, 1.3 then in the module file itself you'd say conflict open MPI. So the first time you load the module, a open MPI module that will get loaded fine no problem then if you say load open MPI slash 1.3 it'll say oh sorry can't do that it conflicts. Now also there's the other feature of requires and so say I'm not sure what open MPI would require but say required say AutoCop for whatever reason you could say require AutoCop but then I think it'll tell you that oh you can't load the module because it doesn't have all the required ones but what you can also do is have one module file which will do a module load of a whole string of things and so that way with one module file you're actually setting up your entire developer environment. Cool, well on a slightly different vein we asked you what's the largest most complicated module file usage you've seen what's the strangest use of a module file that you've seen? Well I think on the email list we often get questions from people saying well this application has a script that needs to be run and then they try and course module files in order to run this script. Generally yeah you shouldn't do that because essentially the script is trying to do what module files is supposed to be doing and that is basically setting up your environment so often times it's just far easier or far better if you look at the script see what it's trying to do and if it's setting certain environment variables or doing certain I don't know actions put into a module file and so those are kind of the strangest and kind of weirdest things but for the most part most people just use module files or modules for what it's intended and that is basically modifying the environment. So if I must say I'm a software developer say like I'm in Jeff's position and I want to include a module file with my source but I don't know what I should do or an intelligent way to do something being a first time user. If I wanted to get some help making that module can I get any? Yes there is an email list I think it's called modules-interest at sf.net or sourceforge.net and there's actually quite a few people who look at the email list and they actually are very good about answering questions and they actually do a far better job of answering questions than I do because I think they have more of a beat on what modules are especially in their own environment in their own situation and so many questions come in and I've never seen one just kind of drop through the cracks yet and so yeah and because that's also it's an interesting question because I don't really recall any software providers or developers really asking for help because I think modules are fairly straightforward in the sense that if you want to add your own module file it's pretty straightforward in that you look in the module's home and the module's path and you can discover pretty much where things are being placed on the system and so it may be that most software developers if they do want to add a module file haven't really needed help yet but if they do as I say the module's email list is very helpful and of course I'm very willing to help out anyone particularly if they want to add modules to their package to make it more available. So what's coming up in future versions of the module software package? What new features are you doing? What are people asking for and things like that? You talked about new language interpreters which I think would be personally awesome because I never remember the syntax for TCL because this is the only context in which I need it. So something like a Perl language interpreter I'll give you a plus one on that one right here but what else is coming down the pike? All right, so I've got the 3.3 line up there just waiting to go and what it has is internationalization and also I think what I want to add is something like module reference counting. Now there's also a TCL version of modules which I don't really have much to do with but it's part of the CVS repository and they already have module reference counting there but something that it's a good idea and I want to add it to the C version and also add some module file directors for creating stacks in the environment so you can have a value and then pop it off this environment to some sort of environment variable stack. Also I'm going to kind of remove some crust that's in the modules command itself like module file caching. It was kind of an artifact of the 3.0 and at the time they used caching to try and speed up modules but they also kind of hampered the module environment by limiting it to two levels of directories. Other things like tracing, haven't really needed that at all and also how to handle modules in a global home environment because if you have a global home and you're logging on from three or four different machines all with different host operating system then you have kind of a conflict with the module purge command and as I said the embedding other interpreters or allow them to be embedded and also just trying to package it so that'll be available for Ubuntu, Debian, perhaps Red Hat and also another tool I would like to have is a TCL-TK script that will basically help craft module files from a template so that for yourself it's just a matter of invoking a script and popping up a window and you can just click off that. I wanna add this to path, man path, LD library path, whatever paths that you have there. Okay, RK, well thank you very much for your time. What's the website for modules? The one that's active is modules.sf.net that's the website that's associated with the source purge repository and that's the one that we keep up to date. Okay, well thank you very much for your time. I appreciate it, thank you very much.