 Prepare yourselves for a Splunking expedition through the ancient caves of Fortran 4, the PDP-10, and the original Colossal Cave Adventure game. Please welcome Chris to the stage. Thank you. So again, my name is Christopher Swenson. I have my Twitter handle and my GitHub on every slide, so hopefully that. And I've just published the slides on GitHub as well, so you can feel free to follow along or look later. Like he said, this talk is just going to be about the original Colossal Cave Adventure as well as weird details on PDP-10, Fortran 4. And I'll get to all that. The audience for this talk is basically people who are here, people who are curious about this sort of thing, and are a bit programmer-y. This talk is mostly in Fortran 4 with some Python. So it's not like anybody here is probably a master of Fortran 4. And I am not either. So who am I? We already went through who I am, occasional BWARE core contributor currently at Twilio. And I like stuff, which is why I did this. So the motivation for this, so quick background, I suppose, is when you start at Twilio, it is a tradition that you use Twilio to build an app as an initiation sort of thing. And so I was like, well, what I want to do is I'm going to build a game of some sort. And Twilio, for those of you that don't know, they generally do voice calls and text messaging and stuff like that. And so I was like, why not write a game with text messaging as the core mechanic? And if I'm going to do that, well, the obvious, I think, solution to that is to write a text adventure. But computer science is the science of laziness. And so I'm not going to write a text adventure. Why don't I just port an existing text adventure to Twilio or to, basically, SMS so that I can play it? And so that's what I did. And so I decided, well, why not do the first ever text adventure? So arguably, that is called Colossal Cave, or sometimes called Adventure, depending on which version you get. And when you got it originally written in 1976, the first version that you can find source code for still isn't from 1977. It was extremely popular, imported to dozens of different computers all throughout the 70s and 80s. Like some of the terminology used in it still lives on to this day. The original source code that we have was originally written for the Fortran 4 on the PDP10, which was a mini computer from 1966. So it was kind of like a fun challenge to get this to work at all. And I didn't want to, like, you can't actually take Fortran 4 code and compile it these days. There's not a compiler. Like, there just isn't one. And even if there was one, we don't have a PDP10 to run it on anymore. I think those are all dead now. So I sort of had a lot of fun writing a Python interpreter for the Fortran 4 code to get it to run on modern machines and then hook it up to text as the way to interact with it. And if you want, it is up and running now. You can text 1-669-238-3683. I think that spells, like, Advent 3 or something. But anyway, so you can actually play this now. You will probably melt the server if you all try it once because this program is real inefficient. Like, there's so many layers of bad things going on. This is really just a talk of shame. I'm not going to lie. And for echoing the keynote of this morning, the code wound up being more complicated than the problem it solved. It's like maybe a 700-line Fortran program. It's like a 1,000-line interpreter that I wrote for it. But I guess that's OK because I had to write an interpreter and then also emulate the PDP10 and a whole bunch of other just really weird things in order to get this thing to run it all. So if you start up the game, it looks something like this. This is the intro from the original game. I do not actually recommend emailing or sending complaints to Will Crowther. He's a very lovely man. I think he's like a professor somewhere or now or something. I'm not sure he's still around. But he probably does not care about bugs in this game anymore. It's been 40 years. And I think I even have it on terminal here where you can run it locally as well. By default, it will just run locally. And you can ask it for instructions. It will print that. You can wander around and go into a building, enter a building, enter maybe. It's not the smartest thing in the world. You could take things. It's like a text adventure. I'm not going to play it any more than that, but this does work. And this is running on the actual program I wrote, interpreting the original unmodified 4Tran4 source code. So the PDP 10 looks something like this. It probably would fit on something about the size of the stage. If you were to have one here, it would be super loud. And you notice there's not things like a monitor on it or anything like that. The only way you would typically interact with these things is usually through a tape drive or a teletype machine to do text input and printing output. And that's exactly what it was meant for, which in some sense makes it work perfectly for SMS, because a teletype works by sending a line of input and expecting a line of input and giving you some lines of output. So that maps perfectly to sending a text message. And everything was like 80 columns back then. So sending a text message, well, 80 columns will fit fine. So 4Tran4 is really fun, and nothing at all like any programming language any of us have probably ever touched in the past 30 years. We get all those sort of fun things that you hear about in programming, like all caps. There's no such thing as recursion. So functions can't call themselves or even make multiple calls around. It's got to be a really straightforward kind of program. There are line numbers everywhere. We're going to go through some of the source code. It's a lot of fun. Spaces don't matter. You can remove every space in the entire program, and it will still compile. And it was meant to be entered on punch cards. So I actually had to modify the source a little bit to get it to fit on these slides and have syntax highlighting because modern syntax highlighters don't understand what to do with such really formatted code. So typically, every line starts with a tab because you would enter the line numbers in on the leftmost side. If the first character is like a C or an asterisk, that means it's a comment. And so it sort of looks like this. It's a little weird. This is a data declaration statement. The first line just says things like any variable that I don't otherwise declare is an integer if it's a single letter. And then real ran is actually a really weird thing. It took me forever to find out because it's undocumented, but that actually is a function that returns like a random variable. Fortran 4 doesn't have functions interestingly enough. So this is some really strange implicit function. And then you have a bunch of arrays of text that is all the text in the game. So sort of like to go through and give you some more flavors on Fortran 4. It uses things like dot any dot. So this is written in time with teletypes. And teletypes didn't have all these fancy keys on them that were used to these days like less than or like equal and things like, well, it does have equal. But it doesn't have like, there was no equal equal. So I think it's dot eq dot is the operator you would use. There's like line numbers in places. They don't have to come in any particular order. So it says, go to one and one appears like 100 lines down. So that doesn't matter at all. It sets up a bunch of variables. It reads in a bunch of the arrays. Notice that when we declare the array, you can't actually give it any values. You have to do that separately as actual code. It's got this really weird, almost Python-like way to initialize an array that was kind of advanced for its time. Like you give it a little for loop almost to initialize its data. So you don't really have for loops or while loops or anything like that. You just have this one thing called do. And you would tell it what the last line is. And then the compiler would remember. And then it would loop through that 300 times. And then it would eventually exit onto that line there. Yeah, array access sort of looks like function calls. But there were no function calls, so that's fine. Not function calls. And I guess in the sense, they're more, well, we'll get to that. Whenever you read input, input is really strange. You sort of read it into a variable there. You have to give it another line number that has a format statement that tells you what kind. And G naturally stands for integer. So that's reading in an integer. I looked so long to find like manuals for this ancient version of Fortran to try to figure all this stuff out. And eventually I just had to guess for a lot of it. And I just kept trying to run the program over and over again until I got it to finally run. This is a computed go-to statement, which is a lot of fun. So it actually computes the last bit there. And then it uses that as an index and picks the line number to go to next based on that. So it's sort of like a jump or switch jump kind of thing. But yeah, it uses like a table internally. It's really strange. This is one of my favorite ones. This one took me forever to figure out what this did. So it's a read statement. So it's reading input again. This time from a tape drive, I think the one means read from the tape drive, of course. And in 1005, it means read an integer. And then some number of 0A5 means ASCII strings. So I haven't even gotten to all the fun parts yet. And then that last statement there is it will read an array of them all in one line like that, which I think is really cool that you could do that so succinctly and like tersely. I don't understand any of that without. This is not intuitive at all, I think, if you didn't know for-transfer specifically. So you do have sort of subroutines that you call, I would say, but they're not like functions, because they don't return a value. So how do you return a value? This is a function that it uses to say, ask the user a question, a yes or no question, and then return, in some sense, the answer. And the way that works is in the actual subroutine itself, the last variable, since you passed in a variable, which is yay, it will actually assign that before it returns. And so every variable can be in or out or both, which is just weird. Yeah, and other than that, it is pretty straightforward based on the things we know, .eq.dot.or.dot. Again, they didn't have all those fancy characters on their keyboards. The line labels are actually unique to the subroutines, so you can reuse them, which is just fun. Yeah, so some things that I haven't talked about is that this was on a PDP-10. Before about 1980, computers didn't really have this concept of eight bits or 16 bits being in a byte or in a word. They were sort of just making it up as they went along. And the Digital Equipment Corporation, who made the PDP-10, was using a 12-bit system and then switched to a 36-bit system around the time of the PDP-10. And so all the numbers and everything that it was talking about were all 36 bits, which is really strange. That's why earlier, it was reading in five ASCII characters as a string, because that's how many would fit in 36 bits. Nowadays, we have this kind of standardized. And so we take that for granted. But using these systems, things just get really weird with 12-bit and 36-bit integers and 18-bit addresses and things like that. So I mentioned this briefly. So ASCII wasn't really a thing back then. It was but a pipe dream. There was a version of it released in 1963, but it is not like the ASCII that you and I know and love today. Like the control characters were just all messed up. Most of the letters and stuff were there. So luckily, we can still read the data from that era. But this machine was designed in 1966 before most modern versions of ASCII. And so you can get those sort of five characters in. And you have an extra bit, which the way you do this is you sort of just like if you have the string ABCDE, you just sort of fit it in the 36-bits and then stick an extra bit on the right. And so the Fortran 4 doesn't have this concept of data types like we do. So a five character string is treated as the same as an integer. And you can also use it as a floating point. Like it's all just whatever you want, man. Like it's so like littered in the code is it just interchangeably uses strings and integers because it didn't care. This is obviously only ever going to run on a Fortran for compiler written for a 1966 machine. So we don't need to do anything. And here's some fun. So maybe you think this is a text adventure. Why do I care how the integers are represented? And the reason is because Fortran 4 did not have text processing capabilities. So it actually had to split up its own string. So I type in a two-word command. It's got to find the space and do dot split like we would in Python. And the way it does this is with this beautiful function here. I had to stare at this function a long time before I figured out what was going on here. But it's kind of like a work of art, like how it does this. So just to kind of give you a quick thing to like this first data line here, these numbers are an octal. That's what the quote means is it's an octal number. And those are the bit offsets of each character in ASCII. So the 400 is like the eighth bit, I think, or something like that, or whatever. It'll be the shift. And since it doesn't have bit shift operations, it has to use multiplication, which you can kind of see like right here. It sort of is like shifting one character at a time. And this XOR statement is looking for a space. So if only the, what is it, fifth bit is set, then that's an ASCII space character. And so it looks for a character that only has that bit set by doing a bunch of shift XORs and ands. And then once it finds it, it'll say, yes, this is a two-word statement. And then it does some stuff to separate out the two words. Yeah, I could give a whole talk, maybe like a nice lightning talk, just on how this function works, and it's really awesome. And it was like a really good way to just like destroy my interpreter versions as I like, because getting all of these little bits right is almost impossible. Yeah, so that was like a little foray into 4x4. So then my goal was to write a 4x4 interpreter in Python. And so how do you do a compiler? How do you write an interpreter? They sort of follow a basic formula. If you've ever taken a compilers course or seen someone do a talk on it, you typically have these four, five-ish formal steps of you scan all the text into tokens that are sort of like words. And you try to construct the words into meaningful sentences or statements. And then you take that, and you get a tree that you can execute, but you want to optimize it and tweak it and stuff. And then you can use that tree to generate code. And I thought to myself, that just sounds way too much work. I'm not going to do all that stuff. This is an intro project for my company. I only have a couple of days. Let's just hack this. How hard could it be? Let's use some regular expressions or something. So my general strategy, whenever I need to write something, and just need to just slam it out, this is only has to work for one program ever. I'm not going to compile the world's Fortran 4. And all the Fortran 4 that has ever been written probably has already been written. There's not ever going to be another line written, so this is going to be fine. So I just do something where I just split the lines, and then I try to split the lines by whitespace, commas, and parentheses. And then I just try to guess what statement this is going to be, and then try to interpret the rest of it. It's real bad. Don't ever do this in real code, but it's fine for a little fun project. And if you're slamming together a quick compiler, name to tuple is your friend. So normally, you want to create a fun class that will contain your token and has a bunch of stuff on it. But I'm just like, I don't have time to write the class statement. I'm just going to use name tuple. Name tuple, for those of you that do not know, creates a class with that name that accepts those parameters, and that's it. That's all it does. It's a simple container, and you can use it as a tuple to make it really easy to pass it around and stuff. But so my pseudo grammar is written all in named tuples. And if contains an expression and a statement, you can have something called a numerical if, which has a negative and equal to 0 and a positive label that it will jump to, things like that, the go to, and then there was the computed go to. So basically, just like a line, a name tuple for every kind of statement that can occur. And then I just sort of load the tape drive, which is just a file in our case. But it has to simulate in the FORTRAN4 world reading in from the tape drive, the first character device or something, and then read in the code, and then try to parse the lines. I didn't talk about this, but some of the lines can be split. If it's a really long line, you can put it on multiple lines. And there's a special character column that's reserved for that. So if that's marked, then you have to combine them. I have to parse that, especially. And then you just hack it until it works. So I do a quick life-school analysis run where I parse all the lines and slam them all together, replace tabs with eight characters, because otherwise nothing will make sense if you try to actually keep the tabs in. Remove all the comments, because we don't care about those. Take out all the labels, because we're going to have to keep track of those. Yeah, and then basically go through and just start executing statements after that. So it literally just takes the output of that. And then execute statements, there's a little bit of logic here for handling those dues, because while you can't recursively call anything, you can have many do statements in a row. And it can be kind of complicated to figure out exactly which do statement you're in and when it's going to return exactly, because it didn't have a return. It just said the next line to execute after you've do this many iterations. And so you have to keep track of all of that. And I do it with a simple stack. If you were actually writing a compiler for this, those are all determined statically, so you wouldn't need a stack. You could do it all at compile time and know exactly how many levels you're going to be in. And then the execute statement is just a giant switch. It's just like, if it's an if, execute it as an if. Like grab the expression, evaluate it as a Boolean, and run things. And then it'll check all of the other kinds of instances, all of the other kinds of statements it could be, and just start executing them. Like nothing super fancy there. You have to be able to evaluate expressions. So if you need it to be a Boolean, then it does that. So it checks to see if I like threw it into a Python string or a Python integer, and then it'll just return those. Otherwise it might be like an Xor or an equal or a not or an add and things like that. It has to do all of that computation. There's a bunch of hacks in here for if statements are really annoying because it could have those weird numerical if statements that I mentioned. So it has to do a quick regular expression match to grab all that, because we love regular expressions. Type is how you give output to the user. And so it's another one of those things that has the sort of format argument. You tell it what kind of thing you're going to print out. And so this just does a little quick ASCII formatting of that where it converts the integers into strings and then prints them out, things like that. And there's a hack because I was sloppy in how I handled arrays and so I had to like unhack that right there. That's fine. And yeah, and that's how it really works is like literally just execute statements over and over again. This program will run essentially forever. It doesn't really terminate because it's just constantly reading user input and then outputting things and that's all it does in a big loop forever. So this is what the game sort of looks like. So like I had all that execution going on but I wanted to store the state because this is going to be running on a web server or something like that where someone's going to be texting into and it's going to have to respond. So it needs to keep all of that structure. So I take the entire interpreter state right here and I just throw it in a dictionary. And then I pickle it. I could press that. Oh, and then I throw that whole big blob in a Postgres database. And store it per user and it'll stay there forever and then whenever you come back to it months later and type in your next command it will pull from the Postgres database and warm up the program and then dump back in the entire interpreter state into itself and then just pick up right where it left off. It's amazing. So there's three interfaces that we need in order to accomplish this. We need a tape drive, which is just a file so that's pretty easy. And we have to do a teletype input and a teletype output. Those could be command line again or they could be texting. So I already said we're going to use SMS. So I was going to just do a little Heroku app using Flask and it looks like this. It's only like a few dozen lines of code, if that. But like, yeah, you can literally like, it chops all any commands off the 20 characters because it can't read more than 20 characters anyway. So yeah. And then it's like, okay, just grab the state from our table and then like unpickle it and throw it into a game somewhere in here. Like, and then it does a time dot sleep is how you could tell this is really quality production code right here. But it basically just like runs until like it waits a certain amount of time and doesn't see any more output and then it decides it's done and then it'll go back to sleep. And then yeah, it'll accept a line of input and then just keep looping forever and then update the state. And literally it's just like a 10 kilobyte blob. Like that's the entire database is just like, phone number 10 kilobyte blob of interpreter state and that's just it. So the background photo is from someone. That is my talk. I had a lot of fun writing it. I really love these sort of software archeology projects, you know, like looking into old code and seeing how they work and then just doing just awful, terrible, horrible hacks and having fun. So if you do play with the phone number, feel free to, it will probably break. There's like, if you like actually look in the code there's like all sorts of places where I like literally just like sys.exit. I'm like, I don't even know what to do here. I hope the code never reaches this point and if it does, just too bad. Like so if you manage to hit one of those, congratulations. Yeah, that's how it is. And I didn't want to point out so I think I mentioned at the very beginning I'm a BWARE core contributor. I think we're having a BWARE breakfast tomorrow morning. Meet here at nine o'clock out front. BWARE we can talk about it later. But yeah, that's my talk.