 I'm a junior lead software senior engineer at the Ladders, Cassio Juarez on Twitter, and today I'm going to talk to you a bit about COBOL. First of all, who here knows COBOL? All right, so this should be pretty easy to put one past you guys. So first of all, I'd like to talk a little bit about what COBOL is and get into a little bit of the history of the language itself. Back in 1959 the landscape in computer programming was that you bought a computer from somebody, IBM, Honeywell, one of these other big companies, and you got the programming language with it. There really wasn't a cross-platform language for writing code, and so you were kind of tied as a customer to whatever piece of hardware you bought, and that made it hard for you. It also made it kind of hard for the people trying to sell the computers because one of the big problems was, well, we don't really like your language, or you know, maybe what if we decide we want to switch things up. So in 1959 a woman named Mary Hawes was at a conference and started cornering people and saying, hey, we've got this need on the business side to have some sort of common programming language that we can use to write business applications, and she started kind of putting together a team of people who were interested in doing this. So by about the middle of 1959 they had started an organization to develop COBOL. They had a long-range committee that was in charge of kind of strategic long-term planning, a short-range committee in charge of the medium-range committee in charge of kind of figuring things out for the next few years, and a short-term committee that was in charge of actually designing the language. Short-term committee is the only one that actually ended up doing anything. In 1959, in December, they produced the first COBOL spec. So the driving ideas, and this is according to Gene Samet who was on the short-term committee, the driving ideas behind COBOL were that they wanted a language that was natural, so they wanted something that people could read, and it would kind of read like English. It would read like something people were used to. They wanted something that had an ease of transcription to the required media. So if you think back to that time, what you know of it, or for those of you who were there and remember it, we were talking about five-bit character sets. We're talking about things that are entered on teletypes. There were academic languages that people had proposed that were specified in these kind of characters that nobody was actually able to type, and so the idea was that COBOL would be something where the actual language specification matched what you could actually type and print out on a computer at that time. The third requirement was that they wanted something where the problem structure could be specified in the language that was created. And then the final thing was that they wanted something that was easy to implement. So that led to a few things. First of all, the initial idea was that the first word of every sentence would be a verb, so you would have a program that was effectively a set of imperative statements towards the computer. You'd have very few verbs with a lot of different options, and the word go to would be allowed after every sentence. So at any point you could throw in a go to. This was before Dijkstra had written his go to considered harmful at the time. People actually thought go to was a pretty good idea. There was also this idea that new verbs could be added at any time, so initially there was this thought that the language would be extensible. In about 1965 that was removed from the specification because it turned out that nobody had actually bothered to implement this. Quick word about functions. Everybody liked Michael's talk previously, right? So in 1959, there was a lot of academic thought about computers. There were a lot of papers out. They confused the hell out of everybody in the business world. The idea was that functions were impossible for non-mathematicians to understand. And so instead of having functions in COBOL, they came up with their own way of having these verbs that you would use and chain together so that it would be easy for other people to understand. They stuck with this idea in 1989, some built-in functions were added, and in 2002, user-defined functions were finally added to the spec. Before that, some manufacturers had inserted their own, but they were ad hock kind of one-off things. Anyway, I want to talk a little bit about some of the things that we got out of COBOL that were good or bad. So one of the first things is that enforced separation of concerns can be a good thing. So COBOL programs are divided into four divisions. There's an identification division, an environment division, a data division, and a procedure division. So if you look at most of your programs today, you can kind of recognize these things in your own code. But most programming languages and environments don't necessarily enforce this. You're kind of left on your own with whatever tools are there to have something that identifies who you are, something that contains all the environment-specific code, like your configuration files, then someplace where hopefully you declare your data in a way that somebody can go and look and understand what you're working on, and then something that actually has the code that you're working with. COBOL actually enforced this in the structure of the programs. I've got an example from Storm and the Clojure DSL that Storm provides. Is anybody familiar with Storm, the event processing system? All right. Well, they have a Clojure DSL. It's a system for basically doing the same types of work that you'd be doing with COBOL, which is processing data. And if you look at a definition of a bolt, which is one of the units of work in Clojure, it looks a lot like the structure of a COBOL program, and that you've got your split sentence, which is your identifier for the bolt. You've got your output, which is that word thing. And then you've got your inputs, which are the tuple in the collector. So some systems that we use these days provide varying degrees of this stuff. COBOL was really kind of the first place where people actually started thinking about these things in computer languages and computer frameworks and systems. Second thing to talk about is that naturalness is subjective and not always a useful measure of how good a language is. I think this is the part everybody's been waiting for, which is some real life COBOL code. If you look at this, hopefully you can kind of get an idea of what's supposed to be happening here. This process reads records from one file and then copies them over to a file that will be printed out. It's a very simple program, but if you read it, it reads pretty much like plain English. If you had a rough idea of what it was supposed to do, you could go through and you could identify all the parts that do these things. And in fact, I'd argue that even if you weren't a programmer, you could probably go through and read this and figure out what was going on. So this is great at a certain level. You can look at this, see what's going on, no big deal. The problem becomes difficult to work with more abstract ideas when you're confined to a language like this. And so this idea that they originally had of naturalness ended up kind of hamstringing COBOL because all of your code had to be written this way. It all had to be verbs and then the various permutations. And so when you start doing things like math or string manipulation, or some of the other important things you want to do, you start to have to be really verbose. When people talk about billions of lines of COBOL being out there, part of it is because some of this stuff is just really hard to write in COBOL. So you'll see things like this when people talk about languages, they'll say, oh, it's, you know, I like Ruby because it's intuitive or it's natural. Usually what they're saying is this looks like something I know. In COBOL's time, that was English. Nowadays, most people are used to working with languages that have functions or things like that. And so these things seem natural. But it's all kind of, again, it's subjective and it's all really very domain specific. The final thing that I think we should take from COBOL is that ETL will always be with us. And ETL is extract, transform, and load. And that's effectively what we're always doing with databases. And it's what COBOL was designed to do. COBOL should not be thought of as a general purpose programming language. It's really a DSL and a framework for doing ETL work, pulling something out of one place, manipulating it, writing it back out to another place. You may not really be able to see this, but the point I'm trying to make here is that the thing on the right is from a COBOL book that I have. And it describes the process of pulling things out, manipulating them and shoving them off into another file. The diagram at the top is from a PIG documentation. PIG is a map reduce framework built on top of, or it's a framework built on top of Hadoop, which is a map reduce framework. And the thing on the bottom is a storm topology diagram. We're still doing the same things. We're still looking for languages that let us express these ideas well, and we're succeeding or failing to various extents. But if you go back and look at what COBOL did and what they were trying to do, you'll recognize a lot of the things that we're trying to do these days with respect to big data and event processing. So final thing, actually that URL is pretty small. It's bit.ly-go-ruco-cobol. I have some links to some documents that I used for this. Unfortunately, a lot of the COBOL stuff isn't freely available online, but this links to some ACM articles and some books out there that you can read if you want to learn more about COBOL, because we should all be learning about our past in history. That's it. Thank you, and go learn about COBOL.