 So our final speaker today will be Brian Lennon, who is Associate Professor of English and Comparative Literature at Penn State University. And the title of the talk is, Can Multilingualism Be Simulated? Okay. While I, as I hope will be clear, while what I have to say today has any number of points of contact with the two papers that you just heard, I think it would also be useful for you to think of it as complementary to the presentation by our colleague, Rohini Srihari, this morning insofar as I'm going to be filling in some, not all, but some of the intellectual and institutional history of the kind of work that Rohini was talking about this morning. Today I'm going to talk about a kind of multilingualism that while, of course, deeply social in itself is quite different from the socially individuated and mostly existential multilingualism whose literary expressions, or more often frustrations, I wrote about in a book published in 2010 entitled, In Babel Shadow, Multilingual Literatures, Monolingual States. The multilingualism that I'm going to talk about today is the product not of the complexity of human social life as such, but of interesting breakdowns in the use of computers to attempt to manage that complexity, and particularly the complexity of linguistic confusion. So what you're looking at on the left hand screen is the result of an experimental mechanical translation from German into English of a scholarly book review of a book on a topic in mathematics. And this text appeared in an essay by Victor Ingve, included in a volume entitled Machine Translation of Languages published in 1955 and containing revised versions of papers presented at the first international conference on machine translation convened at the Massachusetts Institute of Technology in 1952. Now for those of you who are unfamiliar with what I find to be an absolutely fascinating episode in the early history of computing, it's useful, I think, to think of machine translation as the first imagined cultural rather than strictly military application of the arithmetic computing machines developed by the United States and the UK for use in cryptanalysis and ballistics calculations during the Second World War. Much of what we now call computational linguistics and artificial intelligence has its origin in early work on machine translation, and we might say that in many ways much of that early work was driven by the profoundly cultural power of speculation in the imagination of fully automated, fully computerized natural language processing and production sufficiently accurate to pass the so-called Turing test by persuasively simulating the discourse of a human being in a particular national graph-elect or standardized written dialect. Now, plainly the process that produced this text on the left-hand screen has no hope of doing that, but in the period 1949 to 1966, especially in the United States and the UK, both enthusiasts and skeptics described fully automatic, fully automated, high-quality machine translation in positively mythic terms as a holy grail. This is a phrase that was frequently used in the literature of the period. And what we're talking about here is entirely computerized translation of sufficient quality in both correctness by various measures and style so as to require no human preparation of the source text serving as input and no human editing of the target output once the process was complete. And this dream, and it really is a dream even today, had its own acronym in the literature of the period, FAHQT, fully automated, high-quality translation. And in the rest of this paper, I'll also be using a common abbreviation for machine translation, which is just MT. Now, there's a great deal that I'm going to have to skip over here in order to return to Victor Ingva and the text you have before you on the left-hand screen. And what I'm eliding begins with a patent granted in the Soviet Union in 1933 and then continues in the post-war writings of Warren Weaver in his role as a director at the Rockefeller Foundation in the United States. Around 1946, Weaver began publicly speculating about new applications for the Colossus code-breaking computers built for the government code and cipher school in Bletchley, near London suggesting that cryptanalytic techniques might be used or might be applied to the translation of natural languages. Initially the medium for this speculation was simply private correspondence with figures like the cyberneticist Norbert Wiener, to whom Weaver wrote in 1947, and these are his words from the letter, when I look at an article in Russian, I say, this is really written in English, but it has been coded in some strange symbols. I will now proceed to decode. A memorandum entitled Translation that Weaver distributed to his circle of acquaintances in 1949 revisited this correspondence, referring to Claude Shannon's information theory as well as the sinologist Erwin Rifler's work on comparative semantics in English and Chinese. It also foregrounded a war anecdote, this is Weaver's phrase, related to Weaver by William Prager, a mathematician at Brown University. The German-born Prager, who had emigrated to Turkey during the war before arriving in the United States, had encoded a sentence in Turkish, this is the anecdote Weaver's telling, had encoded a sentence in Turkish for one of his colleagues in mathematics to practice a deciphering technique on. The most important point about the fact that this experiment succeeded, Weaver asserted in his memo, was that the decoding was done by someone who did not know Turkish and did not know that the message was in Turkish. So Prager, who knows some Turkish, writes a message in Turkish and then uses some sort of an incyphering technique, the details are not given in order to encode the message, then he gives it to a colleague of his practicing a deciphering technique. The colleague succeeds in deciphering the message but is then crestfallen because it makes no sense, brings it back to Prager saying this didn't work, Prager looks at it and says when you break the text back up into words and correct for the transliteration he used for some letters, the message was in Turkish. So the first conclusion that Weaver drew from this was that a logical basis for all languages might be accessed using cryptanalytic techniques and this conclusion was very quickly discredited. Nevertheless and this is why I describe much of this work as being driven by the profoundly cultural power of speculation, nevertheless Weaver's Memo was galvanizing and by the end of 1949 research groups had been formed at MIT, UCLA and the University of Washington where a team was led by Reifler, the most prominent of a very few M.T. researchers whose training was in a discipline other than mathematics and engineering. Starting in 1950, Reifler who appears to have been the first to respond in writing to Weaver's Memo, circulated a series of papers entitled studies in mechanical translation using his credentials as a scholar of comparative semantics, a translator and a teacher of Chinese and German as foreign languages to advocate for M.T. from a humanist perspective. The essay by Victor Ingva that included the text on the left hand screen was entitled Syntax and the Problem of Multiple Meaning and it's a good example I think of work that balanced speculative optimism with pragmatism and a sense of humor about or in dealing with obstacles. For various reasons including real hardware limitations, much of the earliest work on M.T. had focused on crude word by word dictionary translation and Ingva's essay is in some ways an attempt to mediate conflict between the theoretical and perfectionist M.I.T. approach which was devoted to the long term goal of FAHQT and the empirical and operational approach of Reifler's group at the University of Washington which merely sought to produce usable translations. Ingva began by observing what he called the remarkable fact, these are his words, the remarkable fact that most of the languages of interest for mechanical translation divide a section of discourse such as a sentence into about as many words as English does. Furthermore he continued, these are his words still, words of various languages can be found that have substantially the same meaning as certain English words. For this reason he suggests word for word translations are surprisingly good, tantalizingly good and we might as well take them as an acceptable first step. Noting however that any given input word may have several meanings in the output language Ingva admitted that polysemy especially conspicuous to the translator is an issue in nearly every spoken or written utterance in so far as meaning in natural language is profoundly dependent on context. This he explained had led him to think of context as a kind of repository for and these are his words information necessary for the resolution of the multiple meaning problem to be extracted from that repository. Hypothesizing that the sentence was the proper unit of analysis since it is likely to contain enough information to resolve most of the multiple meaning problems he described in experiment in the partial translation of a book review in German into English conducted manually using index cards to build up a dictionary of German English word equivalents. Rather than concealing the grammatical meaning of the German original with an imperfect translation Ingva explained this partial translation left German word order and grammatical particles including inflectional word endings intact in the output Ingva observed that and these are his words people who knew a little German grammar after they had recovered from their mirth demonstrated that they were able to understand quite well and fairly rapidly what was being said while those who knew no German at all were able to grasp only the subject matter from the translated stems and not much else. This he concluded suggested that a viable solution to the translation of grammatical meaning is needed. Meanwhile, because as he put it and these are his words slight knowledge of the input language helps the reader a great deal. It was desirable for those who would need to read empty output to obtain basic knowledge of the source language through a brief introductory course. Now this is important because it points to the issue of language acquisition which I would say formed a kind of absent presence in M.T. research until it was brought front and center by a report that would produce the nearly complete collapse of research funding in the field after 1966 and this is something to which I'll return at the very end of my talk. So in my book in Babel shadow I juxtapose this text that you've been looking at on the left hand screen with a passage from a novel entitled between by Christine Brookrose. This is on the right hand screen and I know that this is a novel that some of you in the audience are familiar with. One of the things I found myself doing in that book was speculating myself about what is similar and what is different about these two texts beyond what's merely obvious about the difference. The text given by Ingve on the left hand screen is we could say a prototype of the kind of incompletely translated output that an ordinary civilian at least will sometimes obtain from even the best freely available non-specialized machine translation engines today although of course it's no longer going to be anywhere near this crude. As such we might want to say that the text given by Ingve on the left hand screen marks a gap between the algorithmic computational processing and the human uses of language and that it thus represents a kind of simulated multilingualism and I'm using the word simulated both in the ordinary sense and also a bit mischievously in order to turn the tables on the operational multilingualism of much work in natural language processing the way that it often presumes a certain national monolingualism and of course one bit of the work that Rohini Shri Hari was showing us this morning of course contradicts that and represents an advance beyond that. The text by Christine Brookrose on the right hand screen is by contrast an artifact of literary expression specifically the literary expression of say a multilingual human self possessing the privilege of a certain level of education. These two texts are entirely different in most ways in terms of both provenance and purpose right one the text given by Ingve on the left hand screen is a representation of a kind of failure in relation to the real goals of the work that produced it while Christine Brookrose's text represents what many literary critics and scholars might want to call a virtuosic literary style but I think that in the monolingual context that they both address that both texts address we can say that both texts serve as incitements to multilingualism or at least to language acquisition even if both options are very narrowly circumscribed indeed and there's a whole host of other issues here that I'm simply not addressing in this paper and which have to do with the history of the European empires and of philology and orientalism which all bear on the topic of machine translation but not in a way that I have time to address today. So I think that in the monolingual context that they both address both of these texts service incitements to multilingualism or at least to language acquisition however narrowly circumscribed we may want to see those options as and this is why I emphasized Ingve's conclusion I propose this text on the left hand screen that those who need to read machine translation output should obtain basic knowledge of the source language through as he put it a brief introductory course and for that matter if you can read the text on the right hand screen in its entirety you know you can also choose to read Christine Brookrose's narrator here as invested in the translational dynamic equivalents or commensurability of languages here just as much as in their difference right she says at one point three quarters of the way for the three quarters of the way through that text as if languages loved each other behind their own facades despite Alice Vassman Denkter Ruber Daphon Datsu so this text of complimentary in that sense as well or symmetrical. In his memo entitled translation Warren Weaver had placed empty in the service of an imperial internationalist ideal describing the multiplicity of human languages as a worldwide translation problem his phrase that impedes cultural interchange between the peoples of the earth and is a serious deterrent to international understanding also his words speculating about invariable properties statistically common to all languages we ever invoke the philologist and orientalist moxmuller and apparently unaware of Mueller's contempt for them on a motto poetic echoic bow wow theories of the origin of human language suggesting that all human beings had identical vocal organs producing similar ranges of sounds and these are Weaver's words with minor exceptions such as the glottal click of the African native phonological and graphic correlations between words in Chinese and English have been demonstrated by Irving Reifler we were we were noted while Hans Reichenbach a founder of the Berlin circle who had these are Weaver's words who had also spent some time in Istanbul and like many of the German scholars who went there was perplexed and irritated by the Turkish language had discovered common features of the basic logical structures of otherwise very different languages describing the deep use of language invariance as the most promising approach of all to M.T. Weaver imagined languages as towers erected on a common foundation with an open basement and translation as a traversal of that basement rather than shouting from tower to tower and another very well known short text produced during the same period entitled the new tower we were described computer engineers as building a new tower of anti-babel the years from 1949 to 1960 were in many ways a golden age for M.T. in the United States defined by such liberal technocratic optimism and by a series of disciplinary advances if not necessarily advances in technical implementation during this period the journal mechanical translation was founded to support a growing mass of important publications and a public demonstration of Russian to English M.T. in 1954 IBM's technical computing Bureau in New York would help secure easy access to generous government military and private funding even before the Sputnik crisis of 1957 although this 1954 demonstration showcasing the work of the M.T. group at Georgetown University has recently been called contrived and a fraud at the time it clearly marked a surge forward now these words contrived and fraud appear in a recent memoir by Anthony Ettinger who began working on M.T. as a Harvard undergraduate in 1949 and went on to produce the first doctoral dissertation on M.T. in 1954 and then went on to lead Harvard's M.T. group for some time afterward and in this recent memoir Ettinger recalls that when he joined the automatic language processing advisory committee of the National Academy of Sciences convened in 1964 to assess progress on M.T. these are his words I knew that I was probably going to end up by taking my own research field down the drain but I already had the firm conviction that M.T. was not going anywhere and that it made no sense to perpetuate a fraudulent belief that something might be achieved. Ettinger describes the culture of casinoized grantsmanship with both U.S. and Soviet researchers engaged in these are adding those words a kind of amiable conspiracy to extract money from their respective governments playing each other off with various experiments and demonstrations that sometimes bordered on fraud. The committee's report issued 1966 was deeply skeptical of researchers claims that M.T. was needed to help process Russian language technical literature observing that the present supply of human translators greatly exceeds the demand and that these are a direct quotation from the report there is no emergency in the field of translation it stated flatly that the date to date without recourse to human translation or editing there has been no machine translation of general scientific text and none is an immediate prospect and it observed that after eight years of work the Georgetown group could still not produce output usable without human post-editing. Finally it noted that in some cases these are the words from the report in some cases it might be simpler and more economical for heavy users of Russian translations to learn to read the documents in the original language adding that many US scientists already did just that that instructional resources were available for those inclined to make use of them and that acquiring basic reading facility in Russian was not likely to divert disablingly large quantities of researchers time. Apropos the labor cost of using human translators to post-edit M.T. output it quoted Robert T. Byer a physicist at Brown University who observed that these are Byer's words I found that I spent at least as much time in editing as if I had carried out the entire translation from the start. Even at that I doubt if the edited translation reads as smoothly as one which I would have started from scratch. I drew the conclusion that the machine today translates from a foreign language to a form of broken English somewhat somewhat comparable to pigeon English but it then remains for the reader to learn this Patois in order to understand what the Russian actually wrote. Learning Russian would not be much more difficult. Now this is as far as I have time to go today with this passage that I would say juxtaposes the simulated multilingualism of failed machine translation with the acquired multilingualism that it would seem to encourage. Two final notes in conclusion. The first is that the ALPAC committee's report the the impact of this report was really in many ways nothing less than devastating. By 1968 the Association for Machine Translation and Computational Linguistics had dropped machine translation from its name as the 10 U.S. research groups that were active in 1963 dwindled to three with research virtually shut down in the U.K. and also significantly reduced in other contexts in which it had been growing like Japan and the Soviet Union. The second thing is that as some of you know this is not the end of the story by any means because new work on machine translation tied to much more modest goals eventually emerged after 1975 and 1976 under the sponsorship of the European communities and thus the so-called winter of machine translation was really a relatively brief one. But I think that we can say that it took the ALPAC report and this nearly complete collapse of both public credibility and research funding to get machine translation researchers to move beyond what I would call the metaphysical objective of fully automated high-quality translation and to resign itself to a durable human computer symbiosis. It's been pointed out that it was only after the ALPAC report in subsequent work on interactive human computer translation workstations that professional translators were invited to join MT research efforts as translators rather than as models for their computer surrogates or post editors of their output. Thank you.