In this paper I propose a critical reading of a famous piece of Fortran code, which erroneously has been held responsible for the failed launch of Mariner I, the first American space probe, in 1962. After explaining how numerous 'urban myths' have been constructed around this piece of code — which, when executed, leads to an infinite loop — I turn to a deeper investigation of its functioning with the aim of showing how, more generally, some unpredictable consequences of software-based technologies can be viewed as the result of what I propose to call the 'impossible linearization of code'.
I focus on the way in which a Fortran compiler works and I take into examination some of the political and cultural assumptions implicit in the process of compilation. I show in what ways the misguided introduction of a dot ('.') rather than a comma (',') into the source program - a human error on the part of the programmer ”" escapes the compiler and how, ultimately, the apparent perfection of this line of code leads to uncontrollable consequences and potentially to disaster. Thus, I advance the proposal of understanding code as a process of unstable linearization.
Drawing on the work of theorists such as Bernards Stielger and Jacques Derrida, but also on classics of the theory of programming languages, I show how the functioning of code is based on the fundamental assumptions about writing, calculability and instrumentality that characterize the Western thought on technology — and how, nevertheless, code unavoidably exceeds and escapes such assumptions. Ultimately, I argue that such a problematizing reading of code can be beneficial in the context of Critical Code Studies, since it enhances our capacity to engage with software-based technologies deeply on both the cultural and the political level.
One of the most famous bugs in software — so famous that in the late 1980s professionals started wondering whether it was in fact just an urban myth — is the following FORTRAN statement:
DO 10 I = 1.10
...
10 CONTINUE
This legendary piece of code is largely quoted by technical literature as well as by the many websites devoted to the discussion of software errors. When executed, it leads to an infinite loop, which erroneously has been held responsible for the failed launch of Mariner I, the first American space probe, in 1962. In a posting dated 28 Feb 1995 on http://www.rchrd.com/Misc-Texts/Famous_Fortran_Errors John A. Turner debunks the myth of the FORTRAN error by referring to a previous discussion posted by Dan Pop on the newsgroup http://comp.lang.fortran on Mon, 19 Sep 1994.[1] To give but one example of the many pieces of narrative circulating on the Internet in relation to this bug, Turner mentions how one of the Frequently Asked Questions on http://alt.folklore.computers is "III.1 - I heard that one of the NASA space probes went off course and had to be destroyed because of a typo in a FORTRAN DO loop. Is there any truth to this rumor?". Also, Dieter Britz is reported to have posted the following question on http://comp.lang.fortran: "[i]t is said that this error was made in a space program program, and led to a rocket crash. Is this factual, or is this an urban myth in the computer world?". Other incorrect (but often sharp and colorful) versions circulate, such as those posted by craig@umcp-cs ("[t]he most famous bug I've ever heard of was in the program which calculated the orbit for an early Mariner flight to Venus. Someone changed a + to a - in a Fortran program, and the spacecraft went so wildly off course that it had to be destroyed") and by Sibert@MIT-MULTICS ("it is said that the first Mariner space probe, Mariner 1, ended up in the Atlantic instead of around Venus because someone omitted a comma in a guidance program"), both reported by Dave Curry in his piece on 'Famous Bugs' on http://www.textfiles.com/100/famous.bug. [2]
The legend results from the confusion of two separate events: Mariner I was in fact destroyed on 22 July 1962, when it started behaving erratically four minutes after launch, but such an erratical behaviour was due to a combination of hardware and software failures (Ceruzzi 1989: 202 f.), while the DO-loop piece of code was identified (and corrected) at NASA during the summer of 1963 during the testing of a computer system. In fact, according to the story originally recounted by Fred Webb on http://alt.folklore.computers in 1990 and subsequently reported by Turner, during the summer of 1963 a team of which Webb was part undertook preliminary work on the Mission Control Center computer system and programs. Among other tests, an orbit computation program that had previously been used during the Mercury flights was checked for accuracy, and the person conducting the tests came across a statement in the form of 'DO 10 I=1.10'. [3]
Webb writes:
"This statement was interpreted by the compiler (correctly) as:
DO10I = 1.10
The programmer had clearly intended:
DO 10 I = 1, 10
After changing the `.' to a `,' the program results were corrected to the desired accuracy." (http://www.rchrd.com/Misc-Texts/Famous_Fortran_Errors, last accessed on 23 May 2009 at 12.01 GMT)
Apparently, the program's answers had been accurate enough for the sub-orbital Mercury flights, so no one suspected a bug until the programmers tried to make it more precise in anticipation of later orbital and moon flights. Thus, the error seems to have been found under harmless circumstances and was never the cause of any actual failure of a space flight.
What I want to emphasize by retelling this story is how the misguided introduction of a dot ('.') rather than a comma (',') into the source program - a human error on the part of the programmer "" resulted in a perfectly 'correct' program statement which nevertheless potentially led to unforeseen consequences. But in what ways could such a mistake "" that is, the presence of a vicious piece of code capable of generating a potentially fatal infinite loop in a mission-critical software system "" remain undetected for so long? In order to explain this point it is worth spending a few words on what happens after a piece code has been written and before it can be executed, and particularly on the concept of the 'compiler'.
The FORTRAN piece of code I am analysing here is part of a much larger computer program "" that is, of a text generally inscribed in a form that takes into account the space of the page (even if the page is visualized on a screen) and that respects the conventional direction of (Western) reading. However, from the point of view of the theory of formal languages the program is one continuous string, with blanks and semicolons coded as ASCII characters that are also part of the string. In fact, the use of new lines and tabulations has the only function of making the program more easily readable for the human eye. In order to be executed the program needs to undergo not just one but many transformations. As Alfred V. Aho, Monica S. Lam, Ravi Sethi, Jeffrey D. Ullman remark in their classical handbook on formal languages, "[b]efore a program can run, it first must be translated into a form in which it can be executed by a computer. The software systems that do this translation are called compilers" (Aho et al. 2007: 1). For the sake of clarity, let me now follow Aho et al. (2007) in their informal and intuitive explanation of compilers. They define a compiler as follows:
Simply stated, a compiler is a program that can read a program in one language -- the source language -- and translate it into an equivalent program in another language -- the target language. ... An important role of the compiler is to report any errors in the source program that it detects during the translation process. (Aho et al. 2007: 1)
A compiler is thus an automaton (or -- simply put -- a piece of software) that 'reads' any program previously written in the high-level programming language for which the compiler has been designed (the 'source language') and re-inscribes it as binary code (the 'target language') "" a fundamental step toward the stage of execution, when the binary code will actually become a sequence of opening and closing circuits. The terms 'read' and 'translates' used in the above passage are metaphors for the material re-inscription of the source program in a different form "" a process of re-inscription that is technically called 'compilation'. But how does this re-inscription work? What does it do?
The authors establish that the process of compilation can be broken down into different phases. They write:
The first phase of a compiler is called lexical analysis or scanning. The lexical analyzer reads the stream of characters making up the source program and groups the characters into meaningful sequences called lexemes. For each lexeme, the lexical analyzer produces as output a token " that it passes on to the subsequent phase, syntax analysis. (Aho et al. 2007: 5f.)
This passage uses all the rhetorical devices that are common in technical descriptions of programs: it switches from terms denoting temporal "phases" of execution of the program ("lexical analysis") to terms denoting agency of (parts of) the program itself ("lexical analyzer"). Yet, what this quote makes clear is that the process of compilation consists first of all of detecting 'interruptions' in the string of the program (such as blanks and semicolons) and of re-grouping the symbols of the string into smaller sub-strings called "lexemes". It must be noted that the compiler is an automaton that follows the rules of production of a specific grammar "" and it is this grammar that establishes what works as an interruption in the source program (for instance, a blank space or a ';' work as interruptions, but not an 'a'). The symbol of interruption therefore ruptures the chain of symbols and functions as a discriminating tool for re-grouping the other alphabetical symbols into new groups ("lexemes"). For each lexeme, the phase of lexical analysis produces a "token" "" in other words, each lexeme is re-inscribed as a slightly different string with a few explanatory symbols added. At this point, the initial single string of the source program has been completely re-inscribed. I want to emphasize here how the lexical analyzer also works on the basis of linearization and discreteness: its function is to "read" the ruptures in the source program, to make them work as ruptures, to make them function in the context of the computer so that the originary string is transformed into a different string (the sequence of tokens). The lexical analyzer works with the rupture in order to re-inscribe the string. The ruptures (the blanks, the semicolons) disappear from the initial string as symbols - that is, as characters with their own binary encoding in the computer memory - and they are re-inscribed literally as the "tokenization" of the program. They are actually what enables such tokenization.
Starting from the sequence of tokens produced in the phase of lexical analysis, another re-inscription is carried out. The authors continue: "The second phase of the compiler is syntax analysis or parsing. The parser uses the first components of the tokens produced by the lexical analyzer to create a tree-like intermediate representation that depicts the grammatical structure of the token stream" (Aho et al. 2007: 8). Again, it is not important to understand all the technicalities of this passage. What is important is that the tokens are 'rearranged' once more according to the syntactical rules of the grammar on which the language is based, with the help of the additional symbols inscribed into the tokens in the previous phase. The way this happens varies physically: obviously no graphical representation of a tree is actually 'depicted' in the memory of the computer. The parse tree is also a rebuilt string: in other words, the sequence of tokens is regrouped in order to be then analyzed by the "semantic analyzer", which in turn checks the consistency between the tokens and produces the so-called "intermediate code". Aho et al. (2007) explain: "In the process of translating a source program into target code, a compiler may construct one or more intermediate representations, which can have a variety of forms" (Aho et al. 2007: 9). Such intermediate code is a further re-inscription of the source code and a further transformation of the ruptures inscribed in the source program - ruptures that actually make it possible for code to function.
It is important to notice at this point that, while performing all these re-inscriptions of the source program, a compiler also checks the program for correctness "" that is, it assesses the program's compliance with the rules of transformation defined for the specific programming language in which it was written (Aho et al. 2007: 1). However, a compiler can only detect errors according to such rules (for instance, it can easily locate a missing ';') but it cannot anticipate whether the source program will function in an unpredictable way when executed. For instance, it cannot anticipate whether a perfectly correct portion of the source program will cause the computer to enter an infinite loop with unforeseeable consequences.
I want to emphasize here that a compiler works on the basis of what the French anthropologist André Leroi-Gourhan calls the linearization of language. According to Leroi-Gourhan (on whose theory Jacques Derrida famously comments in Of Grammatology) the emergence of alphabetic writing must be understood as a process of linearization (Leroi-Gourhan 1993: 190). To understand the concept of linearization better, one must start from Leroi-Gourhan's concept of language as a "world of symbols" that "parallels the real world and provides us with our means of coming to grips with reality" (195). For Leroi-Gourhan graphism is not dependent on spoken language, although the two belong to the same realm. Leroi-Gourhan views the emergence of alphabetic writing as associated with the technoeconomic development of the Mediterranean and European group of civilizations. At a certain point in time during this process writing became subordinated to spoken language. Before that "" Leroi-Gourhan states - the hand had its own language, which was sight-related, while the face possessed another one, which was related to hearing. He explains:
At the linear graphism stage that characterizes writing, the relationship between the two fields undergoes yet another development. Written language, phoneticized and linear in space, becomes completely subordinated to spoken language, which is phonetic and linear in time. The dualism between graphic and verbal disappears, and the whole of human linguistic apparatus becomes a single instrument for expressing and preserving thought "" which itself is channelled increasingly toward reasoning. (Leroi-Gourhan 1993: 210)
By becoming a means for the phonetic recording of speech, writing becomes a technology. It is actually placed at the level of the tool, or of 'technology' in its instrumental sense. As a tool, its efficiency becomes proportional to what Leroi-Gourhan views as a "constriction" of its figurative force, pursued precisely through an increasing linearization of symbols. Leroi-Gourhan calls this process "the adoption of a regimented form of writing" that opens the way "to the unrestrained development of a technical utilitarianism" (212).
Expanding on Leroi-Gourhan's view of phonetic writing as "rooted in a past of nonlinear writing", and on the concept of the linearization of writing as the victory of "the irreversible temporality of sound", Derrida relates the emergence of phonetic writing to a linear understanding of time and history (Derrida 1976: 85). For him linearization is nothing but the constitution of the "line" as a norm, a model "" and yet, one must keep in mind that the line is just a model, however privileged. The linear conception of writing implies a linear conception of time - that is, a conception of time as homogeneous and involved in a continuous movement, be it straight or circular. Derrida draws on Heidegger's argument that this conception of time characterizes all ontology from Aristotle to Hegel - that is, all Western thought. Therefore, and this is the main point of Derrida's thesis, 'the meditation upon writing and the deconstruction of the history of philosophy become inseparable' (86).
However simplified, this reconstruction of Derrida's argument demonstrates how, in his rereading of Leroi-Gourhan's theory, Derrida understands the relationship of the human with writing and with technology as constitutive of the human rather than instrumental. Writing has become what it is through a process of linearization - that is, by conforming to the model of the line -- and in doing so it has become instrumental to speech. Since the model of the line also characterizes the idea of time in Western thought, questioning the idea of language as linear implies questioning the role of the line as a model, and thus the concept of time as modelled on the line. It also implies questioning the foundations of Western thought (by means of a strategy of investigation that Derrida names "writing in general", or "writing in the broader sense"). In fact, if we follow Derrida's reworking of Leroi-Gourhan's thought, a new understanding of technology (as intimately related to language and writing) entails a rethinking of Western philosophy "" ambitious as this task may be.
However, to return to the process of software compilation, a compiler can be viewed as an automaton that examines the text of the program "" for instance, our lines of FORTRAN code ""detects "interruptions" in it and re-groups the symbols belonging to the FORTRAN string into smaller sub-strings called "tokens". As we have seen, the introduction of a dot ('.') in the Fortran code we are examining results in a perfectly 'correct' program statement which can be "tokenized" by the compiler according to the rules of the FORTRAN grammar. However, when it functions as a rupture in the FORTRAN string, the dot leads to a very different tokenization than the one that would be obtained if the dot was replaced by a comma. With a comma, the string would be broken down by the compiler into the following tokens:
DO
10
I
=
1
10
Whit a dot, the string is in turn broken down into the following tokens:
DO10I
=
1.10
Both sequences of tokens are correct according to the rules of substitution of FORTRAN. However, when executed, the first sequence of token leads to the repetition of the loop for ten times, while the second leads to an infinite loop -- that is, to the repetition of the loop which goes on for ever.[4] Thus, ultimately, a perfectly linearized string results in the perfect execution of a sequence of actions that potentially leads the computer system to disaster. In other words, execution (or, in the technical jargon, 'run time') is the moment in which apparently successful linearization leads to uncontrollable consequences.[5] Linearization does not therefore ensure the perfect calculability of the consequences of technology (and, broadly speaking, of the future).
Understanding software bugs as failed instances of the linearization of (programming) languages, and understanding software in general as an unstable process of linearization, can lead to many important consequences. Firstly, it shows that a bug is a decision. To understand this point better, it must be kept in mind that software debugging is a late stage of software development, and is part of what in Software Engineering is generally called 'test phase' (Sommerville 1995). Before being released to commercial users, a software system needs to be tested "" i.e., it is necessary to verify that the system meets its specifications, or that it works as expected. When a test reveals an anomalous (or unexpected) behaviour of software, code must be inspected in order to find out the origin of the anomaly "" namely, the particular piece of code that performs in that unexpected way. Code must then be corrected in order to eliminate the anomaly. The testing process takes time because all the functions of the system need to be tested. Furthermore, sometimes the correction of an error introduces further errors or inconsistencies into the system and generates more unexpected behaviour.
Although in the phase of testing unexpected behaviour is generally viewed as an error, it is worth noting that decisions must still be made at this level. The testing team is responsible for deciding whether the unexpected behaviour of the system must be considered an error or just something that was not anticipated by the specifications but that does not really contradict them. Errors need to be fixed (by correcting code), but non-dangerous (and even useful and welcome) anomalies can just be allowed for and included in the specifications. Thus, the activity of deciding whether an anomaly is an error introduces changes in the conception of the system, in a sustained process of iteration. In the case we are examining, the bug itself is discovered and identified as a bug because a number of previous decisions have been made "" namely, the decision to start Project Mercury (made by NASA and the US government), leading to the re-examination and re-evaluation of the Mission Control Center software (and particularly of this specific orbit computation program) according to newly established standards of accuracy.
Even more importantly, the above analysis of FORTRAN code emphasizes the horror with which we generally look at the unexpected consequences of technology. To give but one example, in his recollections about the so-called software crisis of the late 1960s, Brian Randell writes:
I still remember the ABM debate vividly, and my horror and incredulity that some computer people really believed that one could depend on massively complex hardware and software systems to detonate one or more H-bombs at exactly the right time and place over New York City to destroy just the incoming missiles, rather than the city or its inhabitants. (Randell 1979: 5)
Clearly, Randell's horror at the excessive self-confidence of some software professionals stems from the connotative association between technology, catastrophe and death in a cold-war scenario. "Horror" "" a powerful emotion - is the result of the anticipation of the consequences of technology combined with the awareness of its intrinsic fallibility (Frabetti 2010). In a way, it could be said that we are so terrified by the unexpected consequences of technology that, even when there is none (as in the case of the innocuous FOTRAN bug), we invent them. As I have argued earlier on, the intrinsic fallibility of software relies on linearization "" and so do the alphabet, writing, and our conception of time. But linearization is always unstable, always impossible. And yet, it is necessary for software, as well as for writing and for technology, to exist "" or, as Bernard Stiegler would have it, to be externalized. Perhaps what we see with horror is not the 'genie in a bottle' scenario of the atomic experiment gone wrong but the constitutive capacity of technology to escape calculation and control, its ability to be much more than an instrument for us. What we see with horror (or what we would rather not see) is the prostheticity, or the originary technicity, of what we consider 'the human': never a master of its tools, always constituted through and with them.
[1] Turner's posting is dated 28 Feb 1995 23:31:39 GMT. Pop's originary posting is dated Mon, 19 Sep 1994 at 10:56:50 GMT. See http://www.rchrd.com/Misc-Texts/Famous_Fortran_Errors (last accessed on 23 May 2009 at 12.01 GMT).
[2] Dave Curry's posting gives a detailed account of the research originally started by John Shore on "documented reports on 'famous bugs'". See http://www.textfiles.com/100/famous.bug (last accessed on 23 May 2009 at 12.19 GMT).
[3] Project Mercury's sub-orbital flights took place in 1961 and its orbital flights began in 1962.
[4] Technically, the FORTRAN statement containing the comma means 'repeat the action (represented by ... CONTINUE") while increasing the value of the counter "I" from 1 to 10, then stop', while the statement containing the dot means 'assign the value 10.1 to the variable "DO10I", then go on repeating the action "" CONTINUE" forever'. However, there is no need to recur to an explanation based on meaning to clarify the different executions of the two different pieces of code, because the two different instances of tokenization are sufficient to lead to different configurations of the circuitry of the system, resulting in different computer behaviours.
[5] The circular time of the loop is still linear time. This makes it even clearer how linearization cannot completely rule out the unexpected.
Aho, Alfred V., Lam, Monica. S., Sethi, Ravi, and Ullman, Jeffrey. D. Compilers. Principles, Techniques, and Tools. London and New York: Pearson and Addison-Wesley, 2007.
Ceruzzi, Paul. Beyond the Limits: Flight Enters the Computer Age. Cambridge, MA: MIT Press, 1989.
Derrida, Jacques. Of Grammatology.Baltimore: The Johns Hopkins University Press, 1976
Frabetti, Federica. "'Does It Work?': The Unforeseeable Consequences of Quasi-Failing Technology". Culture Machine 11 (2010) http://www.culturemachine.net/index.php/cm/article/view/388/409.
Leroi-Gourhan, André. Gesture and Speech. Cambridge: MIT Press, 1993.
Randell, Brian. "Software Engineering in 1968". Proceedings of the IEEE 4th International Conference on Software Engineering. Munich: 1-10, 1979.
Sommerville, Ian. Software Engineering. Harlow: Addison-Wesley, 1995.
Stiegler, Bernard. Technics and Time, 1: The Fault of Epimetheus. Stanford, CA: Stanford University Press, 1998.