Saturday, May 26, 2007

John Backus, 1924-2007

1977 Turing Award winner John Backus died on March 17.

Long before I met him, he made two contributions that had major effects on my career--as well as on the field of computing: FORTRAN and BNF.

John's team invented and implemented the original FORTRAN for the IBM 704 computer in the period 1954-57.
In late 1953, Backus wrote a memo to his boss that outlined the design of a programming language for IBM’s new computer, the 704. This computer had a built-in scaling factor, also called a floating point, and an indexer, which significantly reduced operating time. However, the inefficient computer programs of the time would hamper the 704’s performance, and Backus wanted to design not only a better language, but one that would be easier and faster for programmers to use when working with the machine. IBM approved Backus’s proposal, and he hired a team of programmers and mathematicians to work with him.

The challenge Backus and his team faced was not designing the language, which they felt they could easily do. Instead, it was coming up with a device that would translate that language into something the machine could understand. This device, known as a translator, would eliminate the laborious hand-coding that characterized computer programming at the time. It contained an element known as a parser, which identified the various components of the program and translated them from a high-level language (one that people understand) into the binary language of the computer. [THOCP]
FORTRAN was not the very first "high-level language," but it was the first with a translator (compiler) that produced efficient programs, and the first to be widely used. Acceptance was certainly not instantaneous: I didn't have the opportunity to use it until 1965 (actually, FORTRAN II on an IBM 1620). But its effect was dramatic, radically changing the way people thought about programming (and programmers), and inspiring the inventors of literally hundreds of other languages.
No single one of those languages represented as great an advance in the state-of-the-art as FORTRAN was over the assembly languages of the day. Not COBOL. Not LISP. Not Algol. Not Pascal. Not PL/I. Not Ada. Not Simula. Not Smalltalk. Not C. Not C++. Not Java. Not ML. Not your favorite language, either. (Although cumulatively we have made a few FORTRAN's worth of progress in the past half-century.)

Relatively few implementors of these languages have devoted as much attention to producing efficient machine code as the original FORTRAN team did. A notable exception is the latest A. M. Turing Award winner,
Fran Allen.
FORTRAN provided a quantum leap in software development, comparable to what the integrated circuit did for hardware. It took a good programmer roughly as long to plan, write, and debug a given number of lines of FORTRAN code as the same number of lines of assembly code. But those FORTRAN lines corresponded to about ten times as much program functionality and ten times as many machine instructions. This radically changed the economics of software production. The "computer revolution" would have proceeded much more slowly over the past 50 years if software had cost ten times as much, if ten times as many programmers had been needed to write it, and if the pool of programmers had remained limited to those of us who thought and wrote in terms of sequences of hardware instructions.
Let me give just one simple example: The first time I taught Advanced Programming, the term project was to write a machine-language program for the Bendix G-15D that accepted four typed-in numbers, sorted them, and printed them out in increasing order. [1] Today that assignment might be suitable for the first couple of weeks of an introductory programming course, but let me assure you: My students thought it was pretty challenging.
John's second major contribution to computing was BNF. [2] In the wake of FORTRAN,
It almost seemed that each new computer, and even each programming group, was spawning its own algebraic language or cherished dialect of an existing one. Apart from local technical variations, these languages were much like others already present or under design. ... However, if we remember that improvement in process, as much as product, is a goal of computing theory, the profusion of language developments was undoubtedly beneficial. Profusion, to be beneficial, must lead to unification. ... In 1957 it seemed appropriate for ACM to support the petition of some user groups (USE, SHARE, and DUO, to be exact) for a study that would lead to a single (national) programming language that all could (would) use on their computers and in whose terms programs would be communicated between the users of all machines. The petition, dated May 10, 1957, contained recommendations, one of which suggested that ACM appoint a committee to study, and recommend action for, the creation of a universal programming language. [Alan Perlis]
Not surprisingly, John Backus was a member of this committee, which soon agreed to merge its work with that of a similar GAMM committee in Europe. At a meeting in Zurich in 1958, the joint committee produced a draft description for an International Algorithmic Language (IAL) that later came to be called ALGOL 58.

At one point, John discovered that he and Peter Naur had different understandings of the syntax of one of the constructs of the language, which led him to seek a more precise way of defining programming language syntax than the natural-language descriptions that had been used to that point. From his mathematical education, John recalled the "production systems" of Emil Post, and found that they could be used quite satisfactorily to define the syntax of the language. [3] For the next meeting of the committee, he wrote up a description of the draft language in this form. [4] As he recalled in an interview with Grady Booch,
It was a little paper and I handed it - I hand carried it because it was so late in the game, so I had these copies that I dragged to the meeting and passed out to people and nobody paid any attention to it. ... Except Peter Naur. When he came to write up the thing, he used this descriptive method, and improved it in the process.
Peter Naur wrote that "In particular, the invention by Backus of a formal notation for describing most of the syntax of the language proved to be a great help in preparing a complete description. The successful use of this formal notation for describing a language designed for actual communication (and not just as an object of study) must be regarded the most significant contribution of the ALGOL effort to the problem of documentation." Alan Perlis said that it was "not that ALGOL provided BNF, but a need to describe ALGOL provided in a sense a need for BNF and Backus gave it to us."

The syntax of ALGOL 60, expressed in "Backus Normal Form" provided the organizing principle for the hugely influential "Report on the algorithmic language ALGOL 60," edited by Naur. Each construct of the language was explained by giving its BNF syntax, some examples, and an English description of its semantics. (Since the syntax was recursive, the report had to be studied recursively, as well.) This report has been called the foundation of Computer Science.

For the first time, a programming language was a formal object, subject to scientific study. Parsing moved from a black art to a problem that could be solved in general, leading to algorithms for generating the tables controlling table-driven parsing: such as precedence, operator precedence, LL(k), Early's algorithm, Cocke's algorithm, etc. [5] [6]

The Algol 60 Report was the Bible of Computer Science when I was a graduate student. Its combination of brevity (16 pages) [7], clarity, and completeness has never been improved on. In Tony Hoare's memorable words, "Here is a language so far ahead of its time, that it was not only an improvement on its predecessors, but also on nearly all its successors. Of particular interest are its introduction of all the main program structuring concepts, the simplicity and clarity of its description, rarely equalled and never surpassed."

Virtually every programing language since ALGOL 60 has been described by a grammar in BNF, or, more frequently, "a somewhat extended BNF." [8] A programming language without a grammar is now almost inconceivable.

By the time I first met him, John had come to feel that FORTRAN, ALGOL, and the myriad other languages that sprang up in the '60s and '70s presented obstacles to the efficient expression of programs and to the efficient execution of programs. They imposed what he called "the von Neumann bottleneck," requiring the expression of programs as sequences of instructions to be executed in a particular order, even if the operation being implemented, say a matrix multiplication, conceptually offered a great deal of parallelism. He expressed these concerns, and an approach he was working on to deal with them, in his 1977 Turing Award address, "Can programming be liberated from the von Neumann style? A functional style and its algebra of programs." He spent much of the latter part of his career--still at IBM--refining his ideas about functional programming (FP).
There were a number of functional language projects active worldwide, so many that in 1982, together with John Hughes, I organized an IFIP working group on Functional Programming, WG2.8, and chaired it until I retired from IBM in 1996. At its largest, the IBM contingent consisted of 7 Research Staff Members, including Alex Aiken (who by the way devised an ingenious compile-time FL type checker), Ed Wimmers, Peter Lucas, and me, all of whom Backus considered colleagues. A terrific manager, who always considered his role as manager to be working For his group. You should know that he would be considerably disturbed to hear his colleagues described as "assistants." [John Williams [9], private communication]
Williams recalls
Listening to John Guttag's [1978] Marktoberdorf lectures on his 1977 CACM paper (chronicled in EWD 676) and talking with him over (several) beers, it seemed to me that his algebraic approach to adding abstract types would be a natural way to extend FP's primitive types (at that time only atoms and lists). Over the course of a few meetings, we developed an equational style of adding type definitions into FP, and determined to try to write it up.

John G. proposed that you would be a natural collaborator, and we met at Xerox PARC to enlist your help and to determine how to proceed. [10] As I recall, the paper consisted of some FP background, some algebraic abstract types background, and the FP type proposal together with a couple examples, written by respectively me, you, and us.

Backus was not much enamored of static type checking at that time. He considered it Draconian, which, in the early days before good type inference algorithms, it was. [ibid.]
The result was the paper "FP with data abstraction and strong typing," which is the closest I came to a direct research collaboration with John Backus.

On the occasions when we were together, I developed a great deal of respect for John's intelligence, wit, tenacity, intellectual curiosity, and willingness to engage as an equal. On the one hand, he was obviously proud of what "his friends had accomplished" with FORTRAN, but on the other hand, he never tried to use that to set himself above the people he was interacting with.

The world is poorer with his passing.

Other links:
IBM biography.

Transcript of FORTRAN Session at the first History of Programming Languages conference:
John Backus's presentation,
discussant's remarks,
question and answer session,
full text of all questions submitted.

Paul McJones's memories.
The Independent obituary, by Martin Campbell-Kelly.
AP obituary, by Brian Bergstein.

Updated on 5/31/07: Corrected errors and amplified the history of the FP with types paper, based on a private communication from John Williams.
Updated on 6/3/07: Added better links to documents, provided by Paul McJones.

[1] It would have been a harder problem if I had asked them to sort five numbers, rather than four, and their programs would have been significantly different.
[2] BNF originally stood for Backus Normal Form; today it is usually called Backus-Naur Form. But that's a story for another day. ("If it had not been for Naur's work in recognizing the potential of Backus's ideas and popularizing them with the ALGOL committee, Backus's work would have become virtually lost; and much of the knowledge we have today about languages and compilers would not have been acquired." Donald Knuth)
[3] Well, most of the syntax. What we would today, following Chomsky, call the context free syntax.
[4] A revised form of this paper was later published as "The syntax and semantics of the proposed international algebraic language of the Zurich ACM-GAMM Conference" in the Proceedings of the International Conference on Information Processing, UNESCO, Paris, June 15-20, 1959, which I have been unable to track down online.
[5] My own small contribution, Mixed-Strategy Precedence (MSP) would have gone completely unnoticed had we not used it as the basis for the XPL compiler-generator system.
[6] Meanwhile, Don Knuth formulated the general problem and solved it in a 1965 paper modestly called "On the translation of languages from left to right." Unfortunately, it took one or two decades for most of us to understand the full power and subtlety of LR(k) parsing--in fact, there are probably insights encoded in that paper that I have still not discovered. During the time that I worked on parser generators, every time I thought I'd discovered something novel and interesting about LR(k) parsers, I'd go back and re-read Knuth's paper and realize that he must have understood that to have written exactly what he wrote.
[7] Except, I guess, by the meta-circular interpreter for LISP 1.5 written in LISP 1.5, which, alas, was ambiguous.
[8] One distinguished language designer even published a paper complaining that there were too many variations on BNF being used, and proposing that everyone instead use a new one given in the paper. It had no effect.
[9] Not the guitarist John Williams, not the composer and conductor John Williams, but the former Cornell Professor of Computer Science John Williams.
[10] I was at this Marktoberdorf Summer School, but cannot now remember whether I was enlisted into the collaboration on the spot, or only after we had returned to the U.S.


Comment by Blogger Keizer:

very useful post, actually it helped a lot since i'm learning these things in my institute

3:45 PM  

Post a Comment

<< The Way It Was home