Monday, September 11, 2006

Exhaustive Testing

The most exhaustively-tested program that I know of still had a serious bug when it was released. But this story has a happy ending.

Background: One night circa 1960 a group of Pacific Union College undergraduate students was playing the UCLA Executive Game in the Data Processing Laboratory, when a member of one of the teams spotted an anomaly on his team's balance sheet. The numbers didn't add up. In particular, the bottom line was not exactly the sum of the numbers it was supposed to be the total of. It was off by one. [1]

Initial investigation: Off-by-one errors are not uncommon in programming, but they generally do not show up as errors in arithmetic, particularly in integer arithmetic. Suspicion first fell on the hardware, but running the hardware diagnostics did not reveal any faults, and the error proved to be quite repeatable for those inputs. Next, the Executive Game program was suspected, but the relevant section of code was a single block add of the numbers, and it was hard to see how that could be wrong. So, finally, attention fell on the binary-to-decimal conversions.

It wasn't necessary to convert all the numbers by hand (remember, this was before pocket calculators), since inspection of the program's memory revealed that one of the binary numbers was odd and its corresponding decimal printout was even. This should not happen when converting integers! There must be something wrong with the Bendix-supplied conversion routine, and, sure enough, it failed a stand-alone test with that number as input.

At this point, we felt we had "scored" against Bendix, since the routine was used in hundreds of G-15 installations. However, it would be even more satisfying to tell Bendix what the error was, and better still to provide the fix. And of course it was natural to wonder whether other integers were incorrectly converted, and if so, how many.

Diagnosis: One of the students took up the challenge. He had a binary number that didn't convert correctly, and the source code for the conversion routine, which was only a few dozen instructions of straight-line code, so he could step through it and check the results after each instruction against a hand calculation. He found that the final multiplication--to obtain the least-significant decimal digit--left a fractional part that was very close to, but not quite, one. Had it been just a little larger, there would have been a carry into the integer part, and the final digit of the converted number would have been correct.

Working backwards, he found that the initial division was not programmed for the 58 word times required to produce a full 28-bit quotient. Presumably, the original programmer wanted to speed up the routine and had reasoned that all 28 bits weren't needed, since 10^7 is less than 2^24. However, there were a few unfortunate bit patterns in the truncated quotient that could lead to that loss of a final carry [2]. The smallest integer resulting in such a bit pattern had five digits, which is probably why no one had noticed the problem before.

Constraints: The "obvious solution" was simply to extend the division to 58 word times. This was tried immediately, and generated a correct result for the problem case. Although this was strong evidence that the problem had been correctly diagnosed, it was not an acceptable fix for the routine, because it would have cost an extra drum revolution per call. (The longer division would not be complete before the next instruction came off the drum.) So something more clever would be needed, satisfying some difficult constraints:
  • It could not take an extra drum revolution.
  • The locations of the integer and the fraction entry points could not be changed, since they were hardcoded into calling programs.
  • The exit must happen at the original time, to avoid costing an extra drum revolution in the calling program.
  • The new routine must use only (a subset of) the drum locations used by the original, since calling programs were free to use the remaining locations for their own purposes.

Being another programmer who believed he could improve any program [3], he set himself the challenge of meeting all these constraints, but always generating a correct conversion. And he produced a program that, by his analysis, surely did so.

Testing: Presumably the original programmer had also convinced himself that the original routine was correct. What if there were some other "corner case" that caused the new routine to fail? Before submitting the new routine to Bendix, it seemed prudent to test it stringently.

He devised the following test methodology: Write a loop that counted from 0 to 9,999,999, both in binary and in BCD [4], converted the binary to BCD, and compared the results. This would take weeks to run, but should surely provide incontrovertible evidence of the correctness of the routine. He set it up so that it could be run in chunks, whenever the machine was not otherwise occupied [5]. Whenever someone needed the machine, they'd log how far the test had gotten, and when the machine was free again, it would be restarted at that point.

After weeks of testing, the loop finally reached 9,999,999 without having encountered a single conversion error, and we were all convinced that the new routine was ready for the big time.

Submission: The routine was carefully prepared on DPL (rather than Bendix) coding sheets, and everything was carefully documented, in the hopes that Bendix would distribute it unmodified to all G-15 installations, enhancing the fame of the DPL. And sure enough, shortly after submission, we received a copy of the distribution that went to all sites, and it was on our coding sheets. We were all elated.

OOPS: Soon after Bendix's distribution, the programmer received an urgent telephone call: "Were you aware that your new routine drops the sign of negative numbers?" (He wasn't, but it did.)

Moral: The most exhaustive test I've ever witnessed failed to uncover a significant flaw. Furthermore, had the 10,000,000 test cases been supplemented by any one of the other 9,999,999 values in the domain of the routine, the error would have been caught instantly.

In the spirit of one of Alan Perlis's aphorisms [6]:

If you have ten million test cases, you probably missed one.
Happy ending: Applying even more cleverness, the student was able to produce a program that met all the constraints, and correctly converted both positive and negative integers. But he did not attempt another exhaustive test before submitting it.

[1] By focusing on the least-significant digit, he was probably calling his executive qualifications into question. :-)
[2] "Few" is of course relative here. As I recall, it was only a few thousand values out of ten million.
[3] This was not me, although I was surely just as cocky.
[4] Since the BCD counting was just test harness, it could be optimized for simplicity and clarity, rather than speed or space, making it unlikely that it would fail in a fashion that correlated in any way with the conversion routine.
[5] Sort of like SETI@home.
[6] "If you have a procedure with ten parameters, you probably missed some."


G-15 Binary-to-Decimal Conversion

On a machine with binary arithmetic, conversion of numbers from decimal form to binary is easy; conversely, on a machine with decimal arithmetic, it is conversion to decimal that is easy. The Bendix G-15 had binary arithmetic internally, but most input and output was done in decimal. [1] The Bendix Computer Division [2] provided standard subroutines for single- or double-precision (7 or 14 digits) [3] conversion in each direction for both integers and fractions.

An integer whose magnitude is less than 10^7 can be exactly represented in either form, so conversion could be exact. However, this is not true of fractions. There is no finite binary fraction that exactly equals decimal 0.1, and no 7-digit decimal fraction that exactly equals hexadecimal 0.01 (1/256). Some rounding was inevitable, but it was highly desirable that converting a decimal fraction to binary and then converting it back would yield the original decimal value.

Conversion of a binary fraction to decimal was quite straightforward using available machine operations. The first decimal digit of the fraction could be obtained by multiplying the fraction (in binary) by u, the G-15's hex digit for ten, and extracting the integer part of the result. [4] The remaining digits could be computed in the same way from the remaining fractional parts. Since each multiplier was just a four-bit number, each multiplication required only eight word times. [5]

It might seem that it would be even easier to convert integers than fractions, but it wasn't. A digit-at-a-time strategy would have required division, to extract the low-order digit as the integer mod 10, and then further divisions of the subsequent quotients. But such divisions could not be short-cut like the multiplications, since the full precision results would be needed. Furthermore, the G-15's division algorithm did not provide any quick way to access the remainder. The solution adopted by the Bendix programmers was to divide the integer by the binary value corresponding to 10,000,000 (or to 100,000,000,000,000)--thereby converting it to a binary fraction--and then use the appropriate fraction conversion routine. [6]

Here's an anecdote about testing a binary-to-decimal conversion routine.

[1] Actually in a form of binary coded decimal (BCD), using hexadecimal digits restricted to the range 0-9. I.e., the decimal number 1,234,567 would be encoded as the hexadecimal number 1234567, equivalent to the binary number 0001 0010 0011 0100 0101 0110 0111 (equal to the decimal number 19,088,743).
[2] Coincidentally also abbreviated BCD.
[3] Note that 9,999,999 decimal is only 98967z in hex, so there were more than four "extra" bits in the single-precision binary respresentation of seven-digit decimal numbers, and more than eight for double-precision representation of fourteen digits.
[4] The G-15 had a special command for extracting bits from the product register, surely designed just for this purpose.
[5] And since a four-bit multiply shifted the multiplier register right just four bits, the entire sequence of seven multipliers could be loaded as the single word, represented in hex as uuuuuuu.
[6] And I never learned of a superior approach.


Sunday, September 10, 2006

The UCLA Executive Game

One of the early applications of computer simulation to business education was an "Executive Game" developed at UCLA. Multiple teams of "executives" would make financial decisions for imaginary firms competing in the business of manufacturing and selling widgets. [1] At the start of the game, each team was given an (identical) profit and loss sheet and balance sheet from the previous quarter for the "firm" they were taking control of.

For each fiscal quarter (usually about a quarter-hour of real time), each team would decide six values:
  1. Selling price
  2. Manufacturing volume
  3. Investment in plant
  4. Advertising budget
  5. Research & Development budget
  6. Dividend

These would be entered into the computer, and based on what the team had decided, what its competitors had decided, and a hidden variable representing general economic conditions, the computer would calculate sales, expenses, and profits for each firm, and print out new P&L and balance sheets. The simulation was a gross simplification of reality:

  • The initial condition of N identical single-product firms constituting the entire market was grossly unrealistic.
  • Non-numerical factors (e.g., quality of R&D, cleverness of advertising) were not considered at all.
  • Real executives have to decide on a much larger collection of budgetary figures.

However, the formulas in the game were intended to be fairly realistic, given the simplifications of the game. For example, advertising had a quicker effect than R&D, but the effect did not persist for as long, unit manufacturing costs depended on the ratio of production to nominal plant capacity, R&D reduced manufacturing costs as well as raising sales, cost of capital depended on net worth, and so on. One aim of the game was to get the players to think strategically, rather than just quarter-to-quarter, about how they were trying to position their firm (low-cost producer, top-of-the-line product, or whatever).

UCLA used the game to train its own business students, and also ran sessions for real executives (presumably at real executive prices--but that would include the donuts and the posh conference room). They also published a technical report completely explaining the internal workings of the game. [2]

Someone passed a copy of this report on to one of the students working at the DPL, with the suggestion that this might be a good way to interest the college's Business & Accounting Department in using the computer. Of course, UCLA's version was written for a different computer, but the report was clear enough, and the formulas were simple enough, that it was not too hard to produce a G-15 version of the program.

The DPL proudly invited the B&A Department's faculty and students to try out the new game. They came, politely played for an afternoon, and were never seen in the DPL again. [3]

However, the denizens of the DPL and their friends took to the game. It was a lot higher-tech than Monopoly, and probably somewhat more realistic--at least, it was my first exposure to P&L statements and balance sheets. We quickly discovered that the game gave no credit at all for paying dividends (they were just subtracted from cash on hand) [4], so we only had five decisions to make, instead of six.

The game also made no attempt to simulate the stock market, so share prices were not an issue, and there were no stock options. [5]

[1] This was long before the Web and web browsers; the term was probably adopted from the popular board game Monopoly.
[2] The principles of the Open Source Movement were being practiced long before it acquired its Capital Letters.
[3] But I did manage to get a substantial column-inch credit in my Journalism course by writing a long story about it that was published in the Napa Register.
[4] One wonders whether this was just an oversight, or was somehow prescient.
[5] I did not become a virtual millionaire until the bubble.


Sunday, September 03, 2006

When I first met Niklaus Wirth

In 1965 Stanford University formed the first Computer Science Department west of the Mississippi, chaired by Prof. George Forsythe. [1]

I applied for admission in the spring of 1966. To make sure that I was more than just a name on paper when my application was considered, I decided to visit Stanford and talk to some of the professors. A friend helped me set up some interviews. I remember only one of them: my interview with Prof. Niklaus Wirth. [2]

We started with some inconsequential chit-chat [3]. But then Wirth asked me how much I knew about Algol 60. Since my previous experience with Algol was limited to one unsuccessful attempt to use Algol 58, and one lecture by Peter Naur on Algol 60, this did not seem to be a good time to bluff. So I admitted that I knew very little about Algol, but added that I'd been using FORTRAN II for almost a year, and understood that they were very similar.

It was immediately apparent that this was the wrong thing to have said. Wirth practically spluttered that there was no resemblance between FORTRAN and Algol 60. FORTRAN was a cobbled-together collection of features, and Algol 60 was a formally-defined language that could be made the object of scientific study. [4] I don't recall that we managed to communicate anything else of importance that day.

Fortunately, this interview didn't prevent me from being accepted by the department (though I did not get Wirth as my advisor). As a graduate student I learned that the "Revised Report on the Algorithmic Language Algol 60" was practically holy writ, to be studied deeply and internalized; memorized, if possible. [5] And I did come to understand why Wirth felt the languages were so different. To quote Sir Charles Antony Richard (Tony) Hoare, Algol 60 was "a language so far ahead of its time that it was not only an improvement on its predecessors but also on nearly all its successors."

[1] There were already faculty and students specializing in computing within the Mathematics Department, but forming a separate department was a watershed event, for Stanford and for the discipline.
[2] This was before Wirth had earned his reputation as the world's only professional programming language designer.* He had just published a pair of papers on Euler, but PL360, Algol W, Pascal, Modula, and Oberon were all in the future.
*Wirth not only designed and implemented several important and influential languages, he kept learning from experience (his own and others), and repeatedly published critiques of his own languages, something that amateurs seldom bother to do.
[3] Which failed to uncover that each of us had started his programming career using the Bendix G-15D computer.
[4] "The report was a breakthrough in establishing programming languages as objects of scientific study." -- 2005 Turing Award Citation.
[5] Bill McKeeman, one of Stanford's first couple of Ph.D.s in Computer Science (the other was Raj Reddy), was my first advisor at Stanford. He told me that to satisfy his foreign language requirement he had been asked to translate a portion of the German version of the Report to English--and that he had relied much more on his knowledge of the Report than on his knowledge of German. Alas, I was given a German statistics journal to translate for my test.