XPL

XPL is a programming language based on PL/I, and a portable one-pass compiler written in its own language, and a parser generator tool for easily implementing similar compilers for other languages. XPL was designed in 1967 as a way to teach compiler design principles and as starting point for students to build compilers for their own languages.

XPL was designed and implemented by William McKeeman and David B. Wortman at University of California, Santa Cruz and James J. Horning and others at Stanford University. XPL was first announced at the 1968 Fall Joint Computer Conference. The methods and compiler are described in detail in the 1971 textbook 'A Compiler Generator'.

They called the combined work a 'compiler generator'. But that implies little or no language- or target-specific programming is required to build a compiler for a new language or new target. A better label for XPL is a Translator Writing System. It helps you write a compiler with less new or changed programming code.

The XPL language is a simple, small, efficient dialect of PL/I intended mainly for the task of writing compilers. The XPL language was also used for other purposes once it was available. XPL can be compiled easily to most modern machines by a simple compiler. Compiler guts can be written easily in XPL, and the code is easy to read. The PL/I language was designed by an IBM committee in 1964 as a comprehensive language replacing Fortran, Cobol, and Algol, and meeting all customer and internal needs. These ambitious goals made PL/I complex, hard to implement efficiently, and sometimes surprising when used. XPL is a small dialect of the full language. XPL has one added feature not found in PL/I: a STRING datatype with dynamic lengths. String values live in a separate text-only heap memory space with automatic garbage collection of stale values. Much of what a simple compiler does is manipulating input text and output byte streams, so this feature helps simplify XPL-based compilers.

The XPL compiler, called XCOM, is a one-pass compiler using a table-driven parser and simple code generation techniques. Versions of XCOM exist for different machine architectures, using different hand-written code generation modules for those targets. The original target was System/360.

XCOM compiles from XPL source code, but XCOM itself is written in XPL too. It is thus a self-compiling compiler, not reliant on anyone else's compilers. Several famous languages have self-compiling compilers, including Burroughs B5000 Algol, PL/I, C, LISP, and Java. Creating such compilers is a chicken-and-egg conundrum. The language is first implemented by a temporary compiler written in some other language. XCOM began as an Algol program running on Burroughs machines, translating XPL source code into System/360 machine code. Someone manually turned its Algol source code into XPL source code. That XPL version of XCOM was then compiled on Burroughs, creating a self-compiling XCOM for System/360 machines. The Algol version was then thrown away, and all further improvements happened in the XPL version only. This is called bootstrapping the compiler. Retargeting the compiler for a new machine architecture is a similar exercise, except only the code generation modules need to be changed. Nowadays, GNU C compilers are available on nearly all platforms, and so it is feasible and often easier to write most new compilers in C or C++ than to use the compiler's own language. Only the C compiler needs to be self-compiling.

The XCOM compiler has a hand-written lexical scanner and a mechanically-generated parser. The syntax of the compiler's input language (in this case, XPL) is described by a BNF grammar. XPL's grammar analyzer tool XA turned this into a set of large data tables describing all legal combinations of the syntax rules and how to discern them. This table generation step is re-done only when the language is changed. When the compiler runs, those data tables are used by a small, language-independent parsing algorithm to parse and respond to the input language. This style of table-driven parser is generally easier to write than an entirely hand-written recursive descent parser. XCOM uses a bottom-up parsing method, in which the compiler can delay its decision about which syntax rule it has encountered until it has seen the rightmost end of that phrase. This handles a wider range of programming languages than top-down methods, in which the compiler must guess or commit to a specific syntax rule early, when it has only seen the left end of a phrase.

XCOM originally used a now-obsolete bottom-up parse table method called Mixed Strategy Precedence, invented by the XPL team. MSP is a generalization of the simple precedence parser method invented by Niklaus Wirth for PL360. Simple precedence is itself a generalization of the trivially simple operator precedence methods that work nicely for expressions like A+B*(C+D)-E. MSP tables include a list of expected triplets of language symbols. This list grows larger as the cube of the grammar size, and becomes quite large for typical full programming languages. XPL-derived compilers were difficult to fit onto minicomputers of the 1970s with limited memories. MSP is also not powerful enough to handle all likely grammars. It is applicable only when the language designer can tweak the language definition to fit MSP's restrictions, before the language is widely used.

XCOM and XA were subsequently changed to instead use a variant of Donald Knuth's LR parser bottom-up method. XCOM's variant is called Simple LR or SLR. It handles more grammars than MSP but not quite as many grammars as LALR or full LR(1). The differences from LR(1) are mostly in the table generator's algorithms, not in the compile-time parser method. XCOM and XA predate the widespread availability of Unix and its yacc parser generator tool. XA and yacc have similar purposes.

XCOM is a one-pass compiler. It emits machine code for each statement as each grammar rule within a statement is recognized, rather than waiting until it has parsed the entire procedure or entire program. There are no parse trees or other required intermediate program forms, and no loop-wide or procedure-wide optimizations. XCOM does, however, perform peephole optimisations. The code generation response to each grammar rule is attached to that rule. This immediate approach can result in inefficient code and inefficient use of machine registers. Such are offset by the efficiency of implementation, namely, the use of dynamic strings mentioned earlier: in processing text during compilation, substring operations are frequently performed. These are as fast as an assignment to an integer; the actual substring is not moved. In short, it is quick, easy to teach in a short course, fits into modest-sized memories, and is easy to change for different languages or different target machines.

Another part of XCOM is the runtime support library for allocating and garbage-collecting XPL string values. This library must be included into most every program written in XPL.

The last piece of the XPL compiler writing system is an example compiler named SKELETON. This is just XCOM with parse tables for an example toy grammar instead of XPL's full grammar. It is your starting point for building a compiler for some new language, if that language differs much from XPL.

XPL is run under the control of a monitor, XMON, which is the only operating system-specific part of this system. The originally published XMON was optimized for IBM 2311 disk drives, and was most efficient on those drives. XMON is about 50 percent efficient on IBM 2314 disk drives, and is dramatically less efficient on subsequently introduced disk drives, such as the IBM 3330, 3330-1 or 3350. Converting XMON from the use of primitive NOTE, POINT and READ/WRITE disk operations to EXCP or XDAP yields a dramatic increase in disk utilization efficiency, and a corresponding reduction in operating system overhead.

XPL is open source. The System/360 version of XPL was distributed through the IBM SHARE users organization. Other groups ported XPL onto many of the larger machines of the 1970s. Various groups extended XPL, or used XPL to implement other moderate-sized languages. NASA's HAL/S was implemented via XPL.

Another Translator Writing System of the 1970s is Frank DeRemer's proprietary MetaWare, which was used in several industrial-grade compilers.