= INTERCAL IMPLEMENTOR'S NOTES =
Eric S. Raymond <esr@thyrsus.com> & Alex Smith <ais523(hello(spambots)\))@nethack4.org>

The C-INTERCAL compiler has a very conventional implementation using YACC and
LEX (latterly, bison and flex).  It generates C, which is then passed to your C
compiler.

== Lexical issues ==

The spectacular ugliness of INTERCAL syntax requires that the lexical
analyzer have two levels.  One, embedded in the input() function,
handles the backquote and bang constructs, and stashes the input line
away in a buffer for the splat construct's benefit.  The upper level
is generated by lex(1) and does normal tokenizing for YACC.

The upper level of the lexer also does a bit of syntax analysis, by
distinguishing it from an ordinary COME FROM and passing different
symbols to the parser. The lexer is also responsible for helping the
parser to match sparks and ears in complicated array situations, by
distinguishing possibly closing sparkears from definitely opening
sparkears.

In order to splatter erroneous statements correctly, the generated code has to
track both logical and physical lines.  That's the reason for the 'lineno'
variable in the generated C, it's actually tracking physical lines.

Numeral tokens for input are defined in a symbol table (numerals.c) that is
directly included in the run-time library module (cesspool.c).  This avoids
having to put the the size of the numerals array in an extern.  To add new
numeral tokens, simply put them in the numerals initializer.

== Compilation ==

The parser builds an array of tuples, one for each INTERCAL statement.  Most
tuples have node trees attached.  Once all tuples have been generated, the
compile-time checker and optimizer phases can do consistency checks and
expression-tree rewrites.  Finally, the tuples are ground out as C code by the
emit() function.

Calculations are fully type-checked at compile time; they have to be because
(as I read the manual) the 16- and 32-bit versions of the unary ops do
different things.  The only potential problem here is that the typechecker has
to assume that :m ~ :n has the type of :n (32-bit) even though the result might
fit in 16 bits.  At run-time everything is calculated in 32 bits.  When
INTERCAL-72 was designed 32 bits was expensive; now it's cheap.  Really, the
only reason for retaining a 16-bit type at all is for the irritation value of
it (yes, C-INTERCAL *does* enforce the 16-bit limit on constants).

Labels are mapped to tuple indices (logical line numbers) in the code checker,
just before optimization.

== Code Generation ==

Each line of INTERCAL is translated into a C if()-then; the guard part is used
to implement abstentions and RESUMES, and the arm part translates the `body' of
the corresponding INTERCAL statement.

The generated C code is plugged into the template file ick-wrap.c inside
main().  It needs to be linked with cesspool.o, fiddle.o and lose.o (these are
in libick.a, with the support for runtime switches, arrgghh.o).  Cesspool.o is
the code that implements the storage manager; fiddle.o implements the INTERCAL
operators; and lose.o is the code that generates INTERCAL's error messages.
The routine arrgghh.o parses the runtime command line arguments.

The abstain[] array in the generated C is used to track line and label
abstentions; if member i is on, the statement on line i is being abstained
from.  If gerund abstentions/reinstatements are present in the code, a second
array recording the type of each statement in generated into the runtime, and
used to ensure that these operations are translated into abstention-guard
changes on all appropriate line numbers.

RESUMES are implemented with a branch to a generated switch statement that
executes a goto to the appropriate label.  If there are no RESUMES, no such
switch is generated.

The compiler places a simple label at the location of each COME FROM in the
program, while all of the machinery for checking for abstention and conditional
execution and actually performing the jump is placed immediately after the code
for the target statement.

AIS adds: The additions to the language [since 0.24] have made code
generation potentially more complicated, depending on which switches
are on. Each INTERCAL command is still translated into an if()-then,
but with extra guards around it. COME FROM is still handled the same
way, but only in singlethreaded programs with no computed COME
FROMs. For anything more complicated than that, a general COMING FROM
mechanism is invoked. C's most confusing control-flow identifiers,
setjmp and longjmp, are used. When computed COME FROMs are involved,
after every labeled statement, a set of gotos are done to check each
in turn. A jmp_buf, cjb, is used to store the point in the program
where the check started. Each COME FROM leading to that line and
computed COME FROM anywhere in the program is checked, and modifies
cjb to aim for its suckpoint if neccesary. In the case of a
multithreaded program, one cjb will be stored in a linked ring of
threads, and the other will be the one in the main program.  Extra
guards are also added for ONCE/AGAIN; the position of this guard
depends on whether the command is a NEXT or something else. ONCE/AGAIN
are quite legal even in singlethreaded programs. Computed ABSTAIN is
dealt with by allowing the abstain[] array to hold values other than 0
or 1.  Multithreading is supported by a new library, libickmt.a, to
hold the multithread versions of libick.a functions. See unravel.c for
an explanation of the full gory details of the multithread
implementation (it has many comments explaining the implementation at
the top).

== Optimization ==

The optimizer core does full recursive folding of all constant
expressions at compile time (evaluating away all the irritating little
kluges you have to use to generate 32-bit constants).  It also checks
for known INTERCAL idioms for `test for equality', `test for
nonzeroness', and the C logical operators &, |, ^, and ~.

Here are the constant-folding optimizations from the original optimizer core:

'Vx$y'~"#0x55555555" into (x | y)
'?x$y'~"#0x55555555" into (x ^ y)
'&x$y'~"#0x55555555" into (x & y)
(x ^ 0xFFFF) or (x ^ 0xFFFFFFFF) into ~x

AIS adds: When I got my C-INTERCAL distribution, it only had a
rudimentary optimiser. Constant folding had already been done, and the
following optimisations. The following is mixed INTERCAL and C,
sometimes in the same statement. It should be clear from context and
whether I have used INTERCAL spark-ears or C brackets which language
is which, although in some places I've mixed them. Note that the
optimizer is only run in binary.

ESR adds: After 0.24, Alex Smith, assisted by Joris Huizer, wrote a
program that interprets an Optimizer Idiom Language and uses it to
generate a library implementing those optimizations to be linked to
the compiler.  This subsumes all the stuff that used to be in the
optimizer code.

AIS adds: Optimizations available with -f (I wrote all these)

Guard removal (if a line can't be abstained, the if(!abstained[x]) line is
removed when possible)

Ignorance checks (if a variable can't be ignored and it can't overflow,
assignment is done with C's = rather than a function)

Recognition of NEXT...NEXT...computed RESUME as an idiom for if() (currently
disabled because the rest of the optimiser can't handle it)

Optimizations available with -F (I wrote this one)

This optimization should mean that INTERCAL suddenly becomes the best language
for various benchmark programs. Basically, the file is analysed to try to
determine whether it takes no input and always produces the same output. If
that is the case, the output is simply a shell-script (with executable
permissions set and a #! line) that outputs the correct output! Therefore, the
time required for one run of the program is taken up during compilation, and
whenever the program is executed, it just copies out the predetermined
output. This means (for instance) that primes.i runs in only a fraction of a
second when optimized with -F. Of course, this tends to slow down the compiler
and often produces a large object file, which is why it has a command-line
option of its own.

== Credits ==

ESR wrote the first version of this compiler over a weekend using a pre-ANSI C
compiler.  It worked, but it wasn't pretty. 

Louis Howell added the array support later; he also torture-tested the
COME FROM implementation by actually using it for the life2.i program
included in this distribution, and fixed some bugs.

Brian Raiter did a much-needed delinting while porting it for ANSI C,
flex and Linux. He also improved the lexical analyzer's line tracking.

Alex Smith, assisted by Joris Huizer, did a lot of heavy-duty work
on optimization, including designing and implementing OIL, after 0.24.
