EXPERIMENTS
Valentin F. Turchin,
WITH A SUPERCOMPILER
Robert M. Nirenberg,
and Dimitri V. T u r c h i n
The City College, the City University of New York New York, New York 10031 National
This work has been supported by Science F o u n d a t i o n grant # M C S - 8 0 0 7 5 6 5
I.
INTRODUCTION
A Supercompiler Project based on the language REFAL was o u t l i n e d in [i]. For an introduction to the main c o n c e p t s of this project see [2], which also includes, as an appendix, the formal d e f i n i t i o n of REFAL. A full and detailed exposition is to be found in
[3]. In this paper we present and discuss some examples of the o p e r a t i o n of the first experimental model of a supercompiler. We have chosen our examples to serve as "exhibits" which highlight most important features of the supercompiler and intimate its potential uses (this is reflected in section names). The acquaintance with REFAL is not assumed in this paper; our c o m m e n t s will be s u f f i c i e n t for a reader e x p e r i e n c e d in p r o g r a m m i n g languages to understand REFAL programs (and almost to learn REFAL in the process). A usual compiler goes through a program Ps in a source language L s and translates constructs of L s into e q u i v a l e n t c o n s t r u c t s of the target language L t. The d e f i n i t i o n of the source language we need in order to write such a compiler should be in the c o m p i l a t i o n mode: essentially, a list of e q u i v a l e n c e s between the c o n s t r u c t s of L s and L t. The work of a s u p e r c o m p i l e r is o r g a n i z e d differently. It requires a d e f i n i t i o n of the target language in the i n t e r p r e t a t i o n mode. Using this definition, a s u p e r c o m p i l e r initiates the process of partial e x e c u t i o n s of the source program Ps, during which those o p e r a t i o n s which can be c o m p l e t e d are completed, while those which depend on the input data are set off and compiled into a target program. The technique of s u p e r c o m p i l a t i o n d e s c r i b e d in [3] allows a d e e p t r a n s f o r m a t i o n of the structure of the original program Ps, which may lead to a c o m p l e t e l y d i f f e r e n t program e x p r e s s e d in terms of certain g e n e r a l i z e d states of the computing machine .... basic c o n f i g u r a t i o n s " . Another important feature of s u p e r c o m p i l a t i o n is the use of the o u t s i d e - i n s t r a t e g y of the evaluation of nested function calls. Partial and o u t s i d e - i n e v a l u a t i o n have been used in the context of REFAL since 1970. Our a p p r o a c h is closely related to the effort of some other researches done m o s t l y in the c o n t e x t of LISP. For a review on partial e v a l u a t i o n see E r s h o v ' s paper [4]; see also the recent paper by Komorowski [5]. The o u t s i d e - i n strategy of evluation is related to the delay rules by Vuillemin [6], lazy e v a l u a t i o n by H e n d e r s o n and Morris [7], and the use of s u s p e n s i o n s while c o n s t r u c t i n g data by Friedman and Wise [8]. The source language for our s u p e r c o m p i l e r is REFAL. The target language L t may be, in principle, any. The core of the s u p e r c o m p i l e r is independent of the target language, and produces the output p r o g r a m in the form of a graph of t r a n s i t i o n s between basic c o n f i g u rations of the abstract REFAL machine. This graph can then be mapped on any machine. A version of the supercompiler is now being d e v e l o p e d which uses PASCAL as the target language L t. The model d e s c r i b e d in this paper, however, maps the graph of states and t r a n s i t i o n s Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission.
©
1982
ACMO-89791-082-6/82/O08/O047 $00.75
47
black on the REFAL machine. of REFAL programs.
Thus
L t is REFAL,
and this model
can be seen as an o p t i m i z e r
The s u p e r c o m p i l e r itself is w r i t t e n also in REFAL. It has been run under the REFAL i m p l e m e n t a t i o n at the C U N Y / U n i v e r s i t y Computer Center, on an IBM System/370 machine. Each exhibit c o n s i s t s of three parts as printed by the supercompiler: i.
The o r i g i n a l functions.
program
in
REFAL,
2.
The starting c o n f i g u r a t i o n of the REFAL machine, w h i c h may (and almost always does) include free variables. Thus the starting c o n f i g u r a t i o n is in fact a new function d e f i n e d through the functions of the o r i g i n a l program. It has a standard n o t a t i o n CI.
3.
The output necessary,
EXHIBIT
8
=END
9 i0 Ii 12 13 14 15 16
'A' EX SY EX = =
STARTING *(CI)
=
=
definitions
of
all
PARTIAL
necessary
CI.
If
EVALUATION
PROGRAM: = 'B' K/FA/EX. SY K/FA/EX.
CONFIGURATION:
*(/FA/THE
SUPERCOMPILED C1
the
1
ORIGINAL FA
includes
of the supercompiler: an o p t i m i z e d c o n f i g u r a t i o n (function) it uses a u x i l i a r y functions, which are d e n o t e d C2, C3, etc.
2.
1 2 3 4 5 6 7
which
CAT WAS
BLACK)
PROGRAM:
'THE CBT WBS BLBCK'
The o r i g i n a l p r o g r a m here is the d e f i n i t i o n of a r e c u r s i v e function /FA/ w h i c h goes through its a r g u m e n t (supposed to be a string of characters) and replaces every o c c u r r e n c e of the letter A with the letter B. The d e f i n i t i o n c o n s i s t s of three sentences. The first sentence, line 5, begins with the function name FA typed starting in c o l u m n i. Other lines begin with blanks in c o l u m n i, w h i c h m e a n s that they are additional s e n t e n c e s defining the same function. Each s e n t e n c e c o n s i s t s of the left and right sides separated by the e q u a l i t y sign. The left side is a particular case of the argument, and the right side gives the e x p r e s s i o n which must replace the function call (left side) in order to make one step of evaluation. The left side in line 5 d e s c r i b e s the case w h e r e the argument starts with the letter A. Quoted strings of c h a r a c t e r s in a sentence, like 'A' in this case, represent themselves. The pair EX is a free variable, w h i c h stands for any expression; E is the v a r i a b l e type, X is its index. The right side of the first sentence includes a f u n c t i o n call K/FA/ EX. The c h a r a c t e r s K and "." are left and right e v a l u a t i o n brackets. The function name in a call follows the e v a l u a t i o n sign K and must be d e l i m i t e d by slashes. The first sentence may be read in this way: if the a g r u m e n t of /FA/ starts with the letter A on the left, then replace the call by the letter B c o n c a t e n a t e d on the right with the result of the a p p l i c a t i o n of /FA/ to the remaining part of the argument. C o n c a t e n a t i o n is implied in REFAL w h e n e v e r two e x p r e s s i o n s stand by one another. In the process of e v a l u a t i o n , the REFAL m a c h i n e tries to apply s e n t e n c e s in the order in which they appear in the program. Thus the second sentence (line 6) will be applied only if the first sentence was found inapplicable, i.e. if the argument does not start with A. The pair SY is a free v a r i a b l e which can take as its value any, but e x a c t l y one, symbol. A symbol is any character, or something d e l i m i t e d by (unquoted) slashes, like a function name or a number. When this sentence is applied, v a r i a b l e SY will take the first symbol of the a r g u m e n t as its value, and EX will take on the rest. The meaning of the right side is obvious.
48
The third sentence says that the value of /FA/ of any empty Line 8 indicates the end of the whole program.
argument
is empty.
The starting c o n f i g u r a t i o n (line 12) is written, for certain technical reasons, in a somewhat d i f f e r e n t code, which is more on the m a t h e m a t i c a l side as c o m p a r e d to c o m p u t e r style code used to input programs. Here we do not put quotes on symbols which r e p r e s e n t themselves; the price we have to pay to this liberty is a c l u m s i e r n o t a t i o n for v a r i a b l e s and function calls. V a r i a b l e s S1 and EX of the c o m p u t e r code will be r e p r e s e n t e d as *SI and *EX, respectively, in the m a t h e m a t i c a l (meta)code. E v a l u a t i o n sign K and the paired e v a l u a t i o n stop "." are r e p r e s e n t e d by a pair of o r d i n a r y p a r e n t h e s e s with an * p r e c e d i n g the left parenthesis. The starting c o n f i g u r a t i o n d e f i n i t i o n in line 12 can be seen as the d e f i n i t i o n of the function to be computed, which in this particular case is a function of no variables. In the c o m p u t e r code this d e f i n i t i o n would look like: C1 = K / F A / ' T H E (Blanks
which
appear
CAT WAS BLACK' in quotes
are
treated
as o r d i n a r y
characters).
The o p t i m i z e d p r o g r a m produced by the s u p e r c o m p i l e r (line 16) is the result of the e v a l u a t i o n of the starting function call, namely, the original argument with every A replaced by B.
3.
1 2 3 4 5 6 7 8 9 i0 ii 12 13 14 15 16 17 18 19 20 21 22
EXHIBIT
OUTSIDE-IN
EVALUATION
STRATEGY
2
ORIGINAL
PROGRAM:
FA
'A' EX = 'B' K/FA/EX. SY EX = SY K/FA/EX. = FB 'B' EX = 'C' K/FB/EX. SZ EX = SZ K/FB/EX. = =END STARTING *(CI*E'I')
=
CONFIGURATION: *(/FB/*(/FA/*E'I'))
SUPERCOMPILED C1
PROGRAM:
'A' E1 = 'C' K/C1/ El. 'B' E1 = 'C' K/C1/ El. $2 E1 = $2 K/C1/ El. =
F u n c t i o n /FB/ in the original program scans the argument from left to right, like function /FA/, but the s u b s t i t u t i o n s it makes are different: every letter B is replaced by the letter C. The starting c o n f i g u r a t i o n d e f i n e s a two-pass p r o c e s s i n g of the input: in the first pass each A is replaced by B, and in the second pass each B is replaced by C. By using the o u t s i d e in strategy and looking for recurrent c o n f i g u r a t i o n s , the s u p e r c o m p i l e r t r a n s f o r m s this p r o g r a m into a more e f f i c i e n t o n e - p a s s algorithm. F u n c t i o n /CI/ scans the a r g u m e n t (input) only once, s u b s t i t u t i n g C for A (line 19) and B (line 20j, and leaving all other symbols as they were (line 21).
4.
PROGRAM
SPECIALIZATION
For lack of space we shall only briefly d e s c r i b e one e x h i b i t in this topic without a c t u a l l y listing printouts. A function K / L A S T / ( E L ) E S . of two a r g u m e n t s EL and ES has been defined. It goes from right to left through the string of symbols ES and c o m p a r e s each symbol SX with the list EL using function K/IN/SX(EL). which gives 'T' if SX is in EL and 'F' if it is not. E v a l u a t i o n of
49
function LAST stops when the first symbol SX from EL is found in ES, or when ES is exhausted. This process is, clearly, a loop in a loop. Consider K / L A S T / ( ' + - ' ) E S . as the starting configuration. This is a s p e c i a l i z a t i o n of function LAST for a given list EL; an e f f i c i e n t p r o g r a m for it should be one simple loop over ES. And that is exactly the program p r o d u c e d by the supercompiler.
5.
FORMAL
LANGUAGE
DEFINITION
AND T R A N S L A T I O N
One type of p r o g r a m s p e c i a l i z a t i o n is known as c o m p i l a t i o n , when we speak of formalized p r o g r a m m i n g languages. To have a p r o g r a m m i n g language L implemented, m e a n s to have a m a c h i n e which for every p r o g r a m P 1 w r i t t e n in L and an input Ip for Pl p r o d u c e s ultimately, one way or another, an output Op of the p r o g r a m Pl. If the i m p l e m e n t a t i o n is in the i n t e r p r e t a t i o n mode, then it is just a m a c h i n e which requires two inputs, in other words, a r e c u r s i v e function of two agruments: P1 and Ip. If the i m p l e m e n t a t i o n is in the c o m p i l a t i o n mode, then it first c o n v e r t s the text P1 ~ n t o a machine, or recursive function, requiring one argument Ip as input. This second recursive function is nothing else but the s p e c i a l i z a t i o n of the first, general, recursive function for a c o n c r e t e first argument PI. If the general interpreting machine, that is to say the s e m a n t i c s of the language L, is defined in REFAL then an interpretive REFAL i m p l e m e n t a t i o n p r o v i d e s an i n t e r p r e t i v e i m p l e m e n t a t i o n of the language L, w h e r e a s a s u p e r c o m p i l e r will translate a text P1 in L into an e f f i c i e n t c o m p i l e d program for the target language. We show how this works on a very simple example used in [2] and c o n c e p t s of theory (see Exhibit 3, part i). 1 2 3 4 5 6 7
EXHIBIT
[3] to introduce basic
3, PART 1
L E1 ';' E2 (EA) = K/L/E2 (E/LI/EI (EA).). E1 (EA) = K/L1/El (EA). Ll 'CROSS' (SI E2) ($3 E4) = Sl $3 K/LI/'CROSS' 'CROSS' (El) (E2) = E1 E2 'ADD' (El) (EA) = EA E1
(E2) (E4).
F u n c t i o n L here is the interpreting function of a language, w h i c h we also call L. format is K/L/ EP
Its
(EA).
where EP is a program in L, and EA is the input for this program. A p r o g r a m in L c o n s i s t s of a number of statements separated by semicolons. Each s t a t e m e n t is an o p e r a t i o n with one parameter which is e n c l o s e d in p a r e n t h e s e s and follows the o p e r a t i o n name. There are only two types of operation: ADD and CROSS. O p e r a t i o n ADD simply c o n c a t e n a t e s its parameter to the right of the p r o c e s s e d data, e.g., o p e r a t i o n ADD(FISH) when applied to string CAT results in the string CATFISH. O p e r a t i o n C R O S S ( p a r a m e t e r ) "crosses" the p r o c e s s e d d a t a with the parameter by putting their symbols in the a l t e r n a t i o n until one of the strings is exhausted. Then it r e p r o d u c e s the other string, e.g., o p e r a t i o n CROSS(AB) when applied to 12345 p r o d u c e s AIB2345; CROSS(ABCDE) o p e r a t i n g on 123 yields AIB2C3DE, etc. It takes five lines in REFAL to define this language formally. Line 3 d e f i n e s the semantics of semicolon as standing for c o n s e c u t i v e e x e c u t i o n of operations. V a r i a b l e E1 will take the first statement as its value; in the right side we see the call of function /LI/ which applies statement E1 to the data being p r o c e s s e d EA. Then the result becomes a new state of the data, and the rest of the p r o g r a m E2 is applied to it by function /L/. Line 4 works for the last (or the only) sentence in the program. Lines 5 to 7 d e f i n i n g the way one o p e r a t i o n works should be easily understood. Note that since line 6 is p r e c e d e d by line 5, at least one of the variables E1 and E2 in it must not start with a symbol (i.e. be empty if there are no parentheses). T h e r e f o r e we simply c o n c a t e n a t e E1 and E2 in this case. At this stage we recommend (*)
that the reader
take the following program:
C R O S S (CAT) ;ADD (DOG)
and apply it to the string LION by evaluating: K/L/'CROSS' ('CAT') ';'
'ADD' ('DOG') ('LION').
50
the result
should
be:
CLAITONDOG 1
EXHIBIT
17"" 18 19 20 21 22 23 24 25 26
3, PART
STARTING *(CI*E'I')
=
2
CONFIGURATION: *(/L/CROSS(CAT) ;ADD(DOG) (*E'I'))
SUPERCOMPILED C1
PROGRAM:
$2 $3 $4 E1 = 'C' $2 'A' $3 'T' $4 E1 $2 $3 E1 = 'C' $2 'A' $3 'T' E1 'DOG' $2 E1 = 'C' $2 'AT' E1 'DOG' E1 = 'CAT' E1 'DOG'
'DOG'
As a starting c o n f i g u r a t i o n we chose the a p p l i c a t i o n of p r o g r a m (*), given above, to a r b i t r a r y input data. Hence the s u p e r c o m p i l e d p r o g r a m is the e f f i c i e n t p r o g r a m resulting from the c o m p i l a t i o n of program (*). The target language here is REFAL. Should the target language be an A l g o l - l i k e language, the s u p e r c o m p i l e d p r o g r a m would look like (semiformally): IF E1 does not start with a symbol THEN return 'CAT' E1 'DOG' ELSE [chop off $2, redefine El; IF E1 does not start with a symbol THEN return 'C' $2 'AT' E1 'DOG' ELSE [chop off S3, redefine El; IF E1 does not start with a symbol THEN return 'C' $2 'A' $3 'T' E1 'DOG' ELSE [chop off $4, redefine El; return 'C' $2 'A' $3 'T' $4 E1 'DOG']
] ] Note that we have succeeded in translating p r o g r a m (*) without writing a compiler for the language L in which (*) is written. With a supercompiler, we only have to d e f i n e in REFAL each new language we want to use, and to d e f i n e it in the i n t e r p r e t a t i o n mode, at that, which is not nearly as serious a job as to write a compiler. The s u p e r c o m p i l e r uses then this i n t e r p r e t i v e d e f i n i t i o n of a language L in order to produce an e f f i c i e n t (compiled) program from any input p r o g r a m in L. It is this a p p l i c a t i o n of our •program that gave it the name of supercompiler.
6.
PROBLEM SOLVING, PROGRAM INVESTIGATION, AND THEOREM PROVING
There are, however, other i n t e r e s t i n g applications. A great many of the problems that m a t h e m a t i c s deal with are easily and n a t u r a l l y formulated as the inversion of recursive functions (if we do not limit the d o m a i n of a recursive function to numbers only). By inversion of a recursive function we mean finding the set of all, or some of its a r g u m e n t s for which the function takes a given value. Since we always have the r e c u r s i v e p r e d i c a t e of e q u a l i t y at our disposal we can assume that the function to the inverted is a predicate. To investigate the behaviour of a computer p r o g r a m and to v e r i f y it, we must find answers to such q u e s t i o n s as: do the input and o u t p u t of the p r o g r a m obey a c e r t a i n relationship, is a given type of output produced under any c i r c u m s t a n c e s , etc. Clearly, all such q u e s t i o n s can be formulated as inversions of r e c u r s i v e p r e d i c a t e s defined through the recursive function r e p r e s e n t i n g the program. Finally, the solution of an inverse p r o b l e m may be seen as the proof of a theorem. The following example shows how an inverse p r o b l e m can be formulated with a s u p e r c o m p i l e r (see E x h i b i t 4).
51
in REFAL and solved
1 2 3 4 5 6 7 8 9 i0 ii 12 13 14 15 16 17 18 19 20
EXHIBIT
4
ORIGINAL
PROGRAM:
A D D (El) '0' = E1 (El) E2'I' = K / A D D / ( E I ) E2. 'i' E Q ('0')'0' = 'T' (EI'I')E2'I' = K/EQ/(EI)E2. E1 = 'F' P E1 = K / E Q / ( K / A D D / ( ' 0 1 1 1 ' ) E I . ) '011111' C U T 'T' = 'T' =END STARTING *(CI*E'I')
=
CONFIGURATION: *(/CUT/*(/P/*E'I'))
SUPERCOMPILED C1
'011'
=
PROGRAM:
'T'
In l i n e s 5 to 6 we d e f i n e the o p e r a t i o n o f a d d i t i o n o f " t h e o r e t i c a l " n u m b e r s u s e d in recursive arithmetic. In t h i s r e p r e s e n t a t i o n s y m b o l 0 s t a n d s for zero; the n u m b e r x', w h i c h i m m e d i a t e l y f o l l o w s n u m b e r x is r e p r e s e n t e d by a d d i n g s y m b o l 1 o n the right. The definition of a d d i t i o n in R E F A L a l m o s t l i t e r a l l y r e p e a t s the w e l l - k n o w n definition of recursive arithmetic: x + 0 = x x + y'
=
(x + y)'
W e w a n t n o w to find the n u m b e r x t h a t w h e n a d d e d to 3 r e s u l t s in 5. For t h i s p u r p o s e we d e f i n e in l i n e i0 the p r e d i c a t e P o f o n e a r g u m e n t x w h i c h b e c o m e s true if x is the s o l u t i o n sought: P(x) The
:=
definition
(3 + x = 5) of equality
in s t r i c t
REFAL
takes
lines
7 to 9.
T h e p r o b l e m n o w is to find the a r g u m e n t (or a r g u m e n t s ) o f P for w h i c h it b e c o m e s true. W e c a n try to d o t h i s b y s u p e r c o m p i l i n g the p r o g r a m for P. B u t f i r s t w e t r a n s f o r m P in a c e r t a i n way, b e i n g g u i d e d by the f o l l o w i n g logic. We are i n t e r e s t e d in t h o s e a r g u m e n t s o f P o n l y , for w h i c h the v a l u e is 'T'. In o r d e r to c u t o f f as s o o n as p o s s i b l e all t h o s e b r a n c h e s of the p r o g r a m w h i c h l e a d to 'F', let us c o n s t r u c t a p r e d i c a t e w h i c h is 'T' w h e n P is 'T', and is u n d e f i n e d o t h e r w i s e . W e a c h i e v e t h i s by a p p l y i n g f u n c t i o n / C U T / , d e f i n e d in l i n e ii, to the r e s u l t o f P. T h i s c o m p o s i t i o n b e c o m e s our s t a r t i n g c o n f i g u r a t i o n . The supercompiled n u m b e r 2.
p r o g r a m t e l l s us t h a t t h e r e is o n e and o n l y o n e s o l u t i o n
Let us n o w g i v e the s u p e r c o m p i l e r n u m b e r s.
a more difficult
F u n c t i o n / N L Z / = 'No L e a d i n g Z e r o e s ' w h i c h are f~ree o f l e a d i n g z e r o e s .
is i n t r o d u c e d
task of s u b t r a c t i n g
in o r d e r
to the p r o b l e m :
binary,
not unary,
to h a v e o n l y t h o s e s o l u t i o n s
1 2 3 4 5 6 7 8 9 i0 ii 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 3O 31
EXHIBIT
5
ORIGINAL
PROGRAM:
A D D B ( E I ' 0 ' ) E 2 SX = K/ADDB/(EI)E2. SX (EI'I')E2'0' = K/ADDB/(EI)E2. 'i' (EI'I')E2'I' = K/ADDBI/(EI)E2.'0' (El SX) = E1 SX ()E2 = E2 ADDBI (EI'0')E2'0' = K/ADDB/(EI)E2.'I' (EI'0')E2'I' = K/ADDBI/(EI)E2.'0' (EI'I')E2'0' = K/ADDBI/(EI)E2.'0' (EI'I')E2'I' = K/ADDBI/(EI)E2.'I' (El SX) = K/ADDB/(EI SX)'I'. ()E2 = K/ADDB/('I')E2. E Q () = 'T' (El S X ) E 2 SX = K/EQ/(EI)E2. (El)E2 = 'F' C U T 'T' = 'T' N L Z '0' = '0' 'i' EX = 'I'EX P EX = K/CUT/ K/EQ/(K/ADDB/('II011110')EX.)'I00010100'.. =END STARTING * (CI*E'I')
CONFIGURATION:
=
* (/P/* (/NLZ/*E' 1 ') )
SUPERCOMPILED C1
'ii0110'
=
PROGRAM: 'T'
7. The principle E x h i b i t 6).
of induction
IR[PLEMENTING I N D U C T I O N
is i m p l e m e n t e d
in the s u p e r c o m p i l e r
as d e s c r i b e d
in [3]
(See
E x a m i n i n g the o r i g i n a l p r o g r a m in t h i s e x a m p l e o n e c a n v e r i f y t h a t f u n c t i o n / F / is d e f i n e d (for the p u r p o s e o f i l l u s t r a t i o n ) in s u c h a w a y t h a t its v a l u e (when it exists) is i d e n t i c a l to its a r g u m e n t . T h e s u p e r c o m p i l e r r e c o g n i z e s t h i s fact by e x a m i n i n g the v a l u e s of the f u n c t i o n c o m p u t a b l e in a m i n i m a l n u m b e r o f steps, m a k e s a g e n e r a l i z a t i o n , and p r o v e s it by induction. T h e n it s i m p l i f i e s the p r o g r a m to o n e l i n e (line 20). 1 2 3 4 5 6 7 8 9 i0 ii 12 13 14 15 16 17 18 19 20
EXHIBIT
6
ORIGINAL F
PROGRAM:
'A'EI = 'A' K / F / E 1 . E1 (E2) = K / F / E 1 . (K/F/E2.) (El) SX = K / F / ( E l ) . SX SA E1 SA = K / F I / S A . E1 SA =
F1 S1 $2 = Sl = Sl =END
$2 S1
STARTING *(CI*E'I')
CONFIGURATION:
=
*(/F/*E'I')
SUPERCOMPILED C1 While
E1 =
E1
the e q u a l i t y X + 0 = X
PROGRAM:
in the recursive
arithmetic
is one of the axioms,
the e q u a l i t y
0 + x : x is a theorem, which can be proven only using induction. this theorem by the s u p e r c o m p i l e r (see E x h i b i t 7). 1 2 3 4 5 6 7 8 9 i0 ii 12 13 14 15 16 17 18 19
EXHIBIT
is the proof of
7
ORIGINAL
PROGRAM:
ADD (El) '0' = E1 (El) E2'I' = K/ADD/(EI) E2. EQ ('0')'0' = 'T' ('0')E2'I' = 'F' (EI'I')E2'I' = K/EQ/(EI)E2. (EI'I')'0' = 'F' =END STARTING *(CI*E'I')
=
E1 =
'i'
CONFIGURATION: *(/EQ/(*(/ADD/(0)*E'I'))*E'I')
SUPERCOMPILED C1
Our final exhibit
PROGRAM:
'T'
The starting c o n f i g u r a t i o n here is a p r e d i c a t e which checks the theorem for an a r b i t r a r y argument El. The fact that it can be e q u i v a l e n t l y t r a n s f o r m e d to the trivial form in line 19 proves the theorem. It should
be added,
though,
that only
a slightly
more
difficult
commutativity
theorem
x + y = y + x cannot be d i r e c t l y proven by a supercompiler. The reasons for this and the way c o m m u t a t i v i t y of addition can be proven by a s u p e r c o m p i l e r using a m e t a s y s t e m t r a n s i t i o n d i s c u s s e d in [9].
8. i.
the are
CONCLUSIONS
A s u p e r c o m p i l e r is a p r o g r a m which initiates a process of partial e x e c u t i o n of a source program, during which those o p e r a t i o n s that can be c o m p l e t e d are completed, while those which depend on the input data are set off and c o m p i l e d into a target program. It is important that in this process the s u p e r c o m p i l e r should be able to d e t e r m i n e a set of recurrent "basic" c o n f i g u r a t i o n s of the c o m p u t i n g m a c h i n e and to build the target p r o g r a m as a graph in which the nodes are basic c o n f i g u r a t i o n s , w h e r e a s the arcs are transitions between them. Our p r o g r a m is the first, and still rather rough, w o r k i n g model of a supercompiler. In addition to the a f o r e m e n t i o n e d feature (which is the d e f i n i t i o n of the c o n c e p t of s u p e r c o m p i l e r ) , our model can p e r f o r m c e r t a i n simple g e n e r a l i z a t i o n s and prove them by induction.
2.
The p r i n c i p l e s d e v e l o p e d in [3] are shown to work as expected. All major e x a m p l e s used in [3] to introduce and illustrate c o n c e p t s of theory have now been run on the computer.
3.
The d e s i g n of our s u p e r c o m p i l e r model is s t r a i g h t f o r w a r d , and no attempt has been made to provide for special cases which one can e n c o u n t e r while p e r f o r m i n g p r o g r a m o p t i m i zation. Yet the p r o g r a m shows e x c e l l e n t p e r f o r m a n c e in very d i f f e r e n t types of o p t i m i z a t i o n s and very d i v e r s e applications. To us, it d e m o n s t r a t e s the g e n e r a l i t y and f r u i t f u l n e s s of the basic p r i n c i p l e s employed.
4.
Examples
illustrate
the
• program
specialization
• program
optimization
following
applications
54
of the
supercompiler:
• the use of interpretive compiled programs
definition
• problem solving of d i f f e r e n t
kinds,
of a p r o g r a m m i n g
including p r o g r a m
language
to produce
efficient
investigation
• theorem proving 5.
The s t r a i g h t f o r w a r d design of our first model leaves a vast field open to improvement and specialization. First and foremost, we have not yet used the concept of m e t a s y s t e m transition with a supercompiler, which must, as has been shown in [3], c r u c i a l l y enhance its power.
Acknowledgement The authors are g r e a t l y indebted to Charles Weldon for help in the installation and m a i n t e n a n c e of the CUNY REFAL system.
REFERENCES i.
Turchin, V.F. A S u p e r c o m p i l e r System based on the language REFAL. 2 (February 1979, pp. 46-54.
SIGPLAN Notices, 14,
2.
Turchin, V.F. S e m a n t i c s D e f i n i t i o n s in REFAL and A u t o m a t i c P r o d u c t i o n of Compilers, in: S e m a n t i c s - D i r e c t e d C o m p i l e r Generation, Lecture Notes in Computer Science #94, Springer Verlag, 1980, pp. 443-474.
3.
Turchin, V.F. The Language REFAL, the Theory of Compilation, and M e t a s y s t e m Analysis. Courant Institute Report #20, New York, 1980.
4.
Ershov, E.P. On the essence of translation, in: Neuhold, Editor, Formal D e s c r i p t i o n of Programming Concepts, N o r t h - H o l l a n d Publ. Co. (1978) pp. 391-418.
5.
Komorovski, H.J. Partial e v a l u a t i o n as a means for inferencing data structures in an applicative lanaguage: a theory and i m p l e m e n t a t i o n in the case of PROLOG, C o n f e r e n c e Record of The Ninth AMC Symp. on P r i n c i p l e s of Prog. Lang. (1982), pp. 255-267.
6.
Vuillemin, J. C o r r e c t and optimal i m p l e m e n t a t i o n of recursion in a simple p r o g r a m m i n g language. Journal of Computer and System Sciences, vol. 9, No. 3, Dec. 1974.
7.
Henderson, P., and Morris, J.H., Jr. of Progr. Lang. (1976) pp. 95-103.
8.
Friedman, D.P., and Wise, D.S., CONS should not evaluate its argument in: M i c h a e l s o n and Milner Editors, Automata, Languages, and Programming, E d i n b u r g h Univ. Press (1967), pp. 257-284.
9.
Turchin, V.F. The Use of M e t a s y s t e m T r a n s i t i o n in T h e o r e m Proving and program O p t i m i z a t i o n in: Automata, L a n g u a g e s and Programming, the 7th Colloquium, Lecture Notes in Computer Science #85, Springer Verlag, 1980, pp. 645-657.
A lazy evaluator, Proc. 3rd ACM Symp. on P r i n c i p l e s
55