Automatic Compilation of Logical Specifications into Efficient Programs

Abstract

We describe an automatic programmer, or “compiler” which accepts as input a predicate calculus specification of a set to generate or a condition to test, along with a description of the underlying representation of the data. This compiler searches a space of possible algorithms for the one that is expected to be most efficient. We describe the knowledge that is and is not available to this compiler, and its corresponding capabilities and limitations. This compiler is now regularly used to produce large programs. 1. Int reduction This work is motivated by a desire to help programmers do their job better, i.e., more easily and quickly create and modify programs that are more efficient, correct and understandable. Our approach follows the well-travelled route of supplying a “higher level language” which allows a programmer to say more of what he wants the machine to do and less of the details of how it is to be done. This leads to programs that are shorter, easier to understand and modify, and contain fewer bugs. However, higher level languages tend to degrade efficiency, since their compilers fail to make many optimizations that a human might make. In fact, some optimizations cannot even be expressed in the higher level language. Our higher level language, APS, is an extension of lisp in which programs can be written with much less commitment to particular algorithms or data representations. This is a benefit to the degree (which we believe is quite large) that programmers spend their effort dealing with these issues. One way to avoid thinking about data representation is to use a single powerful representation for all data. To some extent this is the approach of APL [Pakin 681, PROLOG [Clocksin 841 and Relational Databases [Ullman 821. This unfortunately results in a large performance penalty. APT, SETL [Schonberg 811 and MRS [Genesereth 811 provide the illusion of a uniform relational representation, but avoid the penalty by representing different relations with different data structures. APT goes further by accepting ‘*specifications” that contain compound well-formed formulas (wffs). Its compiler assumes the responsibility of finding good algorithms to ‘This research was supported by the Defense Advanced Research Projects Agency under contract No. MDAtXX3 81 C 0335. Mews and conclusions contained in this report are the authors’ and should not be interpreted as representing the official opinion or policy of DARPA, the U.S. Government, or any person or agency connected with them. implement these specifications. Another advantage of multiple representations, though not the focus of this paper, is that new things can be regarded as data, e.g., the + function may be regarded as a relation which is true of infinitely many tuples. APT compiles uses of such relations into computations rather than data structure accesses2 The compilation process is not entirely done by the machine. The programmer must provide a small amount of “annotation”, which is not part of the actual specification. The most important annotations tell AP5 how to represent the data. If the predefined representations are inadequate, he can add new ones. The compiler searches a space of algorithms appropriate for the given representations. Its job is to find the most efficient one that meets the specification. of course, the variablity of the representations makes this job much more difficult. This paper describes how the compiler does its job. The following programming methodology seems to work well for APT: First the programmer writes a specification. Next he adds annotations that select general (but inefficient) data representations. The result is compiled into a prototype that can be tested on small examples. (Even specifications contain bugs!) Finally he optimizes the prototype by changing the most critical annotations. 2. Example Suppose we want a program to find the nephews of an input person. The data is a set of people along with the sex and parents of each. For simplicity we define a nephew as the son of a sibling (not including the son of a brother-inlaw or sister-in-law), and we define siblings to include halfbrothers and half-sisters. Figure 2-1 shows a sample lisp program for this task, and figure 2-2 shows a corresponding AP5 program. The APT program contains two wffs, one as a definition for the Sibling relation and the other as a specification of the objects desired as output. Wffs are represented in prefix notation. The quantifiers tl and 3 are represented by the symbols All and Exists. The connectives, A, V, -, etc. are represented by the symbols “and”, “or“, “not”, etc. Comparison of these two programs reveals that the lisp program specifies both a data representation and an algorithm while the APT program only specifies the result to 2Computations can also express the intent of recursive definitions, which AP5 prohibits because they allow multiple interpretations. 20 / SCIENCE From: AAAI-86 Proceedings. Copyright ©1986, AAAI (www.aaai.org). All rights reserved. (Defun list-nephews (person) ; Algorithm: ; (1) get the parents of person, ; (2) get their children, excluding the . input person (siblings) i (3) get their children (nephews and nieces) ; (4) filter out the nieces ; (5) remove dupl icate nephews ; Representation: ; persons are symbols (with property lists) ; ii": x 'parents) is a list of x’s parents 'children) is a list of x's children ; (git i 'sex) is x's sex (remove-duplicates (loop for parent in (get person 'parents) nconc (loop for sibling in (get parent 'children) unless (eq person sibling) nconc (loop for nephew in (get sibling *children) when (eq (get nephew 'sex) 'male) ect nephew))))) -isp program to find nephews co1

Topics

    0 Figures and Tables

      Download Full PDF Version (Non-Commercial Use)