summaryrefslogtreecommitdiffstats
path: root/share/doc/papers/px/pxin1.n
diff options
context:
space:
mode:
Diffstat (limited to 'share/doc/papers/px/pxin1.n')
-rw-r--r--share/doc/papers/px/pxin1.n538
1 files changed, 0 insertions, 538 deletions
diff --git a/share/doc/papers/px/pxin1.n b/share/doc/papers/px/pxin1.n
deleted file mode 100644
index c70f729..0000000
--- a/share/doc/papers/px/pxin1.n
+++ /dev/null
@@ -1,538 +0,0 @@
-.\" Copyright (c) 1979 The Regents of the University of California.
-.\" All rights reserved.
-.\"
-.\" Redistribution and use in source and binary forms, with or without
-.\" modification, are permitted provided that the following conditions
-.\" are met:
-.\" 1. Redistributions of source code must retain the above copyright
-.\" notice, this list of conditions and the following disclaimer.
-.\" 2. Redistributions in binary form must reproduce the above copyright
-.\" notice, this list of conditions and the following disclaimer in the
-.\" documentation and/or other materials provided with the distribution.
-.\" 3. All advertising materials mentioning features or use of this software
-.\" must display the following acknowledgement:
-.\" This product includes software developed by the University of
-.\" California, Berkeley and its contributors.
-.\" 4. Neither the name of the University nor the names of its contributors
-.\" may be used to endorse or promote products derived from this software
-.\" without specific prior written permission.
-.\"
-.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
-.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
-.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
-.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
-.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
-.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
-.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
-.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
-.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
-.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
-.\" SUCH DAMAGE.
-.\"
-.\" @(#)pxin1.n 5.2 (Berkeley) 4/17/91
-.\" $FreeBSD$
-.\"
-.tr _\(ru
-.nr H1 0
-.NH
-Organization
-.PP
-Most of
-.I px
-is written in the
-.SM "VAX 11/780"
-assembly language, using the
-.UX
-assembler
-.I as.
-Portions of
-.I px
-are also written in the
-.UX
-systems programming language C.
-.I Px
-consists of a main procedure that reads in the interpreter code,
-a main interpreter loop that transfers successively to various
-code segments implementing the abstract machine operations,
-built-in procedures and functions,
-and several routines that support the implementation of the
-Pascal input-output environment.
-.PP
-The interpreter runs at a fraction of the speed of equivalent
-compiled C code, with this fraction varying from 1/5 to 1/15.
-The interpreter occupies 18.5K bytes of instruction space, shared among
-all processes executing Pascal, and has 4.6K bytes of data space (constants,
-error messages, etc.) a copy of which is allocated to each executing process.
-.NH 2
-Format of the object file
-.PP
-.I Px
-normally interprets the code left in an object file by a run of the
-Pascal translator
-.I pi.
-The file where the translator puts the object originally, and the most
-commonly interpreted file, is called
-.I obj.
-In order that all persons using
-.I px
-share a common text image, this executable file is
-a small process that coordinates with the interpreter to start
-execution.
-The interpreter code is placed
-at the end of a special ``header'' file and the size of the initialized
-data area of this header file is expanded to include this code,
-so that during execution it is located at an
-easily determined address in its data space.
-When executed, the object process creates a
-.I pipe ,
-creates another process by doing a
-.I fork ,
-and arranges that the resulting parent process becomes an instance of
-.I px .
-The child process then writes the interpreter code through
-the pipe that it has to the
-interpreter process parent.
-When this process is complete, the child exits.
-.PP
-The real advantage of this approach is that it does not require modifications
-to the shell, and that the resultant objects are ``true objects'' not
-requiring special treatment.
-A simpler mechanism would be to determine the name of the file that was
-executed and pass this to the interpreter.
-However it is not possible to determine this name
-in all cases.\*(Dd
-.FS
-\*(dd\ For instance, if the
-.I pxref
-program is placed in the directory
-`/usr/bin'
-then when the user types
-``pxref program.p''
-the first argument to the program, nominally the programs name, is
-``pxref.''
-While it would be possible to search in the standard place,
-i.e. the current directory, and the system directories
-`/bin'
-and
-`/usr/bin'
-for a corresponding object file,
-this would be expensive and not guaranteed to succeed.
-Several shells exist that allow other directories to be searched
-for commands, and there is,
-in general,
-no way to determine what these directories are.
-.FE
-.NH 2
-General features of object code
-.PP
-Pascal object code is relocatable as all addressing references for
-control transfers within the code are relative.
-The code consists of instructions interspersed with inline data.
-All instructions have a length that is an even number of bytes.
-No variables are kept in the object code area.
-.PP
-The first byte of a Pascal interpreter instruction contains an operation
-code.
-This allows a total of 256 major operation codes, and 232 of these are
-in use in the current
-.I px.
-The second byte of each interpreter instruction is called the
-``sub-operation code'',
-or more commonly the
-.I sub-opcode.
-It contains a small integer that may, for example, be used as a
-block-structure level for the associated operation.
-If the instruction can take a longword constant,
-this constant is often packed into the sub-opcode
-if it fits into 8 bits and is not zero.
-A sub-opcode value of zero specifies that the constant would not
-fit and therefore follows in the next word.
-This is a space optimization, the value of zero for flagging
-the longer case being convenient because it is easy to test.
-.PP
-Other instruction formats are used.
-The branching
-instructions take an offset in the following word,
-operators that load constants onto the stack
-take arbitrarily long inline constant values,
-and many operations deal exclusively with data on the
-interpreter stack, requiring no inline data.
-.NH 2
-Stack structure of the interpreter
-.PP
-The interpreter emulates a stack-structured Pascal machine.
-The ``load'' instructions put values onto the stack, where all
-arithmetic operations take place.
-The ``store'' instructions take values off the stack
-and place them in an address that is also contained on the stack.
-The only way to move data or to compute in the machine is with the stack.
-.PP
-To make the interpreter operations more powerful
-and to thereby increase the interpreter speed,
-the arithmetic operations in the interpreter are ``typed''.
-That is, length conversion of arithmetic values occurs when they are
-used in an operation.
-This eliminates interpreter cycles for length conversion
-and the associated overhead.
-For example, when adding an integer that fits in one byte to one that
-requires four bytes to store, no ``conversion'' operators are required.
-The one byte integer is loaded onto the stack, followed by the four
-byte integer, and then an adding operator is used that has, implicit
-in its definition, the sizes of the arguments.
-.NH 2
-Data types in the interpreter
-.PP
-The interpreter deals with several different fundamental data types.
-In the memory of the machine, 1, 2, and 4 byte integers are supported,
-with only 2 and 4 byte integers being present on the stack.
-The interpreter always converts to 4 byte integers when there is a possibility
-of overflowing the shorter formats.
-This corresponds to the Pascal language definition of overflow in
-arithmetic operations that requires that the result be correct
-if all partial values lie within the bounds of the base integer type:
-4 byte integer values.
-.PP
-Character constants are treated similarly to 1 byte integers for
-most purposes, as are Boolean values.
-All enumerated types are treated as integer values of
-an appropriate length, usually 1 byte.
-The interpreter also has real numbers, occupying 8 bytes of storage,
-and sets and strings of varying length.
-The appropriate operations are included for each data type, such as
-set union and intersection and an operation to write a string.
-.PP
-No special
-.B packed
-data formats are supported by the interpreter.
-The smallest unit of storage occupied by any variable is one byte.
-The built-ins
-.I pack
-and
-.I unpack
-thus degenerate to simple memory to memory transfers with
-no special processing.
-.NH 2
-Runtime environment
-.PP
-The interpreter runtime environment uses a stack data area and a heap
-data area, that are kept at opposite ends of memory
-and grow towards each other.
-All global variables and variables local to procedures and functions
-are kept in the stack area.
-Dynamically allocated variables and buffers for input/output are
-allocated in the heap.
-.PP
-The addressing of block structured variables is done by using
-a fixed display
-that contains the address of its stack frame
-for each statically active block.\*(Dg
-.FS
-\*(dg\ Here ``block'' is being used to mean any
-.I procedure ,
-.I function
-or the main program.
-.FE
-This display is referenced by instructions that load and store
-variables and maintained by the operations for
-block entry and exit, and for non-local
-.B goto
-statements.
-.NH 2
-Dp, lc, loop
-.PP
-Three ``global'' variables in the interpreter, in addition to the
-``display'', are the
-.I dp,
-.I lc,
-and the
-.I loop.
-The
-.I dp
-is a pointer to the display entry for the current block;
-the
-.I lc
-is the abstract machine location counter;
-and the
-.I loop
-is a register that holds the address of the main interpreter
-loop so that returning to the loop to fetch the next instruction is
-a fast operation.
-.NH 2
-The stack frame structure
-.PP
-Each active block
-has a stack frame consisting of three parts:
-a block mark, local variables, and temporary storage for partially
-evaluated expressions.
-The stack in the interpreter grows from the high addresses in memory
-to the low addresses,
-so that those parts of the stack frame that are ``on the top''
-of the stack have the most negative offsets from the display
-entry for the block.
-The major parts of the stack frame are represented in Figure 1.1.
-.so fig1.1.n
-Note that the local variables of each block
-have negative offsets from the corresponding display entry,
-the ``first'' local variable having offset `\-2'.
-.NH 2
-The block mark
-.PP
-The block mark contains the saved information necessary
-to restore the environment when the current block exits.
-It consists of two parts.
-The first and top-most part is saved by the
-.SM CALL
-instruction in the interpreter.
-This information is not present for the main program
-as it is never ``called''.
-The second part of the block mark is created by the
-.SM BEG
-begin block operator that also allocates and clears the
-local variable storage.
-The format of these blocks is represented in Figure 1.2.
-.sp
-.so fig1.2.n
-.PP
-The data saved by the
-.SM CALL
-operator includes the line number
-.I lino
-of the point of call,
-that is printed if the program execution ends abnormally;
-the location counter
-.I lc
-giving the return address;
-and the current display entry address
-.I dp
-at the time of call.
-.PP
-The
-.SM BEG
-begin operator saves the previous display contents at the level
-of this block, so that the display can be restored on block exit.
-A pointer to the beginning line number and the
-name of this block is also saved.
-This information is stored in the interpreter object code in-line after the
-.SM BEG
-operator.
-It is used in printing a post-mortem backtrace.
-The saved file name and buffer reference are necessary because of
-the input/output structure
-(this is discussed in detail in
-sections 3.3 and 3.4).
-The top of stack reference gives the value the stack pointer should
-have when there are no expression temporaries on the stack.
-It is used for a consistency check in the
-.SM LINO
-line number operators in the interpreter, that occurs before
-each statement executed.
-This helps to catch bugs in the interpreter, that often manifest
-themselves by leaving the stack non-empty between statements.
-.PP
-Note that there is no explicit static link here.
-Thus to set up the display correctly after a non-local
-.B goto
-statement one must ``unwind''
-through all the block marks on the stack to rebuild the display.
-.NH 2
-Arguments and return values
-.PP
-A function returns its value into a space reserved by the calling
-block.
-Arguments to a
-.B function
-are placed on top of this return area.
-For both
-.B procedure
-and
-.B function
-calls, arguments are placed at the end of the expression evaluation area
-of the caller.
-When a
-.B function
-completes, expression evaluation can continue
-after popping the arguments to the
-.B function
-off the stack,
-exactly as if the function value had been ``loaded''.
-The arguments to a
-.B procedure
-are also popped off the stack by the caller
-after its execution ends.
-.KS
-.PP
-As a simple example consider the following stack structure
-for a call to a function
-.I f,
-of the form ``f(a)''.
-.so fig1.3.n
-.KE
-.PP
-If we suppose that
-.I f
-returns a
-.I real
-and that
-.I a
-is an integer,
-the calling sequence for this function would be:
-.DS
-.TS
-lp-2w(8) l.
-PUSH \-8
-RV4:\fIl a\fR
-CALL:\fIl f\fR
-POP 4
-.TE
-.DE
-.ZP
-Here we use the operator
-.SM PUSH
-to clear space for the return value,
-load
-.I a
-on the stack with a ``right value'' operator,
-call the function,
-pop off the argument
-.I a ,
-and can then complete evaluation of the containing expression.
-The operations used here will be explained in section 2.
-.PP
-If the function
-.I f
-were given by
-.LS
- 10 \*bfunction\fR f(i: integer): real;
- 11 \*bbegin\fR
- 12 f := i
- 13 \*bend\fR;
-.LE
-then
-.I f
-would have code sequence:
-.DS
-.TS
-lp-2w(8) l.
-BEG:2 0
- 11
- "f"
-LV:\fIl\fR 40
-RV4:\fIl\fR 32
-AS48
-END
-.TE
-.DE
-.ZP
-Here the
-.SM BEG
-operator takes 9 bytes of inline data.
-The first byte specifies the
-length of the function name.
-The second longword specifies the
-amount of local variable storage, here none.
-The succeeding two lines give the line number of the
-.B begin
-and the name of the block
-for error traceback.
-The
-.SM BEG
-operator places a name pointer in the block mark.
-The body of the
-.B function
-first takes an address of the
-.B function
-result variable
-.I f
-using the address of operator
-.SM LV
-.I a .
-The next operation in the interpretation of this function is the loading
-of the value of
-.I i .
-.I I
-is at the level of the
-.B function
-.I f ,
-here symbolically
-.I l,
-and the first variable in the local variable area.
-The
-.B function
-completes by assigning the 4 byte integer on the stack to the 8 byte
-return location, hence the
-.SM AS48
-assignment operator, and then uses the
-.SM END
-operator to exit the current block.
-.NH 2
-The main interpreter loop
-.PP
-The main interpreter loop is simply:
-.DS
-.mD
-iloop:
- \fBcaseb\fR (lc)+,$0,$255
- <table of opcode interpreter addresses>
-.DE
-.ZP
-The main opcode is extracted from the first byte of the instruction
-and used to index into the table of opcode interpreter addresses.
-Control is then transferred to the specified location.
-The sub-opcode may be used to index the display,
-as a small constant,
-or to specify one of several relational operators.
-In the cases where a constant is needed, but it
-is not small enough to fit in the byte sub-operator,
-a zero is placed there and the constant follows in the next word.
-Zero is easily tested for,
-as the instruction that fetches the
-sub-opcode sets the condition code flags.
-A construction like:
-.DS
-.mD
-_OPER:
- \fBcvtbl\fR (lc)+,r0
- \fBbneq\fR L1
- \fBcvtwl\fR (lc)+,r0
-L1: ...
-.DE
-is all that is needed to effect this packing of data.
-This technique saves space in the Pascal
-.I obj
-object code.
-.PP
-The address of the instruction at
-.I iloop
-is always contained in the register variable
-.I loop .
-Thus a return to the main interpreter is simply:
-.DS
- \fBjmp\fR (loop)
-.DE
-that is both quick and occupies little space.
-.NH 2
-Errors
-.PP
-Errors during interpretation fall into three classes:
-.DS
-1) Interpreter detected errors.
-2) Hardware detected errors.
-3) External events.
-.DE
-.PP
-Interpreter detected errors include I/O errors and
-built-in function errors.
-These errors cause a subroutine call to an error routine
-with a single parameter indicating the cause of the error.
-Hardware errors such as range errors and overflows are
-fielded by a special routine that determines the opcode
-that caused the error.
-It then calls the error routine with an appropriate error
-parameter.
-External events include interrupts and system limits such
-as available memory.
-They generate a call to the error routine with an
-appropriate error code.
-The error routine processes the error condition,
-printing an appropriate error message and usually
-a backtrace from the point of the error.
OpenPOWER on IntegriCloud