diff options
Diffstat (limited to 'share/doc/papers/px/pxin2.n')
-rw-r--r-- | share/doc/papers/px/pxin2.n | 923 |
1 files changed, 0 insertions, 923 deletions
diff --git a/share/doc/papers/px/pxin2.n b/share/doc/papers/px/pxin2.n deleted file mode 100644 index d0edf98..0000000 --- a/share/doc/papers/px/pxin2.n +++ /dev/null @@ -1,923 +0,0 @@ -.\" Copyright (c) 1979 The Regents of the University of California. -.\" All rights reserved. -.\" -.\" Redistribution and use in source and binary forms, with or without -.\" modification, are permitted provided that the following conditions -.\" are met: -.\" 1. Redistributions of source code must retain the above copyright -.\" notice, this list of conditions and the following disclaimer. -.\" 2. Redistributions in binary form must reproduce the above copyright -.\" notice, this list of conditions and the following disclaimer in the -.\" documentation and/or other materials provided with the distribution. -.\" 3. All advertising materials mentioning features or use of this software -.\" must display the following acknowledgement: -.\" This product includes software developed by the University of -.\" California, Berkeley and its contributors. -.\" 4. Neither the name of the University nor the names of its contributors -.\" may be used to endorse or promote products derived from this software -.\" without specific prior written permission. -.\" -.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND -.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE -.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE -.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE -.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL -.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS -.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) -.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT -.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY -.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF -.\" SUCH DAMAGE. -.\" -.\" @(#)pxin2.n 5.2 (Berkeley) 4/17/91 -.\" $FreeBSD$ -.\" -.nr H1 1 -.if n .ND -.NH -Operations -.NH 2 -Naming conventions and operation summary -.PP -Table 2.1 outlines the opcode typing convention. -The expression ``a above b'' means that `a' is on top -of the stack with `b' below it. -Table 2.3 describes each of the opcodes. -The character `*' at the end of a name specifies that -all operations with the root prefix -before the `*' -are summarized by one entry. -Table 2.2 gives the codes used -to describe the type inline data expected by each instruction. -.sp 2 -.so table2.1.n -.sp 2 -.so table2.2.n -.bp -.so table2.3.n -.bp -.NH 2 -Basic control operations -.LP -.SH -HALT -.IP -Corresponds to the Pascal procedure -.I halt ; -causes execution to end with a post-mortem backtrace as if a run-time -error had occurred. -.SH -BEG s,W,w," -.IP -Causes the second part of the block mark to be created, and -.I W -bytes of local variable space to be allocated and cleared to zero. -Stack overflow is detected here. -.I w -is the first line of the body of this section for error traceback, -and the inline string (length s) the character representation of its name. -.SH -NODUMP s,W,w," -.IP -Equivalent to -.SM BEG , -and used to begin the main program when the ``p'' -option is disabled so that the post-mortem backtrace will be inhibited. -.SH -END -.IP -Complementary to the operators -.SM CALL -and -.SM BEG , -exits the current block, calling the procedure -.I pclose -to flush buffers for and release any local files. -Restores the environment of the caller from the block mark. -If this is the end for the main program, all files are -.I flushed, -and the interpreter is exited. -.SH -CALL l,A -.IP -Saves the current line number, return address, and active display entry pointer -.I dp -in the first part of the block mark, then transfers to the entry point -given by the relative address -.I A , -that is the beginning of a -.B procedure -or -.B function -at level -.I l. -.SH -PUSH s -.IP -Clears -.I s -bytes on the stack. -Used to make space for the return value of a -.B function -just before calling it. -.SH -POP s -.IP -Pop -.I s -bytes off the stack. -Used after a -.B function -or -.B procedure -returns to remove the arguments from the stack. -.SH -TRA a -.IP -Transfer control to relative address -.I a -as a local -.B goto -or part of a structured statement. -.SH -TRA4 A -.IP -Transfer control to an absolute address as part of a non-local -.B goto -or to branch over procedure bodies. -.SH -LINO s -.IP -Set current line number to -.I s. -For consistency, check that the expression stack is empty -as it should be (as this is the start of a statement.) -This consistency check will fail only if there is a bug in the -interpreter or the interpreter code has somehow been damaged. -Increment the statement count and if it exceeds the statement limit, -generate a fault. -.SH -GOTO l,A -.IP -Transfer control to address -.I A -that is in the block at level -.I l -of the display. -This is a non-local -.B goto. -Causes each block to be exited as if with -.SM END , -flushing and freeing files with -.I pclose, -until the current display entry is at level -.I l. -.SH -SDUP* -.IP -Duplicate the word or long on the top of -the stack. -This is used mostly for constructing sets. -See section 2.11. -.NH 2 -If and relational operators -.SH -IF a -.IP -The interpreter conditional transfers all take place using this operator -that examines the Boolean value on the top of the stack. -If the value is -.I true , -the next code is executed, -otherwise control transfers to the specified address. -.SH -REL* r -.IP -These take two arguments on the stack, -and the sub-operation code specifies the relational operation to -be done, coded as follows with `a' above `b' on the stack: -.DS -.mD -.TS -lb lb -c a. -Code Operation -_ -0 a = b -2 a <> b -4 a < b -6 a > b -8 a <= b -10 a >= b -.TE -.DE -.IP -Each operation does a test to set the condition code -appropriately and then does an indexed branch based on the -sub-operation code to a test of the condition here specified, -pushing a Boolean value on the stack. -.IP -Consider the statement fragment: -.DS -.mD -\*bif\fR a = b \*bthen\fR -.DE -.IP -If -.I a -and -.I b -are integers this generates the following code: -.DS -.TS -lp-2w(8) l. -RV4:\fIl a\fR -RV4:\fIl b\fR -REL4 \&= -IF \fIElse part offset\fR -.sp -.T& -c s. -\fI\&... Then part code ...\fR -.TE -.DE -.NH 2 -Boolean operators -.PP -The Boolean operators -.SM AND , -.SM OR , -and -.SM NOT -manipulate values on the top of the stack. -All Boolean values are kept in single bytes in memory, -or in single words on the stack. -Zero represents a Boolean \fIfalse\fP, and one a Boolean \fItrue\fP. -.NH 2 -Right value, constant, and assignment operators -.SH -LRV* l,A -.br -RV* l,a -.IP -The right value operators load values on the stack. -They take a block number as a sub-opcode and load the appropriate -number of bytes from that block at the offset specified -in the following word onto the stack. As an example, consider -.SM LRV4 : -.DS -.mD -_LRV4: - \fBcvtbl\fR (lc)+,r0 #r0 has display index - \fBaddl3\fR _display(r0),(lc)+,r1 #r1 has variable address - \fBpushl\fR (r1) #put value on the stack - \fBjmp\fR (loop) -.DE -.IP -Here the interpreter places the display level in r0. -It then adds the appropriate display value to the inline offset and -pushes the value at this location onto the stack. -Control then returns to the main -interpreter loop. -The -.SM RV* -operators have short inline data that -reduces the space required to address the first 32K of -stack space in each stack frame. -The operators -.SM RV14 -and -.SM RV24 -provide explicit conversion to long as the data -is pushed. -This saves the generation of -.SM STOI -to align arguments to -.SM C -subroutines. -.SH -CON* r -.IP -The constant operators load a value onto the stack from inline code. -Small integer values are condensed and loaded by the -.SM CON1 -operator, that is given by -.DS -.mD -_CON1: - \fBcvtbw\fR (lc)+,\-(sp) - \fBjmp\fR (loop) -.DE -.IP -Here note that little work was required as the required constant -was available at (lc)+. -For longer constants, -.I lc -must be incremented before moving the constant. -The operator -.SM CON -takes a length specification in the sub-opcode and can be used to load -strings and other variable length data onto the stack. -The operators -.SM CON14 -and -.SM CON24 -provide explicit conversion to long as the constant is pushed. -.SH -AS* -.IP -The assignment operators are similar to arithmetic and relational operators -in that they take two operands, both in the stack, -but the lengths given for them specify -first the length of the value on the stack and then the length -of the target in memory. -The target address in memory is under the value to be stored. -Thus the statement -.DS -i := 1 -.DE -.IP -where -.I i -is a full-length, 4 byte, integer, -will generate the code sequence -.DS -.TS -lp-2w(8) l. -LV:\fIl i\fP -CON1:1 -AS24 -.TE -.DE -.IP -Here -.SM LV -will load the address of -.I i, -that is really given as a block number in the sub-opcode and an -offset in the following word, -onto the stack, occupying a single word. -.SM CON1 , -that is a single word instruction, -then loads the constant 1, -that is in its sub-opcode, -onto the stack. -Since there are not one byte constants on the stack, -this becomes a 2 byte, single word integer. -The interpreter then assigns a length 2 integer to a length 4 integer using -.SM AS24 \&. -The code sequence for -.SM AS24 -is given by: -.DS -.mD -_AS24: - \fBincl\fR lc - \fBcvtwl\fR (sp)+,*(sp)+ - \fBjmp\fR (loop) -.DE -.IP -Thus the interpreter gets the single word off the stack, -extends it to be a 4 byte integer -gets the target address off the stack, -and finally stores the value in the target. -This is a typical use of the constant and assignment operators. -.NH 2 -Addressing operations -.SH -LLV l,W -.br -LV l,w -.IP -The most common operation done by the interpreter -is the ``left value'' or ``address of'' operation. -It is given by: -.DS -.mD -_LLV: - \fBcvtbl\fR (lc)+,r0 #r0 has display index - \fBaddl3\fR _display(r0),(lc)+,\-(sp) #push address onto the stack - \fBjmp\fR (loop) -.DE -.IP -It calculates an address in the block specified in the sub-opcode -by adding the associated display entry to the -offset that appears in the following word. -The -.SM LV -operator has a short inline data that reduces the space -required to address the first 32K of stack space in each call frame. -.SH -OFF s -.IP -The offset operator is used in field names. -Thus to get the address of -.LS -p^.f1 -.LE -.IP -.I pi -would generate the sequence -.DS -.mD -.TS -lp-2w(8) l. -RV:\fIl p\fP -OFF \fIf1\fP -.TE -.DE -.IP -where the -.SM RV -loads the value of -.I p, -given its block in the sub-opcode and offset in the following word, -and the interpreter then adds the offset of the field -.I f1 -in its record to get the correct address. -.SM OFF -takes its argument in the sub-opcode if it is small enough. -.SH -NIL -.IP -The example above is incomplete, lacking a check for a -.B nil -pointer. -The code generated would be -.DS -.TS -lp-2w(8) l. -RV:\fIl p\fP -NIL -OFF \fIf1\fP -.TE -.DE -.IP -where the -.SM NIL -operation checks for a -.I nil -pointer and generates the appropriate runtime error if it is. -.SH -LVCON s," -.IP -A pointer to the specified length inline data is pushed -onto the stack. -This is primarily used for -.I printf -type strings used by -.SM WRITEF . -(see sections 3.6 and 3.8) -.SH -INX* s,w,w -.IP -The operators -.SM INX2 -and -.SM INX4 -are used for subscripting. -For example, the statement -.DS -a[i] := 2.0 -.DE -.IP -with -.I i -an integer and -.I a -an -``array [1..1000] of real'' -would generate -.DS -.TS -lp-2w(8) l. -LV:\fIl a\fP -RV4:\fIl i\fP -INX4:8 1,999 -CON8 2.0 -AS8 -.TE -.DE -.IP -Here the -.SM LV -operation takes the address of -.I a -and places it on the stack. -The value of -.I i -is then placed on top of this on the stack. -The array address is indexed by the -length 4 index (a length 2 index would use -.SM INX2 ) -where the individual elements have a size of 8 bytes. -The code for -.SM INX4 -is: -.DS -.mD -_INX4: - \fBcvtbl\fR (lc)+,r0 - \fBbneq\fR L1 - \fBcvtwl\fR (lc)+,r0 #r0 has size of records -L1: - \fBcvtwl\fR (lc)+,r1 #r1 has lower bound - \fBmovzwl\fR (lc)+,r2 #r2 has upper-lower bound - \fBsubl3\fR r1,(sp)+,r3 #r3 has base subscript - \fBcmpl\fR r3,r2 #check for out of bounds - \fBbgtru\fR esubscr - \fBmull2\fR r0,r3 #calculate byte offset - \fBaddl2\fR r3,(sp) #calculate actual address - \fBjmp\fR (loop) -esubscr: - \fBmovw\fR $ESUBSCR,_perrno - \fBjbr\fR error -.DE -.IP -Here the lower bound is subtracted, and range checked against the -upper minus lower bound. -The offset is then scaled to a byte offset into the array -and added to the base address on the stack. -Multi-dimension subscripts are translated as a sequence of single subscriptings. -.SH -IND* -.IP -For indirect references through -.B var -parameters and pointers, -the interpreter has a set of indirection operators that convert a pointer -on the stack into a value on the stack from that address. -different -.SM IND -operators are necessary because of the possibility of different -length operands. -The -.SM IND14 -and -.SM IND24 -operators do conversions to long -as they push their data. -.NH 2 -Arithmetic operators -.PP -The interpreter has many arithmetic operators. -All operators produce results long enough to prevent overflow -unless the bounds of the base type are exceeded. -The basic operators available are -.DS -Addition: ADD*, SUCC* -Subtraction: SUB*, PRED* -Multiplication: MUL*, SQR* -Division: DIV*, DVD*, MOD* -Unary: NEG*, ABS* -.DE -.NH 2 -Range checking -.PP -The interpreter has several range checking operators. -The important distinction among these operators is between values whose -legal range begins at zero and those that do not begin at zero, -for example -a subrange variable whose values range from 45 to 70. -For those that begin at zero, a simpler ``logical'' comparison against -the upper bound suffices. -For others, both the low and upper bounds must be checked independently, -requiring two comparisons. -On the -.SM "VAX 11/780" -both checks are done using a single index instruction -so the only gain is in reducing the inline data. -.NH 2 -Case operators -.PP -The interpreter includes three operators for -.B case -statements that are used depending on the width of the -.B case -label type. -For each width, the structure of the case data is the same, and -is represented in figure 2.4. -.sp 1 -.so fig2.4.n -.PP -The -.SM CASEOP -case statement operators do a sequential search through the -case label values. -If they find the label value, they take the corresponding entry -from the transfer table and cause the interpreter to branch to the -specified statement. -If the specified label is not found, an error results. -.PP -The -.SM CASE -operators take the number of cases as a sub-opcode -if possible. -Three different operators are needed to handle single byte, -word, and long case transfer table values. -For example, the -.SM CASEOP1 -operator has the following code sequence: -.DS -.mD -_CASEOP1: - \fBcvtbl\fR (lc)+,r0 - \fBbneq\fR L1 - \fBcvtwl\fR (lc)+,r0 #r0 has length of case table -L1: - \fBmovaw\fR (lc)[r0],r2 #r2 has pointer to case labels - \fBmovzwl\fR (sp)+,r3 #r3 has the element to find - \fBlocc\fR r3,r0,(r2) #r0 has index of located element - \fBbeql\fR caserr #element not found - \fBmnegl\fR r0,r0 #calculate new lc - \fBcvtwl\fR (r2)[r0],r1 #r1 has lc offset - \fBaddl2\fR r1,lc - \fBjmp\fR (loop) -caserr: - \fBmovw\fR $ECASE,_perrno - \fBjbr\fR error -.DE -.PP -Here the interpreter first computes the address of the beginning -of the case label value area by adding twice the number of case label -values to the address of the transfer table, since the transfer -table entries are 2 byte address offsets. -It then searches through the label values, and generates an ECASE -error if the label is not found. -If the label is found, the index of the corresponding entry -in the transfer table is extracted and that offset is added -to the interpreter location counter. -.NH 2 -Operations supporting pxp -.PP -The following operations are defined to do execution profiling. -.SH -PXPBUF w -.IP -Causes the interpreter to allocate a count buffer -with -.I w -four byte counters -and to clear them to zero. -The count buffer is placed within an image of the -.I pmon.out -file as described in the -.I "PXP Implementation Notes." -The contents of this buffer are written to the file -.I pmon.out -when the program ends. -.SH -COUNT w -.IP -Increments the counter specified by -.I w . -.SH -TRACNT w,A -.IP -Used at the entry point to procedures and functions, -combining a transfer to the entry point of the block with -an incrementing of its entry count. -.NH 2 -Set operations -.PP -The set operations: -union -.SM ADDT, -intersection -.SM MULT, -element removal -.SM SUBT, -and the set relationals -.SM RELT -are straightforward. -The following operations are more interesting. -.SH -CARD s -.IP -Takes the cardinality of a set of size -.I s -bytes on top of the stack, leaving a 2 byte integer count. -.SM CARD -uses the -.B ffs -opcode to successively count the number of set bits in the set. -.SH -CTTOT s,w,w -.IP -Constructs a set. -This operation requires a non-trivial amount of work, -checking bounds and setting individual bits or ranges of bits. -This operation sequence is slow, -and motivates the presence of the operator -.SM INCT -below. -The arguments to -.SM CTTOT -include the number of elements -.I s -in the constructed set, -the lower and upper bounds of the set, -the two -.I w -values, -and a pair of values on the stack for each range in the set, single -elements in constructed sets being duplicated with -.SM SDUP -to form degenerate ranges. -.SH -IN s,w,w -.IP -The operator -.B in -for sets. -The value -.I s -specifies the size of the set, -the two -.I w -values the lower and upper bounds of the set. -The value on the stack is checked to be in the set on the stack, -and a Boolean value of -.I true -or -.I false -replaces the operands. -.SH -INCT -.IP -The operator -.B in -on a constructed set without constructing it. -The left operand of -.B in -is on top of the stack followed by the number of pairs in the -constructed set, -and then the pairs themselves, all as single word integers. -Pairs designate runs of values and single values are represented by -a degenerate pair with both value equal. -This operator is generated in grammatical constructs such as -.LS -\fBif\fR character \fBin\fR [`+', '\-', `*', `/'] -.LE -.IP -or -.LS -\fBif\fR character \fBin\fR [`a'..`z', `$', `_'] -.LE -.IP -These constructs are common in Pascal, and -.SM INCT -makes them run much faster in the interpreter, -as if they were written as an efficient series of -.B if -statements. -.NH 2 -Miscellaneous -.PP -Other miscellaneous operators that are present in the interpreter -are -.SM ASRT -that causes the program to end if the Boolean value on the stack is not -.I true, -and -.SM STOI , -.SM STOD , -.SM ITOD , -and -.SM ITOS -that convert between different length arithmetic operands for -use in aligning the arguments in -.B procedure -and -.B function -calls, and with some untyped built-ins, such as -.SM SIN -and -.SM COS \&. -.PP -Finally, if the program is run with the run-time testing disabled, there -are special operators for -.B for -statements -and special indexing operators for arrays -that have individual element size that is a power of 2. -The code can run significantly faster using these operators. -.NH 2 -Mathematical Functions -.PP -The transcendental functions -.SM SIN , -.SM COS , -.SM ATAN , -.SM EXP , -.SM LN , -.SM SQRT , -.SM SEED , -and -.SM RANDOM -are taken from the standard UNIX -mathematical package. -These functions take double precision floating point -values and return the same. -.PP -The functions -.SM EXPO , -.SM TRUNC , -and -.SM ROUND -take a double precision floating point number. -.SM EXPO -returns an integer representing the machine -representation of its argument's exponent, -.SM TRUNC -returns the integer part of its argument, and -.SM ROUND -returns the rounded integer part of its argument. -.NH 2 -System functions and procedures -.SH -LLIMIT -.IP -A line limit and a file pointer are passed on the stack. -If the limit is non-negative the line limit is set to the -specified value, otherwise it is set to unlimited. -The default is unlimited. -.SH -STLIM -.IP -A statement limit is passed on the stack. The statement limit -is set as specified. -The default is 500,000. -No limit is enforced when the ``p'' option is disabled. -.SH -CLCK -.br -SCLCK -.IP -.SM CLCK -returns the number of milliseconds of user time used by the program; -.SM SCLCK -returns the number of milliseconds of system time used by the program. -.SH -WCLCK -.IP -The number of seconds since some predefined time is -returned. Its primary usefulness is in determining -elapsed time and in providing a unique time stamp. -.sp -.LP -The other system time procedures are -.SM DATE -and -.SM TIME -that copy an appropriate text string into a pascal string array. -The function -.SM ARGC -returns the number of command line arguments passed to the program. -The procedure -.SM ARGV -takes an index on the stack and copies the specified -command line argument into a pascal string array. -.NH 2 -Pascal procedures and functions -.SH -PACK s,w,w,w -.br -UNPACK s,w,w,w -.IP -They function as a memory to memory move with several -semantic checks. -They do no ``unpacking'' or ``packing'' in the true sense as the -interpreter supports no packed data types. -.SH -NEW s -.br -DISPOSE s -.IP -An -.SM LV -of a pointer is passed. -.SM NEW -allocates a record of a specified size and puts a pointer -to it into the pointer variable. -.SM DISPOSE -deallocates the record pointed to by the pointer -and sets the pointer to -.SM NIL . -.sp -.LP -The function -.SM CHR* -converts a suitably small integer into an ascii character. -Its primary purpose is to do a range check. -The function -.SM ODD* -returns -.I true -if its argument is odd and returns -.I false -if its argument is even. -The function -.SM UNDEF -always returns the value -.I false . |