ack/doc/basic.doc

.TL
.de Sy
.LP
.IP \fBsyntax\fR 10
..
.de PU
.IP \fBpurpose\fR 10
..
.de RM
.IP \fBremarks\fR 10
..
The ABC compiler
.AU
Martin L. Kersten
.AI
Department of Mathematics and Computer Science.
.br
Vrije Universiteit
.AB
This manual describes the 
programming language BASIC and its compiler
included in the Amsterdam Compiler Kit.
.AE
.SH
INTRODUCTION.
.LP
The BASIC-EM compiler is an extensive implementation of the
programming language BASIC.
The language structure and semantics are modelled after the 
BASIC interpreter/compiler of Microsoft (tr), a detailed comparison
is provided in appendix A.
.LP
The compiler generates code for a virtual machine, the EM machine
[[ACM, etc]]
Using EM as an intermediate machine results in a highly portable
compiler and BASIC code.
The drawback of EM is that it does not directly reflect one particular
hardware design, which means that many of 
the low level operations available within 
BASIC are ill-defined or even inapplicable.
To mention a few, the peek and poke instructions are likely
to be behave errorneous, while line printer and tapedeck 
primitives are unknown.
.LP
This manual is divided into three chapters.
The first chapter discusses the general language syntax and semantics.
Chapter two describes the statements available in BASIC-EM.
Chapter 3 describes the predefined functions,
ordered alphabetically.
Appendix A discusses the differences with
Microsoft BASIC. Appendix B describes all reserved symbols.
Appendix C lists the error messages in use.
.sp
Additional information about EM and the Amsterdam Compiler Kit
can be obtained from .... and found in ......
.SH
SyNTAX NOTATION
.LP
The conventions for syntax presentation are as follows:
.IP CAPS 10
Items are reserved words, must be input as shown
.IP <> 10
Items in lowercase letters enclosed in angular brackets
are to be supplied by the user.
.IP [] 10
Items are optional.
.IP \.\.\. 10
Items may be repeated any number of times 
.IP {} 10
A choice between two or more alternatives. At least one of the entries
must be chosen.
.IP | 10
Vertical bars separate the choices within braces.
.LP
All punctuation must be included where shown.
.NH 1
GENERAL INFORMATION
.LP
The BASIC-EM compiler is designed for a UNIX based environment.
It accepts a text file with your BASIC program (suffix .b) and generates
an executable file, called a.out.
.LP
Should we call the preprocessor first?
.NH 2
LINE FORMAT
.LP
A BASIC program consists of a series of lines, starting with a 
positive line number in the range 0 to 65529.
A line may consists of more then one physical line on your terminal, but must
is limited to 1024 characters.
Multiple BASIC statements may be placed on a single line, provided
they are separated by a colon (:).
.NH 2
CONSTANTS
.LP
The BASIC compiler character set is comprised of alphabetic
characters, numeric characters, and special characters shown below.
.DS
= + - * / ^ ( ) % # $ \\ _
! [ ] , . ; : & ' ? > <  \\ (blanc)
.DE
.LP
BASIC uses two different types of constants during processing:
numeric and string constants.
.br
A string constant is a sequence of characters taken from the ASCII
character set enclosed by double quotation marks.
.br
Numeric constants are positive or negative numbers, grouped into
five different classes.
.IP "a) integer constants" 25
Whole numbers in the range of -32768 and 32767. Integer constants do
not contain decimal points.
.IP "b) fixed point constants" 25
Positive or negative real numbers, i.e. numbers with a decimal point.
.IP "c) floating point constants" 25
Real numbers in scientific notation. A floating point constant
consists of an optional signed integer or fixed point number
followed by the letter E (or D) and an optional signed integer
(the exponent).
The allowable range of floating point constants is 10^-38 to 10^+38.
.IP "d) Hex constants" 25
Hexadecimal numbers, denoted by the prefix &H.
.IP "d) Octal constants" 25
Octal numbers, denoted by the prefix &O.
.NH 2
VARIABLES
.LP
Variables are names used to represent values in a BASIC program.
A variable is assigned a value by assigment specified in the program.
Before a variable is assigned its value is assumed to be zero.
.br
Variable names are composed of letters, digits or the decimal point,
starting with a letter. Up to 40 characters are significant.
A variable name be be followed by any of the following  type 
declaration characters:
.IP % 5
Defines an integer variable
.IP ! 5
Defines a single precision variable (see below)
.IP # 5
Defines a double precision variable
.IP $ 5
Defines a string variable.
.LP
NOTE: Two variables with the same name but different type is
considered illegal (DONE?).
.LP
Beside single valued variables, values may be grouped
into tables or arrays.
Each element in an array is referenced by the array name and an index,
such a variable is called a subscripted variable.
An array has as many subscripts as there are dimensions in the array,
the maximum of which is 11.
.br
If a variable starts with FN it is assumed to be a call to a user defined
function. 
.br
A variable name may not be a reserved word nor the name 
of a predefined function.
A list of all reserved identifiers is included as Appendix ?.
.NH 2
EXPRESSIONS
.LP
BASIC-EM differs from Microsoft BASIC in supporting floats in one precision
only (due to EM).
All floating point constants have the same precision, i.e. 16 digits.
.LP
When necessary the compiler will convert a numeric value from
one type to another.
A value is always converted to the precision of the variable it is assigned
to.
When a floating point value is converted to an integer the fractional
portion is rounded.
In an expression all values are converted to the same degree of precision,
i.e. that of the most precise operand.
.br
Division by zero results in the message "Division by zero".
If overflow (or underflow) occurs, the "Overflow (underflow)" message is
displayed and  execution is terminated (contrary to Microsoft).
.SH
Arithmetic
.LP
The arithmetic operators in order of precedence,a re:
.DS L
\^		Exponentiation
-		Negation
*,/,\\,MOD	Multiplication, Division, Remainder
+,-		Addition, Substraction
.DE
The operator \\\\ denotes integer division, its operands are rounded to
integers before the operator is applied.
Modulus arithmetic is denoted by the operator MOD, which yields the
integer value that is the remainder of an integer division.
.br
The order in which operators are performed can be changec with parentheses.
.SH
Relational
.LP
The relational operators in order of precedence, are:
.DS
=	Equality
<>	Inequality
<	Less than
>	Greater than
<=	Less than or equal to
>=	Greater than or equal to
.DE
The relational operators are used to compare two values and returns
either "true" (-1) or "false" (0) (See IF statement).
The precedence of the relational operators is lower 
then the arithmetic operators.
.SH
Logical
.LP
The logical operators performs tests on multiple relations, bit manipulations,
or Boolean operations.
The logical operators returns a bitwise result ("true" or "false").
In an expression, logical operators are performed after the relational and
arithmetic operators.
The logical operators work by converting their operands to signed
two-complement integers in the range -32768 to 32767.
.DS
NOT		Bitwise negation
AND		Bitwise and
OR		Bitwise or
XOR		Bitwise exclusive or
EQV		Bitwise equivalence
IMP		Bitwise implies
.DE
.SH
Functional
.LP
A function is used in  an expression to call a system or user defined
function.
A list of predefined functions is presented in chapter 3.
.SH
String operations
.LP
Strings can be concatenated by using +. Strings can be compared with
the relational operators. String comparison is performed in lexicographic
order.
.NH 2
ERROR MESSAGES
.LP
The occurence of an error results in termination of the program
unless an ON....ERROR statement has been encountered.
.NH 1
B-EM STATEMENTS
.LP
This chapter describes the statements available within the BASIC-EM
compiler. Each description is formatted as follows:
.Sy
Shows the correct syntax for the statement. See introduction of
syntax notation above.
.PU
Describes the purpose and details of the instructions.
.RM
Describes special cases, deviation from Microsoft BASIC etc.
.LP
.NH 2 
CALL
.Sy
CALL <variable name>[(<argument list>)]
.PU
The CALL statement provides the means to execute procedures
and functions written in another language included in the
Amsterdam Compiler Kit.
The argument list consist of (subscripted) variables.
The BASIC compiler pushes the address of the arguments on the stack in order
of encounter.
.RM
Not yet available
.NH 2
CLOSE
.Sy
CLOSE [[#]<file number>[,[#]<file number...>]]
.PU
To terminate I/O on a disk file.
<file number> is the number associated with the file 
when it was OPENed (See OPEN). Ommission of parameters results in closing
all files.
.sp
The END statement and STOP statement always issue a CLOSE of
all files.
.NH 2
DATA
.Sy
DATA <list of constants>
.PU
DATA statements are used to construct a data bank of values that are
accessed by the program's READ statement.
DATA statements are non-executable,
the data items are assembled in a data file by the BASIC compiler.
This file can be replaced, provided the layout remains
the same (otherwise the RESTORE won't function properly).
.sp
The list of data items consists of numeric and string constants
as discussed in section 1.
Moreover, string constants starting with a letter and not
containing blancs, newlines, commas, colon need not be enclosed with
the string quotes.
.sp
DATA statements can be reread using the RESTORE statement.
.NH 2
DEF FN
.Sy
DEF FN<name> [(<parameterlist>)]=<expression>
.PU
To define and name a function that is written by the user.
<name> must be an identifier and should be preceded by FN,
which is considered integral part of the function name. 
<expression> defines the expression to be evaluated upon function call.
.sp
The parameter list is comprised of a comma separated 
list of variable names, used within the function definition,
that are to replaced by values upon function call.
The variable names defined in the parameterlist, called formal
parameters, do not affect the definition and use of variables
defined with the same name in the rest of the BASIC program.
.sp
A type declaration character may be suffixed to the function name to
designate the data type of the function result.
.NH 2
DEFINT/SNG/DBL/STR
.Sy
DEF<type> <range of letters>
.PU
Any undefined variable starting with the letter included in the range of
letters is declared of type <type> unless a type declaration character
is appended.
The range of letters is a comma separated list of characters and
character ranges (<letter>-<letter>).
.NH 2
DIM
.Sy
DIM <list of subscripted variable>
.PU
The DIM statement allocates storage for subscripted variables.
If an undefined subscripted variable is used 
the maximum value of the array subscript(s) is assumed to be 10.
A subscript out of range is signalled by the program (when RCK works)
The minimum subscript value is 0, unless the OPTION BASE statement has been
encountered.
.sp
All variables in a subscripted variable are initially zero.
.sp
BUG. Multi-dimensional arrays MUST be defined.
.NH 2
END
.Sy
END
.PU
END terminates a BASIC program and returns to the UNIX shell.
An END statement at the end of the BASIC program is optional.
.NH 2
ERR and ERL
.PU
Whenever an error occurs the variable ERR contains the
error number and ERL the BASIC line where the error occurred.
The variables are usually used in error handling routines
provided by the user.
.NH 2
ERROR
.Sy
ERROR <integer expression>
.PU
To simulate the occurrence of a BASIC error.
To define your own error code use a value not already in
use by the BASIC runtime system.
The list of error messages currently in use 
can be found in appendix B.
.NH 2
FIELD
.PU
To be implemented.
.NH 2
FOR...NEXT
.Sy
FOR <variable>= <low>TO<high>[STEP<size>]
.br
 ......
.br
NEXT [<variable>][,<variable>...]
.PU
The FOR statements allows a series of statements to be performed
repeatedly. <variable> is used as a counter. During the first
execution pass it is assigned the value <low>,
an arithmetic expression. After each pass the counter
is incremented with the step size <size>, an expression.
Ommission of the step size is intepreted as an increment of 1.
Execution of the program lines specified between the FOR and the NEXT
statement is terminated as soon as <low> is greater than <high>
.sp
The NEXT statement is labeled with the name(s) of the counter to be
incremented.
.sp
The body of the FOR statement is skipped when the initial value of the
loop times the sign of the step exceeds the value of the highest value
times the sign of the step.
.sp
The variables mentioned in the NEXT statement may be ommitted, in which case
the variable of increment the counter of the most recent FOR statement.
If a NEXT statement is encountered before its corresponding FOR statement,
the error message "NEXT without FOR" is generated.
.NH 2
GET
.Sy
GET [#]<file number>[, <record number>]
.PU
To be implemented.
.NH 2
GOSUB...RETURN
.Sy
GOSUB <line number
  ...
.br
RETURN
.PU
The GOSUB statement branches to the first statement of a subroutine.
The RETURN statement cause a branch back to the statement following the
most recent GOSUB statement.
A subroutine may contain more than one RETURN statement.
.sp
Subroutines may be called recursively. 
Nesting of subroutine calls is limited, upon exceeding the maximum depth
the error message "XXXXX" is displayed.
.NH 2
GOTO
.Sy
GOTO <line number>
.PU
To branch unconditionally to a specified line in the program.
If <line number> does not exists, the compilation error message
"Line not defined" is displayed.
.RM
Microsoft BASIC continues at the first line
equal or greater then the line specified.
.NH 2
IF...THEN
.Sy
.br
IF <expression> THEN {<statements>|<line number>}
[ELSE {<statements>|<line number>}]
.br
.Sy
IF <expression> GOTO <line number>
[ELSE {<statements>|<line number>}]
.PU
The IF statement is used
to make a decision regarding the program flow based on the
result of the expressions.
If the expression is not zero, the THEN or GOTO clause is
executed. If the result of <expression> is zero, the THEN or
GOTO clause is ignored and the ELSE clause, if present is
executed.
.br
IF..THEN..ELSE statements may be nested.
Nesting is limited by the length of the line.
The ELSE clause matches with the closests unmatched THEN.
.sp
When using IF to test equality for a value that is the
result of a floating point expression, remember that the
internal representation of the value may not be exact.
Therefore, the test should be against a range to
handle the relative error.
.RM
Microsoft BASIC allows a comma before THEN.
.NH 2
INPUT
.Sy
INPUT [;][<"prompt string">;]<list of variables>
.PU
An INPUT statement can be used to obtain values from the user at the
terminal.
When an INPUT statement is encountered a question mark is printed
to indicate the program is awaiting data.
IF <"prompt string"> is included, the string is printed before the
the question mark. The question mark is suppressed when the prompt
string is followed by a comma, rather then a semicolon.
.sp
For each variable in the variable a list a value should be supplied.
Data items presented should be separated by a comma.
.sp
The type of the variable in the variable list must aggree with the
type of the data item entered. Responding with too few or too many
data items causes the message "?Redo". No assignment of input values
is made until an acceptable response is given.
.RM
The option to disgard the carriage return with the semicolon after the
input symbol is not yet implemented.
.NH 2
INPUT [#]
.Sy
INPUT #<file number>,<list of variables>
.PU
The purpose of the INPUT# statement is to read data items from a sequential
file and assign them to program variables.
<file number> is the number used to open the file for input.
The variables mentioned are (subscripted) variables.
The type of the data items read should aggree with the type of the variables.
A type mismatch results in the error message "XXXXX".
.sp
The data items on the sequential file are separated by commas and newlines.
In scanning the file, leading spaces, new lines, tabs, and
carriage returns are ignored. The first character encountered 
is assumed to be the state of a new item.
String items need not be enclosed with double quotes, provided
it does not contain spaces, tabs, newlines and commas,
.RM
Microsoft BASIC won't assign values until the end of input statement.
This means that the user has to supply all the information.
.NH 2
LET
.Sy
[LET]<variable>=<expression>
.PU
To assign  the value of an expression to a (subscribted) variable.
The type convertions as dictated in section 1.X apply.
.NH 2
LINE INPUT
.Sy
LINE INPUT [;][<"prompt string">;]<string variable>
.PU
An entire line of input is assigned to the string variable.
See INPUT for the meaning of the <"prompt string"> option.
.NH 2
LINE INPUT [#]
.Sy
LINE INPUT #<file number>,<string variable>
.PU
Read an entire line of text from a sequential file <file number>
and assign it to a string variable.
.NH 2
LSET and RSET
.PU
To be implemented
.NH 2
MID$
.Sy
MID$(<string expr1>,n[,m])=<string expr2>
.PU
To replace a portion of a string with another string value.
The characters of <string expr> replaces characters in <string expr1>
starting at position n. If m is present, at most m characters are copied,
otherwise all characters are copied.
However, the string obtained never exceeds the length of string expr1.
.NH 2
ON ERROR GOTO
.Sy
ON ERROR GOTO <line number>
.PU
To enable error handling within the BASIC program.
An error may result from arithmetic errors, disk problems, interrupts, or
as a result of the ERROR statement.
After printing an error message the program is continued at the
statements associated with <line number>.
.sp
Error handling is disabled using ON ERROR GOTO 0.
Subsequent errors result in an error message and program termination.
.NH 2
ON...GOSUB and ON ...GOTO
.Sy
ON <expression> GOSUB <list of line numbers>
ON <expression> GOTO <list of line numbers>
.PU
To branch to one of several specified line numbers or subroutines, based
on the result of the <expression>. The list of line numbers are considered
the first, second, etc alternative. Branching to the first occurs when
the expression evaluates to one, to the second alternative on two, etc.
If the value of the expression in zero or greater than the number of alternatives, processing continues at the first statement following the ON..GOTO 
(ON GOSUB) statement.
When the expression results in a negative number the 
an "Illegal function call" error occurs.
.NH 2
OPEN
.NH 2
OPTION BASE
.Sy
OPTION BASE n
.PU
To declare the lower bound of subsequent array subscripts as either
0 or 1. The default lower bound is zero.
.NH 2
POKE
.Sy
POKE <expr1>,<expr2>
.PU
To poke around in memory. The use of this statement is not recommended,
because it requires full understanding of both
the implementation of the Amsterdam
Compiler Kit and the hardware characteristics.
.NH 2
PRINT [USING]
.NH 2
PUT
.PU
To be implemented
.NH 2
RANDOMIZE
.Sy
RANDOMIZE [<expression>]
.PU
To reset the random seed. When the expression is ommitted, the system
will ask for a value between -32768 and 32767.
The random number generator returns the same sequence of values provided
the same seed is used.
.NH 2
READ
.Sy
READ <list of variables>
.PU
To read values from the DATA statements and assign them to variables.
The type of the variables should match to the type of the items being read,
otherwise a "Syntax error" occurs.
.NH 2
REM
.Sy
REM <remark>
.PU
To include explantory information in a program.
The REM statements are not executed.
A single quote has the same effect as  : REM, which
allows for the inclusion of comment at the end of the line.
.RM
Microsoft BASIC does not allow REM statements as part of
DATA lines.
.NH 2
RESTORE
.Sy
RESTORE  [<line number>]
.PU
To allow DATA statements to be re-read from a specific line.
After a RESTORE statement is executed, the next READ accesses
the first item of the DATA statements.
If <line number> is specified, the next READ accesses the first
item in the specified line.
.sp
Note that data statements result in a sequential datafile generated
by the compiler, being read by the read statements.
This data file may be replaced using the operating system functions
with a modified version, provided the same layout of items
(same number of lines and items per line) is used.
.NH 2
STOP
.Sy
STOP
.PU
To terminate the execution of a program and return to the operating system
command interpreter. A STOP statement results in the message "Break in line
???"
.NH 2
SWAP
.Sy
SWAP <variable>,<variable>
.PU
To exchange the values of two variables.
.NH 2
TRON/TROFF
.Sy
TRON
.Sy
TROFF
.PU
As an aid in debugging the TRON statement results in a program
listing each line being interpreted. TROFF disables generation of
this code.
.NH 2
WHILE...WEND
.Sy
WHILE <expression>
  .....
WEND
.PU
To execute a series of BASIC statements as long as a conditional expression
is true. WHILE...WEND loops may be nested.
.NH 2
WRITE 
.Sy
WRITE [<list of expressions>]
.PU
To write data at the terminal in DATA statement layout conventions.
The expressions should be separated by commas.
.NH 2
WRITE #
.Sy
WRITE #<file number> ,<list of expressions>
.PU
To write a sequential data file, being opened with the "O" mode.
The values are being writting using the DATA statements layout conventions.
.NH
FUNCTIONS
.LP
.IP ABS(X) 12
Returns the absolute value of expression X
.IP ASC(X$) 12
Returns the numeric value of the first character of the string.
If X$ is not initialized an "Illegal function call" error
is returned.
.IP ATN(X) 12
Returns the arctangent of X in radians. Result is in the range
of -pi/2 to pi/2.
.IP CDBL(X) 12
Converts X to a double precision number.
.IP CHR$(X) 12
Converts the integer value X to its ASCII character. 
X must be in the range of 0 to 127.
It is used for cursor addressing and generating bel signals.
.IP CINT(X) 12
Converts X to an integer by rounding the fractional portion.
If X is not in the range -32768 to 32767 an "Overflow"
error occurs.
.IP COS(X) 12
Returns the cosine of X in radians.
.IP CSNG(X) 12
Converts X to a double precision number.
.IP CVI(<2-bytes>) 12
Convert two byte string value to integer number.
.IP CVS(<4-bytes>) 12
Convert four byte string value to single precision number.
.IP CVD(<8-bytes>) 12
Convert eight byte string value to double precision number.
.IP EOF[(<file-number>)] 12
Returns -1 (true) if the end of a sequential file has been reached.
.IP EXP(X) 12
Returns e(base of natural logarithm) to the power of X.
X should be less then 10000.0.
.IP FIX(X) 12
Returns the truncated integer part of X. FIX(X) is
equivalent to SGN(X)*INT(ABS(X)).
The major difference between FIX and INT is that FIX does not
return the next lower number for negative X.
.IP HEX$(X) 12
Returns the string which represents the hexadecimal value of
the decimal argument. X is rounded to an integer using CINT
before HEX$ is evaluated.
.IP INT(X) 12
Returns the largest integer <= X.
.IP INPUT$(X[,[#]Y]) 12
Returns the string of X characters read from the terminal or
the designated file.
.IP LEX(X$) 12
Returns the number of characters in the string X$.
Non printable and blancs are counted too.
.IP LOC(<file\ number>) 12
For sequential files LOC returns 
position of the read/write head, counted in number of bytes.
For random files the function returns the record number just
read or written from a GET or PUT statement.
If nothing was read or written 0 is returned.
.IP LOG(X) 12
Returns the natural logarithm of X. X must be greater than zero.
.IP MID$(X,I,[J]) 12
To be implemented.
.IP MKI$(X) 12
Converts an integer expression to a two-byte string.
.IP MKS$(X) 12
Converts a single precision expression to a four-byte string.
.IP MKD$(X) 12
Converts a double precision expression to a eight-byte string.
.IP OCT$(X) 12
Returns the string which represents the octal value of the decimal
argument. X is rounded to an integer using CINT before OCTS is evaluated.
.IP PEEK(I) 12
Returns the byte read from the indicated memory. (Of limited use
in the context of ACK)
.IP POS(I) 12
Returns the current cursor position. To be implemented.
.IP RIGHT$(X$,I)
Returns the right most I characters of string X$.
If I=0 then the empty string is returned.
.IP RND(X) 12
Returns a random number between 0 and 1. X is a dummy argument.
.IP SGN(X) 12
If X>0 , SGN(X) returns 1.
.br
if X=0, SGN(X) returns 0.
.br
if X<0, SGN(X) returns -1.
.IP SIN(X) 12
Returns the sine of X in radians.
.IP SPACE$(X) 12
Returns  a string of spaces length X. The expression
X is rounded to an integer using CINT.
.IP STR$(X)
Returns the string representation value of X.
.IP STRING$(I,J) 12
Returns thes string of length Iwhose characters all
have ASCII code J. (or first character when J is a string)
.IP TAB(I) 12
Spaces to position I on the terminal. If the current
print position is already beyond space I,TAB
goes to that position on the next line.
Space 1 is leftmost position, and the rightmost position
is width minus 1. To be used within PRINT statements only.
.IP TAN(X) 12
Returns the tangent of X in radians. If TAN overflows
the "Overflow" message is displayed.
.IP VAL(X$) 12
Returns the numerical value of string X$.
The VAL function strips leading blanks and tabs from the
argument string.
.SH
APPENDIX A DIFFERENCES WITH MICROSOFT BASIC
.LP
The following list of Microsoft commands and statements are
not recognized by the compiler.
.DS
SPC
USR
VARPTR
AUTO
CHAIN
CLEAR	
CLOAD
COMMON
CONT
CSAVE
DELETE
EDIT
ERASE
FRE
KILL
LIST
LLIST
LOAD
LPRINT
MERGE
NAME
NEW
NULL
RENUM
RESUME
RUN
SAVE
WAIT
WIDTH LPRINT
.DE
Some statements are in the current implementation not available,
but will be soon. These include:
.DS
CALL
DEFUSR
FIELD
GET
INKEY
INPUT$
INSTR$
LEFT$
LSET
RSET
PUT
.DE