.\" $Header$ .TL .de Sy .LP .IP \fBsyntax\fR 10 .. .de PU .IP \fBpurpose\fR 10 .. .de RM .IP \fBremarks\fR 10 .. The ABC compiler .AU Martin L. Kersten .AI Department of Mathematics and Computer Science. .br Vrije Universiteit .AB This manual describes the programming language BASIC and its compiler included in the Amsterdam Compiler Kit. .AE .SH INTRODUCTION. .LP The BASIC-EM compiler is an extensive implementation of the programming language BASIC. The language structure and semantics are modelled after the BASIC interpreter/compiler of Microsoft (tr), a detailed comparison is provided in appendix A. .LP The compiler generates code for a virtual machine, the EM machine [[ACM, etc]] Using EM as an intermediate machine results in a highly portable compiler and BASIC code. The drawback of EM is that it does not directly reflect one particular hardware design, which means that many of the low level operations available within BASIC are ill-defined or even inapplicable. To mention a few, the peek and poke instructions are likely to be behave errorneous, while line printer and tapedeck primitives are unknown. .LP This manual is divided into three chapters. The first chapter discusses the general language syntax and semantics. Chapter two describes the statements available in BASIC-EM. Chapter 3 describes the predefined functions, ordered alphabetically. Appendix A discusses the differences with Microsoft BASIC. Appendix B describes all reserved symbols. Appendix C lists the error messages in use. .sp Additional information about EM and the Amsterdam Compiler Kit can be obtained from .... and found in ...... .SH SyNTAX NOTATION .LP The conventions for syntax presentation are as follows: .IP CAPS 10 Items are reserved words, must be input as shown .IP <> 10 Items in lowercase letters enclosed in angular brackets are to be supplied by the user. .IP [] 10 Items are optional. .IP \.\.\. 10 Items may be repeated any number of times .IP {} 10 A choice between two or more alternatives. At least one of the entries must be chosen. .IP | 10 Vertical bars separate the choices within braces. .LP All punctuation must be included where shown. .NH 1 GENERAL INFORMATION .LP The BASIC-EM compiler is designed for a UNIX based environment. It accepts a text file with your BASIC program (suffix .b) and generates an executable file, called a.out. .LP Should we call the preprocessor first? .NH 2 LINE FORMAT .LP A BASIC program consists of a series of lines, starting with a positive line number in the range 0 to 65529. A line may consists of more then one physical line on your terminal, but must is limited to 1024 characters. Multiple BASIC statements may be placed on a single line, provided they are separated by a colon (:). .NH 2 CONSTANTS .LP The BASIC compiler character set is comprised of alphabetic characters, numeric characters, and special characters shown below. .DS = + - * / ^ ( ) % # $ \\ _ ! [ ] , . ; : & ' ? > < \\ (blanc) .DE .LP BASIC uses two different types of constants during processing: numeric and string constants. .br A string constant is a sequence of characters taken from the ASCII character set enclosed by double quotation marks. .br Numeric constants are positive or negative numbers, grouped into five different classes. .IP "a) integer constants" 25 Whole numbers in the range of -32768 and 32767. Integer constants do not contain decimal points. .IP "b) fixed point constants" 25 Positive or negative real numbers, i.e. numbers with a decimal point. .IP "c) floating point constants" 25 Real numbers in scientific notation. A floating point constant consists of an optional signed integer or fixed point number followed by the letter E (or D) and an optional signed integer (the exponent). The allowable range of floating point constants is 10^-38 to 10^+38. .IP "d) Hex constants" 25 Hexadecimal numbers, denoted by the prefix &H. .IP "d) Octal constants" 25 Octal numbers, denoted by the prefix &O. .NH 2 VARIABLES .LP Variables are names used to represent values in a BASIC program. A variable is assigned a value by assigment specified in the program. Before a variable is assigned its value is assumed to be zero. .br Variable names are composed of letters, digits or the decimal point, starting with a letter. Up to 40 characters are significant. A variable name be be followed by any of the following type declaration characters: .IP % 5 Defines an integer variable .IP ! 5 Defines a single precision variable (see below) .IP # 5 Defines a double precision variable .IP $ 5 Defines a string variable. .LP NOTE: Two variables with the same name but different type is considered illegal (DONE?). .LP Beside single valued variables, values may be grouped into tables or arrays. Each element in an array is referenced by the array name and an index, such a variable is called a subscripted variable. An array has as many subscripts as there are dimensions in the array, the maximum of which is 11. .br If a variable starts with FN it is assumed to be a call to a user defined function. .br A variable name may not be a reserved word nor the name of a predefined function. A list of all reserved identifiers is included as Appendix ?. .NH 2 EXPRESSIONS .LP BASIC-EM differs from Microsoft BASIC in supporting floats in one precision only (due to EM). All floating point constants have the same precision, i.e. 16 digits. .LP When necessary the compiler will convert a numeric value from one type to another. A value is always converted to the precision of the variable it is assigned to. When a floating point value is converted to an integer the fractional portion is rounded. In an expression all values are converted to the same degree of precision, i.e. that of the most precise operand. .br Division by zero results in the message "Division by zero". If overflow (or underflow) occurs, the "Overflow (underflow)" message is displayed and execution is terminated (contrary to Microsoft). .SH Arithmetic .LP The arithmetic operators in order of precedence,a re: .DS L \^ Exponentiation - Negation *,/,\\,MOD Multiplication, Division, Remainder +,- Addition, Substraction .DE The operator \\\\ denotes integer division, its operands are rounded to integers before the operator is applied. Modulus arithmetic is denoted by the operator MOD, which yields the integer value that is the remainder of an integer division. .br The order in which operators are performed can be changec with parentheses. .SH Relational .LP The relational operators in order of precedence, are: .DS = Equality <> Inequality < Less than > Greater than <= Less than or equal to >= Greater than or equal to .DE The relational operators are used to compare two values and returns either "true" (-1) or "false" (0) (See IF statement). The precedence of the relational operators is lower then the arithmetic operators. .SH Logical .LP The logical operators performs tests on multiple relations, bit manipulations, or Boolean operations. The logical operators returns a bitwise result ("true" or "false"). In an expression, logical operators are performed after the relational and arithmetic operators. The logical operators work by converting their operands to signed two-complement integers in the range -32768 to 32767. .DS NOT Bitwise negation AND Bitwise and OR Bitwise or XOR Bitwise exclusive or EQV Bitwise equivalence IMP Bitwise implies .DE .SH Functional .LP A function is used in an expression to call a system or user defined function. A list of predefined functions is presented in chapter 3. .SH String operations .LP Strings can be concatenated by using +. Strings can be compared with the relational operators. String comparison is performed in lexicographic order. .NH 2 ERROR MESSAGES .LP The occurence of an error results in termination of the program unless an ON....ERROR statement has been encountered. .NH 1 B-EM STATEMENTS .LP This chapter describes the statements available within the BASIC-EM compiler. Each description is formatted as follows: .Sy Shows the correct syntax for the statement. See introduction of syntax notation above. .PU Describes the purpose and details of the instructions. .RM Describes special cases, deviation from Microsoft BASIC etc. .LP .NH 2 CALL .Sy CALL [()] .PU The CALL statement provides the means to execute procedures and functions written in another language included in the Amsterdam Compiler Kit. The argument list consist of (subscripted) variables. The BASIC compiler pushes the address of the arguments on the stack in order of encounter. .RM Not yet available .NH 2 CLOSE .Sy CLOSE [[#][,[#]]] .PU To terminate I/O on a disk file. is the number associated with the file when it was OPENed (See OPEN). Ommission of parameters results in closing all files. .sp The END statement and STOP statement always issue a CLOSE of all files. .NH 2 DATA .Sy DATA .PU DATA statements are used to construct a data bank of values that are accessed by the program's READ statement. DATA statements are non-executable, the data items are assembled in a data file by the BASIC compiler. This file can be replaced, provided the layout remains the same (otherwise the RESTORE won't function properly). .sp The list of data items consists of numeric and string constants as discussed in section 1. Moreover, string constants starting with a letter and not containing blancs, newlines, commas, colon need not be enclosed with the string quotes. .sp DATA statements can be reread using the RESTORE statement. .NH 2 DEF FN .Sy DEF FN [()]= .PU To define and name a function that is written by the user. must be an identifier and should be preceded by FN, which is considered integral part of the function name. defines the expression to be evaluated upon function call. .sp The parameter list is comprised of a comma separated list of variable names, used within the function definition, that are to replaced by values upon function call. The variable names defined in the parameterlist, called formal parameters, do not affect the definition and use of variables defined with the same name in the rest of the BASIC program. .sp A type declaration character may be suffixed to the function name to designate the data type of the function result. .NH 2 DEFINT/SNG/DBL/STR .Sy DEF .PU Any undefined variable starting with the letter included in the range of letters is declared of type unless a type declaration character is appended. The range of letters is a comma separated list of characters and character ranges (-). .NH 2 DIM .Sy DIM .PU The DIM statement allocates storage for subscripted variables. If an undefined subscripted variable is used the maximum value of the array subscript(s) is assumed to be 10. A subscript out of range is signalled by the program (when RCK works) The minimum subscript value is 0, unless the OPTION BASE statement has been encountered. .sp All variables in a subscripted variable are initially zero. .sp BUG. Multi-dimensional arrays MUST be defined. .NH 2 END .Sy END .PU END terminates a BASIC program and returns to the UNIX shell. An END statement at the end of the BASIC program is optional. .NH 2 ERR and ERL .PU Whenever an error occurs the variable ERR contains the error number and ERL the BASIC line where the error occurred. The variables are usually used in error handling routines provided by the user. .NH 2 ERROR .Sy ERROR .PU To simulate the occurrence of a BASIC error. To define your own error code use a value not already in use by the BASIC runtime system. The list of error messages currently in use can be found in appendix B. .NH 2 FIELD .PU To be implemented. .NH 2 FOR...NEXT .Sy FOR = TO[STEP] .br ...... .br NEXT [][,...] .PU The FOR statements allows a series of statements to be performed repeatedly. is used as a counter. During the first execution pass it is assigned the value , an arithmetic expression. After each pass the counter is incremented with the step size , an expression. Ommission of the step size is intepreted as an increment of 1. Execution of the program lines specified between the FOR and the NEXT statement is terminated as soon as is greater than .sp The NEXT statement is labeled with the name(s) of the counter to be incremented. .sp The body of the FOR statement is skipped when the initial value of the loop times the sign of the step exceeds the value of the highest value times the sign of the step. .sp The variables mentioned in the NEXT statement may be ommitted, in which case the variable of increment the counter of the most recent FOR statement. If a NEXT statement is encountered before its corresponding FOR statement, the error message "NEXT without FOR" is generated. .NH 2 GET .Sy GET [#][, ] .PU To be implemented. .NH 2 GOSUB...RETURN .Sy GOSUB .PU To branch unconditionally to a specified line in the program. If does not exists, the compilation error message "Line not defined" is displayed. .RM Microsoft BASIC continues at the first line equal or greater then the line specified. .NH 2 IF...THEN .Sy .br IF THEN {|} [ELSE {|}] .br .Sy IF GOTO [ELSE {|}] .PU The IF statement is used to make a decision regarding the program flow based on the result of the expressions. If the expression is not zero, the THEN or GOTO clause is executed. If the result of is zero, the THEN or GOTO clause is ignored and the ELSE clause, if present is executed. .br IF..THEN..ELSE statements may be nested. Nesting is limited by the length of the line. The ELSE clause matches with the closests unmatched THEN. .sp When using IF to test equality for a value that is the result of a floating point expression, remember that the internal representation of the value may not be exact. Therefore, the test should be against a range to handle the relative error. .RM Microsoft BASIC allows a comma before THEN. .NH 2 INPUT .Sy INPUT [;][<"prompt string">;] .PU An INPUT statement can be used to obtain values from the user at the terminal. When an INPUT statement is encountered a question mark is printed to indicate the program is awaiting data. IF <"prompt string"> is included, the string is printed before the the question mark. The question mark is suppressed when the prompt string is followed by a comma, rather then a semicolon. .sp For each variable in the variable a list a value should be supplied. Data items presented should be separated by a comma. .sp The type of the variable in the variable list must aggree with the type of the data item entered. Responding with too few or too many data items causes the message "?Redo". No assignment of input values is made until an acceptable response is given. .RM The option to disgard the carriage return with the semicolon after the input symbol is not yet implemented. .NH 2 INPUT [#] .Sy INPUT #, .PU The purpose of the INPUT# statement is to read data items from a sequential file and assign them to program variables. is the number used to open the file for input. The variables mentioned are (subscripted) variables. The type of the data items read should aggree with the type of the variables. A type mismatch results in the error message "XXXXX". .sp The data items on the sequential file are separated by commas and newlines. In scanning the file, leading spaces, new lines, tabs, and carriage returns are ignored. The first character encountered is assumed to be the state of a new item. String items need not be enclosed with double quotes, provided it does not contain spaces, tabs, newlines and commas, .RM Microsoft BASIC won't assign values until the end of input statement. This means that the user has to supply all the information. .NH 2 LET .Sy [LET]= .PU To assign the value of an expression to a (subscribted) variable. The type convertions as dictated in section 1.X apply. .NH 2 LINE INPUT .Sy LINE INPUT [;][<"prompt string">;] .PU An entire line of input is assigned to the string variable. See INPUT for the meaning of the <"prompt string"> option. .NH 2 LINE INPUT [#] .Sy LINE INPUT #, .PU Read an entire line of text from a sequential file and assign it to a string variable. .NH 2 LSET and RSET .PU To be implemented .NH 2 MID$ .Sy MID$(,n[,m])= .PU To replace a portion of a string with another string value. The characters of replaces characters in starting at position n. If m is present, at most m characters are copied, otherwise all characters are copied. However, the string obtained never exceeds the length of string expr1. .NH 2 ON ERROR GOTO .Sy ON ERROR GOTO .PU To enable error handling within the BASIC program. An error may result from arithmetic errors, disk problems, interrupts, or as a result of the ERROR statement. After printing an error message the program is continued at the statements associated with . .sp Error handling is disabled using ON ERROR GOTO 0. Subsequent errors result in an error message and program termination. .NH 2 ON...GOSUB and ON ...GOTO .Sy ON GOSUB ON GOTO .PU To branch to one of several specified line numbers or subroutines, based on the result of the . The list of line numbers are considered the first, second, etc alternative. Branching to the first occurs when the expression evaluates to one, to the second alternative on two, etc. If the value of the expression in zero or greater than the number of alternatives, processing continues at the first statement following the ON..GOTO (ON GOSUB) statement. When the expression results in a negative number the an "Illegal function call" error occurs. .NH 2 OPEN .NH 2 OPTION BASE .Sy OPTION BASE n .PU To declare the lower bound of subsequent array subscripts as either 0 or 1. The default lower bound is zero. .NH 2 POKE .Sy POKE , .PU To poke around in memory. The use of this statement is not recommended, because it requires full understanding of both the implementation of the Amsterdam Compiler Kit and the hardware characteristics. .NH 2 PRINT [USING] .NH 2 PUT .PU To be implemented .NH 2 RANDOMIZE .Sy RANDOMIZE [] .PU To reset the random seed. When the expression is ommitted, the system will ask for a value between -32768 and 32767. The random number generator returns the same sequence of values provided the same seed is used. .NH 2 READ .Sy READ .PU To read values from the DATA statements and assign them to variables. The type of the variables should match to the type of the items being read, otherwise a "Syntax error" occurs. .NH 2 REM .Sy REM .PU To include explantory information in a program. The REM statements are not executed. A single quote has the same effect as : REM, which allows for the inclusion of comment at the end of the line. .RM Microsoft BASIC does not allow REM statements as part of DATA lines. .NH 2 RESTORE .Sy RESTORE [] .PU To allow DATA statements to be re-read from a specific line. After a RESTORE statement is executed, the next READ accesses the first item of the DATA statements. If is specified, the next READ accesses the first item in the specified line. .sp Note that data statements result in a sequential datafile generated by the compiler, being read by the read statements. This data file may be replaced using the operating system functions with a modified version, provided the same layout of items (same number of lines and items per line) is used. .NH 2 STOP .Sy STOP .PU To terminate the execution of a program and return to the operating system command interpreter. A STOP statement results in the message "Break in line ???" .NH 2 SWAP .Sy SWAP , .PU To exchange the values of two variables. .NH 2 TRON/TROFF .Sy TRON .Sy TROFF .PU As an aid in debugging the TRON statement results in a program listing each line being interpreted. TROFF disables generation of this code. .NH 2 WHILE...WEND .Sy WHILE ..... WEND .PU To execute a series of BASIC statements as long as a conditional expression is true. WHILE...WEND loops may be nested. .NH 2 WRITE .Sy WRITE [] .PU To write data at the terminal in DATA statement layout conventions. The expressions should be separated by commas. .NH 2 WRITE # .Sy WRITE # , .PU To write a sequential data file, being opened with the "O" mode. The values are being writting using the DATA statements layout conventions. .NH FUNCTIONS .LP .IP ABS(X) 12 Returns the absolute value of expression X .IP ASC(X$) 12 Returns the numeric value of the first character of the string. If X$ is not initialized an "Illegal function call" error is returned. .IP ATN(X) 12 Returns the arctangent of X in radians. Result is in the range of -pi/2 to pi/2. .IP CDBL(X) 12 Converts X to a double precision number. .IP CHR$(X) 12 Converts the integer value X to its ASCII character. X must be in the range of 0 to 127. It is used for cursor addressing and generating bel signals. .IP CINT(X) 12 Converts X to an integer by rounding the fractional portion. If X is not in the range -32768 to 32767 an "Overflow" error occurs. .IP COS(X) 12 Returns the cosine of X in radians. .IP CSNG(X) 12 Converts X to a double precision number. .IP CVI(<2-bytes>) 12 Convert two byte string value to integer number. .IP CVS(<4-bytes>) 12 Convert four byte string value to single precision number. .IP CVD(<8-bytes>) 12 Convert eight byte string value to double precision number. .IP EOF[()] 12 Returns -1 (true) if the end of a sequential file has been reached. .IP EXP(X) 12 Returns e(base of natural logarithm) to the power of X. X should be less then 10000.0. .IP FIX(X) 12 Returns the truncated integer part of X. FIX(X) is equivalent to SGN(X)*INT(ABS(X)). The major difference between FIX and INT is that FIX does not return the next lower number for negative X. .IP HEX$(X) 12 Returns the string which represents the hexadecimal value of the decimal argument. X is rounded to an integer using CINT before HEX$ is evaluated. .IP INT(X) 12 Returns the largest integer <= X. .IP INPUT$(X[,[#]Y]) 12 Returns the string of X characters read from the terminal or the designated file. .IP LEX(X$) 12 Returns the number of characters in the string X$. Non printable and blancs are counted too. .IP LOC() 12 For sequential files LOC returns position of the read/write head, counted in number of bytes. For random files the function returns the record number just read or written from a GET or PUT statement. If nothing was read or written 0 is returned. .IP LOG(X) 12 Returns the natural logarithm of X. X must be greater than zero. .IP MID$(X,I,[J]) 12 To be implemented. .IP MKI$(X) 12 Converts an integer expression to a two-byte string. .IP MKS$(X) 12 Converts a single precision expression to a four-byte string. .IP MKD$(X) 12 Converts a double precision expression to a eight-byte string. .IP OCT$(X) 12 Returns the string which represents the octal value of the decimal argument. X is rounded to an integer using CINT before OCTS is evaluated. .IP PEEK(I) 12 Returns the byte read from the indicated memory. (Of limited use in the context of ACK) .IP POS(I) 12 Returns the current cursor position. To be implemented. .IP RIGHT$(X$,I) Returns the right most I characters of string X$. If I=0 then the empty string is returned. .IP RND(X) 12 Returns a random number between 0 and 1. X is a dummy argument. .IP SGN(X) 12 If X>0 , SGN(X) returns 1. .br if X=0, SGN(X) returns 0. .br if X<0, SGN(X) returns -1. .IP SIN(X) 12 Returns the sine of X in radians. .IP SPACE$(X) 12 Returns a string of spaces length X. The expression X is rounded to an integer using CINT. .IP STR$(X) Returns the string representation value of X. .IP STRING$(I,J) 12 Returns thes string of length Iwhose characters all have ASCII code J. (or first character when J is a string) .IP TAB(I) 12 Spaces to position I on the terminal. If the current print position is already beyond space I,TAB goes to that position on the next line. Space 1 is leftmost position, and the rightmost position is width minus 1. To be used within PRINT statements only. .IP TAN(X) 12 Returns the tangent of X in radians. If TAN overflows the "Overflow" message is displayed. .IP VAL(X$) 12 Returns the numerical value of string X$. The VAL function strips leading blanks and tabs from the argument string. .SH APPENDIX A DIFFERENCES WITH MICROSOFT BASIC .LP The following list of Microsoft commands and statements are not recognized by the compiler. .DS SPC USR VARPTR AUTO CHAIN CLEAR CLOAD COMMON CONT CSAVE DELETE EDIT ERASE FRE KILL LIST LLIST LOAD LPRINT MERGE NAME NEW NULL RENUM RESUME RUN SAVE WAIT WIDTH LPRINT .DE Some statements are in the current implementation not available, but will be soon. These include: .DS CALL DEFUSR FIELD GET INKEY INPUT$ INSTR$ LEFT$ LSET RSET PUT .DE