333 lines
10 KiB
Groff
333 lines
10 KiB
Groff
.tr ~
|
|
.TH UNI_ASS VI
|
|
.ad
|
|
.SH NAME
|
|
uni_ass \- universal assembler/loader
|
|
.SH SYNOPSIS
|
|
/usr/em/lib/\fImachine\fP_as [options] argument ...
|
|
.SH DESCRIPTION
|
|
The universal assembler is a framework allowing easy
|
|
generation of an assembler for any byte oriented machine.
|
|
The framework includes common pseudo instructions for name
|
|
definition, label usage, storage allocation and initialization
|
|
and expression evaluation.
|
|
The resulting program assembles and links assembly modules.
|
|
Arguments may be flags, assembly language modules or libraries.
|
|
.br
|
|
Flags are:
|
|
.IP -d[\fIn\fP]
|
|
Produce a listing on standard output, the octal number
|
|
\fIn\fP is mainly used for debugging purposes.
|
|
The default is 700. 500 and 600 give slightly different
|
|
listings.
|
|
.IP -s[\fIn\fP]
|
|
Produce a human-readable symbol table on standard output.
|
|
The default for \fIn\fP is 3.
|
|
The value 2 causes a listing of only the symbols internal to
|
|
the modules.
|
|
The value 1 causes a listing of external symbols only.
|
|
.IP -o
|
|
The argument following this flag is taken as the name of the
|
|
resulting load file.
|
|
The default name is \fBa.out\fP.
|
|
.PD
|
|
.PP
|
|
The assemblers assemble
|
|
and link together assembly language modules
|
|
machine
|
|
from files and libraries,
|
|
producing an a.out file.
|
|
.PP
|
|
Two different types of arguments are allowed:
|
|
.IP "1-"
|
|
Assembly language modules
|
|
.PD 0
|
|
.IP "2-"
|
|
UNIX archives, as maintained by arch(I). These archives must
|
|
only contain
|
|
assembly language modules with \fI.define\fP as their first
|
|
statement.
|
|
.PD
|
|
.PP
|
|
Note that it is not possible to do a partial load;
|
|
loading starts from assembly language and produces binary
|
|
machine code. No symbol table and no relocation bits are produced.
|
|
.SH "SEGMENTS and TYPES"
|
|
The statements allocating and initializing space,
|
|
like instructions and
|
|
some pseudo-instruction reserve that space in the current
|
|
segment.
|
|
The currently reigning type of segment is determined by
|
|
one of the pseudo-instructions: \fI.text, .data, .bss\fP and
|
|
\&\fI.org\fP.
|
|
The assembler concatenates all space allocated in each of the
|
|
text, data and bss segments.
|
|
That is: every byte in a text segment is followed by another
|
|
byte in the text segment except the last, of which there is
|
|
only one in each program.
|
|
The org segment differs from the other three in the sense that
|
|
the assembler makes no attempt to concatenate pieces of org
|
|
segments.
|
|
Each \fI.org\fP pseudo-instruction has a parameter telling where it
|
|
should start allocating space.
|
|
In the final stages of the assembly the text, data and bss
|
|
segments are concatenated in that order after the length of
|
|
each segment has been made a multiple of a machine dependent
|
|
constant.
|
|
The first segment (text) starts at location 0.
|
|
.br
|
|
The start address of each segment can be set by the .base
|
|
pseudo-instruction.
|
|
.sp
|
|
The labels defined in a particular segment
|
|
have the type of that
|
|
segment, other types are: \fIundefined\fP and \fIabsolute\fP.
|
|
All variables that do not have a value have the type
|
|
\fIundefined\fP, a good example is an unsatisfied external
|
|
reference.
|
|
Numbers have the type \fIabsolute\fP.
|
|
The type of expressions depends on both the operators and the
|
|
operands used.
|
|
Generally, but not always, the following rule holds: whenever
|
|
one of the operands is absolute and the resulting type is that
|
|
of the other operand.
|
|
Not every operation is allowed on every combination of types,
|
|
for example: it is not allowed to add two \fItext\fP values.
|
|
.SH SYNTAX
|
|
.IP letters
|
|
Both upper and lower case may be used and are seen as
|
|
different.
|
|
The underscore '_' is considered to be a letter.
|
|
.IP identifiers
|
|
Identifiers are a sequence of letters and digits, starting with
|
|
a letter or a period '.'.
|
|
Only the first eight characters are remembered by the
|
|
assemblers, identifiers with the same first eight characters
|
|
are considered to be identical.
|
|
Identifiers can, only once, receive a value through assignment or a
|
|
label definition.
|
|
.IP "local labels"
|
|
Local labels consist of a single digit.
|
|
They can only be defined in the label part of a statement and
|
|
used anywhere an identifier is allowed.
|
|
They can be redefined at will.
|
|
Two forms of use exist: \fIf\fPorward and \fIb\fPackward
|
|
references.
|
|
The first consists of the digit followed by an \fIf\fP
|
|
and refers to the first definition of that label following the
|
|
reference.
|
|
The second consists of the digit followed by an \fIb\fP
|
|
and refers to the last definition of the label before the
|
|
reference.
|
|
.IP strings
|
|
Strings are enclosed in single "'" or double """ quotes.
|
|
The use of \eddd where ddd is an octal number and \en, \er,
|
|
\et, \eb and \ef is allowed and has the same meaning as in the
|
|
C language.
|
|
.IP numbers
|
|
Numbers are a sequence of letters and digits, starting with a
|
|
digit.
|
|
No difference is made between small and capital letters.
|
|
.br
|
|
The base of the number is determined in the following way:
|
|
.nf
|
|
if the number ends with an 'h' it is hexadecimal else
|
|
if the number starts with '0x' it is hexadecimal else
|
|
if the number starts with '0' it is octal else
|
|
it's decimal.
|
|
.fi
|
|
Note that the number '0x10h' is an illegal hexadecimal number,
|
|
because 'x' is an illegal hexadecimal digit.
|
|
The number should be written as '0x10' or '10h'.
|
|
The range of numbers depends on the machine.
|
|
A rule of the thumb is that the width of the machine's registers
|
|
the same is as the number of bits allowed in numbers.
|
|
.IP expressions
|
|
The following operators are recognized:
|
|
.nf
|
|
.sp 1
|
|
op type action
|
|
|
|
| binary bitwise or
|
|
& binary bitwise and
|
|
^ binary bitwise exclusive or
|
|
+ binary two's complement addition
|
|
+ unary no effect
|
|
- binary two's complement subtraction
|
|
- unary two's complement negation
|
|
* binary two's complement multiplication
|
|
/ binary two's complement division
|
|
% binary two's complement remainder
|
|
.tr ~~
|
|
~ unary one's complement negation
|
|
.tr ~
|
|
.sp 1
|
|
.fi
|
|
The operator precedence is the same as in C.
|
|
.br
|
|
The operands allowed are: identifiers, numbers and expressions.
|
|
The evaluation order can be changed using the brackets '[' and
|
|
\&']'.
|
|
.sp
|
|
.IP comment
|
|
The character '!' denotes the start of comment, every character
|
|
up to the next newline is skipped.
|
|
Exclamation marks in strings are not recognized as the start of
|
|
comment.
|
|
.IP statements
|
|
Statements are separated by newlines and ';' and can be
|
|
preceded by label definitions.
|
|
Label definitions have the form "\fIidentifier\fP~:" or
|
|
"\fIdigit\fP~:".
|
|
Statements can be: empty, an assignment, an instruction or a
|
|
pseudo-instruction.
|
|
.IP assignment
|
|
An assignment has the form:
|
|
.br
|
|
\fIidentifier\fP = \fIexpression\fP
|
|
.br
|
|
The identifier receives the value and type of the expression.
|
|
.IP instruction
|
|
The syntax of an instruction depends on the type of the target
|
|
machine.
|
|
An example of a assembly file is presented at
|
|
the end of the document.
|
|
.IP pseudo-instruction
|
|
.de Pu
|
|
.sp 1
|
|
.ti +5
|
|
\&\\$1
|
|
.sp 1
|
|
..
|
|
.Pu ".extern \fIidentifier [, identifier]*\fP"
|
|
The identifiers mentioned in the list are exported and can be
|
|
used in other modules.
|
|
.Pu ".define \fIidentifier [, identifier]*\fP"
|
|
Used for modules that are to be part of a libary.
|
|
The .define pseudo's should be the first in such modules.
|
|
When scanning a module in a library the univeral assembler
|
|
checks whether any of its unsatified external references is
|
|
mentioned in a .define list. If so, it includes that module in
|
|
the program.
|
|
The identifiers mentioned in the list are exported and can be
|
|
used in other modules.
|
|
.Pu ".byte \fIexpression [, expression]*\fP"
|
|
Initialize a sequence of bytes.
|
|
This is not followed by automatic alignment.
|
|
.Pu ".short \fIexpression [, expression]*\fP"
|
|
Initialize a sequence of shorts (2-byte values).
|
|
This is not followed by automatic alignment.
|
|
.Pu ".long \fIexpression [, expression]*\fP"
|
|
Initialize a sequence of longs (4-byte values).
|
|
This is not followed by automatic alignment.
|
|
.Pu ".word \fIexpression [, expression]*\fP"
|
|
Initialize a sequence of words. The number of bytes occupied by
|
|
a word depends on the target machine.
|
|
This is not followed by automatic alignment.
|
|
.Pu ".ascii \fIstring\fP"
|
|
Initialize a sequence of bytes with the value of the bytes in
|
|
the string.
|
|
This is not followed by automatic alignment.
|
|
.Pu ".asciz \fIstring\fP"
|
|
Initialize a sequence of bytes with the value of the bytes in
|
|
the string and terminate this with an extra zero byte.
|
|
This is not followed by automatic alignment.
|
|
.Pu ".align [\fIexpression\fP]"
|
|
Adjust the current position to a multiple of the value of the
|
|
expression.
|
|
The default is the word-size of the target machine.
|
|
.Pu ".space \fIexpression\fP"
|
|
Allocate the indicated amount of bytes.
|
|
The expression must be absolute.
|
|
.Pu ".org \fIexpression\fP"
|
|
Start an org segment with the location counter at the indicated
|
|
value.
|
|
The value of the expression must be absolute.
|
|
.Pu ".text"
|
|
.Pu ".data"
|
|
.Pu ".bss"
|
|
Start an segment of the indicated type.
|
|
.Pu ".base \fIexpresssion\fP"
|
|
Set the starting address of the current segment to the value of
|
|
the expression.
|
|
The expression must be absolute.
|
|
.Pu ".errnz \fIexpression\fP"
|
|
Stop with a fatal error message when the value of the
|
|
expression is non-zero.
|
|
.SH "SEE ALSO"
|
|
ack(I), arch(I), a.out(V)
|
|
.SH "EXAMPLE"
|
|
An example of INtel 8086 assembly code.
|
|
.sp 2
|
|
.nf
|
|
.ta 8 16 32 40 48 56 64
|
|
.define begbss
|
|
.define hol0,.diverr,.reghp
|
|
.define EIDIVZ
|
|
|
|
EIDIVZ = 6
|
|
|
|
base = 0x01C0
|
|
topmem = 0xFFF0
|
|
|
|
.org topmem-16
|
|
.extern __n_line
|
|
maxmem:
|
|
__n_line:
|
|
.space 16
|
|
.errnz __n_line-0xFFE0
|
|
|
|
.base base
|
|
|
|
.text
|
|
cld
|
|
xor ax,ax
|
|
mov (2),cs
|
|
mov (0),.diverr
|
|
mov sp,maxmem
|
|
mov di,begbss
|
|
mov cx,[[endbss-begbss]/2]&0x7FFF
|
|
! xor ax,ax ! ax still is 0
|
|
rep stos
|
|
mov ax,1
|
|
push ax
|
|
call _start
|
|
3:
|
|
jmp 3b
|
|
.diverr:
|
|
push ax
|
|
mov ax,EIDIVZ
|
|
call .error
|
|
pop ax
|
|
iret
|
|
cmp 0,4(bx)(di) ! just to show this addr. mode
|
|
|
|
.data
|
|
begdata:
|
|
hol0:
|
|
.word 0,0
|
|
.word 0,0
|
|
.word 3f
|
|
.reghp:
|
|
.word endbss
|
|
3:
|
|
.asciz "PROGRAM"
|
|
.sp 3
|
|
.fi
|
|
.SH DIAGNOSTICS
|
|
Various diagnostics may be produced.
|
|
The most likely errors, however, are unresolved references,
|
|
probably caused by the omission of a library argument.
|
|
.SH BUGS
|
|
The resulting a.out file contains no information about the size
|
|
and starting address of the segments.
|
|
.br
|
|
The resulting a.out file does not contain a symbol table.
|
|
.br
|
|
The alignment might give rise to internal assertion errors when
|
|
the alignment requestes is larger than the machine dependent
|
|
segment alignment.
|
|
.br
|
|
Identifiers declared as externals cannot be used as locals in
|
|
any following module.
|