ack/man/uni_ass.6
1984-07-12 15:18:13 +00:00

333 lines
10 KiB
Groff

.\" $Header$
.tr ~
.TH UNI_ASS VI
.ad
.SH NAME
uni_ass \- universal assembler/loader
.SH SYNOPSIS
/usr/em/lib/\fImachine\fP_as [options] argument ...
.SH DESCRIPTION
The universal assembler is a framework allowing easy
generation of an assembler for any byte oriented machine.
The framework includes common pseudo instructions for name
definition, label usage, storage allocation and initialization
and expression evaluation.
The resulting program assembles and links assembly modules.
Arguments may be flags, assembly language modules or libraries.
.br
Flags are:
.IP -d[\fIn\fP]
Produce a listing on standard output, the octal number
\fIn\fP is mainly used for debugging purposes.
The default is 700. 500 and 600 give slightly different
listings.
.IP -s[\fIn\fP]
Produce a human-readable symbol table on standard output.
The default for \fIn\fP is 3.
The value 2 causes a listing of only the symbols internal to
the modules.
The value 1 causes a listing of external symbols only.
.IP -o
The argument following this flag is taken as the name of the
resulting load file.
The default name is \fBa.out\fP.
.PD
.PP
The assemblers assemble
and link together assembly language modules
machine
from files and libraries,
producing an a.out file.
.PP
Two different types of arguments are allowed:
.IP "1-"
Assembly language modules
.PD 0
.IP "2-"
UNIX archives, as maintained by arch(I). These archives must
only contain
assembly language modules with \fI.define\fP as their first
statement.
.PD
.PP
Note that it is not possible to do a partial load;
loading starts from assembly language and produces binary
machine code. No symbol table and no relocation bits are produced.
.SH "SEGMENTS and TYPES"
The statements allocating and initializing space,
like instructions and
some pseudo-instruction reserve that space in the current
segment.
The currently reigning type of segment is determined by
one of the pseudo-instructions: \fI.text, .data, .bss\fP and
\&\fI.org\fP.
The assembler concatenates all space allocated in each of the
text, data and bss segments.
That is: every byte in a text segment is followed by another
byte in the text segment except the last, of which there is
only one in each program.
The org segment differs from the other three in the sense that
the assembler makes no attempt to concatenate pieces of org
segments.
Each \fI.org\fP pseudo-instruction has a parameter telling where it
should start allocating space.
In the final stages of the assembly the text, data and bss
segments are concatenated in that order after the length of
each segment has been made a multiple of a machine dependent
constant.
The first segment (text) starts at the location that is given
as an argument to the .base pseudo-instruction.
The default is 0.
.sp
The labels defined in a particular segment
have the type of that
segment, other types are: \fIundefined\fP and \fIabsolute\fP.
All variables that do not have a value have the type
\fIundefined\fP, a good example is an unsatisfied external
reference.
Numbers have the type \fIabsolute\fP.
The type of expressions depends on both the operators and the
operands used.
Generally, but not always, the following rule holds: whenever
one of the operands is absolute and the resulting type is that
of the other operand.
Not every operation is allowed on every combination of types,
for example: it is not allowed to add two \fItext\fP values.
.SH SYNTAX
.IP letters
Both upper and lower case may be used and are seen as
different.
The underscore '_' is considered to be a letter.
.IP identifiers
Identifiers are a sequence of letters and digits, starting with
a letter or a period '.'.
Only the first eight characters are remembered by the
assemblers, identifiers with the same first eight characters
are considered to be identical.
Identifiers can, only once, receive a value through assignment or a
label definition.
.IP "local labels"
Local labels consist of a single digit.
They can only be defined in the label part of a statement and
used anywhere an identifier is allowed.
They can be redefined at will.
Two forms of use exist: \fIf\fPorward and \fIb\fPackward
references.
The first consists of the digit followed by an \fIf\fP
and refers to the first definition of that label following the
reference.
The second consists of the digit followed by an \fIb\fP
and refers to the last definition of the label before the
reference.
.IP strings
Strings are enclosed in single "'" or double """ quotes.
The use of \eddd where ddd is an octal number and \en, \er,
\et, \eb and \ef is allowed and has the same meaning as in the
C language.
.IP numbers
Numbers are a sequence of letters and digits, starting with a
digit.
No difference is made between small and capital letters.
.br
The base of the number is determined in the following way:
.nf
if the number ends with an 'h' it is hexadecimal else
if the number starts with '0x' it is hexadecimal else
if the number starts with '0' it is octal else
it's decimal.
.fi
Note that the number '0x10h' is an illegal hexadecimal number,
because 'x' is an illegal hexadecimal digit.
The number should be written as '0x10' or '10h'.
The range of numbers depends on the machine.
A rule of the thumb is that the width of the machine's registers
the same is as the number of bits allowed in numbers.
.IP expressions
The following operators are recognized:
.nf
.sp 1
op type action
| binary bitwise or
& binary bitwise and
^ binary bitwise exclusive or
+ binary two's complement addition
+ unary no effect
- binary two's complement subtraction
- unary two's complement negation
* binary two's complement multiplication
/ binary two's complement division
% binary two's complement remainder
.tr ~~
~ unary one's complement negation
.tr ~
.sp 1
.fi
The operator precedence is the same as in C.
.br
The operands allowed are: identifiers, numbers and expressions.
The evaluation order can be changed using the brackets '[' and
\&']'.
.sp
.IP comment
The character '!' denotes the start of comment, every character
up to the next newline is skipped.
Exclamation marks in strings are not recognized as the start of
comment.
.IP statements
Statements are separated by newlines and ';' and can be
preceded by label definitions.
Label definitions have the form "\fIidentifier\fP~:" or
"\fIdigit\fP~:".
Statements can be: empty, an assignment, an instruction or a
pseudo-instruction.
.IP assignment
An assignment has the form:
.br
\fIidentifier\fP = \fIexpression\fP
.br
The identifier receives the value and type of the expression.
.IP instruction
The syntax of an instruction depends on the type of the target
machine.
An example of a assembly file is presented at
the end of the document.
.IP pseudo-instruction
.de Pu
.sp 1
.ti +5
\&\\$1
.sp 1
..
.Pu ".extern \fIidentifier [, identifier]*\fP"
The identifiers mentioned in the list are exported and can be
used in other modules.
.Pu ".define \fIidentifier [, identifier]*\fP"
Used for modules that are to be part of a libary.
The .define pseudo's should be the first in such modules.
When scanning a module in a library the univeral assembler
checks whether any of its unsatified external references is
mentioned in a .define list. If so, it includes that module in
the program.
The identifiers mentioned in the list are exported and can be
used in other modules.
.Pu ".byte \fIexpression [, expression]*\fP"
Initialize a sequence of bytes.
This is not followed by automatic alignment.
.Pu ".short \fIexpression [, expression]*\fP"
Initialize a sequence of shorts (2-byte values).
This is not followed by automatic alignment.
.Pu ".long \fIexpression [, expression]*\fP"
Initialize a sequence of longs (4-byte values).
This is not followed by automatic alignment.
.Pu ".word \fIexpression [, expression]*\fP"
Initialize a sequence of words. The number of bytes occupied by
a word depends on the target machine.
This is not followed by automatic alignment.
.Pu ".ascii \fIstring\fP"
Initialize a sequence of bytes with the value of the bytes in
the string.
This is not followed by automatic alignment.
.Pu ".asciz \fIstring\fP"
Initialize a sequence of bytes with the value of the bytes in
the string and terminate this with an extra zero byte.
This is not followed by automatic alignment.
.Pu ".align [\fIexpression\fP]"
Adjust the current position to a multiple of the value of the
expression.
The default is the word-size of the target machine.
.Pu ".space \fIexpression\fP"
Allocate the indicated amount of bytes.
The expression must be absolute.
.Pu ".org \fIexpression\fP"
Start an org segment with the location counter at the indicated
value.
The value of the expression must be absolute.
.Pu ".text"
.Pu ".data"
.Pu ".bss"
Start an segment of the indicated type.
.Pu ".base \fIexpresssion\fP"
Set the starting address of the first of the consecutive segments
(text) to the value of the expression.
The expression must be absolute.
.Pu ".errnz \fIexpression\fP"
Stop with a fatal error message when the value of the
expression is non-zero.
.SH "SEE ALSO"
ack(I), arch(I), a.out(V)
.SH "EXAMPLE"
An example of INtel 8086 assembly code.
.sp 2
.nf
.ta 8 16 32 40 48 56 64
.define begbss
.define hol0,.diverr,.reghp
.define EIDIVZ
EIDIVZ = 6
base = 0x01C0
topmem = 0xFFF0
.org topmem-16
.extern __n_line
maxmem:
__n_line:
.space 16
.errnz __n_line-0xFFE0
.base base
.text
cld
xor ax,ax
mov (2),cs
mov (0),.diverr
mov sp,maxmem
mov di,begbss
mov cx,[[endbss-begbss]/2]&0x7FFF
! xor ax,ax ! ax still is 0
rep stos
mov ax,1
push ax
call _start
3:
jmp 3b
.diverr:
push ax
mov ax,EIDIVZ
call .error
pop ax
iret
cmp 0,4(bx)(di) ! just to show this addr. mode
.data
begdata:
hol0:
.word 0,0
.word 0,0
.word 3f
.reghp:
.word endbss
3:
.asciz "PROGRAM"
.sp 3
.fi
.SH DIAGNOSTICS
Various diagnostics may be produced.
The most likely errors, however, are unresolved references,
probably caused by the omission of a library argument.
.SH BUGS
The resulting a.out file contains no information about the size
and starting address of the segments.
.br
The resulting a.out file does not contain a symbol table.
.br
The alignment might give rise to internal assertion errors when
the alignment requestes is larger than the machine dependent
segment alignment.
.br
Identifiers declared as externals cannot be used as locals in
any following module.