ack/util/led/ack.out.5
George Koehler 13beb5e336 Document RELOLIS from commit 1bf58cf.
I hastily chose the name RELOLIS for this relocation type.  If we want
to rename it, we only need to edit these files:

 - h/out.h
 - mach/powerpc/as/mach5.c
 - util/amisc/ashow.c
 - util/led/ack.out.5
 - util/led/relocate.c
2017-02-10 11:59:34 -05:00

334 lines
11 KiB
Groff

.TH "ACK.OUT" 5 2017-01-18
.ad
.SH NAME
ack.out\ \-\ ACK-assembler and link editor output
.SH SYNOPSIS
.B #include <out.h>
.SH DESCRIPTION
This manual page discusses the format of object files, as generated by ACK
assemblers and the link editor LED.
The format is designed to be compact, machine independent, and
portable from one machine to another,
so that an object file can be produced on one machine, and
further processed on another.
.ta \w'#define x'u +\w'XXXXXXXX'u +\w'XXXXXXXXXXX'u
.PP
In the following discussion, some structures are defined using
\fBlong\fR and \fBshort\fR as type indicators.
It is assumed that the size of a short is 2 bytes (chars) and that the
size of a long is 4 bytes.
However, these types
have a machine dependent byte and word order.
Therefore, a machine independent representation is chosen for the
object format:
a long consists of two shorts, of which the least significant one
comes first, and a short consists of two bytes, of which the
least significant one comes first.
There is no alignment between various parts and structures in the object
file.
.PP
In general, an object file consists of the following parts:
.PP
.nf
\- a file header
\- a number of section headers
\- the sections themselves
\- a number of relocation structures
\- a symbol table
\- a string area containing the names from the symbol table
.fi
.PP
.B The header.
.br
The header of an object file has the following structure:
.PP
.nf
struct outhead {
uint16_t oh_magic; /* magic number */
uint16_t oh_stamp; /* version stamp */
uint16_t oh_flags; /* several format flags */
uint16_t oh_nsect; /* number of outsect structures */
uint16_t oh_nrelo; /* number of outrelo structures */
uint16_t oh_nname; /* number of outname structures */
uint32_t oh_nemit; /* length of sections */
uint32_t oh_nchar; /* size of string area */
};
.fi
.PP
.nf
#define HF_LINK 0x0004 /* unresolved references left */
.fi
.PP
The fields of this structure have the following purpose:
.nr x \w'oh_magic\ \ \ 'u
.IP oh_magic \nxu
A magic number, indicating that this is an object file.
.IP oh_stamp \nxu
A version stamp, used to detect obsolete versions of object files.
.IP oh_flags \nxu
Currently only used for the HF_LINK flag. When this flag is set, the
object file contains unresolved references.
.IP oh_nsect \nxu
The number of sections and section description structures, later on
referred to as \fIoutsect\fR structures.
Usually, there are only a few sections, f.i. a TEXT section,
a ROM section, a DATA section and a BSS section.
Notice that neither the assemblers nor LED know more about them than their
names.
.IP oh_nrelo \nxu
The number of relocation structures, later on referred to as \fIoutrelo\fR
structures.
.IP oh_nname \nxu
The number of symbol table structures, later on referred to as \fIoutname\fR
structures.
.IP oh_nemit \nxu
The total number of bytes in this object file used for the sections themselves.
This field is used to find the relocation and symbol table structures fast.
.IP oh_nchar \nxu
The size of the string area (the number of bytes).
.PP
.B The section descriptions.
.br
The next part of an object file contains the outsect-structures.
An outsect structure has the following layout:
.PP
.nf
struct outsect {
uint32_t os_base; /* start address in machine */
uint32_t os_size; /* section size in machine */
uint32_t os_foff; /* start address in file */
uint32_t os_flen; /* section size in file */
uint32_t os_lign; /* section alignment */
};
.fi
.PP
The fields in this structure have the following purpose:
.IP os_base \nxu
The start address of this section in the target machine.
This address is determined by LED,
when producing a non-relocatable object file.
It is ignored for relocatable object files.
.IP os_size \nxu
The size of this section on the target machine.
.IP os_foff \nxu
The start address of this section in this file.
.IP os_flen \nxu
The size of this section in this file.
This field does not have to have
the same value as the \fIos_size\fR field!
For instance, an uninitialized
data section probably has \fIos_flen\fR set to 0.
Notice that
the \fIoh_nemit\fR field of the header contains
the sum of all the \fIos_flen\fR fields.
.IP os_lign \nxu
The alignment requirement for this section. The requirement is that
the loader must leave
.IP "" \nxu
\ \ \ \ \ \ \ \fIos_base\fR \fBmod\fR \fIos_lign\fR = 0
.IP "" \nxu
in tact.
.PP
.B The sections.
.br
The next part of an object file contains the sections themselves.
Usually, the LED program places the sections right behind one another in the
target machine, taking the alignment requirements into account.
However, the user is allowed to give
the start addresses of each section.
But if the user gave a start address for
say section 2, but not for section 3, section 3 will be put
right behind section 2.
.PP
.B The relocation structures.
.br
Relocation information is information that allows a program like LED
to combine several object files and produce an executable binary
if there are no unresolved references.
If relocation information is present, it amounts to 8 bytes per
relocatable datum. The information has the following structure:
.PP
.nf
struct outrelo {
uint16_t or_type; /* type of reference */
uint16_t or_sect; /* referencing section */
uint16_t or_nami; /* referenced symbol index */
uint32_t or_addr; /* referencing address */
};
.fi
.PP
.nf
/*
* relocation type bits
*/
#define RELSZ 0x0fff /* relocation length */
#define RELO1 0x01 /* 1 byte */
#define RELO2 0x02 /* 2 bytes */
#define RELO4 0x03 /* 4 bytes */
#define RELOPPC 0x04 /* 26-bit PowerPC address */
#define RELOLIS 0x05 /* PowerPC lis */
#define RELOVC4 0x06 /* VideoCore IV address in 32-bit insruction */
#define RELPC 0x2000 /* pc relative */
#define RELBR 0x4000 /* High order byte lowest address. */
#define RELWR 0x8000 /* High order word lowest address. */
.fi
.PP
.nf
/*
* section type bits and fields
*/
#define S_TYP 0x007F /* undefined, absolute or relative */
#define S_EXT 0x0080 /* external flag */
#define S_ETC 0x7F00 /* for symbolic debug, bypassing 'as' */
.fi
.PP
.nf
/*
* S_TYP field values
*/
#define S_UND 0x0000 /* undefined item */
#define S_ABS 0x0001 /* absolute item */
#define S_MIN 0x0002 /* first user section */
#define S_MAX (S_TYP-1) /* last user section */
#define S_CRS S_TYP /* reference to other namelist item */
.fi
.PP
The fields of this structure have the following purpose:
.IP or_type \nxu
Contains several flags: One of RELO1, RELO2 and RELO4 is set, indicating the
size of the relocatable datum, RELPC is set when the datum is
relocated pc relative, RELBR and RELWR indicate byte and word order of
the relocatable datum.
RELBR and RELWR are needed here.
It is not sufficient
to have flags for them in the header of the object file, because some
machines (NS 32016) use several of the possible combinations in their
instruction encoding.
.IP or_sect \nxu
Contains the section number of the referenc\fIing\fR section.
This is a number that lies between S_MIN and S_MAX.
The section indicated with number S_MIN
is the first section in the sections-section, etc.
.IP or_addr \nxu
Contains the address of the relocatable datum, in the form of an
offset from the base of the section indicated in the \fIor_sect\fR field.
.IP or_nami \nxu
Usually contains the index of the referenced symbol in the symbol table,
starting at 0.
In this case, the reference is to an undefined external symbol, a common
symbol, or a section name.
The relocatable datum then contains
an offset from the indicated symbol or the start of the indicated section.
It may, however, also have the same value as
the \fIoh_nname\fR field of the header.
In this case the relocatable datum
is an absolute number, and the datum is relocated pc relative.
The relocatable datum must then be relocated with respect to the
base address of its section.
.PP
For RELOPPC and RELOVC4, the relocatable datum is a PowerPC or
VideoCore IV instruction.
The relocation depends on the instruction, and uses an offset encoded
in the instruction.
.PP
RELOLIS assembles a PowerPC \fBlis\fR instruction.
The relocatable datum is a 4-byte integer.
The high bit is set for ha16 or clear for hi16.
The next 5 bits are the register \fIRT\fR.
The low 26 bits are a signed offset.
The relocation replaces the datum with the PowerPC instruction
\(oq\fBlis\fR\ \fIRT\fR,\ ha16[\fIsymbol\fR\ +\ \fIoffset\fR]\(cq.
.PP
.B The symbol table.
.br
This table contains definitions of symbols.
It is referred to by outrelo-structures, and can be used by debuggers.
Entries in this table have the following structure:
.PP
.nf
struct outname {
union {
char *on_ptr; /* symbol name (in core) */
long on_off; /* symbol name (in file) */
} on_u;
#define on_mptr on_u.on_ptr
#define on_foff on_u.on_off
uint16_t on_type; /* symbol type */
uint16_t on_desc; /* debug info */
uint32_t on_valu; /* symbol value */
};
.fi
.PP
.nf
/*
* S_ETC field values
*/
#define S_SCT 0x0100 /* section names */
#define S_LIN 0x0200 /* hll source line item */
#define S_FIL 0x0300 /* hll source file item */
#define S_MOD 0x0400 /* ass source file item */
#define S_COM 0x1000 /* Common name */
.fi
.PP
The members of this structure have the following purpose:
.IP on_foff \nxu
Contains the offset of the name from the beginning of the file.
The name extends from the offset to the next null byte.
.IP on_type \nxu
The S_TYP field of this member contains the section number of the symbol.
Here, this number may be S_ABS for an absolute item, or S_UND, for an
undefined item.
The S_EXT flag is set in this member if the symbol is external.
The S_ETC field has the following flags:
S_SCT is set if the symbol represents a section name,
S_COM is set if the symbol represents a common name,
S_LIN is set if the symbol refers to a high level language source line item,
S_FIL is set if the symbol refers to a high level language source file item,
and S_MOD is set if the symbol refers to an assembler source file item.
.IP on_desc \nxu
Currently not used.
.IP on_valu \nxu
Is not used if the symbol refers to an undefined item.
For absolute items
it contains the value, for common names it contains the size, and
for anything else it contains the offset from the beginning of the section.
In a fully linked binary, the beginning of the section is added.
.PP
.B The string area.
.br
The last part of an object file contains the name list.
This is just a sequence of null-terminated strings.
.PP
The relocation information, the symbol table, and the name list do not
have to be present, but then of course we do not have a relocatable
object file.
.PP
.B Miscellaneous defines
.br
The following miscellaneous defines might come in handy when reading
object files:
.PP
.nf
/*
* structure sizes (bytes in file; add digits in SF_*)
*/
#define SZ_HEAD 20
#define SZ_SECT 20
#define SZ_RELO 10
#define SZ_NAME 12
.fi
.PP
.nf
/*
* file access macros
*/
#define BADMAGIC(x) ((x).oh_magic!=O_MAGIC)
#define OFF_SECT(x) SZ_HEAD
#define OFF_EMIT(x) (OFF_SECT(x) + ((long)(x).oh_nsect * SZ_SECT))
#define OFF_RELO(x) (OFF_EMIT(x) + (x).oh_nemit)
#define OFF_NAME(x) (OFF_RELO(x) + ((long)(x).oh_nrelo * SZ_RELO))
#define OFF_CHAR(x) (OFF_NAME(x) + ((long)(x).oh_nname * SZ_NAME))
.fi
.SH "SEE ALSO"
led(6), object(3)