left_shift() adapts the example from @davidgiven that caused `ack -O3`
to crash. multiply() is a similar example.
Edit the build system to use -O3 for this test. It now takes -O3 from
the test's filename, and still defaults to -O0.
The strength reduction (SR) phase mishandled the END pseudo when SR
found at least 2 different expressions in the same loop, and SR made a
new header for the loop.
Since 1987 (commit 159b84e), SR appended the new header to the end of
the procedure. Later, SR fixed the header by adding a BRA branch to
the loop's entry, and moving the END pseudo from the previous basic
block to the header. If SR found multiple expressions, it called
fix_header() multiple times, and tried to move the END again. The
extra move failed assert(INSTR(e) == ps_end), or moved another line
after the END, so opt2 (the peephole optimizer) would crash.
Fix by removing fix_header() and moving the code to make_header(), so
SR adds the BRA and moves the END when it makes the header, and does
so only once for each header.
Adjust init_code(), which inserts code into a header. Stop checking
for an empty header, because make_header() now adds BRA. After
inserting code before BRA, don't move LP_INSTR, because later
insertions should go before BRA, not after END.
Our build enables assertions in some other ACK tools (like assemblers
and LLgen), but disabled them in ego until now. Bug #203 becomes a
failed assertion in ego's SR phase:
sr: util/ego/sr/sr_reduce.c:483: fix_header: Assertion
`INSTR(e) == ps_end' failed.
Comment out 2 assertions in util/ego/share, because they fail on
systems with 4-byte int, 8-byte long.
Calls like `debug("something\n", 0, 0, 0, 0)` cause clang warnings,
because debug() is a macro that passes its arguments to printf(), and
clang warns about extra 0s to printf(). Silence the warnings by
hiding the printf() in a new function do_debug(). The code still
passes extra 0s to printf(), but clang can't warn.
Macros debug() and verbose() should use C99 __VA_ARGS__, so they don't
require the extra 0s; but ACK doesn't use __VA_ARGS__ yet.
Adjust some format strings for debug() or fatal(), or cast their
arguments, to match their types. I don't know whether uint32_t is
unsigned int or unsigned long, so I cast it to unsigned long, and
print it with "%lx".
In util/led/sym.c, #include "save.h" to declare savechar(), and use
parentheses to silence a clang warning in hash().
Most warnings are for functions implicitly returning int. Change most
of these functions to return void. (Traditional K&R C had no void
type, but C89 has it.)
Add prototypes to most function declarations in headers. This is
easy, because ego declares most of its extern functions, and the
comments listed most parameters. There were a few outdated or missing
declarations, and a few .c files that failed to include an .h with the
declarations.
Add prototypes to a few function definitions in .c files. Most
functions still have traditional K&R definitions. Most STATIC
functions still don't have prototypes, because they have no earlier
declaration where I would have added the prototype.
Change some prototypes in util/ego/share/alloc.h. Functions newmap()
and oldmap() handle an array of pointers to something; change the
array's type from `short **` to `void **`. Callers use casts to go
between `void **` and the correct type, like `line_p *`. Function
oldtable() takes a `short *`, not a `short **`; I added the wrong type
in 5bbbaf4.
Make a few other changes to silence warnings. There are a few places
where clang wants extra parentheses in the code.
Edit util/ego/ra/build.lua to add the missing dependency on ra*.h; I
needed this to prevent crashes from ra.
Fixes https://github.com/davidgiven/ack/issues/188
One call to n_coerc() omits the 6th and last argument. This worked in
traditional K&R C, but stops working if we declare n_coerc() with a
prototype of all 6 parameters.
Change the last parameter to a pointer. Declare n_coerc() with
prototype, so it now requires all 6 arguments. Pass NULL when have no
iocc_t. This NULL exists only to satisfy the prototype; n_coerc()
will not use this NULL.
A different fix would declare n_coerc() with 5 parameters and `...`,
then use <stdarg.h> to read the 6th argument when it exists.
If a .c file included "types.h" before "mach.h", then it missed the
declaration of mach_option(). Fix by adding "xmach.h".
Fix mach/powerpc/ncg/mach.h and mach/vc4/ncg/mach.h to use the correct
type in their printf() format strings.
Also add `static` and remove `register` in mach/proto/top/top.c. A
static function is only in one file, so its function declaration may
go in that file, instead of a header file.
Use the new syntax in the mips, pdp, powerpc tables to declare
functions before calling them. These declarations prevent compiler
warnings about implicitly declaring functions. They also provide
prototypes of the function parameters.
Also fix a warning in the powerpc table: use `bsearch(...) != NULL` to
avoid converting the pointer from bsearch() to an int.
The syntax for topgen is a block `{...}` among the parameters in the
top table. It looks like the syntax of LLgen, but topgen doesn't
allow nested blocks, so declarations like `struct whatever {...};`
don't work. The token OPEN_BRACKET that begins a declaration_block
doesn't conflict with the LETTER that begins a parameter_line or the
'%' that begins a separator.
Because a block writes a #line command to gen.h, a parameter line now
also writes a #line command to gen.h, so it doesn't get a wrong line
number from a previous block.
Use switch.h to reduce warnings from clang about implicit declarations
of functions. I used `grep ... do_*.c | sed ... | sort`, and some
manual editing, to make the big list of Do???() functions.
Edit C code to reduce warnings from clang. Most warnings are for
implicit declarations of functions, but some warnings want me to add
parentheses or curly braces, or to cast arguments for printf().
Make a few other changes, like declaring float_cst() in h/con_float to
be static, and using C99 bool in ego/ra/makeitems.c and
ego/share/makecldef.c. Such changes don't silence warnings; I make
such changes while I silence warnings in the same file. In
float_cst(), rename parameter `str` to `float_str`, so it doesn't
share a name with the global variable `str`.
Remove `const` from `newmodule(const char *)` in mach/proto/as to
silence a warning. I wrongly added the `const` in d347207.
For warnings about implicit declarations of functions, the fix is to
declare the function before calling it. For example, my OpenBSD
system needs <sys/wait.h> to declare wait().
In util/int, add "whatever.h" to declare more functions. Remove old
declarations from "mem.h", to prefer the newer declarations of the
same functions in "data.h" and "stack.h".
Use C89 size_t for sizes from sizeof() or to malloc() or realloc().
Remove obsolete (unsigned) casts. Sizes were unsigned int in
traditional C but are size_t in C89.
Silence some clang warnings. Add the second pair of round brackets in
`while ((ff = ff->ff_next))` to silence -Wparentheses. Change
`if (nc_first(...))/*nothing*/;` to `(void)nc_first(...);` to silence
-Wempty-body. The code in compute.c nc_first() had the form
`if (x) if (y) s; else t;`. The old indentation (before 10717cc)
suggests that the "else" belongs to the 2nd "if", so add braces like
`if (x) { if (y) s; else t; }` to silence -Wdangling-else.
Shuffle extern function declarations. Add missing declaration for
LLparse(). Stop declaring RENAME(); it doesn't exist. Move some
declarations from main.c to extern.h, so the C compiler may check that
the declarations are compatible with the function definitions.
Assume that standard C89 remove() is available and doesn't need the
UNLINK() wrapper.
In lib/incl, don't need to include <stdio.h> nor <stdlib.h> to use
assert().
Remove alloc.h. If you don't clean your build, then an outdated
BUILDDIR/obj/util/LLgen/headers/alloc.h will survive but should not
cause harm, because nothing includes it. Don't need to remove alloc.h
from util/LLgen/distr.sh, because it isn't there.
Run the bootstrap to rebuild LLgen.c, Lpars.c, tokens.c.
The ACK builds an internal LLgen without installing it. The new
target would rebuild LLgen's own parser using the ACK's internal
LLgen. Keep bootstrap.sh, which uses an installed LLgen. The new
target is more convenient for those who build the ACK but don't build
and install a separate LLgen.
This will cause ACK libc to provide int64_t as long (instead of long
long) on LP64, if we ever get such a platform.
LP64 would have 64-bit long and 64-bit long long, so int64_t might be
either type. For example on amd64, int64_t is long in NetBSD libc,
and long long in OpenBSD libc. Support for long long in ACK remains
incomplete (no printf "%lld"), so it seems better to prefer long where
possible. Also, int64_t being long before long long is more
consistent with int32_t being int before long.
Put suffixes on the values of INT32_MAX, INT64_MAX, and related
constants, so they have the same types as int32_t and int64_t.
Most machines had undefined valu_t and redefined it to a different
type. Edit mach/*/as/mach0.c to remove such redefinitions, so the
next change to valu_t will affect all machines.
Edit mach/proto/as/comm0.h to change valu_t to int64_t, and add
uvalu_t and uint64_t.
Remove int64_t y_valu8 from the yacc %union, now that valu_t y_valu
can hold 64 bits. Replace y_valu8 with y_valu. The .data8 pseudo
becomes less special; it now accepts absolute expressions.
This change simplifies the assembler and seems to have no effect on
the assembled output. Among the files in share/ack/examples, the only
changes are in hilo_bas.* and startrek_c.linuxppc, but those files
seem to change whenever I rebuild them.
This allows `long long x; switch (x) {...}` in C. Add test in C.
This adapts the code for csa 8 and csb 8 from the existing code for
csa 4 and csb 4, for both i386 and m68020.
Shifts that drop an EM word don't need to coerce the word to REG.
Some arithmetic right shifts can use _cdq_.
Drop rules for illegal integer conversions. Sizes below a word are
illegal in EM, except as the source size of _cii_.
Add rules for 8-byte integers to m68020 ncg. Add 8-byte long long to
ACK C on linux68k. Enable long-long tests for linux68k. The tests
pass in our emulator using musahi; I don't have a real 68k processor
and haven't tried other emulators.
Still missing are conversions between 8-byte integers and any size of
floats. The long-long tests don't cover these conversions, and our
emulator can't do floating-point.
Our build always enables TBL68020 and uses word size 4. Without
TBL68020, 8-byte multiply and divide are missing. With word size 2,
some conversions between 2-byte and 8-byte integers are missing.
Fix .cii in libem, which didn't work when converting from 1-byte or
2-byte integers. Now .cii and .cuu work, but also add some rules to
skip .cii and .cuu when converting 8-byte integers. The new rule for
loc 4 loc 8 cii `with test_set4` exposes a bug: the table may believe
that the condition codes test a 4-byte register when they only test a
word or byte, and this incorrect test may describe an unsigned word or
byte as negative. Another rule `with exact test_set1+test_set2` works
around the bug by ignoring the negative flag, because a zero-extended
word or byte is never negative.
The old rules for comparison and logic do work with 8-byte integers
and bitsets, but add some specific 8-byte rules to skip libem calls or
loops. There were no rules for 8-byte arithmetic, shift, or rotate;
so add some. There is a register shortage, because the table requires
preserving d3 to d7, leaving only 3 data registers (d0, d1, d2) for
8-byte operations. Because of the shortage, the code may move data to
an address register, or read a memory location more than once.
The multiplication and division code are translations of the i386
code. They pass the tests, but might not give the best performance on
a real 68k processor.
The assembler wrongly defined _bfexts_ and _bfffo_ with the same bits
as _bfextu_; this turned all bfexts and bfffo instructions into
bfextu. Motorola's 68k Programmer's Reference Manual (1992) gives
different bits for bfexts, but still has wrong bits for bfffo. Change
bfexts and bfffo to match the 68k emulators musahi, aranym, syn68k.
The bitfield width is from 1 to 32, not 0 to 31, so move the warning
from 32 to 0. This doesn't change the warning message, so it will say
that 0 is "too big", when 0 is really too small.
Add EXACT to the rule for adi 8, in the same way that the old rules
for and 8, ior 8, xor 8 have EXACT.
Add rules for sli 8 and sru 8 when shifting 32 bits, and add
assertions in llshift_e.c to test these rules.
Add cases with long long a, b in hexadecimal, where it is more obvious
whether or not a + b or a - b carries to or borrows from bit 32. Add
failure codes to identify each case.
Given `long long o1` and `unsigned int o2`, then `o1 << o2` was
converting o2 to long long and then to int. Remove the first
conversion and just convert o2 to int.
My i386 code from 893df4b gave the wrong sign to some 8-byte
remainders. Fix by splitting .dvi8 and .rmi8 so each has its own code
to pick the sign. They and .dvu8 and .rmu8 share a private sub
.divrem8 for unsigned division.
Improve the i386 code by using instructions like _bsr_ and _shrd_.
Change the helpers to yield a quotient in ebx:eax or a remainder in
ecx:edx; this seems more convenient, because _div_ puts its quotient
in eax and remainder in edx.
Also change UINT32_MAX in <stdint.h> from 4294967295 to 4294967295U.
The U suffix avoids a promotion to long or unsigned long if it would
fit in unsigned int.
Define _EM_LLSIZE but not EM_LLSIZE. The leading underscore is a
convention for such macros. If code always uses _EM_LLSIZE, we will
never need to add EM_LLSIZE. The flag -D_EM_LLSIZE={q} is in
plat/linux386/descr, not lib/descr/fe, so platforms without long long
don't define _EM_LLSIZE.
<stdint.h> doesn't keep the old code for _EM_LSIZE == 8, because I
change it to _EM_LLSIZE == 8. No platform had _EM_LSIZE == 8, and the
old limits like INT64_MAX were wrong.
Add tests for comparisons and shifts. Also add enough integer
conversions to compile the shift test (llshift_e.c), and disable
some wrong rules for ldc and conversions.
Skip the long-long test set on other platforms, because they don't
have long long. Each platform would need to implement 8-byte
operations like `adi 8` in its code generator, and set long long to
8 bytes in its descr file.
The first test is for negation, addition, and subtraction. It also
requires comparison for equality.
For now, a long long literal must have the 'LL' or 'll' suffix. A
literal without 'LL' or 'll' acts as before: it may become unsigned
long but not long long. (For targets where int and long have the same
size, some literals change from unsigned int to unsigned long.)
Type `arith` may be too narrow for long long values. Add a second
type `writh` for wide arithmetic, and change some variables from arith
to writh. This may cause bugs if I forget to use writh, or if a
conversion from writh to arith overflows. I mark some conversions
with (arith) or (writh) casts.
- BigPars, SmallPars: Remove SPECIAL_ARITHMETICS. This feature
would change arith to a different type, but can't work, because it
would conflict with definitions of arith in both <em_arith.h> and
<flt_arith.h>.
- LLlex.c: Understand 'LL' or 'll' suffix. Cut size of constant when
it overflows writh, not only when it overflows the target machine's
types. (This cut might not be necessary, because we might cut it
again later.) When picking signed long or unsigned long, check the
target's long type, not the compiler's arith type; the old check
for `val >= 0` was broken where sizeof(arith) > 4.
- LLlex.h: Change struct token's tok_ival to writh, so it can hold a
long long literal.
- arith.c: Adjust to VL_VALUE being writh. Don't convert between
float and integer at compile-time if the integer might be too wide
for <flt_arith.h>. Add writh2str(), because writh might be too
wide for long2str().
- arith.h: Remove SPECIAL_ARITHMETICS. Declare full_mask[] here,
not in several *.c files. Declare writh2str().
- ch3.c, ch3bin.c, ch3mon.c, declarator.c, statement.g: Remove
obsolete casts. Adjust to VL_VALUE being writh.
- conversion.c, stab.c: Don't declare full_mask[].
- cstoper.c: Use writh for constant operations on VL_VALUE, and for
full_mask[].
- declar., field.c, ival.g: Add casts.
- dumpidf.c: Need to #include "parameters.h" before checking DEBUG.
Use writh2str, because "%ld" might not work.
- eval.c, eval.h: Add casts. Use writh when writing a wide constant
in EM.
- expr.c: Add and remove casts. In fill_int_expr(), make expression
from long long literal. In chk_cst_expr(), allow long long as
constant expression, so the compiler may accept `case 123LL:` in a
switch statement.
- expr.str: Change struct value's vl_value and struct expr's VL_VALUE
to writh, so an expression may have a long long value at compile
time.
- statement.g: Remove obsolete casts.
- switch.c, switch.str: Use writh in case entries for switch
statements, so `switch (ll) {...}` with long long ll works.
- tokenname.c: Add ULNGLNG so LLlex.c can use it for literals.
Add long long type, but without literals; you can't say '123LL' yet.
You can try constant operations, like `(long long)123 + 1`, but the
compiler's `arith` type might not be wide enough. Conversions,
shifts, and some other operations don't work in i386 ncg; I am using a
union instead of conversions:
union q {
long long ll;
unsigned long long ull;
int i[2];
};
Hack plat/linux386/descr to enable long long (size 8, alignment 4)
only for this platform. The default for other platforms is to disable
long long (size -1).
In lang/cem/cemcom.ansi,
- BigPars, SmallPars: Add default size, alignment of long long.
- align.h: Add lnglng_align.
- arith.c: Convert arithmetic operands to long long or unsigned long
long when necessary; avoid conversion from long long to long.
Allow long long as an arithmetic, integral, or logical operand.
- ch3.c: Handle long long like int and long when erroneously applying
a selector, like `long long ll; ll.member` or `ll->member`. Add
long long to integral and arithmetic types.
- code.c: Add long long to type stabs for debugging.
- conversion.c: Add long long to integral conversions.
- cstoper.c: Write masks up to full_mask[8]. Add FIXME comment.
- declar.g: Parse `long long` in code.
- decspecs.c: Understand long long in type declarations.
- eval.c: Add long long to operations, to generate code like `adi 8`.
Don't use `ldc` with constant over 4 bytes.
- ival.g: Allow long long in initializations.
- main.c: Set lnglng_type and related values.
- options.c: Add option like `-Vq8.4` to set long long to size 8,
alignment 4. I chose 'q', because Perl's pack and Ruby's
Array#pack use 'q' for 64-bit or long long values; it might be a
reference to BSD's old quad_t alias for long long.
- sizes.h: Add lnglng_size.
- stab.c: Allow long long when writing the type stab for debugging.
Switch from calculating the ranges to hardcoding them in strings;
add 8-byte ranges as a special case. This also hardcodes the
unsigned 4-byte range as "0;-1". Before it was either "0;-1" or
"0;4294967295", depending on sizeof(long) in the compiler.
- struct.c: Try long long bitfield, but it will probably give the
error, "bit field type long long does not fit in a word".
- switch.c: Update comment.
- tokenname.c: Define LNGLNG (long long) like LNGDBL (long double).
- type.c, type.str: Add lnglng_type and ulnglng_type. Add function
no_long_long() to check if long long is disabled.
This provides adi, sbi, mli, dvi, rmi, ngi, dvu, rmu 8, but is missing
shifts and rotates. It is also missing conversions between 8-byte
integers and other sizes of integers or floats. The code might not be
all correct, but works at least some of the time.
I adapted this from how ncg i86 does 4-byte integers, but I use a
different algorithm when dividing by a large value: i86 avoids the div
instruction and uses a shift-and-subtract loop; but I use the div
instruction to estimate a quotient, which is more like how big integer
libraries do division. My .dvi8 and .dvu8 also set ecx:ebx to the
remainder; this might be a bad idea, because it requires .dvi8 and
.dvu8 to always calculate the remainder, even when the caller only
wants the quotient.
To play with 8-byte integers, I wrote EM procedures like
mes 2, 4, 4
exp $ngi
pro $ngi,0
ldl 4
ngi 8
lol 0
sti 8
lol 0
ret 4
end
exp $adi
pro $adi,0
ldl 4
ldl 12
adi 8
lol 0
sti 8
lol 0
ret 4
end
and called them from C like
typedef struct { int l; int h; } q;
q ngi(q);
q adi(q, q);
This turns EM `con 5000000000I8` into assembly `.data8 5000000000` for
machines i386, i80, i86, m68020, powerpc, vc4. These are the only ncg
machines in our build.
i80 and i86 get con_mult(sz) for sz == 4 and sz == 8. The other
machines only get sz == 8, because they have 4-byte words, and ncg
only calls con_mult(sz) when sz is greater than the word size. The
tab "\t" after .data4 or .data8 is like the tabs in the con_*() macros
of mach/*/ncg/mach.h.
i86 now uses .data4, like i80. Also, i86 and i386 now use the numeric
string without converting it to an integer and back to a string.
This takes literal integers, not expressions, because each machine
defines its own valu_t for expressions, but valu_t can be too narrow
for an 8-byte integer, and I don't want to change all the machines to
use a wider valu_t. Instead, change how the assembler parses literal
integers. Remove the NUMBER token and add a NUMBER8 token for an
int64_t. The new .data8 pseudo emits all 8 bytes of the int64_t;
expressions narrow the int64_t to a valu_t. Don't add any checks for
integer overflow; expressions and .data* pseudos continue to ignore
overflow when a number is too wide.
This commit requires int64_t and uint64_t in the C compiler to build
the assembler. The ACK's own C compiler doesn't have these.
For the assembler's temporary file, add NUMBER4 to store 4-byte
integers. NUMBER4 acts like NUMBER[0-3] and only stores a
non-negative integer. Each negative integer now takes 8 bytes (up
from 4) in the temporary file.
Move the `\fI` and `\fP` in the uni_ass(6) manual, so the square
brackets in `thing [, thing]*` are not italic. This looks nicer in my
terminal, where italic text is underlined.