d0p1/ack - Cute Engineering : Cute solutions to hard problems

d0p1/ack

Author	SHA1	Message	Date
David Given	86c832ef86	Put saved registers in actually the write place. I hope.	2016-11-15 21:54:15 +01:00
David Given	cc686ded62	Get subtractions the right way round.	2016-11-15 20:25:11 +01:00
David Given	852d3a691d	Update the table to return call output values in the right registers. Fix the register allocator so the corrupted registers only apply to throughs (otherwise, you can't put output registers in corrupted registers).	2016-11-11 21:48:36 +01:00
David Given	b5c1d622f5	Rework the way stack frames are laid out to be simpler and, hopefully, more correct. Saved registers are now placed in what may be the right place.	2016-11-11 21:17:45 +01:00
David Given	84ee75ec07	Merge from default.	2016-11-11 20:17:54 +01:00
David Given	fd91851005	Add enough return types to the K&R C that the ACK builds (on Linux) using clang now.	2016-11-10 22:04:18 +01:00
David Given	9261cd978d	Typo fix.	2016-10-31 23:16:02 +01:00
David Given	941072e0d7	Add, I hope, patterns for fmsub, fnmadd, and fnmsub (also float versions).	2016-10-31 22:36:54 +01:00
David Given	44f0cea6ca	Also use fmadd for single-precision floats.	2016-10-31 19:55:16 +01:00
David Given	064d1a5d5d	Use fmadd for multiply-and-add instructions.	2016-10-31 19:52:17 +01:00
David Given	8c3670483f	Get top working with the PowerPC; use it to eliminate useless branches and moves.	2016-10-29 23:37:11 +02:00
David Given	a8c4dac67c	Merge from default (merging in George Koehler's PowerPC changes).	2016-10-29 22:40:40 +02:00
David Given	a311e61360	Add support for preserved registers.	2016-10-29 20:22:44 +02:00
David Given	1ae8b90238	More opcodes.	2016-10-29 12:55:34 +02:00
David Given	61349389fb	More opcodes. sti can now cope with non-standard sizes (really need a better fix for this). Hack in crude support for mismatched stack pushes and pops (ints vs longs).	2016-10-29 12:48:05 +02:00
David Given	68419da235	Actually, the locals need to go above the spills and saved regs, so fp == lb.	2016-10-29 12:00:33 +02:00
David Given	2cc2c0ae98	Lots more opcodes. Rearrange the stack layout so that fp->ab is a fixed value (needed for CHAINFP and FPTOAB). Wire up lfrs to calls via a phi when necessary, to allow call-bra-lfr chains.	2016-10-29 11:57:56 +02:00
David Given	658db4ba71	Mangle label names (turns out that the ACK assembler can't really cope with labels that are the same name as instructions...).	2016-10-27 23:17:16 +02:00
David Given	81525c0f2c	Swaps work (at least for registers). More opcodes. Rearrange the stack layout so we can always trivially find fp, which lets CHAINFP work.	2016-10-27 21:50:58 +02:00
David Given	9977ce841a	Remove the bytes1, bytes2, bytes4, bytes8 attributes; remove the concept of a register 'type'; now use int/float/long/double throughout to identify registers. Lots of register allocator tweaks and table bugfixes --- we now get through the dreading Mathlib.mod!	2016-10-25 23:04:20 +02:00
David Given	45a7f2e993	Phi copies are now inserted as part of type inference. More opcodes.	2016-10-24 22:14:08 +02:00
David Given	111c13e253	More opcodes.	2016-10-24 20:15:22 +02:00
David Given	b22780c075	More opcodes, including the difficult and fairly stupid los/sts.	2016-10-23 22:24:08 +02:00
David Given	abd0cedd61	Massive change to how IR types are handled; we use the type code for matching rather than the size. Much cleaner and simpler.	2016-10-23 21:54:14 +02:00
David Given	11b0bc1055	More opcodes.	2016-10-22 20:32:51 +02:00
David Given	2d52b1fdaa	Remove GETRET; values are now returned directly by CALL. Fix a bug in convertstackops which was resulting in duplicate IR groups.	2016-10-22 12:13:57 +02:00
David Given	ceb938fb3c	More opcodes.	2016-10-22 11:26:28 +02:00
David Given	7ae888b754	Hacky workaround the way the Modula-2 compiler generates non-standard sized loads and saves. More opcodes; simplified table using macros.	2016-10-22 10:48:22 +02:00
David Given	f851ab83af	Better (and more correct) floating point conversions; fif; various new opcodes.	2016-10-22 00:48:26 +02:00
David Given	d535be87b1	fef4 and fef8 is now cleaner, albeit slower; add some more register alias stuff.	2016-10-22 00:02:15 +02:00
David Given	4db402f229	Add (pretty crummy) support for register aliases and static pairs of registers. We should have enough functionality now for rather buggy 8-bit ints and doubles. Rework the table and the platform.c to match.	2016-10-21 23:31:00 +02:00
David Given	e4fec71f9c	Lots more opcodes; better eviction behaviour; better register moves. Lots more PowerPC stuff (some working).	2016-10-19 23:29:05 +02:00
George Koehler	99dee0ad24	Remove f14 to f31 from FREG and FSREG. This would have happened later, if f14 to f31 became regvar (like r13 to r31 are now). I am doing it now because ncg is too slow for rules "with FREG FREG uses FREG". We use such rules for adf 8 and other EM instructions that operate on 2 floats. Like my last commit `cfbc537`, this commit speeds ncg by removing choices for register allocation.	2016-10-18 21:16:47 -04:00
George Koehler	cfbc537959	In powerpc ncg, add a speed hack for sti 8. ncg is too slow with this many registers. A stack pattern "with GPR GPR GPR" or "with REG REG REG" takes too long to pick registers, causing ncg 8 to take about 2 seconds on each sti 8. I introduce REG_PAIR and there are only 4 such pairs. For programs that use sti 8 (including C programs that copy 8-byte structs), this speed hack improves the ncg run from several seconds to almost instantaneous. Also add a few COMMENT(...) lines in stacking rules.	2016-10-17 20:31:59 -04:00
David Given	938fb8c2fc	Lots more opcodes.	2016-10-18 00:31:26 +02:00
David Given	4a093b9eba	Add li and mr pseudoinstructions.	2016-10-18 00:21:32 +02:00
George Koehler	c7b68033ef	Add costs to powerpc instructions. Also show how andi., andis., or., set condition codes.	2016-10-17 14:57:21 -04:00
George Koehler	f33b30ed3c	Rewrite .fif8 to avoid powerpc64 fctid This fixes the SIGILL (illegal instruction) in startrek when firing phasers. The 32-bit processors in my PowerPC Mac and in QEMU don't have fctid, a 64-bit instruction. I got the idea from mach/proto/fp/fif8.c to extract the exponent, clear some bits to get an integer, then subtract the integer from the original value to get the fraction.	2016-10-17 00:39:59 -04:00
George Koehler	e2ccc8f942	Add "kills MEMORY" to powerpc sti rules. Adjust some of the loi rules (and associated moves) so we can identify the tokens that must be in MEMORY. With this commit, I can navigate the Enterprise even if I comment out my work-around from `e22c888`.	2016-10-16 18:13:39 -04:00
David Given	5f0164db62	Bolt mcg into the PowerPC backend. It doesn't build yet, but it is generating some code.	2016-10-17 00:06:06 +02:00
George Koehler	19f0eb86a4	Remove IND_LABEL_W and IND_LABEL_D Because li32 always loads a label into a GPR, it is sufficient to coerce LABEL to REG, then use IND_RC_W or IND_RC_D for indirection through the label.	2016-10-16 16:33:24 -04:00
George Koehler	5b5f774a64	Simplify moves to and from IND_RC_* Now that SUM_RC always has a signed 16-bit constant, it happens that the various IND_RC_* tokens also have a signed 16-bit constant, so we no longer need to touch the scratch register.	2016-10-16 16:02:25 -04:00
George Koehler	7c64dab491	Refactor how powerpc ncg pushes constants. When loc (load constant) pushes a constant, it now checks the value of the constant and pushes any of 7 tokens. These tokens allow stack patterns to recognize 16-bit signed integers (CONST2), 16-bit unsigned integers (UCONST2), multiples of 0x10000 (CONST_HZ), and other interesting forms of constants. Use the new constant tokens in the rules for adi, sbi, and, ior, xor. Adjust a few other rules to understand the new tokens. Require that SUM_RC has a signed 16-bit constant, and OR_RC and XOR_RC each have an unsigned 16-bit constant. The moves from SUM_RC, OR_RC, XOR_RC to GPR no longer touch the scratch register, because the constant is not too big.	2016-10-16 13:58:54 -04:00
George Koehler	baa152217e	Remove unused parts of mach/powerpc/ncg/table Remove unused tokens GPRINDIRECTLO, HILABEL, LOLABEL, LABELI. Also remove an #if 0 ... #endif group of patterns.	2016-10-15 20:00:48 -04:00
George Koehler	29cb008faa	In powerpc table, fix macros los() and his(). Change the operator in his() from a - minus to a + plus. When los(n) becomes negative, then his(n) needs to add 0x10000, not subtract it. Also change los(n) to do the sign extension, because smalls(los(n)) should be true, not false. Also change hi(n) and lo(n) to wrap n in parentheses, as (n), because these are macros and n might still contain operators.	2016-10-14 23:59:26 -04:00
George Koehler	409ba7fb1b	Remove most of GPRE from mach/powerpc/ncg/table We only need GPRE in a few places where we write {GPRE, regvar(...)} because ncgg can't parse plain regvar(...). In all other places, a plain GPR works. Also remove gpr_gpr_gpr and a few other unused and fake instructions from the list of instructions.	2016-10-06 22:59:27 -04:00
George Koehler	7cccd88b71	Rename SCRATCH to RSCRATCH. Never stack RSCRATCH nor FSCRATCH. Rename the scratch gpr (currently r11) from SCRATCH to RSCRATCH so I can search for RSCRATCH without finding FSCRATCH. I also want to avoid confusion with the SCRATCH keyword of the old code generator (cg which came before ncg). Change the stacking rules to prevent stacking of RSCRATCH or FSCRATCH or any other GPR or FPR that isn't an allocatable REG or FREG. Then ncgg rejects any rule that tries to stack a GPR or FPR, so change such rules to stack a REG or FREG.	2016-10-06 20:47:42 -04:00
George Koehler	ce5faba919	Remove .linenumber and .filename; use hol0 and hol0+4. We need this because some .e files in lang/ are using 'loe 0' and 'lae 4' to load the line number from hol0 and filename from hol0+4.	2016-09-30 13:40:36 -04:00
George Koehler	e22c8881e7	Add a rule for sdl ldl $1==$2 to work around a bug. In our powerpc table, sdl fails to kill the old value of the local. This is a bug, because a later ldl can load the old value instead of the newly stored value. By rewriting "sdl 0" "ldl 0" as "dup 8" "sdl 0", the newly added rule works around the bug, but only when the ldl is immediately after the sdl. This rule improves code that uses double-precision floating point. The output of printf("%f", 6.0) in C changes from all zero digits to "6000000" but still doesn't print the decimal point. The result of atof("-123.456") becomes correct. In startrek, I can now move the Enterprise, but I still can't fire phasers without crashing the game. We already have a rule for stl lol $1==$2. We had two copies of the rule, so I am deleting the second copy.	2016-09-30 11:50:50 -04:00
George Koehler	6ae415d48b	Rewrite fef 8 in powerpc assembly. In EM, fef splits a float into exponent and fraction. The old C code, given an infinite float, got stuck in an infinite loop. The new assembly code doesn't loop; it extracts the IEEE exponent.	2016-09-29 15:52:54 -04:00

1 2

65 commits