d0p1/ack - Cute Engineering : Cute solutions to hard problems

d0p1/ack

Author	SHA1	Message	Date
George Koehler	dc05cb2dc8	Add pat cms !defined($1) Switch .cms to pass inputs and outputs on the real stack, not in registers; like we do with .and, .or (`81c677d`) and .xor (`c578c49`). At this point, nearly all functions in libem use the real stack, not registers, for passing inputs and outputs. This simplifies the ncg table (which needs fewer lists of specific registers) but slows calls to libem. For example, after `ba9b021`, each call to .aar4 is about 10 instructions slower. I moved 3 inputs and 1 output from registers to the real stack. A program would take 4 instructions to move registers to stack, 4 to move stack to registers, and perhaps 2 to adjust the stack pointer.	2017-02-13 16:52:32 -05:00
George Koehler	89dd80e34d	Add missing instances of "kills ALL" or "with STACK".	2017-02-13 16:38:26 -05:00
George Koehler	ba9b021253	Use .los4 in lar 4 and .sts4 in sar 4. Our libem had two implementations of loading a block from a stack, one for lar 4 and one for los 4. Now lar 4 and los 4 share the code in .los4. Likewise, sar 4 and sts 4 share the code in .sts4. Rename .los to .los4 and .sts to .sts4, because they implement los 4 and sts 4. Remove the special case for loading or storing 4 bytes, because we can do it with 1 iteration of the loop. Remove the lines to "align size" where the size must already be a multiple of 4. Fix the upper bound check in .aar4. Change .aar4, .lar4, .los4, .sar4, .sts4 to pass all operands on the real stack, except that .los4 and .sts4 take the size in register r3. Have .aar4 set r3 to the size of the array element. So lar 4 is just .aar4 then .los4, and sar 4 is just .aar4 then .sts4. ncg no longer calls .lar4 and .sar4 in libem, because it inlines the code; but I keep .lar4 and .sar4 in libem, because mcg references them. They might or might not work in mcg.	2017-02-13 15:22:00 -05:00
George Koehler	54949f713f	Change .fef8 and .fif8 to pass values on the stack. Reorder the code in .fef8 and .fif8 so that in the usual case, we fall through to the blr without taking any branches. The usual case, by my guess, is .fef8 with normalized numbers or .fif8 with small integers. I change .fef8 and .fif8 to pass values on the real stack, not in specific registers. This simplifies the ncg table, and might help me experiment with changes to the ncg table. This change might or might not help mcg. Seems that mcg always uses the stack to pass values to libem, but I have not tested .fef8 or .fif8 with mcg.	2017-02-12 16:44:37 -05:00
George Koehler	1de1e8f7f0	Experiment with conversions between integers and floats. Switch some conversions from libem calls to inline code. The conversions from integers to floats are now too slow, because each conversion allocates 4 or 5 registers, and the register allocator is too slow. I might use these slow conversions to experiment with the register allocator. I add the missing conversions between 4-byte single floats and integers, simply by going through 8-byte double floats. (These replace the calls to nonexistant functions in libem.) I remove the placeholder for fef 4, because it doesn't exist in libem, and our language runtimes only use fef 8.	2017-02-12 15:45:28 -05:00
George Koehler	2e41c392fa	Implement blm and bls using an inline loop. This replaces a call to memmove() in libc. That was working for me, but it can fail because EM programs don't always link to libc. blm and bls only need to copy aligned words. They don't need to copy bytes, and they don't need to copy between overlapping buffers, as memmove() does. So the new loop is simpler than memmove().	2017-02-11 19:30:12 -05:00
George Koehler	c578c495bb	Edit PowerPC assembly for .and, .cms, .ior, .xor, .zer Remove one addi instruction from some loops. These loops had increased 2 pointers, they now increase 1 index. I must initialize the index, so I add "li r6, 0" before each loop. Change .zer to use subf instead of neg, add. Change .xor to take the size on the real stack, as .and and .or have done since `81c677d`.	2017-02-11 18:00:56 -05:00
George Koehler	83c13597e1	Use "mr" and make a few other tweaks in PowerPC ncg table. Use extended "mr" instead of basic "or" to move registers. Both "mr" and "or" encode the same machine instruction. With "mr", I can more easily search the assembly output for register moves. Fold several stacking rules into a single rule ANY_BHW-REG to STACK. Remove the EM patterns for loc mlu $2==2 and loc slu. The first pattern had the wrong size (should be $2==4, not $2==2). Both patterns were redundant. They rewrote loc mlu as loc mli and loc slu as loc sli, but this table doesn't have patterns for loc mli or loc sli, so it is enough to rewrite mlu as mli and slu as sli.	2017-02-10 11:45:50 -05:00
George Koehler	85391399a4	Use ha16/lo16 to load or store 1, 2, 8 bytes from labels. Add the tokens IND_RL_B, IND_RL_H, IND_RL_H_S, IND_RL_D, along with the rules to use them. These rules emit shorter code. For example, loading a byte becomes lis, lbz instead of lis, addi, lbz. While making this, I wrongly set IND_RL_D to size 4. Then ncg made infinite recursion in codegen() and stackupto(), until it crashed by stack overflow. I correctly set IND_RL_D to size 8, preventing the crash.	2017-02-08 12:31:14 -05:00
George Koehler	5e00e1fce2	Trimming mach/powerpc/ncg/table Remove coercion from LABEL to REG. The coercion never happens because I have stopped putting LABEL on the stack. Also remove LABEL from set ANY_BHW. Retain the move from LABEL to REG because pat gto uses it. Remove li32 instruction, unused after the switch to the hi16, ha16, lo16 syntax. Remove COMMENT(...) lines from most moves. In my opinion, they took too much space, both in the table and in the assembly output. The stacking rules and coercions keep their COMMENT(...) lines. In test GPR, don't write to RSCRATCH. Fold several coercions into a single coercion from ANY_BHW uses REG. Use REG instead of GPR in stack patterns. REG and GPR act the same, because every GPR on the stack is a REG, but I want to be clear that I expect a REG, not r0. In code rules, sort SUM_RC before SORT_RR, so I can add SUM_RL later. Remove rules to optimize loc loc cii loc loc cii. If $2==$4, the peephole optimizer can optimize it. If $2!=$4, then the EM program is missing a conversion from size $2 to size $4. Remove rules to store a SEX_B with sti 1 or a SEX_H with sti 2. These rules would never get used, unless the EM program is missing a conversion from size 4 to size 1 or 2.	2017-02-08 12:27:16 -05:00
George Koehler	ed21a59a82	In PowerPC ncg, allocate register for ha16[label]. Use it to generate code like lis r12,ha16[__II0] lis r11,ha16[_f] lfs f1,lo16[_f](r11) lfs f2,lo16[__II0](r12) fadds f13,f2,f1 stfs f13,lo16[_f](r11) Here ncg has allocated r11 for ha16[_f]. We use r11 in lfs and again in stfs. Before this change, we needed an extra lis before stfs, because ncg did not remember that ha16[_f] was in a register. This example has a gap between ha16[__II0] and lo16[__II0], because the lo16 is not in the next instruction. This requires my previous commit `1bf58cf` for RELOLIS. There is a gap because ncg emits the lis as soon as I allocate it. The "lfs f2,lo16[__II0](r12)" happens in a coercion from IND_RL_W to FSREG. The coercion allocates one FSREG but may not allocate any other registers. So I must allocate r12 earlier. I allocate r12 in pat lae, but this causes a gap.	2017-02-08 12:23:06 -05:00
George Koehler	754e96ef16	Use ha16/lo16 to emit pairs of lis/stw, lis/lfs, lis/stfs. A 4-byte load from a label yields a token IND_RL_W. This token emits either lis/lwz or lis/lfs, if we want a general-purpose register or a floating-point register.	2017-02-08 12:13:54 -05:00
George Koehler	7255ed403f	Tweak some tokens in PowerPC ncg. Remove the GPRINDIRECT token, and use the IND_RC_* tokens as operands to instructions. We no longer need to unpack an IND_RC_* token and repack it as a GPRINDIRECT to use it in an instruction. Allow storing IND_ALL_B and IND_ALL_H in register variables. Create a set ANY_BHW for anything that we can store in a regvar. Push register variables on the stack without using GPRE, by changing stwu to accept LOCAL. Then ncg will replace the string ">>> BUG IN LOCAL" with the register name. (I copied ">>> BUG IN LOCAL" from mach/arm/ncg/table.) Fix the rule for "pat lil inreg($1)>0" to yield a IND_RC_W token, not a register. We might need to kill the token with "kills MEMORY". Rename CONST_ALL to CONST_STACK, because it only includes constants on the stack, and excludes CONST tokens. Instructions still don't allow CONST_STACK operands, so we still need to repack each CONST_STACK as a CONST to use it in an instruction. Rename LABEL_OFFSET_HI to just LABEL_HI, and same for LABEL_HA and LABEL_HO.	2017-02-08 12:12:28 -05:00
George Koehler	1bf58cf51c	Add RELOLIS for PowerPC lis with ha16 or hi16. The new relocation type RELOLIS handles these instructions: lis RT, ha16[expr] == addis RT, r0, ha16[expr] lis RT, hi16[expr] == addis RT, r0, hi16[expr] RELOLIS stores a 32-bit value in the program text. In this value, the high bit is a ha16 flag, the next 5 bits are the target register RT, and the low bits are a signed 26-bit offset. The linker replaces this value with the lis instruction. The old RELOPPC relocated a ha16/lo16 or hi16/lo16 pair. The new RELOLIS relocates only a ha16 or hi16, so it is no longer necessary to have a matching lo16 in the next instruction. The disadvantage is that RELOLIS has only a signed 26-bit offset, not a 32-bit offset. Switch the assembler to use RELOLIS for ha16 or hi16 and RELO2 for lo16. The li32 instruction still uses the old RELOPPC relocation. This is not the same as my RELOPPC change from my recent mail to tack-devel (https://sourceforge.net/p/tack/mailman/message/35651528/). This commit is on a different branch. Here I am throwing away my RELOPPC change and instead trying RELOLIS.	2017-02-08 11:46:31 -05:00
George Koehler	f4cfbedd5c	Remove #include <stdbool.h> from mach/powerpc/as/mach1.c We should not include a system header file here, because mach/proto/as/comm2.y goes through cpp twice. The include can cause problems like https://github.com/davidgiven/ack/issues/1 Remove this include #<stdbool.h> and leave a comment pointing to the includes in comm0.h. Change the few instances of bool, false, true, to int, 0, 1.	2017-01-30 16:39:23 -05:00
George Koehler	3c1d2d79f0	Remove type quad, use type word_t in PowerPC as. Type word_t is for encoding the machine instructions. It only needs 32 bits for PowerPC. It was long (which can have 32 or 64 bits), and there was a second type quad (which was uint32_t). Switch word_t to uint32_t and replace quad with word_t. Also change valu_t and ADDR_T away from long.	2017-01-30 16:15:02 -05:00
George Koehler	48e3aab728	Swap RA and RS when assembling "and", "or", and such instructions. They must use OP_RA_RS_RB_C instead of OP_RS_RA_RB_C. The code generator often sets RS and RA to the same register, so swapping them causes no change in many programs. I also rename OP_RS_RA_UI_CC to OP_RA_RS_UI_CC, and OP_RS_RA_C to OP_RA_RS_C, because they already swap RA and RS.	2017-01-30 15:47:09 -05:00
George Koehler	9ddbb66c8b	Turn off comments again. I turned them on by accident in `c416889`.	2017-01-30 15:45:46 -05:00
George Koehler	c41688929c	In PowerPC ncg, switch the scratch register from r11 to r0. r0 is a special case and can't be used when adding a register to a constant. The few remaining users of the scratch register don't do that. I removed other usages of the scratch register in `7c64dab`, `5b5f774`, `19f0eb8`, `f64b7d8`.	2017-01-26 13:10:08 -05:00
George Koehler	1dfd5524e4	In PowerPC top, don't delete addi r0, r0, 0 Also don't delete addis r0, r0, 0. These instructions are special cases that set r0 to zero. If we delete them, then r0 keeps its old value. I caught this bug because osxppc protects the .text segment against writing. (linuxppc doesn't protect it.) A program tried to set r0 to the NULL pointer, but top deleted the instruction, so r0 kept an old return address pointing into .text. Later the program checked that r0 wasn't NULL, tried to write to address r0, and crashed.	2017-01-26 12:44:32 -05:00
George Koehler	8c8f291a07	In PowerPC libem, remove tge.s and powerpc.h Nothing uses the tables in tge.s, after I changed the ncg table. There are no *.e files in libem, so don't try to build them.	2017-01-26 12:39:16 -05:00
George Koehler	f64b7d8ea0	Rewrite how PowerPC ncg does conditional branches and tests. The rewritten code rules bring 3 new features: 1. The new rules compare a small constant with a register by reversing the comparison and using `cmpwi` or `cmplwi`. The old rules put the constant in a register. 2. The new rules emit shorter code to yield the test results, without referencing the tables in mach/powerpc/ncg/tge.s. 3. The new rules use the extended `beq` and relatives, not the basic `bc`, in the assembly output. I delete the old tristate tokens and the old moves, because they confused me. Some of the old moves weren't really moves. For example, `move R3, C0` and then `move C0, R0` did not move r3 to r0. I rename C0 to CR0.	2017-01-25 19:08:55 -05:00
George Koehler	a348853ece	Add missing size declarations for 8-byte registers. This fixes the coercion from IND_ALL_D to FREG. The coercion had never happened, because IND_ALL_D had 8 bytes but FREG had 4 bytes. Instead, ncg always stacked the IND_ALL_D and unstacked a FREG. The stacking rule uses f0, so the code did load f0 with the indirect value, push f0 to stack, load f1 to stack, move stack pointer. Now that FREG has 8 bytes, ncg does the coercion, and the code just loads f1 with the indirect value.	2017-01-25 11:56:58 -05:00
George Koehler	188b23bade	Add constraints for pat lab, as done in the m68020 table. Always use 'kills ALL' when reaching a label, because our registers and tokens have the wrong values if the program jumps to this label from somewhere else. When falling through a label, if the top element is in r3, then require that the rest of the stack is in the real STACK, not in registers or tokens. I'm doing this to be certain that the missing constraints are not causing bugs. I did not find any such bug, perhaps because the labels are usually near other instructions (like conditional branches and function calls) that stack or kill tokens.	2017-01-24 11:26:35 -05:00
George Koehler	bb67dbeb11	Use "kills ALL" instead of a list of killed registers. This is for fef 8 and fif 8. I changed .fef8 so it no longer kills r7, but I don't want to update the list. We already use "kills ALL" for most other calls to libem.	2017-01-23 17:31:29 -05:00
George Koehler	032bcffef6	In PowerPC libem, use the new features of our assembler. The new features are the hi16/lo16 and ha16/lo16 syntax for relocations, and the extended mnemonics like "blr". Use ha16/lo16 to load some double floats with 2 instructions (lis/lfd) instead of 3 (lis/ori/lfd). Use the extended names for branches, comparisons, and bit rotations, so I can more easily read the code. The new names often encode the same machine instructions as the old names, except in a few places where I changed the instructions. Stop using andi. when we don't need to set cr0. In inn.s, I change andi. to extrwi to extract the same bits. In los.s and sts.s, I change "andi. r3, r3, ~3" to "clrrwi r3, r3, 2". This avoids setting cr0 and also stops clearing the high 16 bits of r3. In csa.s, los.s, sts.s, I change some comparisons and right shifts from signed to unsigned (cmplw, cmplwi, srwi), because the sizes are unsigned. In inn.s, the right shift can be signed (sraw) or unsigned (srw), but I use srw because we don't need the carry bit. In fef8.s, I save an instruction by using rlwinm instead of addis/andc to rlwinm to clear a field. The code no longer kills r7. In both fef8.s and fif8.s, I remove the list of killed registers. Also remove some whitespace from ends of lines.	2017-01-23 17:16:39 -05:00
George Koehler	5aa2ac2246	Teach the assembler about PowerPC extended mnemonics. Also make a few changes to basic mnemonics. Fix typo in name of the basic "creqv". Add the basic "addc" and relatives, because it would be odd to have the extended "subc" without "addc". Fix the basic "rldicl", "rldicr", "rldic", "rldimi" to correctly encode the 6-bit MB field. Fix "slw" and relatives to correctly swap their RA and RS operands. Add many, but not all, of the extended mnemonics from IBM's Power ISA Version 2.06 Book I Appendix E. (I used 2.06, published 2009, just because I already had the PDF of it.) This commit includes mnemonics for branching, subtraction, traps, bit rotation, and a few others, like "mflr" and "nop". The assembler now understands branches like `beq cr7, label` and bit shifts like `slwi r7, r7, 2`. These encode the same machine instructions as the basic "bc" and "rlwinm". Some operands to basic names become optional. The assembler no longer requires the level in "sc" or the branch hint in "bcctr" and "bclr"; they default to zero. Some extended names take an optional branch hint or condition register. Some extended names are still missing. I don't provide names with static branch prediction, like "beq+" or "bge-", because the assembler parses '+' and '-' as operators, not as part of an instruction name. I also don't provide some names that 2.06 has for moving to or from the condition register or some special purpose registers, names like "mtcr" or "mfuamr". This commit also deletes some unused tokens and one unused yacc rule.	2017-01-21 23:49:29 -05:00
David Given	d7df126730	Merge pull request #44 from kernigh/kernigh-pr-as mach/proto/as: allow more tokens	2017-01-18 23:33:40 +01:00
George Koehler	f705339f86	Allow more tokens in the assembler. I need this so I can add more %token lines to mach/powerpc/as/mach2.c The assembler's tempfile encoded each token in a byte. This only worked with tokens 0 to 127 and 256 and 383. If a token 384 or higher existed, the assembler stopped working. I need tokens 384 and higher. I change the token encoding to a 2-byte little-endian integer. I also change a byte in the string encoding.	2017-01-17 22:41:11 -05:00
David Given	232545606d	Merge from default.	2017-01-18 00:02:32 +01:00
George Koehler	ba2a03705e	Use prototypes in mach/proto/as/comm5.c Order the function prototypes in comm1.h to match the order of the function definitions in *.c files.	2017-01-17 16:41:29 -05:00
David Given	81c677d218	Add a bunch more set operations to the PowerPC backends, and the Pascal test for the same.	2017-01-17 22:31:38 +01:00
George Koehler	916d270534	Delay inclusion of <stdint.h> when compiling comm2.y See issue #1 (https://github.com/davidgiven/ack/issues/1). The file mach/proto/as/comm2.y goes through cpp twice. The _include macro, defined in comm2.y and used in comm0.h, delays the inclusion of system header files. The inclusion of <stdint.h> wasn't delayed. This caused multiple inclusions of <sys/_types.h> in FreeBSD and <machine/_types.h> in OpenBSD. Use _include to delay <stdint.h>. Also use _include for "arch.h" and "out.h", because h/out.h includes <stdint.h> and h/arch.h might include it in the future. Sort the system includes in comm0.h by moving them up to be with <stdint.h>. Must include <stdint.h> before "mach0.c", because mach/powerpc/as/mach0.c needs it. Must include "mach0.c" before checking ASLD.	2017-01-16 22:39:44 -05:00
George Koehler	e97116c037	Remove some obsolete code that causes a gcc warning. In my OpenBSD/amd64 system, the code becomes if (0) outname.on_valu &= ~(((0xFFFFFFFF)<<32)<<32); The 0xFFFFFFFF is a 32-bit int, so the left shift by 32 is out of range and causes the gcc warning. The intent might be to clear any sign-extended bits, if the assignment outname.on_valu = valu did sign extension. Old C had no unsigned long, so .on_valu would have been long. The code is obsolete because h/out.h now declares .on_valu as uint32_t.	2017-01-16 18:09:55 -05:00
David Given	c471f617b7	Ensure that memory is zero-initialised.	2017-01-16 22:45:03 +01:00
David Given	2cdcc16bc2	Fix a buffer overrun that was manifesting on OpenBSD; also fix a bounds check and some uninitialised variable problems.	2017-01-16 22:44:37 +01:00
David Given	fa5675d439	Run through clang-format.	2017-01-16 21:16:33 +01:00
David Given	e7e29d34ff	Add a test (currently failing) to check that Pascal char sets can store all 256 possible values. Add the PowerPC ncg and mcg backend support to let the test actually run, including modifying a bunch of PowrePC libem functions so that they can be called from both ncg and mcg.	2017-01-15 22:28:14 +01:00
David Given	9a346c382d	Turns out Apple's hi16/ha16 exactly match my ha16/has16, so renamed accordingly. (Memo to self: read the docs before doing the work.)	2017-01-15 11:59:33 +01:00
David Given	f80acfe9f5	Signed vs unsigned lower halves of powerpc fixups are now handled by having two assembler directives, ha16() and has16(), for the upper half; has16() applies the sign adjustment. .powerpcfixup is now gone, as we generate the relocation in ha*() instead. Add special logic to the linker for undoing and redoing the sign adjustment when reading/writing fixups. Tests still pass.	2017-01-15 11:51:37 +01:00
David Given	3c0bc205fc	Update the hi/lo syntax to be a bit more standard.	2017-01-15 10:21:02 +01:00
David Given	8edbff9795	Add assembler support for fixing up arbitrary oris/addi pairs of instructions; this should allow oris/lwz constant value loads, which will save an opcode.	2017-01-15 00:15:01 +01:00
David Given	efab08178b	Fix a bunch of issues with pushing and popping mismatched sizes, which the B compiler does a lot; dup 8 for pairs of words is now optimised.	2017-01-07 18:47:00 +01:00
David Given	6b4f8d72b8	ine and ste are now declared to modify memory (preventing cached values being propagated across the modification).	2017-01-07 13:25:09 +01:00
David Given	7710c76d56	Introduce sequence points before store instructions to prevent loads from the same address being delayed until after the store (at which point they'll return the wrong value).	2017-01-07 13:17:39 +01:00
David Given	0da248dced	Use a better NOT; and after remembering that PowerPC bit numbers are all backwards in the documentation, rewrote IFEQ/IFLT/IFLE to actually work. Probably. Thanks to the B test suite for spotting this.	2017-01-07 01:03:15 +01:00
David Given	73922f1d16	Ensure that procedure labels are word-aligned.	2017-01-06 22:29:52 +01:00
David Given	e3f8fb84dc	Change the i80 assembler to be three-pass, which allows forward references; required for assembling B.	2016-12-29 17:08:53 +00:00
David Given	e50f4be710	Merge from default.	2016-12-26 19:44:48 +00:00
David Given	bf2e0be69a	Merge pull request #27 from kernigh/pr-qemu-doze Teach qemuppc to halt the cpu on _exit().	2016-12-11 23:17:12 +01:00
George Koehler	8605a2fcfc	Add Modula-2 set operations to PowerPC ncg. This provides and, ior, xor, com, zer, set, cms when defined($1) and ior, set when !defined($1). I don't provide the other operations !defined($1) because our Modula-2 compiler hasn't used them. I wrote a Modula-2 example in https://gist.github.com/kernigh/add79662bb3c63ffb7c46d01dc8ae788 Put a dummy comment in mach/powerpc/libem/build.lua so git checkout will touch that file. Without the touch, the build system doesn't see the new *.s files.	2016-12-10 12:23:07 -05:00
George Koehler	fcda786fe9	Add some missing clauses to los, sts, aar, inn, cmi, cmu. We only implement 'los 4', 'sts 4', 'cmi 4', 'cmu 4', not for sizes other than 4. Add clause $1==4. We only implement inn when defined($1). The rule for aar needs 'kills ALL' because it kills many registers, like other rules that call libem.	2016-12-09 19:49:50 -05:00
George Koehler	436114fce4	Add a move from CONST smalls(%val) to GPR. This allows 'move {CONST, $1}, R3' with a small enough $1 to emit one instruction (addi) instead of two instructions (addis, ori). The CONST token confusingly isn't in the CONST_ALL set.	2016-12-09 18:40:14 -05:00
George Koehler	17211eef47	Fix ass to match the EM spec. The spec says, "ASS w: Adjust the stack pointer by w-byte integer". The w argument "can either be given as argument or on top of the stack." Therefore, 'ass 4' would pop the 4-byte integer from the stack, but 'ass' would pop the size w from the stack, then pop the w-byte integer. PowerPC ncg wrongly implemented 'ass' as if it was 'ass 4'. Fix it to accept only 'ass 4'.	2016-12-09 17:32:42 -05:00
George Koehler	5bd0ad4269	Remove the bogus rules for 'lor 2' and 'str 2'. These instructions would load or store the EM heap pointer. They don't work. Programs must use brk() or sbrk() in libsys. The last file to use 'lor 2' and 'str 2' was lang/pc/libpc/sav.e in the Pascal library. Commit `c084f9f` deleted the file, so we no longer need rules 'lor 2' or 'str 2' to build the ACK.	2016-12-09 17:00:56 -05:00
George Koehler	805883e377	Fill in a hint for enabling the COMMENT macro. If you want to enable comments in the .s file, change #define COMMENT(n) /* comment {LABEL, n} */ to #define COMMENT(n) comment {LABEL, n}	2016-12-09 16:58:47 -05:00
George Koehler	244e554f2f	Remove trailing whitespace in mach/powerpc/ncg/table	2016-12-09 16:36:42 -05:00
George Koehler	b8c921ca70	Allow mfspr, mtspr with a register number. PowerPC has a few hundred special-purpose registers. The assembler had only accepted the names "xer", "lr", "ctr". Most programs use only those three SPRs. If I add more names, they would almost never get used, and they might conflict with labels. I want to use "mfspr r3, 0x3f0" and "mtspr 0x3f0, r3" in plat/qemu/boot.s to access register hid0 from supervisor mode.	2016-12-07 17:28:00 -05:00
David Given	55e24e1f24	inn was assuming that bitfields were arrays of bytes, when actually they're arrays of words (which makes the LSB move on big-endian systems).	2016-12-06 21:45:20 +01:00
David Given	fbd6e8f63d	Add support for consecutive labels; needed by the B compiler.	2016-11-27 21:18:00 +01:00
David Given	5bce5fc4da	Change the extension used by Basic files for .b to .bas, to avoid conflicts with B.	2016-11-27 20:38:33 +01:00
David Given	f8fa3ece42	inn on ncg now passes the CPU tests.	2016-11-20 19:35:34 +01:00
David Given	953c08839f	inn works now; add a helper for it.	2016-11-20 12:53:44 +01:00
David Given	196fa914b3	lxa now works, I hope; traps are better (and stubbed out on qemuppc).	2016-11-20 11:57:21 +01:00
David Given	d5328492d7	Better handling of float conversions; more tests; converting to unsigned ints works now.	2016-11-20 11:27:40 +01:00
David Given	454a7494bb	cif8 and cuf8 work now. More tests.	2016-11-19 11:42:30 +01:00
David Given	cc660b230f	Floats and doubles are now written out correctly.	2016-11-19 11:39:13 +01:00
David Given	d31bc6a3f9	Made csa and csb work with mcg; adjust the libem functions and the corresponding invocation in the ncg table so the same helpers can be used for both mcg and ncg. Add a new IR opcode, FARJUMP, which jumps to a helper function but saves volatile registers.	2016-11-19 10:55:41 +01:00
David Given	5208e5f751	Yet another OB1 stack format fix.	2016-11-19 10:42:22 +01:00
David Given	43439c6d0c	Remember to push the result of lor onto the stack.	2016-11-17 22:04:32 +01:00
David Given	81bc2c74c5	A bb's regsin are no longer the same as those of its first instruction; occasionally the first hop of a block would try to rearrange its registers (due to evicted throughs), resulting in the phi moves copying values into the wrong registers.	2016-11-16 20:52:15 +01:00
David Given	581fa4a457	Reenable eviction of corrupted registers, which had been broken by a previous change. Change the register move code to get swaps right, or at least righter.	2016-11-15 21:55:10 +01:00
David Given	86c832ef86	Put saved registers in actually the write place. I hope.	2016-11-15 21:54:15 +01:00
David Given	cc686ded62	Get subtractions the right way round.	2016-11-15 20:25:11 +01:00
David Given	0289b1004e	Allow values left on the stack at the end of the procedure (it's legal!).	2016-11-14 21:47:49 +01:00
David Given	e7132183fb	Fix buffer overrun: if LABEL_STARTER is seen but LABEL_TERMINATOR is not, the label parser will keep going forever looking for the end of the label. It now stops at the end of the string.	2016-11-13 14:04:58 +01:00
David Given	852d3a691d	Update the table to return call output values in the right registers. Fix the register allocator so the corrupted registers only apply to throughs (otherwise, you can't put output registers in corrupted registers).	2016-11-11 21:48:36 +01:00
David Given	b5c1d622f5	Rework the way stack frames are laid out to be simpler and, hopefully, more correct. Saved registers are now placed in what may be the right place.	2016-11-11 21:17:45 +01:00
David Given	84ee75ec07	Merge from default.	2016-11-11 20:17:54 +01:00
David Given	d82df74a7a	Rename addr_t to address_t to avoid clashes with the system addr_t.	2016-11-11 20:17:10 +01:00
David Given	fd91851005	Add enough return types to the K&R C that the ACK builds (on Linux) using clang now.	2016-11-10 22:04:18 +01:00
David Given	4fa2c94a4a	Correctly mangle labels used in initialisers.	2016-10-31 23:21:33 +01:00
David Given	9261cd978d	Typo fix.	2016-10-31 23:16:02 +01:00
David Given	941072e0d7	Add, I hope, patterns for fmsub, fnmadd, and fnmsub (also float versions).	2016-10-31 22:36:54 +01:00
David Given	44f0cea6ca	Also use fmadd for single-precision floats.	2016-10-31 19:55:16 +01:00
David Given	064d1a5d5d	Use fmadd for multiply-and-add instructions.	2016-10-31 19:52:17 +01:00
David Given	e19850b114	Fix a few c11isms.	2016-10-30 16:51:06 +01:00
David Given	ca5b6e07bb	Properly export symbols.	2016-10-29 23:52:17 +02:00
David Given	8c3670483f	Get top working with the PowerPC; use it to eliminate useless branches and moves.	2016-10-29 23:37:11 +02:00
David Given	a8c4dac67c	Merge from default (merging in George Koehler's PowerPC changes).	2016-10-29 22:40:40 +02:00
David Given	a311e61360	Add support for preserved registers.	2016-10-29 20:22:44 +02:00
David Given	e3ebf986e9	More opcodes.	2016-10-29 13:32:09 +02:00
David Given	1ae8b90238	More opcodes.	2016-10-29 12:55:34 +02:00
David Given	acaae765af	Emit negative constants correctly.	2016-10-29 12:55:21 +02:00
David Given	61349389fb	More opcodes. sti can now cope with non-standard sizes (really need a better fix for this). Hack in crude support for mismatched stack pushes and pops (ints vs longs).	2016-10-29 12:48:05 +02:00
David Given	68419da235	Actually, the locals need to go above the spills and saved regs, so fp == lb.	2016-10-29 12:00:33 +02:00
David Given	2cc2c0ae98	Lots more opcodes. Rearrange the stack layout so that fp->ab is a fixed value (needed for CHAINFP and FPTOAB). Wire up lfrs to calls via a phi when necessary, to allow call-bra-lfr chains.	2016-10-29 11:57:56 +02:00
David Given	bfa65168e2	Don't generate phis if unnecessary (because this breaks the critical-edge-splitting guarantee and causes insertion of phi copies to fail).	2016-10-29 10:55:48 +02:00
David Given	658db4ba71	Mangle label names (turns out that the ACK assembler can't really cope with labels that are the same name as instructions...).	2016-10-27 23:17:16 +02:00
David Given	81525c0f2c	Swaps work (at least for registers). More opcodes. Rearrange the stack layout so we can always trivially find fp, which lets CHAINFP work.	2016-10-27 21:50:58 +02:00
David Given	be3dece5af	Allow emission of strings containing ".	2016-10-27 21:48:46 +02:00
David Given	51bd3ee4dd	Fix bug where some phis weren't being inserted when a given variable definition needed more than one phi (due to the dominance frontier containing more than one basic block).	2016-10-27 21:40:25 +02:00
David Given	9977ce841a	Remove the bytes1, bytes2, bytes4, bytes8 attributes; remove the concept of a register 'type'; now use int/float/long/double throughout to identify registers. Lots of register allocator tweaks and table bugfixes --- we now get through the dreading Mathlib.mod!	2016-10-25 23:04:20 +02:00
David Given	45a7f2e993	Phi copies are now inserted as part of type inference. More opcodes.	2016-10-24 22:14:08 +02:00
David Given	111c13e253	More opcodes.	2016-10-24 20:15:22 +02:00
David Given	a4644dee4d	More opcodes.	2016-10-24 12:08:40 +02:00
David Given	b22780c075	More opcodes, including the difficult and fairly stupid los/sts.	2016-10-23 22:24:08 +02:00
David Given	abd0cedd61	Massive change to how IR types are handled; we use the type code for matching rather than the size. Much cleaner and simpler.	2016-10-23 21:54:14 +02:00
David Given	b1a3d76d6f	Re-re-add the type inference layer, now I know more about how things work. Remove that terrible float promotion code.	2016-10-22 23:04:13 +02:00
David Given	11b0bc1055	More opcodes.	2016-10-22 20:32:51 +02:00
David Given	2d52b1fdaa	Remove GETRET; values are now returned directly by CALL. Fix a bug in convertstackops which was resulting in duplicate IR groups.	2016-10-22 12:13:57 +02:00
David Given	ceb938fb3c	More opcodes.	2016-10-22 11:26:28 +02:00
David Given	7ae888b754	Hacky workaround the way the Modula-2 compiler generates non-standard sized loads and saves. More opcodes; simplified table using macros.	2016-10-22 10:48:22 +02:00
David Given	90d0661639	Typo fix.	2016-10-22 00:48:55 +02:00
David Given	f851ab83af	Better (and more correct) floating point conversions; fif; various new opcodes.	2016-10-22 00:48:26 +02:00
David Given	d535be87b1	fef4 and fef8 is now cleaner, albeit slower; add some more register alias stuff.	2016-10-22 00:02:15 +02:00
David Given	4db402f229	Add (pretty crummy) support for register aliases and static pairs of registers. We should have enough functionality now for rather buggy 8-bit ints and doubles. Rework the table and the platform.c to match.	2016-10-21 23:31:00 +02:00
David Given	e4fec71f9c	Lots more opcodes; better eviction behaviour; better register moves. Lots more PowerPC stuff (some working).	2016-10-19 23:29:05 +02:00
David Given	ffb1eabf45	Floating point promotion is less buggy.	2016-10-19 23:27:53 +02:00
George Koehler	99dee0ad24	Remove f14 to f31 from FREG and FSREG. This would have happened later, if f14 to f31 became regvar (like r13 to r31 are now). I am doing it now because ncg is too slow for rules "with FREG FREG uses FREG". We use such rules for adf 8 and other EM instructions that operate on 2 floats. Like my last commit `cfbc537`, this commit speeds ncg by removing choices for register allocation.	2016-10-18 21:16:47 -04:00
David Given	d5071e7df1	Promote values accessed via NOP.	2016-10-18 23:58:03 +02:00
David Given	5413d47029	'!' tracing is now always emitted; tracing goes to stderr.	2016-10-18 22:32:09 +02:00
David Given	3520704ea8	Add support for floating point constants.	2016-10-18 22:29:42 +02:00
George Koehler	cfbc537959	In powerpc ncg, add a speed hack for sti 8. ncg is too slow with this many registers. A stack pattern "with GPR GPR GPR" or "with REG REG REG" takes too long to pick registers, causing ncg 8 to take about 2 seconds on each sti 8. I introduce REG_PAIR and there are only 4 such pairs. For programs that use sti 8 (including C programs that copy 8-byte structs), this speed hack improves the ncg run from several seconds to almost instantaneous. Also add a few COMMENT(...) lines in stacking rules.	2016-10-17 20:31:59 -04:00
David Given	938fb8c2fc	Lots more opcodes.	2016-10-18 00:31:26 +02:00
David Given	4a093b9eba	Add li and mr pseudoinstructions.	2016-10-18 00:21:32 +02:00
George Koehler	c7b68033ef	Add costs to powerpc instructions. Also show how andi., andis., or., set condition codes.	2016-10-17 14:57:21 -04:00
George Koehler	f33b30ed3c	Rewrite .fif8 to avoid powerpc64 fctid This fixes the SIGILL (illegal instruction) in startrek when firing phasers. The 32-bit processors in my PowerPC Mac and in QEMU don't have fctid, a 64-bit instruction. I got the idea from mach/proto/fp/fif8.c to extract the exponent, clear some bits to get an integer, then subtract the integer from the original value to get the fraction.	2016-10-17 00:39:59 -04:00
George Koehler	e2ccc8f942	Add "kills MEMORY" to powerpc sti rules. Adjust some of the loi rules (and associated moves) so we can identify the tokens that must be in MEMORY. With this commit, I can navigate the Enterprise even if I comment out my work-around from `e22c888`.	2016-10-16 18:13:39 -04:00
David Given	5f0164db62	Bolt mcg into the PowerPC backend. It doesn't build yet, but it is generating some code.	2016-10-17 00:06:06 +02:00
David Given	d539389e81	Merge in the unfinished PowerPC branch.	2016-10-16 22:38:27 +02:00
David Given	1e17921208	Implement saving of dirty registers onto the stack.	2016-10-16 22:37:42 +02:00
George Koehler	19f0eb86a4	Remove IND_LABEL_W and IND_LABEL_D Because li32 always loads a label into a GPR, it is sufficient to coerce LABEL to REG, then use IND_RC_W or IND_RC_D for indirection through the label.	2016-10-16 16:33:24 -04:00
George Koehler	5b5f774a64	Simplify moves to and from IND_RC_* Now that SUM_RC always has a signed 16-bit constant, it happens that the various IND_RC_* tokens also have a signed 16-bit constant, so we no longer need to touch the scratch register.	2016-10-16 16:02:25 -04:00
George Koehler	7c64dab491	Refactor how powerpc ncg pushes constants. When loc (load constant) pushes a constant, it now checks the value of the constant and pushes any of 7 tokens. These tokens allow stack patterns to recognize 16-bit signed integers (CONST2), 16-bit unsigned integers (UCONST2), multiples of 0x10000 (CONST_HZ), and other interesting forms of constants. Use the new constant tokens in the rules for adi, sbi, and, ior, xor. Adjust a few other rules to understand the new tokens. Require that SUM_RC has a signed 16-bit constant, and OR_RC and XOR_RC each have an unsigned 16-bit constant. The moves from SUM_RC, OR_RC, XOR_RC to GPR no longer touch the scratch register, because the constant is not too big.	2016-10-16 13:58:54 -04:00
George Koehler	baa152217e	Remove unused parts of mach/powerpc/ncg/table Remove unused tokens GPRINDIRECTLO, HILABEL, LOLABEL, LABELI. Also remove an #if 0 ... #endif group of patterns.	2016-10-15 20:00:48 -04:00
David Given	6a23906ad8	Various bits of cleanup; we should almost be ready to try sending this to the assembler soon...	2016-10-15 23:39:38 +02:00
David Given	286435a2ed	Oops, forgot to add the output option spec to the string!	2016-10-15 23:34:54 +02:00
David Given	b36897c299	References to the stack frame are now rendered properly.	2016-10-15 23:33:30 +02:00
David Given	a8ee82d197	Stop passing proc around, and use a global instead --- much cleaner.	2016-10-15 23:19:44 +02:00
David Given	7aa60a6451	Register spilling to the stack frame works, more or less.	2016-10-15 22:53:56 +02:00
David Given	0eb32e7553	Fix yet another bug to do with IR register outputs.	2016-10-15 19:14:25 +02:00
David Given	9504aec2bd	Function termination gets routed through an exit block; we now have prologues and epilogues. mcgg now exports some useful data as headers. Start factoring out some of the architecture-specific bits into an architecture-specific file.	2016-10-15 18:38:46 +02:00
David Given	5ad3aa8595	Add a pile of new instructions used by Pascal; I'm going to need to think about how locals and the local base are handled.	2016-10-15 13:07:59 +02:00
David Given	358c44de35	Bytes were sometimes failing to be sign extended correctly.	2016-10-15 12:11:40 +02:00
David Given	517120d0fb	Allow asm names for registers which are different from the friendly names shown in the tracing (because PowerPC register names are just numbers).	2016-10-15 11:42:47 +02:00
David Given	b2ddf12473	Some more opcodes.	2016-10-15 11:22:40 +02:00
George Koehler	29cb008faa	In powerpc table, fix macros los() and his(). Change the operator in his() from a - minus to a + plus. When los(n) becomes negative, then his(n) needs to add 0x10000, not subtract it. Also change los(n) to do the sign extension, because smalls(los(n)) should be true, not false. Also change hi(n) and lo(n) to wrap n in parentheses, as (n), because these are macros and n might still contain operators.	2016-10-14 23:59:26 -04:00
David Given	bb17aea73a	You can now mark a register as corrupting a certain register class; calls work, or at least look like they work. The bad news is that the register allocator has a rare talent for putting things in the wrong register.	2016-10-15 01:15:08 +02:00
David Given	886adb86d7	Log empty hops.	2016-10-14 23:19:25 +02:00
David Given	4f2177e41f	Reworked loads and stores; it's now different, maybe not better.	2016-10-14 23:19:02 +02:00
David Given	a63052427e	Factor out the register allocation routines to make them easier to deal with.	2016-10-14 23:17:06 +02:00
David Given	bb53a7fb51	Fix stupid issue where hop output registers were being overwritten, leading to invalid SSA form.	2016-10-14 23:12:29 +02:00
David Given	98fe70a7de	Output register equality constraints work.	2016-10-14 22:17:02 +02:00
David Given	216ff5cc43	Make loads and stores in the table nicer; fix a place where it looked like it was working but only accidentally.	2016-10-12 23:12:53 +02:00
David Given	f06b51c981	Keep track of register types as well as attributes --- the type being how we find new registers when evicting values. Input constraints work (they were being ignored before). Various bug fixing so they actually work.	2016-10-12 22:58:46 +02:00
David Given	4723a1442f	Add code to remove unused phis, converting to pruned SSA form, to avoid confusing the register allocator later.	2016-10-12 21:50:12 +02:00
David Given	df239b3f90	Don't allow the same IR to be added to the sequence list more than once (sometimes happens because op_dup, but makes no sense).	2016-10-12 00:45:36 +02:00
David Given	96dffd2007	Clean up the allocator a bit, in preparation for making it lots more complicated; no semantic changes.	2016-10-11 23:17:30 +02:00
David Given	668cccdff1	A few more opcodes.	2016-10-11 00:29:18 +02:00
David Given	2be1c51885	A little fiddling with store instructions. The PowerPC is not friendly to iburg.	2016-10-11 00:23:35 +02:00
David Given	e93c58dc8d	Refactored the way hops are rendered; add support for emitting code (although with no prologue or epilogue yet).	2016-10-11 00:12:11 +02:00
David Given	92bd1ac5f4	Register allocator now gets all the way through all of my test file without crashing (albeit with register moves and swaps stubbed out). Correct code? Who knows.	2016-10-10 23:19:46 +02:00
David Given	a4d06d1795	D'oh, need multiple passes over the edge splitter in order to properly find all cases.	2016-10-10 23:18:37 +02:00
David Given	fac12aae32	Calculate phi congruency groups; use them to solve the importing-hreg-from-the-future problem (probably poorly).	2016-10-09 22:04:20 +02:00
David Given	23c3575f0f	The register allocator now makes a spirited attempt to honour register attributes when allocating. Unfortunately, backward edges don't work (because the limited def-use chain stuff doesn't work across basic blocks). Needs more thought.	2016-10-09 15:09:34 +02:00
David Given	38de688c5a	Floating point promotion was broken since the IR float change. Fix.	2016-10-09 15:08:03 +02:00
David Given	36cddd6afb	Add some more opcodes; rearrange the registers to be more PowerPC-friendly.	2016-10-09 14:45:13 +02:00
David Given	cfe5312fcc	Predicates can now take numeric arguments. The PowerPC predicates have been turned into generic ones (as they'll be useful everywhere). Node arguments for predicates require the '%' prefix for consistency. Hex numbers are permitted.	2016-10-09 12:32:36 +02:00
David Given	d75cc0a663	Basic register allocation works!	2016-10-08 23:32:54 +02:00
David Given	637aeed70a	Only allocate an output vreg if the instruction actually wants one.	2016-10-08 12:15:21 +02:00
David Given	2198db69b1	Instruction predicates work now.	2016-10-08 11:35:33 +02:00
David Given	9ebf731335	Minor cleanup.	2016-10-08 11:07:28 +02:00
David Given	9db902314b	Fix bug where pushes were being placed in the wrong blocks.	2016-10-08 10:21:24 +02:00
George Koehler	65c2a8a0ae	Remove stackadjust and stackoffset() from ncg. This feature has never been used since its introduction, more than 3 years ago, in David Given's commit `c93cb69` of May 8, 2013. The commit was for "PowerPC and M68K work". I am not undoing the entire commit. I am only removing the stackadjust and stackoffset() feature. This commit removes the feature from my branch kernigh-linuxppc. This removal includes the mach/proto/ncg parts. The default branch already removed most of the feature, but kept the mach/proto/ncg parts. That removal happened in commit `81778b6` of May 13, 2013 (which was a merge; git diff `af0dede` `81778b6`). The branch dtrg-experimental-powerpc merged the default branch but without the removal. That merge was commit `4703db0f` of Sep 15, 2016 (git diff `8c94b13` `4703db0`). My branch kernigh-linuxppc is off branch dtrg-experimental-powerpc, so I can no longer get the removal by merging default. David Given described the stackadjust feature in https://sourceforge.net/p/tack/mailman/message/30814691/ The instruction stackadjust would add a value to the offset, and the function stackoffset() would return this offset. One would use this to track sp - fp, then omit the frame pointer by not keeping fp in a register.	2016-10-07 20:52:13 -04:00
David Given	4e49830e09	Overhaul of everything phi related; critical edge splitting now happens before anything SSA happens; liveness calculations now look like they might be working.	2016-10-08 00:21:23 +02:00
George Koehler	409ba7fb1b	Remove most of GPRE from mach/powerpc/ncg/table We only need GPRE in a few places where we write {GPRE, regvar(...)} because ncgg can't parse plain regvar(...). In all other places, a plain GPR works. Also remove gpr_gpr_gpr and a few other unused and fake instructions from the list of instructions.	2016-10-06 22:59:27 -04:00
George Koehler	7cccd88b71	Rename SCRATCH to RSCRATCH. Never stack RSCRATCH nor FSCRATCH. Rename the scratch gpr (currently r11) from SCRATCH to RSCRATCH so I can search for RSCRATCH without finding FSCRATCH. I also want to avoid confusion with the SCRATCH keyword of the old code generator (cg which came before ncg). Change the stacking rules to prevent stacking of RSCRATCH or FSCRATCH or any other GPR or FPR that isn't an allocatable REG or FREG. Then ncgg rejects any rule that tries to stack a GPR or FPR, so change such rules to stack a REG or FREG.	2016-10-06 20:47:42 -04:00
David Given	ee93389c5f	Refactor the cfg and dominance stuff to make it a lot nicer.	2016-10-06 21:34:21 +02:00
David Given	d20b63dc94	The register allocator is really a pass, so arrange the code like one.	2016-10-05 23:55:38 +02:00
David Given	87e004e4a9	Warning fix.	2016-10-05 23:55:04 +02:00
David Given	21034c0d65	No, dammit, for register allocation I need to walk the blocks in dominance order. Since the dominance tree has changed when I fiddled with the graph, I need to recompute it, so factor it out of the SSA pass. Code is uglier than I'd like but at least the RET statement goes last in the generated code now.	2016-10-05 23:52:54 +02:00
David Given	d95c75dfd7	Allowing an input filename on the command line makes debuggers happy. (Then we don't need to redirect stdin.)	2016-10-05 23:24:29 +02:00
David Given	88fb231d6e	Better constraint syntax; mcgg now passes register usage information up to mcg; mcg can track individual hop inputs and outputs (needed for live range analysis!); the register allocator now puts the basic blocks into the right order in preparation for live range analysis.	2016-10-05 22:56:25 +02:00
David Given	7a6fc7a72b	Made sure that all files end in vim magic.	2016-10-05 21:07:29 +02:00
David Given	92502901a7	Better management of register data. Add struct hreg.	2016-10-05 21:00:28 +02:00
David Given	ac62c34e19	Add a pass to do critical edge splitting.	2016-10-04 23:42:00 +02:00
David Given	8fedf5a0a8	Added support for the op_bXX conditional branch instructions.	2016-10-04 23:28:16 +02:00
David Given	249855ed23	Fix the horror of the startup code; now uses getopt and stuff and the debug flags can be set as an option.	2016-10-04 22:36:01 +02:00
David Given	ac063a6f54	Remove unused variable (reduce memory usage by 1/10).	2016-10-04 22:35:08 +02:00
David Given	c6f576f758	Bodge in enough phi support to let the instruction generator complete on basic programs.	2016-10-04 21:58:31 +02:00
David Given	e13ff5be31	Don't allocate new vregs for REG and NOP --- a bit hacky, but suppresses stray movs very effectively.	2016-10-04 21:29:03 +02:00
David Given	bd28bddb92	Massive rewrite of how emitters and the instruction selector works, after I realised that the existing approach wasn't working. Now, hopefully, tracks the instruction trees generated during selection properly.	2016-10-04 00:16:06 +02:00
David Given	68f98cbad7	Instruction selection now happens on a shadow tree, rather than on the IR tree itself. Currently it's semantically the same but the implementation is cleaner.	2016-10-03 20:52:36 +02:00
David Given	288ee56203	Get quite a long way towards basic output-register equality constraints (needed to make special nodes like NOP work properly). Realise that the way I'm dealing with the instruction selector is all wrong; I need to physically copy chunks of tree to give to burg (so I can terminate them correctly).	2016-10-02 23:25:54 +02:00
David Given	3aa30e50d1	Come up with a syntax for register constraints.	2016-10-02 21:51:25 +02:00
David Given	c079e97492	Perform SSA conversion of locals. Much, much better code now, at least inasmuch as it looks better before register allocation. Basic blocks now know their own successors and predecessors (after a certain point in the IR processing).	2016-10-02 17:50:34 +02:00
David Given	79d4ab1d96	Add zrl opcode. Keep track of local sizes as well as offsets.	2016-10-02 16:08:46 +02:00
David Given	bf73fcdb64	Add inl and del opcodes.	2016-10-02 14:44:21 +02:00
David Given	b298c27c63	Refactor mcg.h as it's getting a bit big; keep track of register variables.	2016-10-02 00:30:33 +02:00
David Given	06059233da	Make betterer.	2016-10-01 23:41:45 +02:00
David Given	65e75be42d	Fix edge case where leftover pushes would occasionally cause infinite loops in the analysis.	2016-10-01 23:41:35 +02:00
David Given	73d7e89c32	Show expression trees correctly.	2016-10-01 23:41:03 +02:00
David Given	3474e20274	Deal with malformed mes instructions emitted by ego.	2016-10-01 23:13:39 +02:00
David Given	a3cfe6047f	More rigorous dealing of IR groups; no need for is_generated and is_root any more (but now passes are required to set IR roots properly when changing instructions).	2016-10-01 22:58:29 +02:00
David Given	21898f784a	We're going to need some type inference after all, I think. Let's do a little for now and see how it goes.	2016-10-01 19:10:22 +02:00
David Given	91e277e046	Predicates work; we now have prefers and requires clauses. Predicates must be functions. Not convinced that semantic types are actually working --- there are still problems with earlier statements leaving things in the wrong registers.	2016-10-01 13:56:52 +02:00
David Given	4a3a9a98dc	It doesn't really make a lot of sense to have BURG nonterminal names different to register classes, so combine them. Refactor the map code.	2016-10-01 12:17:14 +02:00
George Koehler	ce5faba919	Remove .linenumber and .filename; use hol0 and hol0+4. We need this because some .e files in lang/ are using 'loe 0' and 'lae 4' to load the line number from hol0 and filename from hol0+4.	2016-09-30 13:40:36 -04:00
David Given	3a973a19f3	Move fatal(), warning() and aprintf() into the new data module (because they're really useful).	2016-09-30 19:10:30 +02:00
George Koehler	e22c8881e7	Add a rule for sdl ldl $1==$2 to work around a bug. In our powerpc table, sdl fails to kill the old value of the local. This is a bug, because a later ldl can load the old value instead of the newly stored value. By rewriting "sdl 0" "ldl 0" as "dup 8" "sdl 0", the newly added rule works around the bug, but only when the ldl is immediately after the sdl. This rule improves code that uses double-precision floating point. The output of printf("%f", 6.0) in C changes from all zero digits to "6000000" but still doesn't print the decimal point. The result of atof("-123.456") becomes correct. In startrek, I can now move the Enterprise, but I still can't fire phasers without crashing the game. We already have a rule for stl lol $1==$2. We had two copies of the rule, so I am deleting the second copy.	2016-09-30 11:50:50 -04:00
David Given	0d246c0d73	Much better handling of fragments (no run-time code needed to distinguish them from registers) and better handling of individual hops within a paragraph --- no more ghastly hacks to try and distinguish the input from the output.	2016-09-29 22:06:04 +02:00
George Koehler	6ae415d48b	Rewrite fef 8 in powerpc assembly. In EM, fef splits a float into exponent and fraction. The old C code, given an infinite float, got stuck in an infinite loop. The new assembly code doesn't loop; it extracts the IEEE exponent.	2016-09-29 15:52:54 -04:00
David Given	a0131fdb47	You know what, the type inference stuff is a complete red herring. What this actually needs is a more intelligent register allocator. So, remove the type inference.	2016-09-29 19:58:02 +02:00
George Koehler	a71eee3914	For "pat ass", move fake stack to real stack before adjusting SP. This fixes code that tried to "addi SP, SP, 4" to drop a value that was in a register, not on the real stack. Add a rule to optimize "asp 4" (which becomes "loc 4" "ass") when the value being dropped is already in a GPR.	2016-09-28 00:13:35 -04:00
David Given	4572f1b774	Actually, I don't need vregs: hops work just as well. Particularly if I restructure things so that I don't need to walk the blasted ir / burg tree every time I look at an instruction.	2016-09-27 23:38:47 +02:00
George Koehler	1e3dde915a	Remove the "invalid" stacking rule. When ncg fell back on this rule, it did emit the string "invalid" in the assembly code and caused a syntax error in the assembler. Adjust the stacking rules so we can stack LOCAL, CONST, and LABEL without falling back on the "invalid" rule, and so we can stack them when we have no free register except the scratch register.	2016-09-27 16:46:11 -04:00
David Given	e77c5164cf	Fleshed out hops and vregs. The result is almost looking like code now --- uncanny.	2016-09-27 00:19:45 +02:00
David Given	f552c9c7c6	Move map into the data module.	2016-09-26 23:03:04 +02:00
David Given	c4b8e00ae2	Revamp the array module not to use nasty macros any more. Slightly more verbose to use, but definitely cleaner.	2016-09-26 22:48:58 +02:00
David Given	3671892c34	Move the array library into the data module.	2016-09-26 22:24:49 +02:00
David Given	cc176e5183	Keep more data around about ir instructions. Implement a half-baked type inference routine to propagate information about floats up the tree, so we know whether to put floats into special registers as early as possible.	2016-09-26 22:12:46 +02:00
David Given	416b13fd76	Start factoring out the hardware op code.	2016-09-25 23:29:59 +02:00
David Given	39aa672422	Sort of keep track of registers and register classes. Start walking the generated instruction tree --- holy cow, they look like instructions!	2016-09-25 22:17:14 +02:00
David Given	bde5792b1a	Collapse several rule arrays into one; actually generate the array properly.	2016-09-25 17:14:54 +02:00
David Given	67eb21d428	Rename struct insn to struct em (throughout).	2016-09-25 12:29:03 +02:00
David Given	9f78e0b36b	Rethink the way patterns are mapped to rules; generate emitters (probably badly).	2016-09-25 11:49:51 +02:00
David Given	7c028bdd45	We now record the code fragments to be emitted by each rule.	2016-09-25 00:21:46 +02:00
David Given	717b77dd0a	Instruction selection is so important the file needs a longer name.	2016-09-24 22:50:53 +02:00
David Given	629e0ddfc6	Some instruction selection is now happening.	2016-09-24 22:46:08 +02:00
David Given	c8fcbe282a	More grammar changes.	2016-09-24 19:03:55 +02:00
David Given	2acc4ed29d	IR codes are now owned by mcgg; ir terminals are inserted into the table during compilation (so you can refer to them).	2016-09-24 18:31:35 +02:00
David Given	1516657907	Crudely bolt on mcgg to mcg itself.	2016-09-24 17:20:40 +02:00
David Given	6643d39b2c	Fix some late-night typo bugs.	2016-09-24 01:09:32 +02:00
David Given	bb9aa030a5	Procedure compilation now happens after the entire EM file has been read in (so that we can look inside data blocks which might be defined in the future... sigh, csa and csb). csa and csb no longer generate invalid IR.	2016-09-24 01:04:00 +02:00
David Given	ed67d427c9	Replaced the block splicer with a trivial block eliminator (which rewrites jumps to blocks which contain only a jump). Don't bother storing the bb graph in the ir nodes; we can find it on demand by walking the tree instead --- slower, but much easier to understand and more robust. Added a terrible map library.	2016-09-23 23:59:15 +02:00
David Given	f8bbf9e87d	Each pass now lives in its own source file; much cleaner.	2016-09-23 21:07:16 +02:00
David Given	9077baa850	Add a bodged in algorithm for converting basic block communication from stacked variables to SSA. Also add dead block removal and block splicing. IR code is much better now.	2016-09-22 23:19:29 +02:00
David Given	6a74cb2e11	Tracing cleanup. Simplified the IR code. Some more opcodes.	2016-09-22 00:15:48 +02:00
David Given	4546dd5f22	Massive grammar overhaul and refactor. Hacked in support for predicates, where instructions can be turned on and off based on their parameters. New lexer using a lexer. Now quite a lot of the way towards being a real instruction selector.	2016-09-21 00:43:10 +02:00
David Given	36d7d1ee4e	Create hacky fake basic blocks for data fragments, used to track which instruction labels descriptor blocks refer to; this allows csa and csb to know where they're going.	2016-09-20 00:19:39 +02:00
David Given	dcba03646b	Treebuilder now gets to the bottom of my test file, merrily generating (probably horribly broken) IR.	2016-09-19 23:30:41 +02:00
David Given	6ce2495aeb	Store the EM code up front and build the basic block graph before generating the IR code. Lots more IR code.	2016-09-19 23:06:59 +02:00
David Given	176cd7365c	Archival checking of the half-written IR treebuilder.	2016-09-18 23:24:54 +02:00
George Koehler	5b69777647	Rename our pseudo-opcode 'la' to 'li32'. GNU as has "la %r4,8(%r3)" as an alias for "addi %r4,%r3,8", meaning to load the address of the thing at 8(%r3). Our 'la', now 'li32', makes an addis/ori pair to load an immediate 32-bit value. For example, "li32 r4,23456789" loads a big number.	2016-09-18 17:03:23 -04:00
George Koehler	9db305b338	Enable the Hall check again, and get powerpc to pass it. Upon enabling the check, mach/powerpc/ncg/table fails to build as ncgg gives many errors of "Previous rule impossible on empty stack". David Given reported this problem in 2013: https://sourceforge.net/p/tack/mailman/message/30814694/ Commit `c93cb69` commented out the error in util/ncgg/cgg.y to disable the Hall check. This commit enables it again. In ncgg, the Hall check is checking that a rule is possible with an empty fake stack. It would be possible if ncg can coerce the values from the real stack to the fake stack. The powerpc table defined coercions from STACK to {FS, %a} and {FD, %a}, but the Hall check didn't understand the coercions and rejected each rule "with FS" or "with FD". This commit removes the FS and FD tokens and adds a new group of FSREG registers for single-precision floats, while keeping FREG registers for double precision. The registers overlap, with each FSREG containing one FREG, because it is the same register in PowerPC hardware. FS tokens become FSREG registers and FD tokens become FREG registers. The Hall check understands the coercions from STACK to FSREG and FREG. The idea to define separate but overlapping registers comes from the PDP-11 table (mach/pdp/ncg/table). This commit also removes F0 from the FREG group. This is my attempt to keep F0 off the fake stack, because one of the stacking rules uses F0 as a scratch register (FSCRATCH).	2016-09-18 15:08:55 -04:00
George Koehler	03b067e1d5	Add the missing .lar4 and .sar4 for powerpc. Inspired by the sparc code (mach/sparc/libem/lar.s). My powerpc code might still have bugs, but it's enough for examples/hilo.mod to work. May need to 'make clean' or touch a build.lua file, so ackbuilder can notice the new lar4.s and sar4.s files and build them.	2016-09-17 23:55:55 -04:00
David Given	24380e2a93	Abstract out the EM reader; skeleton of the tree builder.	2016-09-18 00:02:16 +02:00
David Given	2eee391aef	Basic skeleton of em parser.	2016-09-17 22:21:47 +02:00
David Given	80cb6ba927	Eliminate the RELOH2 relocation, as it never worked --- the address would be calculated incorrectly because of overflow errors. Replace it with an extended RELOPPC relocation which understands addis/ori pairs; add an la pseudoop to the assembler which generates these and the appropriate relocation. Make good. --HG-- branch : dtrg-experimental-powerpc-branch	2016-09-17 12:43:15 +02:00
David Given	45a950571d	Mostly add support for the experimental and largely broken linuxppc platform. (Doesn't quite build.) --HG-- branch : dtrg-experimental-powerpc-branch	2016-09-15 23:12:03 +02:00
David Given	f67c98e239	Distributions are a pain --- let's not bother any more. Instead, we just tag the repository and download a complete snapshot, old and ancient stuff and all.	2016-09-02 23:00:38 +02:00
David Given	612e38f1c6	Remove the old make-based build system, plus some big chunks of horribly obsolete protomake build system.	2016-09-02 22:17:51 +02:00
David Given	856eb120b3	Add files which got missed in the initial build pass.	2016-08-20 14:04:17 +02:00
David Given	204f932ed2	Raspberry Pi backend now builds.	2016-08-20 12:40:13 +02:00
David Given	4d24666432	Move util/data into modules/src/em_data, for consistency with the other modules.	2016-08-14 14:09:38 +02:00
David Given	38fa6941d5	linux68k builds now.	2016-08-14 11:34:18 +02:00
David Given	f253b6a169	linux386 builds. Also, forgot to turn back on the language runtimes.	2016-08-14 10:37:55 +02:00
David Given	262c5fedcf	Biggish refactor to break cycles; my build rules were full of them. cpm builds, which requires top and topgen.	2016-08-14 01:39:40 +02:00
David Given	0d77cb8279	We can build our first C file.	2016-08-07 21:56:53 +02:00
David Given	b50dc4214a	Add check for undefined variables. Find undefined variables. Fix undefined variables.	2016-08-05 00:01:55 +02:00
David Given	5e84be70fd	Massive ackbuilder refactor --- cleaner and more expressive. Lists are automatically flattened (leading to better build files), and the list and filename functions are vastly more orthogonal.	2016-08-04 23:51:19 +02:00
David Given	b2bb4ce3b2	Builds libend (the simplest library). Becoming obvious I need to rework the way ackbuilder deals with lists.	2016-07-30 00:39:22 +02:00
David Given	a8a9d1bbfa	yacc, ncgg; platform ncg builds now.	2016-07-26 23:35:30 +02:00
David Given	bff5c4019c	Baby steps towards building a platform --- make the assembler work. Add ackbuilder support for C preprocessor files and yacc.	2016-07-24 00:50:02 +02:00
David Given	88bd7ce126	Remove defunct pmfiles. --HG-- branch : default-branch	2016-06-03 13:56:50 +02:00
David Given	ef8e6e25e0	Fix a whole pile of issues related to the failed attempt to increase the number of types of relocation possible in the object file. (Now, hopefully, working.) Also change the object serialiser/deserialiser to never try to read or write raw structures; it's way safer this way and we don't need the performance boost any more. --HG-- branch : default-branch	2016-03-18 21:46:55 +01:00
David Given	88e13ecce3	Don't use the ACK preprocessor on host files --- use the host preprocessor instead. --HG-- branch : default-branch	2016-03-14 20:58:19 +01:00
David Given	e85991ec86	Fix stray 'call file'. --HG-- branch : default-branch	2016-03-13 21:40:05 +01:00
David Given	ff0c78cc78	Merge from default. --HG-- branch : dtrg-videocore-branch-branch	2016-03-13 21:13:09 +01:00
David Given	62cc636f10	Merge. --HG-- branch : dtrg-videocore	2015-03-23 00:15:42 +01:00
David Given	9f23fbbe6a	Allow machines to use cg if they wish. --HG-- rename : mach/proto/ncg/build.mk => mach/proto/cg/build.mk rename : util/ncgg/build.mk => util/cgg/build.mk	2015-03-23 00:08:51 +01:00
David Given	c5018d7088	64-bit-ify (adhoc varargs are evil).	2015-03-23 00:07:59 +01:00
David Given	3d5e72e20b	Newer versions of GNU Make have a new function which collides with a variable we're using; change the name of the variable.	2015-03-22 12:09:46 +01:00
David Given	e36d739fa4	Add (largely untested) float/int conversion. --HG-- branch : dtrg-videocore	2013-07-01 13:05:36 +01:00
David Given	8b6951dac0	Fix incorrect offset encoding in lea (sp) instructions. --HG-- branch : dtrg-videocore	2013-06-29 00:35:07 +01:00
David Given	edb174da8d	Fix incorrect encoding of 'push lr' and 'pop pc'. --HG-- branch : dtrg-videocore	2013-06-29 00:32:39 +01:00
David Given	29af6f1adb	ISA change: clz has been renamed to log2. --HG-- branch : dtrg-videocore	2013-06-27 11:25:50 +01:00
David Given	2b3f95de0b	Fix jump range checking in the addcmpb family of instructions. --HG-- branch : dtrg-videocore	2013-06-26 23:32:54 +01:00
David Given	d94c1c8150	Updated distr files. --HG-- branch : dtrg-videocore rename : mach/i80/.distr => mach/vc4/.distr rename : plat/cpm/.distr => plat/rpi/.distr	2013-06-21 23:38:21 +01:00
David Given	fd2360be0f	Ship assembler man pages. --HG-- branch : dtrg-videocore rename : man/8080_as.6 => man/i80_as.6 rename : man/m68k2_as.6 => man/m68020_as.6	2013-06-21 23:20:50 +01:00
David Given	bbd4b46850	Fix stack corruption when adjusting SP. Be a bit more rigorous about clearing the pseudostack on branch/labels. --HG-- branch : dtrg-videocore	2013-06-07 21:25:38 +01:00
David Given	3e0123ca03	Fix treatment of out-of-range values in switch csa. --HG-- branch : dtrg-videocore	2013-06-05 23:57:23 +01:00
David Given	86c6fa2f1e	Implement NOT... --HG-- branch : dtrg-videocore	2013-05-30 23:50:19 +01:00
David Given	d3e3e72860	Update from trunk. --HG-- branch : dtrg-videocore	2013-05-29 15:03:48 +01:00
David Given	e0c121d6e6	Use relocation enumerations rather than hard-coded values for relocation types (these were causing problems due to the enumeration values having changed).	2013-05-29 14:11:04 +01:00
David Given	1f36370d87	Implement nop (the C compiler sometimes generates this!). --HG-- branch : dtrg-videocore	2013-05-26 22:54:53 +01:00
David Given	ef25c53c9c	Fix bug in ine/dee. --HG-- branch : dtrg-videocore	2013-05-26 18:59:19 +01:00
David Given	366cd10194	Remainders are calculated correctly. printf now works. --HG-- branch : dtrg-videocore	2013-05-26 13:13:58 +01:00
David Given	510888e6d5	.csb now works. --HG-- branch : dtrg-videocore rename : mach/vc4/libem/csa.s => mach/vc4/libem/csb.s	2013-05-26 13:06:25 +01:00
David Given	6284512b37	Fix erroneous section check (symbols may not have a defined section in pass 1). --HG-- branch : dtrg-videocore	2013-05-26 00:35:15 +01:00
David Given	308d41e083	Added triple-quad load and store (used by the signal stuff). --HG-- branch : dtrg-videocore	2013-05-26 00:22:08 +01:00
David Given	8c21a2ef9b	Stop fighting the terrible code and remove the regvar support --- it didn't help much and was a pain. --HG-- branch : dtrg-videocore	2013-05-25 23:58:35 +01:00
David Given	3b07fee160	Major bugfix where instructions weren't being shrunk correctly. (Turns out there's built-in support for doing this, which I hadn't found.) --HG-- branch : dtrg-videocore	2013-05-25 23:26:10 +01:00
David Given	b6680a48cc	Disable register variables. The code is a bit worse, but having two stackable registers makes things much easier to understand. --HG-- branch : dtrg-videocore	2013-05-25 13:31:58 +01:00
David Given	d7efb0a32c	Implement .csa. --HG-- branch : dtrg-videocore rename : mach/vc4/libem/dummy.s => mach/vc4/libem/csa.s	2013-05-25 13:31:27 +01:00
David Given	2ee79ab0b2	Encode comparing branch correctly. --HG-- branch : dtrg-videocore	2013-05-25 13:31:01 +01:00
David Given	472f778342	Don't write out constant data as big-endian! Some other cleanups. --HG-- branch : dtrg-videocore	2013-05-25 00:33:38 +01:00
David Given	98a51732ab	Various codegen tweaks. --HG-- branch : dtrg-videocore	2013-05-24 17:04:29 +01:00
David Given	2c7ee27206	Double-quads can be loaded and stored (more) correctly. --HG-- branch : dtrg-videocore	2013-05-22 23:55:23 +01:00

... 4 5 6 7 8 ...

2693 commits