d0p1/ack - Cute Engineering : Cute solutions to hard problems

d0p1/ack

Author	SHA1	Message	Date
George Koehler	85fcbde22f	Check LOI expressions to prevent a read after free. CS eliminates outer expressions before inner ones, as `x * y * z` before `x * y`. It does this by reversing the order of expressions in the code. This almost always works, but it sometimes doesn't work if a STI changes the value number of a LOI. In code like `expr1 LOI expr2 STI expr2 LOI`, CS might eliminate the inner `expr2` before the outer `expr2 LOI`. This caused a read after free because the occurrence of `expr2 LOI` pointed to the eliminated lines of `expr2`. This bug went unnoticed until my recent changes caused CS to crash with a double free. I did not get the crash in OpenBSD, but I saw the crash in Travis, then David Given reproduced the crash in Linux. See the discussion in https://github.com/davidgiven/ack/pull/73	2018-03-12 20:58:31 -04:00
George Koehler	ebba76e08f	Don't read INSTR(l) after oldline(l) frees it. This bug got in my way while I was looking for another read-after-free bug in the CS phase.	2018-03-11 20:10:13 -04:00
George Koehler	12643f1740	Solve some gcc warnings in ego. Some of these are from gcc -Wimplicit	2018-03-08 18:51:07 -05:00
George Koehler	b1b737ed6c	Optimize procedures that do both a / b and a % b. Enable this in CS for PowerPC; disable it for all other machines. PowerPC has no remainder instruction; the back end uses division to compute remainder. If CS finds both a / b and a % b, then CS now rewrites a % b as a - b * (a / b) and computes a / b only once. This removes an extra division in the PowerPC code, so it saves both time and space. I have not considered whether to enable this optimization for other machines. It might be less useful in machines with a remainder instruction. Also, if a % b occurs before a / b, the EM code gets a DUP. PowerPC ncg handles this DUP well; other back ends might not.	2018-03-05 13:32:06 -05:00
George Koehler	f26259caac	Check AAR earlier to prevent LOI/STI unknown size. In ego, the CS phase may convert a LAR/SAR to AAR LOI/STI so it can optimize multiple occurrences of AAR of the same array element. This conversion should not happen if it would LOI/STI a large or unknown size. cs_profit.c okay_lines() checked the size of each occurrence of AAR except the first. If the first AAR was the implicit AAR in a LAR/SAR, then the conversion happened without checking the size. For unknown size, this made a bad LOI -1 or STI -1. Fix by checking the size earlier: if a LAR/SAR has a bad size, then don't enter it as an AAR. This Modula-2 code showed the bug. Given M.def: DEFINITION MODULE M; TYPE S = SET OF [0..95]; PROCEDURE F(a: ARRAY OF S; i, j: INTEGER); END M. and M.mod: ($R-) IMPLEMENTATION MODULE M; FROM SYSTEM IMPORT ADDRESS, ADR; PROCEDURE G(s: S; p, q: ADDRESS; t: S); BEGIN s := s; p := p; q := q; t := t; END G; PROCEDURE F(a: ARRAY OF S; i, j: INTEGER); BEGIN G(a[i + j], ADR(a[i + j]), ADR(a[i + j]), a[i + j]) END F; END M. then the bug caused an error: $ ack -mlinuxppc -O3 -c.e M.mod /tmp/Ack_b357d.g, line 57: Argument range error The bug had put LOI -1 in the code, then em_decode got an error because -1 is out of range for LOI. Procedure F has 4 occurrences of `a[i + j]`. The size of `a[i + j]` is 96 bits, or 12 bytes, but the EM code hides the size in an array descriptor, so the size is unknown to CS. The pragma `($R-)` disables a range check on `i + j` so CS can work. EM uses AAR for the 2 `ADR(a[i + j])` and LAR for the other 2 `a[i + j]`. EM pushes the arguments to G in reverse order, so the last `a[i + j]` in Modula-2 is the first LAR in EM. CS found 4 occurrences of AAR. The first AAR was an implicit AAR in LAR. Because of the bug, CS converted this LAR 4 to AAR 4 LOI -1.	2018-03-02 16:06:21 -05:00
George Koehler	a7bb4ec4b1	Fixes for compiling ego with -DTRACE - In share/debug.c, undo my mistake in commit `9037d13` by changing vfprintf back to fprintf in OUTTRACE. - In ud/ud.c, move the trace output from stdout to stderr, because stdout has ego's output file, which becomes opt2's input file. If trace output goes to stdout, it gets prepended to the output file, and opt2 errors with "wrong input file". I also edit both build.lua files so ego depends on its header files; this part isn't needed for -DTRACE. One can now use -DTRACE by adding it to the cflags in both build.lua files.	2018-03-01 13:19:38 -05:00
George Koehler	0a6d3de7fe	Use prototypes in ego/cs, ego/sp.	2018-02-05 16:09:30 -05:00
George Koehler	11d48be49e	Fix my typo from commit `5bbbaf4`.	2017-11-17 15:46:24 -05:00
George Koehler	d99a0682fc	Switch ego to libc <assert.h> I also tried, in types.h, to switch ego to libc <stdbool.h>, but that causes an infinite loop in the IL phase.	2017-11-15 19:48:53 -05:00
George Koehler	9037d137f5	Add prototypes, void in util/ego/share This uncovers a problem in il/il_aux.c: it passes 3 arguments to getlines(), but the function expects 4 arguments. I add FALSE as the 4th argument. TRUE would fill in the list of mesregs. IL uses mesregs during phase 1, but this call to getlines() is in phase 2. TRUE would leak memory unless I added a call to Ldeleteset(mesregs). So I pass FALSE. Functions passed to go() now have a `void ` parameter because no_action() now takes a `void `.	2017-11-15 17:19:56 -05:00
George Koehler	5bbbaf4919	Use size_t and void with memory allocation in ego. alloc.h now needs to #include <stdlib.h> to find type size_t and function free().	2017-11-14 20:35:18 -05:00
George Koehler	87a2315037	strcmp, strncmp are in <string.h> Important: Do `make clean` to work around a problem and prevent infinite rebuilds, https://github.com/davidgiven/ack/issues/68 I edit tokens.g in util/LLgen/src, so I regenerate tokens.c. The regeneration script bootstrap.sh can't find LLgen, but I can run the same command by typing the path to llgen.	2017-11-14 17:35:35 -05:00
George Koehler	50a7031007	Don't use '-' in option string to getopt(). @dram reported a build failure in FreeBSD at https://github.com/davidgiven/ack/issues/1#issuecomment-273668299 Linux manual for getopt(3) says: > If the first character of optstring is '-', then each nonoption > argv-element is handled as if it were the argument of an option with > character code 1.... > > The use of '+' and '-' in optstring is a GNU extension. GNU/Linux and OpenBSD handle '-' in this special way, but FreeBSD seems not to. If '-' is not special, then em_ego can't find its input file, so the build must fail. This commit stops using '-' in both em_b and em_ego, but doesn't change mcg. Also fix em_ego -O3 to not act like -O4.	2017-10-29 23:25:07 -04:00
George Koehler	d6e9eac785	Merge branch 'default' into kernigh-linuxppc This merges several fixes and improvements from upstream. This includes commit `5f6a773` to turn off qemuppc. I see several failing tests from qemuppc; this merge will hide the test failures.	2017-10-14 13:50:49 -04:00
David Given	1203e8afd2	mkstemp() is a bit more complex than it looks; because ego wants to use the same base name and generate multiple files based on it, we can't really use mkstemp() for every temporary file. Instead, use mkstemp() once on a placeholder, then generate temporary names based on this. (And delete the placeholder once we've finished.)	2017-08-06 14:25:12 +02:00
David Given	64f2fa9d46	Stop using mktemp() --- on Haiku, it always generates the same filenames, pretty much guaranteeing temporary file overwrites on parallel builds. Use mkstemp() instead which creates the files atomically.	2017-08-06 13:22:05 +02:00
David Given	fd10cf7ac2	Merge from trunk.	2017-08-06 10:42:16 +02:00
George Koehler	a20b87ca01	In ego, put both words and double-words in reg_float. The size of a reg_float isn't in the descr file, so ego doesn't know. PowerPC and SPARC are the only arches with floating-point registers in their descr files. PowerPC and SPARC registers can hold both 4-byte and 8-byte floats, so I want ego to do both sizes. This might break our SPARC code expander because ego doesn't know that 8-byte values take 2 registers in SPARC. (So ego might allocate too many registers and deallocate too much stack space.) We don't build the SPARC code expander, and its descr file is already wrong: its list of register save costs is too short, so ego will read past the end of the array. This commit doesn't fix the problem with ego and PowerPC ncg. Right now, ncg refuses to put 4-byte floats in registers, but ego expects them to get registers and deallocates their stack space. So ncg emits programs that use the deallocated space, and the values of 4-byte floats become corrupt.	2017-02-16 19:55:52 -05:00
George Koehler	cbe5d8640b	Add floating-point register variables to PowerPC ncg. Use f14 to f31 as register variables for 8-byte double-precison. There are no regvars for 4-byte double precision, because all regvar(reg_float) must have the same size. I expect more programs to prefer 8-byte double precision. Teach mach/powerpc/ncg/mach.c to emit stfd and lfd instructions to save and restore 8-byte regvars. Delay emitting the function prolog until f_regsave(), so we can use one addi to make stack space for both local vars and saved registers. Be more careful with types in mach.c; don't assume that int and long and full are the same. In ncg table, add f14 to f31 as register variables, and some rules to use them. Add rules to put the result of fadd, fsub, fmul, fdiv, fneg in a regvar. Without such rules, the result would go in a scratch FREG, and we would need fmr to move it to the regvar. Also add a rule for pat sdl inreg($1)==reg_float with STACK, so we can unstack the value directly into the regvar, again without a scratch FREG and fmr. Edit util/ego/descr/powerpc.descr to tell ego about the new float regvars. This might not be working right; ego usually decides against using any float regvars, so ack -O1 (not running ego) uses the regvars, but ack -O4 (running ego) doesn't use the regvars. Beware that ack -mosxppc runs ego using powerpc.descr but -mlinuxppc and -mqemuppc run ego without a config file (since `8ef7c31`). I am testing powerpc.descr with a local edit to plat/linuxppc/descr to run ego with powerpc.descr there, but I did not commit my local edit.	2017-02-15 19:34:07 -05:00
George Koehler	8ef7c31089	Write a powerpc.descr for ego and use it with osxppc. No change to linuxppc and qemuppc. They continue to run ego without any descr file. I copied m68020.descr to powerpc.descr and changed some numbers. My numbers are guesses; I know little about PowerPC cycle counts, and almost nothing about ego. This powerpc.descr causes most of the example programs to shrink in size (without descr -> with descr): 65429 -> 57237 hilo_b.osxppc -8192 36516 -> 32420 hilo_c.osxppc -4096 55782 -> 51686 hilo_mod.osxppc -4096 20096 -> 20096 hilo_p.osxppc 0 8813 -> 8813 mandelbrot_c.osxppc 0 93355 -> 89259 paranoia_c.osxppc -4096 92751 -> 84559 startrek_c.osxppc -8192 (Each file has 2 Mach segments, then a symbol table. Each segment takes a multiple of 4096 bytes. When the code shrinks, we lose a multiple of 4096 bytes.) I used "ack -mosxppc -O6 -c.so" to examine the assembly code for hilo.mod and mandelbrot.c, both without and with descr. This reveals optimizations made only with descr, from 2 ego phases: SP (stack pollution) and RA (register allocation). In hilo.mod, SP deletes some instructions that remove items from the stack. These items get removed when the function returns. In both hilo.mod and mandelbrot.c, RA moves some values into local variables, so ncg can make them into register variables. This shrinks code size, probably because register variables get preserved across function calls. More values stay in registers, and ncg emits shorter code. I believe that the ego descr file uses (time,space) tuples but the ncg table uses (space,time) tuples. This is confusing. Perhaps I am wrong, and some or all tuples are backwards. My time values are the cycle counts in latency from the MPC7450 Reference Manual (but not including complications like "store serialization"). In powerpc.descr, I give the cost for saving and restoring registers as if I was using chains of stw and lwz instructions. Actually ncg uses single stmw and lmw instructions with at least 2 instructions. The (time,space) for stmw and lmw would be much less than the (time,space) for chains of stw and lwz. But this ignores the pipeline of the MPC7450. The chains of stw and lwz may run faster than stmw and lmw in the pipeline, because the throughput may be better than the latency. By using the wrong values for (time,space), I'm trying to tell ego that stmw and lmw are not better than chains of stw and lwz.	2016-11-30 15:29:19 -05:00
David Given	3e69d1185a	Fix a whole lot more stray prototypes.	2016-11-24 21:47:40 +01:00
David Given	fd91851005	Add enough return types to the K&R C that the ACK builds (on Linux) using clang now.	2016-11-10 22:04:18 +01:00
George Koehler	b1d1b5e1f8	Fix bugs with memory allocation in ego. cf/cf_loop.c and share/put.c tried to read the next pointer in an element of a linked list after freeing the element. ud/ud_copy.c tried to read beyond the end of the _defs_ array: it only has _nrexpldefs_ elements, not _nrdefs_ elements. These bugs caused core dumps on OpenBSD. Its malloc() put _defs_ near the end of a page, so reading beyond the end crossed into an unmapped page. Its free() wrote junk bytes and changed the next pointer to 0xdfdfdfdfdfdfdfdf.	2016-09-09 23:37:43 -04:00
David Given	f67c98e239	Distributions are a pain --- let's not bother any more. Instead, we just tag the repository and download a complete snapshot, old and ancient stuff and all.	2016-09-02 23:00:38 +02:00
David Given	612e38f1c6	Remove the old make-based build system, plus some big chunks of horribly obsolete protomake build system.	2016-09-02 22:17:51 +02:00
David Given	2b6d251dec	Fix a fun bug where, every now again, ego would get its temporary files mangled and generate invalid calls to the optimisers. Previously ego would generate a temporary file template that looked like /tmp/ego.A.BB.XXXXXX, call mktemp() on it to randomise the XXXXXX, and then replace A and BB with data. However, it used strrchr to find the A and B. Which would fine, except when mktemp produced an A or a B in the randomised part... This code was written on 4 March 1991. I was 16.	2016-08-22 23:53:01 +02:00
David Given	2a95b1c5e3	Forgot to check a file in.	2016-08-22 22:45:32 +02:00
David Given	5bae29a00c	ego now builds and is used. This needed lots of refactoring to ego --- not all platforms have ego descr files, and ego will just crash if you invoke it without one. I think originally it was never intended that these platforms would be used at -O2 or above. Plats now only specify the ego descr file if they have one.	2016-08-21 22:01:19 +02:00
David Given	2b2bd93e44	Run through clang-format.	2016-08-21 20:08:05 +02:00
David Given	44b6421519	Run through clang-format.	2016-08-21 19:53:14 +02:00
David Given	671bf250f5	Run through clang-format.	2016-08-21 19:46:19 +02:00
David Given	918f300513	Run through clang-format.	2016-08-21 19:38:54 +02:00
David Given	1b66b63eae	Run through clang-format.	2016-08-21 19:38:02 +02:00
David Given	3584ddb6e9	Push through clang-format.	2016-08-21 19:34:54 +02:00
David Given	a4f136f999	Run through clang-format.	2016-08-21 18:51:36 +02:00
David Given	03a0b182c4	Push em_ego.c through clang-format before working on it.	2016-08-21 18:45:25 +02:00
David Given	88bd7ce126	Remove defunct pmfiles. --HG-- branch : default-branch	2016-06-03 13:56:50 +02:00
David Given	3d5e72e20b	Newer versions of GNU Make have a new function which collides with a variable we're using; change the name of the variable.	2015-03-22 12:09:46 +01:00
David Given	11377070fd	Update distribution files. --HG-- branch : dtrg-buildsystem	2013-05-15 23:46:15 +01:00
David Given	e9233b4712	Build ego. --HG-- branch : dtrg-buildsystem rename : util/arch/build.mk => util/ego/build.mk	2013-05-15 21:14:06 +01:00
David Given	c1aca7dae5	First milestone of replacing the build system. --HG-- branch : dtrg-buildsystem rename : lang/cem/cpp.ansi/Parameters => lang/cem/cpp.ansi/parameters.h	2013-05-12 20:45:55 +01:00
George Koehler	0131ca4d46	Delete 689 undead files. These files "magically reappeared" after the conversion from CVS to Mercurial. The old CVS repository deleted these files but did not record when it deleted these files. The conversion resurrected these files because they have no history of deletion. These files were probably deleted before year 1995. The CVS repository begins to record deletions around 1995. These files may still appear in older revisions of this Mercurial repository, when they should already be deleted. There is no way to fix this, because the CVS repository provides no dates of deletion. See http://sourceforge.net/mailarchive/message.php?msg_id=29823032	2012-09-20 22:26:32 -04:00
dtrg	ee72886e54	Renamed 'switch' variable to avoid conflict with a keyword in modern awks.	2010-08-01 10:35:04 +00:00
dtrg	494d9a3e4a	Now runs descr files through the ANSI C preprocessor, rather than the K&R one (which no longer exists).	2007-04-29 21:23:55 +00:00
dtrg	6a0dd9377d	Removed a dynamically generated file from the distribution.	2007-02-25 22:49:22 +00:00
dtrg	b611731ec3	Updated .distr files for the new release.	2007-02-25 12:51:55 +00:00
dtrg	6d58210806	em_table is now in /h, not /etc.	2007-02-25 12:51:21 +00:00
dtrg	dbe10d2c19	Updated to the version 0.1 of Prime Mover (which involves some syntax changes).	2006-10-15 00:28:12 +00:00
dtrg	014be56fb0	Replaced calls to the custom strindex() and strrindex() functions with the exactly equivalent and standard strchr() and strrchr() functions instead.	2006-07-23 20:01:02 +00:00
dtrg	eed5d461e4	cpp now gets installed in the right place.	2006-07-23 17:52:23 +00:00

1 2 3 4 5 ...

307 commits