At startup __bound_init() wants to mark malloc zone as invalid memory,
so that any access to memory on heap, not allocated through malloc be
invalid. Other pages are initialized as empty regions, access to which
is not treated as invalid by bounds-checking.
The problem is code incorrectly assumed that heap goes right after bss,
and that is not correct for two cases:
1) if we are running from `tcc -b -run`, program text data and bss
will be already in malloced memory, possibly in mmaped region
insead of heap, and marking memory as invalid from _end
will not cover heap and probably wrongly mark correct regions.
2) if address space randomization is turned on, again heap does not
start from _end, and we'll mark as invalid something else instead
of malloc area.
For example with the following diagnostic patch ...
diff --git a/tcc.c b/tcc.c
index 5dd5725..31c46e8 100644
--- a/tcc.c
+++ b/tcc.c
@@ -479,6 +479,8 @@ static int parse_args(TCCState *s, int argc, char **argv)
return optind;
}
+extern int _etext, _edata, _end;
+
int main(int argc, char **argv)
{
int i;
@@ -487,6 +489,18 @@ int main(int argc, char **argv)
int64_t start_time = 0;
const char *default_file = NULL;
+ void *brk;
+
+ brk = sbrk(0);
+
+ fprintf(stderr, "\n>>> TCC\n\n");
+ fprintf(stderr, "etext:\t%10p\n", &_etext);
+ fprintf(stderr, "edata:\t%10p\n", &_edata);
+ fprintf(stderr, "end:\t%10p\n", &_end);
+ fprintf(stderr, "brk:\t%10p\n", brk);
+ fprintf(stderr, "stack:\t%10p\n", &brk);
+
+ fprintf(stderr, "&errno: %p\n", &errno);
s = tcc_new();
output_type = TCC_OUTPUT_EXE;
diff --git a/tccrun.c b/tccrun.c
index 531f46a..25ed30a 100644
--- a/tccrun.c
+++ b/tccrun.c
@@ -91,6 +91,8 @@ LIBTCCAPI int tcc_run(TCCState *s1, int argc, char **argv)
int (*prog_main)(int, char **);
int ret;
+ fprintf(stderr, "\n\ntcc_run() ...\n\n");
+
if (tcc_relocate(s1, TCC_RELOCATE_AUTO) < 0)
return -1;
diff --git a/lib/bcheck.c b/lib/bcheck.c
index ea5b233..8b26a5f 100644
--- a/lib/bcheck.c
+++ b/lib/bcheck.c
@@ -296,6 +326,8 @@ static void mark_invalid(unsigned long addr, unsigned long size)
start = addr;
end = addr + size;
+ fprintf(stderr, "mark_invalid %10p - %10p\n", (void *)addr, (void *)end);
+
t2_start = (start + BOUND_T3_SIZE - 1) >> BOUND_T3_BITS;
if (end != 0)
t2_end = end >> BOUND_T3_BITS;
... Look how memory is laid out for `tcc -b -run ...`:
$ ./tcc -B. -b -DTCC_TARGET_I386 -DCONFIG_MULTIARCHDIR=\"i386-linux-gnu\" -run \
-DONE_SOURCE ./tcc.c -B. -c x.c
>>> TCC
etext: 0x8065477
edata: 0x8070220
end: 0x807a95c
brk: 0x807b000
stack: 0xaffff0f0
&errno: 0xa7e25688
tcc_run() ...
mark_invalid 0xfff80000 - (nil)
mark_invalid 0xa7c31d98 - 0xafc31d98
>>> TCC
etext: 0xa7c22767
edata: 0xa7c2759c
end: 0xa7c31d98
brk: 0x8211000
stack: 0xafffeff0
&errno: 0xa7e25688
Runtime error: dereferencing invalid pointer
./tccpp.c:1953: at 0xa7beebdf parse_number() (included from ./libtcc.c, ./tcc.c)
./tccpp.c:3003: by 0xa7bf0708 next() (included from ./libtcc.c, ./tcc.c)
./tccgen.c:4465: by 0xa7bfe348 block() (included from ./libtcc.c, ./tcc.c)
./tccgen.c:4440: by 0xa7bfe212 block() (included from ./libtcc.c, ./tcc.c)
./tccgen.c:5529: by 0xa7c01929 gen_function() (included from ./libtcc.c, ./tcc.c)
./tccgen.c:5767: by 0xa7c02602 decl0() (included from ./libtcc.c, ./tcc.c)
The second mark_invalid goes right after in-memory-compiled program's
_end, and oops, that's not where malloc zone is (starts from brk), and oops
again, mark_invalid covers e.g. errno. Then compiled tcc is crasshing by
bcheck on errno access:
1776 static void parse_number(const char *p)
1777 {
1778 int b, t, shift, frac_bits, s, exp_val, ch;
...
1951 *q = '\0';
1952 t = toup(ch);
1953 errno = 0;
The solution here is to use sbrk(0) as approximation for the program
break start instead of &_end:
- if we are a separately compiled program, __bound_init() runs early,
and sbrk(0) should be equal or very near to start_brk (in case other
constructors malloc something), or
- if we are running from under `tcc -b -run`, sbrk(0) will return
start of heap portion which is under this program control, and not
mark as invalid earlier allocated memory.
With this patch `tcc -b -run tcc.c ...` succeeds compiling above
small-test program (diagnostic patch is still applied too):
$ ./tcc -B. -b -DTCC_TARGET_I386 -DCONFIG_MULTIARCHDIR=\"i386-linux-gnu\" -run \
-DONE_SOURCE ./tcc.c -B. -c x.c
>>> TCC
etext: 0x8065477
edata: 0x8070220
end: 0x807a95c
brk: 0x807b000
stack: 0xaffff0f0
&errno: 0xa7e25688
tcc_run() ...
mark_invalid 0xfff80000 - (nil)
mark_invalid 0x8211000 - 0x10211000
>>> TCC
etext: 0xa7c22777
edata: 0xa7c275ac
end: 0xa7c31da8
brk: 0x8211000
stack: 0xafffeff0
&errno: 0xa7e25688
(completes ok)
but running `tcc -b -run tcc.c -run tests/tcctest.c` sigsegv's - that's
the plot for the next patch.