"make test" crashes without that "save_regs()".
This partially reverts
commit 49d3118621.
Found another solution: In a 2nd pass Just look if
any of the argument registers has been saved again,
and restore if so.
This was causing assembler bugs in a tcc compiled by itself
at i386-asm.c:352 when ExprValue.v was changed to uint64_t:
if (op->e.v == (int8_t)op->e.v)
op->type |= OP_IM8S;
A general test case:
#include <stdio.h>
int main(int argc, char **argv)
{
long long ll = 4000;
int i = (char)ll;
printf("%d\n", i);
return 0;
}
Output was "4000", now "-96".
Also: add "asmtest2" as asmtest with tcc compiled by itself
On 2016-08-11 09:24 +0100, Balazs Kezes wrote:
> I think it's just that that copy_params() never restores the spilled
> registers. Maybe it needs some extra code at the end to see if any
> parameters have been spilled to stack and then restore them?
I've spent some time on this and I've found an alternative solution.
Although I'm not entirely sure about it but I've attached a patch
nevertheless.
And while poking at that I've found another problem affecting the
unsigned long long division on arm and I've attached a patch for that
too.
More details in the patches themselves. Please review and consider them
for merging! Thank you!
--
Balazs
[PATCH 1/2] Fix slow unsigned long long division on ARM
The macro AEABI_UXDIVMOD expands to this bit:
#define AEABI_UXDIVMOD(name,type, rettype, typemacro) \
...
while (num >= den) { \
...
while ((q << 1) * den <= num && q * den <= typemacro ## _MAX / 2) \
q <<= 1; \
...
With the current ULONG_MAX version the inner loop goes only until 4
billion so the outer loop will progress very slowly if num is large.
With ULLONG_MAX the inner loop works as expected. The current version is
probably a result of a typo.
The following bash snippet demonstrates the bug:
$ uname -a
Linux eper 4.4.16-2-ARCH #1 Wed Aug 10 20:03:13 MDT 2016 armv6l GNU/Linux
$ cat div.c
int printf(const char *, ...);
int main(void) {
unsigned long long num, denom;
num = 12345678901234567ULL;
denom = 7;
printf("%lld\n", num / denom);
return 0;
}
$ time tcc -run div.c
1763668414462081
real 0m16.291s
user 0m15.860s
sys 0m0.020s
[PATCH 2/2] Fix long long dereference during argument passing on ARMv6
For some reason the code spills the register to the stack. copy_params
in arm-gen.c doesn't expect this so bad code is generated. It's not
entirely clear why the saving part is necessary. It was added in commit
59c35638 with the comment "fixed long long code gen bug" with no further
clarification. Given that tcctest.c passes without this, maybe it's no
longer needed? Let's remove it.
Also add a new testcase just for this. After I've managed to make the
tests compile on a raspberry pi, I get the following diff without this
patch:
--- test.ref 2016-08-22 22:12:43.380000000 +0100
+++ test.out3 2016-08-22 22:12:49.990000000 +0100
@@ -499,7 +499,7 @@
2
1 0 1 0
4886718345
-shift: 9 9 9312
+shift: 291 291 291
shiftc: 36 36 2328
shiftc: 0 0 9998683865088
manyarg_test:
More discussion on this thread:
https://lists.nongnu.org/archive/html/tinycc-devel/2016-08/msg00004.html
- "utf8 in identifiers"
from 936819a1b9
- CValue: remove member str.data_allocated
- make tiny allocator private to tccpp
- allocate macro_stack objects on heap
because otherwise it could crash after error/setjmp
in preprocess_delete():end_macro()
- mov "TinyAlloc" defs to tccpp.c
- define_push: take int* str again
fixes 5c35ba66c5
Implementation was consistent within tcc but incompatible
with the ABI (for example library functions vprintf etc)
Also:
- tccpp.c/get_tok_str() : avoid "unknown format "%llu" warning
- x86_64_gen.c/gen_vla_alloc() : fix vstack leak
The case below previously was causing an assertion failure
in the target specific generator.
It probably is not incorrect not to allow this even if
gcc does.
struct S { long b; };
void f(struct S *x)
{
struct S y[1] = { *x };
}
The check for structs was too late and on amd64 and aarch64 could
lead to accepting and then asserting with code like:
struct S {...} s;
char *c = (char*)0x10 - s;
... for fast redeclaration checks
Also, check function parameters too:
void foo(int a) { int a; ... }
Also, try to fix struct/union/enum's on different scopes:
{ struct xxx { int x; };
{ struct xxx { int y; }; ... }}
and some (probably not all) combination with incomplete
declarations "struct xxx;"
Replaces 2bfedb1867
and 07d896c8e5
Fixes cf95ac399c
A constant expression removed from the loop.
If subroutine have 50000+ local variables, then currently
compilation of such code takes obly 15 sec. Was 2 min.
gcc-4.1.2 compiles such code in 7 sec. pcc -- 3.44 min.
A test generator:
#include <stdio.h>
int main() {
puts("#include <stdio.h>"); puts("int main()"); puts("{");
for (int i = 0; i < 50000; ++i) printf("int X%d = 1;\n", i);
for (int i = 0; i < 50000; ++i) puts("scanf(\"%d\", &X0);");
puts("}");
return 0;
}
don't catch redefinition for local vars. With this option on
tcc accepts the following code:
int main()
{
int a = 0;
long a = 0;
}
But if you shure there is no problem with your local variables,
then a compilation speed can be improved if you have a lots of
the local variables (50000+)
From gcc docs: "You may also specify attributes between the enum, struct or union tag and the name of the type rather than after the closing brace."
Adds `82_attribs_position.c` in `tests/tests2`
the check on incomplete struct/union/enum types was too early,
disallowing mixed specifiers and qualifiers. Simply rely on
the size (->c) field for that. See testcases.
A CString used to be copied into a token string, which is an int array.
On a 64-bit architecture the pointers were misaligned, so ASan gave
lots of warnings. On a 64-bit architecture that required memory
accesses to be correctly aligned it would not work at all.
The CString is now included in CValue instead.
Some test cases:
#define SE ({ switch (0) { } 0; })
// Should give error:
int x = SE;
void f(void) { static int x = SE; }
void f(void) { enum e { a = SE }; }
void f(void) { switch (0) { case SE: break; } }
// Correct:
int f(void) { return SE; }
int f(void) { return sizeof(SE); }
- avoid memory allocation by using its (int) token number
- avoid additional function parameter by using Attribute
Also: fix some strange looking error messages
In a case like
typedef int T[1];
const T x;
we must make a copy of the typedef type so that we can add the type
qualifiers to it.
The following code used to give
error: incompatible types for redefinition of 'f'
typedef int T[1];
void f(const int [1]);
void f(const T);
don't crash
a test program:
================
typedef struct X { int len; } X;
#define init(s,len) s.len = len;
int main(void) {
X myX;
init(myX,10);
return 0;
}
================
After a patch:
error: field name expected
a test program:
========
typedef struct X { int len; } X;
int main(void) {
X myX;
myX.10 = 10;
return 0;
}
========
Error message before a patch:
error: ';' expected (got "(null)")
After a patch:
error: field name expected
* Documentation is now in "docs".
* Source code is now in "src".
* Misc. fixes here and there so that everything still works.
I think I got everything in this commit, but I only tested this
on Linux (Make) and Windows (CMake), so I might've messed
something up on other platforms...
Jsut for testing. It works for me (don't break anything)
Small fixes for x86_64-gen.c in "tccpp: fix issues, add tests"
are dropped in flavor of this patch.
Pip Cet:
Okay, here's a first patch that fixes the problem (but I've found
another bug, yet unfixed, in the process), though it's not
particularly pretty code (I tried hard to keep the changes to the
minimum necessary). If we decide to actually get rid of VT_QLONG and
VT_QFLOAT (please, can we?), there are some further simplifications in
tccgen.c that might offset some of the cost of this patch.
The idea is that an integer is no longer enough to describe how an
argument is stored in registers. There are a number of possibilities
(none, integer register, two integer registers, float register, two
float registers, integer register plus float register, float register
plus integer register), and instead of enumerating them I've
introduced a RegArgs type that stores the offsets for each of our
registers (for the other architectures, it's simply an int specifying
the number of registers). If someone strongly prefers an enum, we
could do that instead, but I believe this is a place where keeping
things general is worth it, because this way it should be doable to
add SSE or AVX support.
There is one line in the patch that looks suspicious:
} else {
addr = (addr + align - 1) & -align;
param_addr = addr;
addr += size;
- sse_param_index += reg_count;
}
break;
However, this actually fixes one half of a bug we have when calling a
function with eight double arguments "interrupted" by a two-double
structure after the seventh double argument:
f(double,double,double,double,double,double,double,struct { double
x,y; },double);
In this case, the last argument should be passed in %xmm7. This patch
fixes the problem in gfunc_prolog, but not the corresponding problem
in gfunc_call, which I'll try tackling next.
* fix some macro expansion issues
* add some pp tests in tests/pp
* improved tcc -E output for better diff'ability
* remove -dD feature (quirky code, exotic feature,
didn't work well)
Based partially on ideas / researches from PipCet
Some issues remain with VA_ARGS macros (if used in a
rather tricky way).
Also, to keep it simple, the pp doesn't automtically
add any extra spaces to separate tokens which otherwise
would form wrong tokens if re-read from tcc -E output
(such as '+' '=') GCC does that, other compilers don't.
* cleanups
- #line 01 "file" / # 01 "file" processing
- #pragma comment(lib,"foo")
- tcc -E: forward some pragmas to output (pack, comment(lib))
- fix macro parameter list parsing mess from
a3fc543459a715d7143d
(some coffee might help, next time ;)
- introduce TOK_PPSTR - to have character constants as
written in the file (similar to TOK_PPNUM)
- allow '\' appear in macros
- new functions begin/end_macro to:
- fix switching macro levels during expansion
- allow unget_tok to unget more than one tok
- slight speedup by using bitflags in isidnum_table
Also:
- x86_64.c : fix decl after statements
- i386-gen,c : fix a vstack leak with VLA on windows
- configure/Makefile : build on windows (MSYS) was broken
- tcc_warning: fflush stderr to keep output order (win32)