* more ops: umod and udiv
* large immediates: suboptimal code, e.g. when loading
0xffffffffU (which is what a cast from long to int does).
tests2 work up to 67_macro_concat.
* load/store of FP ops
* load/store from symbols, VT_LLOCAL, some with large addend
* load of VT_JMP result
* calls with many args (stack slots)
* calls with FP args
* more operations: and/or/xor/div
* register indirect loads and stores
* load of local addresses
* indirect calls (uses ra as temporary reg if necessary)
* operations *, -, <<
* gen_cvt_sxtw: is not needed in most cases, let's see
tests2 runs until (incl) 09_do_while.
* implement compares, gtst and gsym/gjmp and add
* implement stores (simple cases)
* fix arg passing with more than one register arg, fix
loads to not always use 8byte loads
* add some predefined macros: __riscv, __riscv_xlen,
__SIZEOF_POINTER__ (needed by glibc header)
The first 5 tests of tests2 run now.