This removes a wrong-way dependency of libsys on libem. The C
functions in libsys called .ret, but libsys is after libem in the
linker arguments, so the linker didn't find .ret unless something else
had called .ret. Almost everything called .ret, but I got a linker
error when I wrote an assembly program using the EM runtime, because
my assembly program didn't call .ret.
Add a dummy comment to build.lua, so git checkout touches that file,
the build system reconfigures itself, and the *.s glob sees that ret.s
has gone.
Our libem had two implementations of loading a block from a stack, one
for lar 4 and one for los 4. Now lar 4 and los 4 share the code in
.los4. Likewise, sar 4 and sts 4 share the code in .sts4.
Rename .los to .los4 and .sts to .sts4, because they implement los 4
and sts 4. Remove the special case for loading or storing 4 bytes,
because we can do it with 1 iteration of the loop. Remove the lines
to "align size" where the size must already be a multiple of 4.
Fix the upper bound check in .aar4.
Change .aar4, .lar4, .los4, .sar4, .sts4 to pass all operands on the
real stack, except that .los4 and .sts4 take the size in register r3.
Have .aar4 set r3 to the size of the array element. So lar 4 is just
.aar4 then .los4, and sar 4 is just .aar4 then .sts4.
ncg no longer calls .lar4 and .sar4 in libem, because it inlines the
code; but I keep .lar4 and .sar4 in libem, because mcg references
them. They might or might not work in mcg.
This provides and, ior, xor, com, zer, set, cms when defined($1) and
ior, set when !defined($1). I don't provide the other operations
!defined($1) because our Modula-2 compiler hasn't used them.
I wrote a Modula-2 example in
https://gist.github.com/kernigh/add79662bb3c63ffb7c46d01dc8ae788
Put a dummy comment in mach/powerpc/libem/build.lua so git checkout
will touch that file. Without the touch, the build system doesn't see
the new *.s files.
In EM, fef splits a float into exponent and fraction. The old C code,
given an infinite float, got stuck in an infinite loop. The new
assembly code doesn't loop; it extracts the IEEE exponent.