In the instruction list, put /* kills xer */ for sraw, srawi, subfic;
and correct the (now unused) "addi." and "lfdu".
Change MACHOPT_F from -m3 to -m2. This changes the code for 15 * i
from
slwi r3,r4,4
subfic r5,r4,0
add r3,r3,r5
to
mulli r3,r4,15
If the sequence "slwi subfic addi" takes 3 cycles and 12 bytes, and
mulli takes 3 cycles and 4 bytes, then mulli is better.
|
||
|---|---|---|
| .. | ||
| mach.c | ||
| mach.h | ||
| table | ||