Because lwzu or stwu moves the pointer, I can remove an addi instruction from the loop, so the loop is slightly faster. I wrote a benchmark in Modula-2 that exercises some of these loops. I measured its time on my old PowerPC Mac. Its user time decreases from 8.401s to 8.217s with the tighter loops. |
||
|---|---|---|
| .. | ||
| as | ||
| libem | ||
| libend | ||
| mcg | ||
| ncg | ||
| top | ||