2007-11-02 18:56:58 +00:00
|
|
|
.sect .text
|
|
|
|
|
2016-10-17 04:39:59 +00:00
|
|
|
! Multiplies two double-precision floats, then splits the product into
|
Add fef 4, fif 4. Improve fef 8, fif 8. Other float changes.
When I wrote fef 8, I forgot to test denormalized numbers. Oops. Now
fix two of my mistakes:
- When checking for zero, `extrwi r6, r3, 22, 12` needs to be
`extrwi r6, r3, 20, 12`. There are only 20 bits to extract.
- After the multiplication by 2**64, I forgot to put the fraction in
[0.5, 1) or (-1, 0.5] by setting IEEE exponent = 1022.
Teach fif 8 about signed zero and NaN.
In ncg/table, change cmf so NaN is not equal to any value, and comment
why ordered comparisons don't work with NaN. Also add cost for
fctwiz, remove extra `uses REG`.
Edit comment in cfu8.s because the conditional branch might be before
or after fctwiz.
2018-01-22 19:04:15 +00:00
|
|
|
! fraction and integer, both as floats, like modf(3) in C,
|
|
|
|
! http://en.cppreference.com/w/c/numeric/math/modf
|
2017-02-12 21:44:37 +00:00
|
|
|
!
|
|
|
|
! Stack: ( a b -- fraction integer )
|
2007-11-02 18:56:58 +00:00
|
|
|
|
|
|
|
.define .fif8
|
|
|
|
.fif8:
|
2017-02-12 21:44:37 +00:00
|
|
|
lfd f1, 8(sp)
|
|
|
|
lfd f2, 0(sp)
|
|
|
|
fmul f1, f1, f2 ! f1 = a * b
|
|
|
|
stfd f1, 0(sp)
|
2016-10-17 04:39:59 +00:00
|
|
|
lwz r3, 0(sp) ! r3 = high word
|
|
|
|
lwz r4, 4(sp) ! r4 = low word
|
|
|
|
|
Add fef 4, fif 4. Improve fef 8, fif 8. Other float changes.
When I wrote fef 8, I forgot to test denormalized numbers. Oops. Now
fix two of my mistakes:
- When checking for zero, `extrwi r6, r3, 22, 12` needs to be
`extrwi r6, r3, 20, 12`. There are only 20 bits to extract.
- After the multiplication by 2**64, I forgot to put the fraction in
[0.5, 1) or (-1, 0.5] by setting IEEE exponent = 1022.
Teach fif 8 about signed zero and NaN.
In ncg/table, change cmf so NaN is not equal to any value, and comment
why ordered comparisons don't work with NaN. Also add cost for
fctwiz, remove extra `uses REG`.
Edit comment in cfu8.s because the conditional branch might be before
or after fctwiz.
2018-01-22 19:04:15 +00:00
|
|
|
! IEEE double = sign * 1.fraction * 2**(exponent - 1023)
|
2016-10-17 04:39:59 +00:00
|
|
|
! sign exponent fraction
|
|
|
|
! 0 1..11 12..63
|
2017-02-12 21:44:37 +00:00
|
|
|
!
|
2016-10-17 04:39:59 +00:00
|
|
|
! Subtract 1023 from the IEEE exponent. If the result is from
|
|
|
|
! 0 to 51, then the IEEE fraction has that many integer bits.
|
2017-02-12 21:44:37 +00:00
|
|
|
|
In PowerPC libem, use the new features of our assembler.
The new features are the hi16/lo16 and ha16/lo16 syntax for
relocations, and the extended mnemonics like "blr".
Use ha16/lo16 to load some double floats with 2 instructions (lis/lfd)
instead of 3 (lis/ori/lfd).
Use the extended names for branches, comparisons, and bit rotations,
so I can more easily read the code. The new names often encode the
same machine instructions as the old names, except in a few places
where I changed the instructions.
Stop using andi. when we don't need to set cr0. In inn.s, I change
andi. to extrwi to extract the same bits. In los.s and sts.s, I
change "andi. r3, r3, ~3" to "clrrwi r3, r3, 2". This avoids setting
cr0 and also stops clearing the high 16 bits of r3.
In csa.s, los.s, sts.s, I change some comparisons and right shifts
from signed to unsigned (cmplw, cmplwi, srwi), because the sizes are
unsigned. In inn.s, the right shift can be signed (sraw) or unsigned
(srw), but I use srw because we don't need the carry bit.
In fef8.s, I save an instruction by using rlwinm instead of addis/andc
to rlwinm to clear a field. The code no longer kills r7. In both
fef8.s and fif8.s, I remove the list of killed registers.
Also remove some whitespace from ends of lines.
2017-01-23 22:16:39 +00:00
|
|
|
extrwi r5, r3, 11, 1 ! r5 = IEEE exponent
|
2016-10-17 04:39:59 +00:00
|
|
|
addic. r5, r5, -1023 ! r5 = nr of integer bits
|
Add fef 4, fif 4. Improve fef 8, fif 8. Other float changes.
When I wrote fef 8, I forgot to test denormalized numbers. Oops. Now
fix two of my mistakes:
- When checking for zero, `extrwi r6, r3, 22, 12` needs to be
`extrwi r6, r3, 20, 12`. There are only 20 bits to extract.
- After the multiplication by 2**64, I forgot to put the fraction in
[0.5, 1) or (-1, 0.5] by setting IEEE exponent = 1022.
Teach fif 8 about signed zero and NaN.
In ncg/table, change cmf so NaN is not equal to any value, and comment
why ordered comparisons don't work with NaN. Also add cost for
fctwiz, remove extra `uses REG`.
Edit comment in cfu8.s because the conditional branch might be before
or after fctwiz.
2018-01-22 19:04:15 +00:00
|
|
|
blt 3f ! branch if no integer
|
In PowerPC libem, use the new features of our assembler.
The new features are the hi16/lo16 and ha16/lo16 syntax for
relocations, and the extended mnemonics like "blr".
Use ha16/lo16 to load some double floats with 2 instructions (lis/lfd)
instead of 3 (lis/ori/lfd).
Use the extended names for branches, comparisons, and bit rotations,
so I can more easily read the code. The new names often encode the
same machine instructions as the old names, except in a few places
where I changed the instructions.
Stop using andi. when we don't need to set cr0. In inn.s, I change
andi. to extrwi to extract the same bits. In los.s and sts.s, I
change "andi. r3, r3, ~3" to "clrrwi r3, r3, 2". This avoids setting
cr0 and also stops clearing the high 16 bits of r3.
In csa.s, los.s, sts.s, I change some comparisons and right shifts
from signed to unsigned (cmplw, cmplwi, srwi), because the sizes are
unsigned. In inn.s, the right shift can be signed (sraw) or unsigned
(srw), but I use srw because we don't need the carry bit.
In fef8.s, I save an instruction by using rlwinm instead of addis/andc
to rlwinm to clear a field. The code no longer kills r7. In both
fef8.s and fif8.s, I remove the list of killed registers.
Also remove some whitespace from ends of lines.
2017-01-23 22:16:39 +00:00
|
|
|
cmpwi r5, 52
|
Add fef 4, fif 4. Improve fef 8, fif 8. Other float changes.
When I wrote fef 8, I forgot to test denormalized numbers. Oops. Now
fix two of my mistakes:
- When checking for zero, `extrwi r6, r3, 22, 12` needs to be
`extrwi r6, r3, 20, 12`. There are only 20 bits to extract.
- After the multiplication by 2**64, I forgot to put the fraction in
[0.5, 1) or (-1, 0.5] by setting IEEE exponent = 1022.
Teach fif 8 about signed zero and NaN.
In ncg/table, change cmf so NaN is not equal to any value, and comment
why ordered comparisons don't work with NaN. Also add cost for
fctwiz, remove extra `uses REG`.
Edit comment in cfu8.s because the conditional branch might be before
or after fctwiz.
2018-01-22 19:04:15 +00:00
|
|
|
bge 4f ! branch if no fraction
|
2017-02-12 21:44:37 +00:00
|
|
|
cmpwi r5, 21
|
|
|
|
bge 6f ! branch if large integer
|
|
|
|
! fall through if small integer
|
2016-10-17 04:39:59 +00:00
|
|
|
|
|
|
|
! f1 has r5 = 0 to 20 integer bits in the IEEE fraction.
|
|
|
|
! High word has 20 - r5 fraction bits.
|
In PowerPC libem, use the new features of our assembler.
The new features are the hi16/lo16 and ha16/lo16 syntax for
relocations, and the extended mnemonics like "blr".
Use ha16/lo16 to load some double floats with 2 instructions (lis/lfd)
instead of 3 (lis/ori/lfd).
Use the extended names for branches, comparisons, and bit rotations,
so I can more easily read the code. The new names often encode the
same machine instructions as the old names, except in a few places
where I changed the instructions.
Stop using andi. when we don't need to set cr0. In inn.s, I change
andi. to extrwi to extract the same bits. In los.s and sts.s, I
change "andi. r3, r3, ~3" to "clrrwi r3, r3, 2". This avoids setting
cr0 and also stops clearing the high 16 bits of r3.
In csa.s, los.s, sts.s, I change some comparisons and right shifts
from signed to unsigned (cmplw, cmplwi, srwi), because the sizes are
unsigned. In inn.s, the right shift can be signed (sraw) or unsigned
(srw), but I use srw because we don't need the carry bit.
In fef8.s, I save an instruction by using rlwinm instead of addis/andc
to rlwinm to clear a field. The code no longer kills r7. In both
fef8.s and fif8.s, I remove the list of killed registers.
Also remove some whitespace from ends of lines.
2017-01-23 22:16:39 +00:00
|
|
|
li r6, 20
|
2016-10-17 04:39:59 +00:00
|
|
|
subf r6, r5, r6
|
|
|
|
srw r3, r3, r6
|
In PowerPC libem, use the new features of our assembler.
The new features are the hi16/lo16 and ha16/lo16 syntax for
relocations, and the extended mnemonics like "blr".
Use ha16/lo16 to load some double floats with 2 instructions (lis/lfd)
instead of 3 (lis/ori/lfd).
Use the extended names for branches, comparisons, and bit rotations,
so I can more easily read the code. The new names often encode the
same machine instructions as the old names, except in a few places
where I changed the instructions.
Stop using andi. when we don't need to set cr0. In inn.s, I change
andi. to extrwi to extract the same bits. In los.s and sts.s, I
change "andi. r3, r3, ~3" to "clrrwi r3, r3, 2". This avoids setting
cr0 and also stops clearing the high 16 bits of r3.
In csa.s, los.s, sts.s, I change some comparisons and right shifts
from signed to unsigned (cmplw, cmplwi, srwi), because the sizes are
unsigned. In inn.s, the right shift can be signed (sraw) or unsigned
(srw), but I use srw because we don't need the carry bit.
In fef8.s, I save an instruction by using rlwinm instead of addis/andc
to rlwinm to clear a field. The code no longer kills r7. In both
fef8.s and fif8.s, I remove the list of killed registers.
Also remove some whitespace from ends of lines.
2017-01-23 22:16:39 +00:00
|
|
|
li r4, 0 ! clear low word
|
2016-10-17 04:39:59 +00:00
|
|
|
slw r3, r3, r6 ! clear fraction in high word
|
2017-02-12 21:44:37 +00:00
|
|
|
! fall through
|
|
|
|
|
|
|
|
1: stw r3, 0(sp)
|
|
|
|
stw r4, 4(sp)
|
|
|
|
lfd f2, 0(sp) ! integer = high word, low word
|
Add fef 4, fif 4. Improve fef 8, fif 8. Other float changes.
When I wrote fef 8, I forgot to test denormalized numbers. Oops. Now
fix two of my mistakes:
- When checking for zero, `extrwi r6, r3, 22, 12` needs to be
`extrwi r6, r3, 20, 12`. There are only 20 bits to extract.
- After the multiplication by 2**64, I forgot to put the fraction in
[0.5, 1) or (-1, 0.5] by setting IEEE exponent = 1022.
Teach fif 8 about signed zero and NaN.
In ncg/table, change cmf so NaN is not equal to any value, and comment
why ordered comparisons don't work with NaN. Also add cost for
fctwiz, remove extra `uses REG`.
Edit comment in cfu8.s because the conditional branch might be before
or after fctwiz.
2018-01-22 19:04:15 +00:00
|
|
|
fsub f1, f1, f2 ! fraction = value - integer
|
|
|
|
2: stfd f1, 8(sp) ! push fraction
|
2017-02-12 21:44:37 +00:00
|
|
|
stfd f2, 0(sp) ! push integer
|
|
|
|
blr
|
|
|
|
|
Add fef 4, fif 4. Improve fef 8, fif 8. Other float changes.
When I wrote fef 8, I forgot to test denormalized numbers. Oops. Now
fix two of my mistakes:
- When checking for zero, `extrwi r6, r3, 22, 12` needs to be
`extrwi r6, r3, 20, 12`. There are only 20 bits to extract.
- After the multiplication by 2**64, I forgot to put the fraction in
[0.5, 1) or (-1, 0.5] by setting IEEE exponent = 1022.
Teach fif 8 about signed zero and NaN.
In ncg/table, change cmf so NaN is not equal to any value, and comment
why ordered comparisons don't work with NaN. Also add cost for
fctwiz, remove extra `uses REG`.
Edit comment in cfu8.s because the conditional branch might be before
or after fctwiz.
2018-01-22 19:04:15 +00:00
|
|
|
! f1 is a fraction without integer (or zero).
|
|
|
|
! Then integer is zero with same sign.
|
|
|
|
3: extlwi r3, r3, 1, 0 ! extract sign bit
|
|
|
|
li r4, 0
|
|
|
|
stfd f1, 8(sp) ! push fraction
|
|
|
|
stw r4, 4(sp)
|
|
|
|
stw r3, 0(sp) ! push integer = zero with sign
|
|
|
|
blr
|
|
|
|
|
|
|
|
! f1 is an integer without fraction (or infinity or NaN).
|
|
|
|
! Unless NaN, then fraction is zero with same sign.
|
|
|
|
4: fcmpu cr0, f1, f1 ! integer = f1
|
|
|
|
bun cr0, 5f
|
|
|
|
extlwi r3, r3, 1, 0 ! extract sign bit
|
|
|
|
li r4, 0
|
|
|
|
stw r4, 12(sp)
|
|
|
|
stw r3, 8(sp) ! push fraction = zero with sign
|
|
|
|
stfd f1, 0(sp) ! push integer
|
|
|
|
blr
|
2016-10-17 04:39:59 +00:00
|
|
|
|
Add fef 4, fif 4. Improve fef 8, fif 8. Other float changes.
When I wrote fef 8, I forgot to test denormalized numbers. Oops. Now
fix two of my mistakes:
- When checking for zero, `extrwi r6, r3, 22, 12` needs to be
`extrwi r6, r3, 20, 12`. There are only 20 bits to extract.
- After the multiplication by 2**64, I forgot to put the fraction in
[0.5, 1) or (-1, 0.5] by setting IEEE exponent = 1022.
Teach fif 8 about signed zero and NaN.
In ncg/table, change cmf so NaN is not equal to any value, and comment
why ordered comparisons don't work with NaN. Also add cost for
fctwiz, remove extra `uses REG`.
Edit comment in cfu8.s because the conditional branch might be before
or after fctwiz.
2018-01-22 19:04:15 +00:00
|
|
|
! f1 is NaN, so both fraction and integer are NaN.
|
|
|
|
5: fmr f2, f1
|
2017-02-12 21:44:37 +00:00
|
|
|
b 2b
|
|
|
|
|
Add fef 4, fif 4. Improve fef 8, fif 8. Other float changes.
When I wrote fef 8, I forgot to test denormalized numbers. Oops. Now
fix two of my mistakes:
- When checking for zero, `extrwi r6, r3, 22, 12` needs to be
`extrwi r6, r3, 20, 12`. There are only 20 bits to extract.
- After the multiplication by 2**64, I forgot to put the fraction in
[0.5, 1) or (-1, 0.5] by setting IEEE exponent = 1022.
Teach fif 8 about signed zero and NaN.
In ncg/table, change cmf so NaN is not equal to any value, and comment
why ordered comparisons don't work with NaN. Also add cost for
fctwiz, remove extra `uses REG`.
Edit comment in cfu8.s because the conditional branch might be before
or after fctwiz.
2018-01-22 19:04:15 +00:00
|
|
|
! f1 has r5 = 21 to 51 to integer bits.
|
2016-10-17 04:39:59 +00:00
|
|
|
! Low word has 52 - r5 fraction bits.
|
Add fef 4, fif 4. Improve fef 8, fif 8. Other float changes.
When I wrote fef 8, I forgot to test denormalized numbers. Oops. Now
fix two of my mistakes:
- When checking for zero, `extrwi r6, r3, 22, 12` needs to be
`extrwi r6, r3, 20, 12`. There are only 20 bits to extract.
- After the multiplication by 2**64, I forgot to put the fraction in
[0.5, 1) or (-1, 0.5] by setting IEEE exponent = 1022.
Teach fif 8 about signed zero and NaN.
In ncg/table, change cmf so NaN is not equal to any value, and comment
why ordered comparisons don't work with NaN. Also add cost for
fctwiz, remove extra `uses REG`.
Edit comment in cfu8.s because the conditional branch might be before
or after fctwiz.
2018-01-22 19:04:15 +00:00
|
|
|
6: li r6, 52
|
2016-10-17 04:39:59 +00:00
|
|
|
subf r6, r5, r6
|
|
|
|
srw r4, r4, r6
|
|
|
|
slw r4, r4, r6 ! clear fraction in low word
|
2017-02-12 21:44:37 +00:00
|
|
|
b 1b
|