ack/mach/powerpc/libem/fif8.s

.sect .text

! Multiplies two double-precision floats, then splits the product into
! fraction and integer, like modf(3) in C.  On entry:
!
! Stack: ( a b -- fraction integer )

.define .fif8
.fif8:
	lfd f1, 8(sp)
	lfd f2, 0(sp)
	fmul f1, f1, f2			! f1 = a * b
	stfd f1, 0(sp)
	lwz r3, 0(sp)			! r3 = high word
	lwz r4, 4(sp)			! r4 = low word

	! IEEE double-precision format:
	!   sign  exponent  fraction
	!   0     1..11     12..63
	!
	! Subtract 1023 from the IEEE exponent.  If the result is from
	! 0 to 51, then the IEEE fraction has that many integer bits.
	! (IEEE has an implicit 1 before its fraction.  If the IEEE
	! fraction has 0 integer bits, we still have an integer.)

	extrwi r5, r3, 11, 1		! r5 = IEEE exponent
	addic. r5, r5, -1023		! r5 = nr of integer bits
	blt 4f				! branch if no integer
	cmpwi r5, 52
	bge 5f				! branch if no fraction
	cmpwi r5, 21
	bge 6f				! branch if large integer
	! fall through if small integer

	! f1 has r5 = 0 to 20 integer bits in the IEEE fraction.
	! High word has 20 - r5 fraction bits.
	li r6, 20
	subf r6, r5, r6
	srw r3, r3, r6
	li r4, 0			! clear low word
	slw r3, r3, r6			! clear fraction in high word
	! fall through

1:	stw r3, 0(sp)
	stw r4, 4(sp)
	lfd f2, 0(sp)			! integer = high word, low word
2:	fsub f1, f1, f2			! fraction = value - integer
3:	stfd f1, 8(sp)			! push fraction
	stfd f2, 0(sp)			! push integer
	blr

4:	! f1 is a fraction without integer.
	fsub f2, f1, f1			! integer = zero
	b 3b

5:	! f1 is an integer without fraction (or infinity or NaN).
	fmr f2, f1			! integer = f1
	b 2b

6:	! f1 has r5 = 21 to 51 to integer bits.
	! Low word has 52 - r5 fraction bits.
	li r6, 52
	subf r6, r5, r6
	srw r4, r4, r6
	slw r4, r4, r6			! clear fraction in low word
	b 1b
Archival checkin (semi-working code). 2007-11-02 18:56:58 +00:00			`.sect .text`

Rewrite .fif8 to avoid powerpc64 fctid This fixes the SIGILL (illegal instruction) in startrek when firing phasers. The 32-bit processors in my PowerPC Mac and in QEMU don't have fctid, a 64-bit instruction. I got the idea from mach/proto/fp/fif8.c to extract the exponent, clear some bits to get an integer, then subtract the integer from the original value to get the fraction. 2016-10-17 04:39:59 +00:00			`! Multiplies two double-precision floats, then splits the product into`
Change .fef8 and .fif8 to pass values on the stack. Reorder the code in .fef8 and .fif8 so that in the usual case, we fall through to the blr without taking any branches. The usual case, by my guess, is .fef8 with normalized numbers or .fif8 with small integers. I change .fef8 and .fif8 to pass values on the real stack, not in specific registers. This simplifies the ncg table, and might help me experiment with changes to the ncg table. This change might or might not help mcg. Seems that mcg always uses the stack to pass values to libem, but I have not tested .fef8 or .fif8 with mcg. 2017-02-12 21:44:37 +00:00			`! fraction and integer, like modf(3) in C. On entry:`
			`!`
			`! Stack: ( a b -- fraction integer )`
Archival checkin (semi-working code). 2007-11-02 18:56:58 +00:00
			`.define .fif8`
			`.fif8:`
Change .fef8 and .fif8 to pass values on the stack. Reorder the code in .fef8 and .fif8 so that in the usual case, we fall through to the blr without taking any branches. The usual case, by my guess, is .fef8 with normalized numbers or .fif8 with small integers. I change .fef8 and .fif8 to pass values on the real stack, not in specific registers. This simplifies the ncg table, and might help me experiment with changes to the ncg table. This change might or might not help mcg. Seems that mcg always uses the stack to pass values to libem, but I have not tested .fef8 or .fif8 with mcg. 2017-02-12 21:44:37 +00:00			`lfd f1, 8(sp)`
			`lfd f2, 0(sp)`
			`fmul f1, f1, f2 ! f1 = a * b`
			`stfd f1, 0(sp)`
Rewrite .fif8 to avoid powerpc64 fctid This fixes the SIGILL (illegal instruction) in startrek when firing phasers. The 32-bit processors in my PowerPC Mac and in QEMU don't have fctid, a 64-bit instruction. I got the idea from mach/proto/fp/fif8.c to extract the exponent, clear some bits to get an integer, then subtract the integer from the original value to get the fraction. 2016-10-17 04:39:59 +00:00			`lwz r3, 0(sp) ! r3 = high word`
			`lwz r4, 4(sp) ! r4 = low word`

			`! IEEE double-precision format:`
			`! sign exponent fraction`
			`! 0 1..11 12..63`
Change .fef8 and .fif8 to pass values on the stack. Reorder the code in .fef8 and .fif8 so that in the usual case, we fall through to the blr without taking any branches. The usual case, by my guess, is .fef8 with normalized numbers or .fif8 with small integers. I change .fef8 and .fif8 to pass values on the real stack, not in specific registers. This simplifies the ncg table, and might help me experiment with changes to the ncg table. This change might or might not help mcg. Seems that mcg always uses the stack to pass values to libem, but I have not tested .fef8 or .fif8 with mcg. 2017-02-12 21:44:37 +00:00			`!`
Rewrite .fif8 to avoid powerpc64 fctid This fixes the SIGILL (illegal instruction) in startrek when firing phasers. The 32-bit processors in my PowerPC Mac and in QEMU don't have fctid, a 64-bit instruction. I got the idea from mach/proto/fp/fif8.c to extract the exponent, clear some bits to get an integer, then subtract the integer from the original value to get the fraction. 2016-10-17 04:39:59 +00:00			`! Subtract 1023 from the IEEE exponent. If the result is from`
			`! 0 to 51, then the IEEE fraction has that many integer bits.`
			`! (IEEE has an implicit 1 before its fraction. If the IEEE`
			`! fraction has 0 integer bits, we still have an integer.)`
Change .fef8 and .fif8 to pass values on the stack. Reorder the code in .fef8 and .fif8 so that in the usual case, we fall through to the blr without taking any branches. The usual case, by my guess, is .fef8 with normalized numbers or .fif8 with small integers. I change .fef8 and .fif8 to pass values on the real stack, not in specific registers. This simplifies the ncg table, and might help me experiment with changes to the ncg table. This change might or might not help mcg. Seems that mcg always uses the stack to pass values to libem, but I have not tested .fef8 or .fif8 with mcg. 2017-02-12 21:44:37 +00:00
In PowerPC libem, use the new features of our assembler. The new features are the hi16/lo16 and ha16/lo16 syntax for relocations, and the extended mnemonics like "blr". Use ha16/lo16 to load some double floats with 2 instructions (lis/lfd) instead of 3 (lis/ori/lfd). Use the extended names for branches, comparisons, and bit rotations, so I can more easily read the code. The new names often encode the same machine instructions as the old names, except in a few places where I changed the instructions. Stop using andi. when we don't need to set cr0. In inn.s, I change andi. to extrwi to extract the same bits. In los.s and sts.s, I change "andi. r3, r3, ~3" to "clrrwi r3, r3, 2". This avoids setting cr0 and also stops clearing the high 16 bits of r3. In csa.s, los.s, sts.s, I change some comparisons and right shifts from signed to unsigned (cmplw, cmplwi, srwi), because the sizes are unsigned. In inn.s, the right shift can be signed (sraw) or unsigned (srw), but I use srw because we don't need the carry bit. In fef8.s, I save an instruction by using rlwinm instead of addis/andc to rlwinm to clear a field. The code no longer kills r7. In both fef8.s and fif8.s, I remove the list of killed registers. Also remove some whitespace from ends of lines. 2017-01-23 22:16:39 +00:00			`extrwi r5, r3, 11, 1 ! r5 = IEEE exponent`
Rewrite .fif8 to avoid powerpc64 fctid This fixes the SIGILL (illegal instruction) in startrek when firing phasers. The 32-bit processors in my PowerPC Mac and in QEMU don't have fctid, a 64-bit instruction. I got the idea from mach/proto/fp/fif8.c to extract the exponent, clear some bits to get an integer, then subtract the integer from the original value to get the fraction. 2016-10-17 04:39:59 +00:00			`addic. r5, r5, -1023 ! r5 = nr of integer bits`
Change .fef8 and .fif8 to pass values on the stack. Reorder the code in .fef8 and .fif8 so that in the usual case, we fall through to the blr without taking any branches. The usual case, by my guess, is .fef8 with normalized numbers or .fif8 with small integers. I change .fef8 and .fif8 to pass values on the real stack, not in specific registers. This simplifies the ncg table, and might help me experiment with changes to the ncg table. This change might or might not help mcg. Seems that mcg always uses the stack to pass values to libem, but I have not tested .fef8 or .fif8 with mcg. 2017-02-12 21:44:37 +00:00			`blt 4f ! branch if no integer`
In PowerPC libem, use the new features of our assembler. The new features are the hi16/lo16 and ha16/lo16 syntax for relocations, and the extended mnemonics like "blr". Use ha16/lo16 to load some double floats with 2 instructions (lis/lfd) instead of 3 (lis/ori/lfd). Use the extended names for branches, comparisons, and bit rotations, so I can more easily read the code. The new names often encode the same machine instructions as the old names, except in a few places where I changed the instructions. Stop using andi. when we don't need to set cr0. In inn.s, I change andi. to extrwi to extract the same bits. In los.s and sts.s, I change "andi. r3, r3, ~3" to "clrrwi r3, r3, 2". This avoids setting cr0 and also stops clearing the high 16 bits of r3. In csa.s, los.s, sts.s, I change some comparisons and right shifts from signed to unsigned (cmplw, cmplwi, srwi), because the sizes are unsigned. In inn.s, the right shift can be signed (sraw) or unsigned (srw), but I use srw because we don't need the carry bit. In fef8.s, I save an instruction by using rlwinm instead of addis/andc to rlwinm to clear a field. The code no longer kills r7. In both fef8.s and fif8.s, I remove the list of killed registers. Also remove some whitespace from ends of lines. 2017-01-23 22:16:39 +00:00			`cmpwi r5, 52`
Change .fef8 and .fif8 to pass values on the stack. Reorder the code in .fef8 and .fif8 so that in the usual case, we fall through to the blr without taking any branches. The usual case, by my guess, is .fef8 with normalized numbers or .fif8 with small integers. I change .fef8 and .fif8 to pass values on the real stack, not in specific registers. This simplifies the ncg table, and might help me experiment with changes to the ncg table. This change might or might not help mcg. Seems that mcg always uses the stack to pass values to libem, but I have not tested .fef8 or .fif8 with mcg. 2017-02-12 21:44:37 +00:00			`bge 5f ! branch if no fraction`
			`cmpwi r5, 21`
			`bge 6f ! branch if large integer`
			`! fall through if small integer`
Rewrite .fif8 to avoid powerpc64 fctid This fixes the SIGILL (illegal instruction) in startrek when firing phasers. The 32-bit processors in my PowerPC Mac and in QEMU don't have fctid, a 64-bit instruction. I got the idea from mach/proto/fp/fif8.c to extract the exponent, clear some bits to get an integer, then subtract the integer from the original value to get the fraction. 2016-10-17 04:39:59 +00:00
			`! f1 has r5 = 0 to 20 integer bits in the IEEE fraction.`
			`! High word has 20 - r5 fraction bits.`
In PowerPC libem, use the new features of our assembler. The new features are the hi16/lo16 and ha16/lo16 syntax for relocations, and the extended mnemonics like "blr". Use ha16/lo16 to load some double floats with 2 instructions (lis/lfd) instead of 3 (lis/ori/lfd). Use the extended names for branches, comparisons, and bit rotations, so I can more easily read the code. The new names often encode the same machine instructions as the old names, except in a few places where I changed the instructions. Stop using andi. when we don't need to set cr0. In inn.s, I change andi. to extrwi to extract the same bits. In los.s and sts.s, I change "andi. r3, r3, ~3" to "clrrwi r3, r3, 2". This avoids setting cr0 and also stops clearing the high 16 bits of r3. In csa.s, los.s, sts.s, I change some comparisons and right shifts from signed to unsigned (cmplw, cmplwi, srwi), because the sizes are unsigned. In inn.s, the right shift can be signed (sraw) or unsigned (srw), but I use srw because we don't need the carry bit. In fef8.s, I save an instruction by using rlwinm instead of addis/andc to rlwinm to clear a field. The code no longer kills r7. In both fef8.s and fif8.s, I remove the list of killed registers. Also remove some whitespace from ends of lines. 2017-01-23 22:16:39 +00:00			`li r6, 20`
Rewrite .fif8 to avoid powerpc64 fctid This fixes the SIGILL (illegal instruction) in startrek when firing phasers. The 32-bit processors in my PowerPC Mac and in QEMU don't have fctid, a 64-bit instruction. I got the idea from mach/proto/fp/fif8.c to extract the exponent, clear some bits to get an integer, then subtract the integer from the original value to get the fraction. 2016-10-17 04:39:59 +00:00			`subf r6, r5, r6`
			`srw r3, r3, r6`
In PowerPC libem, use the new features of our assembler. The new features are the hi16/lo16 and ha16/lo16 syntax for relocations, and the extended mnemonics like "blr". Use ha16/lo16 to load some double floats with 2 instructions (lis/lfd) instead of 3 (lis/ori/lfd). Use the extended names for branches, comparisons, and bit rotations, so I can more easily read the code. The new names often encode the same machine instructions as the old names, except in a few places where I changed the instructions. Stop using andi. when we don't need to set cr0. In inn.s, I change andi. to extrwi to extract the same bits. In los.s and sts.s, I change "andi. r3, r3, ~3" to "clrrwi r3, r3, 2". This avoids setting cr0 and also stops clearing the high 16 bits of r3. In csa.s, los.s, sts.s, I change some comparisons and right shifts from signed to unsigned (cmplw, cmplwi, srwi), because the sizes are unsigned. In inn.s, the right shift can be signed (sraw) or unsigned (srw), but I use srw because we don't need the carry bit. In fef8.s, I save an instruction by using rlwinm instead of addis/andc to rlwinm to clear a field. The code no longer kills r7. In both fef8.s and fif8.s, I remove the list of killed registers. Also remove some whitespace from ends of lines. 2017-01-23 22:16:39 +00:00			`li r4, 0 ! clear low word`
Rewrite .fif8 to avoid powerpc64 fctid This fixes the SIGILL (illegal instruction) in startrek when firing phasers. The 32-bit processors in my PowerPC Mac and in QEMU don't have fctid, a 64-bit instruction. I got the idea from mach/proto/fp/fif8.c to extract the exponent, clear some bits to get an integer, then subtract the integer from the original value to get the fraction. 2016-10-17 04:39:59 +00:00			`slw r3, r3, r6 ! clear fraction in high word`
Change .fef8 and .fif8 to pass values on the stack. Reorder the code in .fef8 and .fif8 so that in the usual case, we fall through to the blr without taking any branches. The usual case, by my guess, is .fef8 with normalized numbers or .fif8 with small integers. I change .fef8 and .fif8 to pass values on the real stack, not in specific registers. This simplifies the ncg table, and might help me experiment with changes to the ncg table. This change might or might not help mcg. Seems that mcg always uses the stack to pass values to libem, but I have not tested .fef8 or .fif8 with mcg. 2017-02-12 21:44:37 +00:00			`! fall through`

			`1: stw r3, 0(sp)`
			`stw r4, 4(sp)`
			`lfd f2, 0(sp) ! integer = high word, low word`
			`2: fsub f1, f1, f2 ! fraction = value - integer`
			`3: stfd f1, 8(sp) ! push fraction`
			`stfd f2, 0(sp) ! push integer`
			`blr`

			`4: ! f1 is a fraction without integer.`
			`fsub f2, f1, f1 ! integer = zero`
			`b 3b`
Rewrite .fif8 to avoid powerpc64 fctid This fixes the SIGILL (illegal instruction) in startrek when firing phasers. The 32-bit processors in my PowerPC Mac and in QEMU don't have fctid, a 64-bit instruction. I got the idea from mach/proto/fp/fif8.c to extract the exponent, clear some bits to get an integer, then subtract the integer from the original value to get the fraction. 2016-10-17 04:39:59 +00:00
Change .fef8 and .fif8 to pass values on the stack. Reorder the code in .fef8 and .fif8 so that in the usual case, we fall through to the blr without taking any branches. The usual case, by my guess, is .fef8 with normalized numbers or .fif8 with small integers. I change .fef8 and .fif8 to pass values on the real stack, not in specific registers. This simplifies the ncg table, and might help me experiment with changes to the ncg table. This change might or might not help mcg. Seems that mcg always uses the stack to pass values to libem, but I have not tested .fef8 or .fif8 with mcg. 2017-02-12 21:44:37 +00:00			`5: ! f1 is an integer without fraction (or infinity or NaN).`
			`fmr f2, f1 ! integer = f1`
			`b 2b`

			`6: ! f1 has r5 = 21 to 51 to integer bits.`
Rewrite .fif8 to avoid powerpc64 fctid This fixes the SIGILL (illegal instruction) in startrek when firing phasers. The 32-bit processors in my PowerPC Mac and in QEMU don't have fctid, a 64-bit instruction. I got the idea from mach/proto/fp/fif8.c to extract the exponent, clear some bits to get an integer, then subtract the integer from the original value to get the fraction. 2016-10-17 04:39:59 +00:00			`! Low word has 52 - r5 fraction bits.`
In PowerPC libem, use the new features of our assembler. The new features are the hi16/lo16 and ha16/lo16 syntax for relocations, and the extended mnemonics like "blr". Use ha16/lo16 to load some double floats with 2 instructions (lis/lfd) instead of 3 (lis/ori/lfd). Use the extended names for branches, comparisons, and bit rotations, so I can more easily read the code. The new names often encode the same machine instructions as the old names, except in a few places where I changed the instructions. Stop using andi. when we don't need to set cr0. In inn.s, I change andi. to extrwi to extract the same bits. In los.s and sts.s, I change "andi. r3, r3, ~3" to "clrrwi r3, r3, 2". This avoids setting cr0 and also stops clearing the high 16 bits of r3. In csa.s, los.s, sts.s, I change some comparisons and right shifts from signed to unsigned (cmplw, cmplwi, srwi), because the sizes are unsigned. In inn.s, the right shift can be signed (sraw) or unsigned (srw), but I use srw because we don't need the carry bit. In fef8.s, I save an instruction by using rlwinm instead of addis/andc to rlwinm to clear a field. The code no longer kills r7. In both fef8.s and fif8.s, I remove the list of killed registers. Also remove some whitespace from ends of lines. 2017-01-23 22:16:39 +00:00			`li r6, 52`
Rewrite .fif8 to avoid powerpc64 fctid This fixes the SIGILL (illegal instruction) in startrek when firing phasers. The 32-bit processors in my PowerPC Mac and in QEMU don't have fctid, a 64-bit instruction. I got the idea from mach/proto/fp/fif8.c to extract the exponent, clear some bits to get an integer, then subtract the integer from the original value to get the fraction. 2016-10-17 04:39:59 +00:00			`subf r6, r5, r6`
			`srw r4, r4, r6`
			`slw r4, r4, r6 ! clear fraction in low word`
Change .fef8 and .fif8 to pass values on the stack. Reorder the code in .fef8 and .fif8 so that in the usual case, we fall through to the blr without taking any branches. The usual case, by my guess, is .fef8 with normalized numbers or .fif8 with small integers. I change .fef8 and .fif8 to pass values on the real stack, not in specific registers. This simplifies the ncg table, and might help me experiment with changes to the ncg table. This change might or might not help mcg. Seems that mcg always uses the stack to pass values to libem, but I have not tested .fef8 or .fif8 with mcg. 2017-02-12 21:44:37 +00:00			`b 1b`