ack/mach/powerpc/libem/fef8.s

.sect .text; .sect .rom; .sect .data; .sect .bss

.sect .text

! Split a double-precision float into fraction and exponent, like
! frexp(3) in C.
!
! Stack: ( double -- fraction exponent )

.define .fef8
.fef8:
	lwz r3, 0(sp)			! r3 = high word (bits 0..31)
	lwz r4, 4(sp)			! r4 = low word (bits 32..63)

	! IEEE double-precision format:
	!   sign  exponent  fraction
	!   0     1..11     12..63
	!
	! To get fraction in [0.5, 1) or (-1, -0.5], we subtract 1022
	! from the IEEE exponent.

	extrwi. r6, r3, 11, 1		! r6 = IEEE exponent
	addi r5, r6, -1022		! r5 = our exponent
	beq 2f				! jump if zero or denormalized
	cmpwi r6, 2047
	beq 1f				! jump if infinity or NaN
	! fall through if normalized

	! Put fraction in [0.5, 1) or (-1, -0.5] by setting its
	! IEEE exponent to 1022.
	rlwinm r3, r3, 0, 12, 0		! clear old exponent
	oris r3, r3, 1022 << 4		! set new exponent
	! fall through

1:	stw r3, 0(sp)
	stw r4, 4(sp)			! push fraction
	stwu r5, -4(sp)			! push exponent
	blr

2:	! Got denormalized number or zero, probably zero.
	extrwi r6, r3, 22, 12
	or. r6, r6, r4			! r6 = high|low fraction
	bne 3f				! jump if not zero
	li r5, 0			! exponent = 0
	b 1b

3:	! Got denormalized number, not zero.
	lfd f0, 0(sp)
	lis r6, ha16[_2_64]
	lfd f1, lo16[_2_64](r6)
	fmul f0, f0, f1			! multiply it by 2**64
	stfd f0, 0(sp)
	lwz r3, 0(sp)
	lwz r4, 4(sp)
	extrwi r6, r3, 11, 1		! r6 = IEEE exponent
	addi r5, r6, -1022 - 64		! r5 = our exponent
	b 1b

.sect .rom
_2_64:
	! (double) 2**64
	.data4 0x43f00000
	.data4 0x00000000
In PowerPC libem, use the new features of our assembler. The new features are the hi16/lo16 and ha16/lo16 syntax for relocations, and the extended mnemonics like "blr". Use ha16/lo16 to load some double floats with 2 instructions (lis/lfd) instead of 3 (lis/ori/lfd). Use the extended names for branches, comparisons, and bit rotations, so I can more easily read the code. The new names often encode the same machine instructions as the old names, except in a few places where I changed the instructions. Stop using andi. when we don't need to set cr0. In inn.s, I change andi. to extrwi to extract the same bits. In los.s and sts.s, I change "andi. r3, r3, ~3" to "clrrwi r3, r3, 2". This avoids setting cr0 and also stops clearing the high 16 bits of r3. In csa.s, los.s, sts.s, I change some comparisons and right shifts from signed to unsigned (cmplw, cmplwi, srwi), because the sizes are unsigned. In inn.s, the right shift can be signed (sraw) or unsigned (srw), but I use srw because we don't need the carry bit. In fef8.s, I save an instruction by using rlwinm instead of addis/andc to rlwinm to clear a field. The code no longer kills r7. In both fef8.s and fif8.s, I remove the list of killed registers. Also remove some whitespace from ends of lines. 2017-01-23 22:16:39 +00:00			`.sect .text; .sect .rom; .sect .data; .sect .bss`
Rewrite fef 8 in powerpc assembly. In EM, fef splits a float into exponent and fraction. The old C code, given an infinite float, got stuck in an infinite loop. The new assembly code doesn't loop; it extracts the IEEE exponent. 2016-09-29 19:52:54 +00:00
			`.sect .text`

			`! Split a double-precision float into fraction and exponent, like`
Change .fef8 and .fif8 to pass values on the stack. Reorder the code in .fef8 and .fif8 so that in the usual case, we fall through to the blr without taking any branches. The usual case, by my guess, is .fef8 with normalized numbers or .fif8 with small integers. I change .fef8 and .fif8 to pass values on the real stack, not in specific registers. This simplifies the ncg table, and might help me experiment with changes to the ncg table. This change might or might not help mcg. Seems that mcg always uses the stack to pass values to libem, but I have not tested .fef8 or .fif8 with mcg. 2017-02-12 21:44:37 +00:00			`! frexp(3) in C.`
			`!`
			`! Stack: ( double -- fraction exponent )`
Rewrite fef 8 in powerpc assembly. In EM, fef splits a float into exponent and fraction. The old C code, given an infinite float, got stuck in an infinite loop. The new assembly code doesn't loop; it extracts the IEEE exponent. 2016-09-29 19:52:54 +00:00
			`.define .fef8`
			`.fef8:`
Change .fef8 and .fif8 to pass values on the stack. Reorder the code in .fef8 and .fif8 so that in the usual case, we fall through to the blr without taking any branches. The usual case, by my guess, is .fef8 with normalized numbers or .fif8 with small integers. I change .fef8 and .fif8 to pass values on the real stack, not in specific registers. This simplifies the ncg table, and might help me experiment with changes to the ncg table. This change might or might not help mcg. Seems that mcg always uses the stack to pass values to libem, but I have not tested .fef8 or .fif8 with mcg. 2017-02-12 21:44:37 +00:00			`lwz r3, 0(sp) ! r3 = high word (bits 0..31)`
			`lwz r4, 4(sp) ! r4 = low word (bits 32..63)`

Rewrite fef 8 in powerpc assembly. In EM, fef splits a float into exponent and fraction. The old C code, given an infinite float, got stuck in an infinite loop. The new assembly code doesn't loop; it extracts the IEEE exponent. 2016-09-29 19:52:54 +00:00			`! IEEE double-precision format:`
			`! sign exponent fraction`
			`! 0 1..11 12..63`
Change .fef8 and .fif8 to pass values on the stack. Reorder the code in .fef8 and .fif8 so that in the usual case, we fall through to the blr without taking any branches. The usual case, by my guess, is .fef8 with normalized numbers or .fif8 with small integers. I change .fef8 and .fif8 to pass values on the real stack, not in specific registers. This simplifies the ncg table, and might help me experiment with changes to the ncg table. This change might or might not help mcg. Seems that mcg always uses the stack to pass values to libem, but I have not tested .fef8 or .fif8 with mcg. 2017-02-12 21:44:37 +00:00			`!`
			`! To get fraction in [0.5, 1) or (-1, -0.5], we subtract 1022`
			`! from the IEEE exponent.`

			`extrwi. r6, r3, 11, 1 ! r6 = IEEE exponent`
			`addi r5, r6, -1022 ! r5 = our exponent`
			`beq 2f ! jump if zero or denormalized`
In PowerPC libem, use the new features of our assembler. The new features are the hi16/lo16 and ha16/lo16 syntax for relocations, and the extended mnemonics like "blr". Use ha16/lo16 to load some double floats with 2 instructions (lis/lfd) instead of 3 (lis/ori/lfd). Use the extended names for branches, comparisons, and bit rotations, so I can more easily read the code. The new names often encode the same machine instructions as the old names, except in a few places where I changed the instructions. Stop using andi. when we don't need to set cr0. In inn.s, I change andi. to extrwi to extract the same bits. In los.s and sts.s, I change "andi. r3, r3, ~3" to "clrrwi r3, r3, 2". This avoids setting cr0 and also stops clearing the high 16 bits of r3. In csa.s, los.s, sts.s, I change some comparisons and right shifts from signed to unsigned (cmplw, cmplwi, srwi), because the sizes are unsigned. In inn.s, the right shift can be signed (sraw) or unsigned (srw), but I use srw because we don't need the carry bit. In fef8.s, I save an instruction by using rlwinm instead of addis/andc to rlwinm to clear a field. The code no longer kills r7. In both fef8.s and fif8.s, I remove the list of killed registers. Also remove some whitespace from ends of lines. 2017-01-23 22:16:39 +00:00			`cmpwi r6, 2047`
Change .fef8 and .fif8 to pass values on the stack. Reorder the code in .fef8 and .fif8 so that in the usual case, we fall through to the blr without taking any branches. The usual case, by my guess, is .fef8 with normalized numbers or .fif8 with small integers. I change .fef8 and .fif8 to pass values on the real stack, not in specific registers. This simplifies the ncg table, and might help me experiment with changes to the ncg table. This change might or might not help mcg. Seems that mcg always uses the stack to pass values to libem, but I have not tested .fef8 or .fif8 with mcg. 2017-02-12 21:44:37 +00:00			`beq 1f ! jump if infinity or NaN`
			`! fall through if normalized`

			`! Put fraction in [0.5, 1) or (-1, -0.5] by setting its`
			`! IEEE exponent to 1022.`
			`rlwinm r3, r3, 0, 12, 0 ! clear old exponent`
			`oris r3, r3, 1022 << 4 ! set new exponent`
			`! fall through`
Rewrite fef 8 in powerpc assembly. In EM, fef splits a float into exponent and fraction. The old C code, given an infinite float, got stuck in an infinite loop. The new assembly code doesn't loop; it extracts the IEEE exponent. 2016-09-29 19:52:54 +00:00
Change .fef8 and .fif8 to pass values on the stack. Reorder the code in .fef8 and .fif8 so that in the usual case, we fall through to the blr without taking any branches. The usual case, by my guess, is .fef8 with normalized numbers or .fif8 with small integers. I change .fef8 and .fif8 to pass values on the real stack, not in specific registers. This simplifies the ncg table, and might help me experiment with changes to the ncg table. This change might or might not help mcg. Seems that mcg always uses the stack to pass values to libem, but I have not tested .fef8 or .fif8 with mcg. 2017-02-12 21:44:37 +00:00			`1: stw r3, 0(sp)`
			`stw r4, 4(sp) ! push fraction`
			`stwu r5, -4(sp) ! push exponent`
			`blr`

			`2: ! Got denormalized number or zero, probably zero.`
In PowerPC libem, use the new features of our assembler. The new features are the hi16/lo16 and ha16/lo16 syntax for relocations, and the extended mnemonics like "blr". Use ha16/lo16 to load some double floats with 2 instructions (lis/lfd) instead of 3 (lis/ori/lfd). Use the extended names for branches, comparisons, and bit rotations, so I can more easily read the code. The new names often encode the same machine instructions as the old names, except in a few places where I changed the instructions. Stop using andi. when we don't need to set cr0. In inn.s, I change andi. to extrwi to extract the same bits. In los.s and sts.s, I change "andi. r3, r3, ~3" to "clrrwi r3, r3, 2". This avoids setting cr0 and also stops clearing the high 16 bits of r3. In csa.s, los.s, sts.s, I change some comparisons and right shifts from signed to unsigned (cmplw, cmplwi, srwi), because the sizes are unsigned. In inn.s, the right shift can be signed (sraw) or unsigned (srw), but I use srw because we don't need the carry bit. In fef8.s, I save an instruction by using rlwinm instead of addis/andc to rlwinm to clear a field. The code no longer kills r7. In both fef8.s and fif8.s, I remove the list of killed registers. Also remove some whitespace from ends of lines. 2017-01-23 22:16:39 +00:00			`extrwi r6, r3, 22, 12`
Rewrite fef 8 in powerpc assembly. In EM, fef splits a float into exponent and fraction. The old C code, given an infinite float, got stuck in an infinite loop. The new assembly code doesn't loop; it extracts the IEEE exponent. 2016-09-29 19:52:54 +00:00			`or. r6, r6, r4 ! r6 = high\|low fraction`
Change .fef8 and .fif8 to pass values on the stack. Reorder the code in .fef8 and .fif8 so that in the usual case, we fall through to the blr without taking any branches. The usual case, by my guess, is .fef8 with normalized numbers or .fif8 with small integers. I change .fef8 and .fif8 to pass values on the real stack, not in specific registers. This simplifies the ncg table, and might help me experiment with changes to the ncg table. This change might or might not help mcg. Seems that mcg always uses the stack to pass values to libem, but I have not tested .fef8 or .fif8 with mcg. 2017-02-12 21:44:37 +00:00			`bne 3f ! jump if not zero`
			`li r5, 0 ! exponent = 0`
			`b 1b`
Rewrite fef 8 in powerpc assembly. In EM, fef splits a float into exponent and fraction. The old C code, given an infinite float, got stuck in an infinite loop. The new assembly code doesn't loop; it extracts the IEEE exponent. 2016-09-29 19:52:54 +00:00
Change .fef8 and .fif8 to pass values on the stack. Reorder the code in .fef8 and .fif8 so that in the usual case, we fall through to the blr without taking any branches. The usual case, by my guess, is .fef8 with normalized numbers or .fif8 with small integers. I change .fef8 and .fif8 to pass values on the real stack, not in specific registers. This simplifies the ncg table, and might help me experiment with changes to the ncg table. This change might or might not help mcg. Seems that mcg always uses the stack to pass values to libem, but I have not tested .fef8 or .fif8 with mcg. 2017-02-12 21:44:37 +00:00			`3: ! Got denormalized number, not zero.`
Rewrite fef 8 in powerpc assembly. In EM, fef splits a float into exponent and fraction. The old C code, given an infinite float, got stuck in an infinite loop. The new assembly code doesn't loop; it extracts the IEEE exponent. 2016-09-29 19:52:54 +00:00			`lfd f0, 0(sp)`
In PowerPC libem, use the new features of our assembler. The new features are the hi16/lo16 and ha16/lo16 syntax for relocations, and the extended mnemonics like "blr". Use ha16/lo16 to load some double floats with 2 instructions (lis/lfd) instead of 3 (lis/ori/lfd). Use the extended names for branches, comparisons, and bit rotations, so I can more easily read the code. The new names often encode the same machine instructions as the old names, except in a few places where I changed the instructions. Stop using andi. when we don't need to set cr0. In inn.s, I change andi. to extrwi to extract the same bits. In los.s and sts.s, I change "andi. r3, r3, ~3" to "clrrwi r3, r3, 2". This avoids setting cr0 and also stops clearing the high 16 bits of r3. In csa.s, los.s, sts.s, I change some comparisons and right shifts from signed to unsigned (cmplw, cmplwi, srwi), because the sizes are unsigned. In inn.s, the right shift can be signed (sraw) or unsigned (srw), but I use srw because we don't need the carry bit. In fef8.s, I save an instruction by using rlwinm instead of addis/andc to rlwinm to clear a field. The code no longer kills r7. In both fef8.s and fif8.s, I remove the list of killed registers. Also remove some whitespace from ends of lines. 2017-01-23 22:16:39 +00:00			`lis r6, ha16[_2_64]`
			`lfd f1, lo16[_2_64](r6)`
Rewrite fef 8 in powerpc assembly. In EM, fef splits a float into exponent and fraction. The old C code, given an infinite float, got stuck in an infinite loop. The new assembly code doesn't loop; it extracts the IEEE exponent. 2016-09-29 19:52:54 +00:00			`fmul f0, f0, f1 ! multiply it by 2**64`
			`stfd f0, 0(sp)`
			`lwz r3, 0(sp)`
			`lwz r4, 4(sp)`
In PowerPC libem, use the new features of our assembler. The new features are the hi16/lo16 and ha16/lo16 syntax for relocations, and the extended mnemonics like "blr". Use ha16/lo16 to load some double floats with 2 instructions (lis/lfd) instead of 3 (lis/ori/lfd). Use the extended names for branches, comparisons, and bit rotations, so I can more easily read the code. The new names often encode the same machine instructions as the old names, except in a few places where I changed the instructions. Stop using andi. when we don't need to set cr0. In inn.s, I change andi. to extrwi to extract the same bits. In los.s and sts.s, I change "andi. r3, r3, ~3" to "clrrwi r3, r3, 2". This avoids setting cr0 and also stops clearing the high 16 bits of r3. In csa.s, los.s, sts.s, I change some comparisons and right shifts from signed to unsigned (cmplw, cmplwi, srwi), because the sizes are unsigned. In inn.s, the right shift can be signed (sraw) or unsigned (srw), but I use srw because we don't need the carry bit. In fef8.s, I save an instruction by using rlwinm instead of addis/andc to rlwinm to clear a field. The code no longer kills r7. In both fef8.s and fif8.s, I remove the list of killed registers. Also remove some whitespace from ends of lines. 2017-01-23 22:16:39 +00:00			`extrwi r6, r3, 11, 1 ! r6 = IEEE exponent`
Change .fef8 and .fif8 to pass values on the stack. Reorder the code in .fef8 and .fif8 so that in the usual case, we fall through to the blr without taking any branches. The usual case, by my guess, is .fef8 with normalized numbers or .fif8 with small integers. I change .fef8 and .fif8 to pass values on the real stack, not in specific registers. This simplifies the ncg table, and might help me experiment with changes to the ncg table. This change might or might not help mcg. Seems that mcg always uses the stack to pass values to libem, but I have not tested .fef8 or .fif8 with mcg. 2017-02-12 21:44:37 +00:00			`addi r5, r6, -1022 - 64 ! r5 = our exponent`
			`b 1b`
Rewrite fef 8 in powerpc assembly. In EM, fef splits a float into exponent and fraction. The old C code, given an infinite float, got stuck in an infinite loop. The new assembly code doesn't loop; it extracts the IEEE exponent. 2016-09-29 19:52:54 +00:00
			`.sect .rom`
			`_2_64:`
			`! (double) 2**64`
			`.data4 0x43f00000`
			`.data4 0x00000000`