ack/modules/src/flt_arith/flt_arith.3

.TH FLT_ARITH 3ACK " Fri June 30 1989"
.ad
.SH NAME
flt_arith \- high precision floating point arithmetic
.SH SYNOPSIS
.nf
.B #include <flt_arith.h>
.PP
.if t .ta 3m 13m 22m
.if n .ta 5m 25m 40m
struct flt_mantissa {
	long	flt_h_32;	/* high order 32 bits of mantissa */
	long	flt_l_32;	/* low order 32 bits of mantissa */
};

typedef struct {
	short	flt_sign;	/* 0 for positive, 1 for negative */
	short	flt_exp;	/* between -16384 and 16384 */
	struct flt_mantissa	flt_mantissa;	/* normalized, in [1,2). */
} flt_arith;

extern int	flt_status;
#define FLT_OVFL	001
#define FLT_UNFL	002
#define FLT_DIV0	004
#define FLT_NOFLT	010
#define FLT_BTSM	020

#define FLT_STRLEN	32
.PP
.B flt_add(e1, e2, e3)
.B flt_arith *e1, *e2, *e3;
.PP
.B flt_mul(e1, e2, e3)
.B flt_arith *e1, *e2, *e3;
.PP
.B flt_sub(e1, e2, e3)
.B flt_arith *e1, *e2, *e3;
.PP
.B flt_div(e1, e2, e3)
.B flt_arith *e1, *e2, *e3;
.PP
.B flt_umin(e)
.B flt_arith *e;
.PP
.B flt_modf(e1, intpart, fractpart)
.B flt_arith *e1, *intpart, *fractpart;
.PP
.B int flt_cmp(e1, e2)
.B flt_arith *e1, *e2;
.PP
.B flt_str2flt(s, e)
.B char *s;
.B flt_arith *e;
.PP
.B flt_flt2str(e, buf, bufsize)
.B flt_arith *e;
.B char *buf;
.B int bufsize;
.PP
.B int flt_status;
.PP
.B #include <em_arith.h>
.B flt_arith2flt(n, e, uns)
.B arith n;
.B flt_arith *e;
.B int uns;
.PP
.B arith flt_flt2arith(e, uns)
.B flt_arith *e;
.B int uns;
.PP
.B flt_b64_sft(m, n)
.B struct flt_mantissa *m;
.B int n;
.SH DESCRIPTION
This set of routines emulates floating point arithmetic, in a high
precision. It is intended primarily for compilers that need to evaluate
floating point expressions at compile-time. It could be argued that this
should be done in the floating point arithmetic of the target machine,
but EM does not define its floating point arithmetic.
.PP
.B flt_add
adds the numbers indicated by
.I e1
and
.I e2
and stores the result indirectly through
.IR e3 .
.PP
.B flt_mul
multiplies the numbers indicated by
.I e1
and
.I e2
and stores the result indirectly through
.IR e3 .
.PP
.B flt_sub
subtracts the number indicated by
.I e2
from the one indicated by
.I e1
and stores the result indirectly through
.IR e3 .
.PP
.B flt_div
divides the number indicated by
.I e1
by the one indicated by
.I e2
and stores the result indirectly through
.IR e3 .
.PP
.B flt_umin
negates the number indicated by
.I e
and stores the result indirectly through
.IR e .
.PP
.B flt_modf
splits the number indicated by
.I e
in an integer and a fraction part, and stores the integer part through
.I intpart
and the fraction part through
.IR fractpart .
So, adding the numbers indicated by
.I intpart
and
.I fractpart
results (in the absence of rounding error) in the number
indicated by
.IR e .
Also, the absolute value of the number indicated by
.I intpart
is less than or equal to the absolute value of the number indicated by
.IR e .
The absolute value of the number indicated by
.I fractpart
is less than 1.
.PP
.B flt_cmp
compares the numbers indicated by
.I e1
and
.I e2
and returns -1 if
.I e1
<
.IR e2 ,
0 if
.I e1
=
.IR e2 ,
and 1 if
.I e1
>
.IR e2 .
.PP
.B flt_str2flt
converts the string indicated by
.I s
to a floating point number, and stores this number through
.IR e.
The string should contain a floating point constant, which consists of
an integer part, a decimal point, a fraction part, an \f(CWe\fP or an
\f(CWE\fP, and an optionally signed integer exponent. The integer and
fraction parts both consist of a sequence of digits. They may not both be
missing. The decimal point, the \f(CWe\fP and the exponent may be
missing.
.PP
.B flt_flt2str
converts the number indicated by
.I e
into a string, in a scientific notation acceptable for EM. The result is
stored in
.IR buf .
At most
.I bufsize
characters are stored.
The maximum length needed is available in the constant FLT_STRLEN.
.PP
.B flt_arith2flt
converts the number
.I n
to the floating point format used in this package and returns the result
in
.IR e . If the
.I uns
flag is set, the number
.I n
is regarded as an unsigned.
.PP
.B flt_flt2arith
truncates the number indicated by
.I e
to the largest integer value smaller than or equal to the number indicated by
.IR e .
It returns this value. If the
.I uns
flag is set, the result is to be regarded as unsigned.
.PP
Before each operation, the
.I flt_status
variable is reset to 0. After an operation, it can be checked for one
of the following values:
.IP FLT_OVFL
.br
an overflow occurred. The result is a large value with the correct sign.
This can occur with the routines
.IR flt_add ,
.IR flt_sub ,
.IR flt_div ,
.IR flt_mul ,
.IR flt_flt2arith ,
and
.IR flt_str2flt .
.IP FLT_UNFL
.br
an underflow occurred. The result is 0.
This can occur with the routines
.IR flt_div ,
.IR flt_mul ,
.IR flt_sub ,
.IR flt_add ,
and
.IR flt_str2flt .
.IP FLT_DIV0
.br
divide by 0. The result is a large value with the sign of the dividend.
This can only occur with the routine
.IR flt_div .
.IP FLT_NOFLT
.br
indicates that the string did not represent a floating point number. The
result is 0.
This can only occur with the routine
.IR flt_str2flt .
.IP FLT_BTSM
.br
indicates that the buffer is too small. The contents of the buffer is
undefined. This can only occur with the routine
.IR flt_flt2str .
.PP
The routine
.I flt_b64_sft
shifts the mantissa
.I m
.I |n|
bits left or right, depending on the sign of
.IR n .
If
.I n
is negative, it is a left-shift; If
.I n
is positive, it is a right shift.
.SH FILES
~em/modules/h/flt_arith.h
.br
~em/modules/h/em_arith.h
.br
~em/modules/lib/libflt.a
Initial revision 1989-07-10 11:17:19 +00:00			`.TH FLT_ARITH 3ACK " Fri June 30 1989"`
			`.ad`
			`.SH NAME`
			`flt_arith \- high precision floating point arithmetic`
			`.SH SYNOPSIS`
			`.nf`
			`.B #include <flt_arith.h>`
			`.PP`
changed lay-out of manual page a bit 1989-07-12 09:48:15 +00:00			`.if t .ta 3m 13m 22m`
			`.if n .ta 5m 25m 40m`
made to work, and added the b64 shift routines to the interface 1989-07-11 09:15:17 +00:00			`struct flt_mantissa {`
			`long flt_h_32; /* high order 32 bits of mantissa */`
			`long flt_l_32; /* low order 32 bits of mantissa */`
			`};`

			`typedef struct {`
			`short flt_sign; /* 0 for positive, 1 for negative */`
			`short flt_exp; /* between -16384 and 16384 */`
changed lay-out of manual page a bit 1989-07-12 09:48:15 +00:00			`struct flt_mantissa flt_mantissa; /* normalized, in [1,2). */`
made to work, and added the b64 shift routines to the interface 1989-07-11 09:15:17 +00:00			`} flt_arith;`

			`extern int flt_status;`
			`#define FLT_OVFL 001`
			`#define FLT_UNFL 002`
			`#define FLT_DIV0 004`
changed lay-out of manual page a bit 1989-07-12 09:48:15 +00:00			`#define FLT_NOFLT 010`
made to work, and added the b64 shift routines to the interface 1989-07-11 09:15:17 +00:00			`#define FLT_BTSM 020`
Added #define for buffer size needed for flt_flt2str() 1989-07-31 13:05:51 +00:00
			`#define FLT_STRLEN 32`
made to work, and added the b64 shift routines to the interface 1989-07-11 09:15:17 +00:00			`.PP`
Initial revision 1989-07-10 11:17:19 +00:00			`.B flt_add(e1, e2, e3)`
			`.B flt_arith e1, e2, *e3;`
			`.PP`
			`.B flt_mul(e1, e2, e3)`
			`.B flt_arith e1, e2, *e3;`
			`.PP`
			`.B flt_sub(e1, e2, e3)`
			`.B flt_arith e1, e2, *e3;`
			`.PP`
			`.B flt_div(e1, e2, e3)`
			`.B flt_arith e1, e2, *e3;`
			`.PP`
fixed some bugs, added flt_umin 1989-07-28 14:13:39 +00:00			`.B flt_umin(e)`
			`.B flt_arith *e;`
			`.PP`
Initial revision 1989-07-10 11:17:19 +00:00			`.B flt_modf(e1, intpart, fractpart)`
			`.B flt_arith e1, intpart, *fractpart;`
			`.PP`
			`.B int flt_cmp(e1, e2)`
			`.B flt_arith e1, e2;`
			`.PP`
fixed some bugs, added flt_umin 1989-07-28 14:13:39 +00:00			`.B flt_str2flt(s, e)`
Initial revision 1989-07-10 11:17:19 +00:00			`.B char *s;`
			`.B flt_arith *e;`
			`.PP`
			`.B flt_flt2str(e, buf, bufsize)`
			`.B flt_arith *e;`
			`.B char *buf;`
			`.B int bufsize;`
			`.PP`
			`.B int flt_status;`
			`.PP`
			`.B #include <em_arith.h>`
speeded up a bit for converting 0.0 to string 1989-11-27 17:25:55 +00:00			`.B flt_arith2flt(n, e, uns)`
Initial revision 1989-07-10 11:17:19 +00:00			`.B arith n;`
			`.B flt_arith *e;`
speeded up a bit for converting 0.0 to string 1989-11-27 17:25:55 +00:00			`.B int uns;`
Initial revision 1989-07-10 11:17:19 +00:00			`.PP`
speeded up a bit for converting 0.0 to string 1989-11-27 17:25:55 +00:00			`.B arith flt_flt2arith(e, uns)`
Initial revision 1989-07-10 11:17:19 +00:00			`.B flt_arith *e;`
speeded up a bit for converting 0.0 to string 1989-11-27 17:25:55 +00:00			`.B int uns;`
made to work, and added the b64 shift routines to the interface 1989-07-11 09:15:17 +00:00			`.PP`
			`.B flt_b64_sft(m, n)`
			`.B struct flt_mantissa *m;`
			`.B int n;`
Initial revision 1989-07-10 11:17:19 +00:00			`.SH DESCRIPTION`
			`This set of routines emulates floating point arithmetic, in a high`
			`precision. It is intended primarily for compilers that need to evaluate`
			`floating point expressions at compile-time. It could be argued that this`
			`should be done in the floating point arithmetic of the target machine,`
			`but EM does not define its floating point arithmetic.`
			`.PP`
			`.B flt_add`
			`adds the numbers indicated by`
			`.I e1`
			`and`
			`.I e2`
			`and stores the result indirectly through`
			`.IR e3 .`
			`.PP`
			`.B flt_mul`
			`multiplies the numbers indicated by`
			`.I e1`
			`and`
			`.I e2`
			`and stores the result indirectly through`
			`.IR e3 .`
			`.PP`
			`.B flt_sub`
			`subtracts the number indicated by`
			`.I e2`
			`from the one indicated by`
			`.I e1`
			`and stores the result indirectly through`
			`.IR e3 .`
			`.PP`
			`.B flt_div`
			`divides the number indicated by`
			`.I e1`
			`by the one indicated by`
			`.I e2`
			`and stores the result indirectly through`
			`.IR e3 .`
			`.PP`
fixed some bugs, added flt_umin 1989-07-28 14:13:39 +00:00			`.B flt_umin`
			`negates the number indicated by`
			`.I e`
			`and stores the result indirectly through`
			`.IR e .`
			`.PP`
Initial revision 1989-07-10 11:17:19 +00:00			`.B flt_modf`
			`splits the number indicated by`
			`.I e`
			`in an integer and a fraction part, and stores the integer part through`
			`.I intpart`
			`and the fraction part through`
			`.IR fractpart .`
			`So, adding the numbers indicated by`
			`.I intpart`
			`and`
			`.I fractpart`
			`results (in the absence of rounding error) in the number`
			`indicated by`
			`.IR e .`
			`Also, the absolute value of the number indicated by`
			`.I intpart`
			`is less than or equal to the absolute value of the number indicated by`
			`.IR e .`
			`The absolute value of the number indicated by`
			`.I fractpart`
			`is less than 1.`
			`.PP`
			`.B flt_cmp`
			`compares the numbers indicated by`
			`.I e1`
			`and`
			`.I e2`
			`and returns -1 if`
			`.I e1`
			`<`
			`.IR e2 ,`
			`0 if`
			`.I e1`
			`=`
			`.IR e2 ,`
			`and 1 if`
			`.I e1`
			`>`
			`.IR e2 .`
			`.PP`
			`.B flt_str2flt`
			`converts the string indicated by`
			`.I s`
			`to a floating point number, and stores this number through`
			`.IR e.`
			`The string should contain a floating point constant, which consists of`
			`an integer part, a decimal point, a fraction part, an \f(CWe\fP or an`
			`\f(CWE\fP, and an optionally signed integer exponent. The integer and`
			`fraction parts both consist of a sequence of digits. They may not both be`
			`missing. The decimal point, the \f(CWe\fP and the exponent may be`
			`missing.`
			`.PP`
			`.B flt_flt2str`
			`converts the number indicated by`
			`.I e`
			`into a string, in a scientific notation acceptable for EM. The result is`
			`stored in`
			`.IR buf .`
			`At most`
			`.I bufsize`
			`characters are stored.`
Added #define for buffer size needed for flt_flt2str() 1989-07-31 13:05:51 +00:00			`The maximum length needed is available in the constant FLT_STRLEN.`
Initial revision 1989-07-10 11:17:19 +00:00			`.PP`
			`.B flt_arith2flt`
			`converts the number`
			`.I n`
fixed some bugs, added flt_umin 1989-07-28 14:13:39 +00:00			`to the floating point format used in this package and returns the result`
Initial revision 1989-07-10 11:17:19 +00:00			`in`
speeded up a bit for converting 0.0 to string 1989-11-27 17:25:55 +00:00			`.IR e . If the`
			`.I uns`
			`flag is set, the number`
			`.I n`
			`is regarded as an unsigned.`
Initial revision 1989-07-10 11:17:19 +00:00			`.PP`
			`.B flt_flt2arith`
			`truncates the number indicated by`
			`.I e`
			`to the largest integer value smaller than or equal to the number indicated by`
			`.IR e .`
speeded up a bit for converting 0.0 to string 1989-11-27 17:25:55 +00:00			`It returns this value. If the`
			`.I uns`
			`flag is set, the result is to be regarded as unsigned.`
Initial revision 1989-07-10 11:17:19 +00:00			`.PP`
			`Before each operation, the`
			`.I flt_status`
			`variable is reset to 0. After an operation, it can be checked for one`
			`of the following values:`
			`.IP FLT_OVFL`
			`.br`
			`an overflow occurred. The result is a large value with the correct sign.`
			`This can occur with the routines`
			`.IR flt_add ,`
			`.IR flt_sub ,`
			`.IR flt_div ,`
			`.IR flt_mul ,`
			`.IR flt_flt2arith ,`
			`and`
			`.IR flt_str2flt .`
			`.IP FLT_UNFL`
			`.br`
			`an underflow occurred. The result is 0.`
			`This can occur with the routines`
			`.IR flt_div ,`
			`.IR flt_mul ,`
			`.IR flt_sub ,`
			`.IR flt_add ,`
			`and`
			`.IR flt_str2flt .`
			`.IP FLT_DIV0`
			`.br`
			`divide by 0. The result is a large value with the sign of the dividend.`
			`This can only occur with the routine`
			`.IR flt_div .`
			`.IP FLT_NOFLT`
			`.br`
			`indicates that the string did not represent a floating point number. The`
			`result is 0.`
			`This can only occur with the routine`
			`.IR flt_str2flt .`
			`.IP FLT_BTSM`
			`.br`
			`indicates that the buffer is too small. The contents of the buffer is`
			`undefined. This can only occur with the routine`
			`.IR flt_flt2str .`
made to work, and added the b64 shift routines to the interface 1989-07-11 09:15:17 +00:00			`.PP`
			`The routine`
			`.I flt_b64_sft`
			`shifts the mantissa`
			`.I m`
			`.I \|n\|`
			`bits left or right, depending on the sign of`
			`.IR n .`
			`If`
			`.I n`
			`is negative, it is a left-shift; If`
			`.I n`
			`is positive, it is a right shift.`
Initial revision 1989-07-10 11:17:19 +00:00			`.SH FILES`
			`~em/modules/h/flt_arith.h`
			`.br`
			`~em/modules/h/em_arith.h`
			`.br`
			`~em/modules/lib/libflt.a`