How to resolve the algorithm Safe addition step by step in the C programming language
How to resolve the algorithm Safe addition step by step in the C programming language
Table of Contents
Problem Statement
Implementation of interval arithmetic and more generally fuzzy number arithmetic require operations that yield safe upper and lower bounds of the exact result. For example, for an addition, it is the operations +↑ and +↓ defined as: a +↓ b ≤ a + b ≤ a +↑ b. Additionally it is desired that the width of the interval (a +↑ b) - (a +↓ b) would be about the machine epsilon after removing the exponent part. Differently to the standard floating-point arithmetic, safe interval arithmetic is accurate (but still imprecise). I.E.: the result of each defined operation contains (though does not identify) the exact mathematical outcome. Usually a FPU's have machine +,-,*,/ operations accurate within the machine precision. To illustrate it, let us consider a machine with decimal floating-point arithmetic that has the precision is 3 decimal points. If the result of the machine addition is 1.23, then the exact mathematical result is within the interval ]1.22, 1.24[. When the machine rounds towards zero, then the exact result is within [1.23,1.24[. This is the basis for an implementation of safe addition.
Show how +↓ and +↑ can be implemented in your language using the standard floating-point type. Define an interval type based on the standard floating-point one, and implement an interval-valued addition of two floating-point numbers considering them exact, in short an operation that yields the interval [a +↓ b, a +↑ b].
Let's start with the solution:
Step by Step solution about How to resolve the algorithm Safe addition step by step in the C programming language
The provided C code defines a safe_add
function that calculates an interval for the sum of two double-precision floating-point numbers, a
and b
. It ensures that the interval covers all possible results of the addition, taking into account floating-point rounding errors.
Here's a detailed explanation of the code:
-
Header Files:
- The code includes the following header files:
<fenv.h>
: For floating-point environment manipulation functions (fegetround()
,fesetround()
) in the first implementation.<float.h>
: For floating-point control functions (_controlfp()
) in the second implementation.<ieeefp.h>
: For IEEE floating-point rounding control functions (fpsetround()
) in the third implementation.<stdio.h>
: For input/output functions (printf()
) in all implementations.
- The code includes the following header files:
-
safe_add
Function:- The
safe_add
function takes three arguments:interval
: A pointer to an array of two double-precision floating-point numbers where the interval will be stored.a
,b
: The two double-precision floating-point numbers to be added.
- The function calculates an interval for the sum of
a
andb
. The interval is defined such that:interval[0]
is the lower bound of the interval, and it is guaranteed thatinterval[0] <= a + b
.interval[1]
is the upper bound of the interval, and it is guaranteed thata + b <= interval[1]
.
- To calculate the interval, the function uses the following steps:
- It stores the current floating-point rounding mode using
fegetround()
(in the first implementation),_controlfp(0, 0)
(in the second implementation), orfpsetround(FP_RN)
(in the third implementation). - It sets the rounding mode to round to -infinity (
FE_DOWNWARD
,_RC_DOWN
,FP_RM
) usingfesetround()
(in the first implementation),_controlfp(_RC_DOWN, _MCW_RC)
(in the second implementation), orfpsetround(FP_RM)
(in the third implementation). - It calculates the sum of
a
andb
and stores it ininterval[0]
. - It sets the rounding mode to round to +infinity (
FE_UPWARD
,_RC_UP
,FP_RP
) usingfesetround()
(in the first implementation),_controlfp(_RC_UP, _MCW_RC)
(in the second implementation), orfpsetround(FP_RP)
(in the third implementation). - It calculates the sum of
a
andb
again and stores it ininterval[1]
. - Finally, it restores the original rounding mode using
fesetround()
(in the first implementation),_controlfp(orig, _MCW_RC)
(in the second implementation), orfpsetround(orig)
(in the third implementation).
- It stores the current floating-point rounding mode using
- The
-
main
Function:- The
main
function is the entry point of the program. - It defines a 2D array
nums
containing four pairs of double-precision floating-point numbers. These numbers are used to test thesafe_add
function. - It also declares an array
ival
of size 2 to store the interval calculated bysafe_add
. - It iterates through the pairs of numbers in
nums
and callssafe_add
to calculate the interval for each pair. - For each pair, it prints the sum of the pair, the interval calculated by
safe_add
, and the size of the interval.
- The
Notes:
- The code uses different methods for controlling floating-point rounding mode depending on the available headers and functions on different platforms.
- The
volatile
keyword is used to prevent the compiler from optimizing away the calculations in thesafe_add
function, ensuring that the rounding mode is correctly applied. - The format string
%.17g
is used to print the double-precision floating-point numbers with enough precision to show the differences between the rounded results.
Source code in the c programming language
#include <fenv.h> /* fegetround(), fesetround() */
#include <stdio.h> /* printf() */
/*
* Calculates an interval for a + b.
* interval[0] <= a + b
* a + b <= interval[1]
*/
void
safe_add(volatile double interval[2], volatile double a, volatile double b)
{
#pragma STDC FENV_ACCESS ON
unsigned int orig;
orig = fegetround();
fesetround(FE_DOWNWARD); /* round to -infinity */
interval[0] = a + b;
fesetround(FE_UPWARD); /* round to +infinity */
interval[1] = a + b;
fesetround(orig);
}
int
main()
{
const double nums[][2] = {
{1, 2},
{0.1, 0.2},
{1e100, 1e-100},
{1e308, 1e308},
};
double ival[2];
int i;
for (i = 0; i < sizeof(nums) / sizeof(nums[0]); i++) {
/*
* Calculate nums[i][0] + nums[i][1].
*/
safe_add(ival, nums[i][0], nums[i][1]);
/*
* Print the result. %.17g gives the best output.
* %.16g or plain %g gives not enough digits.
*/
printf("%.17g + %.17g =\n", nums[i][0], nums[i][1]);
printf(" [%.17g, %.17g]\n", ival[0], ival[1]);
printf(" size %.17g\n\n", ival[1] - ival[0]);
}
return 0;
}
#include <float.h> /* _controlfp() */
#include <stdio.h> /* printf() */
/*
* Calculates an interval for a + b.
* interval[0] <= a + b
* a + b <= interval[1]
*/
void
safe_add(volatile double interval[2], volatile double a, volatile double b)
{
unsigned int orig;
orig = _controlfp(0, 0);
_controlfp(_RC_DOWN, _MCW_RC); /* round to -infinity */
interval[0] = a + b;
_controlfp(_RC_UP, _MCW_RC); /* round to +infinity */
interval[1] = a + b;
_controlfp(orig, _MCW_RC);
}
int
main()
{
const double nums[][2] = {
{1, 2},
{0.1, 0.2},
{1e100, 1e-100},
{1e308, 1e308},
};
double ival[2];
int i;
for (i = 0; i < sizeof(nums) / sizeof(nums[0]); i++) {
/*
* Calculate nums[i][0] + nums[i][1].
*/
safe_add(ival, nums[i][0], nums[i][1]);
/*
* Print the result. %.17g gives the best output.
* %.16g or plain %g gives not enough digits.
*/
printf("%.17g + %.17g =\n", nums[i][0], nums[i][1]);
printf(" [%.17g, %.17g]\n", ival[0], ival[1]);
printf(" size %.17g\n\n", ival[1] - ival[0]);
}
return 0;
}
#include <ieeefp.h> /* fpsetround() */
#include <stdio.h> /* printf() */
/*
* Calculates an interval for a + b.
* interval[0] <= a + b
* a + b <= interval[1]
*/
void
safe_add(volatile double interval[2], volatile double a, volatile double b)
{
fp_rnd orig;
orig = fpsetround(FP_RM); /* round to -infinity */
interval[0] = a + b;
fpsetround(FP_RP); /* round to +infinity */
interval[1] = a + b;
fpsetround(orig);
}
int
main()
{
const double nums[][2] = {
{1, 2},
{0.1, 0.2},
{1e100, 1e-100},
{1e308, 1e308},
};
double ival[2];
int i;
for (i = 0; i < sizeof(nums) / sizeof(nums[0]); i++) {
/*
* Calculate nums[i][0] + nums[i][1].
*/
safe_add(ival, nums[i][0], nums[i][1]);
/*
* Print the result. With OpenBSD libc, %.17g gives
* the best output; %.16g or plain %g gives not enough
* digits.
*/
printf("%.17g + %.17g =\n", nums[i][0], nums[i][1]);
printf(" [%.17g, %.17g]\n", ival[0], ival[1]);
printf(" size %.17g\n\n", ival[1] - ival[0]);
}
return 0;
}
You may also check:How to resolve the algorithm Check output device is a terminal step by step in the Raku programming language
You may also check:How to resolve the algorithm Jordan-Pólya numbers step by step in the FreeBASIC programming language
You may also check:How to resolve the algorithm Semordnilap step by step in the Java programming language
You may also check:How to resolve the algorithm Tokenize a string step by step in the Seed7 programming language
You may also check:How to resolve the algorithm Tau number step by step in the MAD programming language