How to resolve the algorithm Safe addition step by step in the C programming language

Problem Statement
Step by Step Solution
Sourcecode

Problem Statement

Implementation of interval arithmetic and more generally fuzzy number arithmetic require operations that yield safe upper and lower bounds of the exact result. For example, for an addition, it is the operations +↑ and +↓ defined as: a +↓ b ≤ a + b ≤ a +↑ b. Additionally it is desired that the width of the interval (a +↑ b) - (a +↓ b) would be about the machine epsilon after removing the exponent part. Differently to the standard floating-point arithmetic, safe interval arithmetic is accurate (but still imprecise). I.E.: the result of each defined operation contains (though does not identify) the exact mathematical outcome. Usually a FPU's have machine +,-,*,/ operations accurate within the machine precision. To illustrate it, let us consider a machine with decimal floating-point arithmetic that has the precision is 3 decimal points. If the result of the machine addition is 1.23, then the exact mathematical result is within the interval ]1.22, 1.24[. When the machine rounds towards zero, then the exact result is within [1.23,1.24[. This is the basis for an implementation of safe addition.

Show how +↓ and +↑ can be implemented in your language using the standard floating-point type. Define an interval type based on the standard floating-point one, and implement an interval-valued addition of two floating-point numbers considering them exact, in short an operation that yields the interval [a +↓ b, a +↑ b].

Let's start with the solution:

Step by Step solution about How to resolve the algorithm Safe addition step by step in the C programming language

The provided C code defines a safe_add function that calculates an interval for the sum of two double-precision floating-point numbers, a and b. It ensures that the interval covers all possible results of the addition, taking into account floating-point rounding errors.

Here's a detailed explanation of the code:

Header Files:
- The code includes the following header files:
  - <fenv.h>: For floating-point environment manipulation functions (fegetround(), fesetround()) in the first implementation.
  - <float.h>: For floating-point control functions (_controlfp()) in the second implementation.
  - <ieeefp.h>: For IEEE floating-point rounding control functions (fpsetround()) in the third implementation.
  - <stdio.h>: For input/output functions (printf()) in all implementations.
safe_add Function:
- The safe_add function takes three arguments:
  - interval: A pointer to an array of two double-precision floating-point numbers where the interval will be stored.
  - a, b: The two double-precision floating-point numbers to be added.
- The function calculates an interval for the sum of a and b. The interval is defined such that:
  - interval[0] is the lower bound of the interval, and it is guaranteed that interval[0] <= a + b.
  - interval[1] is the upper bound of the interval, and it is guaranteed that a + b <= interval[1].
- To calculate the interval, the function uses the following steps:
  - It stores the current floating-point rounding mode using fegetround() (in the first implementation), _controlfp(0, 0) (in the second implementation), or fpsetround(FP_RN) (in the third implementation).
  - It sets the rounding mode to round to -infinity (FE_DOWNWARD, _RC_DOWN, FP_RM) using fesetround() (in the first implementation), _controlfp(_RC_DOWN, _MCW_RC) (in the second implementation), or fpsetround(FP_RM) (in the third implementation).
  - It calculates the sum of a and b and stores it in interval[0].
  - It sets the rounding mode to round to +infinity (FE_UPWARD, _RC_UP, FP_RP) using fesetround() (in the first implementation), _controlfp(_RC_UP, _MCW_RC) (in the second implementation), or fpsetround(FP_RP) (in the third implementation).
  - It calculates the sum of a and b again and stores it in interval[1].
  - Finally, it restores the original rounding mode using fesetround() (in the first implementation), _controlfp(orig, _MCW_RC) (in the second implementation), or fpsetround(orig) (in the third implementation).
main Function:
- The main function is the entry point of the program.
- It defines a 2D array nums containing four pairs of double-precision floating-point numbers. These numbers are used to test the safe_add function.
- It also declares an array ival of size 2 to store the interval calculated by safe_add.
- It iterates through the pairs of numbers in nums and calls safe_add to calculate the interval for each pair.
- For each pair, it prints the sum of the pair, the interval calculated by safe_add, and the size of the interval.

Notes:

The code uses different methods for controlling floating-point rounding mode depending on the available headers and functions on different platforms.
The volatile keyword is used to prevent the compiler from optimizing away the calculations in the safe_add function, ensuring that the rounding mode is correctly applied.
The format string %.17g is used to print the double-precision floating-point numbers with enough precision to show the differences between the rounded results.

Source code in the c programming language

#include <fenv.h>	/* fegetround(), fesetround() */
#include <stdio.h>	/* printf() */
 
/*
 * Calculates an interval for a + b.
 *   interval[0] <= a + b
 *   a + b <= interval[1]
 */
void
safe_add(volatile double interval[2], volatile double a, volatile double b)
{
#pragma STDC FENV_ACCESS ON
	unsigned int orig;
 
	orig = fegetround();
	fesetround(FE_DOWNWARD);	/* round to -infinity */
	interval[0] = a + b;
	fesetround(FE_UPWARD);		/* round to +infinity */
	interval[1] = a + b;
	fesetround(orig);
}
 
int
main()
{
	const double nums[][2] = {
		{1, 2},
		{0.1, 0.2},
		{1e100, 1e-100},
		{1e308, 1e308},
	};
	double ival[2];
	int i;
 
	for (i = 0; i < sizeof(nums) / sizeof(nums[0]); i++) {
		/*
		 * Calculate nums[i][0] + nums[i][1].
		 */
		safe_add(ival, nums[i][0], nums[i][1]);
 
		/*
		 * Print the result. %.17g gives the best output.
		 * %.16g or plain %g gives not enough digits.
		 */
		printf("%.17g + %.17g =\n", nums[i][0], nums[i][1]);
		printf("    [%.17g, %.17g]\n", ival[0], ival[1]);
		printf("    size %.17g\n\n", ival[1] - ival[0]);
	}
	return 0;
}


#include <float.h>	/* _controlfp() */
#include <stdio.h>	/* printf() */

/*
 * Calculates an interval for a + b.
 *   interval[0] <= a + b
 *   a + b <= interval[1]
 */
void
safe_add(volatile double interval[2], volatile double a, volatile double b)
{
	unsigned int orig;

	orig = _controlfp(0, 0);
	_controlfp(_RC_DOWN, _MCW_RC);	/* round to -infinity */
	interval[0] = a + b;
	_controlfp(_RC_UP, _MCW_RC);	/* round to +infinity */
	interval[1] = a + b;
	_controlfp(orig, _MCW_RC);
}

int
main()
{
	const double nums[][2] = {
		{1, 2},
		{0.1, 0.2},
		{1e100, 1e-100},
		{1e308, 1e308},
	};
	double ival[2];
	int i;

	for (i = 0; i < sizeof(nums) / sizeof(nums[0]); i++) {
		/*
		 * Calculate nums[i][0] + nums[i][1].
		 */
		safe_add(ival, nums[i][0], nums[i][1]);

		/*
		 * Print the result. %.17g gives the best output.
		 * %.16g or plain %g gives not enough digits.
		 */
		printf("%.17g + %.17g =\n", nums[i][0], nums[i][1]);
		printf("    [%.17g, %.17g]\n", ival[0], ival[1]);
		printf("    size %.17g\n\n", ival[1] - ival[0]);
	}
	return 0;
}


#include <ieeefp.h>	/* fpsetround() */
#include <stdio.h>	/* printf() */

/*
 * Calculates an interval for a + b.
 *   interval[0] <= a + b
 *   a + b <= interval[1]
 */
void
safe_add(volatile double interval[2], volatile double a, volatile double b)
{
	fp_rnd orig;

	orig = fpsetround(FP_RM);	/* round to -infinity */
	interval[0] = a + b;
	fpsetround(FP_RP);		/* round to +infinity */
	interval[1] = a + b;
	fpsetround(orig);
}

int
main()
{
	const double nums[][2] = {
		{1, 2},
		{0.1, 0.2},
		{1e100, 1e-100},
		{1e308, 1e308},
	};
	double ival[2];
	int i;

	for (i = 0; i < sizeof(nums) / sizeof(nums[0]); i++) {
		/*
		 * Calculate nums[i][0] + nums[i][1].
		 */
		safe_add(ival, nums[i][0], nums[i][1]);

		/*
		 * Print the result. With OpenBSD libc, %.17g gives
		 * the best output; %.16g or plain %g gives not enough
		 * digits.
		 */
		printf("%.17g + %.17g =\n", nums[i][0], nums[i][1]);
		printf("    [%.17g, %.17g]\n", ival[0], ival[1]);
		printf("    size %.17g\n\n", ival[1] - ival[0]);
	}
	return 0;
}

You may also check:How to resolve the algorithm Check output device is a terminal step by step in the Raku programming language
You may also check:How to resolve the algorithm Jordan-Pólya numbers step by step in the FreeBASIC programming language
You may also check:How to resolve the algorithm Semordnilap step by step in the Java programming language
You may also check:How to resolve the algorithm Tokenize a string step by step in the Seed7 programming language
You may also check:How to resolve the algorithm Tau number step by step in the MAD programming language

How to resolve the algorithm Safe addition step by step in the C programming language

Table of Contents

Problem Statement

Step by Step solution about How to resolve the algorithm Safe addition step by step in the C programming language

Source code in the c programming language