How to resolve the algorithm Verify distribution uniformity/Chi-squared test step by step in the C programming language
How to resolve the algorithm Verify distribution uniformity/Chi-squared test step by step in the C programming language
Table of Contents
Problem Statement
Write a function to determine whether a given set of frequency counts could plausibly have come from a uniform distribution by using the
χ
2
{\displaystyle \chi ^{2}}
test with a significance level of 5%.
The function should return a boolean that is true if and only if the distribution is one that a uniform distribution (with appropriate number of degrees of freedom) may be expected to produce.
Note: normally a two-tailed test would be used for this kind of problem.
Let's start with the solution:
Step by Step solution about How to resolve the algorithm Verify distribution uniformity/Chi-squared test step by step in the C programming language
This C code defines functions for numerical integration using Simpson's 3/8 rule, calculating the incomplete gamma function, and performing a chi-square uniformity test on a set of data. A detailed explanation of each part of the code is provided below:
-
Numerical Integration:
-
double Simpson3_8( Ifctn f, double a, double b, int N)
: This function performs numerical integration using the Simpson's 3/8 rule. It takes a function pointerf
, the integration interval endpointsa
andb
, and the number of subintervalsN
. It calculates and returns the integral off
over the specified interval. -
#define A 12
: This defines a constantA
that sets the number of approximations for the incomplete gamma function calculation. -
double Gamma_Spouge( double z )
: This function calculates an approximation of the incomplete gamma function using Spouge's approximation. It takes a complex numberz
and returns the approximation.
-
-
Incomplete Gamma Function:
-
double aa1;
: This global variable is used to store the value ofa-1
for the incomplete gamma function calculation. -
double f0( double t)
: This function defines the integrand for the incomplete gamma function calculation. It takes a valuet
and returnspow(t, aa1)*exp(-t)
. -
double GammaIncomplete_Q( double a, double x)
: This function calculates the incomplete gamma functionQ(a,x)
using numerical integration and the Spouge approximation. It takes the parametersa
andx
and returns the value ofQ(a,x)
.
-
-
Chi-Square Uniformity Test:
-
double chi2UniformDistance( double *ds, int dslen)
: This function calculates the chi-square distance between a set of datads
of lengthdslen
and a uniform distribution. It returns the chi-square distance. -
double chi2Probability( int dof, double distance)
: This function calculates the chi-square probability for a given number of degrees of freedomdof
and distancedistance
. It returns the chi-square probability. -
int chiIsUniform( double *dset, int dslen, double significance)
: This function checks if a set of datadset
of lengthdslen
is uniformly distributed based on a given significance levelsignificance
. It returns 0 if the data is not uniformly distributed and 1 if it is.
-
-
Main Function:
-
The
main
function creates two sets of data,dset1
anddset2
, and calculates their chi-square uniform distribution test results. -
It prints the data sets, their degrees of freedom, chi-square distances, chi-square probabilities, and whether they are uniformly distributed based on a significance level of 0.05.
-
The output shows whether each data set is uniformly distributed or not based on the chi-square test.
-
Source code in the c programming language
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#ifndef M_PI
#define M_PI 3.14159265358979323846
#endif
typedef double (* Ifctn)( double t);
/* Numerical integration method */
double Simpson3_8( Ifctn f, double a, double b, int N)
{
int j;
double l1;
double h = (b-a)/N;
double h1 = h/3.0;
double sum = f(a) + f(b);
for (j=3*N-1; j>0; j--) {
l1 = (j%3)? 3.0 : 2.0;
sum += l1*f(a+h1*j) ;
}
return h*sum/8.0;
}
#define A 12
double Gamma_Spouge( double z )
{
int k;
static double cspace[A];
static double *coefs = NULL;
double accum;
double a = A;
if (!coefs) {
double k1_factrl = 1.0;
coefs = cspace;
coefs[0] = sqrt(2.0*M_PI);
for(k=1; k<A; k++) {
coefs[k] = exp(a-k) * pow(a-k,k-0.5) / k1_factrl;
k1_factrl *= -k;
}
}
accum = coefs[0];
for (k=1; k<A; k++) {
accum += coefs[k]/(z+k);
}
accum *= exp(-(z+a)) * pow(z+a, z+0.5);
return accum/z;
}
double aa1;
double f0( double t)
{
return pow(t, aa1)*exp(-t);
}
double GammaIncomplete_Q( double a, double x)
{
double y, h = 1.5e-2; /* approximate integration step size */
/* this cuts off the tail of the integration to speed things up */
y = aa1 = a-1;
while((f0(y) * (x-y) > 2.0e-8) && (y < x)) y += .4;
if (y>x) y=x;
return 1.0 - Simpson3_8( &f0, 0, y, (int)(y/h))/Gamma_Spouge(a);
}
double chi2UniformDistance( double *ds, int dslen)
{
double expected = 0.0;
double sum = 0.0;
int k;
for (k=0; k<dslen; k++)
expected += ds[k];
expected /= k;
for (k=0; k<dslen; k++) {
double x = ds[k] - expected;
sum += x*x;
}
return sum/expected;
}
double chi2Probability( int dof, double distance)
{
return GammaIncomplete_Q( 0.5*dof, 0.5*distance);
}
int chiIsUniform( double *dset, int dslen, double significance)
{
int dof = dslen -1;
double dist = chi2UniformDistance( dset, dslen);
return chi2Probability( dof, dist ) > significance;
}
int main(int argc, char **argv)
{
double dset1[] = { 199809., 200665., 199607., 200270., 199649. };
double dset2[] = { 522573., 244456., 139979., 71531., 21461. };
double *dsets[] = { dset1, dset2 };
int dslens[] = { 5, 5 };
int k, l;
double dist, prob;
int dof;
for (k=0; k<2; k++) {
printf("Dataset: [ ");
for(l=0;l<dslens[k]; l++)
printf("%.0f, ", dsets[k][l]);
printf("]\n");
dist = chi2UniformDistance(dsets[k], dslens[k]);
dof = dslens[k]-1;
printf("dof: %d distance: %.4f", dof, dist);
prob = chi2Probability( dof, dist );
printf(" probability: %.6f", prob);
printf(" uniform? %s\n", chiIsUniform(dsets[k], dslens[k], 0.05)? "Yes":"No");
}
return 0;
}
You may also check:How to resolve the algorithm Van Eck sequence step by step in the AArch64 Assembly programming language
You may also check:How to resolve the algorithm Jacobi symbol step by step in the Raku programming language
You may also check:How to resolve the algorithm Bernoulli numbers step by step in the Lua programming language
You may also check:How to resolve the algorithm Integer sequence step by step in the Wren programming language
You may also check:How to resolve the algorithm Water collected between towers step by step in the Raku programming language