IMSL_KOLMOGOROV2
Syntax | Return Value | Arguments | Keywords | Discussion | Example | Version History
The IMSL_KOLMOGOROV2 function performs a Kolmogorov-Smirnov two-sample test.
Note
This routine requires an IDL Advanced Math and Stats license. For more information, contact your ITT Visual Information Solutions sales or technical support representative.
Syntax
Result = KOLMORGOROV2(x, y [, DIFFERENCES=variable] [, /DOUBLE] [, NMISSINGX=variable] [, NMISSINGY=variable] )
Return Value
One-dimensional array of length 3 containing Z, p1, and p2 .
Arguments
x
One-dimensional array containing the observations from sample one.
y
One-dimensional array containing the observations from sample two.
Keywords
DIFFERENCES
Named variable into which a one-dimensional array containing Dn , Dn+, Dn- is stored.
DOUBLE
If present and nonzero, double precision is used.
NMISSINGX
Named variable into which the number of missing values in the x sample is stored.
NMISSINGY
Named variable into which the number of missing values in the y sample is stored.
Discussion
The IMSL_KOLMOGOROV2 function computes Kolmogorov-Smirnov two-sample test statistics for testing that two continuous cumulative distribution functions (CDF's) are identical based upon two random samples. One- or two-sided alternatives are allowed. If n_observations_x = N_ELEMENTS(x) and n_observations_y = N_ELEMENTS(y), then the exact p-values are computed for the two-sided test when n_observations_x * n_observations_y is less than 104.
Let Fn(x) denote the empirical CDF in the X sample, let Gm(y) denote the empirical CDF in the Y sample, where n = n_observations_x - Nmissingx and m = n_observations_y - Nmissingy, and let the corresponding population distribution functions be denoted by F(x) and G(y), respectively. Then, the hypotheses tested by IMSL_KOLMOGOROV2 are as follows:

The test statistics are given as follows:

Asymptotically, the distribution of the statistic
(returned in Result (0)) converges to a distribution given by Smirnov (1939).
Exact probabilities for the two-sided test are computed when m * n is less than or equal to 104, according to an algorithm given by Kim and Jennrich (1973;). When m * n is greater than 104, the very good approximations given by Kim and Jennrich are used to obtain the two-sided p-values. The one-sided probability is taken as one half the two-sided probability. This is a very good approximation when the p-value is small (say, less than 0.10) and not very good for large p-values.
Example
The following example illustrates the IMSL_KOLMOGOROV2 routine with two randomly generated samples from a uniform(0,1) distribution. Since the two theoretical distributions are identical, we would not expect to reject the null hypothesis.
IMSL_RANDOMOPT, set = 123457 x = IMSL_RANDOM(100, /Uniform) y = IMSL_RANDOM(60, /Uniform) stats = IMSL_KOLMOGOROV2(x, y, DIFFERENCES = d, $ NMISSINGX = nmx, NMISSINGY = nmy) PRINT, 'D =', d(0) PRINT, 'D+ =', d(1) PRINT, 'D- =', d(2) PRINT, 'Z =', stats(0) PRINT, 'Prob greater D one sided =', stats(1) PRINT, 'Prob greater D two sided =', stats(2) PRINT, 'Missing X =', nmx PRINT, 'Missing Y =', nmy D = 0.180000 D+ = 0.180000 D- = 0.0100001 Z = 1.10227 Prob greater D one sided = 0.0720105 Prob greater D two sided = 0.144021 Missing X = 0 Missing Y = 0
Version History