Hypothesis Testing

Hypothesis testing tests one or more sample populations for a statistical characteristic or interaction. The results of the testing process are generally used to formulate conclusions about the probability distributions of the sample populations.

Hypothesis testing involves four steps:

For example, suppose the FDA wishes to establish the effectiveness of a new drug in the treatment of a certain ailment. Researchers test the assumption that the drug is effective by administering it to a sample population and collecting data on the patients' health. Once the data are collected, an appropriate statistical test is selected and the results analyzed. If the interpretation of the test results suggests a statistically significant improvement in the patients' condition, the researchers conclude that the drug will be effective in general.

It is important to remember that a valid or successful test does not prove the proposed hypothesis. Only by disproving competing or opposing hypotheses can a given assumption's validity be statistically established.

One- and Two-sided Tests

In the above example, only the hypothesis that the drug would significantly improve the condition of the patients receiving it was tested. This type of test is called one-sided or one-tailed, because it is concerned with deviation in one direction from the norm (in this case, improvement of the patients' condition). A hypothesis designed to test the improvement or ill-effect of the trial drug on the patient group would be called two-sided or two-tailed.

Parametric and Nonparametric Tests

Tests of hypothesis are usually classified into parametric and nonparametric methods. Parametric methods make assumptions about the underlying distribution from which sample populations are selected. Nonparametric methods make no assumptions about a sample population's distribution and are often based upon magnitude-based ranking, rather than actual measurement data. In many cases it is possible to replace a parametric test with a corresponding nonparametric test without significantly affecting the conclusion.

The following example demonstrates this by replacing the parametric T-means test with the nonparametric Wilcoxon Rank-Sum test to test the hypothesis that two sample populations have significantly different means of distribution.

Define two sample populations.

X = [257, 208, 296, 324, 240, 246, 267, 311, 324, 323, 263, $ 
     305, 270, 260, 251, 275, 288, 242, 304, 267] 
Y = [201,  56, 185, 221, 165, 161, 182, 239, 278, 243, 197, $ 
     271, 214, 216, 175, 192, 208, 150, 281, 196] 

Compute the T-statistic and its significance, using IDL's TM_TEST function, assuming that X and Y belong to Normal populations with the same variance.

PRINT, TM_TEST(X, Y) 

IDL prints:

5.52839  2.52455e-06 

The small value of the significance (2.52455e-06) indicates that X and Y have significantly different means.

Compute the Wilcoxon Rank-Sum Test, using IDL's RS_TEST function, to test the hypothesis that X and Y have the same mean of distribution.

PRINT, RS_TEST(X, Y) 

IDL prints:

-4.26039  1.01924e-05 

The small value of the computed probability (1.01924e-05) requires the rejection of the proposed hypothesis and the conclusion that X and Y have significantly different means of distribution.

Each of IDL's 11 parametric and nonparametric hypothesis testing functions is based upon a well-known and widely-accepted statistical test. Each of these functions returns a two-element vector containing the statistic on which the test is based and its significance. Examples are provided and demonstrate how the result is interpreted.

Routines for Hypothesis Testing

See Hypothesis Testing (in the functional category "Mathematics" (IDL Quick Reference)) for a brief description of IDL routines for hypothesis testing. More detailed information is available in the IDL Reference Guide.