Hypothesis Testing

A statistical hypothesis is a statement about the population under study or about the distribution of a quantity under consideration. The null hypothesis, , is the hypothesis to be tested. It is a statement about a theory that is believed to be true but has not been proven. For instance, if a new product design is thought to perform consistently, regardless of the region of operation, then the null hypothesis may be stated as ": New product design performance is not affected by region." Statements in always include exact values of parameters under consideration, e.g. ": The population mean is 100" or simply "."

 

Rejection of the null hypothesis, , leads to the possibility that the alternative hypothesis, , may be true. Given the previous null hypothesis, the alternate hypothesis may be ": New product design performance is affected by region." In the case of the example regarding inference on the population mean, the alternative hypothesis may be stated as ": The population mean is not 100" or simply "."

 

Hypothesis testing involves the calculation of a test statistic based on a random sample drawn from the population. The test statistic is then compared to the critical value(s) and used to make a decision about the null hypothesis. The critical values are set by the analyst.

 

The outcome of a hypothesis test is that we either "reject " or we "fail to reject ." Failing to reject implies that we did not find sufficient evidence to reject . It does not necessarily mean that there is a high probability that is true. As such, the terminology "accept " is not preferred.

 

Example 3.1

 

Assume that an analyst wants to know if the mean of a certain population is 100 or not. The statements for this hypothesis can be stated as follows:

MATH

The analyst decides to use the sample mean as the test statistic for this test. The analyst further decides that if the sample mean lies between 98 and 102 it can be concluded that the population mean is 100. Thus, the critical values set for this test by the analyst are 98 and 102. It is also decided to draw out a random sample of size 25 from the population.

 

Now assume that the true population mean is 100 (i.e. ) and the true population standard deviation is 5 (i.e. ). This information is not known to the analyst. Using the Central Limit Theorem, the test statistic (sample mean) will follow a normal distribution with a mean equal to the population mean, , and a standard deviation of , where is the sample size. Therefore, the distribution of the test statistic has a mean of 100 and a standard deviation of . This distribution is shown in Figure 3.9.

 

Figure

Figure 3.9: Acceptance region and critical regions for the hypothesis test in Example 3.1.

 

The unshaded area in the figure bound by the critical values of 98 and 102 is called the acceptance region. The acceptance region gives the probability that a random sample drawn from the population would have a sample mean that lies between 98 and 102. Therefore, this is the region that will lead to the conclusion of "fail to reject ". On the other hand, the shaded area gives the probability that the sample mean obtained from the random sample lies outside of the critical values. In other words, it gives the probability of rejection of the null hypothesis when the true mean is 100. The shaded area is referred to as the critical region or the rejection region. Rejection of the null hypothesis when it is true is referred to as type I error. Thus, there is a 4.56% chance of making a type I error in this hypothesis test. This percentage is called the significance level of the test and is denoted by . Here or (area of the shaded region in the figure). The value of is set by the analyst when he/she chooses the critical values.

 

A type II error is also defined in hypothesis testing. This error occurs when the analyst fails to reject the null hypothesis when it is actually false. Such an error would occur if the value of the sample mean obtained is in the acceptance region bounded by 98 and 102 even though the true population mean is not 100. The probability of occurrence of type II error is denoted by .

Two-Sided and One-Sided Hypotheses

As seen in the previous section, the critical region for the hypothesis test is split into two parts, with equal areas in each tail of the distribution of the test statistic. Such a hypothesis, in which the values for which we can reject are in both tails of the probability distribution, is called a two-sided hypothesis.

 

The hypothesis for which the critical region lies only in one tail of the probability distribution is called a one-sided hypothesis. For instance, consider the following hypothesis test:

MATH

This is an example of a one-sided hypothesis. Here the critical region lies entirely in the right tail of the distribution as shown in Figure 3.10.

The hypothesis test may also be set up as follows:

MATH

This is also a one-sided hypothesis. Here the critical region lies entirely in the left tail of the distribution as shown in Figure 3.11.

 

Figure

Figure 3.10: One-sided hypothesis where the critical region lies in the right tail.

 
 

Figure

Figure 3.11: One-sided hypothesis where the critical region lies in the left tail.

 

See Also:

 

F Distribution

Statistical Inference for a Single Sample