Statistical Inference for a Single Sample

Hypothesis testing forms an important part of statistical inference. As stated previously, statistical inference refers to the process of estimating results for the population based on measurements from a sample. In the next sections, statistical inference for a single sample is discussed briefly.

 

This section is divided into the following subsections:

 

Inference on the Mean of a Population When the Variance Is Known

The test statistic used in this case is based on the standard normal distribution. If is the calculated sample mean, then the standard normal test statistic is:

MATH(9)

where is the hypothesized population mean, is the population standard deviation and is the sample size.

 

Example 3.2

Assume that an analyst wants to know if the mean of a population, , is 100. The population variance, , is known to be 25. The hypothesis test may be conducted as follows:

 

  1. The statements for this hypothesis test may be formulated as:

    MATH

    It is a clear that this is a two-sided hypothesis. Thus the critical region will lie in both of the tails of the probability distribution.

  2. Assume that the analyst chooses a significance level of 0.05. Thus . The significance level determines the critical values of the test statistic. Here the test statistic is based on the standard normal distribution. For the two-sided hypothesis these values are obtained as: [Note] MATH

    andMATH

    These values and the critical regions are shown in Figure 3.12. The analyst would fail to reject if the test statistic, , is such that:MATH

    orMATH

    Figure

    Figure 3.12: Critical values and rejection region for Example 3.2 marked on the standard normal distribution.

     
  3. Next the analyst draws a random sample from the population. Assume that the sample size, , is 25 and the sample mean is obtained as .
  4. The value of the test statistic corresponding to the sample mean value of 103 is:

    MATH

    Since this value does not lie in the acceptance region , we reject at a significance level of 0.05.

 

P Value

In the previous example the null hypothesis was rejected at a significance level of 0.05. This statement does not provide information as to how far out the test statistic was into the critical region. At times it is necessary to know if the test statistic was just into the critical region or was far out into the region. This information can be provided by using the value.

 

The value is the probability of occurrence of the values of the test statistic that are either equal to the one obtained from the sample or more unfavorable to than the one obtained from the sample. It is the lowest significance level that would lead to the rejection of the null hypothesis, , at the given value of the test statistic. The value of the test statistic is referred to as significant when is rejected. The value is the smallest at which the statistic is significant and is rejected.

 

For instance, in the previous example the test statistic was obtained as . Values that are more unfavorable to in this case are values greater than 3. Then the required probability is the probability of getting a test statistic value either equal to or greater than 3 (this is abbreviated as ). This probability is shown in Figure 3.13 as the dark shaded area on the right tail of the distribution and is equal to 0.0013 or 0.13% (i.e. ). Since this is a two-sided test the value is:MATH

Therefore, the smallest (corresponding to the test static value of 3) that would lead to the rejection of is 0.0026.

 

 

Figure

Figure 3.13: P value for Example 3.2.

 

Inference on Mean of a Population When Variance Is Unknown

When the variance, , of a population (that can be assumed to be normally distributed) is unknown the sample variance, , is used in its place in the calculation of the test statistic. The test statistic used in this case is based on the distribution and is obtained using the following relation:

MATH(10)

The test statistic follows the distribution with degrees of freedom.

 

Example 3.3

Assume that an analyst wants to know if the mean of a population, , is less than 50 at a significance level of 0.05. A random sample drawn from the population gives the sample mean, , as 47.7 and the sample standard deviation, , as 5. The sample size, , is 25. The hypothesis test may be conducted as follows:

 

  1. The statements for this hypothesis test may be formulated as:

    MATH

    It is clear that this is a one-sided hypothesis. Here the critical region will lie in the left tail of the probability distribution.

  2. Significance level, . Here, the test statistic is based on the distribution. Thus, for the one-sided hypothesis the critical value is obtained as:MATHThis value and the critical regions are shown in Figure 3.14. The analyst would fail to reject if the test statistic is such that:MATH
  3. The value of the test statistic, , corresponding to the given sample data is:

MATH

Since is less than the critical value of -1.7109, is rejected and it is concluded that at a significance level of 0.05 the population mean is less than 50.

  1. P value

    In this case the value is the probability that the test statistic is either less than or equal to (since values less than are unfavorable to ). This probability is equal to 0.0152.

    Figure

    Figure 3.14: Critical value and rejection region for Example 3.3 marked on the distribution.

     

Inference on Variance of a Normal Population

The test statistic used in this case is based on the Chi-Squared distribution. If is the calculated sample variance and the hypothesized population variance then the Chi-Squared test statistic is:

MATH(11)

The test statistic follows the Chi-Squared distribution with degrees of freedom.

 

Example 3.4

Assume that an analyst wants to know if the variance of a population exceeds 1 at a significance level of 0.05. A random sample drawn from the population gives the sample variance as 2. The sample size, , is 20. The hypothesis test may be conducted as follows:

 

  1. The statements for this hypothesis test may be formulated as:

    MATH

    This is a one-sided hypothesis. Here the critical region will lie in the right tail of the probability distribution.

  2. Significance level, . Here, the test statistic is based on the Chi-Squared distribution. Thus for the one-sided hypothesis the critical value is obtained as:MATH

    This value and the critical regions are shown in Figure 3.15. The analyst would fail to reject if the test statistic is such that:MATH

     

    Figure

    Figure 3.15: Critical value and rejection region for Example 3.4 marked on the Chi-Squared distribution.

     
  3. The value of the test statistic corresponding to the given sample data is:

    MATH

    Since is greater than the critical value of 30.1435, is rejected and it is concluded that at a significance level of 0.05 the population variance exceeds 1.

  4. P value

    In this case the value is the probability that the test statistic is greater than or equal to 38 (since values greater than 38 are unfavorable to ). This probability is determined to be 0.0059.

 

See Also:

 

Hypothesis Testing

Statistical Inference for Two Samples