Hypothesis Testing

 

The decision making starts by identifying something of concern and then formulating two hypotheses about it.

 

Hypothesis is a statement that describes something is true.

 

One hypothesis is called null hypothesis,, and the other is called alternative hypothesis, .

 

Statistical Hypothesis Test is a process, which need test statistic, used for decision making about two hypotheses. 

 

The possible results are as follows:

 

 

 is true

 is true

Fail to reject

 

Type II error

Reject

Type I error

 

 

 

                       

 

Need to find sample region as a reject region (or called critical region).

Using to find reject  sample region; using to find sample size needed ( is not used in the class of Stat 281.)

 

 

Inference for One Population: 

 

Population should have normal distribution for all the following cases.

Significant level for hypothesis testing is  and confident level is .

 

Inference for population mean

 

In the class of Stat 281, the c in the hypothesis is always zero.

1)      Population mean is unknown and population variance is known

Use normal curve table.  (It is also called Z test.)   

 

a)  vs   (Two tail alternative hypothesis)

    

      Test statistic:   and critical region is  or

 

       p-value for the test is

 

b)  vs   (left tail alternative hypothesis)

    

      Test statistic:   and critical region is

 

       p-value for the test is

 

c)  vs   (right tail alternative hypothesis)

    

      Test statistic:   and critical region is

 

       p-value for the test is

  

 

 confident interval for population mean :

 

 

 

2)      Population mean is unknown and population variance is unknown

 

 For sample size, use T Table.  (It is also call T test.)

 

a)  vs   (Two tail alternative hypothesis)

    

      Test statistic:   and critical region is  or

 

       p-value for the test is

 

b)  vs   (left tail alternative hypothesis)

    

      Test statistic:   and critical region is

 

       p-value for the test is

 

c)  vs   (right tail alternative hypothesis)

    

      Test statistic:   and critical region is

 

       p-value for the test is

 

 

 confident interval for population mean :

 

 

 

For sample size, we can use Z test procedures.

 

Ifis unknown, then it should be replaced by  sample standard deviation.

 

 where  ------ Formula (2.6)  page 86

 

Inference for population variance

 

3)      For population variance

 

     a)  vs   (Two tail alternative hypothesis)

    

      Test statistic:   and critical region is  or

 

       p-value for the test is .  Where is evaluation of statistic.

 

b)  vs   (left tail alternative hypothesis)

    

      Test statistic:   and critical region is

 

       p-value for the test is .  Where is evaluation of statistic.

 

c)  vs   (right tail alternative hypothesis)

    

      Test statistic:   and critical region is  

       p-value for the test is .  Where is evaluation of statistic

 

 

 confident interval for population variance

 

 

 

Inference for one population proportion

 

            Let be population proportion which is unknown and is sample proportion

            with sample size .

            Assume that sample size is large enough. 

            (Thumb of rule says )

 

a)  vs   (Two tail alternative hypothesis)

    

      Test statistic:   and critical region is  or

 

       p-value for the test is

 

b) vs   (left tail alternative hypothesis)

    

      Test statistic:   and critical region is

 

       p-value for the test is

 

c)  vs   (right tail alternative hypothesis)

    

      Test statistic:   and critical region is

 

       p-value for the test is

 

 

 confident interval for is:

 

 

 

Inference for Two Populations

 

Both populations should have normal distributions for all the following cases.

Significant level for hypothesis testing is  and confident level is.   and are values from normal curve table.

 

is sample mean of random sample of size from the population which has .  is sample mean of random sample of size from the population which has

 

Inference for two population means

 

Independent test for two population means:

 

In the class of Stat281, c is always zero.

1)      Population means,  are unknown and population variances are known .  Using normal curve table

 

a)  vs   (Two tail alternative hypothesis)

    

      Test statistic:   and critical region is  or

 

       p-value for the test is

 

b)  vs   (left tail alternative hypothesis)

    

            Test statistic:   and critical region is

 

       p-value for the test is

 

 

c)  vs   (right tail alternative hypothesis)

    

      Test statistic:   and critical region is

 

       p-value for the test is

  

 

 

 

 

 confident interval for population mean :

 

 

 

2)      Population means are unknown and population variances are unknown

 

The population variances will be replaced by the corresponding sample variances, respectively.

 

i) For sample size, use T Table.  Usually, we call independent T test for mean from two populations.  (It is also called independent T test.)

 

a) When both populations have different variances.  Follow all the cases from 1) of Inference for Two Populations with the test statistic replaced by .   It means that the variance  is replaced by , and the variance  is replaced by on the test statistic from 1).

The exact sampling distribution for has T-distribution with a complicated degree of freedom.  A simple approximation for the degree of freedom, , is smaller value of  and, which gives a conservative result. 

 

Therefore, use T-table with degree of freedom which is the smaller value of   andto find or to replace and .

 

b) When both population variances are equal (

The pool estimator  will be used to replace common variance.  Then the test statistic is  which has sampling distribution T distribution with degree of freedom

 

Therefore, use T-table with degree of freedom which is  to find or to replace and.

 

ii) For sample size, use normal curve table

 

Follow all the cases from 1) of Inference for Two Populations with the test statistic.  It means that the unknown variances will be replaced by sample variances accordingly.  

 

 

Paired test for means from two populations

 

Let, be pair sample from experimental units.

 

For example, , is sample from experimental units before treatment applied and , is sample from experimental units after treatment applied.  The main purpose is to compare the population mean after treatment and population mean before treatment.

 

Let, be treated as one sample situation.

a)      When, use  as test statistics, where is sample standard deviation from sample, .  Refer z-test for the one population inference above.

 

b)      When, use  as test statistics, where is sample standard deviation from sample, .  Refer t-test for the one population inference above.  (This is also called pair-wise T test.)

 

 

 

 

 

Inference for both population variances

 

Use F-distribution table. 

 

3)  For population variances

 

 a)  vs   (Two tail alternative hypothesis)

    

      Test statistic:   and critical region is  or

 

       p-value for the test is

 

b)  vs   (left tail alternative hypothesis)

    

Test statistic:   and critical region is

 

       p-value for the test is

 

c)  vs   (right tail alternative hypothesis)

    

Test statistic:   and critical region is

 

       p-value for the test is

 

 

 confident interval for

 

 

 

 

Inference for proportions of two populations

 

            Let be the first population proportion and be the second population

            proportion.

 

            Select sample of size  from first population and sample proportion is

 

            Select sample of size  from second population and sample proportion is

 

            Assume that sample sizes  and are large enough. 

 

a)  vs   (Two tail alternative hypothesis)

    

      Test statistic:   where

 

 

 

and critical region is  or

 

       p-value for the test is

 

b)  vs   (Left tail alternative hypothesis)

    

      Test statistic:   where

 

 

 

and critical region is  

 

       p-value for the test is

 

 

c)  vs   (Right tail alternative hypothesis)

    

      Test statistic:   where

 

 

 

and critical region is  

 

       p-value for the test is

 

 confident interval foris :

 

 

 

 

 

 

 

 

 

The following example is used for you to understand the necessary procedure needed for hypothesis testing.  

 

Example. An automobile manufacturer who wished to advertise that one of its models achieves 30 miles per gallon (mpg) decides to carry out a fuel efficiency test. Six non-professional drivers are selected, and each one drives a car from Phoenix to Los Angeles. The resulting figures are, in mpg, 27.2, 29.3, 31.2, 28.4, 30.3, 29.6  Assume that fuel efficiency (mpg) is normally distributed, does the data contradict the claim that true average fuel efficiency is (at least) 30 under significant level of 0.05?

 

Solution:   1) Set up hypothesis

 

          (Claim: at least 30 mpg)         

 

2)      Select test statistic:  Since sample size is 6 < 30, use t test.

 

                Reject region (critical region)

3)      Evaluate test statistic value, , based on data on hand.

 

                    ,  

4)      Conclusion:  Since , no sufficient evidence to reject the claim: at least 30 mpg under significant level of 0.05

5)      P-value:  . 

                      Since p-value > 0.05, no sufficient evidence to reject the claim

                      under significant level of 0.05.