Hypothesis Testing Explained in Detail

balajiv.ganesh · Jan 31, 2013

Description
This is a presentation describes steps in carrying out hypothesis testing in detail with help of examples.

January 31, 2013
Dr.B.Sasidhar 1

Hypothesis Testing
Hypothesis testing or significance testing, is a method
for checking whether an apparent result from a
sample could possibly be due to randomness.
It serves to check on how strong the evidence is.
Statistical hypothesis testing is a means of
assessing whether apparent results in a sample
conclusively indicate that something is really
happening .
January 31, 2013
Dr.B.Sasidhar 2

There are 4 major components of a test of hypothesis.
1. Null hypothesis

2. Alternative hypothesis (research hypothesis)

3. Test statistic

4. Rejection region
A research hypothesis typically states that there is a
real change, a real difference, or a real effect in the
underlying population or process.
Null hypothesis states that there is no real change,
difference or effect.
January 31, 2013
Dr.B.Sasidhar 3

Null hypothesis
The null hypothesis specifies that the
parameter is equal to a single value. For one-tail
tests, it may be ? or ? type.
0
H
Example: If we wanted to test whether the mean
weight loss of people who have
participated in a new weight program
is 3 kg, we would test
3
0
= µ : H
January 31, 2013
Dr.B.Sasidhar 4

Alternative hypothesis
The alternative hypothesis answers the
question by specifying that the parameter is
one of the following.

1. Greater than the value shown in the
null hypothesis.
1
H
Example : If a tyre company wanted to know
whether the average life of its new
radial tyre exceeds its advertised
value of 50,000 km., the company
would specify the alternative
hypothesis as
000 50
1
, : H > µ
January 31, 2013
Dr.B.Sasidhar 5

2. If the company wanted to know whether the
average life of the tyre is less than 50,000km,
it would test
000 50
1
, : H < µ
3. If the company wanted to determined whether
the average life of the tyre differs from the
advertised value,.
000 50
1
, : H = µ
6
Signs in the Tails of a Test
January 31, 2013
Dr.B.Sasidhar
January 31, 2013
Dr.B.Sasidhar 7

Test Statistic
The purpose of the test is to determine whether
it is appropriate or not to reject the null hypothesis.
Therefore the test statistic is the sample statistic
upon which we base our decision to either reject
or not reject the null hypothesis.
January 31, 2013
Dr.B.Sasidhar 8

Rejection Region
The rejection region is a range of values such that, if
the test statistic falls into that range, we decide to reject
the null hypothesis.
The key question answered by the rejection region is
when is the value of the test statistic sufficiently
different from the hypothesized value of the parameter
to enable us to reject the null hypothesis.
continued
The process we use in answering this question
depends on the probability of our making a mistake
when testing the hypothesis.
Since the conclusion we draw is based on sample
data, the chance of our making one of two possible
errors will always exist.
January 31, 2013
Dr.B.Sasidhar 9

0
H is true
0
H is false
0
H Reject
Do not reject
0
H
Type I error
P(Type I)=
o
Correct
Decision
Correct
Decision
Type II error
P(Type II)=
|
January 31, 2013
Dr.B.Sasidhar 10
The Power of Statistical Test
The power of a statistical test, given as 1 – | =
P (reject H
0
when H
0
is false), measures the
ability of the test to perform as required. This
1 – | is called the power of the function. This
means that greater the power of the function
the better would be the decision rule.
There are two types of tail test
1. One-tailed tests - the rejection region is in only
one tail of the distribution
2. Two-tailed tests - the rejection region is in both
tails of the distribution
January 31, 2013
Dr.B.Sasidhar 11

Rejection
Region
Rejection
Region
Acceptance
Region
Rejection
Region
Acceptance
Region
Two-tailed Test
One-tailed Test
January 31, 2013
Dr.B.Sasidhar 12

Steps in hypothesis testing
- Define Null hypothesis
- Define Alternative hypothesis
- Calculate Test statistic
- Determine Rejection region
- Compare Value of the test statistic with
Critical Value
- Conclusion
January 31, 2013
Dr.B.Sasidhar 13
Testing the Population Mean When the
Population Standard Deviation is Known
? Example
– A new billing system for a department store will
be cost- effective only if the mean monthly
account is more than $170.
– A sample of 400 accounts has a mean of $178.
– If accounts are approximately normally
distributed with o = $65, can we conclude that
the new system will be cost effective?
January 31, 2013
Dr.B.Sasidhar 14
? Example – Solution
– The population of interest is the credit
accounts at the store.
– We want to know whether the mean
account for all customers is greater than
$170.
H
1
: µ > 170
– The null hypothesis generally specifies a
single value of the parameter µ,
H
0
: µ = 170
Testing the Population Mean (o is Known)
January 31, 2013
Dr.B.Sasidhar 15
Approaches to Testing
? There are two approaches to test
whether the sample mean supports the
alternative hypothesis (H
1
)
– The rejection region method is mandatory
for manual testing (but can be used when
testing is supported by a statistical
software)
– The p-value method which is mostly used
when a statistical software is available.
January 31, 2013
Dr.B.Sasidhar 16
The rejection region is a range of values
such that if the test statistic falls into
that range, the null hypothesis is
rejected in favor of the alternative
hypothesis.
The Rejection Region Method
January 31, 2013
Dr.B.Sasidhar 17
Example – solution continued

• Recall: H
0:
µ = 170
H
1
: µ > 170
therefore,

• It seems reasonable to reject the null hypothesis and
believe that µ > 170 if the sample mean is sufficiently large.

The Rejection Region Method –
for a Right - Tail Test
Reject H
0
here
Critical value of the
sample mean
January 31, 2013
Dr.B.Sasidhar 18
Example – solution continued

• Define a critical value for that is just large enough
to reject the null hypothesis.
x
L
x

• Reject the null hypothesis if
L
x x >
The Rejection Region Method
for a Right - Tail Test
January 31, 2013
Dr.B.Sasidhar 19

– Instead of using the statistic , we can
use the standardized value z.

– Then, the rejection region becomes
x
n
x
z
o
µ ÷
=
o
> z z
One tail test
The standardized test statistic
Critical Value for the Rejection Region
? Set the probability of committing a Type I
error be o (also called the significance
level).

January 31, 2013
Dr.B.Sasidhar 20
? Example - continued
– We re-do this example using the
standardized test statistic.
Recall: H
0
: µ = 170
H
1
: µ > 170
– Test statistic:

– Rejection region: z > z
.05
= 1.645.
46 . 2
400 65
170 178
n
x
z =
÷
=
o
µ ÷
=
The standardized test statistic
January 31, 2013
Dr.B.Sasidhar 21
? Example - continued
The standardized test statistic
645 . 1 Z
if hypothesis null the ject Re
>
Conclusion
Since Z = 2.46 > 1.645, reject the null
hypothesis in favor of the alternative
hypothesis.
January 31, 2013
Dr.B.Sasidhar 22
– The p-value provides information about the
amount of statistical evidence that supports
the alternative hypothesis.
– The p-value of a test is the probability of observing a
test statistic at least as extreme as the one computed,
given that the null hypothesis is true.

– Let us demonstrate the concept on Example
P-value Method
January 31, 2013
Dr.B.Sasidhar 23
0069 . ) 4615 . 2 z ( P
)
400 65
170 178
z ( P
= > =
÷
> =
170
x
= µ
178 x =
The probability of observing a
test statistic at least as extreme as 178,
given that µ = 170 is…
The p-value
P-value Method
) 170 when 178 x ( P = µ >
January 31, 2013
Dr.B.Sasidhar 24
178 x =
170 : H
x 0
= µ
170 : H
x 1
> µ
We can conclude that the smaller the p-value
the more statistical evidence exists to support the
alternative hypothesis.
Interpreting the p-value
January 31, 2013
Dr.B.Sasidhar 25
? Describing the p-value
– If the p-value is less than 1%, there is
overwhelming evidence that supports the
alternative hypothesis.
– If the p-value is between 1% and 5%, there is
a strong evidence that supports the
alternative hypothesis.
– If the p-value is between 5% and 10% there
is a weak evidence that supports the
alternative hypothesis.
– If the p-value exceeds 10%, there is no
evidence that supports the alternative
hypothesis.
Interpreting the p-value
January 31, 2013
Dr.B.Sasidhar 26
– The p-value can be used when making
decisions based on rejection region
methods as follows:
• Define the hypotheses to test, and the required
significance level o.
• Perform the sampling procedure, calculate the
test statistic and the p-value associated with it.
• Compare the p-value to o. Reject the null
hypothesis only if p-value –1.28 do not reject the null hypothesis.
The p value = P(Z .10, do not reject the null hypothesis
Define the rejection region
January 31, 2013
Dr.B.Sasidhar 32
A Two - Tail Test
? Example
– AT&T has been challenged by competitors
who argued that their rates resulted in
lower bills.
– A statistics practitioner determines that the
mean and standard deviation of monthly
long-distance bills for all AT&T residential
customers are $17.09 and $3.87
respectively.
January 31, 2013
Dr.B.Sasidhar 33
A Two - Tail Test
? Example - continued
– A random sample of 100 customers is
selected and customers? bills recalculated
using a leading competitor?s rates (see file
AT&T.xls).
– Assuming the standard deviation is the
same (3.87), can we infer that there is a
difference between AT&T?s bills and the
competitor?s bills (on the average)?
January 31, 2013
Dr.B.Sasidhar 34
? Solution
– Is the mean different from 17.09?
H
0
: µ = 17.09
09 . 17 : H
1
= µ
– Define the rejection region

A Two - Tail Test
2 / 2 / o o
z z or z z > ÷ s
January 31, 2013
Dr.B.Sasidhar 35
17.09
We want this erroneous
rejection of H
0
to be a
rare event, say 5%
chance.
x x
If H
0
is true (µ =17.09), can still fall far
above or far below 17.09, in which case
we erroneously reject H
0
in favor of H
1

x
) 09 . 17 ( = µ
o/2 = 0.025
o/2 = 0.025
Solution - continued
A Two – Tail Test
January 31, 2013
Dr.B.Sasidhar 36
o/2 = 0.025
17.09
0
x x
o/2 = 0.025
o/2 = 0.025 o/2 = 0.025
19 . 1
100 87 . 3
09 . 17 55 . 17
=
÷
=
÷
=
n
x
z
o
µ
-z
o/2
= -1.96 z
o/2
= 1.96
Rejection region
Solution - continued
A Two – Tail Test
55 . 17 x =
From the sample we have:
17.55
January 31, 2013
Dr.B.Sasidhar 37
o/2 = 0.025 o/2 = 0.025
19 . 1
100 87 . 3
09 . 17 55 . 17
=
÷
=
÷
=
n
x
z
o
µ
-z
o/2
= -1.96 z
o/2
= 1.96
There is insufficient evidence to infer that there is a
difference between the bills of AT&T and the competitor.
-1.19
Also, by the p value approach:
The p-value = P(Z< -1.19)+P(Z >1.19)
= 2(.1173) = .2346 > .05
1.19 0
A Two – Tail Test
38
In practice, the population standard deviation
will be unknown.
Recall that when o

is known we use the
following statistic to estimate and test a
population mean

When o

is unknown or when the sample size is
small, we use its point estimator s, and the
z-statistic is replaced then by the t-statistic
Inference About a Population Mean When the
Population Standard Deviation Is Unknown or
When the Sample Size is Small
n
x
z
o
µ ÷
=
January 31, 2013
Dr.B.Sasidhar
39
The t - Statistic
n
x µ ÷
=
s
0
The t distribution is mound-shaped,
and symmetrical around zero.
The “degrees of freedom”,
(a function of the sample size)
determine how spread the
distribution is (compared to the
normal distribution)
d.f. = v
2

d.f. = v
1

v
1
< v
2

t
January 31, 2013
Dr.B.Sasidhar
January 31, 2013
Dr.B.Sasidhar 40
? Example
– In order to determine the number of workers
required to meet demand, the productivity of
newly hired trainees is studied.

– It is believed that trainees can process and
distribute more than 450 packages per hour
within one week of hiring.

– Can we conclude that this belief is correct,
based on productivity observation of 50
trainees (see file PROD.xls).
Testing µ when o is unknown
January 31, 2013
Dr.B.Sasidhar 41
? Example – Solution
– The problem objective is to describe the
population of the number of packages
processed in one hour.
– H
0
:µ = 450
H
1
:µ > 450
– The t statistic

d.f. = n - 1 = 49
n s
x
t
µ ÷
=
Testing µ when o is unknown
January 31, 2013
Dr.B.Sasidhar 42
? Solution continued (solving by
hand)

– The rejection region is
t > t
o,n – 1

t
o,n - 1
= t
.05,49
~ t
.05,50
= 1.676.
( )
83 . 38 55 . 1507 s
. 55 . 1507
1 n
n
x
x
s
and , 38 . 460
50
019 , 23
x
thus , 357 , 671 , 10 x 019 , 23 x
have we data the From
2
i
2
i
2
2
i i
= =
=
÷
÷
=
= =
= =
¿
¿
¿ ¿
Testing µ when o is unknown
January 31, 2013
Dr.B.Sasidhar 43
• The test statistic is
89 . 1
50 83 . 38
450 38 . 460
n s
x
t =
÷
=
µ ÷
=
• Since 1.89 > 1.676 we reject the null
hypothesis in favor of the alternative.
• There is sufficient evidence to infer that the
mean productivity of trainees one week after
being hired is greater than 450 packages at .05
significance level.
1.676
1.89
Rejection region
Testing µ when o is unknown
44
. size sample n
. successes of number the x
where
n
x
p
ˆ
÷
÷
=
? Statistic and sampling distribution
– the statistic used when making inference
about p is:
– Under certain conditions, [np > 5 and n(1-p)
> 5], is approximately normally distributed,
with µ = p and o
2
= p(1 - p)/n.
p
ˆ
Inference About a Population
Proportion
January 31, 2013
Dr.B.Sasidhar
45
Testing and Estimating the
Proportion
? Test statistic
for p

5 ) p 1 ( n and 5 np where
n / ) p 1 ( p
p pˆ
Z
> ÷ >
÷
÷
=
January 31, 2013
Dr.B.Sasidhar
46
? Example 12.6
– A pharmaceutical company claimed that its
medicine was 80% effective in relieving allergy
for a period of 15 hours. In a sample of 200
persons, who were given medicine, 150
persons had relief. Do you thank that the
company?s claim is justified? Use 0.05 level of
significance.
Testing the Proportion
January 31, 2013
Dr.B.Sasidhar
47
? Solution
– The problem objective is to test the
effectiveness of medicine.
– The data are nominal.
– The parameter to be tested is „p?.
– Success is defined as “having relief”.
– The hypotheses are:
H
0
: p = .8
H
1
: p < .8

Testing the Proportion
January 31, 2013
Dr.B.Sasidhar
48
– Solution
• The rejection region is z < z
o
= z
.05
= -1.645.
• The sample proportion is
• The value of the test statistic is

Since calculated z is less than critical value, we
reject null hypothesis and conclude that the
claim of the company that its medicine is 80%
effective is not justified.
75 . 200 150 ˆ = = p
786 . 1
200 / ) 8 . 1 ( 8 .
8 . 75 .
/ ) 1 (
ˆ
÷ =
÷
÷
=
÷
÷
=
n p p
p p
Z
Testing the Proportion
January 31, 2013
Dr.B.Sasidhar
January 31, 2013
Dr.B.Sasidhar 49
T-Tests : When sample size is small
(
=
=
H
H
H
H
January 31, 2013
Dr.B.Sasidhar 51
Paired sample t-test

0 :
0 :
0 :
0 :
1
1
1
0
<
>
=
=
d
d
d
d
H
H
H
H
µ
µ
µ
µ
January 31, 2013
Dr.B.Sasidhar 52

Matched pairs
The mean of the population differences is
D
µ
that is
D
µ µ µ = ÷
2 1
D D
D D
n s
x
t
Test statistic:
Degree of freedom = 1 ÷
D
n
January 31, 2013
Dr.B.Sasidhar 53
Independent sample t-test

2 1 1
2 1 1
2 1 1
2 1 0
:
:
:
:
µ µ
µ µ
µ µ
µ µ
<
>
=
=
H
H
H
H
January 31, 2013
Dr.B.Sasidhar 54

The sampling process.
Population 1
Population2
Parameters:
2
1 1
o µ and
Parameters:
2
2 2
o µ and
Statistics: Statistics:
2
1 1
ands x
2
2 2
ands x
Sample size:
1
n Sample size:
2
n
January 31, 2013
Dr.B.Sasidhar 55

If the two population standard deviations are
unknown, then we can estimate the standard
error of the difference between two means.
2
2
2
1
2
1
2 1
n
ˆ
n
ˆ
ˆ
x x
o o
o + =
÷
January 31, 2013
Dr.B.Sasidhar 56

( )
2
2
2
1
2
1
2 1
ˆ ˆ
n n
x x
z
o o
+
÷
=
Test statistic:
January 31, 2013
Dr.B.Sasidhar 57

If population variance unknown and the sample size
is small and the population variances are equal
Then we will use the weighted average called a
“ pooled estimate” of
2
o
|
.
|

\
|
+ =
÷
2 1
2
2 1
1 1
n n
s
p x x
o
2
1 1
2 1
2
2 2
2
1 1
2
÷ +
÷ + ÷
=
n n
s n s n
s
p
Where:
January 31, 2013
Dr.B.Sasidhar 58

Test statistic:
( )
|
|
.
|

\
|
+
÷
=
2 1
2
2 1
1 1
n n
s
x x
t
p
Degree of freedom =
2
2 1
÷ + n n
59
Inference about the difference
between two population
proportions

? For nominal data we compare the population
proportions of the occurrence of a certain
event.
? Examples
– Comparing the effectiveness of new drug versus
older one
– Comparing market share before and after
advertising campaign
– Comparing defective rates between two machines
January 31, 2013
Dr.B.Sasidhar
60
Parameter and Statistic
? Parameter
– When the data are nominal, we can only
count the occurrences of a certain event in
the two populations, and calculate
proportions.
– The parameter is therefore p
1
– p
2.
? Statistic
– An unbiased estimator of p
1
– p
2
is
(the difference between the sample
proportions).
2 1
p
ˆ
p
ˆ
÷
January 31, 2013
Dr.B.Sasidhar
61
Sample 1
Sample size n
1

Number of successes x
1

Sample proportion

? Two random samples are drawn from two
populations.
? The number of successes in each sample is
recorded.
? The sample proportions are computed.
Sample 2
Sample size n
2

Number of successes x
2

Sample proportion

x
n
1
1
ˆ =
p
1
2
2
2
n
x
p
ˆ
=
Sampling Distribution of
2 1
p
ˆ
p
ˆ
÷
January 31, 2013
Dr.B.Sasidhar
62
• The statistic is approximately normally distributed
if n
1
p
1
,

n
1
(1 - p
1
), n
2
p
2
, n
2
(1 - p
2
) are all greater than or
equal to 5.
• The mean of is p
1
- p
2
.

• The variance of is (p
1
(1-p
1
)

/n
1
)+ (p
2
(1-p
2
)/n
2
)
2 1
p
ˆ
p
ˆ
÷
2 1
p
ˆ
p
ˆ
÷
2 1
p
ˆ
p
ˆ
÷
Sampling distribution of
2 1
p
ˆ
p
ˆ
÷
January 31, 2013
Dr.B.Sasidhar
63
Testing the p
1
– p
2

H
0
: p
1
-p
2
=0
Calculate the pooled proportion
2 1
2 1
ˆ
n n
x x
p
+
+
=
Then
)
n
1
n
1
)( p
ˆ
1 ( p
ˆ
) p p ( ) p
ˆ
p
ˆ
(
Z
2 1
2 1 2 1
+ ÷
÷ ÷ ÷
=
January 31, 2013
Dr.B.Sasidhar
64
– The marketing manager needs to decide
which of two new packaging designs to
adopt, to help improve sales of his
company?s soap.
– A study is performed in two supermarkets:
• Brightly-colored packaging is distributed in
supermarket 1.
• Simple packaging is distributed in supermarket 2.
– First design is more expensive, therefore, to
be financially viable it has to outsell the
second design.
Example : Testing p
1
– p
2

January 31, 2013
Dr.B.Sasidhar
65
• Summary of the experiment results
–Supermarket 1 - 180 purchasers bought the
soap out of a total of 904

–Supermarket 2 - 155 purchasers bought the
soap out of a total of 1,038
–Use 5% significance level and perform a
test to find which type of packaging to
use.
Example : Testing p
1
– p
2

January 31, 2013
Dr.B.Sasidhar
66
? Solution
– The problem objective is to compare the
proportion of sales of the two packaging
designs.
– The hypotheses are
H
0
: p
1
- p
2
= 0
H
1
: p
1
- p
2
> 0

Population 1: purchases at supermarket 1
Population 2: purchases at supermarket 2
Example : Testing p
1
– p
2

January 31, 2013
Dr.B.Sasidhar
67
Example : Testing p
1
– p
2

? Compute:
– For a 5% significance level
the rejection region is
z > z
o
= z
.05
= 1.645
1725 . ) 038 , 1 904 ( ) 155 180 ( ) ( ) (
ˆ
2 1 2 1
= + + = + + = n n x x p
is proportion pooled The
90 . 2
038 , 1
1
904
1
) 1725 . 1 ( 1725 .
1493 . 1991 .
1 1
) ˆ 1 ( ˆ
) ˆ ˆ (
2 1
2 1
=
|
.
|

\
|
+ ÷
÷
=
|
|
.
|

\
|
+ ÷
÷
=
n n
p p
p p
Z
becomes statistic z The
1493 . 038 , 1 155
ˆ
, 1991 . 904 180
ˆ
2 1
= = = = p and p
are s proportion sample The
Conclusion: There is sufficient
evidence to conclude at the 5%
significance level, that brightly-
colored design will outsell the
simple design.

January 31, 2013 Dr.B.Sasidhar
January 31, 2013
Dr.B.Sasidhar 68
Inference About a Population
Variance
? Sometimes we are interested in making
inference about the variability of
processes.
? Examples:
– The consistency of a production process for
quality control purposes.
– Investors use variance as a measure of risk.
? To draw inference about variability, the
parameter of interest is o
2.

January 31, 2013
Dr.B.Sasidhar 69
? The sample variance s
2
is an unbiased,
consistent and efficient point estimator for o
2
.
? The statistic has a distribution
called Chi-squared, if the population is
normally distributed.

2
2
s ) 1 n (
o
÷
1 n . f . d
s ) 1 n (
2
2
2
÷ =
o
÷
= _
d.f. = 5
d.f. = 10
Inference About a Population
Variance
positively skewed
ranging between
0 and ?
January 31, 2013
Dr.B.Sasidhar 70
? Example (operation management
application)
– A container-filling machine is believed to fill
1 liter containers so consistently, that the
variance of the filling will be less than 1 cc
(.001 liter).
– To test this belief a random sample of 25
1-liter fills was taken, and the results
recorded (FILL.xls)
– Do these data support the belief that the
variance is less than 1cc at 5%
significance level?
Testing the Population Variance

January 31, 2013
Dr.B.Sasidhar 71
? Solution
– The problem objective is to describe the population
of 1-liter fills from a filling machine.
– The data are interval, and we are interested in the
variability of the fills.
– The complete test is:
H
0
:

o
2
= 1
H
1
: o
2
F
o,k-1,n-k

MSE
MST
F=
The F test rejection region
January 31, 2013
Dr.B.Sasidhar 98
The F test

H
o
: µ
1
= µ
2
= µ
3
H
1
: At least two means differ

Test statistic F= MST/ MSE= 3.23

15 . 3 F F F : . R . R
3 60 , 1 3 , 05 . 0 k n k
~ = >
÷ ÷ ÷ , ÷ , o 1
Since 3.23 > 3.15, there is sufficient evidence
to reject H
o
in favor of H
1
,

and argue that at least one
of the mean sales is different than the others.
23 . 3
17 . 894 , 8
12 . 756 , 28
MSE
MST
F
=
=
=
January 31, 2013
Dr.B.Sasidhar 99
-0.02
0
0.02
0.04
0.06
0.08
0.1
0 1 2 3 4
The F test p- value
p Value = P(F>3.23) = .0467
January 31, 2013
Dr.B.Sasidhar 100
ANOVA Table
JUICE.sav
January 31, 2013
Dr.B.Sasidhar 101
The test statistic used to test the
hypothesis is F statistic
The F statistic is the ratio of two
Variabilities, that is between-sample
variability and within-sample
variability.
Assumptions:
1. The random variable is normally
distributed.
2. The population variances are equal.
same are means all Not :
........ :
1
3 2 1 0
H
H = = = µ µ µ
January 31, 2013
Dr.B.Sasidhar 102
The test statistic
F = Mean Square for Treatment
Mean Square for Error
Degree of freedom = k - 1 ( numerator ) and
n - k ( denominator )
Sum of Squares for
Treatments ( SST ) =
( )
2
1
x x n
j
k
j
j
÷
¿
=
Sum of Squares for
Error ( SSE ) =
( )
¿¿
= =
÷
k
j
j
n
i
j ij
x x
1 1
2
January 31, 2013
Dr.B.Sasidhar 103
ANOVA Table
Sum of
Squares
Degrees of
freedom
Mean
Square
F Sig.
Between
Groups
(Treatments)
SST k-1 MST =
SST/k-1
MST/
MSE
p-value
Within Groups
(Error)
SSE n-k MSE =
SSE/n-k
Total SST+SSE n-1
January 31, 2013
Dr.B.Sasidhar 104
Chi Squared Tests
? Two statistical techniques are
presented, to analyze nominal data.
– A goodness-of-fit test for the multinomial
experiment.
– A contingency table test of independence.
? Both tests use the _
2
as the sampling
distribution of the test statistic.
January 31, 2013
Dr.B.Sasidhar 105
The Chi-square Distribution
At the outset, we should know that the chi-
square distribution has only one parameter
called the ‘degrees of freedom’ (df ) as is the
case with the t-distribution. The shape of a
particular chi-square distribution depends on
the number of degrees of freedom.
January 31, 2013
Dr.B.Sasidhar 106
1. Chi-square is non-negative in value; it is
either zero or positively valued.

2. It is not symmetrical; it is skewed to the
right.

3. There are many chi-square distributions.
As with the t-distribution, there is a
different chi-square distribution for each
degree-of-freedom value.
Properties Chi-square Distribution
January 31, 2013
Dr.B.Sasidhar 107
? The hypothesis tested involves the probabilities
p
1
, p
2
, …, p
k.
of a multinomial distribution.
? The multinomial experiment is an extension of
the binomial experiment.
– There are n independent trials.
– The outcome of each trial can be classified
into one of k categories, called cells.
– The probability p
i
that the outcome fall into cell
i remains constant for each trial. Moreover,
p
1
+ p
2
+ … +p
k
= 1.
– Trials of the experiment are independent
Chi-Squared Goodness-of-Fit
Test
January 31, 2013
Dr.B.Sasidhar 108
Chi-squared Goodness-of-Fit
Test
? We test whether there is sufficient
evidence to reject a pre-specified set of
values for p
i.
? The hypothesis:

i i
k k
a p one least At H
a p a p a p H
=
= = =
:
,..., , :
1
2 2 1 1 0
• The test builds on comparing actual frequency
and the expected frequency of occurrences in all
the cells.
January 31, 2013
Dr.B.Sasidhar 109

– Two competing companies A and B have
enjoy dominant position in the market. The
companies conducted aggressive
advertising campaigns.
– Market shares before the campaigns were:
• Company A = 45%
• Company B = 40%
• Other competitors = 15%.
The multinomial goodness of fit
test - Example
January 31, 2013
Dr.B.Sasidhar 110
? Example – continued
– To study the effect of the campaign on the
market shares, a survey was conducted.
The multinomial goodness of fit test
– 200 customers were asked to indicate their preference
regarding the product advertised.
– Survey results:
• 102 customers preferred the company A’s product,
• 82 customers preferred the company B’s product,
• 16 customers preferred the competitors product..
• Can we conclude at 5% significance level
that the market shares were affected by the
advertising campaigns?

January 31, 2013
Dr.B.Sasidhar 111
? Solution
– The population investigated is the brand
preferences.
– The data are nominal (A, B, or other)
– This is a multinomial experiment (three
categories).
– The question of interest: Are p
1
, p
2
, and p
3

different after the campaign from their values
before the campaign?
The multinomial goodness of fit test
January 31, 2013
Dr.B.Sasidhar 112
1
2
3
1
2
3
? The hypotheses are:
H
0
: p
1
= .45, p
2
= .40, p
3
=

.15
H
1
: At least one p
i
changed.
The expected frequency for each
category (cell) if the null hypothesis
is true is shown below:
90 = 200(.45)
30 = 200(.15)
102 82
16
What actual frequencies
did the sample return?
The multinomial goodness of fit test
- Example
80 = 200(.40)
January 31, 2013
Dr.B.Sasidhar 113
? The statistic is

? The rejection region is

i i
k
1 i
i
2
i i
2
np e where
e
) e f (
=
÷
= _
¿
=
2
1 k ,
2
÷ o
_ > _
The multinomial goodness of fit test
- Example
January 31, 2013
Dr.B.Sasidhar 114
The multinomial goodness of fit test
? Example – continued
18 . 8
30
) 30 16 (
80
) 80 82 (
90
) 90 102 (
2 2
k
1 i
2
2
=
÷
+
÷
+
÷
= _
¿
=
01679 . ) 18 . 8 (
99147 . 5
2
2
1 3 , 05 .
2
1 ,
= > =
= =
÷ ÷
_
_ _
o
P value p The
k
January 31, 2013
Dr.B.Sasidhar 115
The multinomial goodness of fit test
? Example – continued

0
0.005
0.01
0.015
0.02
0.025
0 2 4 6 8 10 12
Conclusion: Since 8.18 > 5.99, there is sufficient
evidence at 5% significance level to reject the null
hypothesis. At least one of the probabilities p
i
is
different. Thus, at least two market shares have
changed.
P value Alpha
5.99 8.18
Rejection region
_
2
with 2 degrees of freedom
January 31, 2013
Dr.B.Sasidhar 116
Required conditions –
the rule of five
? The test statistic used to perform the
test is only approximately Chi-squared
distributed.
? For the approximation to apply, the
expected cell frequency has to be at
least 5 for all the cells (np
i
> 5).
? If the expected frequency in a cell is
less than 5, combine it with other cells.
January 31, 2013
Dr.B.Sasidhar 117
Chi-squared Test of a Contingency
Table
? Test of Independence : Test on
association between two nominal
variables regarding contingency tables.

Null Hypothesis : Two variables are
independent
Alternative Hypothesis : The two variables
are dependent
January 31, 2013
Dr.B.Sasidhar 118

The chi-squared statistic measures the difference
between the actual counts and the expected
counts ( assuming validity of the null hypothesis)
The sum
( Observed count - Expected count )
2

Expected count
¿
=
÷
=
k
i
i
i i
E
E O
1
2
January 31, 2013
Dr.B.Sasidhar 119
Contingency table _
2
test –
Example
– In an effort to better predict the demand for
courses offered by a certain MBA program, it
was hypothesized that students? academic
background affect their choice of MBA major,
thus, their courses selection.
– A random sample of last year?s MBA
students was selected. The following
contingency table summarizes relevant data.
January 31, 2013
Dr.B.Sasidhar 120
Contingency table _
2
test –
Example
Degree Accounting Finance Marketing
BA 31 13 16 60
BENG 8 16 7 31
BBA 12 10 17 60
Other 10 5 7 39
61 44 47 152
The observed values
January 31, 2013
Dr.B.Sasidhar 121
? Solution
– The hypotheses are:
H
0
: The two variables are independent
H
1
: The two variables are dependent
k is the number of cells in
the contingency table.
– The test statistic
¿
=
÷
= _
k
1 i
i
2
i i
2
e
) e f (
– The rejection region
2
) 1 c )( 1 r ( ,
2
÷ ÷ o
_ > _
Contingency table _
2
test –
Example
Since e
i
= np
i
but p
i
is
unknown, we need to
estimate the unknown
probability from the data,
assuming H
0
is true.
January 31, 2013
Dr.B.Sasidhar 122
Under the null hypothesis the two variables are independent:

P(Accounting and BA) = P(Accounting)*P(BA)
Undergraduate MBA Major
Degree Accounting Finance Marketing Probability
BA 60 60/152
BENG 31 31/152
BBA 39 39/152
Other 22 22/152
61 44 47 152
Probability 61/152 44/152 47/152
The number of students expected to fall in the cell “Accounting - BA” is
e
Acct-BA
= n(p
Acct-BA
) = 152(61/152)(60/152) = [61*60]/152 = 24.08
= [61/152][60/152].
60
61
152
The number of students expected to fall in the cell “Finance - BBA” is
e
Finance-BBA
= np
Finance-BBA
= 152(44/152)(39/152) = [44*39]/152 = 11.29
44
39
152
Estimating the expected
frequencies
January 31, 2013
Dr.B.Sasidhar 123
The expected frequencies for a
contingency table
e
ij
=
(Column j total)(Row i total)
Sample size
• The expected frequency of cell of raw i and
column j in the contingency table is calculated by
January 31, 2013
Dr.B.Sasidhar 124
¿
=
÷
= _
k
1 i
i
2
i i
2
e
) e f (
Undergraduate MBA Major
Degree Accounting Finance Marketing
BA 31 (24.08) 13 (17.37) 16 (18.55) 60
BENG 8 (12.44) 16 (8.97) 7 (9.58) 31
BBA 12 (15.65) 10 (11.29) 17 (12.06) 39
Other 10 (8.83) 5 (6.39) 7 (6.80) 22
61 44 47 152
The expected frequency
31 24.08
31 24.08
31 24.08
31 24.08
31 24.08
(31 - 24.08)
2

24.08
+….+
5 6.39
5 6.39
5 6.39
5 6.39
(5 - 6.39)
2

6.39
+….+
7 6.80
7 6.80
7 6.80
(7 - 6.80)
2
6.80
7 6.80
_
2
= = 14.70
¿
=
÷
= _
k
1 i
i
2
i i
2
e
) e f (
Calculation of the _
2
statistic
• Solution – continued
January 31, 2013
Dr.B.Sasidhar 125
Contingency table _
2
test –
Example
• Conclusion:
Since _
2
= 14.70 > 12.5916, there
is sufficient evidence to infer at 5% significance
level that students’ undergraduate degree
and MBA students courses selection
are dependent.
• Solution – continued
– The critical value in our example is:
5916 . 12
2
) 1 3 )( 1 4 ( , 05 .
2
) 1 c )( 1 r ( ,
= _ = _
÷ ÷ ÷ ÷ o
January 31, 2013
Dr.B.Sasidhar 126
Yates’ Correction for Continuity
Chi-square distribution is a continuous
distribution. When results for continuous
distribution are applied to discrete data, certain
corrections for continuity can be made
( whenever d.f. = 1)
January 31, 2013
Dr.B.Sasidhar 127
NONPARAMETRIC METHODS
Nonparametric methods are statistical
procedures for hypothesis testing that
do not require a normal distribution
( or any other particular shape of
distribution ) because they are based
on counts or ranks instead of the
actual data values
However these methods still require
that you have a random sample from
the population.
January 31, 2013
Dr.B.Sasidhar 128
NONPARAMETRIC METHODS
Advantages of Nonparametric Testing

1. No need to assume normality; can be used
even if the distribution is not normal.

2. Can even be used to test ordinal data because
ranks can be found based on the natural ordering.

3. Can be much more efficient than parametric
methods when distributions are not normal
Disadvantages of Nonparametric Testing
Less statistically efficient than parametric methods
when distributions are normal
January 31, 2013
Dr.B.Sasidhar 129
Non-Parametric Tests (T-Tests)
Types of T-tests:
Wilcoxon Signed Ranks Sum Test (similar
to Paired sample t-test)
Mann-Whitney U-Test (similar to
Independent samples t-test)
January 31, 2013
Dr.B.Sasidhar 130
? This test is used when
– the problem objective is to compare two
populations,
– the data are not normal,
– the samples are matched pairs.

Wilcoxon Signed Rank Sum Test
January 31, 2013
Dr.B.Sasidhar 131
? This test is used when
– the problem objective is to compare two
populations,
– the data are not normal,
– the samples are from two independent
populations.

Mann Whitney U-Test
January 31, 2013
Dr.B.Sasidhar 132

Mann-Whitney U-Test

Test statistic,
( )
( )
12
1
and
2
2
1
where
- U
z
2 1 2 1 2 1
1
1 1
2 1
u
u
+ +
= =
÷
+
+ =
=
n n n n n n
R
n n
n n U
u u
o µ
o
µ
January 31, 2013
Dr.B.Sasidhar 133
Kruskal-Wallis Test
(similar to ANOVA)
? The variable is ordinal or interval and
not normally distributed for each of the
populations as defined by the different
levels of the factor
? The cases represent random samples
from the populations and the scores on
the test variable are independent of
each other.
January 31, 2013
Dr.B.Sasidhar 134
Kruskal-Wallis Test
? The problem characteristics for this test are:
– The problem objective is to compare two or more
populations.
– The data are either ordinal or interval but not
normal.
– The samples are independent.
? The hypotheses are
H
0
: The location of all the k populations are the
same.
H
1
: At least two population locations differ.

doc_205159798.ppt

Hypothesis Testing Explained in Detail

Attachments