Sampling Distributions Estimation

balajiv.ganesh · Jan 31, 2013

Description
describes sampling distribution in detail.

January 31, 2013 Dr.B.Sasidhar 1
Sampling Distribution
Introduction
• In real life calculating parameters of
populations is prohibitive because
populations are very large.
• Rather than investigating the whole
population, we take a sample, calculate a
statistic related to the parameter of interest,
and make an inference.
• The sampling distribution of the statistic is
the tool that tells us how close is the statistic
to the parameter.
January 31, 2013 Dr.B.Sasidhar 2

Sampling Distributions – Sample Means
Instead of working with individual scores,
statisticians often work with means.
Several samples are taken,
the mean is computed for each sample, and
then the means are used as the data,
rather than individual scores being used.
The sample is a sampling distribution of the
sample means.
January 31, 2013 Dr.B.Sasidhar 3
Sampling Distribution of the Mean
• An example
– A die is thrown infinitely many times. Let X
represent the number of spots showing on
any throw.
– The probability distribution of X is ….
January 31, 2013 Dr.B.Sasidhar 4
• Suppose we want to estimate µ
from the mean of a sample of
size n = 2.
• What is the distribution of ?
x
Throwing a die twice – sampling
distribution of sample mean
x
E( X ) = ?

V( X?) = ?
x
January 31, 2013 Dr.B.Sasidhar 5
6
)
5
( 5833 .
5 . 3
5 n
2
x
2
x
x
o
= = o
= µ
=
)
10
( 2917 .
5 . 3
10 n
2
x
2
x
x
o
= = o
= µ
=
)
25
( 1167 .
5 . 3
25 n
2
x
2
x
x
o
= = o
= µ
=
Sampling Distribution of the
Mean
January 31, 2013 Dr.B.Sasidhar 6
Sampling Distribution of the
Mean
)
5
( 5833 .
5 . 3
5 n
2
x
2
x
x
o
= = o
= µ
=
)
10
( 2917 .
5 . 3
10 n
2
x
2
x
x
o
= = o
= µ
=
)
25
( 1167 .
5 . 3
25 n
2
x
2
x
x
o
= = o
= µ
=
Notice that is smaller than .
The larger the sample size the
smaller . Therefore, tends
to fall closer to µ, as the sample
size increases.
2
x
o
x
2
x
o
Notice that is smaller than o
x
.
The larger the sample size the
smaller . Therefore, tends
to fall closer to µ, as the sample
size increases.
2
x
o
x
2
x
o
2
January 31, 2013 Dr.B.Sasidhar 7

Sampling Distributions
Properties of the Sampling Distribution of the
Sample Means.
1. The mean of the sample means will be the
mean of the population.
2. The variance of the sample means will the
the variance of the population divided by
the sample size
continued
3. The standard deviation of the sample means
( known as the standard error of the mean)
will be smaller than the population mean and
will be equal to the standard deviation of the
population divided by the square root of the
sample size.
January 31, 2013 Dr.B.Sasidhar 8

Sampling Distributions (Contd.)
4. If the population has a normal distribution,
then the sample means will have a normal
distribution
continued
5. If the population is not normally distributed,
but the sample size is sufficiently large, then
the sample means will have an approximately
normal distribution.
The standard deviation of the distribution of a sample
statistic is known as the standard error of the statistic.
The standard error indicates not only the size of the
chance error that has been made but also the accuracy
we are likely to get if we use a sample statistic to estimate
a population parameter
January 31, 2013 Dr.B.Sasidhar 9

Sampling Distributions
Standard Error of the Mean
n
x
o
o =
January 31, 2013 Dr.B.Sasidhar 10
• If a random sample is drawn from any
population, the sampling distribution of
the sample mean is approximately
normal for a sufficiently large sample size
(n>30).
• The larger the sample size, the more
closely the sampling distribution of will
resemble a normal distribution.
x
The Central Limit Theorem
January 31, 2013 Dr.B.Sasidhar 11

Central Limit Theorem
As the sample size increases, the
sampling distribution of the sample
means will become approximately
normally distributed.
January 31, 2013 Dr.B.Sasidhar 12

Sampling Distributions
The significance of the central limit theorem is that it permits
us to use sample statistics to make inferences about
population parameters without knowing anything about
the shape of the frequency distribution of that population
other than what we can get from the sample
Regardless of the nature of the population
distribution-discrete or continuous,
symmetric or skewed, unimodal or multi-
modal- the sampling distribution of mean is
always nearly normal as long as the sample
size if large enough.
Sufficiently large : at least 30
January 31, 2013 Dr.B.Sasidhar 13
size. sample large ly sufficient
for d distribute normally ely approximat is x
nonnormal is x If normal. is x normal, . 3
. 2
. 1
2
2
is x If
n
x
x
x x
o
o
µ µ
=
=
Sampling Distribution of the
Sample Mean
January 31, 2013 Dr.B.Sasidhar 14

Sampling Distributions
Sample size and Standard Error
As n increases, the standard error decreases.
As the standard error decreases, the precision
with which the sample mean can be used to
estimate the population mean increases.
January 31, 2013 Dr.B.Sasidhar 15

Sampling Distributions
Finite Population Correction Factor
If the sample size is more than 1% of the
population size and the sampling is done
without replacement, then a correction needs
to be made to the standard error of the
means.
1 ÷
÷
- =
N
n N
n
x
o
o
January 31, 2013 Dr.B.Sasidhar 16
• Example
– The amount of soda pop in each bottle is
normally distributed with a mean of 32.2
ounces and a standard deviation of .3
ounces.
– Find the probability that a bottle bought by
a customer will contain more than 32
ounces.
– Solution
• The random variable X is the
amount of soda in a bottle.
µ = 32.2
?
x = 32

January 31, 2013 Dr.B.Sasidhar 17 µ = 32.2
?
x = 32
• Find the probability that a carton of four bottles will
have a mean of more than 32 ounces of soda per
bottle.
• Solution
– Define the random variable as the mean amount of soda per
bottle.
32 x =
?
2 . 32
x
= µ
Sampling Distribution of the
Sample Mean
January 31, 2013 Dr.B.Sasidhar 18
• Example
– Dean’s claim: The average weekly income of
M.B.A graduates one year after graduation is
$600.
– Suppose the distribution of weekly income has a
standard deviation of $100. What is the
probability that 25 randomly selected graduates
have an average weekly income of less than
$550?
– Solution …..
Sampling Distribution of the
Sample Mean
January 31, 2013 Dr.B.Sasidhar 19
• Example – continued
– If a random sample of 25 graduates
actually had an average weekly income of
$550, what would you conclude about the
validity of the claim that the average
weekly income is $600?
– Solution …..

Sampling Distribution of the
Sample Mean
January 31, 2013 Dr.B.Sasidhar 20
95 . )
n
96 . 1 x
n
96 . 1 ( P
become which
95 . )
n
96 . 1 x
n
96 . 1 ( P
as written be can This
=
o
+ µ s s
o
÷ µ
=
o
s µ ÷ s
o
÷
95 . ) 96 . 1
n
x
96 . 1 ( P or , 95 . ) 96 . 1 z 96 . 1 ( P = s
o
µ ÷
s ÷ = s s ÷
• To make inference about population parameters we use
sampling distributions (as in Example).
• The symmetry of the normal distribution along with the
sample distribution of the mean lead to:
- Z
.025
Z
.025

Using Sampling Distributions
for Inference
January 31, 2013 Dr.B.Sasidhar 21
Using Sampling Distributions
for Inference
-1.96 -1.96
0
n
96 . 1
o
÷ µ
n
96 . 1
o
+ µ
.025
.025 .025
.025
Standard normal distribution Z Normal distribution of
x
95 . )
25
100
96 . 1 600 x
25
100
96 . 1 600 ( P = + s s ÷
.95
.95
x
Z
µ
25
100
96 . 1 600 ( P ÷
25
100
96 . 1 600 ( P +
µ=600
January 31, 2013 Dr.B.Sasidhar 22
• Conclusion
– There is 95% chance that the sample mean
falls within the interval [560.8, 639.2] if the
population mean is 600.
– Since the sample mean was 550, the
population mean is probably not 600.
95 . ) 2 . 639 x 8 . 560 ( P to reduces Which
95 . )
25
100
96 . 1 600 x
25
100
96 . 1 600 ( P
= s s
= + s s ÷
Using Sampling Distributions
for Inference
January 31, 2013 Dr.B.Sasidhar 23

Sampling Distributions
Student’s t distribution
When the population standard deviation is
unknown, the mean has a Student’s t
distribution.

When the sample size is small ( >
January 31, 2013 Dr.B.Sasidhar 28
Approximate Sampling
Distribution
of a Sample Proportion
• From the laws of expected value and variance,
it can be shown that E( ) = p and V( )
=p(1-p)/n
• If both np ? 5 and n(1-p) ? 5, then

• Z is approximately standard normally
distributed.
n
) p 1 ( p
p p
ˆ
z
÷
÷
=
p
ˆ
p
ˆ
January 31, 2013 Dr.B.Sasidhar 29
• Example
– A state representative received 52% of the
votes in the last election.
– One year later the representative wanted
to study his popularity.
– If his popularity has not changed, what is
the probability that more than half of a
sample of 300 voters would vote for him?
January 31, 2013 Dr.B.Sasidhar 30
• The distribution of is normal if

– The two samples are independent, and
– The parent populations are normally
distributed.
2 1
x x ÷
2 1
x x ÷
? If the two populations are not both
normally distributed, but the sample
sizes are 30 or more, the distribution of
is approximately normal.
Sampling Distribution of the
Difference Between Two
Means
January 31, 2013 Dr.B.Sasidhar 31
• Applying the laws of expected value and
variance we have:
2
2
2
1
2
1
2 1 2 1
2 1 2 1 2 1
) ( ) ( ) (
) ( ) ( ) (
n n
x V x V x x V
x E x E x x E
o o
µ µ
+ = + = ÷
÷ = ÷ = ÷
2
2
2
1
2
1
2 1 2 1
n n
) ( ) x x (
Z
o
+
o
µ ÷ µ ÷ ÷
=
? We can define:
Sampling Distribution of the
Difference Between Two
Means
January 31, 2013 Dr.B.Sasidhar 32
Example
– The starting salaries of MBA students from
two universities (U1 and U2) are $62,000
(stand.dev. = $14,500), and $60,000 (stand.
dev. = $18,300).
– What is the probability that a sample mean of
U1 students will exceed the sample mean of
U2 students? (n
U1
= 50; n
U2
= 60)
Sampling Distribution of the
Difference Between Two
Means
January 31, 2013 Dr.B.Sasidhar 33

Estimation
There are two areas of concern in inferential
statistics:
1.Estimation
The sample statistic is calculated from
the sample data and the population
parameter is inferred (or estimated)
from the sample statistic.
2. Sample size
How large of a sample should be taken
to make an accurate estimation.
January 31, 2013 Dr.B.Sasidhar 34

Estimation
There are two types of estimates:
1. Point estimate
A single value used to estimate an unknown
population parameter.
A point estimate is much more useful if it is
accompanied by an estimate of the error that
might be involved.
continued
January 31, 2013 Dr.B.Sasidhar 35

Estimation
There are two types of estimates:
1. Point estimate
A single value used to estimate an unknown population
parameter.
A point estimate is much more useful if it is accompanied
by an estimate of the error that might be involved.
2. Interval estimate.
A range of values used to estimate a parameter.
It indicates the error by:
a. the extent of its range, and
b. the probability of the true population parameter lying
within that range
January 31, 2013 Dr.B.Sasidhar 36
• Selecting the right sample statistic to estimate
a parameter value depends on the
characteristics of the statistic.
Estimator’s Characteristics
Estimator’s desirable characteristics:
Unbiasedness: An unbiased estimator is one whose
expected value is equal to the parameter it estimates.
Consistency: An unbiased estimator is said to be
consistent if the difference between the estimator and
the parameter grows smaller as the sample size
increases.
Relative efficiency: For two unbiased estimators, the one
with a smaller variance is said to be relatively efficient.
January 31, 2013 Dr.B.Sasidhar 37

Estimation
Confidence Intervals
The point estimate is going to be different from
the population parameter because due to the
sampling error, and there is no way to know
how close it is to the actual parameter.
Statisticians like to give an interval estimate
which is a range of values used to estimate
the parameter.
January 31, 2013 Dr.B.Sasidhar 38

Estimation
A confidence interval is an interval estimate
with a specific level of confidence.
A level of confidence is the probability that
the interval estimate will contain the para-
meter
The level of confidence is ( equal the
probability. )
o ÷ 1
o ÷ 1
area lies within the confidence interval

January 31, 2013 Dr.B.Sasidhar 39

Estimation
From the confidence level
, o ÷ 1
we determine , , 2 o o and finally
2 o
z
o ÷ 1 o
2 o
2 o
z
0.90 0.10 0.05 1.645
=
05 .
z
0.95 0.05 0.025 1.96
=
025 .
z
0.98 0.02 0.01 2.33
=
01 .
z
0.99 0.01 0.005 2.575
=
005 .
z
January 31, 2013 Dr.B.Sasidhar 40

Estimation
Suppose we have a population with mean µ and variance
2
o
The population mean is assumed to be unknown, and our
task is to estimate its value.
The technique of estimation involves drawing a random
sample of size n and calculating the sample mean . x
Estimating the population mean when the population
variance is known.
µ
Since
x
is normally distributed or approx. normally
distributed, this means that the variable
n
x
z
o
µ ÷
=
is standard normally distributed.
January 31, 2013 Dr.B.Sasidhar 41

Estimation
We can make probability statements about
this variable
From the properties of a normal distribution,
there is a 95% chance the is within 1.96
standard errors of .
x
µ
Therefore,
95 0 96 1 96 1 . .
n
x
. P =
|
.
|

\
|
<
÷
< ÷
o
µ
January 31, 2013 Dr.B.Sasidhar 42

Estimation
Another way of looking at this:
Any time the observed sample mean
x
n . o µ 96 1 ±
The interval n . x o 96 1 ±
lies in the interval
encloses µ
The interval that we construct
using the observed sample mean is called
95% confidence interval for
n . x o 96 1 ±
µ
January 31, 2013 Dr.B.Sasidhar 43

Estimation
Confidence Interval Estimator of
µ
n
z x
o
o 2
±
The probability is called the
confidence level.
o ÷ 1
n
z x
o
o 2
÷
is called the lower confidence
limit (LCL).
n
z x
o
o 2
+
is called the upper confidence
limit (UCL).
January 31, 2013 Dr.B.Sasidhar 44
x
n
z 2
2
o
o
n
z x
2
o
÷
o
n
z x
2
o
+
o
Lower confidence limit Upper confidence limit
1 - o
Confidence level
Graphical Demonstration of the
Confidence Interval for µ
January 31, 2013 Dr.B.Sasidhar 45
• Example: Estimate the mean value of the distribution resulting from
the throw of a fair die. It is known that o = 1.71. Use a 90%
confidence level, and 100 repeated throws of the die

•Solution: The confidence interval is …..
The Confidence Interval for µ ( o is
known)
January 31, 2013 Dr.B.Sasidhar 46
The Confidence Interval for µ ( o is
known)
• Recalculate the confidence interval for 95% confidence level.

• Solution:
34 . x + 34 . x ÷
.95
.90
28 . x + 28 . x ÷
January 31, 2013 Dr.B.Sasidhar 47
The Confidence Interval for µ ( o is
known)
• The width of the 90% confidence interval = 2(.28) = .56
The width of the 95% confidence interval = 2(.34) = .68
• Note that the 95% confidence interval is wider.

January 31, 2013 Dr.B.Sasidhar 48
• Example
– Doll Computer Company delivers computers
directly to its customers who order via the
Internet.
– To reduce inventory costs in its warehouses Doll
employs an inventory model, that requires the
estimate of the mean demand during lead time.
– From a sample of 25, it is found that lead time
demand is normally distributed with a mean of
370.16 and a standard deviation of 75
computers per lead time.
– Estimate the lead time demand with 95%
confidence.
The Confidence Interval for µ ( o is known)
January 31, 2013 Dr.B.Sasidhar 49

Estimation
Estimating population mean when the
population variance is unknown.
Since we don’t have the population standard
deviation, we will use the sample standard
deviation to estimate the population standard
deviation.
( )
1
2
÷
÷
= =
¿
n
x x
s ˆ o
This formula will be used:
January 31, 2013 Dr.B.Sasidhar 50

Estimation
Confidence Interval Estimator of
µ
n
ˆ
z x
o
o 2
±
January 31, 2013 Dr.B.Sasidhar 51

When the population standard deviation is
unknown, and the sample size is small (

Sampling Distributions Estimation

Attachments