Description
Random Variables & Discrete Prob. Distributions
Random Variables and Discrete Probability Distributions
Random Variables…
A random variable is a function or rule that assigns a number to each outcome of an experiment. Alternatively, the value of a random variable is a numerical event. Instead of talking about the coin flipping event as {heads, tails} think of it as “the number of heads when flipping a coin” {1, 0} (numerical events)
Two Types of Random Variables…
Discrete Random Variable – one that takes on a countable number of values – E.g. values on the roll of dice: 2, 3, 4, …, 12
Continuous Random Variable – one whose values are not discrete, not countable – E.g. time (30.1 minutes? 30.10000001 minutes?)
Analogy: Integers are Discrete, while Real Numbers are Continuous
Probability Distributions…
A probability distribution is a table, formula, or graph that describes the values of a random variable and the probability associated with these values. Since we’re describing a random variable (which can be discrete or continuous) we have two types of probability distributions: – Discrete Probability Distribution, (this chapter) and – Continuous Probability Distribution (Chapter 8)
Probability Notation…
An upper-case letter will represent the name of the random variable, usually X. Its lower-case counterpart will represent the value of the random variable. The probability that the random variable X will equal x is: P(X = x) or more simply P(x)
Discrete Probability Distributions…
The probabilities of the values of a discrete random variable may be derived by means of probability tools such as tree diagrams or by applying one of the definitions of probability, so long as these two conditions apply:
Example 7.1
The Statistical Abstract of the United States is published annually. It contains a wide variety of information based on the census as well as other sources. The objective is to provide information about a variety of different aspects of the lives of the country’s residents. One of the questions asked households to report the number of color televisions in the household. The following table summarizes the data. Develop the probability distribution of the random variable defined as the number of color televisions per household.
Example 7.1
.Number of Color Televisions
(1,000s) 0 1 2 3 4 5 Total
Number of Households 1,218 32,379 37,961 19,387 7,714 2,842 101,501
Example 7.1
Probability distributions can be estimated from relative frequencies.
1,218 ÷ 101,501 = 0.012
e.g. P(X=4) = P(4) = 0.076 = 7.6%
Example 7.1
E.g. what is the probability there is at least one television but no more than three in any given household?
?at least one television but no more than three? P(1 ? X ? 3) = P(1) + P(2) + P(3) = .319 + .374 + .191 = .884
Example 7.2…
A mutual fund salesperson has arranged to call on three people tomorrow. Based on past experience the salesperson knows that there is a 20% chance of closing a sale on each call. Determine the probability distribution of the number of sales the salesperson will make. Let S denote success, i.e. closing a sale P(S)=.20 Thus SC is not closing a sale, and P(SC)=.80
Example 7.2…
Sales Call 1 Sales Call 2 Sales Call 3
Developing a Probability Distribution…
P(S)=.2 P(S)=.2 P(S)=.2 P(SC)=.8 P(SC)=.8 P(S)=.2 P(SC)=.8 P(S)=.2
(.2)(.2)(.8)= .032
SSS S S SC S SC S S SC SC SC S S SC S SC SC SC S
P(SC)=.8
P(S)=.2 P(SC)=.8
X 3 2 1 0
P(x) .23 = .008 3(.032)=.096 3(.128)=.384 .83 = .512
P(SC)=.8 P(S)=.2 P(SC)=.8
SC SC SC
P(X=2) is illustrated here…
Population/Probability Distribution…
The discrete probability distribution represents a population
Example 7.1 the population of number of TVs per household Example 7.2 the population of sales call outcomes
Since we have populations, we can describe them by computing various parameters. E.g. the population mean and population variance.
Population Mean (Expected Value)
The population mean is the weighted average of all of its values. The weights are the probabilities. This parameter is also called the expected value of X and is represented by E(X).
Population Variance…
The population variance is calculated similarly. It is the weighted average of the squared deviations from the mean.
As before, there is a ?short-cut? formulation…
The standard deviation is the same as before:
Example 7.3… Find the mean, variance, and standard deviation for the population of the number of color televisions per household… (from Example 7.1)
= 0(.012) + 1(.319) + 2(.374) + 3(.191) + 4(.076) + 5(.028) = 2.084
Example 7.3…
Find the mean, variance, and standard deviation for the population of the number of color televisions per household… (from Example 7.1)
= (0 – 2.084)2(.012) + (1 – 2.084)2(.319)+…+(5 – 2.084)2(.028) = 1.107
Example 7.3… Find the mean, variance, and standard deviation for the population of the number of color televisions per household… (from Example 7.1)
= 1.052
Laws of Expected Value…
E(c) = c
The expected value of a constant (c) is just the value of the constant.
E(X + c) = E(X) + c E(cX) = cE(X)
We can ?pull? a constant out of the expected value expression (either as part of a sum with a random variable X or as a coefficient of random variable X).
Example 7.4…
Monthly sales have a mean of $25,000 and a standard deviation of $4,000. Profits are calculated by multiplying sales by 30% and subtracting fixed costs of $6,000. Find the mean monthly profit.
1) Describe the problem statement in algebraic terms: sales have a mean of $25,000 ? E(Sales) = 25,000 profits are calculated by… ? Profit = .30(Sales) – 6,000
Example 7.4…
Monthly sales have a mean of $25,000 and a standard deviation of $4,000. Profits are calculated by multiplying sales by 30% and subtracting fixed costs of $6,000. Find the mean monthly profit.
E(Profit)
=E[.30(Sales) – 6,000] =E[.30(Sales)] – 6,000 [by rule #2] =.30E(Sales) – 6,000 [by rule #3] =.30(25,000) – 6,000 = 1,500 Thus, the mean monthly profit is $1,500
Laws of Variance…
V(c) = 0
The variance of a constant (c) is zero.
V(X + c) = V(X)
The variance of a random variable and a constant is just the variance of the random variable (per 1 above).
V(cX) = c2V(X)
The variance of a random variable and a constant coefficient is the coefficient squared times the variance of the random variable.
Example 7.4…
Monthly sales have a mean of $25,000 and a standard deviation of $4,000. Profits are calculated by multiplying sales by 30% and subtracting fixed costs of $6,000. Find the standard deviation of monthly profits.
1) Describe the problem statement in algebraic terms: sales have a standard deviation of $4,000 ? V(Sales) = 4,0002 = 16,000,000 (remember the relationship between standard deviation and variance ) profits are calculated by… ? Profit = .30(Sales) – 6,000
Example 7.4…
Monthly sales have a mean of $25,000 and a standard deviation of $4,000. Profits are calculated by multiplying sales by 30% and subtracting fixed costs of $6,000. Find the standard deviation of monthly profits. 2) The variance of profit is = V(Profit) =V[.30(Sales) – 6,000] =V[.30(Sales)] [by rule #2] =(.30)2V(Sales) [by rule #3] =(.30)2(16,000,000) = 1,440,000 Again, standard deviation is the square root of variance, so standard deviation of Profit = (1,440,000)1/2 = $1,200
Example 7.4 (summary)
Monthly sales have a mean of $25,000 and a standard deviation of $4,000. Profits are calculated by multiplying sales by 30% and subtracting fixed costs of $6,000. Find the mean and standard deviation of monthly profits. The mean monthly profit is $1,500 The standard deviation of monthly profit is $1,200
Bivariate Distributions…
Up to now, we have looked at univariate distributions, i.e. probability distributions in one variable. As you might guess, bivariate distributions are probabilities of combinations of two variables. Bivariate probability distributions are also called joint probability. A joint probability distribution of X and Y is a table or formula that lists the joint probabilities for all pairs of values x and y, and is denoted P(x,y). P(x,y) = P(X=x and Y=y)
Discrete Bivariate Distribution…
As you might expect, the requirements for a bivariate distribution are similar to a univariate distribution, with only minor changes to the notation:
for all pairs (x,y).
Example 7.5…
Xavier and Yvette are real estate agents. Let X denote the number of houses that Xavier will sell in a month and let Y denote the number of houses Yvette will sell in a month. An analysis of their past monthly performances has the following joint probabilities (bivariate probability distribution).
Marginal Probabilities…
As before, we can calculate the marginal probabilities by summing across rows and down columns to determine the probabilities of X and Y individually:
E.g the probability that Xavier sells 1 house = P(X=1) =0.50
Describing the Bivariate Distribution…
We can describe the mean, variance, and standard deviation of each variable in a bivariate distribution by working with the marginal probabilities…
same formulae as for univariate distributions…
Covariance…
The covariance of two discrete variables is defined as:
or alternatively using this shortcut method:
Coefficient of Correlation…
The coefficient of correlation is calculated in the same way as described earlier…
Example 7.6…
Compute the covariance and the coefficient of correlation between the numbers of houses sold by Xavier and Yvette.
COV(X,Y) = (0 – .7)(0 – .5)(.12) + (1 – .7)(0 – .5)(.42) + … … + (2 – .7)(2 – .5)(.01) = –.15 = –0.15 ÷ [(.64)(.67)] = –.35 There is a weak, negative relationship between the two variables.
Example 7.6…
X 0 0 0 1 1 1 2 2 2 0.7 Y 0 1 2 0 1 2 0 1 2 0.5 Probability 0.12 0.21 0.07 0.42 0.06 0.02 0.06 0.03 0.01 X - µ(x) -0.7 -0.7 -0.7 0.3 0.3 0.3 1.3 1.3 1.3 Y - µ
-0.5 0.5 1.5 -0.5 0.5 1.5 -0.5 0.5 1.5 [X - µ(x)][Y-µ
] 0.042 -0.074 -0.074 -0.063 0.009 0.009 -0.039 0.020 0.020 -0.150
Example 7.6…
= –0.150 ÷ [(.64)(.67)] = –.35
There is a weak, negative relationship between the two variables.
The bivariate distribution allows us to develop the probability distribution of any combination of the two variables, of particular interest is the sum of two variables. If we consider our example of Xavier and Yvette selling houses, we can create a probability distribution…
Sum of Two Variables…
…to answer questions like ?what is the probability that two houses are sold?? P(X+Y=2) = P(0,2) + P(1,1) + P(2,0) = .07 + .06 + .06 = .19
Sum of Two Variables…
Likewise, we can compute the expected value, variance, and standard deviation of X+Y in the usual way… E(X + Y) = 0(.12) + 1(.63) + 2(.19) + 3(.05) + 4(.01) = 1.2
V(X + Y) = (0 – 1.2)2(.12) + … + (4 – 1.2)2(.01) = .56
?x ? y ?
Var(X ? Y) ? .56 ? .75
Laws…
We can derive laws of expected value and variance for the sum of two variables as follows… E(X + Y) = E(X) + E(Y) V(X + Y) = V(X) + V(Y) + 2COV(X, Y)
If X and Y are independent, COV(X, Y) = 0 and thus
V(X + Y) = V(X) + V(Y)
Laws
E(X + Y) = E(X) + E(Y) = .7 + .5 = 1.2 V(X + Y) = V(X) + V(Y) + 2COV(X, Y)
= .41 + .45 + 2(-.15) = .56
Portfolio Diversification and Asset Allocation
Consider an investor who forms a portfolio, consisting of only two stocks, by investing $4,000 in one stock and $6,000 in a second stock. Suppose that the results after 1 year are: One-Year Results
Stock 1 2 Total OR Initial Investment $4,000 $6,000 $10,000 Value of Investment After One Year $5,000 $5,400 $10,400 Rate of Return on Investment R1 = .25 (25%) R2 =-.10 (-10%) Rp = .04 ( 4%)
Portfolio Diversification and Asset Allocation
Mean and Variance of a Portfolio of Two Stocks E(Rp) = w1 E(R1) + w2 E(R2) V(Rp) = w12 V(R1) + w22 V(R2) + 2w1w2 COV(R1, R2) = w12?12 + w22?22 + 2w1w2??1?2 where w1 and w2 are the proportions or weights of investments 1 and 2, E(R1) and E(R2) are their expected values, ?1 and ?2 are their standard deviations, and ? is the coefficient of correlation
Example 7.8
An investor has decided to form a portfolio by putting 25% of his money into McDonald’s stock and 75% into Cisco Systems stock. The investor assumes that the expected returns will be 8% and 15%, respectively, and that the standard deviations will be 12% and 22%, respectively. a Find the expected return on the portfolio. b Compute the standard deviation of the returns on the portfolio assuming that (i) the two stocks’ returns are perfectly positively correlated (ii) the coefficient of correlation is .5 (iii) the two stocks’ returns are uncorrelated
Example 7.8 Solution
a The expected values of the two stocks are E(R1) = .08 and E(R2) = .15 The weights are w1 = .25 and w2 = .75. Thus, E(R2) = w1E(R1) + w2E(R2) = .25(.08) + .75(.15) = .1325
Example 7.8 Solution
The standard deviations are ?1 = .12 and ?2 = .22. Thus, V(Rp) = w12?12 + w22?22 + 2w1w2??1?2 = (.252)(.122) + (.752)(.222) + 2(.25)(.75)? (.12)(.22) = .0281 + .0099 ? When ? = 1 V(Rp) = .0281 + .0099(1) = .0380 When ? = .5 V(Rp) = .0281 + .0099(.5) = .0331 When ? = 0 V(Rp) = .0281 + .0099(0) = .0281
Portfolio Diversification in Practice
The formulas introduced in this section require that we know the expected values, variances, and covariance (or coefficient of correlation) of the investments we’re interested in. The question arises, How do we determine these parameters? (Incidentally, this question is rarely addressed in finance textbooks!) The most common procedure is to estimate the parameters from historical data, using sample statistics.
Portfolios with More Than Two Stocks
We can extend the formulas that describe the mean and variance of the returns of a portfolio of two stocks to a portfolio of any number of k stocks. wi E ( R ) Mean and Variancei of a Portfolio of k Stocks
?
i ?1
E(Rp ) = V(Rp ) =
?
i ?1
k
w i2 ? i2 ? 2
? ? w w COV (R , R )
i j i j i ?1 j?i ?1
k
k
Where Ri is the return of the ith stock, wi is the proportion of the portfolio invested in stock i, and k is the number of stocks in the portfolio.
Portfolios with More Than Two Stocks
When k is greater than 2 the calculations can be tedious and time-consuming. For example, when k = 3, we need to know the values of the three weights, three expected values, three variances, and three covariances. When k = 4, there are four expected values, four variances and six covariances. [The number of covariances required in general is k(k-1)/2.] To assist you we have created an Excel worksheet to perform the computations when k =2, 3, or 4. (For larger values of k, see the reference at the end of the chapter.) To demonstrate we’ll return to the problem described in this chapter’s introduction.
An investor has $100,000 to invest in the stock market. She is Chapter-Opening Example interested in developing a stock portfolio made up of General Electric, General Motors, McDonald’s, and Motorola. However, she doesn’t know how much to invest in each one. She wants to maximize her return, but she would also like to minimize the risk. She has computed the monthly returns for all four stocks during a 60-month period (January 2001 to December 2006) (Xm07-00).
After some Chapter-Opening Example consideration, she narrowed her choices down to the following three. What should she do? 1. $25,000 in each stock 2. General Electric: $10,000, General Motors: $20,000, McDonald’s: $30,000, Motorola: $40,000 3. General Electric: $10,000, General Motors: $50,000, McDonald’s: $30,000, Motorola: $10,000
Chapter-Opening Example
Because of the large amount of calculations we will solve this problem using only Excel. From the file we compute the B C D E means of each stock’s returns.
74 0.000305 0.002339 0.007910 0.007997
Chapter-Opening Example
Next we compute the variance-covariance matrix. (The commands are the same as those described in Chapter 4—simply include all the columns ofA returns ofB investments you wish D include in the the the to C E portfolio.) 1 GE GM McDonalds Motorola 2 GE 0.003493 3 GM 0.001076 0.011016 4 McDonalds 0.001528 0.001989 0.005409 5 Motorola 0.000933 0.004131 0.002515 0.010277
Chapter-Opening Example
Notice that the variances of the returns are listed on the diagonal. Thus, for example, the variance of the 60 monthly returns of General Electric is .00349. The covariances appear below the diagonal. The covariance between the returns of General Electric and General Motors is .00108. The means and the variance-covariance matrix are copied to the Portfolio Diversification spreadsheet. The weights are typed, producing the accompanying output.
A 1 Portfolio of 4 Stocks Chapter-Opening Example 2 3 Variance-Covariance Matrix 4 5 6 7 8 Expected Returns 9 10 Weights 11 12 Portfolio Return 13 Expected Value 14 Variance 15 Standard Deviation
B
C
D
E
F
GE GM McDonalds Motorola GE 0.003493 GM 0.001076 0.011016 McDonalds 0.001528 0.001989 0.005409 Motorola 0.000933 0.004131 0.002515 0.010277 0.000305 0.250000 0.002339 0.250000 0.007910 0.007997 0.250000 0.250000
0.0046 0.0034 0.0584
Chapter-Opening Example A
1 Portfolio of 4 Stocks Plan 2 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Variance-Covariance Matrix
B
C
D
E
F
GE GM McDonalds Motorola GE 0.003493 GM 0.001076 0.011016 McDonalds 0.001528 0.001989 0.005409 Motorola 0.000933 0.004131 0.002515 0.010277 0.000305 0.100000 0.002339 0.200000 0.007910 0.007997 0.300000 0.400000
Expected Returns Weights Portfolio Return Expected Value Variance Standard Deviation
0.0061 0.0043 0.0657
A 1 Portfolio of 4 Stocks Plan 3 2 3 Variance-Covariance Matrix 4 5 6 7 8 Expected Returns 9 10 Weights 11 12 Portfolio Return 13 Expected Value 14 Variance 15 Standard Deviation
Chapter-Opening Example
B
C
D
E
F
GE GM McDonalds Motorola GE 0.003493 GM 0.001076 0.011016 McDonalds 0.001528 0.001989 0.005409 Motorola 0.000933 0.004131 0.002515 0.010277 0.000305 0.100000 0.002339 0.500000 0.007910 0.007997 0.300000 0.100000
0.0044 0.0048 0.0690
Chapter-Opening Example
Plan 3 has the smallest expected value and the largest variance, making it the worst of the three plans. Plan 2 has the largest expected value, whereas plan 1 has the smallest variance. If the investor is like most investors she would select Plan 1 because of its lower risk. Other more daring investors may choose plan 2 to take advantage of its higher expected value.
Binomial Distribution…
The binomial distribution is the probability distribution that results from doing a ?binomial experiment?. Binomial experiments have the following properties: Fixed number of trials, represented as n. Each trial has two possible outcomes, a ?success? and a ?failure?. P(success)=p (and thus: P(failure)=1–p), for all trials. The trials are independent, which means that the outcome of one trial does not affect the outcomes of any other trials.
Success and Failure…
…are just labels for a binomial experiment, there is no value judgment implied. For example a coin flip will result in either heads or tails. If we define ?heads? as success then necessarily ?tails? is considered a failure (inasmuch as we attempting to have the coin lands heads up). Other binomial experiment notions:
An election candidate wins or loses An employee is male or female
Binomial Random Variable…
The random variable of a binomial experiment is defined as the number of successes in the n trials, and is called the binomial random variable. E.g. flip a fair coin 10 times…
1) Fixed number of trials ? n=10 2) Each trial has two possible outcomes ? {heads (success), tails (failure)} 3) P(success)= 0.50; P(failure)=1–0.50 = 0.50 ? 4) The trials are independent ? (i.e. the outcome of heads on the first flip will have no impact on subsequent coin flips).
Hence flipping a coin ten times is a binomial experiment since all conditions were met.
Binomial Random Variable…
The binomial random variable counts the number of successes in n trials of the binomial experiment. It can take on values from 0, 1, 2, …, n. Thus, its a discrete random variable. To calculate the probability associated with each value we use combintorics: for x=0, 1, 2, …, n
Pat Statsdud…
Pat Statsdud is a (not good) student taking a statistics course. Pat’s exam strategy is to rely on luck for the next quiz. The quiz consists of 10 multiple-choice questions. Each question has five possible answers, only one of which is correct. Pat plans to guess the answer to each question. What is the probability that Pat gets no answers correct?
What is the probability that Pat gets two answers correct?
Pat Statsdud…
Pat Statsdud is a (not good) student taking a statistics course whose exam strategy is to rely on luck for the next quiz. The quiz consists of 10 multiple-choice questions. Each question has five possible answers, only one of which is correct. Pat plans to guess the answer to each question. Algebraically then: n=10, and P(success) = 1/5 = .20
Pat Statsdud…
Is this a binomial experiment? Check the conditions: ? There is a fixed finite number of trials (n=10). ? An answer can be either correct or incorrect. The probability of a correct answer (P(success)=.20) does not change from question to question. ? Each answer is independent of the others.
Pat Statsdud…
n=10, and P(success) = .20 What is the probability that Pat gets no answers correct? I.e. # success, x, = 0; hence we want to know P(x=0)
Pat has about an 11% chance of getting no answers correct using the guessing strategy.
Pat Statsdud…
n=10, and P(success) = .20 What is the probability that Pat gets two answers correct? I.e. # success, x, = 2; hence we want to know P(x=2)
Pat has about a 30% chance of getting exactly two answers correct using the guessing strategy.
Cumulative Probability…
Thus far, we have been using the binomial probability distribution to find probabilities for individual values of x. To answer the question: ?Find the probability that Pat fails the quiz? requires a cumulative probability, that is, P(X ? x) If a grade on the quiz is less than 50% (i.e. 5 questions out of 10), that’s considered a failed quiz. Thus, we want to know what is: P(X ? 4) to answer
Pat Statsdud…
P(X ? 4) = P(0) + P(1) + P(2) + P(3) + P(4)
We already know P(0) = .1074 and P(2) = .3020. Using the binomial formula to calculate the others: P(1) = .2684 , P(3) = .2013, and P(4) = .0881
We have P(X ? 4) = .1074 + .2684 + … + .0881 = .9672 Thus, its about 97% probable that Pat will fail the test using the luck strategy and guessing at answers…
Binomial Table…
Calculating binomial probabilities by hand is tedious and error prone. There is an easier way. Refer to Table 1 in Appendix B. For the Pat Statsdud example, n=10, so the first important step is to get the correct table!
n = 10 k 0 1 2 3 4 5 6 7 8 9 0.01 0.9044 0.9957 0.9999 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.05 0.5987 0.9139 0.9885 0.9990 0.9999 1.0000 1.0000 1.0000 1.0000 1.0000 0.1 0.3487 0.7361 0.9298 0.9872 0.9984 0.9999 1.0000 1.0000 1.0000 1.0000 0.2 0.1074 0.3758 0.6778 0.8791 0.9672 0.9936 0.9991 0.9999 1.0000 1.0000 0.25 0.0563 0.2440 0.5256 0.7759 0.9219 0.9803 0.9965 0.9996 1.0000 1.0000 0.3 0.0282 0.1493 0.3828 0.6496 0.8497 0.9527 0.9894 0.9984 0.9999 1.0000 0.4 0.0060 0.0464 0.1673 0.3823 0.6331 0.8338 0.9452 0.9877 0.9983 0.9999 0.5 0.0010 0.0107 0.0547 0.1719 0.3770 0.6230 0.8281 0.9453 0.9893 0.9990 0.6 0.0001 0.0017 0.0123 0.0548 0.1662 0.3669 0.6177 0.8327 0.9536 0.9940 0.7 0.0000 0.0001 0.0016 0.0106 0.0473 0.1503 0.3504 0.6172 0.8507 0.9718 0.75 0.0000 0.0000 0.0004 0.0035 0.0197 0.0781 0.2241 0.4744 0.7560 0.9437 0.8 0.0000 0.0000 0.0001 0.0009 0.0064 0.0328 0.1209 0.3222 0.6242 0.8926 0.9 0.0000 0.0000 0.0000 0.0000 0.0001 0.0016 0.0128 0.0702 0.2639 0.6513 0.95 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0010 0.0115 0.0861 0.4013 0.99 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0043 0.0956
Binomial Table…
The probabilities listed in the tables are cumulative, i.e. P(X ? k) – k is the row index; the columns of the table are organized by P(success) = p
n = 10 k 0 1 2 3 4 5 6 7 8 9 0.01 0.9044 0.9957 0.9999 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.05 0.5987 0.9139 0.9885 0.9990 0.9999 1.0000 1.0000 1.0000 1.0000 1.0000 0.1 0.3487 0.7361 0.9298 0.9872 0.9984 0.9999 1.0000 1.0000 1.0000 1.0000 0.2 0.1074 0.3758 0.6778 0.8791 0.9672 0.9936 0.9991 0.9999 1.0000 1.0000 0.25 0.0563 0.2440 0.5256 0.7759 0.9219 0.9803 0.9965 0.9996 1.0000 1.0000 0.3 0.0282 0.1493 0.3828 0.6496 0.8497 0.9527 0.9894 0.9984 0.9999 1.0000 0.4 0.0060 0.0464 0.1673 0.3823 0.6331 0.8338 0.9452 0.9877 0.9983 0.9999 0.5 0.0010 0.0107 0.0547 0.1719 0.3770 0.6230 0.8281 0.9453 0.9893 0.9990 0.6 0.0001 0.0017 0.0123 0.0548 0.1662 0.3669 0.6177 0.8327 0.9536 0.9940 0.7 0.0000 0.0001 0.0016 0.0106 0.0473 0.1503 0.3504 0.6172 0.8507 0.9718 0.75 0.0000 0.0000 0.0004 0.0035 0.0197 0.0781 0.2241 0.4744 0.7560 0.9437 0.8 0.0000 0.0000 0.0001 0.0009 0.0064 0.0328 0.1209 0.3222 0.6242 0.8926 0.9 0.0000 0.0000 0.0000 0.0000 0.0001 0.0016 0.0128 0.0702 0.2639 0.6513 0.95 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0010 0.0115 0.0861 0.4013 0.99 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0043 0.0956
Binomial Table…
?What is the probability that Pat gets no answers correct?? i.e. what is P(X = 0), given P(success) = .20 and n=10 ?
n = 10 k 0 1 2 3 4 5 6 7 8 9 0.01 0.9044 0.9957 0.9999 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.05 0.5987 0.9139 0.9885 0.9990 0.9999 1.0000 1.0000 1.0000 1.0000 1.0000 0.1 0.3487 0.7361 0.9298 0.9872 0.9984 0.9999 1.0000 1.0000 1.0000 1.0000 0.2 0.1074 0.3758 0.6778 0.8791 0.9672 0.9936 0.9991 0.9999 1.0000 1.0000 0.25 0.0563 0.2440 0.5256 0.7759 0.9219 0.9803 0.9965 0.9996 1.0000 1.0000 0.3 0.0282 0.1493 0.3828 0.6496 0.8497 0.9527 0.9894 0.9984 0.9999 1.0000 0.4 0.0060 0.0464 0.1673 0.3823 0.6331 0.8338 0.9452 0.9877 0.9983 0.9999 0.5 0.0010 0.0107 0.0547 0.1719 0.3770 0.6230 0.8281 0.9453 0.9893 0.9990 0.6 0.0001 0.0017 0.0123 0.0548 0.1662 0.3669 0.6177 0.8327 0.9536 0.9940 0.7 0.0000 0.0001 0.0016 0.0106 0.0473 0.1503 0.3504 0.6172 0.8507 0.9718 0.75 0.0000 0.0000 0.0004 0.0035 0.0197 0.0781 0.2241 0.4744 0.7560 0.9437 0.8 0.0000 0.0000 0.0001 0.0009 0.0064 0.0328 0.1209 0.3222 0.6242 0.8926 0.9 0.0000 0.0000 0.0000 0.0000 0.0001 0.0016 0.0128 0.0702 0.2639 0.6513 0.95 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0010 0.0115 0.0861 0.4013 0.99 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0043 0.0956
P(X = 0) = P(X ? 0) = .1074
?What is the probability that Pat gets two answers correct?? Binomial Table… i.e. what is P(X = 2), given P(success) = .20 and n=10 ?
n = 10 k 0 1 2 3 4 5 6 7 8 9 0.01 0.9044 0.9957 0.9999 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.05 0.5987 0.9139 0.9885 0.9990 0.9999 1.0000 1.0000 1.0000 1.0000 1.0000 0.1 0.3487 0.7361 0.9298 0.9872 0.9984 0.9999 1.0000 1.0000 1.0000 1.0000 0.2 0.1074 0.3758 0.6778 0.8791 0.9672 0.9936 0.9991 0.9999 1.0000 1.0000 0.25 0.0563 0.2440 0.5256 0.7759 0.9219 0.9803 0.9965 0.9996 1.0000 1.0000 0.3 0.0282 0.1493 0.3828 0.6496 0.8497 0.9527 0.9894 0.9984 0.9999 1.0000 0.4 0.0060 0.0464 0.1673 0.3823 0.6331 0.8338 0.9452 0.9877 0.9983 0.9999 0.5 0.0010 0.0107 0.0547 0.1719 0.3770 0.6230 0.8281 0.9453 0.9893 0.9990 0.6 0.0001 0.0017 0.0123 0.0548 0.1662 0.3669 0.6177 0.8327 0.9536 0.9940 0.7 0.0000 0.0001 0.0016 0.0106 0.0473 0.1503 0.3504 0.6172 0.8507 0.9718 0.75 0.0000 0.0000 0.0004 0.0035 0.0197 0.0781 0.2241 0.4744 0.7560 0.9437 0.8 0.0000 0.0000 0.0001 0.0009 0.0064 0.0328 0.1209 0.3222 0.6242 0.8926 0.9 0.0000 0.0000 0.0000 0.0000 0.0001 0.0016 0.0128 0.0702 0.2639 0.6513 0.95 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0010 0.0115 0.0861 0.4013 0.99 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0043 0.0956
P(X = 2) = P(X?2) – P(X?1) = .6778 – .3758 = .3020
remember, the table shows cumulative probabilities…
Binomial Distribution…
What is the probability that Pat fails the quiz?? i.e. what is P(X ? 4), given P(success) = .20 and n=10 ?
n = 10 k 0 1 2 3 4 5 6 7 8 9 0.01 0.9044 0.9957 0.9999 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.05 0.5987 0.9139 0.9885 0.9990 0.9999 1.0000 1.0000 1.0000 1.0000 1.0000 0.1 0.3487 0.7361 0.9298 0.9872 0.9984 0.9999 1.0000 1.0000 1.0000 1.0000 0.2 0.1074 0.3758 0.6778 0.8791 0.9672 0.9936 0.9991 0.9999 1.0000 1.0000 0.25 0.0563 0.2440 0.5256 0.7759 0.9219 0.9803 0.9965 0.9996 1.0000 1.0000 0.3 0.0282 0.1493 0.3828 0.6496 0.8497 0.9527 0.9894 0.9984 0.9999 1.0000 0.4 0.0060 0.0464 0.1673 0.3823 0.6331 0.8338 0.9452 0.9877 0.9983 0.9999 0.5 0.0010 0.0107 0.0547 0.1719 0.3770 0.6230 0.8281 0.9453 0.9893 0.9990 0.6 0.0001 0.0017 0.0123 0.0548 0.1662 0.3669 0.6177 0.8327 0.9536 0.9940 0.7 0.0000 0.0001 0.0016 0.0106 0.0473 0.1503 0.3504 0.6172 0.8507 0.9718 0.75 0.0000 0.0000 0.0004 0.0035 0.0197 0.0781 0.2241 0.4744 0.7560 0.9437 0.8 0.0000 0.0000 0.0001 0.0009 0.0064 0.0328 0.1209 0.3222 0.6242 0.8926 0.9 0.0000 0.0000 0.0000 0.0000 0.0001 0.0016 0.0128 0.0702 0.2639 0.6513 0.95 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0010 0.0115 0.0861 0.4013 0.99 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0043 0.0956
P(X ? 4) = .9672
Binomial Table…
The binomial table gives cumulative probabilities for P(X ? k), but as we’ve seen in the last example, P(X = k) = P(X ? k) – P(X ? [k–1]) Likewise, for probabilities given as P(X ? k), we have: P(X ? k) = 1 – P(X ? [k–1])
=BINOMDIST() Excel Function…
There is a binomial distribution function in Excel that can also be used to calculate these probabilities. For example: # successes What is the probability that Pat gets two answers correct?
# trials P(success) cumulative (i.e. P(X?x)?)
P(X=2)=.3020
=BINOMDIST() Excel Function…
There is a binomial distribution function in Excel that can also be used to calculate these probabilities. For example: # What is the probability that Pat fails the quiz? successes
# trials P(success) cumulative (i.e. P(X?x)?)
P(X?4)=.9672
Binomial Distribution…
As you might expect, statisticians have developed general formulas for the mean, variance, and standard deviation of a binomial random variable. They are:
Poisson Distribution…
Named for Simeon Poisson, the Poisson distribution is a discrete probability distribution and refers to the number of events (a.k.a. successes) within a specific time period or region of space. For example:
The number of cars arriving at a service station in 1 hour. (The interval of time is 1 hour.) The number of flaws in a bolt of cloth. (The specific region is a bolt of cloth.) The number of accidents in 1 day on a particular stretch of highway. (The interval is defined by both time, 1 day, and space, the particular stretch of highway.)
The Poisson Experiment…
Like a binomial experiment, a Poisson experiment has four defining characteristic properties: The number of successes that occur in any interval is independent of the number of successes that occur in any other interval. The probability of a success in an interval is the same for all equal-size intervals The probability of a success is proportional to the size of the interval. The probability of more than one success in an interval approaches 0 as the interval becomes smaller.
Poisson Distribution…
The Poisson random variable is the number of successes successes that occur in a period of time or an interval of space in a Poisson experiment. E.g. On average, 96 trucks arrive attime period a border crossing every hour. E.g. The number of typographic errors in a new textbook edition averages 1.5 per 100 pages.
successes (?!) interval
Poisson Probability Distribution…
The probability that a Poisson random variable assumes a value of x is given by:
and e is the natural logarithm base. FYI:
A statistics instructor has observed that the number of Example 7.12… typographical errors in new editions of textbooks varies considerably from book to book. After some analysis he concludes that the number of errors is Poisson distributed with a mean of 1.5 per 100 pages. The instructor randomly selects 100 pages of a new book. What is the probability that there are no typos?
A statistics instructor has observed that the number of Example 7.12… errors in new editions of textbooks varies typographical considerably from book to book. After some analysis he concludes that the number of errors is Poisson distributed with a mean of 1.5 per 100 pages. The instructor randomly selects 100 pages of a new book. What is the probability that there are no typos? That is, what is P(X=0) given that µ = 1.5?
“There is about a 22% chance of finding zero errors”
Example 7.13…
Refer to Example 7.12. Suppose that the instructor has just received a copy of a new statistics book. He notices that there are 400 pages. a What is the probability that there are no typos? b What is the probability that there are five or fewer typos?
Poisson Distribution…
As mentioned on the Poisson experiment slide: The probability of a success is proportional to the size of the interval Thus, knowing an error rate of 1.5 typos per 100 pages, we can determine a mean value for a 400 page book as: =1.5(4) = 6 typos / 400 pages.
Example 7.13…
For a 400 page book, what is the probability that there are no typos? P(X=0) =
“there is a very small chance there are no typos”
For a 400 page book, what is the probability that there are Example 7.13… five or less typos? P(X?5) = P(0) + P(1) + … + P(5)
This is rather tedious to solve manually. A better alternative is to refer to Table 2 in Appendix B… …k=5, µ =6, and P(X ? k) = .446
“there is about a 45% chance there are 5 or less typos”
Example 7.13…
…Excel is an even better alternative:
doc_629818386.pptx
Random Variables & Discrete Prob. Distributions
Random Variables and Discrete Probability Distributions
Random Variables…
A random variable is a function or rule that assigns a number to each outcome of an experiment. Alternatively, the value of a random variable is a numerical event. Instead of talking about the coin flipping event as {heads, tails} think of it as “the number of heads when flipping a coin” {1, 0} (numerical events)
Two Types of Random Variables…
Discrete Random Variable – one that takes on a countable number of values – E.g. values on the roll of dice: 2, 3, 4, …, 12
Continuous Random Variable – one whose values are not discrete, not countable – E.g. time (30.1 minutes? 30.10000001 minutes?)
Analogy: Integers are Discrete, while Real Numbers are Continuous
Probability Distributions…
A probability distribution is a table, formula, or graph that describes the values of a random variable and the probability associated with these values. Since we’re describing a random variable (which can be discrete or continuous) we have two types of probability distributions: – Discrete Probability Distribution, (this chapter) and – Continuous Probability Distribution (Chapter 8)
Probability Notation…
An upper-case letter will represent the name of the random variable, usually X. Its lower-case counterpart will represent the value of the random variable. The probability that the random variable X will equal x is: P(X = x) or more simply P(x)
Discrete Probability Distributions…
The probabilities of the values of a discrete random variable may be derived by means of probability tools such as tree diagrams or by applying one of the definitions of probability, so long as these two conditions apply:
Example 7.1
The Statistical Abstract of the United States is published annually. It contains a wide variety of information based on the census as well as other sources. The objective is to provide information about a variety of different aspects of the lives of the country’s residents. One of the questions asked households to report the number of color televisions in the household. The following table summarizes the data. Develop the probability distribution of the random variable defined as the number of color televisions per household.
Example 7.1
.Number of Color Televisions
(1,000s) 0 1 2 3 4 5 Total
Number of Households 1,218 32,379 37,961 19,387 7,714 2,842 101,501
Example 7.1
Probability distributions can be estimated from relative frequencies.
1,218 ÷ 101,501 = 0.012
e.g. P(X=4) = P(4) = 0.076 = 7.6%
Example 7.1
E.g. what is the probability there is at least one television but no more than three in any given household?
?at least one television but no more than three? P(1 ? X ? 3) = P(1) + P(2) + P(3) = .319 + .374 + .191 = .884
Example 7.2…
A mutual fund salesperson has arranged to call on three people tomorrow. Based on past experience the salesperson knows that there is a 20% chance of closing a sale on each call. Determine the probability distribution of the number of sales the salesperson will make. Let S denote success, i.e. closing a sale P(S)=.20 Thus SC is not closing a sale, and P(SC)=.80
Example 7.2…
Sales Call 1 Sales Call 2 Sales Call 3
Developing a Probability Distribution…
P(S)=.2 P(S)=.2 P(S)=.2 P(SC)=.8 P(SC)=.8 P(S)=.2 P(SC)=.8 P(S)=.2
(.2)(.2)(.8)= .032
SSS S S SC S SC S S SC SC SC S S SC S SC SC SC S
P(SC)=.8
P(S)=.2 P(SC)=.8
X 3 2 1 0
P(x) .23 = .008 3(.032)=.096 3(.128)=.384 .83 = .512
P(SC)=.8 P(S)=.2 P(SC)=.8
SC SC SC
P(X=2) is illustrated here…
Population/Probability Distribution…
The discrete probability distribution represents a population
Example 7.1 the population of number of TVs per household Example 7.2 the population of sales call outcomes
Since we have populations, we can describe them by computing various parameters. E.g. the population mean and population variance.
Population Mean (Expected Value)
The population mean is the weighted average of all of its values. The weights are the probabilities. This parameter is also called the expected value of X and is represented by E(X).
Population Variance…
The population variance is calculated similarly. It is the weighted average of the squared deviations from the mean.
As before, there is a ?short-cut? formulation…
The standard deviation is the same as before:
Example 7.3… Find the mean, variance, and standard deviation for the population of the number of color televisions per household… (from Example 7.1)
= 0(.012) + 1(.319) + 2(.374) + 3(.191) + 4(.076) + 5(.028) = 2.084
Example 7.3…
Find the mean, variance, and standard deviation for the population of the number of color televisions per household… (from Example 7.1)
= (0 – 2.084)2(.012) + (1 – 2.084)2(.319)+…+(5 – 2.084)2(.028) = 1.107
Example 7.3… Find the mean, variance, and standard deviation for the population of the number of color televisions per household… (from Example 7.1)
= 1.052
Laws of Expected Value…
E(c) = c
The expected value of a constant (c) is just the value of the constant.
E(X + c) = E(X) + c E(cX) = cE(X)
We can ?pull? a constant out of the expected value expression (either as part of a sum with a random variable X or as a coefficient of random variable X).
Example 7.4…
Monthly sales have a mean of $25,000 and a standard deviation of $4,000. Profits are calculated by multiplying sales by 30% and subtracting fixed costs of $6,000. Find the mean monthly profit.
1) Describe the problem statement in algebraic terms: sales have a mean of $25,000 ? E(Sales) = 25,000 profits are calculated by… ? Profit = .30(Sales) – 6,000
Example 7.4…
Monthly sales have a mean of $25,000 and a standard deviation of $4,000. Profits are calculated by multiplying sales by 30% and subtracting fixed costs of $6,000. Find the mean monthly profit.
E(Profit)
=E[.30(Sales) – 6,000] =E[.30(Sales)] – 6,000 [by rule #2] =.30E(Sales) – 6,000 [by rule #3] =.30(25,000) – 6,000 = 1,500 Thus, the mean monthly profit is $1,500
Laws of Variance…
V(c) = 0
The variance of a constant (c) is zero.
V(X + c) = V(X)
The variance of a random variable and a constant is just the variance of the random variable (per 1 above).
V(cX) = c2V(X)
The variance of a random variable and a constant coefficient is the coefficient squared times the variance of the random variable.
Example 7.4…
Monthly sales have a mean of $25,000 and a standard deviation of $4,000. Profits are calculated by multiplying sales by 30% and subtracting fixed costs of $6,000. Find the standard deviation of monthly profits.
1) Describe the problem statement in algebraic terms: sales have a standard deviation of $4,000 ? V(Sales) = 4,0002 = 16,000,000 (remember the relationship between standard deviation and variance ) profits are calculated by… ? Profit = .30(Sales) – 6,000
Example 7.4…
Monthly sales have a mean of $25,000 and a standard deviation of $4,000. Profits are calculated by multiplying sales by 30% and subtracting fixed costs of $6,000. Find the standard deviation of monthly profits. 2) The variance of profit is = V(Profit) =V[.30(Sales) – 6,000] =V[.30(Sales)] [by rule #2] =(.30)2V(Sales) [by rule #3] =(.30)2(16,000,000) = 1,440,000 Again, standard deviation is the square root of variance, so standard deviation of Profit = (1,440,000)1/2 = $1,200
Example 7.4 (summary)
Monthly sales have a mean of $25,000 and a standard deviation of $4,000. Profits are calculated by multiplying sales by 30% and subtracting fixed costs of $6,000. Find the mean and standard deviation of monthly profits. The mean monthly profit is $1,500 The standard deviation of monthly profit is $1,200
Bivariate Distributions…
Up to now, we have looked at univariate distributions, i.e. probability distributions in one variable. As you might guess, bivariate distributions are probabilities of combinations of two variables. Bivariate probability distributions are also called joint probability. A joint probability distribution of X and Y is a table or formula that lists the joint probabilities for all pairs of values x and y, and is denoted P(x,y). P(x,y) = P(X=x and Y=y)
Discrete Bivariate Distribution…
As you might expect, the requirements for a bivariate distribution are similar to a univariate distribution, with only minor changes to the notation:
for all pairs (x,y).
Example 7.5…
Xavier and Yvette are real estate agents. Let X denote the number of houses that Xavier will sell in a month and let Y denote the number of houses Yvette will sell in a month. An analysis of their past monthly performances has the following joint probabilities (bivariate probability distribution).
Marginal Probabilities…
As before, we can calculate the marginal probabilities by summing across rows and down columns to determine the probabilities of X and Y individually:
E.g the probability that Xavier sells 1 house = P(X=1) =0.50
Describing the Bivariate Distribution…
We can describe the mean, variance, and standard deviation of each variable in a bivariate distribution by working with the marginal probabilities…
same formulae as for univariate distributions…
Covariance…
The covariance of two discrete variables is defined as:
or alternatively using this shortcut method:
Coefficient of Correlation…
The coefficient of correlation is calculated in the same way as described earlier…
Example 7.6…
Compute the covariance and the coefficient of correlation between the numbers of houses sold by Xavier and Yvette.
COV(X,Y) = (0 – .7)(0 – .5)(.12) + (1 – .7)(0 – .5)(.42) + … … + (2 – .7)(2 – .5)(.01) = –.15 = –0.15 ÷ [(.64)(.67)] = –.35 There is a weak, negative relationship between the two variables.
Example 7.6…
X 0 0 0 1 1 1 2 2 2 0.7 Y 0 1 2 0 1 2 0 1 2 0.5 Probability 0.12 0.21 0.07 0.42 0.06 0.02 0.06 0.03 0.01 X - µ(x) -0.7 -0.7 -0.7 0.3 0.3 0.3 1.3 1.3 1.3 Y - µ


Example 7.6…
= –0.150 ÷ [(.64)(.67)] = –.35
There is a weak, negative relationship between the two variables.
The bivariate distribution allows us to develop the probability distribution of any combination of the two variables, of particular interest is the sum of two variables. If we consider our example of Xavier and Yvette selling houses, we can create a probability distribution…
Sum of Two Variables…
…to answer questions like ?what is the probability that two houses are sold?? P(X+Y=2) = P(0,2) + P(1,1) + P(2,0) = .07 + .06 + .06 = .19
Sum of Two Variables…
Likewise, we can compute the expected value, variance, and standard deviation of X+Y in the usual way… E(X + Y) = 0(.12) + 1(.63) + 2(.19) + 3(.05) + 4(.01) = 1.2
V(X + Y) = (0 – 1.2)2(.12) + … + (4 – 1.2)2(.01) = .56
?x ? y ?
Var(X ? Y) ? .56 ? .75
Laws…
We can derive laws of expected value and variance for the sum of two variables as follows… E(X + Y) = E(X) + E(Y) V(X + Y) = V(X) + V(Y) + 2COV(X, Y)
If X and Y are independent, COV(X, Y) = 0 and thus
V(X + Y) = V(X) + V(Y)
Laws
E(X + Y) = E(X) + E(Y) = .7 + .5 = 1.2 V(X + Y) = V(X) + V(Y) + 2COV(X, Y)
= .41 + .45 + 2(-.15) = .56
Portfolio Diversification and Asset Allocation
Consider an investor who forms a portfolio, consisting of only two stocks, by investing $4,000 in one stock and $6,000 in a second stock. Suppose that the results after 1 year are: One-Year Results
Stock 1 2 Total OR Initial Investment $4,000 $6,000 $10,000 Value of Investment After One Year $5,000 $5,400 $10,400 Rate of Return on Investment R1 = .25 (25%) R2 =-.10 (-10%) Rp = .04 ( 4%)
Portfolio Diversification and Asset Allocation
Mean and Variance of a Portfolio of Two Stocks E(Rp) = w1 E(R1) + w2 E(R2) V(Rp) = w12 V(R1) + w22 V(R2) + 2w1w2 COV(R1, R2) = w12?12 + w22?22 + 2w1w2??1?2 where w1 and w2 are the proportions or weights of investments 1 and 2, E(R1) and E(R2) are their expected values, ?1 and ?2 are their standard deviations, and ? is the coefficient of correlation
Example 7.8
An investor has decided to form a portfolio by putting 25% of his money into McDonald’s stock and 75% into Cisco Systems stock. The investor assumes that the expected returns will be 8% and 15%, respectively, and that the standard deviations will be 12% and 22%, respectively. a Find the expected return on the portfolio. b Compute the standard deviation of the returns on the portfolio assuming that (i) the two stocks’ returns are perfectly positively correlated (ii) the coefficient of correlation is .5 (iii) the two stocks’ returns are uncorrelated
Example 7.8 Solution
a The expected values of the two stocks are E(R1) = .08 and E(R2) = .15 The weights are w1 = .25 and w2 = .75. Thus, E(R2) = w1E(R1) + w2E(R2) = .25(.08) + .75(.15) = .1325
Example 7.8 Solution
The standard deviations are ?1 = .12 and ?2 = .22. Thus, V(Rp) = w12?12 + w22?22 + 2w1w2??1?2 = (.252)(.122) + (.752)(.222) + 2(.25)(.75)? (.12)(.22) = .0281 + .0099 ? When ? = 1 V(Rp) = .0281 + .0099(1) = .0380 When ? = .5 V(Rp) = .0281 + .0099(.5) = .0331 When ? = 0 V(Rp) = .0281 + .0099(0) = .0281
Portfolio Diversification in Practice
The formulas introduced in this section require that we know the expected values, variances, and covariance (or coefficient of correlation) of the investments we’re interested in. The question arises, How do we determine these parameters? (Incidentally, this question is rarely addressed in finance textbooks!) The most common procedure is to estimate the parameters from historical data, using sample statistics.
Portfolios with More Than Two Stocks
We can extend the formulas that describe the mean and variance of the returns of a portfolio of two stocks to a portfolio of any number of k stocks. wi E ( R ) Mean and Variancei of a Portfolio of k Stocks
?
i ?1
E(Rp ) = V(Rp ) =
?
i ?1
k
w i2 ? i2 ? 2
? ? w w COV (R , R )
i j i j i ?1 j?i ?1
k
k
Where Ri is the return of the ith stock, wi is the proportion of the portfolio invested in stock i, and k is the number of stocks in the portfolio.
Portfolios with More Than Two Stocks
When k is greater than 2 the calculations can be tedious and time-consuming. For example, when k = 3, we need to know the values of the three weights, three expected values, three variances, and three covariances. When k = 4, there are four expected values, four variances and six covariances. [The number of covariances required in general is k(k-1)/2.] To assist you we have created an Excel worksheet to perform the computations when k =2, 3, or 4. (For larger values of k, see the reference at the end of the chapter.) To demonstrate we’ll return to the problem described in this chapter’s introduction.
An investor has $100,000 to invest in the stock market. She is Chapter-Opening Example interested in developing a stock portfolio made up of General Electric, General Motors, McDonald’s, and Motorola. However, she doesn’t know how much to invest in each one. She wants to maximize her return, but she would also like to minimize the risk. She has computed the monthly returns for all four stocks during a 60-month period (January 2001 to December 2006) (Xm07-00).
After some Chapter-Opening Example consideration, she narrowed her choices down to the following three. What should she do? 1. $25,000 in each stock 2. General Electric: $10,000, General Motors: $20,000, McDonald’s: $30,000, Motorola: $40,000 3. General Electric: $10,000, General Motors: $50,000, McDonald’s: $30,000, Motorola: $10,000
Chapter-Opening Example
Because of the large amount of calculations we will solve this problem using only Excel. From the file we compute the B C D E means of each stock’s returns.
74 0.000305 0.002339 0.007910 0.007997
Chapter-Opening Example
Next we compute the variance-covariance matrix. (The commands are the same as those described in Chapter 4—simply include all the columns ofA returns ofB investments you wish D include in the the the to C E portfolio.) 1 GE GM McDonalds Motorola 2 GE 0.003493 3 GM 0.001076 0.011016 4 McDonalds 0.001528 0.001989 0.005409 5 Motorola 0.000933 0.004131 0.002515 0.010277
Chapter-Opening Example
Notice that the variances of the returns are listed on the diagonal. Thus, for example, the variance of the 60 monthly returns of General Electric is .00349. The covariances appear below the diagonal. The covariance between the returns of General Electric and General Motors is .00108. The means and the variance-covariance matrix are copied to the Portfolio Diversification spreadsheet. The weights are typed, producing the accompanying output.
A 1 Portfolio of 4 Stocks Chapter-Opening Example 2 3 Variance-Covariance Matrix 4 5 6 7 8 Expected Returns 9 10 Weights 11 12 Portfolio Return 13 Expected Value 14 Variance 15 Standard Deviation
B
C
D
E
F
GE GM McDonalds Motorola GE 0.003493 GM 0.001076 0.011016 McDonalds 0.001528 0.001989 0.005409 Motorola 0.000933 0.004131 0.002515 0.010277 0.000305 0.250000 0.002339 0.250000 0.007910 0.007997 0.250000 0.250000
0.0046 0.0034 0.0584
Chapter-Opening Example A
1 Portfolio of 4 Stocks Plan 2 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Variance-Covariance Matrix
B
C
D
E
F
GE GM McDonalds Motorola GE 0.003493 GM 0.001076 0.011016 McDonalds 0.001528 0.001989 0.005409 Motorola 0.000933 0.004131 0.002515 0.010277 0.000305 0.100000 0.002339 0.200000 0.007910 0.007997 0.300000 0.400000
Expected Returns Weights Portfolio Return Expected Value Variance Standard Deviation
0.0061 0.0043 0.0657
A 1 Portfolio of 4 Stocks Plan 3 2 3 Variance-Covariance Matrix 4 5 6 7 8 Expected Returns 9 10 Weights 11 12 Portfolio Return 13 Expected Value 14 Variance 15 Standard Deviation
Chapter-Opening Example
B
C
D
E
F
GE GM McDonalds Motorola GE 0.003493 GM 0.001076 0.011016 McDonalds 0.001528 0.001989 0.005409 Motorola 0.000933 0.004131 0.002515 0.010277 0.000305 0.100000 0.002339 0.500000 0.007910 0.007997 0.300000 0.100000
0.0044 0.0048 0.0690
Chapter-Opening Example
Plan 3 has the smallest expected value and the largest variance, making it the worst of the three plans. Plan 2 has the largest expected value, whereas plan 1 has the smallest variance. If the investor is like most investors she would select Plan 1 because of its lower risk. Other more daring investors may choose plan 2 to take advantage of its higher expected value.
Binomial Distribution…
The binomial distribution is the probability distribution that results from doing a ?binomial experiment?. Binomial experiments have the following properties: Fixed number of trials, represented as n. Each trial has two possible outcomes, a ?success? and a ?failure?. P(success)=p (and thus: P(failure)=1–p), for all trials. The trials are independent, which means that the outcome of one trial does not affect the outcomes of any other trials.
Success and Failure…
…are just labels for a binomial experiment, there is no value judgment implied. For example a coin flip will result in either heads or tails. If we define ?heads? as success then necessarily ?tails? is considered a failure (inasmuch as we attempting to have the coin lands heads up). Other binomial experiment notions:
An election candidate wins or loses An employee is male or female
Binomial Random Variable…
The random variable of a binomial experiment is defined as the number of successes in the n trials, and is called the binomial random variable. E.g. flip a fair coin 10 times…
1) Fixed number of trials ? n=10 2) Each trial has two possible outcomes ? {heads (success), tails (failure)} 3) P(success)= 0.50; P(failure)=1–0.50 = 0.50 ? 4) The trials are independent ? (i.e. the outcome of heads on the first flip will have no impact on subsequent coin flips).
Hence flipping a coin ten times is a binomial experiment since all conditions were met.
Binomial Random Variable…
The binomial random variable counts the number of successes in n trials of the binomial experiment. It can take on values from 0, 1, 2, …, n. Thus, its a discrete random variable. To calculate the probability associated with each value we use combintorics: for x=0, 1, 2, …, n
Pat Statsdud…
Pat Statsdud is a (not good) student taking a statistics course. Pat’s exam strategy is to rely on luck for the next quiz. The quiz consists of 10 multiple-choice questions. Each question has five possible answers, only one of which is correct. Pat plans to guess the answer to each question. What is the probability that Pat gets no answers correct?
What is the probability that Pat gets two answers correct?
Pat Statsdud…
Pat Statsdud is a (not good) student taking a statistics course whose exam strategy is to rely on luck for the next quiz. The quiz consists of 10 multiple-choice questions. Each question has five possible answers, only one of which is correct. Pat plans to guess the answer to each question. Algebraically then: n=10, and P(success) = 1/5 = .20
Pat Statsdud…
Is this a binomial experiment? Check the conditions: ? There is a fixed finite number of trials (n=10). ? An answer can be either correct or incorrect. The probability of a correct answer (P(success)=.20) does not change from question to question. ? Each answer is independent of the others.
Pat Statsdud…
n=10, and P(success) = .20 What is the probability that Pat gets no answers correct? I.e. # success, x, = 0; hence we want to know P(x=0)
Pat has about an 11% chance of getting no answers correct using the guessing strategy.
Pat Statsdud…
n=10, and P(success) = .20 What is the probability that Pat gets two answers correct? I.e. # success, x, = 2; hence we want to know P(x=2)
Pat has about a 30% chance of getting exactly two answers correct using the guessing strategy.
Cumulative Probability…
Thus far, we have been using the binomial probability distribution to find probabilities for individual values of x. To answer the question: ?Find the probability that Pat fails the quiz? requires a cumulative probability, that is, P(X ? x) If a grade on the quiz is less than 50% (i.e. 5 questions out of 10), that’s considered a failed quiz. Thus, we want to know what is: P(X ? 4) to answer
Pat Statsdud…
P(X ? 4) = P(0) + P(1) + P(2) + P(3) + P(4)
We already know P(0) = .1074 and P(2) = .3020. Using the binomial formula to calculate the others: P(1) = .2684 , P(3) = .2013, and P(4) = .0881
We have P(X ? 4) = .1074 + .2684 + … + .0881 = .9672 Thus, its about 97% probable that Pat will fail the test using the luck strategy and guessing at answers…
Binomial Table…
Calculating binomial probabilities by hand is tedious and error prone. There is an easier way. Refer to Table 1 in Appendix B. For the Pat Statsdud example, n=10, so the first important step is to get the correct table!
n = 10 k 0 1 2 3 4 5 6 7 8 9 0.01 0.9044 0.9957 0.9999 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.05 0.5987 0.9139 0.9885 0.9990 0.9999 1.0000 1.0000 1.0000 1.0000 1.0000 0.1 0.3487 0.7361 0.9298 0.9872 0.9984 0.9999 1.0000 1.0000 1.0000 1.0000 0.2 0.1074 0.3758 0.6778 0.8791 0.9672 0.9936 0.9991 0.9999 1.0000 1.0000 0.25 0.0563 0.2440 0.5256 0.7759 0.9219 0.9803 0.9965 0.9996 1.0000 1.0000 0.3 0.0282 0.1493 0.3828 0.6496 0.8497 0.9527 0.9894 0.9984 0.9999 1.0000 0.4 0.0060 0.0464 0.1673 0.3823 0.6331 0.8338 0.9452 0.9877 0.9983 0.9999 0.5 0.0010 0.0107 0.0547 0.1719 0.3770 0.6230 0.8281 0.9453 0.9893 0.9990 0.6 0.0001 0.0017 0.0123 0.0548 0.1662 0.3669 0.6177 0.8327 0.9536 0.9940 0.7 0.0000 0.0001 0.0016 0.0106 0.0473 0.1503 0.3504 0.6172 0.8507 0.9718 0.75 0.0000 0.0000 0.0004 0.0035 0.0197 0.0781 0.2241 0.4744 0.7560 0.9437 0.8 0.0000 0.0000 0.0001 0.0009 0.0064 0.0328 0.1209 0.3222 0.6242 0.8926 0.9 0.0000 0.0000 0.0000 0.0000 0.0001 0.0016 0.0128 0.0702 0.2639 0.6513 0.95 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0010 0.0115 0.0861 0.4013 0.99 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0043 0.0956
Binomial Table…
The probabilities listed in the tables are cumulative, i.e. P(X ? k) – k is the row index; the columns of the table are organized by P(success) = p
n = 10 k 0 1 2 3 4 5 6 7 8 9 0.01 0.9044 0.9957 0.9999 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.05 0.5987 0.9139 0.9885 0.9990 0.9999 1.0000 1.0000 1.0000 1.0000 1.0000 0.1 0.3487 0.7361 0.9298 0.9872 0.9984 0.9999 1.0000 1.0000 1.0000 1.0000 0.2 0.1074 0.3758 0.6778 0.8791 0.9672 0.9936 0.9991 0.9999 1.0000 1.0000 0.25 0.0563 0.2440 0.5256 0.7759 0.9219 0.9803 0.9965 0.9996 1.0000 1.0000 0.3 0.0282 0.1493 0.3828 0.6496 0.8497 0.9527 0.9894 0.9984 0.9999 1.0000 0.4 0.0060 0.0464 0.1673 0.3823 0.6331 0.8338 0.9452 0.9877 0.9983 0.9999 0.5 0.0010 0.0107 0.0547 0.1719 0.3770 0.6230 0.8281 0.9453 0.9893 0.9990 0.6 0.0001 0.0017 0.0123 0.0548 0.1662 0.3669 0.6177 0.8327 0.9536 0.9940 0.7 0.0000 0.0001 0.0016 0.0106 0.0473 0.1503 0.3504 0.6172 0.8507 0.9718 0.75 0.0000 0.0000 0.0004 0.0035 0.0197 0.0781 0.2241 0.4744 0.7560 0.9437 0.8 0.0000 0.0000 0.0001 0.0009 0.0064 0.0328 0.1209 0.3222 0.6242 0.8926 0.9 0.0000 0.0000 0.0000 0.0000 0.0001 0.0016 0.0128 0.0702 0.2639 0.6513 0.95 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0010 0.0115 0.0861 0.4013 0.99 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0043 0.0956
Binomial Table…
?What is the probability that Pat gets no answers correct?? i.e. what is P(X = 0), given P(success) = .20 and n=10 ?
n = 10 k 0 1 2 3 4 5 6 7 8 9 0.01 0.9044 0.9957 0.9999 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.05 0.5987 0.9139 0.9885 0.9990 0.9999 1.0000 1.0000 1.0000 1.0000 1.0000 0.1 0.3487 0.7361 0.9298 0.9872 0.9984 0.9999 1.0000 1.0000 1.0000 1.0000 0.2 0.1074 0.3758 0.6778 0.8791 0.9672 0.9936 0.9991 0.9999 1.0000 1.0000 0.25 0.0563 0.2440 0.5256 0.7759 0.9219 0.9803 0.9965 0.9996 1.0000 1.0000 0.3 0.0282 0.1493 0.3828 0.6496 0.8497 0.9527 0.9894 0.9984 0.9999 1.0000 0.4 0.0060 0.0464 0.1673 0.3823 0.6331 0.8338 0.9452 0.9877 0.9983 0.9999 0.5 0.0010 0.0107 0.0547 0.1719 0.3770 0.6230 0.8281 0.9453 0.9893 0.9990 0.6 0.0001 0.0017 0.0123 0.0548 0.1662 0.3669 0.6177 0.8327 0.9536 0.9940 0.7 0.0000 0.0001 0.0016 0.0106 0.0473 0.1503 0.3504 0.6172 0.8507 0.9718 0.75 0.0000 0.0000 0.0004 0.0035 0.0197 0.0781 0.2241 0.4744 0.7560 0.9437 0.8 0.0000 0.0000 0.0001 0.0009 0.0064 0.0328 0.1209 0.3222 0.6242 0.8926 0.9 0.0000 0.0000 0.0000 0.0000 0.0001 0.0016 0.0128 0.0702 0.2639 0.6513 0.95 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0010 0.0115 0.0861 0.4013 0.99 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0043 0.0956
P(X = 0) = P(X ? 0) = .1074
?What is the probability that Pat gets two answers correct?? Binomial Table… i.e. what is P(X = 2), given P(success) = .20 and n=10 ?
n = 10 k 0 1 2 3 4 5 6 7 8 9 0.01 0.9044 0.9957 0.9999 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.05 0.5987 0.9139 0.9885 0.9990 0.9999 1.0000 1.0000 1.0000 1.0000 1.0000 0.1 0.3487 0.7361 0.9298 0.9872 0.9984 0.9999 1.0000 1.0000 1.0000 1.0000 0.2 0.1074 0.3758 0.6778 0.8791 0.9672 0.9936 0.9991 0.9999 1.0000 1.0000 0.25 0.0563 0.2440 0.5256 0.7759 0.9219 0.9803 0.9965 0.9996 1.0000 1.0000 0.3 0.0282 0.1493 0.3828 0.6496 0.8497 0.9527 0.9894 0.9984 0.9999 1.0000 0.4 0.0060 0.0464 0.1673 0.3823 0.6331 0.8338 0.9452 0.9877 0.9983 0.9999 0.5 0.0010 0.0107 0.0547 0.1719 0.3770 0.6230 0.8281 0.9453 0.9893 0.9990 0.6 0.0001 0.0017 0.0123 0.0548 0.1662 0.3669 0.6177 0.8327 0.9536 0.9940 0.7 0.0000 0.0001 0.0016 0.0106 0.0473 0.1503 0.3504 0.6172 0.8507 0.9718 0.75 0.0000 0.0000 0.0004 0.0035 0.0197 0.0781 0.2241 0.4744 0.7560 0.9437 0.8 0.0000 0.0000 0.0001 0.0009 0.0064 0.0328 0.1209 0.3222 0.6242 0.8926 0.9 0.0000 0.0000 0.0000 0.0000 0.0001 0.0016 0.0128 0.0702 0.2639 0.6513 0.95 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0010 0.0115 0.0861 0.4013 0.99 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0043 0.0956
P(X = 2) = P(X?2) – P(X?1) = .6778 – .3758 = .3020
remember, the table shows cumulative probabilities…
Binomial Distribution…
What is the probability that Pat fails the quiz?? i.e. what is P(X ? 4), given P(success) = .20 and n=10 ?
n = 10 k 0 1 2 3 4 5 6 7 8 9 0.01 0.9044 0.9957 0.9999 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.05 0.5987 0.9139 0.9885 0.9990 0.9999 1.0000 1.0000 1.0000 1.0000 1.0000 0.1 0.3487 0.7361 0.9298 0.9872 0.9984 0.9999 1.0000 1.0000 1.0000 1.0000 0.2 0.1074 0.3758 0.6778 0.8791 0.9672 0.9936 0.9991 0.9999 1.0000 1.0000 0.25 0.0563 0.2440 0.5256 0.7759 0.9219 0.9803 0.9965 0.9996 1.0000 1.0000 0.3 0.0282 0.1493 0.3828 0.6496 0.8497 0.9527 0.9894 0.9984 0.9999 1.0000 0.4 0.0060 0.0464 0.1673 0.3823 0.6331 0.8338 0.9452 0.9877 0.9983 0.9999 0.5 0.0010 0.0107 0.0547 0.1719 0.3770 0.6230 0.8281 0.9453 0.9893 0.9990 0.6 0.0001 0.0017 0.0123 0.0548 0.1662 0.3669 0.6177 0.8327 0.9536 0.9940 0.7 0.0000 0.0001 0.0016 0.0106 0.0473 0.1503 0.3504 0.6172 0.8507 0.9718 0.75 0.0000 0.0000 0.0004 0.0035 0.0197 0.0781 0.2241 0.4744 0.7560 0.9437 0.8 0.0000 0.0000 0.0001 0.0009 0.0064 0.0328 0.1209 0.3222 0.6242 0.8926 0.9 0.0000 0.0000 0.0000 0.0000 0.0001 0.0016 0.0128 0.0702 0.2639 0.6513 0.95 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0010 0.0115 0.0861 0.4013 0.99 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0043 0.0956
P(X ? 4) = .9672
Binomial Table…
The binomial table gives cumulative probabilities for P(X ? k), but as we’ve seen in the last example, P(X = k) = P(X ? k) – P(X ? [k–1]) Likewise, for probabilities given as P(X ? k), we have: P(X ? k) = 1 – P(X ? [k–1])
=BINOMDIST() Excel Function…
There is a binomial distribution function in Excel that can also be used to calculate these probabilities. For example: # successes What is the probability that Pat gets two answers correct?
# trials P(success) cumulative (i.e. P(X?x)?)
P(X=2)=.3020
=BINOMDIST() Excel Function…
There is a binomial distribution function in Excel that can also be used to calculate these probabilities. For example: # What is the probability that Pat fails the quiz? successes
# trials P(success) cumulative (i.e. P(X?x)?)
P(X?4)=.9672
Binomial Distribution…
As you might expect, statisticians have developed general formulas for the mean, variance, and standard deviation of a binomial random variable. They are:
Poisson Distribution…
Named for Simeon Poisson, the Poisson distribution is a discrete probability distribution and refers to the number of events (a.k.a. successes) within a specific time period or region of space. For example:
The number of cars arriving at a service station in 1 hour. (The interval of time is 1 hour.) The number of flaws in a bolt of cloth. (The specific region is a bolt of cloth.) The number of accidents in 1 day on a particular stretch of highway. (The interval is defined by both time, 1 day, and space, the particular stretch of highway.)
The Poisson Experiment…
Like a binomial experiment, a Poisson experiment has four defining characteristic properties: The number of successes that occur in any interval is independent of the number of successes that occur in any other interval. The probability of a success in an interval is the same for all equal-size intervals The probability of a success is proportional to the size of the interval. The probability of more than one success in an interval approaches 0 as the interval becomes smaller.
Poisson Distribution…
The Poisson random variable is the number of successes successes that occur in a period of time or an interval of space in a Poisson experiment. E.g. On average, 96 trucks arrive attime period a border crossing every hour. E.g. The number of typographic errors in a new textbook edition averages 1.5 per 100 pages.
successes (?!) interval
Poisson Probability Distribution…
The probability that a Poisson random variable assumes a value of x is given by:
and e is the natural logarithm base. FYI:
A statistics instructor has observed that the number of Example 7.12… typographical errors in new editions of textbooks varies considerably from book to book. After some analysis he concludes that the number of errors is Poisson distributed with a mean of 1.5 per 100 pages. The instructor randomly selects 100 pages of a new book. What is the probability that there are no typos?
A statistics instructor has observed that the number of Example 7.12… errors in new editions of textbooks varies typographical considerably from book to book. After some analysis he concludes that the number of errors is Poisson distributed with a mean of 1.5 per 100 pages. The instructor randomly selects 100 pages of a new book. What is the probability that there are no typos? That is, what is P(X=0) given that µ = 1.5?
“There is about a 22% chance of finding zero errors”
Example 7.13…
Refer to Example 7.12. Suppose that the instructor has just received a copy of a new statistics book. He notices that there are 400 pages. a What is the probability that there are no typos? b What is the probability that there are five or fewer typos?
Poisson Distribution…
As mentioned on the Poisson experiment slide: The probability of a success is proportional to the size of the interval Thus, knowing an error rate of 1.5 typos per 100 pages, we can determine a mean value for a 400 page book as: =1.5(4) = 6 typos / 400 pages.
Example 7.13…
For a 400 page book, what is the probability that there are no typos? P(X=0) =
“there is a very small chance there are no typos”
For a 400 page book, what is the probability that there are Example 7.13… five or less typos? P(X?5) = P(0) + P(1) + … + P(5)
This is rather tedious to solve manually. A better alternative is to refer to Table 2 in Appendix B… …k=5, µ =6, and P(X ? k) = .446
“there is about a 45% chance there are 5 or less typos”
Example 7.13…
…Excel is an even better alternative:
doc_629818386.pptx