MA 115, Fall 2011

This is the page maintained by the TF in order to share optional class materials. It uses an alien technology known as MathML, which is adequately supported by the latest version of Mozilla Firefox, and is known not to work with Microsoft Internet Explorer. This page is updated weekly, so direct your Web browser to reload this page every time you check it.

Contents

Distribution Of The Sample Mean

Expected Value And Variance

Chebyshev's Inequality

Stem And Leaf Plot Examples

Distribution Of The Sample Mean

Example

A professional Hold'Em player claims to win 1 big blind per 1 hour session with standard deviation of 12 big blinds per hour. If the winnings from different sessions are independent, find the probability that the average amount won after 30 hours is positive. Do the same for 500 hours.

We can regard the winnings in n=30 sessions as independent identically distributed (iid) random variables X1 , X2 , ... , Xn . Since the size of this sample is pretty big, we can rely on the Central Limit Theorem. Regardless of the distribution of each X , we know that the distribution of the average winnings X is approximately normal with mean 1 and standard deviation

σX=σn=1230 .

And so the answer is

PX>0=1PX<0=1PZ<0112/30

or approximately 0.6759616 . Since PX>0=PX1+X2++X30>0 , we can conclude that after playing 30 hours, the player comes out in black about 67.6% of the time.

Similarly, if n=500 then

σX=σn=12500

and

PX>0=1PZ<0112/500

or approximately 0.9687963 . As a consequence, there is a whopping 3% chance that the player will still be in the red after playing for 500 hours.

Expected Value And Variance

Expected Value, Discrete Case

For a discrete random variable X with a corresponding probability mass function fx=PX=x we define the expected value of X (denoted by EX ) to be the sum

EX=fx1PX=x1+fx2PX=x2+fx2PX=x2+

For example, let X be the amount of money (in USD) you earn in a lottery where you buy a 1 USD ticket and have a small chance to win either a 100 or 10 USD prize. Let PX=99=0.004 , PX=9=0.04 , and PX=1=0.956 . That is, there is a large chance that you win no prize but still have to pay 1 USD for the ticket. Here's the pmf of X in table form (note that probabilities add up to 1 ):

x 1 9 99
PX=x 0.956 0.04 0.004

Then

EX=1×0.956+9×0.04+99×0.004=15

That is, every time you play this lottery, you lose 20 cents on average.

Variance

The variance of any random variable X (denoted VarX ) is defined to be

VarX=EXEX2 .

A basic theorem about variance gives us another way of expressing it:

VarX=EX2EX2 .

To find the variance of a discrete random variable, we can construct a pmf for X2 and compute EX2 . Using the variable from the example above, we have

x2 1 81 9801
PX2=x2 0.956 0.04 0.004

And so

EX2=1×0.956+81×0.04+9801×0.004=2175

and hence

VarX=EX2EX2=2175152=108425=43.36 .

This is rather large variance (much larger than the average value of X ), just as one would expect from a lottery. You lose a few cents on average, but there is a distinct possibility of earning many dollars every once in a while.

Chebyshev's Inequality

When applied to random variables, Chebyshev's inequality can be stated as follows: for any random variable X with mean μ and variance σ and any real number k>1 , at least 11k2 of the possible values of X are within k standard deviations of the mean. In terms of probability,

PXμ<kσ11k2

The inequality can be stated for any particular value of k . For example, if k=2 , then at least 1122=0.75=75% of possible values are within 2 standard deviations of the mean; if k=5 , then at least 1152=0.96=96% of the possible values are within 5 standard deviations of the mean, and so on.

Example

The height of a hobbit in his or her tweens follows the distribution with mean μ=36 inches and standard deviation σ=4 inches. Use Chebyshev's inequality to (a) put a bound on the proportion of hobbits whose height is between 30 and 42 inches and (b) find the interval which is guaranteed to contain 95% of the population.

(a) We need the proportion of the population within 1.5σ of the mean, so we can use the inequality directly for k=1.5 ,

PX36<1.5×4111.52=59 .

(b) We can find k which corresponds to the interval which covers 95% of the population by solving the equation 0.95=11k2 . Since k=20 , the interval can be written as μkσμ+kσ=3642036+420 , or approximately 18.1153.89 .

Stem And Leaf Plot Examples

Example 1

Sorted data:

43 45 47 49 62 66 67 69 72 72
75 76 81 84 88 104 104 105 105 105

Stem and Leaf plot:

  The decimal point is 1 digit(s) to the right of the |

   4 | 3579
   5 |
   6 | 2679
   7 | 2256
   8 | 148
   9 |
  10 | 44555

Example 2

Sorted data:

-2.300 -1.500 -1.200 -1.100 -0.790 -0.580 -0.550 -0.260 -0.190 -0.130
-0.072 0.210 0.380 0.800 0.950 0.990 1.300 1.400 1.400 1.500

Stem and Leaf plot (rounding away from zero)

  The decimal point is at the |

  -2 | 3
  -1 | 521
  -0 | 8663211
   0 | 248
   1 | 003445

See here for more info.