# MA 115, Fall 2011

This is the page maintained by the TF in order to share optional class materials. It uses an alien technology known as MathML, which is adequately supported by the latest version of Mozilla Firefox, and is known not to work with Microsoft Internet Explorer. This page is updated weekly, so direct your Web browser to reload this page every time you check it.

## Contents

Distribution Of The Sample Mean

## Distribution Of The Sample Mean

### Example

A professional Hold'Em player claims to win 1 big blind per 1 hour session with standard deviation of 12 big blinds per hour. If the winnings from different sessions are independent, find the probability that the average amount won after 30 hours is positive. Do the same for 500 hours.

We can regard the winnings in $n=30$ sessions as independent identically distributed (iid) random variables ${X}_{1}$, ${X}_{2}$, ... , ${X}_{n}$. Since the size of this sample is pretty big, we can rely on the Central Limit Theorem. Regardless of the distribution of each $X$, we know that the distribution of the average winnings $\stackrel{\u203e}{X}$ is approximately normal with mean $1$ and standard deviation

$\sigma}_{\stackrel{\u203e}{X}}=\frac{\sigma}{\sqrt{n}}=\frac{12}{\sqrt{30}$.

And so the answer is

$P\left(\stackrel{\u203e}{X}>0\right)=1-P\left(\stackrel{\u203e}{X}<0\right)=1-P\left(Z<\frac{0-1}{12/\sqrt{30}}\right)$

or approximately $0.6759616$. Since $P\left(\stackrel{\u203e}{X}>0\right)=P\left({X}_{1}+{X}_{2}+\dots +{X}_{30}>0\right)$, we can conclude that after playing 30 hours, the player comes out in black about $67.6\mathrm{\%}$ of the time.

Similarly, if $n=500$ then

$\sigma}_{\stackrel{\u203e}{X}}=\frac{\sigma}{\sqrt{n}}=\frac{12}{\sqrt{500}$

and

$P\left(\stackrel{\u203e}{X}>0\right)=1-P\left(Z<\frac{0-1}{12/\sqrt{500}}\right)$

or approximately $0.9687963$. As a consequence, there is a whopping $3\mathrm{\%}$ chance that the player will still be in the red after playing for 500 hours.

## Expected Value And Variance

### Expected Value, Discrete Case

For a discrete random variable $X$ with a corresponding probability mass function $f\left(x\right)=P\left(X=x\right)$ we define the expected value of $X$ (denoted by $EX$) to be the sum

$EX=f\left({x}_{1}\right)P\left(X={x}_{1}\right)+f\left({x}_{2}\right)P\left(X={x}_{2}\right)+f\left({x}_{2}\right)P\left(X={x}_{2}\right)+\dots$

For example, let $X$ be the amount of money (in USD) you earn in a lottery where you buy a 1 USD ticket and have a small chance to win either a 100 or 10 USD prize. Let $P\left(X=99\right)=0.004$, $P\left(X=9\right)=0.04$, and $P\left(X=-1\right)=0.956$. That is, there is a large chance that you win no prize but still have to pay 1 USD for the ticket. Here's the pmf of $X$ in table form (note that probabilities add up to $1$):

$x$ | $-1$ | $9$ | $99$ |

$P\left(X=x\right)$ | $0.956$ | $0.04$ | $0.004$ |

Then

$EX=\left(-1\right)\times 0.956+9\times 0.04+99\times 0.004=-\frac{1}{5}$

That is, every time you play this lottery, you lose 20 cents on average.

### Variance

The variance of any random variable $X$ (denoted $\mathrm{Var}X$) is defined to be

$\mathrm{Var}X=E{\left(X-EX\right)}^{2}$.

A basic theorem about variance gives us another way of expressing it:

$\mathrm{Var}X=E{X}^{2}-{\left(EX\right)}^{2}$.

To find the variance of a discrete random variable, we can construct a pmf for ${X}^{2}$ and compute $E{X}^{2}$. Using the variable from the example above, we have

${x}^{2}$ | $1$ | $81$ | $9801$ |

$P\left({X}^{2}={x}^{2}\right)$ | $0.956$ | $0.04$ | $0.004$ |

And so

$E{X}^{2}=1\times 0.956+81\times 0.04+9801\times 0.004=\frac{217}{5}$

and hence

$\mathrm{Var}X=E{X}^{2}-{\left(EX\right)}^{2}=\frac{217}{5}-{\left(-\frac{1}{5}\right)}^{2}=\frac{1084}{25}=43.36$.

This is rather large variance (much larger than the average value of $X$), just as one would expect from a lottery. You lose a few cents on average, but there is a distinct possibility of earning many dollars every once in a while.

## Chebyshev's Inequality

When applied to random variables, Chebyshev's inequality can be stated as follows: for any random variable $X$ with mean $\mu $ and variance $\sigma $ and any real number $k>1$, at least $\left(1-\frac{1}{{k}^{2}}\right)$ of the possible values of $X$ are within $k$ standard deviations of the mean. In terms of probability,

$P\left(\left|X-\mu \right|<k\sigma \right)\ge 1-\frac{1}{{k}^{2}}$

The inequality can be stated for any particular value of $k$. For example, if $k=2$, then at least $1-\frac{1}{{2}^{2}}=0.75=75\mathrm{\%}$ of possible values are within $2$ standard deviations of the mean; if $k=5$, then at least $1-\frac{1}{{5}^{2}}}=0.96=96\mathrm{\%$ of the possible values are within $5$ standard deviations of the mean, and so on.

### Example

The height of a hobbit in his or her tweens follows the distribution with mean $\mu =36$ inches and standard deviation $\sigma =4$ inches. Use Chebyshev's inequality to (a) put a bound on the proportion of hobbits whose height is between 30 and 42 inches and (b) find the interval which is guaranteed to contain 95% of the population.

(a) We need the proportion of the population within $1.5\sigma $ of the mean, so we can use the inequality directly for $k=1.5$,

$P\left(\left|X-36\right|<1.5\times 4\right)\ge 1-\frac{1}{{1.5}^{2}}=\frac{5}{9}$.

(b) We can find $k$ which corresponds to the interval which covers 95% of the population by solving the equation $0.95=1-\frac{1}{{k}^{2}}$. Since $k=\sqrt{20}$, the interval can be written as $\left(\mu -k\sigma ,\mu +k\sigma \right)=\left(36-4\sqrt{20},36+4\sqrt{20}\right)$, or approximately $\left(18.11,53.89\right)$.

## Stem And Leaf Plot Examples

### Example 1

Sorted data:

43 | 45 | 47 | 49 | 62 | 66 | 67 | 69 | 72 | 72 |

75 | 76 | 81 | 84 | 88 | 104 | 104 | 105 | 105 | 105 |

Stem and Leaf plot:

The decimal point is 1 digit(s) to the right of the | 4 | 3579 5 | 6 | 2679 7 | 2256 8 | 148 9 | 10 | 44555

### Example 2

Sorted data:

-2.300 | -1.500 | -1.200 | -1.100 | -0.790 | -0.580 | -0.550 | -0.260 | -0.190 | -0.130 |

-0.072 | 0.210 | 0.380 | 0.800 | 0.950 | 0.990 | 1.300 | 1.400 | 1.400 | 1.500 |

Stem and Leaf plot (rounding away from zero)

The decimal point is at the | -2 | 3 -1 | 521 -0 | 8663211 0 | 248 1 | 003445

See here for more info.