Day By Day
Notes for MATH 301
Fall 2006
Activity: Go over syllabus. Take roll. Overview examples: Randomness - coin example. Gilbert trial. Election polls. Spam filters.
Creating random samples.
The text is
remiss in telling us how to actually select random samples in practice. Many texts fail in this regard, so to
fill in this blank, we will use three methods of sampling today: dice, a table
of random digits, and our calculator.
To make the problem feasible, we will only use a population of size
6. (I know this is unrealistic in
practice, but the point today is to see how randomness works, and trust that
hopefully the results extend to larger problems.) Pretend that the items in our population (perhaps they are
people) are labeled 1 through 6.
For each of our methods, you will have to decide in your group what to
do with "ties". Keep in
mind the goal of simple random sampling: at each stage, each remaining item has
an equal chance to be the next item selected.
By rolling dice, generate a sample of three people. (Let the number on the die correspond to one of the
items.) Repeat 20 times, giving 20
samples of size 3.
Using the table of random digits, starting at any haphazard location, select
three people. (Let the random
digit correspond to one of the items.)
Repeat 20 times, giving 20 more samples of size 3.
Using your calculator, select three people. The TI-83 command MATH
randInt(2,4,5) will produce 5 numbers between 2 and 4, inclusive,
for example. (If you leave off the
third number, only one value will be generated.) If your calculator has a rand
function only, you can achieve the same result as the TI-83 MATH randInt(2,4) with
int(3*rand)+2. Repeat 20 times, giving 20 more samples of
size 3.
Your group should have drawn 60 samples at the end. Keep careful track of which samples you selected; record
your results in order, as 125 or 256, for example. (125 would mean items 1, 2, and 5 were selected.) We will pool the results of
everyone's work
together on the board.
Goals: Review
course objectives: collect data, summarize information, model with probability,
make inferences.
Gain practice taking random samples.
Understand what a simple random sample is. Become familiar
with randInt(. Accept that calculator is
random.
Skills:
…
Know the definition
of a Simple Random Sample (SRS). Simple Random Samples can be defined in two ways:
1) An SRS is a sample where, at
each stage, each item has an equal chance to be the next item selected.
2) A scheme were every possible
sample has an equal chance to be the
sample results in an SRS.
…
Select an SRS from a
list of items. The TI-83 command randInt( will select numbered items from a list randomly. If a number selected is already in the
list, ignore that number and get a new one. Remember, as long as each remaining item is equally likely to be chosen as the next item,
you have drawn an SRS.
…
Understand the real
world uses of SRS.
In practice, simple random samples are not that
common. It is just too impractical
(or impossible) to have a list of the entire population available. However, the idea of simple random
sampling is essentially the foundation for all the other types of
sampling. In that sense then it is very
common.
Reading: Sections 1.1 to
1.6.
Activity: Dance Fever example.
Use the "Arizona Temps" dataset to
calculate means, standard deviations, the 5-number summaries. To calculate our summary statistics
with the TI-83, we will use STAT CALC 1-Var Stats (to use List 1) or STAT CALC 1-Var Stats L2 for List 2, for example. There are two screens of output; we will be mostly concerned
with the mean , the standard deviation Sx, and the five-number summary on screen two.
Answer these questions:
1) Are high and low temperatures
distributed the same way, other than the obvious fact that highs are higher
than lows?
2) How does a single case affect the
calculator's routines? (What if we
had had an outlier?)
3) What information does
the 5-number
summary disguise?
Now, create the following lists:
1) A list of 10 numbers that has
only one number below the mean.
2) A list of 10 numbers that has the
standard deviation greater than the mean.
3) A list of 10 numbers that has a
standard deviation of zero.
For your fourth list start with any 21 numbers. Find a number N
such that 14 of the numbers in your list are within N of the average.
For example, pick a number N
(say 4), calculate the average plus 4, the average minus 4, and count how many
numbers in your list are between those two values. If the count is less than 14, try a larger number
for N (bigger than 4). If the count is more than 14, try a smaller number
for N (smaller than 4).
Finally, compare the standard deviation to the Interquartile Range (IQR = Q3 -
Q1).
Goals: Compare
numerical measures of center and spread.
Use technology to summarize data with numerical measures. Interpret standard deviation as a
measure of spread.
Skills:
…
Understand the effect
of outliers on the mean.
The mean (or average) is unduly influenced by outlying
(unusual) observations. Therefore,
knowing when your distribution is skewed is
helpful.
…
Understand the effect
of outliers on the median. The median is almost completely
unaffected by outliers. For
technical reasons, though, the median is not as common in scientific
applications as the mean.
…
Use the TI-83 to
calculate summary statistics.
Calculating may be as simple as entering numbers into
your calculator and pressing some buttons: STAT CALC 1-Var Stats. Or, if
you are doing some things by hand, you may have to organize information the
correct way, such as listing the numbers from low to high. Please get used to using the
statistical features of your calculator to produce the means, standard
deviations, etc. While I know you
can calculate the mean by simply adding up all the numbers and dividing by the
sample size, you will not be in the habit of using the full features of your
machine, and later on you will be "missing the
boat".
…
Understand standard
deviation. At first, standard deviation will seem foreign to you,
but I believe that it will make more sense the more you become familiar with
it. In its simplest terms, the
standard deviation is non-negative number that measures how "wide" a
dataset is. One common
interpretation is that the range of a dataset is about 4 standard
deviations. Another interpretation
is that the standard deviation is roughly ¾ times IQR; that is the
standard deviation is a bit smaller than the IQR. Eventually we will use the standard deviation in our
calculations for statistical inference; until then, this measure is just
another summary statistic, and getting used to this number is your goal. The normal curve of Chapter 6 will
further help us understand standard deviation.
Reading: Sections 1.7 to 1.9 and 8.3
(excluding normal quantile-quantile plots).
Activity: Use the "Arizona Temps" dataset to practice creating
the histograms, stemplots, boxplots, and quantile plots for several lists. Compare and interpret the graphs. Identify shape, center, and spread.
Compare these measures with the corresponding numerical measures you calculated
on Day 2. Notice that the boxplots
and numerical measures cannot describe shape very well. The histograms are hard to use to
compare two lists. The stem and
leaf is difficult to modify.
Useful commands for the TI-83:
STAT
EDIT (use one of the lists to enter data,
L1 for example; the other L's can be used too)
2nd
STATPLOT 1 On (Use this screen to
designate the plot settings. You
can have up to three plots on the screen at once. For now we will only use one at a time.)
ZOOM
9 This command centers the window around
your data.
PRGM
EXEC QUANTILE ENTER This program I wrote
plots the sorted data and "stacks" them up. It is essentially a quantile plot.
Using the plots now instead of the summary statistics, answer these questions
again:
1) Are high and low temperatures
distributed the same way, other than the obvious fact that highs are higher
than lows?
2) How does a single case affect
the calculator's routines? (What
if we had had an outlier?)
3) What information does the
5-number summary disguise?
Goals: Be able
to use the calculator to make a histogram, boxplot, or a quantile plot. Be able to make a stemplot by
hand.
Skills:
…
Summarize data into a
frequency table. The easiest way to make a frequency table is
to TRACE the boxes in a histogram and record the classes and
counts. You can control
the size and
number of the classes with Xscl
and Xmin
in the WINDOW menu. The decision as to
how many classes to create is arbitrary; there isn't a "right"
answer. One popular suggestion is
try the square root of the number of data values. For example, if there are 25 data points, use 5
intervals. If there are 50 data
points, try 7 intervals. This is a
rough rule; you should experiment with it. The TI-83 has a rule for doing this; I do not know what
their rule is. You should
experiment by changing the interval width and see what happens to the
diagram.
…
Use the TI-83 to
create an appropriate histogram, boxplot, or quantile plot. STAT PLOT is our main tool for
viewing distributions of data.
Histograms are common displays, but have flaws; the choice of
class width
is troubling as it is not unique.
The quantile plot is more reliable, but less common. For interpretation purposes, remember
that in a histogram tall boxes represent places with lots of data, while in a
quantile plot those same high-density data places are
steep.
…
Create a stemplot by
hand. The stemplot is a convenient manual display; it is
most useful for small datasets, but not all datasets make good stemplots. Choosing the "stem" and
"leaves" to make reasonable displays will require some practice. Some notes for proper choice of stems:
if you have many empty rows, you have too many stems. Move one column to the left and try again. If you have too few rows (all the data
is on just one or two stems) you have too few stems. Move to the right one digit and try again. Some datasets will not give good
pictures for any choice of stem, and some benefit from splitting or rounding
(see the example in the text).
…
Describe shape,
center, and spread.
From each of our graphs, you should be able to make
general statements about the shape, center, and spread of the distribution of
the variable being explored.
…
Compare several lists
of numbers using boxplots.
For two lists, the best simple approach is the
back-to-back stemplot. For more
than two lists, I suggest trying boxplots, side-by-side, or stacked. At a glance, then, you can assess which
lists have typically larger values or more spread out values,
etc.
…
Understand
boxplots. You should know that the boxplots for some lists don't
tell the interesting part of those lists.
For example, boxplots do not
describe shape very well (apart from rough symmetry); you can only see where
the quartiles are. Alternatively,
you should know that the boxplot can
be a very good first quick look at a dataset.
Reading: Sections 2.1 to
2.3.
Activity: Sample Spaces. Venn Diagrams. Coins, Dice. Pascal's Triangle.
Using either complete sampling spaces (theory) or simulation, find (or
estimate) these chances:
1) Roll two dice, one colored, one
white. Find the chance of the
colored die being less than the white die.
2) Roll three dice and find the
chance that the largest of the three dice is a 6. (Ignore multiple values; that is, the largest value when 6,
6, 4 is rolled is a 6.)
3) Roll three dice and find the
chance of getting a sum of less than 8.
Goals: Create
sample spaces. Use Venn diagrams
to organize sample spaces. Use
simulation to estimate probabilities.
Skills:
…
Know the definitions
of Sample Space, Event, Outcome, etc.
The basic language of
probability will be used throughout the course, so it is important for you to
be conversant in it.
…
Be able to use a Venn
diagram. The Venn diagram is a way of partitioning the sample
space into mutually exclusive regions.
It can be useful for simply organizing sets, or sometimes is quite
useful in understanding proofs (as we will see in the inclusion/exclusion
formula on Day 6.)
…
List simple sample
spaces. Flipping coins and rolling dice are common events to
us, and listing the possible outcomes lets us explore probability
distributions. We will not delve
too deeply into probability rules; rather, we are more interested in the ideas
of probability and I think the best way to accomplish this is by
example.
…
Simulation can be
used to estimate probabilities.
If the number of repetitions of an experiment is
large, then the resulting observed frequency of success can be used as an
estimate of the true unknown probability of success. However, a "large" enough number of repetitions
may be more than we can reasonably perform. For example, for problem 1 today, a sample of 100 will give
results between 32/100 and 51/100 (.32 to .51) 95% of the time. That may not be good enough for our
purposes. Even with 500, the range
is 187/500 to 230/500 (.374 to .460). Eventually the answers will converge to a useful percentage;
the question is how soon that will occur.
We will have answers to that question after Chapter
?.
…
Recognize the
usefulness and properties of Pascal's Triangle. Pascal's
Triangle is old (known to the Persians and the Chinese in the 11th
century) yet is still quite useful.
There are just two rules to construct Pascal's Triangle: each row begins and ends with a 1, and
each entry is the sum of the two entries above it to the left and the
right. From such a simple
construction, though, we encounter many relationships: the combination formula,
the triangular numbers, the Fibonacci numbers, the powers of 2, among
others. Our chief interest is in
the combination formula and its relationship to the binomial distribution.
Reading: Section
2.3.
Activity: Presentation 1.
Summaries (Chapters 1 and 8.3)
Gather 3 to 5 variables on at least 20 subjects; the source is irrelevant, but
knowing the data will help you explain its meaning to us. Be sure to have at least one numerical
and at least one categorical variable.
Demonstrate that you can summarize data graphically and numerically.
Combinations vs Permutations.
Goals: Continue
exploring Pascal's Triangle and how it relates to counting (permutations and
combinations).
Skills:
…
Know the Permutation
and Combination formulas.
When counting the number of ways of choosing items or
ordering items, our formulas are nCr and nPr, respectively.
You will need to work enough problems so that you know when to use each
of them. One way to keep them
straight is to think of a Combination as a Committee of people,
and a Permutation as a Photograph of that committee. (There are more permutations than
combinations for a particular choice of n and r.) Also don't forget our trick of listing
the complete sample space, but only for small
problems!
Reading: Sections 2.4 and
2.5.
Activity: Finish Combinations and
Permutations.
Arrange the letters in FREDA.
Arrange the letters in FREED.
Arrange the letters in ERRORS.
Arrange the letters in SETTER.
Demonstrate the Inclusion/Exclusion formula with a 3 set Venn diagram.
Use Venn diagrams to "prove":
A = (A«B)
» (A«B')
(A»B)' = A'«B'
Basic probability rules:
Probability is a number between 0 and 1, inclusive.
Mutually Exclusive events add when finding the union.
Mutually Exclusive and exhaustive events add to one.
Goals: Know the
rules of probability, including addition, complement, and
inclusion/exclusion. The
multiplication rule will be covered on Day 7.
Skills:
…
Understand the
probability rules.
Being adept at probability begins with knowing
definitions and knowing basic formulas.
For example, you can't prove things about mutually exclusive sets if you
can't recite the definition of mutually exclusive. Memorize at first; later it becomes "learned", not
"memorized".
…
Relate the rules to
sample spaces. Remember that the rules we're discussing are all based
on counting elements in sample spaces.
Sometimes it is helpful to have a few "standard" examples in
mind so conjectures or steps in reasoning can be verified. For example, the inclusion exclusion
principle is shown well with the two-dice problem "what is the chance of
at least one six?". Ignoring
the intersection makes the probability too large.
…
Realize how the Venn
diagram can help verify results. The inclusion/exclusion formula is a good example
where a Venn diagram can help with the proof or development. Other examples are DeMorgan's
Laws. For Bayes'
formula, on Day 7,
the Venn diagram will also be useful.
Reading: Sections 2.6 to
2.8.
Activity: Constructing
probability trees. Demonstrating
Bayes' with the rare disease problem.
Consider a card trick where two cards are drawn sequentially off the top of a
shuffled deck. (There are 52 cards
in a deck, 4 suits of 13 ranks.)
We want to calculate the chance of getting hearts on the first draw, on
the second draw, and on both draws.
We will organize our thoughts into a tree diagram, much like water
flowing in a pipe. On each branch,
the label will be the probability of taking that branch; thus at each node, the
exiting probabilities (conditional probabilities) add to one.
On the far right of the tree, we will have the intersection events. Their probability is found by
multiplying.
Calculate the chances of:
1) Drawing a heart on the first
card.
2) Drawing a heart on the second
card.
3) Drawing at least one heart.
4) Drawing two hearts.
5) Drawing a heart on the second
draw given that a heart was drawn first.
6) Drawing a heart on the first
draw given that a heart was drawn first.
Now we will do this work for the rare disease problem (Problem 2.128).
Goals: Be able
to express probability calculations as tree diagrams. Be able to reverse the events in a probability tree, which
is what Bayes' formula is about.
Skills:
…
Know how to use the
multiplication rule in a probability tree. Each branch of a
probability tree is labeled with the conditional probability for
that branch.
To calculate
the joint probability of a series of branches, we multiply the conditional
probabilities together. Note that
at each branching in a tree, the (conditional) probabilities add to one, and
that overall, the joint probabilities add to one.
…
Recognize conditional
probability in English statements.
Sometimes the key word is
"given". Other times the
conditional phrase has "if".
But sometimes the fact that a statement is conditional is
disguised. For example: "Assuming John buys the insurance, what is the chance
he will come out ahead" is equivalent to "If John buys insurance,
what is the chance he will come out ahead".
…
Be able to use the
conditional probability formula to reverse the events in a probability
tree. The key here is the symmetry of the events in the
conditional probability formula.
We exchange the roles of A and B, and tie them together with our formula
for Pr(A«B).
…
Know the definition
of independence. Independence is a fact about probability, not about
sets. Contrast this to
"disjoint" which is a property of sets. In
particular, independent events are by definition not disjoint.
Independence is important later as an assumption as it allows us to
multiply individual probabilities together without having to worry about
conditional probability.
Reading: Sections 3.1 and
3.2.
Activity: Continue coins and dice. Introduce Random Variables.
We will finish up the problems from Day 4. Also in our tables, we will include random variables.
Answer the following questions:
1) What is the chance of getting a
sum of 8 on two dice?
2) What is the chance of getting a
sum of 10 on two dice?
3) What is the chance of getting a
sum of x on two dice,
where x is between 1 and 13?
4) What is the chance of getting
10 heads on 20 flips of a fair coin?
5) How can you get the TI-83 to
graph a probability histogram?
Derive a pmf and its cdf. Use the
sum on two dice as an example.
Know how to work back and forth from one to the other.
Goals:
Understand
that variables may have values that are not equally likely.
Skills:
…
Understand discrete
random distributions and how to create simple ones. We have listed
sample spaces of equally likely events, like dice and coins. Events can further
be grouped together and assigned values.
These new groups of events may not be equally likely, but as long as the
rules of probability still hold, we have valid probability distributions. Pascal's triangle is one such example,
though you should realize that it applies only to fair coins. We will work with "unfair
coins" (proportions) later, in Chapter 5. Historical note: examining these sampling distributions led
to the discovery of the normal curve in the early 1700's. We will copy their work and
"discover" the normal curve for ourselves too using
dice.
…
Know the definition
of a discrete probability mass function (pmf). If a non-negative
function sums to 1 over some set, then we have a discrete pmf. It is not necessary for the set to be
finite; this means we may need to work with infinite sums. Because each item in the sum is a
probability, it is necessary that
each value is less than one.
(Contrast this with the continuous distributions on Day
9.)
…
Know the definition
of a discrete cumulative distribution function (cdf). If a
non-decreasing function begins at 0 from the left and ends at 1 on the right,
and has no place where the derivative is non-zero, then we have a discrete
cdf. The key is that discrete
cdf's are stairs, flat spot with discrete jumps.
Reading: Section
3.3.
Activity: Presentation 2.
Probability (Chapter 2)
Choose one of the following games and
1) Give us a short history of the
game.
2) Describe how randomness is part
of the game.
3) Using our probability rules,
show us an example using this game.
Games: Risk, Blackjack,
Backgammon, Roulette, Battleship, Poker, Minesweeper, Cribbage
Calculate a pmf and its cdf. Use
the uniform as an example. Know
how to work back and forth from one to the other. Note: calculus
required!
Go over the cdf method of generating random samples. Requires the cdf in a formula that can be inverted.
Goals: Introduce
continuous distributions.
Skills:
…
Know the definition
of a probability density function (pdf).
If a non-negative function
integrates to 1 over some interval, then we have a probability density
function. Notice that the function
can certainly be over 1 (contrast to pmf's); the key here is that the
area is one, not the maximum
height.
…
Know the definition
of a continuous cumulative distribution function (cdf). If
a continuous non-decreasing function begins at 0 from the left and ends at 1 on
the right, then we have a discrete cdf.
The key is the continuity.
If a function has only
jumps, it is discrete. If a
function has no jumps, it is
continuous. A function with both
is mixed. An example of a mixed distribution is a
question like "If you are employed, what is your income?" People without a job have no income, so
there is a spike at 0.
… Realize that these formulas and functions we are exploring are simply models. What we ar