Day By Day
Notes for PBIS 187
Sports
Mathematics
Fall 2006
Activity: Go over syllabus. Take roll. Overview examples: NCAA tournament, QB rating, Batting
averages, What is random?
http://www.sabernomics.com/sabernomics/index.php/2006/05/age-cut-offs-and-month-of-birth-in-baseball
http://www-math.bgsu.edu/~albert/papers/saber.html
http://www.sabr.org
http://sabermetrics.hnrc.tufts.edu
http://www.baseball-reference.com
Goals: Review
course objectives: collect data, summarize information, make inferences, reason
logically.
Activity: Home Run Comparisons.
Pick one of the top home run hitters of all time (get the data from http://www.baseball-reference.com)
and create graphical summaries of their yearly home run totals. Make a histogram, a stem plot, and a
quantile plot.
Useful commands for the calculator:
STAT EDIT (Use
one of the lists to enter data, L1 for
example; the other L's can be used too.)
2nd
STATPLOT 1 On (Use this screen to designate the plot settings. You can have up to three plots on the
screen at once. For now we will
only use one at a time.)
ZOOM
9 (This
command centers the window around your data.)
PRGM QUANTILE
ENTER (This program plots the sorted data and "stacks"
them up, as opposed to a histogram, which places the boxes side by side.)
From your displays, write a short description of the player's home
run history.
To make a histogram: Enter data into a list on the
TI-83. Setup one of the
plots. Zoom the window
settings.
To interpret a histogram: Each "bin" is represented by
a rectangle; the height is proportional to the number of cases in that bin or
interval. Tall boxes mean lots of
data; short boxes (or empty boxes) indicate little (or no) data.
To make a stem plot: Choose a "numbers place",
such as tens, hundreds, etc. for a stem.
(You may also have to consider ones, tenths, hundreds, etc. The choice of stem will be dictated by
how many data points end up on each row; too many stems and each row has just
one or two items. Too
few stems and you have one or two stems with all the
data. Choosing the proper stem
requires good judgment.) After
choosing a stem, make a column of these stems starting at the lowest value, and
without skipping any values. Then
go through the data set and record each data point on the appropriate row
(stem), writing down only the
digit to the right of the stem's digit.
For example, if you have chosen the tens place for the stem, the data
value 123 would belong on the stem labeled "12" and you jot down the
number "3" for the leaf.
When you are finished, you may want to sort the items (the leaves) on
each row (stem). Note: the stem plot is a visual display; make
sure each digit you write down occupies the same amount of space. If you are typing, use Monaco or
Courier or some other fixed-width font.
It is especially tempting to squeeze together a string of 1's.
To interpret a stem plot: Each row of a stem plot can be
interpreted in the same way as a bin in a histogram; wide stems (just like tall
boxes in a histogram) represent lots of data points. One advantage of a stem plot over a histogram is that every
data point appears in the stem plot; in the histogram, all you know is how
many data values are in an
interval.
To make a quantile plot: A quantile plot is a graph of the rank
of a data value (lowest, second lowest, etc.) to its data value. We put the ranks on the left (the
vertical scale) and the data values on the bottom (the horizontal scale). All quantile plots start on the lower
left and end on the upper right.
The TI-83 program QUANTILE will graph a quantile
plot for you; all you need to tell the calculator is which list your data is
in.
To interpret a quantile plot: The slope of the graph is the important
feature of a quantile plot. Steep
sections represent x-values with lots
of data values; flat sections are areas with little or no data.
Goals: Perform
graphical summaries (describing data with pictures). Be able to use the calculator to make a histogram or a
quantile plot. Be able to make a
stem plot by hand.
Skills:
…
Identify types of
variables. To choose the proper graphical displays, it is
important to be able to differentiate between Categorical and Quantitative (or
Numerical) variables. Categorical
variables do not have numerical values, or if they are numerical, it is only a
label.
…
Be familiar with
types of graphs. To graph categorical variables we use bar graphs or
pie graphs. To graph numerical
variables, we use histograms, stem plots, or QUANTILE (TI-83 program). In practice, most of our variables will be numerical but it
is still important to choose the right display.
…
Summarize data into a
frequency table. The easiest way to make a frequency table is
to TRACE the boxes in a histogram and record the classes and
counts. You can control the size
and number of the classes with Xscl
and Xmin
in the WINDOW menu. The decision as to
how many classes to create is arbitrary; there isn't a "right"
answer. One popular suggestion is
try the square root of the number of data values. For example, if there are 25 data points, use 5
intervals. If there are 50 data
points, try 7 intervals. This is a
rough rule; you should experiment with it. The TI-83 has a rule for doing this; I do not know what
their rule is. You should
experiment by changing the interval width and see what happens to the
diagram.
…
Use the TI-83 to
create an appropriate histogram or quantile plot. STAT PLOT is our main tool for
viewing distributions of data.
Histograms are common displays, but have flaws; the choice of class
width is troubling as it is not unique.
The quantile plot is more reliable, but less common. For interpretation purposes, remember
that in a histogram tall boxes represent places with lots of data, while in a
quantile plot those same high-density data places are
steep.
…
Create a stem plot by
hand. The stem plot is a convenient manual display; it is
most useful for small datasets, but not all datasets make good stem plots. Choosing the "stem" and
"leaves" to make reasonable displays will require some practice. Some notes for proper choice of stems:
if you have many empty rows, you have too many stems. Move one column to the left and try again. If you have too few rows (all the data
is on just one or two stems) you have too few stems. Move to the right one digit and try again. Some datasets will not give good
pictures for any choice of stem, and some benefit from splitting or rounding
(see the example in class).
…
Describe shape,
center, and spread.
From each of our graphs, you should be able to make
general statements about the shape, center, and spread of the distribution of
the variable being explored. Our
descriptors will be simple words like symmetric, skewed, two-peaked, etc.
Day 3
Activity: Cumulative Progress.
Examples: Pennant races, Running
pace, Bowling averages.
http://www.alexreisner.com/baseball/history/race Davenport's graphs.
To display cumulative progress, use the program PROGRESS. The
program will prompt you for whether you want the endpoint to be the average of
the list or a number you input.
For the pennant races and other yes/no type responses, use INPUT and give it the value
"0". For the other
examples, we will likely use AVERAGE, but you can explore the shape of the graph with
other values. In all graphs,
regions of similar slope have similar averages. We will discuss this phenomenon in our class examples.
Numerical summaries, including box plots: Our main numerical
summaries will be the mean, the median, and the standard deviation. The mean is the arithmetic average, the
median is the middle number in the sorted list, and the standard deviation is a
measure of how spread out the values are.
Roughly, most data sets are 4 to 6 standard deviations wide. That is, the largest value is close to
4 to 6 standard deviations above the smallest value.
The 5-number summary uses the smallest value, the largest value, the median,
and the medians of the two halves of the data. These two other medians are called the quartiles, because
they split the data set up into quarters.
The box plot is a visual picture of the 5-number summary. The calculator has a
selection in the STAT PLOT menu for
this (the 5th
icon). However, I recommend using
the modified box plot
(the 4th
icon) as it has a built-in outlier detector. This outlier detection routine is not foolproof; we still
need good judgment. But it at
least gives us more than just our opinion.
Goals: Be able
to make and interpret a cumulative progress graph. Be able to calculate and interpret numerical summaries. Be able to make and interpret a box
plot.
Skills:
…
Know the basics of a
cumulative progress graph.
Quite simply, record the result over time. Up indicates success, down indicates
failure. If the result is
continuous (as in running or bowling) then it will be appropriate to modify the
slope (see next item.)
…
Know the two ways a
cumulative progress graph can be drawn. When comparing several subjects (like
teams' season records) and the response is yes/no, or win/loss, etc., it may
make more sense to simply plot the graph without adjustment, to allow a
comparison. Up indicates a
success, down indicates failure, and the endpoint (to the right) will not be at
zero unless by coincidence. When
an adjustment is made, we require the right endpoint to be at zero, and the
amount for each success and failure is adjusted accordingly. Personally I think this is best done
with a computer program. You are
basically multiplying each element in the list by a proportional amount. For the yes/no type answers, use the
average .5 in the PROGRESS
program.
…
Recognize the
features easily seen in a cumulative progress graph. The most visual
feature of a cumulative progress graph is the fact that parallel lines denote
periods of equivalent performance.
For example, if the graph over one period of time has the same slope as
over another period of time, then the performance (batting average, running
pace, or whatever is being measured) is the same for both time
periods.
…
Use the TI-83 to
calculate summary statistics.
Calculating may be as simple as entering numbers into
your calculator and pressing a button.
Or, if you are doing some things by hand, you may have to organize
information the correct way, such as listing the numbers from low to high. On the TI-83, the numerical measures
are accessed in 1-Var Stats function
in the STAT
CALC menu.
Please get used to using the statistical features of your calculator to
produce the mean. While I know you
can calculate the mean by simply adding up all the numbers and dividing by the
sample size, you will not be in the habit of using the full features of your
machine, and later on you will be missing out.
…
Compare several lists
of numbers using box plots.
For two lists, the best simple approach is the
back-to-back stem plot. For more
than two lists, I suggest trying box plots, side-by-side, or stacked. At a glance, then, you can assess which
lists have typically larger values or more spread out values,
etc.
…
Understand box
plots. You should know that the box plots for some lists
don't tell the interesting part of those lists. For example, box plots do not describe shape very well; you can only see where the
quartiles are. Alternatively, you
should know that the box plot can
be a very good first quick look.
…
Understand the effect
of outliers on the mean.
The mean (or average) is unduly influenced by outlying
(unusual) observations. Therefore,
knowing when your distribution is skewed or symmetric is
helpful.
…
Understand the effect
of outliers on the median. The median is almost completely
unaffected by outliers. For
technical reasons, though, the median is not as common in scientific
applications as the mean.
Activity: Basketball and football scores
comparisons. Do teams that score
many points also give up many points?
Can final score be predicted from half time score? Using the data below, make scatter
plots of team score versus opponent score and half time score versus final
score. For each scatter plot,
include a correlation coefficient.
2005 Green Bay Packers
Week |
Opponent |
Half |
Final |
2nd |
1 |
17 |
3 |
3 |
0 |
2 |
26 |
7 |
24 |
17 |
3 |
17 |
13 |
16 |
3 |
4 |
32 |
7 |
29 |
22 |
5 |
3 |
35 |
52 |
17 |
7 |
23 |
17 |
20 |
3 |
8 |
21 |
7 |
14 |
7 |
9 |
20 |
3 |
10 |
7 |
10 |
25 |
17 |
33 |
16 |
11 |
20 |
14 |
17 |
3 |
12 |
19 |
14 |
14 |
0 |
13 |
19 |
7 |
7 |
0 |
14 |
13 |
10 |
16 |
6 |
15 |
48 |
3 |
3 |
0 |
16 |
24 |
7 |
17 |
10 |
17 |
17 |
13 |
23 |
10 |
Nov 2005 Milwaukee Bucks
Game |
Opponent |
Half |
Final |
2nd |
1 |
102 |
50 |
102 |
52 |
2 |
96 |
46 |
110 |
64 |
3 |
100 |
49 |
105 |
56 |
4 |
110 |
53 |
103 |
50 |
5 |
102 |
40 |
103 |
63 |
6 |
109 |
46 |
85 |
39 |
7 |
87 |
48 |
90 |
42 |
8 |
103 |
44 |
82 |
38 |
9 |
100 |
39 |
80 |
41 |
10 |
97 |
51 |
108 |
57 |
11 |
99 |
44 |
91 |
47 |
12 |
85 |
35 |
76 |
41 |
13 |
100 |
55 |
100 |
45 |
The pattern in a scatter plot can often be summarized adequately with a
straight line. Usually, we want to
summarize such linear scatter
plots with a single number, the correlation coefficient. The correlation coefficient is a
unit-less number that varies between -1 (perfect negative association) and +1
(perfect positive correlation). We
will discuss in class a technique to approximate by hand the correlation
coefficient in a scatter plot.
If the x-variable is time, we have
a time plot. This website (http://alexreisner.com/baseball )
gives some great examples of sports time plots.
Goals: Display
two variables and measure (and interpret) linear association using the
correlation coefficient.
Skills:
…
Plot data with a
scatter plot. This will be as simple as entering two lists of
numbers into your TI-83 and pressing a few buttons, just as for histograms or
box plots. Or, if you are doing
plots by hand you will have to first choose an appropriate axis scale and then
plot the points. You should also
be able to describe overall patterns in scatter diagrams and suggest tentative
models that summarize the main features of the relationship, if
any.
…
Use the TI-83 to
calculate the correlation coefficient.
We will have to use the
regression function STAT CALC LinReg(ax+b) to
calculate correlation, r. First, you will have to have pressed DiagnosticOn. Access
this command through the CATALOG (2nd 0). If you
type ENTER after the STAT CALC
LinReg(ax+b) command, the calculator
assumes your lists are in columns L1and
L2; otherwise you will type where they are,
for example STAT CALC
LinReg(ax+b) L2, L3.
…
Interpret the
correlation coefficient.
You should know the range of the correlation
coefficient (-1 to +1) and what a "typical" diagram looks like for
various values of the correlation coefficient. You should recognize some of the things the correlation
coefficient does not measure, such
as the strength of a non-linear
pattern.
…
Recognize
time plots and
their features. A time plot occurs when the x-variable is a time variable. Because the time variable usually
doesn't repeat itself, time plots are sometimes graphed as line
plots, as on Reisner's website.
Activity: Contingency tables.
Is there really a difference between home and away won/loss records in
sports? Baseball managers
sometimes "platoon" their right- and left-handed batters based on the
hand of the opposing pitchers. Can
we see evidence of this?
We cannot make a scatter plot with categorical data. Our next best option is to make a "contingency
table". This is simply a
cross-classification of the data values.
A simple example is the win/loss, home/away record for a team. It is quite easy to make such a table;
we just count how many items in the population fit into each cell.
The real question is whether the
data shows anything meaningful. By
that I mean do the categories show any departure from what would be expected if
everything were just random?
Before we answer this, we will usually want to summarize percentages
from the table, including marginal percentages.
Our first look at this will be to see what things would look like if everything
were random. We will
make the expected
table using our TI-83's. At the same time, the calculator will
give us a number (a P-value) that will help us decide if there is any pattern
present. Key characteristics of
the expected table are that the row and column totals are identical
to the original
data table, but the individual cell totals are proportional to the marginal
totals. That is, the percentage of
cases falling in any column is the same across all rows and vice versa. See the class examples.
The expected table represents how things would have worked out if the two
variables were unrelated. We will
go through examples in class to explain this phenomenon. It is essentially a "what-if"
type argument. "What would
things look like if the two variables were unrelated?" Then we compare that situation to what
has actually occurred, and if the difference is too large, then we conclude
that "dumb luck" is not the most likely explanation.
Goals: Organize
two categorical variables in a summary chart.
Skills:
…
Create a table
summarizing two categorical variables.
Unlike numerical variables,
summarization of categorical data is accomplished by making frequency
tables. Often along with the
tables, one will calculate marginal totals and percentages.
Activity: Expected Tables.
Today we will continue the material from Day 5, exploring further what the
calculator can do for us.
Specifically, we will look at the baseball platoon data.
Goals: Develop
intuition for when the observed and expected tables are too
different.
Skills:
…
Create the table of
expected counts. The primary method of analyzing categorical tables is
comparing the observed data to a table of expected counts. The TI-83 will calculate the expected
table for us. Our job is
to understand
the meaning of the numbers.
Basically, the expected table is the way the table would have come out
if the two variables were unrelated.
We use it as a baseline in determining
association.
…
Recognize when an
association is present.
When two categorical variables are associated (much
like when two numerical variables are correlated) we detect this with
the c2 test. We will use a statistical technique to
decide if the differences in the tables are too great, STAT
TESTS c2-Test. You
must have the observed table in a matrix.
The expected table will be stored in another matrix. If p
< .05, we conclude the two tables are quite different. Our reasoning is this: if the difference between the actual
results and the results assuming no association is a small difference, then we
have no reason to think that the variables are related. However, if the difference between the
two tables is considered large, then we conclude something can be said about
the relationship; that is, that one exists.
Activity: Presentations.
Graphical (Chapter 1) and Numerical (Chapter 2) Summaries
Collect or find some sports data; the quality of the data is not important for
this project. Use 3 to 5 lists of
data; make sure you have enough data so that your summaries are meaningful, say
at least 20 cases. Summarize your
data using both graphical and numerical summaries. Make sure you have at least one categorical variable and at
least one numerical variable. Make
sure you have at least one 2-variable summary.
Activity: Quiz 1. This first quiz is on graphical and numerical summaries.
Activity: What is Randomness?
Our notions of probability theory are based on the "long run", but
our everyday lives are dominated by "short runs". Today we will look at some everyday
sequences to see if they exhibit this "short term" behavior.
Coin experiment 1: Write down a
sequence of H's and T's representing head and tails, pretending you are
flipping a coin. Then flip a real
coin 50 times and record these 50 H's and T's. Without knowing which list is which, in most cases I will be
able to identify your real coin.
Baseball players: In sports you
often hear about the "hot hand". We will pick a player, look at his last 20 games, and see if
flipping a coin will produce a simulation that resembles his real
performance. Then we will examine
whether we could pick out the simulation without knowing which was which.
Coin experiment 2: Spin a penny on
a flat surface, instead of tossing it into the air. Record the percentage of heads.
Coin experiment 3: Balance a
nickel on its edge on a flat surface.
Jolt the surface enough so that the nickel falls over, and record the
percentage of heads.
Goals: Observe
some real sequences of random experiments. Develop an intuition about
variability.
Skills:
…
Recognize the feature
of randomness. Random does not mean haphazard, or without
pattern. We cannot predict what
will happen on a single toss of a coin, but we can predict what will happen in 1,000 tosses of a
coin. This is the hallmark of a
random process: uncertainty in a small number of trials, but a predictable
pattern in a large number of trials.
…
Resist the urge to
jump to conclusions with small samples.
Typically our daily activities
do not involve large samples of
observations. Therefore our ideas
of "long run" probability theory are not applicable. You need to develop some intuition
about when to believe an observed simulation, and when to doubt the
results. We will hone this intuition as we
develop our upcoming inference methods.
For now, understand that you may be jumping to conclusions by just
believing a small simulation's observed results.
Activity: Probability Trees.
YahtzeeÅ. Conditional
Probability.
Multiplication rule.
The game of YahtzeeÅ involves rolling 5 dice and trying to get a high score in
13 categories. You are allowed to
re-roll any of the dice up to 2 times.
We can describe the various cases we encounter using a probability
tree, which is a chart with a series
of branches that depict what could happen at each roll or re-roll of the
dice. The handy part of a tree is
the inclusion of probabilities at each branch. Sometimes, calculating these individual branch probabilities
is quite tedious. I hope we can
find some simple trees to model some YahtzeeÅ situations.
The branch probabilities in our diagram are really conditional
probabilities. The
"condition" is what has occurred previously in the tree. Usually at the far right of the tree we
calculate a final probability of that particular event. To do this we use the multiplication
rule, which says that the probability
of a series of events in a tree is found by multiplying all the conditional
probabilities together. Our class
examples should make this clearer.
Goals: Introduce
probability with trees.
Skills:
…
Understand what is
being displayed in a probability tree.
A probability tree shows all
possible outcomes in a series of events, like dice rolling or card
drawing. They are most useful when
the events described are dependent on one another, as in YahtzeeÅ rolls, or in
card drawing, although technically we can draw trees for independent events
too. However, with independent
events, the conditional probabilities are not influenced by previous events in
the tree (hence the notion of independence.)
…
Be able to prepare a
probability tree for simple problems.
The probability trees we
looked at in class are or can be quite complicated. I don't expect you to be able to create one for the full
choices in a game of YahtzeeÅ or for poker. However, for small problems (rolling a die and flipping a
coin; describing 3 games for a team; drawing two cards from a deck) I expect
you can create one. The key to
remember is that the branch labels are the chances of what happens at that
point in the sequence (the essence of
conditional probability).
…
Know how to use the
multiplication rule. With a sequence of related events (ones
where the idea of conditional probability makes sense) we can find the
probability of all the events
happening by multiplying all the individual conditional probabilities
together.
Activity: Probability Rules.
Poker. We will make some
probability trees for poker hands.
This may involve some counting techniques.
Goals:
Understand
the basic rules of probability.
Skills:
…
Know the
addition rule. When
two events are mutually exclusive,
we can find the chance that either
event occurs by adding the individual probabilities. An example of using the addition rule correctly is finding
the chance of drawing a spade or a heart on one draw from a deck of cards. The most common misuse of this rule is
to apply it to events that have elements in common and are therefore not
mutually exclusive. An example of
a misuse is finding the probability of getting at least one six on two rolls of
a die.
…
Know the complement
rule. The complement of an event (not compliment) is the
elements not in the event. For many of our sports examples, this
amounts to one of two items, such as win or loss, hit or miss, success or
failure, but many times there are several choices. Thus the opposite of a hit in baseball, for example, is not
necessarily an out; the batter could be hit by the pitch, he could walk, he
could get a sacrifice fly, etc.
The probability rule for complements is that the probability of an event
is the probability of the complement subtracted from one. This is just saying that either an
event or its complement occurs.
Activity: Simulation.
If we make a probability tree for a series of events, we can then use random
devices to simulate the experiment.
From these simulations, we can hopefully draw conclusions about the
probability of the events. We will
look at two examples today.
Should a football team go for the 2 point conversion, or kick the
extra point?
In baseball, is a sacrifice hit ever worth it? There have been some reports on this. I found the following websites.
http://japanesebaseball.com/forum/thread.jsp?forum=1&thread=134
http://www.commonwealthclub.org/archive/03/03-09baseball-speech.html
http://ite.pubs.informs.org/Vol5No1/Bickel
On our calculator, MATH PRB rand and MATH PRB
randint( are the two most useful random
number generators for us. Before
we begin, we should all reset our random number seeds. Otherwise, everyone's calculator will
give the same sequence of random numbers, and that isn't at all what we mean by
random! To reset your seed, store
any number to rand: <some number here> -> rand. Now, MATH
PRB rand gives a random number between 0 and 1. By truncating appropriately, we can
make this number be any sort of probability we need. For example, if we have an event that has probability 1/3,
we will say it has occurred if our number rand
is between 0 and .333333. If we
have some specific fraction in mind, instead of just any old decimal number, we
can use MATH PRB randint(
instead. This will give a random
integer between 1 and n. The syntax is MATH PRB
randint( 1, 12, 3) which will give us 3
numbers, each between 1 and 12.
The third number is unnecessary; if you leave it off, the calculator
will give you just one random number.
Goals: Develop
an intuition for the power of a simulation. Realize the limitations of a
simulation.
Skills:
…
Know how to use
random number generators (or coins, dice, etc.) to simulate events from a
probability tree. On the TI-83, MATH PRB rand is the most useful random number generator, due to its
flexibility. MATH PRB randInt( can also be
helpful, if the event desired is an number between 1 and n.
…
Know what can and
what cannot be deduced from a simulation.
From a simulation, we can get
an estimate of the probability of some event. This may be an improvement on the theoretical calculation
(using the probability rules) because the event in question may be incredibly
complicated. (Example: getting a
YahtzeeÅ [5 of a kind] in 3 rolls.)
On the other hand, if the simulation wasn't performed enough times
(which we usually won't be able to tell) then the results may be
misleading. It is a skill to
decide whether to believe the results of a simulation or not. Many of the examples I have shown so
far in this course are such examples.
We looked at correlations of team scores. However, the sample sizes were so small that I would feel
uncomfortable telling anyone our conclusions.
Activity: Markov Chains.
Suppose a baseball player has a 40 % chance of getting a hit after a hit, but
only a 25% chance of getting a hit after making an out. What will his eventual batting average
be?
In tennis, when the score is tied in a match, play continues until one player
has won two points in a row. How
long will the typical match last?
Markov Chains will help us answer these questions. Here are two other examples:
The status of the bases and outs in a baseball game can be considered to be the
nodes. The inning ends when 3 outs
have occurred, so that is the final node.
There are eight possible base situations, and 3 possible out situations,
so there are 25 total nodes. We
will draw this big chart in class and see what we can conclude.
Another situation is the ball/strike count for a batter.
We can use some matrix results to draw some conclusions. If we organize the probabilities of
going from one node to another in an array or matrix of numbers, we can have
our calculator give us the probabilities of the eventual end states of the
system.
Start by drawing the nodes with arrows leading from one node to another, with
associated probabilities. Then
translate these probabilities to the matrix. Be careful to label correctly. If some of the nodes are terminal nodes, i.e., you can't
leave that state, then we have absorbing states. List these at the
bottom and right of the matrix.
If there are no absorbing states (as in the baseball batting average example),
then the easiest way to analyze the problem is to raise the matrix to a large
power. This calculation will give
the probabilities of being in each state, and if they are the same for each
row, we have a steady state solution.
In the case of absorbing states, we need to analyze the problem
differently. Break down the matrix
into four matrices, crossing the non-absorbing states with the absorbing
states. We will only be interested
in the top two of these four matrices.
(See class notes.) Call the
first matrix Q and the
second one R. Our two
chief results will be ( I - Q )-1 and ( I - Q )-1
R. The TI-83 commands are (assuming the numbers are
stored in [A] and [B],
respectively, and N is the number of
non-absorbing states): ( identity( N ) - [A] )^-1 and (
identity( N ) – [A]
)^-1*[B]. (identity( and [A], [B], etc are found in the MATRIX menus.) The
first result tells us the expected number of times we'll be in each state, and
the second result tells us the eventual probability of being in each absorbing
state.
Goals: See how
some trees can be "solved" using Markov chains.
Skills:
…
Know how to draw a
Markov chain with nodes.
In a Markov chain, there are probabilities attached to
the connections between nodes. In
some chains, like the batting average example and the tennis example, it is
possible to revisit nodes previously visited; in our other examples this
doesn't happen. Thus there are a
lot of zeroes in those diagrams.
What we usually want to know is the behavior after the system is
"let run" for a while, i.e., the long run
behavior.
…
Know how to use
matrices to make conclusions about Markov chains. By organizing our
information into appropriate matrices, and performing simple operations (using
the TI-83), we can discover various steady-state solutions to Markov
chains.
Activity: Quiz 2.
This second quiz is on probability, randomness, and
simulations.
Activity: Tournaments.
How do we decide which of several teams is the best one? Many professional sports leagues have a
"regular season" and a "post season" to find out. As we have seen, in only a few trials,
a simulation may give unreliable results.
For some sports, like football, only a few games are played. For other sports, many, many games are
played, like baseball. This leads
to a question: Can we believe in
football that a team with 12 wins is better than a team with 4 wins?
Before answering such a difficult question, we will look at the various ways
that post-season tournaments are organized. Regular season are almost always some form of a Round Robin
tournament. Local sports
tournaments often use single-elimination.
Goals: Know the
different types of tournaments.
King Of The Hill (KOTH) (Bowling), Round Robin (RR) (High School
Conferences), Single-Elimination (SE) (NCAA March Madness), Double-Elimination
(DE)
Skills:
…
King of the Hill
Tournament. In this tournament, the two lowest seeded players
compete, the winner plays the next highest seeded player, etc. until the last
winner meets the highest seeded player.
This tournament gives the highest seeded player at least a 50% chance of
winning the tournament.
…
Round Robin
Tournament. In the Round Robin tournament, each team plays each
other team once. This is a fairly
common way to conduct a league season, especially in football and
basketball. However, it is quite
possible for there to be no clear cut champion at season end. This usually leads to a
"playoff" to determine the champion.
…
Single-Elimination
Tournament. In a single-elimination tournament, each loser of a
game is eliminated, until there is only one undefeated team. The KOTH is an example of a
single-elimination
tournament.
…
Double-Elimination
Tournament. In a double-elimination tournament, teams are
eliminated after their second
loss. This creates some
interesting ways to structure the tournament. Most if not all tournaments have a
"loser's" bracket,
the idea being that once a team has lost a game, they should only play teams
that also have lost a game. In
many supposed double-elimination tournaments, the winner of the loser's bracket
is awarded 3rd place, but this type of tournament is more properly
called a consolation
tournament. In a true
double-elimination tournament, the loser of the last winner's bracket game
plays the winner of the loser's bracket, and that winner plays the winner's
bracket winner, possibly twice in a row.
The key is that all but one team will have two losses.
Activity: Seeding.
What is the best way to seed a single-elimination tournament? It is fairly obvious what to do with 5
or fewer teams, but with 6 or more, it is not obvious at all what is
appropriate. We will calculate
some probabilities of teams winning some tournaments, and discuss some rules
for seeding that make guarantees for the team's probabilities of winning the
tournament. These probability
calculations will use the multiplication rules.
Goals: Use
Probability structures to analyze single-elimination
tournaments.
Skills:
…
Intransitive dice
example. It is possible to have a system of real world
probabilities that do not obey stochastic transitivity. Basically, we have this intuition that
if team A beats team B most of the time, and team B beats team C most of the
time, then team A should beat team C most of the time. The dice example shows that this sort
of transitivity is not always the case.
…
Be able to calculate
the chance of a team winning a tournament. Given a particular
matrix of probabilities, know the formulas for calculating the chance of
winning the tournament. Tree
diagrams, and good organization, will help in these
calculations.
…
Rules for
seeding. Basic structure for 4 teams: 1 plays 4 and 2 plays 3. Top-n rule: No
team ranked lower than the top n
teams should play fewer games (treat lower ranked teams as automatic wins for
purposes of using this rule).
Sub-tournament rule: All
sub-tournaments must be seeded appropriately. Conclusion: Of
the many structures, only a small number are ordered. We will list them in class. Because of the relative scarcity of ordered tournaments,
they are rarely used in practice (for more than 6 teams). This means that there exist probability
structures where some tournaments are unordered. We will look at the counter-examples for 6 teams, and these
will be the basis for knowing what to do with 7 or more teams.
Activity: Playoff tournaments (like NCAA
March Madness, or NFL playoffs).
What systems do the major sports leagues use for their playoff schedules? Tournaments are a major source of
entertainment, revenue, fan interest, etc. The notions of seeding, byes, winner's and loser's brackets
etc. have been made popular by both professional and local post-season
tournaments. We will look at the
NCAA tournaments and the NFL playoffs to see what is being used.
Revisiting the NCAA tournament we saw on Day 1, what sort of tournament is it,
and given our results from previous days, is it a reasonable approach?
The NFL playoffs involve 6 teams from each conference. The two top teams are given first round
byes. Team 3 plays Team 6, and
Team 4 plays Team 5. After the
first round, teams are relabeled if any upsets have occurred. For example, if Team 6 beats Team 3,
and Team 4 beats Team 5, they re-label old Team 6 as New Team 4, and old Team 4
as New Team 3. Then they have Team
1 play New Team 4 and Team 2 play New Team 3. Again, given our previous results, is this reasonable?
Finally, does Major League baseball do things fairly? MLB has three divisions in each league. Each division winner is in the
playoffs, plus one "wild-card" team. The wild-card team plays the division winner with the most
victories. The other two division
winners also pair off.
Project 1 due:
Choose one of the following games and
1) Give a short history of the
game.
2) Describe how randomness is part
of the game.
3) Using probability rules, show
some probability examples using this game.
The purpose of this report is to show that you can effectively communicate the
ideas of probability in a real world setting. It will be important to use proper English. If I cannot read your paper or follow
your logic, you will not have convinced me! The probability calculations do not have to be extremely
detailed; please talk to me if you have any doubts about what is
appropriate.
Games: Risk, Blackjack,
Backgammon, Roulette, Battleship, Poker, Minesweeper, Cribbage
Goals: Explore
professional tournaments and seeding to see how practice compares to
theory.
Skills:
…
Know the systems the
major sports leagues use for playoff tournaments. Quite simply, all
major sports leagues use single-elimination tournaments for their
playoffs. Some use 16 teams, some
4, some 6. As we saw,
single-elimination tournaments with 6 teams and higher are usually not ordered,
at least for some probability
structures. It is very unclear
whether those specific probability structures exist in practice.
Activity: NFL regular season schedule.
How is the NFL regular season schedule made? Because NFL teams only play 16 games, we know that a Round
Robin tournament among 32 teams is not possible. Within a division, however, with only 4 teams, it is easy to
have a Round Robin, even a replicated Round Robin. What choices are there for the remaining 10 games of a
season? We will explore what
actually is done as well as other potential options.
Before class, choose an NFL team (we will coordinate teams so we don't all do
the same one) and find out it's season schedule for the last three years. (You can access this information from
the following site by clicking on your team of choice. http://www.nfl.com/teams ) See if there are common factors year to
year for your chosen team. We will
compare notes in class and see if we can unravel how they do it.
http://www.sports-scheduling.com
Goals: Explore
NFL Scheduling.
Skills:
…
Understand the actual
method the NFL uses to schedule teams.
Knowing that a complete Round
Robin is not possible, the NFL tries to balance schedules as much as
possible. Because of the division
structure, the league has deemed that teams play their division teams twice
each. The remaining 10 games are
where the decisions are made.
Activity: Traveling Salesman.
Do the major sports leagues take traveling costs into account when they create
their schedules? We will explore
some simple graph theory to try to address this.
First, we draw a graph to represent the cities in our league. Each graph consists of points and
lines. The points represent the
cities, and the lines represent traveling from city to city. We can label each travel path with the
costs of traveling there, or perhaps with mileage. Now, the traveling salesman problem (TSP) amounts to finding
a path that visits each city once and returns to the starting point, but with
minimum travel costs.
The situation of sports teams and schedules isn't exactly the TSP, but has
similar components. In reality,
baseball teams will take a "road trip" and visit several cities
before coming back home.
To begin today, we will schedule road trips for the National League. Then we will see how they
really do it.
http://www.tsp.gatech.edu/index.html
http://ite.pubs.informs.org/Vol5No1/Birge/index.php
http://en.wikipedia.org/wiki/Traveling_salesman_problem
Goals: Know
about the TSP problem.
Skills:
…
Understand the
complexity of the Traveling Salesman Problem. The Traveling
Salesman Problem is a famous graph theory problem. What is the optimal path a salesman should take to visit
every city in the district at minimum cost? Its solution turns out to be incredibly complex; we are not
going to solve the TSP, but just look at some of the basics of it.
Activity: MLB and NBA regular season
schedule.
How is the Major League Baseball schedule made? We will see if we can devise some integer solutions to the
equations that make for balanced schedules. For example, back in the 70's and 80's, the American League
used to have each team (in a 7-team division) play teams in their own division
13 times, and teams in the other division 12 times. 13*6 + 12*7 = 78+84 = 162. The National League used 6-team divisions, and their numbers
were 18 and 12: 18*5 + 12*6 = 90 + 72 = 162. What is happening now, with 4- and 5-team divisions, and
inter-league play? Does the NBA do
the same thing for their basketball schedules?
For the following breakdowns of divisions, find all the integer solutions less
than 20.
4 and 4
5 and 5
6 and 6
6 and 7
8 and 8 for basketball (82 games)
4 and 5 and 5
5 and 5 and 6
5 and 5 and 5
5 and 5 and 5 for basketball (82 games)
Goals:
Understand
the equations constraining baseball and basketball
scheduling.
Skills:
…
Understand that
integer solutions to the equations may not be possible. We
seek solutions to equations like 7 x
+ 6 y = 162 or 4
x + 5 y +
4 z + 18 = 162. If there are no integer solutions that
are acceptable to baseball, what compromises are made?
Activity: Scheduling a Round Robin.
Today we will try to schedule a complete season of competitions using a Round
Robin tournament. I will let you
try your hand at this for a while, say 20 minutes. Then I will show you what I have discovered. Then I will let you see if you can do
some larger ones (10 or more teams).
Goals: Be able
to arrange a season schedule for teams participating in a Round Robin
tournament.
Skills:
…
Know how Latin
Squares can be used to construct a Round Robin tournament. A
Latin Square has each number appear exactly once in each row and column. If we let the rows and columns
represent teams, and the entries in the diagram represent the week they meet,
then we want symmetric Latin Squares, which will describe the schedule for a
Round Robin tournament.
Activity: Presentations.
Tournament scheduling. Your task
is to design a double-elimination tournament for 6 teams. Assume that the teams have just
finished a regular season so we have them seeded from 1 to 6, 1 being the team
with the best season record. Your
only constraint is that every team except the winner will have two losses at
the end. As part of your design,
you need to convince us that your plan is reasonable. You must balance the desire to have the best teams have the
best chance of winning with the desire to entertain the fans.
Activity: Quiz 3.
This third quiz is on tournaments and scheduling.
Activity: Guess m&m's percentage.
What fraction of m&m's are blue or green? Is it 25 %?
33 %? 50 %? We take samples to find out.
Each of you will sample from my jar of m&m's, and you will all calculate
your own confidence interval. Of
course, not everyone will be correct, and in fact, some of us will have
"lousy" samples. But
that is the point of the confidence coefficient, as we will see when we jointly
interpret our results.
It has been my experience that confidence intervals are easier to understand if
we talk about sample proportions instead of sample averages. Each of you will have a different
sample size and a different number of successes. In this case the sample size, n, is the total number of m&m's you have selected,
and the number of successes, x, is
the total number of blue or green m&m's in your sample. Your guess is simply the
ratio x/n, or
the sample proportion. We call this estimate
p-hat or .
Use STAT TEST
1-PropZInt with 50 % confidence for your
interval here today. In sports,
this would be akin to using an observed batting average to estimate a batter's true batting average.
When you have calculated your confidence interval, record your result on the
board for all to see. We will
jointly inspect these confidence intervals and observe just how many are
"correct" and how many are "incorrect". The percentage of correct
intervals should match our chosen level of confidence. This is in fact what is meant by
confidence. In practice, high
values of confidence are used, such as 95 %. We used 50 % in class only as a demonstration; we
would never use such a low value in practice.
Goals: Introduce
statistical inference - Guessing the parameter. Construct and interpret a confidence
interval.
Skills:
…
Understand how to
interpret confidence intervals.
The calculation of a confidence interval is quite
mechanical. In fact, as we have
seen, our calculators do all the work for us. Our job is then not so much to calculate the confidence intervals as it is to be able to
understand when one should be used
and how best to interpret
one. The time to use a confidence
interval is when we want to estimate a true value from a population or a
process. Each confidence interval
comes with a confidence level, a
measure of how reliable our method is.
For example, if we use 95 % confidence, then over the long run 95 % of
our intervals will contain the true answer.
…
Understand what makes
a confidence interval narrow.
There are several factors that make a confidence
interval narrow, which is ideally what we want because a narrow
confidence interval
means we are sure of our guess.
First, so we don't fool ourselves, we settle on a confidence level. It is true that a lower confidence level makes a narrower
interval, but we want high confidence so we can believe our results. The second way a confidence interval
can be narrow is if the data we are using has a small standard deviation. In the case of proportions, this means
the value is far from 50 %.
The third
and most important way that confidence intervals can be narrow is with a larger
sample size. This is the only one
of the three that we can really realistically control. The moral is simple: results from larger samples are more
believable.
Activity: Applying confidence
intervals to some
binomial processes, such as Batting Averages and Winning Percentages.
After 100 appearances at the plate, a batter has 25 hits. Could this player's true batting
average be .300? The confidence
interval will help us understand which values are reasonable guesses for the
true parameter (in this case, batting average).
Similarly, after 100 games, one team has 60 wins, and another has 40 wins. What is the difference between these
two teams' true abilities? This
time the appropriate routine is STAT TEST 2-PropZInt. We
won't go over the messy calculation details, just the new interpretation for a
difference of two proportions.
Goals:
Understand
further Confidence Interval applications.
Skills:
…
Be able to give an
English description to a binomial confidence interval. Based
on our m&m's example, we will want to describe confidence intervals as a
"long run" argument. For
example, our actual outcome is just one of many possible sequences. The confidence coefficient is a
probability that we attach to the method we use to guess the true answer.
The range of answers in our interval tells the values we consider to be
plausible. If the interval is
narrow enough, we will be able to make convincing statements. If the interval is wide, we are
basically observing random variation.
…
Understand the 2
sample problem. When we have two random events, like two teams'
winning percentages or two players' performances, we often like to compare them
head-to-head. The proper way to do
this is to analyze their differences. The resulting confidence interval will
often have small percentages, or negative numbers. It is important to be able to interpret such numbers. The key is to remember that we are
examining the difference between the two values. If the two players or teams are nearly equal in ability,
this difference should be close to zero.
If one player or team is much better than the other, this difference
will be far from zero. Ideally, we
want our confidence interval to be completely on one side of zero. This will convince us that one is
better than the other.
Activity: Argument by contradiction.
Scientific method. Type I and Type
II error diagram. Courtroom
terminology.
Some terminology:
Null hypothesis. A statement about a parameter. The null hypothesis is
always an equality or a single claim (like two variables are
independent). We assume the null
hypothesis is true in our following calculations, so it is important that the
null be a specific value or fact that can be assumed.
Alternative hypothesis.
The alternative hypothesis is a statement that we will
believe if the null hypothesis is rejected. The alternative does not have to be the complement of the
null hypothesis. It just has to be
some other statement. It
can be an inequality, and usually is.
Rejection rule. To decide between two competing hypotheses, we create
a rejection rule. It's usually as
simple as "Reject the null hypothesis if the sample mean is greater than
10. Otherwise fail to
reject." We always want to
phrase our answer as "reject the null hypothesis" or "fail to
reject the null hypothesis".
We never want to say "accept the null hypothesis". The reasoning is this: Rejecting the null hypothesis means the
data have contradicted the assumptions we've made (assuming the null hypothesis
was correct); failing to reject the null hypothesis doesn't mean we've proven
the null hypothesis is true, but rather that we haven't seen anything to doubt
the claim yet. It
could be the case that we just haven't taken a large enough
sample yet.
Type I Error. When we reject the null hypothesis when
it is in fact true, we have made a Type I error. We have made a conscious decision to treat this error as a
more important error, so we construct our rejection rule to make this error
rare.
Type II Error. When we fail to reject the null
hypothesis, and in fact the alternative hypothesis is the true one, we have
made a Type II error. Because we
construct our rejection rule to control the Type I error rate, the Type II
error rate is not really under our control; it is more a function of the
particular test we have chosen.
The one aspect we can
control is the sample size.
Generally, larger sample make the chance of making a Type II error
smaller.
Significance level, or size of the test.
The probability of making a
Type I error is the significance level.
We also call it the size of the test, and we use the symbol a to represent it. Because we want the Type I error to be rare, we usually will
set a to be a small number, like .05 or .01 or even
smaller. Clearly smaller is
better, but the drawback is that the smaller a is,
the larger the Type II error becomes.
P-value. There are two definitions for the P-value. Definition 1: The P-value is the alpha level that will cause us
to just reject our observed data. Definition 2:
The P-value is the chance of seeing data as extreme or more extreme than
the data actually observed. Using
either definition, we calculate the P-value as an area under a tail in a
distribution.
We will examine these ideas using the z-test for a proportion.
The TI-83 command is STAT TEST 1-PropZTest. The command gives you a
menu of items to input. It assumes
your null hypothesis is a statement about a true proportion
p.
You must
tell the assumed null value, p0,
and the alternative claim, usually the not-equals option. You also need to tell the calculator
your total successes, x, and your
total trials, n. If you choose CALCULATE the machine will simply display the test statistic and
the P-value. We care about whether
the P-value is small or not. If
you choose DRAW, the calculator will
graph the P-value calculation for you.
You should experiment to see which way you prefer.
Project 2 due:
Create a round-robin schedule for a
league that has two 4-team divisions.
Have each team play all the other teams in their own division first,
then all the teams in the other division, then the teams in their own division
again. Balance the schedule with
respect to home and away games; that is, make sure each team has an equal
number of home and away games. In
your report, describe the method you used to create your schedule, and any
difficulties you encountered. Even
if you are unable to satisfy all the conditions I have presented here for you,
let me know how far you got, and why you weren't able to
finish.
Goals: Introduce
statistical inference - Hypothesis testing.
Skills:
…
Recognize the two
types of errors we make.
If we decide to reject a null hypothesis, we might be
making a Type I error. If we fail
to reject the null hypothesis, we might be making a Type II error. If it turns out that the null
hypothesis is true, and we reject it because our data looked weird, then we
have made a Type I error.
Statisticians have agreed to control this type of error at a specific
percentage, usually 5%. On the
other hand, if the alternative hypothesis is true, and we
fail to reject the null hypothesis, we have also made a
mistake. This second type of error
is generally not controlled by us;
the sample size is the determining factor here.
…
Understand why one
error is considered a more serious error.
Because we control the
frequency of a Type I error, we feel confident that when we reject the null
hypothesis, we have made the right decision. This is how the scientific method works; researchers usually
set up an experiment so that the conclusion they would like to make is the
alternative hypothesis. Then if
the null hypothesis (usually the opposite of what they are trying to show) is
rejected, there is some confidence in the conclusion. On the other hand, if we fail to reject the null hypothesis, the most useful
conclusion is that we didn't have a large enough sample size to detect a real
difference. We aren't really
saying we are confident the null hypothesis is a true statement; rather we are
saying it could be true. Because we cannot control the frequency
of this error, it is a less confident statement.
…
Become familiar with
"argument by contradiction".
When researchers are trying to
"prove" a treatment is better or that their hypothesized mean is the
right one, they will usually choose to assume the opposite as the null
hypothesis. For election polls,
they assume the candidate has 50% of the vote, and hope to show that is an
incorrect statement. For showing
that a local population differs from, say, a national population, they will
typically assume the national average applies to the local population, again
with the hope of rejecting that assumption. In all cases, we formulate the hypotheses
before collecting data; therefore, you will never see a
sample average or a sample proportion in either a null or alternative
hypothesis.
…
Understand why we
reject the null hypothesis for small P-values. The P-value is the
probability of seeing a sample result "worse" than the one we
actually saw. In this sense,
"worse" means even more evidence against the null hypothesis; more
evidence favoring the alternative hypothesis. If this probability is small, it means either we have
observed a rare event, or that we have made an incorrect assumption, namely the
null hypothesis. Statisticians and
practitioners have agreed that 5% is a reasonable cutoff between a result that
contradicts the null hypothesis and a result that could be argued to be in
agreement with the null hypothesis.
Thus, we reject our claim only when the P-value is a small enough
number.
Activity: Baseball player comparisons.
Could two players have the same batting average and yet perform differently
over a short period of time?
Similar
to our work on Day 25, we will use STAT TEST 2-PropZTest to help decide if two players have different true
ability levels. In this setting,
the null hypothesis is that the two players are of equal ability. That is, the difference in their true
proportions is zero.
After we look at a few pairs of players, let's figure out how large of a
difference between batting averages is considered statistically
significant. In your groups, have
each person choose a different sample size, like 50 at bats, 100 at bats,
etc. Then invent some fictitious
results and see when two players are considered to be different. Compare notes with your group
mates. We will pool results at the end.
Goals:
Hypothesis
test applications.
Skills:
…
Understand another
context for hypothesis testing.
The two sample z-test for proportions offers us another example of a hypothesis
test. In this setting, the null
hypothesis is that the true difference in proportions is zero. The interpretation of the P-value is
the same: a small enough P-value causes us to doubt the null hypothesis. Here, doubting the null hypothesis
means we think the two players have different batting averages. If the P-value is large, meaning we
fail to reject the null hypothesis, then we conclude the two players could
indeed have the same batting average.
We haven't proven that they
do; it's just a plausible explanation for the data.
Activity: Catch up Day/Review
Goals:
Skills:
…
Be able to formulate
and conduct a statistical hypothesis test. The first step in
conducting a statistical hypothesis test is the formulation of the hypotheses
to be tested. The null hypothesis
is an equality statement, usually one of no change. For example, if we have a historical value we might claim
the current average is the same as the historical average. Or we may claim two players have the
same abilities. The next step is
to gather data and use an appropriate scheme to convert it to a
probability. This probability will
measure how likely the data are given the assumption of the null
hypothesis. If this
"P-value" is small, it means our data is unusual for that
hypothesis. This is the
counter-evidence we need to "prove" the statement wrong. If the "P-value" is large, it
means the data seem consistent with the statement, and we have failed to find
anything wrong.
…
Know the different
uses of the t procedures
and the proportion procedures.
The t-test
and t-interval is used when we
have data that can be put into a list, such as bowling scores, or
game-by-game passing
yards, etc. We need our numbers in
a list, or someone to tell us the mean and standard deviation of the
numbers. For the proportions
tests, we must have binary data, like success/failure data. Our examples of this included
winning/losing, passing successes/passing failures, etc. In either case, the P-value is our
decider: if the P-value is small, we reject the null hypothesis of equality and
believe the alternate is true. If
we fail to reject (because the P-value is large) then we are willing to say the
null hypothesis is plausible. The
data have not yet contradicted the claim, probably due to a small
sample size.
Activity: Quiz 4.
This fourth quiz is on statistical inference.
Activity: Linear
Regression.
Using the Olympic data, fit a regression line to predict the 2004 and 2008 race
results.
Begin by making a scatter plot of the race times. If you want a rough guess for the slope of the best fitting
line through the data, you can connect two points spaced far apart (details in
class.)
Next, use the TI-83's regression features to calculate the best
fit. The
command is STAT CALC LinReg(ax+b),
assuming the two lists are in L1
and L2.
(L1 will be the horizontal variable, years in this
case.) (We used this command on
Day 4 also.)
Have the calculator type this equation into your Y= menu (using VARS Statistics EQ RegEQ), and
TRACE on the line to predict the future results.
Here is the data:
Men's and Women's 100-meter
dash winning Olympic times:
1896 |
Thomas
Burke, United
States |
12
sec |
|
|
1900 |
Francis W.
Jarvis, United States |
11.0
sec |
|
|
1904 |
Archie Hahn,
United States |
11.0
sec |
|
|
1908 |
Reginald Walker,
South Africa |
10.8
sec |
|
|
1912 |
Ralph Craig,
United States |
10.8
sec |
|
|
1920 |
Charles Paddock,
United States |
10.8
sec |
|
|
1924 |
Harold Abrahams,
Great Britain |
10.6
sec |
|
|
1928 |
Percy Williams,
Canada |
10.8
sec |
Elizabeth
Robinson, United States |
12.2
sec |
1932 |
Eddie Tolan,
United States |
10.3
sec |
Stella Walsh,
Poland (a) |
11.9
sec |
1936 |
Jesse Owens,
United States |
10.3
sec |
Helen Stephens,
United States |
11.5
sec |
1948 |
Harrison
Dillard, United States |
10.3
sec |
Francina
Blankers-Koen, Netherlands |
11.9
sec |
1952 |
Lindy Remigino,
United States |
10.4
sec |
Marjorie,
Jackson, Australia |
11.5
sec |
1956 |
Bobby Morrow,
United States |
10.5
sec |
Betty Cuthbert,
Australia |
11.5
sec |
1960 |
Armin Hary,
Germany |
10.2
sec |
Wilma Rudolph,
United States |
11.0
sec |
1964 |
Bob Hayes,
United States |
10.0
sec |
Wyomia Tyus,
United States |
11.4
sec |
1968 |
Jim Hines,
United States |
9.95
sec |
Wyomia Tyus,
United States |
11.0
sec |
1972 |
Valery Borzov,
USSR |
10.14
sec |
Renate Stecher,
E. Germany |
11.07
sec |
1976 |
Hasely Crawford,
Trinidad |
10.06
sec |
Annegret
Richter, W. Germany |
11.08
sec |
1980 |
Allen Wells,
Britain |
10.25
sec |
Lyudmila
Kondratyeva, USSR |
11.6
sec |
1984 |
Carl Lewis,
United States |
9.99
sec |
Evelyn Ashford,
United States |
10.97
sec |
1988 |
Carl Lewis,
United States |
9.92
sec |
Florence
Griffith-Joyner, United States |
10.54
sec |
1992 |
Linford
Christie, Great Britain |
9.96
sec |
Gail Devers,
United States |
10.82
sec |
1996 |
Donovan Bailey,
Canada |
9.84
sec |
Gail Devers,
United States |
10.94
sec |
2000 |
Maurice Greene,
United States |
9.87
sec |
Marion Jones,
United States |
10.75
sec |
2004 |
?? |
|
?? |
|
(a) A 1980
autopsy determined that Walsh was a man.
Men's
and Women's 200-meter dash winning Olympic times:
1900 |
Walter
Tewksbury, United States |
22.2
sec |
|
|
1904 |
Archie Hahn,
United States |
21.6
sec |
|
|
1908 |
Robert Kerr,
Canada |
22.6
sec |
|
|
1912 |
Ralph Craig,
United States |
21.7
sec |
|
|
1920 |
Allan Woodring,
United States |
22
sec |
|
|
1924 |
Jackson Sholz,
United States |
21.6
sec |
|
|
1928 |
Percy Williams,
Canada |
21.8
sec |
|
|
1932 |
Eddie Tolan,
United States |
21.2
sec |
|
|
1936 |
Jesse Owens,
United States |
20.7
sec |
|
|
1948 |
Mel Patton,
United States |
21.1
sec |
Francina
Blankers-Koen, Netherlands |
24.4
sec |
1952 |
Andrew
Stanfield, United States |
20.7
sec |
Marjorie,
Jackson, Australia |
23.7
sec |
1956 |
Bobby Morrow,
United States |
20.6
sec |
Betty Cuthbert,
Australia |
23.4
sec |
1960 |
Livio Berruti,
Italy |
20.5
sec |
Wilma Rudolph,
United States |
24.0
sec |
1964 |
Henry Carr,
United States |
20.3
sec |
Edith McGuire,
United States |
23.0
sec |
1968 |
Tommy Smith,
United States |
19.83
sec |
Irena Szewinska,
Poland |
22.5
sec |
1972 |
Valeri Borzov,
USSR |
20.00
sec |
Renate Stecher,
E. Germany |
22.40
sec |
1976 |
Donald Quarrie,
Jamaica |
20.23
sec |
Barbel Eckert,
E. Germany |
22.37
sec |
1980 |
Pietro Mennea,
Italy |
20.19
sec |
Barbel Wockel,
E. Germany |
22.03
sec |
1984 |
Carl Lewis,
United States |
19.80
sec |
Valerie
Brisco-Hooks, United States |
21.81
sec |
1988 |
Joe DeLoach,
United States |
19.75
sec |
Florence
Griffith-Joyner, United States |
21.34
sec |
1992 |
Mike Marsh,
United States |
20.01
sec |
Gwen Torrance,
United States |
21.81
sec |
1996 |
Michael Johnson,
United States |
19.32
sec |
Marie-Jose
Perec, France |
22.12
sec |
2000 |
Konstantinos
Kenteris, Greece |
20.09
sec |
Marion Jones,
United States |
21.84
sec |
2004 |
?? |
|
?? |
|
Goals: Practice
using regression with the TI-83.
We want the regression equation, the regression line superimposed on the
plot, the correlation coefficient, and we want to be able to use the line to
predict new values.
Skills:
…
Fit a line to
data. This may be as simple as 'eyeballing' a straight line
to a scatter plot. However, to be
more precise, we will use least squares, STAT CALC LinReg(ax+b) on the TI-83, to calculate the
coefficients, and VARS Statistics
EQ RegEQ to type the equation
in the Y= menu.
You should also be able to sketch a line onto a scatter plot (by hand)
by knowing the regression coefficients.
…
Interpret regression
coefficients. Usually, we want to only interpret slope, and slope is
best understood by examining the units involved, such as inches per year or
miles per gallon, etc. Because
slope can be thought of as "rise" over "run", we are
looking for the ratio of the units involved in our two variables. More precisely, the slope tells us the
change in the response variable for a unit change in the explanatory
variable. We don't typically
bother interpreting the intercept, as zero is often outside of the range of
experimentation.
…
Estimate/predict new
observations using the regression line.
Once we have calculated a
regression equation, we can use it to predict new responses. The easiest way to use the TI-83 for
this is to TRACE on the
regression line. You may need to use up and down arrows
to toggle back and forth from the plot to the line. You may also just use the equation itself by multiplying the
new x-value by the slope and
adding the intercept. (This is
exactly what TRACE is doing.) Note: when using TRACE, and the x-value
you want is currently outside the window settings (lower than XMin or above XMax)
you must reset the window to include your x-value first.
Activity: Continue Olympic Data
regressions.
Explore Residuals, Outliers, and other forms of regressions (other than
linear).
Whenever you perform one of the regressions on the TI-83, the residuals are
stored in a list called RESID. This list of numbers, one for each data
point, tells the difference between the actual value and the predicted value of
the model chosen. Ideally we'd
like these to all be zero. Looking
at residuals can help us find outliers and model deficiencies. The main use of residuals though is
when we include more x
variables in
multiple regression. You should
plot the residuals against the x
value to see if the model is a good one.
Using one of the Olympic races, change a data point to something large and see
what effect the change has on the regression line. Now choose a different data point and try again. Make sure at least one of your points
is off to the side. This will give
you a look at the influence of data points on the edges of the
scatter plot.
Using the race data, try different models to see how the fits change. It is very important with all models to
plot the line or curve with the data.
We will use R2 to measure how
good a model explains the variation, but it's not a perfect measure. Just because it is large does not mean
the model is a good fit. We will
see some examples in class of this phenomenon.
Goals: Continue
Simple Linear Regression.
Skills:
…
Understand what
regression is trying to minimize. The residuals in a regression are the distances
(measured vertically) from the data points to the regression line. The overall objective in regression is
to make these distances as small as possible. Regression uses a technique called Least Squares to
accomplish this. The main effect
of least squares is that outliers tend to have a large influence on the values
in the regression line.
…
Know the effect
outliers have on regression.
Because the residuals are squared, the regression line
tends to be "attracted" to outlying points in a scatter plot. You should be able to guess the
influence a data point in a scatter plot has on the fitted
line.
…
Be able to perform
the hypothesis test for whether a variable adds to a regression
model. What we
want to know about a variable is whether the slope coefficient for that
variable is zero or not. If the
slope is zero, then that variable would not contribute to the model, and we
would say that the variable is not useful for predicting the response
variable. As usual with a
hypothesis test, we use the P-value as our measure; if the P-value is very
small (less than .05 or .01) we reject the null hypothesis that the slope is
zero.
…
Realize that a
straight line is not the only possible model. "Linear"
regression means the model is a straight line. Other models can be used, and the TI-83 has a number of them
available to you. The key to using
these alternate models is looking at the graph of the data
and the fit.
Another consideration is the interpretation of the parameters. For linear regression, slope is the
important parameter. For the
exponential, it's the growth rate.
For the others, there is no easy interpretation, which makes these other
models less appealing to use. Also
keep in mind that for multiple regression (more x variables) there is no easy theory available.
Activity: Investigation of
the QB Rating in football.
What is the formula the NFL uses to assess quarterback efficiency? We will use multiple regression (in
class) to see if we can figure out their method. Then we will compare to the actual formula, and discuss and
critique the formula. We will also
discuss ways to make our own rating.
Our main tool for multiple regression is the software MINITAB. You
enter data in a similar way to the TI-83, with lists of data. In MINITAB they are called columns, such as C1,
C2, etc. To
perform regression, use the pull-down menu and select REGRESS. Put
your x variables in and
your y variable and click OK. We will talk about the output in class. There will be only a few numbers we
need, so it will be important that you familiarize yourself with the output by
doing a few analyses yourself.
http://football.about.com/c/ht/03/03/How_Calculate_Quarterback_Rating1048560068.htm
Goals: Explore
the very basics of multiple regression.
Skills:
…
Understand how adding
more variables to the regression equation is done. The simplest form
of linear regression is y = ax + b. We can add more variables by just
adding more x's. Example: y =
a1x1 + a2x2 + b. We haven't discussed any
of the details of fitting a multiple regression model; that would require an
entire upper division math course!
However, you should appreciate what is being
attempted.
…
Know how to use the
P-value for the F-test.
The F-test is an overall statement about the model,
which includes all the variables at once.
The null hypothesis for this test is that all the variables have a zero slope, simultaneously. If we fail to reject this test, we are
saying that none of the variables help in predicting the response
variable. If we reject this null
hypothesis, we are saying that at least one variable is useful. To find out which variables are useful,
we use the individual t-tests.
…
Know how to use the
P-value for the individual t-tests. After we have rejected the F-test null
hypothesis, we generally explore which of the x-variables are important to the model. There are many ways to do this; we will
look at just one way, called "Backward Elimination". We will start with all the
x-variables available, and then we will drop any that
have large P-values, as those variables are not adding anything to the
model. There is a minor technical
point you should be aware of at this point; the null hypothesis for the t-tests
are that the slope for that variable is zero given that all the
other x-variables are already in the model. If we
remove a variable using this method, it doesn't mean that variable isn't useful
all by itself in predicting the response variable; it just means it's not
useful now with all the other
variables already in the model.
…
Be able to
use R2 as a measure of a model's overall
fit. R2 measures how small the vertical
deviations from the fitted "line" are. If R2 = 1, then we have a perfect fit; the
"line" goes through every data point. How close to 1 we need to be to say we have a good fit will
depend on the field you are in or the problem you are exploring. In physical sciences, the
values of R2
tend to be very close to 1, such
as .995 or higher. In social
sciences, where the response is often human behavior, R2 values near
.3 may be considered large.
…
Use s as a measure of
a model's overall fit.
On the MINITAB outputs is a value called
s which we can use as another measure of a model's
adequacy. s is an estimate of the standard deviation of the
residuals; it measures the amount of spread around the fit. For example, if s is 10, then 68 % of the measurements are within 10
units of the model's fit, and 95 % of the measurements are within 20 units of
the model's fit.
Day 33
Activity: Decathlon.
The decathlon is a series of 10 track and field events to determine the world's
greatest athlete. The events
involve throwing, jumping, endurance, and speed. But a basic question is how should performances in these 10
areas be combined? If we simply
ranked competitors, we wouldn't be able to compare in different years or
different ability levels. We will
consider our own ideas first, then look at how the decathlon was and is
scored. (I have chosen the
decathlon, which is for men. The
corresponding women's competition is the heptathlon.)
http://www.iaaf.org/newsfiles/32097.pdf
http://www.athleticscoaching.ca/UserFiles/File/Sport%20Science/Theory%20&%20Methodology/Combined%20Events/Westera%20Redefining%20the%20decathlon%20scoring%20tables.pdf
http://www.decathlon2000.ee/eng/10athlon.php?id=28
http://www.decathlon2000.ee/pdf/scoringtables.pdf
Goals: Another
example of a linear equation.
Skills:
…
Be familiar with the
issues involved with combining different sorts of measurements. You've
all heard the phrase "You can't compare apples and oranges." In a similar way, you can't add seconds
and inches. But we still want to
compare similar performances. For example, a world record performance
in the 100 meter dash should count as much as a world record performance in the
long jump, even though one is measured in seconds and the other in feet. Just how to combine these disparate
measurements is the issue in scoring a decathlon.
…
Know how ranks can be
used in a competition to choose the champion. One method of
combining performances is to rank the individual events from best to worst and
replace the actual results with the rank, an integer between 1 and
n, the number of competitors. (If we have ties, we assign the average ranks they would
have gotten if there were no ties.)
Each competitor's score is the sum of all their ranks. For the decathlon, the lowest (best)
score would be 10 (a 1 in each event) and the highest (worst) score
would be 10n (a score of n in each event). There are
two problems with this technique.
First, it is not possible to compare performances from other meets; only
the participants in this pool can be scored. Second, very similar performances are treated the same as
quite different performances because the rank is used. For example, the difference between the
fastest and second fastest 100 meter dash time could be .01 seconds; these two
would get scores of 1 and 2. The
difference between the worst and second worst dash times might be 1 second, a
large difference, yet those two would get scores of 9 and 10 (if there were 10
competitors), the same difference as the close scores 1 and 2 got.
…
Realize that the
system they use for the decathlon scoring could become obsolete. Over
time, performances in a particular event may improve to the point where that
event is having an undue influence on the results. For example, maybe pole vaulting techniques or advances in
equipment have improved the heights jumped to the point where scores in this
event are consistently producing top scores around 1200. Suppose in another event, like the 100
meter dash, the best scores are generally only near 1000. We would want to modify the system so
that the two events' best performances give similar top scores.
Day 34
Activity: Basketball Salaries.
Is there a relationship between basketball players' abilities and their
salaries? What factors influence
salary? We will explore the
development of a model for predicting a player's salary. If our formula is successful, athletes
could use the results to bargain for salary adjustments. However, few athletes would
argue for lower salaries than the formula predicts, so such an
approach might well lead to inflated salaries.
Goals: Another
example of a multiple regression equation.
Skills:
…
Know how to use
MINITAB to interpret multiple regression output. The computer
program MINITAB's output for multiple regression gives several P-values. You should know the P-value for the
F-test tells whether there is a relationship at all, but doesn't specify which
variables are important. The
P-values for the t-tests let us know which variables are not contributing, when
they are the last variable added.
Activity: MLB attendance.
What is the association between fan attendance at baseball games and their
success on the field? We will
explore this with real data. Of
course there are many factors influencing attendance, and we will try to
accommodate these factors in our model.
http://www.sabernomics.com/sabernomics/index.php/2004/04/winning-and-attendance-in-mlb
Goals:
Regression
example.
Skills:
…
Know the actions to
take when P-values for the t-test are large. When the P-values
for the individual variables t-tests are large, this means that variable is not
helping explain variation when it is the last variable added. A typical procedure to find a good
model is to drop the variable with the largest P-value over some pre-specified
value, such as 0.05. We then refit
the model without that variable and repeat the procedure until all the
variables have small P-values.
This model then becomes our final model.
Activity: Quiz 5.
This fifth quiz is on correlation and regression.
Activity: Parabolas.
If we ignore wind resistance, then the flight path for a projectile can be
modeled very accurately with parabolas, or quadratic equations. These equations have a
squared x term, so they are also called second order
polynomials. We will explore how
changing the initial velocity and the initial angle will change the flight
path, and hence the landing point.
Using parametric mode, change the angle and the velocity and see how the
distance is affected. In
particular, see how the angle affects distance for a fixed velocity.
Goals:
Ballistics.
Skills:
…
Know the basic form
of the equations for projectile flight.
We use the parametric feature
on our calculators to do projectile flight. The two equations are X = Vx t and Y = -16 t2
+ Vy t +
h. The two
velocities can be adjusted by the angle the ball is hit.
Activity: Physics of baseball.
What makes a curve ball curve?
What sort of influences does wind resistance have on the flight of a
baseball? What use can we make of
our knowledge of parabolas in charting the flight path of a batted or thrown
baseball?
The following link is a program that will plot the flight path of a baseball
for various altitudes, speeds, and angles. You will use it in Presentation 3.
http://faculty.tcc.fl.edu/scma/carrj/Java/baseball4.html
Goals:
Understand
some of the factors that influence balls in flight.
Skills:
…
Know how to do simple
calculations of distance traveled.
You should be able to convert
miles per hour to feet per second, and be able to calculate the distance a ball
travels in a certain length of time.
The important thing to remember is how to convert units (5280 feet in a
mile, 3600 seconds in an hour).
The formula we use is Distance = Rate times Time.
…
Be able to discuss
the Magnus effect.
When a ball in flight (baseball, cannonball, ping pong
ball) is spinning, the air pressures can differ greatly depending on the angle
of the spin. For example, if the
ball is spinning counter-clockwise (as viewed from above), the air pressure on
the right side of the ball is greater than the pressure on the opposite
side. (In baseball, this is the
situation with a right-handed pitcher throwing a curve ball. The third base side of the ball moves
faster than the first base side of the ball.) Because of the decrease in air pressure, the flight path of
the ball will veer that way. (For
the baseball example above, this makes the ball curve from right to left as
viewed from the pitcher.)
This movement
towards the lower pressure is called the Magnus effect.
…
Know what backspin
does to a pitch. When a pitch is thrown with more backspin than usual,
the Magnus effect will cause the ball to fall more slowly than a pitch thrown
with less backspin. This
"rise" amounts to about 3 inches difference in height when the ball
reaches home plate, 60 feet 6 inches away from the pitcher's mound. Because of the physiology of the brain,
a batter has to commit to a certain swing before he can really tell what sort
of backspin is on the ball. (The
difference in heights of the two types of spinning ball is only an inch or so
at the point where the batter makes his final swing
choice.)
…
Activity: Physics of football.
2-dimensional motion.
By knowing the timing of plays, such as how long it takes a quarterback to get
to a pre-designed spot from which to throw the football, and when the receiver
must be at the spot to catch the ball, we can make some calculations of time
and distance. Our chief
tools are a
Cartesian coordinate system and the Pythagorean theorem.
Also using angles and going through possible strategies, we can determine the
path a runner should take to either avoid being tackled (if he is the ball
carrier) or to intercept and tackle an opponent carrying the ball. Perhaps surprisingly, to catch up to
someone running laterally away from you, you should not run towards them, but towards either the sideline
point that they are heading for,
towards a spot closer to you if they decide not to take
their optimum angle.
Project 3 due:
Develop an alternate scoring system
for the triathlon. The
current method
is to add the three times together.
Your goal is to use some linear combination, for example NewScore = .5
SwimTime + .3 BikeTime + 2 RunTime.
Decide on a scheme that is fair to you, such as making the spread of
times equal, or making the standard deviations equal, or any other measure you
think makes the scoring more fair than just adding times. You need to justify your choice
though. You can use
plots, or summary
statistics, or just common sense reasoning, but be persuasive. If you are stuck, please see me outside
of class to get some guidance.
Goals:
Understand
some basic physics.
Skills:
…
Know the Pythagorean
theorem and how it is used in a coordinate system. In a right
triangle (one angle is exactly 90 degrees) we can calculate the length of the
hypotenuse (the side of the triangle not touching the right angle) using the Pythagorean theorem: c2 = a2
+ b2. To map out where an object (a football
carrier) will be at a particular time and running with a particular speed, we
need to know the distance he runs along the diagonal. If we mark his position on the field as a distance
horizontally from the center and vertically from the center, then these
distance are a and
b in the formula.
…
Know the principle a
ball carrier should use to gain the maximum distance on a run. If
we connect a line between the runner and the defender, who wants to intercept
the runner, then draw a perpendicular line from the midpoint of our first line,
then the runner should avoid running away from this second line. If the runner gets closer to the line,
the defender should "mirror" his path, using the second line, the
perpendicular line, as the "mirror". If the runner makes a mistake, and moves
further from the perpendicular line, then the defender should
stay the same distance from the runner; we can then draw a new set of lines
that decreases the farthest distance the runner can go.
Activity: Review
Goals: Know
everything.
Activity: Presentations.
Optimal angle for a HR.
What parameters affect distance a baseball travels? Using the link from Day 38, pick a stadium to examine. By trial and error, discover what
combinations of speed and angle will produce a home run (you must clear the
outfield fence to be a home run.)
Keep careful track of which values work and which don't. You may want to make a two-dimensional
map of speed versus angle shading in those combinations that work. How do the results change with a
moderate wind, say 10 mph?
Activity: Quiz 6.
This last quiz is on the physics of sports.
Managed by: Chris Edwards
edwards at uwosh dot edu
Last updated December 10, 2006