Day By Day Notes for PBIS 187

Sports Mathematics

Fall 2006

 

Day 1

Activity: Go over syllabus.  Take roll.  Overview examples: NCAA tournament, QB rating, Batting averages, What is random?

http://www.sabernomics.com/sabernomics/index.php/2006/05/age-cut-offs-and-month-of-birth-in-baseball

http://www-math.bgsu.edu/~albert/papers/saber.html

http://www.sabr.org

http://sabermetrics.hnrc.tufts.edu

http://www.baseball-reference.com

Goals:     Review course objectives: collect data, summarize information, make inferences, reason logically.

Day 2

Activity: Home Run Comparisons.

Pick one of the top home run hitters of all time (get the data from http://www.baseball-reference.com) and create graphical summaries of their yearly home run totals.  Make a histogram, a stem plot, and a quantile plot.

Useful commands for the calculator:

    
STAT EDIT  (Use one of the lists to enter data, L1 for example; the other L's can be used too.)
    
2nd STATPLOT 1 On  (Use this screen to designate the plot settings.  You can have up to three plots on the screen at once.  For now we will only use one at a time.)
    
ZOOM 9  (This command centers the window around your data.)
    
PRGM QUANTILE ENTER  (This program plots the sorted data and "stacks" them up, as opposed to a histogram, which places the boxes side by side.)

From your displays, write a short description of the player's home run history.


To make a histogram:
  Enter data into a list on the TI-83.  Setup one of the plots.  Zoom the window settings.

To interpret a histogram:
  Each "bin" is represented by a rectangle; the height is proportional to the number of cases in that bin or interval.  Tall boxes mean lots of data; short boxes (or empty boxes) indicate little (or no) data.

To make a stem plot:
  Choose a "numbers place", such as tens, hundreds, etc. for a stem.  (You may also have to consider ones, tenths, hundreds, etc.  The choice of stem will be dictated by how many data points end up on each row; too many stems and each row has just one or two items.  Too few stems and you have one or two stems with all the data.  Choosing the proper stem requires good judgment.)  After choosing a stem, make a column of these stems starting at the lowest value, and without skipping any values.  Then go through the data set and record each data point on the appropriate row (stem), writing down only the digit to the right of the stem's digit.  For example, if you have chosen the tens place for the stem, the data value 123 would belong on the stem labeled "12" and you jot down the number "3" for the leaf.  When you are finished, you may want to sort the items (the leaves) on each row (stem).  Note:  the stem plot is a visual display; make sure each digit you write down occupies the same amount of space.  If you are typing, use Monaco or Courier or some other fixed-width font.  It is especially tempting to squeeze together a string of 1's.

To interpret a stem plot:
  Each row of a stem plot can be interpreted in the same way as a bin in a histogram; wide stems (just like tall boxes in a histogram) represent lots of data points.  One advantage of a stem plot over a histogram is that every data point appears in the stem plot; in the histogram, all you know is how many data values are in an interval.

To make a quantile plot:
  A quantile plot is a graph of the rank of a data value (lowest, second lowest, etc.) to its data value.  We put the ranks on the left (the vertical scale) and the data values on the bottom (the horizontal scale).  All quantile plots start on the lower left and end on the upper right.  The TI-83 program QUANTILE will graph a quantile plot for you; all you need to tell the calculator is which list your data is in.

To interpret a quantile plot:
  The slope of the graph is the important feature of a quantile plot.  Steep sections represent x-values with lots of data values; flat sections are areas with little or no data.

Goals:     Perform graphical summaries (describing data with pictures).  Be able to use the calculator to make a histogram or a quantile plot.  Be able to make a stem plot by hand.

Skills:

                        Identify types of variables.  To choose the proper graphical displays, it is important to be able to differentiate between Categorical and Quantitative (or Numerical) variables.  Categorical variables do not have numerical values, or if they are numerical, it is only a label.

                        Be familiar with types of graphs.  To graph categorical variables we use bar graphs or pie graphs.  To graph numerical variables, we use histograms, stem plots, or QUANTILE (TI-83 program).  In practice, most of our variables will be numerical but it is still important to choose the right display.

                        Summarize data into a frequency table.  The easiest way to make a frequency table is to TRACE the boxes in a histogram and record the classes and counts.  You can control the size and number of the classes with Xscl and Xmin in the WINDOW menu.  The decision as to how many classes to create is arbitrary; there isn't a "right" answer.  One popular suggestion is try the square root of the number of data values.  For example, if there are 25 data points, use 5 intervals.  If there are 50 data points, try 7 intervals.  This is a rough rule; you should experiment with it.  The TI-83 has a rule for doing this; I do not know what their rule is.  You should experiment by changing the interval width and see what happens to the diagram.

                        Use the TI-83 to create an appropriate histogram or quantile plot.  STAT PLOT is our main tool for viewing distributions of data.  Histograms are common displays, but have flaws; the choice of class width is troubling as it is not unique.  The quantile plot is more reliable, but less common.  For interpretation purposes, remember that in a histogram tall boxes represent places with lots of data, while in a quantile plot those same high-density data places are steep.

                        Create a stem plot by hand.  The stem plot is a convenient manual display; it is most useful for small datasets, but not all datasets make good stem plots.  Choosing the "stem" and "leaves" to make reasonable displays will require some practice.  Some notes for proper choice of stems: if you have many empty rows, you have too many stems.  Move one column to the left and try again.  If you have too few rows (all the data is on just one or two stems) you have too few stems.  Move to the right one digit and try again.  Some datasets will not give good pictures for any choice of stem, and some benefit from splitting or rounding (see the example in class).

                        Describe shape, center, and spread.  From each of our graphs, you should be able to make general statements about the shape, center, and spread of the distribution of the variable being explored.  Our descriptors will be simple words like symmetric, skewed, two-peaked, etc.

Day 3

Activity: Cumulative Progress.

Examples:  Pennant races, Running pace, Bowling averages.

http://www.alexreisner.com/baseball/history/race  Davenport's graphs.

To display cumulative progress, use the program
PROGRESS.  The program will prompt you for whether you want the endpoint to be the average of the list or a number you input.  For the pennant races and other yes/no type responses, use INPUT and give it the value "0".  For the other examples, we will likely use AVERAGE, but you can explore the shape of the graph with other values.  In all graphs, regions of similar slope have similar averages.  We will discuss this phenomenon in our class examples.

Numerical summaries, including box plots:
  Our main numerical summaries will be the mean, the median, and the standard deviation.  The mean is the arithmetic average, the median is the middle number in the sorted list, and the standard deviation is a measure of how spread out the values are.  Roughly, most data sets are 4 to 6 standard deviations wide.  That is, the largest value is close to 4 to 6 standard deviations above the smallest value.

The 5-number summary uses the smallest value, the largest value, the median, and the medians of the two halves of the data.  These two other medians are called the quartiles, because they split the data set up into quarters.  The box plot is a visual picture of the 5-number summary.  The calculator has a selection in the
STAT PLOT menu for this (the 5th icon).  However, I recommend using the modified box plot (the 4th icon) as it has a built-in outlier detector.  This outlier detection routine is not foolproof; we still need good judgment.  But it at least gives us more than just our opinion.

Goals:     Be able to make and interpret a cumulative progress graph.  Be able to calculate and interpret numerical summaries.  Be able to make and interpret a box plot.

Skills:

                        Know the basics of a cumulative progress graph.  Quite simply, record the result over time.  Up indicates success, down indicates failure.  If the result is continuous (as in running or bowling) then it will be appropriate to modify the slope (see next item.)

                        Know the two ways a cumulative progress graph can be drawn.  When comparing several subjects (like teams' season records) and the response is yes/no, or win/loss, etc., it may make more sense to simply plot the graph without adjustment, to allow a comparison.  Up indicates a success, down indicates failure, and the endpoint (to the right) will not be at zero unless by coincidence.  When an adjustment is made, we require the right endpoint to be at zero, and the amount for each success and failure is adjusted accordingly.  Personally I think this is best done with a computer program.  You are basically multiplying each element in the list by a proportional amount.  For the yes/no type answers, use the average .5 in the PROGRESS program.

                        Recognize the features easily seen in a cumulative progress graph.  The most visual feature of a cumulative progress graph is the fact that parallel lines denote periods of equivalent performance.  For example, if the graph over one period of time has the same slope as over another period of time, then the performance (batting average, running pace, or whatever is being measured) is the same for both time periods.

                        Use the TI-83 to calculate summary statistics.  Calculating may be as simple as entering numbers into your calculator and pressing a button.  Or, if you are doing some things by hand, you may have to organize information the correct way, such as listing the numbers from low to high.  On the TI-83, the numerical measures are accessed in 1-Var Stats function in the STAT CALC menu.  Please get used to using the statistical features of your calculator to produce the mean.  While I know you can calculate the mean by simply adding up all the numbers and dividing by the sample size, you will not be in the habit of using the full features of your machine, and later on you will be missing out.

                        Compare several lists of numbers using box plots.  For two lists, the best simple approach is the back-to-back stem plot.  For more than two lists, I suggest trying box plots, side-by-side, or stacked.  At a glance, then, you can assess which lists have typically larger values or more spread out values, etc.

                        Understand box plots.  You should know that the box plots for some lists don't tell the interesting part of those lists.  For example, box plots do not describe shape very well; you can only see where the quartiles are.  Alternatively, you should know that the box plot can be a very good first quick look.

                        Understand the effect of outliers on the mean.  The mean (or average) is unduly influenced by outlying (unusual) observations.  Therefore, knowing when your distribution is skewed or symmetric is helpful.

                        Understand the effect of outliers on the median.  The median is almost completely unaffected by outliers.  For technical reasons, though, the median is not as common in scientific applications as the mean.

Day 4

Activity: Basketball and football scores comparisons.  Do teams that score many points also give up many points?  Can final score be predicted from half time score?  Using the data below, make scatter plots of team score versus opponent score and half time score versus final score.  For each scatter plot, include a correlation coefficient.

2005 Green Bay Packers

Week

Opponent

Half

Final

2nd

1

17

3

3

0

2

26

7

24

17

3

17

13

16

3

4

32

7

29

22

5

3

35

52

17

7

23

17

20

3

8

21

7

14

7

9

20

3

10

7

10

25

17

33

16

11

20

14

17

3

12

19

14

14

0

13

19

7

7

0

14

13

10

16

6

15

48

3

3

0

16

24

7

17

10

17

17

13

23

10


Nov 2005 Milwaukee Bucks

Game

Opponent

Half

Final

2nd

1

102

50

102

52

2

96

46

110

64

3

100

49

105

56

4

110

53

103

50

5

102

40

103

63

6

109

46

85

39

7

87

48

90

42

8

103

44

82

38

9

100

39

80

41

10

97

51

108

57

11

99

44

91

47

12

85

35

76

41

13

100

55

100

45



The pattern in a scatter plot can often be summarized adequately with a straight line.  Usually, we want to summarize such linear
scatter plots with a single number, the correlation coefficient.  The correlation coefficient is a unit-less number that varies between -1 (perfect negative association) and +1 (perfect positive correlation).  We will discuss in class a technique to approximate by hand the correlation coefficient in a scatter plot.

If the x
-variable is time, we have a time plot.  This website (http://alexreisner.com/baseball ) gives some great examples of sports time plots.

Goals:     Display two variables and measure (and interpret) linear association using the correlation coefficient.

Skills:

                        Plot data with a scatter plot.  This will be as simple as entering two lists of numbers into your TI-83 and pressing a few buttons, just as for histograms or box plots.  Or, if you are doing plots by hand you will have to first choose an appropriate axis scale and then plot the points.  You should also be able to describe overall patterns in scatter diagrams and suggest tentative models that summarize the main features of the relationship, if any.

                        Use the TI-83 to calculate the correlation coefficient.  We will have to use the regression function STAT CALC LinReg(ax+b) to calculate correlation, r.  First, you will have to have pressed DiagnosticOn.  Access this command through the CATALOG (2nd 0).  If you type ENTER after the STAT CALC LinReg(ax+b) command, the calculator assumes your lists are in columns L1and L2; otherwise you will type where they are, for example STAT CALC LinReg(ax+b) L2, L3.

                        Interpret the correlation coefficient.  You should know the range of the correlation coefficient (-1 to +1) and what a "typical" diagram looks like for various values of the correlation coefficient.  You should recognize some of the things the correlation coefficient does not measure, such as the strength of a non-linear pattern.

                        Recognize time plots and their features.  A time plot occurs when the x-variable is a time variable.  Because the time variable usually doesn't repeat itself, time plots are sometimes graphed as line plots, as on Reisner's website.

Day 5

Activity: Contingency tables.

Is there really a difference between home and away won/loss records in sports?  Baseball managers sometimes "platoon" their right- and left-handed batters based on the hand of the opposing pitchers.  Can we see evidence of this?

We cannot make a scatter plot with categorical data.  Our next best option is to make a "contingency table".  This is simply a cross-classification of the data values.  A simple example is the win/loss, home/away record for a team.  It is quite easy to make such a table; we just count how many items in the population fit into each cell.

The real
question is whether the data shows anything meaningful.  By that I mean do the categories show any departure from what would be expected if everything were just random?  Before we answer this, we will usually want to summarize percentages from the table, including marginal percentages.

Our first look at this will be to see what things would look like if everything were random.  We will make the expected table
using our TI-83's.  At the same time, the calculator will give us a number (a P-value) that will help us decide if there is any pattern present.  Key characteristics of the expected table are that the row and column totals are identical to the original data table, but the individual cell totals are proportional to the marginal totals.  That is, the percentage of cases falling in any column is the same across all rows and vice versa.  See the class examples.

The expected table represents how things would have worked out if the two variables were unrelated.  We will go through examples in class to explain this phenomenon.  It is essentially a "what-if" type argument.  "What would things look like if the two variables were unrelated?"  Then we compare that situation to what has actually occurred, and if the difference is too large, then we conclude that "dumb luck" is not the most likely explanation.

Goals:     Organize two categorical variables in a summary chart.

Skills:

                        Create a table summarizing two categorical variables.  Unlike numerical variables, summarization of categorical data is accomplished by making frequency tables.  Often along with the tables, one will calculate marginal totals and percentages.

Day 6

Activity: Expected Tables.

Today we will continue the material from Day 5, exploring further what the calculator can do for us.  Specifically, we will look at the baseball platoon data.

Goals:     Develop intuition for when the observed and expected tables are too different.

Skills:

                        Create the table of expected counts.  The primary method of analyzing categorical tables is comparing the observed data to a table of expected counts.  The TI-83 will calculate the expected table for us.  Our job is to understand the meaning of the numbers.  Basically, the expected table is the way the table would have come out if the two variables were unrelated.  We use it as a baseline in determining association.

                        Recognize when an association is present.  When two categorical variables are associated (much like when two numerical variables are correlated) we detect this with the c2 test.  We will use a statistical technique to decide if the differences in the tables are too great, STAT TESTS c2-Test.  You must have the observed table in a matrix.  The expected table will be stored in another matrix.  If p < .05, we conclude the two tables are quite different.  Our reasoning is this:  if the difference between the actual results and the results assuming no association is a small difference, then we have no reason to think that the variables are related.  However, if the difference between the two tables is considered large, then we conclude something can be said about the relationship; that is, that one exists.

Day 7

Activity: Presentations.

Graphical (Chapter 1) and Numerical (Chapter 2) Summaries

Collect or find some sports data; the quality of the data is not important for this project.  Use 3 to 5 lists of data; make sure you have enough data so that your summaries are meaningful, say at least 20 cases.  Summarize your data using both graphical and numerical summaries.  Make sure you have at least one categorical variable and at least one numerical variable.  Make sure you have at least one 2-variable summary.

Day 8

Activity: Quiz 1.  This first quiz is on graphical and numerical summaries.

Day 9

Activity: What is Randomness?

Our notions of probability theory are based on the "long run", but our everyday lives are dominated by "short runs".  Today we will look at some everyday sequences to see if they exhibit this "short term" behavior.

Coin experiment 1:  Write down a sequence of H's and T's representing head and tails, pretending you are flipping a coin.  Then flip a real coin 50 times and record these 50 H's and T's.  Without knowing which list is which, in most cases I will be able to identify your real coin.

Baseball players:  In sports you often hear about the "hot hand".  We will pick a player, look at his last 20 games, and see if flipping a coin will produce a simulation that resembles his real performance.  Then we will examine whether we could pick out the simulation without knowing which was which.

Coin experiment 2:  Spin a penny on a flat surface, instead of tossing it into the air.  Record the percentage of heads.

Coin experiment 3:  Balance a nickel on its edge on a flat surface.  Jolt the surface enough so that the nickel falls over, and record the percentage of heads.

Goals:     Observe some real sequences of random experiments.  Develop an intuition about variability.

Skills:

                        Recognize the feature of randomness.  Random does not mean haphazard, or without pattern.  We cannot predict what will happen on a single toss of a coin, but we can predict what will happen in 1,000 tosses of a coin.  This is the hallmark of a random process: uncertainty in a small number of trials, but a predictable pattern in a large number of trials.

                        Resist the urge to jump to conclusions with small samples.  Typically our daily activities do not involve large samples of observations.  Therefore our ideas of "long run" probability theory are not applicable.  You need to develop some intuition about when to believe an observed simulation, and when to doubt the results.  We will hone this intuition as we develop our upcoming inference methods.  For now, understand that you may be jumping to conclusions by just believing a small simulation's observed results.

Day 10

Activity: Probability Trees.

YahtzeeÅ.  Conditional Probability.  Multiplication rule.

The game of YahtzeeÅ involves rolling 5 dice and trying to get a high score in 13 categories.  You are allowed to re-roll any of the dice up to 2 times.  We can describe the various cases we encounter using a probability tree
, which is a chart with a series of branches that depict what could happen at each roll or re-roll of the dice.  The handy part of a tree is the inclusion of probabilities at each branch.  Sometimes, calculating these individual branch probabilities is quite tedious.  I hope we can find some simple trees to model some YahtzeeÅ situations.

The branch probabilities in our diagram are really conditional probabilities
.  The "condition" is what has occurred previously in the tree.  Usually at the far right of the tree we calculate a final probability of that particular event.  To do this we use the multiplication rule, which says that the probability of a series of events in a tree is found by multiplying all the conditional probabilities together.  Our class examples should make this clearer.

Goals:     Introduce probability with trees.

Skills:

                        Understand what is being displayed in a probability tree.  A probability tree shows all possible outcomes in a series of events, like dice rolling or card drawing.  They are most useful when the events described are dependent on one another, as in YahtzeeÅ rolls, or in card drawing, although technically we can draw trees for independent events too.  However, with independent events, the conditional probabilities are not influenced by previous events in the tree (hence the notion of independence.)

                        Be able to prepare a probability tree for simple problems.  The probability trees we looked at in class are or can be quite complicated.  I don't expect you to be able to create one for the full choices in a game of YahtzeeÅ or for poker.  However, for small problems (rolling a die and flipping a coin; describing 3 games for a team; drawing two cards from a deck) I expect you can create one.  The key to remember is that the branch labels are the chances of what happens at that point in the sequence (the essence of conditional probability).

                        Know how to use the multiplication rule.  With a sequence of related events (ones where the idea of conditional probability makes sense) we can find the probability of all the events happening by multiplying all the individual conditional probabilities together.

 

Day 11

Activity: Probability Rules.

Poker.  We will make some probability trees for poker hands.  This may involve some counting techniques.

Goals:     Understand the basic rules of probability.

Skills:

                        Know the addition rule.  When two events are mutually exclusive, we can find the chance that either event occurs by adding the individual probabilities.  An example of using the addition rule correctly is finding the chance of drawing a spade or a heart on one draw from a deck of cards.  The most common misuse of this rule is to apply it to events that have elements in common and are therefore not mutually exclusive.  An example of a misuse is finding the probability of getting at least one six on two rolls of a die.

                        Know the complement rule.  The complement of an event (not compliment) is the elements not in the event.  For many of our sports examples, this amounts to one of two items, such as win or loss, hit or miss, success or failure, but many times there are several choices.  Thus the opposite of a hit in baseball, for example, is not necessarily an out; the batter could be hit by the pitch, he could walk, he could get a sacrifice fly, etc.  The probability rule for complements is that the probability of an event is the probability of the complement subtracted from one.  This is just saying that either an event or its complement occurs.

 Day 12

Activity: Simulation.

If we make a probability tree for a series of events, we can then use random devices to simulate the experiment.  From these simulations, we can hopefully draw conclusions about the probability of the events.  We will look at two examples today.

Should a football team go for the 2 point conversion, or kick the extra point?

In baseball, is a sacrifice hit ever worth it?  There have been some reports on this.  I found the following websites.

http://japanesebaseball.com/forum/thread.jsp?forum=1&thread=134

http://www.commonwealthclub.org/archive/03/03-09baseball-speech.html

http://ite.pubs.informs.org/Vol5No1/Bickel

On our calculator,
MATH PRB rand and MATH PRB randint( are the two most useful random number generators for us.  Before we begin, we should all reset our random number seeds.  Otherwise, everyone's calculator will give the same sequence of random numbers, and that isn't at all what we mean by random!  To reset your seed, store any number to rand: <some number here> -> rand.  Now, MATH PRB rand gives a random number between 0 and 1.  By truncating appropriately, we can make this number be any sort of probability we need.  For example, if we have an event that has probability 1/3, we will say it has occurred if our number rand is between 0 and .333333.  If we have some specific fraction in mind, instead of just any old decimal number, we can use MATH PRB randint( instead.  This will give a random integer between 1 and n.  The syntax is MATH PRB randint( 1, 12, 3) which will give us 3 numbers, each between 1 and 12.  The third number is unnecessary; if you leave it off, the calculator will give you just one random number.

Goals:     Develop an intuition for the power of a simulation.  Realize the limitations of a simulation.

Skills:

                        Know how to use random number generators (or coins, dice, etc.) to simulate events from a probability tree.  On the TI-83, MATH PRB rand is the most useful random number generator, due to its flexibility. MATH PRB randInt( can also be helpful, if the event desired is an number between 1 and n.

                        Know what can and what cannot be deduced from a simulation.  From a simulation, we can get an estimate of the probability of some event.  This may be an improvement on the theoretical calculation (using the probability rules) because the event in question may be incredibly complicated.  (Example: getting a YahtzeeÅ [5 of a kind] in 3 rolls.)  On the other hand, if the simulation wasn't performed enough times (which we usually won't be able to tell) then the results may be misleading.  It is a skill to decide whether to believe the results of a simulation or not.  Many of the examples I have shown so far in this course are such examples.  We looked at correlations of team scores.  However, the sample sizes were so small that I would feel uncomfortable telling anyone our conclusions.

Day 13

Activity: Markov Chains.

Suppose a baseball player has a 40 % chance of getting a hit after a hit, but only a 25% chance of getting a hit after making an out.  What will his eventual batting average be?

In tennis, when the score is tied in a match, play continues until one player has won two points in a row.  How long will the typical match last?

Markov Chains will help us answer these questions.  Here are two other examples:

The status of the bases and outs in a baseball game can be considered to be the nodes.  The inning ends when 3 outs have occurred, so that is the final node.  There are eight possible base situations, and 3 possible out situations, so there are 25 total nodes.  We will draw this big chart in class and see what we can conclude.

Another situation is the ball/strike count for a batter.

We can use some matrix results to draw some conclusions.  If we organize the probabilities of going from one node to another in an array or matrix of numbers, we can have our calculator give us the probabilities of the eventual end states of the system. 

Start by drawing the nodes with arrows leading from one node to another, with associated probabilities.  Then translate these probabilities to the matrix.  Be careful to label correctly.  If some of the nodes are terminal nodes, i.e., you can't leave that state, then we have absorbing
states.  List these at the bottom and right of the matrix.

If there are no absorbing states (as in the baseball batting average example), then the easiest way to analyze the problem is to raise the matrix to a large power.  This calculation will give the probabilities of being in each state, and if they are the same for each row, we have a steady state solution.

In the case of absorbing states, we need to analyze the problem differently.  Break down the matrix into four matrices, crossing the non-absorbing states with the absorbing states.  We will only be interested in the top two of these four matrices.  (See class notes.)  Call the first matrix Q
and the second one R.  Our two chief results will be ( I - Q )-1 and ( I - Q )-1 R.  The TI-83 commands are (assuming the numbers are stored in [A] and [B], respectively, and N is the number of non-absorbing states): ( identity( N ) - [A] )^-1 and ( identity( N ) – [A] )^-1*[B].  (identity( and [A], [B], etc are found in the MATRIX menus.)  The first result tells us the expected number of times we'll be in each state, and the second result tells us the eventual probability of being in each absorbing state.

Goals:     See how some trees can be "solved" using Markov chains.

Skills:

                        Know how to draw a Markov chain with nodes.  In a Markov chain, there are probabilities attached to the connections between nodes.  In some chains, like the batting average example and the tennis example, it is possible to revisit nodes previously visited; in our other examples this doesn't happen.  Thus there are a lot of zeroes in those diagrams.  What we usually want to know is the behavior after the system is "let run" for a while, i.e., the long run behavior.

                        Know how to use matrices to make conclusions about Markov chains.  By organizing our information into appropriate matrices, and performing simple operations (using the TI-83), we can discover various steady-state solutions to Markov chains.

 

Day 14

Activity: Quiz 2.

This second quiz is on probability, randomness, and simulations.

 

Day 15

Activity: Tournaments.

How do we decide which of several teams is the best one?  Many professional sports leagues have a "regular season" and a "post season" to find out.  As we have seen, in only a few trials, a simulation may give unreliable results.  For some sports, like football, only a few games are played.  For other sports, many, many games are played, like baseball.  This leads to a question:  Can we believe in football that a team with 12 wins is better than a team with 4 wins? 

Before answering such a difficult question, we will look at the various ways that post-season tournaments are organized.  Regular season are almost always some form of a Round Robin tournament.  Local sports tournaments often use single-elimination.

Goals:     Know the different types of tournaments.  King Of The Hill (KOTH) (Bowling), Round Robin (RR) (High School Conferences), Single-Elimination (SE) (NCAA March Madness), Double-Elimination (DE)

Skills:

                        King of the Hill Tournament.  In this tournament, the two lowest seeded players compete, the winner plays the next highest seeded player, etc. until the last winner meets the highest seeded player.  This tournament gives the highest seeded player at least a 50% chance of winning the tournament.

                        Round Robin Tournament.  In the Round Robin tournament, each team plays each other team once.  This is a fairly common way to conduct a league season, especially in football and basketball.  However, it is quite possible for there to be no clear cut champion at season end.  This usually leads to a "playoff" to determine the champion.

                        Single-Elimination Tournament.  In a single-elimination tournament, each loser of a game is eliminated, until there is only one undefeated team.  The KOTH is an example of a single-elimination tournament.

                        Double-Elimination Tournament.  In a double-elimination tournament, teams are eliminated after their second loss.  This creates some interesting ways to structure the tournament.  Most if not all tournaments have a "loser's" bracket, the idea being that once a team has lost a game, they should only play teams that also have lost a game.  In many supposed double-elimination tournaments, the winner of the loser's bracket is awarded 3rd place, but this type of tournament is more properly called a consolation tournament.  In a true double-elimination tournament, the loser of the last winner's bracket game plays the winner of the loser's bracket, and that winner plays the winner's bracket winner, possibly twice in a row.  The key is that all but one team will have two losses.

Day 16

Activity: Seeding.

What is the best way to seed a single-elimination tournament?  It is fairly obvious what to do with 5 or fewer teams, but with 6 or more, it is not obvious at all what is appropriate.  We will calculate some probabilities of teams winning some tournaments, and discuss some rules for seeding that make guarantees for the team's probabilities of winning the tournament.  These probability calculations will use the multiplication rules.

Goals:     Use Probability structures to analyze single-elimination tournaments.

Skills:

                        Intransitive dice example.  It is possible to have a system of real world probabilities that do not obey stochastic transitivity.  Basically, we have this intuition that if team A beats team B most of the time, and team B beats team C most of the time, then team A should beat team C most of the time.  The dice example shows that this sort of transitivity is not always the case.

                        Be able to calculate the chance of a team winning a tournament.  Given a particular matrix of probabilities, know the formulas for calculating the chance of winning the tournament.  Tree diagrams, and good organization, will help in these calculations.

                        Rules for seeding.  Basic structure for 4 teams:  1 plays 4 and 2 plays 3.  Top-n rule:  No team ranked lower than the top n teams should play fewer games (treat lower ranked teams as automatic wins for purposes of using this rule).  Sub-tournament rule:  All sub-tournaments must be seeded appropriately.  Conclusion:  Of the many structures, only a small number are ordered.  We will list them in class.  Because of the relative scarcity of ordered tournaments, they are rarely used in practice (for more than 6 teams).  This means that there exist probability structures where some tournaments are unordered.  We will look at the counter-examples for 6 teams, and these will be the basis for knowing what to do with 7 or more teams.

Day 17

Activity: Playoff tournaments (like NCAA March Madness, or NFL playoffs).

What systems do the major sports leagues use for their playoff schedules?  Tournaments are a major source of entertainment, revenue, fan interest, etc.  The notions of seeding, byes, winner's and loser's brackets etc. have been made popular by both professional and local post-season tournaments.  We will look at the NCAA tournaments and the NFL playoffs to see what is being used.

Revisiting the NCAA tournament we saw on Day 1, what sort of tournament is it, and given our results from previous days, is it a reasonable approach?

The NFL playoffs involve 6 teams from each conference.  The two top teams are given first round byes.  Team 3 plays Team 6, and Team 4 plays Team 5.  After the first round, teams are relabeled if any upsets have occurred.  For example, if Team 6 beats Team 3, and Team 4 beats Team 5, they re-label old Team 6 as New Team 4, and old Team 4 as New Team 3.  Then they have Team 1 play New Team 4 and Team 2 play New Team 3.  Again, given our previous results, is this reasonable?

Finally, does Major League baseball do things fairly?  MLB has three divisions in each league.  Each division winner is in the playoffs, plus one "wild-card" team.  The wild-card team plays the division winner with the most victories.  The other two division winners also pair off.

Project 1 due:

Choose one of the following games and

1)  Give a short history of the game.
2)  Describe how randomness is part of the game.
3)  Using probability rules, show some probability examples using this game.

The purpose of this report is to show that you can effectively communicate the ideas of probability in a real world setting.  It will be important to use proper English.  If I cannot read your paper or follow your logic, you will not have convinced me!  The probability calculations do not have to be extremely detailed; please talk to me if you have any doubts about what is appropriate.

Games:  Risk, Blackjack, Backgammon, Roulette, Battleship, Poker, Minesweeper, Cribbage


Goals:     Explore professional tournaments and seeding to see how practice compares to theory.

Skills:

                        Know the systems the major sports leagues use for playoff tournaments.  Quite simply, all major sports leagues use single-elimination tournaments for their playoffs.  Some use 16 teams, some 4, some 6.  As we saw, single-elimination tournaments with 6 teams and higher are usually not ordered, at least for some probability structures.  It is very unclear whether those specific probability structures exist in practice.

Day 18

Activity: NFL regular season schedule.

How is the NFL regular season schedule made?  Because NFL teams only play 16 games, we know that a Round Robin tournament among 32 teams is not possible.  Within a division, however, with only 4 teams, it is easy to have a Round Robin, even a replicated Round Robin.  What choices are there for the remaining 10 games of a season?  We will explore what actually is done as well as other potential options.

Before class, choose an NFL team (we will coordinate teams so we don't all do the same one) and find out it's season schedule for the last three years.  (You can access this information from the following site by clicking on your team of choice. http://www.nfl.com/teams )  See if there are common factors year to year for your chosen team.  We will compare notes in class and see if we can unravel how they do it.

http://www.sports-scheduling.com

Goals:     Explore NFL Scheduling.

Skills:

                        Understand the actual method the NFL uses to schedule teams.  Knowing that a complete Round Robin is not possible, the NFL tries to balance schedules as much as possible.  Because of the division structure, the league has deemed that teams play their division teams twice each.  The remaining 10 games are where the decisions are made.

Day 19

Activity: Traveling Salesman.

Do the major sports leagues take traveling costs into account when they create their schedules?  We will explore some simple graph theory to try to address this.

First, we draw a graph to represent the cities in our league.  Each graph consists of points and lines.  The points represent the cities, and the lines represent traveling from city to city.  We can label each travel path with the costs of traveling there, or perhaps with mileage.  Now, the traveling salesman problem (TSP) amounts to finding a path that visits each city once and returns to the starting point, but with minimum travel costs.

The situation of sports teams and schedules isn't exactly the TSP, but has similar components.  In reality, baseball teams will take a "road trip" and visit several cities before coming back home.

To begin today, we will schedule road trips for the National League.  Then we will see how they really do it.

http://www.tsp.gatech.edu/index.html

http://ite.pubs.informs.org/Vol5No1/Birge/index.php

http://en.wikipedia.org/wiki/Traveling_salesman_problem

Goals:     Know about the TSP problem.

Skills:

                        Understand the complexity of the Traveling Salesman Problem.  The Traveling Salesman Problem is a famous graph theory problem.  What is the optimal path a salesman should take to visit every city in the district at minimum cost?  Its solution turns out to be incredibly complex; we are not going to solve the TSP, but just look at some of the basics of it.

Day 20

Activity: MLB and NBA regular season schedule.

How is the Major League Baseball schedule made?  We will see if we can devise some integer solutions to the equations that make for balanced schedules.  For example, back in the 70's and 80's, the American League used to have each team (in a 7-team division) play teams in their own division 13 times, and teams in the other division 12 times.  13*6 + 12*7 = 78+84 = 162.  The National League used 6-team divisions, and their numbers were 18 and 12: 18*5 + 12*6 = 90 + 72 = 162.  What is happening now, with 4- and 5-team divisions, and inter-league play?  Does the NBA do the same thing for their basketball schedules?

For the following breakdowns of divisions, find all the integer solutions less than 20. 

4 and 4
5 and 5
6 and 6
6 and 7
8 and 8 for basketball (82 games)

4 and 5 and 5
5 and 5 and 6
5 and 5 and 5
5 and 5 and 5 for basketball (82 games)

Goals:     Understand the equations constraining baseball and basketball scheduling.

Skills:

                        Understand that integer solutions to the equations may not be possible.  We seek solutions to equations like 7 x + 6 y = 162 or 4 x + 5 y + 4 z + 18 = 162.  If there are no integer solutions that are acceptable to baseball, what compromises are made?

Day 21

Activity: Scheduling a Round Robin.

Today we will try to schedule a complete season of competitions using a Round Robin tournament.  I will let you try your hand at this for a while, say 20 minutes.  Then I will show you what I have discovered.  Then I will let you see if you can do some larger ones (10 or more teams).

Goals:     Be able to arrange a season schedule for teams participating in a Round Robin tournament.

Skills:

                        Know how Latin Squares can be used to construct a Round Robin tournament.  A Latin Square has each number appear exactly once in each row and column.  If we let the rows and columns represent teams, and the entries in the diagram represent the week they meet, then we want symmetric Latin Squares, which will describe the schedule for a Round Robin tournament.

Day 22

Activity: Presentations.

Tournament scheduling.  Your task is to design a double-elimination tournament for 6 teams.  Assume that the teams have just finished a regular season so we have them seeded from 1 to 6, 1 being the team with the best season record.  Your only constraint is that every team except the winner will have two losses at the end.  As part of your design, you need to convince us that your plan is reasonable.  You must balance the desire to have the best teams have the best chance of winning with the desire to entertain the fans.

Day 23

Activity: Quiz 3.

This third quiz is on tournaments and scheduling.

Day 24

Activity: Guess m&m's percentage.

What fraction of m&m's are blue or green?  Is it 25 %?  33 %?  50 %?  We take samples to find out.

Each of you will sample from my jar of m&m's, and you will all calculate your own confidence interval.  Of course, not everyone will be correct, and in fact, some of us will have "lousy" samples.  But that is the point of the confidence coefficient, as we will see when we jointly interpret our results.

It has been my experience that confidence intervals are easier to understand if we talk about sample proportions instead of sample averages.  Each of you will have a different sample size and a different number of successes.  In this case the sample size, n
, is the total number of m&m's you have selected, and the number of successes, x, is the total number of blue or green m&m's in your sample.  Your guess is simply the ratio x/n, or the sample proportion.  We call this estimate p-hat or .  Use STAT TEST 1-PropZInt with 50 % confidence for your interval here today.  In sports, this would be akin to using an observed batting average to estimate a batter's true batting average.

When you have calculated your confidence interval, record your result on the board for all to see.  We will jointly inspect these confidence intervals and observe just how many are "correct" and how many are "incorrect".  The percentage of correct intervals should
match our chosen level of confidence.  This is in fact what is meant by confidence.  In practice, high values of confidence are used, such as 95 %.  We used 50 % in class only as a demonstration; we would never use such a low value in practice.

Goals:     Introduce statistical inference - Guessing the parameter.  Construct and interpret a confidence interval.

Skills:

                        Understand how to interpret confidence intervals.  The calculation of a confidence interval is quite mechanical.  In fact, as we have seen, our calculators do all the work for us.  Our job is then not so much to calculate the confidence intervals as it is to be able to understand when one should be used and how best to interpret one.  The time to use a confidence interval is when we want to estimate a true value from a population or a process.  Each confidence interval comes with a confidence level, a measure of how reliable our method is.  For example, if we use 95 % confidence, then over the long run 95 % of our intervals will contain the true answer.

                        Understand what makes a confidence interval narrow.  There are several factors that make a confidence interval narrow, which is ideally what we want because a narrow confidence interval means we are sure of our guess.  First, so we don't fool ourselves, we settle on a confidence level.  It is true that a lower confidence level makes a narrower interval, but we want high confidence so we can believe our results.  The second way a confidence interval can be narrow is if the data we are using has a small standard deviation.  In the case of proportions, this means the value is far from 50 %.  The third and most important way that confidence intervals can be narrow is with a larger sample size.  This is the only one of the three that we can really realistically control.  The moral is simple:  results from larger samples are more believable.

Day 25

Activity: Applying confidence intervals to some binomial processes, such as Batting Averages and Winning Percentages.

After 100 appearances at the plate, a batter has 25 hits.  Could this player's true batting average be .300?  The confidence interval will help us understand which values are reasonable guesses for the true parameter (in this case, batting average).

Similarly, after 100 games, one team has 60 wins, and another has 40 wins.  What is the difference between these two teams' true abilities?  This time the appropriate routine is
STAT TEST 2-PropZInt.  We won't go over the messy calculation details, just the new interpretation for a difference of two proportions.

Goals:     Understand further Confidence Interval applications.

Skills:

                        Be able to give an English description to a binomial confidence interval.  Based on our m&m's example, we will want to describe confidence intervals as a "long run" argument.  For example, our actual outcome is just one of many possible sequences.  The confidence coefficient is a probability that we attach to the method we use to guess the true answer.  The range of answers in our interval tells the values we consider to be plausible.  If the interval is narrow enough, we will be able to make convincing statements.  If the interval is wide, we are basically observing random variation.

                        Understand the 2 sample problem.  When we have two random events, like two teams' winning percentages or two players' performances, we often like to compare them head-to-head.  The proper way to do this is to analyze their differences.  The resulting confidence interval will often have small percentages, or negative numbers.  It is important to be able to interpret such numbers.  The key is to remember that we are examining the difference between the two values.  If the two players or teams are nearly equal in ability, this difference should be close to zero.  If one player or team is much better than the other, this difference will be far from zero.  Ideally, we want our confidence interval to be completely on one side of zero.  This will convince us that one is better than the other.

Day 26

Activity: Argument by contradiction.

Scientific method.  Type I and Type II error diagram.  Courtroom terminology.

Some terminology:

Null hypothesis.
  A statement about a parameter.  The null hypothesis is always an equality or a single claim (like two variables are independent).  We assume the null hypothesis is true in our following calculations, so it is important that the null be a specific value or fact that can be assumed.

Alternative hypothesis. 
The alternative hypothesis is a statement that we will believe if the null hypothesis is rejected.  The alternative does not have to be the complement of the null hypothesis.  It just has to be some other statement.  It can be an inequality, and usually is.

Rejection rule. 
To decide between two competing hypotheses, we create a rejection rule.  It's usually as simple as "Reject the null hypothesis if the sample mean is greater than 10.  Otherwise fail to reject."  We always want to phrase our answer as "reject the null hypothesis" or "fail to reject the null hypothesis".  We never want to say "accept the null hypothesis".  The reasoning is this:  Rejecting the null hypothesis means the data have contradicted the assumptions we've made (assuming the null hypothesis was correct); failing to reject the null hypothesis doesn't mean we've proven the null hypothesis is true, but rather that we haven't seen anything to doubt the claim yet.  It could be the case that we just haven't taken a large enough sample yet.

Type I Error.
  When we reject the null hypothesis when it is in fact true, we have made a Type I error.  We have made a conscious decision to treat this error as a more important error, so we construct our rejection rule to make this error rare.

Type II Error.
  When we fail to reject the null hypothesis, and in fact the alternative hypothesis is the true one, we have made a Type II error.  Because we construct our rejection rule to control the Type I error rate, the Type II error rate is not really under our control; it is more a function of the particular test we have chosen.  The one aspect we can control is the sample size.  Generally, larger sample make the chance of making a Type II error smaller.

Significance level, or size of the test. 
The probability of making a Type I error is the significance level.  We also call it the size of the test, and we use the symbol a to represent it.  Because we want the Type I error to be rare, we usually will set a to be a small number, like .05 or .01 or even smaller.  Clearly smaller is better, but the drawback is that the smaller a is, the larger the Type II error becomes.

P-value. 
There are two definitions for the P-value.  Definition 1:  The P-value is the alpha level that will cause us to just reject our observed data.  Definition 2:  The P-value is the chance of seeing data as extreme or more extreme than the data actually observed.  Using either definition, we calculate the P-value as an area under a tail in a distribution.

We will examine these ideas using the z
-test for a proportion.  The TI-83 command is STAT TEST 1-PropZTest.  The command gives you a menu of items to input.  It assumes your null hypothesis is a statement about a true proportion p.  You must tell the assumed null value, p0, and the alternative claim, usually the not-equals option.  You also need to tell the calculator your total successes, x, and your total trials, n.  If you choose CALCULATE the machine will simply display the test statistic and the P-value.  We care about whether the P-value is small or not.  If you choose DRAW, the calculator will graph the P-value calculation for you.  You should experiment to see which way you prefer.

Project 2 due:

Create a round-robin schedule for a league that has two 4-team divisions.  Have each team play all the other teams in their own division first, then all the teams in the other division, then the teams in their own division again.  Balance the schedule with respect to home and away games; that is, make sure each team has an equal number of home and away games.  In your report, describe the method you used to create your schedule, and any difficulties you encountered.  Even if you are unable to satisfy all the conditions I have presented here for you, let me know how far you got, and why you weren't able to finish.

 

Goals:     Introduce statistical inference - Hypothesis testing.

Skills:

                        Recognize the two types of errors we make.  If we decide to reject a null hypothesis, we might be making a Type I error.  If we fail to reject the null hypothesis, we might be making a Type II error.  If it turns out that the null hypothesis is true, and we reject it because our data looked weird, then we have made a Type I error.  Statisticians have agreed to control this type of error at a specific percentage, usually 5%.  On the other hand, if the alternative hypothesis is true, and we fail to reject the null hypothesis, we have also made a mistake.  This second type of error is generally not controlled by us; the sample size is the determining factor here.

                        Understand why one error is considered a more serious error.  Because we control the frequency of a Type I error, we feel confident that when we reject the null hypothesis, we have made the right decision.  This is how the scientific method works; researchers usually set up an experiment so that the conclusion they would like to make is the alternative hypothesis.  Then if the null hypothesis (usually the opposite of what they are trying to show) is rejected, there is some confidence in the conclusion.  On the other hand, if we fail to reject the null hypothesis, the most useful conclusion is that we didn't have a large enough sample size to detect a real difference.  We aren't really saying we are confident the null hypothesis is a true statement; rather we are saying it could be true.  Because we cannot control the frequency of this error, it is a less confident statement.

                        Become familiar with "argument by contradiction".  When researchers are trying to "prove" a treatment is better or that their hypothesized mean is the right one, they will usually choose to assume the opposite as the null hypothesis.  For election polls, they assume the candidate has 50% of the vote, and hope to show that is an incorrect statement.  For showing that a local population differs from, say, a national population, they will typically assume the national average applies to the local population, again with the hope of rejecting that assumption.  In all cases, we formulate the hypotheses before collecting data; therefore, you will never see a sample average or a sample proportion in either a null or alternative hypothesis.

                        Understand why we reject the null hypothesis for small P-values.  The P-value is the probability of seeing a sample result "worse" than the one we actually saw.  In this sense, "worse" means even more evidence against the null hypothesis; more evidence favoring the alternative hypothesis.  If this probability is small, it means either we have observed a rare event, or that we have made an incorrect assumption, namely the null hypothesis.  Statisticians and practitioners have agreed that 5% is a reasonable cutoff between a result that contradicts the null hypothesis and a result that could be argued to be in agreement with the null hypothesis.  Thus, we reject our claim only when the P-value is a small enough number.

Day 27

Activity: Baseball player comparisons.

Could two players have the same batting average and yet perform differently over a short period of time?  Similar to our work on Day 25, we will use
STAT TEST 2-PropZTest to help decide if two players have different true ability levels.  In this setting, the null hypothesis is that the two players are of equal ability.  That is, the difference in their true proportions is zero.

After we look at a few pairs of players, let's figure out how large of a difference between batting averages is considered statistically significant.  In your groups, have each person choose a different sample size, like 50 at bats, 100 at bats, etc.  Then invent some fictitious results and see when two players are considered to be different.  Compare notes with your group mates.  We will pool results at the end.

Goals:     Hypothesis test applications.

Skills:

                        Understand another context for hypothesis testing.  The two sample z-test for proportions offers us another example of a hypothesis test.  In this setting, the null hypothesis is that the true difference in proportions is zero.  The interpretation of the P-value is the same: a small enough P-value causes us to doubt the null hypothesis.  Here, doubting the null hypothesis means we think the two players have different batting averages.  If the P-value is large, meaning we fail to reject the null hypothesis, then we conclude the two players could indeed have the same batting average.  We haven't proven that they do; it's just a plausible explanation for the data.

Day 28

Activity: Catch up Day/Review

Goals:    

Skills:

                        Be able to formulate and conduct a statistical hypothesis test.  The first step in conducting a statistical hypothesis test is the formulation of the hypotheses to be tested.  The null hypothesis is an equality statement, usually one of no change.  For example, if we have a historical value we might claim the current average is the same as the historical average.  Or we may claim two players have the same abilities.  The next step is to gather data and use an appropriate scheme to convert it to a probability.  This probability will measure how likely the data are given the assumption of the null hypothesis.  If this "P-value" is small, it means our data is unusual for that hypothesis.  This is the counter-evidence we need to "prove" the statement wrong.  If the "P-value" is large, it means the data seem consistent with the statement, and we have failed to find anything wrong.

                        Know the different uses of the t procedures and the proportion procedures.  The t-test and t-interval is used when we have data that can be put into a list, such as bowling scores, or game-by-game passing yards, etc.  We need our numbers in a list, or someone to tell us the mean and standard deviation of the numbers.  For the proportions tests, we must have binary data, like success/failure data.  Our examples of this included winning/losing, passing successes/passing failures, etc.  In either case, the P-value is our decider: if the P-value is small, we reject the null hypothesis of equality and believe the alternate is true.  If we fail to reject (because the P-value is large) then we are willing to say the null hypothesis is plausible.  The data have not yet contradicted the claim, probably due to a small sample size.

Day 29

Activity: Quiz 4.

This fourth quiz is on statistical inference.

Day 30

Activity:    Linear Regression.

Using the Olympic data, fit a regression line to predict the 2004 and 2008 race results.

Begin by making a scatter plot of the race times.  If you want a rough guess for the slope of the best fitting line through the data, you can connect two points spaced far apart (details in class.)

Next, use the TI-83's regression features to calculate the best fit
.  The command is STAT CALC LinReg(ax+b), assuming the two lists are in L1 and L2.  (L1 will be the horizontal variable, years in this case.)  (We used this command on Day 4 also.)

Have the calculator type this equation into your
Y= menu (using VARS Statistics EQ RegEQ), and TRACE on the line to predict the future results.

Here is the data:

Men's and Women's 100-meter dash winning Olympic times:

1896

Thomas Burke, United States

12 sec

 

 

1900

Francis W. Jarvis, United States

11.0 sec

 

 

1904

Archie Hahn, United States

11.0 sec

 

 

1908

Reginald Walker, South Africa

10.8 sec

 

 

1912

Ralph Craig, United States

10.8 sec

 

 

1920

Charles Paddock, United States

10.8 sec

 

 

1924

Harold Abrahams, Great Britain

10.6 sec

 

 

1928

Percy Williams, Canada

10.8 sec

Elizabeth Robinson, United States

12.2 sec

1932

Eddie Tolan, United States

10.3 sec

Stella Walsh, Poland (a)

11.9 sec

1936

Jesse Owens, United States

10.3 sec

Helen Stephens, United States

11.5 sec

1948

Harrison Dillard, United States

10.3 sec

Francina Blankers-Koen, Netherlands

11.9 sec

1952

Lindy Remigino, United States

10.4 sec

Marjorie, Jackson, Australia

11.5 sec

1956

Bobby Morrow, United States

10.5 sec

Betty Cuthbert, Australia

11.5 sec

1960

Armin Hary, Germany

10.2 sec

Wilma Rudolph, United States

11.0 sec

1964

Bob Hayes, United States

10.0 sec

Wyomia Tyus, United States

11.4 sec

1968

Jim Hines, United States

9.95 sec

Wyomia Tyus, United States

11.0 sec

1972

Valery Borzov, USSR

10.14 sec

Renate Stecher, E. Germany

11.07 sec

1976

Hasely Crawford, Trinidad

10.06 sec

Annegret Richter, W. Germany

11.08 sec

1980

Allen Wells, Britain

10.25 sec

Lyudmila Kondratyeva, USSR

11.6 sec

1984

Carl Lewis, United States

9.99 sec

Evelyn Ashford, United States

10.97 sec

1988

Carl Lewis, United States

9.92 sec

Florence Griffith-Joyner, United States

10.54 sec

1992

Linford Christie, Great Britain

9.96 sec

Gail Devers, United States

10.82 sec

1996

Donovan Bailey, Canada

9.84 sec

Gail Devers, United States

10.94 sec

2000

Maurice Greene, United States

9.87 sec

Marion Jones, United States

10.75 sec

2004

??

 

??

 

 (a)  A 1980 autopsy determined that Walsh was a man.


Men's and Women's 200-meter dash winning Olympic times:

1900

Walter Tewksbury, United States

22.2 sec

 

 

1904

Archie Hahn, United States

21.6 sec

 

 

1908

Robert Kerr, Canada

22.6 sec

 

 

1912

Ralph Craig, United States

21.7 sec

 

 

1920

Allan Woodring, United States

22 sec

 

 

1924

Jackson Sholz, United States

21.6 sec

 

 

1928

Percy Williams, Canada

21.8 sec

 

 

1932

Eddie Tolan, United States

21.2 sec

 

 

1936

Jesse Owens, United States

20.7 sec

 

 

1948

Mel Patton, United States

21.1 sec

Francina Blankers-Koen, Netherlands

24.4 sec

1952

Andrew Stanfield, United States

20.7 sec

Marjorie, Jackson, Australia

23.7 sec

1956

Bobby Morrow, United States

20.6 sec

Betty Cuthbert, Australia

23.4 sec

1960

Livio Berruti, Italy

20.5 sec

Wilma Rudolph, United States

24.0 sec

1964

Henry Carr, United States

20.3 sec

Edith McGuire, United States

23.0 sec

1968

Tommy Smith, United States

19.83 sec

Irena Szewinska, Poland

22.5 sec

1972

Valeri Borzov, USSR

20.00 sec

Renate Stecher, E. Germany

22.40 sec

1976

Donald Quarrie, Jamaica

20.23 sec

Barbel Eckert, E. Germany

22.37 sec

1980

Pietro Mennea, Italy

20.19 sec

Barbel Wockel, E. Germany

22.03 sec

1984

Carl Lewis, United States

19.80 sec

Valerie Brisco-Hooks, United States

21.81 sec

1988

Joe DeLoach, United States

19.75 sec

Florence Griffith-Joyner, United States

21.34 sec

1992

Mike Marsh, United States

20.01 sec

Gwen Torrance, United States

21.81 sec

1996

Michael Johnson, United States

19.32 sec

Marie-Jose Perec, France

22.12 sec

2000

Konstantinos Kenteris, Greece

20.09 sec

Marion Jones, United States

21.84 sec

2004

??

 

??

 

Goals:     Practice using regression with the TI-83.  We want the regression equation, the regression line superimposed on the plot, the correlation coefficient, and we want to be able to use the line to predict new values.

Skills:

                        Fit a line to data.  This may be as simple as 'eyeballing' a straight line to a scatter plot.  However, to be more precise, we will use least squares, STAT CALC LinReg(ax+b) on the TI-83, to calculate the coefficients, and VARS Statistics EQ RegEQ to type the equation in the Y= menu.  You should also be able to sketch a line onto a scatter plot (by hand) by knowing the regression coefficients.

                        Interpret regression coefficients.  Usually, we want to only interpret slope, and slope is best understood by examining the units involved, such as inches per year or miles per gallon, etc.  Because slope can be thought of as "rise" over "run", we are looking for the ratio of the units involved in our two variables.  More precisely, the slope tells us the change in the response variable for a unit change in the explanatory variable.  We don't typically bother interpreting the intercept, as zero is often outside of the range of experimentation.

                        Estimate/predict new observations using the regression line.  Once we have calculated a regression equation, we can use it to predict new responses.  The easiest way to use the TI-83 for this is to TRACE on the regression line.  You may need to use up and down arrows to toggle back and forth from the plot to the line.  You may also just use the equation itself by multiplying the new x-value by the slope and adding the intercept.  (This is exactly what TRACE is doing.)  Note: when using TRACE, and the x-value you want is currently outside the window settings (lower than XMin or above XMax) you must reset the window to include your x-value first.

Day 31

Activity: Continue Olympic Data regressions.

Explore Residuals, Outliers, and other forms of regressions (other than linear).

Whenever you perform one of the regressions on the TI-83, the residuals are stored in a list called
RESID.  This list of numbers, one for each data point, tells the difference between the actual value and the predicted value of the model chosen.  Ideally we'd like these to all be zero.  Looking at residuals can help us find outliers and model deficiencies.  The main use of residuals though is when we include more x variables in multiple regression.  You should plot the residuals against the x value to see if the model is a good one.

Using one of the Olympic races, change a data point to something large and see what effect the change has on the regression line.  Now choose a different data point and try again.  Make sure at least one of your points is off to the side.  This will give you a look at the influence of data points on the edges of the scatter plot.

Using the race data, try different models to see how the fits change.  It is very important with all models to plot the line or curve with the data.  We will use
R2 to measure how good a model explains the variation, but it's not a perfect measure.  Just because it is large does not mean the model is a good fit.  We will see some examples in class of this phenomenon.

Goals:     Continue Simple Linear Regression.

Skills:

                        Understand what regression is trying to minimize.  The residuals in a regression are the distances (measured vertically) from the data points to the regression line.  The overall objective in regression is to make these distances as small as possible.  Regression uses a technique called Least Squares to accomplish this.  The main effect of least squares is that outliers tend to have a large influence on the values in the regression line.

                        Know the effect outliers have on regression.  Because the residuals are squared, the regression line tends to be "attracted" to outlying points in a scatter plot.  You should be able to guess the influence a data point in a scatter plot has on the fitted line.

                        Be able to perform the hypothesis test for whether a variable adds to a regression model.  What we want to know about a variable is whether the slope coefficient for that variable is zero or not.  If the slope is zero, then that variable would not contribute to the model, and we would say that the variable is not useful for predicting the response variable.  As usual with a hypothesis test, we use the P-value as our measure; if the P-value is very small (less than .05 or .01) we reject the null hypothesis that the slope is zero.

                        Realize that a straight line is not the only possible model.  "Linear" regression means the model is a straight line.  Other models can be used, and the TI-83 has a number of them available to you.  The key to using these alternate models is looking at the graph of the data and the fit.  Another consideration is the interpretation of the parameters.  For linear regression, slope is the important parameter.  For the exponential, it's the growth rate.  For the others, there is no easy interpretation, which makes these other models less appealing to use.  Also keep in mind that for multiple regression (more x variables) there is no easy theory available.

Day 32

Activity: Investigation of the QB Rating in football.

What is the formula the NFL uses to assess quarterback efficiency?  We will use multiple regression (in class) to see if we can figure out their method.  Then we will compare to the actual formula, and discuss and critique the formula.  We will also discuss ways to make our own rating.

Our main tool for multiple regression is the software
MINITAB.  You enter data in a similar way to the TI-83, with lists of data.  In MINITAB they are called columns, such as C1, C2, etc.  To perform regression, use the pull-down menu and select REGRESS.  Put your x variables in and your y variable and click OK.  We will talk about the output in class.  There will be only a few numbers we need, so it will be important that you familiarize yourself with the output by doing a few analyses yourself.

http://football.about.com/c/ht/03/03/How_Calculate_Quarterback_Rating1048560068.htm

Goals:     Explore the very basics of multiple regression.

Skills:

                        Understand how adding more variables to the regression equation is done.  The simplest form of linear regression is y = ax + b.  We can add more variables by just adding more x's.  Example:  y = a1x1 + a2x2 + b.  We haven't discussed any of the details of fitting a multiple regression model; that would require an entire upper division math course!  However, you should appreciate what is being attempted.

                        Know how to use the P-value for the F-test.  The F-test is an overall statement about the model, which includes all the variables at once.  The null hypothesis for this test is that all the variables have a zero slope, simultaneously.  If we fail to reject this test, we are saying that none of the variables help in predicting the response variable.  If we reject this null hypothesis, we are saying that at least one variable is useful.  To find out which variables are useful, we use the individual t-tests.

                        Know how to use the P-value for the individual t-tests.  After we have rejected the F-test null hypothesis, we generally explore which of the x-variables are important to the model.  There are many ways to do this; we will look at just one way, called "Backward Elimination".  We will start with all the x-variables available, and then we will drop any that have large P-values, as those variables are not adding anything to the model.  There is a minor technical point you should be aware of at this point; the null hypothesis for the t-tests are that the slope for that variable is zero given that all the other x-variables are already in the model.  If we remove a variable using this method, it doesn't mean that variable isn't useful all by itself in predicting the response variable; it just means it's not useful now with all the other variables already in the model.

                        Be able to use R2 as a measure of a model's overall fit.  R2 measures how small the vertical deviations from the fitted "line" are.  If R2 = 1, then we have a perfect fit; the "line" goes through every data point.  How close to 1 we need to be to say we have a good fit will depend on the field you are in or the problem you are exploring.  In physical sciences, the values of R2 tend to be very close to 1, such as .995 or higher.  In social sciences, where the response is often human behavior, R2 values near .3 may be considered large.

                        Use s as a measure of a model's overall fit.  On the MINITAB outputs is a value called s which we can use as another measure of a model's adequacy.  s is an estimate of the standard deviation of the residuals; it measures the amount of spread around the fit.  For example, if s is 10, then 68 % of the measurements are within 10 units of the model's fit, and 95 % of the measurements are within 20 units of the model's fit.

Day 33

Activity: Decathlon.

The decathlon is a series of 10 track and field events to determine the world's greatest athlete.  The events involve throwing, jumping, endurance, and speed.  But a basic question is how should performances in these 10 areas be combined?  If we simply ranked competitors, we wouldn't be able to compare in different years or different ability levels.  We will consider our own ideas first, then look at how the decathlon was and is scored.  (I have chosen the decathlon, which is for men.  The corresponding women's competition is the heptathlon
.)

http://www.iaaf.org/newsfiles/32097.pdf

http://www.athleticscoaching.ca/UserFiles/File/Sport%20Science/Theory%20&%20Methodology/Combined%20Events/Westera%20Redefining%20the%20decathlon%20scoring%20tables.pdf

http://www.decathlon2000.ee/eng/10athlon.php?id=28

http://www.decathlon2000.ee/pdf/scoringtables.pdf

Goals:     Another example of a linear equation.

Skills:

                        Be familiar with the issues involved with combining different sorts of measurements.  You've all heard the phrase "You can't compare apples and oranges."  In a similar way, you can't add seconds and inches.  But we still want to compare similar performances.  For example, a world record performance in the 100 meter dash should count as much as a world record performance in the long jump, even though one is measured in seconds and the other in feet.  Just how to combine these disparate measurements is the issue in scoring a decathlon.

                        Know how ranks can be used in a competition to choose the champion.  One method of combining performances is to rank the individual events from best to worst and replace the actual results with the rank, an integer between 1 and n, the number of competitors.  (If we have ties, we assign the average ranks they would have gotten if there were no ties.)  Each competitor's score is the sum of all their ranks.  For the decathlon, the lowest (best) score would be 10 (a 1 in each event) and the highest (worst) score would be 10n (a score of n in each event).  There are two problems with this technique.  First, it is not possible to compare performances from other meets; only the participants in this pool can be scored.  Second, very similar performances are treated the same as quite different performances because the rank is used.  For example, the difference between the fastest and second fastest 100 meter dash time could be .01 seconds; these two would get scores of 1 and 2.  The difference between the worst and second worst dash times might be 1 second, a large difference, yet those two would get scores of 9 and 10 (if there were 10 competitors), the same difference as the close scores 1 and 2 got. 

                        Realize that the system they use for the decathlon scoring could become obsolete.  Over time, performances in a particular event may improve to the point where that event is having an undue influence on the results.  For example, maybe pole vaulting techniques or advances in equipment have improved the heights jumped to the point where scores in this event are consistently producing top scores around 1200.  Suppose in another event, like the 100 meter dash, the best scores are generally only near 1000.  We would want to modify the system so that the two events' best performances give similar top scores.

Day 34

Activity: Basketball Salaries.

Is there a relationship between basketball players' abilities and their salaries?  What factors influence salary?  We will explore the development of a model for predicting a player's salary.  If our formula is successful, athletes could use the results to bargain for salary adjustments.  However, few athletes would argue for lower
salaries than the formula predicts, so such an approach might well lead to inflated salaries.

Goals:     Another example of a multiple regression equation.

Skills:

                        Know how to use MINITAB to interpret multiple regression output.  The computer program MINITAB's output for multiple regression gives several P-values.  You should know the P-value for the F-test tells whether there is a relationship at all, but doesn't specify which variables are important.  The P-values for the t-tests let us know which variables are not contributing, when they are the last variable added.

Day 35

Activity: MLB attendance.

What is the association between fan attendance at baseball games and their success on the field?  We will explore this with real data.  Of course there are many factors influencing attendance, and we will try to accommodate these factors in our model.

http://www.sabernomics.com/sabernomics/index.php/2004/04/winning-and-attendance-in-mlb

Goals:     Regression example.

Skills:

                        Know the actions to take when P-values for the t-test are large.  When the P-values for the individual variables t-tests are large, this means that variable is not helping explain variation when it is the last variable added.  A typical procedure to find a good model is to drop the variable with the largest P-value over some pre-specified value, such as 0.05.  We then refit the model without that variable and repeat the procedure until all the variables have small P-values.  This model then becomes our final model.

Day 36

Activity: Quiz 5.

This fifth quiz is on correlation and regression.

Day 37

Activity: Parabolas.

If we ignore wind resistance, then the flight path for a projectile can be modeled very accurately with parabolas, or quadratic equations.  These equations have a squared x
term, so they are also called second order polynomials.  We will explore how changing the initial velocity and the initial angle will change the flight path, and hence the landing point.

Using parametric mode, change the angle and the velocity and see how the distance is affected.  In particular, see how the angle affects distance for a fixed velocity.

Goals:     Ballistics.

Skills:

                        Know the basic form of the equations for projectile flight.  We use the parametric feature on our calculators to do projectile flight.  The two equations are X = Vx t and Y = -16 t2 + Vy t + h.  The two velocities can be adjusted by the angle the ball is hit.

Day 38

Activity: Physics of baseball.

What makes a curve ball curve?  What sort of influences does wind resistance have on the flight of a baseball?  What use can we make of our knowledge of parabolas in charting the flight path of a batted or thrown baseball?

The following link is a program that will plot the flight path of a baseball for various altitudes, speeds, and angles.  You will use it in Presentation 3.

http://faculty.tcc.fl.edu/scma/carrj/Java/baseball4.html

Goals:     Understand some of the factors that influence balls in flight.

Skills:

                        Know how to do simple calculations of distance traveled.  You should be able to convert miles per hour to feet per second, and be able to calculate the distance a ball travels in a certain length of time.  The important thing to remember is how to convert units (5280 feet in a mile, 3600 seconds in an hour).  The formula we use is Distance = Rate times Time.

                        Be able to discuss the Magnus effect.  When a ball in flight (baseball, cannonball, ping pong ball) is spinning, the air pressures can differ greatly depending on the angle of the spin.  For example, if the ball is spinning counter-clockwise (as viewed from above), the air pressure on the right side of the ball is greater than the pressure on the opposite side.  (In baseball, this is the situation with a right-handed pitcher throwing a curve ball.  The third base side of the ball moves faster than the first base side of the ball.)  Because of the decrease in air pressure, the flight path of the ball will veer that way.  (For the baseball example above, this makes the ball curve from right to left as viewed from the pitcher.)  This movement towards the lower pressure is called the Magnus effect.

                        Know what backspin does to a pitch.  When a pitch is thrown with more backspin than usual, the Magnus effect will cause the ball to fall more slowly than a pitch thrown with less backspin.  This "rise" amounts to about 3 inches difference in height when the ball reaches home plate, 60 feet 6 inches away from the pitcher's mound.  Because of the physiology of the brain, a batter has to commit to a certain swing before he can really tell what sort of backspin is on the ball.  (The difference in heights of the two types of spinning ball is only an inch or so at the point where the batter makes his final swing choice.)

                       

Day 39

Activity: Physics of football.

2-dimensional motion.

By knowing the timing of plays, such as how long it takes a quarterback to get to a pre-designed spot from which to throw the football, and when the receiver must be at the spot to catch the ball, we can make some calculations of time and distance.  Our chief tools are a Cartesian coordinate system and the Pythagorean theorem.

Also using angles and going through possible strategies, we can determine the path a runner should take to either avoid being tackled (if he is the ball carrier) or to intercept and tackle an opponent carrying the ball.  Perhaps surprisingly, to catch up to someone running laterally away from you, you should not
run towards them, but towards either the sideline point that they are heading for, towards a spot closer to you if they decide not to take their optimum angle.

Project 3 due:

Develop an alternate scoring system for the triathlon.  The current method is to add the three times together.  Your goal is to use some linear combination, for example NewScore = .5 SwimTime + .3 BikeTime + 2 RunTime.  Decide on a scheme that is fair to you, such as making the spread of times equal, or making the standard deviations equal, or any other measure you think makes the scoring more fair than just adding times.  You need to justify your choice though.  You can use plots, or summary statistics, or just common sense reasoning, but be persuasive.  If you are stuck, please see me outside of class to get some guidance.

Goals:     Understand some basic physics.

Skills:

                        Know the Pythagorean theorem and how it is used in a coordinate system.  In a right triangle (one angle is exactly 90 degrees) we can calculate the length of the hypotenuse (the side of the triangle not touching the right angle) using the Pythagorean theorem:  c2 = a2 + b2.  To map out where an object (a football carrier) will be at a particular time and running with a particular speed, we need to know the distance he runs along the diagonal.  If we mark his position on the field as a distance horizontally from the center and vertically from the center, then these distance are a and b in the formula.

                        Know the principle a ball carrier should use to gain the maximum distance on a run.  If we connect a line between the runner and the defender, who wants to intercept the runner, then draw a perpendicular line from the midpoint of our first line, then the runner should avoid running away from this second line.  If the runner gets closer to the line, the defender should "mirror" his path, using the second line, the perpendicular line, as the "mirror".  If the runner makes a mistake, and moves further from the perpendicular line, then the defender should stay the same distance from the runner; we can then draw a new set of lines that decreases the farthest distance the runner can go.

Day 40

Activity: Review

Goals:     Know everything.

Day 41

Activity: Presentations.

Optimal angle for a HR.

What parameters affect distance a baseball travels?  Using the link from Day 38, pick a stadium to examine.  By trial and error, discover what combinations of speed and angle will produce a home run (you must clear the outfield fence to be a home run.)  Keep careful track of which values work and which don't.  You may want to make a two-dimensional map of speed versus angle shading in those combinations that work.  How do the results change with a moderate wind, say 10 mph?

Day 42

Activity: Quiz 6.

This last quiz is on the physics of sports.

Return to Chris' Homepage

Return to UW Oshkosh Homepage

Managed by: Chris Edwards
edwards at uwosh dot edu
Last updated December 10, 2006