Day By Day
Notes for PBIS 189
Spring 2006
Activity: Go over syllabus. Take roll. Overview examples: Gilbert trial, election polls, spam
filters
Goals: Review
course objectives: collect data, summarize information, make inferences.
Reading: To The Student, pages xx-xxv
Activity: Video 1 – Overview of
Statistics, Discussion of variables and graphs.
Goals: Get a feel
for what questions we answer with statistics. Begin graphical summaries (describing data with pictures).
Skills:
á
Identify types of
variables. To choose the proper graphical displays, it is
important to be able to differentiate between Categorical and Quantitative (or
Numerical) variables.
á
Be familiar with
types of graphs. To graph categorical variables we use bar graphs or
pie graphs. To graph numerical
variables, we use histograms, stemplots, or CUMPLOT (TI-83 program). In
practice, most of our variables will be numerical but it is still important to
choose the right display.
Reading: Chapter 1 (Skip Time Plots)
Activity: Use the monarchs dataset to create the histograms, stemplots, and cumplots for
the variable "years reigned" separately for the Saxon Rulers (829 to
1066), the rulers from William I to Henry VI (1066 to 1471), the rulers from
Edward IV to Charles I (1461 to 1649), and the rulers from Charles II to
present (1660 to 1998). Compare and interpret the graphs. Identify shape, center, and
spread.
Useful commands for the calculator:
STAT EDIT (use one of the lists to enter data, L1 for example;
the other LÕs can be used too)
2nd
STATPLOT 1 On (Use this screen to
designate the plot settings. You
can have up to three plots on the screen at once. For now we will only use one at a time.)
ZOOM 9 This command centers the window around your data.
CUMPLOT This program I wrote plots the sorted data and
"stacks" them up.
Goals: Be able
to use the calculator to make a histogram or a cumplot. Be able to make a stemplot by hand.
Skills:
á
Summarize data into a
frequency table. The easiest way to make a frequency table is to TRACE the boxes in a histogram and record the classes and
counts. You can control the size and
number of the classes with Xscl and Xmin in the WINDOW menu. The decision as to
how many classes to create is arbitrary; there isn't a "right"
answer. One popular suggestion is
try the square root of the number of data values. For example, if there are 25 data points, use 5
intervals. If there are 50 data
points, try 7 intervals. This is a
rough rule; you should experiment with it. The TI-83 has a rule for doing this; I do not know what
their rule is. You should
experiment by changing the interval width and see what happens to the diagram.
á
Use the TI-83 to
create an appropriate histogram or cumplot. STAT PLOT is our main tool for
viewing distributions of data.
Histograms are common displays, but have flaws; the choice of class
width is troubling as it is not unique.
The cumplot is more reliable, but less common. For interpretation purposes, remember that in a histogram
tall boxes represent places with lots of data, while in a cumulative plot those
same high-density data places are steep.
á
Create a stemplot by
hand. The stemplot is a convenient manual display; it is
most useful for small datasets, but not all datasets make good stemplots. Choosing the "stem" and
"leaves" to make reasonable displays will require some practice. Some notes for proper choice of stems:
if you have many empty rows, you have too many stems. Move one column to the left and try again. If you have too few rows (all the data
is on just one or two stems) you have too few stems. Move to the right one digit and try again. Some datasets will not give good
pictures for any choice of stem, and some benefit from splitting or rounding
(see the example in the text).
á
Describe shape,
center, and spread. From each of our graphs, you should be able to make
general statements about the shape, center, and spread of the distribution of
the variable being explored.
Reading: Chapter 1 (Skip Time Plots)
Activity: Video 2 – Lightning
Research. Dance Fever example.
To calculate our summary statistics, we will use 1-Var Stats (to use List 1) or 1-Var Stats L2 for List 2, for example. There are two screens of output; we will be mostly concerned
with the mean x-bar, the standard deviation Sx, and the five-number summary on screen two.
Goals: Observe
the creation and interpretation of graphical displays in practice. Compare numerical measures of center.
Skills:
á
Understand the effect
of outliers on the mean. The mean (or average) is unduly influenced by outlying
(unusual) observations. Therefore,
knowing when your distribution is skewed is helpful.
á
Understand the effect
of outliers on the median. The median is almost completely
unaffected by outliers. For
technical reasons, though, the median is not as common in scientific
applications as the mean.
Reading: Chapter 2
Activity: Use the monarchs dataset to calculate the mean, the standard deviation, the
5-number summary, and the associated boxplot
for the variable "years reigned" separately for the Saxon Rulers (829
to 1066), the rulers from William I to Henry VI (1066 to 1471), the rulers from
Edward IV to Charles I (1461 to 1649), and the rulers from Charles II to
present (1660 to 1998).
Compare these measures with the corresponding
histogram and cumulative plot.
Note the similarities (where the data values are dense, and where they
are sparse) but especially note the differences. The boxplots and numerical measures cannot describe
shape. The histograms are hard to
use to compare two lists. The stem
and leaf is difficult to modify.
Answer these
questions:
1) Has the variable "years
reigned" changed over time?
2) How does a single case affect the
calculator's routines?
3) What information does the boxplot
disguise?
Goals: Summarize
data with numerical measures and boxplots. Compare these new measures with the histograms, stemplots,
and cumplots you made on Day 3.
Skills:
á
Use the TI-83 to
calculate summary statistics. Calculating may be as simple as entering numbers into
your calculator and pressing a button.
Or, if you are doing some things by hand, you may have to organize
information the correct way, such as listing the numbers from low to high. On the TI-83, the numerical measures
are accessed in 1-Var Stats function in the STAT CALC menu.
Please get used to using the statistical features of your calculator to
produce the mean. While I know you
can calculate the mean by simply adding up all the numbers and dividing by the
sample size, you will not be in the habit of using the full features of your
machine, and later on you will be 'missing the boat'.
á
Compare several lists
of numbers using boxplots. For two lists, the best simple approach is the
back-to-back stemplot. For more
than two lists, I suggest trying boxplots, side-by-side, or stacked. At a glance, then, you can assess which
lists have typically larger values or more spread out values, etc.
á
Understand
boxplots. You should know that the boxplots for some lists don't
tell the interesting part of those lists.
For example, boxplots do not
describe shape; you can only see where the quartiles are. Alternatively, you should know that the
boxplot can be a very good first
quick look.
Reading: Chapter 2
Activity: Create
the following lists:
1) A list of 10 numbers that has
only one number below the mean.
2) A list of 10 numbers that has the
standard deviation greater than the mean.
3) A list of 10 numbers that has a
standard deviation of zero.
For your fourth list start with any 21 numbers. Find a number N
such that 14 of the numbers in your list are within N of the average.
For example, pick a number N
(say 4), calculate the average plus 4, the average minus 4, and count how many
numbers in your list are between those two values. If the count is less than 14, try a larger number for N (bigger than 4). If the count is more than 14, try a smaller number for N (smaller than 4).
Finally, compare the standard deviation to the
Interquartile Range (IQR = Q3 - Q1).
Goals: Interpret
standard deviation as a measure of spread.
Skills:
á
Understand standard
deviation. At first, standard deviation will seem foreign to you,
but I believe that it will make more sense the more you become familiar with
it. In its simplest terms, the
standard deviation is non-negative number that measures how "wide" a
dataset is. One common
interpretation is that the range of a dataset is 4 standard deviations. Another interpretation is that the
standard deviation is roughly ¾ times IQR. Eventually we will use the standard deviation in our
calculations for statistical inference; until then, this measure is just
another summary statistic, and getting used to this number is your goal. The normal curve of Chapter 3 will
further help us understand standard deviation.
Reading: Chapter 3
Activity: Review Homework 1. Video 3 – Boston Beanstalks. Introduce the TI-83's normal
calculations.
Goals: Introduce
normal curve. Use TI-83 in place
of the standard normal table in the text.
Skills:
á
Using the TI-83 to
find areas under the normal curve.
When we have a distribution that
can be approximated with the bell-shaped normal curve, we can make accurate
statements about frequencies and percentages by knowing just the mean and the
standard deviation of the data.
Our TI-83 has 2 functions, normalcdf( and invNorm( which allow us to
calculate these percentages more easily and more accurately than the table in
the text. We use normalcdf( when we want the percentage as an answer and we use invNorm( when we already know the percentage but not the value
that gives that percentage.
Reading: Chapter 3
Activity: Practice normal calculations.
1)
Suppose SAT scores are distributed normally with mean 800 and standard
deviation (sd) 100. Estimate
the chance that a randomly chosen score will be above 720. Estimate the
chance that a randomly chosen score with be between 800 and 900. The top 20% of scores are above
what number? (This is called the
80th percentile.)
2) Find
the Interquartile Range (IQR) for the standard normal (mean 0, sd 1). Compare this to the standard deviation of
1.
3) Women aged 20 to 29 have
normally distributed heights with mean 64 and sd 2.7. Men have mean 69.3 with sd 2.8. what percent of women are taller than the average man, and
what percentage of men are taller than the average woman?
4)
Pretend we are manufacturing fruit snacks, and that the average weight
in a package is .92 ounces with sd 0.05.
What should we label the net weight on the package so that only 5 % of
packages are "underweight"?
5) Suppose that your
average commute time to work is 20 minutes, with an sd of 2 minutes. What time should you leave home to
arrive to work on time at 8:00?
(You may have to decide a reasonable value for the chance of being
late.)
Goals: Master normal calculations. Realize that summarizing using the normal curve is the
ultimate reduction in complexity, but only applies to data whose distribution
is actually bell-shaped.
Skills:
á
Memorize 68-95-99.7
rule. While we do rely on our technology to calculate areas
under normal curves, it is convenient to have some of the values committed to
memory. These values can be used
as rough guidelines; if precision is required, you should use the TI-83
instead. I will assume you know these
numbers by heart when we encounter the normal numbers again in chapters 10 and
13 through 19.
á
Understand that
summarizing with just the mean and standard deviation is a special case. We
have progressed from pictures like histograms to summary statistics like
medians, means, etc. to finally summarizing an entire list with just the mean
and the standard deviation.
However, this last step in our summarization only applies to lists whose distribution resembles the
bell-shaped normal curves. If the
data's distribution is skewed, or has any other shape, this level of
summarization is incomplete. Also,
it is important to realize that these calculations are only approximations.
Reading: Chapters 1 through 3
Activity: Presentations. Graphical (Chapter 1) and Numerical
(Chapter 2) Summaries
Collect or find some data; the quality of the data is not important for this
project. Use 3 to 5 lists of data;
make sure you have enough data so that your summaries are meaningful, say at
least 20 cases. Summarize your
data using both graphical and numerical summaries. Also, make sure you have at least one categorical variable
and at least one numerical variable.
Reading: Chapters 1 through 3
Activity: Exam 1. This first exam will cover graphical summaries (pictures), numerical
summaries (summary calculations) and normal curve calculations (areas under the
bell curve). Some of the questions
will be multiple choice. Others
will require you to show your worked out solution. Chapter reviews are an excellent source for studying for the
exams. Don't forget to review your
class notes and recall what we saw in the videos.
Activity: 1) Using the monarchs data, plot "years reigned" versus "death
age". Then guess what the correlation coefficient might be using
your calculator. Use the sample
diagrams on page 92 to guide you.
Finally, using your calculator, calculate
the actual value for the correlation coefficient and compare it to your guess.
2) Outlier effects. With the
dataset I give you in class, add an eighth point
in three different places and observe how the
correlations coefficient changes.
Goals: Display
two variables and measure (and interpret) linear association using the
correlation coefficient.
Skills:
á
Plot data with a
scatterplot. This will be as simple as entering two lists of
numbers into your TI-83 and pressing a few buttons, just as for histograms or
boxplots. Or, if you are doing
plots by hand you will have to first choose an appropriate axis scale and then
plot the points. You should also
be able to describe overall patterns in scatter diagrams and suggest tentative
models that summarize the main features of the relationship, if any.
á
Use the TI-83 to
calculate the correlation coefficient.
We will have to use the
regression function STAT CALC LinReg(ax+b) to
calculate correlation, r. First, you will have to have pressed DiagnosticOn. Access
this command through the CATALOG (2nd 0). If you
type ENTER after the LinReg(ax+b) command, the calculator assumes your lists are in columns
L1and L2; otherwise
you will type where they are, for example LinReg(ax+b) L2, L3.
á
Interpret the
correlation coefficient. You should know the range of the correlation
coefficient (-1 to +1) and what a 'typical' diagram looks like for various
values of the correlation coefficient.
Again, page 92 is your guide.
You should recognize some of the things the correlation coefficient does
not measure, such as the strength
of a non-linear pattern. You should also recognize how outliers
influence the magnitude of the correlation coefficient. One simple way to observe the effects
of outliers is to calculate the correlation coefficient with and without the
outlier in the dataset and compare the two values. If the values vary greatly (this is a judgment call) then
you would say the outlier is "influential".
Reading: Chapter 4
Activity: Video 4 – Manatees. Correlation summary.
1) The variables can be entered in
any order; correlation is a fact about a pair of variables.
This will be different when we get to regression; there, the order the
variables are presented matters.
2) We must have numerical variables to calculate correlation. For categorical variables, we will use
contingency tables, in Chapter 6.
3) High correlation does not
necessarily mean a straight line scatterplot. US population growth is an example.
4) Correlation is not resistant;
the dataset from Day 11 showed that the placement of a single point in the
scatterplot can greatly influence the value of the correlation.
Goals: See scatterplots
and correlation in practice.
Understand correlations limitations and features.
Skills:
á
Recognize the proper
use of correlation, and know how it is abused. Correlation
measures straight line
relationships. Any departures from
that model make the correlation coefficient less reliable as a summary measure.
Just as for the standard deviation and the mean, the correlation coefficient is
affected by outliers. Therefore,
it is extremely important to be aware of data that is unusual. Some 2-dimensional outliers are hard to
detect with summary statistics; scatterplots are a must then.
Reading: Chapter 5
Activity: 1) Using the Olympic data, fit a regression line to predict the
2004 and 2008 race results.
2) Revisit outliers dataset,
adding regression lines.
Goals: Practice
using regression with the TI-83.
We want the regression equation, the regression line superimposed on the
plot, the correlation coefficient, and we want to be able to use the line to
predict new values.
Skills:
á
Fit a line to data. This
may be as simple as 'eyeballing' a straight line to a scatter plot. However, to be more precise, we will
use least squares, STAT CALC LinReg(ax+b) on the
TI-83, to calculate the coefficients, and VARS Statistics EQ RegEQ to type the equation in the Y= menu.
You should also be able to sketch a line onto a scatter plot (by hand)
by knowing the regression coefficients.
á
Interpret regression
coefficients. Usually, we want to only interpret slope, and slope is
best understood by examining the units involved, such as inches per year or
miles per gallon, etc. Because
slope can be thought of as "rise" over "run", we are
looking for the ratio of the units involved in our two variables. More precisely, the slope tells us the
change in the response variable for a unit change in the explanatory
variable. We don't typically
bother interpreting the intercept, as zero is often outside of the range of
experimentation.
á
Estimate/predict new
observations using the regression line.
Once we have calculated a regression
equation, we can use it to predict new responses. The easiest way to use the TI-83 for this is to TRACE on the regression line. You may need to use up and down arrows to toggle back and
forth from the plot to the line.
You may also just use the equation itself by multiplying the new x-value by the slope and adding the intercept. (This is exactly what TRACE is doing.)
á
Understand the
limitations and strengths of linear regression. Quite simply,
linear regression should only be used with scatterplots that are roughly linear
in nature. That seems
obvious. However, there is nothing
that prevents us from calculating
the numbers for any data set we can input into our TI-83's. We have to realize what our data looks
like before we calculate the regression;
therefore a scatter plot is essential. In the presence of outliers and
non-linear patterns, we should avoid drawing conclusions from the fitted
regression line.
Reading: Chapter 5
Activity: Try to summarize and predict the
population growth in the US. Using
the census data, see if any of the other regression functions in the STAT CALC menu are good models.
Goals: Explore
non-linear regressions on the TI-83.
Skills:
á
Effectively model
using non-linear regression functions.
When we have a relationship
that is non-linear, we try other models.
Because straight lines are easy for us to understand (we are accustomed
to them), the coefficients have meaning.
Some of the other functions available to you are also interpretable,
with some familiarity (which I am not expecting from you) but others have
coefficients that are uninterpretable.
Our main use of these alternate functions is to see the fitted model on
the scatterplot. (We add them to
the scatterplot in the same way as for linear regression: VARS Statistics
EQ RegEQ from the the Y= menu.)
á
Understand that a
high value of r2 is not
necessarily a good fit. We have seen that when r2 = 1, we have a perfect fit. So, you might assume that values very close to 1 are
indicators of very good fits, but this is not necessarily the case. The population data should show us some
high values of r2 that are
poor predictive models. Again, we
need the scatterplot along with the equation to make proper conclusions.
Reading: Chapter 6
Activity: Video 5 – Smoking. Introduce tables of categorical data.
Goals: Introduce
association for categorical variables.
Explore Simpson's paradox.
Skills:
á
Understand that cause
and effect is difficult to establish.
The slogan is
"Association is not the same as Causation." We will encounter this many times throughout the rest of the
course. In the next set of
material (Chapters 7 and 8) we will discuss ways to produce data from which we can draw conclusions about causation.
á
Recognize Simpson's
paradox. Sometimes when data is summarized over several
sub-categories, an association can be reversed. It seems contrary to good common sense, but it is actually
the effects of a lurking variable, and the phenomenon is known as Simpson's
paradox. You should be able to
recognize situations where this paradox might occur. Not all tables of categorical variables will exhibit this
paradox; the tables must be comparing rates over several groups.
Reading: Chapter 6
Activity: Expected Tables.
Goals: Develop intuition
for when the observed and expected tables are too different.
Skills:
á
Create the table of
expected counts. The primary method of analyzing categorical tables is
comparing the observed data to a table of expected counts. (This material comes from Chapter 20,
but I will not expect you to master Chapter 20.)
á
Recognize when an
association is present. When two categorical variables are associated (much
like when two numerical variables are correlated) we detect this with the c2 test. I will show you a way to decide if the
differences in the tables are too great (STAT TESTS c2-Test You must
have the observed table in a matrix.
The expected table will be stored in another matrix. If p
< .05, we conclude the two tables are quite different.)
Reading: Chapters 4 through 6
Activity: Presentations. Regression/Correlation (Chapters 4 and
5)
Pick one of the 50 states. Predict
the population in the year 2010 using a regression function (not necessarily
linear though). Describe how you
decided upon your model, and explain how good you think your prediction is.
Reading: Chapters 4 through 6
Activity: Exam 2. This second exam covers scatterplots, correlation,
regression, and associations in categorical data. Some of the questions will be multiple choice. Others will require you to show your
work. Chapter reviews are an
excellent source for studying for the exams. Don't forget to review your class notes and recall what we
saw in the videos.
Activity: Video 6 – Frito Lay. History of polls.
Goals: Introduce
sampling. Identify biases. Explore why non-random samples are not trustworthy.
Skills:
á
Understand the issues
of bias. We seek representative samples. The "easy" ways of sampling,
samples of convenience and voluntary response samples, may or may not produce
good samples, and because we don't know the chances of subjects being in such
samples, they are poor sampling methods.
Even when probability methods are used, biases can spoil the
results. Avoiding bias is our
chief concern in designing surveys.
á
Huge samples are not
necessary. One popular misconception about sampling is that if
the population is large, then we need a proportionately large sample. This is just not so. My favorite counter-example is our
method of tasting soup. To find
out if soup tastes acceptable, we mix it up, then sample from it with a
spoon. It doesn't matter to us
whether it is a small bowl of soup, or a huge vat, we still use only a
spoonful. The situation is the
same for statistical sampling; we use a small "spoon", or
sample. The fundamental
requirement though is that the "soup" (our population) is "well
mixed" (as in a simple random sample – see Day 20).
Reading: Chapter 7
Activity: Creating random samples. We will use three methods of sampling
today: dice, Table B in our book, and our calculator. To make the problem feasible, we will only use a population
of size 6. (I know this is
unrealistic in practice, but the point today is to see how randomness works,
and hopefully trust that the results extend to larger problems.) Pretend that the items in our
population (perhaps they are people) are labeled 1 through 6. For each of our methods, you will have
to decide in your group what to do with "ties". Keep in mind the goal of simple random
sampling: at each stage, each remaining item has an equal chance to be the next
item selected.
Using dice, generate a sample of three people. Repeat 20 times.
Using Table B, starting at any haphazard location, select three people. Repeat 20 times.
Using your TI-83, select three people.
The command randInt(2,4,5) will produce 5
numbers between 2 and 4, inclusive, for example.
Your group should have drawn 60 samples at the end. Keep careful track of which samples you selected; record
your results in order, as 125 or 256, for example. (125 would mean persons 1, 2, and 5 were selected.) We will pool the results of everyone's
work together on the board.
Goals: Gain
practice taking random samples.
Understand what a simple random sample is. Become familiar
with randInt(. Accept that calculator is random.
Skills:
á
Know the definition
of a Simple Random Sample (SRS). Simple Random Samples can be defined in two ways:
1) An SRS is a sample where, at
each stage, each item has an equal chance to be the next item selected.
2) A scheme were every possible
sample has an equal chance to be the
sample results in an SRS.
á
Select an SRS from a
list of items. The TI-83 command randInt( will select numbered items from a list randomly. If a number selected is already in the
list, ignore that number and get a new one. Remember, as long as each remaining item is equally likely to be chosen as the next item,
you have drawn an SRS.
á
Understand the real
world uses of SRS. In practice, simple random samples are not that
common. It is just too impractical
(or impossible) to have a list of the entire population available. However, the idea of simple random
sampling is essentially the foundation for all the other types of
sampling. In that sense then it is
very common.
Reading: Chapter 8
Activity: Video 7 – Aspirin. Lurking variables exercises.
Goals: Explore
experimentation ideas. Discover
potential lurking variables.
Skills:
á
Examine situations
and detect lurking variables. When searching for lurking variables, it is not enough
to suggest variables that might also explain the response variable. Your potential "lurking"
variable must also be associated with the explanatory variable. So, for example, suppose you are trying
to explain height using weight. A
possible lurking variable might be age, because age tends to be associated with
weight and height. On the other hand, a variable
associated with height that is unlikely to be related to weight (and therefore
would not be a lurking variable)
is arm span.
á
Understand that
experimentation, done properly, will allow us to establish cause-and-effect
relationships. Observational studies have lurking variables; we can
try to control for them by various methods, but we cannot eliminate them. If the data is collected appropriately
through good experimentation, however, the effects of lurking variables can be
eliminated. This is done through
randomization, the thinking being that if a sample is large enough, it can't
realistically be the case that all of one group contains all the large values of a lurking variable, for example.
Reading: Chapter 9
Activity: Video 8 – Traffic. Coins, Dice, Probability Histograms.
Using either complete sampling spaces (theory) or simulation, find (or
estimate) these chances:
1) Roll two dice, one colored, one
white. Find the chance of the
colored die being less than the white die.
2) Roll three dice and find the
chance that the largest of the three dice is a 6. (Ignore ties; that is, the largest value when 6, 6, 4 is
rolled is 6.)
3) Roll three dice and find the
chance of getting a sum of less than 8.
Goals: Create
sample spaces. Use simulation to
estimate probabilities.
Skills:
á
List simple sample
spaces. Flipping coins and rolling dice are common events to
us, and listing the possible outcomes lets us explore probability
distributions. We will not delve
deeply into probability rules; rather, we are more interested in the ideas of
probability and I think the best way to accomplish this is by example.
á
Simulation can be
used to estimate probabilities. If the number of repetitions of an experiment is
large, then the resulting observed frequency of success can be used as an
estimate of the true unknown probability of success. However, a "large" enough number of repetitions
may be more than we can reasonably perform. For example, for problem 1 today, a sample of 100 will give
results between 30 and 50 95% of the time. That may not be good enough for our purposes. Even with 500, the range is 180 to
220. Eventually the answers will
converge to a useful percentage; the question is how soon that will occur. We will have answers to that question
after Chapter 10.
Reading: Chapter 9
Activity: Continue coins and dice. Introduce Random Variables. We will finish up the problems from Day
22, and also examine Pascal's triangle, which is a way of figuring binomial
probabilities (chances on fair coins).
Also in our tables, we will include random variables.
Goals: Understand
that variables may have values that are not equally likely.
Skills:
á
Understand sampling
distributions and how to create simple ones. We have listed
sample spaces of equally likely events, like dice and coins. Events can further
be grouped together and assigned values.
These new groups of events may not be equally likely, but as long as the
rules of probability still hold, we have valid probability distributions. Pascal's triangle is one such example,
though you should realize that it applies only to fair coins. We will work with "unfair
coins" (proportions) later, in Chapters 18 and 19. Historical note: examining these
sampling distributions led to the discovery of the normal curve in the early
1700's. We will copy their work
and "discover" the normal curve for ourselves too using dice.
Reading: Chapter 10 (Skip SPC)
Activity: Central Limit Theorem
exploration. In addition to coins
and dice, rand on your calculator is
another good random mechanism for exploring "sampling
distributions". These
examples will give you some different views of sampling distributions. The important idea is that each time an
experiment is performed, a potentially different result occurs. How these results vary from sample to
sample is what we seek. You are
going to produce many samples, and will therefore see how these values vary.
1) Sums of two items: Each of you in your group will roll two
dice. Record the sum on the dice. Repeat this 30 times, generating 30
sums. Make a histogram or a CUMPLOT of your 30 sums.
Compare to the graphs of the other members in your group, particularly
noting the shape. Sketch the graph
you made and compare to the .
2) Sums of 4 items: Each of you generate 4 random numbers
on your calculator, add them together, average, and record the result; repeat
30 times. The full command is: seq
(rand+rand+rand+rand,X,1,30)/4->L1,
which will generate 30 four-sum average random numbers and store them in L1.) Again,
make a graph of the distribution.
3) Sums of 12 items: Each of you generate 12 random normal numbers on your calculator using randNorm(65,5,12). Add
them together and record the result; repeat 30 times. The full command is: seq
(sum(randNorm(65,5,12)),X,1,30)->L2.) Again, make a graph of the
distribution. (This is problem
10.30 in our text.)
For all the lists you generated, calculate the
standard deviation and the mean.
We will find these two statistics to be immensely important in our
upcoming discussions about inference.
It turns out that these means and standard deviations can be found
through formulas instead of having to actually generate repeated samples. These means depend only on the mean and
standard deviation of the original population (the dice or rand or randNorm in this
case) and the number of times the dice were rolled or rand was pressed (called the sample size, denoted n).
Goals: Examine
histograms to see that averages are less variable than individual
measurements. Also, the shape of
these curves should get closer to the shape of the normal curve as n increases.
Skills:
á
Understand the
concept of sampling variability. Results vary from sample to sample. This idea is sampling variability. We are
very much interested in knowing what the likely values of a statistic are, so
we focus our energies on describing the sampling distributions. In today's exercise, you simulated
samples, and calculated the variability of your results. In practice, we only do one sample, but
calculate the variability with a formula.
In practice, we also have the Central Limit Theorem, which lets us use
the normal curve in many situations to calculate probabilities.
Reading: Chapter 10 (Skip SPC)
Activity: Practice Central Limit Theorem
(CLT) problems. We will have
examples of non-normal data and normal data to contrast the diverse cases where
the CLT applies.
Goals: Use
normal curve with the CLT.
Skills:
á
Recognize how to use
the CLT to answer probability questions concerning sums and averages. The
CLT says that for large sample sizes, the distribution of the sample average is
approximately normal, even though the original data in a problem may not be
normal.
á
For small samples, we
can only use the normal curve if the actual distribution of the original data
is normally distributed. It is important to realize when original data is not
normal, because there is a tendency to use the CLT even for small sample sizes,
and this is inappropriate. When
the CLT does apply, though, we are
armed with a valuable tool that allows us to estimate probabilities concerning
averages. A particular example is
when the data is a count that must
be an integer, and there are only a few possible values, such as the number of
kids in a family. Here the normal
curve wouldn't help you calculate chances of a family having 3 kids.
However, we could calculate quite accurately the number of kids in 100
such families.
Reading: Chapters 7 through 10.
Activity: Presentations. Sampling (Chapters 7 and 8)
Sample 20 students from UWO. For
each student, record the number of credits they are taking this semester, what
year they are in school, and whether or not they are graduating this
semester. Try to make your sample
as representative as you can. You
must have a probability sample to get full credit. Discuss the biases your sample has and what you did to avoid
bias.
Reading: Chapters 7 through 10.
Activity: Exam 3. This third exam is on sampling, experiments, and
probability, including sampling distributions. Most of the exercises will be multiple choice. Chapter reviews are an excellent source
for studying for the exams. Don't
forget to review your class notes and recall what we saw in the videos.
Activity: Guess m&m's percentage. What
fraction of m&m's are blue or green?
Is it 25 %? 33 %? 50 %? We take samples to find out.
Each of you will sample from my jar of m&m's, and you will all calculate
your own confidence interval. Of
course, not everyone will be correct, and in fact, some of us will have "lousy"
samples. But that is the point of
the confidence coefficient, as we will see when we jointly interpret our
results.
It has been my experience that confidence intervals are easier to understand if
we talk about sample proportions instead of sample averages. Thus I will use techniques from Chapter
18. Each of you will have a
different sample size and a different number of successes. In this case the sample size, n, is the total number of m&m's you have selected,
and the number of successes, x, is
the total number of blue or green m&m's in your sample. Your guess is simply the ratio x/n, or
the sample proportion. We call this estimate p-hat or . Use 1-PropZInt with 70 % confidence for your interval here today.
When you have calculated your confidence interval, record your result on the
board for all to see. We will
jointly inspect these confidence intervals and observe just how many are
'correct' and how many are 'incorrect'.
The percentage of correct intervals should match our chosen level of confidence. This is in fact what is meant by
confidence.
Goals: Introduce
statistical inference – Guessing the parameter. Construct and interpret a confidence interval.
Skills:
á
Understand how to
interpret confidence intervals. The calculation of a confidence interval is quite
mechanical. In fact, as we have
seen, our calculators do all the work for us. Our job is then not so much to calculate confidence intervals as it is to be able to
understand when one should be used
and how best to interpret one.
Reading: Chapter 13 (Skip "choosing
the sample size")
Activity: Video 9 – Batteries. Changing confidence levels and sample
sizes.
Goals: See how
the TI-83 calculates our CI's.
Interpret the effect of differing confidence coefficients and sample sizes.
Skills:
á
Understand the
factors that make confidence intervals believable guesses for the
parameter. The two chief factors that make our confidence
intervals believable are the sample size and the confidence coefficient. The key result is larger confidence
makes wider intervals, and larger sample size makes narrower intervals.
Reading: Chapter 14
Activity: Argument by contradiction. Scientific method. Type I and Type II error diagram.
Goals: Introduce
statistical inference – Hypothesis testing.
Skills:
á
Recognize the two
types of errors we make. If we decide to reject a null hyothesis, we might be
making a Type I error. If we fail
to reject the null hypothesis, we might be making a Type II error. If it turns out that the null
hypothesis is true, and we reject it because our data looked weird, then we
have made a Type I error.
Statisticians have agreed to control this type of error at a specific
percentage, usually 5%. On the
other hand, if the alternative hypothesis is true, and we fail to reject the null hypothesis, we have also made a
mistake. This error is generally not controlled by us; the sample size is the determining
factor here.
á
Understand why one
error is considered a more serious error.
Because we control the
frequency of a Type I error, we feel confident that when we reject the null
hypothesis, we have made the right decision. This is how the scientific method works; researchers usually
set up an experiment so that the conclusion they would like to make is the
alternative hypothesis. Then if
the null hypothesis (usually the opposite of what they are trying to show) is
rejected, there is some confidence in the conclusion. On the other hand, if we fail to reject the null hypothesis, the most useful
conclusion is that we didn't have a large enough sample size to detect a real
difference. We aren't really
saying we are confident the null hypothesis is a true statement; rather we are
saying it could be true. Because we cannot control the frequency
of this error, it is a less confident statement.
Reading: Chapter 14
Activity: Video 10 –
Shakespeare. Practice problems on
hypothesis testing.
Goals: Practice
contradiction reasoning, the basis of the scientific method.
Skills:
á
Become familiar with
"argument by contradiction".
When researchers are trying to
"prove" a treatment is better or that their hypothesized mean is the
right one, they will usually choose to assume the opposite as the null
hypothesis. For election polls,
they assume the candidate has 50% of the vote, and hope to show that is an
incorrect statement. For showing
that a local population differs from, say, a national population, they will
typically assume the national average applies to the local population, again
with the hope of rejecting that assumption. In all cases, we formulate the hypotheses before collecting data; therefore, you will never see a
sample average in either a null or alternative hypothesis.
á
Understand why we
reject the null hypothesis for small p-values. The p-value is the
probability of seeing a sample result "worse" than the one we
actually saw. In this sense,
"worse" means even more evidence against the null hypothesis; more
evidence favoring the alternative hypothesis. If this probability is small, it means either we have
observed a rare event, or that we have made an incorrect assumption, namely the
null hypothesis. Statisticians and
practitioners have agreed that 5% is a reasonable cutoff between a result that
contradicts the null hypothesis and a result that could be argued to be in
agreement with the null hypothesis.
Thus, we reject our claim only when the p-value is a small number.
Reading: Chapter 15 (Skip power
calculations)
Activity: Testing Simulation. In this experiment, you will work in pairs
and generate data for your partner to analyze. Your partner will come up with a conclusion (either reject
the null hypothesis or fail to reject the null hypothesis) and you will let
them know if they made the right decision or not. Keep careful track of the success rates.
For each of these simulations, let the null hypothesis
mean be 20, n = 10, and sigma = 5.
You will let mu change for
each replication.
1) Without your partner knowing, choose either 16, 18, 20, 22, or 24 for mu. Then
use your calculator and generate 10 observations. Use randNorm(M,5,10)->L1 where M is the value of mu you chose for this replication. Clear the screen (so your partner can't
see what you did) and give them the calculator. They will perform a hypothesis test using the .05
significance level and tell you their decision.
2) Repeat step 1 until you have
each done 10 hypothesis tests; it is not necessary to have each value of mu done twice; try to do each one at least once. Do 20 at least twice each. (We need a more cases for 20 because
we're using a small alpha level.)
3) Keep track of the results you
got (number of successful decisions and number of unsuccessful decisions) and
report them to me so we can all see the combined results.
Goals: Interpret
significance level. Observe the
effects of different values of the population mean. Recognize limitations to inference.
Skills:
á
Interpret
significance level. Our value for rejecting, usually .05, is the
percentage of the time that we falsely reject a true null hypothesis. It does not measure whether we had a
random sample; it does not measure whether we have bias in our sample. It only measures whether random data could look like the
observed data.
á
Understand how the
chance of rejecting the null hypothesis changes when the population mean is
different than the hypothesized value.
When the population mean is not the hypothesized value, we expect to reject the null
hypothesis more often. This is
reasonable, because rejecting a false null hypothesis is a correct
decision. Likewise, when the null
hypothesis is in fact true, we hope to seldom decide to reject. If we have generated enough
replications in class, we should see a power curve emerge that tells us how
effective our test is for various values of the population mean.
á
Know the limitations
to confidence intervals and hypothesis tests. Chapter 15 has
examples of when our inference techniques are inappropriate. The main points to watch for are
non-random samples, misinterpreting what "rejecting the null
hypothesis" means, and misunderstanding what error the margin of error is
measuring. Be sure to read the
examples in Chapter 15 carefully as I will not go over them in detail in class.
Reading: Chapters 13 through 15.
Activity: Presentations. Confidence Intervals (Chapter 13) and
Hypothesis Tests (Chapter 14)
Sample 10 of the 50 states randomly.
Calculate:
1) a Confidence Interval for the true average state name length,
2) a Confidence Interval for the true average state capital population, and
3) a test of the hypothesis H0: average state land area = 70,000
square miles versus the hypothesis Ha: average state land area <
70,000 square miles.
Reading: Chapters 13 through 15.
Activity: Exam 4. This exam covers the basics of inferences for the two
techniques we've explored: confidence intervals and hypothesis tests. Also included are the cautions from
Chapter 15. Most of the exercises will be multiple choice. Chapter reviews are an excellent source
for studying for the exams. Don't
forget to review your class notes and recall what we saw in the videos.
Activity: Gossett Simulation. Take samples of size 5 from a normal
distribution. Use s instead of s in the standard 95%
confidence z-interval. Repeat 100 times to see if the true
coverage is 95%. We will pool our
results to see how close we are to 95%.
A century ago, Gossett noticed this phenomenon and guessed what the true
distribution should be. A few
years later Sir R. A. Fisher proved that Gossett's guess was correct, and the t distribution was accepted by the statistical
community. Gossett was unable to
publish his results under his own name (to protect trade secrets), so he used
the pseudonym "A. Student".
You will therefore sometimes see the t distribution referred to as "Student's t distribution".
Goals: Introduce
t-test. Understand how the z-test is inappropriate in most small sample
situations.
Skills:
á
Know why using the t-test or the t-interval when s is
unknown is appropriate. When we use s instead of s and do not
use the correct t distribution, we
find that our confidence intervals are too narrow, and our hypothesis tests
reject H0 too often.
á
Realize that the
larger the sample size, the less serious the problem. When we have larger
sample sizes, say 15 to 20, we notice that the simulated success rates are much
closer to the theoretical. Thus
the issue of t vs z is a moot point for large samples.
Reading: Chapter 16
Activity: Matched Pairs vs 2-Sample.
Goals: Recognize
when matched-pairs applies.
Skills:
á
Detect situations where
the matched pairs t-test is
appropriate. The nature of the matched pairs is that each value of
one of the variables is associated with a value of the other variable. The most common example is a repeated
measurement on a single individual, like a pre-test and a post-test. Other situations are natural pairs,
like a married couple, or twins. In
all cases, the variable we are really
interested in is the difference in the two scores or measurements. This single difference then makes the
matched pairs test a one-variable t-test.
Reading: Chapter 17 (Skip F-test.)
Activity: Finish 2-sample work.
Goals: Complete 2-sample
t-test.
Skills:
á
Know the typical null
hypothesis for 2-sample hypothesis tests.
The typical null hypothesis
for 2-sample problems, both matched and independent samples, is that of
"no difference". For the
matched pairs, we say H0: m=0, and for the 2
independent samples we say H0: m1= m2. As usual,
the null hypothesis is an equality statement, and the alternative is the
statement the researcher typically wants to end up concluding. In both 2-sample procedures, we
interpret confidence intervals as ranges for the difference in means, and hypothesis tests as whether the
observed difference in means is far from zero.
Reading: Chapter 17 (Skip F-test.)
Activity: Video 11 –
Salem. Proportions.
Goals: Introduce
proportions.
Skills:
á
Detect situations
where proportions z-test is correct.
We have several conditions
that are necessary for using proportions.
We must have situations where only two outcomes are possible, such as yes/no,
success/failure, live/die, Rep/Dem, etc.
We must have independence between trials, which is typically simple to
justify; each successive measurement has nothing to do with the previous
one. We must have a constant
probability of success from trial to trial. We call this value p. And finally we must have
a fixed number of trials in mind beforehand; in contrast, some experiments continue
until a certain number of
successes has occurred.
á
Know the conditions
when the normal approximation is appropriate. In order to use the
normal approximation for proportions, we must have a large enough sample
size. The typical rule of thumb is
to make sure there are at least 5 successes and at least 5 failures in the
sample. For example, in a sample
of voters, there must be at least 5 Republicans and at least 5 Democrats, if we
are estimating the proportion or percentage of Democrats in our
population. (Recall the m&m's
example: when you each had fewer than 5 blue or green m&m's, I made you
take more until you had at least 5.)
á
Know the Plus 4
Method. A recent (1998) result from statistical research suggested
that the typical normal theory failed mysteriously in certain unpredictable
situations. Those researchers
found a convenient "fix": pretend there are 4 additional
observations, 2 successes and 2 failures.
By adding these pretend cases to our real cases, the resulting
confidence intervals almost magically capture the true parameter the stated percentage
of the time. Because this
"fix" is so simple, it is the recommended approach in all confidence
interval problems. Hypothesis testing procedures remain
unchanged.
Reading: Chapter 18
Activity: 2-Sample
Proportions
Goals: 2-Sample
proportions.
Skills:
á
Detect situations
where the 2-proportion z-test is correct.
Description.
Reading: Chapter 19
Activity: Video 12 –
AIDS
Goals: Conclude
course topics. Know everything.
Skills:
á
Be able to correctly
choose the technique from among the z-test, the t-test,
the matched pairs t-test,
the 2 sample t-test, and
tests for proportions. Description.
Reading: Chapters 16 through 19
Activity: Presentations. Statistical Inference (Chapters 16 to
19)
Make a claim, a statistical hypothesis, and test it. Gather appropriate data to test your claim. Discuss and justify any assumptions you
made. Explain why your test is the
appropriate technique.
Reading: Chapters 16 through 19
Activity: Exam 5 This last exam covers the t tests and intervals in Chapters 16 and 17, and the z tests and intervals for proportions in Chapters 18
and 19. The basic principles from the
last exam are still used here; the choice of the particular test changes. Due to the nature of these problems,
there will be some overlap with Exam 4 material. Most of the questions will be multiple choice. Chapter reviews are an excellent source
for studying for the exams. Don't
forget to review your class notes and recall what we saw in the videos.
Managed by: Chris Edwards
edwards@uwosh.edu
Last updated March 15, 2006