Using the World
Wide Web for Teaching Statistics
This document contains problems that can be used in an introductory statistics course. The problems are grouped into the following categories.
Numerical Description of Data
The following sites have useful resources for teaching
statistic.
The Dataset and Story Library: http://lib.stat.cmu.edu/DASL/
Data Surfing: http://it.stlawu.edu/~rlock/datasurf.html
StatLib Dataset Archive: http://lib.stat.cmu.edu/datasets/
Java Applets for Statistics: http://www.stat.duke.edu/sites/java.html
Multimedia Statistics Page: http://www.berrie.dds.nl/index.html
Rice Virtual Lab in Statistics: http://www.ruf.rice.edu/~lane/rvls.html
Journal of Statistics Education Data Archive: http://www.amstat.org/publications/jse/jse_data_archive.html
1. In a survey of 1000 people, conducted in June 2002 by Strategy One/Colonial Williamsburg, they were asked what issues was most important to them out of the choices given in the table below.
Issue

Percentage of Response

Freedom of Speech 
26 
Access to affordable health care 
20 
Freedom of religion 
19 
Opportunity of economic advancement 
12 
Right to pursue an education 
12 
Freedom of press 
3 
Don’t know/none of the above 
8 
(Source: Survey One/Central Williamsburg, USA Today, August 13, 2002)
Draw a pie chart to describe the distribution.
2. 25 countries won medals in the 2002 Winter Olympics. The table below list them along with the total number of medals each won.
Country

Medals

Country

Medals

Germany 
35 
Croatia 
4 
USA 
34 
Korea 
4 
Norway 
24 
Bulgaria 
3 
Canada 
17 
Estonia 
3 
Austria 
16 
Great Britain 
3 
Russia 
16 
Australia 
2 
Italy 
12 
Czech Republic 
2 
France 
11 
Japan 
2 
Switzerland 
11 
Poland 
2 
China 
8 
Spain 
2 
Netherlands 
8 
Belarus 
1 
Finland 
7 
Slovenia 
1 
Sweden 
6 


(Source: CNNSI.com http://sportsillustrated.cnn.com/olympics/2002/current_medal_tracker/)
(a) Draw a pie chart to describe the distribution. What problems do you encounter?
(b) Can you find a way to organize the data so that the graph is more successful?
3. 202 countries participated in 2004 Summer Olympics, and 75 countries won medals. The table below list top 25 countries with the total number of medals won.
Country

Medals

Country

Medals

USA 
103 
Romania 
19 
Russia 
92 
Spain 
19 
China 
63 
Hungary 
17 
Australia 
49 
Greece 
16 
Germany 
48 
Belarus 
15 
Japan 
37 
Canada 
12 
France 
33 
Bulgaria 
12 
Italy 
32 
Brazil 
10 
South Korea 
30 
Turkey 
10 
Great Britain 
30 
Poland 
10 
Cuba 
27 
Thailand 
8 
Ukraine 
23 
Denmark 
8 
Netherlands 
22 


(Source: CNNSI.com http://sportsillustrated.cnn.com/olympics/2004/medaltracker/medalTrackerByTotal.html)
a. Draw a pie chart to describe the distribution. What problems do you encounter?
b. Can you find a way to organize the data so that the graph is more successful?
4. The following table, based on the American Chamber of Commerce Researchers Association Survey for the second quarter of 2002, gives the prices (in dollars) of five items in 25 urban areas across the United States.
City 
Apartment Rent 
Phone Bill 
Price of Gasoline 
Visit to Doctor 
Price of Beer 
Montgomery (AL) 
$576 
$22.28 
$1.335 
$52.33 
$7.88 
Juneau (AK) 
1020 
18.26 
1.584 
88.67 
8.12 
Tucson (AZ) 
689 
21.03 
1.347 
54.80 
7.79 
Sacramento (CA) 
749 
16.99 
1.643 
70.00 
6.99 
San Diego (CA) 
1306 
24.57 
1.632 
75.20 
7.99 
Denver (CO) 
891 
23.10 
1.343 
71.8 
6.90 
Hartford (CT) 
896 
22.39 
1.419 
80.25 
7.15 
Jacksonville (FL) 
810 
20.31 
1.419 
63.80 
7.15 
Bloomington (IN) 
678 
19.95 
1.402 
56.67 
6.99 
New Orleans (LA) 
798 
26.06 
1.351 
56.20 
6.56 
Boston (MA) 
1248 
24.41 
1.405 
78.00 
7.21 
Grand Rapids (MI) 
678 
22.40 
1.499 
59.20 
8.03 
Minneapolis (MN) 
815 
25.16 
1.366 
72.20 
7.49 
Springfield (MO) 
568 
18.25 
1.309 
63.72 
7.89 
Billings (MT) 
550 
30.45 
1.449 
70.75 
7.09 
Buffalo (NY) 
714 
33.71 
1.413 
53.00 
7.03 
Charlotte (NC) 
540 
21.07 
1.359 
58.00 
7.03 
Akron (OH) 
686 
21.16 
1.519 
59.40 
7.29 
Oklahoma City (OK) 
579 
23.04 
1.308 
60.02 
6.94 
Portland (OR) 
753 
20.92 
1.403 
72.40 
7.69 
Philadelphia (PA) 
1282 
21.12 
1.360 
62.50 
8.57 
Austin (TX) 
1025 
19.20 
1.299 
68.33 
6.78 
Richmond (VA) 
769 
26.15 
1.317 
59.80 
6.37 
Spokane (WA) 
593 
18.49 
1.305 
61.80 
6.89 
Charleston (WV) 
606 
27.08 
1.423 
64.67 
7.01 
Apartment Rent: Monthly rent of an unfurnished 2bedroom apartment (excluding all utilities except water), 1 ½ or 2 baths, approximately 950 square feet.
Phone Bill: Monthly telephone charges for a private residential line (customer owns instruments).
Price of Gasoline: Price of one gallon regular unleaded, national brand.
Visit to doctor: General practitioner’s routine examination of patient.
Price of Beer: Heineken’s 6pack, 12oz. containers, excluding deposit.
a. Prepare frequency distributions for the five variables.
b. Construct the relative frequency and percentage distribution for the five variables.
c. Draw histograms.
5. The table below shows the average SAT scores for each of the 50 states and District of Columbia for 1990 and 2000.
State

1990 
2000 
Alabama 
1079 
1114 
Alaska 
1015 
1034 
Arizona 
1041 
1044 
Arkansas 
1077 
1117 
California 
1002 
1015 
Colorado 
1067 
1071 
Connecticut 
1002 
1017 
Delaware 
1006 
998 
D.C 
950 
980 
Florida 
988 
998 
Georgia 
951 
974 
Hawaii 
985 
1007 
Idaho 
1066 
1081 
Illinois 
1089 
1154 
Indiana 
972 
999 
Iowa 
1172 
1189 
Kansas 
1129 
1154 
Kentucky 
1089 
1098 
Louisiana 
1088 
1120 
Maine 
991 
1004 
Maryland 
1008 
1016 
Massachusetts 
1001 
1024 
Michigan 
1063 
1126 
Minnesota 
1110 
1175 
Mississippi 
1090 
1111 
Missouri 
1089 
1149 
Montana 
1082 
1089 
Nebraska 
1121 
1131 
Nevada 
1022 
1027 
New Hampshire 
1028 
1039 
New Jersey 
993 
1011 
New Mexico 
1100 
1092 
New York 
985 
1000 
North Carolina 
948 
988 
North Dakota 
1157 
1197 
Ohio 
1048 
1072 
Oklahoma 
1095 
1123 
Oregon 
1024 
1054 
Pennsylvania 
987 
995 
Rhode Island 
986 
1005 
South Carolina 
942 
966 
South Dakota 
1150 
1175 
Tennessee 
1102 
1116 
Texas 
979 
993 
Utah 
1121 
1139 
Vermont 
1000 
1021 
Virginia 
997 
1009 
Washington 
1024 
1054 
West Virginia 
1034 
1037 
Wisconsin 
1111 
1181 
Wyoming 
1072 
1090 
(Source: College Entrance Examination Board, 2001)
a. Use graphs to display the two SAT score distributions. How has the distribution of average state scores changed over the decade?
b. Compute the paired difference by subtracting the 1990 score from the 200 score for each state. Summarize these differences with a graph.
Numerical Description of Data
6. In an advertisement in USA Today (July 9, 2001), the company Net2Phone listed its long distance rates to 24 of the 250 countries to which it offers service.
Country 
Cost per Minute (cents) 
Country 
Cost per Minute (cents) 
Belgium 
7.9 
Italy 
9.9 
Chile 
17 
Japan 
7.9 
Canada 
3.9 
Mexico 
16 
Colombia 
9.9 
Pakistan 
49 
Dominican Republic 
15 
Philippines 
49 
Finland 
9.9 
Puerto Rico 
21 
France 
7.9 
Singapore 
11 
Germany 
7.9 
South Korea 
9.9 
Hong Kong 
7.9 
Taiwan 
9.9 
India 
49 
United Kingdom 
7.9 
Ireland 
7.9 
United States 
3.9 
Israel 
8.9 
Venezuela 
22 
a. Make a graphical display of these rates.
b. Find the mean and the median.
c. Find the standard deviation.
7. The U.S. Department of Transportation collects data on the amount of gasoline sold in each state. The following data show the per capita (gallons used per person) consumption in the year 2000. Using appropriate graphical displays and summary statistics, write a report on the gasoline use by state in the year 2000.
Alabama 
544.71 
Montana 
548.5 
Alaska 
433.08 
Nebraska 
508.28 
Arizona 
452.82 
Nevada 
446.17 
Arkansas 
532.82 
NH 
542.86 
CA 
422.65 
NJ 
474.28 
Colorado 
461.90 
NM 
474.28 
CT 
431.04 
New York 
551.18 
Delaware 
481.45 
NC 
296.66 
Florida 
542.36 
ND 
513.3 
Georgia 
452.82 
Ohio 
574.83 
Hawaii 
327.27 
OK 
457.63 
Idaho 
500.34 
Oregon 
520.42 
Illinois 
406.66 
PA 
441.44 
Indiana 
518.7 
RI 
410.31 
Iowa 
534.7 
SC 
381.86 
Kansas 
511.34 
SD 
555.06 
Kentucky 
510.9 
TN 
586.58 
LA 
522.12 
Texas 
515.17 
Maine 
542.36 
Utah 
498.66 
Maryland 
542.82 
Vermont 
456.27 
Mass 
438.1 
Virginia 
584.03 
Michigan 
502.77 
WA 
506.92 
MN 
528.06 
WV 
450.4 
MS 
559.29 
WI 
462 
Missouri 
563.56 
Wyoming 
462.67 
8. The Gallup Poll conducted a representative telephone survey during the fist quarter of 1999. Among their reported results was the following table concerning the preferred political party affiliation of respondents and their ages?

Rep. 
Dem. 
Ind. 
Total 
1829 
241 
351 
409 
1001 
3049 
299 
330 
370 
999 
5064 
282 
341 
375 
998 
65+ 
279 
382 
343 
1004 
Total 
1101 
1404 
1497 
4002 
a. What percent of people surveyed were Republicans?
b. Do you think this might be a reasonable estimate of the percentage of all voters who are Republicans? Explain.
c. What percent of people surveyed were under 30 or over 65?
d. What percent of people were Independents under the age of 30?
e. What percent of Independents were under 30?
f. What percent of people under 30 were Independents?
9. The following table gives the number of home runs hit during the 2002 season by all of the baseball teams in the American League
Team 
Home Runs 
Team 
Home Runs 
Team 
Home Runs 
Anaheim 
152 
Texas 
230 
Tampa Bay 
133 
Boston 
177 
Chicago 
217 
Cleveland 
192 
New York 
223 
Toronto 
187 
Detroit 
124 
Seattle 
152 
Oakland 
205 
Baltimore 
165 
Minnesota 
167 
Kansas City 
140 


a. Find the mean and median.
b. Find the standard deviation
10. The following table shows the number of stolen bases (SB) by each of the 16 National League baseball teams during the 2002 season
Team 
SB 
Team 
SB 
Team 
SB 
Colorado 
103 
Montreal 
118 
Cincinnati 
116 
St. Louis 
86 
Florida 
177 
Milwaukee 
94 
Arizona 
92 
Atlanta 
76 
San Diego 
71 
San Francisco 
74 
Philadelphia 
104 
Chicago 
63 
Los Angeles 
96 
New York 
87 
Pittsburgh 
86 
Houston 
71 




a. Find the mean and median.
b. Find the standard deviation.
11. According to a survey by Food Processing, 85% of Americans say they eat homecooked meals three or more times per week (Time, October 7, 2002). Suppose that this result is true for the current population of Americans.
a. Let X be a binomial random variable that denotes the number of Americans in a random sample of 12 who say they eat homecooked meals three or more times per week. What are the possible values that X can assume?
b. Find the probability that in a random sample of 10 shoppers, exactly 4 faithfully buy the same cereal.
12. During the hard economic times, people switch between brands while shopping and rarely stick to one brand. According to an Insight Express online survey of hoppers, only 24% faithfully buy a favorite cereal (CBS.MarketWatch.com, October 1, 2002). Assume that this percentage is true for the current population of all shoppers.
a. Let X be a binomial random variable that denotes the number who faithfully buy the same cereal in a random sample of 10 shoppers. What are the possible values that X can assume?
b. Find the probability that in a random sample of 10 shoppers, exactly 4 faithfully buy the same cereal.
13. In a poll of 1218 yearold females conducted by Harris Interactive for the Gillette Company, 40% of the young females said that they expected the United States to have a female president within 10 years (USA Today, October 1, 2002). Assume that this result is true for the current population of all 12to 18yearold females. Suppose a random sample of 16 females from this age group is selected. Find the probability that the number of young females in this sample who expect a female president within 10 years is
a. at least 9 b. at most 5 c. between 6 to 9
14. According to a 2001 study of college students conducted by Harvard University’s School of Public Health, 34.9% of the male students surveyed said they got drunk three or more times in the past 30 days. (USA Today, April 3, 2002). Assuming that this result holds true for all male college students, find the probability that in a random sample of 10 male college students, the number of students who got drunk three or more times in the past 30 days is
a. exactly 4 b. none c. exactly 8
15. The U.S. Bureau of Labor Statistics conducts periodic surveys to collect information on the labor market. According to one such survey, the average earnings of workers in retail trade were$10 per hour in August 2002 (Bureau of Labor Statistics News, September 18, 2002). Assume that the hourly earnings of such workers in August 2002 had a normal distribution with a mean of $10 and a standard deviation of $1.10. Find the probability that the hourly earnings of a randomly selected retail trade worker in August 2002 were
a. more than $12 b. between $8.50 and $10.80
16. According to a survey by the Kaiser Family Foundation, employers paid an average of $7954 per employee in annual premiums to provide family health coverage for their employees, and each worker paid and average of $2084 toward these premiums (USA Today, September 6, 2002). Assume that the current annual premiums paid by all workers for family health coverage are normally distributed with a mean of $2084 and a standard deviation of $300.
a. Find the probability that a randomly selected worker pays more than $2500 per year toward the family health coverage premium.
b. What percentage of such workers are paying between $1800 and $2400 per year toward such premiums?
17. In a Visa USA poll of Americans, participants were asked which of the following had taught them the most about money management: school or mistakes. Sixtyfour percent of the persons polled said that mistakes had taught them the most (USA Today, May 16, 2002). Assume that this result is true for the current population of all Americans. Find the probability that in a random sample of 400 Americans, the number who will say that mistakes have taught them the most about money management is
a. exactly 250 b. 260 to 272 c. at most 244
18. According to a survey by Money magazine, 27% of women expect to support their parents financially (USA Today, June 19, 2002). Assume that this percentage holds true for the current population of all women. Suppose that a random sample of 300 women is taken.
a. Find the probability that exactly 79 of the women in this sample expect to support their parents financially.
b. Find the probability that at most 74 of the women in this ample expect to support their parent financially.
c. What is the probability that between 75 and 89 of the women in this sample expect to support their parents financially?
19. According to International Communications Research for Cingular Wireless, men talk an average of 594 minutes per month on their cell phones (USA Today, July 29, 2002). Assume that currently 594 minutes with a standard deviation of 160 minutes. Let X be the mean time spent per month talking on their cell phones by a random sample of 400 men who own cell phones. Find the mean and standard deviation of X.
20. According to the U.S. Bureau of Labor Statistics estimates, the average earnings of construction workers were $18.96 per hour in August 2002 (Bureau of Labor Statistics News, September 18, 2002). Assume that the current earnings of all construction workers are normally distributed with a mean of $18.96 per hour and a standard deviation of $3.60 per hour. Find the probability that the mean hourly earning of a random sample of 25 construction workers is
a. between $18 and $20 per hour
b. within $1 of the population mean
c. greater that the population mean by $1.50
21. According toCardWeb.com, the average credit card debt per household was $8367 in 2001 (USA Today, April 29, 2002). Assume that the probability distribution of all such current debts is skewed to the right with a mean of $8367 and a standard deviation of $8367 and a standard deviation of $2400. Find the probability that the mean of a random sample of 225 such debts is
a. between $8100 and$8500
b. within $200 of the population mean
c. greater that the population mean by $300 or more
22. A Maritz poll of adult drivers conducted in July 2002 found that 45% of them “often” or “sometimes” eat or drink while driving (USA Today, October 23, 2002). Assume that 45% of all current adult drivers “often” or “sometimes” eat or drink while driving. Let p be the proportion of adult drivers in a sample of 400 who behave this way. Find the mean and standard deviation of p and describe the shape of its sampling distribution.
23. In a 2002 USA TODAYCNNGallup poll, 37% of taxpayers said that the income tax they had to pay was not fair (USA Today, April 15, 2002). Assume that this percentage is true for the current population of all taxpayers. Let p be the proportion of taxpayers in a random sample of 300 who will say that the income tax they have to pay is not fair. Calculate the mean and standard deviation of p and comment on the shape of its sampling distribution.
24. In a Retirement Confidence survey of retired people, 51% said that retirement is better than they had expected (U.S. News & World Report, June 3, 2002). Assume that this percentage is true for the current population of all retirees. Let p be the proportion of retirees in a random sample of 225 who hold this opinion. Calculate the mean and standard deviation of p and describe the shape of its sampling distribution.
25. According to a 2002 survey by America Online, mothers with children under 18 years of age spent an average of 16.87 hours per week online (USA Today, May 7, 2002). Assume that the mean time spent online by all current mothers with children under 18 years of age is 16.87 hours per week with a standard deviation of 5 hours per week. Find the probability that the mean time spent online per week by a random sample of 100 such mothers is
a. greater that 17 hours
b. between 16.5 and 17.5 hours
c. within .75hour of the population mean
d. less than the population mean by .75 hour or more
26. Due to sluggish economic conditions, the percentage of companies that host holiday parties for their employees have declined. According to a survey by Hewitt Associates, 64% of the companies hosted holiday parties in 2002 (USA Today, December 2, 2002). Assume that this result is based on a random sample of 400 U.S. companies.
a. Find a 98% confidence interval for the proportion of all U.S. companies who hosted holiday parties in 2002.
b. Explain why we need to make the confidence interval. Why can we not say that 64% of all U.S. companies hosted parties in 2002?
27. A May 2002 Gallup Poll found that only 8% of a random sample of 1012 adults approved of attempts to clone a human.
a. Find the margin of error for this poll if we want 95% confidence in our estimate of the percent of American adults who approve of cloning humans.
b. Explain what that margin of error means.
c. If we only need to be 90% confident, will the margin of error be larger or smaller? Explain.
d. Find that margin of error.
e. In general, if all other aspects of the situation remain the same, would smaller samples produce smaller or larger margins of error?
28. In the 1992 U.S. presidential election, Bill Clinton received 43% of the vote compared with 38% for George Bush, and 19% for Ross Perot. Suppose we had taken a random sample of 100 voters in an exit poll and asked them for whom they had voted.
a. Would you always get 43 votes for Clinton, 38 for Bush, and 19 for Perot in a sample of 100? Why or why not?
b. In 95% of such polls, our sample proportion of voters for Clinton should be between what two values?
c. In 95% of such polls, the sample proportion of Perot votes should be between what two numbers?
d. Would you expect the sample proportion of Perot votes to vary more, less, or about the same as the sample proportion of Bush votes? Why?
29. In May of 200, the Pew Research Foundation sampled 1593 respondents and asked how they obtain news. In Pew’s report, 33% of respondents say that they now obtain news from the Internet at least once a week.
a. Pew reports a margin of error of _{}3% for this result. Explain what the margin of error means.
b. Pew also asked about investment information, and 21% of respondents reported that the Internet is their main source of this information. When limited to the 780 respondents who identified themselves as investors, the percent who rely on the Internet rose to 28%. How would you expect the margin of error for this statistic to change in comparison with the margin of error for the percentage of all respondents?
c. When restricted to the 239 active traders in the sample, Pew reports that 45% rely on the Internet for investment information. Find a confidence interval for this statistic.
d. How does the margin of error for your confidence interval compare with the values in parts a and b? Explain why.
30. In May of 2002, the Gallup Organization asked a random sample of 537 American adults this question:
If you could choose between the
following two approaches, which do you think is the better penalty for murder,
the death penalty or life imprisonment, with absolutely no possibility of
parole?
Of those polled, 52% chose the death penalty.
a. Find a 95% confidence interval for the percentage of all American adults who favor the death penalty.
b. Based on your confidence interval, is it clear that the death penalty has majority support? Explain.
c. If pollsters wanted to follow up on this poll with another survey that could determine the level of support for the death penalty to within 2% with 98% confidence, how many people should they poll?
31. During the 2000 season, the home team won 138 of the 240 regular season National Football League games. Is this strong evidence of a home field advantage in professional football? Test an appropriate hypothesis and state your conclusion. Be sure the appropriate assumptions and conditions are satisfied before you proceed.
32. In 2000, 19 million registered voters failed to vote in the presidential election. According to the U.S. Census Bureau’s Current Population Survey in November 2000, the most frequently given reason for not voting was “too busy,” cited by 20.9% of the respondents (USA Today, April 15, 2002). Suppose that a random sample of 250 registered voters who did not vote in the November 2002 midterm elections showed that 18.1% of them stated the main reason for not voting was that they were too busy. At the 5% level of significance, can you conclude that the percentage of the registered voters who did not vote in November 2002 because they were too busy was less than 20.9%?
33. According to a 2002 survey conducted by Harris Interactive for lawyers.com, 47% of Americans dream of owning a business (USA Today, September 10,2002). Assume that this result was true for the population of Americans in 2002. A recent random sample of 100 Americans found that 430 of 5them dream of owning a business. Test at the 5% significance level if the current percentage of Americans who dream of owning a business is different from 47%.
34. In a 2002 Affluent Americans and Their Money survey of “affluent” Americans (having an annual household income of $75,000 of more) conducted for Money magazine by RoperASW, 32% of the respondents indicated that they would have a serious problem paying an unexpected bill of $5000 (Money, Fall 2002). In a recent random sample of 1100 households with annual income of $75,000 or more, 396 said that they would have a serious problem paying an unexpected bill of $5000.
a. Test at the 2.5% significance level, whether the company should market this yogurt?
b. What will your decision be in part (a) if the probability of making a Type I error is zero? Explain.
35. A Vermont study published in December 2001 by the American Academy of Pediatrics examined parental influence on teenagers’ decisions to smoke. A group of students who had never smoked were questioned about their parents’ attitudes toward smoking. These students were questioned again two years later to see if they had started smoking. The researchers found that among the 284 students who indicated that their parents disapproved of kids smoking, 54 had become established smokers. Among the 41 students who initially said their parents were lenient about smoking, 11 became smokers. Do these data provide strong evidence that parental attitude influences teenagers’ decisions about smoking?
a. What kind of design did the researchers use?
b. Write appropriate hypotheses.
c. Are the assumptions and conditions necessary for inference satisfied?
d. Test the hypothesis and state your conclusion.
e. Explain in this context what your pvalue means.
f. If that conclusion is actually wrong, which type of error did you commit?
36. The August 2001 issue of Pediatrics reported on a study of adolescent suicide attempts. Questionnaires were given to 6577 middle and high school students, 214 of whom were adopted. Of the 6577 students 213 youngsters said they had attempted suicide within the last year: 16 of those who were adopted and 197 of those who were not. Does this indicate a significantly different rate of suicide among adopted teens?
a. Test an appropriate hypothesis and state your conclusion.
b. If you concluded there was a difference, estimate that difference with a confidence interval and interpret your interval in context.
37. Among 242 Cleveland area children born prematurely at low birth weights between 1977 and 1979, only 74% graduated from high school. Among a comparison group of 233 children of normal birth weight, 83% were high school graduates. (New England Journal of Medicine, 346, no. 3 [2002])
a. Find a 95% confidence interval for the difference in graduation rates between children of normal and very low birth weights. Be sure to check the appropriate assumption and conditions.
b. Does this provide evidence that premature birth may be a risk factor for not finishing high school? Use your confidence interval to test an appropriate hypothesis.
c. Suppose your conclusion is incorrect. Which type of error did you make?
38. According to Money magazine, the average net worth of U.S. households in 2002 was $355,000 (Money, Fall 2002). Assume that this mean is based on a random sample of 500 households and that the sample standard deviation is $125,000. Find a 99% confidence interval for the 2002 mean net worth of all U.S.households.
39. According to a 2002 survey by America Online, mothers with children under age 18 spent an average of 16.87 hours per week online (USA Today, May 7, 2002). Suppose that this mean is based on a random sample of 1000 such mothers and that the standard deviation for this sample is 3.2 hours per week. Find a 95% confidence interval for the corresponding population mean for all such mothers.
40. According to Money magazine, the average price of new homes in the United States was $145,000 in 2002 (Money, Fall 2002). Assume that this mean is based on a random sample of 1000 new home sales and that the sample standard deviation is $24,000. Find a 95% confidence interval for the 2002 mean price of all such homes.
41. According to Money magazine, the average cost of a movie ticket in the United States was $5.70 in 2002 (Money, Fall 2002). Suppose that a random sample of 25 theaters in the United States yielded a mean movie ticked price of $5.70 with a standard deviation of $1.05. Assuming that movie ticket prices are normally distributed, find a 95% confidence interval for the mean price of movie tickets for all theaters in the United States.
42. In 1998, as an advertising campaign, the Nabisco Company announced a “1000 Chips Challenge,” claiming that every 18ounce bag of their Chips Ahoy cookies contained at least 1000 chocolate chips. The students in a Statistics class at the Air Force Academy purchased some randomly selected bags of cookies, and counted the chocolate chips. Some of their data are given below. (Chance, 12, no. 1, 1999)
1219 1214 1087 1200 1419 1121 1325 1345
1244 1258 1356 1132 1191 1270 1295 1135
a. Find a 95% confidence interval for the average number of chips in bags of Chips Ahoy cookies.
b. What does this evidence say about Nabisco’s claim? Use your confidence interval to test an appropriate hypothesis and state your conclusion.
43. According to the U.S. Bureau of Labor Statistics, there were 8.1 million unemployed people aged 16 years and over in August 2002. The average duration of unemployment for these people was 16.3 weeks (Bureau of Labor Statistics News, September 6, 2002). Suppose that a recent random sample of 400 unemployed Americans aged 16 years and over gave a mean duration of unemployment of 16.9 weeks with a standard deviation of 4.2 weeks. Find the pvalue for the hypothesis test with the alternative hypothesis that the current mean duration of unemployment for all unemployed Americans aged 16 years and over exceeds 16.3 weeks. Will you reject the null hypothesis at _{} = .02?
44. According to an estimate, Americans spend an average of $226 per year to “look good,” buying personalcare products and services such as tooth whitener, hair dyes, and sessions at salons (Reader’s Digest, September 2002). Suppose that a recent random sample of 250 Americans showed that they spent an average of $238 on such products and services this year with a standard deviation of $77.
a. Find the pvalue for the test of hypothesis with the alternative hypothesis that the current mean annual amount spent on such products and services differs from $226.
b. If _{}= .01, would you reject the null hypothesis based on the pvalue calculated in part a? What if _{}= .02?
45. According to the International Communications Research for Cingular Wireless, women talked an average of 394 minutes per month on their cell phones in 2002 (USA Today, July 29, 2002). Suppose that a recent sample of 295 women who own cell phones showed that the mean time they spend per month talking on their cell phones is 402 minutes with a standard deviation of 81 minutes.
a. At the 2% level of significance, can you conclude that the mean time spent talking on their cell phones by all women who own cell phones is currently more that 394 minutes per month?
b. What is the Type I error in this case? What is the probability of making this error in part a?
46. According to Money magazine, the average cost of a visit to a doctor’s office in the United States was $60 in 2002 (Money, Fall 2002). Suppose that a recent random sample of 25 visits to doctors gave a mean of $63.50 and a standard deviation of $2.00. Using the 5% significance level, can you conclude that the current mean cost of a visit to a doctor’s office exceeds $60? Assume that such cost for all visits to doctors are normally distributed.
47. According to the U.S Bureau of Labor Statistics, production workers in the mining industry worked an average of 43.5 hours per week in June 2002 (Bureau of Labor Statistics News, September 6, 2002). A random sample of 24 production workers selected recently from a large mining company, Low Yield Mine, found that they work an average of 41.7 hours per week with a standard deviation of 1.3 hours per week. Assume that the weekly working hours of all these employees are normally distributed.
a. Suppose that the probability of making a Type I error is selected to be zero. Can you conclude that workers at Low Yield Mine work less than 4.35 hours per week? Answer without performing the five steps of a test of hypothesis.
b. Using the 5% level of significance, can you conclude that workers at Low Yield Mine work less than 43.5 hours per week?
48. In the 2002 National Geographic Society—RoperASW poll of geographic knowledge, young adults of 18 to 24 years of age from the United States and 8 other nations were asked to identify 11 countries on a numbered map of Asia (National Geographic, December 2002). The two highestscoring nations were Germany, with an average of 6.7 correct identifications out of 11, and Sweden, with an average of 6.3 correct answers. Suppose that these means were based on random samples of 400 young adults from Germany and 600 from Sweden, and that the sample standard deviations of scores for Germany and Sweden were .7 and .8, respectively.
a. Let _{} and _{}be the population mean for Germany and Sweden, respectively. Find the point estimate of _{}and its margin of error.
b. Find a 98% confidence interval for _{}. Using the 5% level of significance, can you conclude that the mean score for all young adults from Germany is greater than that of all young adults from Sweden?
49. According to Smith Travel Research, the mean hotel room rates in the United States were $85.69 and $84.58 per day in 200 and 2001, respectively (USA Today, September 4, 2002). Suppose that these mean rates were based on random samples of 1000 hotel rooms for 2000 and 1100 hotel rooms for 2001; further assume that the standard deviations of these rates for the two samples were $18.50 and $18, respectively.
a. Let _{} and _{} be the mean rates for all hotel rooms in the United States in 200 and 2001, respectively. What are the point estimate of u1u2 and its margin of error?
b. Find a 90% confidence interval for _{}.
c. Test ate the 1% significance level whether the mean hotel room rate for 2000 was higher than that for 2001.
50. In June 2002, the Journal of Applied Psychology reported on a study that examined whether the content of TV shows influenced the ability of viewers to recall brand names of items featured in the commercials. The researchers randomly assigned volunteers to watch one of three programs, each containing the same nine commercials. One of the programs had violent content, another sexual content, and the third neutral content. After the shows ended the subjects were asked to recall the brands of products that were advertised. Results are summarized below.

Violent 
Sexual 
Neutral 
No. of subjects 
108 
108 
108 
Mean

2.08 
1.71 
3.17 
St. Dev 
1.87 
1.76 
1.77 
a. Do the results indicate that viewer memory for ads may differ depending on the program content?
b. Is there evidence that viewer memory for ads may differ between programs with sexual content and those with neutral content?
51. In a fullpage ad that ran in many U.S. newspapers in August 2002, a Canadian discount pharmacy listed costs of drugs that could be ordered from a Website in Canada. The table compares prices (in US$) for commonly prescribed drugs.
COST PER 100 PILLS

United States 
Canada 
Percent Savings 
Cardizem 
131 
83 
37 
Celebrex

136 
72 
47 
Cipro 
374 
219 
41 
Pravachol 
370 
166 
55 
Premarin 
61 
17 
72 
Prevacid 
252 
214 
15 
Prozac 
263 
112 
57 
Tamoxifen 
349 
50 
86 
Vioxx 
243 
134 
45 
Zantac 
166 
42 
75 
Zocor 
365 
200 
45 
Zoloft 
216 
105 
51 
a. Find a 95% confidence interval for the average savings in dollars.
b. Find a 95% confidence interval for the average savings in Percent.
c. Which analysis do you think is more appropriate? Why?
52. Ever since Lou Gehrig developed amyotrophic lateral sclerosis (ALS), this deadly condition has been commonly known as Lou Gehrig’s disease. Some believe that ALS is more likely to strike athletes or the very fit. Columbia University neurologist Lewis P. Rowland recorded personal histories of 431 patients he examined between 1992 and 2002. He diagnosed 280 as having ALS; 38% of them had been varsity athletes. The other 151 had other neurological disorders, and only 26% of them had been varsity athletes. (Science News, Sept.28, 2002). Is there evidence that ALS is more common among athletes?
53. A study of the health behavior of schoolaged children asked a sample of 15yearolds in several different countries if they had been drunk at least twice. The results are shown in the table, by gender. Find a 95% confidence interval for the difference in the rates for males and females. Be sure to check the assumptions that support your chosen procedure, and explain what your interval means. (Health and Health Behavior Among Young people. Copenhagen: World Health Organization, 2000)
Percent of 15YearOlds Drunk at Least Twice
Country 
Female 
Male 
Denmark 
63 
71 
Wales 
63 
72 
Greenland 
59 
58 
England 
62 
51 
Finland 
58 
52 
Scotland 
56 
53 
No. Ireland 
44 
53 
Slovakia 
31 
49 
Austria 
36 
49 
Canada 
42 
42 
Sweden 
40 
40 
Norway 
41 
37 
Ireland 
29 
42 
Germany 
31 
36 
Latvia 
23 
47 
Estonia 
23 
44 
Hungary 
22 
43 
Poland 
21 
39 
USA 
29 
34 
Czech Rep. 
22 
36 
Belgium 
22 
36 
Russia 
25 
32 
Lithuania 
20 
32 
France 
20 
29 
Greece 
21 
24 
Switzerland 
16 
25 
Israel 
10 
18 
54. In March 2002, Consumer Reports listed the rate of return for several large cap mutual funds over the previous 3year and 5year periods. (“Large cap” refers to companies worth over $10 billion.)
Annualized Returns (%)
Fund Name 
3year 
5year 
Ameristock 
7.9 
17.1 
Clipper 
14.1 
18.2 
Credit Suisse Strategic Value 
5.5 
11.5 
Dodge & Cox Stock 
15.2 
15.7 
Excelsior Value 
13.1 
16.4 
Harbor Large Cap Value 
6.3 
11.5 
ICAP Discretionary Equity 
6.6 
11.4 
ICAP Equity 
7.6 
12.4 
Neuberger Berman Focus 
9.8 
13.2 
PBHG Large Cap Value 
10.7 
18.1 
Pelican 
7.7 
12.1 
Price Equity Income 
6.1 
10.9 
USAA Cornerstone Strategy 
2.5 
4.9 
Vanguard Equity Income 
3.5 
11.3 
Vanguard Windsor 
11.0 
11.0 
a. Find a 95% confidence interval for the difference in rate of return for the 3 and 5year periods covered by these data. Clearly explain what your interval means.
b. It’s common for advertisements to carry the disclaimer that ”past returns may not be indicative of future performance,” but do these data indicate that there was an association between 3year and 5year rates of return?
55. In 2000, the Journal of American Medical Association published a study that examined a sample of pregnancies that resulted in the birth of twins. Births were classified as preterm with intervention (induced labor or cesarean), preterm without such procedures, or term or postterm. Researchers also classified the pregnancies by the level of prenatal medical care the mother received (inadequate, adequate, or intensive). The data, from the years 199597, are summarized in the table below. Figures are in thousands of births. (JAMA 284, 2000)
TWIN BIRTHS
19951997 (IN THOUSANDS)

Preterm (induced or Cesarean) 
Preterm (without procedures) 
Term or postterm 
Total 
Intensive 
18 
15 
28 
61 
Adequate 
46 
43 
65 
154 
Inadequate 
12 
13 
38 
63 
Total 
76 
71 
131 
278 
Is there evidence of an association between the duration of the pregnancy and the level of care received by the mother?
56. The Gallup Poll conducted a representative telephone survey during the first quarter of 1999. Among the reported results was the following table concerning the preferred political party affiliation of respondents and their ages. Is there evidence of agebased differences in party affiliation in the United States?

Republican

Democratic 
Independent 
Total 
1829 
241 
351 
409 
1001 
3049 
299 
330 
370 
999 
5064 
282 
341 
375 
998 
65+ 
279 
382 
343 
1004 
Total 
1101 
1404 
1497 
4002 
a. Will you conduct a test of homogeneity or independence? Why?
b. Test an appropriate hypothesis.
c. State your conclusion, including an analysis of differences you find (if any).
57. In January 2002, 725 people receiving outplacement assistance, with incomes of $60,000 to $150,000 were asked how long they could comfortably afford to be unemployed (Business Week, April 15, 2002). Eight percent said” less than three months,” 46% said “up to six months,” 26% said “up to a year,” and 20% said “more that a year.” Assume that these results are true for the 2002 population of such people. Suppose we denote the above responses by A, B, C, and D, respectively. Recently_{}500 such people were randomly selected and asked the same question. The following table summarizes their responses.
Response 
A 
B 
C 
D 
Number of people 
48 
242 
120 
90 
Test at the 5% significance level whether the current distribution of response differs from the one for 2002.
58. In an AtaGlance Communications 2002 survey, office workers were asked how long they normally took to respond to email. Thirtysex percent said “as soon as I return to my desk,” 35% said “within an hour or two,” 24% said “before the end of the business day,” and 5% said “when I can” (USA Today, May 7, 2002). Assume that these results hold true for the population of all office workers in 2002. Suppose we denote these response by A, B, C, and D, respectively. A recent random sample of 400 office workers was asked the same question and it yielded the frequency distribution shown in the following table.
Response 
A 
B 
C 
D 
Frequency 
128 
142 
116 
14 
Using the 1% significance level, can you conclude that the current distribution of responses differs from the 2002 distribution?
59. A survey conducted from June 21 through August 7, 2002 studied “affluent” Americans with household incomes of $75,000 or more per year (Money, Fall 2002). Part of that survey examined the relationship between the use of a financial advisor and ownership of stocks. Assuming that this portion of the survey was based on a random sample of 400 affluent Americans, the percentages given in the magazine would yield the numbers shown in the following table.


Own Stocks 
Do Not Own Stocks 
Use financial 
Yes 
165 
135 
advisor? 
No 
43 
57 
At the 5% significance level, can you conclude that use of a financial advisor is related to stock ownership for all affluent Americans?
60. In a Knowledge Networks/Statistical Research 2002 survey, 8 to 17yearolds were asked which medium was there favorite (The Reader’s Digest, November 2002). If the survey were based on random samples of 500 boys and 500 girls, the percentages given in the magazine article would have yielded the following table.

Internet 
TV 
Phone 
Radio 
Other 
Boys 
190 
170 
60 
60 
20 
Girls 
140 
85 
155 
85 
35 
Using the 1% significance level, test the null hypothesis that the distribution of media preferences is the same for boys and girls in this age group.
61. The following table gives horsepower ratings and expected gas mileage for several 2001 vehicles.
Audi A4 
170 hp 
22 mpg 
Buick LeSabre 
205 
20 
Chevy Blazer 
190 
15 
Chevy Prizm 
125 
31 
Ford Excursion 
310 
10 
GMC Yukon 
285 
13 
Honda Civic 
127 
29 
Hyundai Elantr 
140 
25 
Lexus 300 
215 
21 
Lincoln LS 
210 
23 
Mazda MPV 
170 
18 
Olds Alero 
140 
23 
Toyota Camry 
194 
21 
VW Beetle 
115 
29 
a. Make a scatterplot for these data.
b. Describe the direction, form, and scatter of the plot.
c. Find the correlation between horsepower and miles per gallon.
d. Write a few sentences telling what the plot says about fuel economy.
62. The following table shows the oil production of the United States from 1949 to 2000 (in millions of barrels per year).
Year 
Oil 
Year 
Oil 
Year 
Oil 
Year 
Oil 
1949 
1,841,940 
1962 
2,676,189 
1975 
3,056,779 
1988 
2,979,123 
1950 
1,973,574 
1963 
2,752,723 
1976 
2,976,180 
1989 
2,778,773 
1951 
2,247,711 
1964 
2,786,822 
1977 
3,009,265 
1990 
2,684,687 
1952 
2,289,836 
1965 
2,848,514 
1978 
3,178,216 
1991 
2,707,039 
1953 
2,357,082 
1966 
3,027,763 
1979 
3,121,310 
1992 
2,624,632 
1954 
2,314,988 
1967 
3,215,742 
1980 
3,146,365 
1993 
2,499,033 
1955 
2,484,428 
1968 
3,329,042 
1981 
3,128,624 
1994 
2,431,476 
1956 
2,617,283 
1969 
3,371,751 
1982 
3,156,715 
1995 
2,394,268 
1957 
2,616,901 
1970 
3,517,450 
1983 
3,170,999 
1996 
2,366,017 
1958 
2,448,987 
1971 
3,453,914 
1984 
3,249,696 
1997 
2,354,831 
1959 
2,574,590 
1972 
3,455,368 
1985 
3,274,553 
1998 
2,281,919 
1960 
2,574,933 
1973 
3,360,903 
1986 
3,168,252 
1999 
2,146,732 
1961 
2,621,758 
1974 
3,202,585 
1987 
3,047,378 
2000 
2,135,062 
a. Find the correlation between year and production.
b. A reporter concludes that a low correlation between year and production shows that oil production has remained steady over the 50year period. Do you agree with this interpretation? Explain.
c. Fit a least squares regression line to oil production by year.
d. Using this regression line, predict U.S. oil production in the year 2001.
e. Does the prediction in part b look reasonable? Comment
f. Do you think the regression line is an appropriate model? Comment.
63. The following table gives the total 2002 payroll (rounded to the nearest million dollars) and the percentage of games won during the 2002 season by each of the National League baseball teams.
Team 
Total Payroll 
Percentage of Game Won 
Arizona Diamondbacks 
103 
60.5 
Atlanta Braves 
93 
63.1 
Chicago Cubs 
76 
41.4 
Cincinnati Reds 
45 
48.1 
Colorado Rockies 
57 
45.1 
Florida Marlins 
42 
48.8 
Houston Astros 
63 
51.9 
Los Angeles Dodgers 
95 
56.8 
Milwaukee Brewers 
50 
34.6 
Montreal Expos 
39 
51.2 
New York Mets 
95 
46.6 
Philadelphia Phillies 
58 
49.7 
Pittsburgh Pirates 
42 
44.7 
St. Louis Cardinals 
75 
59.9 
San Diego Padres 
41 
40.7 
San Francisco Giants 
78 
59.0 
a. Find the least squares regression line with total payroll as an independent variable and percentage of games won as a dependent variable.
b. Give a brief interpretation of the values of the yintercept and the slope.
c. Predict the percentage of games won for a team with a total payroll of $55 million.
64. The following table gives the total 2002 payroll (rounded to the nearest million dollars) and the percentage of games won during the 2002 season by each of the American League baseball teams.
Team 
Total Payroll 
Percentage of Games Won 
Anaheim Angels 
62 
61.1 
Baltimore Orioles 
60 
41.4 
Boston Red Sox 
108 
57.4 
Chicago White Sox 
57 
50.0 
Cleveland Indians 
79 
45.7 
Detroit Tigers 
55 
34.2 
Kansas City Royals 
47 
38.3 
Minnesota Twins 
40 
58.4 
New York Yankees 
126 
64.0 
Oakland A’s 
40 
63.6 
Seattle Marines 
80 
57.4 
Tampa Bay Devil Rays 
34 
34.2 
Texas Rangers 
106 
44.4 
Toronto Blue Jays 
77 
48.1 
a. Find the least square regression line with total payroll as an independent variable and percentage of games won as a dependent variable.
b. Give a brief interpretation of the values of the yintercept and the slope.
c. Predict the percentage of games won for a team with a total payroll of $65 million.
65. The following table gives the average hotel room rates in the United States for the years 19922001.
Year 
Average Hotel Room Rate 
1992 
$59.39 
1993 
60.99 
1994 
63.35 
1995 
66.34 
1996 
70.68 
1997 
74.77 
1998 
78.24 
1999 
81.59 
2000 
85.69 
2001 
84.58 
a. Assign a value of 0 to 1992, 1 to 1993, 2 to 1994, and so on. Call this new variable Time. Make a new table with the variables Time and Average Hotel Room Rate.
b. Construct a scatter diagram for these data. Does the scatter diagram exhibit a linear positive relationship between time and average hotel room rates?
c. Find the least squares regression line.
d. Compute the correlation coefficient r.
e. Predict the average hotel room rate for 2006. Comment on this prediction.
66. The following table shows the price of ladies diamond rings and the weight of their diamond stones. (Journal of Statistical Education, 1996)
Weight 
Price 
Weight 
Price 
Weight 
Price 
Weight 
Price 
Weight 
Price 
Weight 
Price 
.17 
355 
.21 
483 
.12 
223 
.17 
353 
.32 
919 
.25 
655 
.16 
328 
.15 
323 
.26 
663 
.18 
438 
.15 
298 
.35 
1086 
.17 
350 
.18 
462 
.25 
750 
.17 
318 
.16 
339 
.18 
443 
.18 
325 
.28 
823 
.27 
720 
.18 
419 
.16 
338 
.25 
678 
.25 
642 
.16 
336 
.18 
468 
.17 
346 
.23 
595 
.25 
675 
.16 
342 
.20 
498 
.16 
345 
.15 
315 
.23 
553 
.15 
287 
.15 
322 
.23 
595 
.17 
352 
.17 
350 
.17 
345 
.26 
693 
.19 
485 
.29 
860 
.16 
332 
.32 
918 
.33 
945 
.15 
316 
a. Construct a scatter diagram for these data.
b. Find the least squares regression line.
c. Compute the correlation coefficient r.