# Hypothesis testing, Correlation testing, Anova & Regression

Please see attached for better formatting on the tables.

8. Flat Tire and Missed Class A classic tale involves four carpooling students who missed test and gave as an excuse a flat tire. On the makeup test, the instructor asked the students to identify the particular tire that went flat. If they really didn't have a flat tire, would they be able to identify the same tire? The author asked 41 other students to identify the tire they would select. The results are listed in the following table (expect for one student who selected the spare). Use a 0.05 significance level to test the author's claim that the results fit a uniform distribution. What does the result suggest about the ability of the four students to select the same tire when they really didn't have a flat?

Tire : Left front Right front Left rear Right rear

Number selected: 11 15 8 6

18.

Do World War 11 Bomb Hits a Poisson Distribution? In a analyzing hits by V-1 buzz bombs in World War 11, South London was subdivided into regions, each with an area of 0.25 km2. Shown below is a table of actual frequencies of hits and the frequencies expected with the Poisson distribution. (The Poisson distribution is described in Section 5-5) Use the valves listed and 0.05 significance level to test the claim that the actual frequencies fit a Poisson distribution.

Number of bombs hits 0 1 2 3 4 or more

Actual number of regions 229 211 93 35 8

Expected number of regions (from Poisson distribution) 227.5 211.4 97.9 30.5 8.7

11-3

18. Global Warming Survey - A pew research poll was conducted to investigate opinions about global warming. The respondents who answered yes when asked if there is solid evidence that the earth is getting warmer were then asked to select a cause of global warming. The results for two age brackets are given in the table below. Use a 0.01 significance level to test the claim that the age bracket is independent of the choice for the cause of global warming.

Do respondents from both age brackets appear to agree, or is there a substantial difference?

Human activity Natural patterns Don't know or refused to answer

Under 30 108 41 7

65 and over 121 71 43

12-2

14. Car Emissions Listed below are measured amounts of greenhouse gas emissions from cars in three different categories (from Data Set 16 in Appendix B). The measurements are in tons per year, expressed as CO2 equivalents. Use a 0.05 significance level to test the claim that the different car categories have the same mean amount of greenhouse gas emissions. Based on the results, does the number of cylinders appear to affect the amount of greenhouse gas emissions?

Four cylinder 7.2 7.9 6.8 7.4 6.5 6.6 6.7 6.5 6.5 7.1 6.7 5.5 7.3

Six cylinder 8.7 7.7 7.7 8.7 8.2 9.0 9.3 7.4 7.0 7.2 7.2 8.5

Eight cylinder 9.3 9.3 8.6 8.6 8.7 9.3 9.3

9-5

12. Home Size and Selling Price Using the sample data from Data Set 23 in Appendix B, 21 homes with living areas under 2000 ft2have selling prices with a standard deviation of $32,159.73. There are 19 homes with living areas greater than 2000 ft2 and they have selling prices with a standard deviation of $66,628.50. Use a 0.05 significance level to test the claim of a real estate agent that homes larger than 2000 ft2 have selling prices that vary more that vary more than the smaller homes.

10-2

16. Heights of Presidents and Runners-Up Theories have been developed about the heights of winning candidates for the U.S presidency and the heights of candidates who were runners-up. Listed below are heights (in inches) from recent presidential elections. Is there a linear correlation between the heights of candidates who won and the heights of the candidates who were runners-up?

Winner: 69.5, 73, 73, 74, 74.5, 74.5, 71, 71

Runner up: 72, 69.5, 70, 68, 74, 74, 73, 76

10- 3

16. Heights of Presidents and Runners-Up Find the best predicted height of runner-up Goldwater, given that the height of the winning presidential candidate Johnson is 75 in. Is the predicted height of Goldwater close to his actual height of 72 in.?

Winner 69.5 73 73 74 74.5 74.5 71 71

Runner-Up 72 69.5 70 68 74 74 73 76

https://brainmass.com/statistics/hypothesis-testing/hypothesis-testing-correlation-testing-anova-regression-546357

#### Solution Summary

The solution provides step by step method for the calculation of Hypothesis testing, Correlation testing, Anova & Regression Analysis. Formula for the calculation and Interpretations of the results are also included.

ANOVA, Regression Analysis and Correlation Hypothesis Test

Question 1

[Refer to the file Q1.xls for the data]

a) Test the null hypothesis that six samples of word counts for males (columns 1, 3, 5, 7, 9, 11)

are from populations with the same mean. Print the results and write a brief summary of

your calculations

b) Test the null hypothesis that the six samples of word counts for females (columns 2, 4, 6, 8,

10, 12) are from populations with the same mean. Print the results and write a brief

summary of your conclusions

c) If we want to compare the number of words spoken by men to the number of words spoken

by women, does it make sense to combine the six columns of word counts for males and

combine the six columns of word counts for females, then compare the two samples? Why

and why not?

Question 2

[Refer to the file Q2.xls for the data]

a) Using the paired data consisting of the proportions of wins and the numbers of runs

scored, find the linear correlation coefficient r and determine whether there is sufficient

evidence to support a claim of linear correlation between those two variables. Then find

the regression equation with the response variable y representing the proportions of wins

and the predictor variable x representing the numbers of runs scored.

b) Using the paired data consisting of the proportions of wins and the numbers of runs

allowed, find the linear correlation coefficient r and determine whether there is sufficient

evidence to support a claim of a linear correlation between those two variables. Then, find

the regression equation with the response variable y representing the proportions of wins

and the predictor variable x representing the numbers of runs allowed.

c) Use the paired data consisting of the proportions of wins and these differences: (Runs

scored) ‐ (runs allowed). Find the linear correlation coefficient r and determine whether

there is sufficient evidence to support a claim of a linear correlation between those two

variables. Then find the regression equation with the response variable y representing the

proportions of wins and the predictor variable x representing the differences of (runs

scored)‐ (runs allowed).

d) Compare the preceding results. Which appears to be more effective for winning baseball

games: a strong defense or a strong offense? Explain.

e) Find the regression equation with the response variable y representing the winning

percentage and the two predictor variables of runs scored and runs allowed. Does that

equation appear to be useful for predicting a team's proportion of wins based on the

number of runs scored and the number of runs allowed? Explain.

f) Using the paired data consisting of the numbers of runs scored and the numbers of runs

allowed, find the linear correlation coefficient r and determine whether there is sufficient

evidence to support a claim of a linear correlation between those two variables. What does

the result suggest about the offensive strengths and the defensive strengths of the

different teams?