# Chi-square test: Degree of certainty, Cockpit Noise & Cola

Directions: You may include the statistical software output, but you must also include a well-written explanation of the findings. Be sure to answer the question asked in each problem, and explain why, with reference to your output. If you calculate the answers manually, be sure to show your work. I would prefer a Word document with your answers below each problem, but you may also submit an Excel document. (Excel data sets are available in Course Materials forum as attachment.)

1: 15.18 Sixty-eight students in an introductory college economics class were asked how many credits they had earned in college, and how certain they were about their choice of major. Research question: At &#945; = .05, is the degree of certainty independent of credits earned? Certainty

Certainty of Major by Credits Earned

Credits Earned Very Uncertain Somewhat Certain Very Certain Row Total
0 - 9 12 8 3 23
10-59 8 6 10 24
60 or more 5 5 11 21
Col Total 25 19 24 68

2: 15.24 High levels of cockpit noise in an aircraft can damage the hearing of pilots who are exposed to this hazard for many hours. A Boeing 727 co-pilot collected 61 noise observations using a handheld sound meter. Noise level is defined as "Low" (under 88 decibels), "Medium" (88 to 91 decibels), or "High" (92 decibels or more). There are three flight phases (Climb, Cruise, Descent). Research question: At &#945; = .05, is the cockpit noise level independent of flight phase?

(Data are from Capt. Robert E. Hartl, retired.) Noise

Noise Readings by Flight Phase in B-727

Flight Phase
Noise Level Climb Cruise Descent Row Total
Low 6 2 6 14
Medium 18 3 8 29
High 1 3 14 18
Col Total 25 8 28 61

3: 15.28 Can people really identify their favorite brand of cola? Volunteers tasted Coca-Cola Classic, Pepsi, Diet Coke, and Diet Pepsi, with the results shown below. Research question: At &#945; = .05, is the correctness of the prediction different for the two types of cola drinkers? Could respondents identify their favorite brand in this kind of test? Since it is a 2 × 2 table, try also a two-tailed two-sample z test for &#960;1 = &#960;2 (see Chapter 10 - test of proportions) and test whether your results are the same as using the Chi-square test for independence. Which test is preferable? Why?

(Data are from Consumer Reports 56, no. 8 [August 1991], p. 519.) Cola

Note: You need to run or to calculate both a Chi-square test of independence and a test of two sample proportions. The sample size is 46, with 19 for the regular cola drinkers and 27 for the diet cola drinkers.

Can Cola Drinkers Tell The Difference?

Correct? Regular Cola Diet Cola Row Total
Yes 7 7 14
No 12 20 32
Col Total 19 27 46

Week 4: Quiz: Chi square tests (2 points)
Due on March 22, 2010 in your Individual Forum

Directions: For the following hypothesis tests, identify the null and alternative hypothesis, and the critical value. Then, calculate the test statistic, note the p value and make a decision on the null hypothesis. Please show your work if you calculated manually. If you used statistical software, please show output. Your p value will be approximate if you use manual calculation (i.e., less than .05) or exact (if you used statistical software).

1. One day sales volume of five flavored shakes at a QSR is shown. Using the chi square test for uniform distributions with significance = .05, is this a uniform distribution or not? Explain your answer with reference to the output.

Types of shakes

Chocolate 127
Vanilla 135
Strawberry 129
Mango 121
Pineapple 113

Goodness of Fit Test

observed expected O - E (O - E)² / E % of chisq
127 125.000 2.000 0.032 1.43
135 125.000 10.000 0.800 35.71
129 125.000 4.000 0.128 5.71
121 125.000 -4.000 0.128 5.71
113 125.000 -12.000 1.152 51.43
625 625.000 0.000 2.240 100.00

2.24 chi-square
4 df
.6917 p-value

2: Results of a study of male freshman athletes in the Big Ten Conference are shown. Before doing any formal tests, examine the contingency table and discuss any obvious patterns. At &#945; = .01, is graduation independent of sport? Explain your decision.

Tennis 42 16
Swimming 116 51
Football 35 17
Gymnastics 40 23
Golf 30 21
Track 97 69
American football 267 317
Wrestling 70 87
Baseball 77 98
Hockey 39 66
Other 18 5

Tennis Observed 42 16 58
Expected 29.61 28.39 58.00
O - E 12.39 -12.39 0.00
(O - E)² / E 5.18 5.40 10.58
% of chisq 6.3% 6.5% 12.8%
Swimming Observed 116 51 167
Expected 85.27 81.73 167.00
O - E 30.73 -30.73 0.00
(O - E)² / E 11.07 11.55 22.63
% of chisq 13.4% 14.0% 27.4%
Soccer Observed 35 17 52
Expected 26.55 25.45 52.00
O - E 8.45 -8.45 0.00
(O - E)² / E 2.69 2.80 5.49
% of chisq 3.2% 3.4% 6.6%
Gymnastics Observed 40 23 63
Expected 32.17 30.83 63.00
O - E 7.83 -7.83 0.00
(O - E)² / E 1.91 1.99 3.90
% of chisq 2.3% 2.4% 4.7%
Golf Observed 30 21 51
Expected 26.04 24.96 51.00
O - E 3.96 -3.96 0.00
(O - E)² / E 0.60 0.63 1.23
% of chisq 0.7% 0.8% 1.5%
Track Observed 97 69 166
Expected 84.76 81.24 166.00
O - E 12.24 -12.24 0.00
(O - E)² / E 1.77 1.84 3.61
% of chisq 2.1% 2.2% 4.4%
Football Observed 267 317 584
Expected 298.19 285.81 584.00
O - E -31.19 31.19 0.00
(O - E)² / E 3.26 3.40 6.67
% of chisq 3.9% 4.1% 8.1%
Wrestling Observed 70 87 157
Expected 80.16 76.84 157.00
O - E -10.16 10.16 0.00
(O - E)² / E 1.29 1.34 2.63
% of chisq 1.6% 1.6% 3.2%
Baseball Observed 77 98 175
Expected 89.36 85.64 175.00
O - E -12.36 12.36 0.00
(O - E)² / E 1.71 1.78 3.49
% of chisq 2.1% 2.2% 4.2%
Hockey Observed 39 66 105
Expected 53.61 51.39 105.00
O - E -14.61 14.61 0.00
(O - E)² / E 3.98 4.16 8.14
% of chisq 4.8% 5.0% 9.8%
Expected 49.53 47.47 97.00
O - E -13.53 13.53 0.00
(O - E)² / E 3.70 3.86 7.55
% of chisq 4.5% 4.7% 9.1%
Other Observed 18 5 23
Expected 11.74 11.26 23.00
O - E 6.26 -6.26 0.00
(O - E)² / E 3.33 3.48 6.81
% of chisq 4.0% 4.2% 8.2%
Expected 867.00 831.00 1698.00
O - E 0.00 0.00 0.00
(O - E)² / E 0.00 0.00 0.00
% of chisq 0.0% 0.0% 0.0%
Total Observed 1734 1662 3396
Expected 1734.00 1662.00 3396.00
O - E 0.00 0.00 0.00
(O - E)² / E 40.49 42.24 82.73
% of chisq 48.9% 51.1% 100.0%

82.73 chi-square
12 df
1.24E-12 p-value

Additional Questions - 150 word count

1: Does correlation equal causation? Does the strength of correlation depend on the direction? What is the meaning of a zero correlation? Explain your answers.

2. What terms describe the fit of a regression equation to the data? What is the importance of the coefficient of determination (r2)? How do you identify outliers in your data? How do they impact your regression equation?

3: Does a regression model imply causation? What are the requirements that must be met for a regression analysis? What happens if these requirements are violated? Why is analysis of residuals important?

4: How does multiple regression analysis differ from bivariate regression analysis? In the following example (see page 596 in Doane & Seward), what independent (predictor) variables are significant? How well does the model predict the dependent variable?

Regression Analysis

R² 0.967
R 0.983 k 5
Std. Error 90.189 Dep. Var. Assessed

ANOVA table
Source SS df MS F p-value
Regression 6,225,261.2561 5 1,245,052.2512 153.07 2.01E-18
Residual 211,486.6189 26 8,134.1007
Total 6,436,747.8750 31

Regression output confidence interval
variables coefficients std. error t (df=26) p-value 95% lower 95% upper std. coeff.
Intercept -59.3894 71.9826 -0.825 .4168 -207.3520 88.5731 0.000
Floor 0.2509 0.0218 11.494 1.08E-11 0.2060 0.2957 0.792
Offices 97.7927 30.8056 3.175 .0038 34.4708 161.1146 0.204
Entrances 72.8405 38.7501 1.880 .0714 -6.8115 152.4924 0.086
Age -0.4570 1.2011 -0.380 .7067 -2.9258 2.0118 -0.015
Freeway 116.1786 34.7721 3.341 .0025 44.7035 187.6536 0.129