# Chi squared test for independence and linear regression

1. Use the following data to:
a) draw a scatter plot
b) find the coefficient of correlation and test the significance at the .05 level
c) find the regression line
d) predict y' for x = 5
e) find the coefficients of determination and non-determination

Number of alcoholic drinks - x Score on a Dexterity Test - y
2 15
1 18
3 11
4 7
2 10
1 16
5 4
6 2

2. A real estate agent found that there is a significant relationship among the number of acres on a farm ( x 1), the number of rooms in the farmhouses ( x 2 ) , and the selling price in ten thousands ( y ) of farms in a specific area. The regression equation is y' = 44.9 - 0.0266 x 1 + 7.56 x 2. Predict the selling price of a farm that has 500 acres and a farmhouse with 8 rooms.

3. An instant oatmeal mix is considering adding flavors to its mix. 200 people tested the flavors and gave their preferences. Is there a preference for the flavor at the .05 level? State the hypotheses and identify the claim, find the critical value(s), compute the test value, make the decision, summarize the results.

Plain 20
Cinnamon 58
Apple 48
Maple 22
Peach 52

4. A USA Today poll shows that 74% of respondents felt that other motorists were driving more aggressively than they did five years ago, 23% felt that other motorists were driving the same way they did five years ago, and 3% felt other motorists were driving less aggressively than they were driving five years ago. A sample survey of 180 senior drivers found that 125 felt that other motorists were driving more aggressively than they did five years ago, 36 felt that other motorists were driving about the same as they did five years ago, and 19 felt that other motorists were driving less aggressively than they did five years ago. At a significance level of .10 test the claim that senior drivers feel the same way as those who were surveyed in the USA Today poll. State the hypotheses and identify the claim, find the critical value(s), compute the test value, make the decision, summarize the results.

5. Given the following information, is the grade dependent on gender? Test at the
.05 level. State the hypotheses and identify the claim, find the critical value(s), compute the test value, make the decision, summarize the results.

A B C D or lower
Males 8 17 25 10
Females 12 10 21 7

6. Suppose a researcher wants to investigate whether the proportion of smokers within different age groups is the same. He divides the American population into four age groups. Within each age group, he surveys 80 individuals and asks, "Have you smoked at least one cigarette in the past week?". Test at the .01 level. State the hypotheses and identify the claim, find the critical value(s), compute the test value, make the decision, summarize the results. The results of the survey are as follows:

18-19 30-49 50-64 65 or older
Smoked at least on cigarette in past week 24 21 23 12
Did not smoke at least one cigarette in past week 56 59 57 68

a)

b) We will use CORR function in Excel and get the correlation coefficient: r = -0.96

Test the significance at the .05 level

Suppose that you predicted a negative association. For 12 pairs of results, Pearson's correlation coefficient, r = -.96.
Degrees of freedom, N - 2 = 12 - 2 = 10
From the table of critical values (c.v.), for a one tailed test for a negative correlation, at p = 0.05 the critical value is given as r = 0.541
Since the absolute value of computed r exceeds the critical value: = 0.96 > c.v. = 0.541,
i.e. the probability is greater than 0.05 (p < 0.05), we conclude that the correlation coefficient is statistically significant at 95% level of significance.

c) From the last table in the Excel output we get the estimated coefficients of the linear regression.
Therefore, the regression line is:

d) By plugging in x =5 into the above equation, we get:

e) Coefficient of determination is R square = 0.924 and Coefficient of non-determination is 1-R square =1-0.924=0.076

2. A real estate agent found that there is a significant relationship among the number of acres on a farm ( x 1), the number of rooms in the farmhouses ( x 2 ) , and the selling price in ten thousands ( y ) of farms in a specific area. The regression equation is y' = 44.9 - 0.0266 x 1 + 7.56 x 2. Predict the selling price of a farm that has 500 acres and a farmhouse with 8 rooms.

In the regression equation, we replace x1= 500 and x2=8 and we get:

3. An ...

