# Chi-square Test, Regression Analysis and Correlation

See file attached for proper format of tables.

12.5 A sample of 500 shoppers was selected in a large metropolitan area to determine various information concerning consumer behavior. Among the questions asked was, "Do you enjoy shopping for clothing?" The results are summarized in the following contingency table:

Enjoy Shopping for Clothing Gender

Male Female Total

Yes 136 224 360

No 104 36 140

Total 240 260 500

Is there evidence of a significant difference between the proportion of males and females who enjoy shopping for clothing at the 0.01 level of significance? Determining the p-value in (a) and interpret its meaning. What are your answers to (a) and (b) if 206 males enjoyed shopping for clothing and 34 did not? Compare the results of (a) through (c).

12.15 The health-care industry and consumer advocacy groups are at odds over the sharing of a patient's medical records without the patient's consent. The health-care industry believe that no consent should be necessary to openly share data among doctors, hospitals, pharmacies, and insurance companies. Suppose a study is conducted in which 600 patients are randomly assigned, 200 each, to three "organizational groupings"- insurance companies, pharmacies, and medical researchers. Each patient is given material to read about the advantages concerning the sharing of medical records within the assigned "organizational grouping." Each patient is then asked "would you object to the sharing of your medical records with..."; the results are recorded in the following contingency table:

Object to Sharing Information

Organizational Grouping

Insurance Pharmacies Research

Yes 40 80 90

No 160 120 110

Is there evidence of a difference in objection to sharing information among the organizational groupings? (Use α = 0.05) Compute the p-value and interpret its meaning? If appropriate, use the Marascuilo procedure and α = 0.05 to determine which groups are different.

12.23 USA Today reported on preferred types of office communication by different age groups ("Taking Face to Face vs. Group Meetings," USA Today, October 13, 2003, p. A1). Suppose the results were based on a survey of 500 respondents in each age group. The results are cross-classified in the following table:

Age Group Group Meetings Face-to-Face Meetings with Individuals E-mails Other Total

Generation Y 180 260 50 10 500

Generation X 210 190 65 35 500

Boomer 205 195 65 35 500

Mature 200 195 50 55 500

Total 795 840 230 135 2,000

At the 0.05 level of significance, is there evidence of a relationship between age group and type of communication preferred?

13.7 A critically important aspect of customer service in a supermarket is the waiting time at the checkout (defined as the time the customer enters the line he or she is served). Data were collected during time periods in which a constant number of checkout counters were open. The total number of customers in the store and the waiting times (in minutes) were recorded. The results are stored in Supermarket.

Construct a scatter plot Assuming a linear relationship, use the least-squares method to determine the regression coefficients b0 and b1 Interpret the meaning of the slope, b1, in this problem. Predict the waiting time when there are 20 customers in the store.

13.9 An agent for a residential real estate company in a large city would like to be able to predict the monthly rental cost for apartments, based on the size of an apartment, as defined by square footage. The agent selects a sample of 25 apartments in a particular residential neighborhood and gathers the data below (stored in Rent).

ApartmentMonthly Rent ($)Size (Sq.Feet)ApartmentMonthly Rent ($)Size (Sq. Feet)1

950 850 14 1,800 1,369

2 1,600 1,450 15 1,400 1,175

3 1,200 1,085 16 1,450 1,225

4 1,500 1,232 17 1,100 1,245

5 950 718 18 1,700 1,259

6 1,700 1,485 19 1,200 1,150

7 1,650 1,136 20 1,150 896

8 935 726 21 1,600 1,361

9 875 700 22 1,650 1,040

10 1,150 956 23 1,200 755

11 1,400 1,100 24 800 1,000

12 1,650 1,285 25 1,750 1,200

13 2,300 1,985

Construct a scatter plot Use the least-squares method to determine the regression coefficients b0 and b1 Interpret the meaning of b0 and b1 in this problem Predict the monthly rent for an apartment that has 1,000 square feet. Why would it be not be appropriate to use the model to predict the monthly rent for apartments that have 500 square feet? Your friends Jim and Jennifer are considering signing a lease for an apartment in this residential neighborhood. They are trying to decide between two apartments, one with 1,000 square feet for a monthly rent of $1,275 and the order with 1,200 square feet for a monthly rent of $1,425. Based on (a) through (d), which apartment do you think is the better deal?

13.17 For those data, SSR = 130,301.41 and SST = 144,538.64.

Determine the coefficient of determination, r2, interpret its meaning Determine the standard error of the statue How useful do you think this regression model is for predicting audited sales?

13.39 You are testing the null hypothesis that there is no linear relationship between two variables, X and Y. From your sample of n = 10, you determine that r = 0.80. What is the value of the rest t test statistic tSTAT? At the alpha = 0.05 level significance, what are the critical values? Based on your answers to (a) and (b), what statistical decision should you make?

13.41 You are testing the null hypothesis that there is no linear relationship between two variables, X and Y. From your sample of n =20, you determine that SSR = 60 and SSE = 40. What is the value of FStat? At the alpha = 0.05 level of significance, what is the critical value? Based on your answers to (a) and (b), what statistical decision should you make? Compute the correlation coefficient by first computing r2 and assuming the b1 is negative At the 0.05 level of significance, is there a significant correlation between X and Y?

13.47 In Problem 13.9, an agent for a real estate company wanted to predict the monthly rent for apartments, based on the size of the apartment. The data are stored in Rent. Using the results of that problem, At the 0.05 level of significance, is there evidence of a linear relationship between the size of the apartment and the monthly rent? Construct a 95% confidence interval estimate of the population slope.

13.51 The file CoffeeDrink represent the calories and fat (in grams) of 16-ounce iced coffee drinks at Dunkin' Donuts and Starbucks:

Product Calories Fat

Dunkin' Donuts Iced Mocha Swirl latte (whole milk) 240 8.0

Starbucks Coffee Frappuccino blended coffee 260 3.5

Dunkin' Donuts Coffee Coolatta (cream) 350 22.0

Starbucks Iced Coffee Mocha Espresso (whole milk and whipped cream) 350 20.0

Starbucks Chocolate Brownie Frappuccino blended coffee (whipped cream) 510 22.0

Starbucks Chocolate Frappuccino Blended Creme (whipped cream) 530 19.0

Compute and interpret the coefficient of correlation, r. At the 0.05 level of significance, is there a significant linear relationship between calories and fat?

#### Solution Summary

The solution provides a step by step method for the calculation of Chi-square test for association, regression analysis and correlation coefficient. Formula for the calculation and interpretations of the results are also included.