# Regression Analysis & Confidence Interval

1. You may recall that at our first class we discussed a data file collected from a graduate survey. At that time we were concerned most with descriptive statistics. The file is GRADSURVEY (attached). An interesting question is whether there are some variables in that data file that can explain or predict anticipated salary in five years. Four potential variables are Graduate GPA, GMAT, Spending, and Number of Jobs.

a) Using only Graduate GPA as a variable, determine the regression model used to predict anticipated salary in 5 years. Even if it is not a "good" model, it is all you have to address the following situation: You interview for a job and your Graduate GPA is 3.85. You receive a job offer that guarantees you a salary of $95,000 in 5 years. Having studied past data, what should you do if your decision is based only on salary?

b) Using all the four variables listed above, carry out the required analysis to see if you can develop a regression model that is significant at the 5% level. Are all independent variables needed in your model? Interpret your findings.

c) What happens if you add another variable "Expected Salary"? Why does it increase the prediction ability of the model?

Data set

ID Num Gender Age Height Major Graduate GPA Undergrad Specialization Undergrad GPA GMAT Employment Status Number of Jobs Expected Salary Anticipated Salary in 5 Years Satisfaction Advisement Spending

ID01 M 22 69 IS 3.90 CM 3.30 600 PT 0 45 75 5 200

ID02 M 35 67 A 3.92 O 3.34 480 FT 2 120 250 4 150

ID03 M 31 67 MR 3.77 BU 3.04 550 FT 2 85 120 5 65

ID04 M 28 73 M 3.43 BI 3.41 530 FT 4 100 150 5 150

ID05 M 36 70 EF 3.51 BU 3.12 610 FT 3 80 90 4 300

ID06 F 27 60 A 3.00 SS 3.50 460 FT 3 100 150 4 250

ID07 M 30 68 EF 3.65 CM 3.02 580 FT 5 100 125 4 400

ID08 M 28 66 A 3.00 CM 2.84 590 FT 1 60 100 6 60

ID09 F 24 65 UN 3.22 CM 3.13 570 FT 4 50 60 4 180

ID10 M 33 70 A 3.90 SS 3.24 530 PT 5 50 80 4 700

ID11 M 26 71 A 4.00 BU 3.89 550 FT 3 60 100 5 100

ID12 M 24 74 M 3.20 BU 3.22 500 FT 2 65 100 4 200

ID13 M 31 69 A 3.53 CM 3.33 540 FT 3 80 110 6 300

ID14 M 39 71 EF 3.42 CM 3.04 570 FT 2 100 150 1 100

ID15 F 29 63 MR 3.12 BU 3.14 480 UN 1 50 100 4 1000

ID16 M 26 74 MR 3.43 EN 2.56 600 FT 4 40 65 4 300

ID17 F 23 64 IS 3.75 CM 3.00 580 FT 1 70 100 5 200

ID18 F 26 63 A 3.30 HU 3.23 520 FT 3 60 75 4 150

ID19 M 30 63 EF 4.00 O 3.75 580 FT 3 105 120 6 150

ID20 F 25 63 MR 4.00 BU 3.72 650 FT 1 60 100 4 130

ID21 F 27 62 MR 3.25 ED 3.77 480 UN 2 45 65 4 300

ID22 F 25 63 EF 3.51 BU 3.64 500 FT 2 60 80 4 200

ID23 M 32 73 A 3.35 BU 2.87 580 FT 1 80 140 5 90

ID24 F 31 65 MR 3.22 BU 2.95 540 FT 3 65 85 6 170

ID25 M 25 68 EF 3.47 BU 3.18 590 PT 1 60 150 4 320

ID26 M 29 73 IB 3.67 HU 3.56 620 FT 2 65 135 4 200

ID27 F 25 64 MR 3.40 SS 3.26 600 FT 2 55 90 4 600

ID28 M 37 68 M 3.65 EN 3.41 530 FT 2 90 130 2 200

ID29 M 34 66 A 3.54 BI 3.38 540 FT 1 70 100 3 100

ID30 F 33 61 M 3.64 ED 2.79 570 FT 2 45 80 4 160

ID31 F 38 65 EF 4.00 BU 3.78 570 PT 1 80 110 4 230

ID32 M 30 72 EF 3.70 PS 3.55 550 FT 2 75 150 5 500

ID33 M 32 73 M 3.24 PA 3.17 580 FT 2 60 85 6 250

ID34 F 28 61 A 3.37 SS 3.68 610 FT 1 75 95 3 150

ID35 F 27 66 IS 3.56 CM 3.27 560 FT 1 65 90 4 120

ID36 M 41 74 IB 3.28 SS 3.65 490 FT 1 50 85 1 160

ID37 F 35 65 M 3.16 PS 3.29 510 FT 3 75 100 2 100

ID38 F 25 63 IS 3.59 CM 3.45 560 FT 1 60 90 3 160

ID39 M 32 70 EF 3.80 EN 3.03 600 FT 2 90 160 7 130

ID40 M 30 69 M 3.15 O 3.22 540 PT 1 55 85 6 110

2. An auditor for a government agency needs to evaluate payments for doctors' office visits paid by Medicare in a particular zip code during the month of June. A total of 25,056 visits occurred during June in this area. The auditor selects a sample of 138 visit claims for the audit. It is determined that the average amount of reimbursement was $93.40 and the standard deviation was $34.55. In 12 of the office visits, an incorrect amount of reimbursement was provided. For the 12 office visits in which there was an incorrect reimbursement, the differences between the amount reimbursed and the amount that the auditor determined should have been reimbursed are in the data file Medicare.

a) What information would you give the agency if it wants to know the total amount of reimbursement it incurred for this geographic area in June? The agency is satisfied with a maximum error of 5% in any estimates it receives.

b) What information would you give the agency if it wants to know the total difference between the amount reimbursed and the amount the auditor determined should have been reimbursed? Again a maximum 5% error is allowed.

Data set

Difference

17

25

14

-10

20

40

35

30

28

22

15

5

Please see attached files.

#### Solution Summary

The solution provides step-by-step method for the calculation of regression analysis and confidence interval. Formula for the calculation and Interpretations of the results are also included.