Explore BrainMass

Multiple regression problem.

See attached file for full problem description.

Study & Data Collection:

The administer for a small rural hospital is interested in better understanding what drives inpatient charges - is it length of stay (LOS), severity of the illness, patient demographics (such as age or gender), or could it possibly be due to one or two specific physicians consistently ordering expensive services? Your job is to develop a model (equation) to show the marginal effect these variables have on inpatient charges and then report your findings in a clear manner.

Data for this study was collected over a 2 week period. Data on charges, LOS, severity of illness, age, gender, and attending physician were collected on all patients discharged during that 2 week period. The data set (patientdataHW4) contains a total of 52 patient observations.

Data Preparation:

Often before conducting data analysis, it is necessary to make some modifications to the existing data set to get it into a form that is useable. I think it's a little easier to modify the data set in Excel, so I would go ahead and open the data file and make the following changes in Excel (before importing into SPSS).

 In the data set, gender is coded as a category variable (M/F). Therefore, before we can include gender in our regression analysis, we will first need to create a dummy variable. To be consistent with each other, let's create a dummy variable for "male." (when doing this on your own, you could choose either "male" or "female" to represent "1"). To create a dummy variable, simply add a column called "male" to your excel data set and code all the males as "1" and all the females as "0"

 The data set includes the name of the attending physician; however, this is also a category variable (Jones, Smith, Roberts). Therefore, to include physician information in our regression analysis, we will need to create dummy variables for each one. For example, add a column called "Jones" and then for each patient who has "Jones" listed, code this as a 1 and make all the others 0. Do this for Smith and Roberts as well.

Now you are ready to import the file into SPSS and begin your data analysis.


a) Find the Pearson's r (correlation coefficient) for Charges, LOS, severity, age, and male. (Include output in an Appendix)
b) Based on your findings in part (a), which of these variables is correlated with "Charges" and therefore should be included as independent variables in our regression?
c) Run a regression using "Charges" as the dependent variable and LOS, severity, age, and male as the independent variables (Include output in Appendix).
d) What is the R2 value for this regression? Interpret the meaning of this R2 value.
e) Write out an equation for "Charges" based on your regression results
f) Are there any independent variables that are not significant based on your regression results?
g) What is the marginal effect of LOS on Charges?
h) Interpret the marginal effect of LOS on Charges - what does it tell us?
i) At what level of significance can we report this result (.01, .05, or .10)?
j) What is the marginal effect of severity on Charges? Interpret and indicate the level of significance.
k) What is the marginal effect of age and gender on Charges? Interpret and indicate the level of significance for each of these demographic variables.
l) Suppose the hospital administrator wants to determine if one physician has significantly higher charges compared to the others. Find the correlation coefficient for Charges and each of the doctors (Jones, Smith, Roberts). (Include output in the Appendix)
m) Based on your correlation coefficients, is there one doctor who appears to have higher charges than the others? Who is it?
n) Perhaps the doctor identified in part (m) simply sees patients whose cases are more severe and that is the reason why his charges are higher than the other doctors. Run the same regression as before, but this time, include this doctor as a dummy variable. This will allow us to test whether or not there is still a significant difference in this physician's charges once we "hold constant" all the other independent variables. (include output in an Appendix)
o) Is the coefficient for this doctor significant? If so, what is the marginal effect? What does this tell us about patient charges for this physician compared to the other physicians?


On a separate sheet, prepare a report (in memo format) that you plan to submit to the hospital administrator with your findings. Your report should be no more than 2 pages (single spaced, business format). In your report, include the source of your data (explained above) and your key findings. Make the information as relevant and understandable as possible (remember your audience - the hospital administrator). Provide a complete summary of your findings, but present them in a clear, concise way. Finally, include any implications based on your findings - what action(s) would (or would not) be appropriate based on your findings?

Turn in your report and attach the following in an appendix:
 the answers to your questions (these will be graded)
 a print out of your modified data set (with dummy variables)
 any supporting SPSS output (clearly labeled)


Solution Summary

This posting contains solution to following problem on Multiple regression.