Explore BrainMass

# Regression analysis

Not what you're looking for? Search our solutions OR ask your own Custom question.

This content was COPIED from BrainMass.com - View the original, and get the already-completed solution here!

Life insurance companies are interested in predicting how long their customers will live, because their premiums and profitability depend on such numbers. An actuary for one insurance company gathered data from 100 recently deceased male customers contained in the file longevity.xls. He recorded the age at death of the customer plus the ages at death of his mother and father, the mean ages at death of his grand mothers, the mean ages at death of his grandfathers, and whether the deceased customer was or was not a smoker.

The actual variables in the data set, longevity.xls, are

Longvity = Age of the male customer at death.
Mother = mother's age at death.
Father = father's age at death.
Gmothers = mean age of grandmothers at death
Gfathers = mean age of grandfathers at death

Smoker = 1 if customer was a smoker
0 if customer was not a smoker
Longevity = Age of the deceased male customer

Estimate a simple regression model showing the relationship between the age the male customer died at and the customer was a smoker.

a. What percent of the variation in the customer's age at death is explained by whether the customer was a smoker?
b. What is your estimated value for &#61538;&#61489;&#61503;&#61472;Interpret what this says about the relationship between longevity and smoking.
c. Test the hypothesis that smoking have no influence on the customer's longevity against the alternative that it does have a negative effect on longevity. Assume &#61537; = .05.
d. Develop a 95 percent prediction interval estimate of longevity for a particular male who smoked.

Now estimate the multiple regression model:

longvty = beta0 + beta1*smoker + beta2*Mother + beta3*Father +beta4*Gmothers + beta5*Gfathers

e. Did including Mother, Father, Gmothers, and Gfathers have an effect on the estimates of &#61538;0 and &#61538;1? If so, what were the effects? Why did these estimates change?
f. Explain and interpret what the coefficient estimates of &#61538;2 &#61538;3 , &#61538;&#61492;4, and &#61538;5 mean.
g. How has adding Mother, Father, Gmothers, and Gfathers into the model improved the explanatory power of the model? Explain.
h. Do the variables Gmothers and Gfathers have a significant effect on longevity? Explain.

Reestimate the model, but this time do not include the variables Gmothers and Gfathers. In other words, estimate the following regression model:

longvty = beta0 + beta1*smoker + beta2*Mother + beta3*Father

i. How did removing Gmothers and Gfathers affect the mode;'s goodness of fit?

j. Using this new model, develop a point estimate of the longevity of a male who smoked and whose mother passed away at the age of 78 and whose father died at the age of 90.