Share
Explore BrainMass

Multiple Regression and Non Parametric Methods

See Attached File

Attachments

Solution Preview

14.20
data stored in a file: 1420_data.txt as follows:
County MedianIncome MedianAge Coastal
A 48157 57.7 1
B 48568 60.7 1
C 46816 47.9 1
D 34876 38.4 0
E 35478 42.8 0
F 34465 35.4 0
G 35026 39.5 0
H 38599 65.6 0
J 33315 27.0 0

Using R:
a.
Is there a linear relationship between the median income and median age?

d1420<-read.table("1420_data.txt", header=TRUE)
> colnames(d1420) = c("County", "MedianIncome", "MedianAge", "Coastal")
> attach(d1420)
The following object(s) are masked from 'd1420 (position 3)':

Coastal, County, MedianAge, MedianIncome
> cor.test(MedianIncome, MedianAge)

Pearson's product-moment correlation

data: MedianIncome and MedianAge
t = 2.7561, df = 7, p-value = 0.02825
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.1099740 0.9367364
sample estimates:
cor
0.7214069

Conclusion: Yes there is correlation between Median Income and Median Age.

b.
Which variable is the "dependent" variable?

Median Income is Median Income.

c.
Use statistical software to determine the regression equation. Interpret the value of the slope in a simple regression equation.

> lm(MedianIncome~ MedianAge)

Call:
lm(formula = MedianIncome ~ MedianAge)

Coefficients:
(Intercept) MedianAge
22804.7 361.6

i.e.,

MedianIncome = 361.6*MedianAge + 22804.7

d.
Include the aspect that the county is "coastal" or not in a multiple linear regression analysis using a "dummy" variable. Does it appear to be a significant influence on incomes?

> lmIncAgeCoast = lm(formula = MedianIncome ~ MedianAge + Coastal)
> summary(lmIncAgeCoast)

Call:
lm(formula = MedianIncome ~ MedianAge + Coastal)

Residuals:
Min 1Q Median 3Q Max
-0.27723 -0.14934 0.01199 0.06174 ...

Solution Summary

A few good quality statistics questions on linear regression are solved using R.

$2.19