Explore BrainMass

R programming - Multiple Linear Regression

Plots should be done in R, and data set is loaded through R. Please see attachment.


Solution Preview

Question 1

>dat <- read.csv("http://biostat.jhsph.edu/~iruczins/teaching/140.615/data/fev.csv",row.names=1)

This will give you 16 boxes, of which 12 has plots. These are all of the plots of 2 of the 4 variables. For example, the first row third column box is a plot of fev (y-axis, which horizontally aligns with the box) vs. height (x-axis, which is verticalled aligned with the box).

It is important to look at these before we begin, because they will give us a qualitative feeling of the data. For example, we can see that gender and smoke are dummy variables, and fev and height are variables that take all values in the positive reals. Furthermore, we can see that fev and height seems to be positively correlation (which is true intuitively).

We fit the following model. fev = beta0 + beta1*gender + beta2*height + beta3*smoking

> model1 = lm(fev~gender+height+smoking,data=dat)
> summary(model1)

lm(formula = fev ~ gender + height + smoking, data = dat)

Min 1Q Median 3Q Max
-1.82118 -0.30188 0.00575 0.30685 1.44124

Estimate Std. Error t value Pr(>|t|)
(Intercept) -5.244425 0.312132 -16.802 < 2e-16 ***
genderM 0.006640 0.034809 0.191 0.84875
height ...

Solution Summary

R programming - Multiple Linear Regression