# Bivariate regression model and caveats

1 (a) What is a bivariate regression model? Must it be linear? (b) State three caveats about regression. (c) What does the random error component in a regression model represent? (d) What is the difference between a regression residual and the true random error?

2. (a) Explain how you fit a regression to an Excel scatter plot. (b) What are the limitations of Excel's scatter plot fitted regression?

3. (a) Explain the logic of the ordinary least squares (OLS) method. (b) How are the least squares formulas for the slope and intercept derived? (c) What sums are needed to calculate the least squares estimates?

4 (a) Why can't we use the sum of the residuals to assess fit? (b) What sums are needed to calculate R2? (c) Name an advantage of using the R2 statistic instead of the standard error syx to measure fit. (d) Why do we need the standard error syx?

5 (a) Explain why a confidence interval for the slope or intercept would be equivalent to a two-tailed hypothesis test. (b) Why is it especially important to test for a zero slope?

12:48: In the following regression, X = weekly pay, Y = income tax withheld, and n = 35 McDonald's employees. (a) Write the fitted regression equation. (b) State the degrees of freedom for a two- tailed test for zero slope, and use Appendix D to find the critical value at α = .05. (c) What is your conclusion about the slope? (d) Interpret the 95 percent confidence limits for the slope. (e) Verify that F = t2 for the slope. (f) In your own words, describe the fit of this regression.

R2 0.202

Std. Error 6.816

n 35

ANOVA table

Source SS df MS F p-value

Regression 387.6959 1 387.6959 8.35 .0068

Residual 1,533.0614 33 46.4564

Total 1,920.7573 34

Regression output confidence interval

variables coefficients std. error t (df =33) p-value 95% lower 95% upper

Intercept 30.7963 6.4078 4.806 .0000 17.7595 43.8331

Slope 0.0343 0.0119 2.889 .0068 0.0101 0.0584

14:16: (a) Plot the data on U.S. general aviation shipments. (b) Describe the pattern and discuss possible causes. (c) Would a fitted trend be helpful? Explain. (d) Make a similar graph for 1992-2003 only. Would a fitted trend be helpful in making a prediction for 2004? (e) Fit a trend model of your choice to the 1992-2003 data. (f) Make a forecast for 2004, using either the fitted trend model or a judgment forecast. Why is it best to ignore earlier years in this data set? Airplanes

U.S. Manufactured General Aviation Shipments, 1966-2003

Year Planes

1966 15,587

1967 13,484

1968 13,556

1969 12,407

1970 7,277

1971 7,346

1972 9,774

1973 13,646

1974 14,166

1975 14,056

1976 15,451

1977 16,904

1978 17,811

1979 17,048

1980 11,877

1981 9,457

1982 4,266

1983 2,691

1984 2,431

1985 2,029

1986 1,495

1987 1,085

1988 1,143

1989 1,535

1990 1,134

1991 1,021

1992 856

1993 870

1994 881

1995 1,028

1996 1,053

1997 1,482

1998 2,115

1999 2,421

2000 2,714

2001 2,538

2002 2,169

2003 2,090

#### Solution Preview

1 (a) What is a bivariate regression model? Must it be linear? (b) State three caveats about regression. (c) What does the random error component in a regression model represent? (d) What is the difference between a regression residual and the true random error?

Solution : (a) Bivariate regression modal is linear relationship of two variable (say x and y). In this model we estimates the value of a variable Y corresponding to given value of variable X. This can be accomplished by estimating the value of Y from a least-squares curve that fits the sample data. The resulting curve is called "regression curve" . NO, it is not necessary that it must be linear. Regression may be linear or nonlinear.

(b) (i) The "fit" of the regression does not depand on the sign of its slope.

(ii) View the intercept with skepticism unless X =0 is logically possible and was actually observed in the data set.

(iii) Regression does not demonstrate cause and effect between X and Y. A good fit only shows that X and Y vary together. Both could be affected by another variable or by the way data are defined.

(c) random error component represent that how is data differ from the regression line.

(d) Residual represents unexplained variation after fitting a regression model. Even random error represents the measured values are not being constants.

2. (a) Explain how you fit a regression to an Excel scatter plot. (b) What are the limitations of Excel's scatter plot fitted regression?

(a) First we write the data on the Excel Spread sheet. Then click insert menu. Select scatter digram option and select whole data. And click plot. To find regression equation. We click the chart menu and then click add trendline option.

(b) Excel scatter ...

#### Solution Summary

The bivariate regression model and caveats are analyzed. The limitations of Excel's scatter plot fitted regression is given.