Explore BrainMass

Forward selection, backward elimination, and stepwise select

This content was STOLEN from BrainMass.com - View the original, and get the already-completed solution here!

Explain the difference between forward selection, backward elimination, and stepwise selection for the imputation of variables in a regression equation.

Explain how a researcher should interpret a correlation coefficient.

© BrainMass Inc. brainmass.com October 24, 2018, 9:13 pm ad1c9bdddf

Solution Preview

Stepwise regression is a strategy for model building that examines a fraction of all possible regressions to come up with a model that is pretty good, even if not optimal. Stepwise regression examines one predictor at a time, deciding to put it into the model or leave it out. There are several main varieties of stepwise regression. In the regression module, StatTools has separate options for backward, forward, and regular stepwise regression strategies. I will next discuss these three options in general terms.

1. Backward Stepwise Regression
In backward stepwise regression, begin with the model that has all of the predictors included. Then drop the worst predictor and rerun the regression with the remaining predictors. From the result, drop the worst predictor and rerun the regression with the remaining predictors. At each step, the worst predictor is removed. The model becomes simpler with each step.
?With k predictors, at most k steps are needed to complete the backward stepwise procedure. At most k of the 2k possible models must be examined.
?At each step, the predictor removed is ordinarily the one with smallest absolute T-value (equivalently, largest p-value).
?Justification: Removal of the predictor with least absolute T-value will reduce the R-square by the least of any ...

Solution Summary

The solution explains the difference between forward selection, backward elimination, and stepwise selection

See Also This Related BrainMass Solution

Questions: Moving Averages, Wine Sales, Tax Prediction, Income, etc.

1) The method of moving averages is used:
(A) to plot a series
(B) to exponentiate a series
(C) to smooth a series
(D) in regression analysis

2) The following table contains the number of complaints received in a department store for the first 6 months of last year.

Month Complaints
January 36
February 45
March 81
April 90
May 108
June 144
Referring to the above table, If a three-term moving average is used to smooth this series, what would be the second calculated term?
(A) 36
(B) 40.5
(C) 54
(D) 72

3) In multiple regression, the _____ procedure permits variables to enter and leave the model at different stages of its development
(A) forward selection
(B) residual analysis
(C) backward elimination
(D) stepwise regression

4) A real estate builder wishes to determine how house size (House) is influenced by the family income(Income ) family size (SIZE) , and education of the head of the household (School). House size is measured in hundreds of square feet, income is measured in thousand of dollars and education is measure in years. The builders randomly selected 50 families and ran the multiple regressions The business literature involving human capital shows the education influence an Individual's annual income. Combined, these may influence family size. With this in mind, what should the real estate builder be particularly concern with when analyzing the multiple regression models.
(A) randomness of error terms
(B) collinearity
(C) normality of residuals
(D) missing observation

5) An auditor for a country government would like to develop a model to predict the county taxes based on the age of single family houses. A random sample of single-family houses has been selected, with the results as shown below:

Taxes Age of house
925 1
870 2
809 4
720 4
694 5
630 8
626 10
562 10
546 12
523 15
480 20
486 22
462 25
441 25
426 30
368 35
350 40
384 50
322 50

Assuming a quadratic relationship between the age of the houses and the country taxes, which of the following is the best prediction of the average county taxes for a 20 year old house
(A) $ 557.30
(B) $481.25
(C) $480.60
(D) $479.15

6) When using the exponentially weighted moving average for purpose of forecasting rather than smoothing
(A) the previous smoothed value become the forecast
(B) the current smoothed value becomes the forecast
(C) the next smoothed value become the forecast
(D) none of the above

7)You need to decide whether you should invest in a particular stock. You would like to invest if the price is likely to rise in the long run. You have data on the daily average price of this stock over the last 12 months . Your best action is to
(A) compute moving average
(B) perform exponential smoothing
(C) estimate a least square trend model
(D) compute the MD statistic

8) The number of cases of merlot wine sold by a Paso Robles winery in an 8 year period follows

Year Cases of wine
1991 270
1992 356
1993 398
1994 456
1995 358
1996 500
1997 410
1998 376

Referring to the table above, the holt-winter method for forecasting with smoothing constant of 0.2 for both level and trend will be used to smooth the wine sale. The smoothed value of the level and trend for 1993 are _____ and _____ respectively.
(A) 356 and 86
(B) 406.8 and 57.84
(C) 398 and 61.25
(D) 374.8 and 76.5

9) A model that can be used to make prediction about long-term future valves of ta time series is:
(A) Linear Trend
(B) quadratics trend
(C) exponential trend
(D) all of the above

View Full Posting Details