# Multiple Regression Equation Development in SPSS

Develop a hypothetical multiple regression (prediction) equation to predict something in your area of professional or personal interest.

a. First, identify the dependent (criterion) variable that you are interested in predicting. What variable do you plan to predict? Next, choose two variables (called independent) that you will use to predict your chosen dependent criterion variable.

b. Make-up at least 10 values for y, x1, and x2.

c. Calculate two correlation coefficients in SPSS.

d. Using your dependent criterion variable as "y," and your predictor independent variables as x1and x2, use SPSS to create the multiple regression prediction equation based on your table of 10 (or more) values for y, x1, and x2.

e. Use values for x1 and x2 to predict y using the equation you created in Step D.

Complete detailed instructions attached.

© BrainMass Inc. brainmass.com October 25, 2018, 9:42 am ad1c9bdddfhttps://brainmass.com/statistics/correlation-and-regression-analysis/583710

#### Solution Summary

The solution provides step by step method for the calculation of regression analysis. Formula for the calculation and Interpretations of the results are also included.

24 Assorted Statistic Problems

(See attached file for full problem description)

---

Question 1

Lilo is creating an experiment to determine mean differences in weight loss (ratio level variable) of obese children as a result of participation in different weight loss programs (nominal variable with three levels: Atkins, South Beach, and Weight Watchers). Although the children will be randomly assigned to the weight loss program (i.e., the treatment), Lilo believes that the child's initial weight may make a difference in how much weight the child loses. Thus she would like to control for initial weight when running the analysis. Which statistical procedure is most appropriate?

a. analysis of covariance

b. hierarichical (or nested) ANOVA

c. multiple linear regression

d. simple linear regression

Question 2

Gracie is interested in predicting the amount of water the dogs at the Humane Society consume per day (measured in ounces, ratio level variable) by their size of their paws (measured in inches, ratio level variable). Which statistical procedure is most appropriate?

a. analysis of covariance

b. hierarichical (or nested) ANOVA

c. multiple linear regression

d. simple linear regression

Question 3

The research office at a large school district is interested in examining the effect of different formats of professional development on teachers' average implementation of what was learned in the professional development. Approximately 500 teachers from 20 different schools were assigned by convenience to receive professional development via the following formats: 1) face-to-face; 2) some face-to-face and some online; 3) totally online. The researchers anticipate that there will be differences not only due to treatment (i.e., the format of the professional development) but also due to contextual differences that may exist from school to school (e.g., some teachers may receive more support from administrators in terms of implementing what was learned in professional development, some teachers may have more resources to implement, and similar contextual differences). The researchers have the following measures on the teachers who participated: 1) format of professional development (three levels: face-to-face, some face-to-face and some online, totally online); 2) level of implementation (interval level variable); 3) school in which the teacher is employed (nominal with 20 levels). Which statistical procedure is most appropriate?

a. analysis of covariance

b. hierarichical (or nested) ANOVA

c. multiple linear regression

d. simple linear regression

Question 4

Joe wants to determine whether the time to run a marathon (ratio level variable) differs, on average, for non-professional athletes who complete a 12 week endurance training program as compared to those who complete a 4 week endurance training program. Joe randomly assigns non-professional athletes to one of the two training programs. In conducting this experiment, Joe also wants to control for the number of prior marathons in which the participant has ran. Which statistical procedure is most appropriate?

b. hierarichical (or nested) ANOVA

c. multiple linear regression

d. simple linear regression

Question 5

Sophie wants to predict the weight of golden retrievers (ratio level variable) by whether they are full blooded or mixed breeds (nominal level variable with two levels). Which statistical procedure do you recommend?

b. hierarichical (or nested) ANOVA

c. multiple linear regression

d. simple linear regression

Question 6

Dedra is interested in examining whether the aggression (interval level variable) of stray dogs changes, on average, depending on the amount of attention provided to them from shelter volunteers (ordinal level variable with three levels: minimum, moderate, maximum). To test the hypothesis, she selects 10 different shelters and she randomly assigns the volunteers at the shelter to provide either mininum, moderate, or maximum attention to the dogs (thus at any one shelter, the volunteers will only provide one level of attention--not a combination). In traveling to the different shelters, she notices that some shelters have better trained volunteers, more resources, a calmer overall environment, and other characteristics that may also effect a dog's agressiveness. Given this situation, what is the best procedure to recommend to Dedra?

b. hierarichical (or nested) ANOVA

c. multiple linear regression

d. simple linear regression

Question 7

Jennifer wants to know whether the amount of time spent surfing the Internet each day (measured in minutes, ratio level variable) and employment status (nominal with two levels: employed or unemployed) can be used to predict the range of motion in a person's fingers (ratio level variable). Which statistical procedure is most appropriate?

b. hierarichical (or nested) ANOVA

c. multiple linear regression

d. simple linear regression

Question 8

A researcher conducts a simple linear regression and finds a correlation coefficient of .62. What is the coefficient of determination and interpretation of the coefficient of determination using Cohen's interpretations?

a. .38, generally interpreted to be a moderate effect.

b. .38, generally interpreted to be a large effect.

c. .62, generally interpreted to be a moderate effect.

d. .62, generally interpreted to be a large effect.

Question 9

Ignoring the nested nature of data when generating inferential statistics leads to which one of the following?

a. Attenuated correlations

b. Decreased alpha

c. Increased alpha

d. Increased power

e. Increased unsystematic variation

Question 10

In a regression analysis with one independent variable (X), what is the predicted value of Y when the intercept is 8.0, the slope is 5.25, and X is 10?

a. 8.0

b. 5.25

c. 52.50

d. 60.5

e. Cannot be answered from the given information.

Question 11

Oscar has found a statistically significant linear model that allows him to predict attitude toward college (interval level variable) from college entrance exam score (interval level variable). The entrance exam scores in his sample ranged from 26 to 34. A fellow researcher wants to use the prediction equation to predict attitudes toward college from a different sample of students whose entrance exam scores ranged from 12 to 18. Do you recommend they proceed?

a. Yes

b. No

Question 12

Randy uses nested ANOVA to analyze data. In examining the residuals, he finds skewness of 14.9, kurtosis of 9.03, and a p value for Shapiro-Wilk's of .001. What does the data suggest?

a. Homogeneity of variances is not a reasonable assumption.

b. Homogeneity of variances is a reasonable assumption.

c. Independence is not a reasonable assumption.

d. Independence is a reasonable assumption.

e. Normality is not a reasonable assumption.

f. Normality is a reasonable assumption.

g. There is a statistically significant nested effect.

h. There is a statistically significant main effect.

Question 13

In generating a simple linear regression, you fail to reject the null hypothesis that the slope is equal to zero. Which of the following will NOT be found?

a. A confidence interval for B (the unstandardized regression coefficient) that contains zero.

b. A non-statistically significant correlation coefficient.

c. A non-statistically significant F test for the model.

d. A p value for the regression coefficient of the independent variable of .001.

Question 14

Maleah examines preschool children and generates analysis to predict height (ratio level variable) based on creativity (interval level variable). She finds that the change in height based on a one unit change in creativity is zero. Given this, what is the best estimate for predicting height?

a. the effect size

b. the intercept

c. the mean of height

d. the mean of creativity

e. the slope of creativity

f. none of the above.

Question 15

Wesley would like to predict the number of hours college freshmen work per week (ratio level variable) from their parents' income (ratio level variable) and the student's previous year's income (ratio level variable). The coefficient for the correlation between parents' income and student's previous year's income is .13. What does this suggest?

a. There may be a problem with multicollinearity between parents' income and student's previous year's income.

b. There may be a problem with multicollinearity between number of hours worked per week and student's previous year's income.

c. There most likely will not be a problem with multicollinearity between parents' income and student's previous year's income.

d. There most likely will not be a problem with multicollinearity between number of hours worked per week and student's previous year's income.

Question 16

The multiple R squared value provided by SPSS when generating a multiple linear regression is .76. This value refers to which one of the following?

a. 58% of the variance of the dependent variable is accounted for by the model.

b. 58% of the variance of the independent variables is accounted for by the dependent variable.

c. 76% of the variance of the dependent variable is accounted for by the model.

d. 76% of the variance of the independent variables is accounted for by the dependent variable.

e. The bivariate correlation coefficient between the independent and dependent variables is .76.

Question 17

The results of a nested ANOVA analysis indicate a statistically significant nested factor. Which of the following is a correct interpretation?

a. The assumption of independence has been violated.

b. Statistically significant main effects should not be interpreted when there is a statistically significant nested factor.

c. There is a statistically significant difference in the dependent measure between the levels of the nested factor within the same level of the independent variable.

d. There is not a statistically significant main effect.

Question 18

The zero order correlation coefficient between the number of ounces of chocolate consumed per day (ratio level variable) and the number of minutes slept per day (ratio level variable) is .21. Which of the following is a correct statement?

a. The p value for the correlation coefficient will be statistically significant.

b. The correlation between chocolate consumed and minutes slept with other independent variables partialed out is .21.

c. The bivariate correlation between chocolate consumed and minutes slept is .21.

d. The bivariate correlation between chocolate consumed and minutes slept is .04.

e. The multiple coefficient of determination is .21.

Question 19

The state department of education wants to know if the contract cost to repair a school [cost] can be predicted by the engineer's estimate of cost for repairs [engest]. Using the construction data set in SPSS. Conduct the appropriate inferential procedure at an alpha level of .05. [Note: conduct the analysis using all cases in the data. In other words, do not remove or filter out any cases that may be outliers, etc.] Which one of the following is correct?

a. Engineer's estimates for cost of school repairs is a good predictor of contract costs for school repairs.

b. Engineer's estimates for cost of school repairs is NOT a good predictor of contract costs for school repairs.

c. The coefficient of determination is not statistically significant.

d. The overall regression model is not statistically significant.

e. The slope is approximately zero.

Question 20

The state department of education wants to know if the contract cost to repair a school [cost] can be predicted by the engineer's estimate of cost for repairs [engest]. Using the construction data set in SPSS. Conduct the appropriate inferential procedure at an alpha level of .05. [Note: conduct the analysis using all cases in the data. In other words, do not remove or filter out any cases that may be outliers, etc.] Which one of the following is correct?

a. About 25% of the proportion of variance of the contract cost to repair schools can be accounted for by its linear relationship with the engineer's estimate of cost for repairs.

b. About 98% of the proportion of variance of the contract cost to repair schools can be accounted for by its linear relationship with the engineer's estimate of cost for repairs.

c. About 98% of the proportion of variance of the engineer's estimate of cost for repairs can be accounted for by its linear relationship with the engineers estimate of the number of days to complete the job.

d. About 93% of the proportion of variance of the engineer's estimate of cost for repairs can be accounted for by its linear relationship with the engineer's estimate of the number of days to complete the job.

Question 21

The state department of education wants to know if the contract cost to repair a school [cost] can be predicted by the engineer's estimate of cost for repairs [engest]. Using the construction data set in SPSS. Conduct the appropriate inferential procedure at an alpha level of .05. [Note: conduct the analysis using all cases in the data. In other words, do not remove or filter out any cases that may be outliers, etc.] Which one of the following is correct?

a. When the contract cost for school repairs increases by $1, the engineer's estimate of cost for school repairs increases by slightly over $25.

b. When the contract cost for school repairs increases by $1, the engineer's estimate of cost for school repairs increases by nearly $1.

c. When the contract cost for school repairs increases by $1, the engineer's estimate of cost for school repairs increases by .009.

d. When the contract cost for school repairs increases by $1, the engineer's estimate of the number of days to complete the job increases by nearly $1.

Question 22

The state department of education wants to know if the contract cost to repair a school [cost] can be predicted by the engineer's estimate of cost for repairs [engest]. Using the construction data set in SPSS. Conduct the appropriate inferential procedure at an alpha level of .05. [Note: conduct the analysis using all cases in the data. In other words, do not remove or filter out any cases that may be outliers, etc.] Which one of the following is correct?

a. The predicted contract cost for school repairs is equal to: 26 + .926(engineer's estimate of cost for school repairs).

b. The predicted engineer's estimate of cost for school repairs is equal to: 26 + .926(contract cost for school repairs).

c. The predicted contract cost for school repairs is equal to: 26 + .926(engineer's estimate of the number of days to complete the job).

d. The prediction equation cannot be determined.

Question 23

The state department of education wants to know if the contract cost to repair a school [cost] can be predicted by the engineer's estimate of school construction cost [engest] and by the engineer's estimate of number of work days needed to build the school [daysest]. Using the construction data set in SPSS. Conduct the appropriate inferential procedure at an alpha level of .05 [Note: conduct the analysis using all cases in the data. In other words, do not remove or filter out any cases that may be outliers, etc.] Which one of the following is correct?

a. The variance inflation factor suggests there is NOT a problem with multicollinearity between the engineer's estimate of school construction cost and the engineer's estimate of number of work days needed to build the school.

b. The variance inflation factor suggests there is a problem with multicollinearity between the engineer's estimate of school construction cost and the engineer's estimate of number of work days needed to build the school.

c. The variance inflation factor suggests there is NOT a problem with multicollinearity between the contract cost to build a school and the engineer's estimate of number of work days needed to build the school.

d. The variance inflation factor suggests there is a problem with multicollinearity between the contract cost to build a school and the engineer's estimate of number of work days needed to build the school.

Question 24

The state department of education wants to know if the contract cost to repair a school [cost] can be predicted by the engineer's estimate of school construction cost [engest] and by the engineer's estimate of number of work days needed to repair the school [daysest]. Using the construction data set in SPSS. Conduct the appropriate inferential procedure at an alpha level of .05 [Note: conduct the analysis using all cases in the data. In other words, do not remove or filter out any cases that may be outliers, etc.] Which one of the following is correct?

a. The bivariate correlation between the engineer's estimate of the number of work days needed to repair the school and the engineer's estimate of the cost for school repairs is .988.

b. About 35% of the variance of the contract cost for school repairs has been accounted for after the engineer's estimate of the cost for school repairs has been adjusted by removing the common variance with the engineer's estimate of number of work days needed to repair the school

c. About 12% of the variance of the engineer's estimate of number of work days needed to repair the school is accounted for by the the engineer's estimated cost for school repairs after controlling for the contract cost to repair the school.