# Multiple Regression Analysis

See attached files.

14.7 The business problem facing the director of broadcasting operations for a television station was the issue of standby hours (i.e. hours in which unionized graphic artists are paid but are not actually involved in any activity) and what factors were related to standby hours. The study included the following variable:

Standby hours (Y)-Total number of standby hours in a week

Total staff present (X1)-Weekly total of people-days

Remote hours (X2)-Total number of hours worked by employees at locations away from the central plant

Data were collected for 26 weeks; these data are organized and stored in Standby.

a. State the multiple regression equation.

b. Interpret the meaning of the slopes b1 and b2, in this problem.

c. Explain why the regression coefficient, b0, has no practical meaning in the context of this problem.

d. Predict the standby hours for a week in which the total staff present have 310 people-days and the remote hours are 400.

e. Construct a 95% confidence interval estimate for the mean standby hours for weeks in which the total staff present have 310 people-days and the remote hours are 400.

f. Construct a 95% prediction interval for the standby hours for a single week in which the total staff present have 310 people-days and the remote hours are 400.

14.25 In Problem 14.3 on page 532, you predicted the durability of a brand of running shoe, based on the forefoot shock-absorbing capability (FOREIMP) and the change of impact properties over time (MIDSOLE) for a sample of 15 pairs of shoes. Use the following results:

Variable Coeffieient Stand Error t Stat p-value

INTERCEPT -0.02686 0.06905 -0.39 0.7034

FOREIMP 0.79116 0.06295 12.57 0.0000

MIDSOLE 0.60484 0.07174 8.43 0.0000

a. Construct a 95% confidence interval estimate of the population slope between durability and forefoot shock-absorbing capability.

b. At the 0.05 level of significance, determine whether each independent variable makes a significant contribution to the regression model.On the basis of these results, indicate the independent variables to include in this model.

14.41 The marketing manager of a large supermarket chain faced the business problem of determining the effect on the sales of pet food of shelf space and whether the product was placed at the front (=1) or back (=0) of the aisle. Data are collected from a random sample of stores. The results are shown inthe following table (and organized and stores in Petfood):

Store Shelf space(ft) Location Weekly Sales (dollars)

1 5 Back 160

2 5 Front 220

3 5 Back 140

4 10 Back 190

5 10 Back 240

6 10 Front 260

7 15 Back 230

8 15 Back 270

9 15 Front 280

10 20 Back 260

11 20 Back 290

12 20 Front 310

For (a) through (m), do not include interaction term.

a. State the multiple regression equation that predicts sales based on shelf space and location.

b. Interpret the regression coefficients in (a).

c. Predict the weekly sales of pet food for a store with 8 feet of shelf space situated at the back of the aisle. Construct a 95% confidence interval estimate and a 95% prediction interval.

d. Perform a residual analysis on the results and determne whether the regression assumptions are valid.

e. Is there a significant relationship between sales and the two independent variables (shelf space and aisle position) at the 0.05 level of significance?

f. At the 0.05 level of significance, determine whether each independent variable makes a contribution to the regression model. Indicate the most appropriate regression model for this set of data.

g. Construct and interpret 95% confidence interval estimates of th population slope for the relationship between sales and shelf space and between sales and aisle location.

h. Compate the slope in (b) with the slope for the simple linear regression model of Problem 13.4 on page 481. Explain the difference in results.

i. Compute and interpret the eaning of the coefficient of multiple determination, r squared.

j. Compute and intrepret the adjusted r cubed.

k. Compare r squared with the r squared vaiue computed in Problem 13.14 on page 487.

l. Compute the coefficients of partial determination and interpret their meaning.

m. What assumption about the sope of shelf space with sales do you need to make in this problem?

n. Add an interaction term to the model and, at the 0.05 level of significance, determine whether it makes a significant contribution to the model.

o. On the basis of the results of (f) and (n), which model is most appropriate? Explain.

14.43 The owner of a moving company typicall has his most experienced manager predict the total number of labor hours that will be required to complete an upcoming move. This approach has proved useful in the past, but the owner has the business objective of developing a more accurate method of predicting labor hours. In a preliminary effort to provide a more accurate method, the owner has decided to use the number of cubic feet moved and whether there is an elevator in the apartment building as the independent variables and has collected data for 36 moves in which the orign and destination were within the borough of Manhattan in NYC and the travel time was an insignificant portion of hours worked. The data are organized and stored in Moving. For (a) through (k), do not include an interaction term.

a. State the multiple regression equation for predicting labor hours, using the number of cubic feet moved and whether there is an elevator.

b. Interpret the regression coefficients in (a).

c. Predict the labor hours for moving 500 cubic feet in an apartment building that has an elevator and construct a 95% confidence interval estimate and a 95% prediction interval.

d. Perform a residual analysis on the results and determine whether the regression assumptions are valid.

e. Is there a significant relationship between labor hours and the two independent variables (cubic feet moved and whether there is an elevator in the apartment building) at the 0.05 level of significance.

f. At the 0.05 level of significance, determine whether each independent variable makes a contribution to the regression model. Indicate the most appropriate regression model for this set of data.

g. Construct a 95% confidence interval estimate of the population slope for the relationship between labor hours and cubic feet moved.

h. Construct a 95% confidence interval estimate for the relationship between labor hours and the presence of an elevator.

i. Compute and interpret the adjusted r squared.

j. Compute the coefficients of partial determination and interpret it's meaning.

k. What assumption do you need to make about the slope of labor hours with cubic feet moved?

l. Add an interaction term to the model and, at the 0.05 level of significance, determine whether it makes a significant controbution to the model.

m. On the basis of results of (f) and (l), which model is more appropriate? Explain.

14.49 The director ofa training program for a large insurance company has the business objective of determining which training method is best for training underwriters. The three methods to be evaluated are traditional, CD-ROM based, and Web based. The 30 trainees are divided into 3 randomly assigned groups of 10. Before the start of the training, each trainee is given a proficiency exam that measures mathematics and computer skills. At the end of the training, all students take the same end-of-training exam. The results are organized and stored in Underwriting. Develop a multiple regression model to predict the score on the end-of-training exam, based on the score on the proficiency exam and the method of training used. For (a) through (k), do not include an interaction term.

a. State the multiple regression equation.

b. Interpret the regression coefficients in (a).

c. Predict the end-of-training exam score for a student with a proficiency exam score of 100 and who had Web based training.

d. Perform a residual analysis on your results and determine whether the regression assumptions are vaild.

e. Is there significant relationship between the end-of-training exam score and the independent variables (proficiency score and training method) at the 0.05 level of significance?

f. At the 0.05 level of significance, determine whether each independent variable makes a contribution to the regression model. Indicate the most appropriate regression model for this set of data.

g. Construct and interpret 95% confidence interval estimate of the population slope for the relationship between end-of-training exam score and proficiency exam.

h. Construct and interpret 95% confidence interval estimates of the population slope for the relationship between end-of-training exam score and type of training method.

i. Compute and interpret the adjusted r squared.

j. Compute the coefficients of partial determination and interpret its meaning.

k. What assumption about the slope of proficiency score with end-of-training exam score do you need to make in this problem?

l. Add interaction terms to the model and, at the 0.05 level of significance, determine whether any interaction terms make a significant contribution to the model.

m. On the basis of the results of (f) and (l), which model is more appropriate? Explain.

https://brainmass.com/statistics/regression-analysis/multiple-regression-analysis-414091

#### Solution Summary

The solution provides step by step method for the calculation of multiple regression analysis. Formula for the calculation and Interpretations of the results are also included.

Multiple Regression Analysis, Time Series Analysis

See attached data file.

Background:

One day, after reporting the performance of the company to the shareholders, the CEO of A. Fictitious & Co. decided that he would like to quantify the impact of the company's expenditures has on how much sales it generates. In other words, he would like to know if the company increases the amount spent on marketing by one dollar, how large of an increase (or decrease) in sales would be expected? The major categories of expenditures and how much was spent in each are known for the company, as is the total sales generated per quarter for the last five years. He has the data file with the relevant data sent to you, and asks you to do the multiple-regression analysis to find out the answer to his questions. Oh, and he also asks you to do a time-series analysis on the total sales per quarter and forecast the amount of sales expected in the future.

Part I. Multiple Regression:

1. Look over the expenditure categories that the CEO gave you. Check to see if there interaction between the category Capital Equipment and Materials, and the category Salary and Benefits. Namely, do a multiple regression model with quarterly sales as the y-variable and the four expenditure categories given in the Data set as the x-variables. Do a second multiple regression model with four expenditure categories plus an interaction term between the category Capital Equipment and Materials, and the category Salary and Benefits. After comparing the two models, which is the better model? Can you conclude whether the two categories are independent of each other?

2. Based on your analysis in Question 1, write down the best-fit multiple regression equation for this problem with quarterly sales as the y-variable and the expenditure categories as the x-variables; do not forget the interaction term if there is one. Define each variable in the equation.

3. Answer the CEO's question. Namely, for each the four categories of expenditure (marketing, R&D, equipment and supplies, and salaries and benefits), if the CEO increases spending in one category by one million dollars (holding the others fixed), how much increase in sales should he expect? If you found an interaction term, explain its effects as well.

4. Being a very cautious person, you decide to also give the CEO the confidence intervals for the rates of increase your calculated in Question 3. Calculate the 95% confidence intervals for the slopes you calculated in Question 2, including the interaction term if you found one.

5. Being even more cautious - you are reporting the results to the CEO, after all - you decide to do a residual error analysis by applying the F-Test on the entire regression model. Do so, and interpret the results.

Part II. Time-series Analysis

The CEO noticed that he has five years of quarterly sales data in hand, and they form a time series. He decided to also ask you to perform time-series analysis on it, and use it to forecast what future sales are expected to be at the end of 1Q 2009.

6. Plot the quarterly sales as a function of time in your Excel data spreadsheet. From the shape of these graphs, and any analysis that you think is needed, determine what type of trend model is best suitable for this data. Write down the equation for the trend model, and define and explain each of the variables as it applies to this problem.

7. Do a regression analysis on the data for the trend model you decided on in Question 6, and determine the parameters for the model.

8. Answer the CEO's question. Tell him how much sales are expected to be at the end of 1Q 2009. Be careful, and also include the 95% confidence interval for this number.

Part III. Conclusions

9. Write a report to the CEO of your findings. Which expenditure has the largest impact on sales, and which one has the least impact on sales? How fast do you expect sales to increase in time?

10. When tasked to do this analysis, the CEO made a number of assumptions about the data, and what can be extrapolated from it. List down the assumptions he made, and criticize each assumption. Criticize also the conclusions that you drew from your analysis for your report to the CEO. As a starting point, remember that the data covered a period of five years.

View Full Posting Details