# Linear Regression Decision Model: Appraisers

I need assistance doing linear regression decision model. It needs to be done in Excel. The data you need is in the excel file.

a.If the team of appraisers want to use a simple linear regression decision model (one X, one Y) based on either X1 or X2, which one of these independent variables do you recommend they use? Why?

b.Estimate the parameters, and therefore the prediction equation, for

Y = b0+ b1X1 + b2X2

For each additional areal foot , what is the change in heating cost?

c.Set up a binary variable (X3) for furnace age in combination with the model from part b. How much (%), if any, additional variability in heating cost, Y, does furnace age help explain?

d.What is the "best" model recommendation you can offer the home appraisal team using any combination of the independent variables from parts a through c above. Write the specific model.

© BrainMass Inc. brainmass.com October 25, 2018, 9:49 am ad1c9bdddfhttps://brainmass.com/math/functional-analysis/linear-regression-decision-model-appraisers-588012

#### Solution Preview

Linear Regression Decision Model - Appraisers

I need assistance doing linear regression decision model. It needs to be done in Excel. The data you need is in the excel file.

Answers

a. If the team of appraisers want to use a simple linear regression decision model (one X, one Y) based on either X1 or X2, which one of these independent variables do you recommend they use? Why?

The model adequacy of a regression model is measured using the R2 value.

X1, Outside Temp

In the case of independent variable X1, R2 = 0.6463. Thus 64.63% variability in the Heating Cost can be explained by the linear relationship between the Outside Temp and Heating Cost (as described by the regression equation). The standard error of estimate of the regression model is 64.689.

Details

Regression Statistics

Multiple R 0.803956307

R Square 0.646345743

Adjusted R Square 0.626698284

Standard Error 64.68938215

Observations 20

X2, Footage

In the case of independent variable X2, R2 = 0.8863. Thus 88.63% variability in the Heating Cost can be explained by the linear relationship between the Footage and Heating Cost (as described by the regression equation). The ...

#### Solution Summary

The solution provides step by step method for the calculation of regression analysis. Formula for the calculation and Interpretations of the results are also included in the attached Excel and Word documents.

Simple Linear Regression, Multiple Regression Model, and CI

1. A simple linear regression model relating investment (y) by companies to bank lending interest rate (x) is stated as

error term the is where y=?o+?1x+? where is ? is the error term

a. What are the intercept and slope for the relationship between investment and interest rate stated above? What sign would you expect for the slope in the relationship for investment and interest rate? Please explain your reasoning.

b. What is the role of the error term in the simple linear regression model like the one stated above? Please list as many examples as you can of the factors that you think belong in the error term of the simple linear regression model stated above briefly justifying each example.

c. Please explain how you would estimate the relationship between investment and interest rate specified above. [note: you do not need to write any formulas as an answer to this question. Just explain what you will need and the reasoning behind the method you would use to estimate the relationship].

d. For what purpose could you use the estimated simple linear regression model for investment and interest rate? Why could the simple linear regression model stated above be inadequate for the purpose?

2. John Cooper, a real estate appraiser in small town in Pennsylvania, has estimated the following simple linear regression model relating home prices (y) in the town with the square footage (x) for the homes.

?=85,473+52.65X

(620) (5.22)

n=36 SST=865,500 SSR=545,328 SSE=320,272

where the numbers in the parenthesis are the standard errors.

SST= total sum of squares SSR=Regression (explained) sum of squares

SSE=Error (residual) sum of squares

a. Does the relationship between home prices and square footage reported above make sense? Please interpret the slope of the simple linear regression model estimated by John.

b. At 1% level of significance, test whether square footage has a significant effect on home price. Please clearly show all your steps and comment on whether your decision to reject or not to reject the null hypothesis makes sense.

c. Calculate and interpret the coefficient of determination (R2) for home price and square footage. Does the magnitude of R2 make sense for the linear regression reported above?

d. Calculate the correlation coefficient (r) and conduct a hypothesis test involving a null hypothesis which says there is no correlation between home price and square footage (r=0) against an alternative hypothesis which says there is correlation between home price and square footage (r?0). Conduct the test at 1% level of significance.

e. Did you arrive at the same conclusion in b and d? Given the results of your hypothesis tests and the magnitude of the R2, do you think is it reasonable to use John's model for predicting home prices in the town?

f. Use the model to predict the selling price of a house with a square footage of 2000.

g. A house with 2000 square feet recently sold for $220,000. Explain why there is a difference between the price predicted by the model for such a house and the actual price at which the house was sold.

h. If you were to estimate a multiple regression model for home prices what other variables might you include in the model? Please briefly justify why you think each of the variables you would include influence home prices.

3. Accountants at Zodok Company believed that several traveling executives submit unusually high travel vouchers when they return from business trips. The accountants took a sample of 35 vouchers submitted from the past year and estimated the following multiple linear regression model relating submitted travel costs (y) to the number of days on road (x1), distance traveled (x2), age of the executive (x3) and gender of the executive (x4).

?=450.3 + 122.4x1 + 0.42x2 - 1.2x3 - 15x4

(150.1) (32.1) (0.06) (0.6) (10)

N=35 SST=122,344 SSR=88,087.68 SSE=34,256

Where

X4=1 for female executive

=0 for male executive

The numbers in the parenthesis are the standard errors.

SST= Total sum of squares SSR=Regression (explained) sum of squares

SSE= Errors (residual) sum of squares

a. If you were a member of the team who estimated this model, what additional independent variables would you include in the model? Which variable(s) would you exclude? Please briefly justify your answers.

b. Please interpret the estimated coefficient for each of the independent variables. Does the sign of the estimated coefficient make sense for each independent variable.

c. At 1% level of significance, test whether each of the four estimated coefficients is statistically significant. Please clearly show all your steps and comment on whether your decision to reject or not to reject the null hypothesis makes sense in each case.

d. At 1% level of significance please test whether all the four independent variables have jointly significant effect on travel costs clearly showing all your steps.

e. Calculate and interpret the coefficient of multiple determination (R2). Does the magnitude of R2 make sense for the multiple linear regression model reported above?

f. Given the magnitude of R2 and your decisions about the significance tests, do you think the model reported above is strong enough to be used for predicting the expected travel costs for executives of Zodok Company?

g. If a 50 year old male executive submitted vouchers in the amount of $1100 for 3 days trip to a city 500 miles away, what would be the difference between the submitted amount and the amount the model would predict for this trip?

4. A cola-dispensing machine is set to dispense a mean of 2.02 liters into a container labeled 2 liters. Actual quantities dispensed vary and the amounts are normally distributed with a standard deviation of 0.015 liters.

a. What is the probability a container will have less than 2 liters?

b. What is the probability that a container will have more than 2.04 liters?

c. What is the probability that a container will have between 2 and 2.04 liters?

5. Management of a refrigerator assembly plant is considering adopting a bonus system to increase production. Past records indicate that, on the average, 4000 units are assembled within a week. The distribution of the weekly production is approximately normally distributed with a standard deviation of 60 units.

a. If the bonus is paid on the upper 5 percent of production, what is the cutoff level of production above which bonus will be paid ?

b. What is the probability that at least 4300 refrigerators may be assembled in a given week?

c. What is the probability that less than 3900 units may be assembled in a given week?

6. Think about any variable for which you would like to know the mean value in the population but you don't have any data at the population level. Assuming that you have enough budget to collect data only from a sample of 25, please outline and describe the steps you would follow to calculate the 90% confidence interval for the population mean you are interested in.

7. A pharmaceutical company wanted to estimate the population mean of monthly sales for their sales people. Forty sales people were randomly selected. Their mean monthly sales was $10,000 with a population standard deviation of $1000. Construct and interpret a 90% confidence interval for the mean sales of all the sales people.

8. Mileage tests were conducted on a randomly selected sample of 100 newly developed automobile tires. The results showed that the average tread life for the sample was 50,000 miles with a sample standard deviation of 3,500 miles.

a. What is the best estimate of the average tread life in miles for the entire population of these tires?

b. Please construct and interpret 90% confidence interval for the tread life of the tires.

9. A random sample of 16 ATM transactions at the Last National Bank of Flatrock revealed a mean transaction time of 2.8 minutes with a sample standard deviation of 1.2 minutes. Please construct and interpret the 90% confidence interval for the true (population) mean transaction time.

10. A manufacturer of stereo equipment introduces new models in the fall. Retail dealers are surveyed immediately after the Christmas selling season regarding their stock on hand of each piece of equipment. It has been discovered that unless 40% of the new equipment ordered by the retailers in the fall had been sold by Christmas, immediate production cutbacks are needed. The manufacturer has found that contacting all of the dealers after Christmas by mail is frustrating as many of them never respond. This year 80 dealers were selected at random and telephoned regarding a new receiver. It was discovered that 38% of those receivers had been sold. Construct 99% confidence interval for proportion of sales for all the dealers of this receiver? Based on your result is it likely that production cutbacks will be needed?

11. A random sample of 160 commercial customers of PayMor Lumber revealed that 32 had paid their accounts within a month of billing. Please construct and interpret the 90% confidence interval for the population proportion of customers who pay within a month.

12. Think about a claim often made by a company or government agency or any other institution.

a. Describe the steps you would follow to test the validity of the claim.

b. What are Type I and Type II errors you could commit while conducting hypothesis test about the claim?

c. If you choose 1% level of significance to conduct your hypothesis test, please explain the relationship between this level of significance and type I error.

13. During emissions testing period, a group of 10 vehicles using an 85% ethanol-gasoline mixture showed mean CO2 emissions of 240 pounds per 100 miles, with sample standard deviation of 20 pounds. Another group of 14 vehicles using regular gasoline showed mean CO2 emissions of 252 pounds per 100 miles with sample standard deviation of 15 pounds. Assuming unequal population variances please test whether the emission rates are statistically the same for cars using ethanol-gasoline mixture and regular gasoline. Conduct the test at 1% level of significance and explain your decision.

14. The average cost of tuition, room and board at small private liberal arts colleges is reported to be $8,500 per term, but a financial administrator believes that the average cost is higher. A survey of 36 small liberal arts colleges showed that the average cost per term is $8,745 with a sample standard deviation of $1,200. At 5% level of significance is the financial administrator's belief supported by the evidence?

15. The national average gross annual income of certified welders is $30,000. The ship building association believes that their welder's on average earn at least as much as the national average. A survey of 25 welders in the ship building industry revealed an average annual income of $32,000 with sample standard deviation of $2,000. At 1% level of significance, is the ship building association right? Show all your steps and explain your decision.

View Full Posting Details