Purchase Solution

# General Statistics

Not what you're looking for?

I need help interpreting what is being asked and how to solve problems.

##### Solution Summary

1. A real estate agent named Betsy has gathered data on 150 houses that were recently sold in Portland, Oregon. Included in this data set are observations for each of the following variables:
&#61607; the appraised value of each house (in thousands of dollars),
&#61607; the selling price of each house (in thousands of dollars),
&#61607; the size of each house (in hundreds of square feet), and
&#61607; the number of bedrooms in each house.

Betsy wants to understand the relationship between the selling prices (Y) and the appraised values (X) of homes in the Portland area. Use the following scatterplot and regression output to answer Betsy's questions.

a) Is there evidence of a linear relationship between the selling price and appraised value? If so, characterize the relationship (i.e., indicate whether the relationship is a positive or negative one, a strong or weak one, etc.).
b) Identify any unusual observations by circling them in the scatterplot. What would you recommend Betsy do with these data points?
c) The true or population regression model is

What is the estimated regression model? Briefly explain the difference between the population and estimated models.
d) Interpret each of the following terms means using the output below.
The standard error of estimate se
The coefficient of determination or R2
The slope of the estimated least squares line.

Results of multiple regression for Price

Summary measures
Multiple R 0.8412
R-Square 0.7077
StErr of Est 7.9406

ANOVA Table
Source df SS MS F p-value
Explained 1 22593.2835 22593.2835 358.3182 0.0000
Unexplained 148 9331.9463 63.0537

Regression coefficients
Coefficient Std Err t-value p-value
Constant 7.7083 6.6543 1.1584 0.2486
Value 0.9482 0.0501 18.9293 0.0000

e) In terms of finding houses that are good deals, would Betsy be more interested in the points above or below the regression line? Explain.
f) Betsy proposes to include the two remaining variables, the size of the home and the number of bedrooms in the home, in the regression analysis. Given the output below, should Betsy have included these two variables?

Results of multiple regression for Price

Summary measures
Multiple R 0.8783
R-Square 0.7714
StErr of Est 7.0702

ANOVA Table
Source df SS MS F p-value
Explained 3 24626.9734 8208.9911 164.2190 0.0000
Unexplained 146 7298.2563 49.9881

Regression coefficients
Coefficient Std Err t-value p-value
Constant 2.5174 6.5519 0.3842 0.7014
Value 0.6841 0.0621 11.0128 0.0000
Square_Footage 2.4931 0.4627 5.3884 0.0000
Number_Bedrooms -1.2086 1.1094 -1.0894 0.2778

2. Rob needs to buy a PC to replace his aging Mac and has collected data on 24 computers. For each computer he has recorded:
&#61607; The speed, measured in megahertz,
&#61607; The time in minutes the battery maintains its charge,
&#61607; The RAM measured in megabytes,
&#61607; The chip type DX, SX, and SL encoded as 3 dummy variables, Chip Type DX, Chip Type SX, and Chip Type SL,
&#61607; The monitor type, either Color or Mono, where Color is coded 1 and mono is 0.

a) Why does StatPro display an error message when Rob tries to include the 3 chip type dummy variables in a regression analysis?

b) Interpret the coefficients for Charge, Chip Type DX, and Monitor Type below.

Results of multiple regression for Price

Summary measures
Multiple R 82%
R-Square 67%
StErr of Est \$ 929

Regression coefficients
Coefficient p-value
Constant \$ 2,135 0.0357
Speed \$ 63 0.1348
Charge \$ 6 0.2727
RAM \$ 26 0.4412
Chip Type DX \$ 2,839 0.0468
Chip Type SL \$ 302 0.6696
Monitor Type__ \$ 2,130 0.0005

c) Use the estimated regression equation to predict the price of a laptop computer with the following features: a 50-megahertz processor, a battery that holds its charge for 180 minutes, 20 megabytes of RAM, a DX chip, and a color monitor.

d) Find the 95% prediction interval for the price of the laptop characterized in part c.

Solution provided by:
###### Education
• BSc , Wuhan Univ. China
• MA, Shandong Univ.
###### Recent Feedback
• "Your solution, looks excellent. I recognize things from previous chapters. I have seen the standard deviation formula you used to get 5.154. I do understand the Central Limit Theorem needs the sample size (n) to be greater than 30, we have 100. I do understand the sample mean(s) of the population will follow a normal distribution, and that CLT states the sample mean of population is the population (mean), we have 143.74. But when and WHY do we use the standard deviation formula where you got 5.154. WHEN & Why use standard deviation of the sample mean. I don't understand, why don't we simply use the "100" I understand that standard deviation is the square root of variance. I do understand that the variance is the square of the differences of each sample data value minus the mean. But somehow, why not use 100, why use standard deviation of sample mean? Please help explain."
• "excellent work"
• "Thank you so much for all of your help!!! I will be posting another assignment. Please let me know (once posted), if the credits I'm offering is enough or you ! Thanks again!"
• "Thank you"
• "Thank you very much for your valuable time and assistance!"

##### Measures of Central Tendency

Tests knowledge of the three main measures of central tendency, including some simple calculation questions.

##### Terms and Definitions for Statistics

This quiz covers basic terms and definitions of statistics.

##### Measures of Central Tendency

This quiz evaluates the students understanding of the measures of central tendency seen in statistics. This quiz is specifically designed to incorporate the measures of central tendency as they relate to psychological research.