Share
Explore BrainMass

Hypothesis testing and regression analysis

PART I. HYPOTHESIS TESTING

PROBLEM 1 A certain brand of fluorescent light tube was advertised as having an effective life span before burning out of 4000 hours. A random sample of 84 bulbs was burned out with a mean illumination life span of 1870 hours and with a sample standard deviation of 90 hours. Construct a 95 confidence interval based on this sample and be sure to interpret this interval.

PROBLEM 2 Given the following data from two independent data sets, conduct a one-tail hypothesis test to determine if the means are statistically equal using alpha=0.05. Do NOT do a confidence interval.

n1 = 35 n2 = 30
xbar1= 32 xbar2 = 25
s1=7 s2 = 6

PROBLEM 3. A test was conducted to determine whether gender of a display model affected the likelihood that consumers would prefer a new product. A survey of consumers at a trade show which used a female spokesperson determined that 120 of 300 customers preferred the product while 92 of 280 customers preferred the product when it was shown by a female spokesperson.

Do the samples provide sufficient evidence to indicate that the gender of the salesperson affect the likelihood of the product being favorably regarded by consumers? Evaluate with a two-tail, alpha =.01 test. Do NOT do a confidence interval.

PROBLEM 4 Asuming that the population variances are equal for Male and Female GPA's, test the following sample data to see if Male and Female PhD candidate GPA's (Means) are equal. Conduct a two-tail hypothesis test at α =.01 to determine whether the sample means are different. Do NOT do a confidence interval.

Male GPA's Female GPA's
Sample Size 12 13
Sample Mean 2.8 4.95
Sample Standard Dev .25 .8

PART II REGRESSION ANALYSIS
Problem 5 You wish to run the regression model (less Intercept and coefficients) shown below:
VOTE = URBAN + INCOME + EDUCATE

Given the Excel spreadsheet below for annual data from1970 to 2006 (with the data for row 5 thru row 35 not shown), complete all necessary entries in the Excel Regression Window shown below the data.

A B C D E
1 YEAR VOTE URBAN INCOME EDUCATE
2 1970 49.0 62.0 7488 4.3
3 1971 58.3 65.2 7635 8.3
4 1972 45.2 75.0 7879 4.5

36 2004 50.1 92.1 15321 4.9
37 2005 67.7 94.0 15643 4.7
38 2006 54.2 95.6 16001 5.1

Regression

Input OK
Input Y Range:
Cancel
Input X Range:
Help
Labels Constant is Zero
Confidence Level: 95 %

Output options
Output Range:

New Worksheet Ply:
New Workbook

Residuals
Residuals Residual Plots
Standardized Residuals Line Fit Plots

Normal Probability
Normal Probability Plots

PROBLEM 6. Use the following regression output to determine the following:

A real estate investor has devised a model to estimate home prices in a new suburban development. Data for a random sample of 100 homes were gathered on the selling price of the home ($ thousands), the home size (square feet), the lot size (thousands of square feet), and the number of bedrooms.

The following multiple regression output was generated:

Regression Statistics
Multiple R 0.8647
R Square 0.7222
Adjusted R Square 0.6888
Standard Error 16.0389
Observations 100

Coefficients Standard Error t Stat P-value
Intercept -24.888 38.3735 -0.7021 0.2154
X1 (Square Feet) 0.2323 0.0184 9.3122 0.0000
X2 (Lot Size) 11.2589 1.7120 4.3256 0.0001
X3 (Bedrooms) 15.2356 6.8905 3.2158 0.1589

a. Why is the coefficient for BEDROOMS a positive number?

b. Which is the most statistically significant variable? What evidence shows this?

c. Which is the least statistically significant variable? What evidence shows this?

d. For a 0.05 level of significance, should any variable be dropped from this model? Why or why not?

e. Interpret the value of R squared? How does this value from the adjusted R squared?

f. Predict the sales price of a 1134-square-foot home with a lot size of 15,400 square feet and 2 bedrooms.

PART III SPECIFIC KNOWLEDGE SHORT-ANSWER QUESTIONS.

Problem 7 Define Autocorrelation in the following terms:
a. In what type of regression is it likely to occur?

b. What is bad about autocorrelation in a regression?

c. What method is used to determine if it exists? (Think of statistical test to be used)

d. If found in a regression how is it eliminated?

Problem 8 Define Multicollinearity in the following terms:
a. In what type of regression is it likely to occur?

b. Why is multicollinearity in a regression a difficulty to be resolved?

c. How can multicollinearity be determined in a regression?.

d. If multicollinearity is found in a regression, how is it eliminated?

STUDENT T TABLE

df .10 .05 .025 .010 .005

1 3.078 6.314 12.706 31.821 63.657
2 1.886 2.920 4.303 6.965 9.925
3 1.638 2.353 3.182 4.541 5.841
4 1.533 2.132 2.776 3.747 4.604
5 1.476 2.015 2.571 3.365 4.032
6 1.440 1.943 2.447 3.143 3.707
7 1.415 1.895 2.365 2.998 3.499
8 1.397 1.860 2.306 2.896 3.355
9 1.383 1.833 2.262 2.821 3.250
10 1.372 1.812 2.228 2.764 3.169
11 1.363 1.796 2.201 2.718 3.106
12 1.356 1.782 2.179 2.681 3.055
13 1.350 1.771 2.160 2.650 3.012
14 1.345 1.761 2.145 2.624 2.977
15 1.341 1.753 2.131 2.602 2.947
16 1.337 1.746 2.120 2.583 2.921
17 1.333 1.740 2.110 2.567 2.898
18 1.330 1.734 2.101 2.552 2.878
19 1.328 1.729 2.093 2.539 2.861
20 1.325 1.725 2.086 2.528 2.845
21 1.323 1.721 2.080 2.518 2.831
22 1.321 1.717 2.074 2.508 2.819
23 1.319 1.714 2.069 2.500 2.807
24 1.318 1.711 2.064 2.492 2.797
25 1.316 1.708 2.060 2.485 2.787
26 1.315 1.706 2.056 2.479 2.779
27 1.314 1.703 2.052 2.473 2.771
28 1.313 1.701 2.048 2.467 2.763
29 1.311 1.699 2.045 2.462 2.756
30 1.310 1.697 2.042 2.457 2.750
40 1.303 1.684 2.021 2.423 2.704
60 1.296 1.671 2.000 2.390 2.660
120 1.289 1.658 1.980 2.358 2.617
 1.282 1.645 1.960 2.326 2.576

DURBIN-WATSON d STATISTIC, = .05

n k=1 k=2 k=3 k=4 k=5
dL dU dL dU dL dU dL dU dL dU
15 1.08 1.36 .95 1.54 .82 1.75 .69 1.97 .56 2.21
16 1.10 1.37 .98 1.54 .86 1.73 .74 1.93 .62 2.15
17 1.13 1.38 1.02 1.54 .90 1.71 .78 1.90 .67 2.10
18 1.16 1.39 1.05 1.53 .93 1.69 .82 1.87 .71 2.06
19 1.18 1.40 1.08 1.53 .97 1.68 .86 1.85 .75 2.02
20 1.20 1.41 1.10 1.54 1.00 1.68 .90 1.83 .79 1.99
21 1.22 1.42 1.13 1.54 1.03 1.67 .93 1.81 .83 1.96
22 1.24 1.43 1.15 1.54 1.05 1.66 .96 1.80 .86 1.94
23 1.26 1.44 1.17 1.54 1.08 1.66 .99 1.79 .90 1.92
24 1.27 1.45 1.19 1.55 1.10 1.66 1.01 1.78 .93 1.90
25 1.29 1.45 1.21 1.55 1.12 1.66 1.04 1.77 .95 1.89
26 1.30 1.46 1.22 1.55 1.14 1.65 1.06 1.76 .98 1.88
27 1.32 1.47 1.24 1.56 1.16 1.65 1.08 1.76 1.01 1.86
28 1.33 1.48 1.26 1.56 1.18 1.65 1.10 1.75 1.03 1.85
29 1.34 1.48 1.27 1.56 1.20 1.65 1.12 1.74 1.05 1.84
30 1.35 1.49 1.28 1.57 1.21 1.65 1.14 1.74 1.07 1.83
31 1.36 1.50 1.30 1.57 1.23 1.65 1.16 1.74 1.09 1.83
32 1.37 1.50 1.31 1.57 1.24 1.65 1.18 1.73 1.11 1.82
33 1.38 1.51 1.32 1.58 1.26 1.65 1.19 1.73 1.13 1.81
34 1.39 1.51 1.33 1.58 1.27 1.65 1.21 1.73 1.15 1.81
35 1.40 1.52 1.34 1.58 1.28 1.65 1.22 1.73 1.16 1.80
36 1.41 1.52 1.35 1.59 1.29 1.65 1.24 1.73 1.18 1.80
37 1.42 1.53 1.36 1.59 1.31 1.66 1.25 1.72 1.19 1.80
38 1.43 1.54 1.37 1.59 1.32 1.66 1.26 1.72 1.21 1.79
39 1.43 1.54 1.38 1.60 1.33 1.66 1.27 1.72 1.22 1.79
40 1.44 1.54 1.39 1.60 1.34 1.66 1.29 1.72 1.23 1.79
45 1.48 1.57 1.43 1.62 1.38 1.67 1.34 1.72 1.29 1.78
50 1.50 1.59 1.46 1.63 1.42 1.67 1.38 1.72 1.34 1.77
55 1.53 1.60 1.49 1.64 1.45 1.68 1.41 1.72 1.38 1.77
60 1.55 1.62 1.51 1.65 1.48 1.69 1.44 1.73 1.41 1.77
65 1.57 1.63 1.54 1.66 1.50 1.70 1.47 1.73 1.44 1.77
70 1.58 1.64 1.55 1.67 1.52 1.70 1.49 1.74 1.46 1.77
75 1.60 1.65 1.57 1.68 1.54 1.71 1.51 1.74 1.49 1.77
80 1.61 1.66 1.59 1.69 1.56 1.72 1.53 1.74 1.51 1.77
85 1.62 1.67 1.60 1.70 1.57 1.72 1.55 1.75 1.52 1.77
90 1.63 1.68 1.61 1.70 1.59 1.73 1.57 1.75 1.54 1.78
95 1.64 1.69 1.62 1.71 1.60 1.73 1.58 1.75 1.56 1.78
100 1.65 1.69 1.63 1.72 1.61 1.74 1.59 1.76 1.57 1.78

MEAN HYPOTHESIS TEST AND CONFIDENCE INTERVAL FORMULAS

Null Hypothesis Standard Deviation Data "t" for Hypothesis Test Confidence Interval

["Previous Standard" is "old"
numerical constant to which
new statistic is being compared.]

DF = n-1

Standard Deviation
is the "data" t
Denominator

LARGE SAMPLE DF =

Use "pooled s" above in small sample t and confidence interval formulas.
SMALL SAMPLE: DF = n1 + n2 - 2
******************************************************************************************************************************************
Determine Sample size for when B = Error of Estimation

ALWAYS ROUND UP VALUE OF n DETERMINED IN FORMULA

LARGE SAMPLE: DF =

BINOMIAL PROBABILITY HYP TEST & CONFIDENCE INTERVAL FORMULAS

Null Hypothesis Estimator "Data" t for Hypothesis Test Confidence Interval

Ho: p = Previous Probability Standard

[Previous Probability Standard
is "old" probability to which
probability based on new data
is being compared.]

************************************************************************************************************
H0 :
(Previous Probability
Difference assumed
to be 0.)

************************************************************************************************************
DETERMINE SAMPLE SIZE FOR BINOMIAL p
B = Error of Estimation
[Round sample size "n" determined in this formula to next higher integer.]

DF for sample size is always is value closest to .5

Attachments

Solution Summary

The solution provides step by step method for the calculation of regression model and test statistic for hypothesis testing problems . Formula for the calculation and Interpretations of the results are also included. Interactive excel sheet is included. The user can edit the inputs and obtain the complete results for a new set of data.

$2.19