# Confidence Interval,Hypothesis Testing & Regression Analysis

See attached file for proper format.

The Skaff Appliance Company currently has over 1,000 retail outlets throughout the United States and Canada. They sell name brand electronic products, such as TVs, stereos, VCRs, and microwave ovens. Skaff Appliance is considering opening several additional stores in other large metropolitan areas. Paul Skaff, president, would like to study the relationship between the sales at existing locations and several factors regarding the existing store or its region. The factors are the population and the unemployment in the region, and the advertising expense of the store. Another variable considered is 'mall'. Mall refers to whether the existing store is located in an enclosed shopping mall or not. A '1' indicates a mall location; a '0' indicates the store is not located in a mall. A random sample of 20 stores is selected.

Sales

(000)

Population

(000,000)

Percent

Unemployed

Advertising

Expense

(000)

Mall

Location

5.17

7.50

5.1

59.0

0

5.78

8.71

6.3

62.5

0

4.84

10.00

4.7

61.0

0

6.00

7.45

5.4

61.0

1

6.00

8.67

5.4

61.0

1

6.12

11.00

7.2

12.5

0

6.40

13.18

5.8

35.8

0

7.10

13.81

5.8

59.9

0

8.50

14.43

6.2

57.2

1

7.50

10.00

5.5

35.8

0

9.30

13.21

6.8

27.9

0

8.80

17.10

6.2

24.1

1

9.96

15.12

6.3

27.7

1

9.83

18.70

0.5

24.0

0

10.12

20.20

5.5

57.2

1

10.70

15.00

5.8

44.3

0

10.45

17.60

7.1

49.2

0

11.32

19.80

7.5

23.0

0

11.87

14.40

8.2

62.7

1

11.91

20.35

7.8

55.8

0

Managerial Report

1. Construct the confidence interval for the mean of sales of stores for this company.

2. Are mean of sales different between stores which are located in the malls and not located in the malls?

3. Compare the mean of advertising expense of the stores which are located in the malls and the stores which are not located in the malls

4. What is the confidence interval for the proportion of the stores having the advertising expense which are larger than or equals to 50,000 (USD)

5. Using the data given, estimate the multiple linear regression equation of the dependence of sales on Population, Unemployment Rate, Advertising Expense and Mall Location. Answer the following questions:

a. Write down the estimated regression model and explain the meaning of each estimated regression coefficients?

b. What is the value of multiple coefficient of determination and explain its meaning?

c. What independence variable (s) which is not significant in the model

d. Testing for overall significance of the model?

e. Whether or not the Mall Location affect on sales of the stores

f. What is the estimated value of sales of a store which is located in a mall, in a region with unemployment rate 4%, population 18,000,000 and this store invests 40,000 (USD) for advertising?

https://brainmass.com/statistics/regression-analysis/confidence-interval-hypothesis-testing-regression-analysis-419507

#### Solution Summary

The solution provides step by step method for the calculation of confidence intervals, testing of hypothesis and multiple regression analysis. Formula for the calculation and Interpretations of the results are also included.

Statistics: Normality, regression and hypotheses testing problems

Please use use the templates attached.

1. (a) Suppose you want to estimate the mean percentage of gain in per share value for growth-type mutual funds over a specific 2-year period. Ten mutual funds are randomly selected from the population of all the commonly listed funds. The percentage gain figures are shown below (negative values indicate losses).

11.2 4.8 -2.6 16.8 -1.9

10.1 9.6 14.9 10.1 11.2

Find a 90% confidence interval for the mean percentage gain for the population of funds. Assume that the population of percentage gains for growth-type mutual funds can be adequately approximated by a normal distribution.

(b) Now go back and use a goodness of fit test to check the assumption made in part (a) that the data can be assumed to come from a normal distribution.

2. (a) Average total daily sales at a small food store are known to be $451.75. The store manager recently made some changes in displays of goods, order within aisles, and other changes, and she now wants to know whether average sales volume has increased. A random sample of 14 days shows Xbar = $502.95 and s = $64.00. You can assume normality. Using alpha = 4%, is the sampling result significant? Explain.

(b) A simple random sample of 350 households from a large community was selected to estimate the mean residential electricity usage per household during January of last year. Another simple random sample of 420 households was selected, independently of the first, to estimate mean residential electricity usage during January of this year. The sample results (expressed in kilowatt hours) were:

Sample Sample Sample

Year size mean sd

Last Year 350 1272 kwh 259 kwh

This Year 420 1335 kwh 265 kwh

You feel that the mean usage of electricity has increased for January of this year

compared to January of last year. Carry out the appropriate hypothesis test at

5%.

3. (a) Suppose the USGA wants to compare the mean distances associated with five different brands of golf balls when struck with a driver. A completely randomized design is employed, with Iron Byron, the USGA's robotic golfer, using a driver to hit a random sample of 10 balls of each brand in a random sequence. The distance is recorded for each hit, and the results are shown below, organized by brand.

Golf Ball Test

Brand A Brand B Brand C Brand D Brand E

251.2 263.2 269.7 251.6 248.1

245.1 262.9 263.2 248.6 247.5

248.0 265.0 277.5 249.4 242.2

251.1 254.5 267.4 242.0 245.1

260.5 264.3 270.5 246.5 247.6

250.0 257.0 265.5 251.3 260.1

253.9 262.8 270.7 261.8 247.7

244.6 264.4 272.9 249.0 245.0

254.6 260.6 275.6 247.1 241.6

248.8 255.9 266.5 245.9 249.4

Carry out a complete analysis using ANOVA and a 5% significance level.

(b) The Bradford Electric Illuminating Company is studying the relationship between kilowatt hours (thousands) used and the number of rooms in a private single-family residence. A random sample of 10 homes yielded the following data.

Number of Rooms Kilowatt-Hours (thousands)

12 9

9 7

14 10

6 5

10 8

8 6

10 8

10 10

5 4

7 7

Carry out a complete analysis using Simple Linear Regression and a 5% significance level. In addition use your results to estimate the number of kilowatt-hours, in thousands, for a six-room house, and provide a 95% C.I. for your estimate.

4. The Article "Characterization of Highway Runoff in Austin, Texas, Area" (J. of Envir. Engr., 1998: 131-137) gave a scatter plot, along with the simple regression line, of X = rainfall volume (m3) and Y = runoff volume (m3) for a particular location. The following is data from the article.

X: 5 12 14 17 23 30 40 47 55 67 72 81 96 112 127

Y: 4 10 13 15 15 25 27 46 38 46 53 70 82 99 100

Carry out a "complete" analysis using Simple Linear Regression and a 5% significance level. In addition use your results to estimate the amount of runoff volume for a rainfall volume of 65 m3, and provide a 95% C.I. for your estimate. Remember in your analysis to discuss the pertinent results obtained.

5. An experiment was designed by a market researcher to study the effects of two types of promotional expenditures on sales of a line of food products sold in supermarkets. Sixteen locations were selected at random for the test. The researcher collected data on three factors. The data is given below, where y = sales volume ($ ten thousands), x1 = media expenditures ($ thousands), and x2 = point-of-sale expenditures ($ thousands).

y: 8.74 10.53 10.99 11.97 12.74 12.83 14.69 15.30 16.11

x1: 2 2 2 2 3 3 3 3 4

x2: 2 3 4 5 2 3 4 5 2

y: 16.31 16.46 17.69 19.65 18.86 19.93 20.51

x1: 4 4 4 5 5 5 5

x2: 3 4 5 2 3 4 5

The following model is proposed:

Y = b0 + b1x1 + b2x2 + e

Carry out a "complete" multiple regression analysis of this model. In addition

predict y for x1 = 3, and x2 = 3, and give a 95% C.I. for both Y|X, and E[Y|X].

6. (a)The article "Nonbloated Burned Clay Aggregate Concrete" (J. Materials, 1972: 555-563) reports the following data on 7-day flexural strength of nonbloated burned clay aggregate concrete samples (psi):

227 299 291 276 316 318 323 338 361 374 373 385 401 403 432 427 442 411 456 448 466 472 470 492 496 508 517 548 570 726 746 766

786 806 816 831

Test at a significance level of 5% to decide if flexural strength can be considered

to be normally distributed. Use the appropriate Goodness of Fit test.

(b) The article "Measuring the Exposure of Infants to Tobacco Smoke" (N. Engl.

J. Med., 1964: 1075-1078) reports on a study in which various measurements

were taken both from a random sample of infants who had been exposed to

household smoke and from a sample of unexposed infants. The accompanying

data consists of observations on urinary concentration of cotanine, a major

metabolite of nicotine. Does the data suggest that true average cotanine level

is higher in exposed infants than in unexposed infants by more than 25?

Cary out a test at a significance level of 5%.

Unexposed : 8 11 12 14 20 43 111

Exposed: 35 56 83 92 128 150 176 208

You can't assume normality, so use the appropriate nonparametric test.

View Full Posting Details