# Analysis of variance and confidence interval, chi-square

Out of the 15 study problems I am having trouble with 6. I want to compare my answers to yours. Can you please work the following problems, explaing how you got the answer and showing the tables from excel, please. It is very important I understand how you got the answers.

1. A sample of the reading scores of 25 sixth-graders has a mean of 82, and the standard deviation of the sample is 15. Assume that the population can be adequately approximated by a normal distribution.

(a) Find the 95% confidence interval of the mean reading scores of all sixth-graders.

(b) Find the 85% confidence interval of the mean reading scores of all six-graders

2. The times (in minutes) it took six white mice to learn to run a simple maze and the times it took six brown mice to learn to run the same maze are given here.

White Mice Brown Mice

18 25

24 16

20 19

13 14

15 16

12 10

(a) Does the color of the mice make a difference in their learning rate? Test using a significance level of 5%.

(b) Give the p-value for the test, and interpret this value.

(c) Find the 99% confidence interval for the difference of the means. Interpret this interval.

Note: You can assume that the data is normally distributed.

3. Three different relaxation techniques are given to randomly selected patients in an effort to reduce their stress levels. A special instrument has been designed to measure the percentage of stress reduction in each person. The data is shown below. You can assume normality and that good randomization and experimental procedures were used.

Relaxation Experiment

Technique I Technique II Technique III

3 12 15

10 12 14

5 17 18

1 13 14

13 18 20

3 9 22

4 14 16

Carry out a "complete" analysis using ANOVA and a 5% significance level.

4. A researcher wishes to determine if the number of hours (Y) a person exercises per week is related to their age (X). The data is shown below, and you can assume a well designed experiment was used to obtain the data.

Age (X): 18 22 26 32 35 38 52 59

Hours (Y): 10 8 5 2 4 3 1.5 1

Carry out a "complete" analysis using Simple Linear Regression and a 5% significance level. In addition, use your results to estimate the average amount of exercise for a 29 year old person, and provide a 98% C.I. for your estimate. Remember in your analysis to discuss the pertinent results obtained.

5. (a) A researcher surveyed married women and single women to ascertain whether there was a difference in the number of books each had read during the past year. The data is shown below.

Books Read

Married Single

6 2

8 3

7 5

4 11

9 3

12 5

13 11

7 12

10 16

18 4

15 0

1

You can't assume normality, so use an appropriate nonparametric method to

Test the claim that each group read the same number of books. Test using a

significance level of 10%. Draw appropriate conclusions.

(b) You are curious what the result would be if you used a parametric technique,

so please re-do the analysis using the appropriate two sample t-test. Comment on

the result.

6. An experiment was conducted to study the effects of temperature of freezer (X1) and freezer storage density (X2) on number of days before flavor deterioration (Y) occurs, for a food product stored in a commercial freezer. The independent variables are measured in terms of deviations from the levels normally used; thus in the first observation the temperature setting was 10 degrees centigrade below the normal setting and the storage setting was 10 percentage points less than the normal density. Note: This coding should not concern you, and has nothing to do with carrying out the regression. The results of the study are as follows:

Y: 196 169 138 179 158 122 164 139 108

X1: -10 0 +10 -10 0 +10 -10 0 +10

X2: -10 -10 -10 0 0 0 +10 +10 +10

The following model is proposed:

Y = b0 + b1x1 + b2x2 + e

Carry out a "complete" multiple regression analysis of this model. In addition predict y for x1 = -5, and x2 = 5, and give a 95% C.I. for both Y|X, and E[Y|X].

Â© BrainMass Inc. brainmass.com December 24, 2021, 8:51 pm ad1c9bdddfhttps://brainmass.com/statistics/chi-squared-test/analysis-variance-confidence-interval-chi-square-319875

## SOLUTION This solution is **FREE** courtesy of BrainMass!

Kathy Scott

Study Guide Questions (6 of 15)

Final Exam Study Guide

Answers in RED

1. A sample of the reading scores of 25 sixth-graders has a mean of 82, and the standard deviation of the sample is 15. Assume that the population can be adequately approximated by a normal distribution.

(a) Find the 95% confidence interval of the mean reading scores of all sixth-graders.

where is the t- critical value at (n-1) deg. of freedom. We use t-distribution because we estimate the standard deviation.

= (78.8, 89.2).

Always interpret: For example,

We are 95% confident that the pop. Mean test score lies between 78.8 and 89.2.

(b) Find the 85% confidence interval of the mean reading scores of all six-graders

Same as above, the difference is just the critical value. Here, . Now, just replace 2.39 above with 1.86 and show that the interval is (76.4, 87.6). Interpret.

2. The times (in minutes) it took six white mice to learn to run a simple maze and the times it took six brown mice to learn to run the same maze are given here.

White Mice Brown Mice

18 25

24 16

20 19

13 14

15 16

12 10

Xbar1 = 17 xbar2 = 16.67

Stdev1 =4.56 stdev2 = 5.05

Var1 = 20.8 Var2 = 25.47

n1 = 6 n2 = 6

(a) Does the color of the mice make a difference in their learning rate? Test using a significance level of 5%.

I will test difference in mean of the two color mice.

Ho : (No difference between the mean time)

Ha : (Significant difference between the mean time)

Test stats = (17-16.67)/âˆš (20.8/6 + 25.47/6) = 0.119

T-critical = 2.64

Since 0.119 < 2.64, do not reject Ho and conclude there is no significant difference between the mean times of the two mice. Thus, color does not make a difference.

(b) Give the p-value for the test, and interpret this value.

Pvalue = P(T > 0.119) = 0.9068 (using Excel software).

P-value > 2(0.20) = 0.40 (using the t- table). Notice that I am using n-2 = 12-2 = 10 degree of freedom.

(c) Find the 99% confidence interval for the difference of the means. Interpret this interval.

Note: You can assume that the data is normally distributed.

Just like in question 1 above but with different values:

= (-9.61, 10.27)

We are 99% confident that the difference between the population means of the two mice colors is between -9.61 and 10.27.

3. Three different relaxation techniques are given to randomly selected patients in an effort to reduce their stress levels. A special instrument has been designed to measure the percentage of stress reduction in each person. The data is shown below. You can assume normality and that good randomization and experimental procedures were used.

Relaxation Experiment

Technique I Technique II Technique III

3 12 15

10 12 14

5 17 18

1 13 14

13 18 20

3 9 22

4 14 16

Carry out a "complete" analysis using ANOVA and a 5% significance level.

I will use Excel to run ANOVA.

Ho: mu1=mu2=mu3

Ha: Not equal

Anova: Single Factor

SUMMARY

Groups Count Sum Average Variance

Tech. I 7 39 5.571428571 18.61905

Tech. II 7 95 13.57142857 9.619048

Tech. III 7 119 17 9.666667

ANOVA

Source of Variation SS df MS F P-value F crit

Between Groups 481.5238095 2 240.7619048 19.05528 0.000036 3.554557

Within Groups 227.4285714 18 12.63492063

Total 708.952381 20

We see, from table, that the test statistics is F = 19.055 and p-value = 0.000036 (approximately). Hence, reject Ho since the p-value < 0.05. Conclude that there is no significant difference between the three means.

4. A researcher wishes to determine if the number of hours (Y) a person exercises per week is related to their age (X). The data is shown below, and you can assume a well designed experiment was used to obtain the data.

Age (X): 18 22 26 32 35 38 52 59

Hours (Y): 10 8 5 2 4 3 1.5 1

Carry out a "complete" analysis using Simple Linear Regression and a 5% significance level. In addition, use your results to estimate the average amount of exercise for a 29 year old person, and provide a 98% C.I. for your estimate. Remember in your analysis to discuss the pertinent results obtained.

Here is the Excel output of the regression

SUMMARY OUTPUT(Edited)

Regression Statistics

Multiple R 0.857198176

R Square 0.734788713

Adjusted R Square 0.690586832

Standard Error 1.789763971

Observations 8

ANOVA

df SS MS F Significance F

Regression 1 53.24922 53.24922 16.62347156 0.006523

Residual 6 19.21953 3.203255

Total 7 72.46875

Coefficients Standard Error t Stat P-value

Intercept 11.13498065 1.788977 6.224217 0.000794954

Age(X) -0.19354555 0.04747 -4.07719 0.006522719

From the table, the equation can be written as

Y = 11.135 - 0.194X

The average amount of exercise for a 29 year old person, and provide a 98% C.I. for your estimate.

Estimated Hours (Yhat) = 11.135 - 0.194(29) =5.509 (about 6 hours).

98% CI of Yhat is (-0.52, 11.56) (See Excel page and the website below for help)

Helpful site: http://www.weibull.com/DOEWeb/confidence_intervals_in_simple_linear_regression.htm

http://people.stfx.ca/bliengme/ExcelTips/RegressionAnalysisConfidence2.htm

5. (a) A researcher surveyed married women and single women to ascertain whether there was a difference in the number of books each had read during the past year. The data is shown below.

Books Read

Married Single

6 2

8 3

7 5

4 11

9 3

12 5

13 11

7 12

10 16

18 4

15 0

1

You can't assume normality, so use an appropriate nonparametric method to

Test the claim that each group read the same number of books. Test using a

significance level of 10%. Draw appropriate conclusions.

We will use a Mann-Whitney U Test:

Ho: No Difference between mean of Samples

Ha: Difference Exits

C = 100

n1*n2-C = 32

U- stats = 100

U cirt(12,11) = 38 Reject Ho since Ustat > Ucrit

p-value 0.0522 (from software)

Reject Ho since p-value <10%

Here is the software equivalence. Note the 'Wilcoxon rank sum test' is the same as Mann-Whitney Test.

Wilcoxon rank sum test with continuity correction

data: Married and Single

W = 98, p-value = 0.05219

alternative hypothesis: true location shift is not equal to 0

Warning message:

In wilcox.test.default(Married, Single) :

cannot compute exact p-value with ties

Helpful paper: http://www.plantbio.ohiou.edu/epb/instruct/quantmet/lectures/lec9.pdf

(b) You are curious what the result would be if you used a parametric technique,

so please re-do the analysis using the appropriate two sample t-test. Comment on

the result.

T-Test

t-Test: Two-Sample Assuming Unequal Variances

Married Single

Mean 9.909090909 6.083333333

Variance 17.69090909 26.08333333

Observations 11 12

Hypothesized Mean Difference 0

df 21

t Stat 1.967269364

P(T<=t) one-tail 0.031251153

t Critical one-tail 1.720742871

P(T<=t) two-tail 0.062502305

t Critical two-tail 2.079613837

Do not reject Ho just like the nonparametric test

6. An experiment was conducted to study the effects of temperature of freezer (X1) and freezer storage density (X2) on number of days before flavor deterioration (Y) occurs, for a food product stored in a commercial freezer. The independent variables are measured in terms of deviations from the levels normally used; thus in the first observation the temperature setting was 10 degrees centigrade below the normal setting and the storage setting was 10 percentage points less than the normal density. Note: This coding should not concern you, and has nothing to do with carrying out the regression. The results of the study are as follows:

Y: 196 169 138 179 158 122 164 139 108

X1: -10 0 +10 -10 0 +10 -10 0 +10

X2: -10 -10 -10 0 0 0 +10 +10 +10

The following model is proposed:

Y = b0 + b1x1 + b2x2 + e

Carry out a "complete" multiple regression analysis of this model. In addition

predict y for x1 = -5, and x2 = 5, and give a 95% C.I. for both Y|X, and E[Y|X].

See Excel page

Y = 152.56 -2.85*X1 -1.53*X2

When X1= -5 and X2 = 5, we have

Yhat = 152.56 -2.85*(-5) - 1.53*(5) = 159.16

I used Minitab software to get the following results:

A 95% C.I. for Y|X:

(151.571, 166.707)

A 95% C.I. for E[Y|X]

(156.085, 162.192)

Here is the complete MINITAB output:

â€”â€”â€”â€”â€” 5/26/2010 1:42:26 AM â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”

Welcome to Minitab, press F1 for help.

Regression Analysis: Y versus X1, X2

The regression equation is

Y = 153 - 2.85 X1 - 1.53 X2

Predictor Coef SE Coef T P

Constant 152.556 0.943 161.72 0.000

X1 -2.8500 0.1155 -24.67 0.000

X2 -1.5333 0.1155 -13.27 0.000

S = 2.83006 R-Sq = 99.2% R-Sq(adj) = 99.0%

Analysis of Variance

Source DF SS MS F P

Regression 2 6284.2 3142.1 392.31 0.000

Residual Error 6 48.1 8.0

Total 8 6332.2

Source DF Seq SS

X1 1 4873.5

X2 1 1410.7

Unusual Observations

Obs X1 Y Fit SE Fit Residual St Resid

5 0.0 158.000 152.556 0.943 5.444 2.04R

R denotes an observation with a large standardized residual.

Predicted Values for New Observations

New Obs Fit SE Fit 95% CI 95% PI

1 159.139 1.248 (156.085, 162.192) (151.571, 166.707)

Values of Predictors for New Observations

New Obs X1 X2

1 -5.00 5.00

Married 18 15 13 12 10 9 8 7 7 6 4

Single 16 12 11 11 5 5 4 3 3 2 1 0

Â© BrainMass Inc. brainmass.com December 24, 2021, 8:51 pm ad1c9bdddf>https://brainmass.com/statistics/chi-squared-test/analysis-variance-confidence-interval-chi-square-319875