# Statistics : Hypothesis Testing, Significance Levels, ANOVA, Tukey-Kramer, Wilcoxon Rank Sums Test and Kruskal-Wallis Rank Test

1. Which of the following components in an ANOVA table are not

additive?

a. Sum of squares

b. Degrees of freedom

c. Mean squares

d. It is not possible to tell

2. Why would you use the Tukey-Kramer procedure?

a. To test for normality

b. To test for homogeneity of variance

c. To test independence of errors

d. To test for differences in pair wise means

3. A completely randomized design

a. has only one factor with several treatment groups.

b. can have more than one factor, each with several treatment

groups.

c. has one factor and one block.

d. has one factor and one block and multiple values.

4. In a one-way ANOVA, the null hypothesis is always

a. There is no treatment effect.

b. There is some treatment effect.

c. All the population means are different.

d. Some of the population means are different.

5. Interaction in an experimental design can be tested in

a. a completely randomized model.

b. a randomized block model.

c. a two-factor model.

d. all ANOVA models.

6. Suppose there is interest in comparing the median response time for

three independent groups learning a specific task. The appropriate

nonparametric procedure is

a. Wilcoxon Rank Sums Test.

b. Wilcoxon Signed Rank Test.

c. Kruskal-Wallis Rank Test for Differences in Medians.

d. None of the above.

7. A campus researcher wanted to investigate the factors that affect

visitor travel time in a complex, multilevel building on campus.

Specifically, he wanted to determine whether different building

signs (building maps versus wall signage) affect the total amount of

time visitors require to reach their destination and if that time

depends on whether the starting location is inside or outside the

building. Three subjects were assigned to each of the combinations

of signs and starting locations, and travel time in seconds from

beginning to destination was recorded. How should the data be

analyzed?

Starting Room______________________________

Interior Exterior____________

Wall Signs 141, 119, 238 224, 339, 139 224, 339, 139________

Map 85, 94, 126 226, 129, 130_________

a. Completely randomized design

b. Randomized block design

c. 2x2 factorial design

d. Kruskal-Wallis rank test

8. You are working with an experiment that has a single factor of

interest with five groups or levels, and seven observations in each

group. How many degrees of freedom are there in determining the

within-group variation?

a. 4

b. 6

c. 30

d. 34

9. You are working with an experiment that has a single factor of

interest with five groups or levels, and seven observations in each

group. You have calculated SSA = 60 and SST = 210. At

the .05 level of significance, what is the value of the upper-tail

critical value from the F distribution?

a. 2.69

b. 3.00

c. 3.25

d. 5.00

10. Given a randomized block experiment having one factor containing

four treatment levels and 8 blocks, MSA = 80, SSBL = 540, and

F among blocks = 5.0, what is the MSBL (mean square or variance

among blocks)?

a. 15.4286

b. 77.1429

c. 67.5000

d. 16.0000

11. A chef was experiencing difficulty in getting brands of pasta cooked

to the appropriate firmness. She conducted an experiment with two

brands of pasta cooked for either 4 minutes or for 8 minutes. The

response variable measured was the weight of the pasta. The result

(in terms of weight in grams) for two replications of each type of

pasta and cooking time are as follows:

Cooking Time Cooking Time

(minutes) (minutes)_____

Pasta Type 4 8 Pasta Type 4 8__

American 265 310 Italian 250 300

270 320 245 305

At the .05 level of significance, what is the F statistic used to test

whether there is an interaction between the type of pasta and the

cooking time?

a. 1.28571

b. 7.70865

c. 24.1429

d. 28.1250

12. An industrial psychologist wants to test whether the reaction times

of assembly line workers are equivalent under three different

learning methods. From a group of 25 new employees, 9 are

randomly assigned to method A, 8 to method B and 8 to method C.

After the learning period the workers are given a task to complete,

and their reaction times are measured as shown in the table below.

What is the appropriate test statistic to use for measuring whether

there is a significant difference between the median reaction times

for these learning methods at a significance level of 0.01?

METHOD

A B C

2 1 5

3 6 7

4 8 11

9 15 12

10 16 13

14 17 18

19 21 24

20 22 25

23

a. 2 χ = 9.210

b. 2 χ = 5.991

c. H = 0.12

d. H = 0.64

13. Given a randomized block experiment having a single factor of

interest with five treatment levels and seven blocks, with the

following sums of squares already calculated: SSA = 60, SSBL = 75

and SST = 210, form the ANOVA summary table, then answer the

following question. Which of the following most accurately

represents the decision rule for testing the null hypothesis that

there are no block effects at a .05 significance level?

a. If F > 2.51, reject H0

b. If F > 2.78, reject H0

c. If F > 4.00, reject H0

d. If F > 4.80, reject H0

14. If we use the χ2 method of analysis to test for the differences among

4 proportions, the degrees of freedom are equal to:

a. 3

b. 4

c. 5

d. 1

15. If we wish to determine whether there is evidence that the

proportion of successes is the same in Group 1 as in Group 2, the

appropriate test to use is

a. The Z test

b. The χ2 test

c. Both of the above

d. None of the above

16. A study published in the American Journal of Public Health was

conducted to determine whether the use of seat belts in motor

vehicles depends on ethnic status in San Diego County. A sample of

792 children treated for injuries sustained from motor vehicle

accidents was obtained, and each child was classified according to

(1) ethnic status (Hispanic or non-Hispanic) and (2) seat belt usage

(worn or not worn) during the accident. The number of children in

each category is given in the table below.

Hispanic Non-Hispanic

Seat belts worn 31 148

Seat belts not worn 283 330

Using the data from the table above, the calculated test statistic is

a. -0.9991

b. -0.1368

c. 48.1849

d. 72.8063

17. Many companies use well-known celebrities as spokespersons in

their TV advertisements. A study was conducted to determine

whether brand awareness of female TV viewers and the gender of

the spokesperson are independent. Each in a sample of 300 female

TV viewers was asked to identify a product advertised by a celebrity

spokesperson. The gender of the spokesperson and whether or not

the viewer could identify the product was recorded. The numbers in

each category are given below.

Male Celebrity Female Celebrity

Identified product 41 61

Could not identify 109 89

Referring to the table above, at 5% level of significance, the critical

value of the test statistic is

a. 3.8415

b. 5.9914

c. 9.4877

d. 13.2767

18. A computer used by a 24-hour banking service is supposed to

randomly assign each transaction to one of 5 memory locations. A

check at the end of a day's transactions gave the counts shown in

the table to each of the 5 memory locations, along with the number

of reported errors. The bank manager wants to test whether the

proportion of errors in transactions assigned to each of the 5

memory locations differ. Using the data in the table below, the

calculated value of the χ2 test statistic is

Memory Location 1 2 3 4 5

Number of Transactions 82 100 74 92 102

Number of Reported Errors 11 12 6 9 10

a. - 0.1777

b. - 0.0185

c. 1.4999

d. 1.5190

19. The Wall Street Journal recently ran an article indicating

differences in perception of sexual harassment on the job between

men and women. The article claimed that women perceived the

problem to be much more prevalent than did men. One question

asked to both men and women was: "Do you think sexual

harassment is a major problem in the American workplace?" Some

24% of the men compared to 62% of the women responded "Yes."

Suppose that 150 women and 200 men were interviewed. What

conclusion should be reached?

a. Using a 0.01 level of significance, there is sufficient evidence

to conclude that women perceive the problem of sexual

harassment on the job as much more prevalent than do men.

b. There is insufficient evidence to conclude with at least 99%

confidence that women perceive the problem of sexual

harassment on the job as much more prevalent than do men.

c. There is no evidence of a significant difference between the

men and women in their perception.

d. More information is needed to draw any conclusions from the

data set.

20. A powerful women's group has claimed that men and women differ

in attitudes about sexual discrimination. A group of 50 men (Group

1) and 40 women (Group 2) were asked if they thought sexual

discrimination is a problem in the United States. Of those sampled,

11 of the men and 19 of the women did believe that sexual

discrimination is a problem. Which of the following are the

appropriate null and alternative hypotheses to test the group's

claim?

a. H0: M W ρ ρ − ≥ 0 versus H1: M W ρ ρ − < 0

b. H0: M W ρ ρ − ≤ 0 versus H1: M W ρ ρ − > 0

c. H0: M W ρ ρ − = 0 versus H1: M W ρ ρ − ≠ 0

d. H0: M W µ µ − ≤ 0 versus H1: M W µ µ − > 0

https://brainmass.com/statistics/analysis-of-variance/43769

#### Solution Summary

Hypothesis Testing, Significance Levels, ANOVA, Tukey-Kramer, Wilcoxon Rank Sums Test and Kruskal-Wallis Rank Test are investigated. The solution is detailed and well presented.

NCSS assignment: hypothesis testing and regression.

Use NCSS to conduct all statistical tests and calculations. For each question, you are to edit, copy, paste and highlight the appropriate output from NCSS and produce a typed document answering the questions. Be succinct. Control the Type I error rate at the 0.05 level for all statistical test procedures and confidence intervals. At a minimum, for all hypothesis tests, be sure to:

1. State the null and alternate hypotheses

2. Choose a statistical test and calculate the test statistic

3. Calculate the p-value of the test statistic

4. Make your statistical decision

5. State your conclusion

Problem 1. Height and weight are often used in epidemiological studies as possible predictors of disease outcomes. If patients are assessed in the clinic, then heights and weights are usually measured directly. However, if people are interviewed by phone or mail, then a person's self-reported height and weight are often used instead. Suppose we conduct a study on 10 people to test the comparability of these two methods as to "location" of the population distributions of weight. Conduct a nonparametric statistical test to answer the study question. The weight data follows.

ID # Self-reported Weight (lbs.) Measured Weight (lbs.)

1 120 125

2 120 118

3 135 139

4 118 120

5 120 125

6 190 198

7 124 128

8 175 176

9 133 131

10 125 125

Problem 2. Total heart weight (THW) was measured at autopsy on a group of 11 males with left-heart disease and 10 normal males. Test for a significant difference in THW between the two groups using a nonparametric statistical procedure. The autopsy data follows.

Left-Heart Disease Males Normal Males

ID # THW (g) ID # THW (g)

1 450 12 245

2 760 13 350

3 325 14 340

4 495 15 300

5 285 16 310

6 450 17 270

7 460 18 300

8 375 19 360

9 310 20 405

10 615 21 290

11 425

Problem 3. A study was conducted focusing on the protein concentration of duodenal secretions from patients with cystic fibrosis. The following table relates protein concentration (mg/ml) to pancreatic function as measured by trypsin secretion.

Trypsin Secretions [u/(kg/hr)]

£ 50 51-1000 ³ 1001

Subject Number Protein Concentration Subject Number Protein Concentration Subject Number Protein Concentration

1 1.7 10 1.4 20 2.9

2 2.0 11 2.4 21 3.8

3 2.0 12 2.4 22 4.4

4 2.2 13 3.3 23 4.7

5 4.0 14 4.4 24 5.0

6 4.0 15 4.7 25 5.6

7 5.0 16 6.7 26 7.4

8 6.7 17 7.6 27 9.4

9 7.8 18 9.5 28 10.3

19 11.7

a) What parametric and nonparametric statistical procedures can be used to compare the protein concentration for the three groups?

b) Perform both tests mentioned above.

c) Compare all pairs of group means/medians using both parametric and nonparametric methods. Use a Bonferroni correction to insure a family-wise error rate of no more than 0.05.

Problem 4. The following table gives contraceptive usage among a simple random sample of 100 women age 15 - 45 discharged from a large hospital with a diagnosis of idiopathic thromboembolism. A researcher wanted to know if there is a difference in contraceptive usage among women suffering from thromboembolic disease. Conduct a statistical test to answer the researcher's question.

Contraceptive Used Observed Count

Oral Contraceptive 30

IUD 25

Diaphragm 20

None 25

Total 100

Problem 5. The following tabulation represents the outcome of a drug trial of 60 patients assigned to a new treatment as compared to 40 patients assigned to the standard treatment for a particular disease. The researcher wanted to know if the new treatment was different than the standard treatment. Conduct a statistical test to answer the researcher's question.

Treatment Group Improved Not Improved Total

New 40 20 60

Standard 25 15 40

Total 65 35 100

Problem 6. Sleep-disordered breathing is very common among adult males. Snoring is a significant contributor to these disorders, in particular with regards to sleep apnea. To estimate the prevalence of this disorder 1670 male employees, 30-60 years of age, who worked for three large state agencies in Georgia were surveyed as to their snoring status. Conduct a statistical test to assess if snoring status is related to age. The results of the survey are given in the following table.

Age Snore Total

Yes No

30-39 188 348 536

40-49 313 383 696

50-60 232 206 438

Total 733 937 1670

Problem 7. A company operates a production line for making prescription pill bottles. The relation between the speed of the line (X) and the amount of scrap (Y) for the day was studied for 15 days. The basic data for the line is given in the table below.

Day

Y (lbs.) X (ft./min.)

1 218 100

2 248 125

3 360 220

4 351 205

5 470 300

6 394 255

7 332 225

8 321 175

9 410 270

10 260 170

11 241 155

12 331 190

13 275 140

14 425 290

15 367 265

a) For these data obtain estimates for the regression parameters, a and b, for the simple linear regression of Y on X and interpret the meaning of each estimator.

b) What is the fitted regression line?

c) Find the standard error of each estimator in part a.

d) Set a 95% confidence interval on a and b and interpret the meaning of these intervals.

e) Write out the ANOVA for the regression giving source, degrees-of-freedom, sum-of-squares, mean squares.

f) What is the estimated variance of the regression (s2Y|X)?

g) Perform an F-test for the "significance" of regression. Be sure to clearly state the hypotheses tested and your conclusions.

h) Compute R2 and interpret its meaning.

i) Compute Pearson's product-moment (simple) correlation coefficient (r). What is its associated p-value. Interpret its meaning and assess its significance.

j) Compute Spearman's rank-order correlation coefficient (rs). What is its associated p-value. Interpret its meaning and assess its significance.

View Full Posting Details