# Nonparametric, Regression, and Correlation problems

Dentistry

In a study, 28 adults with mild periodontal disease are assessed before and 6 months after implementation of a dental- education program intended to promote better oral hygiene. After 6 months, periodontal status improved in 15 patients, declined in 8, and remained the same in 5.

9.4 Suppose there are two samples of size 6 and 7, with a rank sum of 58 in the sample of size 6. Using the Wilcoxon rank- sum test, evaluate the significance of the results, assuming there are no ties.

Infectious Disease

The distribution of white- blood- cell count is typically positively skewed, and assumptions of normality are usually not valid.

9.11 To compare the distribution of white- blood- cell counts of patients on the medical and surgical services in Table 2.11 when normality is not assumed, what test can be used?

9.12 Perform the test in Problem 9.11, and report a p- value.

Ophthalmology

An investigator wants to test a new eye drop that is supposed to prevent ocular itching during allergy season. To study the drug she uses a contralateral design whereby for each participant one eye is randomized (using a random- number table) to get active drug (A) while the other eye gets placebo (P). The participants use the eye drops three times a day for a 1- week period and then report their degree of itching in each eye on a 4- point scale (1 = none, 2 = mild, 3 = moderate, 4 = severe) without knowing which eye drop is used in which eye. Ten participants are randomized into the study.

Refer to the data set in Tables 7.7 and 7.8

Table 7.7 Randomization assignment

Eye w/ active drug Eye w/ placebo

Subject L R Subject L R

1 A P 6 A P

2 P A 7 A P

3 A P 8 P A

4 A P 9 A P

5 P A 10 A P

Table 7.8 Itching scores reported by participants

Eye

Subject L R Difference*

1 1 2 -1

2 3 3 0

3 4 3 1

4 2 4 -2

5 4 1 3

6 2 3 -1

7 2 4 -2

8 3 2 1

9 4 4 0

10 1 2 -1

Mean 2.60 2.80 -0.20

sd 1.17 1.03 1.55

N 10 10 10

*Itching score left eye - itching score right eye

7.71 What test can be used to test the hypothesis that the mean degree of itching is the same for active vs. placebo eyes?

9.38 Answer the question in Problem 7.71 using nonparametric methods.

9.39 Implement the test suggested in Problem 9.38, and report a two-sided p- value.

Diabetes

Growth during adolescence among girls with diabetes has been shown to relate to consistency in taking insulin injections. A similar hypothesis was tested in adolescent boys ages 13- 17. Boys were seen at repeated visits approximately 90 days apart. Their weight and HgbA1c, a marker that reflects consistency in taking insulin injections over the past 30 days, were measured at each visit. People with diabetes have a higher- than- normal HgbA1c; the goal of insulin treatment is to lower the HgbA1c level as much as possible. To look at the relationship between change in weight and change in HgbA1c, each of 23 boys was ascertained during one 90- day interval when HgbA1c change was minimal (i. e., change of < 1%) (control period) and during another 90- day interval when HgbA1c increased by = 1% ( lack-of- consistency period); this is a fairly large increase, indicating lack of consistency in taking insulin injections. These data represent a subset of the data in DIABETES. DAT on the Companion Website. Weight change was compared between these intervals using the following measure:

∆ = (weight change, control period) - (weight change, lack- of- consistency period)

A frequency distribution of the data sorted in increasing order of ∆ is shown in Table 9.13. Suppose we assume normality of the change scores in Table 9.13.

Table 9.13 (Weight change, control period) - (weight change, lack- of- consistency period) among 23 adolescent diabetic boys

i ∆i

1 -12.6

2 -10.3

3 -5.9

4 -5.4

5 -4.5

6 -2.7

7 -1.8

8 +0.3

9 +2.2

10 +3.5

11 +4.8

12 +5.4

13 +5.8

14 +6.0

15 +6.7

16 +9.6

17 +11.5

18 +12.2

19 +13.9

20 +14.2

21 +18.0

22 +18.6

23 +21.7

Mean 4.83

sd 9.33

n 23

9.47 What test can be used to compare weight change during the control period vs. weight change during the lack-of- consistency period?

9.49 Answer the question in Problem 9.47 if we are not willing to make the assumption of normality.

9.50 Implement the test in Problem 9.49, and report a two-tailed p- value.

Endocrinology

A study to assess the effect of a low- fat diet on estrogen metabolism recruited 6 healthy women ages 21- 32 [5]. The women were within 5% of their ideal body weight, were not participating in athletics, and were eating a typical American diet. For the first 4 weeks the women were fed a high- fat diet (40% of total calories from fat). They were then switched to a low- fat diet (25% of calories from fat) for 2 months. During the follicular phase of their menstrual cycle (days 5- 7), each woman was given a sugar cube with [3H] E2 (estradiol). This was done once during the high- fat period and again after the woman had been eating the low-fat diet for 2 months. The percentage of orally administered [3H] E2 excreted in urine as 16α- hydroxylated glucoronides is given in Table 9.15.

Table 9.15 Percentage of orally administered [3H] E2 excreted in urine as glucoronides of 16α- OHE1

Subject High-fat diet Low-fat diet

1 2.55 1.27

2 2.92 1.60

3 1.71 0.53

4 4.00 1.02

5 0.77 0.74

6 1.03 0.67

9.63 What nonparametric test can be used to compare the 16α- OHE1 percentages on a high- fat diet vs. a low- fat diet?

9.64 Perform the test in Problem 9.63 and report a p-value (two- tailed).

Hypertension

The Update to the Task Force Report on Blood Pressure Control in Children [14] reported the observed 90th percentile of SBP in single years of age from age 1 to 17 based on prior studies. The data for boys of average height are given in Table 11.21.

Table 11.21 90th percentile of SBP in boys ages 1- 17 of average height

Age (x) SBP* (y)

1 99

2 102

3 105

4 107

5 108

6 110

7 111

8 112

9 114

10 115

11 117

12 120

13 122

14 125

15 127

16 130

17 132

*90th percentile for each 1-year age group.

11.20 What is the predicted blood pressure for an average 17- year- old boy as estimated from the regression line?

11.21 What is the standard error of the estimate in Problem 11.20?

Pediatrics, Endocrinology

Transient hypothyroxinemia, a common finding in premature infants, is not thought to have long- term consequences or to require treatment. A study was performed to investigate whether hypothyroxinemia in premature infants is a cause of subsequent motor and cognitive abnormalities [17]. Blood thyroxine values were obtained on routine screening in the first week of life from 536 infants who weighed 2000 g or less at birth and were born at 33 weeks gestation or earlier. The data in Table 11.23 were presented concerning the relationship between mean thyroxine level and gestational age.

Table 11.23 Relationship between mean thyroxine level and gestational age among 536 premature infants

(x) Gestational age (weeks) (y) Mean thyroxine level (µg/dl)

≤24* 6.5

25 7.1

26 7.0

27 7.1

28 7.2

29 7.1

30 8.1

31 8.7

32 9.5

33 10.1

*Treated as 24 in subsequent analyses

11.49 Is there a significant association between mean thyroxine level and gestational age? Report a p- value.

Endocrinology

A 65- year- old woman with low bone density in 1992 was treated with alendronate through the year 1999. Bone density was measured irregularly over this period. The results for change in bone density of the lumbar spine are shown in Table 11.26.

Table 11.26 Change in bone density, lumbar spine, over time

Visit (i) Time from baseline (months) (ti) Bone density, lumbar spine (g/cm2) (xi)

1 0 0.797

2 8 0.806

3 18 0.817

4 48 0.825

5 64 0.837

6 66 0.841

7 79 0.886

8 92 0.881

11.77 The normal change in bone density over time from age 40 to age 80 is a decrease of 0.15 g/ cm2. Does the rate of change in this woman differ significantly from the expected age-related change?

Ophthalmology

Retinitis pigmentosa (RP) is a hereditary ocular disease in which patches of pigment appear on the retina, potentially resulting in substantial vision loss and in some cases complete blindness. An important issue is how fast the subjects decline. Visual field is an important measure of area of vision which is measured in degree2. A visual field area for a nor-mal person is around 11,000 degree2. The longitudinal data in Table 11.31 were provided by an individual patient. Suppose the rate of change of ln (visual field) is a linear function of follow- up time.

Table 11.31 Longitudinal visual field data for one RP patient

Visit Time (yr) Visual field area (degree2) In

(visual field area)

1 0 3059 8.03

2 1 3053 8.02

3 2 1418 7.26

4 3 1692 7.43

5 4 1978 7.59

6 5 1567 7.36

7 6 1919 7.56

8 7 1998 7.60

9 11 1648 7.41

10 13 1721 7.45

11 15 1264 7.14

mean 6.09 1938 7.532

sd 4.97 597 0.280

11.102 What does the intercept mean in this context? What is the estimated % decline in visual field per year?

Nutrition

The assessment of diet is an important exposure for many disease outcomes. However, there is often a lot of imprecision in dietary recall. In one study, 70- to 79- year- old women were asked about the preschool diet of their children (ages 2- 4) using a food frequency questionnaire (FFQ). A unique aspect of the study is that simultaneous diet record data exist on the same children recorded in real time by their mothers when they were ages 2- 4 and their mothers were 20 to 40 years old. The data in Table 11.40 were available on average servings of margarine per week. The Pearson correlation between intake from the two recording methods was 0.448. Assume that FFQ and DR margarine intake are normally distributed.

Table 11.40 Margarine intake assessed by two different recording methods (servings per week, [n = 12])

ID FFQ DR

340 7 0

399 7 0.5

466 0 0

502 0 0

541 0 0

554 7 2.5

558 7 3

605 7 0.5

611 21 3.7

618 0 2.5

653 21 4.1

707 7 8.5

11.118 Test the hypothesis that the true Pearson correlation (ρ) is significantly different from zero [provide a p- value (two- tail)].

© BrainMass Inc. brainmass.com October 16, 2018, 9:02 pm ad1c9bdddfhttps://brainmass.com/statistics/regression-analysis/nonparametric-regression-correlation-problems-544892

#### Solution Preview

Dentistry

In a study, 28 adults with mild periodontal disease are assessed before and 6 months after implementation of a dental- education program intended to promote better oral hygiene. After 6 months, periodontal status improved in 15 patients, declined in 8, and remained the same in 5.

9.4 Suppose there are two samples of size 6 and 7, with a rank sum of 58 in the sample of size 6. Using the Wilcoxon rank- sum test, evaluate the significance of the results, assuming there are no ties.

Solution:

Wilcoxon Rank Sum Test

first, combine the two samples and rank order all the observations.

smallest number has rank 1, largest number has rank N (= sum of n1 and n2).

separate samples and add up the ranks for the smaller sample. (If n1 = n2, choose either one.)

test statistic : rank sum T for smaller sample.

Wilcoxon - Rejection region:

(With Sample taken from Population A being smaller than sample for Population B) - reject H0 if

TA ≥ TU or TA ≤ TL

Z = (TA - n1(n1 + n2 + 1)/2)/sqrt(n1n2(n1 + n2 + 1)/12)

Z = (58-63/2)/sqrt(42*14/12)= 3.785714

P-value for this is 0.000

The results are significant as the P-value is less than 0.05.

Infectious Disease

The distribution of white- blood- cell count is typically positively skewed, and assumptions of normality are usually not valid.

9.11 To compare the distribution of white- blood- cell counts of patients on the medical and surgical services in Table 2.11 when normality is not assumed, what test can be used?

9.12 Perform the test in Problem 9.11, and report a p- value.

Solution:

The chi-square test can be used to compare the distribution of white-blood cell counts because the data is positively skewed and it is related to the frequency count.

ID No. First WBC (x 103) following admission Expected O-E (O-E)^2/E

1 8 7.84 0.16 0.003265

2 5 7.84 -2.84 1.028776

3 12 7.84 4.16 2.207347

4 4 7.84 -3.84 1.880816

5 11 7.84 3.16 1.273673

6 6 7.84 -1.84 0.431837

7 8 7.84 0.16 0.003265

8 7 7.84 -0.84 0.09

9 7 7.84 -0.84 0.09

10 12 7.84 4.16 2.207347

11 7 7.84 -0.84 0.09

12 3 7.84 -4.84 2.987959

13 11 7.84 3.16 1.273673

14 14 7.84 6.16 4.84

15 11 7.84 3.16 1.273673

16 9 7.84 1.16 0.171633

17 6 7.84 -1.84 0.431837

18 6 7.84 -1.84 0.431837

19 5 7.84 -2.84 1.028776

20 6 7.84 -1.84 0.431837

21 10 7.84 2.16 0.595102

22 14 7.84 6.16 4.84

23 4 7.84 -3.84 1.880816

24 5 7.84 -2.84 1.028776

25 5 7.84 -2.84 1.028776

Chi-Square 31.55102

P-value 0.138522

Ophthalmology

An investigator wants to test a new eye drop that is supposed to prevent ocular itching during allergy season. To study the drug she uses a contralateral design whereby for each participant one eye is randomized (using a random- number table) to get active drug (A) while the other eye gets placebo (P). The participants use the eye drops three times a day for a 1- week period and then report their degree of itching in each eye on a 4- point scale (1 = none, 2 = mild, 3 = moderate, 4 = severe) without knowing which eye drop is used in which eye. Ten participants are randomized into the study.

Refer to the data set in Tables 7.7 and 7.8

Table 7.7 Randomization assignment

Eye w/ active drug Eye w/ placebo

Subject L R Subject L R

1 A P 6 A P

2 P A 7 A P

3 A P 8 P A

4 A P 9 A P

5 P A 10 A P

Table 7.8 Itching scores reported by participants

Eye

Subject L R Difference*

1 1 2 -1

2 3 3 0

3 4 3 1

4 2 4 -2

5 4 1 3

6 2 3 -1

7 2 4 -2

8 3 2 1

9 4 4 0

10 1 2 -1

Mean 2.60 2.80 -0.20

sd 1.17 1.03 1.55

N 10 10 10

*Itching score left eye - itching score right eye

7.71 What test can be used to test the hypothesis that the mean degree of itching is the same for active vs. placebo eyes?

Solution:

We can use independent sample t test if we assume that the data is normal to test the hypothesis that the mean degree of itching is the same for active vs. placebo eyes.

9.38 Answer the question in Problem 7.71 using nonparametric methods.

Solution:

We will use Wilcoxon - Mann/Whitney Test to test the hypothesis that the mean degree of itching is the same for active vs. placebo eyes if the data is not normal or normality is not assumed.

9.39 Implement the test suggested in Problem 9.38, and report a two-sided p- value.

TABLE 7.8 Itching scores reported by participants

SUBJECT L R DIFFERENCE (A)

1 1 2 -1

2 3 3 0

3 4 3 1

4 2 4 -2

5 4 1 3

6 2 3 -1

7 2 4 -2

8 3 2 1

9 4 4 0

10 1 2 -1

MEAN 2.6 2.8 -0.2

SD 1.17 1.03 1.55

N 10 10 10

Solution:

The below is the data sorted for active and ...

#### Solution Summary

This solution is comprised of a detailed explanation for Non-Parametric Tests and Regression and Correlation coefficient related problems. The test is identified in each equation eith proper reasoning. There are different problems which can be seen in the description of the posting. The detailed solution is provided for better understanding.

Multiple choice questions from correlation and regression.

Question 1

The range of the correlation coefficient is?

a. -1 to 0.

b. 0 to 1.

c. -1 to 1.

d. None of the above.

Question 2

Which of the following values could not represent a correlation coefficient?

a. r = 0.99

b. r = 1.09

c. r = -0.73

d. r = -1.0

Question 3

Answer questions 3 - 9 from the following problem statement.

An instructor wants to show the students that there is a linear correlation between the number of hours they spent watching TV during a certain weekend and their scores on a test taken the following Monday. The number of television viewing hours and the test scores for 12 randomly selected students are shown in Table 1.

What is value of the correlation coefficient, r?

Hours

x 0 1 2 3 3 5 5 5 6 7 7 10

Score

y 96 85 82 74 95 68 76 84 58 65 75 50

a. 0.84

b. -0.831

c. -0.952

d. 1.0

Question 4

Using Table 11 in Appendix B of the text, what is the critical value if α = 0.05?

a. 0.632

b. 0.735

c. 0.708

d. 0.576

Question 5

Which of the following is true at α = 0.05?

a. |r| > critical value

b. |r| < critical value

c. |r| = critical value

d. None of the above

Question 6

What are the null and alternative hypotheses for this given problem?

a. Ho: ρ = 0; Ha: ρ ≠ 0

b. Ho: ρ ≥ 0; Ha: ρ < 0

c. Ho: ρ ≤ 0; Ha: ρ > 0

d. Ho: ρ < 0; Ha: ρ ≥ 0

Question 7

If a hypothesis test at α = 0.05, has been performed using Table 5 in Appendix B, then what are the rejection regions?

a. t < -2.447 & t > 2.447

b. t < -1.812 & t > 1.812

c. t < -2.228 & t > 2.228

d. t < -0.7 & t > 0.7

Question 8

What is the value of standardized test statistics, t?

a. -4.73

b. -5.18

c. 5.67

d. 4.03

Question 9

Which pair of conclusions is appropriate for this given problem?

a. The correlation is not significant; reject the null hypothesis.

b. The correlation is significant; reject the null hypothesis.

c. The correlation is not significant; fail to reject the null hypothesis.

d. The correlation is significant; fail to reject the null hypothesis.

Question 10

If ŷ is the predicted value for a given x-value and b is the y-intercept then the equation of a regression line for an independent variable x and a dependent variable y is

a. ŷ = mx + b, where m = slope.

b. x = ŷ + mb, where m = slope.

c. ŷ = x/m + b, where m = slope.

d. ŷ = x + mb, where m = slope.

Question 11

Answer questions 11- 14 from the following table.

The ages (in years) of seven children and number of words in their vocabulary is given in Table 2.

What is the value of slope, m of the equation of regression line for the given data?

Age, x 3 4 4 5 6 2 3

Vocabulary size, y 1100 1300 1500 2100 2600 460 1200

a. m = -504.474

b. m = -490.791

c. m = 510.789

d. m = 1300

Question 12

What is the value of y-intercept, b of the equation of regression line for the given data?

a. -504.474

b. -490.791

c. 510.789

d. 1300

Question 13

Using the regression line equation for the given data, find the value of y for x = 4.

a. 1300

b. 1400

c. 1500

d. 1539

Question 14

What is the standard error of estimate, se for this given problem?

a. 139.2

b. 141.3

c. 1143.3

d. 1541.8

Question 15

Which of the following indicates a strong positive correlation?

a. r = 0

b. r = -0.793

c. r = 0.913

d. r = 0.45

Question 16

The coefficient of determination is the

a. ratio of the explained variation to the total deviation.

b. ratio of the unexplained deviation to the explained deviation.

c. ratio of the unexplained deviation to the total variation.

d. ratio of the explained variation to the total variation.

Question 17

If the correlation coefficient r = 0.5 then the coefficient of determination is

a. 0.10

b. 0.25

c. 1.00

d. 2.50

Question 18

Which one of the following statement is false?

a. A nonparametric test is a hypothesis test.

b. A nonparametric test requires a specific condition.

c. Nonparametric tests are easier to perform than corresponding parametric tests.

d. Nonparametric tests are less efficient than parametric tests.

Question 19

Which of the following statement is true for using Wilcoxon signed-test and

Wilcoxon rank sum test?

a. If the samples are dependent we can use a Wilcoxon signed-rank test.

b. If the samples are independent we can use a Wilcoxon rank sum test.

c. None of the above.

d. Both of A & B.

Question 20

The condition(s) for using a Kruskal-Wallis test is/are

a. The sample must be randomly selected.

b. The size of each sample must be at least 10.

c. None of above.

d. Both of A & B.