# Correlation and Regression Review

5

Answer questions 5-8 using the following information:

Test the hypothesis that the treatment means for samples given below are equal. Use the .01 significance level.

Treatment 1 Treatment 2 Treatment 3

22 34 13

20 31 10

21 25 14

18 25 11

19 32

30

The decision rule is:

Choose one answer.

a. Reject the null hypothesis if F > 5.42

b. Reject the null hypothesis if F > 6.93

c. Accept the null hypothesis if F > 26.9

d. Reject the null hypothesis if F > 99.4

6

SS total is:

Choose one answer.

a. -4,132.8

b. 755.83

c. 845.33

d. 4,132.8.

7

MSE is:

Choose one answer.

a. 7.46

b. 377.92

c. 422.66

d. 2,066.4

8

The F statistic =

Choose one answer.

a. 1.00

b. 7.46

c. 50.67

d. 54.5

9

Based on the analysis of variance we would fail to reject Ho.

Choose one answer.

a. True

b. False

10

In an experiment in which two of four similar units are each compressed at three different levels (light, medium, heavy) to determine resilience, the number of degrees of freedom (numerator, denominator) is:

Choose one answer.

a. (2,3)

b. (2,6)

c. (1,4)

d. (1,3)

11

Please use the following information to answer questions 11 & 12

The following data apply to a two-factor ANOVA:

Treatment

Source 1 2 3

A 12 14 8

B 9 11 9

C 7 8 8

SST for the data =

Choose one answer.

a. 1.36

b. 10.89

c. 31.11

d. 42.22

12

SSB for the data =

Choose one answer.

a. 20.22

b. 31.11

c. 53.33

d. 63.11

13

The regression equation:

Choose one answer.

a. can be adjusted to accommodate any number of independent variables.

b. indicates an inverse relationship between variables when a "b" coefficient has a negative sign.

c. should only be used to predict values for the dependent variable that are inside the range of the sample values.

d. both a and b.

e. all of the above.

14

The measure of explained variation is the:

Choose one answer.

a. coefficient of multiple determination.

b. coefficient of multiple nondetermination.

c. regression coefficient.

d. correlation matrix.

15

An analyst determines the relationship between the time taken to perform a computer-triggered production function (Y), required memory to run the function (000 bytes) and amount of input (000 lines of data). The regression equation representing this relationship is determined to be:

Y'=11.43 + 1.26X1 + 3.11X2

For required memory of 25,000 bytes of data, and input of 8,000 lines of data, the estimated time to run the function is:

Choose one answer.

a. 14.233 minutes

b. 67.81 minutes

c. 73.69 minutes

d. 129.43 minutes

e. not calculable without additional data.

16

For a run that required a memory of 15,000 bytes and input of 8,000 lines the time of the run is 54 minutes; this is:

Choose one answer.

a. 13 minutes less than expected.

b. 1.2 minutes less than expected.

c. 1.2 minutes more than expected.

d. not calculable without additional data.

17

Correlation analysis:

Choose one answer.

a. measures the extent to which changes in an independent variable cause changes in a dependent variable.

b. refers to a group of techniques to measure the strength of the association between two variables.

c. measures relationships in terms of strong (a positive r value); moderate (an r value close to zero); and weak (a negative r value).

d. is the development of a mathematical model to estimate the value of one variable based on the value of another.

e. both b and d.

f. all of the above.

18

Marks: 1

An r value of -0.75 indicates:

Choose one answer.

a. weak negative correlation between variables.

b. weak positive correlation between variables.

c. strong negative correlation between variables.

d. virtually no correlation between variables.

19

Marks: 1

Please answer questions 19-22 using the following data.

x 2 4 1 5 3

y 15 25 10 40 30

Σ(X2) =

Choose one answer.

a. 55

b. 3,025

c. 3,450

d. 14,400

20

Σ(Y2) =

Choose one answer.

a. 225

b. 3,450

c. 6,800

d. 14,400

21

r =

Choose one answer.

a. -0.0025

b. 0.073

c. 0.927

d. 1.08

22

The r value indicates:

Choose one answer.

a. moderately weak negative correlation between x and y.

b. moderately strong negative correlation between x and y.

c. weak positive correlation between x and y.

d. strong correlation between x and y.

23

The percentage of variation in a dependent variable that is explained by an independent variable is known as the:

Choose one answer.

a. correlation of the variables.

b. coefficient of correlation.

c. coefficient of determination.

d. regression equation.

24

For the data presented for questions 19-22, the proportion of total variation in y that is explained by x is:

Choose one answer.

a. 7%

b. 14%

c. 74%

d. 86%

25

The coefficient of determination:

Choose one answer.

a. must be a positive number.

b. cannot be larger, in absolute terms, than the coefficient of correlation.

c. must be greater, in absolute terms, than the coefficient of nondetermination.

d. is negative when the coefficient of nondetermination is negative.

e. both a and b.

f. all of the above.

26

A sample of 25 shipments of a product from a warehouse indicates that the correlation between number of defects and processing speed is 0.56. At the .05 significance level,:

Choose one answer.

a. t = 1.714, indicating no positive association between the two variables.

b. t = -1.94, indicating a negative association between the two variables.

c. t = 1.94, indicating a positive association between the two variables.

d. t = 3.24, indicating a positive association between the two variables.

27

The least-squares regression equation:

Choose one answer.

a. is determined by minimizing total squared error between the actual Y values and the predicted values of Y.

b. is used to estimate the value of the dependent variable Y based on any value for the independent variable X.

c. determines the amount of correlation between the X and Y variables.

d. is determined by minimizing the total error between the X and Y values.

e. both a and c.

f. both b and d.

28

In the linear regression equation Y = a + bx, the slope is represented by:

Choose one answer.

a. a

b. b

c. x

d. y

29

Please answer questions 29-31 using the following data.

x 2 4 6 8 10

y 3 1 7 5 9

The value of b is:

Choose one answer.

a. 0.8

b. 1.0

c. 2.2

d. 4.0

30

For this data, a value of 5 for x would yield a predicted value for y of:

Choose one answer.

a. 1.24

b. 4.20

c. 6

d. 12

31

The correlation coefficient (r) for this data is :

Choose one answer.

a. -0.2

b. 0.64

c. 0.80

d. 0.89

e. none of the above.

32

The overall accuracy of a regression procedure may be estimated by the degree of scatter about the regression line provided the sample results.

Choose one answer.

a. True

b. False

33

For the data given for questions 29-31, the standard error of the estimate is:

Choose one answer.

a. 1.27

b. 2.19

c. 3.45

d. not calculable without additional data.

34

The application of linear regression is based on the assumption that:

Choose one answer.

a. for each x value, there is a group of Y values that is normally distributed.

b. the means of the normal distributions of Y values all lie on the straight line of regression.

c. the standard deviations of the Y values are equal.

d. Y values are statistically independent.

e. both a and b.

f. all of the above.

35

For the data given for questions 29-31, a 95% confidence interval for Y' for an x value of 9 is:

Choose one answer.

a. [3.59, 10.81]

b. [2.86, 11.94]

c. [5.34, 9.42]

d. [5.05 ,9.74]

e. [4.53, 13.47]

36

A confidence interval:

Choose one answer.

a. refers to a particular case for a given value of X.

b. determines a particular value for Y given a certain X value.

c. determines the mean value of Y for a given X.

d. increases in size as the level of confidence decreases.

e. both b and d.

f. all of the above.

37

r2 measures the reduction in the total sum of squares achieved by fitting the regression line.

Choose one answer.

a. True

b. False

38

Total explained variation for the data presented in questions 29-31 is:

Choose one answer.

a. 14.4

b. 11.2

c. 17.4

d. 25.6

e. 40

39

Please answer questions 39-41 using the following table.

Source DF SS MS

Regression 1 7,582 ---

Error 29 524.5 ---

Total 30 --- ---

Total variation is:

Choose one answer.

a. 7,057.50

b. 7,582

c. 8,106.50

d. not calculable without additional data.

40

The coefficient of determination is

Choose one answer.

a. 0.874

b. 0.935

c. 0.967

d. 1.074

41

The standard error of the estimate is:

Choose one answer.

a. 4.253

b. 16.17

c. 18.09

d. 419

#### Solution Summary

The solution provides answers to multiple choice questions on correlation and regression.