# Multiple Regression and Non Parametric Methods

See Attached File

© BrainMass Inc. brainmass.com October 25, 2018, 8:36 am ad1c9bdddfhttps://brainmass.com/statistics/regression-analysis/multiple-regression-and-non-parametric-methods-546624

#### Solution Preview

14.20

data stored in a file: 1420_data.txt as follows:

County MedianIncome MedianAge Coastal

A 48157 57.7 1

B 48568 60.7 1

C 46816 47.9 1

D 34876 38.4 0

E 35478 42.8 0

F 34465 35.4 0

G 35026 39.5 0

H 38599 65.6 0

J 33315 27.0 0

Using R:

a.

Is there a linear relationship between the median income and median age?

d1420<-read.table("1420_data.txt", header=TRUE)

> colnames(d1420) = c("County", "MedianIncome", "MedianAge", "Coastal")

> attach(d1420)

The following object(s) are masked from 'd1420 (position 3)':

Coastal, County, MedianAge, MedianIncome

> cor.test(MedianIncome, MedianAge)

Pearson's product-moment correlation

data: MedianIncome and MedianAge

t = 2.7561, df = 7, p-value = 0.02825

alternative hypothesis: true correlation is not equal to 0

95 percent confidence interval:

0.1099740 0.9367364

sample estimates:

cor

0.7214069

Conclusion: Yes there is correlation between Median Income and Median Age.

b.

Which variable is the "dependent" variable?

Median Income is Median Income.

c.

Use statistical software to determine the regression equation. Interpret the value of the slope in a simple regression equation.

> lm(MedianIncome~ MedianAge)

Call:

lm(formula = MedianIncome ~ MedianAge)

Coefficients:

(Intercept) MedianAge

22804.7 361.6

i.e.,

MedianIncome = 361.6*MedianAge + 22804.7

d.

Include the aspect that the county is "coastal" or not in a multiple linear regression analysis using a "dummy" variable. Does it appear to be a significant influence on incomes?

> lmIncAgeCoast = lm(formula = MedianIncome ~ MedianAge + Coastal)

> summary(lmIncAgeCoast)

Call:

lm(formula = MedianIncome ~ MedianAge + Coastal)

Residuals:

Min 1Q Median 3Q Max

-0.27723 -0.14934 0.01199 0.06174 ...

#### Solution Summary

A few good quality statistics questions on linear regression are solved using R.

Multiple choice questions:

1. _____ is a procedure for deriving a mathematical relationship, in the form of an equation between a single metric dependent variable and a single metric independent variable.

a. chi-square

b. part correlation

c. multiple regression

d. bivariate regression

2. _______ are hypothesis testing procedures that assume that the variables of interest are measured on at least an interval scale.

a. parameter tests

b. parametric tests

c. nonparametric tests

d. none of the above

3. The ______ is a symmetric bell-shaped distribution that is useful for small (n<30) testing.

a. t distribution

b. frequency distribution

c. chi-square distribution

d. F distribution

4. The degrees of freedom for the t statistic to test hypothesis about one mean are _____

a. n

b. n - 1

c. n1 + n2

d. n1 + n2 -2

5. The _____ is a statistical test of the equality of the variance of two populations.

a. z test

b. t test

c. paired sample test

d. F test

6. The total variation in y is _____.

a. SSy

b. SSwithin

c. SSbetween

d. SSy

7. Which of the following statements in not correct about the alternative hypothesis

a. There is not way to determine whether the alternative hypothesis is true.

b. The alternative hypothesis represents the conclusion for which the evidence is sought.

c. The alternative hypothesis is the opposite of the null hypothesis.

d. None of the statements are correct.

8. Hypothesis tests can be used to relate to:

a. tests of strengths

b. tests of association

c. tests of differences

d. b and c are correct

9. Also know as SSerror, _____ is the variation in Y due to the variation within each of the categories of X. This variation is not accounted for by X.

a. SSy

b. SSwithin

c. SSbetween

d. SSy

10. How consumers' intentions to buy a brand vary with different levels of price and different levels of distribution is best analyzed via _____.

a. n-way ANOVA

b. one-way ANOVA

c. ANCOVA

d. Regression

11. Is a statistical procedure for analyzing associative relationships between a metric dependent variable and one or more independent variables.

a. regression

b. partial correlation coefficient

c. ANOVA

d. Product movement correlation

12. The degrees of freedom for the t statistic to test hypothesis about two independent samples is __________

a. n

b. n - 1

c. n1 + n2

d. n1 + n2 -2

13. Univariate techniques can be classified based on _____.

a. whether the data are metric or nonmetric

b. whether one two, or more than two samples are involved

c. whether interdependence techniques or dependence techniques are to be used

d. a and b are correct

14. A hypothesis test produces a t statistic of t = 2.30. If the researcher is using a two-tailed test with alpha = .05, how large does the sample have to be to reject the null hypothesis?

A. at least n = 8

B. at least n = 9

C. at least n =10

D. at least n = 11

15. _____ occurs when the sample results lead to the rejection of the null hypothesis that is in fact true.

a. Type I error

b. Two-tailed error

c. Type II error

d. One -tailed error