Regression analysis is a powerful and commonly used tool in business research. One important step in regression is to determine the dependent and independent variable(s).
In a bivariate regression, which variable is the dependent variable and which one is the independent variable?
What does the intercept of a regression tell? What does the slope of a regression tell?
What are some of the main uses of a regression?
Provide an example of a situation wherein a bivariate regression would be a good choice for analyzing data.
Justify your answers using examples and reasoning. Comment on the postings of at least two peers and state whether you agree or disagree with their views.
Types of Regression Analyses
There are two major types of regression analysis—simple and multiple regression analysis. Both types consist of dependent and independent variables. Simple linear regression has two variables—dependent and independent. Multiple regression consists of dependent variable and two or more independent variables.
How does a multiple regression compare with a simple linear regression?
What are the various ways to determine what variables should be included in a multiple regression equation?
Compare and contrast the following processes: forward selection, backward elimination, and stepwise selection.
Consider the bivariate equation
y = a + bx
Q. In a bivariate regression, which variable is the dependent variable and which one is the independent variable?
A. Dependent Variable is the variable denoted as y in the equation "y = a + bx". It is the variable whose value is dependent on the value of the independent variable (denoted as x in the above equation)
Independent Variable is the variable denoted as x in the equation "y = a + bx". It is the variable whose value is not dependent on any external variable within the framework of the equation and on which the value of the dependent variable depends
Eg. Rate of growth of bacteria 'y' depends on the ambient temperature 'x'. It can be described in a bivariate regression as
Growth_bacteria = a + b*(temperature)
Rate of growth of number of automobiles in a city 'y' depends on the rate of growth of population in a city i.e. growth_automobile = a + b*(growth_population)
Q. What does the intercept of a regression tell?
A. Intercept of the regression ('a' as per equation under consideration) is the value of dependent variable 'y' when the independent variable 'x' is zero (x=0). 'a' is the expected mean value of 'y' when 'x=0'.
For example when the ambient temperature 'x' = 0, then rate of growth of bacteria will be a constant 'a' i.e. the bacteria will grow at a constant rate.
Q. What does the slope of a regression tell?
A. The slope of regression (determined as 'b' in the equation y = a + bx) identifies the rate of change in the dependent variable 'y' expected for change in the value of dependent variable 'x'. The more steep the slope i.e. higher the value of 'b' higher will the change in value of 'y' for every unit change in value of 'x'
For example in the example mentioned earlier if Growth_bacteria = 2 + 1.5*(temperature), i.e. slope 'b' is 1.5, then the growth of bacteria will be increased by 1.5 units for every unit increase in the temperature
Q. What are some of the main uses of a regression?
A. Key uses of regression include the following
- Forecasting and ...
This solution discusses the following concepts: understanding the construction and uses of bivariate analysis; comparison of multiple regression with linear regression; determining the criteria for inclusion of variables in multiple regression; understanding the models for inclusion of variables in multiple regression, i.e. forward selection, backward elimination and stepwise selection
Multiple choice questions:
1. _____ is a procedure for deriving a mathematical relationship, in the form of an equation between a single metric dependent variable and a single metric independent variable.
b. part correlation
c. multiple regression
d. bivariate regression
2. _______ are hypothesis testing procedures that assume that the variables of interest are measured on at least an interval scale.
a. parameter tests
b. parametric tests
c. nonparametric tests
d. none of the above
3. The ______ is a symmetric bell-shaped distribution that is useful for small (n<30) testing.
a. t distribution
b. frequency distribution
c. chi-square distribution
d. F distribution
4. The degrees of freedom for the t statistic to test hypothesis about one mean are _____
b. n - 1
c. n1 + n2
d. n1 + n2 -2
5. The _____ is a statistical test of the equality of the variance of two populations.
a. z test
b. t test
c. paired sample test
d. F test
6. The total variation in y is _____.
7. Which of the following statements in not correct about the alternative hypothesis
a. There is not way to determine whether the alternative hypothesis is true.
b. The alternative hypothesis represents the conclusion for which the evidence is sought.
c. The alternative hypothesis is the opposite of the null hypothesis.
d. None of the statements are correct.
8. Hypothesis tests can be used to relate to:
a. tests of strengths
b. tests of association
c. tests of differences
d. b and c are correct
9. Also know as SSerror, _____ is the variation in Y due to the variation within each of the categories of X. This variation is not accounted for by X.
10. How consumers' intentions to buy a brand vary with different levels of price and different levels of distribution is best analyzed via _____.
a. n-way ANOVA
b. one-way ANOVA
11. Is a statistical procedure for analyzing associative relationships between a metric dependent variable and one or more independent variables.
b. partial correlation coefficient
d. Product movement correlation
12. The degrees of freedom for the t statistic to test hypothesis about two independent samples is __________
b. n - 1
c. n1 + n2
d. n1 + n2 -2
13. Univariate techniques can be classified based on _____.
a. whether the data are metric or nonmetric
b. whether one two, or more than two samples are involved
c. whether interdependence techniques or dependence techniques are to be used
d. a and b are correct
14. A hypothesis test produces a t statistic of t = 2.30. If the researcher is using a two-tailed test with alpha = .05, how large does the sample have to be to reject the null hypothesis?
A. at least n = 8
B. at least n = 9
C. at least n =10
D. at least n = 11
15. _____ occurs when the sample results lead to the rejection of the null hypothesis that is in fact true.
a. Type I error
b. Two-tailed error
c. Type II error
d. One -tailed error