# Linear Regression and Correlation

A. What is linear regression?

b. What can linear regression do for you - both in a general business sense and specifically to your place of employment, or circle of influence?

c. What are some of the limitations of regression analysis?

a. What is correlation analysis and why is it important to us when we are using regression analysis?

b. Provide a definition of R and R squared? What is the relationship between the two numbers?

c. What is the equation for a straight line? Give a brief example of each of the variables involved.

a.What is the difference between a strong negative and a strong positive R?

b. What does zero correlation tell you? What about a correlation of positive or negative one?

c. What is the relationship between the independent and dependent variable? Can the independent and dependent variables be interchanged?

https://brainmass.com/statistics/regression-analysis/linear-regression-and-correlation-49029

#### Solution Preview

Please see response attached for best formatting (also below), including examples as well. I hope this helps and take care.

RESPONSE:

1. a. What is linear regression?

Linear regression attempts to model the relationship between two variables by fitting a linear equation to observed data. One variable is considered to be an explanatory variable, and the other is considered to be a dependent variable. For example, a modeler might want to relate the weights of individuals to their heights using a linear regression model.

Before attempting to fit a linear model to observed data, a modeler should first determine whether or not there is a relationship between the variables of interest. This does not necessarily imply that one variable causes the other (for example, higher SAT scores do not cause higher college grades), but that there is some significant association between the two variables. A scatterplot can be a helpful tool in determining the strength of the relationship between two variables. If there appears to be no association between the proposed explanatory and dependent variables (i.e., the scatterplot does not indicate any increasing or decreasing trends), then fitting a linear regression model to the data probably will not provide a useful model. A valuable numerical measure of association between two variables is the correlation coefficient, which is a value between -1 and 1 indicating the strength of the association of the observed data for the two variables.

A linear regression line has an equation of the form Y = a + bX, where X is the explanatory variable and Y is the dependent variable. The slope of the line is b, and a is the intercept (the value of y when x = 0). (http://www.stat.yale.edu/Courses/1997-98/101/linreg.htm).

Some definitions are as follows:

· A statistical procedure for predicting the value of a dependent variable from an independent variable when the relationship between the variables can be described with a linear model. A linear regression equation can be written as Yp= mX + b, where Yp is the predicted value of the dependent variable, m is the slope of the regression line, and b is the Y-intercept of the regression line. In Microsoft Excel, the LINEST function is used to perform linear regression.

aa.uncw.edu/ward/chm255/glossary.htm

· The relation between variables when the regression equation is linear: e.g., y = ax + b

wordnet.princeton.edu/perl/webwn

· A statistical technique used to find the best-fitting linear relationship between a target (dependent) variable and its predictors (independent variables).

www.cs.ualberta.ca/~zaiane/courses/cmput690/glossary.html

· The process of finding the equation of a straight line that best fits the data.

highered.mcgraw-hill.com/sites/0072480823/student_view0/glossary.html

· In statistics, linear regression is a method of estimating the conditional expected value of one variable y given the values of some other variable or variables x. The variable of interest, y, is conventionally called the "dependent variable".

The terms "endogenous variable" and "output variable" are also used. The other variables x are called the "independent variables". The terms "exogenous variables" and "input variables" are also used. The dependent and independent variables may be scalars or vecen.wikipedia.org/wiki/Linear_regression

(http://66.102.7.104/search?q=cache:XNgvkk-DG_YJ:www.lsbu.ac.uk/psycho/teaching/ppfiles/rm3-7-04-05.ppt+limitations+of+regression+analysis&hl=en)

b. What can linear regression do for you - both in a general business sense and specifically to your place of employment, or circle of influence?

i) The international rice research institute in the Philippines wants to relate the grain yield of rice varieties, y, to the tiller number, x . They conducted experiments for some rice varieties and tillers (see example attached).

ii) Participatory style management (x) predicts employee behavior (y)

iii) A trendline shows the trend in a data set and is typically associated with regression analysis. Creating a trendline and calculating its coefficients allows for the quantitative analysis of the underlying data and the ability to both interpolate and extrapolate the data for forecast purposes. It is probably best to illustrate the problem with a simple example.

Consider monthly sales as shown in Table 1 Month Sales

1 3100

2 4500

3 4400

4 5400

5 7500

6 8100

Table 1 SEE ATTACHED ARTICLE

From http://www.tushar-mehta.com/excel/tips/trendline_coefficients.htm

c. What are some of the limitations of regression analysis?

i) The above ...

#### Solution Summary

Based on the questions, this solution provides a detailed discussion of linear regression and correlation analysis. It also compares aspects of the independent and dependent variables.

Correlation/Linear Regression Guidelines

1. Construct a scatter plot using excel for the given data. Determine whether there is a positive linear correlation, negative linear correlation, or no linear correlation. Complete the table and find the correlation coefficient r.

a. The data below are the ages and systolic blood pressure (measured in millimeters of mercury) of 9 randomly selected adults.

Age, x 38 41 45 48 51 53 57 61 65

Pressure, y 116 120 123 131 142 145 148 150 152

Part 1: Scatter plot

Part 2: Type of correlation (positive linear correlation, negative linear correlation, or no linear correlation)

Part 3: Complete the table and find the correlation coefficient r.

x y xy x2 y2

38 116

41 120

45 123

48 131

51 142

53 145

57 148

61 150

65 152

Use the last row of the table to show the column totals.

n = 9

r =

2. Construct a scatter plot using excel for the given data. Determine whether there is a positive linear correlation, negative linear correlation, or no linear correlation. Complete the table and find the correlation coefficient r. The data for x and y is shown below.

x 11 -6 8 -3 -2 1 5 -5 6 7

y -5 -3 4 1 -1 -2 0 2 3 -4

Part 1: Scatter plot

Part 2: Type of correlation (positive linear correlation, negative linear correlation, or no linear correlation)

Part 3: Complete the table and find the correlation coefficient r.

Answer

x y xy x2 y2

11 -5

-6 -3

8 4

-3 1

-2 -1

1 -2

5 0

-5 2

6 3

7 -4

Use the last row of the table to show the column totals.

n = 10

r =

3. Using the r calculated in problem 1 test the significance of the correlation coefficient using = 0.01 and the claim = 0. Use the 7-steps hypothesis test shown at the end of this project.

Answer:

1. H0 : = 0

Ha : 0

2. =

3. Find t

4. t0 =

5. Rejection region:

6. Decision:

7. Interpretation:

Linear Regression

4. The data below are the ages and systolic blood pressure (measured in millimeters of mercury) of 9 randomly selected adults.

Age, x 38 41 45 48 51 53 57 61 65

Pressure, y 116 120 123 131 142 145 148 150 152

a. Find the equation of the regression line for the given data. Round the line values to the nearest two decimal places.

b. Using the equation found in part a, predict the pressure when the age is 50. Round to the nearest year.

Instruction to copy a graph from Excel to a Word document

1. Create the graph in Excel.

2. Put your mouse in the graph area and left click. You will see little black boxes top, bottom, sides and the corners. (If desired, you can resize your graph by dragging these boxes with your mouse.)

3. With the boxes showing, choose EDIT COPY from the top menu.

4. Go to the Word document, place your mouse pointer when you want the graph and choose EDIT PASTE form the top menu.

5. Save your document.

Guidelines -- Hypothesis Testing Steps:

1. State H0 and Ha.

2. Specify the level of significance alpha .

3. Find the test statistic using the given data.

4. Find the critical value(s) t0. Use the method specified in the problem statement.

5. Define the rejection region using critical value(s)

6. Make a decision to reject or fail to reject the null hypothesis.

7. Interpret the decision in the context of the original claim.