I need help in conducting correlation and regression analyses using the provided SampleDataSet.xlsx.
Correlation: Compute a correlation matrix that includes all continuous variables. Identify all individual correlations that are significant at the 95 percent level.
Regression: Build a multiple regression model to explain the variability in the median school year. Describe the goodness of fit of your model and summarize your findings. Select at least four to seven similar independent variables from the remaining forty-nine measures and justify your selection.
For the correlation test, pick out 6 or 7 variables and then compute the correlation coefficients using the method in data analysis-excel. You need to make sure you describe how everything is calculated an document your results. Make sure you clean up the data first--no blanks, no zeros where there should be data, such as the age data. If you want to include variables such as Married vs single, then you will have to assign values to these, such as married=1 and single =0. The t-test is described below. Make sure median schooling is one of the variables. Do the correlation test between median schooling and the other 5 or 6 variables.
t test for no correlation
t = r * ((n-2)^.5)/(1 - r^2)^.5
r = sample correlation coefficient
n= sample size
Ho: rho = 0
H1: rho not=0
Set Alpha: critical region = .05
2 1/2% in each tail
In the regression analysis, again pick out the most significant independent variables to test their effect on the median schooling. Make sure you explain how regression is performed and explain all test stastics from the regression, including which variables are the most important in explaining the median education level.
I really don't understand statistics as it is not my strong suit. Any help will be appreciated on this- even if it's a start in the right direction.
This solution is comprised of a detailed explanation of Correlation and Regression analysis. In this solution, all the possible explanation of this complicated topic provides students with a clear perspective of builing regression model with more than one predictor variable.