# Statistical Analysis

Statistics is the art and science of collecting, organizing, analyzing, and interpreting data. Please study any sample dataset from a larger population. You are expected to organize, analyze and interpret the data, and make reasonable inferences and comparisons about the population based on the sample you are studying. Attached is a dataset consisting of a random sample of incoming Cymbalta Community College Freshmen from Fall 2009. The data is current as of the end of Spring 2010.

Analyze the data and write a brief report summarizing your findings. If you can provide a brief analysis I will write the report

1) Provide an overall explanation of the data you are analyzing; e.g., where did the data come from (Ans - Cymbalta community college), what does each row in the dataset represent (Gender,city,age etc.), what does each variable mean ( I need help with this)?

2) Describe the information each variable provides. Include frequencies and percentages of each variable. Apply measures of central tendency, range, and variability when suitable. Describe the shape of the distribution when appropriate. Use charts or graphs.

Below is a dataset that you may use or modified anyway you want.

Student identification # Gender City Age FT/PT GPA Race/Ethnicity

50 M Bridgeport 24 P 3.71 Black or African-American

100 F Bridgeport 23 P 2.68 Black or African-American

150 F Stratford 19 F 1.33 Black or African-American

200 F Bridgeport 33 P 3.41 Black or African-American

250 M Bridgeport 24 P 3.23 Black or African-American

300 F Bridgeport 19 F 0 Black or African-American

350 F Bridgeport 62 P 2.37 Black or African-American

400 M Bridgeport 19 P 0.76 Black or African-American

450 F Trumbull 18 F 3.87 Black or African-American

500 M Bridgeport 64 P 0 Choose Not to Respond

550 M Bridgeport 18 P 0 Other

600 F Stratford 26 P 2.34 White

650 F Bridgeport 20 F 2.45 White

700 M Other 24 P 2.62 White

750 M Milford 45 P 3.35 White

800 M Milford 19 F 3 White

850 M Stratford 24 P 4 White

900 F Other 23 P 4 White

950 F Bridgeport 36 P 3.5 White

1000 F Shelton 20 P 2.7 White

1050 M Fairfield 26 P 2 White

1100 M Monroe 18 P 2.08 White

1150 M Other 24 F 3.64 White

1200 F Stratford 22 P 2.26 Hispanic or Latino

1250 F Bridgeport 24 P 0 Hispanic or Latino

1300 F Stratford 24 P 2.97 Hispanic or Latino

1350 F Stratford 18 F 1.86 Hispanic or Latino

1400 F Milford 20 P 3 Hispanic or Latino

1450 F Bridgeport 20 F 2.3 Hispanic or Latino

1500 M Bridgeport 18 P 3.36 Hispanic or Latino

https://brainmass.com/statistics/descriptive-statistics/statistical-analysis-principle-statistics-445452

#### Solution Preview

A detailed solution is attached.

Statistics is the art and science of collecting, organizing, analyzing, and interpreting data. Please study any sample dataset from a larger population. You are expected to organize, analyze and interpret the data, and make reasonable inferences and comparisons about the population based on the sample you are studying.

Attached is a dataset consisting of a random sample of incoming Cymbalta Community College Freshmen from Fall 2009. The data is current as of the end of Spring 2010.

Analyze the data and write a brief report summarizing your findings. If you can provide a brief analysis I will write the report

1) Provide an overall explanation of the data you are analyzing; e.g., where did the data come from (Ans - Cymbalta community college), what does each row in the dataset represent (Gender,city,age etc.), what does each variable mean ( I need help with this)?

Solution:

The data comes from Cymbalta Community College. It represents a random sample of incoming freshmen from fall 2009. The data set gives information about gender, age, city, GPA, ethnicity, student status (Full Time or Part Time) and student identification number. Each variable in the data set provides very vital information about the college hiring practices.

Age: This variable gives us an idea of the age group of students that enroll in the college. It can be inferred from this information that ...

#### Solution Summary

Statistics is the art and science of collecting, organizing, analyzing, and interpreting data. Please study any sample dataset from a larger population. You are expected to organize, analyze and interpret the data, and make reasonable inferences and comparisons about the population based on the sample you are studying. Attached is a dataset consisting of a random sample of incoming Cymbalta Community College Freshmen from Fall 2009. The data is current as of the end of Spring 2010.

Statistics Problems - Regression Analysis, Autocorrelation, Multicollinearity

1. Suppose an appliance manufacturer is doing a regression analysis, using quarterly time-series data, of the factors affecting its sales of appliances. A regression equation was estimated between appliance sales (in dollars) as the dependent variable and disposable personal income and new housing starts as the independent variables. The statistical tests of the model showed large t-values for both independent variables, along with a high r2 value. However, analysis of the residuals indicated that substantial autocorrelation was present.

a. What are some of the possible causes of this autocorrelation?

b. How does this autocorrelation affect the conclusions concerning the significance of the individual explanatory variables and the overall explanatory power of the regression model?

c. Given that a person uses the model for forecasting future appliance sales, how does this autocorrelation affect the accuracy of these forecasts?

d. What techniques might be used to remove this autocorrelation from the model?

2. Suppose the appliance manufacturer discussed in Exercise 1 also developed another model, again using time-series data, where appliance sales was the dependent variable and disposable personal income and retail sales of durable goods were the independent variables. Although the r2 statistic is high, the manufacturer also suspects that serious multicollinearity exists between the two independent variables.

a. In what ways does the presence of this multicollinearity affect the results of the regression analysis?

b. Under what conditions might the presence of multicollinearity cause problems in the use of this regression equation in designing a marketing plan for appliance sales?

View Full Posting Details