# Graphs, Regression Analysis, Hypothesis Tests

See the attached data.

Below is a breakdown of what exactly to use:

-One set of bivariate data -we will be using the original cost of the vehicle compared to the current value.

-Two or more set from different population- We select population of males and females and plan to use these two population to determine who spends more their car.

-a paired set of data for hypothesis test- for this we plan on using the current value of the vehicle compared to the amount spent on up keep to determine if the cars value is impacted by up keep costs.

PROJECT:

Prepare an analysis and presentation on the topic, "Men Spend More Money Purchasing Vehicles Than Women". The data will be fictious.

2. Gather or create data. If you "invent" the data, please footnote that fact and in your paper describe a method that would have been used to gather the data. If you find data, then document your source and indicate the sampling method and whether your sample was random or not. Either way, your data sets will need to be related to your topic and included in your appendix.

You will need a minimum of three data sets with at least 30 values in each, but you may use more as needed to complete the requirements.

You will need several different sets of data, all numeric.

-One set of bivariate data (two pieces of data taken from each unit in the sample) that is appropriate for a linear regression. There are several options for this data. This could be a trend line with one data set on a time interval axis. This could be two different but related variables, like a person's height and weight. This could be "before" and "after" data.

-Two or more sets of data that are from different populations. This data will be used in the hypothesis test of independent means to determine if the means are equal.

-A paired set of data that can be used in a "paired" hypothesis test.

-All data sets must be included in an appendix of the document.

3. For the bivariate data:

Determine which is the independent and which is the dependent variable.

Create a scatter diagram.

Determine the coefficient of correlation and interpret it.

Determine the equation of the regression line and explain.

Interpret each value in the regression equation and explain how the equation and line are related.

Graph the regression line on a scatter diagram.

Make a prediction for some value within the x-range and explain the meaning and the reliability of the prediction.

From your data set determine the point with the largest residual and explain its interpretation.

4. For the two (or more) sets of data from different populations:

Perform a hypothesis test for the difference of two independent means or an ANOVA.

Remember to show all steps including the statements of the hypotheses, conditions, computational results, decisions, and interpretation. These should be in paragraph form, not numbered.

Remember to show all steps.

5. For the paired data:

Perform a hypothesis test for the difference of two means.

Remember to show all steps and to explain and give an interpretation of the result.

© BrainMass Inc. brainmass.com October 17, 2018, 3:11 am ad1c9bdddfhttps://brainmass.com/statistics/regression-analysis/graphs-regression-analysis-hypothesis-tests-412448

#### Solution Preview

The text below is also attached as a Word document and as a pdf.

----------------------------------------------------------

Below is a breakdown of what exactly to use:   

-One set of bivariate data - we will be using the original cost of the vehicle compared to the current value.

-Two or more set from different population - We select population of males and females and plan to use these two populations to determine who spends more their car.   

-A paired set of data for hypothesis test - for this we plan on using the current value of the vehicle compared to the amount spent on up keep to determine if the cars value is impacted by up keep costs.    

PROJECT:   Prepare an analysis and presentation on the topic, "Men Spend More Money Purchasing Vehicles Than Women". The data will be fictitious.   

2. Gather or create data. If you "invent" the data, please footnote that fact and in your paper describe a method that would have been used to gather the data. If you find data, then document your source and indicate the sampling method and whether your sample was random or not. Either way, your data sets will need to be related to your topic and included in your appendix.   You will need a minimum of three data sets with at least 30 values in each, but you may use more as needed to complete the requirements. You will need several different sets of data, all numeric.   

I'm using the data that you provided in the Excel file. Most of the data look good and appropriate for these tests. However, I think that the data you've chosen for the third hypothesis test would be more appropriate for a regression analysis (bivariate data). The way you have it set up now, you'd be comparing the average amount of money spent on upkeep to the average value of the car. Instead, I think that we should compare the original cost to the current value. If you disagree, you can re-do the analyses with your original data, following the steps I show below, or ask me to re-do it.

* One set of bivariate data (two pieces of data taken from each unit in the sample) that is appropriate for a linear regression. There are several options for this data. This could be a trend line with one data set on a time interval axis. This could be two different but related variables, like a person's height and weight. This could be "before" and "after" data.   

* Two or more sets of data that are from different populations. This data will be used in the hypothesis test of independent means to determine if the means are equal. ...

Pearson correlation, sketch a graph, regression equation

See the attached file for proper format.

2. What information is provided by the numerical value of the Pearson correlation?

6. For the following scores, X Y

3 12

6 7

3 9

5 7

3 10

a. Compute the Pearson correlation. SP= ? Ssx= ? Ssy = ? r= ?

b. With a small sample, a single point can have a large effect on the magnitude of the correlation.

Change the score X = 5 to X = 0 and compute the Pearson Correlation again.

SP= ? Ssx= ? Ssy = ? r= ?

18. Sketch a graph showing the line for the equation Y = -2X + 4. On the same graph, show the line for Y = X - 4.

20. A set of n = 20 pairs of scores (X and Y values) has SSx = 25, SSy = 16, and SP = 12.5. If the mean for the X values is M = 6 and the mean for the Y values is M = 4.

a. Calculate the Pearson correlation for the scores. r= ?

b. Find the regression equation for predicting Y from the X values.

b= ? a= ? Y =

22. For the following scores, X Y

1 2

4 7

3 5

2 1

5 14

3 7

a. Find the regression equation for predicting Y from X

SP = ? SSx = ? a = ? b = ? Y = ?

b. Use the regression equation to find a predicted Y for each X.

X = 1, Y = X = 2, Y =

X = 4, Y = X = 5, Y =

X = 3, Y = X = 3, Y =

c. Find the difference between the actual Y value and the predicted Y value for each individual, square the differences, and add the squared values to obtain SSresidual.

SSresidual = ?

d. Calculate the Pearson correlation for these data. Use r2 and SSy to compute SSresidual

SP = ? SSx = ? SSy = ? r = ? r2 = ? SSresidual = ?

2. The student population at the state college consists of 55% females and 45% males.

a. The college theater department recently staged a production of a modern musical. A researcher recorded the gender of each student entering the theater and found a total of 385 females and 215 males. Is the gender distribution for theater goers significantly different from the distribution for the general college? Test at the 0.05 level of significance.

df = ? fe males = ? x2 males = ?

f0 = ? fe = females = ? x2 females = ?

x2 critical = ? x2 total = ?

Conclusion: ???

b. The same researcher also recorded the gender of each student watching a men's basketball game in the college gym and found a total of 83 females and 97 males. Is the gender distribution for basketball fans significantly different from the distribution for the general college? Test at the 0.05 level of significance.

df = ? fe males = ? x2 males = ?

f0 = ? fe = females = ? x2 females = ?

x2 critical = ? x2 total = ?

Conclusion: ???

4. Data from the Motor Vehicle Department indicate that 80% of all licensed drivers are older than age 25.

a. In a sample of n = 60 people who recently received speeding tickets, 38 were older than 25 years and the other 22 were age 25 or younger. Is the age distribution for this sample significantly different from the distribution for the population of licensed drivers? Use α = 0.05.

df = ? fe older= ? x2 older = ?

f0 = ? fe = younger = ? x2 younger = ?

x2 critical = ? x2 total = ?

Conclusion: ???

b. In a sample of n = 60 people who recently received parking tickets, 43 were older than 25 years and the other 17 were age 25 or younger. Is the age distribution for this sample significantly different from the distribution for the population of licensed drivers? Use α = 0.05.

df = ? fe older= ? x2 older = ?

f0 = ? fe = younger = ? x2 younger = ?

x2 critical = ? x2 total = ?

Conclusion: ???

# 6 Research has demonstrated that people tend to be attracted to others who are similar to themselves. One study demonstrated that individuals are disproportionately more likely to marry those with surnames that begin with the same last letter as their own (Jones, Pelham, Carvallo, & Mirenberg, 2004). The researchers began by looking at marriage records and recording the surname for each groom and the maiden name of each bride. From these records it is possible to calculate the probability of randomly matching a bride and a groom shoes last names begin with the same letter. Suppose that this probability is only 6.5%. Next, a sample of n = 200 married couples is selected and the number who shared the same last initial at the time they were married is counted. The resulting observed frequencies are as follows:

Same Initial Different Initial

19 181 200

Do these data indicate that the number of couples with the same last initial is significantly different that would be expected if couples were matched randomly? Test with α = 0.05.

df = ? fo = ? x2 critical = ?

fe = ? x2 calculated = ?

Conclusion : ???

8. A professor in the psychology department would like to determine whether there has been a significant change in grading practices over the years. It is known that the overall grade distribution for the department in 1985 has 14% As, 26% Bs, 31% Cs, 19% Ds, and 10% Fs. A sample of n = 200 psychology students from last semester produced the following grade distribution:

A B C D F

32 61 64 31 12

Do the data indicate a significant change in the grade distribution? Test at the 0.05 level of significance.

df = ? feA = ? x2 calculated A = ? x2 Total = ?

fo = ? feB = ? x2 calculated B = ? x2 critical = ?

feC = ? x2 calculated C = ?

feD = ? x2 calculated D = ?

feE = ? x2 calculated E = ?

Conclusion : ???

10. The color red is often associated with anger and male dominance. Based on this observation, Hill and Barton (2005) monitored the outcome of four combat sports (boxing, tae kwan do, Greco-Roman wrestling, and freestyle wrestling) during the 2004 Olympic games and found that participants wearing red outfits won significantly more than those wearing blue,

a. In 50 wrestling matches involving red versus blue, suppose that the red outfit won 31 times and lost 19 times. Is this result sufficient to conclude that red wins significantly more than would be expected by chance? Test at the 0.05 level of significance.

df = ? fo = ? x2 critical = ?

fe Red wins = ? x2 calculate = ?

Conclusion : ???

b. In 100 matches, suppose red won 62 times and lost 38. Is this sufficient to conclude that red wins significantly more than would be expected by chance? Again, use the α = 0.05.

df = ? fo = ? x2 critical = ?

fe Red wins = ? x2 calculate = ?

Conclusion : ???

c. Note that the winning percentage for red uniforms in part A is identical to the percentage in part b (31 out of 50 is 62%, and 62 out of 100 is also 62%). Although the two samples have identical winning percentages, one is significant and the other is not. Explain why the two samples lead to different conclusions.???

View Full Posting Details