Explore BrainMass
Share

# Correlation, Z-score, hypothesis, probability

This content was COPIED from BrainMass.com - View the original, and get the already-completed solution here!

1. True or false.
a. The decision to use z-scores or Student's t-scores depends first on the size of the sample.
b. When there is correlation between two data sets, there is always an underlying cause.
c. The probability of an event plus the probability of the complementary event always equals 1.
d. The population parameter is always found within the margin of error of the sample statistic.
e. The median of a set of integers will always be either an integer or an integer + 1/2.
f. In modern usage, the null hypothesis is always described mathematically as an equality.
g. You can always find the mode of a set of data, whether categorical or numerical.
h. The confidence level used to determine the margin of error in polling data is typically 99%.

2. The following pairs of numbers (x, y) are the number of dead American soldiers in Iraq and the number of wounded for the months from January 2005 to December 2006.
107 498
58 415
35 371
52 596
80 575
78 511
54 477
85 541
49 545
96 605
84 400
68 412
62 287
55 342
31 498
76 432
69 442
61 458
43 523
65 586
72 790
106 767
70 543
112 690

a) Treat the numbers above as data points (x,y) and find the correlation coefficient r_x,y.
b) Find the highest threshold that r_x,y meets (95% or 99% or none) using table A-5 where n = 25.
c) Find the line of regression equation y-hat = b_1x + b_0; if doing the work by hand instead of using a computer or calculator package, round the middle steps to six places after the decimal; in the final answer, give the numbers rounded to three places after the decimal.
d) Find the data pair that is closest to the line and the data pair that is farthest away, where the measure of distance is |y - y-hat|.
e) Find the data pair that is closest to the line and the data pair that is farthest away, where the measure of distance is |1 - y/y-hat|.
f) Find the five number summary for the first column and the five number summary for second column.
g) Rank each data set from highest (1st) to lowest (24th) and find the rank correlation; use Table A-6 where n = 24 to find if this correlation passes or fails to pass each of the thresholds (alpha = 10%, 5%, 2%, 1%)
h) Find the average and the standard deviation as a sample of set #1. (round to nearest tenth.)
i) Find the average and the standard deviation as a sample of set #2. (round to nearest tenth.)

3. We have a standard deck of 52 cards. You draw two cards from the deck. Find these probablities rounded to four places after the decimal.
(There are four aces in a deck. There are thirteen hearts in a deck.)
p(no aces in the two cards) =
p(exactly one ace in the two cards) =
p(two aces in the two cards) =
p(no hearts in the two cards) =
p(exactly one heart in the two cards) =
p(two hearts in the two cards) =

4. Here is a set of numbers given on a stem and leaf plot.
5|1
4|0223467778
3|14566899
2|36788
1|03459
0|06799
a. Find the average, median, mode and standard deviation of the set as a population.
b. Find the both the frequency and relative frequency of z-scores in this set that meet the following criteria.
b1) z > 2
b2) 1 < z < 2
b3) 0 < z < 1
b4) -1 < z < 0
b5) -2 < z < -1
b6) z < -2

5. In a recent poll of 920 registered voters, 450 respondents said they would vote for Candidate Jones, 436 were going to vote for Candidate Chan and the rest were undecided. Give all answers with three significant digits, either .xxx or xx.x%
a. What are the percentages for each candidate and the undecided vote?
b. Using 450/920 as p, what is the standard deviation for this sample?
c. What is the margin of error for this sample, given a 95% level of confidence?
d. How large a sample size n would we need to get a margin of error of +/-2.2%? (use 450/920 as p.)

6. The following pairs of numbers (x, y) are the closing prices of the 30 stocks that comprise the Dow Jones Industrial Average; the first number (x) is closing price on July 13, 2006 and the second number (y) is the closing price on December 13, 2006.
26.56 35.55
30.99 30.45
76.71 84.58
51.25 59.97
58.36 71.15
79.59 89.60
69.70 61.49
47.87 52.32
43.10 48.84
28.70 34.45
39.55 47.14
64.07 77.36
32.87 35.50
28.32 29.45
31.22 39.67
34.07 39.11
37.99 41.86
17.72 20.70
74.24 94.77
41.39 47.60
60.27 65.47
33.17 43.59
36.94 43.34
22.26 29.55
22.87 25.39
56.54 63.40
71.63 79.25
61.43 64.21
31.74 35.87
44.16 45.90
Treat these numbers as matched pairs and find the differences d = x-y. Find the average of the differences (d-bar), the standard deviation of the differences when the set is considered a sample (s_d-bar) and the 95% confidence interval for mu_d.
Given the interval, are we 95% confident the difference we see is significant?

7. Find the z-scores that most closely correspond to the following percentages, rounding to the nearest two hundredth.
(nearest two hundredth means either to places after the decimal -OR- three places after the decimal where the third digit is 5.
For example: Counting up from zero by two hundreths: .00, .005, .01, .015, .02, .025, ...)
a) z for the 83% cutoff point
b) z for the 23% cutoff point
c) z for the 63% cutoff point