# 18 Problems in Statistics

1.

a.11 10 8 4 6 7 11 6 11 7

b.13 23 15 13 18 13 15 14 20 20 18 17 20 13

i. Find the range, mean, variance, and standard deviation of the population data set.

ii. Without calculating, which data set has the greatest sample standard deviation? Which has the least sample standard deviation? Explain your reasoning.

iii. How are the data sets the same? How do they differ?

2. Water Consumption The number of gallons of water consumed per day by a small village are listed. Make a frequency distribution (using five classes) for the data set. Then approximate the population mean and the population standard deviation of the data set.

167 180 192 173 145 151 174 175 178 160

195 224 244 146 162 146 177 163 149 188

3. Use frequency distribution formulas to approximate the sample mean and standard deviation of the following data set

108 139 120 123 120 132 123 131 131 157 150 124 111

101 135 119 116 117 127 128 139 119 118 114 127 Â

4. Weekly salaries (in dollars) for a sample of registered nurses are listed.

774 446 1019 795 908 667 444 960

i. Find the mean, the median, and the mode of the salaries. Which best describes a typical salary?

ii. Find the range, variance, and standard deviation of the data set. Interpret the results in the context of the real-life setting.

Confidence Intervals for the Mean (Small Samples)

5. Suppose you incorrectly used the normal distribution of find the maximum error of estimate for the given values of c, s, and n.

c = 0.95, s = 5, n = 16

i.Find the value of E using the normal distribution.

ii.Find the correct value using a t-distribution. Compare the results.

6.Suppose you incorrectly used the normal distribution of find the maximum error of estimate for the given values of c, s, and n.

c = 0.99, c = 3, n = 6

i. Find the value of E using the normal distribution.

ii. Find the correct value using a t-distribution. Compare the results.

7. You want to estimate the mean repair cost for dishwashers. The estimate must be within $10 of the population mean. Determine the required sample size to construct a 99% confidence interval for the population mean. Assume the population standard deviation is $22.50.

8. The following data set represents the repair costs (in dollars) for a random sample of 30 dishwashers.

41.82 52.81 57.80 68.16 73.48 78.88 88.13 88.79

90.07 90.35 91.68 91.72 93.01 95.21 95.34 96.50

100.05 101.32 103.59 104.19 105.62 111.32 117.14 118.42

118.77 119.01 120.70 140.52 141.84 147.06

i. Assume the population of dishwasher repair costs is normally distributed.

ii. Construct a 95% confidence interval for the population variance.

iii. Construct a 95% confidence interval for the population standard deviation.

9. An auto maker estimates that the mean gas mileage of its luxury sedan is at least 25 miles per gallon. A random sample of eight such cars had a mean of 23 miles per gallon and a standard deviation of 5 miles per gallon. At a = 0.05, can you reject the auto maker's claim that the mean gas mileage of its luxury sedan is at least 25 miles per gallon? Assume the population is normally distributed.

1. State the claim mathematically. Identify H0 and Ha.

2. Determine when a type I or type II error occurs.

3. Determine whether the hypothesis test is a one-tailed or a two-tailed test and whether to use a z-test, a t-test. Explain.

4. If necessary, find the critical value(s) and identify the rejection region(s).

5. Find the appropriate test statistic. If necessary, find the P-value.

6. Decide whether to reject or fail to reject the null hypothesis. Then interpret the decision in the context of the original claim.

10. A state school administrator says that the standard deviation of SAT verbal test scores is 105. A random sample of 14 SAT verbal test scores has a standard deviation of 113. At a = 0.01, test the administrator's claim. What can you conclude? Assume the population is normally distributed.

1. State the claim mathematically. Identify H0 and Ha.

2. Determine when a type I or type II error occurs.

3. Determine whether the hypothesis test is a one-tailed or a two-tailed test and whether to use a z-test or a t-test. Explain your reasoning.

4. If necessary, find the critical value(s) and identify the rejection region(s).

5. Find the appropriate test statistic. If necessary, find the P-value.

6. Decide whether to reject or fail to reject the null hypothesis. Then interpret the decision in the context of the original claim.

11. A tourist agency in Massachusetts claims the mean daily cost of meals and lodging for a family of four traveling in the state is $276. You work for a consumer protection advocate and want to test this claim. In a random sample of 35 families of four traveling in Massachusetts, the mean daily cost of meals and lodging is $285 and the standard deviation is $30. Do you have enough evidence to reject the agency's claim? Use a P-value and a = 0.05.

1. State the claim mathematically. Identify H0 and Ha.

2. Determine when a type I or type II error occurs.

3. Determine whether the hypothesis test is a one-tailed or a two-tailed test and whether to use a z-test or a t-test. Explain your reasoning.

4. If necessary, find the critical value(s) and identify the rejection region(s).

5. Find the appropriate test statistic. If necessary, find the P-value.

6. Decide whether to reject or fail to reject the null hypothesis. Then interpret the decision in the context of the original claim.

Measures of Regression and Prediction Intervals

12. Leisure Hours? Construct a 90% prediction interval for the median number of leisure hours per week when the median number of work hours per week is 45.1.

Median no. of work hrs. per week, x 40.6 43.1 46.9 47.3 46.8

Median no. of leisure hrs. per week, y 26.2 24.3 19.2 18.1 16.6

Median no. of work hrs. per week,x 48.7 50.0 50.7 50.6 50.8

Median no. of leisure hrs. per week,y 18.8 18.8 19.5 19.2 19.5

13. What is the Error of Estimate ? Find the standard error of estimate se and interpret the results.

14. Peanut Yield ? The equation used to predict peanut yield (in pounds) is given (please refer to the attachment for the data). x1 is the number of acres planted (in thousands) and x2 is the number of acres harvested (in thousands). (Source: U.S. National Agricultural Statistics Service)

a. x1 = 1458, x2 = 1450

b. x1 = 1500, x2 = 1475

c. x1 = 1400, x2 = 1385

d. x1 = 1525, x2 = 1500

15. Rice Yield ? To predict the annual rice yield (in pounds), use the equation, where x1 is the number of acres planted (in thousands) and x2 is the number of acres harvested (in thousands). (Source: U.S. National Agricultural Statistics Service)

a. x1 = 2532, x2 = 2255

b. x1 = 3581, x2 = 3021

c. x1 = 3213, x2 = 3065

d. x1 = 2758, x2 = 2714

16. Black Cherry Tree Volume ? The volume (in cubic feet) of black cherry trees can be modeled by the equation, where x1 is the tree's height (in feet) and x2 is the tree's diameter (in inches). (Source: Journal of the Royal Statistical Society)

a. x1 = 70, x2 = 8.6

b. x1 = 65, x2 = 11.0

c. x1 = 83, x2 = 17.6

d. x1 = 87, x2 = 19.6

17. Earnings Per Share ? The earnings per share (in dollars) for McDonald's Corporation are given by the equation where x1 represents total revenue (in billions of dollars) and x2 represents total net worth (in billions of dollars). (Source: McDonald's Corporation)

a. x1 = 11.4,. x2 = 8.6

b. x1 = 8.1, x2 = 6.2

c. x1 = 10.7, x2 = 8.5

d. x1 = 7.3, x2 = 5.1

18. The table lists the personal income and outlays (both in trillions of dollars) for Americans for 11 recent years.

Construct a 95% prediction interval for personal outlays when personal income is 6.4 trillion dollars. Interpret the results.

Personal income, x Personal outlays, y

4.5 3.7

4.9 4.0

5.0 4.1

5.3 4.3

5.6 4.6

5.9 4.8

6.2 5.1

6.5 5.4

7.0 5.7

7.4 6.1

7.8 6.5

https://brainmass.com/statistics/descriptive-statistics/18-problems-statistics-327435

#### Solution Summary

This solution is comprised of detailed step-by-step calculations and analysis of the given problems related to Statistics and provides students with a clear perspective of the underlying concepts.

30 Multiple Choice Problems in Statistics

Problem A

The manager of a grocery store claims that the average time that customers spend in checkout lines is 20 minutes or less. A sample of 36 customers is taken. The average time spent on checkout lines for the sample is 24.6 minutes; and the sample standard deviation is 12 minutes. Conduct a hypothesis test (at 0.05 level of significance) to determine if the mean waiting time for the customer population is significantly more than 20 minutes.

The observed value of the test statistic is:

a. 2.3 b. 0.38 c. -2.3 d. -0.38

The p-value is:

a. 0.5107 b. 0.0214 c. 0.0137 d. 0.4893

Problem E:

A company wants to measure the relationship between its employee productivity (measured in output/employee) and the number of employees. Sample data for the last four months are shown below. Use simple linear regression to estimate this relationship.

Independent Variable Dependent Variable

Number of Employees Employee Productivity

15 5

12 7

10 9

7 11

ANSWER QUESTIONS 16 THROUGH 19 BELOW.

16. The least squares estimate of the slope b1 is:

a. -0.7647 b. -0.13 c. 21.4 d. 16.41

17. The least squares estimate of the intercept b0 is:

a. -7.647 b. -0.13 c. 21.4 d. 16.41

18. The estimated employee productivity when the number of employees is 5 is:

a. 78 b. 12.59 c. 5.8 d. 32.6

19. If the sample covariance is -8.67; estimate the coefficient of correlation between the number of employees and employee productivity:

a. -0.997 b. 0.997 c. 1.23 d. 1.02

Problem F:

Consumer Research is an independent agency that is collecting data on annual income (INCOME) and household size (SIZE), to predict annual credit card charges. It runs a regression analysis on the data and an incomplete MS Excel output is shown below.

ANSWER QUESTIONS 20 THROUGH 30 BELOW.

Regression Statistics

Multiple R 0.88038239

R Square

Adjusted R Square

Standard Error 510.495493

Observations

ANOVA

df SS MS F Significance F

Regression 2 17960368.3 3.31446E-07

Residual 20 260605.648

Total 22

Coefficients Standard Error t Stat P-value Lower 95%

Intercept 352.694714 4.15578994 0.00048872 730.0172039

INCOME 25.062956 8.47147285 2.95851223 0.00776734 7.391781505

SIZE 408.400776 71.808401 1.447E-05 258.6111461

20. The sample size is:

a. 23 b. 22 c. 20 d. 21

21. The coefficient of determination is:

a. 0.88 b. 0.775 c. 0.92 d. -0.38

22. The Sum of Squares for Error (i.e., Residual) is:

a. 17960368.3 b. 5212112.97 c. 23172481.3 d. 260605.648

23. The Sum of Squares for Total (SST) is:

a. 17960368.3 b. 5212112.97 c. 23172481.3 d. 260605.648

24. The Mean Square for Regression is

a. 17960368.3 b. 5212112.97 c. 260605.648 d. 8980184.17

25. The observed or computed F-value is:

a. 34.459 b. 0.029 c. 3.445 d. 0.29

26. The hypothesis to be tested is:

H0: B1 = B2 = 0

Ha: At least one of the B is not equal to 0.

The hypothesis is to be tested at the 5% level of significance. The null hypothesis is:

a. not rejected

b. rejected

c. the test is inconclusive

d. none of the above answers are correct

27. The hypothesis to be tested is:

H0: B1 = 0

Ha: B1 ≠ 0

The hypothesis is to be tested at the 1% level of significance. The null hypothesis is:

a. not rejected

b. rejected

c. the test is inconclusive

d. none of the above answers are correct

28. The estimate of the intercept b0 is:

a. 10010.2 b. 2810.3 c. 1465.5 d. 2641.5

29. The observed or computed t-stat (i.e., t-value) for the independent variable SIZE is:

a. 2.96 b. 3.445 c. 4.16 d. 5.687

30. What is the estimated annual credit charges if INCOME = 20, and SIZE = 3?

a. 9700 b. 12600 c. 3189 d. 5300

Problem G:

Last year, the student body of a local university consisted of 30% freshmen, 24% sophomores, 26% juniors, and 20% seniors. A sample of 300 students taken from this year's student body showed the number of students in each classification.

Freshmen 83

Sophomores 68

Juniors 85

Seniors 64

We want to know if there has been a significant change in the proportions of student classifications between the two years.

ANSWER QUESTIONS 31 THROUGH 34 BELOW.

31. The expected number of freshmen in this year is:

a. 83 b. 90 c. 30 d. 10

32. The number of degrees of freedom is:

a. 4 b. 2 c. 3 d. 1

33. The hypothesis is to be tested at the 5% level of significance. The critical chi-square value from the table equals:

a. 1.645 b. 1.96 c. 2.75 d. 7.815

34. If the chi-square value that is calculated equals 1.6615, then the null hypothesis is:

a. not rejected

b. rejected

c. the test is inconclusive

d. none of the above answers are correct

Problem H: Use the following Excel Output to answer questions 35-39:

Source Sum of Squares d.f.

Between Groups 213.88125 3

Within Groups 11.208333 20

Total 225.0895 23

35. Consider the above one-way ANOVA table. What is the treatment mean square?

A) 71.297 B) 0.5604 C) 1.297 D) 213.881 E) 9.7

36. Consider the above one-way ANOVA table. What is the mean square error?

A) 71.297 B) 0.5604 C) 1.297 D) 213.8810 E) 9.7

37. Consider the above one-way ANOVA table. How many groups (treatment levels) are included in the study?

A) 3 B) 4 C) 6 D) 20 E) 24

38. Consider the above one-way ANOVA table. If there are equal number of observations in each group, then each group (treatment level) consists of ______ observations.

A) 3 B) 4 C) 6 D) 20 E) 24

39. What is the critical F-value at an alpha of 0.05?

A) 3.1 B) 3.86 C) 14.17 D) 4.94 E) 8.66

Problem I:

Use the following to answer questions 40-42:

The following results were obtained from a simple regression analysis:

= 37.2895 - (1.2024) * X

r = - 0.6774

40. For each unit change in X (independent variable), the estimated change in Y (dependent variable) is equal to:

A) -1.2024 B) 0.6774 C) 37.2895 D) 0.2934

41. When X (independent variable) is equal to zero, the estimated value of Y (dependent variable) is equal to:

A) -1.2024 B) 0.6774 C) 37.2895 D) 0.2934

42. __________ is the proportion of the variation explained by the simple linear regression model:

A) 0.8230 B) 0.6774 C) 0.4589 D) 0.2934 E) 37.2895

43. Given the following information about a hypothesis test of the difference between two means based on independent random samples, which one of the following is the correct rejection region at a significance level of .05?

HA: μA >μB , μ1 = 12, μ2 = 9, s1 = 4, s2 = 2, n1 = 13, n2 = 10.

A) Reject H0 if Z > 1.96

B) Reject H0 if Z > 1.645

C) Reject H0 if t > 1.721

D) Reject H0 if t > 2.08

E) Reject H0 if t > 1.734

Problem K: Business travelers were asked to rate Miami Airport (on a scale of 1-10). Similarly business travelers were asked to rate Los Angeles airport. A hypothesis test (at alpha = 0.05) is conducted for any difference in the population means in the ratings. The Excel output is shown below. Use the following to answer questions 47- 48:

t-Test: Two-Sample Assuming Unequal Variances

Miami Los Angeles

Mean 6.34 6.72

Variance 4.677959184 5.63428571

Observations 50 50

Hypothesized Mean Difference 0

df 97

t Stat -0.836742811

P(T<=t) one-tail 0.202396923

t Critical one-tail 1.660714588

P(T<=t) two-tail 0.404793846

t Critical two-tail 1.984722076

48. A 95% confidence interval of the difference between the mean ratings is:

a. - 0.52 to 1.25

b. 1.67 to 2.43

c. -0.51 to 1.27

d. -1.28 to 0.52

e. -2.43 to 1.67

[Please refer to the attachment for details]

View Full Posting Details