# Statistics Problems on Comparisons of Data

1. During a natural gas shortage, a gas company randomly sampled residential gas meters in order to monitor daily gas consumption. On a particular day, a sample of 100 meters showed a sample mean of 250 cubic feet and a sample standard deviation of 50 cubic feet. The manager for the gas company believes that the company may not have enough reserves if daily gas consumption rises above 275 cubic feet per household per day.

a. How many degrees of freedom will we have when we find the critical t value?

b. What is the 95% confidence interval estimate of the mean gas consumption for the population?

c. The manager asks us whether she can be 99% confident that daily gas consumption is less than 275 cubic feet per household per day. What is the hypothesis we need to test? (Hint: it will be a one-sided hypothesis test.)

d. What can we conclude about this hypothesis with 99% confidence?

2. A union composed of several thousand pilots is preparing to vote on an amendment to the union contract. The collective bargaining agreement requires approval of the amendment by at least 60% of the voting members (so we are interested in a proportion). A random sample of 500 pilots yielded 320 who planned to vote yes. The union leadership is interested in the likelihood that the amendment will be approved in the upcoming vote.

a. State the appropriate null and alternative hypotheses.

b. Can the union infer at the 5% significance level that the proposed amendment will receive more than 60% yes votes?

c. What is the p-value for the test?

d. What is the 95% confidence interval estimate for the proportion of union members who will vote in favor of the amendment?

3. A food processor wants to compare two additives for their effects on retarding spoilage. Suppose 16 cuts of fresh meat are treated with additive A and 16 are treated with additive B, and the number of hours until spoilage begins is recorded for each of the 32 cuts of meat. The results are summarized in the table below:

Additive A Additive B

Sample Mean 108.7 hours 98.7 hours

Sample Standard Deviation 10.5 hours 13.6 hours

a. State the null and alternative hypotheses to determine if the average number of hours until spoilage begins differs for the additives A and B (a two-sided test).

b. Assume population variances are equal. Calculate the pooled variance.

c. Assume population variances are equal. Calculate the value of the test statistic.

d. Assume population variances are equal. How many degrees of freedom will we have for the test statistic?

e. What is the rejection region at alpha = .05? (Note that there is an upper and a lower rejection region for a two-sided test.)

f. What can we conclude at 95% confidence about the null and alternative hypothesis?

4. At a recent Allman Brothers concert, a survey was conducted that asked a random sample of 20 people their age and how many concerts they have attended since the first of the year. The following data were collected. Note that there is one set of data, which is continued in the second table.

Age 62 57 40 49 67 54 43 65 54 41

Number of Concerts 6 5 4 3 5 5 2 6 3 1

Age 44 48 55 60 59 63 69 40 38 52

Number of Concerts 3 2 4 5 4 5 4 2 1 3

The following is a scatter diagram of the data.

[Please refer to the attachment for the scatter diagram]

An Excel output follows:

[Please refer to the attachment for the EXCEL output]

a. Based on the scatter diagram, does it appear that a linear model is appropriate?

Yes, it appears the relationship is linear.

No, the relationship does not appear to be linear.

b. Using the data above, determine the least squares regression line, with age as X (the independent variable) and number of concerts as the dependent variable (Y). Note that you do not need to do any calculations, because the coefficients can be found in the Excel output.

c. Interpret the value of the slope of the regression line.

d. Estimate the number of Allman Brothers concerts attended by a 64 year old person.

5. The general manager of a chain of department stores believes that experience is the most important factor in determining the level of success of a salesperson. To examine this belief she records last month's sales (in $1,000s) and the years of experience of 10 randomly selected salespeople. These data are listed below.

Salesperson Years of Experience Sales

1 0 7

2 2 9

3 10 20

4 3 15

5 8 18

6 5 14

7 12 20

8 7 17

9 20 30

10 15 25

a. Using the data above and the Excel data analysis tool, determine the least squares regression line, with years of experience as X (the independent variable) and income as the dependent variable (Y).

b. Interpret the value of the slope of the regression line.

c. Estimate the monthly sales for a salesperson with 16 years of experience.

#### Solution Summary

This solution is comprised of detailed explanation and step-by-step calculations and analysis of the given problems and provides students with a clear perspective of the underlying concepts.