# Confidence interval, population mean, level of significance

1. The union representing the Bottle Blowers of America (BBA) is considering a proposal to merge with the Teamsters Union. According to BBA union bylaws, at least three-fourths of the union membership must approve any merger. A random sample of 2,000 current BBA members reveals 1,600 plan to vote for the merger proposal. What is the estimate of the population proportion? Construct a 95 percent confidence interval for the population proportion clearly showing your steps. Based on your results, is it likely that the merger proposal will pass? Why?

2. The U.S. Dairy Industry wants to estimate the mean yearly consumption of milk. A sample of 36 people revealed an average yearly consumption of 60 gallons. The population standard deviation is estimated to be 20 gallons. Based on these data what is the best point estimate for the yearly consumption of milk at the population level? Please construct and interpret the 99% confidence interval for the population mean clearly showing all your steps. Based on your result will it be reasonable to conclude that the population mean is 70 gallons?

3. Merrill Lynch Securities and Healthcare Retirement Inc. are two large employers in downtown Toledo, Ohio. They are considering jointly offering child care for their employees. As a part of the feasibility study, they wish to estimate the mean weekly childcare cost of their employees. The following table presents the amounts spent by 20 randomly selected employees in one week.

$107 $92 $97 $95 $105 $110 $91 $99 $104 $95

$94 $101 $98 $92 $102 $100 $93 $96 $106 $90

$107 $92 $97 $95 $105 $110 $91 $99 $104 $95

$94 $101 $98 $92 $102 $100 $93 $96 $106 $90

Please construct the 90% confidence interval for the mean weekly expenses on child care for all the employees. Clearly show all your steps and interpret your result.

4. Heniz, a manufacturer of ketchup, uses a particular machine designed to dispense 16 ounces of ketchup into containers. A random sample of 50 recently filled containers revealed an average amount of 16.017 ounces. From many years of experience with the machine Heinz knows the population standard deviation to be 0.15 ounces. Does the evidence from the sample of 50 indicate that the mean amount dispensed by the machine is now different from 16 ounces? Please use a significance level of 5% to conduct your test and show all your steps.

5. The McFarland Insurance Company Claims Department reports the mean cost to process a claim is $60. An industry comparison showed this amount to be larger than most other insurance companies, so the company instituted cost-cutting measures. To evaluate the effect of the cost-cutting measures, the Supervisor of the Claims Department selected a random sample of 28 claims processed last month. The sample information is reported below.

$45 $49 $55 $62 $40 $43 $61 $48 $53 $67

63 78 64 48 54 51 56 63 69 58

51 58 59 56 57 38 76 66

At 1% level of significance is it reasonable to report that the mean cost of processing a claim is now less than $60? Please clearly show your steps and explain your conclusion.

6. Mary Pankhurst is the vice president for Nursing Services at St. Luke's Memorial Hospital. Recently she noticed in the job postings for nurses that those that are unionized seem to offer higher wages. She decided to investigate the matter and gathered the following information for randomly selected job postings for unionized and non-unionized positions.randomly selected job postings for unionized and non-unionized positions.

Group Sample size Sample average wage Sample standard deviation

Union 40 $23.75 $2.25

Nonunion 45 $22.50 $1.90

Would it be reasonable for her to conclude that unionized nurses earn more? Please conduct your test at 5% level of significance clearly showing all your steps.

7. Suppose you work as a marketing manager for Tobacco Company and wish to understand the factors that influence the demand for cigarettes so that you can determine which group of people to target more heavily in your marketing strategy. After thinking about the question at length you reasoned out that the number of cigarettes smoked per day (y) by an individual may depend on years of education (x1), average price per pack in the state in cents (x2), age in years (x3) weekly income in Dollars (x4), race (x5) and state level restrictions on smoking in restaurants (x6). Accordingly, you specified the following multiple linear regression model relating the number of cigarettes smoked per day to these six independent (explanatory) variables.

y=?0+?1x1+?2x2+?3x3+?4x4+?5x5+?6x6+?

where, ? is the error term,

x5=1 for white

=0 for non-white

X6=1 if state has smoking restrictions in restaurants

=0 if state does not have smoking restriction in restaurants

While wondering how to obtain data to estimate the relationship you specified above, suppose a friend gave you the data named 'SMOKING' under the webpage for MBA 506. Using the data and Excel please answer the following questions.

a. Briefly explain the role of the error term and why we still include it in a multiple linear regression model like the one specified above.

b. Estimate and report the results for the multiple linear regression model specified above. Please report your results in an equation form writing the standard error in parenthesis under each estimated coefficient. Also include the number of observations, Total sum of squares (SST), Regression sum of squares (SSR), and Error sum of squares (SSE) in your reported results.

c. Please briefly explain whether the sign of the relationship between smoking and education you found makes sense. Do the same for smoking and age, smoking and price per pack, smoking and income, smoking and dummy variable for race, and smoking and dummy variable for state level restrictions on smoking.

d. Interpret the estimated coefficients for each of the six variables included in the multiple regression model.

e. At 5% level of significance, test whether each of the six estimated coefficients is statistically significant. Please clearly show all your steps and comment on whether your decision to reject or not to reject the null hypothesis makes sense.

f. At 5% level of significance please test whether all the six independent variables have jointly significant effect on smoking clearly showing all your steps.

g. Calculate and interpret the coefficient of multiple determination (R2) using the reported results on SST and SSR. Given the magnitude of R2 and your decision about the joint significance of the effects of independent variables under f, is the model you have estimated strong enough to be used for prediction?

#### Solution Summary

Confidence interval, population mean, level of significance, error, regression model