Assume a significance level of = 5%
1. Everybody seems to disagree about why so many parts have to be fixed or thrown away after they are produced. Some say that it’s the temperature of the production process, which needs to be held constant (within a reasonable range). Others claim that it’s clearly the density of the product, and that if we could only produce a heavier material, the problems would disappear. Then there is Ole, who has been warning everyone forever to take care not to push the equipment beyond its limits. This problem would be the easiest to fix, simply by slowing down the production rate; however, this would increase costs. Interestingly, many of the workers on the morning shift think that the problem is “those inexperienced workers in the afternoon,” who, curiously, feel the same way about the morning workers.
Ever since the factory was automated, with computer network communication and bar code readers at each station, data have been piling up. You’ve finally decided to have a look. After your assistant aggregated the data by 4-hour blocks and then typed in the Morning variable, you found the following note on your desk with a printout of the data already loaded into the computer network:
Temperature actually measures temperature variability as a standard deviation
during the time of measurement. Units are degrees Fahrenheit.
Density indicates the density of the final product. Units are ounces per cubic inch.
Rate indicates the rate of production. Units per hour.
Morning is an indicator variable that is 1 during morning production and is 0 during the afternoon.
Defect is the number of defects per 1,000 produced.
You decide to run a regression to determine the effect of the variables Temperature, Density, Rate, and Morning on the number of defects. Use the output on the next page to answer the following questions. Each question is worth 3 pts.
a) Report the R2 value and explain its interpretation (in non-technical language).
b) Explain in non-technical language the coefficient of the variable Temperature. (Be careful here.)
Results of multiple regression for Defect
Summary measures
Multiple R 0.9482
R-Square 0.8990
Adj R-Square 0.8829
StErr of Est 6.6439
ANOVA Table
Source df SS MS F p-value
Explained 4 9825.7565 2456.4391 55.6493 0.0000
Unexplained 25 1103.5355 44.1414
Regression coefficients
Coefficient Std Err t-value p-value
Constant -28.7556 64.1696 -0.4481 0.6579
Temperature 26.2422 9.0515 2.8992 0.0077
Density -0.5081 1.5250 -0.3332 0.7418
Rate 0.0521 0.1256 0.4149 0.6818
Morning -1.7461 0.8026 -2.1756 0.0392
c) What does the p-value on Temperature indicate?
d) What does Temperature’s significance or lack thereof imply about controlling the production quality? In particular, should you be looking to control temperature within a reasonable range?
e) Explain in non-technical language the coefficient of the indicator variable Morning. In particular, which shift (the morning or afternoon) is producing the most defects?
f) What does Morning’s significance or lack thereof imply about controlling the production quality? What action does the p-value on Morning indicate you should take?
g) Even though you feel the above results are good, you remember you statistics teacher going on and on about the importance of looking at summary measures and graphing the variables. Reluctantly, you decide to view the summary stats:
Summary measures for selected variables
Temperature Density Rate Morning Defect
Count 30 30 30 30 30
Mean 2.2 25.3 236.5 0.800 27.1
Standard dev. 0.6 3.4 26.1 1.808 19.4
Minimum 1.0 19.5 177.7 0 0.0
Maximum 3.0 32.2 281.9 10 60.8
Range 2.1 12.7 104.2 10 60.8
After close inspection, you very thankful that you listened to your statistics teacher! Why?