# Chi-square Test for Goodness of Fit

AutoWrecks, Inc. sells auto insurance. AutoWrecks keeps close tabs on its customers' driving records, updating its rates according to the trends indicated by these records. AutoWrecks' records indicate that, in a "typical" year, roughly 70% of the company's customers do not commit a moving violation, 10% commit exactly one moving violation, 15% commit exactly two moving violations, and 5% commit three or more moving violations.

This past year's driving records for a random sample of 100 AutoWrecks customers is summarized in the first row of numbers in Table 1 below. This row gives this year's observed frequencies for each moving violation category for the sample of 100 AutoWrecks customers. The second row of numbers gives the frequencies expected for a sample of 100 AutoWrecks customers if the moving violations distribution for this year is the same as the distribution for a "typical" year. The bottom row of numbers in Table 1 contains the values

(fo - fe)^2/fe = (Observed frequency - Expected frequency)2

Expected frequency

for each of the moving violation categories.

Fill in the missing values of Table 1. Then, using the 0.05 level of significance, perform a test of the hypothesis that there is no difference between this year's moving violation distribution and the distribution in a "typical" year. Then complete Table 2.

Round your responses for the expected frequencies in Table 1 to at least two decimal places. Round your (fo - fe)^2/fe responses in Table 1 to at least three decimal places. Round your responses in Table 2 as specified.

Number of moving violations

No Violations

Exactly one violation

Exactly two violations Three or more violations

Total

Observed freq

(fo) 56 15 18 11

100

Expected freq

(fg) ____ 10.00 15.00 ____

(fo - fe)^2/fe ____ 2.500 0.066 ____

Type of Test Statistic:

The Value of the Test Statistic:

The critical value for the test at the 0.05 level of significance:

Can we reject the hypothesis that there is no difference between this year's moving violation distribution and the distribution in a "typical year? ( using the 0.05 level of significance):

Depression and insomnia often go hand-in-hand, and sometimes it is unclear which of the two should be the primary subject of treatment in individuals suffering from insomnia. Mendoza & Company, a national pharmaceutical firm, has positioned itself as a specialist in the production of both antidepressants and sleeping pills. Mendoza's current business model describes the following breakdown of America's approximately 50 million adults suffering from insomnia: 17% use both antidepressants and sleeping pills regularly, 24% use only antidepressants regularly, 15% use only sleeping pills regularly, and the remaining 44% use neither antidepressants nor sleeping pills regularly.

A recent issue of the psychiatry journal Patterns contains a study on insomnia. In the study, 150 American adults suffering from insomnia (but otherwise chosen at random) were asked about their use of antidepressants and sleeping pills. The breakdown of their answers is given in the top row of numbers in Table 1 below. (These numbers are the frequencies observed for the sample of 150 insomniacs.) The second row of numbers in Table 1 gives the expected frequencies under the hypothesis that Mendoza's model is correct. The bottom row of numbers in Table 1 contains the values

(fo - fe)^2/fe = (Observed frequency - Expected frequency)2

Expected frequency

for each of the categories of medication use.

Fill in the missing values of Table 1. Then, using the 0.10 level of significance, perform a test of the hypothesis that Mendoza's model is correct. Then complete Table 2.

Round your responses for the expected frequencies in Table 1 to at least two decimal places. Round your (fo - fe)^2/fe responses in Table 1 to at least three decimal places. Round your responses in Table 2 as specified.

Antidepressant / Sleeping Pill Use

Both

Only antidepressants

Only Sleeping Pills

Neither

Total

Observed freq

(fo) 27 28 19 76

150

Expected freq

(fg) ____ 36.00 _____ 66.00

(fo - fe)^2/fe ____ 1.778 _____ 1.515

Type of Test Statistic:

The Value of the Test Statistic:

The p-value:

Can we conclude that the percentages given in the Mendoza's model are incorrect? ( using 0.10 level of significance)

Does it seem to you that people tend to be absent more on some days of the week than on others? Recently, a major biotechnology firm collected data with the hope of determining whether or not its employees were more likely to be absent (due to personal reasons or illness) on some weekdays than on others. The firm examined a random sample of 100 employee absences.

The distribution of these 100 absences is shown in Table 1 below. The observed frequencies for each category (each weekday) are shown in the first row of numbers in Table 1. The second row of numbers contains the frequencies expected for a sample of 100 employees if employee absences at the firm are equally likely on each of the five weekdays. The bottom row of numbers in Table 1 contains the values

(fo - fe)^2/fe = (Observed frequency - Expected frequency)2

Expected frequency

for each of the categories.

Fill in the missing values of Table 1. Then, using the 0.05 level of significance, perform a test of the hypothesis that employee absences at this firm are equally likely on each of the five weekdays. Then complete Table 2.

Round your responses for the expected frequencies in Table 1 to at least two decimal places. Round your (fo - fe)^2/fe responses in Table 1 to at least three decimal places. Round your responses in Table 2 as specified.

Weekday

Monday Tuesday Wednesday Thursday Friday

Total

Observed freq

(fo) 21 7 18 25 29

100

Expected freq

(fg) ____ _____ 20.00 20.00 20.00

(fo - fe)^2/fe ____ _____ 0.200 1.250 4.050

Type of Test Statistic:

The Value of the Test Statistic:

The critical value for a test at the 0.05 level of significance.

Can we conclude that the absences by the firm's employees are more likely on some day(s) of the week than on others? ( using 0.05 level of significance)

More than one teacher has given the following advice: choose answer C when blindly guessing among four answers in a multiple choice test, since C is more often the correct answer than either A, B, or D.

Suppose that we take a random sample of 580 multiple-choice test answers (the correct answers from the instructor's answer sheet) from introductory college courses and obtain the information summarized in the first row of numbers in Table 1 below. These numbers are the observed frequencies for each of the categories A, B, C, and D for our sample of 580 correct answers. The second row of numbers in Table 1 contains the frequencies expected for a sample of 580 correct answers if a correct answer is equally likely to be A, B, C, or D. The bottom row of numbers in Table 1 contains the values

(fo - fe)^2/fe = (Observed frequency - Expected frequency)2

Expected frequency

for each of the correct answer categories A, B, C, and D.

Fill in the missing values of Table 1. Then, using the 0.10 level of significance, perform a test of the hypothesis that each of A, B, C, and D is equally likely to be the correct answer on tests in these introductory college courses. Then complete Table 2.

Round your responses for the expected frequencies in Table 1 to at least two decimal places. Round your (fo - fe)^2/fe responses in Table 1 to at least three decimal places. Round your responses in Table 2 as specified.

Correct Answer

"A" "B" "C" "D"

Total

Observed freq

(fo) 165 151 143 121

580

Expected freq

(fg) ____ ____ 145.00 145.00

(fo - fe)^2/fe ____ _____ 0.028 3.972

Type of Test Statistic:

The Value of the Test Statistic:

The p-value:

Can we reject the hypothesis that A,B,C, and D are equally likely to be the correct answer on the tests in these introductory college courses? ( using 0.10 level of significance)

Executives at The Thinking Channel have decided to test whether the educational backgrounds of the channel's viewers are different from the educational backgrounds of American adults (ages 25 and over) as a whole. The executives have the following information on the American adult population as a whole, obtained from a recent U.S. Current Population Survey:

Highest degree earned Less than high school High school College Higher than college

Percent of population 12% 25% 55% 8%

The executives also obtained data (from telephone surveys) on highest degrees earned for a random sample of 160 American adults who are Thinking Channel viewers. These data are summarized in the first row of numbers in Table 1 below. These numbers are the observed frequencies in the sample of 160 for each of the degree categories. The second row of numbers in Table 1 gives the expected frequencies under the assumption that the distribution of highest degrees earned by Thinking Channel viewers is the same as the distribution of highest degrees earned by American adults as a whole. The bottom row of numbers in Table 1 gives the values

(fo - fe)^2/fe = (Observed frequency - Expected frequency)2

Expected frequency

for each of the degree categories.

Fill in the missing values of Table 1. Then, using the 0.05 level of significance, perform a test of the hypothesis that the distribution of highest degrees earned by Thinking Channel viewers is the same as the distribution of highest degrees earned by American adults as a whole. Then complete Table 2.

Round your responses for the expected frequencies in Table 1 to at least two decimal places. Round your (fo - fe)^2/fe responses in Table 1 to at least three decimal places. Round your responses in Table 2 as specified.

Highest Degree Earned

Less than high school High School College Higher than college

Total

Observed freq

(fo) 20 37 97 6

160

Expected freq

(fg) ____ 40.00 88.00 ____

(fo - fe)^2/fe ____ 0.225 0.920 ____

Type of Test Statistic:

The Value of the Test Statistic:

The critical value for the test at the 0.05 level of significance:

Can we conclude that the distribution of highest degrees earned by Thinking Channel viewers is different from the distribution of highest degrees earned by American adults as a whole?( using the 0.05 level of significance):

See attached file.

#### Solution Summary

The solution provides step by step method for the calculation of chi square test for goodness of fit. Formula for the calculation and Interpretations of the results are also included