The Chi-Squared Test is a non-parametric test that uses the chi-squared statistic to test the goodness of fit between an expected frequency distribution and an observed frequency distribution. Although the test’s overall function is to examine if an observed distribution is due to chance, it is mainly used in statistics more specifically, to analyze categorical data. Therefore, prior to conducting the test, the data must be divided into categories, which is why this specific test will not work for data which is not continuous.

**The chi-squared statistic is written in the following form:**

X^2=∑[(observed-expected)^2/expected]

The actual distribution statistic has (r-1)(c-1) degrees of freedom

Where,

r refers to the number of rows

c refers to the number of columns

**Consider the following scenario:**

There are 100 students in one class. Say 50 of those students attended class and of those 50, 30 passed and 20 failed. On the other hand, the other 50 students skipped class, and of those 50, 20 passed and 30 failed.

Thus, Ho = the proportion of pass/fail is independent of class attendance.

Ha = the proportion of pass/fail is associated with class attendance.

Observed data:

……………………….Pass……………………Fail

Attended…………......30………………………20

Skipped……………....20………………………30

Expected data:

……………………….Pass……………………Fail

Attended……….....…25………………………25

Skipped…………...…25………………………25

**Note: **To calculate the expected values, the observed data table needs to be used. If we were to calculate the expected value for the 30 students who attended class, it would be done as follows. First, multiply the row total for attended (30 + 20 = 50), by the column total for pass (30 + 20 = 50). In doing this, the value obtained is 2500. Then, divide the numerator (in this case 2500) by the overall total for the entire table (30 + 20 + 20 + 30 = 100). Thus, the expected value is 25 since 2500/100 = 25. Use this exact same calulcation for the other unknown expected values.

Thus the chi-squared value would be:

X^2 = ((20-25)^2/25)+((30-25)^2)/25)+((30-25)^2/25)+((20-25)^2/25) = 4

Degree of freedom = (2-1)(2-1) = 1

Thus, looking at a standardized chi-squared distribution table, it can be seen that the p-value given is 0.0455. Using the standard significant level of p = 0.05, it can be seen that this p-value falls below it; and so we would reject the null hypothesis that the pass/fail rate is independent of class attendance at the 5% significance level.