Students collected data for a project for statistics class, and were asked to find the mean, median, mode, range, upper quartile, lower quartile, interquartile range, and standard deviation. Then, they were asked to plot the data in a stem-and-leaf plot,
dot plot, histogram, and box-and-whisker plot. Explain how the students would calculate quartiles and why different methods produced slightly different results. Give an example.
Students collected data for a project for statistics class, and were asked to find the mean, median, mode, range, upper quartile, lower quartile, interquartile range, and standard deviation. Then, they were asked to plot the data in a stem-and-leaf plot, dot plot, histogram, and box-and-whisker plot. Explain how the students would calculate quartiles and why different methods produced slightly different results. Give an example.
Quartiles are simple in concept but can be complicated in execution.
The concept of quartiles is that you arrange the data in ascending
order and divide it into four roughly equal parts. The upper
quartile is the part containing the highest data values, the upper
middle quartile is the part containing the next-highest data values,
the lower quartile is the part containing the lowest data values,
while the lower middle quartile is the part containing the next-
lowest data values.
Here's where it starts to get confusing. The terms 'quartile', 'upper
quartile' and 'lower quartile' each have two meanings. One definition
refers to the subset of all data values in each of those parts. For
example, if I say "my score was in the upper quartile on that math
test", I mean that my score was one of the values in the upper
quartile subset (i.e. the top 25% of all scores on that test).
But the terms can also refer to cut-off values between the subsets.
The 'upper quartile' (sometimes labeled Q3 or UQ) can refer to a
cut-off value between the upper quartile subset and the upper middle
quartile subset. Similarly, the 'lower quartile' (sometimes labeled Q1
or LQ) can refer to a cut-off value between the lower quartile subset
and the lower middle quartile subset.
The term 'quartiles' is sometimes used to collectively refer
to these values plus the median (which is the cut-off value between
the upper middle quartile subset and the lower middle quartile
subset). John Tukey, the statistician who invented the box-and-
whisker plot, referred to these cut-off values as 'hinges' to avoid
confusion. Unfortunately, not everyone followed his lead on that.
It gets worse. Statisticians don't agree on whether the ...
By example, this solution defines the concept of quartiles. It also explains how researchers can calculate quartiles using different methods e.g. mean, mode, etc. and why different methods produced slightly different results
Statistics: Arranging Data, Quartiles, Box plot, Mean
Problems are attached.
16. Millionaires: The ages of the 36 millionaires sampled are arranged in increasing order in the following table.
31 38 39 39 42 42 45 47 48
48 48 52 52 53 54 55 57 59
60 61 64 64 66 66 67 68 68
69 71 71 74 75 77 79 79 79
a. Determine the quartiles for the data.
b. Obtain and interpret the interquartile range.
c. Find and interpret the five number summary.
d. Calculate the lower and upper limits.
e. Identify potential outliers, if any.
f. Construct and interpret a boxplot and, if appropriate, a modified boxplot.
18. Gasoline prices. The US Energy Information Administration reports figures on retail gasoline prices in the Monthly Energy Review. Data are obtained by sampling 10,000 gasoline service stations from a total of more than 185,000. For the 10,000 stations sampled in June 2003, the mean price per gallon for unleaded regular gasoline was $1.49.
a. Is the mean price given here a sample mean or a population mean? Explain your answer.
b. What letter or symbol would you use to designate the mean of $1.49?
c. Is the mean price given here a statistic or a parameter? Explain your answer.
20. Millionaires. Refer to date in problem 16. Use the technology of your choice to obtain
a. a modified boxplot for the data.
b. the five number summary of the data.