Share
Explore BrainMass

# Standard Deviation of a Sample

Excel Lab 2

For this assignment, choose a single variable data set that interests you from (except the
"Motion of Stars" data set which I'll use for my example):
http://college.hmco.com/mathematics/brase/understandable_statistics/8e/students/datasets
/sv/index.html or on the CD that came with your text. Follow the link, and click on the
Excel data file icon in the pop-up (on the far left). Save this data set in your Filespace
with a name that helps you remember what's in it.
As I mentioned above, I'll use the "Motion of Starts" data set which represents the
angular motions of stars across the sky due to the stars own velocity. A random sample of
stars from the M92 global cluster was used, and the units of the data are arc seconds per
century. (An arc second is 1/3600 of a degree across the sky). Open your data set in
Excel.
On the standard toolbar, click the Paste Function button that looks like this: and
you'll get a dialog box that looks like

Make sure that in the "Or select a category" box, that you have "Statistical" selected.
Now "Cancel" that dialog, we'll be back to it in a minute. Here's my spread sheet. Notice how I've created titled for the different statistics I want
to find and typed them in a titled column:
To find the measures of center and variation I have listed, I used the following functions
from the Insert Function dialog box:
Mean - AVERAGE(number1,number2,...)
Median - MEDIAN(number1,number2,...)
Mode - MODE(number1,number2,...)
StDev s - STDEV(number1,number2,...) - Note that this is the sample standard
deviation and assumes we have only a sample of the entire data population. This is true
for most of the data sets available from our source.
StdDev p - STDEVP(number1,number2,...) - This is the population standard deviation
which assumes the data you have is the entire population.
5% trim mean - TRIMMEAN(data range, percent as a decimal) - This function
returns a mean based on the total percentage of data removed from both the bottom and
the top of the ordered data values. If you want a 5% trimmed mean (and you do!),
implying that 5% of the bottom data and 5% of the top data will be removed before a
mean is calculated, then enter 0.10 for the percent in Excel.
Count data n - COUNT(number1,number2,...) - returns the number of data items
selected.
In my work sheet I have already found the Mean by typing =AVERAGE(A2:A85) in the
D3 cell. I could have also used the Insert Function dialog by first typing an "=" in the D3
cell and then clicking the button which allows me to then enter the range of values I
want to find the mean of. Hit "OK" and it's done. Let's step through my next computation, which is Mode. I click on cell D5 and type "="
then click . The dialog comes up, I select Mode from the list and hit "OK" to get this
new dialog:
I click in the Number 1 box and then go select my data cells A1:A85, then click OK. The
mode of my selected data shows up in the cell
Find all 7 statistics for your data set and write up your results. Explain what your data set
represents (there is an explanation on the download dialogue, you can paraphrase the wording from that). Summarize the results in an 8 row by 2 column table. To do this, go
to Table->Insert->Table and Select 8 rows and 2 columns, then OK.
Here's a good start. Yours should go something like this. Make sure to answer the extra
questions I have for you to answer.
The following is my analysis of a data set which representing the angular motions of stars
across the sky due to the stars own velocity. A random sample of stars from the M92
global cluster was used, and the units of the data are arc seconds per century. (An arc
second is 1/3600 of a degree across the sky).

Measurement Arc Seconds/Century
Mean 0.028904
Median 0.027
Mode 0.016
Sample Std Deviation 0.010681
Population Std Deviation 0.010617
5% Trim Mean 0.028421
Number of data points 84
By sorting my data from highest to lowest, I notice that the minimum value is 0.012 arc
seconds per century and maximum value is 0.06 arc seconds per century. There are no
extreme outliers in my data set which is reflected by the very slight difference between
the mean and the 5% trim mean which removes 5% of the data from each end of the
ordered data.
You should explain why your sample standard deviation is larger than the population
standard deviation. Finally, let's make a quick and dirty histogram of your data by
selecting the data in Excel then go to Tools->Data Analysis->Histogram. Select Chart
output and New Workbook:
Ignore the Bin Range and Excel will choose them for you. Just click OK and you'll get a
frequency table and a histogram that could use a few tweaks, but it's not needed. Make
sure to include your histogram in your write-up.
Here's mine (I selected a bar, then right clicked on it and selected Format data series-
>Options, and then set the Gap width to 0):

Arc Seconds per Century
0
20
40
0.012
0.022
0.033
0.044
0.054
Bin
Frequency
Frequency
It looks like data is distributed in a bell-shape and is almost symmetric (the mass is
slightly shifted to the left) around the mean at 0.027. This is to be expected, since the
mean and the median are so close. However, the mode is very low compared to both of
these measurements, so we should expect perfect symmetry (When mode, mean, and
median are very close, the distribution of the data in a histogram looks very symmetric
and bell-shaped.)
Include an analysis of your histogram in light of the seven statistics you computed before.
Is the standard deviation small compared to the range of the data, or large? Is there
anything surprising about the results? Did your trimmed mean jump significantly from
the full mean? Why? If so, should you toss out any values and do the analysis again (you
don't have to, you could just mention that it should be done and leave it at that.)

#### Solution Summary

This Solution contains calculations to aid you in understanding the Solution to this question.

\$2.19