The standard deviation of a data set is a measure of the 'spread' of a distribution of data or other numerical values; it is a measure of variability within a particular data set. To calculate standard deviation of a sample, generally the variance of the sample is first calculated. The variance is the average squared deviation of each number from its mean, and is given by the following equation: where m is the mean and N is the number of values within the data set of interest. For example, for the data set: 1, 2, and 3, the mean is 2 and the variance is given by the following:
The standard deviation of a sample is simply the square root of the variance, and is probably the most commonly used measure of data 'spread.'
For this CLC exercise;
a) Calculate the standard deviation for the experimentally obtained values of Keq, DH, and the values calculated for DG, and DS. Then post the results of these individual experiments, not the mean values, to the CLC group, and calculate standard deviation for those same values, using the entire data set generated by your CLC group.
b) How do your values compare to the values of the entire data set generated by your CLC group?
c) How 'tight' is the spread of the data?
d) Compare the values of both your personal data set and those of the CLC groups data set with the following accepted values as reported by Pickering (Pickering 1987) shown in the table below:
DH/kJ mol-1 17.6 ± 7.2
DG/kJ mol-1 -5.77 ± 0.63
DS/J mol-1 K-1 77.0 ± 28
i) How do your value compare to these experimentally obtained values?
ii) What are some possible sources of error in your measurement?
iii) Where there any extreme outliers in the data set you analyzed that might affect the pooled values?
I have given you a formatted response in the word document attached, but it's pasted here for you as well.
Work together and come up with the answers to the following problem. Prepare a single group document, and submit it to the instructor by the end of Week 8.
The standard deviation of a data set is a measure of the 'spread' of a distribution of data or other numerical values; it is a measure of variability within a particular data set. To calculate standard deviation of a sample, generally the variance of the sample is first calculated. The variance is the average squared deviation of each number from its mean, and is given by the following equation, where m is the mean and N is the number of values within the data set of interest.
I'll give a more detailed example with some completely made up numbers.
Terms to remember:
mean = average
devation from mean = subtract your number from the mean (doesn't matter if it's negative since you'll be multiplying it by itself, squaring)
So, let's say we have 4 numbers: 18.5, 27.3, 20.2, and 19.8 (thus, N=4)
The average (m) is (18.5+27.3+20.2+19.8)/4 = 21.45
(don't round off until the end)
The deviation of the first value from the mean is 18.5-21.45=2.95
The deviation of the second value from the mean is 27.3-21.45=5.85
The deviation of the third value from the mean is 20.2-21.45=1.25
The deviation of the fourth value from the mean ...
The solution and a complete explanation are provided for how to determine standard deviation from experimental data. Experimental error from a known value is also calculated and explained.
Frequency distribution and other Statistics Concepts
Please see the attached file.
**All work must be shown (step by step) in order to receive credit.
Part I : Question 10 - 13 Show all your work
1. Consider the following frequency distribution based on a set of sample data:
X 0 1 2 3 4 5
f 10 17 14 9 3 1
a) What is the sample size?
b) What is the relative frequency of X = 2?
c) What type of frequency distribution is this?
d) Generate the set of cumulative frequencies.
e) The value X = 2 occupies what positions in the ordered data set?
2. Using the following data: -60, 10, 10, 20, -75, 5, 0, 10, 20, 0
a) Calculate the mean, mode, and variance
b) Compute the percentage value for the following K values using Chebychef rule 1 - 1/k2: 3/2, 5/2, 1.36, and 16.
3. a) Name measures of spread and variability. Also, discuss the properties of each
b) Name 4 types of random samples.
4. In a government agency, 40% of the employees take public transportation to work. Also, 55% of the employees are female. It is assumed that these two characteristics are independent. Draw a tree diagram to illustrate and find the probability that an employee picked at random from this population will be:
a) Female and take public transportation to work.
b) Female and not take public transportation to work.
c) Male and take public transportation to work.
5. The minimum start-up cost for a fast food restaurant chain is as follows for several leading franchises, Units are in thousands of dollars.
Company Minimum Start-Up Cost
Kentucky Fried Chicken 80.0
Baskin-Robbins USA 134.0
Dairy Queen 405.0
Domino's Pizza 83.0
(Adapted from "Leading Franchises in 1992," The World Almanac and The Book of Facts, 1993, p.133)
a) Calculate the mean minimum start-up cost.
b) Calculate the standard deviation of the minimum start-up costs.
c) Remove the largest and smallest start-up costs from the data. How is the mean affected?
d) If the largest and smallest start-up costs are removed from the data, do you think that the standard deviation will increase or decrease? Calculate the new standard deviation with these two observation omitted and compare to part (b)
6. The percentage of unemployed workers in each of 20 randomly selected cities are as follows:
3.1 4.5 8.2 1.4 6.3 1.8 2.4 8.8 1.9 2.4
5.4 3.8 7.2 3.8 4.6 7.2 2.5 4.8 3.7 4.2
a) Calculate the 20th percentile.
b) Calculate the 40th percentile.
c) Calculate the 65th percentile.
d) Calculate the 85th percentile.
7. Financial planners often recommend international mutual funds to boost the overall performance of an investor's portfolio. The year's performance figures for U.S. stock funds and for international stock funds are as follows, in percentages.
Year U.S. Stock Funds International Stock Funds
1983 21.77 27.73
1984 -1.20 -4.71
1985 28.52 44.44
1986 14.53 41.40
1987 1.17 7.02
1988 15.75 17.43
1989 25.09 21.98
1990 -6.11 -12.07
1991 36.67 12.51
1992 9.11 -4.48
1993 9.17 25.65
(Adapted from "International Aid," USA Today, October4, 1993, p.3B.)
a) Calculate the mean, median, variance, standard deviation
b) Using part (a), describe to an investor the differences between the distributions of the performance figures for the two types of stock funds.
8. Form the summarized table, please answer the following questions:
Bookkeeping Reception Word processing
Less than one year 15 5 30 50
One to three years 5 10 5 20
More than three years 5 15 10 30
Total 25 30 45 100
One person's file is selected at random. Determine the probability that the selected person will fall into the following categories.
b) Less than one year of experience.
d) One to three years of experience.
e) Word processing.
f) More than three years of experience.
g) Assume the selected person has less than one year of experience. What is the probability that his or her skill is bookkeeping?
h) Determine Pr[bookkeeping and less than one year of experience]. Can the multiplication law be used to determine this probability?
i) Use the addition law to determine the probabilities for the following composite events:
1. Bookkeeping or reception 2. Word processing or reception
9. The 12-month performance of selected stock markets in the world, as measured by each country's Dow Jones Equity Market Index ending October 4, 1993, is as follows. Figures are in percentages.
Country 12-Month Performance
Hong Kong 36.3
United Kingdom 11.1
United States 13.3
Calculate and interpret each of the statistics in parts (a) through (f).
d) Standard deviation
e) Second quartile
f) 20th percentile
10. Advertising expenditures constitute one of the important components of the cost of goods sold. From the following data giving the advertising expenditures (in millions of dollars) of 50 companies, approximate the mean, median, and standard deviation of advertising expenditure.
Advertising Expenditure Number of companies
25 and under 35 5
35 and under 45 18
45 and under 55 11
55 and under 65 6
65 and under 75 10
11. The mean GMAT score of 65 applicants who were accepted into the MBA program of Xavier Business School was 520 with a standard deviation of 25. About how many applicants scored between 470 and 570 on the GMAT?
12. Given the following data, compute the following:
Age (in years) Frequency
20 - 24 10
25 - 29 19
30 - 34 27
35 - 39 16
40 - 44 10
45 - 49 6
50 - 54 5
55 - 59 3
e) First quartile
f) Second quartile
g) 45th percentile
13. The following table has been prepared from a survey of 100 companies in three industries:
Industry IBM Apple IBM Compatible Total
Banking 25 5 5 35
Electronics 10 20 10 40
Accounting 25 0 0 25
Total 60 25 15 100
To determine the primary type of microcomputer used by employees in the companies, suppose one of these 100 surveyed companies is chosen at random, and answer the following questions:
a) Find P(IBM)
b) Find P(IBM or Electronics)
c) Find P(Apple and Banking)
d) Find P(IBM and Apple)
Part II : Question 14-26 True/False
______14. . is not P(A/B)*P(B)
______16. is P(B/A)*P(A)
______17. The median always exists in a set of numerical data.
______19. means that A and B are mutually exclusive events.
______20.Mutually exclusive events imply that if one event occurs, the other cannot occur. An event (e.g., A) and its complement are always mutually exclusive.
______21. When events are independent, the multiplication rule is:
______22.When events are not mutually exclusive, the rule is:
______23. Rule of addition when event are mutually exclusives
______24. Events are dependent when the occurrence of one event changes the probability that another will occur.
______25. Events are independent when the occurrence of one event has no effect on the probability that another will occur.
______26.The P(x) is always 0 ≤ P(x) ≤ 1.
Part III : Multiple Choice
Questions 27- 36 refer to the following frequency distribution:
Annual Income Number of households
under $10,000 20
$10,000 - under $20,000 45
$20,000 - under $30,000 70
$30,000 - under $40,000 30
$40,000 - under $50,000 15
$50,000 - under $60,000 12
$60,000 - under $70,000 8
27. What is the frequency of the $20,000-under $30,000 class?
a. 20 b. 45 c. 70 d. 30
28. What is the width of each class?
a. $20,000 b. $5,000 c. $60,000 d. $10,000
29. What is the mid-point for the $40,000-under $50,000 class?
a. $45,000 b. $40,000 c. $42,000 d. 15
30. If we were to convert these data to a relative frequency distribution, what value would be associated with the $50,000-under $60,000 class?
a. 0.96 b. 0.06 c. 0.04 d. 0.12
31. For a cumulative frequency distribution (less than or within), what value would be associated with the $30,000-under $40,000 class?
a. 30 b. 35 c. 165 d. 135
32. For the value 10, 40, 20, 50, and 40, the value of the arithmetic mean is
a. 32 b. 40 c. 30 d. 34.5
33. For the data values, 20, 31, 45, 54, 38, 33, 22, 45, and 17, the mode is
a. 33 b. 34 c. 38 d. 45
34. The simplest measure of dispersion is the
a. Standard deviation. b. Variance.
c. Quartile deviation d. Range
35. The largest value in a set of data is 240, and the lowest value is 100. If the resulting frequency distribution is to have seven classes of equal width, what will be the class width?
a. 12 b. 7 c. 5 d. 20
36. For the values 10, 20, 30, 40, and 50, the average absolute deviation from the mean is
a. 10 b. 15 c. 12 d. 14