Descriptive Statistics

Descriptive statistics is the process of summarizing data in both a quantitative and visual manner. In a quantitative sense, typically this involves highlighting the main elements of a collection of data, for example, calculating the mean, median and mode. It is through the construction of figures, such as histograms and pie charts that data can be visually represented. Essentially, the overarching goal with descriptive statistics is to characterize data so that the basic quantitative features can be properly understood.

In descriptive statistics, the method which is utilized to express data depends on the type of data which is being analyzed. Simply put, data can be separated into two general groups: categorical data and numerical data. When dealing with categorical data, the variables which are measured are usually either nominal or ordinal in nature. A nominal variable is one representative of a category, with no associated numerical value. For example, data collected on the different colours of hummingbirds in a sample would represent nominal data. However, ordinal variables do have associated intrinsic values, usually classified as levels or rankings. For example, pretend that in the hummingbird scenario only yellow hummingbirds were measured in terms of the intensity of yellow which they possessed, either being low, medium or high. Therefore, these three levels of intensity would represent ordinal variables.

In comparison to categorical data, numerical data classifies variables as being either continuous or discrete. Discrete variables are whole numbers, whereas, continuous variables can resemble any number within a range.

Descriptive statistics is an integral part to the analysis of data because it allows data to be represented in a visual way, in which patterns and outliers can be identified. The use of descriptive statistics is useful for any study which aims to describe the characteristics of the different measures being collected. Additionally, it is critical to understand that descriptive statistics does not aim to formulate inferences which extend beyond the sample data being studied to the population at large. Rather, it functions to solely evaluate and visualize the immediate data collected.

Categories within Descriptive Statistics

Statistics: measures of central tendency

Table 1 An insurance company evaluates many numerical variables about a person before deciding on an appropriate rate for automobile insurance. A representative from a local insurance agency selected a random sample of insured drivers and recorded, X, the number of claims each made in the last 3 years with the following results

statistical observation

A. Select a topic of interest - preferably College related such the height of students or the number of hours they take to prepare for classes. B. Choose 2 variables to observe: 1) Female and 2) Male C. Collect 20 data points on each variable 1. Perhaps, you may design a questionnaire to collect you data 2. The design of you

MANOVA in Drug Selection

In this exercise, you are playing the role of a researcher that is testing new medication designed to improve cholesterol levels. When examining cholesterol in clinical settings, we look at two numbers: low-density lipoprotein (LDL) and high-density lipoprotein (HDL). You may have heard these called "good" (HDL) and "bad" (LDL)

Dependent Sample T-test

In this activity, we are interested in finding out whether participation in a creative writing course results in increased scores of a creativity assessment. For this part of the activity, you will be using the data file "Activity 4a.sav". In this file, "Participant" is the numeric student identifier, "CreativityPre" contains cr

Correlation Coefficient for Absenteeism and Age

A personnel manager for a large corporation feels that there may be a relationship between absenteeism and age and would like to use the age of a worker to develop a model to predict the number of days absent during a calendar year. A random sample of 10 workers was selected with the results presented below: Age Da


A population has a mean of 180 and a standard deviation of 24. A sample of 64 observations will be taken. The probability that the sample mean will be between 183 and 186 is

Coefficient of Determination Calculation

You are given the following information about y and x. Dependent Variable (y) Independent Variable (x) (4,5) (6,7) (2,9) (4,11) The coefficient of determination equals ________. Select one: A. 0.3162 B. -0.3162 C. 0.10

Regression and Point Estimates

The estimated regression line mc056-1.jpg = 11-x. The point estimate of Y when X = 3 is Select one: A. 11 B. 14 C. 8 D. 0

The Purpose of Statistical Inference

The purpose of statistical inference is to provide information about the _______ Select one: A. sample based upon information contained in the population B. population based upon information contained in the sample C. population based upon information contained in the population D. mean of the sample based upon the mea


In a regression analysis, the error term e is a random variable with a mean or expected value of

Finding the Confidence Interval at 95% Confidence

A sample of 75 information system managers had an average hourly income of $40.75 with a standard deviation of $7.00. What is the 95% confidence interval for the average hourly wage of all information system managers?


The following information regarding a dependent variable Y and an independent variable X is provided S = SUM (SX means Sum of x values) SX = 90 S (Y - nar008-1.jpg)(X - nar008-2.jpg) = -156 SY = 340 S (X - nar008-3.jpg)2 = 234 n = 4 S (Y - nar008-4.jpg)2 = 1974 SSR = 104 The slope of the regression equation is


Random samples of size 81 are taken from an infinite population whose mean and standard deviation are 200 and 18, respectively. The distribution of the population is unknown. The mean and the standard error of the mean are?


The estimated regression line Y=11x. The point estimate of Y when x=3 is

Finding the Slope of Regression Line

The following information regarding a dependent variable Y and an independent variable X is provided S = SUM (SX means Sum of x values) SX = 90 S (Y - nar008-1.jpg)(X - nar008-2.jpg) = -156 SY = 340 S (X - nar008-3.jpg)2 = 234 n = 4 S (Y - nar008-4.jpg)2 = 1974 SSR = 104 Find the slope of the regression equati

Sample confidence interval for a population mean with margin of error

a. Develop a 95% confidence interval for the mean number of miles driven until transmission failure for the population of automobiles with transmission failure. Provide a managerial interpretation of the interval estimate. b. Discuss the implication of your statistical findings in terms of the belief that some owners of the a

Building a confidence interval for population proportion

a. Develop 95% confidence intervals for the proportion of subscribers who have broadband access at home and the proportion of subscribers who have children. b. Would Young Professional be a good advertising outlet for online brokers? Please use statistical data. c. Would this magazine be a good place to advertise for compa

Implication of negative correlation

Assuming a linear relationship between X and Y, if the coefficient of correlation (r) equals - 0.30, a. there is no correlation b. the slope (b1) is negative C. Variable X is larger than variable Y D. The variance of x is negative

Linear modelling and coefficient of determination

The managers of a brokerage firm are interested in finding out if the number of clients a broker brings into the firm affects the sales generated by the broker. They sampled 12 brokers and determined the number of new clients they have enrolled in their lat year and their sales amount in thousands of dollars. Clients

Hartley test and chi-square hypothesis test

Hi, Could I please have help with the attached question? It is about the Hartley test for equality of variances and chi-square hypothesis testing in descriptive statistics.

Determining the Required Sample Size

An economist is interested in studying the incomes of consumers in a particular region. The population standard deviation is known to be $1,000. A random sample of 50 individuals resulted in an average income of $15,000. What sample size would the economist need to use for a 95% confidence interval if the width of the interval s

Library of Congress

The head librarian at the Library of Congress has asked her assistant for an interval estimate of the mean number of books checked out each day. The assistant provides the following interval estimate: from 740 to 920 books per day. If the head librarian knows that the population standard deviation is 150 books checked out per da

Make use of dummy variable in the regression analysis

First, explain in your own words (no direct quotes, please) what a dummy variable is and its purpose in regression analysis. Secondly, provide an example of where you might use a dummy variable from your own professional experience. Thirdly, briefly describe how you would implement a dummy variable in a data table you intend to

Relationship Between Sampling Mean and Population Mean

Drug manufacturer knows that for a certain antibiotics, the average number of doses ordered for a patient is 20. Steve Simmons, a salesman for the company, after looking at 1 day's prescription orders for the drug in his territory, announced that the sample mean for this drug should be lower. He said, "For any sample, the mean s