Introduction to Confidence Interval
The purpose of taking a random sample from a lot or population and computing a statistic, such as the mean from the data, is to approximate the mean of the population. How well the sample statistic estimates the underlying population value is always an issue. A confidence interval addresses this issue because it provides a range of values which is likely to contain the population parameter of interest.
Confidence intervals are constructed at a confidence level, such as 95%, selected by the user. What does this mean? It means that if the same population is sampled on numerous occasions and interval estimates are made on each occasion, the resulting intervals would bracket the true population parameter in approximately 95% of the cases. It is important to recognize that a 95% confidence level does NOT mean that there is a 95% probability that the interval contains the true mean. Common choices for the confidence level are 90%, 95%, and 99%.
In practice, sample size calculation is often a matter of bargaining between the statistician and the researcher (physician, agriculture expert, biologist, sociologist, engineer, etc. ), the attention of the former being focused mainly on the power of the statistical test and that of the second mainly on the very feasibility of the study. In addition, the use of confidence intervals is strongly recommended in reporting results as confidence intervals convey information about magnitude and precision of effect simultaneously, keeping these two aspects of measurements closely linked.
II. Confidence Interval for Normal Distribution
For normal distribution, there are two parameters that we want to estimate, the population mean  and the standard deviation .
1. Confidence Intervals for Unknown Mean and Known Standard Deviation
Example 1. A scientist measured the concentration of a certain chemical in a liquid. He obtained the concentration of 52.5 ppm, 51.7 ppm, 53.1 ppm, 50.9 ppm, 50.5 ppm, and 52.2 ppm on 6 different samples of the liquid. If he knows that the standard deviation for the experimental procedure is 1.2 ppm. Assume that the measurements follow a normal distribution, what is the confidence interval for the population mean at 90%, 95% and 99% confidence levels ?
The scientist wishes to estimate the true mean concentration of the chemical using the mean that he obtained from his measurements, e. According to the Central Limit Theorem, the sample mean e will have a normal distribution with the mean same as the mean of the population M = , and M = /N1/2. The sample size N = 6 in this case, so M = /N1/2= 0.49.
The confidence interval for every value e of is given in the form of (L,U), where L and U are the upper and lower limits. Let's first consider a confidence level of 90% or =0.1. That means that if the experiments are repeated over and over again, and the confidence interval is computed for every experiment, in the long run about 90% of these intervals would contain the true mean. Due to symmetry, this also means only 5% of the times that we find  > U or  < L.
So, we must estimate U large enough so the probability of finding  > U is 5%. To satisfy this condition, it has been mathematically proven that U must be e+ZM, where Z, the standard "error " or "core" in a normal distribution cutting off the upper /2 of the probability. For the lower limit, L=e-ZM.
The value Z represents the point on the normal density curve N(,) such that the probability of observing a value greater than +Z is equal to /2.. For example, if /2= 0.05, the value Z such that P(x > +Z) = 0.05, or P(x < +Z) = 0.95, is equal to 1.65. For a confidence interval with level C, the value  is equal to (1-C). ...
An introduction to confidence intervals are provided.