# Determining Statistical Sgnificance and Random Distribution

I am working on a research paper for forensics where I am studying striated toolmark patterns created by a tool on various types of materials. I have summarized my data to show the total number of matching striations (lines), the % matching lines and the occurrences of consecutively matching striae (CMS). CMS has been show through 40 plus years of research to be the key component in determining whether or not a toolmark was made by a particular tool.

From research a criteria was developed which briefly states that if certain size CMS runs occur when microscopically comparing two toolmarks, it can be stated that the toolmarks were in fact made by the same tool. My data supports all of the past research; however, I am stuck on three points: Is it necessary to know if your data is randomly distributed? What statistical test do you use to determine if your data is randomly distributed? Lastly, how do you determine the statistical significance of two values?

Specifically, I have determined the probabilities for the CMS run combinations in the most conservative match and in the best known non-match. The probabilities are as follows: 0.0739 and 0.0004 respectively. Although it seems likely that the differences in these values are statistically significant, I want to prove it. I was thinking of using the t-test, but I am not sure if that is the correct test to apply in my case. Your assistance would be greatly appreciated.

#### Solution Preview

Please see response attached. I hope this helps and take care.

RESPONSE:

1. Is it necessary to know if your data is randomly distributed? What statistical test do you use to determine if your data is randomly distributed?

Let's say that your research question was this: What is the likelihood that the toolmarks made by a randomly selected tool of the same type would do as good a job as the toolmarks made by the suspect tool at matching the characteristics of the evidence toolmark? In other words, you would randomly select a tool from a group of tools (meaning that each tool has an equal chance of being chosen). The random selection of the tools is part of the research design, which implies that the data is randomly distributed. No test is necessary. See http://www.statcan.ca/english/edu/power/ch13/probability/probability.htm#size for other types of sampling (in addition to simple randomly selected above).

You probably used some type of random sampling (which is necessary if you want to generalize to the general population) so then you are okay.

2. Lastly, how do you determine the statistical significance of two values?

Specifically, I have determined the probabilities for the CMS run combinations in the most conservative match and in the best-known non-match. The probabilities are as follows: 0.0739 and 0.0004 respectively. Although it seems likely that the differences in these values are statistically significant, I want to prove it. I was thinking of using the t-test, but I am not sure if that is the correct test to apply in my case. Your assistance would be greatly appreciated. If you want to compare two percentages (e.g. 0.0739 and 0.0004 respectively), you are correct, as you would then use a t-test to test the significance.

See on-line percentage calculator by clicking on the calculator below:

The Statistics CalculatorStatistical Analysis Tests At Your Fingertips

http://www.statpac.com/statistics-calculator/percents.htm Also see information below about comparing percentages and related examples.

FINAL COMMENTS I HOPE THIS HELPS AND TAKE CARE.

See

http://64.233.161.104/search?q=cache:pBKFejEwsCMJ:www.stlr.org/html/volume6/schwartz.pdf+consecutively+matching+striae+(CMS)+random+sample&hl=en for related article.

Percents menu (excerpted from on-line source)

Percents are understood by nearly everyone, and therefore, they are the most popular statistics cited in research. Researchers are often interested in comparing two percentages to determine whether there is a significant difference between them.

The Percents menu has three selections:

· One sample t-test between percents

· Two sample t-test between percents

· Confidence interval around a percent

Choosing the Proper Test

There are two kinds of t-tests between percents. Which test you use depends upon whether you're comparing percentages from one or two samples.

Every percentage can be expressed as a fraction. By looking at the denominator of the fraction we can determine whether to use a one-sample or two-sample t-test between percents. If the denominators used to calculate the two percentages represent the same people, we use a one-sample t-test between percents to compare the two percents. If the denominators represent different people, we use the two-sample t-test between percents.

For example suppose you did a survey of 200 people. Your survey asked,

Were you satisfied with the program?

___ Yes ___ ...

#### Solution Summary

In reference to presenting research findings, this solution explains whether or not a researcher needs to know if the data is randomly distributed, the type of statistical test(s) to use to determine if the data is randomly distributed and to determine the statistical significance of two values. Supplemented with a highly informative article expanding on these ideas.