# Applied Biostatistics

See attached file for datasets on low birth weight and related risk factors.

* Must do a complete data analysis and report.

* Can use any statistical software, but R software is preferred. If software other than R is used, then detailed descriptions of techniques used are to be provided.

* Details of the requirements are included in the following document along with the dataset in .xls format.

------------------------------------------------------------------------------------------

Data Analysis Project Requirements

The final report will consist of:

1. a short description of the biological question

2. a short description of the experimental or sampling protocol used to obtain the data

3. a description of the data (number of observations, number of variables, units)

4. an electronic copy of the data set (R data file)

5. a statement of the null hypothesis

6. a justification for the choice of analysis

7. salient statistical results (most likely graphs and extracted R out¬put)

8. statement and examination of assumptions of statistical tests used

9. statistical interpretation of results, description of effect size and power if appropriate

10. biological interpretation of results.

------------------------------------------------------------------------------------------

Data Analysis Project

(1) Biological question and hypothesis

It is generally accepted that low birth weight (less than 2500 grams or about 5.5 pounds) is an indicator of the general health of newborns and a major risk factor for perinatal and infant mortality, as well as childhood disability and other health problems.

The question to be studied for this project will consider low birth weight newborns (less than 2500 grams) and analyze the risk factors associated with low birth weight. The analysis will examine the relationship between low birth weight newborns and the age of the mother, weight of mother at the last menstrual period, smoking status of the mother during pregnancy and the number of physician visits during the first trimester.

The hypothesis postulated is that the low birth weight newborns are positively correlated with the mother's behavior during pregnancy (smoking status, number of physician visits during the first trimester) while the age of the mother and weight in pounds at the last menstrual period do not influence the incidence of low birth weight newborns.

(2) Experimental protocol

The data obtained for this project has been obtained from data collected at the Baystate Medical Center in Springfield, Massachusetts during 1986.

For each birth recorded in the database, there is information on the selected variables of interest. Data are accepted as is, without any knowledge of potential bias in capturing or recording the data. The study will ascertain if the selected dependent variables were risk factors for low birth weight newborns in the particular data set being considered.

The dataset will be summarized with descriptive statistics of the selected variables. In addition, correlation and regression analysis will be used to analyze the relationships between the dependent and independent variables.

NOTE - Suggested analysis - do analysis using both approaches:

1. Using the "Low" variable (i.e. 1 if low birth weight and 0 if not), then you can set up a model: Low = Age + Weight + Smoke + Visits + interactions

- can do Logistic General linear model analysis

2. Using the "Birthweight" variable (i.e. continuous variable of actual birth weights in grams), then a possible model is:

Birthweight = Age + Weight + Smoke + Visits + interactions

- can do Multiple regression analysis and ANCOVA (GLM) i.e. analysis of covariance

------------------------------------------------------------------------------------------

Data

The data obtained for this project has been obtained from data collected at the Baystate Medical Center in Springfield, Massachusetts during 1986. These data are copyrighted by John Wiley & Sons Inc. This data was collected on 189 women, 59 of which had low birth weight babies and the balance (130) had normal birth weight babies.

In this study, the dependent variable is whether or not a newborn has a low birth weight (<2500g) or not. The independent variables are the age of the mother in years, Weight of the mother in pounds at the last menstrual period, smoking status during pregnancy and number of physician visits during the first trimester.

------------------------------------------------------------------------------------------

Dataset variables and descriptions:

Variable Name Variable Description

Low Low birth weight (1 if birth weight <2500, else 0)

Age Age of mother in years

Weight Weight of mother in pounds at the last menstrual period

Smoke Smoking status of mother during pregnancy (1=Yes, 0=No)

Visits Number of physician visits during the first trimester (0=none, 1=one, 2=two, etc.)

Birthweight Birthweight in grams

------------------------------------------------------------------------------------------

References

M. C. McCormick, "The contribution of low birth weight to infant mortality and childhood morbidity," New England Journal of

Medicine, Vol. 312, no. 2 (1985), pp. 82-90

M.D. Overpeck, A.J. Moss, H.J. Hoffman, et al., "A comparison of the childhood health

status of normal birth weight and low birth weight infants," Public Health Reports, Vol. 4, no. 1 (1989), pp. 58-70.

https://brainmass.com/statistics/regression-analysis/applied-biostatistics-349347

#### Solution Preview

See attached files.

I used the Regression tools in SPSS for the analyses. The results should correspond to any regression done in R. I've attached a Word document with my analyses and conclusions and two pdf files with the SPSS output (some of the tables and graphs from the pdf files are also in the Word file).

-------------------------------------------------

Introduction and Hypotheses

This analysis examines the relationship between low birth weight newborns and several independent variables:

* age of the mother (Age = age in years)

* weight of mother at the last menstrual period (Weight = weight in pounds)

* smoking status of the mother during pregnancy (Smoke = 1 if the mother smoked, 0 if she did not)

* number of physician visits during the first trimester (Visits = number of physician visits)

Note: the dependent variable is birth weight:

* Low = 1 if the baby has a low birth weight, 0 if it does not

* Birthweight = birth weight in grams

The hypothesis is that the low birth weight newborns are positively correlated with the mother's behavior during pregnancy (smoking status, number of physician visits during the first trimester) while the age of the mother and weight in pounds at the last menstrual period do not influence the incidence of low birth weight newborns. (Note: These are the hypotheses set forth in the assignment. These are not necessarily what I would have guessed would be true.)

Based on these predictions, we expect to see significant correlations between birth weight and mother's behavior and no significant correlation between birth weight and the mother's age or weight:

Hypotheses

(1) Low is positively correlated with smoking; Birthweight is negatively correlated with smoking

(2) Low is positively correlated with physician visits; Birthweight is negatively correlated with physician visits

(3) Low is not correlated with age of the mother; Birthweight is not correlated with age of the mother

(4) Low is not correlated with weight of the mother; Birthweight is not correlated with weight of the mother

For each variable, the null hypothesis is that there is no correlation ...

#### Solution Summary

The solution analyses the data using logistic regression analysis and multiple regression analysis.