See attached data file.
Assume that you are working on a team that has been commissioned by a large school district to collect and analyze data related to a recent curriculum experiment designed to improve student scores on statewide-standardized tests. The schools in this district are predominantly large, urban schools. They are interested in knowing how successful the experiment was and if the new curriculum should be incorporated district-wide. Three years ago, the school district rolled out the experimental curriculum to 100 of the 400 elementary schools in the district. Those 100 schools were selected via a simple random sample. The working budget is not large enough to collect data on the population of 400 schools and you can only afford to collect a sample of 80 schools. Unless the question states otherwise, conduct all analyses at the 95% confidence level (α=.05).
You think that the best sampling strategy is stratified sampling. You would like to list the characteristics of schools in this district and randomly select 80 schools who roughly match the demographic characteristics of the entire population of schools in this district. Forty (40) of these would come from schools that had the experimental curriculum and 40 would come from schools that kept the old curriculum.
However, people from the school board have made the following statements regarding sampling:
1. One well-meaning school board member has argued, "... because we are interested in how successful the curriculum is, we should randomly sample 80 schools with the experimental curriculum."
2. The school board member who represents a middle class neighborhood has said, "These sorts of exercises always focus on improving schools in poor areas, but what is most important is how these reforms affect the typical middle class kid at the typical middle class school. You should only sample schools whose average family income is close to the overall district-wide average."
Finally, one school board member represents a district where a number of textbook publishers are located. This member stated, "If the new curriculum is adopted then the textbook producers in the city will benefit by getting tons of orders for new books. You need to think about the greater good here; adopting the new curriculum
3. would be a boon to the city in these hard financial times. You should sample the best performing schools with the new curriculum and the worst performing schools with the old curriculum in the sample...the bigger we can make the impact look, the greater the support will be to adopt the new curriculum district wide."
Describe what is wrong methodologically with each of the three suggestions you received from the various school board members. You will want to focus on the following issues:
1. sampling biases that will be introduced from such sampling methods, and
2. how the biases will influence data/data analysis
Divide the essay into three school board member subsections that discusses the issues.
Having convinced the board members that a stratified sample is the most appropriate, you collect data from the 80 schools.
The spreadsheet provided contains:
1. The difference between average school test scores three years ago and average school test scores today is recorded as "chgtestscores." Positive values of chgtestscores indicate an increase in test scores at the school as compared to 3 years ago, while a negative number indicates that the school is now performing worse on these tests. In addition to the variable, there are three more variables in this data set.
2. The first curriculum, which can have a value of "old" or "new," where new involves the experimental curriculum.
3. Next is "income" ($000), which represents the average annual income (in thousands of dollars) of the households of students from each school.
4. Finally, there is a variable called "Schools' ID #," which is simply ID number of the elementary school. The first step to this analysis is to generate some descriptive measures.
5. Show the breakdown of the sample by curriculum type (old vs. new/experimental) with a summary table that contains frequency data and percentage data and graph
6. Show the percentage distribution of the change in test scores across all schools with the appropriate table and graph.
7. Show the percentage distribution of income across all schools with the appropriate table and graph.
Additionally, for subsection 4, 5, and 6, generate three descriptive summary tables
1. Create one table that calculates descriptive summary results of the "change in test scores" and "income" variables across all 80 schools.
2. Create a second table that calculates descriptive summary results of the "change in test scores" and "income" for the old curriculum and
3. Create a third table that calculates descriptive summary results of the "change in test scores" and "income" for the new curriculum.
Based on the tables and graphs created in parts 1-6:
4. What preliminary conclusions can you draw regarding the effectiveness of the experimental curriculum? Remember to include data to support the conclusion.
One of the criticisms levied upon the old curriculum is that it was outdated. It was so outdated; the board members argued that it was causing standardized test scores to fall. You decide to test this hypothesis:
1. Type the null and alternative hypotheses (H0 and H1) with the appropriate numerical value and statistical symbols along with a sentence that explicitly describes the null hypothesis and the alternative hypothesis. Zero is the null hypothesis,
2. Test the hypothesis that the mean change in schools using the old curriculum was less than Zero (0),
3. Calculate the p-value associated with the test statistic from b, and
4. Interpret the results with an explanation that includes data to support the conclusion
Because the school board's primary concern is whether the experimental curriculum led to better-standardized test scores, the next step is to conduct a simple analysis comparing test scores from schools with the old curriculum with the test scores from schools with the new/experimental curriculum.
1. Conduct an ANOVA to evaluate the statistically significant difference in test scores between schools with the old curriculum and schools with the new curriculum.
2. Interpret the results with an explanation that includes data to support the conclusion.
The board member who originally wanted you to include only low-income households in the survey is still concerned about the particular effect of the experimental curriculum on schools in low-income neighborhoods. To find the answer to this, you need to run a multiple regression model.
a) Create a dummy variable for the experimental curriculum and an interaction variable that interacts with the experimental dummy and the income variable. Copy and paste the Excel spreadsheet that shows the variables and data set..
b) Estimate a multiple regression model that includes the curriculum dummy, income, and interaction variable as independent variables. Copy and paste the tables from Excel and type the regression equation.
c) Calculate predicted values for the chgtestscores variable for both the new and old curriculum for income levels of $15,000, $30,000, $60,000, and $120,000.
d) Summarize and interpret the results with data of the model. What do you tell the board member about the effect of the new curriculum across different income levels
Shortly after the findings are published in a report, you receive a call from a small Midwestern school district. The schools in this district are mostly small schools in rural areas and farming communities with very low populations. Unbeknownst to anyone, the school district also experimented with the exact same curriculum used in your school district at the same time; however, they are confused because their statistical findings showed nothing significant. What do you tell them and why?
The variable the school board is most interested in understanding/explaining is the change in schoolwide standardized test scores. You were also given variables that indicated the curriculum type and the income of the households of students from each school. If this were a real research project, you would surely collect other data to use as control variables. For following the variables, explain why you would or would not want to collect the data to use in the analysis:
a) percentage of students who are ESL (English as a Second Language) students,
b) experience of the school superintendant,
c) change in test scores over the three years prior to the experimental curriculum,
d) average change in test scores on a DIFFERENT standardized test, and
e) percentage of households with single parents.
STATISTICAL PROBLEM-Old & New Curriculum