# Statistical analysis case with regression data

Attached you will find an Excel file with two sets of data, and a Word Doc with explanation to the case at hand. This is a conclusions based case using regression analysis, variables, scatterplots etc.

See attached files for a description of the case study and a full problem description.

Report requirements:

1. Use statistical analysis including regression, pivot table etc. Briefly describe what you discovered.

2. Before you run regressions, you should do scatterplots of each independent variable against the dependent variable.

3. Using the available variables, develop the best model you can to predict store sales. How good is this model? After developing your model, describe how you would test the remaining regression assumptions (don't actually do the tests, just state what they are).

4. A group within the planning department has developed a more subjective approach in which potential sites are classified according to an assessment of the "competitive type" of the trading zone. How well does a model just using the "competitive type" variables predict sales compared to the other model you developed?

5. If you are allowed to use any of the variables, can you build a better model? Does this new model change how you would describe locations likely to have higher sales?

6. Two sites, A and B [see attachment], are currently under consideration for the next new store opening. Which site would you recommend? Justify your choice, using the model you like best from those you have developed.

Report specifications:

Prepare a report for a non-technical manager answering the questions above. The text of the report should explain the process you followed from your initial exploration of the data to the development and assessment of the final model. Exhibits (graphs, tables, charts, regression output) should be constructed, labeled and captioned so they can be understood by a non-technical person and your conclusions from each exhibit should be described in the text of the report.

Please communicate findings clearly (both technically and managerially) on the appropriate conclusions. A reasonable guideline would be about three to four pages of actual text plus exhibits.

#### Solution Preview

NOTE: There is a slight error in the answer to Question 4. The instructions (in the Word file, not the text) say, "to do this you must create and use dummy variables for COMTYPE. Remember to use only six of the seven dummy variables in your regression."

I didn't see that part of the directions and only used one comtype variable, with this variable taking the values 0, 1, 2, 3, 4, 5, 6, or 7. Each of the different values 0 through 7 would correspond to a type of competitive environment.

Instead, I should have done the regression with the following dummy variables:

comtype1 (comtype1 = 1 if comtype = 1, comtype1 = 0 otherwise)

comtype2 (comtype2 = 1 if comtype = 2, comtype2 = 0 otherwise)

comtype3 (comtype3 = 1 if comtype = 3, comtype3 = 0 otherwise)

etc.

You wouldn't include a dummy variable for the case of comtype = 0, because you need to leave out one of the dummy variables.

Despite this mistake, this solution should still help you with your assignment. Follow the steps for doing the regression that I explain below, just replace the comtype variable that I use with a dummy variable when you do the assignment for yourself.

Sorry for the confusion!

---------------------------------------------------------------------------------------------------

Please see the attached files. The Word file has the same text as below. The SPSS file has the data used in the analyses. The Excel file has the same data as the SPSS file.

Report requirements:

1. Using statistical analysis including regression, pivot table etc. Briefly describe what you discovered.

We have lots of data on the trading zones of 250 existing stores. We want to use this data to construct regression models to predict sales based on this data. We will then use this model to decide which location (site A or B) would be a better choice.

I will do the calculations in SPSS, but if you prefer another statistical program, you should be able to follow my methods to do the analyses using different software.

The dependent variable (the one we want to predict) will be "sales".

2. Before you run regressions, you should do scatterplots of each independent variable against the dependent ...

#### Solution Summary

The solution provides an example of how to do a regression analysis. Analyses are done in SPSS. If you don't use SPSS, you should be able to follow the methods described here using the software program of your choice.