See attached data file.
Zagat's publishes restaurant ratings for various locations in the United States. The file RESTRATE.xls contains the Zagat rating for food, decor, service, and the price per person for a sample of 53 restaurants located in New York City and 53 restaurants located in Long Island. Suppose you wanted to develop a regression model to predict the price per person based on a variable that represents the sum of the ratings for food, decor, and service.
Using PHStat2 (not Excel), answer the following: The following is a minimum guideline about what you should analyze. For example, you may have to use such tools as confidence interval estimates, one or two-sample tests on the data to improve the quality of your report. (Please include one or both of these) Please explain and show all work.
phStat is an Excel add on, its a free download online. The link below is particularly helpful as it tells you what security to disable in order to download etc. It is not that different than Excel, if you can use Excel, phStat won't be crazy new to you!
a) State your statistical objective for this data set.
b) Perform exploratory data analysis, such as numerical measures or the box-and-whisker plot for this data set.
c) Construct a scatter diagram of price against summated rating. Describe the relationship that you may see. Does this appear to have some association (linear or non-linear)?
d) Construct a scatter diagram of price against each of food, dÃ?©cor, and service separately. Describe the relationship that you see from each diagram. Does any of these appear to have some association?
e) From (a) and (b), does any simple linear model appear to hold? You may want to run some testing to substantiate your finding.
f) Does multiple regression model appear to hold? You may want to run some testing to substantiate why or why not. If so, find the regression equation to predict price from location.
g) Suppose now that you want to develop a regression model to predict the price per person based on a variable that represents the sum of the ratings for food, decor, and service, and on location (New York
City (Locate = 0) or Long Island (Locate =1)).
Is the regression significant? Report the results of the appropriate test, and interpret its meaning.
Are any other variables (Neighborhood or Cuisine) useful for the regression analysis? For example,
when you classify Cuisine as Asian, American, Others, can you use them as dummy variables?
h) Does summated rating have significant impact on price, following adjustment for location? In particular, are New York City's restaurants significantly more expensive or significantly less on average than those in Long Island?
i) Include an interaction term in the model and, at the 0.05 level of significance, determine whether it makes a significant contribution to the model.
j) Summarize and comment on your results.
Step by step method for computing regression model is given in the answer.