# Regression analysis for Salary data.

See attached data file.

2. Sex discrimination. The dataset salary.dat contains salaries and other characteristics of all faculty

members of a small college. The data were collected for presentation in legal proceedings in which

discrimination against women in salary was at issue. All faculty members represented in the dataset

hold tenured or tenure-track positions; temporary faculty are not included. The data were collected

from personnel files and consist of the following:

SX = Sex, coded 1 for female and 0 for male

RK = Rank, coded 1 for Assistant Professor, 2 for Associate Professor and 3 for Full Professor

YR = Number of years in current rank

DG = Highest degree, coded 1 if Doctorate, 0 if Masters

YD = Number of years since highest degree was earned

SL = Academic year salary in dollars

In this problem, treat the rank variable RK as categorical. That is, replace the rank variable by two

binary dummy variables to account for the different rank levels. For example, you might introduce

one variable indicating non-tenured (Assistant Professor) or tenured (Associate/Full Professor) ranks

and one variable indicating Full Professor rank. (Compare the example of echo-locating and non-echolocating

birds and bats discussed in class.)

(a) Test the hypothesis that salary adjusted for years in current rank, highest degree, and years since

highest degree is the same for each of the three ranks.

(b) Fit a linear model that predicts salary given all other variables, including the categorical dummy

variables. Examine the residuals. Explain the need to transform the response, salary, to some other

scale, and suggest an appropriate transformation.

(c) After transforming the response, examine the 'new' residuals. Comment on their appearance and

the adequacy of the regression model.

(d) Test the hypothesis that salary adjusted for rank, years in current rank, highest degree, and years

since highest degree is the same for men and women. Summarize your findings so far in a fashion

that might be useful in court.

(e) Finkelstein (1980), in a discussion of the use of regression in discrimination cases, wrote that

a "variable may reflect a position or status bestowed by the employer, in which case if there is

discrimination in the award of the position or status, the variable may be 'tainted'." For example,

if there is discrimination in the promotion of faculty to higher ranks, using rank to adjust salaries

before comparing the sexes may not be acceptable to the courts. Fit a model similar to that in parts

(c) and (d) to the data, but without adjusting for the effects of rank. Summarize and compare the

results of leaving out rank effects on inferences concerning differentials in pay by sex.

https://brainmass.com/statistics/regression-analysis/regression-analysis-for-salary-data-382888

#### Solution Summary

Step by step method for computing regression model for salary data is given in the answer.