    Regression analysis

    Problem: Do Hispanics earn more than white individuals at a large company, for which lawsuit filed. Attached data include, 1. Employee ID, 2. Job title, 3. Ethnicity, 4. Yrs. Working.
    1. Is pay different by ethnicity and if so are they statistically significant, and what is meaning of such. Consider some of the arguments that Hispanics plantiffs would make to company in lawsuit.
    2. Explain differences in pay by job titles, i.e. do white people simply earn more because they have better job titles rather than differences in ethnicity. What is the perspective from the plaintiffs (Hispanic population) and perspective from defendant (company) when applying statistical analysis? Need to look at this from hypothesis testing and confidence intervals, to determine whether difference in pay is significantly different within each job title.
    3. Do linear regression of payscale on job title. Explain variability in payscale with respect to job title. Are the results fair to Hispanic employees.
    4. Perform statistical analysis of years working and ethnicity, is there any statistical significance.
    5. Need multiple regression of payscale vs. job title vs. yrs working. Is regression significant. How much of variability in payrates explained by this model? Would this analysis help Hispanic worker or CEO

