# ANOVA: Mutual Fund Performance

Mutual funds are classified as large-cap funds, medium cap funds, or small-cap funds, depending ont eh capitalizaiton of the companies in the fund. Hawaii Pacific University Researchers S. Shi and M. Seiler investigated whether the averge performance of a mutual fund is related to capitalizaiton size. (American Business Review , Jan. 2002). Independent random samples of 30 mutual funds were selected from each of the three fund groups, and the 90-day rate of return was determined for each fund. The data for the 90 funds were subjected to an analyziz of variance, with the results shown in the ANOVA table below:

SOURCE df SS MS F p-value

Fund group 2 409.566 204.783 6.965 .002

Error 87 2,557.860 29.401 6.965

Total: 89 2,967.426

a) State the null and alternative hypothesis for the ANOVA.

b) Give the rejection region for the test using ? = .01.

c) Make the appropriate conclusion using either the test statistic or the p-value.

https://brainmass.com/statistics/analysis-of-variance/anova-mutual-fund-performance-383929

#### Solution Summary

This solution gives the step by step method for ANOVA.

More Statistics

(See attached file for full problem description)

---

The owner of Maumee Ford-Mercury wants to study the relationship between the age of a car and its selling price. Listed below is a random sample of 12 used cars sold at the dealership during the last year.

a. If we want to estimate selling price based on the age of the car, which variable

is the dependent variable and which is the independent variable?

The age is the dependent variable, and the selling price is the independent variable.

b. Draw a scatter diagram.

c. Determine the coefficient of correlation.

r = ∑(x-x)(y-y) r = 82.9

(n-1)SxSy (12-1)(2.234)(1.968) = .5833

d. Determine the coefficient of determination.

(.583)2 = .340

e. Interpret these statistical measures. Does it surprise you that the relationship is inverse?

The coefficient of correlation in this example is fairly weak. There is a considerable amount of scatter between the variables. In order for the correlation to be perfect, we would have a coefficient of either -1.0 or 1.0. Because our numbers fall outside of those numbers, we must determine whether the correlation is strong or weak. The terms "strong" and "weak" do not really tell us a lot, so we must turn to the coefficient of determination, which recognizes proportion or percentage. We can therefore say that there is a 34% chance that a relationship exists between the age of a car and its sales price. The inverse relationship is really not a surprise as one number is above the mean and the other is below.

Use Excel to do a scatter plot and to run the regression to get answers to c and d.

SUMMARY OUTPUT

Regression Statistics

Multiple R 0.543646

R Square 0.295551

Adjusted R Square 0.225106

Standard Error 1.732105

Observations 12

ANOVA

df SS MS F Significance F

Regression 1 12.58729 12.58729 4.195499 0.067702

Residual 10 30.00188 3.000188

Total 11 42.58917

Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%

Intercept 11.17724 2.143271 5.215037 0.000393 6.401732 15.95274 6.401732 15.95274

X Variable 1 -0.47876 0.233734 -2.04829 0.067702 -0.99955 0.042036 -0.99955 0.042036

We are studying mutual bond funds for the purpose of investing in several funds. For this particular study, we want to focus on the assets of a fund and its five-year performance. The question is: Can the five-year rate of return be estimated based on the assets of the fund? Nine mutual funds were selected at random, and their assets and rates of return are shown below.

a. Draw a scatter diagram.

b. Compute the coefficient of correlation.

r = 3504.5

(9-1)(182.943)(1.705) = 1.404

c. Compute the coefficient of determination.

(1.404)2 = 1.97

d. Write a brief report of your findings for parts b and c.

Since the coefficient of correlation is above +1.0 and the scattering considerable, the relation between these variables is fairly weak. The same can be said for the coefficient of determination. This information further tells us that we cannot simply look at the fund that has paid out the most. The percentages tell us much more than the totals. By attempting to relate these numbers we are easily able to see which funds are best performing over the five-year period.

e. Determine the regression equation. Use assets as the independent variable.

b = 1.404(182.94/1.70) = 151.084

f. For a fund with $400.0 million in sales, determine the five-year rate of return

(in percent).

I can't calculate this one without more information.

Use Excel to do the scatter plot and to get the various measures by running a regression

Scatter plot above. Regression:

SUMMARY OUTPUT

Regression Statistics

Multiple R 0.046051

R Square 0.002121

Adjusted R Square -0.14043

Standard Error 1.752318

Observations 9

ANOVA

df SS MS F Significance F

Regression 1 0.045679 0.045679 0.014876 0.906352

Residual 7 21.49432 3.070617

Total 8 21.54

Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%

Intercept 9.919827 1.384941 7.162636 0.000183 6.644962 13.19469 6.644962 13.19469

X Variable 1 -0.00039 0.003225 -0.12197 0.906352 -0.00802 0.007232 -0.00802 0.007232

Chapter 14, exercise number 22 page 510

A mortgage department of a large bank is studying its recent loans. Of particular interest is how such factors as the value of the home (in thousands of dollars), education level of the head of the household, age of the head of the household, current monthly mortgage payment (in dollars), and sex of the head of the household (male _ 1, female _ 0) relate to the family income. Are these variables effective predictors of the income of the household? A random sample of 25 recent loans is obtained.

a. Determine the regression equation.

SUMMARY OUTPUT

Regression Statistics

Multiple R 0.71965

R Square 0.517896

Adjusted R Square 0.496935

Standard Error 0.745724

Observations 25

ANOVA

df SS MS F Significance F

Regression 1 13.74 13.74 24.70758 5.02E-05

Residual 23 12.7904 0.556104

Total 24 26.5304

Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%

Intercept 35.88886 0.826168 43.44016 1.4E-23 34.1798 37.59792 34.1798 37.59792

X Variable 1 0.026235 0.005278 4.970672 5.02E-05 0.015317 0.037153 0.015317 0.037153

b. What is the value of R2? Comment on the value.

The value of R2 is .517896. This figure typically denotes the percentage of variation in the dependent variable accounted for by the independent predictor variables. The R2 can be adjusted to consider the sample size and can be used to compare independent variables.

c. Conduct a global hypothesis test to determine whether any of the independent

variables are different from zero.

SUMMARY OUTPUT

Regression Statistics

Multiple R 0.71965

R Square 0.517896

Adjusted R Square 0.496935

Standard Error 0.745724

Observations 25

ANOVA

df SS MS F Significance F

Regression 1 13.74 13.74 24.70758 5.02E-05

Residual 23 12.7904 0.556104

Total 24 26.5304

Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%

Intercept 35.88886 0.826168 43.44016 1.4E-23 34.1798 37.59792 34.1798 37.59792

X Variable 1 0.026235 0.005278 4.970672 5.02E-05 0.015317 0.037153 0.015317 0.037153

RESIDUAL OUTPUT

Observation Predicted Y Residuals

1 40.87351 -0.57351

2 39.06329 0.536705

3 40.11269 0.687306

4 40.11269 0.187306

5 40.58492 -0.58492

6 38.48613 -0.38613

7 38.87965 1.52035

8 41.18833 -0.48833

9 40.7161 0.083901

10 38.25001 -1.15001

11 40.63739 -0.73739

12 39.64046 0.759535

13 39.35188 -1.35188

14 39.2207 -0.2207

15 39.90281 -0.40281

16 39.69293 0.907065

17 40.45375 -0.15375

18 40.53245 -0.43245

19 40.82104 0.878961

20 39.90281 0.197186

21 39.82411 0.775891

22 40.42751 -0.02751

23 40.16516 0.734836

24 39.82411 0.275891

25 39.53552 -1.03552

d. Conduct individual hypothesis tests to determine whether any of the

independent variables can be dropped.

e. If variables are dropped, recompute the regression equation and R2.

Use Excel!