Revised Models

Below I generate the regression model of total expenditures as a function of the number of faculty, undergraduates, graduate students, phd fields, and government grants and endowments using multiple samples of universities. As before, UVA’s total expenditures continue to be under-predicted by these models (UVA’s budget is bigger than the models predict).

The first analysis, using all of the top ARL universities from the previous analysis, produces the best fit and the clearest evidence that four of the five independent variables have some marginally predictive relationship to budgets.

The second analysis, using only the 40 top-ranked institutions by ARL in the US, produces a weaker overall fit, with only one of the five included variables appearing related to budgets within this subsample.

The third analysis, using the 40 institutions nearest to UVA in the ARL rankings (20 above and 20 below), produes the weakest overall fit, with evidence that three of the five included variables are marginally related to total expenditures.

While this could change with the inclusion of the subtitute variable for research expenditures, or by incorporating more information over time, I wouldn’t feel especially confident about any but the first of these models (using a slightly larger sample of institutions).

All 99 ARL Universities

For comparison, the reduced regression model on all 99 of the top-ranked ARL universities.

## 
## Call:
## lm(formula = totexp ~ fac + ugrad + gradstu + phdfld + invgrant, 
##     data = arl_totexp)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -41348568  -4742393   -324124   3377200  35535409 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 3.176e+06  2.592e+06   1.225   0.2237    
## fac         3.159e+03  1.474e+03   2.143   0.0347 *  
## ugrad       5.374e+01  1.134e+02   0.474   0.6367    
## gradstu     6.971e+02  4.721e+02   1.476   0.1432    
## phdfld      9.399e+04  4.585e+04   2.050   0.0432 *  
## invgrant    1.392e+03  1.540e+02   9.038 2.41e-14 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9618000 on 92 degrees of freedom
## Multiple R-squared:   0.73,  Adjusted R-squared:  0.7154 
## F-statistic: 49.76 on 5 and 92 DF,  p-value: < 2.2e-16

The estimated model is

\[ \begin{align} Expenditures = &$3,175,901 + $3,159 * faculty + $54 * undergraduates + $697 * graduate~students + \\ & $93,988 * PhD~fields + $1,392 * 100~thousand~in~govt~grants~investment~returns \end{align} \]

The overall model \(R^2\) is a fairly robust 0.72. The number of faculty, phd fields, and grants/endowments are positively and significantly related to total expenditures. The number of graduate students is marginally related to total expenditures.

The predicted budget for UVA under this model is:

## VIRGINIA 
## 31968668
## Predicted: $31,968,668 
## Actual:    $40,027,846

Top 40-ranked ARL institutions in US

Based on model estimated on sample of 40 highest ranked US institutions by ARL index.

## 
## Call:
## lm(formula = totexp ~ fac + ugrad + gradstu + phdfld + invgrant, 
##     data = arl_totexp1)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -25630144  -7261803  -1950306   6956679  29889251 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 8163084.7  6296980.5   1.296    0.204    
## fac            1123.5     2067.2   0.543    0.590    
## ugrad           198.5      192.4   1.032    0.310    
## gradstu         698.0      713.3   0.979    0.335    
## phdfld        77192.6    71161.2   1.085    0.286    
## invgrant       1551.6      249.1   6.228 4.36e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 11450000 on 34 degrees of freedom
## Multiple R-squared:  0.6714, Adjusted R-squared:  0.6231 
## F-statistic: 13.89 on 5 and 34 DF,  p-value: 2.041e-07

\[ \begin{align} Expenditures = &$8,163,085 + $1,123 * faculty + $199 * undergraduates + $698 * graduate~students + \\ & $77,193 * PhD~fields + $1,552 * 100~thousand~in~govt~grants~investment~returns \end{align} \]

Note the more modest \(R^2\) value (0.62). Only grants/endowments remains significantly related to total expenditures in this subsample.

The UVA library budget prediction under this model:

## Predicted: $36,887,657 
## Actual:    $40,027,846

ARL Index within 20 ranks of UVA

Based on model estimated on sample of 40 US institutions within 20 rankings of UVA (26th).

## 
## Call:
## lm(formula = totexp ~ fac + ugrad + gradstu + phdfld + invgrant, 
##     data = arl_totexp2)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -10667918  -5723786  -1221212   3916004  19127701 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 16646648.2  4676072.0   3.560  0.00115 ** 
## fac             -140.0     1480.7  -0.095  0.92527    
## ugrad            213.3      140.9   1.513  0.13969    
## gradstu          466.6      500.8   0.932  0.35816    
## phdfld         88588.2    60384.1   1.467  0.15182    
## invgrant         859.9      210.2   4.091  0.00026 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7701000 on 33 degrees of freedom
## Multiple R-squared:  0.5003, Adjusted R-squared:  0.4246 
## F-statistic: 6.608 on 5 and 33 DF,  p-value: 0.0002314

\[ \begin{align} Expenditures = &$16,646,648 + $-140 * faculty + $213 * undergraduates + $467 * graduate~students + \\ & $88,588 * PhD~fields + $860 * 100~thousand~in~govt~grants~investment~returns \end{align} \]

The explanatory power of the model is reduced further in this subsample, with an \(R^2\) of 0.42. Only grants/endowments remains a significant predictor of expenditures, though the number of undergraduates and phd fields are marginally related.

The predicted UVA library budget is:

## Predicted: $36,111,099 
## Actual:    $40,027,846