Model for Total Expenditures: Stubbs update

Below I’ve recreated the model generated by Kendon Stubbs as described in documents provided by Donna – using the same 10 universities (minus UVA for the estimation) and independent variables to generate a model prediction for UVA.

## 
## Call:
## lm(formula = totexp ~ fac + ugrad + gradstu + phdfld, data = arl_ks)
## 
## Residuals:
##        1        2        3        4        5        6        7        8 
##  7066644   840897   410120  4436274 -4056147  2521849  -807901 -6647013 
##        9 
## -3764722 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)  
## (Intercept) 18639521.0  7523974.3   2.477   0.0684 .
## fac             3074.7     3286.9   0.935   0.4025  
## ugrad            215.6      401.2   0.537   0.6195  
## gradstu         2674.1     1131.4   2.364   0.0774 .
## phdfld        -18643.4   185359.6  -0.101   0.9247  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6171000 on 4 degrees of freedom
## Multiple R-squared:  0.8607, Adjusted R-squared:  0.7215 
## F-statistic: 6.181 on 4 and 4 DF,  p-value: 0.05278

The estimated regression model is \[ \begin{align} Expenditures = &$18,639,521 + $3,075 * faculty + $216 * undergraduates + \\ & $2,674 * graduate~students + $-18,643 * PhD~fields \end{align} \]

The model has a respectable $R^2$ of 0.72, but only number of graduate students is marginally statistically significantly related to total expenditures. And the signifiance of the input variables collectively, that is, the significance of the overall model, is just at .05 (the p-value on the model F-statistic).

The UVA prediction for expenditures is above actual UVA expenditures in this case. But the selection of these nine comparison universities is a point of concern (and prone to criticisms of data drudging, or selecting a sample in order to achieve a desired result).

## VIRGINIA 
## 41886709

## Predicted: $41,886,709 
## Actual:    $40,027,846

Interestingly, this model doesn’t generate predictions that differ notably from the 2% growth rate. Largely, this is a function of the fact that in the revised models I generated earlier, grants and endowments is carrying a disproporationate amount of the explanatory weight.

Model for Total Expenditures: Stubbs update

January 30, 2019