This demonstration is a modification of the example available on the nomogram() help page. It demonstrates a nomogram for a logistic regression model using simulated data.

Simulate data for logistic regression

First we generate 1000 subjects. The variables “age”, “blood.pressure” and “cholesterol” are random draws from a Normal distribution. “sex” is just randomly sampled like a coin flip. The set.seed(17) function ensures we always generate the same “random” data. We save everything in a data frame and name it “d”.

library(rms)
n <- 1000    # define sample size
set.seed(17) # so can reproduce the results
d <- data.frame(age = rnorm(n, 50, 10),
                blood.pressure = rnorm(n, 120, 15),
                cholesterol = rnorm(n, 200, 25),
                sex = factor(sample(c('female','male'), n,TRUE)))

Next we simulate zeroes and ones. (Perhaps 1 equals “Death” and 0 equals “Live”.) To do this we first generate a linear predictor, L. It’s just a weighted function of the variables we generated. This our “true” model.

Then we convert L to probabilities using the plogis() function. This is also known as the inverse logit function. It takes values on the range \((-\infty, \infty)\) and re-scales to the probability scale, [0,1]. Finally we use the rbinom() function to simulate flipping a coin 1000 times, but with probability p at each flip. The result, y, is a series of zeroes and ones and is added to our data frame

L <- 0.01 + .1*(d$sex=='male') + -.02*d$age + 
  0.01*d$cholesterol + -0.01*d$blood.pressure
p <- plogis(L)
d$y <- rbinom(n = 1000, size = 1, prob = p)
table(d$y)
## 
##   0   1 
## 517 483

Fit a logistic regression model

Next we “work backwards” and try to recover the true model we used to generate the data. Notice we use the lrm() function from the rms package. (We have to use lrm() to make the nomogram! We can’t use glm().) We fit the “correct” model and save to f. Notice printing f generates a very informative summary that goes beyond what we get using glm(). The values in the Coef column are very close to what we specifed in our linear predictor model above.

f <- lrm(y ~ age + sex + cholesterol + blood.pressure,
         data = d)
f
## Logistic Regression Model
##  
##  lrm(formula = y ~ age + sex + cholesterol + blood.pressure, data = d)
##  
##                        Model Likelihood    Discrimination    Rank Discrim.    
##                              Ratio Test           Indexes          Indexes    
##  Obs          1000    LR chi2     23.09    R2       0.030    C       0.590    
##   0            517    d.f.            4    g        0.349    Dxy     0.179    
##   1            483    Pr(> chi2) 0.0001    gr       1.418    gamma   0.179    
##  max |deriv| 3e-07                         gp       0.085    tau-a   0.089    
##                                            Brier    0.244                     
##  
##                 Coef    S.E.   Wald Z Pr(>|Z|)
##  Intercept      -0.0115 0.8108 -0.01  0.9887  
##  age            -0.0209 0.0064 -3.28  0.0010  
##  sex=male       -0.1584 0.1284 -1.23  0.2174  
##  cholesterol     0.0080 0.0026  3.04  0.0024  
##  blood.pressure -0.0045 0.0043 -1.06  0.2892  
## 

Create a nomogram

Now we make the nomogram. To do that we first need to use the datadist() function. This creates a mini dataframe of values that are used for plotting purposes. It is required for the nomogram function. Then we set the datadist argument of the options() function to the datadist object we created. It’s a little weird to me and I’m not quite sure I understand what it’s doing!

ddist <- datadist(d)
options(datadist='ddist')

Finally we get to the nomogram() function. The first argument is the model we fit, f. The next argument, fun, is for transforming the linear predictor. The model is going to predict values on the range of \((-\infty, \infty)\), so we use the plogis function to transform to the probability scale [0,1]. And then we label the outcome “Risk of Death”. Save to object nom and call plot() on it.

nom <- nomogram(f, fun=plogis, funlabel="Risk of Death")
plot(nom)

Here’s how to use. Pick a point on the age, sex, cholesterol and blood pressure scales and get the associated Points directly above it. Then total those points, and then get the associated probability (Risk of Death) corresponding to the Total Points.

For example:

Add up the points:

50 + 10 + 10 + 0 = 70 Total Points

70 Total Points lines up a little less than 0.35. So a female age 50 with 140 cholesterol and 170 blood pressure has about 35% risk of death.

We can check our work using the rms Predict() function. The predicted probability is about 0.33. Notice yet again we use the plogis function to get the prediction on the probability scale.

Predict(f, age=50, sex='female', cholesterol=140, blood.pressure=170,
        fun=plogis)
##   age    sex cholesterol blood.pressure      yhat     lower     upper
## 1  50 female         140            170 0.3313907 0.2233398 0.4607074
## 
## Response variable (y):  
## 
## Limits are 0.95 confidence limits