Logistic Regression Review

First, let’s review:

With logistic regression, our dependent variable is binary. There are types of logistic regression where the dependent variable can take on more than two categories, such as multinomial (unordered) and ordinal (ordered) logistic regression, but they are beyond the scope of this chapter. For now, our dependent variables will take on binary values.

When we model logistic regressions, instead of plotting points on a line (as we do in linear and multivariate regression), we plot the probabilities of our \(y\) variable being one of two binary outcomes given any value of our \(x\) variable. In other words, we model the probability that \(y = 1\) given \(x\). In the form of an equation, logistic regression looks like this:

Or:

Also known as:

We can also take the logits of both sides of this equation in order to discern the difference a change in the “logit scale” (the right-hand side of the below equation) makes on the “probability scale” (the left-hand side of the below equation) (Gelman and Hill):

Take the logistic regression equation Gelman and Hill use, \(P(y = 1) = logit^{-1}(-1.40 + 0.33x)\):

If we take the logit of both sides, we get: \(logit(P(y = 1)) = -1.40 + 0.33x\).

The halfway point of this logistic regression line will be where the probability that \(y = 1\) is 0.5. In order for \(P(y = 1) = 0.5\) (for the left-hand side of this equation to be \(logit(0.5)\)) the right-hand side of this equation, \(-1.40 + 0.33x\), must equal 0, as \(\frac{e^0}{1 + e^0} = \frac{1}{2} = P(y = 1)\). So, the halfway point will be where \(logit(0.5) = -1.40 + 0.33x = 0\) and where, therefore, \(x = 4.2\). And since logistic regression lines take on more S-shaped curves, the halfway point will also be where the slope of the line is at its steepest, causing the steepest change in the probability scale, \(logit(P(y = 1))\), given a change in the logit scale, \(-1.40 + 0.33x\).

Change the logit scale by 0.4 (i.e. now \(-1.40 + 0.33x = 0.4\) and \(x = 5.45\)), and \(P(y = 1)\) now equals roughly \(0.6\), as \(\frac{e^{-1.40 + 0.33(5.45)}}{1 + e^{-1.40 + 0.33(5.45)}} = 0.6 = P(y = 1)\). Thus, the new formula is \(logit(0.6) = -1.40 + 0.33(5.45) = 0.4\), which represents a 10% change in the probability scale (from \(logit(0.5)\) to \(logit(0.6)\) given a 0.4 change in the logit scale (from \(-1.40 + 0.33x = 0\) to \(-1.40 + 0.33x = 0.4\))

Let’s move further up the line from the halfway point, say, where \(P(y = 1) = 0.9\) and so \(-1.40 + 0.33x = 2.2\). If we change the logit scale by 0.4 again, the change in the probability scale will be more gradual, as here the line is less steep. Adding 0.4 to the logit scale, we get: \(-1.40 + 0.33x = 2.6\), \(x = 12.12\), and \(P(y = 1) = 0.93\). Thus, the new formula is \(logit(0.93) = -1.40 + 0.33(12.12) = 2.6\), which represents only a 3% change in the probability scale (from \(logit(0.9)\) to \(logit(0.93)\)) given the same 0.4 change in the logit scale (from \(-1.40 + 0.33x = 2.2\) to \(-1.40 + 0.33x = 2.6\))

While the equations for logistic regression models are akin to those for linear regression models—as you might have noticed, both equations contain the “phrase” \(\beta_0 + \beta_1x\)—interpreting the coefficients of logistic regression models is very different from interpreting those of linear regression models. Since we are plotting probabilities and since logistic regression lines take on more of an S-shape, interpreting the coefficients of logistic regression models can be confusing: if the effect a change in \(x\) has on \(P(y = 1)\) is not constant throughout the curve, how do we make any useful statement on how a change in \(x\) affects the probability that \(y = 1\)? There are a few ways available:

Interpreting Logistic Regression Coefficients

Evaluating changes at and around the mean of our x variable

If we choose this method, we first calculate the probability that \(y = 0\) at the mean of our \(x\) variable, using the formula (where \(\bar{x}\) represents the mean of our \(x\) variable):

  • \(P(y = 1) = logit^{-1}(\beta_0 + \beta_1\bar{x})\)

Using Gelman and Hill’s logistic model above, we can calculate \(P(y = 1)\) at the mean of \(x\) as:

  • \(logit^{-1}(-1.40 + 0.33\bar{x})\) where \(\bar{x} = 3.1\)

In R, we can calculate the above expression using the invlogit function, found in the arm library. The syntax looks like this:

  • invlogit\((\beta_0 + \beta_1mean(x))\)

And the evaluation of it on Gelman and Hill’s model looks like this:

invlogit(-1.40 + 0.33*3.1)
## [1] 0.4068507

In order to calculate the difference a unit change in \(x\) makes on \(P(y = 1)\), we then calculate the difference in probabilities that \(y = 0\) using values around the mean of \(x\). Because, in Gelman and Hill’s model, the mean of \(x\) is 3.1, let’s evaluate the difference using the \(x\) values 2.6 and 3.6, each 0.5 away from the mean and one unit away from each other:

invlogit(-1.40 + 0.33*3.6) - invlogit(-1.40 + 0.33*2.6)
## [1] 0.07947516

From this calculation, we can see that a one unit change in \(x\) results in a positive change of about 8% in the probability of \(y = 1\).

Evaluating change using the derivative of the curve at the mean of our x variable

We can also calculate the derivative of the logistic regression curve at the mean of our \(x\) variable in order to find the slope of the curve (and therefore the change in \(P(y = 1)\) per change in \(x\)) at that point. Taking the derivative of our initial function, \(logit^{-1}(\beta_0 + \beta_1x)\), we get:

  • \(\frac{\beta_1e^{\beta_0 + \beta_1x}}{(1 + e^{\beta_0 + \beta_1x})^2}\)

Going back to the Gelman and Hill model, remembering that our \(\beta_0 + \beta_1x\) is equal to \(-1.40 + 0.33x\), and remembering that we are evaluating the derivative at the mean of \(x\), which equals 3.1, our calculation of the slope of the logistic regression curve is then:

  • \(\frac{0.33e^{-1.40 + 0.33(3.1)}}{(1 + e^{-1.40 + 0.33(3.1)})^2}\)

Which equals 0.07963666 (or a roughly 8% difference in probability), a very, very similar value to the one found using the previous method.

*N.B. Gelman and Hill calculated this expression as equaling 0.13, which is a mistake. #awkward

Evaluating change using the “Divide by 4 Rule”

Another method of interpretation is known as the “Divide by 4 Rule”. Let’s find out why this rule is named as it is:

With the “Divide by 4 Rule”, we are evaluating change at the steepest point on the logistic regression curve: the halfway point, where \(\beta_0 + \beta_1x = 0\) and \(P(y = 1) = 0.5\). Recall the derivative of the logistic regression curve laid out above:

  • \(\frac{\beta_1e^{\beta_0 + \beta_1x}}{(1 + e^{\beta_0 + \beta_1x})^2}\)

To evaluate change at the steepest point on the logistic regression curve—in other words, to calculate the slope at the steepest point on the curve—we plug our \(\beta_0 + \beta_1x\), or 0, into this expression. Doing so yields:

  • \(\frac{\beta_1}{4}\)

Thus, \(\frac{\beta_1}{4}\) is the greatest possible change in \(P(y = 1)\) given a unit change in \(x\).

To follow the “Divide by 4 Rule”, we divide our logistic regression coefficients (that are not the intercept) by 4 in order to get a sense of the upper limit of the predictive change in \(P(y = 1)\) given a unit change in \(x\). Hence, the name!

Using Gelman and Hill’s model once again, following the “Divide by 4 Rule”, we divide our \(\beta_1\), 0.33, by 4 to get 0.0825. Thus, a unit change in \(x\) results in no greater than a positive 8.25% change in \(P(y = 1)\). Note that this value of 0.0825 is very close to the values we got using the previous two methods, 0.0795 and 0.0796, respectively!

Modeling Logistic Regressions in R, A Review

For the purposes of demonstrating both classical and Bayesian logistic regression, we will be using the gapminder dataset, looking only at values from 2007, the most recent year in the dataset. Let’s say that we want to make a logistic regression model with a binary dependent variable, is_long, which, if lifeExp is greater than 71.93550 (the 50th percentile value of lifeExp), is coded as 1 and which, if lifeExp is less than or equal to 71.93550, is coded as 0. And let’s say that, using this model, we want to examine the effect gdpPercap has on the probability that life expectancy is_long. We can then define our regression as: \(P(is\_long = 1 | gdpPercap)\), or \(P(is\_long = 1) = logit^{-1}(\beta_0 + \beta_1gdpPercap)\).

Study the code (and the code comments!) below to find out how to model this logistic regression in R, using the classical method.

First, we clean and make our data, using only the variables in which we are interested and creating our binary dependent variable, is_long.

# Create a new dataset called "gapminder2007", which includes only "gapminder"
# values from the year 2007.

gapminder2007 <- gapminder %>% 
  filter(year == 2007)

# Get the quantile values for the "lifeExp" variable in order to see where the
# "cut-point" should be for our new variable, "is_long".  Here, we use the 50th
# percentile value, 71.93550, as the "cut-point" so that any `lifeExp` greater
# than that value is considered long and, conversely, any `lifeExp` less than or
# equal to that value is not.

quantile(gapminder2007$lifeExp)
##       0%      25%      50%      75%     100% 
## 39.61300 57.16025 71.93550 76.41325 82.60300
# Create a new dataset called "gapminder_lifeExp", which includes only the
# variables "lifeExp" and "gdpPercap" from the "gapminder2007" dataset we
# created above, and which creates the y-variable for our logistic regression
# model, "is_long".

gapminder_lifeExp <- gapminder2007 %>%
  select(lifeExp, gdpPercap) %>% 
  
  # Here, "is_long" is coded as 1 if "lifeExp" is greater than 71.93550, the
  # 50th percentile value of "lifeExp", and "is_long" is coded as 0 if "lifeExp"
  # is less than or equal to 71.93550.
  
  mutate(is_long = ifelse(lifeExp > 71.93550, 1, 0))

Next, we make our logistic regression model!

In the summary output, we get our model’s coefficient estimates, including its intercept (-2.682) and its gdpPercap coefficient estimate (0.0003583). Thus, our logistic regression can be written as the following: \(P(is\_long = 1) = logit^{-1}(-2.682 + 0.0003583gdpPercap)\).

In the confint output, we get 90% confidence intervals for both our model’s intercept—(-3.5038181796, -1.97558421)—and its gdpPercap coefficient—(0.0002592351, 0.00047599). I chose to set the level to 90% instead of 95%, as I think it is helpful to compare these 90% confidence intervals to the Bayesian 90% credible intervals we will see later.

# Fit the logistic regression model using the "glm" function, storing it as
# "logistic_model".  Here, we are making "is_long" a function of "gdpPercap"
# (i.e. "is_long" is our y, and "gdpPercap" is our x).  We specify the family to
# be "binomial" so that our model is logistic (in the empty parentheses, family
# = binomial() automatically supplies the argument link = "logit"), and we
# specify our data to be our "gapminder_lifeExp" dataset.

logistic_model <- glm(is_long ~ gdpPercap, family = binomial(), data = gapminder_lifeExp)

# Get important summary information about our model

summary(logistic_model)
## 
## Call:
## glm(formula = is_long ~ gdpPercap, family = binomial(), data = gapminder_lifeExp)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.7546  -0.4785  -0.1905   0.2074   1.9793  
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -2.682e+00  4.616e-01  -5.809 6.29e-09 ***
## gdpPercap    3.583e-04  6.574e-05   5.451 5.02e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 196.854  on 141  degrees of freedom
## Residual deviance:  94.914  on 140  degrees of freedom
## AIC: 98.914
## 
## Number of Fisher Scoring iterations: 7
# Get 90% confidence intervals for our model

confint(logistic_model, level = 0.90)
##                       5 %        95 %
## (Intercept) -3.5038181796 -1.97558421
## gdpPercap    0.0002592351  0.00047599

Finally, we can visualize our logistic regression model!

# Plot the "gapminder_lifeExp" dataset using ggplot, setting "gdpPercap" as the
# x-axis variable and "is_long" as the y-axis variable

ggplot(gapminder_lifeExp, aes(x = gdpPercap, y = is_long)) + 
  
  # Make the plot a jittered scatterplot, setting width to 0 so that points
  # align with their precise "gdpPercap" values, setting height to 0.05 so that
  # points do not lie on top of each other, and setting alpha to 0.5 so that we
  # can better see individual points
  
  geom_jitter(width = 0, height = .05, alpha = 0.5) + 
  
  # Add the logistic regression line, specifying the method to be "glm",
  # suppressing the plotting of standard error, and specifying the family to be
  # "binomial" so that the regression line is logistic.
  
  geom_smooth(method = "glm", se = FALSE, method.args = list(family = "binomial"))

Interpreting our logistic regression model

So we’ve made and visualized a classical logistic regression model—yay! Now, let’s apply some of the interpretive skills we’ve acquired.

The logistic regression model we just made is: \(P(is\_long = 1) = logit^{-1}(-2.682 + 0.0003583gdpPercap)\)

Using the mean method

Using the mean method outlined above (see the section entitled “Evaluating changes at and around the mean of our x variable”), we can calculate \(P(y = 1)\) at the mean value of gdpPercap, which here is 1.168007210^{4}:

invlogit(-2.682 + 0.0003583*mean(gapminder_lifeExp$gdpPercap))
## [1] 0.818017

From this calculation, we can see that, at the mean value of gdpPercap, the probability that is_long equals 1 is about 81.80%.

We can also evaluate the effect a unit change in gdpPercap has on the probability that is_long equals 1 around the mean value of gdpPercap. Here, I will treat a difference of 5,000 international dollars as a “unit change” in gdpPercap, as the values of gdpPercap range from 277.5518587 to 49357.19, and 5,000 is a tenth of the difference between those two values—the equivalent of one step on a ten-step scale. Note that the ggplot of our model above has gridlines every for every 5,000 difference in gdpPercap. Also note, however, that this definition of what constitutes a “unit change” here is just mine and is subjective.

invlogit(-2.682 + 0.0003583*(mean(gapminder_lifeExp$gdpPercap) + 2500)) - invlogit(-2.682 + 0.0003583*(mean(gapminder_lifeExp$gdpPercap) - 2500))
## [1] 0.2694233

From this calculation, we can see that a one unit change (5,000 international dollars) around the mean value of gdpPercap results in about a 26.94% change in the probability that is_long equals 1.

Using the derivative method

Using the derivative method outlined above (see the section entitled “Evaluating change using the derivative of the curve at the mean of our x variable”), we can calculate the slope of our logistic regression curve at the mean value of gdpPercap:

Recall the derivative of the logistic regression curve:

  • \(\frac{\beta_1e^{\beta_0 + \beta_1x}}{(1 + e^{\beta_0 + \beta_1x})^2}\)

For our model, the calculation will be:

  • \(\frac{0.0003583e^{-2.682 + 0.0003583mean(gdpPercap)}}{(1 + e^{-2.682 + 0.0003583mean(gdpPercap)})^2}\)

We can perform this calculation in R like so:

# Store "exp(1)", which is e in R, as "e"

e <- exp(1)

# Calculate the slope of our logistic regression curve at the mean value of
# "gdpPercap"

(0.0003583*e^(-2.682 + 0.0003583*mean(gapminder_lifeExp$gdpPercap)))/(1 + e^(-2.682 + 0.0003583*mean(gapminder_lifeExp$gdpPercap)))^2
## [1] 5.33384e-05

From this calculation, we have determined that the slope of our logistic regression curve at the mean value of gdpPercap is 0.000053384. This number is miniscule! But it is miniscule only because it corresponds to a 1 international dollar change in gdpPercap. Earlier, we defined a one unit change in gdpPercap to be 5,000 international dollars. Therefore, we must multiply this slope by 5,000 in order to get our adjusted slope—or the magnitude of the effect one unit change in gdpPercap has on the probability of is_long equaling 1.

# Multiply our earlier calculation by 5,000 to get the slope adjusted to our
# definition of a one unit change in "gdpPercap"

5000*(0.0003583*e^(-2.682 + 0.0003583*mean(gapminder_lifeExp$gdpPercap)))/(1 + e^(-2.682 + 0.0003583*mean(gapminder_lifeExp$gdpPercap)))^2
## [1] 0.266692

Much better! As you can see, this slope indicates that, at the mean value of gdpPercap, a one unit change in gdpPercap results in a roughly 26.67% change in probability that is_long equals 1. This finding is relatively consistent with the one above.

Using the “Divide by 4 Rule”

Using the “Divide by 4 Rule”, we can simply divide our estimated \(\beta_1\) by 4 and then multiply that value by 5,000 in order to get the upper limit of the predictive change in the probability that is_long equals 1 given a unit change in gdpPercap. Doing so yields:

5000*(0.0003583/4)
## [1] 0.447875

This value is significantly larger than the values we calculated using the other two methods. But remember that this is the upper limit of the predictive change in probability—an approximation of the slope of our logistic regression curve at its very steepest point, the halfway point. Thus, we interpret this result to mean that a unit change in gdpPercap, here a change of 5,000 international dollars, results in no greater than a positive 44.79% change in the probability that is_long equals 1.

Modeling Bayesian Logistic Regressions in R

Let’s say that we now want to make a Bayesian logistic regression model with a binary dependent variable, is_long, which, if lifeExp is greater than 71.93550 (the 50th percentile value of lifeExp), is coded as 1 and which, if lifeExp is less than or equal to 71.93550, is coded as 0. And let’s say that, using this Bayesian model, we want to examine the effect gdpPercap has on the probability that life expectancy is_long. We can then define our regression as: \(P(is\_long = 1 | gdpPercap)\), or \(P(is\_long = 1) = logit^{-1}(\beta_0 + \beta_1gdpPercap)\).

We have already made the classical version of this model (see the section entitled “Modeling Logistic Regressions in R, A Review”). Now study the code (and the code comments!) below to find out how to model this logistic regression in R, using the Bayesian method.

As before, we first clean and make our data, using only the variables in which we are interested and creating our binary dependent variable, is_long.

# Create a new dataset called "gapminder2007", which includes only "gapminder"
# values from the year 2007.

gapminder2007 <- gapminder %>% 
  filter(year == 2007)

# Get the quantile values for the "lifeExp" variable in order to see where the
# "cut-point" should be for our new variable, "is_long".  Here, we use the 50th
# percentile value, 71.93550, as the "cut-point" so that any `lifeExp` greater
# than that value is considered long and, conversely, any `lifeExp` less than or
# equal to that value is not.

quantile(gapminder2007$lifeExp)
##       0%      25%      50%      75%     100% 
## 39.61300 57.16025 71.93550 76.41325 82.60300
# Create a new dataset called "gapminder_lifeExp", which includes only the
# variables "lifeExp" and "gdpPercap" from the "gapminder2007" dataset we
# created above, and which creates the y-variable for our logistic regression
# model, "is_long".

gapminder_lifeExp <- gapminder2007 %>%
  select(lifeExp, gdpPercap) %>% 
  
  # Here, "is_long" is coded as 1 if "lifeExp" is greater than 71.93550, the
  # 50th percentile value of "lifeExp", and "is_long" is coded as 0 if "lifeExp"
  # is less than or equal to 71.93550.
  
  mutate(is_long = ifelse(lifeExp > 71.93550, 1, 0))

Next, we make our Bayesian logistic regression model!

Whereas in the summary output for our classical logistic regression model, we got our model’s coefficient estimates, here, we get a distribution of values for our model’s coefficients. Using the tidy function found in the broom library, we can get singular point estimates for our Bayesian logistic regression model’s coefficients: for its intercept, it is -2.531, and for its gdpPercap variable, it is 0.000333044. Thus, our logistic regression can be written as the following: \(P(is\_long = 1) = logit^{-1}(-2.531 + 0.000333044gdpPercap)\).

In the posterior_interval output, we get 90% credible intervals for both our model’s intercept—(-3.2597990602, -1.9049518275)—and its gdpPercap coefficient—(0.0002468964, 0.0004350909). Recall our classical logistic regression model’s 90% confidence intervals, (-3.5038181796, -1.97558421) for the intercept and (0.0002592351, 0.00047599) for the gdpPercap coefficient.

As you can see, these intervals are very similar to one another. However, some quick subtractions will tell you that both 90% credible intervals are narrower than both 90% confidence intervals, meaning the 90% credible intervals for our Bayesian logistic regression model are less uncertain than the 90% confidence intervals for our classical logistic regression model. Moreover, while a 90% confidence interval tells us that, in repeated sampling, the true value falls within the confidence interval 90% of the time, a 90% credible interval tells us that there is a 90% probability that the true value falls within that interval. And therein lies the advantage of using Bayesian methods: they allow us to make probabilistic deductions about our models’ parameters.

# Fit the logistic regression model using the "stan_glm" function, storing it as
# "bayes_logistic_model".  Here, we are making "is_long" a function of
# "gdpPercap" (i.e. "is_long" is our y, and "gdpPercap" is our x).  We specify
# the family to be "binomial" so that our model is logistic (in the empty
# parentheses, family = binomial() automatically supplies the argument link =
# "logit"), and we specify our data to be our "gapminder_lifeExp" dataset.  Set
# refresh = 0 so as to suppress all of the "SAMPLING FOR MODEL 'continuous' NOW
# (CHAIN #)" outputs.

bayes_logistic_model <- stan_glm(is_long ~ gdpPercap, family = binomial(), data = gapminder_lifeExp, refresh = 0)

# Get important summary information about our model

summary(bayes_logistic_model)
## 
## Model Info:
##  function:     stan_glm
##  family:       binomial [logit]
##  formula:      is_long ~ gdpPercap
##  algorithm:    sampling
##  sample:       4000 (posterior sample size)
##  priors:       see help('prior_summary')
##  observations: 142
##  predictors:   2
## 
## Estimates:
##               mean   sd   10%   50%   90%
## (Intercept) -2.5    0.4 -3.1  -2.5  -2.0 
## gdpPercap    0.0    0.0  0.0   0.0   0.0 
## 
## Fit Diagnostics:
##            mean   sd   10%   50%   90%
## mean_PPD 0.5    0.0  0.5   0.5   0.5  
## 
## The mean_ppd is the sample average posterior predictive distribution of the outcome variable (for details see help('summary.stanreg')).
## 
## MCMC diagnostics
##               mcse Rhat n_eff
## (Intercept)   0.0  1.0  2019 
## gdpPercap     0.0  1.0  1120 
## mean_PPD      0.0  1.0  3402 
## log-posterior 0.0  1.0  1382 
## 
## For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).
# Call "tidy" on "bayes_logistic_model" in order to retrieve the model's
# singular parameter estimates; store this tibble as "tidy_coefs"

tidy_coefs <- tidy(bayes_logistic_model)

# Print "tidy_coefs" so as to view the model's parameter estimates

tidy_coefs
## # A tibble: 2 x 3
##   term         estimate std.error
##   <chr>           <dbl>     <dbl>
## 1 (Intercept) -2.53     0.412    
## 2 gdpPercap    0.000332 0.0000550
# Get 90% credible intervals for our model

posterior_interval(bayes_logistic_model)
##                        5%           95%
## (Intercept) -3.2588744781 -1.8865518721
## gdpPercap    0.0002449747  0.0004338494
# Subtract the lower bounds of the 90% credible intervals from the upper bounds
# of the 90% credible intervals to get the width of the intervals

# Intercept interval

-1.9049518275 - -3.2597990602
## [1] 1.354847
# "gdpPercap" interval

0.0004350909 - 0.0002468964
## [1] 0.0001881945
# Subtract the lower bounds of the 90% confidence intervals from the upper
# bounds of the 90% confidence intervals to get the width of the intervals

# Intercept interval

-1.97558421 - -3.5038181796
## [1] 1.528234
# "gdpPercap" interval

0.00047599 - 0.0002592351
## [1] 0.0002167549

Finally, we can visualize our logistic regression model!

# Here, I define a custom function, "bayes_logistic_model_function", that takes
# the inverse logit of (the first row of the "estimate" column of "tidy_coefs"
# plus the second row of the "estimate" column of "tidy_coefs" multiplied by
# whatever x is fed into the function).  You can, if you want, type the equation
# itself into the "fun" argument below, but I define a function for it  here
# because I think it makes for cleaner code.

bayes_logistic_model_function <- function(.x) invlogit(tidy_coefs$estimate[1] + tidy_coefs$estimate[2]*.x)

# Plot the "gapminder_lifeExp" dataset using ggplot, setting "gdpPercap" as the
# x-axis variable and "is_long" as the y-axis variable

ggplot(gapminder_lifeExp, aes(x = gdpPercap, y = is_long)) +
  
  # Make the plot a jittered scatterplot, setting width to 0 so that points
  # align with their precise "gdpPercap" values, setting height to 0.05 so that
  # points do not lie on top of each other, and setting alpha to 0.5 so that we
  # can better see individual points
  
  geom_jitter(width = 0, height = .05, alpha = 0.5) + 
  
  # Add the logistic regression line using the function "stat_function", where
  # the function ("fun") of the line (the f(x)) is
  # "bayes_logistic_model_function", the function defined above.  Set the color
  # of the line to "blue" and the size of the line to 1 so that the line
  # resembles the classical logistic regression line earlier in this chapter.
  
  stat_function(fun = bayes_logistic_model_function, color = "blue", size = 1)

Interpreting our Bayesian logistic regression model

Now we’ve made and visualized a Bayesian logistic regression model—wahoo! To interpret our Bayesian logistic regression model, we can apply the same interpretive skills we have acquired throughout this chapter.

The Bayesian logistic regression model we just made is: \(P(is\_long = 1) = logit^{-1}(-2.531 + 0.000333044gdpPercap)\)

Using the mean method - Bayes

Using the mean method outlined above (see the section entitled “Evaluating changes at and around the mean of our x variable”), we can calculate \(P(y = 1)\) at the mean value of gdpPercap, which here is 1.168007210^{4}:

invlogit(-2.531 + 0.000333044*mean(gapminder_lifeExp$gdpPercap))
## [1] 0.7955935

From this calculation, we can see that, at the mean value of gdpPercap, the probability that is_long equals 1 is about 79.56%, 2.24% less than the probability we found using our classical logistic regression.

We can also evaluate the effect a unit change in gdpPercap has on the probability that is_long equals 1 around the mean value of gdpPercap. Here, I will again treat a difference of 5,000 international dollars as a “unit change” in gdpPercap.

invlogit(-2.531 + 0.000333044*(mean(gapminder_lifeExp$gdpPercap) + 2500)) - invlogit(-2.531 + 0.000333044*(mean(gapminder_lifeExp$gdpPercap) - 2500))
## [1] 0.270856

From this calculation, we can see that a one unit change (5,000 international dollars) around the mean value of gdpPercap results in about a 27.09% change in the probability that is_long equals 1, roughly the same as the change in probability we found using our classical logistic regression (0.15% greater).

Using the derivative method - Bayes

Using the derivative method outlined above (see the section entitled “Evaluating change using the derivative of the curve at the mean of our x variable”), we can calculate the slope of our logistic regression curve at the mean value of gdpPercap:

Recall the derivative of the logistic regression curve:

  • \(\frac{\beta_1e^{\beta_0 + \beta_1x}}{(1 + e^{\beta_0 + \beta_1x})^2}\)

For our Bayesian model, the calculation will be:

  • \(\frac{0.000333044e^{-2.531 + 0.000333044mean(gdpPercap)}}{(1 + e^{-2.531 + 0.000333044mean(gdpPercap)})^2}\)

We can perform this calculation in R like so:

# Store "exp(1)", which is e in R, as "e"

e <- exp(1)

# Calculate the slope of our logistic regression curve at the mean value of
# "gdpPercap"

(0.000333044*e^(-2.531 + 0.000333044*mean(gapminder_lifeExp$gdpPercap)))/(1 + e^(-2.531 + 0.000333044*mean(gapminder_lifeExp$gdpPercap)))^2
## [1] 5.41611e-05

From this calculation, we have determined that the slope of our logistic regression curve at the mean value of gdpPercap is 0.0000541611, slightly higher than the slope we found using our classical logistic regression. This number is still miniscule! But it is miniscule only because it corresponds to a 1 international dollar change in gdpPercap. Earlier, we defined a one unit change in gdpPercap to be 5,000 international dollars. Therefore, we must multiply this slope by 5,000 in order to get our adjusted slope—or the magnitude of the effect one unit change in gdpPercap has on the probability of is_long equaling 1.

# Multiply our earlier calculation by 5,000 to get the slope adjusted to our
# definition of a one unit change in "gdpPercap"

5000*(0.000333044*e^(-2.531 + 0.000333044*mean(gapminder_lifeExp$gdpPercap)))/(1 + e^(-2.531 + 0.000333044*mean(gapminder_lifeExp$gdpPercap)))^2
## [1] 0.2708055

Much better! As you can see, this slope indicates that, at the mean value of gdpPercap, a one unit change in gdpPercap results in a roughly 27.08% change in probability that is_long equals 1. This finding is consistent with the one above and is roughly the same as the one we found using our classical logistic regression (0.41% greater).

Using the “Divide by 4 Rule” - Bayes

Using the “Divide by 4 Rule”, we can simply divide our estimated \(\beta_1\) by 4 and then multiply that value by 5,000 in order to get the upper limit of the predictive change in the probability that is_long equals 1 given a unit change in gdpPercap. Doing so yields:

5000*(0.000333044/4)
## [1] 0.416305

This value is significantly larger than the values we calculated using the other two methods. But remember that this is the upper limit of the predictive change in probability—an approximation of the slope of our logistic regression curve at its very steepest point, the halfway point. Thus, we interpret this result to mean that a unit change in gdpPercap, here a change of 5,000 international dollars, results in no greater than a positive 41.63% change in the probability that is_long equals 1. This upper limit is 3.16% less than the upper bound we found using our classical logistic regression model.

So…why Bayes tho?

After all this, you might be wondering…why bother with Bayes? After all, our coefficients are similar, our graphs look almost identical, and our probabilities, changes in probability, and upepr limits of the predictive changes in probability are very close to one another. True! But the way in which Bayesian models are constructed makes it so that, not only are our credible intervals more useful to interpret than our confidence intervals, but also our coefficients—and thus our model and all of our probability calculations (mean, derivative, “Divide by 4”) using those coefficients—are more precise. Bayesian models are constructed using multiple, simultaneously running Markov chains that use priors to inform posteriors and posteriors to inform new posteriors until all of the chains converge—or, in laymen’s terms, agree on what the model parameters should be. Because of the intensive nature and built-in flexibility of Bayesian modeling, Bayesian models end up being very precise and useful to us!

Sources

Note: many of the definitions of terms and explanations of concepts included in this chapter were taken or adapted from the following sources, without which this publication could not exist and which this publication thanks.