In order to perform Bayesian Modeling in R, we first need to load in the `rstanarm`

library. With the tools in the `rstanarm`

library, we can then make Bayesian models like the frequentist models we made in sections 5.1 and 5.2.

`library(rstanarm)`

For the purposes of this chapter as well as future chapters, we will use the function `stan_glm`

from the `rstanarm`

library to make our Bayesian models. While the Bayesian counterpart to the frequentist `lm`

function—the function we have been using thus far to make our models—is `stan_lm`

, we introduce you to `stan_glm`

here because, as you will see in later chapters, `stan_glm`

is a very versatile and useful function. Not only can you use `stan_glm`

to make Bayesian linear models, but you can also use `stan_glm`

to make other kinds of Bayesian models, such as Bayesian logistic models—but that’s for later.

Let’s say we have made a Bayesian model. When examining a Bayesian model, we call the `summary`

function on it—just as we do when examining a frequentist model.

However, there are several key differences between the `summary`

outputs for a Bayesian model and the `summary`

outputs for a frequentist model. When we call `summary`

on a Bayesian model, we get model information, parameter estimates, and model diagnostics. These things may sound familiar, but the estimates no longer have t-values and p-values, and the model no longer has f-statistics and R-squared’s, as the estimates and model have when we call `summary`

on a frequentist model. Instead of singular point estimates and fit statistics, we get a **distribution** of values for those estimates—the mean, the standard deviation, and the 10th, 50th, and 90th percentile values. We also get some other perhaps unfamiliar values:

In `Estimates`

and `MCMC diagnostics`

, we get:

`sigma`

, which is the standard deviation of errors

In `Fit Diagnostics`

, we get (again, the mean, standard deviation, and 10th, 50th, and 90th percentile values of):

`mean_PPD`

, which is the sample mean of “the posterior predictive distribution of the outcome variable”

In `MCMC diagnostics`

, we get:

`log-posterior`

, which is the log of the combined posterior distributions, similar to a likelihood of a frequentist regression`mcse`

, which is (and stands for) the Monte Carlo standard error`Rhat`

, which indicates whether or not the model converges, and, in turn, whether or not our results are reliable. Posterior distributions are sampled in chains, and the Rhat statistic is a measure of within-chain variance compared to across-chain variance. Rhat values less than 1.1 indicate model convergence, and, according to the “How to Use the rstanarm Package” document on the CRAN website, if any Rhat values exceed 1.1,`rstanarm`

will give you the following warning message: “Markov chains did not converge! Do not analyze results!”`n_eff`

, which is “a crude measure” of the effective sample size

Recall `score_model`

from 5.1:

```
# Fit regression model
score_model <- lm(score ~ bty_avg, data = evals_ch6)
# Get important summary information
summary(score_model)
```

```
##
## Call:
## lm(formula = score ~ bty_avg, data = evals_ch6)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.9246 -0.3690 0.1420 0.3977 0.9309
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.88034 0.07614 50.96 < 2e-16 ***
## bty_avg 0.06664 0.01629 4.09 5.08e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.5348 on 461 degrees of freedom
## Multiple R-squared: 0.03502, Adjusted R-squared: 0.03293
## F-statistic: 16.73 on 1 and 461 DF, p-value: 5.083e-05
```

```
# Get 90% confidence intervals (90% to compare against later 90% credible intervals)
confint(score_model, level = 0.90)
```

```
## 5 % 95 %
## (Intercept) 3.75484173 4.00583418
## bty_avg 0.03978652 0.09348755
```

Translating the above into Bayesian language to make and examine `score_model_bayes`

, we perform:

```
# Fit regression model
score_model_bayes <- stan_glm(score ~ bty_avg, data = evals_ch6)
```

```
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 6.6e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.66 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 1: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 1: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.03964 seconds (Warm-up)
## Chain 1: 0.074889 seconds (Sampling)
## Chain 1: 0.114529 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 1e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.1 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 2: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 2: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 2: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 2: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.03503 seconds (Warm-up)
## Chain 2: 0.074875 seconds (Sampling)
## Chain 2: 0.109905 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 1.7e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.17 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 3: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 3: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 3: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 3: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.055952 seconds (Warm-up)
## Chain 3: 0.073196 seconds (Sampling)
## Chain 3: 0.129148 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 1.1e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.11 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 4: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 4: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 4: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 4: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.037256 seconds (Warm-up)
## Chain 4: 0.068902 seconds (Sampling)
## Chain 4: 0.106158 seconds (Total)
## Chain 4:
```

```
# Get important summary information
summary(score_model_bayes)
```

```
##
## Model Info:
## function: stan_glm
## family: gaussian [identity]
## formula: score ~ bty_avg
## algorithm: sampling
## sample: 4000 (posterior sample size)
## priors: see help('prior_summary')
## observations: 463
## predictors: 2
##
## Estimates:
## mean sd 10% 50% 90%
## (Intercept) 3.9 0.1 3.8 3.9 4.0
## bty_avg 0.1 0.0 0.0 0.1 0.1
## sigma 0.5 0.0 0.5 0.5 0.6
##
## Fit Diagnostics:
## mean sd 10% 50% 90%
## mean_PPD 4.2 0.0 4.1 4.2 4.2
##
## The mean_ppd is the sample average posterior predictive distribution of the outcome variable (for details see help('summary.stanreg')).
##
## MCMC diagnostics
## mcse Rhat n_eff
## (Intercept) 0.0 1.0 4127
## bty_avg 0.0 1.0 4045
## sigma 0.0 1.0 4180
## mean_PPD 0.0 1.0 3974
## log-posterior 0.0 1.0 2086
##
## For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).
```

```
# Get credible interval
posterior_interval(score_model_bayes)
```

```
## 5% 95%
## (Intercept) 3.75531177 4.0030256
## bty_avg 0.04062741 0.0930610
## sigma 0.50730848 0.5658406
```

What are we doing here?

First, we “fit” the Bayesian regression model to the data using the `stan_glm`

function and store it as `score_model_bayes`

. `stan_glm`

makes a “Bayesian generalized linear model” and is used as follows: `stan_glm(y ~ x, data = data_frame_name)`

where:

- y is the dependent variable, followed by a tilde (~). Here, y is set to
`score`

. - x is the independent variable. Here, x is set to
`bty_avg`

. - the combination of y ~ x is the model formula. (Note the order of y and x.) Here, the model formula is again
`score ~ bty_avg`

. `data_frame_name`

is the name of the data frame that contains the variables y and x. Here,`data_frame_name`

is`evals_ch6`

.

Second, we take the model stored as `score_model_bayes`

and call the `summary`

function on it to get important information about the model. Recall that when we called `summary`

on our frequentist model, for the parameter estimates we were given the values `Estimate`

, `Std. Error`

, `t value`

, and `Pr(>|t|)`

, which is a p-value, and for the model we were provided with the values `Residual standard error`

, `Multiple R-squared`

, `Adjusted R-squared`

, and `F-statistic`

. Here, when we call `summary`

on our Bayesian model, we are provided with a **distribution** of values for our parameter estimates—the mean, the standard deviation, and the 10th, 50th, and 90th percentile values—and a host of other values defined at the beginning of this chapter.

Finally, we use the function `posterior_interval`

to retrieve our estimates’ *credible intervals*, typically 90% ones. For our frequentist model, we used the function `confint`

to retrieve our estimates’ *confidence intervals*. Typically, a frequentist confidence interval is a 95% one, but here we set the level to be 90% so that we can more easily explain the difference between a Bayesian *credible interval* and a frequentist *confidence interval*, as the distinction between the two is already subtle.

Indeed, you may have noticed that our Bayesian credible interval for our `bty_avg`

coefficient—(0.04034801, 0.09352647)—is very similar to our frequentist confidence interval for our `bty_avg`

coefficient—(0.03462292, 0.09865116). However, there are important differences between the two.

A 90% *credible interval* tells us that **there is a 90% probability that the true value falls within the interval**. By contrast, a frequentist 90% *confidence interval* tells us that, **in repeated sampling, the true value falls within the confidence interval 90% of the time**. Using Bayesian credible intervals, we can determine—**for any given interval**—the probability that the true value falls within that range, which is important because **“only Bayesian methods allow us to make inferences about the actual values of a parameter”**.

Recall `lifeExp_model`

from 5.2:

```
# Fit regression model
lifeExp_model <- lm(lifeExp ~ continent, data = gapminder2007)
# Get important summary information
summary(lifeExp_model)
```

```
##
## Call:
## lm(formula = lifeExp ~ continent, data = gapminder2007)
##
## Residuals:
## Min 1Q Median 3Q Max
## -26.9005 -4.0399 0.2565 3.3840 21.6360
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 54.806 1.025 53.446 < 2e-16 ***
## continentAmericas 18.802 1.800 10.448 < 2e-16 ***
## continentAsia 15.922 1.646 9.675 < 2e-16 ***
## continentEurope 22.843 1.695 13.474 < 2e-16 ***
## continentOceania 25.913 5.328 4.863 3.12e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 7.395 on 137 degrees of freedom
## Multiple R-squared: 0.6355, Adjusted R-squared: 0.6249
## F-statistic: 59.71 on 4 and 137 DF, p-value: < 2.2e-16
```

```
# Get 90% confidence intervals (90% to compare against later 90% credible intervals)
confint(lifeExp_model, level = 0.90)
```

```
## 5 % 95 %
## (Intercept) 53.10785 56.50423
## continentAmericas 15.82176 21.78240
## continentAsia 13.19699 18.64790
## continentEurope 20.03497 25.65015
## continentOceania 17.08939 34.73753
```

Translating the above into Bayesian language to make and examine `lifeExp_model_bayes`

, we perform:

```
# Fit regression model, now setting refresh equal to 0 so as to suppress all of
# the "SAMPLING FOR MODEL 'continuous' NOW (CHAIN #)" outputs
lifeExp_model_bayes <- stan_glm(lifeExp ~ continent, data = gapminder2007, refresh = 0)
# Get important summary information
summary(lifeExp_model_bayes)
```

```
##
## Model Info:
## function: stan_glm
## family: gaussian [identity]
## formula: lifeExp ~ continent
## algorithm: sampling
## sample: 4000 (posterior sample size)
## priors: see help('prior_summary')
## observations: 142
## predictors: 5
##
## Estimates:
## mean sd 10% 50% 90%
## (Intercept) 54.9 1.0 53.6 54.9 56.2
## continentAmericas 18.6 1.8 16.3 18.7 20.9
## continentAsia 15.8 1.6 13.7 15.8 17.9
## continentEurope 22.7 1.7 20.6 22.7 24.8
## continentOceania 25.1 5.5 17.9 25.1 32.1
## sigma 7.4 0.5 6.9 7.4 8.1
##
## Fit Diagnostics:
## mean sd 10% 50% 90%
## mean_PPD 67.0 0.9 65.9 67.0 68.1
##
## The mean_ppd is the sample average posterior predictive distribution of the outcome variable (for details see help('summary.stanreg')).
##
## MCMC diagnostics
## mcse Rhat n_eff
## (Intercept) 0.0 1.0 2671
## continentAmericas 0.0 1.0 2939
## continentAsia 0.0 1.0 3101
## continentEurope 0.0 1.0 3163
## continentOceania 0.1 1.0 4453
## sigma 0.0 1.0 4376
## mean_PPD 0.0 1.0 4156
## log-posterior 0.0 1.0 1990
##
## For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).
```

```
# Get credible interval
posterior_interval(lifeExp_model_bayes)
```

```
## 5% 95%
## (Intercept) 53.251716 56.602464
## continentAmericas 15.634195 21.620791
## continentAsia 13.066702 18.450238
## continentEurope 19.956782 25.464973
## continentOceania 16.045921 33.832984
## sigma 6.730626 8.259066
```

What are we doing here?

Once again, we first “fit” the Bayesian regression model to the data using the `stan_glm`

function and store it as `lifeExp_model_bayes`

. `stan_glm`

makes a “Bayesian generalized linear model” and is used as follows: `stan_glm(y ~ x, data = data_frame_name)`

where:

- y is the dependent variable, followed by a tilde (~). Here, y is set to
`lifeExp`

. - x is the independent variable. Here, x is set to
`continent`

. - the combination of y ~ x is the model formula. (Note the order of y and x.) Here, the model formula is again
`lifeExp ~ continent`

. `data_frame_name`

is the name of the data frame that contains the variables y and x. Here,`data_frame_name`

is`gapminder2007`

Second, we take the model stored as `lifeExp_model_bayes`

and call the `summary`

function on it to get important information about the model. Recall that when we called `summary`

on our frequentist model, for the parameter estimates we were provided with the values `Estimate`

, `Std. Error`

, `t value`

, and `Pr(>|t|)`

, which is a p-value, and for the model we were given the values `Residual standard error`

, `Multiple R-squared`

, `Adjusted R-squared`

, and `F-statistic`

. Here, when we call `summary`

on our Bayesian model, we are provided with a **distribution** of values for our parameter estimates—the mean, the standard deviation, and the 10th, 50th, and 90th percentile values—and a host of other values defined at the beginning of this chapter.

Finally, we use the function `posterior_interval`

to retrieve our estimates’ *credible intervals*, typically 90% ones. For our frequentist model, we used the function `confint`

to retrieve our estimates’ *confidence intervals*. Typically, a frequentist confidence interval is a 95% one, but here we set the level to be 90% so that we can more easily explain the difference between a Bayesian *credible interval* and a frequentist *confidence interval*, as the distinction between the two is already subtle.

Indeed, you may have noticed that our Bayesian credible intervals are very similar to our frequentist confidence intervals. For example, for `continentAmericas`

, they are (15.623322, 21.667609) and (15.82176, 21.78240), respectively. However, there are important differences between credible and confidence intervals.

As explained before, a 90% *credible interval* tells us that **there is a 90% probability that the true value falls within the interval**. By contrast, a frequentist 90% *confidence interval* tells us that, **in repeated sampling, the true value falls within the confidence interval 90% of the time**. Using Bayesian credible intervals, we can determine—**for any given interval**—the probability that the true value falls within that range, which is important because **“only Bayesian methods allow us to make inferences about the actual values of a parameter”**.

We can modify our Bayesian regressions by altering chains—in number, iterations, or warmup values—and by altering priors—making them adjusted as opposed to unadjusted or informative as opposed to uninformative. Within the `stan_glm`

function, we can alter chains using the `chains`

, `iters`

, and `warmup`

arguments, and we can alter priors using the `prior`

argument.

In the case of error messages—which are common enough to warrant the inclusion of these debugging tips—below are a couple of simple methods to troubleshoot and solve the problems.

- If the error message suggests “divergent transitions after warmup”, the model is using steps in the estimator that are too big. To solve this problem, adjust the step-size by increasing
`adapt_delta`

from its default value of 0.95 like so:`stan_glm(y ~ x, sata = data_frame_name, control = list(adapt_delta = 0.99)`

. Increasing`adapt_delta`

decreases the step-size. - If the error message suggests a chain has “reached the maximum tree depth”, the sample prematurely ended the iteration and is not working efficiently. To solve this problem, increase
`max_treedepth`

from its default value of 10 like so:`stan_glm(y ~ x, sata = data_frame_name, control = list(max_treedepth = 15)`

.

These errors can pose threats to the validity of our models, so it is important to know how to remedy them if they crop up.

When examining a frequentist regression, one way of assessing model fit is to look at its R-squared value, which can be found in the `summary`

of the model. When examining a Bayesian regression, however, while the R-squared value is still one that can be used to assess model fit, it is not shown in the `summary`

output of a `stan_glm`

model. If you want to know the R-squared value of a Bayesian regression, it can be manually calculated using the following formula:

\[R^2 = 1 - \frac{\sum_i (y_i - \widehat{y}_i)^2}{\sum_i (y_i - \bar{y}_i)^2}\]

For `score_model_bayes`

, the calculation would be:

`1 - var(residuals(score_model_bayes)) / (var(fitted(score_model_bayes)) + var(residuals(score_model_bayes)))`

`## [1] 0.03573738`

For `lifeExp_model_bayes`

, the calculation would be:

`1 - var(residuals(lifeExp_model_bayes)) / (var(fitted(lifeExp_model_bayes)) + var(residuals(lifeExp_model_bayes)))`

`## [1] 0.6308769`

There are, however, other ways to assess a model’s fit—ones specific to Bayesian regressions. Among these, the most important fall under the class of assessments called *Posterior Predictive Model Checks*.

Take `score_model_bayes`

: using the `posterior_linpred`

function, we are able to get predicted scores for each observation using the parameter values of our iterations. We can then compare the distribution of predicted scores against the distribution of observed scores to determine whether or not the model fits the data well.

```
# Get predicted scores
predictions <- posterior_linpred(score_model_bayes)
# Get predicted scores for first iteration
iteration1 <- predictions[1,]
# Get predicted scores for second iteration
iteration2 <- predictions[2,]
# Summarize to compare predicted scores and observed scores
summary(iteration1)
```

```
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 3.936 4.061 4.159 4.166 4.257 4.481
```

`summary(iteration2)`

```
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 3.987 4.080 4.153 4.158 4.225 4.391
```

`summary(evals_ch6$score)`

```
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.300 3.800 4.300 4.175 4.600 5.000
```

We can also use the function `pp_check`

to perform a global and visual check our model’s distribution of predicted scores. `pp_check`

takes the name of our Bayesian model as its first argument and takes the “Posterior Predictive Checking” plot function (`plotfun`

)—or the type of plot we want to create—as its second argument. As you will see below, “Posterior Predictive Checking” plots can take the form of density overlays, histograms, and scatterplots.

In the density overlay below, again using `score_model_bayes`

as our model, the light blue lines illustrate the distribution of predicted scores, and the dark blue line illustrates the distribution of observed scores.

```
# Density overlay
pp_check(score_model_bayes, plotfun = "dens_overlay")
```