Seroprevalence and test characteristic meta-analysis

Bob Carpenter and Andrew Gelman

May 2020


Sensitivity, specificity, and prevalence

Let the random variable

where \(Z = 1\) indicates the individual is positive for the disease and \(Z = 0\) indicates they are negative.

The prevalence of a disease (in a given [sub]population) is the probability of an individual being positive, \[ \textrm{prevalence} = \textrm{Pr}[Z = 1]. \]

Let the random variable

where \(Y = 1\) indicates a positive test result and \(Y = 0\) a negative result.

The sensitivity of a test is its accuracy for positive individuals, \[ \textrm{sensitivity} = \textrm{Pr}[Y = 1 \mid Z = 1]. \]

The specificity of a test is its accuracy for negative individuals, \[ \textrm{specificity} = \textrm{Pr}[Y = 0 \mid Z = 0]. \]


The data (Bendavid et al. 2020) consists of 17 seroprevalence studies

Sensitivity tests (\(Z = 1\))

pos_tests tests sample_sens
78 85 0.918
27 37 0.730
25 35 0.714

Specificity tests (\(Z = 0\))

neg_tests tests sample_spec
368 371 0.992
30 30 1.000
70 70 1.000
1102 1102 1.000
300 300 1.000
311 311 1.000
500 500 1.000
198 200 0.990
99 99 1.000
29 31 0.935
146 150 0.973
105 108 0.972
50 52 0.962

Prevalence tests (\(Z\) unknown)

pos_tests tests sample_prev
50 3330 0.015

Pooled estimates

The pooled sample sensitivity is 0.828, pooled sample specificity is 0.995, and pooled sample prevalence is 0.015.

Sensitivity and specificity meta-analysis

Using sensitivity as an example (specificity is identical other than some signs), the meta-analysis assumes a test’s sensitivity and specificity vary by site with normal noise on the log odds scale. We will parameterize sensitivity and specificity using log odds, where the log odds function \(\textrm{logit}:(0, 1) \rightarrow \mathbb{R}\) is \[ \textrm{logit}(u) = \log \frac{u}{1 - u}. \] Its inverse is the logistic function, \(\textrm{logit}^{-1}:\mathbb{R} \rightarrow (0, 1),\) \[ \textrm{logit}^{-1}(v) = \frac{1}{1 + \exp(-v)}. \]

Varying sensitivity and specificity

The sensitivity of site \(k\) will be represented by the variable \(\theta_{1, k}\) (the \(1\) is for sensitivity as opposed to specificity).

Sampling distribution (likelihood)

The tests are assumed to be independent so that the number of positive tests at a site may be modeled as binomial, \[ y_k \sim \textrm{binomial_logit}(N_k, \theta_{1, k}). \] We’ve used a parameterization of the binomial’s chance of success parameter on the log odds scale, where \[ \textrm{binomial_logit}(N, \theta) = \textrm{binomial}(N, \textrm{logit}^{-1}(\theta)). \]

Hierarchical prior

The model will assume a test sensitivity log odds \(\mu_1\) and scale of variation \(\sigma_1\) such that the sensitivity log odds \(\theta_{1, k}\) at site \(k\) is distributed as \[ \theta_{1, k} \sim \textrm{normal}(\mu_1, \sigma_1) \] This will pool (or regularize) the estimates \(\theta_{1, k}\) toward \(\mu_1\), with \(\sigma_1\) controlling the scale of inter-site variation in sensitivity.

As \(\sigma_1 \rightarrow 0\), the model approaches a full pooling model where we assume there is no variation in sensitivity among the sites. As \(\sigma_1 \rightarrow \infty\), the model approaches a no-pooling model where each site’s sensitivity is estimated independently. The hierarchical model is a partial pooling model, which determines the amount of pooling to apply from the data by controlling the estimate of \(\sigma_1\) from the data.


The hyperparameters \(\mu_1, \sigma_1\) should get hyperpriors based on what we know about this disease and serum testing variation among sites. Assuming we know nothing, we could assume that the probability of a correct test, \(\textrm{logit}^{-1}(\mu_1)\) has a uniform distribution, which has an implied standard logistic distribution on the log odds scale, \[ \mu_1 \sim \textrm{logistic}(0, 1). \] The scale parameter should be consistent with zero, so we do not want to give it a lognormal or inverse-gamma family prior. We do not expect a huge variation on the log odds scale, so we can use a simple standard half-normal hyperprior on \(\sigma_1\), \[ \sigma_1 \sim \textrm{normal}_+(0, 1) \]

If more is known about the tests, these hyperpriors should be tighter. Without a lot of data (both per site and number of sites), the posteriors will be sensitive to these hyperpriors.

Predictive inference for new data at existing sites

To predict future test results at site \(k\), the probability of getting \(n\) positive tests out of the next \(N\) tests where the subject is positive, given observed (training) data \(y\), is given by the posterior predictive distribution, \[\begin{eqnarray*} p(n \mid N, k) & = & \int \underbrace{\textrm{binomial_logit}(n \mid N, \theta_{1, k})} _{\textrm{sampling uncertainty}} \cdot \underbrace{p(\theta_{1, k} \mid y)} _{\textrm{estimation uncertainty}} \, \textrm{d}{\theta_{1, k}} \\[6pt] & \approx & \frac{1}{M} \sum_{m = 1}^M \textrm{binomial_logit}(n \mid N, \theta_{1, k}^{(m)}), \end{eqnarray*}\] where \[ \theta_{1, k}^{(m)} \sim p(\theta_{1, k} \mid y) \] is a draw from the posterior (binomial-logit determines sampling uncertainty and drawing from the posterior the estimation uncertainty).

Predictive inference for new data at new sites

Suppose we gather data from a new testing site \(k'\) not in the original data. The posterior predictive distribution for new data is identical in form, \[\begin{eqnarray*} p(n \mid N, k') & = & \int \underbrace{\textrm{binomial_logit}(n \mid N, \theta_{1, k'})} _{\begin{array}{l} \textrm{sampling} \\ \textrm{uncertainty} \end{array}} \cdot \underbrace{p(\theta_{1, k'} \mid y)} _{\begin{array}{l} \textrm{estimation} \\ \textrm{uncertainty} \end{array}} \, \textrm{d}{\theta_{1, k'}} \\[6pt] & \approx & \frac{1}{M} \sum_{m = 1}^M \textrm{binomial_logit}(n \mid N, \theta_{1, k'}^{(m)}), \end{eqnarray*}\] where \(\theta_{1, k'}^{(m)} \sim p(\theta_{1, k'} \mid y)\) is a draw from the posterior.

The only difference is that now there is no data informing \(\theta_{1, k'}\), only the hyperprior parameters \(\mu_1, \sigma_1\) estimated through the meta-analysis, that is, as \(\theta_{1, k'} \sim \textrm{normal}(\mu_1, \sigma_1)\). This allows the sampling for MCMC to be refactored as \[ \theta_{1, k'}^{(m)} \sim \textrm{normal}(\mu_1^{(m)}, \sigma_1^{(m)}), \] where \(\mu_1^{(m)}, \sigma_1^{(m)} \sim p(\mu_1, \sigma_1 \mid y).\)

Putting it all together to estimate diseases prevalence

Specificity meta-analysis is identical to sensitivity meta-analysis (with some signs reversed). We can now use the predictive inference for sensitivity and specificity in our new site along with the observed number of positive tests at the new site to estimate prevalence of the disease (in the sampled (sub)population). Suppose we have a site with sensitivity \(\theta_{1, k}\) and a specificity of \(\theta_{0, k}\). If the disease prevalence is \(\pi\), the probability of getting a positive test result is \[ \mbox{Pr}[y = 1] = \pi \cdot \textrm{logit}^{-1}(\theta_{1, k}) + (1 - \pi) \cdot (1 - \textrm{logit}^{-1}(\theta_{0, k})) \] For a total of \(N\) independent tests, the number of positive tests will thus be distributed binomially with the preceding probability of success, \[ n \sim \textrm{binomial}\left(N, \pi \cdot \textrm{logit}^{-1}(\theta_{1, k}) + (1 - \pi) \cdot (1 - \textrm{logit}^{-1}(\theta_{0, k}))\right). \]

If we assume a simple uniform prior on prevalence \(\pi \in (0, 1)\), the entire model is \[\begin{eqnarray*} \pi & \sim & \textrm{uniform}(0, 1) \\ \mu_0, \mu_1 & \sim & \textrm{logistic}(0, 1) \\ \sigma_0, \sigma_1 & \sim & \textrm{normal}_+(0, 1) \\ \theta_{0, k} & \sim & \textrm{normal}(\mu_0, \sigma_0) \\ \theta_{1, k} & \sim & \textrm{normal}(\mu_1, \sigma_1) \\ n^{\textrm{pos}}_k & \sim & \textrm{binomial}(N^{\textrm{pos}}_k, \theta_{1, k}) \\ n^{\textrm{neg}}_k & \sim & \textrm{binomial}(N^{\textrm{neg}}_k, \ \theta_{0, k}) \\ n^{\textrm{unk}}_k & \sim & \textrm{binomial}\left(N^{\textrm{unk}}_k, \pi \cdot \textrm{logit}^{-1}(\theta_{1, k}) + (1 - \pi) \cdot (1 - \textrm{logit}^{-1}(\theta_{0, k})) \right) \end{eqnarray*}\] Some care is required to avoid clasing \(k\) subscripts in an impelementation.

The posterior \(p(\pi \mid n^{\textrm{pos}}, N^{\textrm{pos}}, n^{\textrm{neg}}, N^{\textrm{neg}}, n^{\textrm{unk}}, N^{\textrm{unk}})\) characterizes what we know about the prevalence of the disease given the observed sensitivity data, specificity data, and unkown status data.

Stan Model

Here is the full Stan model implementing this meta-analysis.

data {
  int<lower = 0> K_pos;
  int<lower = 0> N_pos[K_pos];
  int<lower = 0> n_pos[K_pos];

  int<lower = 0> K_neg;
  int<lower = 0> N_neg[K_neg];
  int<lower = 0> n_neg[K_neg];

  int<lower = 0> K_unk;
  int<lower = 0> N_unk[K_unk];
  int<lower = 0> n_unk[K_unk];

  real<lower = 0> sigma_sigma_logit_sens;
  real<lower = 0> sigma_sigma_logit_spec;
parameters {
  real mu_logit_sens;
  real mu_logit_spec;
  real<lower = 0> sigma_logit_sens;
  real<lower = 0> sigma_logit_spec;

  vector<offset = mu_logit_sens,
         multiplier = sigma_logit_sens>[K_pos] logit_sens;
  vector<offset = mu_logit_spec,
         multiplier = sigma_logit_spec>[K_neg] logit_spec;
  vector<offset = mu_logit_sens,
         multiplier = sigma_logit_sens>[K_unk] logit_sens_unk;
  vector<offset = mu_logit_spec,
         multiplier = sigma_logit_spec>[K_unk] logit_spec_unk;

  real<lower = 0, upper = 1> pi;
transformed parameters {
  vector[K_pos] sens = inv_logit(logit_sens);
  vector[K_neg] spec = inv_logit(logit_spec);
  vector[K_unk] sens_unk = inv_logit(logit_sens_unk);
  vector[K_unk] spec_unk = inv_logit(logit_spec_unk);
model {
  mu_logit_sens ~ normal(4, 2);
  sigma_logit_sens ~ normal(0, sigma_sigma_logit_sens);
  logit_sens ~ normal(mu_logit_sens, sigma_logit_sens);
  n_pos ~ binomial_logit(N_pos, logit_sens);

  mu_logit_spec ~ normal(4, 2);
  sigma_logit_spec ~ normal(0, sigma_sigma_logit_spec);
  logit_spec ~ normal(mu_logit_spec, sigma_logit_spec);
  n_neg ~ binomial_logit(N_neg, logit_spec);

  pi ~ uniform(0, 1);
  logit_sens_unk ~ normal(mu_logit_sens, sigma_logit_sens);
  logit_spec_unk ~ normal(mu_logit_spec, sigma_logit_spec);
  n_unk ~ binomial(N_unk, pi * sens_unk + (1 - pi) * (1 - spec_unk));

Now we compile it.

model <- stan_model("meta.stan")

and then fit

data <-
list(sigma_sigma_logit_sens = 0.01,
     sigma_sigma_logit_spec = 0.01,
     K_pos = sens_tests,
     N_pos = sens_df$tests,
     n_pos = sens_df$pos_tests,
     K_neg = spec_tests,
     N_neg = spec_df$tests,
     n_neg = spec_df$neg_tests,
     K_unk = unk_tests,
     N_unk = array(unk_df$tests, dim = c(length(unk_df$tests))),
     n_unk = array(unk_df$pos_tests, dim = c(length(unk_df$tests))))

fit <- sampling(model, data = data,
                iter = 4000,
        control = list(adapt_delta = 0.99),
        open_progress = FALSE, refresh = 0)

print(fit, probs = c(0.025, 0.5, 0.975), digits = 3, pars = c("pi"))
Inference for Stan model: meta.
4 chains, each with iter=4000; warmup=2000; thin=1; 
post-warmup draws per chain=2000, total post-warmup draws=8000.

    mean se_mean    sd  2.5%   50% 97.5% n_eff Rhat
pi 0.013       0 0.003 0.007 0.012 0.019  7795    1

Samples were drawn using NUTS(diag_e) at Mon May 18 18:14:51 2020.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).