Biophysical model

Here we describe the biophysical model of polyclonal antibodies that motivates the approach in this package.

We recommend first reading this paper to understand the model conceptually before reading the rest of the documentation.

Hill curve model

Consider a viral protein bound by polyclonal antibodies, such as might be found in sera. We want to determine the contribution of each mutation to escaping these polyclonal antibodies, being cognizant of the fact that different antibodies target different epitopes.

The actual experimental measurable is as follows: at each concentration \(c\) of the antibody mixture, we measure \(p_v\left(c\right)\), which is the fraction of all variants \(v\) of the viral protein that escape binding or neutralization (whichever is being experimentally measured) by all antibodies in the mix. For instance, \(p_v\left(c\right)\) might be fraction of variants \(v\) that are not neutralized in a virus neutralization deep mutational scanning experiment as in Lee et al (2019).

We assume that antibodies in the mix can bind to one of \(E\) epitopes on the protein. Let \(U_e\left(v,c\right)\) be the fraction of the time that epitope \(e\) is not bound on variant \(v\) when the mix is at concentration \(c\). Then assuming antibodies bind independently without competition, the overall experimentally measured fraction of variants that escape binding at concentration \(c\) is simply:

\[p_v\left(c\right) = \prod_{e=1}^E U_e\left(v, c\right), \label{pv} \tag{1}\]

where \(e\) ranges over the \(E\) epitopes.

We want to write \(U_e\left(v,c\right)\) in terms of underlying physical properties like the relative concentrations of antibodies targeting different epitopes, and the affinities of these antibodies. If we assume that there is no competition among antibodies binding to different epitopes, that all antibodies targeting a given epitope have same affinity, and that there is no cooperativity in antibody binding (Hill coefficient of antibody binding is one), then the fraction of all variants \(v\) that are not bound by an antibody targeting epitope \(e\) at concentration \(c\) is given by a Hill equation:

\[\begin{split}\begin{eqnarray} U_e\left(v, c\right) &=& \frac{1 - t_e}{1 + \left(\frac{c f_e}{K_{d,e}\left(v\right)}\right)^{n_e}} + t_e \\ &=& \frac{1 - t_e}{1 + \left[c f_e \exp \left(-\frac{\Delta G_e\left(v\right)}{RT}\right)\right]^{n_e}} + t_e \\ &=& \frac{1 - t_e}{1 + \left[c \exp \left(-\phi_e\left(v\right)\right)\right]^{n_e}} + t_e, \\ \label{Ue} \tag{2} \end{eqnarray}\end{split}\]

where \(n_e\) is the Hill coefficient for epitope \(e\), \(t_e\) is the non-neutralizable fraction for epitope \(e\), and \(\phi_e\left(v\right)\) represents the total binding activity of antibodies to epitope \(e\) against variant \(v\). Note that \(\phi_e\left(v\right)\) is related to the free energy of binding \(\Delta G_e\left(v\right)\) and the fraction of antibodies \(f_e\) targeting epitope \(e\) by \(\phi_e\left(v\right) = \frac{\Delta G_e\left(v\right)}{RT} - \ln f_e\); note that \(RT\) is the product of the molar gas constant and the temperature and \(K_{d,e}= \exp\left(\frac{\Delta G_e\left(v\right)}{RT}\right)\) is the dissociation constant. The value of \(\phi_e\left(v\right)\) depends both on the affinity of antibodies targeting epitope \(e\) (via \(\Delta G_e\left(v\right)\)) and on the abundance of antibodies with this specificity in the overall mix (via \(f_e\)), and so is a measure of the overall importance of antibodies with this specificity in the polyclonal mix. Smaller (more negative) values of \(\phi_e\left(v\right)\) correspond to a higher overall contribution of antibodies with specificity for epitope \(e\) to the activity against variant \(v\). We call \(-\phi_e\left(v\right)\) the “activity” directed against epitope \(e\).

Note that in the simplest case we have a Hill coefficient of \(n_e = 1\) and there is no non-neutralized fraction, so \(t_e = 0\). The original reference for polyclonal focuses on this simple case, and the addition of allowing the possibility of \(n_e \ne 1\) and \(t_e > 0\) is an extension to that simplest case.

Below is an interactive plot showing the effect of different values of the concentration and non-neutralizable fraction (\(t\)) as a function of an adjustable Hill coefficient (\(n\)):

[1]:
# NBVAL_IGNORE_OUTPUT

import altair as alt

import numpy

import pandas as pd

df = pd.concat(
    [
        pd.DataFrame({"activity": numpy.linspace(-10, 10, 100)}).assign(
            c=c,
            t=t,
            label=f"concentration = {c:.1f}, non-neutralized fraction (t) = {t:.1f}",
        )
        for c in [1, 4]
        for t in [0, 0.1]
    ]
)

n = alt.selection_point(
    fields=["n"],
    value=[{"n": 1}],
    bind=alt.binding_range(min=0.1, max=3, name="Hill coefficient (n)"),
)

to_show = alt.selection_point(
    bind="legend",
    fields=["label"],
    value=[{"label": df["label"].tolist()[0]}],
)

(
    alt.Chart(df)
    .transform_calculate(
        unbound=(1 - alt.datum["t"])
        / (1 + (alt.datum["c"] * alt.expr.exp(alt.datum["activity"])) ** n["n"])
        + alt.datum["t"]
    )
    .encode(
        x=alt.X("activity", title="antibody activity (-phi)"),
        y=alt.Y("unbound:Q", title="fraction unbound (U)"),
        color=alt.Color(
            "label:N",
            legend=alt.Legend(title="click / shift-click to select", labelLimit=500),
            scale=alt.Scale(domain=df["label"].unique()),
        ),
        tooltip=[
            alt.Tooltip(c, type="quantitative", format=".3g")
            for c in ["activity", "unbound"]
        ],
    )
    .mark_line()
    .properties(width=275, height=175)
    .add_params(n, to_show)
    .transform_filter(to_show)
)
[1]:

De-composing epitope activity into “wildtype” activity and escape of mutations

Finally, we want to frame the epitope activity \(-\phi_e\left(v\right)\) in terms of the actual quantities of biological interest. There are two quantities of biological interest:

  1. The activity of antibodies binding epitope \(e\) in the unmutated (“wildtype”) protein background, which will be denoted as \(a_{\rm{wt}, e}\).

  2. The extent of escape mediated by each amino-acid mutation \(m\) on binding of antibodies targeting epitope \(e\), which will be denoted as \(\beta_{m,e}\).

In order to infer these quantities, we make the assumption that mutations have additive effects on the free energy of binding (and so \(\phi_e\left(v\right)\)) for antibodies targeting any given epitope \(e\). Specifically, let \(a_{\rm{wt}, e}\) be the total activity against the “wildtype” protein of antibodies targeting epitope \(e\), with larger values of \(a_{\rm{wt}, e}\) indicating stronger antibody binding (or neutralization) at this epitope. Let \(\beta_{m,e}\) be the extent to which mutation \(m\) (where \(1 \le m \le M\)) reduces binding by antibodies targeting epitope \(e\), with larger values of \(\beta_{m,e}\) corresponding to more escape from binding (a value of 0 means the mutation has no effect on antibodies targeting this epitope). We can then write:

\[\phi_e\left(v\right) = -a_{\rm{wt}, e} + \sum_{m=1}^M \beta_{m,e} b\left(v\right)_m \label{phie} \tag{3}\]

where \(b\left(v\right)_m\) is one if variant \(v\) has mutation \(m\) and 0 otherwise.

These equations relate the quantities of biological interest (\(a_{\rm{wt}, e}\) and \(\beta_{m,e}\)) to the experimental measurables (\(p_v\left(c\right)\)).

IC50s from the model

After we have fit the model parameters (activities and mutation escape values) to the experimental measurables (variant-level escape probabilities at different concentrations), for each variant we can calculate a \(IC_{50}\) (or more generally, an \(IC_x\)), which is the concentration of the polyclonal antibody mix where 50% (or x%, in the more general case) of that particular variant is neutralized.

To do this, we just solve for the \(c\) such that \(p_v\left(c\right) = 1 - x\) where \(x\) is the fraction neutralized, so \(x = 0.5\) for an \(IC_{50}\). Because \(p_v\left(c\right)\) is a monotonically increasing function of \(c\), this will have a unique solution that we determine numerically.