Experimentally Informed Codon Models

Overview

Experimentally Informed Codon Models (ExpCM) describe the evolution of protein-coding genes in terms of their site-specific amino-acid preferences. These models improve on conventional non-site-specific phylogenetic substitution models because they account for the different constraints at different sites of the protein encoded by the gene.

Specifically, for each gene, we assume that we know the preference \(\pi_{r,a}\) of site \(r\) for each amino-acid \(a\) (we constrain \(1 = \sum_a \pi_{r,a}\)). Typically, these preferences might be measured in deep mutational scanning experiments. The preferences represent the effect of each amino-acid mutation at each site, normalized to sum to one. They can therefore be obtained by simply re-normalizing deep mutational scanning enrichment ratios to sum to one at each site. For a more involved description of how the preferences can be inferred from deep mutational scanning experimental data, see Bloom, BMC Bioinformatics, 16:168. Importantly, these amino-acid preferences must be obtained independently from the sequence alignment being analyzed. A deep mutational scanning experiment is an independent means of obtaining the preferences, but estimating them from the amino-acid frequencies in the alignment of homologs is not a valid approach as you are then estimating the preferences from the same sequences that you are subsequently analyzing.

The ExpCM used here are the ones defined in:

See also Bloom, Mol Biol Evol, 31:1956-1978 and Bloom, Mol Biol Evol, 31:2753-2769 for more discussion of models of this type.

These experimentally informed site-specific substitution models closely parallel those used in studies that infer the site-specific information from natural sequences; see Rodrigue and Lartillot, PNAS, 107:4629-4634. For a more general discussion of models of this form, see Halpern and Bruno, Mol Biol Evol, 15:910-917 and McCandlish and Stoltzfus, Quarterly Review of Biology, 89:225-252.

Specifically, the ExpCM implemented by phydms have the following components:

  • A reversible model of nucleotide substitution that is assumed to be constant across sites.

  • A reversible model of amino-acid substitution that differs among sites based on the site-specific amino-acid preferences.

  • A stringency parameter \(\beta\) that reflects how strongly the evolution of the gene in nature adheres to the site-specific amino-acid preferences. Under certain assumptions about the correspondence between the preferences and fitness effects of mutations, \(\beta\) probably has some connection to effective population size.

  • An \(\omega\) parameter that reflects that relative rate of nonsynonymous versus synonymous substitutions after accounting for fact that the site-specific preferences will typically retard the overall rate of nonsynonymous substitution. This \(\omega\) is not identical to the one that would be inferred using a conventional Goldman-Yang or Muse-Gaut model, since it reflects the relative rates of these two types of substitutions after accounting for the preferences. Indeed, if the preferences perfectly captured the constraints on nature evolution, we might expect \(\omega = 1\) since there would be no further reduction in the rate of nonsynonymous substitutions after accounting for the site-specific preferences.

Below is an exact definition of the ExpCM model, and an explanation of how it is used to fit site-specific measures of selection.

The ExpCM substitution model

The substitution model is in mutation-selection form. The rate of substitution \(P_{r,xy}\) from codon \(x\) to codon \(y\) at site \(r\) is

(1)\[P_{r,xy} = Q_{xy} \times F_{r,xy}\]

where \(Q_{xy}\) is the rate of mutation from x to y and \(F_{r,xy}\) represents the selection on this mutation. Note that the mutation terms are identical across sites, but the selection terms are site-specific.

The mutation rate terms \(Q_{xy}\) are given by what is essentially a HKY85 model. It consists of transition-transversion ratio \(\kappa\) and four nucleotide frequencies \(\phi_{A}\), \(\phi_{C}\), \(\phi_{G}\), and \(\phi_{T}\) (these nucleotide frequencies only constitute three independent parameters since they sum to one). The frequencies are the expected nucleotide composition in the absence of any selection on amino acids. The mutation rate terms are:

(2)\[\begin{split}Q_{xy} = \begin{cases} 0 & \mbox{if $x$ and $y$ differ by more than one nucleotide,} \\ \phi_w & \mbox{if $x$ can be converted to $y$ by a transversion of a nucleotide to $w$,} \\ \kappa \times \phi_w & \mbox{if $x$ can be converted to $y$ by a transition of a nucleotide to $w$.} \\ \end{cases}\end{split}\]

The selection terms \(F_{r,xy}\) are defined in terms of the site-specific amino acid preferences. Let \(\pi_{r,a}\) denote the preference of site \(r\) for amino acid \(a\), and let \(\operatorname{A}\left(x\right)\) denote the amino acid encoded by codon \(x\). Let \(\beta\) be a stringency parameter, and let \(\omega\) be a relative rate of nonsynonymous to synonymous mutations. Then the selection terms are:

(3)\[\begin{split}F_{r,xy} = \begin{cases} 1 & \mbox{if $\mathcal{A}\left(x\right) = \mathcal{A}\left(y\right)$} \\ \omega & \mbox{if $\mathcal{A}\left(x\right) \ne \mathcal{A}\left(y\right)$ and $\pi_{r,\mathcal{A}\left(x\right)} = \pi_{r,\mathcal{A}\left(y\right)}$} \\ \omega \times \frac{\ln\left(\left(\pi_{r,\mathcal{A}\left(y\right)}\right)^{\beta} / \left(\pi_{r,\mathcal{A}\left(x\right)}\right)^{\beta}\right)}{1 - \left(\left(\pi_{r,\mathcal{A}\left(x\right)}\right)^{\beta} / \left(\pi_{r,\mathcal{A}\left(y\right)}\right)^{\beta}\right)} & \mbox{otherwise.} \end{cases}\end{split}\]

Equation (3) differs from that described in Bloom, Mol Biol Evol, 31:2753-2769 only by the addition of the \(\omega\) parameter that differentiates nonsynonymous and synonymoust mutations; the equation in that reference was in turn based on the one originally derived by Halpern and Bruno, Mol Biol Evol, 15:910-917 (although note that the Halpern and Bruno equation contains a key typographical error in the denominator). The stringency parameter \(\beta\) has a value of greater than one if natural selection favors high-preference amino acids with greater stringency than suggested by the \(\pi_{r,a}\) values, and has a value of less than one if natural selection favors high-preference amino acids with less stringency than suggested by the \(\pi_{r,a}\) values. The \(\omega\) parameter is not a ratio of nonsynonymous to synonymous mutations, but rather their relative rates after accounting for the differing preferences among sites.

The stationary state of the model defined by the mutation terms alone is trivially observed to be

\[q_x = \phi_{x_1} \times \phi_{x_2} \times \phi_{x_3}\]

where \(x_1\), \(x_2\), and \(x_3\) are the nucleotides at positions 1, 2, and 3 of codon \(x\), and it is also trivially observed that this mutation-term only model is reversible (since direct substitution easily verifies that \(q_x \times Q_{xy} = q_y \times Q_{yx}\)).

In Bloom, Mol Biol Evol, 31:2753-2769, it is shown that the stationary state of the model defined by the selection terms alone is

\[f_{r,x} \propto \left(\pi_{r,\operatorname{A}\left(x\right)}\right)^{\beta},\]

and furthermore that this selection-term only model is reversible (the derivation in Bloom, Mol Biol Evol, 31:2753-2769 doesn’t include the \(\omega\) term, but carrying through the same derivation with this term yields the above result).

Finally, Bloom, Mol Biol Evol, 31:2753-2769 shows that given these stationary states, the overall model defined by (1) is reversible and has stationary state

\[p_{r,x} = \frac{f_{r,x} \times q_x}{\sum_y f_y \times q_y} = \frac{\left(\pi_{r,\operatorname{A}\left(x\right)}\right)^{\beta} \times q_x}{\sum_y \left(\pi_{r,\operatorname{A}\left(y\right)}\right)^{\beta} \times q_y}.\]

Therefore, assuming that the preferences \(\pi_{r,a}\) are known a priori for all amino acids \(a\) at all sites \(r\) (e.g. the preferences have been measured in a deep mutational scanning experiment), the substitution model is completely defined by giving values to the following six parameters: \(\omega\), \(\beta\), \(\kappa\), \(\phi_A\), \(\phi_C\), and \(\phi_G\). When fitting an ExpCM for a gene phylogeny, phydms assumes that these six parameters are constant across all sites, and optimizes their values by maximum likelihood.

Empirical nucleotide frequencies

The foregoing model has three free parameters (\(\phi_A\), \(\phi_C\), and \(\phi_G\)) that are related to the nucleotide frequencies and are independent of the site-specific constraints. Classically, many substitution models determine the values of these nucleotide frequency parameters by setting them equal to the empirically observed nucleotide frequencies in the alignment rather than fitting them by maximum likelihood. This procedure is computationally faster.

However, the the ExpCM described above, this cannot be done in such a simple fashion. The reason is that the \(\phi_w\) parameters give the expected frequency of nucleotide \(w\) for purely mutation-driven evolution – but in reality, the observed nucleotide frequencies are also affected by selection on the amino acid usage. That this is really the case in evolution is easily observed by noting that for many genes, the nucleotide frequencies differ greatly among the first, second, and especially third codon position. These differences are unlikely to be driven by mutation (which is probably indifferent to the reading frame in which a nucleotide falls), and is instead driven by the fact that selection on the amino-acid usage favors certain nucleotides at some codon positions since that is the only way to encode some amino acids.

However, we can still empirically estimate the \(\phi_w\) parameters from the alignment frequencies by taking the amino-acid level constraint into account. Specifically, let \(g_w\) denote the empirically observed frequency of nucleotide \(w\) in the alignment, noting that \(1 = \sum_w g_w\). Let \(\hat{\phi}_w\) be the value of \(\phi_w\) that yields a stationary state of the substitution model where nucleotide \(w\) is at frequency \(g_w\). The value of \(\hat{\phi}_w\) must then satisfy:

\[\begin{split}g_w & = & \frac{1}{L} \sum_r \sum_x \frac{1}{3} N_w\left(x\right) p_{r,x} \\ & = & \frac{1}{3L} \sum_r \frac{\sum_x N_w\left(x\right) f_{r,x} \prod_{k=0}^2 \hat{\phi}_{x_k}}{\sum_y f_{r,y} \prod_{k=0}^2 \hat{\phi}_{y_k}}.\end{split}\]

There are three independent \(g_w\) values and three independent \(\phi_w\) values (since \(1 = \sum_w g_w = \sum_w \hat{\phi}_w\), so we have three equations and three unknowns. These equations can be solved numerically for \(\hat{\phi}_w\), and these empirical estimates can be used in the place of estimating \(\phi_w\) by maximum likelihood. Such a procedure speeds the numerical optimization by reducing the number of free parameters that must be optimized.

Gamma-distributed \(\omega\)

In codon substitution models, it is common to allow \(\omega\) to be drawn from several discrete gamma-distributed categories. For instance, this is what is done for the M5 variant of the YNGKP models described by Yang, Nielsen, Goldman, and Krabbe Pederson, Genetics, 155:431-449. This can also be done for ExpCM by using the --gammaomega option. In this case, \(\omega\) is drawn from --ncats categories placed at the means of a discretized gamma distribution with shape parameter \(\alpha_{\omega}\) and inverse scale parameter \(\beta_{\omega}\). Note that the mean \(\omega\) value is then \(\alpha_{\omega} / \beta_{\omega}\).

Gamma-distributed \(\beta\)

We can also allow the ExpCM \(\beta\) parameter to be drawn from gamma-distributed categories using the --gammabeta option. In this case, \(\beta\) is drawn from --ncats categories placed at the means of a discretized gamma distribution with shape parameter \(\alpha_{\beta}\) and inverse scale parameter \(\beta_{\beta}\). Note that the mean \(\beta\) value is then \(\alpha_{\beta} / \beta_{\beta}\).

Identifying diversifying selection via site-specific \(\omega_r\) values

One type of interesting selection is diversifying selection, where there is continual pressure for amino-acid change. Such selection might be expected to occur at sites that are targeted by adaptive immunity or subjected to some other form of selection which constantly favors changes in the protein sequence. At such sites, we expect that the relative rate of nonsynonymous substitutions will be higher than suggested by the site-specific preferences \(\pi_{r,a}\) due to this diversifying selection.

To detect diversifying selection at specific sites within the framework of the ExpCM implemented in phydms, we use an approach that is highly analogous the FEL (fixed effects likelihood) method described by Kosakovsky Pond and Frost, Mol Biol Evol, 22:1208-1222. Essentially, the tree topology, branch lengths, and all shared model parameters are fixed to their maximum-likelihood values optimized over the entire gene sequence. Then for each site \(r\), we fit a site-specific ratio of the rate of synonymous versus nonsynonymous substitutions while holding all holding all the other tree and model parameters constant. Effectively, this is fitting a different \(\omega_r\) for each site, and so this analysis is indicated as --omegabysite in the phydms options.

Specifically, after fixing all of the other parameters as described above, for each site \(r\) we re-define Equation (3) as

(4)\[\begin{split}F_{r,xy} = \begin{cases} \mu_r & \mbox{if $\mathcal{A}\left(x\right) = \mathcal{A}\left(y\right)$} \\ \mu_r \times \omega_r & \mbox{if $\mathcal{A}\left(x\right) \ne \mathcal{A}\left(y\right)$ and $\pi_{r,\mathcal{A}\left(x\right)} = \pi_{r,\mathcal{A}\left(y\right)}$} \\ \mu_r \times \omega_r \times \frac{\ln\left(\left(\pi_{r,\mathcal{A}\left(y\right)}\right)^{\beta} / \left(\pi_{r,\mathcal{A}\left(x\right)}\right)^{\beta}\right)}{1 - \left(\left(\pi_{r,\mathcal{A}\left(x\right)}\right)^{\beta} / \left(\pi_{r,\mathcal{A}\left(y\right)}\right)^{\beta}\right)} & \mbox{otherwise.} \end{cases}\end{split}\]

and then fit the values for \(\mu_r\) and \(\omega_r\). After this fitting, \(\mu_r\) can be interpreted as the synonymous rate, and \(\mu_r \times \omega_r\) as the nonsynonymous rate. The reason that we fit \(\mu_r\) as well as \(\omega_r\) is to model variation is synonymous rate; this can be important for the reasons described in the Discussion of Kosakovsky Pond and Frost, Mol Biol Evol, 22:1208-1222. If you use the --omegabysite_fixsyn option to phydms then \(\mu_r\) is not fit, but rather is constrained to one.

The null hypothesis is that \(\omega_r = 1\). We compute a P-value for rejection of this null hypothesis using a \(\chi_1^2\) test to compare the likelihood obtained when fitting both \(\mu_r\) and \(\omega_r\) to that obtained when fitting only \(\mu_r\) and fixing \(\omega_r = 1\). See Kosakovsky Pond and Frost, Mol Biol Evol, 22:1208-1222 for a justification for using a \(\chi_1^2\) test for this type of analysis. Note that the P-values reported by phydms are not adjusted for multiple testing, so you will want to make such an adjustment if you are testing the hypothesis that any site has \(\omega_r \ne 1\). Note also that in many cases, the fitted value of \(\omega_r\) will either be very small (e.g. close to zero) or very large (e.g. close to \(\infty\)) – in general, it is more informative to look for sites with small P-values and then simply look to see if \(\omega\) is > or < 1.

Significant support for a value of \(\omega_r > 1\) can be taken as evidence for diversifying selection beyond that expected given the constraints encapsulated in the site-specific amino-acid preferences. Significant support for a value of \(\omega_r < 1\) can be taken as evidence for selection against amino-acid change beyond that expected given the constraints encapsulated in the site-specific amino-acid preferences. Note, however, that if the site-specific preferences don’t accurately describe the real constraints, you might get \(\omega_r \ne 1\) simply because of this fact – so you will want to examine if sites might be subject to selection that is better described by modulating the stringency parameter \(\beta\) or by invoking differential preferences, as described below.

Identifying differentially selected amino acids by fitting preferences for each site

A more complete approach is to examine each site to see the extent to which the preferences for each amino acid in nature differ from those encapsulated in the \(\pi_{r,a}\) values. The advantage of this approach is that it can identify any form of differential selection (the approach in the previous section works best when the selection in nature is more uniform across amino acids than the \(\pi_{r,a}\) values), and also that it can pinpoint specific amino acids that are favored or disfavored in natural evolution by an unexpected amount. The disadvantage is that phydms does not currently implement a good way to statistically test the significance of this type of differential selection, so although you can visualize and assess the selection it’s hard to say that any given differential selection is significant at some specific P-value threshold.

To identify differential selection, we first fix the tree / branch lengths and all parameters of the substitution model at their maximum-likelihood values obtained across the whole tree. We then define a new set of 20 preferences for each site \(r\) that we will denote as \(\hat{\pi}_{r,a}\) (note that these only represent 19 parameters, as \(1 = \sum_a \hat{\pi}_{r,a}\). We then redefine Equation (3) replacing the \(\left(\pi_{r,a}\right)^{\beta}\) values with the \(\hat{\pi}_{r,a}\) values as

(5)\[\begin{split} F_{r,xy} = \begin{cases} 1 & \mbox{if $\mathcal{A}\left(x\right) = \mathcal{A}\left(y\right)$} \\ \omega & \mbox{if $\mathcal{A}\left(x\right) \ne \mathcal{A}\left(y\right)$ and $\hat{\pi}_{r,\mathcal{A}\left(x\right)} = \hat{\pi}_{r,\mathcal{A}\left(y\right)}$} \\ \omega \times \frac{\ln\left(\hat{\pi}_{r,\mathcal{A}\left(y\right)} / \hat{\pi}_{r,\mathcal{A}\left(x\right)}\right)}{1 - \left(\hat{\pi}_{r,\mathcal{A}\left(x\right)} / \hat{\pi}_{r,\mathcal{A}\left(y\right)}\right)} & \mbox{otherwise.}. \end{cases}\end{split}\]

Simply fitting the model defined above with these 19 \(\hat{\pi}_{r,a}\) values will probably overfit the data since we are including 19 new parameters. We therefore regularize the parameters by defining a prior that favors \(\hat{\pi}_{r,a} = \left(\pi_{r,a}\right)^{\beta}\).

Although a Dirichlet prior peaked on the preferences might seem attractive, it performs poorly in practice because the maximum a posteriori is very different for small preferences depending on their exact magnitude – for instance, under a Dirichlet prior we will have very different costs of increasing the differential preference by 0.1 depending on whether the a priori peak estimate is \(10^{-3}\) or \(10^{-4}\). This is undesirable, so instead we use a prior based on the product of inverse-quadratics.

(6)\[\Pr\left(\left\{\hat{\pi}_{r,a}\right\} \mid \left\{\pi_{r,a}\right\}, \beta\right) = \prod_a \left(\frac{1}{1 + C_1 \times \left(\hat{\pi}_{r,a} - \frac{\left(\pi_{r,a}\right)^{\beta}}{\sum_{a'} \left(\pi_{r,a'}\right)^{\beta}}\right)^2}\right)^{C_2}\]

or equivalently

(7)\[\log\left[\Pr\left(\left\{\hat{\pi}_{r,a}\right\} \mid \left\{\pi_{r,a}\right\}, \beta\right)\right] = - C_2 \sum_a \log\left(1 + C_1 \times \left(\hat{\pi}_{r,a} - \frac{\left(\pi_{r,a}\right)^{\beta}}{\sum_{a'} \left(\pi_{r,a'}\right)^{\beta}}\right)^2\right)\]

where \(C_1\) and \(C_2\) are concentration parameters specified via the --diffprefconc option. Larger values of these favor \(\hat{\pi}_{r,a}\) values that more closely match the \(\left(\pi_{r,a}\right)^{\beta}\) values (and so favor smaller values of \(\Delta\pi_{r,a} = \hat{\pi}_{r,a} - \frac{\left(\pi_{r,a}\right)^{\beta}}{\sum_{a'} \left(\pi_{r,a'}\right)^{\beta}}\)).

The phydms program has reasonable defaults for these concentration parameters, but you can fine tune them with --diffprefconc.

Here is a plot of the log prior (Equation (7)) as a function of \(C_1\) and \(C_2\).

_images/diffprefs_prior_plot.png

As the plot makes clear, this prior has the desirable feature of penalizing small differential preferences proportionally more than larger ones, which is good if we think that most sites have no differential selection but some have a lot.

To obtain the \(\hat{\pi}_{r,a}\) values, the --diffprefsbysite option to phydms jointly optimizes the product of the prior in Equation (6) and the likelihood (with fixed tree / branches and other model parameters) using Equation (5). In effect, this is the maximum a posteriori estimate of the \(\hat{\pi}_{r,a}\) given the prior defined by Equation (6).

One might wonder why phydms takes the maximum a posteriori estimates rather than defining a prior with \(\left(\pi_{r,a}\right)^{\beta}\) as the mean rather than the mode, and then sampling from the posterior to obtain the posterior mean estimates. In fact, this procedure would probably be better, but MCMC sampling of the posterior is much more computationally intensive than the maximum a posteriori approach.

The actual values reported by phydms are the differential preferences, defined as

(8)\[\Delta\pi_{r,a} = \hat{\pi}_{r,a} - \frac{\left(\pi_{r,a}\right)^{\beta}}{\sum_{a'} \left(\pi_{r,a'}\right)^{\beta}}.\]

So a differential preference of \(\Delta\pi_{r,a} > 0\) suggests that natural evolution favors amino-acid \(a\) at site \(r\) more than suggested by the preferences, and a value < 0 suggests that natural evolution disfavors this amino acid. One way to summarize the total difference in preferences is the root-mean-square of the differential preferences at a site, defined as

(9)\[RMS_{\Delta\pi_{r,a}} = \sqrt{\frac{1}{N_a} \sum_a \left(\Delta\pi_{r,a}\right)^2}.\]

Another way to summarize the total difference in preferences is the half the absolute sum of the differential preferences, value:

(10)\[\frac{1}{2} \sum_a \left|\Delta\pi_{r,a}\right|\]

The theoretical maximum of this quantity is one.

Unfortunately, phydms does not currently include any method for statistically testing (e.g. P-values) the hypothesis that any given \(\Delta\pi_{r,a}\) is not equal to zero. So instead, you will have to keep in mind that these values are regularized (and so do not typically become substantially non-zero without some reasonable statistical evidence), and then manually inspect them for interesting trends.

Specifiying diversifying pressure at sites

In some cases, it might be possible to specify a priori expections for the diversifying pressure at each site. For instance, viruses benefit from amino-acid change in sites targeted by the immune system and, consequently, these sites have a higher rate of amino-acid substitution than expected given their level of inherent functional constraint. We can incorporate our expectations for diversifying pressure at specific sites into the selection terms \(F_{r,xy}\).

Let \(\delta_{r}\) be the pre-determined diversifying pressure for amino-acid change at site \(r\) in the protein. A large positive value of \(\delta_r\) corresponds to high pressure for amino-acid diversification, and negative value corresponds to expected pressure against amino-acid diversification beyond that captured in the amino-acid preferences. We then replace \(\omega\) in Equation (3) with the expression \(\omega\times\left(1+\omega_{2}\times\delta_{r}\right)\), resulting in selection terms:

(11)\[\begin{split}F_{r,xy} = \begin{cases} 1 & \mbox{if $\mathcal{A}\left(x\right) = \mathcal{A}\left(y\right)$} \\ \omega\times\left(1+\omega_{2}\times\delta_{r}\right) & \mbox{if $\mathcal{A}\left(x\right) \ne \mathcal{A}\left(y\right)$ and $\pi_{r,\mathcal{A}\left(x\right)} = \pi_{r,\mathcal{A}\left(y\right)}$} \\ \omega\times\left(1+\omega_{2}\times\delta_{r}\right) \times \frac{\ln\left(\left(\pi_{r,\mathcal{A}\left(y\right)}\right)^{\beta} / \left(\pi_{r,\mathcal{A}\left(x\right)}\right)^{\beta}\right)}{1 - \left(\left(\pi_{r,\mathcal{A}\left(x\right)}\right)^{\beta} / \left(\pi_{r,\mathcal{A}\left(y\right)}\right)^{\beta}\right)} & \mbox{otherwise.} \end{cases}\end{split}\]

Whereas before \(\omega\) reflected the elevation of non-synonymous substitutin rate (averaged across the entire gene) beyond that expected given the amino-acid preferences, now \(\omega\) reflects a gene-wide rate of elevated non-synonymous substitution after taking into account the expected sites of diversifying pressure (as represented by \(\delta_r\)) weighted by \(\omega_{2}\times\delta_{r}\). These new selection terms in equation Equation (11) are identical the selection terms in Equation (3) when \(\omega_{2} = 0\).

To ensure a positive value of \(\omega\times\left(1+\omega_{2}\times\delta_{r}\right)\) with \(-\infty<\delta_{r}<\infty\), \(\omega\) and \(\omega_{2}\) are constrained in the following ways:

\(\omega>0\)

\(\begin{cases} \frac{-1}{\max_r\delta_r}<\omega_{2}<\infty & \text{when }\min_r\delta_r>0\\ -\infty<\omega_{2}<\frac{-1}{\min_r\delta_r} & \text{when }\max_r\delta_r<0\\ \frac{-1}{\max_r\delta_r}<\omega_{2}<\frac{-1}{\min_r\delta_r} & \text{otherwise}\\ \end{cases}\)

This results in a total of seven free parameters: \(\omega\), \(\omega_{2}\), \(\beta\), \(\kappa\), \(\phi_A\), \(\phi_C\), and \(\phi_G\), an increase of one from the model with \(F_{r,xy}\) defined by Equation (3). A likelihood ratio test can therefore by used to test if adding the specified site-specific diversifying pressure represented by the \(\delta_r\) values actually improves the description of the gene’s evolution.