phydmslib.models module¶
Substitution models.
All models defined here are subclasses of abstract base class Model.
Nucleotides, amino acids, and codons are indexed by integers 0, 1, … using the indexing schemes defined in phydmslib.constants.
-
phydmslib.models.
DiscreteGamma
(alpha, beta, ncats)¶ Returns category means for discretized gamma distribution.
The distribution is evenly divided into categories, and the mean of each category is returned.
- Args:
- alpha (float > 0)
Shape parameter of gamma distribution.
- beta (float > 0)
Inverse scale parameter of gamma distribution.
- ncats (int > 1)
Number of categories in discretized gamma distribution.
- Returns:
- catmeans (numpy.ndarray of float, shape (ncats,))
catmeans[k] is the mean of category k where 0 <= k < ncats.
Check that we get values in Figure 1 of Yang, J Mol Evol, 39:306-314 >>> catmeans = DiscreteGamma(0.5, 0.5, 3) >>> numpy.allclose(catmeans, numpy.array([0.0603,0.4894,2.4502]),atol=1e-4) True
Make sure we get expected mean of alpha / beta >>> alpha = 0.6 >>> beta = 0.8 >>> ncats = 6 >>> catmeans = DiscreteGamma(alpha, beta, ncats) >>> numpy.allclose(catmeans.sum() / ncats, alpha / beta) True
-
class
phydmslib.models.
DistributionModel
¶ Bases:
phydmslib.models.Model
Substitution model with a parameter drawn from distribution.
This abstract base class defines required methods / attributes of substitution models with one parameter (distributedparam) drawn from a distribution. An example is that omega might be drawn from a gamma distribution.
-
abstract
M
(k, t, tips=None, gaps=None)¶ Matrix exponential M(mu * t) = exp(mu * t * P) for category k.
Similar to definition for Model abstract base class except also specifies for which category k we are returning the matrix exponential. k is an int with 0 <= k < ncats.
-
abstract property
basemodel
¶ Base model over which this model is distributed.
A DistributionModel is a base model with some property drawn from a distribution. This gives such a base model. Parameters of this base model that are not distributed among sites will be current, but the distributed parameter will not.
-
abstract property
catweights
¶ Weight assigned to each category.
Is a numpy.ndarray of float of shape (ncats,) where each entry is > 0 and the entries sum to one. catweights[k] is the weight assigned to category k where 0 <= k < ncats.
-
abstract
dM
(k, t, param, Mkt, tips=None, gaps=None)¶ Derivative of M(k, t) with respect to param.
Similar to definiton for Model abstract base class except also specifies for which category k we are returning the derivative. k is an int with 0 <= k < ncats.
Also, param can be any string in freeparams except those in distributionparams, and can also be distributedparam.
-
abstract property
d_distributionparams
¶ Derivatives of distributedparam w.r.t distributionparams.
- Returns:
- d_distributionparams
(dict keyed by params in distributionparams) d_distribution[param] is a numpy.ndarray of float of shape (ncats,), with d_distributionparam[param][k] giving the derivative of the distributedparam for category k with respect to param where param is in distributionparams.
-
abstract property
distributedparam
¶ String giving name of parameter drawn from the distribution.
This parameter must be in freeparams.
-
abstract property
distributionparams
¶ List of params giving distribution shape for distributedparam.
-
abstract
dstationarystate
(k, param)¶ Derivative of stationarystate w.r.t param for category k.
- Args:
- k (int)
Category for which we are computing stationary state.
- param (string)
A string in freeparams or distributedparam
- Returns:
- dstationarystate (numpy.ndarray of floats or zero)
If param is a float, then dstationarystate[k][r][x] is derivative of stationarystate[k][r][x] with respect to param for category k. If param is an array, then dstationarystate[k][i][r][x] is derivative of stationarystate[k][r][x] with respect to param[i] for category k.
-
abstract property
ncats
¶ Number of categories for distributed parameter.
Is an int > 1.
-
abstract
stationarystate
(k)¶ Stationary state of substitution model.
- Args:
- k (int)
Category for which we are computing stationary state.
A numpy.ndarray of floats, shape (k, nsites, N_CODON). Element stationarystate[k][r][x] is the stationary state probability of codon x at site r for category k.
-
abstract
-
class
phydmslib.models.
ExpCM
(prefs, kappa=2.0, omega=0.5, beta=1.0, mu=1.0, phi=array([0.25, 0.25, 0.25, 0.25]), freeparams='kappa', 'omega', 'beta', 'mu', 'eta')¶ Bases:
phydmslib.models.Model
Experimentally informed codon models (ExpCM) for a gene.
See __init__ method for how to initialize an ExpCM.
Attributes should only be updated via the updateParams method, do not set attributes directly.
- Attributes (see also those inherited from Model):
- pi (numpy.ndarray of floats, shape (nsites, N_AA))
pi[r][a] is preference of site r for amino-acid a.
- kappa (float > 0)
Transition-transversion ratio.
- omega (float > 0)
Nonsynonymous to synonymous substitution ratio.
- beta (float > 0)
Stringency parameter re-scaling amino-acid preferences.
- phi (numpy.ndarray of floats, length N_NT)
Nucleotide frequency params summing to one, is computed from eta. We use eta rather than phi as the free parameter during optimization.
- eta (numpy.ndarray of floats, length N_NT - 1)
Transformation of the nucleotide frequency params in phi; all entries are > 0 and < 1. We use eta rather than phi as the free parameter during optimization.
- Prxy (numpy.ndarray of floats, shape (nsites, N_CODON, N_CODON)
Prxy[r][x][y] is substitution rate from codon x to y at r. Diagonal elements make rows sum to zero for each Prxy[r].
- prx (numpy.ndarray of floats, shape (nsites, N_CODON)
prx[r][x] is stationary state of Prxy for codon x at r. This attribute is equivalent to stationarystate.
- Qxy (numpy.ndarray of floats, shape (N_CODON, N_CODON)
Qxy[x][y] is mutation rate from x to y, diagonal undefined.
- Frxy (numpy.ndarray of floats, shape (nsites, N_CODON, N_CODON)
Frxy[r][x][y] fixation prob from x to y, diagonal undefined for each Frxy[r].
- Frxy_no_omega
Like Frxy but not multiplied by omega for non-synonymous mutations
- pi_codon (numpy.ndarray of floats, shape (nsites, N_CODON))
pi_codon[r][x] is preference of site r for amino acid encoded by codon x.
- ln_pi_codon (numpy.ndarray floats, shape (nsites, N_CODON))
Natural logarithm of pi_codon.
- piAx_piAy
(numpy.ndarray floats, shape (nsites, N_CODON, N_CODON)) piAx_piAy[r][x][y] is ratio of preference for amino acid encoded by codon x divided by pref for that encoded by y.
- piAx_piAy_beta
(numpy.ndarray floats, shape (nsites, N_CODON, N_CODON)) Equal to piAx_piAy raised to the power of beta.
- ln_piAx_piAy_beta (`numpy.ndarray of floats)
Natural logarithm of piAx_piAy_beta.
- dPrxy (dict)
Keyed by each string in freeparams, each value is numpy.ndarray of floats giving derivative of Prxy with respect to that parameter. The shape of each array (nsites, N_CODON, N_CODON) except for eta, for which it is (N_NT - 1, nsites, N_CODON, N_CODON) with the first index ranging over each element in eta.
- dprx (dict)
Keyed by strings in freeparams, each value is numpy.ndarray of floats giving derivative of prx with respect to that param, or 0 if if prx does not depend on parameter. The shape of each array is (nsites, N_CODON) except for eta, for which it is (N_NT - 1, nsites, N_CODON) with the first index ranging over each element in eta. Equivalent to dstationarystate[param].
- D (numpy.ndarray floats, shape (nsites, N_CODON))
Element r is diagonal of \(\mathbf{D_r}\) diagonal matrix, \(\mathbf{P_r} = \mathbf{A_r}^{-1} \mathbf{D_r} \mathbf{A_r}\)
- A (numpy.ndarray floats, shape (nsites, N_CODON, N_CODON))
Element r is \(\mathbf{A_r}\) matrix.
- Ainv (numpy.ndarray floats, shape (nsites, N_CODON, N_CODON))
Element r is \(\mathbf{A_r}^{-1}\) matrix.
- B (dict)
Keyed by strings in freeparams, each value is numpy.ndarray of floats giving B matrix for that parameter. The shape is (nsites, N_CODON, N_CODON) except for eta, for which it is (N_NT - 1, nsites, N_CODON, N_CODON) with the first index ranging over each element in eta.
-
ALLOWEDPARAMS
= ['kappa', 'omega', 'beta', 'eta', 'mu']¶
-
M
(t, tips=None, gaps=None)¶ See docs for method in Model abstract base class.
-
property
PARAMLIMITS
¶ See docs for Model abstract base class.
-
PARAMTYPES
= {'beta': <class 'float'>, 'eta': (<class 'numpy.ndarray'>, (3,)), 'kappa': <class 'float'>, 'mu': <class 'float'>, 'omega': <class 'float'>, 'phi': (<class 'numpy.ndarray'>, (4,)), 'pi': <class 'float'>}¶
-
__init__
(prefs, kappa=2.0, omega=0.5, beta=1.0, mu=1.0, phi=array([0.25, 0.25, 0.25, 0.25]), freeparams='kappa', 'omega', 'beta', 'mu', 'eta')¶ Initialize an ExpCM object.
- Args:
- prefs (list)
List of dicts giving amino-acid preferences for each site. Each dict keyed by amino acid letter codes, value is pref > 0 and < 1. Must sum to 1 at each site.
- kappa, omega, beta, mu, phi
Model params described in main class doc string.
- freeparams (list of strings)
Specifies free parameters.
-
property
branchScale
¶ See docs for Model abstract base class.
-
dM
(t, param, Mt, tips=None, gaps=None)¶ See docs for method in Model abstract base class.
-
dlogprior
(param)¶ Zero for all param, as no prior defined over this model.
-
dstationarystate
(param)¶ See docs for Model abstract base class.
-
property
freeparams
¶ See docs for Model abstract base class.
-
property
logprior
¶ Is zero, as no prior is defined over this model.
-
property
mu
¶ See docs for Model abstract base class.
-
property
nsites
¶ See docs for Model abstract base class.
-
property
paramsReport
¶ See docs for Model abstract base class.
-
spielman_wr
(norm=True)¶ Calculate site-specific omega values the ExpCM.
- Args:
- norm (bool)
If True, normalize the omega_r values by the ExpCM gene-wide omega.
- Returns:
- wr (list)
list of omega_r values of length nsites
Following Spielman and Wilke, MBE, 32:1097-1108, we can predict the dN/dS value for each site r, \(\rm{spielman}\omega_r\), from the ExpCM.
When norm is False, the omega_r values are defined as \(\rm{spielman}\omega_r = \frac{\sum_x \sum_{y \in N_x}p_{r,x}P_{r,xy}} {\sum_x \sum_{y \in Nx}p_{r,x}Q_{xy}}\), where r,x,y, \(p_{r,x}\), \(P_{r,xy}\), and \(Q_{x,y}\) have the same definitions as in the main ExpCM doc string and \(N_{x}\) and differ from x by one nucleotide.
When norm is True, the omega_r values above are divided by the ExpCM omega value.
-
property
stationarystate
¶ See docs for Model abstract base class.
-
updateParams
(newvalues, update_all=False)¶ See docs for Model abstract base class.
-
class
phydmslib.models.
ExpCM_empirical_phi
(prefs, g, kappa=2.0, omega=0.5, beta=1.0, mu=1.0, freeparams='kappa', 'omega', 'beta', 'mu')¶ Bases:
phydmslib.models.ExpCM
An ExpCM with phi calculated empirically from nucleotide freqs.
See __init__ method for how to initialize an ExpCM_empirical_phi.
The difference between and ExpCM and an ExpCM_empirical_phi is that the nucleotide frequency parameters phi are now calculated empirically from their observed frequences (denoted g) in the codon alignment such that the stationary state gives the empirically observed nucleotide frequencies.
So compared to an ExpCM, eta (the transformation of phi) is no longer a free parameter, but is computed as a function of other model attributes.
- Has all the attributes of an ExpCM plus the following:
- g (numpy.ndarray of float, length N_NT)
Empirical nucleotide frequencies in alignment, with g[m] being frequency of nucleotide m. Must sum to one.
- dphi_dbeta (numpy.ndarray of float, length N_NT)
Derivative of phi with respect to beta.
-
ALLOWEDPARAMS
= ['kappa', 'omega', 'beta', 'mu']¶
-
PARAMTYPES
= {'beta': <class 'float'>, 'eta': (<class 'numpy.ndarray'>, (3,)), 'g': (<class 'numpy.ndarray'>, (4,)), 'kappa': <class 'float'>, 'mu': <class 'float'>, 'omega': <class 'float'>, 'phi': (<class 'numpy.ndarray'>, (4,)), 'pi': <class 'float'>}¶
-
__init__
(prefs, g, kappa=2.0, omega=0.5, beta=1.0, mu=1.0, freeparams='kappa', 'omega', 'beta', 'mu')¶ Initialize an ExpCM_empirical_phi object.
- Args:
- prefs, kappa, omega, beta, mu, freeparams
Same meaning as for an ExpCM
- g
Has the meaning described in the main class doc string.
-
class
phydmslib.models.
ExpCM_empirical_phi_divpressure
(prefs, g, divPressureValues, kappa=2.0, omega=0.5, beta=1.0, mu=1.0, omega2=0.0, freeparams='kappa', 'omega', 'beta', 'mu', 'omega2')¶ Bases:
phydmslib.models.ExpCM_empirical_phi
ExpCM with empirical phi and a priori diversifying pressure.
See __init__ method for how to initialize.
Difference between ExpCM_empirical_phi and ExpCM_empirical_phi_divpressure`is that `omega is replaced by omega * (1 + omega2 * deltar) where deltar is the expectation of diversifying pressure at site r.
The rate of non-synonymous to synonymous change is proportional to the expectation of diversifying pressure.
- Has all the attributes of an ExpCM_empirical_phi plus the following:
- omega2 (float)
Part of the expression which replaces omega.
- divPressureValues (numpy.array of length nsites)
Array of deltar values in site order. The maximum absolute value of the array should be between 0 and 1.
-
ALLOWEDPARAMS
= ['kappa', 'omega', 'beta', 'mu', 'omega2']¶
-
PARAMTYPES
= {'beta': <class 'float'>, 'eta': (<class 'numpy.ndarray'>, (3,)), 'g': (<class 'numpy.ndarray'>, (4,)), 'kappa': <class 'float'>, 'mu': <class 'float'>, 'omega': <class 'float'>, 'omega2': <class 'float'>, 'phi': (<class 'numpy.ndarray'>, (4,)), 'pi': <class 'float'>}¶
-
__init__
(prefs, g, divPressureValues, kappa=2.0, omega=0.5, beta=1.0, mu=1.0, omega2=0.0, freeparams='kappa', 'omega', 'beta', 'mu', 'omega2')¶ Initialize an ExpCM_empirical_phi_divpressure object.
- Args:
- prefs, kappa, omega, beta, mu, g, freeparams
Same meaning as for an ExpCM_empirical_phi
- divPressureValues, omega2
Meaning described in the main class doc string.
-
class
phydmslib.models.
ExpCM_fitprefs
(prefs, prior, kappa, omega, phi, mu=1.0, origbeta=1.0, freeparams='zeta')¶ Bases:
phydmslib.models.ExpCM
An ExpCM with the preferences pi as free parameters.
The difference between ExpCM and ExpCM_fitprefs is that the amino-acid preferences pi are optimized for the latter. There can also be a regularizing prior on these preferences.
The way this class is currently implemented, it will be inefficient if there is more than one site. The recommended usage is to optimize across-site parameters with fixed preferences, then optimize preferences for each site individually with this class after fixing across-site parameters. You currently can not initialize with more than one site.
See __init__ method for how to initialize an ExpCM_fitprefs.
- Has all the attributes of an ExpCM, plus:
- zeta (numpy.ndarray of float, shape (nsites * (N_AA - 1)))
Transformation of the preferences pi; all entries are > 0 and < 1. We use zeta rather than pi as the free parameter during optimization. The entry zeta.reshape(nsites, N_AA - 1)[r][i] is value i for site r (0 <= r < nsites) where 0 <= i < N_AA - 1.
- dPrxy (dict)
Like for ExpCM, but also contains entry keyed by ‘zeta’ of of shape (nsites * (N_AA - 1), nsites, N_CODON, N_CODON) where the first index ranges over the elements in zeta.
- tildeFrxy
(numpy.ndarray of float, shape (nsites, N_CODON, N_CODON)) Contains quantities used in calculating derivative of dPrxy with respect to zeta.
- origpi (numpy.ndarray of float, shape (nsites, N_AA))
origpi[r][a] is the original preference for amino-acid a at site r as given by the calling parameter prefs and re-scaled by origbeta. This value is not updated as zeta and pi are optimized, and is rather used to keep track of where the optimization starts. Set to one if you have not already optimized a stringency parameter prior to initializing this class.
- prior (None or a tuple)
Specifies the regularizing prior over the preferences. This prior is centered on the preferences in origpi. A value of None indicates no regularizing prior. A tuple of (‘invquadratic’, c1, c2) is the inverse quadratic prior described in Bloom, Biology Direct, 12:1.
beta (the stringency parameter) is not a free parameter, and is instead fixed to 1. This is because it does not make sense to optimize a stringency parameter if you are also optimizing the preferences as they are confounded.
-
ALLOWEDPARAMS
= ['kappa', 'omega', 'eta', 'mu', 'zeta']¶
-
__init__
(prefs, prior, kappa, omega, phi, mu=1.0, origbeta=1.0, freeparams='zeta')¶ Initialize an ExpCM_fitprefs object.
The calling parameters have the meaning described in the main class doc string. The default values have changed consistent with recommended usage of this class: that you have already estimated the across-site parameters (kappa, omega, mu, phi) and so are specifying fixed values for those. You are then just optimizing the preferences in their variable-transformed form zeta from the initial values in prefs scaled by origbeta. The meaning of origbeta is described in the main class doc string.
- Additional arguments:
- origbeta (float)
The preferences in prefs are re-scaled by origbeta and used as the initial value and stored in origpi. origbeta might differ from 1 if you already optimized prefs using a fixed-preference ExpCM prior to initializing this model.
-
dlogprior
(param)¶ Value of derivative of prior depends on value of prior.
-
property
logprior
¶ Value of log prior depends on value of prior.
-
class
phydmslib.models.
ExpCM_fitprefs2
(prefs, prior, kappa, omega, phi, mu=1.0, origbeta=1.0, freeparams='zeta')¶ Bases:
phydmslib.models.ExpCM_fitprefs
An ExpCM with the preferences pi as free parameters.
This class differs from ExpCM_fitprefs in the way that it internally handles the transformation from the preferences pi to the optimization variables zeta. All other attributes are the same as for an ExpCM_fitprefs.
-
class
phydmslib.models.
GammaDistributedBetaModel
(model, ncats, alpha_lambda=1.0, beta_lambda=2.0, freeparams='alpha_lambda', 'beta_lambda')¶ Bases:
phydmslib.models.GammaDistributedModel
Implements gamma distribution over beta for a model.
See __init__ method for how to initialize a GammaDistributedBetaModel.
Attributes should only be updated via the updateParams method; do not set attributes directly.
Attributes are inherited from GammaDistributedModel.
-
__init__
(model, ncats, alpha_lambda=1.0, beta_lambda=2.0, freeparams='alpha_lambda', 'beta_lambda')¶ Initialize an GammaDistributedModel object.
The lambda_param is set to “beta”.
- Args:
- model ncats,`alpha_lambda`, beta_lambda, freeparams
Meaning described in main class doc string for GammaDistributedModel.
-
-
class
phydmslib.models.
GammaDistributedModel
(model, lambda_param, ncats, alpha_lambda=1.0, beta_lambda=2.0, freeparams='alpha_lambda', 'beta_lambda')¶ Bases:
phydmslib.models.DistributionModel
Implements gamma distribution over some parameter lambda for a model.
This model can be used to take a simple substitution model that directly subclasses Model and implement a gamma distribution over one parameter lambda. For instance, if this is done for the YNGKP_M0 model parameter omega, it yields a M5 variant of the YNGKP model.
The lambda parameter is drawn from ncats categories, with the values being at the mean of each of these categories and equal weight assigned to each category.
This means that rather than optimizing lambda directly, we optimize the shape and inverse-scale parameters of its gamma distribution.
See __init__ method for how to initialize a GammaDistributedModel.
Attributes should only be updated via the updateParams method; do not set attributes directly.
- Attributes (see also those inherited from DistributionModel):
- alpha_omega (float > 0)
Gamma distribution shape parameter.
- beta_omega (float > 0)
Gamma distribution inverse-scale parameter
- lambda_param (str)
Parameter which the gamma distirubtion is implemented over.
-
M
(k, t, tips=None, gaps=None)¶ See docs for DistributionModel abstract base class.
-
property
PARAMLIMITS
¶ See docs for Model abstract base class.
-
PARAMTYPES
= {'alpha_lambda': <class 'float'>, 'beta_lambda': <class 'float'>}¶
-
__init__
(model, lambda_param, ncats, alpha_lambda=1.0, beta_lambda=2.0, freeparams='alpha_lambda', 'beta_lambda')¶ Initialize a GammaDistributedModel.
- Args:
- model (Model)
A substitution model with distributedparam as a free parameter that does not have any other property drawn from a distribution. Any other parameter that is in freeparams of this model will also be an optimized parameter.
- ncats, lambda_param, alpha_omega, beta_omega
Meaning described in main class doc string.
- freeparams
The free parameters will be these plus anything in model.freeparams other than distributedparam.
-
property
basemodel
¶ See docs for DistributionModel abstract base class.
-
property
branchScale
¶ See docs for Model abstract base class.
-
property
catweights
¶ See docs for DistributionModel abstract base class.
-
dM
(k, t, param, Mkt, tips=None, gaps=None)¶ See docs for DistributionModel abstract base class.
-
property
d_distributionparams
¶ See docs for DistributionModel abstract base class.
-
property
distributedparam
¶ Returns name of the distributed parameter.
-
property
distributionparams
¶ Returns list of params defining distribution of distributedparam.
This list is [‘alpha_lambda’, ‘beta_lambda’].
-
dlogprior
(param)¶ Equal to value of basemodel.dlogprior.
-
dstationarystate
(k, param)¶ See docs for Model abstract base class.
-
property
freeparams
¶ See docs for Model abstract base class.
-
property
logprior
¶ Equal to value of basemodel.logprior.
-
property
mu
¶ See docs for Model abstract base class.
-
property
ncats
¶ See docs for DistributionModel abstract base class.
-
property
nsites
¶ See docs for Model abstract base class.
-
property
paramsReport
¶ See docs for Model abstract base class.
-
stationarystate
(k)¶ See docs for Model abstract base class.
-
updateParams
(newvalues, update_all=False)¶ See docs for Model abstract base class.
-
class
phydmslib.models.
GammaDistributedOmegaModel
(model, ncats, alpha_lambda=1.0, beta_lambda=2.0, freeparams='alpha_lambda', 'beta_lambda')¶ Bases:
phydmslib.models.GammaDistributedModel
Implements gamma distribution over omega for a model.
This model can be used to implement a gamma distribution over a Models omega parameter. For instance, if this is done for the YNGKP_M0 model, it yields a M5 variant of the YNGKP model.
See __init__ method for how to initialize a GammaDistributedOmegaModel.
Attributes should only be updated via the updateParams method; do not set attributes directly.
Attributes are inherited from GammaDistributedModel.
-
__init__
(model, ncats, alpha_lambda=1.0, beta_lambda=2.0, freeparams='alpha_lambda', 'beta_lambda')¶ Initialize an GammaDistributedModel object.
The lambda_param is set to “omega”.
- Args:
- model ncats,`alpha_lambda`, beta_lambda, freeparams
Meaning described in main class doc string for GammaDistributedModel.
-
-
class
phydmslib.models.
Model
¶ Bases:
object
Substitution model abstract base class.
Specifies required methods / attributes of substitution models.
-
abstract
M
(t, tips=None, gaps=None)¶ Matrix exponential M(mu * t) = exp(mu * t * P).
If we are considering branch to tip node, you can use tips and gaps to get only the column needed for likelihood calculation. This is computationally cheaper.
- Args:
- t (float > 0)
The branch length.
- tips (None or numpy.ndarray of int, length nsites)
If None, return the full matrix exponential. Otherwise, should be an array with element r giving the codon at tip node r (just put 0 if codon is a gap and see gaps below).
- gaps (None or numpy.ndarray of int)
Only meaningful if using tips. In this case, array should list r for all sites where tip node codon is a gap. None is equivalent to no gaps (also can be empty array).
- Returns:
- M (numpy.ndarray of floats)
- If not using tips, shape is (nsites, N_CODON, N_CODON)
with M(t)[r][x][y] probability r changes from x to y in time t.
- If using tips, shape is (nsites, N_CODON) with
M(t)[r][x] probability r changes from `x to tips[r] in time t. If r is in gap, then M(t)[r][x] is one.
-
abstract property
PARAMLIMITS
¶ Dict of tuples giving lower and upper allowed param values.
For each param in freeparams, PARAMLIMITS[param] is a 2-tuple of floats giving upper and lower allowed values of param (or of each element in param if param is an array).
-
abstract property
branchScale
¶ Factor to scale branch lengths to substitutions per site.
When we use a branch length of t = 1, how many substitutions per site do we expect averaged over all sites? This method returns that number. So to scale branch lengths to units of substitution per site, multiply by this number.
-
abstract
dM
(t, param, Mt, tips=None, gaps=None)¶ Derivative of M(t) with respect to param.
- Args:
- t (float > 0)
The branch length.
- param (string in freeparams or the string ‘t’)
Differentiate with respect to this model parameter, or with respect to branch length if ‘t’.
- Mt (numpy.ndarray.)
The current value of M(t, tips, gaps). Typically this has already been pre-computed which is why it is passed here to avoid computing again. Otherwise pass None and it will be re-computed.
- tips and gaps
Same meaning as for M.
- Returns:
- dM_param (numpy.ndarray floats)
If param is a float, then dM_param[r][x][y] is derivative of M(t)[r][x][y] with respect to param. If param is an array, then dM_param[i][r][x][y] is derivative of M(t)[r][x][y] with respect to param[i]. If using tips, then dM_param[r] or dM_params[i][r] is only a single array providing the derivative of M(t, tips, gaps)[r] with respect to param or param[i].
-
abstract
dlogprior
(param)¶ Derivative of logprior with respect to param.
- Args:
param (string in freeparams)
- Returns:
Derivative of logprior with respect to param, or 0 if there is no prior defined over this model.
-
abstract
dstationarystate
(param)¶ Derivative of stationarystate with respect to param.
- Args:
param (string in freeparams)
- Returns:
- dstationarystate (numpy.ndarray of floats or zero)
If param is a float, then dstationarystate[r][x] is derivative of stationarystate[r][x] with respect to param. If param is an array, then dstationarystate[i][r][x] is derivative of stationarystate[r][x] with respect to param[i].
-
abstract property
freeparams
¶ Model parameters to be optimized (a list of strings).
-
abstract property
logprior
¶ Log prior over current model.
Is a float giving log prior over current model, or 0 if there is no prior defined over this model.
-
abstract property
mu
¶ Mutation rate, scales branch lengths in M and dM.
-
abstract property
nsites
¶ An int giving the number of codon sites.
-
abstract property
paramsReport
¶ Reports current values of independent model parameters.
Returns a dictionary keyed by parameter name with value being the parameter value. Does not include mu as this is confounded with branch lengths.
-
abstract property
stationarystate
¶ Stationary state of substitution model.
A numpy.ndarray of floats, shape (nsites, N_CODON). Element stationarystate[r][x] is the stationary state probability of codon x at site r.
-
abstract
updateParams
(newvalues, update_all=False)¶ Update model params.
This method is the only acceptable way to update model parameters; other dependent model parameters are automatically updated as needed if you use this method.
- Args:
- newvalues (dict)
Can be keyed by any parameter in freeparams.
- update_all (bool)
If True, update all dependent attributes using current values of model parameters.
-
abstract
-
class
phydmslib.models.
YNGKP_M0
(e_pw, nsites, kappa=2.0, omega=0.5, mu=1.0, freeparams='kappa', 'omega', 'mu')¶ Bases:
phydmslib.models.Model
YNGKP_M0 model from Yang et al, 2000.
See __init__ method for how to initialize an YNGKP_M0.
Attributes should only be updated via the updateParams method, do not set attributes directly.
- Attributes (see also those inherited from Model):
- kappa (float > 0)
Transition-transversion ratio.
- omega (float > 0)
Nonsynonymous to synonymous substitution ratio.
- Pxy (numpy.ndarray of floats, shape (1, CODON, N_CODON)
Pxy[0][x][y] is substitution rate from codon x to y. Diagonal elements make rows sum to zero.
- Pxy_no_omega
Like Pxy but not multiplied by omega for nonsynonymous mutations.
- dPxy (dict)
Keyed by each string in freeparams, each value is numpy.ndarray of floats giving derivative of Pxy with respect to that parameter. The shape of each array (1, N_CODON, N_CODON).
- e_pw (scipy.ndarray, shape (3, N_NT))
The empirical nucleotide frequencies for each position in a codon measured from the alignment. e_pw[p][w] give the frequency of nucleotide w at codon position p.
- Phi_x (numpy.ndarray, shape (N_CODON,))
The codon frequencies calculated from the phi frequencies. Phi_x[x] = phi[0][x0] * phi[1][x1] * phi[2][x2]
- phi (numpy.ndarray, shape (3, N_NT))
The model nucleotide frequencies for each position in a codon computed using the CF3X4 estimator. phi[p][w] is the model frequency of nucleotide w at codon position p.
Unlike the ExpCM, YNGKP_M0 does not have site-specific calculations. In order to maintain consistency, most attributes, such as Pxy and dPxy do have a single “site” dimension which is carried through the calculations. When M and dM are returned, the single site matrix is repeated so the final dimensions of M and dM are match those in the docs for the Models abstract base class.
The model nucleotide frequences, phi are computed using the empirical nucleotide frequencies e_pw and the Corrected F3X4 estimator (Pond et al, 2010). Unlike the traditional MG3X4 or GY3X4, this estimator takes into account stop codon nucleotide frequencies. The codon frequencies, Phi_x are computed from the phi values.
-
ALLOWEDPARAMS
= ['kappa', 'omega', 'mu']¶
-
M
(t, tips=None, gaps=None)¶ See docs for method in Model abstract base class.
-
property
PARAMLIMITS
¶ See docs for Model abstract base class.
-
PARAMTYPES
= {'e_pw': (<class 'numpy.ndarray'>, (3, 4)), 'kappa': <class 'float'>, 'mu': <class 'float'>, 'omega': <class 'float'>}¶
-
__init__
(e_pw, nsites, kappa=2.0, omega=0.5, mu=1.0, freeparams='kappa', 'omega', 'mu')¶ Initialize an YNGKP_M0 object.
- Args:
- kappa, omega, mu,
Model params described in main class doc string.
- freeparams (list of strings)
Specifies free parameters.
- e_pw, nsites
Meaning described in the main class doc string.
-
property
branchScale
¶ See docs for Model abstract base class.
-
dM
(t, param, Mt, tips=None, gaps=None)¶ See docs for method in Model abstract base class.
-
dlogprior
(param)¶ Zero for all param, as no prior defined over this model.
-
dstationarystate
(param)¶ See docs for Model abstract base class.
-
property
freeparams
¶ See docs for Model abstract base class.
-
property
logprior
¶ Is zero, as no prior is defined over this model.
-
property
mu
¶ See docs for Model abstract base class.
-
property
nsites
¶ See docs for Model abstract base class.
-
property
paramsReport
¶ See docs for Model abstract base class.
-
property
stationarystate
¶ See docs for Model abstract base class.
-
updateParams
(newvalues, update_all=False)¶ See docs for Model abstract base class.