The dms2_fracsurvive program processes files giving the number of observed counts of characters in a selected and mock-selected condition along with a measurement of the overall fraction of the library surviving the selection to estimate the Fraction surviving for each mutation.

If you have multiple related replicates or samples (or even if you have just one), you should probably use the dms2_batch_fracsurvive program rather than running dms2_fracsurvive directly. This is because dms2_batch_fracsurvive runs dms2_fracsurvive, but then also makes some nice summary plots.

Command-line usage

Estimate fraction surviving for each mutation. Part of dms_tools2 (version 2.6.6) written by the Bloom Lab.

usage: dms2_fracsurvive [-h] [--outdir OUTDIR] [--ncpus NCPUS]
                        [--use_existing {yes,no}] [-v] [--indir INDIR]
                        [--chartype {codon_to_aa}] [--aboveavg {yes,no}]
                        [--excludestop {yes,no}] [--pseudocount PSEUDOCOUNT]
                        [--mincount MINCOUNT] --name NAME --sel SEL --mock
                        MOCK --libfracsurvive LIBFRACSURVIVE [--err ERR]

Named Arguments


Output files to this directory (create if needed).


Number of CPUs to use, -1 is all available.

Default: -1


Possible choices: yes, no

If files with names of expected output already exist, do not re-run.

Default: “no”

-v, --version

show program’s version number and exit


Input counts files in this directory.

This option can be useful if the counts files are found in a common directory. Instead of repeatedly listing that directory name, you can just provide it here.


Possible choices: codon_to_aa

Characters for which fraction surviving selection is estimated. codon_to_aa = amino acids from codon counts.

Default: “codon_to_aa”


Possible choices: yes, no

Report fracsurvive above the library average rather than direct fracsurvive values.

Default: “no”


Possible choices: yes, no

Exclude stop codons as a possible amino acid?

Default: “yes”


Pseudocount added to each count for sample with smaller depth; pseudocount for other sample scaled by relative depth.

Default: 5


Report as NaN the fracsurvive of mutations for which both selected and mock-selected samples have < this many counts.

Default: 0


Name used for output files.

The Output files will have a prefix equal to the name specified here. This name should only contain letters, numbers, dashes, and spaces. Underscores are not allowed as they are a LaTex special character.


Post-selection counts file or prefix used when creating this file.

The counts files have the format of the files created by programs such as dms2_bcsubamp. Specifically, they must have the following columns: ‘site’, ‘wildtype’, and then a column for each possible character (e.g., codon).


Like --sel, but for mock-selection counts.


Overall fraction of total library surviving selection versus mock condition. Should be between 0 and 1.


Like --sel but for error-control to correct mutation counts.

Output files

The output files all have the prefix specified by --outdir and --name. For instance, if you use --outdir results --name replicate-1, then the output files will have the prefix ./results/replicate-1 and the suffixes described below.

Here are the specific output files:

Log file

This file has the suffix .log. It is a text file that logs the progress of the program.

Mutation fraction surviving file

This file has the suffix _mutfracsurvive.csv. It gives the fraction surviving for each mutation at each site, which is the \(F_{r,x}\) value defined in Equation (30) of the Fraction surviving section. Note that the quantity is calculated for the wildtype as well as the mutant characters at each site. Note also that if you are using --aboveavg yes then these are the fraction surviving above the library average, denoted as \(F_{r,x}^{\rm{aboveavg}}\) in Equation (31) of the Fraction surviving section. If --mincounts is greater than zero, the fraction surviving may be undefined for some mutations due to low counts, and any such undefined values are also shown as NaN.

Here are the first and last few lines of a _mutfracsurvive.csv file:


Note that the file is sorted from largest to smallest fraction surviving.

Site fraction surviving file

This file has the suffix _sitefracsurvive.csv It gives several measures that summarize the fraction surviving each site. All values in the _sitefracsurvive.csv file can be calculated from the values in the _mutfracsurvive.csv file, but the program outputs both files to make things simpler for the user.

Specifically, it gives the following quantities:

  • avgfracsurvive is the average of the mutation fraction surviving values. If any of the mutation fraction surviving values are NaN (which can happen if you use --mincounts), they are not included in this average.

  • maxfracsurvive is the maximum mutation fraction surviving taken over all non-wildtype characters for each site.

Here are the first and last lines of a _sitefracsurvive.csv file:


If all mutations at a site have a mutation fraction surviving of NaN (which can be the case if --mincounts is > 0), then the site values are reported as NaN.