dms2_logoplot

Overview

The dms2_logoplot program makes logo-plot visualizations of the data. It uses a slightly modified version of weblogo to make the logo plots themselves.

If you run it using --prefs to specify the input file, then the plot will visualize Amino-acid preferences.

If you run it using --muteffects to specify the input file, then the plot will visualize the logarithm base 2 of the mutational effect. You can calculate mutational effects from Amino-acid preferences using the function dms_tools2.prefs.prefsToMutFromWtEffects(). The mutational effect calculated by this function is just the ratio of the preference for the mutant amino acid over the preference for the wildtype amino acid.

If you run it using --diffsel to specify the input file, then the plot will visualize Differential selection.

If you run it using --fracsurvive to specify the input file, then the plot will visualize the Fraction surviving for each mutation.

If you run it using --diffprefs to specify the input file, then the plot will show the difference between preferences, showing negative and positive values.

Command-line usage

Create logo plot visualization. Part of dms_tools2 (version 2.6.6) written by the Bloom Lab.

usage: dms2_logoplot [-h] [--outdir OUTDIR] [--ncpus NCPUS]
                     [--use_existing {yes,no}] [-v]
                     (--prefs PREFS | --diffsel DIFFSEL | --fracsurvive FRACSURVIVE | --diffprefs DIFFPREFS | --muteffects MUTEFFECTS)
                     --name NAME [--nperline NPERLINE]
                     [--numberevery NUMBEREVERY] [--excludestop {yes,no}]
                     [--stringency STRINGENCY]
                     [--restrictdiffsel {all,positive,negative}]
                     [--diffselrange MINDIFFSEL MAXDIFFSEL]
                     [--muteffectrange MINMUTEFFECT MAXMUTEFFECT]
                     [--fracsurvivemax FRACSURVIVEMAX] [--sortsites {yes,no}]
                     [--mapmetric {kd,mw,charge,functionalgroup,singlecolor}]
                     [--colormap COLORMAP]
                     [--overlay1 FILE SHORTNAME LONGNAME]
                     [--overlay2 FILE SHORTNAME LONGNAME]
                     [--overlay3 FILE SHORTNAME LONGNAME]
                     [--underlay {yes,no}] [--scalebar BARHEIGHT LABEL]
                     [--overlaycolormap OVERLAYCOLORMAP]
                     [--letterheight LETTERHEIGHT]
                     [--ignore_extracols {yes,no}] [--sepline {yes,no}]

Named Arguments

--outdir

Output files to this directory (create if needed).

--ncpus

Number of CPUs to use, -1 is all available.

Default: -1

--use_existing

Possible choices: yes, no

If files with names of expected output already exist, do not re-run.

Default: “no”

-v, --version

show program’s version number and exit

--prefs

CSV file of amino-acid preferences.

--diffsel

CSV file of amino-acid differential selection.

--fracsurvive

CSV file of amino-acid fraction surviving.

--diffprefs

CSV file of differences in amino-acid preferences.

--muteffects

CSV file of amino-acid mutational effects.

--name

Name used for output files.

This name should only contain letters, numbers, dashes, and spaces. Underscores are not allowed as they are a LaTex special character.

--nperline

Number of sites per line.

Default: 70

--numberevery

Number sites at this interval.

Default: 10

--excludestop

Possible choices: yes, no

Exclude stop codons as possible amino acid?

Default: “no”

--stringency

Stringency parameter to re-scale prefs.

Default: 1

--restrictdiffsel

Possible choices: all, positive, negative

Plot all diffsel, or only positive or negative.

Default: “all”

--diffselrange

Specify a fixed range for diffsel. Otherwise determined from data range.

--muteffectrange

Specify a fixed range for muteffects. Otherwise determined from data range.

--fracsurvivemax

Specify maximum value for fracsurvive. Otherwise determined from data range.

--sortsites

Possible choices: yes, no

Sort sites from first to last before plotting.

Default: “yes”

--mapmetric

Possible choices: kd, mw, charge, functionalgroup, singlecolor

Color amino acids by Kyte-Doolittle hydrophobicity, molecular weight, charge, or functional group.

Default: “functionalgroup”

--colormap

matplotlib color map for amino acids when –mapmetric is ‘kd’ or ‘mw’; name of single color when it is ‘singlecolor’.

Default: “jet”

--overlay1

Color bar above logo plot to denote per-residue property. FILE is CSV format with column names site and SHORTNAME. SHORTNAME is <= 5 character property name. LONGNAME is longer name for legend. Sites not in FILE are colored white. To show wildtype identity, make SHORTNAME and LONGNAME both wildtype and have this column in FILE give 1-letter wildtype amino-acid code. To show omegabysite.txt file from phydms, give that file and set both SHORTNAME and LONGNAME to omegabysite.

--overlay2

Second overlay color bar.

--overlay3

Third overlay color bar.

--underlay

Possible choices: yes, no

Plot underlay rather than overlay bars.

Default: “no”

--scalebar

Plot a scale bar indicating BARHEIGHT with LABEL. Only for diffsel, fracsurvive, and muteffects.

--overlaycolormap

matplotlib color map for overlay bars (e.g., ‘jet’ or ‘YlOrRd’).

Default: “jet”

--letterheight

Relative height of letter stacks in logo plot.

Default: 1

--ignore_extracols

Possible choices: yes, no

Ignore extra columns in data

Default: “no”

--sepline

Possible choices: yes, no

Separate positive and negative diffsel with black line?

Default: “yes”

Output files

Running dms2_logoplot produces output files in the directory specified by --outdir, and with the prefix specified by --name.

There will be a log file with the suffix .log summarizing the program’s progress.

If you run with --prefs, then the logo plot will be in a file with the suffix _prefs.pdf. An example of such a logo plot is in the Doud2016 example.

If you run with --diffsel, then the logo plot will be a file with the suffix _diffsel.pdf. An example of such a logo plot is in the Doud2017 example.

If you run with --fracsurvive, then the logo plot will be a file with the suffix _fracsurvive.pdf.

If you run with --muteffects, then the logo plot will be a file with the suffix _muteffects.pdf.

If you run with --diffprefs, then the logo plot will be a file with the suffix _diffprefs.pdf.