curvefits

Defines CurveFits to fit curves and display / plot results.

class neutcurve.curvefits.CurveFits(data, *, conc_col='concentration', fracinf_col='fraction infectivity', serum_col='serum', virus_col='virus', replicate_col='replicate', infectivity_or_neutralized='infectivity', fix_slope_first=True, init_slope=1.5, fixbottom=0, fixtop=1, fixslope=False, allow_reps_unequal_conc=False)[source]

Bases: object

Fit and display neutcurve.hillcurve.HillCurve curves.

Args:
data (pandas DataFrame)

Tidy dataframe with data.

conc_col (str)

Column in data with concentrations of serum.

fracinf_col (str)

Column in data with fraction infectivity.

serum_col (str)

Column in data with serum name.

virus_col (str)

Column in data with name of virus being neutralized.

replicate_col (str`)

Column in data with name of replicate of this measurement. Replicates can not be named ‘average’ as we compute the average from the replicates.

fixbottom (False or float or 2-tuple)

Same meaning as for neutcurve.hillcurve.HillCurve.

fixtop (False or float or 2-tuple)

Same meaning as for neutcurve.hillcurve.HillCurve.

fixslope (False or float or 2-tuple)

Same meaning as for neutcurve.hillcurve.HillCurve.

infectivity_or_neutralized ({‘infectivity’, ‘neutralized’})

Same meaning as for neutcurve.hillcurve.HillCurve.

fix_slope_first (bool)

Same meaning as for neutcurve.hillcurve.HillCurve.

init_slope (float)

Same meaning as for neutcurve.hillcurve.HillCurve.

allow_reps_unequal_conc (bool)

Allow replicates for the same serum/virus to have unequal concentrations; otherwise all replicates for a serum/virus must have measurements at same concentrations.

Attributes of a CurveFits include all args except data plus:
df (pandas DataFrame)

Copy of data that only has relevant columns, has additional rows with replicate_col of ‘average’ that hold replicate averages, and added columns ‘stderr’ (standard error of fraction infectivity for ‘average’ if multiple replicates, otherwise nan).

sera (list)

List of all serum names in serum_col of data, in order they occur in data.

viruses (dict)

For each serum in sera, viruses[serum] gives all viruses for that serum in the order they occur in data.

replicates (dict)

replicates[(serum, virus)] is list of all replicates for that serum and virus in the order they occur in data.

allviruses (list)

List of all viruses.

static combineCurveFits(curvefits_list, *, sera=None, viruses=None, serum_virus_replicates_to_drop=None)[source]
Args:
curvesfit_list (list)

List of CurveFits objects that are identical other than the data they contain and have unique virus/serum/replicate combinations. They can differ in fixtop and fixbottom, but then those will be set to None in the returned object.

sera (None or list)

Only keep fits for sera in this list, or keep all sera if None.

viruses (None or list)

Only keep fits for viruses in this list, or keep all sera if None.

serum_virus_replicates_to_drop (None or list)

If a list, should specify (serum, virus, replicates) tuples, and those particular fits are dropped.

Returns:
combined_fits (CurveFits)

A CurveFits object that combines all the virus/serum/replicate combinations in curvefits_list.

fitParams(*, average_only=True, no_average=False, ics=(50,), ics_precision=0, ic50_error=None)[source]

Get data frame with curve fitting parameters.

Args:
average_only (bool)

If True, only get parameters for average across replicates.

no_average (bool)

Do not include average across replicates. Mutually incompatible with average_only.

ics (iterable)

Include ICXX for each number in this list, where the number is the percent neutralized. So if ics only contains 50, we include the IC50. If it includes 95, we include the IC95.

ics_precision (int)

Include this many digits after decimal when creating the ICXX columns.

ic50_error {None, ‘fit_stdev’}

Include estimated error on IC50 as standard deviation of fit parameter; note that we recommend instead just taking standard error of replicate IC50s.

Returns:

A pandas DataFrame with fit parameters for each serum / virus / replicate as defined for a neutcurve.hillcurve.HillCurve. Columns:

  • ‘serum’

  • ‘virus’

  • ‘replicate’

  • ‘nreplicates’: number of replicates for average, NaN otherwise.

  • ‘icXX’: ICXX or its bound as a number, where XX is each number in ics.

  • ‘icXX_bound’: string indicating if ICXX interpolated from data, or is an upper or lower bound.

  • ‘icXX_str’: ICXX represented as string, with > or < indicating if it is an upper or lower bound.

  • ‘midpoint’: midpoint of curve, same as IC50 only if bottom and top are 0 and 1.

  • ‘midpoint_bound’: midpoint bounded by range of fit concentrations

  • ‘midpoint_bound_type’: string indicating if midpoint is interpolated from data or is an upper or lower bound.

  • ‘slope’: Hill slope of curve.

  • ‘top’: top of curve.

  • ‘bottom’: bottom of curve.

  • ‘r2’: coefficient of determination of fit

  • ‘rmsd’: root-mean square deviation of fits

getCurve(*, serum, virus, replicate)[source]

Get the fitted curve for this sample.

Args:
serum (str)

Name of a valid serum.

virus (str)

Name of a valid virus for serum.

replicate (str)

Name of a valid replicate for serum and virus, or ‘average’ for the average of all replicates.

Returns:

A neutcurve.hillcurve.HillCurve.

plotAverages(*, color='black', marker='o', **kwargs)[source]

Plot grid with a curve for each serum / virus pair.

Args:
color (str)

Color the curves.

marker (str)

Marker for the curves.

**kwargs

Other keyword arguments that can be passed to CurveFits.plotReplicates().

Returns:

The 2-tuple (fig, axes) of matplotlib figure and 2D axes array.

plotGrid(plots, *, xlabel=None, ylabel=None, widthscale=1, heightscale=1, attempt_shared_legend=True, fix_lims=None, bound_ymin=0, bound_ymax=1, extend_lim=0.07, markersize=6, linewidth=1, linestyle='-', legendtitle=None, orderlegend=None, titlesize=14, labelsize=15, ticksize=12, legendfontsize=12, align_to_dmslogo_facet=False, despine=False, yticklocs=None, sharex=True, sharey=True, vlines=None, draw_in_bounds=False)[source]

Plot arbitrary grid of curves.

Args:
plots (dict)

Plots to draw on grid. Keyed by 2-tuples (irow, icol), which give row and column (0, 1, … numbering) where plot should be drawn. Values are the 2-tuples (title, curvelist) where title is title for this plot (or None) and curvelist is a list of dicts keyed by:

xlabel, ylabel (None, str, or list)

Labels for x- and y-axes. If None, use conc_col and fracinf_col, respectively. If str, use this shared for all axes. If list, should be same length as plots and gives axis label for each subplot.

widthscale, heightscale (float)

Scale width or height of figure by this much.

attempt_shared_legend (bool)

Share a single legend among plots if they all share in common the same label assigned to the same color / marker.

fix_lims (dict or None)

To fix axis limits, specify any of ‘xmin’, ‘xmax’, ‘ymin’, or ‘ymax’ with specified limit.

bound_ymin, bound_ymax (float or None)

Make y-axis min and max at least this small / large. Ignored if using fix_lims for that axis limit.

extend_lim (float)

For all axis limits not in fix_lims, extend this fraction of range above and below bounds / data limits.

markersize (float)

Size of point marker.

linewidth (float)

Width of line.

linestyle (str)

Line style.

legendtitle (str or None)

Title of legend.

orderlegend (None or list)

If specified, place legend labels in this order.

titlesize (float)

Size of subplot title font.

labelsize (float)

Size of axis label font.

ticksize (float)

Size of axis tick fonts.

legendfontsize (float)

Size of legend fonts.

align_to_dmslogo_facet (False or dict)

Make plot vertically alignable to dmslogo.facet_plot with same number of rows; dict should have keys for height_per_ax, hspace, tmargin, and bmargin with same meaning as dmslogo.facet_plot. Also right and left for passing to subplots_adjust.

despine (bool)

Remove top and right spines from plots.

yticklocs (None or list)

Same meaning as for neutcurve.hillcurve.HillCurve.plot().

sharex (bool)

Share x-axis scale among plots.

sharey (bool)

Share y-axis scale among plots.

vlines (dict or None)

Vertical lines to draw. Keyed by 2-tuples (irow, icol), which give row and column of plot in grid (0, 1, … numbering). Values are lists of dicts with a key ‘x’ giving the x-location of the vertical line, and optionally keys ‘linewidth’, ‘color’, and ‘linestyle’.

draw_in_bounds (bool)

Same meaning as for meth:neutcurve.hillcurve.HillCurve.plot.

Returns:

The 2-tuple (fig, axes) of matplotlib figure and 2D axes array.

plotReplicates(*, ncol=4, nrow=None, sera='all', viruses='all', colors=('#999999', '#E69F00', '#56B4E9', '#009E73', '#F0E442', '#0072B2', '#D55E00', '#CC79A7'), markers=('o', '^', 's', 'D', 'v', '<', '>', 'p', 'x'), subplot_titles='{serum} vs {virus}', show_average=False, average_only=False, attempt_shared_legend=True, **kwargs)[source]

Plot grid with replicates for each serum / virus on same plot.

Args:
ncol, nrow (int or None)

Specify one of these to set number of columns or rows.

sera (‘all’ or list)

Sera to include on plot, in this order.

viruses (‘all’ or list)

Viruses to include on plot, in this order.

colors (iterable)

List of colors for different replicates.

markers (iterable)

List of markers for different replicates.

subplot_titles (str)

Format string to build subplot titles from serum and virus.

show_average (bool)

Include the replicate-average as a “replicate” in plots.

average_only (bool)

Show only the replicate-average on each plot. No legend in this case.

attempt_shared_legend (bool)

Do we attempt to share the same replicate key for all panels or give each its own?

**kwargs

Other keyword arguments that can be passed to CurveFits.plotGrid().

Returns:

The 2-tuple (fig, axes) of matplotlib figure and 2D axes array.

plotSera(*, ncol=4, nrow=None, sera='all', viruses='all', ignore_serum_virus=None, colors=('#999999', '#E69F00', '#56B4E9', '#009E73', '#F0E442', '#0072B2', '#D55E00', '#CC79A7'), markers=('o', '^', 's', 'D', 'v', '<', '>', 'p', 'x'), virus_to_color_marker=None, max_viruses_per_subplot=5, multi_serum_subplots=True, all_subplots=('WT', 'wt', 'wildtype', 'Wildtype', 'wild type', 'Wild type'), titles=None, vlines=None, **kwargs)[source]

Plot grid with replicate-average of viruses for each serum.

Args:
ncol, nrow (int or None)

Specify one of these to set number of columns or rows, other should be None.

sera (‘all’ or list)

Sera to include on plot, in this order.

viruses (‘all’ or list)

Viruses to include on plot, in this order unless one is specified in all_subplots.

ignore_serum_virus (None or dict)

Specific serum / virus combinations to ignore (not plot). Key by serum, and then list viruses to ignore.

colors (iterable)

List of colors for different viruses.

markers (iterable)

List of markers for different viruses.

virus_to_color_marker (dict or None)

Optionally specify a specific color and for each virus as 2-tuples (color, marker). If you use this option, colors and markers are ignored.

max_viruses_per_subplot (int)

Maximum number of viruses to show on any subplot.

multi_serum_subplots (bool)

If a serum has more than max_virus_per_subplot viruses, do we make multiple subplots for it or raise an error?

all_subplots (iterable)

If making multiple subplots for serum, which viruses do we show on all subplots? These are also shown first.

titles (None or list)

Specify custom titles for each subplot different than sera.

vlines (None or dict)

Add vertical lines to plots. Keyed by serum name, values are lists of dicts with a key ‘x’ giving x-location of vertical line, and optional keys ‘linewidth’, ‘color’, and ‘linestyle’.

**kwargs

Other keyword arguments that can be passed to CurveFits.plotGrid().

Returns:

The 2-tuple (fig, axes) of matplotlib figure and 2D axes array.

plotViruses(*, ncol=4, nrow=None, sera='all', viruses='all', ignore_virus_serum=None, colors=('#999999', '#E69F00', '#56B4E9', '#009E73', '#F0E442', '#0072B2', '#D55E00', '#CC79A7'), markers=('o', '^', 's', 'D', 'v', '<', '>', 'p', 'x'), serum_to_color_marker=None, max_sera_per_subplot=5, multi_virus_subplots=True, all_subplots=(), titles=None, vlines=None, **kwargs)[source]

Plot grid with replicate-average of sera for each virus.

Args:
ncol, nrow (int or None)

Specify one of these to set number of columns or rows, other should be None.

sera (‘all’ or list)

Sera to include on plot, in this order, unless one is specified in all_subplots.

viruses (‘all’ or list)

Viruses to include on plot, in this order.

ignore_virus_serum (None or dict)

Specific virus / serum combinations to ignore (not plot). Key by virus, and then list sera to ignore.

colors (iterable)

List of colors for different sera.

markers (iterable)

List of markers for different sera.

serum_to_color_marker (dict or None)

Optionally specify a specific color and for each serum as 2-tuples (color, marker). If you use this option, colors and markers are ignored.

max_sera_per_subplot (int)

Maximum number of sera to show on any subplot.

multi_virus_subplots (bool)

If a virus has more than max_sera_per_subplot sera, do we make multiple subplots for it or raise an error?

all_subplots (iterable)

If making multiple subplots for virus, which sera do we show on all subplots? These are also shown first.

titles (None or list)

Specify custom titles for each subplot different than viruses.

vlines (None or dict)

Add vertical lines to plots. Keyed by virus name, values are lists of dicts with a key ‘x’ giving x-location of vertical line, and optional keys ‘linewidth’, ‘color’, and ‘linestyle’.

**kwargs

Other keyword arguments that can be passed to CurveFits.plotGrid().

Returns:

The 2-tuple (fig, axes) of matplotlib figure and 2D axes array.