plot

Plotting functions.

polyclonal.plot.DEFAULT_NEGATIVE_COLOR = '#E69F00'

Orange from cbPalette color in hex.

Type:

str

polyclonal.plot.DEFAULT_POSITIVE_COLORS = ('#0072B2', '#CC79A7', '#009E73', '#17BECF', '#BCDB22')

french blue, wild orchid, green, light blue, olive.

Type:

tuple

Type:

Colors in hex

polyclonal.plot.TAB10_COLORS_NOGRAY = ('#1f77b4', '#ff7f0e', '#2ca02c', '#d62728', '#9467bd', '#8c564b', '#e377c2', '#bcbd22', '#17becf')

Tableau 10 color palette without gray.

Type:

tuple

polyclonal.plot.activity_wt_barplot(*, activity_wt_df, epitope_colors, epitopes=None, stat='activity', error_stat=None, width=110, height_per_bar=25)[source]

Bar plot of activity against each epitope, \(a_{\rm{wt},e}\).

Parameters:
  • activity_wt_df (pandas.DataFrame) – Epitope activities in format of polyclonal.polyclonal.Polyclonal.activity_wt_df.

  • epitope_colors (dict) – Maps each epitope name to its color.

  • epitopes (array-like or None) – Include these epitopes in this order. If None, use all epitopes in order found in activity_wt_df.

  • stat (str) – Statistic in activity_wt_df to plot as activity.

  • error_stat (str or None) – Statistic in activity_wt_df to plot as error for bars.

  • width (float) – Width of plot.

  • height_per_bar (float) – Height of plot for each bar (epitope).

Returns:

Interactive plot.

Return type:

altair.Chart

polyclonal.plot.color_gradient_hex(start, end, n)[source]

Get a list of colors linearly spanning a range.

Parameters:
  • start (str) – Starting color.

  • end (str) – Ending color.

  • n (int) – Number of colors in list.

Returns:

List of hex codes for colors spanning start to end.

Return type:

list

Example

>>> color_gradient_hex('white', 'red', n=5)
['#ffffff', '#ffbfbf', '#ff8080', '#ff4040', '#ff0000']
polyclonal.plot.corr_heatmap(corr_df, corr_col, sample_cols, *, group_col=None, corr_range=(0, 1), columns=3, diverging_colors=None, scheme=None)[source]

Plot a correlation matrix as heat map from a tidy data frame of correlations.

Parameters:
  • corr_df (pandas.DataFrame) – Data to plot.

  • corr_col (str) – Column in corr_df with correlation coefficient.

  • sample_cols (str or list) – Column(s) in corresponding to sample identifiers, suffixed by “_1” and “_2” for the distinct samples. Should be entries for all pairs of samples.

  • group_col (str or None) – Column in corr_df to facet plots on, or None if no facets.

  • corr_range (tuple or None) – Range of heat map as (min, max), or None to use data range. Typically you will want to set to (0, 1) for \(r^2\) and (-1, 1) for \(r\).

  • columns (int) – Facet by group_col into this many columns.

  • diverging_colors (None or bool) – If True, mid point of color scale is set to zero. If None, select True if corr_range extends below 0.

  • scheme (None or str) – Color scheme to use, see https://vega.github.io/vega/docs/schemes/. If None, choose intelligently based on corr_range and diverging_colors.

Returns:

Heatmap(s) of correlation coefficients.

Return type:

altair.Chart

polyclonal.plot.curves_plot(curve_specs_df, name_col, *, names_to_colors=None, unbound_label='fraction not neutralized', npoints=200, concentration_range=50, height=125, width=225, addtl_tooltip_cols=None, replicate_col=None, weighted_replicates=None)[source]

Plot Hill curves.

The curves are defined by \(U_e = \frac{1 - t_e}{1 + \left[c \exp \left(a_e\right)\right]^{n_e}} + t_e\) where \(U_e\) is the unbound fraction (plotted on y-axis), \(c\) is the concentration (plotted on x-axis), \(a_e\) is the activity, \(n_e\) is the Hill coefficient, and \(t_e\) is the non-neutralizable fraction. A different plot is made for each curve name (eg, epitope \(e\)).

Parameters:
  • curve_specs_df (pandas.DataFrame) – Should have columns name_col (giving name, eg epitope), ‘activity’, ‘hill_coefficient’, and ‘non_neutralized_frac’ specifying each curves. curve.

  • name_col (pandas.DataFrame) – Name of column in curve_specs_df giving the curve name (eg, epitope)

  • names_to_colors (dict or None) – To specify colors for each entry in name_col (eg, epitope), provide dict mapping names to colors.

  • unbound_label (str) – Label for the y-axis, \(U_e\).

  • npoints (int) – Number of points used to calculate the smoothed line that is plotted.

  • concentration_range (float or tuple) – If a float, then plot concentrations from this many fold lower than minimum \(\exp\left(-a_e\right)\) to this many folder greater than maximum \(\excp\left(-a_e\right)\). If a 2-tuple, then plot concentrations in the specified fixed range.

  • height (float) – Plot height.

  • width (float) – Plot width.

  • addtl_tooltip_cols (None or list) – Additional columns in curve_specs_df to show as tooltips.

  • replicate_col (None or str) – If there are multiple replicates with name_col, specify column with their names here and a line is plotted for each.

  • weighted_replicates (None or list) – If you want to plot only some replicates (such as ‘mean’) with a heavily weighted line and the rest with a thinner line, provide list of those to plot with heavily weighted line. None means all are heavily weighted.

Returns:

Interactive plot.

Return type:

altair.Chart

polyclonal.plot.lineplot_and_heatmap(*, data_df, stat_col, category_col, alphabet=None, sites=None, addtl_tooltip_stats=None, addtl_slider_stats=None, addtl_slider_stats_as_max=None, addtl_slider_stats_hide_not_filter=None, init_floor_at_zero=True, init_site_statistic='sum', cell_size=11, lineplot_width=690, lineplot_height=100, site_zoom_bar_width=500, site_zoom_bar_color_col=None, plot_title=None, show_single_category_label=False, category_colors=None, heatmap_negative_color=None, heatmap_color_scheme=None, heatmap_color_scheme_mid_0=True, heatmap_lims_from_slider_init=True, heatmap_max_at_least=None, heatmap_min_at_least=None, heatmap_max_fixed=None, heatmap_min_fixed=None, site_zoom_bar_color_scheme='set3', slider_binding_range_kwargs=None, hide_color='silver', show_zoombar=True, show_lineplot=True, show_heatmap=True, scale_stat_col=1, rename_stat_col=None, rename_std=True, sites_to_show=None)[source]

Lineplots and heatmaps of per-site and per-mutation values.

Parameters:
  • data_df (pandas.DataFrame) – Data to plot. Must have columns “site”, “wildtype”, “mutant”, stat_col, and category_col. The wildtype values (wildtype = mutant) should be included, but are not used for the slider filtering or included in site summary lineplot.

  • stat_col (str) – Column in data_df with statistic to plot.

  • category_col (str) – Column in data_df with category to facet plots over. You can just create a dummy column with some dummy value if you only have one category.

  • alphabet (array-like or None) – Alphabet letters in order. If None, use natsorted “mutant” col of data_df.

  • sites (array-like or None) – Sites in order to show. If None, use natsorted “site” col of data_df.

  • addtl_tooltip_stats (None or array-like) – Additional mutation-level stats to show in the heatmap tooltips. Values in addtl_slider_stats automatically included.

  • addtl_slider_stats (None or dict) – Additional stats for which to have a slider, value is initial setting. Ignores wildtype and drops it when all mutants have been dropped at site. Null values are not filtered.

  • addtl_slider_stats_as_max (None or list) – For slider stats listed here, filter for values <= rather than >= the value indicated in the slider.

  • addtl_slider_stats_hide_not_filter (None or list) – By default, addtl_slider_stats are filtered entirely from data set. If you just them excluded from lineplot but marked as hidden on heat map (eg, gray box), add names of stats to this list. Mutations that fail one of these hiding filters are always shown as hidden on the heat map rather than fully excluded, even if they fail other filters in addtl_slider_stats.

  • init_floor_at_zero (bool) – Initial value for option to put floor of zero on value is stat_col.

  • init_site_statistic ({'sum', 'mean', 'max', 'min', 'mean_abs', 'sum_abs'}) – Initial value for site statistic in lineplot, calculated from stat_col.

  • cell_size (float) – Size of cells in heatmap

  • lineplot_width (float or None) – Overall width of lineplot.

  • lineplot_height (float) – Height of line plot.

  • site_zoom_bar_width (float) – Width of site zoom bar.

  • site_zoom_bar_color_col (float) – Column in data_df with which to color zoom bar. Must be the same for all entries for a site.

  • plot_title (str or None) – Overall plot title.

  • show_single_category_label (bool) – Show the category label if just one category.

  • category_colors (None or dict) – Map each category to its color, or None to use default. These are the colors for positive values of stat_col.

  • heatmap_negative_color (None or str) – Color used for negative values in heatmaps, or None to use default.

  • heatmap_color_scheme (None or str) – Heatmap uses this Vega scheme rather than category_colors and heatmap_negative_color.

  • heatmap_color_scheme_mid_0 (bool) – Set the heatmap color scheme so the domain mid is zero.

  • heatmap_lims_from_slider_init (bool) – Do we set the heatmap limits to span just the range of values that are shown given the initial values for any sliders in addtl_slider_stats, or do we include all values including ones that might be filtered or hidden by the initial slider values.

  • heatmap_max_at_least (None or float) – Make heatmap color max at least this large.

  • heatmap_min_at_least (None or float) – Make heatmap color min at least this small, but still set to 0 if floor of zero selected.

  • heatmap_max_fixed (None or float) – Fix heatmap max to this value, even if it clamps data. Overrides heatmap_max_at_least.

  • heatmap_min_fixed (None or float) – Fix heatmap min to this value, even if it clamps data. Overrides heatmap_min_at_least.

  • site_zoom_bar_color_scheme (str) – If using site_zoom_bar_color_col, the Vega color scheme to use.

  • slider_binding_range_kwargs (dict) – Keyed by keys in addtl_slider_stats, with values being dicts giving keyword arguments passed to altair.binding_range (eg, ‘min’, ‘max’, ‘step’, etc.

  • hide_color (str) – Color given to any cells hidden by addtl_slider_stats_hide_not_filter.

  • show_zoombar (bool) – Show the zoom bar in the returned chart.

  • show_lineplot (bool) – Show the lineplot in the returned chart.

  • show_heatmap (bool) – Show the lineplot in the returned chart.

  • scale_stat_col (float) – Multiply numbers in stat_col by this number before plotting.

  • rename_stat_col (None or str) – If a str, rename stat_col to this. Also changes y-axis labels.

  • rename_std (bool) – If rename_stat_col is True, rename any column named <stat_col>_std to <rename_stat_col>_std.

  • sites_to_show (None or dict) – If None, all sites are shown. If a dict, can be keyed by “include_range” (value a 2-tuple giving first and last site to include, inclusive), “include” (list of sites to include), or “exclude” (list of sites to exclude).

Returns:

Interactive plot.

Return type:

altair.Chart