Constrain fit parameters to a range of reasonable values¶
So far we have looked at allowing the top and bottom to either be free parameters or fixed values, and let the slope of the curve be fit as a free parameter.
But in some cases, you may want to not let these be entirely free parameters, but also not fix them—but rather constrain them to a reasonable range. For instance, maybe constraining the top to be between 0.75 and 1. Likewise, sometimes you can end up with the slope being fit to an extremely steep or shallow value, so it can also be useful to constrain the slope.
Note that when constraining these parameters, especially the slope, you need to make sure you are choosing a reasonable range. For the slope, what is a reasonable range will depend on the units used for concentration, as well as inherent properties of the serum or antibody in question.
To constrain these parameters, you can use the fixtop, fixbottom, and fixslope parameters to HillCurve
or CurveFits
. Each of these can be: - False
: fit as a free parameter - A single number: value is constrained to this number - A 2-tuple giving the min max range (eg, (0.8, 1)
), in which case the value is constrained to this range.
Here is an example.
First, import Python modules:
[1]:
import matplotlib.pyplot as plt
import neutcurve
import pandas as pd
Now we read some example data and fit it with the top and the slope either free (eg, fixtop=False
) or constrained to a range (eg, fixtop=(0.8, 1.0)
). The curves are then plotted for each type of fitting:
[2]:
# NBVAL_IGNORE_OUTPUT
data = pd.read_csv("constrain_params_range_data.csv")
fit_params = []
for group, group_data in data.groupby("serum"):
for desc, fixtop, fixslope in [
("free top", False, False),
("top constrained to [0.8, 1.0], slope to [1, 2]", (0.8, 1), (1, 2)),
]:
fits = neutcurve.CurveFits(group_data, fixtop=fixtop, fixslope=fixslope)
fig, _ = fits.plotReplicates(
sera=[group],
attempt_shared_legend=False,
legendfontsize=8,
heightscale=1.1,
widthscale=1.1,
subplot_titles="{virus}",
)
_ = fig.suptitle(
f"{group} fit with {desc}",
y=0.95,
fontsize=16,
fontweight="bold",
)
fig.tight_layout()
display(fig)
plt.close(fig)
if desc != "free top":
fit_params.append(
fits.fitParams(average_only=False, no_average=True)
.assign(fit_method=desc)
.drop(columns=["nreplicates", "ic50_str", "midpoint_bound"])
)
print("\n\n")
fit_params = pd.concat(fit_params, ignore_index=True)
fit_params = fit_params[
["fit_method"] + [c for c in fit_params.columns if c != "fit_method"]
]
if fit_params["fit_method"].nunique() == 1:
fit_params = fit_params.drop(columns="fit_method")
/home/jbloom/neutcurve/neutcurve/hillcurve.py:1177: RuntimeWarning: invalid value encountered in power
return b + (t - b) / (1 + (c / m) ** s)
/home/jbloom/neutcurve/neutcurve/hillcurve.py:1177: RuntimeWarning: divide by zero encountered in divide
return b + (t - b) / (1 + (c / m) ** s)
Tabulate the actual fit parameters for each setting:
[3]:
for serum_group, df in fit_params.groupby("serum"):
print(f"\nFit params for {serum_group}")
display(
df.drop(columns="serum").map(
lambda x: f"{x:.2g}" if isinstance(x, float) else x
)
)
Fit params for problematic slopes
virus | replicate | ic50 | ic50_bound | midpoint | midpoint_bound_type | slope | top | bottom | r2 | rmsd | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | A/Bangkok/P3755/2023 | plate11_r32_75k-CGCAAGGGATACTAAC | 0.0009 | interpolated | 0.001 | interpolated | 1 | 0.94 | 0 | 0.86 | 0.12 |
1 | A/Bangkok/P3755/2023 | plate11_r32_75k-TAGCTGGGCAAAGGCT | 0.0015 | interpolated | 0.0022 | interpolated | 1.4 | 0.8 | 0 | 0.67 | 0.19 |
2 | A/Bangkok/P3755/2023 | plate11_r32_75k-TATCATTTCATCTACA | 0.00051 | interpolated | 0.0006 | interpolated | 1 | 0.92 | 0 | 0.95 | 0.061 |
3 | A/Chipata/15-NIC-001/2023 | plate11_r32_75k-CAGTGCCATCCATCCA | 0.00085 | interpolated | 0.0011 | interpolated | 1.1 | 0.9 | 0 | 0.87 | 0.11 |
4 | A/Chipata/15-NIC-001/2023 | plate11_r32_75k-CGTATAACTGACGATT | 0.0005 | interpolated | 0.00082 | interpolated | 1 | 0.81 | 0 | 0.87 | 0.092 |
5 | A/Chipata/15-NIC-001/2023 | plate11_r32_75k-TTTCATATAATTTGAG | 0.00034 | interpolated | 0.00054 | interpolated | 1 | 0.82 | 0 | 0.95 | 0.056 |
6 | A/Oman/3011/2023 | plate11_r32_75k-AATAGGCCCAAATCCA | 0.00094 | interpolated | 0.0013 | interpolated | 1 | 0.87 | 0 | 0.52 | 0.2 |
7 | A/Oman/3011/2023 | plate11_r32_75k-TACGAAAATCAAGAGC | 0.00035 | interpolated | 0.00059 | interpolated | 1 | 0.8 | 0 | 0.35 | 0.15 |
8 | A/Oman/3011/2023 | plate11_r32_75k-TCCTTTAACTAATCGA | 0.00021 | interpolated | 0.00027 | interpolated | 1.2 | 0.87 | 0 | 0.97 | 0.043 |
Fit params for problematic tops
virus | replicate | ic50 | ic50_bound | midpoint | midpoint_bound_type | slope | top | bottom | r2 | rmsd | |
---|---|---|---|---|---|---|---|---|---|---|---|
9 | A/Catalonia/NSVH102124476/2023 | plate11_r32_75k-AATATACCGGCACTAC | 0.002 | interpolated | 0.0025 | interpolated | 2 | 0.83 | 0 | 0.83 | 0.14 |
10 | A/Catalonia/NSVH102124476/2023 | plate11_r32_75k-ACTGAACAGTATAACT | 0.0006 | interpolated | 0.00097 | interpolated | 1.1 | 0.8 | 0 | 0.93 | 0.067 |
11 | A/Catalonia/NSVH102124476/2023 | plate11_r32_75k-CTAAGGGCCTGTTCTT | 0.0012 | interpolated | 0.0014 | interpolated | 1.5 | 0.9 | 0 | 0.92 | 0.091 |
12 | A/Hong_Kong/2671/2019 | plate11_r32_75k-CGAAAACATTACAAAT | 0.00022 | interpolated | 0.00022 | interpolated | 2 | 1 | 0 | 0.85 | 0.17 |
13 | A/Hong_Kong/2671/2019 | plate11_r32_75k-CGGGAATCTCCCATAC | 0.00011 | interpolated | 0.00011 | interpolated | 2 | 1 | 0 | 0.77 | 0.16 |
14 | A/Hong_Kong/2671/2019 | plate11_r32_75k-TTCATCAAGTTGGTGC | 0.0001 | interpolated | 0.0001 | interpolated | 2 | 1 | 0 | 0.85 | 0.11 |
15 | A/Hong_Kong/4801/2014_(15/192) | plate11_r32_75k-CACAGACAATAAAAAA | 7.8e-05 | upper | 4.8e-05 | upper | 2 | 1 | 0 | 0.99 | 0.0091 |
16 | A/Hong_Kong/4801/2014_(15/192) | plate11_r32_75k-GGTTAACTTTGGAAGC | 7.8e-05 | upper | 4.4e-05 | upper | 2 | 1 | 0 | 0.93 | 0.023 |
[ ]: