Command-line interface¶
The easiest way to use pdb_prot_align
is typically via the command-line executable that will be installed with the package.
See below for usage.
Usage¶
Align proteins to reference and PDB.
usage: pdb_prot_align [-h] [-v] --protsfile PROTSFILE --refprot_regex
REFPROT_REGEX --pdbfile PDBFILE --chain_ids CHAIN_IDS
[CHAIN_IDS ...] --outprefix OUTPREFIX
[--ignore_gaps IGNORE_GAPS] [--drop_pdb DROP_PDB]
[--drop_refprot DROP_REFPROT] [--mafft MAFFT]
Named Arguments¶
- -v, --version
show program’s version number and exit
- --protsfile
input FASTA file of protein sequences
- --refprot_regex
regex for reference protein header in protsfile
- --pdbfile
input PDB file
- --chain_ids
chains in PDB file to align; all chains aligning to a site must share the same residue number and amino-acid or an error will be raised
- --outprefix
prefix for output files (can be / include directory): “alignment.fa” (alignment with gaps relative to reference stripped); “alignment_unstripped.fa” (non-stripped alignment with PDB chains still included); “sites.csv” (sequential sites in reference, PDB sites, PDB chains, wildtype in reference, wildtype in PDB, site entropy in bits, n effective amino acids at site, amino acid, frequency of amino acid)
- --ignore_gaps
ignore gaps (-) when calculating frequencies, number effective amino acids, entropy
Default: True
- --drop_pdb
drop PDB protein chains from “alignment.fa” and computation of stats in “sites.csv” output files
Default: True
- --drop_refprot
drop reference protein from “alignment.fa” and computation of stats in “sites.csv” output files
Default: False
- --mafft
path to mafft, potentially with additional args such as “mafft –reorder” (if multiple args, it all needs to be in quotes)
Default: “mafft”