bioprov.programs package¶
Submodules¶
bioprov.programs.programs module¶
Module for holding preset instances of the Program class.
- bioprov.programs.programs.blastn(sample=None, db=None, query_tag='query', outformat=6, extra_flags=None)¶
- Parameters
sample (Sample) – Instance of BioProv.Sample.
db (str) – A string pointing to the reference database directory and title.
query_tag (str) – A tag for the query file.
outformat (int) – The output format to gather from blastn.
extra_flags (list) – A list of extra parameters to pass to BLASTN.
- Returns
Instance of PresetProgram for BLASTN.
- Return type
BioProv.PresetProgram.
- Raises
AssertionError – Path to the reference database does not exist.
- bioprov.programs.programs.blastp(sample, db, query_tag='query', outformat=6, extra_flags=None)¶
- Parameters
sample (Sample) – Instance of BioProv.Sample.
db (str) – A string pointing to the reference database directory and title.
query_tag (str) – A tag for the query file.
outformat (int) – The output format to gather from blastp.
extra_flags (list) – A list of extra parameters to pass to BLASTP.
- Returns
Instance of PresetProgram for BLASTP.
- Return type
BioProv.PresetProgram.
- Raises
AssertionError – Path to the reference database does not exist.
- bioprov.programs.programs.diamond(blast_type, sample, db, query_tag='query', outformat=6, extra_flags=None)¶
- Parameters
blast_type (str) – Which aligner to use (‘blastp’ or ‘blastx’).
sample (Sample) – Instance of BioProv.Sample.
db (str) – A string pointing to the reference database path.
query_tag (str) – A tag for the query file.
outformat (int) – The output format to gather from diamond (0, 5 or 6).
extra_flags (list) – A list of extra parameters to pass to diamond (e.g. –sensitive or –log).
- Returns
Instance of PresetProgram containing Diamond.
- Return type
BioProv.PresetProgram.
- bioprov.programs.programs.fasttree(sample, input_tag='input', extra_flags=None)¶
- Parameters
sample (Sample) – Instance of BioProv.Sample.
input_tag (str) – A tag for the input multifasta file.
extra_flags (list) – A list of extra parameters to pass to FastTree.
- Returns
Instance of PresetProgram containing FastTree.
- Return type
BioProv.PresetProgram.
- bioprov.programs.programs.kaiju(_sample, output_path=None, kaijudb='', nodes='', threads=1, r1='R1', r2='R2', add_param_str='')¶
Run Kaiju on paired-end metagenomic data.
- Parameters
_sample – An instance of BioProv sample.
output_path – Output file of Kaiju.
kaijudb – Path to Kaiju database.
nodes – Nodes file to use with Kaiju.False
threads – Threads to use with Kaiju.
r1 – Tag of forward reads.
r2 – Tag of reverse reads.
add_param_str – Add any paremeters to Kaiju.
- Returns
An instance of Program, containing Kaiju.
- bioprov.programs.programs.kaiju2table(_sample, output_path=None, rank='phylum', nodes='', names='', kaiju_output='kaiju_output', add_param_str='')¶
Run kaiju2table to create Kaiju reports. :param _sample: An instance of BioProv sample. :param output_path: Output file of kaiju2table. :param rank: Taxonomic rank to create report of. :param nodes: Nodes file to use with kaiju2table. :param names: Names file to use with kaiju2table. :param kaiju_output: Tag of Kaiju output file. :param add_param_str: Parameter string to add. :return: Instance of Program containing kaiju2table.
- bioprov.programs.programs.kallisto_quant(sample, index, output_dir='./', extra_flags=None)¶
Run kallisto’s alignment and quantification
- Parameters
sample (Sample) – Instance of BioProv.Sample.
index (str) – A path to a kallisto index file.
output_dir (str) – A path to kallisto’s output directory.
extra_flags (list) – A list of extra parameters to pass to kallisto (e.g. –single or –plaintext).
- Returns
Instance of PresetProgram containing kallisto.
- Return type
BioProv.PresetProgram.
- bioprov.programs.programs.mafft(sample, input_tag='input', extra_flags=None)¶
- Parameters
sample (Sample) – Instance of BioProv.Sample.
input_tag (str) – A tag for the input fasta file.
extra_flags (list) – A list of extra parameters to pass to MAFFT.
- Returns
Instance of PresetProgram containing MAFFT.
- Return type
BioProv.PresetProgram.
- bioprov.programs.programs.muscle(sample, input_tag='input', msf=False, extra_flags=None)¶
- Parameters
sample (Sample) – Instance of BioProv.Sample.
input_tag (str) – A tag for the input multi-fasta file.
msf (bool) – Whether or not to have the output in msf format.
extra_flags (list) – A list of extra parameters to pass to Muscle.
- Returns
Instance of PresetProgram for Muscle.
- Return type
BioProv.PresetProgram.
- bioprov.programs.programs.prodigal(sample=None, input_tag='assembly', extra_flags=None)¶
- Parameters
sample – Instance of BioProv.Sample.
input_tag – Instance of BioProv.Sample.
extra_flags (list) – A list of extra parameters to pass to Prodigal.
- Returns
Instance of PresetProgram containing Prodigal.
- bioprov.programs.programs.prokka(_sample, output_path=None, threads=1, add_param_str='', assembly='assembly', contigs='prokka_contigs', genes='prokka_genes', proteins='prokka_proteins', feature_table='feature_table', submit_contigs='submit_contigs', sequin='sequin', genbank='genbank', gff='gff', log='prokka_log', stats='prokka_stats')¶
- Parameters
_sample – An instance of BioProv Sample.
output_path – Output directory of Prokka.
threads – Threads to use for Prokka.
add_param_str – Any additional parameters to be passed to Prokka (in string format)
The following params are the tags for each file, meaning that they are a string present in _sample.files.keys().
- Parameters
assembly – Input assembly file.
contigs – Output contigs.
genes – Output genes.
proteins – Output proteins.
feature_table – Output feature table.
submit_contigs – Output contigs formatted for NCBI submission.
sequin – Output sequin file.
genbank – Output genbank .gbk file
gff – Output .gff file
log – Prokka log file.
stats – Prokka stats file.
- Returns
An instance of Program, containing Prokka.
- bioprov.programs.programs.prokka_()¶
- Returns
Instance of PresetProgram containing Prokka.