bioprov.programs package

Submodules

bioprov.programs.programs module

Module for holding preset instances of the Program class.

bioprov.programs.programs.blastn(sample=None, db=None, query_tag='query', outformat=6, extra_flags=None)
Parameters
  • sample (Sample) – Instance of BioProv.Sample.

  • db (str) – A string pointing to the reference database directory and title.

  • query_tag (str) – A tag for the query file.

  • outformat (int) – The output format to gather from blastn.

  • extra_flags (list) – A list of extra parameters to pass to BLASTN.

Returns

Instance of PresetProgram for BLASTN.

Return type

BioProv.PresetProgram.

Raises

AssertionError – Path to the reference database does not exist.

bioprov.programs.programs.blastp(sample, db, query_tag='query', outformat=6, extra_flags=None)
Parameters
  • sample (Sample) – Instance of BioProv.Sample.

  • db (str) – A string pointing to the reference database directory and title.

  • query_tag (str) – A tag for the query file.

  • outformat (int) – The output format to gather from blastp.

  • extra_flags (list) – A list of extra parameters to pass to BLASTP.

Returns

Instance of PresetProgram for BLASTP.

Return type

BioProv.PresetProgram.

Raises

AssertionError – Path to the reference database does not exist.

bioprov.programs.programs.diamond(blast_type, sample, db, query_tag='query', outformat=6, extra_flags=None)
Parameters
  • blast_type (str) – Which aligner to use (‘blastp’ or ‘blastx’).

  • sample (Sample) – Instance of BioProv.Sample.

  • db (str) – A string pointing to the reference database path.

  • query_tag (str) – A tag for the query file.

  • outformat (int) – The output format to gather from diamond (0, 5 or 6).

  • extra_flags (list) – A list of extra parameters to pass to diamond (e.g. –sensitive or –log).

Returns

Instance of PresetProgram containing Diamond.

Return type

BioProv.PresetProgram.

bioprov.programs.programs.fasttree(sample, input_tag='input', extra_flags=None)
Parameters
  • sample (Sample) – Instance of BioProv.Sample.

  • input_tag (str) – A tag for the input multifasta file.

  • extra_flags (list) – A list of extra parameters to pass to FastTree.

Returns

Instance of PresetProgram containing FastTree.

Return type

BioProv.PresetProgram.

bioprov.programs.programs.kaiju(_sample, output_path=None, kaijudb='', nodes='', threads=1, r1='R1', r2='R2', add_param_str='')

Run Kaiju on paired-end metagenomic data.

Parameters
  • _sample – An instance of BioProv sample.

  • output_path – Output file of Kaiju.

  • kaijudb – Path to Kaiju database.

  • nodes – Nodes file to use with Kaiju.False

  • threads – Threads to use with Kaiju.

  • r1 – Tag of forward reads.

  • r2 – Tag of reverse reads.

  • add_param_str – Add any paremeters to Kaiju.

Returns

An instance of Program, containing Kaiju.

bioprov.programs.programs.kaiju2table(_sample, output_path=None, rank='phylum', nodes='', names='', kaiju_output='kaiju_output', add_param_str='')

Run kaiju2table to create Kaiju reports. :param _sample: An instance of BioProv sample. :param output_path: Output file of kaiju2table. :param rank: Taxonomic rank to create report of. :param nodes: Nodes file to use with kaiju2table. :param names: Names file to use with kaiju2table. :param kaiju_output: Tag of Kaiju output file. :param add_param_str: Parameter string to add. :return: Instance of Program containing kaiju2table.

bioprov.programs.programs.kallisto_quant(sample, index, output_dir='./', extra_flags=None)

Run kallisto’s alignment and quantification

Parameters
  • sample (Sample) – Instance of BioProv.Sample.

  • index (str) – A path to a kallisto index file.

  • output_dir (str) – A path to kallisto’s output directory.

  • extra_flags (list) – A list of extra parameters to pass to kallisto (e.g. –single or –plaintext).

Returns

Instance of PresetProgram containing kallisto.

Return type

BioProv.PresetProgram.

bioprov.programs.programs.mafft(sample, input_tag='input', extra_flags=None)
Parameters
  • sample (Sample) – Instance of BioProv.Sample.

  • input_tag (str) – A tag for the input fasta file.

  • extra_flags (list) – A list of extra parameters to pass to MAFFT.

Returns

Instance of PresetProgram containing MAFFT.

Return type

BioProv.PresetProgram.

bioprov.programs.programs.muscle(sample, input_tag='input', msf=False, extra_flags=None)
Parameters
  • sample (Sample) – Instance of BioProv.Sample.

  • input_tag (str) – A tag for the input multi-fasta file.

  • msf (bool) – Whether or not to have the output in msf format.

  • extra_flags (list) – A list of extra parameters to pass to Muscle.

Returns

Instance of PresetProgram for Muscle.

Return type

BioProv.PresetProgram.

bioprov.programs.programs.prodigal(sample=None, input_tag='assembly', extra_flags=None)
Parameters
  • sample – Instance of BioProv.Sample.

  • input_tag – Instance of BioProv.Sample.

  • extra_flags (list) – A list of extra parameters to pass to Prodigal.

Returns

Instance of PresetProgram containing Prodigal.

bioprov.programs.programs.prokka(_sample, output_path=None, threads=1, add_param_str='', assembly='assembly', contigs='prokka_contigs', genes='prokka_genes', proteins='prokka_proteins', feature_table='feature_table', submit_contigs='submit_contigs', sequin='sequin', genbank='genbank', gff='gff', log='prokka_log', stats='prokka_stats')
Parameters
  • _sample – An instance of BioProv Sample.

  • output_path – Output directory of Prokka.

  • threads – Threads to use for Prokka.

  • add_param_str – Any additional parameters to be passed to Prokka (in string format)

The following params are the tags for each file, meaning that they are a string present in _sample.files.keys().

Parameters
  • assembly – Input assembly file.

  • contigs – Output contigs.

  • genes – Output genes.

  • proteins – Output proteins.

  • feature_table – Output feature table.

  • submit_contigs – Output contigs formatted for NCBI submission.

  • sequin – Output sequin file.

  • genbank – Output genbank .gbk file

  • gff – Output .gff file

  • log – Prokka log file.

  • stats – Prokka stats file.

Returns

An instance of Program, containing Prokka.

bioprov.programs.programs.prokka_()
Returns

Instance of PresetProgram containing Prokka.

Module contents