4.4. assemble_denovo

Assisted de novo viral genome assembly from raw reads.

4.4.1. Inputs

4.4.1.1. Required inputs

assemble_denovo.assemble.spades_n_reads
Int — Default: 10000000
Subsample reads threshold prior to assembly. Default set to 10000000

assemble_denovo.reads_unmapped_bams
Array[File]+ — Default: None
???

assemble_denovo.reference_genome_fasta
Array[File]+ — Default: None
After denovo assembly, large contigs are scaffolded against a reference genome to determine orientation and to join contigs together, before further polishing by reads. You must supply at least one reference genome (all segments/chromomes in a single fasta file). If more than one reference is provided, contigs will be scaffolded against all of them and the one with the most complete assembly will be chosen for downstream polishing.

assemble_denovo.trim_clip_db
File — Default: None
???

4.4.1.2. Other common inputs

assemble_denovo.refine.trim_coords_bed
File? — Default: None
optional primers to trim in reference coordinate space (0-based BED format)

assemble_denovo.scaffold.min_length_fraction
Float? — Default: None
This step will fail with a PoorAssemblyError if the total end-to-end genome length in the output genome (inclusive of interior Ns) is less than this fraction of the length of the reference genome selected. Valid values are fractions from 0 to 1, default value is 0.5.

assemble_denovo.scaffold.min_unambig
Float? — Default: None
This step will fail with a PoorAssemblyError if the total number of unambiguous bases in the output genome (exclusive of interior Ns) is less than this fraction of its end-to-end length (inclusive of interior Ns). Valid values are fractions from 0 to 1, default value is 0.5.

4.4.1.3. Advanced inputs

Show/Hide

assemble_denovo.refine.call_consensus.mark_duplicates
Boolean — Default: false
Instead of removing duplicates, simply marks them.

assemble_denovo.refine.ivar_trim.min_keep_length
Int? — Default: None
Minimum length of read to retain after trimming (Default: 30)

assemble_denovo.refine.ivar_trim.min_quality
Int? — Default: 1
Minimum quality threshold for sliding window to pass (Default: 20)

assemble_denovo.refine.ivar_trim.sliding_window
Int? — Default: None
Width of sliding window for quality trimming (Default: 4)

assemble_denovo.scaffold.nucmer_max_gap
Int? — Default: None
When scaffolding contigs to the reference via nucmer, this specifies the -g parameter to nucmer (the maximum allowed gap between adjacent matches in a cluster). Our default is 200 (up from nucmer default of 90), mummer documentation suggests it is valid to increase up to 1000 to allow for more diversity.

assemble_denovo.scaffold.nucmer_min_cluster
Int? — Default: None
When scaffolding contigs to the reference via nucmer, this specifies the -c parameter to nucmer (minimum cluster length). Our default is the nucmer default of 65 bp.

assemble_denovo.scaffold.nucmer_min_match
Int? — Default: None
When scaffolding contigs to the reference via nucmer, this specifies the -l parameter to nucmer (the minimal size of a maximal exact match). Our default is 10 (down from nucmer default of 20) to allow for more divergence.

assemble_denovo.scaffold.replace_length
Int — Default: 55
The first and last replace_length base pairs of each segment in the output genome will be replaced with the equivalent sequences in the reference genome as a mechanism to handle common assembly errors in repetitive or inverted regions that are common to chromosome/segment ends. Valid values are any non-negative integer. Default is 55 bp.

assemble_denovo.scaffold.scaffold_min_contig_len
Int? — Default: None
Any sequences in contigs_fasta that are shorter than this length will be ignored for scaffolding.

assemble_denovo.scaffold.scaffold_min_pct_contig_aligned
Float? — Default: None
Any contig alignments to the reference scaffold that account for less than this fraction of the contig's length will be rejected for scaffolding. Valid values are fractions from 0 to 1; the default value is 0.3.

4.4.1.4. Other inputs

Show/Hide

assemble_denovo.assemble.docker
String — Default: "quay.io/broadinstitute/viral-assemble:2.1.33.0"
???

assemble_denovo.assemble.machine_mem_gb
Int? — Default: None
???

assemble_denovo.assemble.spades_min_contig_len
Int? — Default: None
Minimum length of output contig.

assemble_denovo.assemble.spades_options
String? — Default: None
Display additional options to pass the SPAdes assembler.

assemble_denovo.deplete_blastDbs
Array[File] — Default: []
Optional list of databases to use for blastn-based depletion. Sequences in fasta format will be indexed on the fly, pre-blast-indexed databases may be provided as tarballs.

assemble_denovo.deplete_bmtaggerDbs
Array[File] — Default: []
Optional list of databases to use for bmtagger-based depletion. Sequences in fasta format will be indexed on the fly, pre-bmtagger-indexed databases may be provided as tarballs.

assemble_denovo.deplete_bwaDbs
Array[File] — Default: []
Optional list of databases to use for bwa mem-based depletion. Sequences in fasta format will be indexed on the fly, pre-bwa-indexed databases may be provided as tarballs.

assemble_denovo.deplete_taxa.clear_tags
Boolean? — Default: false
???

assemble_denovo.deplete_taxa.cpu
Int? — Default: 8
???

assemble_denovo.deplete_taxa.docker
String — Default: "quay.io/broadinstitute/viral-classify:2.1.33.0"
???

assemble_denovo.deplete_taxa.machine_mem_gb
Int? — Default: None
???

assemble_denovo.deplete_taxa.query_chunk_size
Int? — Default: None
???

assemble_denovo.deplete_taxa.tags_to_clear_space_separated
String? — Default: "XT X0 X1 XA AM SM BQ CT XN OC OP"
???

assemble_denovo.filter_to_taxon.docker
String — Default: "quay.io/broadinstitute/viral-classify:2.1.33.0"
???

assemble_denovo.filter_to_taxon.error_on_reads_in_neg_control
Boolean? — Default: false
???

assemble_denovo.filter_to_taxon.machine_mem_gb
Int? — Default: None
???

assemble_denovo.filter_to_taxon.neg_control_prefixes_space_separated
String? — Default: "neg water NTC"
???

assemble_denovo.filter_to_taxon.negative_control_reads_threshold
Int? — Default: 0
???

assemble_denovo.filter_to_taxon_db
File? — Default: None
Optional database to use to filter read set to those that match by LASTAL. Sequences in fasta format will be indexed on the fly.

assemble_denovo.merge_cleaned_reads.docker
String — Default: "quay.io/broadinstitute/viral-core:2.1.33"
???

assemble_denovo.merge_cleaned_reads.reheader_table
File? — Default: None
???

assemble_denovo.merge_cleaned_reads.sample_name
String? — Default: None
???

assemble_denovo.merge_dedup_reads.docker
String — Default: "quay.io/broadinstitute/viral-core:2.1.33"
???

assemble_denovo.merge_dedup_reads.reheader_table
File? — Default: None
???

assemble_denovo.merge_dedup_reads.sample_name
String? — Default: None
???

assemble_denovo.merge_taxfilt_reads.docker
String — Default: "quay.io/broadinstitute/viral-core:2.1.33"
???

assemble_denovo.merge_taxfilt_reads.reheader_table
File? — Default: None
???

assemble_denovo.merge_taxfilt_reads.sample_name
String? — Default: None
???

assemble_denovo.out_basename
String — Default: basename(basename(reads_unmapped_bams[0],".bam"),".cleaned")
a filename-friendly basename for output files

assemble_denovo.refine.align_to_ref.docker
String — Default: "quay.io/broadinstitute/viral-core:2.1.33"
???

assemble_denovo.refine.align_to_ref.machine_mem_gb
Int? — Default: None
???

assemble_denovo.refine.align_to_ref.sample_name
String — Default: basename(basename(basename(reads_unmapped_bam,".bam"),".taxfilt"),".clean")
???

assemble_denovo.refine.align_to_ref_options
Map[String,String] — Default: {"novoalign": "-r Random -l 40 -g 40 -x 20 -t 501 -k", "bwa": "-k 12 -B 1", "minimap2": ""}
???

assemble_denovo.refine.align_to_self.docker
String — Default: "quay.io/broadinstitute/viral-core:2.1.33"
???

assemble_denovo.refine.align_to_self.machine_mem_gb
Int? — Default: None
???

assemble_denovo.refine.align_to_self.sample_name
String — Default: basename(basename(basename(reads_unmapped_bam,".bam"),".taxfilt"),".clean")
???

assemble_denovo.refine.align_to_self_options
Map[String,String] — Default: {"novoalign": "-r Random -l 40 -g 40 -x 20 -t 100", "bwa": "", "minimap2": ""}
???

assemble_denovo.refine.aligner
String — Default: "minimap2"
Read aligner software to use. Options: novoalign, bwa, minimap2. Minimap2 can automatically handle Illumina, PacBio, or Oxford Nanopore reads as long as the 'PL' field in the BAM read group header is set properly (novoalign and bwa are Illumina-only).

assemble_denovo.refine.alignment_metrics.amplicon_set
String? — Default: None
???

assemble_denovo.refine.alignment_metrics.docker
String — Default: "quay.io/broadinstitute/viral-core:2.1.33"
???

assemble_denovo.refine.alignment_metrics.machine_mem_gb
Int? — Default: None
???

assemble_denovo.refine.alignment_metrics.max_amp_len
Int? — Default: 5000
???

assemble_denovo.refine.alignment_metrics.max_amplicons
Int? — Default: 500
???

assemble_denovo.refine.call_consensus.docker
String — Default: "quay.io/broadinstitute/viral-assemble:2.1.33.0"
???

assemble_denovo.refine.call_consensus.machine_mem_gb
Int? — Default: None
???

assemble_denovo.refine.isnvs_ref.docker
String — Default: "quay.io/biocontainers/lofreq:2.1.5--py38h588ecb2_4"
???

assemble_denovo.refine.isnvs_ref.out_basename
String — Default: basename(aligned_bam,'.bam')
???

assemble_denovo.refine.isnvs_self.docker
String — Default: "quay.io/biocontainers/lofreq:2.1.5--py38h588ecb2_4"
???

assemble_denovo.refine.isnvs_self.out_basename
String — Default: basename(aligned_bam,'.bam')
???

assemble_denovo.refine.ivar_trim.bam_basename
String — Default: basename(aligned_bam,".bam")
???

assemble_denovo.refine.ivar_trim.disk_size
Int — Default: 375
???

assemble_denovo.refine.ivar_trim.docker
String — Default: "andersenlabapps/ivar:1.3.1"
???

assemble_denovo.refine.ivar_trim.machine_mem_gb
Int? — Default: None
???

assemble_denovo.refine.ivar_trim.primer_offset
Int? — Default: None
???

assemble_denovo.refine.major_cutoff
Float — Default: 0.75
If the major allele is present at a frequency higher than this cutoff, we will call an unambiguous base at that position. If it is equal to or below this cutoff, we will call an ambiguous base representing all possible alleles at that position.

assemble_denovo.refine.merge_align_to_ref.docker
String — Default: "quay.io/broadinstitute/viral-core:2.1.33"
???

assemble_denovo.refine.merge_align_to_ref.reheader_table
File? — Default: None
???

assemble_denovo.refine.merge_align_to_self.docker
String — Default: "quay.io/broadinstitute/viral-core:2.1.33"
???

assemble_denovo.refine.merge_align_to_self.reheader_table
File? — Default: None
???

assemble_denovo.refine.min_coverage
Int — Default: 3
Minimum read coverage required to call a position unambiguous.

assemble_denovo.refine.novocraft_license
File? — Default: None
???

assemble_denovo.refine.plot_ref_coverage.base_q_threshold
Int? — Default: None
???

assemble_denovo.refine.plot_ref_coverage.bin_large_plots
Boolean — Default: false
???

assemble_denovo.refine.plot_ref_coverage.binning_summary_statistic
String? — Default: "max"
???

assemble_denovo.refine.plot_ref_coverage.docker
String — Default: "quay.io/broadinstitute/viral-core:2.1.33"
???

assemble_denovo.refine.plot_ref_coverage.mapping_q_threshold
Int? — Default: None
???

assemble_denovo.refine.plot_ref_coverage.max_coverage_depth
Int? — Default: None
???

assemble_denovo.refine.plot_ref_coverage.plot_height_pixels
Int? — Default: 850
???

assemble_denovo.refine.plot_ref_coverage.plot_only_non_duplicates
Boolean — Default: false
???

assemble_denovo.refine.plot_ref_coverage.plot_pixels_per_inch
Int? — Default: 100
???

assemble_denovo.refine.plot_ref_coverage.plot_width_pixels
Int? — Default: 1100
???

assemble_denovo.refine.plot_ref_coverage.plotXLimits
String? — Default: None
???

assemble_denovo.refine.plot_ref_coverage.plotYLimits
String? — Default: None
???

assemble_denovo.refine.plot_ref_coverage.read_length_threshold
Int? — Default: None
???

assemble_denovo.refine.plot_ref_coverage.skip_mark_dupes
Boolean — Default: false
???

assemble_denovo.refine.plot_self_coverage.base_q_threshold
Int? — Default: None
???

assemble_denovo.refine.plot_self_coverage.bin_large_plots
Boolean — Default: false
???

assemble_denovo.refine.plot_self_coverage.binning_summary_statistic
String? — Default: "max"
???

assemble_denovo.refine.plot_self_coverage.docker
String — Default: "quay.io/broadinstitute/viral-core:2.1.33"
???

assemble_denovo.refine.plot_self_coverage.mapping_q_threshold
Int? — Default: None
???

assemble_denovo.refine.plot_self_coverage.max_coverage_depth
Int? — Default: None
???

assemble_denovo.refine.plot_self_coverage.plot_height_pixels
Int? — Default: 850
???

assemble_denovo.refine.plot_self_coverage.plot_only_non_duplicates
Boolean — Default: false
???

assemble_denovo.refine.plot_self_coverage.plot_pixels_per_inch
Int? — Default: 100
???

assemble_denovo.refine.plot_self_coverage.plot_width_pixels
Int? — Default: 1100
???

assemble_denovo.refine.plot_self_coverage.plotXLimits
String? — Default: None
???

assemble_denovo.refine.plot_self_coverage.plotYLimits
String? — Default: None
???

assemble_denovo.refine.plot_self_coverage.read_length_threshold
Int? — Default: None
???

assemble_denovo.refine.plot_self_coverage.skip_mark_dupes
Boolean — Default: false
???

assemble_denovo.refine.run_discordance.docker
String — Default: "quay.io/broadinstitute/viral-core:2.1.33"
???

assemble_denovo.refine.skip_mark_dupes
Boolean — Default: false
skip Picard MarkDuplicates step after alignment. This is recommended to be set to true for PCR amplicon based data. (Default: false)

assemble_denovo.rename_fasta_header.docker
String — Default: "quay.io/broadinstitute/viral-core:2.1.33"
???

assemble_denovo.rename_fasta_header.out_basename
String — Default: basename(genome_fasta,".fasta")
???

assemble_denovo.renamed_reads.docker
String — Default: "quay.io/broadinstitute/viral-core:2.1.33"
???

assemble_denovo.renamed_reads.reheader_table
File? — Default: None
???

assemble_denovo.rmdup_ubam.docker
String — Default: "quay.io/broadinstitute/viral-core:2.1.33"
???

assemble_denovo.rmdup_ubam.machine_mem_gb
Int? — Default: None
???

assemble_denovo.rmdup_ubam.method
String — Default: "mvicuna"
mvicuna or cdhit

assemble_denovo.sample_original_name
String? — Default: None
a (possibly filename-unfriendly) sample name for fasta and bam headers

assemble_denovo.scaffold.aligner
String — Default: "muscle"
Alignment tools used to align the reference sequence to aligned contigs. Possible options: muscle, mafft, mummer (= nucmer), set to muscle for default.

assemble_denovo.scaffold.docker
String — Default: "quay.io/broadinstitute/viral-assemble:2.1.33.0"
???

assemble_denovo.scaffold.machine_mem_gb
Int? — Default: None
???

assemble_denovo.scaffold.sample_name
String — Default: basename(basename(contigs_fasta,".fasta"),".assembly1-spades")
???

4.4.2. Outputs

assemble_denovo.aligned_bam
File
???

assemble_denovo.aligned_only_reads_bam
File
???

assemble_denovo.aligned_only_reads_fastqc
File
???

assemble_denovo.assemble_viral_assemble_version
String
???

assemble_denovo.assembly_length
Int
???

assemble_denovo.assembly_length_unambiguous
Int
???

assemble_denovo.assembly_method
String
???

assemble_denovo.assembly_preimpute_length
Int
???

assemble_denovo.assembly_preimpute_length_unambiguous
Int
???

assemble_denovo.bases_aligned
Float
???

assemble_denovo.cleaned_bam
File
???

assemble_denovo.cleaned_fastqc
File
???

assemble_denovo.contigs_fasta
File
???

assemble_denovo.coverage_plot
File
???

assemble_denovo.coverage_tsv
File
???

assemble_denovo.dedup_bam
File
???

assemble_denovo.dedup_fastqc
File
???

assemble_denovo.dedup_read_count_post
Int
???

assemble_denovo.depletion_read_count_post
Int
???

assemble_denovo.filter_read_count_post
Int
???

assemble_denovo.final_assembly_fasta
File
???

assemble_denovo.intermediate_gapfill_fasta
File
???

assemble_denovo.intermediate_scaffold_fasta
File
???

assemble_denovo.isnvs_vcf
File
???

assemble_denovo.mean_coverage
Float
???

assemble_denovo.num_libraries
Int
???

assemble_denovo.num_read_groups
Int
???

assemble_denovo.read_pairs_aligned
Int
???

assemble_denovo.reads_aligned
Int
???

assemble_denovo.replicate_concordant_sites
Int
???

assemble_denovo.replicate_discordant_indels
Int
???

assemble_denovo.replicate_discordant_snps
Int
???

assemble_denovo.replicate_discordant_vcf
File
???

assemble_denovo.scaffold_fasta
File
???

assemble_denovo.scaffold_viral_assemble_version
String
???

assemble_denovo.scaffolding_alt_contigs
File
???

assemble_denovo.scaffolding_chosen_ref_names
Array[String]
???

assemble_denovo.scaffolding_stats
File
???

assemble_denovo.subsampBam
File
???

assemble_denovo.subsample_read_count
Int
???

assemble_denovo.taxfilt_bam
File
???

assemble_denovo.taxfilt_fastqc
File
???


Generated using WDL AID (1.0.0)