4.5. assemble_refbased

Reference-based microbial consensus calling. Aligns NGS reads to a singular reference genome, calls a new consensus sequence, and emits: new assembly, reads aligned to provided reference, reads aligned to new assembly, various figures of merit, plots, and QC metrics. The user may provide unaligned reads spread across multiple input files and this workflow will parallelize alignment per input file before merging results prior to consensus calling.

4.5.1. Inputs

4.5.1.1. Required inputs

assemble_refbased.reads_unmapped_bams
Array[File]+ — Default: None
Unaligned reads in BAM format

assemble_refbased.reference_fasta
File — Default: None
Reference genome to align reads to.

4.5.1.2. Other common inputs

assemble_refbased.sample_name
String — Default: basename(reads_unmapped_bams[0],'.bam')
Base name of output files. The 'SM' field in BAM read group headers are also rewritten to this value. Avoid spaces and other filename-unfriendly characters.

assemble_refbased.trim_coords_bed
File? — Default: None
optional primers to trim in reference coordinate space (0-based BED format)

4.5.1.3. Advanced inputs

Show/Hide

assemble_refbased.call_consensus.mark_duplicates
Boolean — Default: false
Instead of removing duplicates, simply marks them.

assemble_refbased.ivar_trim.min_keep_length
Int? — Default: None
Minimum length of read to retain after trimming (Default: 30)

assemble_refbased.ivar_trim.min_quality
Int? — Default: 1
Minimum quality threshold for sliding window to pass (Default: 20)

assemble_refbased.ivar_trim.sliding_window
Int? — Default: None
Width of sliding window for quality trimming (Default: 4)

4.5.1.4. Other inputs

Show/Hide

assemble_refbased.align_to_ref.docker
String — Default: "quay.io/broadinstitute/viral-core:2.1.33"
???

assemble_refbased.align_to_ref.machine_mem_gb
Int? — Default: None
???

assemble_refbased.align_to_ref.sample_name
String — Default: basename(basename(basename(reads_unmapped_bam,".bam"),".taxfilt"),".clean")
???

assemble_refbased.align_to_ref_options
Map[String,String] — Default: {"novoalign": "-r Random -l 40 -g 40 -x 20 -t 501 -k", "bwa": "-k 12 -B 1", "minimap2": ""}
???

assemble_refbased.align_to_self.docker
String — Default: "quay.io/broadinstitute/viral-core:2.1.33"
???

assemble_refbased.align_to_self.machine_mem_gb
Int? — Default: None
???

assemble_refbased.align_to_self.sample_name
String — Default: basename(basename(basename(reads_unmapped_bam,".bam"),".taxfilt"),".clean")
???

assemble_refbased.align_to_self_options
Map[String,String] — Default: {"novoalign": "-r Random -l 40 -g 40 -x 20 -t 100", "bwa": "", "minimap2": ""}
???

assemble_refbased.aligner
String — Default: "minimap2"
Read aligner software to use. Options: novoalign, bwa, minimap2. Minimap2 can automatically handle Illumina, PacBio, or Oxford Nanopore reads as long as the 'PL' field in the BAM read group header is set properly (novoalign and bwa are Illumina-only).

assemble_refbased.alignment_metrics.amplicon_set
String? — Default: None
???

assemble_refbased.alignment_metrics.docker
String — Default: "quay.io/broadinstitute/viral-core:2.1.33"
???

assemble_refbased.alignment_metrics.machine_mem_gb
Int? — Default: None
???

assemble_refbased.alignment_metrics.max_amp_len
Int? — Default: 5000
???

assemble_refbased.alignment_metrics.max_amplicons
Int? — Default: 500
???

assemble_refbased.call_consensus.docker
String — Default: "quay.io/broadinstitute/viral-assemble:2.1.33.0"
???

assemble_refbased.call_consensus.machine_mem_gb
Int? — Default: None
???

assemble_refbased.isnvs_ref.docker
String — Default: "quay.io/biocontainers/lofreq:2.1.5--py38h588ecb2_4"
???

assemble_refbased.isnvs_ref.out_basename
String — Default: basename(aligned_bam,'.bam')
???

assemble_refbased.isnvs_self.docker
String — Default: "quay.io/biocontainers/lofreq:2.1.5--py38h588ecb2_4"
???

assemble_refbased.isnvs_self.out_basename
String — Default: basename(aligned_bam,'.bam')
???

assemble_refbased.ivar_trim.bam_basename
String — Default: basename(aligned_bam,".bam")
???

assemble_refbased.ivar_trim.disk_size
Int — Default: 375
???

assemble_refbased.ivar_trim.docker
String — Default: "andersenlabapps/ivar:1.3.1"
???

assemble_refbased.ivar_trim.machine_mem_gb
Int? — Default: None
???

assemble_refbased.ivar_trim.primer_offset
Int? — Default: None
???

assemble_refbased.major_cutoff
Float — Default: 0.75
If the major allele is present at a frequency higher than this cutoff, we will call an unambiguous base at that position. If it is equal to or below this cutoff, we will call an ambiguous base representing all possible alleles at that position.

assemble_refbased.merge_align_to_ref.docker
String — Default: "quay.io/broadinstitute/viral-core:2.1.33"
???

assemble_refbased.merge_align_to_ref.reheader_table
File? — Default: None
???

assemble_refbased.merge_align_to_self.docker
String — Default: "quay.io/broadinstitute/viral-core:2.1.33"
???

assemble_refbased.merge_align_to_self.reheader_table
File? — Default: None
???

assemble_refbased.min_coverage
Int — Default: 3
Minimum read coverage required to call a position unambiguous.

assemble_refbased.novocraft_license
File? — Default: None
???

assemble_refbased.plot_ref_coverage.base_q_threshold
Int? — Default: None
???

assemble_refbased.plot_ref_coverage.bin_large_plots
Boolean — Default: false
???

assemble_refbased.plot_ref_coverage.binning_summary_statistic
String? — Default: "max"
???

assemble_refbased.plot_ref_coverage.docker
String — Default: "quay.io/broadinstitute/viral-core:2.1.33"
???

assemble_refbased.plot_ref_coverage.mapping_q_threshold
Int? — Default: None
???

assemble_refbased.plot_ref_coverage.max_coverage_depth
Int? — Default: None
???

assemble_refbased.plot_ref_coverage.plot_height_pixels
Int? — Default: 850
???

assemble_refbased.plot_ref_coverage.plot_only_non_duplicates
Boolean — Default: false
???

assemble_refbased.plot_ref_coverage.plot_pixels_per_inch
Int? — Default: 100
???

assemble_refbased.plot_ref_coverage.plot_width_pixels
Int? — Default: 1100
???

assemble_refbased.plot_ref_coverage.plotXLimits
String? — Default: None
???

assemble_refbased.plot_ref_coverage.plotYLimits
String? — Default: None
???

assemble_refbased.plot_ref_coverage.read_length_threshold
Int? — Default: None
???

assemble_refbased.plot_ref_coverage.skip_mark_dupes
Boolean — Default: false
???

assemble_refbased.plot_self_coverage.base_q_threshold
Int? — Default: None
???

assemble_refbased.plot_self_coverage.bin_large_plots
Boolean — Default: false
???

assemble_refbased.plot_self_coverage.binning_summary_statistic
String? — Default: "max"
???

assemble_refbased.plot_self_coverage.docker
String — Default: "quay.io/broadinstitute/viral-core:2.1.33"
???

assemble_refbased.plot_self_coverage.mapping_q_threshold
Int? — Default: None
???

assemble_refbased.plot_self_coverage.max_coverage_depth
Int? — Default: None
???

assemble_refbased.plot_self_coverage.plot_height_pixels
Int? — Default: 850
???

assemble_refbased.plot_self_coverage.plot_only_non_duplicates
Boolean — Default: false
???

assemble_refbased.plot_self_coverage.plot_pixels_per_inch
Int? — Default: 100
???

assemble_refbased.plot_self_coverage.plot_width_pixels
Int? — Default: 1100
???

assemble_refbased.plot_self_coverage.plotXLimits
String? — Default: None
???

assemble_refbased.plot_self_coverage.plotYLimits
String? — Default: None
???

assemble_refbased.plot_self_coverage.read_length_threshold
Int? — Default: None
???

assemble_refbased.plot_self_coverage.skip_mark_dupes
Boolean — Default: false
???

assemble_refbased.run_discordance.docker
String — Default: "quay.io/broadinstitute/viral-core:2.1.33"
???

assemble_refbased.skip_mark_dupes
Boolean — Default: false
skip Picard MarkDuplicates step after alignment. This is recommended to be set to true for PCR amplicon based data. (Default: false)

4.5.2. Outputs

assemble_refbased.align_to_ref_fastqc
File
???

assemble_refbased.align_to_ref_isnvs_vcf
File
???

assemble_refbased.align_to_ref_merged_aligned_trimmed_only_bam
File
???

assemble_refbased.align_to_ref_merged_bases_aligned
Float
???

assemble_refbased.align_to_ref_merged_coverage_plot
File
???

assemble_refbased.align_to_ref_merged_coverage_tsv
File
???

assemble_refbased.align_to_ref_merged_read_pairs_aligned
Int
???

assemble_refbased.align_to_ref_merged_reads_aligned
Int
???

assemble_refbased.align_to_ref_per_input_aligned_flagstat
Array[File]
???

assemble_refbased.align_to_ref_per_input_reads_aligned
Array[Int]
???

assemble_refbased.align_to_ref_per_input_reads_provided
Array[Int]
???

assemble_refbased.align_to_ref_variants_vcf_gz
File
All variants in the input reads against the original reference genome. This VCF file is used to create the assembly_fasta

assemble_refbased.align_to_ref_viral_core_version
String
???

assemble_refbased.align_to_self_isnvs_vcf
File
???

assemble_refbased.align_to_self_merged_aligned_and_unaligned_bam
Array[File]
???

assemble_refbased.align_to_self_merged_aligned_only_bam
File
???

assemble_refbased.align_to_self_merged_bases_aligned
Float
???

assemble_refbased.align_to_self_merged_coverage_plot
File
???

assemble_refbased.align_to_self_merged_coverage_tsv
File
???

assemble_refbased.align_to_self_merged_mean_coverage
Float
???

assemble_refbased.align_to_self_merged_read_pairs_aligned
Int
???

assemble_refbased.align_to_self_merged_reads_aligned
Int
???

assemble_refbased.assembly_fasta
File
The new assembly / consensus sequence for this sample

assemble_refbased.assembly_length
Int
The length of the sequence described in assembly_fasta, inclusive of any uncovered regions denoted by Ns

assemble_refbased.assembly_length_unambiguous
Int
The number of called consensus bases in assembly_fasta (excludes regions of the genome that lack read coverage)

assemble_refbased.assembly_mean_coverage
Float
???

assemble_refbased.assembly_method
String
???

assemble_refbased.dist_to_ref_indels
Int
???

assemble_refbased.dist_to_ref_snps
Int
???

assemble_refbased.ivar_trim_stats
Array[Map[String,String]]
???

assemble_refbased.ivar_trim_stats_tsv
Array[Array[String]]
???

assemble_refbased.ivar_version
String
???

assemble_refbased.num_libraries
Int
???

assemble_refbased.num_read_groups
Int
???

assemble_refbased.picard_metrics_alignment
File
???

assemble_refbased.picard_metrics_insert_size
File
???

assemble_refbased.picard_metrics_wgs
File
???

assemble_refbased.primer_trimmed_read_count
Array[Int]
???

assemble_refbased.primer_trimmed_read_percent
Array[Float]
???

assemble_refbased.reference_genome_length
Int
???

assemble_refbased.replicate_concordant_sites
Int
???

assemble_refbased.replicate_discordant_indels
Int
???

assemble_refbased.replicate_discordant_snps
Int
???

assemble_refbased.replicate_discordant_vcf
File
???

assemble_refbased.samtools_ampliconstats
File
???

assemble_refbased.samtools_ampliconstats_parsed
File
???

assemble_refbased.viral_assemble_version
String
???


Generated using WDL AID (1.0.0)