DRAGEN Microbial Enrichment Plus
Description
DRAGEN Microbial Enrichment Plus (DME+), formerly known as the Explify Analysis Pipeline, offers a dedicated informatics solution with flexible analysis options for the following Illumina Infectious Disease and Microbiology target-capture enrichment panel kits: the Illumina Respiratory Pathogen ID/AMR Enrichment Panel Kit (RPIP), Illumina Urinary Pathogen ID/AMR Enrichment Panel Kit (UPIP), and Illumina Viral Surveillance Panel V2 Kit (VSP V2). The application delivers easy-to-use, powerful secondary analysis of Illumina sequencing data, with workflows for sample QC, viral WGS (whole-genome sequencing), pathogen detection and quantification, and antimicrobial resistance (AMR) marker profiling. It also supports custom reference sequence analysis.
RPIP: Target-capture enrichment of >280 RNA and DNA respiratory pathogens, including SARS-CoV-2, Influenza viruses, Respiratory syncytial virus, Mycobacterium and Legionella species, and >4000 AMR markers.
UPIP: Target-capture enrichment of >170 genitourinary pathogens, including fastidious, slow-growing, and anaerobic uropathogens, sexually transmitted microorganisms, and >4000 bacterial AMR markers.
VSP V2: Target-capture enrichment for whole-genome sequencing (WGS) of 200 RNA and DNA viruses prioritized as high-risk to public health, zoonotic surveillance, and biotech, and >200 viral AMR markers.
Custom: Analyze FASTQ/FASTA read files with a custom reference sequence database.
Note that samples enriched using the Illumina Respiratory Virus Oligo Panel/Respiratory Virus Enrichment Kit (RVOP/RVEK) and Viral Surveillance Panel Kit (VSP) can also be analyzed using DME+ and the VSP V2 database.
Pipeline Steps
The following table describes the different steps performed by the pipeline, which steps apply to each panel, and whether the step is run when using a set of custom references.
Read QC
Can be disabled. Low-quality bases are trimmed. Short and low-quality reads are discarded. It is assumed that appropriate adapter trimming has already been performed.
All
Yes
Post-QC FASTQ Generation
Can choose to create a FASTQ with the trimmed reads, or a set of kingdom-specific FASTQs with the trimmed reads. Disabled by default.
All
Yes
Dehosting
Removes human reads.
All
Yes
Sample QC
Sample composition analysis and enrichment factor calculation (which requires an internal control).
All
No
Microorganism Classification
K-mer-based analysis with configurable sensitivity.
VSP V2
No
Microorganism Detection
Alignment-based analysis and consensus generation.
All
Yes
Microorganism Quantification
Requires an internal control.
All
No
Bacterial AMR Marker Analysis
Nucleotide and protein alignment, consensus generation, variant calling and annotation.
RPIP, UPIP
No
Viral Variant Calling
Detects variants from alignment results.
RPIP, VSP V2
No
Viral AMR Marker Analysis
Variant calling and annotation.
RPIP, VSP V2
No
Report Generation
Creates the AP JSON.
All
Yes
Command Line Settings
Required Inputs
--enable-explify
Enables the DME+ pipeline. (Default=false).
--output-file-prefix
Prefix for all output files.
--output-directory
Directory for all output files.
--explify-sample-list
Input sample list .tsv file with sample IDs, FASTQs, etc.
--explify-test-panel-name
"RPIP", "UPIP", "VSP V2", "Custom".
--explify-test-panel-version
Set to test panel version (e.g. "1.0.0").
--explify-ref-db-dir
Path to root directory for database files.
Optional Inputs
--intermediate-results-dir
Area for temporary files. Size must be greater than size of all FASTQ files multiplied by 3.
--explify-load-db-ram
Option to load database into RAM if not on ramdisk. (Default=false).
--explify-no-read-qc
Option to turn off read QC on FASTQs before analysis. (Default=false).
--explify-internal-control
Option to set internal control from an accepted list. (Default="Enterobacteria phage T7").
--explify-internal-control-concentration
Option to set internal control concentration. (Default=12100000).
--explify-ncpus
Option to set the number of CPUs available for processing.
--explify-sensitivity-threshold
Option to set sensitivity threshold for considering a virus present. Range: 0 < Integer < 1000. Only valid for VSP V2. (Default=5).
--explify-custom-ref-fasta
Reference FASTA file. Required for Custom reference DBs.
--explify-custom-ref-bed
Reference BED file. Optional for Custom reference DBs.
--explify-viral-consensus-depth-threshold
Minimum depth at position to include base in viral consensus sequence. Only relevant for RPIP and VSP V2 (Default=1).
--explify-viral-vc-depth-threshold
Minimum total depth at position to report viral variant. Only relevant for RPIP and VSP V2. (Default=5).
--explify-viral-vc-af-threshold
Minimum allele frequency to report viral variant. Only relevant for RPIP and VSP V2. (Default=0.2).
--explify-post-qc-fastq-mode
Create a single post-quality fastq file or files split by kingdom. Choices='off', 'single', 'split'. (Default=off).
Example Command Line
Input Details
Sample Input List
Applies to: --explify-sample-list
The sample input list is a column-formatted file with tab separations between the columns (i.e., a .tsv file).
Notes:
The SampleID values must be unique.
BatchID and RunID are to help users track and manage sample analyses. Often the BatchID is used to track libraries that were prepared together, and the RunID is used to track sequencing runs. They can also be left blank.
The ControlFlag value can be POS, NEG, BLANK, or left empty.
POS is used to indicate a positive control sample.
NEG is used to indicate a negative control sample.
BLANK is used to indicate a blank control sample (e.g. buffer only).
If there are multiple FASTQ files, they are tab delimited.
Please be very careful when editing tsv files. Some editors replace tabs with spaces without alerting the user.
Internal Control
Applies to: --explify-internal-control, --explify-internal-control-concentration
The user may specify one of the internal controls listed below. If NONE is specified, the internal control concentration is ignored. These are case-sensitive and must be input exactly as they appear:
Allobacillus halotoleransArmored RNA Quant Internal Process ControlEnterobacteria phage T7(This is the default)Escherichia virus MS2Escherichia virus QbetaEscherichia virus T4Imtechella halotoleransPhocid alphaherpesvirus 1Phocine morbillivirusTruepera radiovictrixNONE
The internal control concentration is an integer representing the number of copies/mL of sample for the internal control.
Last updated
Was this helpful?