Germline WGS

DRAGEN Recipe - Germline WGS

Overview

This recipe is for processing whole genome sequencing data for germline workflows.

Example Command Line

For most scenarios, simply creating the union of the command line options from the single caller scenarios will work.

  • Configure the INPUT options

  • Configure the OUTPUT options

  • Configure MAP/ALIGN depending on if realignment is desired or not

  • Configure the VARIANT CALLERs based on the application

  • Configure any additional options

  • Build up the necessary options for each component separately, so that they can be re-used in the final command line.

The following are partial templates that can be used as starting points. Adjust them accordingly for your specific use case.

#!/bin/bash
set -euo pipefail

# Path to DRAGEN hashtable
DRAGEN_HASH_TABLE=<REF_DIR>

# Path to output directory for the DRAGEN run
OUTPUT=<OUT_DIR>

# File prefix for DRAGEN output files
PREFIX=<OUT_PREFIX>

# Define the input sources, select fastq list, fastq, bam, or cram.
INPUT_FASTQ_LIST="
  --fastq-list $FASTQ_LIST \
  --fastq-list-sample-id $FASTQ_LIST_SAMPLE_ID \
"

INPUT_FASTQ="
  --fastq-file1 $FASTQ1 \
  --fastq-file2 $FASTQ2 \
  --RGSM $RGSM \
  --RGID $RGID \
"

INPUT_BAM="
  --bam-input $BAM \
"

INPUT_CRAM="
  --cram-input $CRAM \
"

# Select input source, here in this example we use INPUT_FASTQ_LIST
INPUT_OPTIONS="
  --ref-dir $DRAGEN_HASH_TABLE \
  $INPUT_FASTQ_LIST \
"

OUTPUT_OPTIONS="
  --output-directory $OUTPUT \
  --output-file-prefix $PREFIX \
"

MA_OPTIONS="
  --enable-map-align true \
  --enable-sort true \
  --enable-duplicate-marking true \
"

CNV_OPTIONS="
  --enable-cnv true \
  --cnv-enable-self-normalization true \
"

SNV_OPTIONS="
  --enable-variant-caller true \
"

SV_OPTIONS="
  --enable-sv true \
"

TARGETED_OPTIONS="
  --enable-targeted true \
"

PGX_OPTIONS="
  --enable-pgx true \
"

STR_OPTIONS="
  --repeat-genotype-enable true \
"

# Automatic merging of VNTR calls into SV VCF disabled with the second option
# See the VNTR calling page for more details
VNTR_OPTIONS="
  --enable-vntr true \
  --sv-vntr-merge false \
"

HLA_OPTIONS="
--enable-hla=true \
--hla-enable-class-2=true \ 
"

# Construct final command line
CMD="
  dragen \
  $INPUT_OPTIONS \
  $OUTPUT_OPTIONS \
  $MA_OPTIONS \
  $CNV_OPTIONS \
  $SNV_OPTIONS \
  $SV_OPTIONS \
  $TARGETED_OPTIONS \
  $PGX_OPTIONS \
  $STR_OPTIONS \
  $VNTR_OPTIONS \
  $HLA_OPTIONS \
"

# Execute
echo $CMD
bash -c $CMD

Additional Notes and Options

Optional settings per component are listed below. Full option list at this page.

HLA

OptionDescription

enable-hla

Enable HLA typer (this setting by default will only genotype class 1 genes)

hla-enable-class-2

Extend genotyping to HLA class 2 genes

Last updated