# DRAGEN v4.5.4 Release Notes

***

## Introduction

These release notes detail the key changes to software components for the Illumina® DRAGEN™ Secondary Analysis Software v4.5.4.

Changes are relative to DRAGEN™ v4.4. If you are upgrading from a version prior to DRAGEN™ v4.4, please review the release notes for a list of features and bug fixes introduced in subsequent versions.

DRAGEN™ Installers, Resource Files, and Release Notes are available here: <https://support.illumina.com/sequencing/sequencing_software/dragen-bio-it-platform.html>

DRAGEN™ User Guide is available here: <https://help.dragen.illumina.com>

The software package includes downloadable installers for Phase 3 and Phase 4 on-premises servers:

* DRAGEN™ SW for x86 Oracle 8 — `dragen-4.5.4-12.multi.el8.x86_64.run`

The following configurations for DRAGEN™ 4.5.4 are also available on request:

* AlmaLinux 8 Amazon Machine Images (AMIs) for f2 instances, available in 12 regions
* AlmaLinux 8 Microsoft Azure Image (VM) available in West US 2 for BYOL
* el8 compatible RPM packages for use with Amazon Web Services (AWS) f2 instances, for customer generated AMIs or customer generated docker images
* DRAGEN™ Kernel drivers for el8, for use with customer generated AMIs or QuickStart

DRAGEN™ v4.5.4 is also made available on:

* Illumina BaseSpace and ICA platforms
* AWS and Azure Marketplaces
  * On AWS see "DRAGEN Complete Suite"
  * On Azure see "DRAGEN Public VM Image - PAYG"

**Deprecated platforms:**

* Support for CentOS 7 ended on June 30, 2024. DRAGEN™ v4.3 is the final release with CentOS 7 installers. el7 builds are no longer generated starting with v4.5.
* AWS F1 instance types were deprecated end of Dec 2025.

***

## Contents

* [Overview](#overview)
* [Updated Resource Files](#updated-resource-files)
* [Major Features and Updates](#major-features-and-updates)
  * [Reference Genome and Multigenome Mapper](#reference-genome-and-multigenome-mapper)
  * [Germline Small Variant Caller](#germline-small-variant-caller)
  * [Germline CNV Caller](#germline-cnv-caller)
  * [Germline Structural Variant Caller](#germline-structural-variant-caller)
  * [Targeted Calling](#targeted-calling)
  * [TruPath Genome](#trupath-genome)
  * [Somatic Small Variant Caller](#somatic-small-variant-caller)
  * [Oncovirus Detection](#oncovirus-detection)
  * [Mutational Signatures](#mutational-signatures)
  * [Somatic T/N High-Specificity Mode](#somatic-tn-high-specificity-mode)
  * [Somatic CNV Cytogenetics](#somatic-cnv-cytogenetics)
  * [Amplicon Pillar Panel Support](#amplicon-pillar-panel-support)
  * [Bulk RNA](#bulk-rna)
  * [Single-Cell RNA](#single-cell-rna)
  * [CheckFingerprint](#checkfingerprint)
  * [5-Base / Methylation](#5-base--methylation)
  * [Iterative gVCF Genotyper (iGG)](#iterative-gvcf-genotyper-igg)
  * [Annotation](#annotation)
  * [Metagenomics / K-mer Classifier](#metagenomics--k-mer-classifier)
  * [16S Pipeline](#16s-pipeline)
  * [BCL Convert](#bcl-convert)
  * [Fragmentomics](#fragmentomics)
  * [Other Updates](#other-updates)
* [Known Issues](#known-issues)
* [SW Installation Procedure](#sw-installation-procedure)

***

## Overview

DRAGEN™ v4.5 delivers deeper biological insight with unmatched speed, scale, and operational simplicity. For full details on each feature or pipeline, please consult the latest Illumina DRAGEN™ Software User Guide available at <https://help.dragen.illumina.com>.

**Highlights**

* **Expanded pangenome reference** — Pangenome reference expanded to 144 globally diverse haplotypes (v12), improving accuracy across populations and difficult-to-map regions.
* **Germline accuracy leap** — \~20% reduction in germline SNP and INDEL FP+FN with personalization now enabled by default.
* **TruPath Genome launch** — Most comprehensive genome yet: improved SNV and SV calling, long-range phasing, fully phased haplotypes in segmental duplication regions, long STR allele estimation, and complex SV visualization.
* **Expanded somatic capabilities** — ML now available for somatic T/N variant calling (Beta), new oncovirus detection pipeline, mutational signatures report (SBS, DBS, ID), somatic high-specificity fingerprint mode for MRD monitoring, and cytogenetics output for somatic WGS CNV.
* **CNV innovations** — Low-pass WGS CNV support (1–10x), cytogenetics-enabled WES via BAF, and SV-informed WGS CNV by default.
* **Multiomics enhancements** — Bulk RNA PTD/ITD detection (Beta), universal annotation parser (DUAP), and major 5-base methylation accuracy and feature improvements.
* **Toolkit updates** — iGG cohort-level ML filtering, enhanced annotation with AlphaMissense and ABraOM support, k-mer classifier for microbial read binning, ORA compression on ICA, and BCL Convert AutoDetect mode.

Please review the section on [Known Issues](#known-issues) and limitations of the release.

***

## Updated Resource Files

DRAGEN™ v4.5 requires updates to key resource files to function correctly and achieve optimum performance. All resource files are available for download at the Illumina DRAGEN™ Product Files support site: <https://support.illumina.com/sequencing/sequencing_software/dragen-bio-it-platform/product_files.html>

| Resource                                                 | Description                                                                                                                                                                                 | File name(s)                                                                                                                                                                                                   |
| -------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Hash Tables v12**                                      | Pre-built v12 pangenome and linear hash tables for hg38, hg19, hs37d5, chm13\_v2. Hash tables must be updated to use v4.5. Existing hash tables built with v4.4 or older are not supported. | Pangenome: `hg38-alt_masked.cnv.graph.hla.methyl_cg.rna-12-r6.0-1.tar.gz` *(and equivalents for hg19, hs37d5, chm13\_v2)* Linear: `hg38-alt_masked.cnv.hla.methyl_cg.rna-12-r6.0-1.tar.gz` *(and equivalents)* |
| **Pangenome Reference Builder Collection v6**            | HT mask BED, Graph BED, Graph exclusion BED, Graph msVCF and FASTA files for building hg38, hg19, hs37d5, chm13\_v2 references.                                                             | `hg38-pangenome-reference-collection-v6-1.tar.gz` *(and equivalents)*                                                                                                                                          |
| **SNV Systematic Noise Baseline collection**             | A collection of somatic noise baseline BED files for hg19, hs37d5, hg38 and for WGS and WES respectively.                                                                                   | `systematic-noise-baseline-collection-2.0.0.tar`                                                                                                                                                               |
| **SV Systematic Noise Baseline collection**              | A collection of somatic noise baseline BEDPE files for WGS hg19, hs37d5, hg38.                                                                                                              | `sv-systematic-noise-baseline-collection-v3.2.0-1.tar.gz`                                                                                                                                                      |
| **Targeted Caller Systematic Noise Baseline collection** | A collection of systematic noise baseline json files for hg38, hg19 and hs37d5 for use with WES analysis.                                                                                   | `tc-systematic-noise-baseline-collection-v2.0.0-1.tar.gz`                                                                                                                                                      |
| **CNV Population SNP VCF**                               | Population SNP VCF for Somatic TO CNV for hg38, hg19, hs37d5 and chm13.                                                                                                                     | `hg38_1000G_phase1.snps.high_confidence.vcf.gz` *(and equivalents — unchanged from v4.4)*                                                                                                                      |
| **CNV panel of normals (PON) v4.5**                      | Collection of pre-constructed CNV PON files for WES.                                                                                                                                        | `CNV_PON-Twist_ILMN_Exome_2_5_Panel-DRAGEN_v4.5_v1-1.tar.gz` `CNV_PON-Twist_ILMN_Exome_FFPE_2_5_Panel-DRAGEN_v4.5_v1-1.tar.gz` `CNV_PON-Twist_ILMN_Exome_Mito_2_5_Panel-DRAGEN_v4.5_v1-1.tar.gz`               |
| **SNV Exclusion BED collection**                         | Somatic SNV ALU region exclusion BED files for hg38, hg19, hs37d5.                                                                                                                          | `bed-file-collection-1.0.0.tar.gz`                                                                                                                                                                             |
| **Microsatellite Files**                                 | Microsatellite files and panels of normals for hg19, hs37d5, hg38 and for WGS and WES respectively.                                                                                         | `microsatellite-files-v1.2.0-1.tar.gz`                                                                                                                                                                         |
| **Imputation Reference Panel and Genetic Map**           | Genetic map and reference panel for hg38.                                                                                                                                                   | `genetic_maps-hg38-2.0.tar` `irp-hg38-2.1.2.0.tar`                                                                                                                                                             |
| **ORA compression references**                           | Compression references for human, methylated and non-human.                                                                                                                                 | `oradata_homo_sapiens_V1.tar.gz` *(and species equivalents — see DRAGEN Product Files for full list)*                                                                                                          |
| **RNA gene annotation files**                            | GTF gene annotations from GENCODE.                                                                                                                                                          | `gene-annotation-files-collection-v1.0-1.tar.gz`                                                                                                                                                               |

***

## Major Features and Updates

DRAGEN v4.5 offers new features and accuracy improvements across germline and somatic variant callers, new analysis pipelines, multiomics support, and platform toolkit updates.

***

### Reference Genome and Multigenome Mapper

* **Hash Tables v12.**
  * The hash table interface is updated to format version 12 (HTv12). Hash tables must be updated to use v4.5. Existing hash tables built with v4.4 or older are not supported.
  * Pre-built hash tables for all supported human references are available at the Illumina DRAGEN™ Product Files support site and are recommended for use.
  * See the User Guide for details on how to prepare your own reference genome.
* **Pangenome reference v6 — expanded to 144 globally diverse haplotypes.**
  * The pangenome reference now incorporates 288 whole-genome haplotypes from 27 ancestries worldwide (144 individuals), up from 128 in v4.4.
  * Middle Eastern ancestry is newly represented in v4.5, expanding population coverage for variant calling accuracy in this previously underrepresented group.
  * Structural variant population haplotypes have been refined, yielding a \~2% increase in germline SV F-score compared to the v4.4 pangenome reference.
  * GRCh38 reference improvements incorporated, including corrections to known reference errors.
  * See Table 1 below for recommended reference usage.
* Since v4.4, DRAGEN will error out if a linear reference is provided when running a component for which a pangenome reference is recommended. To suppress this error when a linear reference is intentionally desired, set `--validate-pangenome-reference=false`.

**Table 1: v4.5 Reference Support and Recommended Use for Human Data**

| Pipeline          | hg19    | hs37d5  | hg38 | chm13   | Recommended |
| ----------------- | ------- | ------- | ---- | ------- | ----------- |
| **Germline**      | Yes     | Yes     | Yes  | *Note1* | Pangenome   |
| **Somatic**       | Yes     | Yes     | Yes  | *Note2* | Linear      |
| **RNA**           | Yes     | Yes     | Yes  | *Note1* | Linear      |
| **Methyl 5-base** | Yes     | Yes     | Yes  | No      | Pangenome   |
| **Methyl TruSeq** | Yes     | Yes     | Yes  | No      | Linear      |
| **scRNA**         | Yes     | Yes     | Yes  | *Note1* | Linear      |
| **TruPath**       | *Note3* | *Note3* | Yes  | No      | Pangenome   |

*Note1 DRAGEN™ supports the component execution; however, the component's accuracy has not been established. Validated only for SNV. Accuracy not validated for CNV, SV, Joint Genotyping, HLA, gVCFGenotyper, RNA and scRNA. Not supported for STR, Targeted Callers, MRJD*

*Note2 DRAGEN™ supports the component execution; however, the component's accuracy has not been established. Not supported for Methylation.*

*Note3 Experimental use only. The component's functionality and accuracy has not been established with this reference.*

***

### Germline Small Variant Caller

* **Personalization now enabled by default.**
  * Personalization builds a 2-haplotype personalized reference to impute variants used as priors in the variant caller, and creates a personalized ML model tailored to the sample.
  * Enabled by default in v4.5 for germline WGS and WES with a pangenome reference, delivering approximately **20% reduction in combined SNP and INDEL FP+FN** compared to v4.4 without personalization.
  * Delivers superior SNP and INDEL accuracy in both confident regions and difficult-to-map (dark) regions of the genome.
  * Adds less than 4 minutes to default small variant calling runs on a DRAGEN P4 server for WGS; WES and panel runtimes may increase up to \~25% due to personalization overhead.
  * Requires a pangenome v6 hash table.

<div align="center"><figure><picture><source srcset="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-233969aa8f213c3f0341c69a364520caadd2a94b%2Fv45_accuracy_improvement_dark.png?alt=media" media="(prefers-color-scheme: dark)"><img src="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-14baacb907234a755980667798eb087230aaf75d%2Fv45_accuracy_improvement.png?alt=media" alt="Personalization accuracy"></picture><figcaption><p>Personalization accuracy</p></figcaption></figure></div>

<br>

***

### Germline CNV Caller

* **WGS CNV now uses SV support by default.**
  * SV evidence is incorporated into germline WGS CNV calling by default in v4.5, improving accuracy over v4.4.
* **Low-pass WGS CNV support.**
  * The germline WGS CNV caller now supports detection of large copy number alterations from low-pass WGS data (1–10x coverage).
  * Supports detection of alterations ≥ 250 kb.
  * Validated on 49 samples (2x coverage): 90% sensitivity compared to chromosomal microarray, \~4 false positive calls per sample (PASS DEL/DUP).
  * Enable with: `--cnv-enable-lowpass=true`
* **Cytogenetics-enabled WES via B-allele frequency.**
  * WES CNV now supports B-allele frequency (BAF) estimation, enabling cytogenetics-style analysis from enrichment data.
  * Enables detection of AOH/LOH regions alongside copy number changes.
  * Increases calling robustness by considering both read coverage and minor allele frequency.
  * Requires: `--cnv-population-b-allele-vcf <SNP_POP_VCF>`
* **Germline WGS CNV mosaic fraction output.**
  * The germline WGS CNV caller now outputs a mosaic fraction (MF) field in the VCF, providing quantitative estimation of mosaic events.

***

### Germline Structural Variant Caller

* **\~5% SV F-score improvement over v4.4.**
  * DRAGEN v4.5 improves SV F-score by approximately 5% compared to v4.4 on both HG002 NIST T2TQ100 and CMRG benchmarks.
  * Improvements driven by ML-enabled scoring for SV insertions and deletions, combined with the updated pangenome reference v6.
* **Mitochondrial (chrM) SV calling.**
  * Improved sensitivity and specificity for mitochondrial SV detection, including accurate heteroplasmy estimation via variant allele frequency (VAF).
  * Heteroplasmy estimation validated across a range of 0.14–0.76 VAF.
* **Hemizygous SV calling via haploid sex-chromosome support.**
  * DRAGEN SV now correctly handles hemizygous calls on sex chromosomes and pseudoautosomal regions (PAR).
* **Per-breakpoint VAF reporting for complex SVs.**
  * Distinct VAFs are now reported per breakpoint for complex structural variants, supporting clearer interpretation of complex rearrangements.
* **High-precision SV filters.**
  * New optional high-precision germline SV filters based on split-read and spanning read counts can now be enabled to increase call precision at the cost of some sensitivity.
  * Enable with: `--sv-enable-high-precision-filters=true` (germline) or `--sv-enable-somatic-high-precision-filters=true` (somatic).
  * Minimum read count thresholds (unique, spanning, split) are configurable via companion options, with separate thresholds for hotspot regions.

<div align="center"><figure><picture><source srcset="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-1558d4e74a6b3bce15e5e0826ceb1d6124cd9389%2Fv45_sv_accuracy_improvement_dark.png?alt=media" media="(prefers-color-scheme: dark)"><img src="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-57ff5833bd137f2581fde359792e6ef8c951ba54%2Fv45_sv_accuracy_improvement.png?alt=media" alt="SV accuracy" width="400"></picture><figcaption><p>SV accuracy</p></figcaption></figure></div>

<br>

***

### Targeted Calling

#### CYP21A2 and GBA Targeted Caller

* **Improved accuracy of recombinant variant calls.**
  * Increased phasing sensitivity by reducing read depth threshold for detecting haplotype switch sites.
  * Reduced FP/FN recombinant haplotypes by filtering phased haplotypes incompatible with depth-based recombinant variant calls.
  * Support added for calling recombinant haplotypes with a single NM\_000500.9:c.955C>T variant in CYP21A2.
  * DRAGEN v4.5 reports fewer false positive CYP21A2 recombinant variants on 3,202 samples from the 1kGP cohort without sacrificing accuracy (97.6% concordance across 84 benchmark samples).
* **Enhanced reporting of recombinant variants.**
  * Phased VCF output indicates how many copies of the gene are affected.
  * All haplotypes from the target gene and pseudogene are reported in the JSON output.
  * Target allele depths at each site within haplotypes are included for verification.

#### SMN Caller

* **SMA 2+0 silent carrier detection (new ML-driven method).**
  * New logistic regression model trained on approximately 200 biomarker SNPs associated with the SMN1 duplication haplotype.
  * Detects SMN1 2+0 silent carriers based on combined signals across all biomarker SNPs — does not rely on any single variant.
  * Outperforms all existing detection methods for silent carrier identification.
  * Supported on both WGS and WES data (using the Illumina CS/PGx Custom Enrichment Research Panel).

#### Star Allele Caller

* **Updated star allele definitions from ClinPGx and PharmVar.**
  * All definitions updated to latest versions. Noteworthy updates for: CYP2C19, CYP2C9, CYP3A4, CYP3A5, CYP4F2, MT-RNR1, NAT2, NUDT15, RYR1, SLCO1B1.
  * All 22 genes now fully supported for both hg19 and hg38.
* **Extended hg19 support.**
  * New hg19 support added for: UGT1A1, TPMT, CFTR, NAT2, RYR1, G6PD, MT-RNR1.
* **Customizable star allele definitions.**
  * Users can now add or remove star alleles and variants from existing gene definitions.
  * Output JSON lists all star alleles tested for each gene, improving transparency and auditability.

#### HLA Typing

* **Full resolution 3- and 4-field HLA calling with novel allele detection.**
  * DRAGEN v4.5 extends HLA genotyping to full 3- and 4-field resolution for Class I and Class II genes.
  * Novel variant detection allows identification of alleles not present in the IMGT/HLA database.
  * Alignment and variant calling performed against HLA reference built with IMGT allele sequences; Expectation-Maximization (EM)-based initial genotype estimates refined with allele-specific variant calling.
  * Validated against long-read truth on 47 Class I samples and 84 Class II samples; new 3- and 4-field calls are concordant with long-read truth even on challenging Class II alleles, with slightly improved 2-field accuracy over v4.4.
  * Auto-detects reference (GRCh38 or hg19).
  * At most 2 alleles returned per gene, with full list of expected variants per gene.

<div align="center"><figure><picture><source srcset="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-b32d332a34d789bb430411a85d63744704c2c7c0%2Fv45_hla_typing_class1_accuracy_dark.png?alt=media" media="(prefers-color-scheme: dark)"><img src="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-b7d82d38958b3e8ea3a4c462930e200823be25be%2Fv45_hla_typing_class1_accuracy.png?alt=media" alt="HLA Typing Class I accuracy"></picture><figcaption><p>HLA Typing Class I accuracy</p></figcaption></figure></div>

<br>

<div align="center"><figure><picture><source srcset="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-968669b46663d2d0c5a2d41ee83ea67145e85161%2Fv45_hla_typing_class2_accuracy_dark.png?alt=media" media="(prefers-color-scheme: dark)"><img src="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-e4b7fa2c096eafdc058cbb7dd21daad1a00b7205%2Fv45_hla_typing_class2_accuracy.png?alt=media" alt="HLA Typing Class II accuracy"></picture><figcaption><p>HLA Typing Class II accuracy</p></figcaption></figure></div>

<br>

***

### TruPath Genome

Illumina TruPath Genome is a new whole-genome sequencing solution that encodes long-range molecular information directly on the flowcell by preserving spatial proximity between reads derived from the same original DNA molecule. When combined with DRAGEN's proximity-aware algorithms, this proximity information enables long-range analysis that extends the power of standard short-read data.

<div align="center"><figure><picture><source srcset="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-e2fcf900eddcc81d47c38527e39680eadf1a8bd9%2Foutput_files_graphic_dark.png?alt=media" media="(prefers-color-scheme: dark)"><img src="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-565249a933d3754c51152383566f46443c0e15af%2Foutput_files_graphic.png?alt=media" alt="TruPath Analysis Workflow"></picture><figcaption><p>TruPath Analysis Workflow</p></figcaption></figure></div>

<br>

* **Proximity-aware mapping and phasing.**
  * DRAGEN Germline integrates proximity-mapped reads from TruPath to support highly accurate read mapping, long-range phasing, and variant detection in complex and low-mappability genomic regions.
  * Activate proximity mode: `--enable-proximity=true`
  * Supports standard short-read FASTQ input and produces standard BAM/CRAM, VCF output formats.
* **Best-in-class small variant calling accuracy.**
  * TruPath with standard DNA input achieves an F-score of 99.41% on T2T-Q100 v1.1.
  * Personalization combined with proximity-aware analysis further reduces FP+FN for SNPs and INDELs.

<div align="center"><figure><picture><source srcset="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-86bd9ec7331321b626c64033a22bd4ef896ac33d%2Fv45_trupath_vc_accuracy_dark.png?alt=media" media="(prefers-color-scheme: dark)"><img src="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-cfac20b81174e672d5068166ac2518df9b52726f%2Fv45_trupath_vc_accuracy.png?alt=media" alt="TruPath VC accuracy" width="900"></picture><figcaption><p>TruPath VC accuracy</p></figcaption></figure></div>

<br>

* **Highly accurate long-range phasing.**
  * TruPath phases up to 98% of genes with high molecular weight (HMW) input and 89% with standard input.
  * Phase blocks spanning up to millions of bases, covering entire genes for greater biological insight.
  * Phasing outperforms long-read sequencing methods on standard benchmarks.

<div align="center"><figure><picture><source srcset="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-681d1a67099fc76ac63b083646a79840357311d8%2Fv45_trupath_phasing_performance_dark.png?alt=media" media="(prefers-color-scheme: dark)"><img src="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-9a0611e170520dcb874ee8f6f5f2e3b101d2790b%2Fv45_trupath_phasing_performance.png?alt=media" alt="TruPath Phasing performance" width="900"></picture><figcaption><p>TruPath Phasing performance</p></figcaption></figure></div>

<br>

* **Haplotype-resolved variant calling in paralogous regions (MRJD).**
  * Multi-Region Joint Detection (MRJD) is enhanced with TruPath proximity data to support copy-number-aware, haplotype-resolved small variant calling in 15 clinically relevant paralogous gene families.
  * Genes include PMS2, SMN1–SMN2, NCF1, CYP21A2, TNXB, STRC, CYP2D6, CYP11B1–CYP11B2, CFHR1–CFHR3–CFHR4, and USP18.
  * Outputs include haplotype-resolved variant calls, phased read alignments, gene-specific copy number, and haplotype visualization.

<div align="center"><figure><picture><source srcset="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-56caaad984412a4a80ee1062e0211c444e276efc%2FTruPath_PMS2_dark.png?alt=media" media="(prefers-color-scheme: dark)"><img src="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-fc5e4dda6b362a04402da330984028cb2012ff75%2FTruPath_PMS2.png?alt=media" alt="TruPath PMS2 haplotype visualization" width="900"></picture><figcaption><p>TruPath PMS2 haplotype visualization</p></figcaption></figure></div>

<br>

* **Improved SV detection.**
  * TruPath enables more accurate complex and structural variant detection: F-score of 94% for SVs >50 bp.
  * Colocation maps depict long-distance genomic interactions spanning 1 kbp to genome scale — valuable for visualizing and interpreting complex SVs. This type of analysis was historically only available with Hi-C methods.
  * Colocation maps are also used to filter SV breakends lacking proximity support, substantially reducing false positive inter-chromosomal BND calls.

<div align="center"><figure><picture><source srcset="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-a2b0bb4c176e3ea44e4d236ff7321a4fae572cf3%2Fv45_trupath_sv_accuracy_dark.png?alt=media" media="(prefers-color-scheme: dark)"><img src="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-a6a87804b14fbde1fdd13cab21b4d7b92f12618e%2Fv45_trupath_sv_accuracy.png?alt=media" alt="TruPath SV accuracy" width="400"></picture><figcaption><p>TruPath SV accuracy</p></figcaption></figure></div>

<br>

<div align="center"><figure><picture><source srcset="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-b50d75f4b9c1a35b06d1392c7381edbdad57125f%2Fv45_trupath_colocation_plots_dark.png?alt=media" media="(prefers-color-scheme: dark)"><img src="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-2374f6432c6d2f3475af0326bfcd8367a39b7d18%2Fv45_trupath_colocation_plots.png?alt=media" alt="TruPath colocation plots"></picture><figcaption><p>TruPath colocation plots</p></figcaption></figure></div>

<br>

* **Extended STR length estimation.**
  * TruPath improves short tandem repeat (STR) expansion length estimation, extending reliable estimates to thousands of base pairs — well beyond the fragment length barrier that limits standard short-read STR analysis.
  * Reports two phased STR alleles, compared to the single aggregate estimate in standard DRAGEN STR analysis.

<div align="center"><figure><picture><source srcset="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-2be9dd2680fcdc57fe73e6f6c73b19b723872ded%2Fv45_trupath_str_perf_dark.png?alt=media" media="(prefers-color-scheme: dark)"><img src="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-9a367754e73b55966a34970e369c49b586eb1c34%2Fv45_trupath_str_perf.png?alt=media" alt="TruPath STR performance"></picture><figcaption><p>TruPath STR performance</p></figcaption></figure></div>

<br>

* For full command-line options, examples, and supported gene list, see the Illumina TruPath Genome Prep in the User Guide.

***

### Somatic Small Variant Caller

* **Machine learning (ML) somatic variant calling for Tumor/Normal — Beta.**
  * A new ML-based SQ score recalibration model is available for somatic T/N variant calling.
  * Compatible with both WGS and WES, and with fresh frozen (FF) and FFPE sample types.
  * Yields **5–20% lower SNV and INDEL FP+FN** and higher sensitivity than non-ML calling across multiple cell line benchmarks.
  * Eliminates the FFPE noise floor: in a normal-vs-normal FFPE benchmark where 100% of calls are artifacts, DRAGEN ML removes the noise that non-ML calling reports as variants.
  * DRAGEN delivers superior SNV detection at low variant allele frequencies (5–20% VAF) and superior INDEL detection at all allele frequencies, compared to competitive solutions.
  * Operates approximately 7.4x faster than DeepSomatic with all callers enabled.
  * ML-driven recalibration is **disabled by default** in v4.5 and tagged as Beta for customer evaluation. Enable with: `--vc-enable-ml-scoring=true`
  * **Note:** Somatic ML should not be used simultaneously with mutational signatures analysis or HRD/TMB biomarkers in the same run. See the User Guide for details.

<div align="center"><figure><picture><source srcset="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-ff40de0c9d1595aba2e5b92882461b6d34c0d064%2Fv45_somatic_snp_accuracy_dark.png?alt=media" media="(prefers-color-scheme: dark)"><img src="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-de8ddcca3b19d20b7437e349436ac864bec166e6%2Fv45_somatic_snp_accuracy.png?alt=media" alt="Somatic SNP FP+FN"></picture><figcaption><p>Somatic SNP FP+FN</p></figcaption></figure></div>

<br>

<div align="center"><figure><picture><source srcset="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-29f9ca36a505ec1960806ab79402bdb72986ccaf%2Fv45_somatic_indel_accuracy_dark.png?alt=media" media="(prefers-color-scheme: dark)"><img src="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-18670e0765f7c02c9b4934f957f883d71aa2c3ed%2Fv45_somatic_indel_accuracy.png?alt=media" alt="Somatic Indel FP+FN"></picture><figcaption><p>Somatic Indel FP+FN</p></figcaption></figure></div>

<br>

<div align="center"><figure><picture><source srcset="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-fe74dd3d01e8bcbfcc1827e37afdbf355e0016a1%2Fv45_somatic_snp_vaf_dark.png?alt=media" media="(prefers-color-scheme: dark)"><img src="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-6e3c4089a1ed39205eb53570e41a165169cbf2fb%2Fv45_somatic_snp_vaf.png?alt=media" alt="Somatic SNP VAF performance" width="900"></picture><figcaption><p>Somatic SNP VAF performance</p></figcaption></figure></div>

<br>

<div align="center"><figure><picture><source srcset="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-d97b0436e6a01b37a69ed872ec7137fa0abf935d%2Fv45_somatic_indel_vaf_dark.png?alt=media" media="(prefers-color-scheme: dark)"><img src="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-f56cb5becfd741e1daa404be8e505a3b7a61efd8%2Fv45_somatic_indel_vaf.png?alt=media" alt="Somatic Indel VAF performance" width="900"></picture><figcaption><p>Somatic Indel VAF performance</p></figcaption></figure></div>

<br>

<div align="center"><figure><picture><source srcset="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-fc97e33810688982ace44da1120fd0b2c35bef84%2Fv45_somatic_ML_FP_reduction_dark.png?alt=media" media="(prefers-color-scheme: dark)"><img src="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-e9a7707261bb192bd3abf72d7d0b67297392d49f%2Fv45_somatic_ML_FP_reduction.png?alt=media" alt="Somatic SNP ML FP reduction" width="600"></picture><figcaption><p>Somatic SNP ML FP reduction</p></figcaption></figure></div>

<br>

<div align="center"><figure><picture><source srcset="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-2c4db2b6f5781cfacd834c8bcada2e09850c2280%2Fv45_somatic_run_time_compared_dark.png?alt=media" media="(prefers-color-scheme: dark)"><img src="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-478ef58237ed6f9c6d8b5ce25377fff2f9d5aaf1%2Fv45_somatic_run_time_compared.png?alt=media" alt="Somatic run time comparisons" width="1259"></picture><figcaption><p>Somatic run time comparisons</p></figcaption></figure></div>

<br>

***

### Oncovirus Detection

A new integrated oncovirus detection pipeline identifies the presence of oncoviral sequences and their genomic integration sites from tumor sequencing data.

* **Supported oncoviruses:** EBV, HBV, HCV, HTLV-1, KSHV, MCPyV, and 25+ distinct HPV types.
* K-mer classification of unmapped reads against the oncovirus database identifies which reads are from each oncovirus and the most likely reference accession.
* Reported metrics per oncovirus: read count, most likely reference, and percent of the reference covered ≥ 1x.
* An oncovirus is considered detected when its read count threshold and k-mer fraction coverage threshold are both met.
* Integration site detection is performed by the DRAGEN SV caller; detected integration sites are reported in the SV VCF file.
* Validated on 198 samples across multiple tumor types: 100% sensitivity for expected oncoviruses plus detection of additional HPV subtypes due to the broader viral database.

**Usage:**

```
--enable-oncovirus-detection=true \
--oncovirus-detection-db /path/to/oncovirus-db/
```

<div align="center"><figure><picture><source srcset="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-f00ad8e35debff60c62fe315320977270085359b%2Fv45_oncoviral_detection_dark.png?alt=media" media="(prefers-color-scheme: dark)"><img src="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-5f52d3146d39eedc7e426e88919d7363b8a0f458%2Fv45_oncoviral_detection.png?alt=media" alt="Oncoviral Detection" width="530"></picture><figcaption><p>Oncoviral Detection</p></figcaption></figure></div>

<br>

The oncovirus database is available for download from the DRAGEN Product Files support page. The SV caller must also be enabled for integration site detection.

***

### Mutational Signatures

* **New Mutational Signatures analysis** identifies which COSMIC v3.5 mutational signatures are active in a tumor sample and quantifies their contributions to the observed somatic mutation burden.
* Supports SBS (single base substitution), DBS (doublet base substitution), and ID (small indel) signature classes.
* Signature contributions quantified using standard non-negative least squares (NNLS) methods; statistical significance estimated by resampling.
* Identification of Indel Signature ID6: ID6 is a strong SNV predictor of the HRD+ phenotype, complementing DRAGEN's existing CNV-based genomic scarring HRD score.
* Automatically enabled for all WGS Tumor/Normal runs with no additional options required.
* For Tumor-Only mode or standalone analysis from an existing VCF: `--enable-mutational-signatures=true`
* **Note:** Mutational Signatures analysis is supported and recommended for WGS workflows only. Results for WES or panel inputs are not considered reliable.
* **Note:** Somatic ML (`--vc-enable-ml-scoring=true`) must not be used in the same run as mutational signatures analysis.

<div align="center"><figure><picture><source srcset="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-2ba23ccce49b42d9a6374df16d022bb22d8c8401%2Fv45_mutsig_ID6_dark.png?alt=media" media="(prefers-color-scheme: dark)"><img src="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-e223736c7a945f9fff59fbce6e5dc6775249618b%2Fv45_mutsig_ID6.png?alt=media" alt="Mutational Signatures" width="451"></picture><figcaption><p>Mutational Signatures</p></figcaption></figure></div>

<br>

***

### Somatic T/N High-Specificity Mode

* A new umbrella option enables maximum-specificity filtering of somatic T/N variant calls, designed for applications requiring the highest possible precision — such as building a high-confidence variant fingerprint for MRD monitoring in plasma.
* Applies aggressive SQ threshold filtering and multiple additional filter options to eliminate false positives at the cost of some sensitivity.
* Supported for both WGS and WES workflows.
* Enable with: `--vc-high-specificity=true`
* **Note:** This mode is not supported with ML (`--vc-enable-ml-scoring=true`) in v4.5.

***

### Somatic CNV Cytogenetics

* **Somatic WGS CNV now supports cytogenetics output.**
  * The somatic WGS CNV workflow now supports cytogenetics output, enabling chromosomal segment-level copy number reporting (p-arm, q-arm, and whole-chromosome events).

***

### Amplicon Pillar Panel Support

* **Pillar panel support is expanded in v4.5.**
  * Improved and validated accuracy for three panels: oncoReveal Essential LBx panel, oncoReveal Core LBx panel, and Heme.

***

### Bulk RNA

* **PTD/ITD detection in the Bulk RNA fusion caller — Beta.**
  * The gene fusion caller now supports detection of Partial Tandem Duplications (PTD) and Internal Tandem Duplications (ITD), intragenic self-fusions relevant to oncology applications.
  * Specify genes for PTD/ITD detection: `--rna-gf-ptd-genes="KMT2A FGFR1"` (default: empty; standard intragenic events filtered)
  * Validated for KMT2A PTD detection.
  * Tagged as Beta.
* **DRAGEN Universal Annotation Parser (DUAP).**
  * DRAGEN now includes a universal annotation parser compatible with all major annotation sources: GENCODE, Ensembl, RefSeq, and custom annotations.
  * Supports both GTF and GFF3 formats.
  * Selective analysis using a BED file (e.g., for panel workflows).
  * Provides flexibility in defining which features are included in the analysis, with advanced filtering by transcript and intron length.
  * Reduces annotation preprocessing requirements and simplifies bring-your-own-reference workflows for RNA analysis.

***

### Single-Cell RNA

DRAGEN v4.5 introduces several new features for the PIPseq Single-Cell RNA pipeline:

* **h5ad and molecule info HDF5 output.**
  * Count matrices are now output in AnnData format (h5ad) in addition to the existing MEX sparse matrix format, for both raw and filtered barcodes (`<prefix>.scRNA.h5ad`, `<prefix>.scRNA.filtered.h5ad`). Compatible with standard downstream tools (Scanpy, Seurat).
  * A molecule info HDF5 file (`<prefix>.scRNA.moleculeInfo.h5`) is now generated, providing per-molecule read counts by barcode, gene, and IMI — enabling sequencing saturation analysis. Sequencing saturation is also reported as a new QC metric.
  * h5ad output can be disabled with `--single-cell-enable-h5ad-output=false`.
* **BAM output sorted by cell barcode (new default).**
  * The output BAM is now grouped by cell barcode by default, reducing peak memory usage for large samples. To revert to reference position-sorted BAM, set `--scrna-split-counts-by-barcode=false`.
* **Genotype-based sample demultiplexing now supported for Illumina PIPseq prep.**
  * Genotype demultiplexing, previously available for other scRNA preps, is now supported in PIPseq mode. Pooled samples can be demultiplexed using a genotype VCF containing variants differentiating the samples.
  * Enable with: `--scrna-demux-sample-vcf <VCF>` (exact sample genotypes) or `--scrna-demux-reference-vcf <VCF>` (background population VCF).
* **Cell hashing sample demultiplexing.**
  * Samples labeled with oligo-tags (HTOs in R2) can be demultiplexed using a cell-hashing reference file.
  * Enable with: `--scrna-cell-hashing-reference <ref>` and `--scrna-hto-barcode-groups <RGIDs>`.
  * Doublet detection is enabled by default (`--scrna-demux-detect-doublets=true`).
* **STAR aligner support.**
  * STAR is now available as an alternative RNA mapper within the DRAGEN scRNA pipeline, providing compatibility with legacy PIPseq workflows.
  * Enable with: `--single-cell-enable-star-mapper true`. Requires a separate STAR-formatted reference (`--single-cell-star-mapper-reference-dir`).
* **Guide RNA calling with Gaussian Mixture Model (GMM) — PIPseq CRISPR mode.**
  * When PIPseq CRISPR mode is enabled, DRAGEN now applies a GMM-based guide RNA calling step to distinguish cells truly expressing a guide RNA from noise, reducing false positive feature assignments.
  * Enabled with `--scrna-enable-pipseq-crispr-mode=true`. Control with `--scrna-crispr-guide-calling-mode=gmm` (default) or `=none` to disable.

***

### CheckFingerprint

DRAGEN v4.5 introduces a new pileup-based batch screening mode for CheckFingerprint, expanding sample identity verification capabilities. CheckFingerprint answers "is this the same person?" — complementary to DRAGEN contamination detection which asks "is this sample pure?"

* **Verify Identity.**
  * Confirm that two datasets belong to the same individual — essential for longitudinal studies or matched tumor/normal analysis.
  * Supports re-sequencing and multi-sample study workflows.
* **Validate at Scale — new in v4.5.**
  * Compare many samples simultaneously using pileup-based matching, with no need to generate VCFs first.
  * Optimized for batch QC and large cohort screening.
  * Example many-to-many comparison:

    ```
    dragen -r <ref_dir> \
    --enable-checkfingerprint true \
    --checkfingerprint-pairwise-read-files sampleA.pileup.txt \
    --checkfingerprint-pairwise-read-files sampleB.pileup.txt \
    --checkfingerprint-pairwise-read-files sampleC.pileup.txt \
    --output-directory <outdir> \
    --output-file-prefix batch
    ```
* **Flexible modes.**
  * Choose read-level, VCF-level, or pileup comparison depending on the workflow stage.
  * Applicable at any stage of analysis.

***

### 5-Base / Methylation

* **DRAGEN ML for 5-base germline SNV calling.**
  * A new ML model trained with 5mC features improves small variant calling accuracy in 5-base germline workflows, with particular gains for PCR-free INDEL accuracy.
  * Reduces FP+FN for PCR-free samples; fixes gVCF HOMREF block likelihoods for 5-base runs.
* **Personalization enabled for 5-base workflows.**
  * Personalization (enabled by default for standard germline in v4.5) is now supported for 5-base germline analysis, delivering the same \~20% FP+FN reduction benefit for methylation WGS.
* **Allele-specific methylation reporting.**
  * Methylation levels are now reported per allele in 5-base workflows.
* **Joint Genotyper support for 5-base.**
  * Multi-sample joint genotyping is now supported for 5-base (methylation) workflows.
* **5-base CNV (ASCN) and SV support.**
  * Allele-specific copy number (ASCN) estimation is now available in 5-base workflows.
  * SV calling (germline and somatic) is now supported in 5-base runs.
* **gVCF output in tumor/normal mode.**
  * gVCF output is now generated in 5-base tumor/normal runs, enabling HOMREF block reporting alongside variant calls — required for methylation level reporting in T/N workflows.
* **Pangenome reference recommended for 5-base germline.**
  * The pangenome reference is now the recommended reference for 5-base germline analyses.

***

### Iterative gVCF Genotyper (iGG)

* **ML-based cohort variant filtering.**
  * A new ML filtering model is applied at the cohort level during iGG step 3 (msVCF generation), using cohort-wide signals to improve genotyping rate and genotype consistency.
  * Designed for population-scale cohorts and reference datasets (1,000 samples or more). Recommended for gVCF input from DRAGEN v4.0+.
  * Complements sample-level ML recalibration (MLR) as implemented in the DRAGEN variant caller.
  * Validated on the 1000 Genomes cohort: significantly reduces Mendelian errors as measured in family trios, and reduces the percentage of missing genotypes per site.
  * Filtered msVCF is the recommended input for downstream applications such as GWAS.
  * Enable with: `--gg-enable-ml-filtering=true`
* **PopGen CLI for cloud-scale cohort analysis on ICA.**
  * Illumina provides the PopGen CLI, a command-line tool that simplifies orchestration of large-scale iGG workflows on Illumina Connected Analytics (ICA).
  * Automates splitting, scheduling, and merging of large cohort runs across ICA compute resources.
  * Available in multiple ICA regions (US, CA, UK, EU, JP, KR, SG, AU, ID, IN, IL, AE). Download from the `popgen-cli-release-<region>` bundle in your ICA domain.
  * Requires Python 3.8 or later; distributed as a Python wheel package.

***

### Annotation

* **Illumina Connected Annotations updated to v3.27.0.**
  * The annotation engine bundled with DRAGEN is updated to version 3.27.0.
  * Illumina Connected Annotations now considers variant type compatibility when retrieving annotations from supplementary databases, reducing false annotation matches for SVs.
* **Methylation annotation support.**
  * Illumina Connected Annotation can now annotate regions of interest using methylation data, enabling methylation-aware annotation for 5-base genome workflows.
* **AlphaMissense annotation support.**
  * DRAGEN Annotation (Nirvana) now supports AlphaMissense predictions, enabling pathogenicity scoring of missense variants directly from the DRAGEN annotation pipeline.
* **ABraOM population database support.**
  * The Brazilian population frequency database ABraOM is now supported as an annotation source for population allele frequencies, improving variant interpretation for Brazilian and admixed populations.
* **More accurate Transcript Consequence Prioritization.**
  * Transcript consequence prioritization logic has been improved for more accurate functional impact ranking of variants across transcripts.
* **Annotation performance improvements.**
  * Throughput and memory footprint improved for large multi-sample VCF annotation workflows.

***

### Metagenomics / K-mer Classifier

* **New k-mer classifier pipeline for microbial contamination identification and read binning.**
  * A standalone k-mer classifier pipeline is now available that distinguishes human reads from microbial reads, enabling identification and removal of microbial contamination from sequencing data.
  * Reads are classified into categories: human, viral, bacterial, fungal, parasite, ambiguous, and unclassified.
  * Uses the Illumina human and microbial binning database (available for download via wget).
  * **Outputs:**
    * Category Summary TSV — summarizes the percentage of reads classified to each category, giving an overview of sample composition.
    * Category-specific FASTQs — input reads are split into per-category FASTQ files, enabling downstream analyses with microbial reads removed.
  * The classifier is also used internally by the oncovirus detection pipeline.

***

### 16S Pipeline

* **New 16S rRNA pipeline for microbial community profiling.**
  * DRAGEN v4.5 introduces a new 16S pipeline that uses a k-mer-based approach to profile bacteria and archaea from amplicon sequencing data, classifying reads to the appropriate taxonomic level based on k-mer evidence.
  * The **Refseq-RDP-v1 database** is available for download and contains 14,676 bacterial and 660 archaeal full-length 16S rRNA gene sequences. Custom reference sequences can be used instead, enabling analyses such as fungal ITS profiling.
  * **Outputs:**
    * Sample Summary CSV — number and percentage of reads assigned to each taxon.
    * Per-taxonomic-level CSVs — summarize classification results across samples at each taxonomic rank.
    * JSON per sample — analysis metadata.
  * 2×500 bp sequencing unlocks greater species-level classification accuracy: all expected species are detected with higher read assignment rates compared to 2×150 bp.
  * Benchmarked on the ZymoBIOMICS Microbial Community Standard (D6305).

***

### BCL Convert

New features released in DRAGEN v4.4.7 are also present in v4.5:

* **AutoDetect mode for sample sheet generation.**
  * BCL Convert includes AutoDetect mode since DRAGEN v4.4.7, which can automatically fixes indexes, identify adapter sequences and demultiplexing parameters from the run data, detects samples -- thereby reducing manual sample sheet configuration.
  * DRAGEN v4.5 includes bug fixes and robustness improvements for AutoDetect mode.

<div align="center"><figure><picture><source srcset="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-89fd0eb5106261739bf62e81128755239fc4170b%2Fv45_bcl_autodetect_flow_dark.png?alt=media" media="(prefers-color-scheme: dark)"><img src="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-4c19e1b23b33c6191e03dd432b90ba4e59bdcd3d%2Fv45_bcl_autodetect_flow.png?alt=media" alt="BCL Autodetect Flow"></picture><figcaption><p>BCL Autodetect Flow</p></figcaption></figure></div>

<br>

<div align="center"><figure><picture><source srcset="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-f7f30801d604eda8f7191cec4ebd9c1cab506d08%2Fv45_bcl_autodetect_new_samples_dark.png?alt=media" media="(prefers-color-scheme: dark)"><img src="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-25e6b6fddaf1497470a1c57fca7da97bbc26d001%2Fv45_bcl_autodetect_new_samples.png?alt=media" alt="BCL Autodetect Detect new samples"></picture><figcaption><p>BCL Autodetect Detect new samples</p></figcaption></figure></div>

<br>

<div align="center"><figure><picture><source srcset="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-70246f9ee25c55a142dccd8cde3984128cf3864d%2Fv45_bcl_autodetect_updated_samplesheet_dark.png?alt=media" media="(prefers-color-scheme: dark)"><img src="https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-70028e9773e7b95d6b4fc32a16947c532d42b21e%2Fv45_bcl_autodetect_updated_samplesheet.png?alt=media" alt="BCL Autodetect Correct samplesheet"></picture><figcaption><p>BCL Autodetect Correct samplesheet</p></figcaption></figure></div>

<br>

* **Improved demultiplexing performance.**
  * Throughput improvements for large runs (>1,000 samples).
* **No-lane-splitting with per-lane FASTQ output.**
  * BCL Convert supports `no-lane-splitting` and specification of `Lane`. When enabled, BCL Convert outputs one pair of FASTQ files for the lanes specified for that sample.

***

### Fragmentomics

The three DRAGEN Fragmentomics components (fragment profile, end motif frequency, and window protection score) can now be enabled independently or combined in a single run:

* **Standalone Window Protection Score (WPS).**
  * WPS now has its own enable flag (`--enable-fragmentomics-wps true`) and can run without a target region file. Previously, WPS only ran when `--fragmentomics-wps-target-file` was provided.
* **Automatic window generation for WPS.**
  * DRAGEN now automatically tiles regions of interest into sliding windows using `--fragmentomics-wps-window-size` (default: 120 bp), with optional left/right padding via `--fragmentomics-wps-region-left-padding` and `--fragmentomics-wps-region-right-padding`. Previously, users had to supply pre-tiled windows manually.
* **Dedicated fragment-size filtering for end motif and WPS.**
  * New per-component fragment size filters: `--fragmentomics-end-motif-fragment-min-size` / `--fragmentomics-end-motif-fragment-max-size` for end motif analysis, and `--fragmentomics-wps-fragment-min-size` / `--fragmentomics-wps-fragment-max-size` for WPS.

***

### Other Updates

#### Licensing

* **Environment variable support for license credentials.**
  * License credential options `--lic-credentials` and `--lic-instance-id-location` can now be supplied via environment variables, simplifying deployment in containerized and automated environments.
  * Environment variable: `DRAGEN_LICENSE_CREDENTIALS_FILE=<path to config file>`

#### Deprecations and Removals

* **VNTR caller deprecated.** The VNTR (Variable Number Tandem Repeat) caller has been deprecated and is no longer officially supported. The feature is disabled by default and may be removed in a future release.
* **CentOS 7 / el7 support removed.** el7 installers can no longer be produced for v4.5 or later.

***

## Known Issues

The following known issues exist in DRAGEN™ v4.5.4. Where applicable, a workaround is provided.

| Component                        | Summary                                                                                                                                                                                                                                                                                                       | Resolution / Workaround                                                                                                                                                                                                            |
| -------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Germline SNV**                 | Some variants may have a lower QUAL score than expected based on read support and mapping quality. These variants are still called correctly; the QUAL score discrepancy is an artifact of the ML training data and does not affect overall accuracy.                                                         | For information only. No workaround required.                                                                                                                                                                                      |
| **SV**                           | DRAGEN SV does not call fold-back inversions, unless the two sides of the inversion are sufficiently far (100s+ of bp) apart.                                                                                                                                                                                 | No workaround. Detection of fold-back inversions is planned for a future release.                                                                                                                                                  |
| **SV**                           | High-depth, low-quality FFPE WGS samples running the SV workflow may encounter an out-of-memory exception and abort.                                                                                                                                                                                          | Reduce the number of noisy false positive SV candidates being reconstructed by increasing the minimum number of supporting reads with `--sv-min-candidate-spanning-count`. This issue is under investigation for a future release. |
| **SV**                           | In the somatic pipeline, Tumor-Only SV VCF records include both `SOMATIC` and `SOMATIC_EVENT` tags, which may be confusing.                                                                                                                                                                                   | No workaround. The field naming will be unified in a future release to match the convention used by the SNV caller.                                                                                                                |
| **CNV**                          | When reporting mosaic CNV events using combined depth+BAF and depth-only resolutions (Cytogenetics mode), calls may be discordant between the two resolution outputs for rare co-occurring AOH/LOH + DEL events. This behavior was also present in v4.4.                                                      | No workaround for the call discordance. The dual-resolution output format is under review for a future release.                                                                                                                    |
| **STR / Repeat Genotyping**      | False negative INDEL calls may occur in STR regions when reads do not span the entire repeat region. This is a known limitation of short-read-based STR genotyping, not a regression from v4.4.                                                                                                               | No workaround. A fix is planned for the next release.                                                                                                                                                                              |
| **5-Base / Methylation**         | Methylation reporting at multiallelic sites in somatic tumor/normal 5-base workflows produces incorrect results. This is not a regression from v4.4; the v4.5 release is improved relative to v4.4 for methylation reporting overall.                                                                         | No workaround. A fix is planned for the next release.                                                                                                                                                                              |
| **5-Base / Methylation**         | The methylation CX report is not generated when running from an aligned BAM (split-run mode with `--enable-map-align=false`). The report is generated correctly in end-to-end runs, and in mapping-only runs.                                                                                                 | Regenerate the methylation report by remapping the BAM file through the mapping phase of a DRAGEN 5-base run.                                                                                                                      |
| **5-Base / Methylation**         | Methylation reporting consistency issues exist in gVCF output when using the cytosine-report option alongside different CpG context options (C vs CG), which may result in discordant reported values at some positions.                                                                                      | No workaround. A fix is planned for the next release.                                                                                                                                                                              |
| **5-Base / Methylation**         | 5mC methylation reporting is disabled for INDEL positions in v4.5. Methylation levels at INDEL sites are not reported in the (g)VCF output.                                                                                                                                                                   | 5mC reporting for INDELs will be implemented in a future release.                                                                                                                                                                  |
| **5-Base / Methylation (UMI)**   | The 5-base Somatic Tumor/Normal WGS with UMI recipe has increased runtime compared to v4.4. This is an intentional design change to improve UMI collapsing accuracy by selecting more reads for downstream analysis.                                                                                          | No workaround. The runtime increase is expected.                                                                                                                                                                                   |
| **5-Base (Runtime)**             | The 5-Base Tumor/Normal Solid WGS recipe has increased runtime compared to v4.4. This is due to the addition of gVCF output in the tumor/normal workflow, which is required for improved methylation level reporting.                                                                                         | No workaround. The runtime increase is expected.                                                                                                                                                                                   |
| **Paralog Caller (Star Allele)** | Overlapping small variants in GBA and CYP2D6 may produce inconsistencies in VCF format representation for specific star allele genotypes. Not all customers will be affected. No customer reports have been received for the currently affected sites.                                                        | No workaround. This will be addressed in a future release.                                                                                                                                                                         |
| **Paralog Caller (Star Allele)** | CYP2D6 and CYP2B6 phenotype annotations are not generated for genotypes containing three or more copies (e.g., ×3, ×4) of a star allele. This issue was also present in v4.4.                                                                                                                                 | No workaround. Affected genotypes will not receive phenotype annotations. This will be addressed in a future release.                                                                                                              |
| **HLA**                          | For one HLA typing sample from the 1000 Genomes Project (NA19463), DRAGEN incorrectly calls *HLA-B58:02* instead of the expected *HLA-B58:01* allele. This issue was reported by a customer for v4.4 and is not a new regression in v4.5.                                                                     | No workaround. Under investigation for a future release.                                                                                                                                                                           |
| **TruPath / MRJD**               | Haplotype-resolved small variant calling in paralogous gene regions on TruPath data, using Multi-Region Joint Detection (MRJD), has accuracy variation across replicates.                                                                                                                                     | MRJD accuracy will be improved in future releases.                                                                                                                                                                                 |
| **RNA**                          | Gene names and biotypes are not included in RNA quantification output files by default. Only gene or transcript IDs are emitted in the default configuration.                                                                                                                                                 | To include gene names and biotypes, use the following options: `--rna-quantification-gene-attributes` and `--rna-quantification-transcript-attributes`. A fix to make these available by default is planned for the next release.  |
| **RNA**                          | If `--rna-annotation-gene-attributes` or `--rna-annotation-transcript-attributes` are set with incorrect values, the run may fail with an error message only visible at the end of the run rather than at startup validation.                                                                                 | Verify attribute names against the GTF annotation file before running. The error handling will be improved in a future release.                                                                                                    |
| **16S**                          | 16S species-level classification accuracy for *Listeria monocytogenes* decreases at longer read lengths (e.g., 2×300 bp vs 2×150 bp), where reads are more likely to be incorrectly classified as *Listeria innocua*.                                                                                         | For information only. No workaround.                                                                                                                                                                                               |
| **TMB (Tumor-Only)**             | Tumor-only (T/O) TMB is less reliable than tumor/normal (T/N) TMB. Germline variants may bleed into the somatic variant count, inflating the true TMB value. This is a known limitation documented in the User Guide and was present in v4.4.                                                                 | For more accurate TMB reporting, use the tumor/normal workflow when a matched normal sample is available.                                                                                                                          |
| **ORA Compression**              | When running ORA compression or decompression without the `--force` option and an output file already exists, the run correctly aborts with an error but may intermittently produce a segfault exit in addition to the expected error code. This issue was also present in v4.4.                              | Use `--force` when re-running ORA compression over existing output files, or remove existing output files before re-running.                                                                                                       |
| **combine-samples-by-name**      | The `--combine-samples-by-name` input mode does not work correctly in v4.5 due to a regression introduced in v4.3. An alternate input method is available.                                                                                                                                                    | Use the `--fastq-list` or explicit sample input methods instead of `--combine-samples-by-name`.                                                                                                                                    |
| **Somatic SNV / UMI**            | When running a Tumor+UMI/Normal workflow from BAM or CRAM input with `--tumor-normal-has-umi=tumor`, DRAGEN will crash. This issue does not affect FASTQ-input workflows, which are the recommended and most common input mode.                                                                               | Use FASTQ input for Tumor+UMI/Normal workflows. A fix is planned for a future release.                                                                                                                                             |
| **BAM list (All Callers mode)**  | When running DRAGEN in All Callers mode with `--bam-list` or `--cram-list` input, the CNV caller may not be invoked because CNV option parsing does not recognize `bam-list` as valid input. This causes a downstream assertion failure when the Star Allele Caller expects a CNV VCF that was not generated. | As a workaround, run the CNV caller as a separate step using direct BAM/CRAM file inputs rather than `--bam-list`. A fix is planned for the next release.                                                                          |
| **QC Metrics (GC)**              | DRAGEN GC metrics may differ from equivalent Picard GC metrics by more than ±1%. The DRAGEN implementation uses a different algorithm than Picard. This is a long-standing behavior difference and not a regression introduced in v4.5.                                                                       | No workaround. Customers comparing DRAGEN GC metrics against Picard-based thresholds should account for this systematic difference.                                                                                                |

***

## SW Installation Procedure

### Prerequisites

* DRAGEN™ v4.5 software supports on-premises **Phase 3** or **Phase 4** DRAGEN™ servers.
  * **Note:** For on-premises analyses, TruPath analysis requires a Phase 4 DRAGEN server due to FPGA memory limitations. For reference, Phase 4 servers have a serial number beginning with the letters "AC".
* Supported operating system: **AlmaLinux 8 / Oracle Linux 8 (el8-compatible)**.
* Root (sudo) privileges are required for installation.

### Download

DRAGEN™ v4.5 software installers are available at:

<https://support.illumina.com/sequencing/sequencing_software/dragen-bio-it-platform/downloads.html>

Download the `.run` installer appropriate for your platform:

```
dragen-4.5.4-12.multi.el8.x86_64.run
```

### Verify Installer Integrity

Verify the integrity of the downloaded installer before proceeding:

```bash
sudo sh dragen-4.5.4-12.multi.el8.x86_64.run --check
```

### Install DRAGEN™ Software

DRAGEN™ v4.5 uses the multi-version installer. Multiple compatible versions may coexist on the same server.

```bash
sudo sh dragen-4.5.4-12.multi.el8.x86_64.run
```

After installation, DRAGEN™ v4.5 is available at:

```
/opt/dragen/4.5.4/bin/dragen
```

To add DRAGEN™ v4.5 to your PATH for the current user, add the following to `~/.bashrc`:

```bash
export PATH="/opt/dragen/4.5.4/bin:$PATH"
```

To view all installed DRAGEN™ versions on the server:

```bash
/usr/bin/dragen_versions
```

> **Note:** Installing a multi-version package will remove any previously installed single-version DRAGEN™ package (v4.2 or older). Installing a new multi-version package will NOT remove existing multi-version packages.

### Install Resource Files (Hash Tables)

DRAGEN™ v4.5 requires updated hash tables built to format version 12 (HTv12). Existing hash tables from DRAGEN™ v4.4 or earlier are **not compatible** and must be replaced.

Download the appropriate pre-built hash tables for your reference(s) from:

<https://support.illumina.com/sequencing/sequencing_software/dragen-bio-it-platform/product_files.html>

Extract hash tables to your reference directory (e.g., `/data/reference/`):

```bash
tar xzvf hg38-alt_masked.cnv.graph.hla.methyl_cg.rna-12-r6.0-1.tar.gz -C /data/reference/
```

### Run System Self-Test

After installation, verify that the system is functioning correctly:

```bash
/opt/dragen/4.5.4/self_test/self_test.sh
```

The self-test takes approximately 12 minutes. A `PASS` result confirms that the DRAGEN™ server, FPGA, and software are operating correctly.

If you experience a `FAIL` result after installation, contact Illumina Technical Support.

### Licensing

DRAGEN™ v4.5 requires valid Illumina licenses. Licenses are verified on each run.

* For network-connected servers: licenses are automatically verified via `https://license.dragen.illumina.com`.
* For dark-site (air-gapped) servers: a license quota file must be installed. See the DRAGEN™ User Guide licensing section for instructions.

For license installation assistance, contact your Illumina sales representative or Illumina Technical Support.

### Additional Resources

| Resource                   | URL                                                                                                     |
| -------------------------- | ------------------------------------------------------------------------------------------------------- |
| DRAGEN™ User Guide         | <https://help.dragen.illumina.com>                                                                      |
| DRAGEN™ Downloads          | <https://support.illumina.com/sequencing/sequencing_software/dragen-bio-it-platform/downloads.html>     |
| DRAGEN™ Product Files      | <https://support.illumina.com/sequencing/sequencing_software/dragen-bio-it-platform/product_files.html> |
| Server Site Prep Guide     | <https://support.illumina.com/downloads/illumina-dragen-server-site-prep-guide.html>                    |
| Illumina Technical Support | <https://support.illumina.com>                                                                          |

***

| Revision | Date       | Description     |
| -------- | ---------- | --------------- |
| 01       | March 2026 | Initial release |

***

*This document is proprietary. © Illumina, Inc. All rights reserved.*
