DRAGEN
Illumina Connected Software
  • Overview
    • Illumina® DRAGEN™ Secondary Analysis
    • DRAGEN Applications
    • Deployment Options
  • Product Guides
    • DRAGEN v4.4
      • Getting Started
      • DRAGEN Host Software
        • DRAGEN Secondary Analysis
      • Clinical Research Workflows
        • DRAGEN Heme WGS Tumor Only Pipeline
          • Quick Start
          • Sample Sheets
            • Introduction
            • Requirements
            • Templates
          • Run Planning
            • Sample Sheet Creation in BaseSpace
            • Custom Config Support
          • DRAGEN Server App
            • Getting Started
            • Launching Analysis
            • Command Line Options
            • Output
            • Advanced Topics
              • Custom Workflow
              • Custom Config Support
              • Illumina Connected Insights
          • ICA Cloud App
            • Getting Started
            • Launching Analysis
            • Advanced Topics
              • Custom Workflow
              • Custom Config Support
              • Post Processing
              • Illumina Connected Insights
          • Analysis Output
          • Analysis Methods
          • Troubleshooting
        • DRAGEN Solid WGS Tumor Normal Pipeline
          • Quick Start
          • Sample Sheets
            • Introduction
            • Requirements
            • Templates
          • Run Planning
            • Sample Sheet Creation in BaseSpace
            • Custom Config Support
          • DRAGEN Server App
            • Quick Start
            • Getting Started
            • Launching Analysis
            • Command Line Options
            • Output
            • Advanced Topics
            • Custom Workflow
              • Custom Config Support
            • Illumina Connected Insights
          • ICA Cloud App
            • Getting Started
            • Launching Analysis
            • Output
            • Advanced Topics
              • Custom Workflow
              • Custom Config Support
              • Post Processing
              • Illumina Connected Insights
          • Analysis Output
          • Analysis Methods
          • Troubleshooting
      • DRAGEN Recipes
        • DNA Germline Panel UMI
        • DNA Germline Panel
        • DNA Germline WES UMI
        • DNA Germline WES
        • DNA Germline WGS UMI
        • DNA Germline WGS
        • DNA Somatic Tumor-Normal Solid Panel UMI
        • DNA Somatic Tumor-Normal Solid Panel
        • DNA Somatic Tumor-Normal Solid WES UMI
        • DNA Somatic Tumor-Normal Solid WES
        • DNA Somatic Tumor-Normal Solid WGS UMI
        • DNA Somatic Tumor-Normal Solid WGS
        • DNA Somatic Tumor-Only Heme WGS
        • DNA Somatic Tumor-Only Solid Panel UMI
        • DNA Somatic Tumor-Only Solid Panel
        • DNA Somatic Tumor-Only Solid WES UMI
        • DNA Somatic Tumor-Only Solid WES
        • DNA Somatic Tumor-Only Solid WGS UMI
        • DNA Somatic Tumor-Only Solid WGS
        • DNA Somatic Tumor-Only ctDNA Panel UMI
        • Illumina scRNA
        • Other scRNA prep
        • RNA Panel
        • RNA WTS
      • DRAGEN Reference Support
        • Prepare a Reference Genome
      • DRAGEN DNA Pipeline
        • DNA Mapping
        • Read Trimming
        • DRAGEN FASTQC
        • Sorting and Duplicate Marking
        • Small Variant Calling
          • ROH Caller
          • B-Allele Frequency Output
          • Somatic Mode
          • Pedigree Analysis
          • De Novo Small Variant Filtering
          • Autogenerated MD5SUM for VCF Files
          • Force Genotyping
          • Machine Learning for Variant Calling
          • Evidence BAM
          • Mosaic Detection
          • VCF Imputation
          • Multi-Region Joint Detection
        • Copy Number Variant Calling
          • Available pipelines
            • Germline CNV Calling (WGS/WES)
            • Germline CNV Calling ASCN (WGS)
            • Multisample Germline CNV Calling
            • Somatic CNV Calling ASCN (WGS)
            • Somatic CNV Calling WES
            • Somatic CNV Calling ASCN (WES)
          • Additional documentation
            • CNV Input
            • CNV Preprocessing
            • CNV Segmentation
            • CNV Output
            • CNV ASCN module
            • CNV with SV Support
            • Cytogenetics Modality
        • Repeat Expansion Detection
          • De Novo Repeat Expansion Detection
        • Targeted Caller
          • CYPDB6 Caller
          • CYP2D6 Caller
          • CYP21A2 Caller
          • GBA Caller
          • HBA Caller
          • LPA Caller
          • Rh Caller
          • SMN Caller
        • Structural Variant Calling
          • Structural Variant De Novo Quality Scoring
          • Structural Variant IGV Tutorial
        • VNTR Calling
        • Population Genotyping
        • Filter Duplicate Variants
        • Ploidy Calling
          • Ploidy Estimator
          • Ploidy Caller
        • Multi Caller
        • QC Metrics Reporting
        • JSON Metrics Reporting
        • HLA Typing
        • Biomarkers
          • Tumor Mutational Burden
          • Microsatellite Instability
          • Homologous Recombination Deficiency
          • BRCA Large Genomic Rearrangment
          • DRAGEN Fragmentomics
        • Downsampling
          • Fractional (Raw Reads) Downsampling
        • Unique Molecular Identifiers
        • Indel Re-aligner (Beta)
        • Star Allele Caller
        • High Coverage Analysis
        • CheckFingerprint
        • Population Haplotyping (Beta)
        • DUX4 Rearrangement Caller
      • DRAGEN RNA Pipeline
        • RNA Alignment
        • Gene Fusion Detection
        • Gene Expression Quantification
        • RNA Variant Calling
        • Splice Variant Caller
      • DRAGEN Single Cell Pipeline
        • Illumina PIPseq scRNA
        • Other scRNA Prep
        • scATAC
        • Single-Cell Multiomics
      • DRAGEN Methylation Pipeline
      • DRAGEN MRD Pipeline
      • DRAGEN Amplicon Pipeline
      • Explify Analysis Pipeline
        • Kmer Classifier
        • Kmer Classifier Database Builder
      • BCL conversion
      • Illumina Connected Annotations
      • ORA Compression
      • Command Line Options
        • Docker Requirements
      • DRAGEN Reports
      • Tools and Utilities
    • DRAGEN v4.3
      • Getting Started
      • DRAGEN Host Software
        • DRAGEN Secondary Analysis
      • DRAGEN Reference Support
        • Prepare a Reference Genome
      • DRAGEN DNA Pipeline
        • DNA Mapping
        • Read Trimming
        • DRAGEN FASTQC
        • Sorting and Duplicate Marking
        • Small Variant Calling
          • ROH Caller
          • B-Allele Frequency Output
          • Somatic Mode
          • Joint Analysis
          • De Novo Small Variant Filtering
          • Autogenerated MD5SUM for VCF Files
          • Force Genotyping
          • Machine Learning for Variant Calling
          • Evidence BAM
          • Mosaic Detection
          • VCF Imputation
          • Multi-Region Joint Detection
        • Copy Number Variant Calling
          • CNV Output
          • CNV with SV Support
          • Multisample CNV Calling
          • Somatic CNV Calling WGS
          • Somatic CNV Calling WES
          • Allele Specific CNV for Somatic WES CNV
        • Repeat Expansion Detection
          • De Novo Repeat Expansion Detection
        • Targeted Caller
          • CYPDB6 Caller
          • CYP2D6 Caller
          • CYP21A2 Caller
          • GBA Caller
          • HBA Caller
          • LPA Caller
          • Rh Caller
          • SMN Caller
        • Structural Variant Calling
          • Structural Variant De Novo Quality Scoring
        • VNTR Calling
        • Filter Duplicate Variants
        • Ploidy Calling
          • Ploidy Estimator
          • Ploidy Caller
        • Multi Caller
        • QC Metrics Reporting
        • HLA Typing
        • Biomarkers
          • Tumor Mutational Burden
          • Microsatellite Instability
          • Homologous Recombination Deficiency
          • BRCA Large Genomic Rearrangment
          • DRAGEN Fragmentomics
        • Downsampling
          • Fractional (Raw Reads) Downsampling
          • Effective Coverage Downsampling
        • Unique Molecular Identifiers
        • Indel Re-aligner (Beta)
        • Star Allele Caller
        • High Coverage Analysis
        • CheckFingerprint
        • Population Haplotyping (Beta)
        • DUX4 Rearrangement Caller
      • DRAGEN RNA Pipeline
        • RNA Alignment
        • Gene Fusion Detection
        • Gene Expression Quantification
        • RNA Variant Calling
        • Splice Variant Caller
      • DRAGEN Single-Cell Pipeline
        • scRNA
        • scATAC
        • Single-Cell Multiomics
      • DRAGEN Methylation Pipeline
      • DRAGEN Amplicon Pipeline
      • Explify Analysis Pipeline
        • Kmer Classifier
        • Kmer Classifier Database Builder
      • DRAGEN Recipes
        • DNA Germline Panel UMI
        • DNA Germline Panel
        • DNA Germline WES UMI
        • DNA Germline WES
        • DNA Germline WGS UMI
        • DNA Germline WGS
        • DNA Somatic Tumor-Normal Solid Panel UMI
        • DNA Somatic Tumor-Normal Solid Panel
        • DNA Somatic Tumor-Normal Solid WES UMI
        • DNA Somatic Tumor-Normal Solid WES
        • DNA Somatic Tumor-Normal Solid WGS UMI
        • DNA Somatic Tumor-Normal Solid WGS
        • DNA Somatic Tumor-Only Heme WGS
        • DNA Somatic Tumor-Only Solid Panel UMI
        • DNA Somatic Tumor-Only Solid Panel
        • DNA Somatic Tumor-Only Solid WES UMI
        • DNA Somatic Tumor-Only Solid WES
        • DNA Somatic Tumor-Only Solid WGS UMI
        • DNA Somatic Tumor-Only Solid WGS
        • DNA Somatic Tumor-Only ctDNA Panel UMI
        • RNA Panel
        • RNA WTS
      • BCL conversion
      • Illumina Connected Annotations
      • ORA Compression
      • Command Line Options
      • DRAGEN Reports
      • Tools and Utilities
  • Reference
    • DRAGEN Server
    • DRAGEN Multi-Cloud
      • DRAGEN on AWS
      • DRAGEN on AWS Batch
      • DRAGEN on Microsoft Azure
        • Run DRAGEN VM on Azure
      • DRAGEN on Microsoft Azure Batch
        • Azure Batch Run Modes
    • DRAGEN Licensing
      • DRAGEN Server Licensing
      • DRAGEN Cloud Licensing
    • DRAGEN Application Manager
    • Support
    • Resource Files
      • Noise Baselines
    • Supplementary Information
    • Troubleshooting
    • Citing DRAGEN software
    • Release Notes
    • Revision History
Powered by GitBook
On this page
  • Limitations
  • Download Data Files
  • Annotate Files (via DRAGEN command-line)
  • Annotate Files (via standalone Illumina Connected Annotations tool)
  • JSON Output File
  • Version History

Was this helpful?

Export as PDF
  1. Product Guides
  2. DRAGEN v4.3

Illumina Connected Annotations

PreviousBCL conversionNextORA Compression

Last updated 11 months ago

Was this helpful?

Illumina Connected Annotations, also known as Illumina Annotation Engine (IAE) or Nirvana provides translational research-grade annotation of genomic variants (SNVs, MNVs, insertions, deletions, indels, STRs, gene fusions, and SVs (including CNVs). It can be run as a stand-alone package, or integrated into larger software tools that require variant annotation.

Users can annotate VCF files by enabling annotation on the DRAGEN command-line or by running the standalone tool.

The input to Illumina Connected Annotations are VCFs and the output is a structured JSON representation of all annotation and sample information (as extracted from the VCF). Illumina Connected Annotations handles multiple alternate alleles and multiple samples with ease.

NOTE: Before running Annotations, the external data sources, gene models, and reference genome needs to be downloaded from our annotation server.

By default, the Annotations binaries are located in the /opt/dragen/<VERSION>/share/nirvana directory. This directory includes two files: the Downloader and Nirvana (Illumina Connected Annotations).

Limitations

Illumina Connected Annotations and the Downloader are compatible with the following platforms:

  • CentOS 7, Oracle 8 and other modern Linux distributions using x64 processors.

Download Data Files

For more upto date and detailed documentation please visit

To store annotation data files, create a top-level directory. The created directory contains three subdirectories:

  • Cache contains gene models.

  • SupplementaryAnnotation contains external data sources like dbSNP and gnomAD.

  • References contains the reference genome.

The following command-line options are used.

Option
Value
Example
Description

--ga

GRCh37, GRCh38, or Both

GRCh38

Genome assembly

--out

output directory

~/Data

Top-level output directory

Download data files as follows.

  1. To create a data directory, enter the following command. This example creates the Data directory in your home directory.

mkdir ~/Data
  1. Download the files for a genome assembly. This example downloads the genome assembly GRCh38.

<INSTALL_PATH>/share/nirvana/Downloader --ga GRCh38 --out ~/Data

You can use the same command to resynchronize the data sources with the Illumina Connected Annotations servers, including the following actions:

  • Remove obsolete files, such as old versions of data sources, from the output directory.

  • Download newer files.

The following is the created output:

---------------------------------------------------------------------------
Downloader                                          (c) 2024 Illumina, Inc.
                                                                     3.23.0
---------------------------------------------------------------------------

- downloading manifest... 37 files.

- downloading file metadata:
  - finished (00:00:00.8).

- downloading files (22.123 GB):
  - downloading 1000_Genomes_Project_Phase_3_v3_plus_refMinor.rma.idx (GRCh38)
  - downloading MITOMAP_20200224.nsa.idx (GRCh38)
  - downloading ClinVar_20200302.nsa.idx (GRCh38)
  - downloading REVEL_20160603.nsa.idx (GRCh38)
  - downloading phyloP_hg38.npd.idx (GRCh38)
  - downloading ClinGen_Dosage_Sensitivity_Map_20200131.nsi (GRCh38)
  - downloading MITOMAP_SV_20200224.nsi (GRCh38)
  - downloading dbSNP_151_globalMinor.nsa.idx (GRCh38)
  - downloading ClinGen_Dosage_Sensitivity_Map_20190507.nga (GRCh38)
  - downloading PrimateAI_0.2.nsa.idx (GRCh38)
  - downloading ClinGen_disease_validity_curations_20191202.nga (GRCh38)
  - downloading 1000_Genomes_Project_Phase_3_v3_plus.nsa.idx (GRCh38)
  - downloading SpliceAi_1.3.nsa.idx (GRCh38)
  - downloading dbSNP_153.nsa.idx (GRCh38)
  - downloading TOPMed_freeze_5.nsa.idx (GRCh38)
  - downloading MITOMAP_20200224.nsa (GRCh38)
  - downloading gnomAD_2.1.nsa.idx (GRCh38)
  - downloading ClinGen_20160414.nsi (GRCh38)
  - downloading gnomAD_gene_scores_2.1.nga (GRCh38)
  - downloading 1000_Genomes_Project_(SV)_Phase_3_v5a.nsi (GRCh38)
  - downloading MultiZ100Way_20171006.pcs (GRCh38)
  - downloading 1000_Genomes_Project_Phase_3_v3_plus_refMinor.rma (GRCh38)
  - downloading ClinVar_20200302.nsa (GRCh38)
  - downloading OMIM_20200409.nga (GRCh38)
  - downloading Both.transcripts.ndb (GRCh38)
  - downloading REVEL_20160603.nsa (GRCh38)
  - downloading PrimateAI_0.2.nsa (GRCh38)
  - downloading dbSNP_151_globalMinor.nsa (GRCh38)
  - downloading Both.sift.ndb (GRCh38)
  - downloading Both.polyphen.ndb (GRCh38)
  - downloading Homo_sapiens.GRCh38.Nirvana.dat
  - downloading 1000_Genomes_Project_Phase_3_v3_plus.nsa (GRCh38)
  - downloading phyloP_hg38.npd (GRCh38)
  - downloading SpliceAi_1.3.nsa (GRCh38)
  - downloading TOPMed_freeze_5.nsa (GRCh38)
  - downloading dbSNP_153.nsa (GRCh38)
  - downloading gnomAD_2.1.nsa (GRCh38)
  - finished (00:04:10.1).

Description                                                     Status
---------------------------------------------------------------------------
1000_Genomes_Project_(SV)_Phase_3_v5a.nsi (GRCh38)                OK
1000_Genomes_Project_Phase_3_v3_plus.nsa (GRCh38)                 OK
1000_Genomes_Project_Phase_3_v3_plus.nsa.idx (GRCh38)             OK
1000_Genomes_Project_Phase_3_v3_plus_refMinor.rma (GRCh38)        OK
1000_Genomes_Project_Phase_3_v3_plus_refMinor.rma.idx (...        OK
Both.polyphen.ndb (GRCh38)                                        OK
Both.sift.ndb (GRCh38)                                            OK
Both.transcripts.ndb (GRCh38)                                     OK
ClinGen_20160414.nsi (GRCh38)                                     OK
ClinGen_Dosage_Sensitivity_Map_20190507.nga (GRCh38)              OK
ClinGen_Dosage_Sensitivity_Map_20200131.nsi (GRCh38)              OK
ClinGen_disease_validity_curations_20191202.nga (GRCh38)          OK
ClinVar_20200302.nsa (GRCh38)                                     OK
ClinVar_20200302.nsa.idx (GRCh38)                                 OK
Homo_sapiens.GRCh38.Nirvana.dat                                   OK
MITOMAP_20200224.nsa (GRCh38)                                     OK
MITOMAP_20200224.nsa.idx (GRCh38)                                 OK
MITOMAP_SV_20200224.nsi (GRCh38)                                  OK
MultiZ100Way_20171006.pcs (GRCh38)                                OK
OMIM_20200409.nga (GRCh38)                                        OK
PrimateAI_0.2.nsa (GRCh38)                                        OK
PrimateAI_0.2.nsa.idx (GRCh38)                                    OK
REVEL_20160603.nsa (GRCh38)                                       OK
REVEL_20160603.nsa.idx (GRCh38)                                   OK
SpliceAi_1.3.nsa (GRCh38)                                         OK
SpliceAi_1.3.nsa.idx (GRCh38)                                     OK
TOPMed_freeze_5.nsa (GRCh38)                                      OK
TOPMed_freeze_5.nsa.idx (GRCh38)                                  OK
dbSNP_151_globalMinor.nsa (GRCh38)                                OK
dbSNP_151_globalMinor.nsa.idx (GRCh38)                            OK
dbSNP_153.nsa (GRCh38)                                            OK
dbSNP_153.nsa.idx (GRCh38)                                        OK
gnomAD_2.1.nsa (GRCh38)                                           OK
gnomAD_2.1.nsa.idx (GRCh38)                                       OK
gnomAD_gene_scores_2.1.nga (GRCh38)                               OK
phyloP_hg38.npd (GRCh38)                                          OK
phyloP_hg38.npd.idx (GRCh38)                                      OK
---------------------------------------------------------------------------

Peak memory usage: 52.3 MB
Time: 00:04:12.2

NOTE: If the DRAGEN server does not have an internet connection, the Downloader executable can be copied to a non-DRAGEN server that is connected to the internet to download the annotation data. Once the download has completed, the annotation data can then be copied locally to the DRAGEN server for subsequent annotation.

Annotate Files (via DRAGEN command-line)

To automatically annotate output VCFs, please add the following command-line arguments:

Argument
Example
Description

--enable-variant-annotation

true

enables annotation if the pipeline supports it

--variant-annotation-data

/path/to/your/NirvanaData

the location where you downloaded the Nirvana annotation files

--variant-annotation-assembly

GRCh38

the genome assembly - either GRCh37 or GRCh38. hg19 is handled properly by using GRCh37

All the command-line arguments shown together:

--enable-variant-annotation=true --variant-annotation-data=/path/to/your/NirvanaData --variant-annotation-assembly=GRCh38

Annotate Files (via standalone Illumina Connected Annotations tool)

  1. If you have not generated a VCF file, download a VCF file using the following command.

curl -O https://raw.githubusercontent.com/HelixGrind/DotNetMisc/master/TestFiles/HiSeq.10000.vcf.gz

Annotations supports uncompressed VCF files and bgzip compressed VCF files. VCF files that have been compressed by standard gzip are not supported.

  1. To annotate the file, enter the following command:

<INSTALL_PATH>/share/nirvana/Nirvana -c ~/Data/Cache/ \
-r ~/Data/References/Homo_sapiens.GRCh38.Nirvana.dat \
--sd ~/Data/SupplementaryAnnotation/GRCh38 -i HiSeq.10000.vcf.gz -o HiSeq.10000

The following are the available command line options:

Option
Value
Example
Description

-c

directory

~/Data/Cache/

Cache directory

-r

directory

~/Data/References/Homo_sapiens.GRCh38.Nirvana.dat

Reference directory

--sd

directory

~/Data/SupplementaryAnnotation/GRCh38

Supplementary annotation directory

-i

path

HiSeq.10000.vcf.gz

Input VCF path

-o

prefix

HiSeq.10000

Output path prefix

Using the example above, Annotations generates the following output called HiSeq.10000.json.gz.

---------------------------------------------------------------------------
Illumina Connected Annotations                      (c) 2024 Illumina, Inc.
                                                                     3.23.0
---------------------------------------------------------------------------

Initialization                                         Time     Positions/s
---------------------------------------------------------------------------
Cache                                               00:00:01.9
SA Position Scan                                    00:00:00.4       23,867

Reference                                Preload    Annotation   Variants/s
---------------------------------------------------------------------------
chr1                                    00:00:00.4  00:00:03.7        2,651

Summary                                                Time         Percent
---------------------------------------------------------------------------
Initialization                                      00:00:02.3       25.7 %
Preload                                             00:00:00.4        5.4 %
Annotation                                          00:00:03.7       41.5 %

Peak memory usage: 1.284 GB
Time: 00:00:08.0

JSON Output File

Version History

Annotations binaries have been included with DRAGEN since v3.5. The table below indicates which version of Annotations binaries were included with different DRAGEN releases, and their AI annotation capabilities.

The Annotations binaries distributed with DRAGEN can not be changed. Never versions of Annotations are backward compatible, and can therefore annotate output files from older DRAGEN releases.

DRAGEN version(s)
Annotations version
AI annotations

4.3

3.23

spliceAI, primateAI3D

3.9, 3.10, 4.0, 4.1, 4.2

3.16.1

spliceAI, primateAI

3.8

3.14

spliceAI, primateAI

3.6, 3.7

3.9.0

spliceAI, primateAI

3.5

3.6.0

spliceAI, primateAI

Annotations produces an output file in JSON format. Please refer to for detailed description of the JSON file.

Illumina Connected Annotations Download Data
Illumina Connected Annotations JSON