Illumina Connected Annotations

Illumina Connected Annotations, also known as Illumina Annotation Engine (IAE) or Nirvana provides translational research-grade annotation of genomic variants (SNVs, MNVs, insertions, deletions, indels, STRs, gene fusions, and SVs (including CNVs). It can be run as a stand-alone package, or integrated into larger software tools that require variant annotation.

Users can annotate VCF files by enabling annotation on the DRAGEN command-line or by running the standalone tool.

The input to Illumina Connected Annotations are VCFs and the output is a structured JSON representation of all annotation and sample information (as extracted from the VCF). Illumina Connected Annotations handles multiple alternate alleles and multiple samples with ease.

NOTE: Before running Annotations, the external data sources, gene models, and reference genome needs to be downloaded from our annotation server.

By default, the Annotations binaries are located in the /opt/dragen/<VERSION>/share/nirvana directory. This directory includes two files: the Downloader and Nirvana (Illumina Connected Annotations).

Limitations

Illumina Connected Annotations and the Downloader are compatible with the following platforms:

  • CentOS 7, Oracle 8 and other modern Linux distributions using x64 processors.

Download Data Files

For more upto date and detailed documentation please visit Illumina Connected Annotations Download Dataarrow-up-right

To store annotation data files, create a top-level directory. The created directory contains three subdirectories:

  • Cache contains gene models.

  • SupplementaryAnnotation contains external data sources like dbSNP and gnomAD.

  • References contains the reference genome.

The following command-line options are used.

Option
Value
Example
Description

--ga

GRCh37, GRCh38, or Both

GRCh38

Genome assembly

--out

output directory

~/Data

Top-level output directory

Download data files as follows.

  1. To create a data directory, enter the following command. This example creates the Data directory in your home directory.

  1. Download the files for a genome assembly. This example downloads the genome assembly GRCh38.

You can use the same command to resynchronize the data sources with the Illumina Connected Annotations servers, including the following actions:

  • Remove obsolete files, such as old versions of data sources, from the output directory.

  • Download newer files.

The following is the created output:

NOTE: If the DRAGEN server does not have an internet connection, the Downloader executable can be copied to a non-DRAGEN server that is connected to the internet to download the annotation data. Once the download has completed, the annotation data can then be copied locally to the DRAGEN server for subsequent annotation.

Annotate Files (via DRAGEN command-line)

To automatically annotate output VCFs, please add the following command-line arguments:

Argument
Example
Description

--enable-variant-annotation

true

enables annotation if the pipeline supports it

--variant-annotation-data

/path/to/your/NirvanaData

the location where you downloaded the Nirvana annotation files

--variant-annotation-assembly

GRCh38

the genome assembly - either GRCh37 or GRCh38. hg19 is handled properly by using GRCh37

All the command-line arguments shown together:

Annotate Files (via standalone Illumina Connected Annotations tool)

  1. If you have not generated a VCF file, download a VCF file using the following command.

Annotations supports uncompressed VCF files and bgzip compressed VCF files. VCF files that have been compressed by standard gzip are not supported.

  1. To annotate the file, enter the following command:

The following are the available command line options:

Option
Value
Example
Description

-c

directory

~/Data/Cache/

Cache directory

-r

directory

~/Data/References/Homo_sapiens.GRCh38.Nirvana.dat

Reference directory

--sd

directory

~/Data/SupplementaryAnnotation/GRCh38

Supplementary annotation directory

-i

path

HiSeq.10000.vcf.gz

Input VCF path

-o

prefix

HiSeq.10000

Output path prefix

Using the example above, Annotations generates the following output called HiSeq.10000.json.gz.

JSON Output File

Annotations produces an output file in JSON format. Please refer to Illumina Connected Annotations JSONarrow-up-right for detailed description of the JSON file.

Version History

Annotations binaries have been included with DRAGEN since v3.5. The table below indicates which version of Annotations binaries were included with different DRAGEN releases, and their AI annotation capabilities.

The Annotations binaries distributed with DRAGEN can not be changed. Never versions of Annotations are backward compatible, and can therefore annotate output files from older DRAGEN releases.

DRAGEN version(s)
Annotations version
AI annotations

4.3

3.23

spliceAI, primateAI3D

3.9, 3.10, 4.0, 4.1, 4.2

3.16.1

spliceAI, primateAI

3.8

3.14

spliceAI, primateAI

3.6, 3.7

3.9.0

spliceAI, primateAI

3.5

3.6.0

spliceAI, primateAI

Last updated

Was this helpful?