DRAGEN Server App

Analysis on DRAGEN Server

Prerequisites

  • DRAGEN Phase 3 or 4 server

  • DRAGEN License

  • Network storage server

DRAGEN server

DRAGEN phase 4 server is recommended especially for datasets from NovaSeq X instruments. The server has 12 TB of intermediate data storage space for full processing of a NovaSeq X 25B flow cell.

The DRAGEN phase 3 server has 6 TB of intermediate data storage space, which can accommodate for flow cells from the NovaSeq 6000 or 6000 Dx instruments.

DRAGEN license

The Heme pipeline uses the standard DRAGEN license without requiring any special licenses.

NFS and CIFS file servers

The Heme pipeline is designed to stream data from a network file server onto the DRAGEN server, complete the analysis using the /staging area of the high performance SSD and then stream the analysis output back to the network file server.

The network file server may be mounted to the DRAGEN server using the NFS or CIFS protocol (SMB 1.0). SMB 2.0 or higher is recommended with Active Directory support if the SMB protocol is used.

Starting from BCL Files

If starting from BCL (*.bcl) files, the Heme pipeline requires the run folder to contain certain files and folders.

The run folder contains data from the sequencing run, make sure that the folder contains the following files:

Folder/File
Description

Config folder

Configuration files

Data folder

*.bcl files

Images folder

[Optional] Raw sequencing image files.

Interop folder

Interop metric files.

Logs folder

[Optional] Sequencing system log files.

RTALogs folder

Real-Time Analysis (RTA) log files.

RunInfo.xml file

Run information.

RunParameters.xml file

Run parameters.

SampleSheet.csv file

Sample information. If you want to use a sample sheet that is not in the run folder or a sample sheet named something other than SampleSheet.csv, provide the full path.

Starting from FASTQ Files

The following inputs are required for running the using FASTQ (*.fastq) files.

  • Full path to an existing FASTQ folder.

  • The FASTQ folder structure conforms to the folder structure in FASTQ File Organization..

  • The sample sheet is in the FASTQ folder path, or you can set the path to the sample sheet with the --sampleSheet override command line option.

Make sure there is sufficient disk space for the analysis to complete. Refer to the --help command line argument details for disk space requirements.

Use BCL Convert to produce FASTQ files for the Heme pipeline. Using bcl2fastq does not produce the same results and is discouraged.

FASTQ File Organization

Store FASTQ files in individual subfolders that correspond to a specific Sample_ID. Keep file pairs together in the same folder. Alternatively, store the FASTQ files in one flat folder structure where the FASTQ files are stored in one folder.

The Heme pipeline requires separate FASTQ files per sample. Do not merge FASTQ files.

The instrument generates two FASTQ files per flow cell lane, so that there are eight FASTQ files per sample.

Sample1_S1_L001_R1_001.fastq.gz

  • Sample1 represents the Sample ID.

  • The S in S1 means sample, and the 1 in S1 is based on the order of samples in the sample sheet, so S1 is the first sample.

  • L001 represents the flow cell lane number.

  • The R in R1 means Read, so R1 refers to Read 1.

Last updated

Was this helpful?