Data Management

Copying data to local /staging drive

  • Copy the run or FASTQ folder to the DRAGEN server into the staging folder with the following recommended organization: /staging/runs/{RunID}. You can copy the run folder onto the DRAGEN server using Linux commands such as rsync. The sample sheet within the run folder is used unless otherwise specified through the command line.

  • Run folder must be intact.

  • If the analysis output folder path is different from the default, provide the analysis output folder path.

Analysis output directory

Before running the analysis, confirm that the output directory for the software to write to is empty and does not include results of previous analyses.

Storage Requirements

The DRAGEN server provides an NVMe SSD in the /staging directory to use as the software output directory. Network-attached storage is required for long-term storage.

When running the Heme pipeline, use the default settings or set the -analysisFolder command line option to a directory in /staging to make sure the DRAGEN server processes read and write data on the NVMe SSD.

Before beginning analysis, develop a strategy to copy data from the DRAGEN server to a network‑attached storage. Delete output data on the DRAGEN server as soon as possible.

The following are the run folder output size estimates and the minimum free space requirements for fastq.gz or fastq.ora output format.

Sequencing System
Run Folder Output (Gb)
Minimum Disk Space .gz (Gb)
Minimum Disk Space .ora (Gb)

NovaSeq 6000/6000Dx (RUO) S4 Flow Cell

~2000

4000

2500

NovaSeq X 10B

~2000

4000

2500

NovaSeq X 25B

~4250

8500

5300

Other Instruments

~2000

4000

2500

When launching the analysis, the software checks that the minimum disk space required is available. If the minimum disk space is not available, the software shows an error message and prevents analysis from starting. If disk space is exhausted during a run, the run shows an error and stops analyzing.

Moving or modifying files during an analysis may cause the analysis to fail or provide incorrect results.

Data streaming from Network Filesystem

Analysis of data stored on network file system may be slow when there are multiple DRAGEN servers reading and writing to the network file system simultaneously. However, it is advisable to use a network filesystem to stream large datasets from NFS when data transfer to local /staging is taking a significant amount of time, especially for NovaSeq X 25B flow cells. Discuss with your system administrator for Network Considerations of the DRAGEN server.

Last updated

Was this helpful?