Other scRNA Prep

Overview

This recipe is for processing general single-cell RNA workflows.

Example Command Line

  • Configure the INPUT options

  • Configure the OUTPUT options

  • Configure the SCRNA MAP/ALIGN options

  • Configure the SCRNA options

We recommend using a linear (non-pangenome) reference for single-cell RNA analysis. For more details, refer to Dragen Reference Support.

The following are partial templates that can be used as starting points. Adjust them accordingly for your specific use case.

#!/bin/bash
set -euo pipefail

# Path to DRAGEN hashtable
DRAGEN_HASH_TABLE=<REF_DIR>

# Path to output directory for the DRAGEN run
OUTPUT=<OUT_DIR>

# File prefix for DRAGEN output files
PREFIX=<OUT_PREFIX>

# Define the input sources, either a FASTQ list or FASTQ files.
INPUT_FASTQ_LIST="
  --fastq-list $FASTQ_LIST \
  --fastq-list-sample-id $FASTQ_LIST_SAMPLE_ID \
"

INPUT_FASTQ="
  --fastq-file1 $FASTQ1 \
  --fastq-file2 $FASTQ2 \
  --RGSM $RGSM \
  --RGID $RGID \
"

# Select the input source. Here in this example, we use a INPUT_FASTQ_LIST
INPUT_OPTIONS="
  --ref-dir $DRAGEN_HASH_TABLE \
  $INPUT_FASTQ_LIST \
"

OUTPUT_OPTIONS="
  --output-directory $OUTPUT \
  --output-file-prefix $PREFIX \
"

# RNA alignment requires an annotation file in GTF format.
GTF=<GTF_PATH>

# The single-cell RNA pipeline requires map-align to be true.
# Map-align output can be optionally enabled. Output format options are SAM, BAM, and CRAM (set to BAM here).
SCRNA_MAP_OPTIONS="
  --enable-rna true \
  --enable-map-align true \
  --annotation-file $GTF \
  --enable-map-align-output true \
  --output-format BAM \
"

# Single-cell RNA options:

# The barcode+UMI source can be set to qname, read1, read2, or fastq. Here we set them to read1.
UMI_SRC=read1

# Barcode and UMI positions should be provided in the form <startPos>_<endPos> for each barcode. Connect multiple barcode sequence positions with a "+".
# For example, a library with the cell-barcode split into three blocks of 9 bp separated by fixed linker sequences and an 8 bp UMI would be set to:
# BARCODE_POS=0_8+21_29+43_51
# UMI_POS=52_59
BARCODE_POS=<BARCODE_POS>
UMI_POS=<UMI_POS>

# A known barcode sequence list can be optionally provided.
BARCODE_SEQUENCE_LIST=<BARCODE_SEQ_LIST_PATH>

# Cell filtering can be done by setting a threshold using either the fixed, ratio, or inflection approaches. 
# Filtering can be done by umi (default) or by read (optional argument but included by clarity).
# Here we set the threshold using the ratio approach and we filter by umi.
FILTER_THRESHOLD=ratio
FILTER_BY=umi

SCRNA_OPTIONS="
  --enable-single-cell-rna true \
  --umi-source $UMI_SRC \
  --scrna-barcode-position $BARCODE_POS \
  --scrna-umi-position $UMI_POS \
  --scrna-barcode-sequence-list $BARCODE_SEQUENCE_LIST \
  --single-cell-threshold $FILTER_THRESHOLD \
  --single-cell-threshold-filterby $FILTER_BY \
"

# Construct final command line
CMD="
  dragen \
  $INPUT_OPTIONS \
  $OUTPUT_OPTIONS \
  $SCRNA_MAP_OPTIONS \
  $SCRNA_OPTIONS
"

# Execute
echo $CMD
bash -c $CMD

For more details on the single-cell RNA options, refer to the DRAGEN Single-Cell RNA User Guide.

Last updated