DRAGEN Fragmentomics
Fragmentomics is the study of fragmentation patterns of cell-free DNA or circulating tumor DNA (ctDNA). DNA molecules are released into plasma from various tissues and cell types. Fragmentation features of cell-free DNA, such as fragment sizes and end motifs, carry characteristics of their tissue of origin. Studies have shown that fragmentation features differ between cancer-derived and noncancer-derived ctDNA. The use of genome-wide fragment profiles of cell-free DNA has proven to be a powerful tool for inferring cancer status and tissue of origin. DRAGEN supports three fragmentomics components, which can be run independently or combined in a single run.[1]
Fragment profile
End motif frequency
Window protection score (WPS)

The fragmentomics workflow processes aligned reads from the mapper, calculates per-read metrics, and tabulates them into per-bin or target-region metrics. DRAGEN first gets chromosome sizes from the reference genome. Only autosomes and chromosomes X and Y are considered for fragment profile calculation. The genome is binned using the bin size specified by the user. Each aligned read is processed sequentially. Only reads satisfying the following criteria are considered: 1) mapped, 2) mate-mapped, 3) not PCR duplicates, 4) primary alignment, and 5) mapping quality no less than the minimum MAPQ specified by the user. Reads with template lengths within the short-fragment size range are counted as short fragments. Reads with template lengths within the long-fragment size range are counted as long fragments. The fragment profile is calculated as the ratio of short-to-long fragment counts for each genomic bin. Genome-wide short-fragment counts, long-fragment counts, and their ratio are normalized against the GC bias of each genomic bin using the GC correction module from the DRAGEN CNV component.
End motif frequency calculation is enabled with --enable-fragmentomics-end-motif true. The motif length is controlled by --fragmentomics-end-motif-len. Unmapped, duplicated, or secondary alignments are excluded from end motif frequency calculation. The first x-base sequences at the 5' end of each read, where x is specified by --fragmentomics-end-motif-len, are tabulated into a frequency dictionary with the sequences as keys and their counts as values. If the first x bases contain any N characters, the read is ignored. After all reads are processed, the frequency table is sorted alphabetically by sequence. End motif analysis also supports dedicated fragment-size filtering through --fragmentomics-end-motif-fragment-min-size and --fragmentomics-end-motif-fragment-max-size.
Window protection score (WPS) calculation is enabled with --enable-fragmentomics-wps true. A target region file should generally be specified with --fragmentomics-wps-target-file. If no target region file is provided, DRAGEN runs WPS across full chromosomes, which is not recommended because WPS signals are typically sparse genome-wide and analysis is usually intended for selected regions of interest. The target region file must be a BED-format text file with three columns. Each row defines a region of interest (ROI). DRAGEN automatically tiles each ROI with sliding windows whose size is controlled by --fragmentomics-wps-window-size (default = 120). Optional flanking bases can be added with --fragmentomics-wps-region-left-padding and --fragmentomics-wps-region-right-padding. Reads are counted at each window based on 5' end position and strand orientation; reads fully spanning the window are also tracked. After all reads are processed, DRAGEN reports the WPS and related per-window count metrics. WPS analysis also supports dedicated fragment-size filtering through --fragmentomics-wps-fragment-min-size and --fragmentomics-wps-fragment-max-size.
Supported assays and DRAGEN modes
DRAGEN Fragmentomics currently supports Tumor-only and Normal-only sequencing data from TSO500/WES/WGS ctDNA assays. The results for Tumor-Normal pair data are undefined because ctDNA data are derived from a mixture of tumor and normal DNA. Therefore, users should avoid running Fragmentomics in Tumor-Normal mode.
Command-Line Options
Component enablement options:
Enable fragment profile calculation:
Enable end motif calculation:
Enable WPS calculation:
Optional options:
Target regions for window protection score
The target regions file is used only for window protection score calculation. The file must be in BED format with at least three columns (chrom, start, end); additional annotation columns such as a transcript ID are permitted. Each row defines a region of interest. DRAGEN automatically tiles each region into sliding windows based on --fragmentomics-wps-window-size, with optional left and right padding applied before tiling.
If a target region set is not readily available, common regions of interest include transcription start sites, promoters, and DHS/open chromatin regions. These genomic features can be accessed through the GENCODE gene annotations: https://www.gencodegenes.org/human/.
Note: The example above shows two TSS intervals from GENCODE (ENST00000958539.1 and ENST00000889171.1) each padded by 120 bp on both sides (e.g., TSS at chr1:1615439–1615440 expanded to chr1:1615319–1615560). Because the padding is already incorporated into the coordinates, --fragmentomics-wps-region-left-padding and --fragmentomics-wps-region-right-padding should be left at their defaults (0) unless additional flanking sequence is desired.
Exclude regions for fragment profile
Users can provide a blocklist of regions to remove reads from fragment profile calculation. For example, low mappability regions. This file is in BED format with three columns.
Example command-line options for FASTQ input of WGS ctDNA
The following example enables all three fragmentomics components in a single run.
Fragmentomics Output
DRAGEN outputs the fragment profile file, end motif frequency file, and WPS file for whichever components are enabled.
The fragment profile file is in the following format:
The end motif frequency file is in the following format:
The WPS file includes the following columns:
Chr
Chromosome name
windowStart
Start coordinate of the WPS window
windowEnd
End coordinate of the WPS window
windowCenter
Center coordinate of the WPS window
ForwardCount
Read count for forward-mapped reads with a 5' end within the region
ReverseCount
Read count for reverse-mapped reads with a 5' end within the region
FullySpanCount
Read count for mapped reads fully spanning the region
WPS
Window protection score for the region
TotalCount
Total read count in the region
RatioForward
Ratio of ForwardCount to TotalCount
RatioReverse
Ratio of ReverseCount to TotalCount
The WPS file is in the following format:
Reference
Y. M. Dennis Lo, Diana S. C. Han, Peiyong Jiang, Rossa W. K. Chiu. Epigenetics, fragmentomics, and topology of cell-free DNA in liquid biopsies. Science. 2021. DOI: 10.1126/science.aaw3616
Last updated
Was this helpful?