Force Genotyping
DRAGEN supports force genotyping (ForceGT) for small variant calling. Use --vc-forcegt-vcf to specify a VCF file containing variants to force genotype. The input list of small variants can be a *.vcf or *.vcf.gz file.
Supported Modes
Germline: Supported. When using joint genotyping with the
--vc-forcegt-vcfoption, the output joint VCF contains only variants tagged withFGT. Without this option, FGT-tagged variants are skipped.Somatic: Supported in both Tumor-Only (T/O) and Tumor-Normal (T/N) modes.
Input Requirements
DRAGEN supports only a single ForceGT VCF input file. The input VCF must:
Be a valid VCF 4.2 file (minimum 8 tab-delimited columns, sorted by contig and position).
The header must list the same contig names as the reference used for variant calling. All variants must refer to one of these contig names.
Contain normalized variants (parsimonious and left-aligned).
Not contain multinucleotide or complex variants (e.g.,
AT → C). These are variants that require more than one substitution / insertion / deletion to go from REF allele to ALT allele and are ignored.Not contain deletions longer than 50bp — these are filtered out.
Duplicate entries (same POS, REF, ALT) are ignored.
Example of normalization:
# Wrong (not parsimonious):
chrX 153592402 GC GCG
# Correct (parsimonious):
chrX 153592403 C CGA nonnormalized variant will cause undefined behavour in DRAGEN.
Output Behavior
The output VCF contains both regular variant calls and ForceGT variants. Each variant is tagged in the INFO field to indicate its origin:
Regular call only (not in ForceGT input)
(none)
ForceGT only (not called by pipeline)
FGT
Both regular and ForceGT (germline)
FGT;NML
Both regular and ForceGT (somatic)
FGT;SOM
Notes:
NML(normal): Indicates the variant was independently called by the pipeline in germline mode AND present in the ForceGT input.SOM(somatic): Indicates the variant was independently called by the pipeline in somatic mode AND present in the ForceGT input.NMLandSOMonly appear paired withFGT, never alone
FILTER and INFO field behavior:
If a ForceGT variant matches a regular call with the same POS, REF, ALT, it inherits all FILTER and INFO fields from the regular call.
If a ForceGT variant is at a novel site (no regular call), FILTER and INFO fields are calculated independently for that variant.
Genotype Reporting
All variants in the ForceGT input VCF are genotyped and included in the output with the following GT values:
No coverage at position
./.
./.
./.
Coverage but no ALT-supporting reads
0/0
0/0
0/0
Coverage with ALT-supporting reads
0/1, 1/1, etc.
0/1
0/1 or 1/1
ForceGT and Multiallelic Sites
In somatic mode, --vc-split-multiallelic-calls is enabled by default, which outputs multiallelic variants on separate lines. It is not recommended to disable this option.
ForceGT variants are combined into a single output line with regular calls only when they have an exact match (same POS, REF, and ALT). Otherwise, a separate ForceGT call is emitted.
Example 1: ForceGT variant differs from regular call
Both variants are output on separate lines:
Example 2: ForceGT variant matches regular call exactly
Combined into a single line with both tags:
Example 3: Multiallelic site with partial ForceGT overlap
If the pipeline calls a multiallelic site (e.g., G→A and G→T) and ForceGT input contains only G→A:
Target BED Filtering
If a target BED file is provided via --vc-target-bed, only ForceGT variants overlapping the BED regions are included in the output.
Last updated
Was this helpful?