wf-somatic-variation documentation

By EPI2ME Labs
1 min read

Somatic variation workflow

This repository contains a nextflow workflow to identify somatic variation in a paired normal/tumor sample. This workflow currently perform:

  • Alignment QC and statistics.
  • Somatic short variant calling (SNV and Indels).

Introduction

This workflow enables analysis of somatic variation using the following tools:

  1. ClairS

Quickstart

The workflow uses nextflow to manage compute and software resources, as such nextflow will need to be installed before attempting to run the workflow.

The workflow can currently be run using either Docker or Singularity to provide isolation of the required software. Both methods are automated out-of-the-box provided either Docker or Singularity is installed.

It is not required to clone or download the git repository in order to run the workflow. For more information on running EPI2ME Labs workflows visit our website.

Workflow options

To obtain the workflow, having installed nextflow, users can run:

nextflow run epi2me-labs/wf-somatic-variation --help

to see the options for the workflow.

Somatic short variant calling

The workflow currently implements a deconstructed version of ClairS (v0.1.0) to identify somatic variants in a paired tumor/normal sample. This workflow allows to take advantage of the parallel nature of Nextflow, providing the best performance in high-performance, distributed systems.

Currently, ClairS supports the following basecalling models:

Indel calling

Currently, indel calling is supported only for dna_r10 basecalling models. When the user specify an r9 model the workflow will automatically skip the indel processes and perform only the SNV calling.

Output folder The output directory has the following structure:

output/
├── GRCh38_no_alt_chr17.fa
├── GRCh38_no_alt_chr17.fa.fai
├── ref_cache
├── execution # Execution reports
│   ├── report.html
│   ├── timeline.html
│   └── trace.txt
├── qc
│   └── SAMPLE
│      ├── coverage
│      │ ├── SAMPLE_normal.mosdepth.global.dist.txt
│      │ ├── SAMPLE_normal.mosdepth.summary.txt
│      │ ├── SAMPLE_normal.per-base.bed.gz
│      │ ├── SAMPLE_normal.regions.bed.gz
│      │ ├── SAMPLE_normal.thresholds.bed.gz
│      │ ├── SAMPLE_tumor.mosdepth.global.dist.txt
│      │ ├── SAMPLE_tumor.mosdepth.summary.txt
│      │ ├── SAMPLE_tumor.per-base.bed.gz
│      │ ├── SAMPLE_tumor.regions.bed.gz
│      │ └── SAMPLE_tumor.thresholds.bed.gz
│   └── readstats
│      ├── SAMPLE_normal.flagstat.tsv
│      ├── SAMPLE_normal.readstats.tsv.gz
│      ├── SAMPLE_tumor.flagstat.tsv
│      └── SAMPLE_tumor.readstats.tsv.gz
├── snp # ClairS outputs
│ ├── SAMPLE # ClairS outputs for SAMPLE
│ │   ├── spectra # Mutational spectra for the workflow; for now, it only works for the SNVs
│ │   │   └── SAMPLE_spectrum.csv
│ │   ├── varstats # Bcftools stats output
│ │   │   └── SAMPLE.stats
│ │   └── vcf # VCF outputs
│ │   ├── SAMPLE_somatic_mutype.vcf.gz
│ │   ├── SAMPLE_somatic_mutype.vcf.gz.tbi
│ │   ├── germline # Clair3 Germline calling for both tumor and normal bams
│ │   │   ├── tumor
│ │   │   │   ├── SAMPLE_tumor_germline.vcf.gz
│ │   │   │   └── SAMPLE_tumor_germline.vcf.gz.tbi
│ │   │   └── normal
│ │   │   ├── SAMPLE_normal_germline.vcf.gz
│ │   │   └── SAMPLE_normal_germline.vcf.gz.tbi
│ │   ├── indels # VCF containing the indels from ClairS
│ │   │   ├── SAMPLE_somatic_indels.vcf.gz
│ │   │   └── SAMPLE_somatic_indels.vcf.gz.tbi
│ │   └── snv # VCF containing the SNVs from ClairS
│ │   ├── SAMPLE_somatic_snv.vcf.gz
│ │   └── SAMPLE_somatic_snv.vcf.gz.tbi
│ ├── info # Runtime info
│ │   ├── params.json
│ │   └── versions.txt
│ └── reports # Output report for the workflow
├── SAMPLE.wf-somatic-snp-report.html
├── SAMPLE.wf-somatic-variation-readQC-report.html
├── params.json
└── versions.txt

The primary outputs are:

  1. output/snp/SAMPLE/vcf/SAMPLE_somatic_mutype.vcf.gz: the final VCF file with SNVs and, if r10, InDels
  2. output/snp/SAMPLE/spectra/SAMPLE_spectrum.csv: the mutation spectrum for the sample
  3. output/snp/SAMPLE/vcf/germline/[tumor/normal]: the germline calls for both the tumor and normal bam files
  4. output/*.html: the reports of the SNV pipeline

Share

EPI2ME Labs

EPI2ME Labs

Senior Button Pusher

Quick Links

TutorialsWorkflowsOpen DataContact

Social Media

© 2020 - 2023 Oxford Nanopore Technologies plc. All rights reserved. Registered Office: Gosling Building, Edmund Halley Road, Oxford Science Park, OX4 4DQ, UK | Registered No. 05386273 | VAT No 336942382. Oxford Nanopore Technologies, the Wheel icon, EPI2ME, Flongle, GridION, Metrichor, MinION, MinIT, MinKNOW, Plongle, PromethION, SmidgION, Ubik and VolTRAX are registered trademarks of Oxford Nanopore Technologies plc in various countries. Oxford Nanopore Technologies products are not intended for use for health assessment or to diagnose, treat, mitigate, cure, or prevent any disease or condition.