wf-amplicon documentation

By EPI2ME Labs
1 min read

wf-amplicon

Introduction

This Nextflow workflow provides a simple way to analyse Oxford Nanopore reads generated from amplicons.

It requires the raw reads and a reference FASTA file containing one sequence per amplicon. After filtering (based on read length and quality) and trimming, reads are aligned to the reference using minimap2. Variants are then called with Medaka. Results include an interactive HTML report and VCF files containing the called variants.

As mentioned above, the reference FASTA file needs to contain one sequence per amplicon for now. An option to provide a whole-genome reference file and pairs of primers might be added in the future if requested by users.

Quickstart

The workflow relies on the following dependencies:

  • Nextflow for managing compute and software resources.
  • Either Docker or Singularity to provide isolation of the required software.

It is not necessary to clone or download the git repository in order to run the workflow. For more information on running EPI2ME Labs workflows, visit our website.

Workflow options

If you have Nextflow installed, you can run the following to obtain the workflow:

nextflow run epi2me-labs/wf-amplicon --help

This will show the workflow’s command line options.

Usage

Basic usage is as follows

nextflow run epi2me-labs/wf-amplicon \
--fastq $input \
--reference references.fa \
--threads 4

$input can be a single FASTQ file, a directory containing FASTQ files, or a directory containing barcoded sub-directories which in turn contain FASTQ files. A sample sheet can be included with --sample_sheet and a sample name for an individual sample with --sample.

Relevant options for filtering of raw reads are

  • --min_read_length
  • --max_read_length
  • --min_read_qual

After filtering and trimming with Porechop, reads can optionally be downsampled. You can control the number of reads to keep per sample with --reads_downsampling_size.

Haploid variants are then called with Medaka. You can set the minimum coverage a variant needs to exceed in order to be included in the results with --min_coverage. The workflow selects the appropriate Medaka model based on the basecaller configuration that was used to process the signal data. You can use the parameter --basecaller_cfg to provide this information (e.g. dna_r10.4.1_e8.2_400bps_hac) to the workflow. Alternatively, you can choose the Medaka model directly with --medaka_model.

If you want to use Singularity instead of Docker, add -profile singularity.

Key outputs

  • Interactive HTML report detailing the results.
  • VCF files with variants called by Medaka.

Share

EPI2ME Labs

EPI2ME Labs

Senior Button Pusher

Quick Links

TutorialsWorkflowsOpen DataContact

Social Media

© 2020 - 2023 Oxford Nanopore Technologies plc. All rights reserved. Registered Office: Gosling Building, Edmund Halley Road, Oxford Science Park, OX4 4DQ, UK | Registered No. 05386273 | VAT No 336942382. Oxford Nanopore Technologies, the Wheel icon, EPI2ME, Flongle, GridION, Metrichor, MinION, MinIT, MinKNOW, Plongle, PromethION, SmidgION, Ubik and VolTRAX are registered trademarks of Oxford Nanopore Technologies plc in various countries. Oxford Nanopore Technologies products are not intended for use for health assessment or to diagnose, treat, mitigate, cure, or prevent any disease or condition.