Resources
About
How Tos
Monkeypox Workflow
Matt Parker
Matt Parker
June 01, 2022
2 min
Monkeypox Workflow

Monkeypox (MPX) is a double-stranded DNA virus. There is an ongoing outbreak of the West African clade of the virus in multiple countries. Data has been generated for a number of these cases using Oxford Nanopore Technology sequencing and here we describe wf-mpx, a decentralised workflow to analyse this data on device, anywhere.

Data Analysis

A dive into the many excellent community posts on virological.org indicated that:

  • people were mapping to existing references,
  • creating a consensus based on this mapping,
  • or they were creating de novo assemblies;
  • and in either case, performing some manual review.

We wanted to empower those users who perhaps are keen to sequence MPX using Oxford Nanopore Technologies devices but don’t have the expertise or resources to throw together an analysis workflow. We have therefore released wf-mpx. By releasing this workflow in it’s nascent state anyone with ONT Monkeypox data, be it metagenomics or something more targeted can get a draft consensus using EPI2ME Labs.

wf-mpx is by no means a comprehensive workflow for the creation of Monkeypox consensus sequences or assemblies, but it might get you started analysing your data.

You should be particulalry careful using this workflow if you have amplicon or other targeted data, no trimming of adapters or primers is carried out by this workflow.

If you have any issues, thoughts, or suggestions please don’t hesitate to raise an issue for us on GitHub: epi2me-labs/wf-mpx.

Workflow Details

You can run the workflow in two ways:

  1. In EPI2ME Labs - you can click the workflow and complete the path to your fastq files. You can download EPI2ME Labs from here
  2. On the command line:
    nextflow run epi2me-labs/wf-mpx --fastq <PATH_TO_FOLDER_OF_FASTQ_FILES>
    

Workflow Steps

The workflow takes a single folder of fastq files (more coming soon) and:

  • Maps the reads using minimap2 to a reference from a choice of:
    • ON568298.1 - German sequence described here
    • MT903344.1 - Monkeypox virus isolate MPXV-UK_P2 NCBI
    • MN648051.1 - Monkeypox virus strain Israel_2018 NCBI
    • ON563414.1 - USA Center for Disease Control sequence NCBI
  • Assesses coverage
  • Keep only reads mapping to reference to exclude potential human reads
  • Calls variants with respect to that reference using medaka
  • Filters variants with <20x depth
  • Creates a draft consensus using bcftools from the variants and reference:
    • Coverage <20x is masked with ‘N’
    • Deletions are represented by ’-’
    • Insertions are in lowercase1
  • Produces an independent de-novo assembly using flye and medaka

Sample Report

The report contains a few useful plots to quality control your data which are described in more detail below. An example can be found here.

Read summary

This section contains two basic plots to show your read length distribution and the read quality scores. These are useful for troubleshooting your experiment.

Genome coverage

This plot shows the depth of coverage at each position along the Monkeypox virus reference you chose to align or map read to. This plot also shows the location of:

  • SNPs: grey dots
  • Insertion/Deletions: blue bars

Variant Context

It has been noted that the mutations identified in the genome appear to be in a context that would suggest APOBEC3 host enzyme action. This plot categporises SNPs in their context to help highlight this observation. More information can be found in this excellent post by Áine O’Toole & Andrew Rambaut

All Variants

This is simply all of the variants called by medaka. This is filtered only by depth (>20x).

Flye Assembly

This plot shows the contigs produced by flye when attemping to assemble the reads.

Software Versions & Workflow Parameters

These sections details the versions of tools used in this workflow and also the parameters at execution.

Test Data

The git repository for wf-mpx includes test data provided by GSTT; Adela Medina, Luke Snell, Themis Charalampous, Rahul Batra, Jonathon Edgeworth. This can be found at wf-mpx/test_data/fastq/barcode01. The original source data can also be found on SRA here.


Tags

#workflows#nextflow

Related Posts

SARS-CoV-2 Midnight Scheme Update
April 15, 2022
3 min
© 2020 - 2022
Oxford Nanopore Technologies
All Rights Reserved.

Quick Links

TutorialsWorkflowsOpen DataContact

Social Media