Resources
About

wf-flu documentation

By EPI2ME Labs
1 min read

wf-flu | Influenza Typing Workflow

This repository contains a nextflow workflow that takes targeted ONT Influenza sequencing data to produce typing information.

Introduction

Influenza is a single-stranded RNA virus and contains a 13.5-14.5kb genome which is split into 8 segments encoding 10-14 proteins (dependent on strain).

The virus is classified using two proteins found on the outer surface of the viral capsid. You’ve probably heard of H1N1 Influenza for example. The H represents hemagglutinin and the N is neuraminidase.

The Oxford Nanopore Technologies protocol listed here amplifies segments of the Influenza Type A and Type B genomes. Using this analysis workflow, users can determine the most likely strain of Influenza to which the sample being sequenced belongs.

Data Analysis

Workflow steps:

  1. Concatenate reads & filter out short reads < 200 bases long
  2. Align reads to reference (minimap2)
  3. Coverage calculations (samtools)
  4. Call variants with medaka
  5. Make a (coverage masked) consensus with bcftools
  6. Type with abricate

Downsampling

Downsampling is optional

For every segment in the reference genome:

  1. Get the length
  2. Work out +/- 10%
  3. Filter reads within that segment and within +/- 10% of segment length

Typing

Typing is carried out using abricate using the insaflu database containing the following sequences:

DatabaseGeneAccessionDetails
insafluM1MK576795Type_A MK576795 A/England/7821/2019 2019/01/03 7 (MP)
insafluM1AF100378Type_B AF100378.1 Influenza B virus B/Yamagata/16/88 segment 7 M1 matrix protein (M) and BM2 protein (BM2) genes, complete cds
insafluHAFJ966974H1 FJ966974.1 Influenza A virus (A/California/07/2009(H1N1)) segment 4 hemagglutinin (HA) gene, complete cds
insafluHAL11142H2 L11142.1 Influenza A virus (A/Singapore/1/57 (H2N2)) hemagglutinin (HA) gene, complete cds
insafluHAMK576794H3 MK576794 A/England/7821/2019 2019/01/03 4 (HA)
insafluHAAF285883H4 AF285883.2 Influenza A virus (A/Swine/Ontario/01911-2/99 (H4N6)) segment 4 hemagglutinin (HA) gene, complete cds
insafluHAEF541403H5 EF541403.1 Influenza A virus (A/Viet Nam/1203/2004(H5N1)) segment 4 hemagglutinin (HA) gene, complete cds
insafluHAAB295613H15 AB295613.1 Influenza A virus (A/duck/Australia/341/83(H15N8)) HA gene for haemagglutinin, complete cds
insafluNAGQ377078N1 GQ377078.1 Influenza A virus (A/California/07/2009(H1N1)) segment 6 neuraminidase (NA) gene, complete cds
insafluNAMK576796N2 MK576796 A/England/7821/2019 2019/01/03 6 (NA)
insafluNAAB295614N8 AB295614.1 Influenza A virus (A/duck/Australia/341/83(H15N8)) NA gene for neuraminidase, complete cds
insafluHAAY338459H7 AY338459.1 Influenza A virus (A/Netherlands/219/2003(H7N7)) segment 4 hemagglutinin (HA) gene, complete cds
insafluHACY014659H8 CY014659.1 Influenza A virus (A/turkey/Ontario/6118/1968(H8N4)) segment 4, complete sequence
insafluHACY014694H13 CY014694.1 Influenza A virus (A/gull/Maryland/704/1977(H13N6)) segment 4, complete sequence
insafluHACY018765Yamagata CY018765.1 Influenza B virus (B/Yamagata/16/1988) segment 4, complete sequence
insafluHACY103892H17 CY103892.1 Influenza A virus (A/little yellow-shouldered bat/Guatemala/060/2010(H17N10)) hemagglutinin (HA) gene, complete cds
insafluNACY103894N10 CY103894.1 Influenza A virus (A/little yellow-shouldered bat/Guatemala/060/2010(H17N10)) neuraminidase (NA) gene, complete cds
insafluNACY125730N3v2 CY125730.1 Influenza A virus (A/Mexico/InDRE7218/2012(H7N3)) neuraminidase (NA) gene, complete cds
insafluHACY125945H18 CY125945.1 Influenza A virus (A/flat-faced bat/Peru/033/2010(H18N11)) hemagglutinin (HA) gene, complete cds
insafluNACY125947N11 CY125947.1 Influenza A virus (A/flat-faced bat/Peru/033/2010(H18N11)) neuraminidase-like protein (NA) gene, complete cds
insafluHACY130078H12 CY130078.1 Influenza A virus (A/duck/Alberta/60/1976(H12N5)) hemagglutinin (HA) gene, complete cds
insafluHACY130094H14 CY130094.1 Influenza A virus (A/mallard/Astrakhan/263/1982(H14N5)) hemagglutinin (HA) gene, complete cds
insafluNACY130096N5 CY130096.1 Influenza A virus (A/mallard/Astrakhan/263/1982(H14N5)) neuraminidase (NA) gene, complete cds
insafluHADQ376624H6 DQ376624.1 Influenza A virus (A/chicken/Taiwan/0705/99(H6N1)) hemagglutinin (HA) gene, complete cds
insafluHAEU293864H16 EU293864.1 Influenza A virus (A/black-headed gull/Turkmenistan/13/76(H16N3)) hemagglutinin (HA) gene, complete cds
insafluHAFJ183474H10 FJ183474.1 Influenza A virus (A/mallard/Bavaria/3/2006(H10N7)) segment 4 hemagglutinin (HA) gene, complete cds
insafluNAFJ183475N7 FJ183475.1 Influenza A virus (A/mallard/Bavaria/3/2006(H10N7)) segment 6 neuraminidase (NA) gene, complete cds
insafluNAGQ907296N3v1 GQ907296.1 Influenza A virus (A/black headed gull/Mongolia/1756/2006(H16N3)) segment 6 neuraminidase (NA) gene, complete cds
insafluHAGU052203H11 GU052203.1 Influenza A virus (A/duck/England/1/1956(H11N6)) segment 4 hemagglutinin (HA) gene, complete cds
insafluNAKC853765N9 KC853765.1 Influenza A virus (A/Hangzhou/1/2013(H7N9)) segment 6 neuraminidase (NA) gene, complete cds
insafluHAKX879589H9 KX879589.1 Influenza A virus (A/swine/Hong Kong/9/98(H9N2)) segment 4 hemagglutinin (HA) gene, partial cds
insafluHAM58428Victoria M58428.1 Influenza B/Victoria/2/87, hemagglutinin (seg 4), RNA
insafluNAEU429793N4 EU429793.1 Influenza A virus (A/turkey/Ontario/6118/1968(H8N4)) segment 6 neuraminidase (NA) mRNA, complete cds
insafluNAEU429795N6 EU429795.1 Influenza A virus (A/duck/England/1/1956(H11N6)) segment 6 neuraminidase (NA) mRNA, complete cds

Quickstart

The workflow uses nextflow to manage compute and software resources. Thus, nextflow will need to be installed before attempting to run the workflow.

The workflow can currently be run using either Docker or Singularity to provide isolation of the required software. Both methods are automated out-of-the-box, provided either docker of singularity is installed.

It is not required to clone or download the git repository in order to run the workflow. For more information on running EPI2ME Labs workflows visit our website.

Workflow options

To obtain the workflow, having installed nextflow, users can run:

nextflow run epi2me-labs/wf-flu --help

to see the options for the workflow.

Workflow outputs

The workflow creates several files that are useful for interpretation and analysis:

  • Per run:
    • wf-flu-report.html: Easy-to-use HTML report for all samples in the run
    • wf-flu-results.csv: Typing results in CSV format for onward processing
  • Per sample:
    • <SAMPLE_NAME>.stats: Read stats
    • <SAMPLE_NAME>.bam: Alignment of reads to reference
    • <SAMPLE_NAME>.bam.bai: BAM index
    • <SAMPLE_NAME>.annotate.filtered.vcf: medaka called variants
    • <SAMPLE_NAME>.draft.consensus.fasta: Consensus FASTA
    • <SAMPLE_NAME>.insaflu.typing.txt: abricate typing results

EPI2ME Labs

EPI2ME Labs

Senior Button Pusher

Quick Links

TutorialsWorkflowsOpen DataContact

Social Media

© 2020 - 2023 Oxford Nanopore Technologies plc. All rights reserved. Registered Office: Gosling Building, Edmund Halley Road, Oxford Science Park, OX4 4DQ, UK | Registered No. 05386273 | VAT No 336942382. Oxford Nanopore Technologies, the Wheel icon, EPI2ME, Flongle, GridION, Metrichor, MinION, MinIT, MinKNOW, Plongle, PromethION, SmidgION, Ubik and VolTRAX are registered trademarks of Oxford Nanopore Technologies plc in various countries. Oxford Nanopore Technologies products are not intended for use for health assessment or to diagnose, treat, mitigate, cure, or prevent any disease or condition.