Targeted BRCA Gene Analysis with Oxford Nanopore

By Matt Parker
Published in How Tos
June 19, 2023
4 min read
Targeted BRCA Gene Analysis with Oxford Nanopore

Oxford Nanopore Technologies products are not intended for use for health assessment or to diagnose, treat, mitigate, cure or prevent any disease or condition.

wf-human-variation is our “do it all” Nextflow workflow for the identification of human DNA variation from Oxford Nanopore Technologies’ long read sequencing data. With its latest update, ClinVar variant annotation is now also included.

This post will show that our human variation workflow can also process targeted sequence data. In the following text we will analyse targeted BRCA1 and BRCA2 sequence data that was generated in a research setting using Oxford Nanopore Technologies’ adaptive sampling methodology. The analysis will be performed with our EPI2ME software running on a laptop computer. While the BRCA gene enrichment here was prepared using adaptive sampling, this “how to” would also apply to capture and/or PCR based methods of enrichment. The analysis described here could similarly be applied to large panels of genes of interest in research environments.

Data was generated using DNA from the cell line NA14636 provided by the NIGMS Human Genetic Cell Repository at the Coriell Institute for Medical Research.

NA14636 sample is from a 56 year-old female with family history of breast cancer. This individual was diagnosed with the disease at 50. We know that certain mutations in BRCA1 and BRCA2 can result in increased risk of hereditary breast and ovarian cancer, and so we sequenced the sample to see if we could detect any key BRCA1 and BRCA2 mutations.

The sample was prepared and sequenced using our ligation sequencing kit and the newer Ligation Sequencing Kit V14 on a GridION for 72hrs, with adaptive sampling of BRCA1 and BRCA2.

Analysis

Let’s dive in and and see how we can get from our long Oxford Nanopore sequencing reads to a list of interesting variants.

If you haven’t already downloaded EPI2ME it’s freely available for all major operating systems here: https://labs.epi2me.io/downloads/, there are some pre-requisites but the application will take you through these.

Inputs

Our 1st task is to make sure we have the required input files for the workflow. The required inputs to wf-human-variation are the following:

  • a BED file of regions we are interested in analysing (target regions)
  • a BAM file of mapped or unmapped sequencing reads (MinKNOW, dorado generated (or other))
  • a FASTA reference genome

BAM file(s)

If you don’t have a mapped/unmapped BAM file input please see the appendix for instructions.

BED file

If you have performed adaptive sampling you can simply use the BED file you used as input into MinKNOW when setting up your sequencing run, if not then you can easily make a BED file, it is a tab separated file with 3 mandatory columns of the chromosome, start and end position of your region. You can add additional columns for the name and size of the region for instance. More details on the BED format here

The BED file for the analysis described here looks like this:

chr13 32305479 32409671 BRCA2 104192
chr17 43034294 43135363 BRCA1 101069

FASTA reference genome

Download the human reference genome - we’ll use hg38 from UCSC in this example:

https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/analysisSet/hg38.analysisSet.fa.gz

Running wf-human-variation

We can easily analyse data from small panels locally on a reasonably powered laptop, or on a GridION or PromethION device using the EPI2ME interface. Our workflow for the analysis of human variation, wf-human-variation, is available in the application and is easy to install and run (Figure 1).

ClinVar Annotation
Figure 1 - Bioinformatics workflows of all flavours are available in EPI2ME.)

  1. In “Workflow Options” choose “Snp” - annotation of SNPs is now carried out automatically (>= v1.6.0).

  2. In “Input Options” choose:

  • the BAM file you created with wf-alignment or any pre-generated BAM file you wish to analyse.
  • the FASTA reference genome
  • the BED file defining the regions you wish to analyse
  1. Under “Small variant calling options” choose “Phase VCF” - ONT long reads mean we can determine the haplotype of the variants called

  2. Again you can alter the resources given to the workflow under “Multiprocessing Options” and “Extra Configuration”

  3. Click “Launch workflow”

Results

Using ClinVar to annotate the variants found by wf-human-variation (Figure 2) highlights one pathogenic variant in BRCA1 which is a 1bp insertion at nucleotide 5677 in exon 24 (5677insA). This results in a frameshift and truncation at codon 1853 (Y1853X). This matches the information provided by Coriell for this sample.

ClinVar Annotation
Figure 2 - ClinVar annotated variants (benign & likely benign excluded.)

Other important data is output by the workflow (Figure 3), including a summaries for the variants, and the alignment in HTML format.

EPI2ME Small Variant Report
Figure 3 - EPI2ME small variant report.

You can also take a look at the raw data such as the VCF file for the variants (Figure 4). Clicking “Open folder” at the bottom of the page will open a file explorer to see the raw data.

Output Files
Figure 4 - Useful output files are created by the workflow.

Conclusion

We hope that you found this quick tutorial on how to analyse targeted human Oxford Nanopore sequencing research data with wf-human-variation and EPI2ME. Any comments, questions or suggestions don’t hesitate to let us know using the usual channels.

Appendix 1 - BAM file generation

You have two options if you have FASTQ data:

  1. Align your reads
  2. Make an unmapped bam

1. Alignment with wf-alignment

You’ll need the reference genome FASTA from above; we can align our sequencing reads from our FASTQ files to this reference with wf-alignment to generate some statistics on our sequencing data and also create our BAM file for wf-human-variation.

In EPI2ME Labs install the wf-alignment workflow if you haven’t already done so. Click “Run this workflow”

Choose the path to your FASTQ files and the path to your reference genome; hg38.analysisSet.fa.gz under “Input Options”.

You can also increase the resources given to the workflow under “Misc Options” and “Extra Configuration”.

Click “Launch workflow”

2. Make an unmapped BAM

If you are reasonably familiar with the command line you can make what’s called an “unmapped” BAM file using samtools

samtools import -o umapped_reads.bam -O BAM <FASTQ>

Where <FASTQ> is the path to your FASTQ file(s). This command will produce an umapped BAM file called unmapped_reads.bam.

Appendix 2 - Running on the command line

Just like with our desktop application, running wf-human-variation on the command line is easy.

A good place to start is to list the parameters that can be used to run the workflow:

nextflow run epi2me-labs/wf-human-variation --help

To recreate the analysis above, these are the options to use:

nextflow run epi2me-labs/wf-human-variation --bam <PATH_TO_READS> --ref <PATH_TO_FASTA> --bed <PATH_TO_BED> --phase_vcf --snp

Tags

#workflows#nextflow#epi2melabs#wf-human-variation

Share

Matt Parker

Matt Parker

Associate Director, Clinical Bioinformatics

Table Of Contents

1
Analysis
2
Results
3
Conclusion
4
Appendix 1 - BAM file generation
5
Appendix 2 - Running on the command line

Related Posts

How to interpret exit codes
October 06, 2023
4 min

Quick Links

TutorialsWorkflowsOpen DataContact

Social Media

© 2020 - 2024 Oxford Nanopore Technologies plc. All rights reserved. Registered Office: Gosling Building, Edmund Halley Road, Oxford Science Park, OX4 4DQ, UK | Registered No. 05386273 | VAT No 336942382. Oxford Nanopore Technologies, the Wheel icon, EPI2ME, Flongle, GridION, Metrichor, MinION, MinIT, MinKNOW, Plongle, PromethION, SmidgION, Ubik and VolTRAX are registered trademarks of Oxford Nanopore Technologies plc in various countries. Oxford Nanopore Technologies products are not intended for use for health assessment or to diagnose, treat, mitigate, cure, or prevent any disease or condition.