Updated Tumor Normal Pair Benchmark Dataset

By Sarah Griffiths
Published in Data Releases
March 07, 2024
1 min read
Updated Tumor Normal Pair Benchmark Dataset

We are pleased to announce an updated data release for the COLO829/COLO829BL tumour/normal pair. Sequencing was performed with the 5kHz sampling rate upgrade to the Ligation Sequencing Kit 14. The release compliments and supercedes our previous release.

These reference samples were sequenced with four PromethION flow cells; two flowcells for the cancer sample, two for the normal. They should provide a valuable resource for cancer researchers.

Data location

As with previous releases the new dataset is available for anonymous download from an Amazon Web Services S3 bucket. The bucket is part of the Open Data on AWS project enabling sharing and analysis of a wide range of data.

The data is located in the bucket at:

s3://ont-open-data/colo829_2024.03/

See the tutorials page for information on downloading the dataset.

Sample preparation

COLO829 melanoma fibroblasts (ATCC CRL-1974) and COLO829BL normal B lymphoblasts (ATCC CRL-1980) were cultured for three days in RPMI-1640 medium with 10% fetal bovine serum and 1% antibiotic-antimycotic incubated at 37°C with 5% CO2. Five million cells were extracted with the DNeasy Blood and Tissue Kit (Qiagen, Cat. No. 69504) following the manufacturer’s instructions. Sequencing libraries were prepared following Oxford Nanopore Ligation Sequencing Kit instructions. Size selection was done using Blue Pippin and 20 fmols were loaded onto R10.4.1 PromethION flow cells. Sequencing was performed on a PromethION 24 instrument with the 23.11.5 MinKNOW software.

Sequencing Outputs

Three flowcells were used to sequence the samples to high depth:

GenomeDescriptionPreparationFlowcellYield / Gbase
COLO829TumourDNeasyPAU61426137
DNeasyPAU59949136
COLO829BLNormalDNeasyPAU59807167
DNeasyPAU61427131

For each flowcell used in the sequencing the primary sequencer outputs are available as .pod files. We also provide sequencing reads in BAM format produced by our wf-basecalling workflow. Reads are aligned to the GRCh38 human reference.

Sequencing summary metrics for the SUP COLO829/BL cancer/normal pair sequenced with Oxford Nanopore Technologies' PromethION instrument.

Analysis

As with the previous release the analyses presented here were performed using the latest version of our workflows:

  • wf-basecalling
  • wf-somatic-variation

The somatic variant calling workflow uses ClairS to create calls for the tumour sample by eliminating variants found also in the paired-normal sample.

The variant calling workflow was run using data from both the COLO829 (tumour) flowcells and the single COLO829BL (normal). The results of the workflow are present at:

s3://ont-open-data/colo829_2024.03/analysis/wf_somatic_variation/

Further information

For additional information regarding these data please contact support@nanoporetech.com.

We hope that these data and analyses provide a useful resource to the community.


Tags

#datasets#human cell-line#R10.4.1#basecalling#dorado#kit14#variant-calls#tumour#cancer#5khz

Share

Sarah Griffiths

Sarah Griffiths

Bioinformatician

Related Posts

An experimental extremely high-accuracy, ultra-long sequencing kit
December 06, 2023
1 min

Quick Links

TutorialsWorkflowsOpen DataContact

Social Media

© 2020 - 2024 Oxford Nanopore Technologies plc. All rights reserved. Registered Office: Gosling Building, Edmund Halley Road, Oxford Science Park, OX4 4DQ, UK | Registered No. 05386273 | VAT No 336942382. Oxford Nanopore Technologies, the Wheel icon, EPI2ME, Flongle, GridION, Metrichor, MinION, MinIT, MinKNOW, Plongle, PromethION, SmidgION, Ubik and VolTRAX are registered trademarks of Oxford Nanopore Technologies plc in various countries. Oxford Nanopore Technologies products are not intended for use for health assessment or to diagnose, treat, mitigate, cure, or prevent any disease or condition.