Throughout its life the Oxford Nanopore Open Datas project has aimed to provide researchers with freely available benchmark datasets from Oxford Nanopore Technologies’ sequencing devices. The data is openly and highly available through the Registry of Open Data on AWS.
Until now all the data made available has been created and provided by Oxford Nanopore. Today we would like to announce also our intention to work with and support the Oxford Nanopore Community to build the Oxford Nanopore Open Data to include additional data from the sequencing of model organisms. Please reach out to us through support@nanoporetech.com.
As a first community collaboration the Oxford Nanopore Open Data project now hosts raw sequencing device data (fast5 files) kindly provided by Bernard Kim at Stanford University from a D. melanogaster strain.
Bernard Kim is a postdoctoral fellow in the lab of Dmitri Petrov (petrov.stanford.edu) at Stanford University. As a part of the Petrov Lab and in collaboration with numerous fly labs and researchers across the world, Bernard leads a series of projects using genomics tools to unravel the mechanisms of evolutionary change occurring at the scale of family Drosophilidae (of which there are >4,500 species). An essential component of this work is the inexpensive and rapid generation of thousands of accurate genome assemblies as an open-source genomic resource spanning the entire diverse group, using long-read sequencing on Oxford Nanopore devices.
The R10.4.1 Kit 14 chemistry dataset, made available by the Petrov lab, was generated for the D. melanogaster Berkeley Drosophila Genome Project iso-1 strain to demonstrate that the goals above can be achieved with nanopore sequencing alone. It is released to the scientific community without limitations on its use.
Bernard has deposited already his initial basecalling results under the bioproject PRJNA914057 (SRA / ENA). The EPI2ME Labs teams will endevour to release updated basecalls when new software is released.
As with all data in the Oxford Nanopore Open Data project, Bernard’s MinKNOW device outputs (ionic current measurements in fast5 files, summary files, and logs) can be found at:
s3://ont-open-data/contrib/melanogaster_bkim_2023.01/flowcells/
and is accessible using the AWS Command Line Interface and any other S3 client. For a (very brief) tutorial see our Open Data Tutorials page.
For more information regarding the precise details of the dataset please contact Bernard at the Petrov lab.
Excited to finally get R10.4.1+Q20 & looks like @nanopore only assembly is the way to go. Major yield & accuracy boost a bonus to our upcoming effort to sequence all of family Drosophilidae (@PetrovADmitri @danrdanny @danielrmatute @DarrenObbard @_hgellert @mbeisen et al) pic.twitter.com/MO8rhRx9of
— Bernard Kim (@Bernard_Y_Kim) December 21, 2022
Information