We have released a Kraken2 server software that has been implemented to enable Kraken2-based taxonomic classification of DNA sequences in a gRPC server-client architecture. This architectural change to the Kraken2 software means that the typically large and memory intensive database only has to be loaded once - this ameliorates a substantial bottleneck when sequence data is being processed in real-time. The Kraken2 server software is available through GitHub and pre-compiled binaries are available through conda. The software is tagged at revision v0.0.5 and will be preinstalled in appropriate workflow containers. This metagenomics server is a critical component for a new real-time implementation of wf-metagenomics that will be released in the next few days.

The real-time wf-metagenomics workflow mentioned above is pending with an imminent release. The re-engineered workflow steps away from the post-run data analysis dogma and encourages the analysis of sequence data as it is being produced - the workflow is thus more akin to the EPI2ME WIMP product. The technical implementation of these reinvigorated analysis capabilities is described in detail in an accompanying blog post. In summary, the new functionality leverages the watchPath capabilities of Nextflow and utilises the kraken2 server mentioned above. We hope that this workflow might be used as a template for the development of other workflows that can explore datasets as they are generated. The updated workflow will be available in a few days time with the tagged version v2.0.0.

Our workflow for whole human genome analysis, wf-human-variation, has been updated to v0.2.0. This release includes the functionality to report regions of 5mC base-modification in the CpG sequence context. When base-calling is performed using a model that includes base-modifications, the modified base information is written to the BAM output files. These per base per read information are now distilled into a more informative BED file of genomic regions using the modbam2bed software. The BED information can be presented as tracks using genome browsers such as IGV or JBrowse. Please also see the workflow CHANGELOG for additional information.

Our wf-single-cell workflow (previously known as sockeye) has been updated and now provides a matrix of gene transcript x cell barcodes counts. The inclusion of transcript resolution data further demonstrates the value of long-read transcriptomic sequence data in the study of single-cells. Further information on the update is provided in the workflow’s CHANGELOG. The workflow has been released with tagged version v0.1.2.

The EPI2ME Labs team would welcome recommendations for new workflows, tutorials, and functionality that you would like to see included in the product.