Nextflow enables scalable and reproducible scientific workflows using software containers. It allows the adaptation of pipelines written in the most common scripting languages.
Its fluent DSL simplifies the implementation and the deployment of complex parallel and reactive workflows on clouds and clusters.

Prerequisites

In this guide we will walkthrough usage of EPI2ME Labs Nextflow workflows from the command-line on either Linux, macOS or, Windows through WSL2. It is assumed that Nextflow has been installed either as part of the EPI2ME Labs desktop application, or in the case of macOS and Linux installed as in our Installation guide.

EPI2ME Labs workflows can currently be run using either Docker, Singularity or to provide isolation of the required software. Each of these methods is automated out-of-the-box provided Docker or Singularity is installed.

EPI2ME Labs workflows no longer support conda as a means of managing their software environments. This decision was taken after encountering many issues (both ourselves and from end-users) while using conda for dependency management.

Generic Workflow instructions

The code behind all EPI2ME Labs workflows is hosted publically on our Github space: https://github.com/epi2me-labs/. Workflow projects are prefixed with wf-. For example the code powering our ARTIC-based SARS-CoV-2 analysis is available at: https://github.com/epi2me-labs/wf-artic.

For the most part, users will not need to interact directly with the Github code repositories as Nextflow has the ability to automatically manage workflows available on Github.

The instructions below are provided using wf-artic as an examplar workflow, for other workflow simple replace wf-artic with wf-name-of-workflow.

Downloading and Running Workflows

With the prerequisites installed, users can run:

nextflow run epi2me-labs/wf-artic --help

to see the options for a specific workflow. The help message will display all common options available for augmenting the workflows behaviour. See Configuration and tuning below for information regarding manipulating how workflows are run.

To run the workflow using Docker containers supply the -profile standard argument to nextflow run:

The command below uses test data available from the github repository It can be obtained with git clone https://github.com/epi2me-labs/wf-artic.

# run the pipeline with the test data
OUTPUT=my_artic_output
nextflow run epi2me-labs/wf-artic \
    -w ${OUTPUT}/workspace
    -profile standard
    --fastq test_data/sars-samples-demultiplexed/
    --samples test_data/sample_sheet \
    --out_dir ${OUTPUT}

Configuration and tuning

This section provides some minimal guidance for changing common options, see the Nextflow documentation for further details.

The default settings for the workflow are described in the configuration file nextflow.config found within the git repository. The default configuration defines an executor that will use a specified maximum CPU cores (four at the time of writing) and RAM (eight gigabytes).

If the workflow is being run on a device other than a GridION, the available memory and number of CPUs may be adjusted to the available number of CPU cores. This can be done by creating a file my_config.cfg in the working directory with the following contents:

executor {
    $local {
        cpus = 4
        memory = "8 GB"
    }
}

and running the workflow providing the -c (config) option, e.g.:

# run the pipeline with custom configuration
nextflow run epi2me-labs/wf-artic \
    -c my_config.cfg \
    ...

The contents of the my_config.cfg file will override the contents of the default configuration file. See the Nextflow documentation for more information concerning customized configuration.

Updating workflows

Periodically when running workflows, users may find that a message is displayed indicating that an update to the workflow is available.

To update the workflow simply run (for e.g. the wf-artic workflow):

nextflow pull epi2me-labs/wf-artic

Managing disk space

When running workflows with large data sets, intermediate steps can take up considerable disk space so it may be worth assigning the work(-w) and output(--out_dir) parameters to a directory with plenty of disk space. When working on a GridION this may be on a mounted drive.

After running a few workflows you may want to clear up intermediate and log files created as part of the workflow that are stored in the work directory.

To clean up the work directory, from the directory where you ran your cmd run:

nextflow clean -f

Tidying up docker

If you want to remove old docker images. Type docker images to find image id of image to delete and then docker rmi <image-id>.

If you want to remove old docker containers. Type docker ps -a to find container id and then docker rm <container-id>.

Useful links

nextflow The workflow management system used by EPI2ME Labs workflows.
docker A software container platform that can be optionally used by EPI2ME Labs.