EPI2ME Labs workflow are built using the nextflow workflow framework:
Nextflow enables scalable and reproducible scientific workflows using software containers. It allows the adaptation of pipelines written in the most common scripting languages.
Its fluent DSL simplifies the implementation and the deployment of complex parallel and reactive workflows on clouds and clusters.
In this guide we will walkthrough usage of EPI2ME Labs Nextflow workflows from the command-line on either Linux, macOS or, Windows through WSL2. It is assumed that Nextflow has been installed either as part of the EPI2ME Labs desktop application, or in the case of macOS and Linux installed as in our Installation guide.
EPI2ME Labs workflows can currently be run using either Docker, Singularity or to provide isolation of the required software. Each of these methods is automated out-of-the-box provided Docker or Singularity is installed.
EPI2ME Labs workflows no longer support conda as a means of managing their software environments. This decision was taken after encountering many issues (both ourselves and from end-users) while using conda for dependency management.
The code behind all EPI2ME Labs workflows is hosted publically on our Github
space: https://github.com/epi2me-labs/. Workflow projects are prefixed with wf-
.
For example the code powering our ARTIC-based
SARS-CoV-2 analysis is available at: https://github.com/epi2me-labs/wf-artic.
For the most part, users will not need to interact directly with the Github code repositories as Nextflow has the ability to automatically manage workflows available on Github.
The instructions below are provided using wf-artic as an examplar workflow,
for other workflow simple replace wf-artic
with wf-name-of-workflow
.
With the prerequisites installed, users can run:
nextflow run epi2me-labs/wf-artic --help
to see the options for a specific workflow. The help message will display all common options available for augmenting the workflows behaviour. See Configuration and tuning below for information regarding manipulating how workflows are run.
To run the workflow using Docker containers supply the -profile standard
argument to nextflow run
:
The command below uses test data available from the github repository
It can be obtained with git clone https://github.com/epi2me-labs/wf-artic
.
# run the pipeline with the test dataOUTPUT=my_artic_outputnextflow run epi2me-labs/wf-artic \-w ${OUTPUT}/workspace-profile standard--fastq test_data/sars-samples-demultiplexed/--samples test_data/sample_sheet \--out_dir ${OUTPUT}
This section provides some minimal guidance for changing common options, see the Nextflow documentation for further details.
The default settings for the workflow are described in the configuration file nextflow.config
found within the git repository. The default configuration defines an executor that will
use a specified maximum CPU cores (four at the time of writing) and RAM (eight gigabytes).
If the workflow is being run on a device other than a GridION, the available memory and
number of CPUs may be adjusted to the available number of CPU cores. This can be done by
creating a file my_config.cfg
in the working directory with the following contents:
executor {$local {cpus = 4memory = "8 GB"}}
and running the workflow providing the -c
(config) option, e.g.:
# run the pipeline with custom configurationnextflow run epi2me-labs/wf-artic \-c my_config.cfg \...
The contents of the my_config.cfg
file will override the contents of the default
configuration file. See the Nextflow documentation
for more information concerning customized configuration.
Periodically when running workflows, users may find that a message is displayed indicating that an update to the workflow is available.
To update the workflow simply run (for e.g. the wf-artic
workflow):
nextflow pull epi2me-labs/wf-artic
When running workflows with large data sets, intermediate steps can take up
considerable disk space so it may be worth assigning the work(-w
) and
output(--out_dir
) parameters to a directory with plenty of disk space.
When working on a GridION this may be on a mounted drive.
After running a few workflows you may want to clear up intermediate and log files created as part of the workflow that are stored in the work directory.
To clean up the work directory, from the directory where you ran your cmd run:
nextflow clean -f
If you want to remove old docker images. Type docker images
to find
image id of image to delete and then docker rmi <image-id>
.
If you want to remove old docker containers. Type docker ps -a
to find
container id and then docker rm <container-id>
.
Information