Pipeline overview

scdrake offers two pipelines - one for single-sample, and second one for integration of multiple samples (which were processed by the single-sample pipeline before). As for now, each pipeline consists of two subpipelines (referred to as stages), and two stages common to both single-sample and integration pipelines.

A more detailed diagram with target structure can be found here.

Each stage has its own config, plus there is a main config for each pipeline. You can read more about configs in a separate vignette("scdrake_config"). Each stage also outputs a report in HTML format with rich graphics.

Advanced users might be interested in looking into source code of scdrake’s plans (files named plans_*.R).

Pipeline steps are mostly based on recommendations given in a great book Orchestrating Single-Cell Analysis with Bioconductor.

Example pipeline output

You can inspect output from the pipeline here.

The used datasets are:

PBMC 1k (v3 chemistry, Cell Ranger 3.0.0)
PBMC 3k (v2 chemistry, Cell Ranger 1.1.0)

All credits for these datasets go to 10x Genomics. Visit https://www.10xgenomics.com/resources/datasets for more information.

Pipelines

Single-sample pipeline

This is a pipeline for processing a single-sample.

Stages

Stage 01_input_qc: reading in data, filtering, quality control -> vignette("stage_input_qc")
Stage 02_norm_clustering: normalization, HVG selection, dimensionality reduction, clustering, cell type annotation -> vignette("stage_norm_clustering")

Integration pipeline

This is a pipeline to integrate multiple samples processed by the single-sample pipeline. Just for clarification, an individual sample is also denoted as a batch.

More information can be found in OSCA

Stages

Stage 01_integration: reading in data and integration -> vignette("stage_integration")
Stage 02_int_clustering: post-integration clustering and cell annotation -> vignette("stage_int_clustering")

Stage `02_int_clustering`

This stage basically reproduces the clustering and cell type annotation steps in the 02_norm_clustering stage of the single-sample pipeline. The only difference is the user selection of a final integration method which will be used downstream. HVGs, reduced dimensions, and selected markers are already computed in the previous stage (01_integration).

Common stages

Some stages are common to both single-sample and integration pipelines.

Stage `cluster_markers`

A stage for calculation, visualization and reporting of cell cluster markers (“global markers”).

-> vignette("stage_cluster_markers")

Stage `contrasts`

A stage for calculation, visualization and reporting of differentially expressed markers (“contrasts”). This stage is basically the same as the cluster_markers stage, but all output is related to individual comparisons of levels of cell groupings. Hence “contrasts”, a term known from bulk RNA-seq where sample groups are compared -> they are put in contrast.

-> vignette("stage_contrasts")

Signpost

Guides:
- Using the Docker image: https://bioinfocz.github.io/scdrake/articles/scdrake_docker.html (or vignette("scdrake_docker"))
- 01 Quick start (single-sample pipeline): vignette("scdrake")
- 02 Integration pipeline guide: vignette("scdrake_integration")
- Advanced topics: vignette("scdrake_advanced")
- Extending the pipeline: vignette("scdrake_extend")
- drake basics: vignette("drake_basics")
  - Or the official drake book: https://books.ropensci.org/drake/
General information:
- Pipeline overview: vignette("pipeline_overview")
- FAQ & Howtos: vignette("scdrake_faq")
- Command line interface (CLI): vignette("scdrake_cli")
- Config files (internals): vignette("scdrake_config")
- Environment variables: vignette("scdrake_envvars")
General configs:
- Pipeline config -> vignette("config_pipeline")
- Main config -> vignette("config_main")
Pipelines and stages:
- Single-sample pipeline:
  - Stage 01_input_qc: reading in data, filtering, quality control -> vignette("stage_input_qc")
  - Stage 02_norm_clustering: normalization, HVG selection, dimensionality reduction, clustering, cell type annotation -> vignette("stage_norm_clustering")
- Integration pipeline:
  - Stage 01_integration: reading in data and integration -> vignette("stage_integration")
  - Stage 02_int_clustering: post-integration clustering and cell annotation -> vignette("stage_int_clustering")
- Common stages:
  - Stage cluster_markers -> vignette("stage_cluster_markers")
  - Stage contrasts (differential expression) -> vignette("stage_contrasts")

Document generated: 2024-09-14 14:01:15 UTC+0000

Example pipeline output

Pipelines

Single-sample pipeline

Stages

Integration pipeline

Stages

Stage 02_int_clustering

Common stages

Stage cluster_markers

Stage contrasts

Signpost

^{Document generated:
2024-09-14 14:01:15 UTC+0000}

Stage `02_int_clustering`

Stage `cluster_markers`

Stage `contrasts`