Pipeline overview
Document generated: 2024-09-14 14:01:15 UTC+0000
Source:vignettes/pipeline_overview.Rmd
pipeline_overview.Rmd
scdrake offers two pipelines - one for single-sample, and second one for integration of multiple samples (which were processed by the single-sample pipeline before). As for now, each pipeline consists of two subpipelines (referred to as stages), and two stages common to both single-sample and integration pipelines.
A more detailed diagram with target structure can be found here.
Each stage has its own config, plus there is a main config for each
pipeline. You can read more about configs in a separate
vignette("scdrake_config")
. Each stage also outputs a
report in HTML format with rich graphics.
Advanced users might be interested in looking into source code of
scdrake’s plans (files
named plans_*.R
).
Pipeline steps are mostly based on recommendations given in a great book Orchestrating Single-Cell Analysis with Bioconductor.
Example pipeline output
You can inspect output from the pipeline here.
The used datasets are:
All credits for these datasets go to 10x Genomics. Visit https://www.10xgenomics.com/resources/datasets for more information.
Pipelines
Single-sample pipeline
This is a pipeline for processing a single-sample.
Stages
- Stage
01_input_qc
: reading in data, filtering, quality control ->vignette("stage_input_qc")
- Stage
02_norm_clustering
: normalization, HVG selection, dimensionality reduction, clustering, cell type annotation ->vignette("stage_norm_clustering")
Integration pipeline
This is a pipeline to integrate multiple samples processed by the single-sample pipeline. Just for clarification, an individual sample is also denoted as a batch.
More information can be found in OSCA
Stages
- Stage
01_integration
: reading in data and integration ->vignette("stage_integration")
- Stage
02_int_clustering
: post-integration clustering and cell annotation ->vignette("stage_int_clustering")
Stage 02_int_clustering
This stage basically reproduces the clustering and cell type
annotation steps in the 02_norm_clustering
stage of the
single-sample pipeline. The only difference is the user
selection of a final integration method which will be used
downstream. HVGs, reduced dimensions, and selected markers are
already computed in the previous stage
(01_integration
).
Common stages
Some stages are common to both single-sample and integration pipelines.
Stage cluster_markers
A stage for calculation, visualization and reporting of cell cluster markers (“global markers”).
Stage contrasts
A stage for calculation, visualization and reporting of
differentially expressed markers (“contrasts”). This stage is basically
the same as the cluster_markers
stage, but all output is
related to individual comparisons of levels of cell groupings. Hence
“contrasts”, a term known from bulk RNA-seq where sample groups are
compared -> they are put in contrast.
-> vignette("stage_contrasts")
Signpost
- Guides:
- Using the Docker image: https://bioinfocz.github.io/scdrake/articles/scdrake_docker.html
(or
vignette("scdrake_docker")
) - 01 Quick start (single-sample pipeline):
vignette("scdrake")
- 02 Integration pipeline guide:
vignette("scdrake_integration")
- Advanced topics:
vignette("scdrake_advanced")
- Extending the pipeline:
vignette("scdrake_extend")
-
drake basics:
vignette("drake_basics")
- Or the official drake book: https://books.ropensci.org/drake/
- Using the Docker image: https://bioinfocz.github.io/scdrake/articles/scdrake_docker.html
(or
- General information:
- Pipeline overview:
vignette("pipeline_overview")
- FAQ & Howtos:
vignette("scdrake_faq")
- Command line interface (CLI):
vignette("scdrake_cli")
- Config files (internals):
vignette("scdrake_config")
- Environment variables:
vignette("scdrake_envvars")
- Pipeline overview:
- General configs:
- Pipeline config ->
vignette("config_pipeline")
- Main config ->
vignette("config_main")
- Pipeline config ->
- Pipelines and stages:
- Single-sample pipeline:
- Stage
01_input_qc
: reading in data, filtering, quality control ->vignette("stage_input_qc")
- Stage
02_norm_clustering
: normalization, HVG selection, dimensionality reduction, clustering, cell type annotation ->vignette("stage_norm_clustering")
- Stage
- Integration pipeline:
- Stage
01_integration
: reading in data and integration ->vignette("stage_integration")
- Stage
02_int_clustering
: post-integration clustering and cell annotation ->vignette("stage_int_clustering")
- Stage
- Common stages:
- Stage
cluster_markers
->vignette("stage_cluster_markers")
- Stage
contrasts
(differential expression) ->vignette("stage_contrasts")
- Stage
- Single-sample pipeline: