Contrasts stage (`contrasts`)
Document generated: 2024-09-14 14:01:55 UTC+0000
Source:vignettes/stage_contrasts.Rmd
stage_contrasts.Rmd
A stage for calculation, visualization and reporting of
differentially expressed markers (“contrasts”). This stage is basically
the same as the cluster_markers
stage, but all output is
related to individual comparisons of levels of cell groupings. Hence
“contrasts”, a term known from bulk RNA-seq where sample groups are
compared -> they are put in contrast.
⚙️ Config files: config/single_sample/contrasts.yaml
,
config/integration/contrasts.yaml
📋 HTML report target (in config/pipeline.yaml
):
DRAKE_TARGETS: ["report_contrasts"]
Cluster markers config is stored in the
config/single_sample/contrasts.yaml
and
config/integration/contrasts.yaml
files (the location of
this file is different for the single-sample and integration pipelines).
As for the pipeline config, directory
with this file is read from environment variables:
-
SCDRAKE_SINGLE_SAMPLE_CONFIG_DIR
for the single-sample pipeline. -
SCDRAKE_INTEGRATION_CONFIG_DIR
for the integration pipeline.
Options named in lowercase are set upon scdrake load
or attach. Then the actual directory used depends on whether you run
run_single_sample_r()
or
run_integration_r()
.
Parameters for this stage are almost identical to those for
cluster_markers
stage (see
vignette("stage_cluster_markers")
). You just need to
replace CLUSTER_MARKERS
with CONTRASTS
in
parameter names 🙂
The different parameters are below.
CONTRASTS_SOURCES:
- dea_cluster_int_graph_louvain_r0.4:
source_column: "cluster_int_graph_louvain_r0.4"
description: "DEA of all groups in graph-based clustering (Louvain alg. with r = 0.4) with blocking on batch"
contrasts: "all"
dea_cluster_int_graph_louvain_r0.8:
source_column: "cluster_int_graph_louvain_r0.8"
description: "DEA of some groups in graph-based clustering (Louvain alg. with r = 0.8)"
contrasts:
- target: "2"
reference: "1"
- target: "3"
reference: "2"
name: "cl3_vs_cl4"
Type: list of named lists
This parameter is similar to CLUSTER_MARKERS_SOURCES
,
but an additional parameter contrasts
is used. It is used
to specify which levels in source_column
will be compared
(differential expression).
- If
"all"
, all combinations of levels insource_column
will be compared. For example, ifsource_column
is"cluster_kmeans_k3"
(k-means withk = 3
), it contains three levels, and the following comparisons (contrasts) will be made:1_vs_2
,1_vs_3
, and2_vs_3
. Note that it could lead to an excessive amount of comparisons for graph-based clustering. - If list of lists, the following character scalar items can/must be
set:
-
target
(required): target level ofsource_column
. -
reference
(required): reference level ofsource_column
. -
name
(optional): unique contrast name. If not specified, will be created as{target}_vs_{reference}
.
-
CONTRASTS_SOURCES_DEFAULTS
Same as CLUSTER_MARKERS_SOURCES_DEFAULTS
, but the
following parameters for statistical tests cannot be used, and hardcoded
values are used for them internally:
-
LFC_DIRECTION
("any"
) -
PVAL_TYPE
("any"
) -
MIN_PROP
(null
)
Here you can find description of the most important targets for this
stage. However, for a full overview, you have to inspect the source
code of the get_contrasts_subplan()
function.
As in the config for this stage, all target names are similar to
those in the stage cluster_markers
after replacing
cluster_markers
with contrasts
(with few
exceptions, see below). Please, refer to
vignette("stage_cluster_markers")
.
Tibbles with parameters
contrasts_params
, contrasts_test_params
,
contrasts_heatmap_params
,
contrasts_plot_params
,
contrasts_dimred_plot_params
Tibbles with cluster markers (contrasts) test results
contrasts_raw
: a tibble, same as
cluster_markers_raw
. Used to extract results for contrasts
of interest.
contrasts
: a tibble with extracted results (in
markers
column) for contrasts of interest. Contrasts are
identified by columns target
, reference
, or
contrast_name
, and their cell grouping of origin by
source_column
. The name
column comes from list
names in CONTRASTS_SOURCES
.
contrasts_out
: same as contrasts
, but
markers
are coerced to dataframes and column names are
normalized to snake_case.
contrasts_for_tables
: same as
contrasts_out
, but markers
dataframes are
prepared for HTML output. See cluster_markers_for_tables
in
vignette("stage_cluster_markers")
.