
Threshold Adjacency Matrices Based on Shuffled Network Quantiles
Source:R/cutoff_adjacency.R
cutoff_adjacency.RdApplies a cutoff to weighted adjacency matrices using a percentile
estimated from shuffled versions of the original expression matrices.
Supports inference methods "GENIE3", "GRNBoost2",
and "JRF".
Usage
cutoff_adjacency(
count_matrices,
weighted_adjm_list,
n,
method = c("GENIE3", "GRNBoost2", "JRF"),
quantile_threshold = 0.99,
weight_function = "mean",
nCores = 1,
grnboost_modules = NULL,
debug = FALSE
)Arguments
- count_matrices
A MultiAssayExperiment object containing expression data from multiple experiments or conditions.
- weighted_adjm_list
A SummarizedExperiment object containing weighted adjacency matrices (one per experiment) to threshold.
- n
Integer. Number of shuffled replicates generated per original expression matrix.
- method
Character string. One of
"GENIE3","GRNBoost2", or"JRF".- quantile_threshold
Numeric. The quantile used to define the cutoff. Default is
0.99.- weight_function
Character string or function used to symmetrize adjacency matrices (
"mean","max", etc.).- nCores
Integer. Number of CPU cores to use for parallelization. Default is the number of workers in the current BiocParallel backend. Note: JRF uses C implementation and does not use this parameter.
- grnboost_modules
Python modules needed for
GRNBoost2if using reticulate.- debug
Logical. If
TRUE, prints detailed progress messages. Default isFALSE.
Value
A SummarizedExperiment object where each assay is a binary (thresholded) adjacency matrix corresponding to an input weighted matrix. Metadata includes cutoff values and method parameters.
Details
For each input expression matrix, n shuffled versions are
generated by randomly permuting each gene’s expression across cells.
Network inference is performed on the shuffled matrices, and a cutoff
is determined as the specified quantile (quantile_threshold) of
the resulting edge weights. The original weighted adjacency matrices
are then thresholded using these estimated cutoffs.
Parallelization is handled via BiocParallel.
The methods are based on:
GENIE3: Random Forest-based inference (Huynh-Thu et al., 2010).
GRNBoost2: Gradient boosting trees using arboreto (Moerman et al., 2019).
JRF: Joint Random Forests across multiple conditions (Petralia et al., 2015).
Examples
data(toy_counts)
# Infer networks (toy_counts is already a MultiAssayExperiment)
networks <- infer_networks(
count_matrices_list = toy_counts,
method = "GENIE3",
nCores = 1
)
head(networks[[1]])
#> regulatoryGene targetGene weight
#> 1 HLA-B FTL 0.2167896
#> 2 CD74 CXCR4 0.1708179
#> 3 HLA-A HLA-B 0.1630234
#> 4 FTL FTH1 0.1516025
#> 5 HLA-B HLA-A 0.1439875
#> 6 FTH1 FTL 0.1377228
# Generate adjacency matrices
wadj_se <- generate_adjacency(networks)
swadj_se <- symmetrize(wadj_se, weight_function = "mean")
# Apply cutoff
binary_se <- cutoff_adjacency(
count_matrices = toy_counts,
weighted_adjm_list = swadj_se,
n = 1,
method = "GENIE3",
quantile_threshold = 0.95,
nCores = 1,
debug = TRUE
)
#> [Method: GENIE3] Matrix 1 → Cutoff = 0.06502
#> [Method: GENIE3] Matrix 2 → Cutoff = 0.06652
#> [Method: GENIE3] Matrix 3 → Cutoff = 0.06565
head(binary_se[[1]])
#> [1] "ACTG1" "ARPC2" "ARPC3" "BTF3" "CD3D" "CD3E"