Threshold Adjacency Matrices Based on Shuffled Network Quantiles

Applies a cutoff to weighted adjacency matrices using a percentile estimated from shuffled versions of the original expression matrices. Supports inference methods "GENIE3", "GRNBoost2", and "JRF".

Usage

cutoff_adjacency(
  count_matrices,
  weighted_adjm_list,
  n,
  method = "GENIE3",
  quantile_threshold = 0.99,
  weight_function = "mean",
  nCores = 1,
  grnboost_modules = NULL,
  debug = FALSE
)

Arguments

count_matrices: A list of expression matrices (genes × cells) or Seurat or SingleCellExperiment objects.
weighted_adjm_list: A list of weighted adjacency matrices (one per expression matrix) to threshold.
n: Integer. Number of shuffled replicates generated per original expression matrix.
method: Character string. One of "GENIE3", "GRNBoost2", or "JRF".
quantile_threshold: Numeric. The quantile used to define the cutoff. Default is 0.99.
weight_function: Character string or function used to symmetrize adjacency matrices ("mean", "max", etc.).
nCores: Integer. Number of CPU cores to use for parallelization. Default is the number of workers in the current BiocParallel backend. Note: JRF uses C implementation and does not use this parameter.
grnboost_modules: Python modules needed for GRNBoost2 if using reticulate.
debug: Logical. If TRUE, prints detailed progress messages. Default is FALSE.

Value

A list of binary (thresholded) adjacency matrices, each corresponding to an input weighted matrix.

Details

For each input expression matrix, n shuffled versions are generated by randomly permuting each gene’s expression across cells. Network inference is performed on the shuffled matrices, and a cutoff is determined as the specified quantile (quantile_threshold) of the resulting edge weights. The original weighted adjacency matrices are then thresholded using these estimated cutoffs.

Parallelization is handled via BiocParallel.

The methods are based on:

GENIE3: Random Forest-based inference (Huynh-Thu et al., 2010).
GRNBoost2: Gradient boosting trees using arboreto (Moerman et al., 2019).
JRF: Joint Random Forests across multiple conditions (Petralia et al., 2015).

Examples

data(count_matrices)

networks <- infer_networks(
    count_matrices_list = count_matrices,
    method = "GENIE3",
    nCores = 1
)
head(networks[[1]])
#>   regulatoryGene targetGene    weight
#> 1          ARPC2      ARPC3 0.2108356
#> 2          HLA-A       CD74 0.1884532
#> 3          HLA-E        FOS 0.1632567
#> 4          ARPC3      ARPC2 0.1600790
#> 5           CD3E       CD3D 0.1556287
#> 6          ARPC2      HLA-E 0.1488873

wadj_list <- generate_adjacency(networks)
swadj_list <- symmetrize(wadj_list, weight_function = "mean")

binary_listj <- cutoff_adjacency(
    count_matrices = count_matrices,
    weighted_adjm_list = swadj_list,
    n = 2,
    method = "GENIE3",
    quantile_threshold = 0.99,
    nCores = 1,
    debug = TRUE
)
#> [Method: GENIE3] Matrix 1 → Cutoff = 0.09270
#> [Method: GENIE3] Matrix 2 → Cutoff = 0.10499
#> [Method: GENIE3] Matrix 3 → Cutoff = 0.09768
head(binary_listj[[1]])
#>       ACTG1 ARPC2 ARPC3 BTF3 CD3D CD3E CD74 CFL1 COX4I1 COX7C CXCR4 EEF1A1
#> ACTG1     0     0     0    0    0    0    0    1      0     0     0      0
#> ARPC2     0     0     1    0    0    0    0    0      0     0     0      0
#> ARPC3     0     1     0    0    0    0    0    0      0     0     0      0
#> BTF3      0     0     0    0    0    0    0    0      0     0     0      0
#> CD3D      0     0     0    0    0    1    0    0      0     0     0      0
#> CD3E      0     0     0    0    1    0    0    0      0     0     0      0
#>       EEF1D EEF2 EIF1 EIF3K EIF4A2 FOS FTH1 FTL GNB2L1 HLA-A HLA-B HLA-C HLA-E
#> ACTG1     0    0    0     0      0   0    0   0      0     0     0     0     0
#> ARPC2     0    0    0     0      0   0    0   0      0     0     0     0     1
#> ARPC3     0    0    0     0      0   0    0   0      0     0     0     0     0
#> BTF3      0    0    0     0      0   0    0   0      0     0     0     0     0
#> CD3D      0    0    0     0      0   0    0   0      0     0     0     0     0
#> CD3E      0    0    0     0      0   0    0   0      0     0     0     0     0
#>       JUN JUNB MYL12B MYL6 NACA PABPC1 PFN1 TMSB4X UBA52 UBC
#> ACTG1   0    0      0    0    0      0    0      0     0   0
#> ARPC2   0    0      0    0    0      0    0      0     0   0
#> ARPC3   0    0      0    0    0      0    0      0     0   0
#> BTF3    0    0      0    0    0      0    0      0     0   0
#> CD3D    0    0      0    0    0      0    0      0     0   0
#> CD3E    0    0      0    0    0      0    0      0     0   0