Skip to contents

Applies a cutoff to weighted adjacency matrices using a percentile estimated from shuffled versions of the original expression matrices. Supports inference methods "GENIE3", "GRNBoost2", and "JRF".

Usage

cutoff_adjacency(
  count_matrices,
  weighted_adjm_list,
  n,
  method = c("GENIE3", "GRNBoost2", "JRF"),
  quantile_threshold = 0.99,
  weight_function = "mean",
  nCores = 1,
  grnboost_modules = NULL,
  debug = FALSE
)

Arguments

count_matrices

A MultiAssayExperiment object containing expression data from multiple experiments or conditions.

weighted_adjm_list

A SummarizedExperiment object containing weighted adjacency matrices (one per experiment) to threshold.

n

Integer. Number of shuffled replicates generated per original expression matrix.

method

Character string. One of "GENIE3", "GRNBoost2", or "JRF".

quantile_threshold

Numeric. The quantile used to define the cutoff. Default is 0.99.

weight_function

Character string or function used to symmetrize adjacency matrices ("mean", "max", etc.).

nCores

Integer. Number of CPU cores to use for parallelization. Default is the number of workers in the current BiocParallel backend. Note: JRF uses C implementation and does not use this parameter.

grnboost_modules

Python modules needed for GRNBoost2 if using reticulate.

debug

Logical. If TRUE, prints detailed progress messages. Default is FALSE.

Value

A SummarizedExperiment object where each assay is a binary (thresholded) adjacency matrix corresponding to an input weighted matrix. Metadata includes cutoff values and method parameters.

Details

For each input expression matrix, n shuffled versions are generated by randomly permuting each gene’s expression across cells. Network inference is performed on the shuffled matrices, and a cutoff is determined as the specified quantile (quantile_threshold) of the resulting edge weights. The original weighted adjacency matrices are then thresholded using these estimated cutoffs.

Parallelization is handled via BiocParallel.

The methods are based on:

  • GENIE3: Random Forest-based inference (Huynh-Thu et al., 2010).

  • GRNBoost2: Gradient boosting trees using arboreto (Moerman et al., 2019).

  • JRF: Joint Random Forests across multiple conditions (Petralia et al., 2015).

Examples

data(toy_counts)


# Infer networks (toy_counts is already a MultiAssayExperiment)
networks <- infer_networks(
    count_matrices_list = toy_counts,
    method = "GENIE3",
    nCores = 1
)
head(networks[[1]])
#>   regulatoryGene targetGene    weight
#> 1          HLA-B        FTL 0.2167896
#> 2           CD74      CXCR4 0.1708179
#> 3          HLA-A      HLA-B 0.1630234
#> 4            FTL       FTH1 0.1516025
#> 5          HLA-B      HLA-A 0.1439875
#> 6           FTH1        FTL 0.1377228

# Generate adjacency matrices
wadj_se <- generate_adjacency(networks)
swadj_se <- symmetrize(wadj_se, weight_function = "mean")

# Apply cutoff
binary_se <- cutoff_adjacency(
    count_matrices = toy_counts,
    weighted_adjm_list = swadj_se,
    n = 1,
    method = "GENIE3",
    quantile_threshold = 0.95,
    nCores = 1,
    debug = TRUE
)
#> [Method: GENIE3] Matrix 1 → Cutoff = 0.06502
#> [Method: GENIE3] Matrix 2 → Cutoff = 0.06652
#> [Method: GENIE3] Matrix 3 → Cutoff = 0.06565
head(binary_se[[1]])
#> [1] "ACTG1" "ARPC2" "ARPC3" "BTF3"  "CD3D"  "CD3E"