CellChat — Cell-Cell Communication Analysis
Overview
CellChat is an R package that infers and visualizes intercellular signaling networks from single-cell RNA-seq data. Starting from a normalized expression matrix and cluster labels, CellChat identifies ligand-receptor interactions supported by CellChatDB — a manually curated database of over 2,000 validated ligand-receptor pairs in human and mouse. Communication probability is modeled using the law of mass action, combining expression levels of ligands, receptors, and cofactors. CellChat aggregates pair-level probabilities into pathway-level signaling networks and quantifies each cell group's role as a signal sender, receiver, mediator, or influencer. The result is a rich, interpretable picture of which cell types talk to which, through which signaling pathways, and how these patterns change between conditions.
When to Use
- Characterizing which cell types are the dominant senders or receivers of paracrine and autocrine signals in a tissue atlas or disease sample
- Identifying specific ligand-receptor pairs mediating communication between a cell population of interest (e.g., tumor cells → T cells, fibroblasts → epithelial cells)
- Comparing intercellular signaling networks between two conditions (e.g., healthy vs. diseased, treatment vs. control) to find rewired or lost communication
- Discovering pathway-level signaling programs (e.g., MHC-II, COLLAGEN, VEGF) enriched in a particular cell-cell interaction
- Prioritizing targets for perturbation experiments by ranking signaling pathways by their aggregate communication strength or network centrality
- Use liana (Python/R) instead when you want a pure-Python workflow or a consensus ranking across multiple ligand-receptor databases (CellChat, CellPhoneDB, Connectome, NicheNet)
- Use NicheNet (R) instead when you need ligand-to-target gene regulatory inference — predicting which ligands from sender cells regulate which target genes in receiver cells
Prerequisites
- R packages:
CellChat(>= 2.0),Seurat(>= 4.0, for Seurat-based input),NMF,ggplot2,ggalluvial,igraph,dplyr,patchwork,reticulate(optional) - Data requirements: Normalized scRNA-seq count matrix (genes × cells) and a cell group identity vector (cluster labels or cell types). Raw counts are acceptable if normalized inside CellChat.
- Species: CellChatDB available for human and mouse; other species require custom database construction
- Memory: 8 GB RAM minimum for datasets with 10,000–50,000 cells; 32 GB+ recommended for larger datasets
# Install CellChat from GitHub (CRAN version may lag)
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install(c("BiocNeighbors", "ComplexHeatmap"))
install.packages("devtools")
devtools::install_github("jinworks/CellChat")
# Core dependencies
install.packages(c("NMF", "ggplot2", "ggalluvial", "igraph",
"dplyr", "patchwork", "circlize", "RColorBrewer"))
Quick Start
library(CellChat)
library(Seurat)
# Assume `seurat_obj` is a processed Seurat object with cell type identities in Idents()
data.input <- GetAssayData(seurat_obj, assay = "RNA", slot = "data") # normalized counts
meta <- data.frame(labels = Idents(seurat_obj), row.names = names(Idents(seurat_obj)))
cellchat <- createCellChat(object = data.input, meta = meta, group.by = "labels")
cellchat@DB <- CellChatDB.human # or CellChatDB.mouse
cellchat <- subsetData(cellchat)
cellchat <- identifyOverExpressedGenes(cellchat)
cellchat <- identifyOverExpressedInteractions(cellchat)
cellchat <- computeCommunProb(cellchat, type = "triMean")
cellchat <- filterCommunication(cellchat, min.cells = 10)
cellchat <- computeCommunProbPathway(cellchat)
cellchat <- aggregateNet(cellchat)
# Quick summary
print(cellchat)
# e.g. "An object of class CellChat created from a single dataset
# with 8 cell groups and 312 inferred ligand-receptor pairs"
Workflow
Step 1: Create CellChat Object
Build a CellChat object from either a Seurat object or a raw count matrix with accompanying metadata.
library(CellChat)
library(Seurat)
# --- Option A: from a Seurat object ---
# seurat_obj must have cell type identities set with Idents() or in meta.data
data.input <- GetAssayData(seurat_obj, assay = "RNA", slot = "data") # log-normalized
meta <- data.frame(
labels = Idents(seurat_obj),
row.names = colnames(seurat_obj)
)
cellchat <- createCellChat(object = data.input, meta = meta, group.by = "labels")
# --- Option B: from a count matrix directly ---
# data.input: genes-by-cells normalized matrix (dgCMatrix or dense matrix)
# identity: named factor of cell group labels (length = ncol(data.input))
cellchat <- createCellChat(object = data.input, meta = data.frame(labels = identity),
group.by = "labels")
cat("Cell groups:", levels(cellchat@idents), "\n")
cat("Number of cells:", ncol(data.input), "\n")
# Cell groups: B_cell Endothelial Fibroblast Macrophage NK T_cell Tumor
# Number of cells: 12847
Step 2: Set CellChatDB and Subset Interactions
Load the species-appropriate ligand-receptor database and optionally subset to a signaling category of interest.
# Load database for the appropriate species
CellChatDB <- CellChatDB.human # use CellChatDB.mouse for mouse data
# Inspect available signaling categories
unique(CellChatDB$interaction$annotation)
# [1] "Secreted Signaling" "ECM-Receptor" "Cell-Cell Contact"
# Option 1: Use all interactions (recommended for discovery)
cellchat@DB <- CellChatDB
# Option 2: Subset to secreted ligand-receptor pairs only (reduces noise)
CellChatDB.use <- subsetDB(CellChatDB, search = "Secreted Signaling",
key = "annotation")
cellchat@DB <- CellChatDB.use
# Subset the CellChat data slots to only genes in the database
cellchat <- subsetData(cellchat)
cat("Genes retained after database subset:", nrow(cellchat@data.signaling), "\n")
# Genes retained after database subset: 1842
Step 3: Identify Over-Expressed Genes and Interactions
For each cell group, identify ligands and receptors that are significantly over-expressed compared to other groups.
# Identify over-expressed genes per cell group (uses Seurat-style wilcoxon test)
cellchat <- identifyOverExpressedGenes(cellchat)
# Map over-expressed genes to ligand-receptor pairs in CellChatDB
cellchat <- identifyOverExpressedInteractions(cellchat)
# Inspect how many interactions were identified per group pair
df.net <- subsetCommunication(cellchat)
cat("Total inferred interactions:", nrow(df.net), "\n")
head(df.net[, c("source", "target", "ligand", "receptor", "prob")], 5)
# source target ligand receptor prob
# 1 B_cell Macrophage CD22 PTPRC 0.0318
# 2 Fibroblast Tumor FN1 CD44 0.1072
# ...
Step 4: Infer Cell-Cell Communication Probabilities
Compute communication probability for each ligand-receptor pair between every ordered pair of cell groups using the law of mass action. CellChat accounts for multi-subunit complexes and co-stimulatory/co-inhibitory cofactors.
# Compute pairwise communication probability
# type = "triMean": uses 25th percentile × mean × 25th percentile for robustness
# type = "truncatedMean": uses trimmed mean with threshold parameter trim
cellchat <- computeCommunProb(
cellchat,
type = "triMean", # recommended default
trim = 0.1, # fraction to trim (only used if type="truncatedMean")
nboot = 100, # bootstrap iterations for p-value estimation
seed.use = 42,
population.size = TRUE # weight by population size (recommended)
)
# Filter out interactions with too few cells in sender or receiver groups
cellchat <- filterCommunication(cellchat, min.cells = 10)
# Summary of retained interactions
df.net <- subsetCommunication(cellchat)
cat("Interactions after f