Cell-Cell Communication
with CellChat
A complete guide to inferring, visualizing, and interpreting cell-cell communication from scRNA-seq data — from biological concepts to publication figures, including critical caveats that most tutorials skip.
What is Cell-Cell Communication?
Cells do not act in isolation. Every cell in a tissue is constantly sending and receiving molecular signals — secreting proteins, displaying surface ligands, responding to hormones. This coordinated communication orchestrates development, homeostasis, immune responses, and disease. CellChat is a computational tool for inferring these conversations from scRNA-seq gene expression data.
The three components of a signaling interaction
Three types of signaling CellChat models
One cell type secretes a ligand that binds receptors on a different cell type. The classic cell-cell communication scenario. e.g. macrophage → T cell via MIF–CD44.
A cell type signals to itself — expresses both ligand and receptor. Common in tumors and activated immune cells. CellChat captures this, many tools don't.
Membrane-bound ligands that require direct contact. e.g. Notch-Jagged signaling. CellChat includes these but can't verify spatial proximity without spatial data.
CellChat infers potential communication based on co-expression of ligands and receptors across cell populations. It does not prove that a signal was actually sent, received, or had a functional effect. It is a hypothesis generator, not a confirmation. Always validate key interactions experimentally or with orthogonal data (e.g. perturbation experiments, protein-level data).
How CellChat Works — The Model
CellChat models communication probability using the law of mass action — borrowed from biochemistry. The idea: the probability of a signaling interaction scales with how much ligand is present in the sender cell and how much receptor is present in the receiver cell.
The probability model
For each sender cell type i and receiver cell type j, and each ligand-receptor pair in CellChatDB, CellChat computes:
P(i→j, L-R) ∝ average(ligand expression in i) × average(receptor expression in j)
For multi-subunit complexes: receptor expression is the minimum of all subunit expressions (the weakest link — all subunits must be present). For cofactors: included as multiplicative terms. The probability is then scaled through a Hill function to keep values bounded.
CellChat averages expression across all cells within a cell type. This means it captures cell-type-level signals, not single-cell resolution. A ligand expressed by 5% of fibroblasts will contribute less than one expressed by 80%. This averaging is both a strength (noise reduction) and a limitation (misses rare subpopulations unless you annotate them separately).
CellChatDB — The Interaction Database
CellChat is only as good as its database. CellChatDB is a manually curated literature-supported database of ligand-receptor interactions, organized into signaling pathways. Understanding its structure is essential for interpreting results correctly.
Always inspect the database before running your analysis. Check which pathways it covers, whether your gene names match (human vs mouse), and consider subsetting to Secreted Signaling only if you're running on dissociated cells without spatial info — contact-dependent interactions can't truly be validated without proximity data.
library(CellChat) # ── Explore CellChatDB ─────────────────────────────────────────── CellChatDB <- CellChatDB.human # or CellChatDB.mouse # What interaction categories are available? unique(CellChatDB$interaction$annotation) # "Secreted Signaling" "Cell-Cell Contact" "ECM-Receptor" # How many interactions total? nrow(CellChatDB$interaction) # ~2800 in v2 # What pathways exist? unique(CellChatDB$interaction$pathway_name) |> head(20) # "TGFb" "MIF" "SPP1" "CXCL" "CCL" "IL1" "NOTCH" "WNT" "EGF" ... # Inspect a specific pathway CellChatDB$interaction |> subset(pathway_name == "TGFb") |> select(ligand, receptor, cofactor, annotation) # ── Subset DB (recommended for dissociated cells) ───────────────── # Only use secreted signaling if no spatial info available CellChatDB.use <- subsetDB(CellChatDB, search = "Secreted Signaling", key = "annotation") # Or use all interactions (default) CellChatDB.use <- CellChatDB
The Atopic Dermatitis Dataset
Atopic dermatitis (AD) is a chronic inflammatory skin disease. This dataset compares normal (nonlesional, NL) and diseased (lesional, LS) human skin, making it ideal for both single-condition CellChat analysis and the more powerful comparison analysis between conditions.
scRNA-seq data from human skin biopsies. Two biological conditions: nonlesional skin (NL, effectively normal) and lesional skin (LS, actively inflamed AD). Multiple cell types including keratinocytes, fibroblasts, T cells, macrophages, dendritic cells, and endothelial cells. Pre-annotated with cell type labels — ready to use directly with CellChat.
The 12 fine-grained cell type labels
The dataset uses fine-grained cell annotations — not just "fibroblast" but 4 distinct fibroblast subtypes, not just "T cell" but inflammatory vs resting vs CD40LG+ subtypes. This granularity is what makes CellChat results biologically meaningful.
AD is driven by immune-stromal crosstalk — T cells signaling to keratinocytes, macrophages activating fibroblasts. This makes the expected biology rich and interpretable. Two conditions (NL vs LS) enable the most powerful CellChat feature: comparison analysis to identify which communication pathways are amplified in disease.
Setup & Installation
# ── Install CellChat v2 ────────────────────────────────────────── # Requires several dependencies install.packages(c("BiocManager", "devtools")) # Bioconductor dependencies BiocManager::install(c("BiocNeighbors", "ComplexHeatmap")) # NMF (for pattern recognition) install.packages("NMF") # CellChat from GitHub devtools::install_github("jinworks/CellChat") # ── Load libraries ──────────────────────────────────────────────── library(CellChat) library(patchwork) library(Seurat) # for data loading library(dplyr) library(ggplot2) options(stringsAsFactors = FALSE) # ── Download data from Figshare ─────────────────────────────────── # NL (nonlesional) and LS (lesional) pre-processed Seurat objects # https://figshare.com/projects/Example_data_for_.../157272 # Files: skin_NL.rds, skin_LS.rds
Some CellChat chord diagram plots require the Python circos library via reticulate. If you hit errors with netVisual_chord_gene(), run reticulate::use_python("/path/to/python") before loading CellChat. Most plots work without Python — this only affects circular chord diagrams.
Create the CellChat Object
CellChat needs two things: a normalized count matrix and cell type labels. The official dataset comes as a single .rda file containing both — no Seurat object needed.
# ── Load the official CellChat dataset ─────────────────────────── # Download data_humanSkin_CellChat.rda from Figshare first load("data/data_humanSkin_CellChat.rda") # Contains one object: data_humanSkin with $data and $meta slots data.input <- data_humanSkin$data # genes × cells, log-normalized meta <- data_humanSkin$meta # barcode → condition + labels head(meta) # labels condition # AAACCTGAGAAGGACA-1 Inflam. FIB LS # AAACCTGAGACTAGAT-1 FBN1+ FIB LS # 12 fine-grained cell types unique(meta$labels) # Inflam. FIB · FBN1+ FIB · APOE+ FIB · COL11A1+ FIB # cDC1 · cDC2 · LC · Inflam. DC # CD40LG+ TC · Inflam. TC · TC · NKT # ── Subset to LS (lesional) condition ──────────────────────────── cell.use <- rownames(meta)[meta$condition == "LS"] data.input <- data.input[, cell.use] meta <- meta[cell.use, ] # ── Create CellChat object ──────────────────────────────────────── cellchat <- createCellChat( object = data.input, meta = meta, group.by = "labels" # the 12-label column ) levels(cellchat@idents) # confirm 12 cell types groupSize(cellchat) # n cells per type — check no group <50 # ── Set DB and preprocess ───────────────────────────────────────── cellchat@DB <- CellChatDB.human cellchat <- subsetData(cellchat) cellchat <- identifyOverExpressedGenes(cellchat) cellchat <- identifyOverExpressedInteractions(cellchat)
# ── Optional: project onto PPI network ──────────────────────────── # Smooths expression via high-confidence protein-protein interactions # Recovers signal for lowly expressed ligands/receptors # Recommended when: high sparsity, rare cell types, low depth cellchat <- projectData(cellchat, PPI.human) # PPI.human built into CellChat (PPI.mouse also available) # Re-run after projection to update over-expressed interactions cellchat <- identifyOverExpressedInteractions(cellchat)
If you don't have cell type labels, CellChat can group cells automatically using a low-dimensional embedding (UMAP, PCA, or pseudotime). Pass coordinates via the do.sparse and dim.reduction arguments to createCellChat(). CellChat then builds a shared neighbor graph and defines groups from the embedding. Useful for trajectory analysis — e.g. grouping iPSC → intermediate → neuron stages without pre-labeling.
Using "Fibroblast" as a single label would completely miss the distinct signaling roles of the 4 fibroblast subtypes in this dataset — Inflam. FIB has different L-R patterns than COL11A1+ FIB. Conversely, groups with <50 cells produce noisy estimates. Aim for biologically meaningful subtypes with at least 50–100 cells each.
Infer Communication Probabilities
This is the core computation — CellChat calculates a communication probability for every sender × receiver × L-R pair combination, then runs permutation tests to determine statistical significance.
# ── Compute communication probability ──────────────────────────── # type="triMean": uses trimmed mean — robust to outlier cells # population.size=TRUE: scales by cell number (affects magnitude) cellchat <- computeCommunProb( cellchat, type = "triMean", population.size = TRUE ) # ── Filter out low-cell-number interactions ─────────────────────── # Removes L-R pairs where cell group has fewer than 10 cells cellchat <- filterCommunication(cellchat, min.cells = 10) # ── Compute pathway-level communication ────────────────────────── # Aggregates L-R level → signaling pathway level cellchat <- computeCommunProbPathway(cellchat) # ── Aggregate communication networks ───────────────────────────── # count: number of significant interactions # weight: total communication probability cellchat <- aggregateNet(cellchat) # ── Inspect results ─────────────────────────────────────────────── df.net <- subsetCommunication(cellchat) head(df.net) # source | target | ligand | receptor | prob | pval | pathway # How many significant interactions total? nrow(df.net) # typically 100s to 1000s
triMean (default) — uses the trimmed mean of expression, giving higher weight to lowly expressed genes. More conservative, fewer false positives. truncatedMean — uses the mean after removing the lowest 25% of cells. Captures signals from even small fractions of expressing cells — useful if your cell types are heterogeneous. For most analyses, stick with triMean.
Pathway-Level Analysis
Individual L-R pairs are noisy. CellChat groups them into signaling pathways (e.g. all TGFβ interactions together) and analyzes patterns at this pathway level.
# ── What pathways are active? ───────────────────────────────────── cellchat@netP$pathways # list of significant pathways # ── Visualize all pathways as heatmap ──────────────────────────── # Rows = pathways, columns = cell types, color = communication prob netAnalysis_signalingRole_heatmap(cellchat, pattern = "outgoing") # or "incoming" # ── Signaling pattern recognition ──────────────────────────────── # NMF-based: groups cells with similar outgoing signaling patterns # k = number of patterns (try k=3, 4, 5 and pick best) cellchat <- identifyCommunicationPatterns( cellchat, pattern = "outgoing", k = 4 ) # Sankey plot: cell types ↔ patterns ↔ pathways netAnalysis_river(cellchat, pattern = "outgoing") # Do same for incoming signals cellchat <- identifyCommunicationPatterns( cellchat, pattern = "incoming", k = 4) netAnalysis_river(cellchat, pattern = "incoming") # ── Pathway similarity (manifold learning) ──────────────────────── # UMAP of pathways — similar pathways cluster together cellchat <- computeNetSimilarity(cellchat, type = "functional") cellchat <- netEmbedding(cellchat, type = "functional") cellchat <- netClustering(cellchat, type = "functional") netVisual_embedding(cellchat, type = "functional")
Network Analysis — Roles & Centrality
CellChat uses social network analysis metrics to quantify each cell type's role in the communication network. This tells you which cell types are dominant senders, receivers, or mediators.
# ── Compute network centrality ──────────────────────────────────── # betweenness: how often a cell type lies on shortest paths # closeness: how close to all other cell types # degree: number of direct connections cellchat <- netAnalysis_computeCentrality(cellchat, slot.name = "netP") # ── 2D scatter: sender vs receiver roles ───────────────────────── # Each dot is a cell type # x-axis = outgoing communication strength # y-axis = incoming communication strength netAnalysis_signalingRole_scatter(cellchat) # ── Role per specific pathway ───────────────────────────────────── # Which cell types are senders/receivers in TGFb pathway? netAnalysis_signalingRole_heatmap(cellchat, signaling = c("TGFb", "MIF", "SPP1")) # ── Summary: number and weight of interactions ──────────────────── # Circle plot: node size = n cells, edge width = communication prob netVisual_circle(cellchat@net$count, vertex.weight = groupSize(cellchat), weight.scale = T, label.edge = F, title.name = "Number of interactions" )
Visualization Deep Dive
CellChat has an unusually rich visualization toolkit. Each plot type reveals different aspects of the communication network. Here's what each one actually shows and when to use it.
# ── 1. Circle plot: all interactions overview ───────────────────── netVisual_circle(cellchat@net$weight, vertex.weight = groupSize(cellchat), weight.scale = T, label.edge = F, title.name = "Interaction strength" ) # ── 2. Heatmap: sender × receiver matrix ────────────────────────── netVisual_heatmap(cellchat, measure = "weight") netVisual_heatmap(cellchat, measure = "count") # ── 3. Explore a specific pathway ──────────────────────────────── # Which pathways are significant? cellchat@netP$pathways # Circle plot for one pathway netVisual_aggregate(cellchat, signaling = c("TGFb"), layout = "circle", vertex.receiver = seq(1, 4) # which cell types receive ) # Chord diagram for L-R pairs within a pathway netVisual_chord_gene(cellchat, sources.use = 1, # cell type index or name targets.use = c(2, 3), signaling = c("TGFb") ) # ── 4. Bubble plot: L-R pairs between two groups ───────────────── netVisual_bubble(cellchat, sources.use = "Macrophage", targets.use = c("Keratinocyte", "T cell"), remove.isolate = FALSE ) # ── 5. Gene expression of L-R pair ─────────────────────────────── plotGeneExpression(cellchat, signaling = "TGFb", enriched.only = TRUE # only show overexpressed )
netVisual_chord_gene() requires the Python circos module via reticulate. If you're in a pipeline or on an HPC without Python setup, use netVisual_aggregate(layout="circle") or netVisual_bubble() instead — they convey similar information and work in pure R.
Merge Two Conditions
The most powerful CellChat analysis is comparing communication between two biological conditions. For atopic dermatitis: what changes between normal skin (NL) and diseased skin (LS)?
# ── Run full pipeline on BOTH conditions ───────────────────────── # (run sections 06-09 separately for NL and LS) # Then save both objects: # cellchat_NL, cellchat_LS # ── Merge into one object ───────────────────────────────────────── object.list <- list(NL = cellchat_NL, LS = cellchat_LS) cellchat_merged <- mergeCellChat( object.list, add.names = names(object.list), cell.prefix = TRUE ) # ── Compare total interaction counts ───────────────────────────── compareInteractions(cellchat_merged, show.legend = F, group = c(1, 2) ) # Bar chart: LS typically has more interactions than NL # ── Compare as circle plots side by side ───────────────────────── par(mfrow = c(1, 2)) for (i in seq_along(object.list)) { netVisual_circle(object.list[[i]]@net$weight, vertex.weight = groupSize(object.list[[i]]), weight.scale = T, label.edge = F, title.name = paste0(names(object.list)[i], " - Interaction strength") ) }
Differential Interactions — What Changes in Disease?
# ── Differential interaction heatmap ───────────────────────────── # Red = more in LS (disease), Blue = more in NL (normal) netVisual_heatmap(cellchat_merged, measure = "weight", comparison = c(1, 2) # NL vs LS ) # ── Which pathways gain/lose strength in disease? ───────────────── rankNet(cellchat_merged, mode = "comparison", stacked = T, do.stat = TRUE ) # Horizontal bar chart: NL (left) vs LS (right) per pathway # Expected: IL signaling, CCL, SPP1 up in LS # ── Bubble plot of differentially expressed L-R pairs ───────────── netVisual_bubble(cellchat_merged, sources.use = "Macrophage", targets.use = "Keratinocyte", comparison = c(1, 2), angle.x = 45 ) # Side-by-side bubble: NL vs LS for the same cell pair # ── Chord diagrams: NL vs LS for specific pathway ───────────────── netVisual_chord_cell(cellchat_merged, comparison = c(1, 2), signaling = c("CCL") )
Shifted Sender/Receiver Roles Between Conditions
Beyond which pathways change, CellChat can identify whether cell types switch roles between conditions — a cell type that was primarily a receiver in normal tissue might become a dominant sender in disease.
# ── Scatter: signaling role shift NL vs LS ─────────────────────── # Each dot = cell type, colored by condition netAnalysis_signalingRole_scatter( cellchat_merged, comparison = c(1, 2) ) # Arrows show direction of shift: NL → LS # Look for: fibroblasts becoming stronger senders in LS # ── Identify "gained" vs "lost" interactions ───────────────────── netAnalysis_diff_interaction(cellchat_merged, comparison = c(1, 2), measure = "weight" ) # ── Manifold learning: are signaling patterns similar across conditions? ─ cellchat_merged <- computeNetSimilarityPairwise(cellchat_merged, type = "functional") cellchat_merged <- netEmbedding(cellchat_merged, type = "functional") cellchat_merged <- netClustering(cellchat_merged, type = "functional") # 2D UMAP of signaling pathways — NL vs LS colored netVisual_embeddingPairwise(cellchat_merged, type = "functional") # Pathways close together = functionally similar signaling roles
Pitfalls & Overinterpretation
CellChat is one of the most widely misused tools in single-cell biology. Understanding what results actually mean — and what they don't — is as important as running the analysis.
CellChat identifies cell type pairs where ligand and receptor are co-expressed and statistically enriched. This does not mean the ligand was actually secreted, reached the receiver cell, bound the receptor, or triggered downstream signaling. Co-expression is necessary but not sufficient for communication. Every biologically important finding needs experimental validation.
The communication probability is a relative score, not an absolute measure. Don't report "macrophages signal strongly to T cells" — report "macrophage → T cell MIF signaling is inferred with high probability in lesional skin vs nonlesional."
If you label all T cells as "T cell", you'll miss CD4/CD8/Treg-specific communication. If you annotate rare populations based on <50 cells, estimates will be noisy. Cell annotation quality directly determines result quality.
CellChat's trimmed mean expression calculation assumes log-normalized counts (the standard Seurat "data" slot). Running on raw counts or scaled counts will give incorrect probability estimates and non-comparable results.
Notch, Ephrin, and other contact-dependent interactions in CellChatDB require physical cell-cell contact. scRNA-seq can't verify proximity. If you don't have spatial data, subset to "Secreted Signaling" only to avoid misleading contact-dependent inferences.
If NL has 3,000 cells and LS has 8,000 cells, raw communication probabilities are not directly comparable. Set population.size=TRUE in computeCommunProb() to scale by cell group size.
CellChat may infer hundreds of significant interactions. Don't list them all. Focus on: (1) pathways with known biology in your tissue, (2) interactions that change between conditions, (3) top interactions by probability within key cell type pairs.
CellChat vs NicheNet vs CellPhoneDB
CellChat is not the only tool for cell-cell communication inference. Choosing the right tool depends on your biological question.
| Feature | CellChat v2 | NicheNet | CellPhoneDB v3 |
|---|---|---|---|
| Primary goal | Network-level communication | Ligand activity prediction | L-R pair enumeration |
| Multi-subunit receptors | ✓ CellChatDB | Partial | ✓ |
| Statistical test | Permutation test | Regulatory network | Permutation test |
| Comparison analysis | ✓ Built-in | Manual | Partial |
| Spatial data support | ✓ CellChat v2/v3 | ❌ | Partial |
| Network visualization | Rich (chord, circle, heatmap) | Basic | Basic |
| Predicts downstream genes | ❌ | ✓ Key differentiator | ❌ |
| Best use case | Network overview, comparison | Mechanism hunting | Quick L-R list |
| Language | R | R | Python / R |
CellChat — your first tool. Best for network-level overview, identifying dominant pathways, and comparing conditions. NicheNet — follow-up when you want to know which sender ligands most plausibly drive gene expression changes in the receiver. Mechanistically richer but more assumptions. CellPhoneDB — quick, simple L-R enumeration. Good for generating hypotheses without the full network framework. Many groups now use CellChat + NicheNet together: CellChat for overview, NicheNet for mechanism.
Publication Figures
A typical CellChat paper figure panel includes 4–6 plots that together tell a complete story: overview → pathway-level → specific interactions → comparison.
Panel A: Circle plot overview (NL vs LS side by side) — establishes more communication in disease. Panel B: rankNet comparison — which pathways increase in LS. Panel C: Role scatter with condition arrows — which cell types shift roles. Panel D: Bubble plot for key cell pair (e.g. Macrophage → Keratinocyte) NL vs LS. Panel E: Chord diagram for top upregulated pathway in LS. Panel F: Violin plots of key ligand/receptor gene expression validating the inference.
The strongest CellChat papers pair computational inference with at least one validation: protein-level evidence (flow cytometry, IHC, ELISA), functional assay (blocking antibody, ligand stimulation experiment), or orthogonal computational support (the same interaction found by CellPhoneDB or NicheNet independently). Pick your top 1–2 pathways and validate them — reviewers will ask.