
RNA Velocity: Computational Inference of Cellular State Dynamics from Single-Cell Transcriptomics
RNA velocity is a computational method that leverages single-cell transcriptomic data to infer the direction and speed of cellular state transitions by analyzing the dynamics of unspliced and spliced mRNA. The core principles are outlined below:
Biological Basis
mRNA biogenesis involves two key stages: transcription (yielding unspliced pre-mRNA) and splicing (producing mature mRNA). Since unspliced mRNA precedes spliced mRNA, the temporal disparity between them encodes kinetic information about cellular state transitions. By quantifying deviations from steady-state assumptions in the ratio of these mRNA types, the rate of gene expression changes can be estimated.
Mathematical Models
Differential equations are employed to model transcriptional kinetics:
- Deterministic models (e.g., velocyto): Assume cells are in steady state, estimating splicing rate (β) and degradation rate (γ) via linear regression. Velocity is derived from residuals between observed values and steady-state ratios.
- Stochastic models: Incorporate probabilistic events to describe transcription, improving robustness through first- and second-order moment analysis.
- Dynamical models (e.g., scVelo): Use expectation-maximization (EM) algorithms to iteratively optimize parameters (e.g., transcription rate α, splicing rate β, degradation rate γ) and infer latent time, reflecting differentiation progression.
Applications
- Cell fate prediction: Velocity vector fields reveal differentiation trajectories (e.g., from progenitors to terminal states).
- Key gene identification: Regulatory drivers (e.g., transcription factors) of state transitions are pinpointed.
- Temporal scaling: Vector magnitude indicates differentiation speed, while coherence evaluates prediction confidence.
Tools & Limitations
- Tools: velocyto (steady-state assumption) and scVelo (dynamic modeling, adaptable to heterogeneous populations).
- Limitations: Sensitivity to data quality (e.g., full-length transcript coverage) and potential biases from model assumptions (e.g., steady-state in transient processes).
Example: In pancreatic development data, positive velocity for Cpe marks upregulation driving β-cell differentiation, while negative velocity for Adk indicates ductal cell transition. Projecting velocity vectors onto UMAP embeddings visualizes differentiation paths.