Skip to content
CAAIL

Metabolic Modeling

Metabolic modeling sits at the intersection of biochemistry, systems biology, and computational science — and at the intersection of essentially every research area in this library. The cost-driving questions in cellular agriculture — what minimal medium supports growth?, what flux distribution maximizes biomass while suppressing lactate?, what gene knockouts increase yield in suspension culture? — are all metabolic-modeling questions in disguise. This page is a cross-cutting methodology overview rather than a matrix-column research area; the relevant papers and tools are catalogued in the matrix under Media Optimization, Cellular Engineering, Bioprocess Control, and AI Tooling / Methodology.

Constraint-based modeling (FBA family)

The dominant paradigm for whole-cell metabolic analysis is constraint-based modeling: starting from a genome-scale metabolic reconstruction (a stoichiometric matrix of every known reaction in the organism), solve a linear program to find the flux distribution that maximizes biomass (or another objective) subject to thermodynamic and capacity constraints. Flux balance analysis (FBA), flux variability analysis (FVA), and Monte-Carlo sampling are the core algorithms; strain-design extensions (OptKnock, OptForce, OptCouple) layer mixed-integer programming on top to identify gene knockouts that redirect flux toward desired products.

The open-source stack for this work is mature and largely centered on the openCOBRA consortium: see COBRApy, COBRA Toolbox, Memote, Escher, BiGG Models, RAVEN Toolbox, StrainDesign, and CNApy in Software.md. The substrate for cell-ag work — species-specific genome-scale metabolic models (GEMs) — is fragmented and growing: each GEM is catalogued on its species page in the Datasets/ directory — the bovine BtaSBML2986, chicken iES1300, porcine pcPigMNet2025, salmon SALARECON, and CHO iCHO family reconstructions. Recent applied work in this area includes the CHO-focused data-driven GEM reduction approach (Antonakoudis & Richelle 2026, npj Sys Bio Apps) and the multi-omics cultivated-meat framework #60 (Mathieu et al. 2025, Trends in Food Science & Technology).

Kinetic and dynamic modeling

A complementary paradigm models metabolism as a system of ordinary differential equations (ODEs) with enzyme kinetic parameters, simulating concentration trajectories rather than steady-state fluxes. This approach captures regulatory dynamics, time-dependent responses to perturbations, and bioreactor behavior that constraint-based models cannot — at the cost of needing measured kinetic parameters that are often unavailable. The principal open-source tools are COPASI and Tellurium wrapping libRoadRunner; the BioModels repository at EMBL-EBI hosts thousands of curated SBML models. Kinetic parameters themselves are catalogued in BRENDA.

ML-based and hybrid approaches

A third paradigm — increasingly active since 2024 — uses ML directly on omics or imaging data to predict metabolic phenotypes, often as a complement rather than replacement for mechanistic models. Single-cell foundation models (scGPT, Geneformer, scFoundation) capture pathway membership and co-expression structure but do not directly predict fluxes; bridging foundation-model embeddings to flux predictions is an active research frontier. The recent State (Adduri et al. 2025) and OmicsLM (Sypetkowski et al. 2026) papers represent two flavors of this trend — perturbation prediction and multimodal omics reasoning respectively. Bayesian-optimization approaches (e.g. Narayanan et al. 2025, Nat Comms, MIT Love lab) demonstrate that data-efficient ML can drive media-formulation work without requiring a full GEM.

The AI agent layer

The newest layer — and the one most rapidly evolving — wraps the above tools with large-language-model agents that can plan multi-step metabolic-reasoning workflows, call FBA / FVA / kinetic primitives as functions, and interpret results. Closest in spirit to a “ChatCOBRA” today are Talk2Biomodels (Wehling et al. 2025) for kinetic SBML models, Genesis (Tiukova et al. 2024, King group) for systems-biology automation, AutonoMS (Brunnsåker et al. 2025) for LLM + symbolic logic + automated cell culture, Lila (Singh et al. 2023, Carbonell group) for microbial strain design, and the Saez-Rodriguez MCP servers (Ruscone et al. 2025) for gene-regulatory and Boolean-network modeling. Adjacent infrastructure includes Biomni, TxAgent, ToolUniverse, PaperQA / PaperQA2, and the AI Scientist family from Sakana — none of which is metabolic-modeling-specific but all of which provide patterns transferable to cell-ag agentic workflows.

Open gaps for cell-ag

Despite the activity above, several gaps remain conspicuous for cellular agriculture specifically: no MCP server dedicated to COBRApy or BiGG Models yet exists; no benchmark suite scores AI agents on cell-ag-relevant metabolic-engineering tasks (the equivalent of Arc Institute’s Virtual Cell Challenge for FBA); no published bridge connects single-cell foundation-model embeddings to flux predictions; and species-specific GEMs for bovine, porcine, avian, and fish are scattered across preprints rather than centralized in a BiGG-style canonical home. The cultivated-meat industry has reported significant cost wins from internal AI stacks (Pythag Tech, DeepLife, Magic Valley) but these are proprietary — credible open competitors that close the gap between published GEMs, open-source FBA tools, and active-learning media formulation are still emerging. The repository tracks this evolving landscape; contributors are encouraged to add new GEMs, agentic tooling, and applied work as it appears.

Further reading