i
Insilijo Science
GeMMA · microbiome × metabolome

Community-scale integration of metagenomics and metabolomics.

GeMMA reconciles the metabolic pathways encoded in a microbial community's genomes with the pathways observed in the metabolome — and tells you, at pathway, taxon, and reaction resolution, where they disagree. Built for paired microbiome–metabolome studies in human health, agricultural, and environmental contexts.

Request a demo → See an example analysis
capabilities

Four linked analyses on the same community model.

Concordance

Capacity — fraction of a pathway's reactions present in the community model, per sample. Activity — fraction of that pathway's metabolites detected in the metabolome, per sample. Every pathway in every sample gets one of four labels: active, silent (capacity without activity), exogenous (activity without capacity), or absent.

Silent pathways are the interesting ones. They mark places the community could be doing something but isn't — regulation, substrate limitation, or a pathway the measurement panel missed.

Steiner subnetworks

The minimally-connected reaction–metabolite subgraph linking every measured metabolite through the community's reaction network, with edges weighted toward actively-carrying reactions. Transport, exchange, demand, and currency metabolites (ATP, NAD, H₂O, CoA, …) are excluded from the underlying graph — they're biochemical plumbing that would turn every network view into a hairball.

The result is a dense but readable backbone of the metabolism that actually reaches the measured compounds.

Punch differential

Per-taxon scoring that contrasts composition change with metabolic-activity change between two sample groups. Each taxon lands in one of four quadrants: driver (enriched + more active), passenger (enriched but not more active), protective (depleted but still more active per-capita — keystone loss), depleted (less of both).

Protective-quadrant taxa are the ones standard differential-abundance methods miss entirely — the single clearest source of keystone-candidate hypotheses in microbiome studies.

Taxonomic rollup

Every analysis is repeated at each rank from strain through phylum. Scores aggregate correctly up the taxonomy, so a signal that looks weak at genus can reveal itself as a coherent family-level shift — and vice versa.

GTDB and NCBI name-shard ambiguity (Clostridium / Clostridium P / Clostridium Q) is collapsed at the parent genus before scoring, so the table doesn't split a single biological signal across three rows.

worked example

Erawijantari et al. 2020 — faecal microbiome & metabolome after gastric cancer surgery.

42 post-gastrectomy patients vs 54 healthy controls. Total gastrectomy removes the stomach's acid barrier, exposing the colonic community to substrates that don't normally reach it. The phenotype should be a community-wide metabolic activation, not a few-taxon bloom. GeMMA surfaces exactly that — and names the taxa doing it.

Headline — genus rank, default parameters

  • • Principal-component separation by group is unambiguous — PC1 explains 15.5% of variance, Welch t = −10.24, p = 5.4 × 10⁻¹⁴ between Gastrectomy and Healthy.
  • 26 of 37 testable genera (≈70%) show positive Δpunch — the community-wide metabolic-activity elevation the acid-barrier-loss mechanism predicts.
  • • Punch quadrant composition: 16 driver · 10 passenger · 10 protective · 1 depleted. The depleted quadrant is effectively empty — ruling out "the community collapsed" as an explanation.
  • • Punch and ALDEx2 largely disagree on which taxa are significant — ALDEx2 picks up abundance shifts, punch picks up per-capita activity shifts. A community-wide secretome elevation is a distributed phenomenon; it lives in the shape of the Δpunch distribution, not at an FDR tail.
  • • Concordance runs over 137 pathways; silent labels concentrate in pathways the model predicts but the metabolomics panel doesn't cover — the expected coverage-limit signal rather than a biological gap.

Top-|Δpunch| genera (biological reading)

A mix of short-chain-fatty-acid producers and mucin specialists — consistent with the substrate-exposure story.

  • Fusicatenibacter — SCFA producer, expanded under increased carbohydrate exposure.
  • Gemmiger — butyrate producer responding to altered colonic substrate supply.
  • Anaerostipes — lactate-utilising butyrate producer; picks up where upstream fermenters leave off.
  • Akkermansia — classic mucin degrader thriving in the altered gut environment.
  • Eubacterium — broad SCFA producer genus reflecting community-wide activation.

Numbers above were reproduced end-to-end on platform commit 5683ce3 with default UI parameters (genus rank, punch permutation FDR ≤ 0.2). Validation record in docs/case_studies/erawijantari_validation.md; figures embedded in the case-study guide are the ones produced by this run.

Cohort

Samples
96 paired
Microbiome
MetaPhlAn species
Metabolomics
Commercial LC-MS, HMDB-annotated
Contrast
Gastrectomy vs Healthy
Runtime
≤ 2 min
Rank used
genus (37 taxa)

Dataset is a built-in demo on the Forge platform. Running the same analysis on your own paired microbiome–metabolome cohort takes less time than rendering a single Manhattan plot and is fully reproducible (RNG seed, pinned VMH vocabulary, container checksums captured on every run).

positioning

How GeMMA differs from the methods you already use.

Pitched for the sceptical bioinformatician. GeMMA isn't a replacement for your differential-abundance or pathway-prediction toolchain; it answers a question the others don't ask. Where another tool does something GeMMA doesn't, we say so.

CapabilityGeMMAMaAsLin2HUMAnN3MIMOSA2MelonnPan
Compositional differential abundance (feature ↔ metadata) ✓ via CLR+Wilcoxon ✓ GLM / mixed models (broader)
Pathway-level abundance from shotgun reads community-weighted reactions ✓ MetaCyc, read-derived
Joint microbiome ↔ metabolome framing ✓ mechanistic (GSMM) ✓ CMP-regression framing ✓ elastic-net prediction
Identifies keystone-candidate taxa composition misses ✓ protective quadrant via residuals
Concordance between capacity and observation ✓ silent / active / exogenous / absent partial
Mechanistic subnetwork (not just pathway list) ✓ Steiner tree over measured metabolites
Requires paired metabolomics yes (core framing) no no yes training cohort only

MaAsLin2 and HUMAnN3 answer composition and functional-prediction questions without using your metabolomics; MIMOSA2 and MelonnPan model the microbiome ↔ metabolome coupling statistically. GeMMA is the only one that uses a genome-scale metabolic reconstruction of the actual community — and therefore the only one where a silent pathway is mechanistically distinguishable from an unmapped one.

how to engage

Three paths, depending on what you need.

Self-serve on Forge

Upload your data, configure the run, read the results. Suited to teams with their own bioinformatics capacity who want the tool under their control.

Request Forge access →

Commercial engagement

We run the analysis end-to-end against your cohort and deliver interpretation, figures, and methods text ready for your manuscript or internal decision. Typically 4–8 weeks from data receipt to deliverable.

Schedule a scoping call →

Academic collaboration

For PIs who want GeMMA in a paper. Co-authorship, methods writing, figure preparation, reviewer-response support.

Discuss a collaboration →
about

Built for publication-grade reproducibility.

Every GeMMA run captures an RNG seed, is anchored to a pinned VMH metabolite vocabulary (SHA-verified on load), and records its parameters on the analysis record. Benchmarks across 14 curated microbiome–metabolome studies and a sensitivity sweep on the core parameters ship as management commands — intended as a reviewer's first stop.

Preprint in preparation. For the method's underlying work and related essays see insilijo.github.io.