Omics data,
without the
data wrangling.

Scattered biomedical datasets: structured, queryable, and AI-ready.

Built for
Comp bio Bioinformatics Translational BD / Strategy Agents
The Data Problem

The data is free.
The context isn’t.

Key omics datasets live in fifteen places.
GEO GWAS Catalog dbGaP PubMed ArrayExpress GTEx Open Targets +8 more
The same biology shows up under five names.
Alzheimer's disease AD late-onset AD MONDO:0004975 DOID:10652 EFO_0000249
Getting to usable data takes longer than the analysis itself.
week 1 find week 4 map week 8 harmonize decision

We make public omics data usable.

The Solution

An expanding set of products,
built to work together.

Each structures a different part of the public biomedical data landscape. Together, they form Devano’s Data Layer: harmonized, queryable, and ready for scientists, workflows, and AI agents.

1.3M samples
GEOx

Gene Expression Omnibus, structured for discovery. Study and sample metadata harmonized across disease, tissue, perturbation, arm role, and experimental design — so GEO becomes queryable instead of free text.

9M associations
QTLx

QTL studies and association tables, structured from papers and supplements. Tissues, cohorts, modalities, and analysis types are harmonized so variant-to-function signals can be searched and compared across studies.

186k studies
GWASx

GWAS Catalog, structured for downstream use. Traits, diseases, ancestry, sample sizes, and association fields are parsed and ontology-mapped so queries work across inconsistent study labels.

live updates
LITx

Biomedical literature, continuously scored and structured. Papers are mapped to genes, diseases, topics, and custom criteria so teams can track what matters without manual triage.

One data layer across multiple omics.

Query across products
Start with a disease, gene, tissue, variant, or paper and move across GEOx, QTLx, GWASx, and LITx without rebuilding the same mappings each time.
Shared structure across sources
Disease, tissue, gene, trait, and study labels are harmonized across products, so the same biology does not fragment across five names.
Literature stays connected
New papers are scored, tagged, and linked back to structured data products, so public-data workflows stay current instead of freezing at the last release.
Devano’s Data Layer
PD
Parkinson's disease
MONDO:0005180 · neurodegenerative disorder
GEOx · 690+ PD samples QTLx · eQTL, sQTL GWASx · rs356182 LITx · SNCA pathway
GEOx · Substantia nigra · 13 studies · case/control
QTLx · SN eQTL/sQTL · Guelfi 2020, N=117
GWASx · rs356182 · SNCA · p < 10−169, N=2.5M
LITx · SNCA top-ranked · major PD GWAS
GEOx, QTLx, GWASx, and LITx — harmonized by disease, tissue, gene, and provenance. ↑ structured for scientists, workflows, and agents
Data Products In Action

Usable data, traceable to the source.

Queries that normally require scraping, cleaning, and manual review return structured results in one call. Every output stays linked to the source study, cohort, paper, and curation status behind it.

GEOx What it is
GEO is one of the richest public transcriptomics resources, but its metadata was built for archiving, not analysis.
Devano structures study and sample context across disease, tissue, perturbation, arm role, and experimental design so teams can query GEO directly instead of manually interpreting free text.
37.3K
studies indexed
1.26M
samples with structured metadata
SNCA in Parkinson's
3 case-control contrasts · human substantia nigra
GSE114517 PD+dementia vs ctrl · SN · 17/12 inferred
GSE133101 PD vs healthy · SN · 15/10 inferred
GSE114918 SNc DA neurons · LCM inferred
Matched on age, sex, RIN where reported. Curation status preserved per contrast.
Ontology MONDO:0005180 · Parkinson disease BTO:0000143 · substantia nigra

Powerful on their own. Stronger together.

Connect

Plug in any way you work.

Wire it to an agent, call it from code, or query it directly. Same structured data, same provenance, same context — however your team works.

Agents need more than access.
They need usable context.

Connect any agent to Devano and it can call structured tools instead of scraping raw sources. Search studies, pull contrasts, traverse associations, and return source-linked outputs with provenance attached.

Connect to agents
D
agent · tool calls
geo−index MCP
search_studies(disease="Parkinson", tissue="substantia nigra")
GSE114517 lncRNA LINC−PINT in PD substantia nigra
GSE133101 circRNAs in human PD brains
GSE114918 RNA−seq of SNc dopamine neurons
list_contrasts(disease="Parkinson", contrast_type="case_control")
GSE114517
17 / 12
GSE133101
15 / 10
cases
controls
find_similar_contrasts(contrast_id=2775)
PD · SN seed
PD · SN nuclei 0.91
LBD · SN 0.91
AD · SN 0.88

One agent. Different questions. Different shapes of structured context.

Integrate into workflows
pipeline.py
from devano.mcp import geo_index
# Find every PD case−control contrast in substantia nigra
contrasts = geo_index.list_contrasts(
  disease="Parkinson",
  tissue="substantia nigra",
  contrast_type="case_control",
)
for c in contrasts:
  print(f"{c.gse_id}: {c.arm_a_count} cases / {c.arm_b_count} controls")
✓ 3 contrasts · status: direct/inferred · 0.3s
GSE114517: 17 cases / 12 controls
GSE133101: 15 cases / 10 controls
GSE114918: 5 cases / 26 controls

One call instead of
another custom scraper.

Replace scattered GEO scraping, hand−curated case/control matching, and ad hoc metadata cleanup with structured calls to Devano products. Results come back with normalized fields, source links, and provenance attached.

Work with the complete data landscape, not whatever your pipeline happened to reach.

Why Devano

Not a knowledge base. Not a score. Infrastructure.

Biology has never had a live public-data layer. Devano builds one.
Structured, connected data — ready for scientists and agents to use directly.

Other genetic insight platforms

  • Static, versioned releases

    Periodic snapshots. New studies and current literature unavailable until the next release.

  • Fixed UI, fixed questions

    You browse their interface. Custom queries require custom engineering.

  • Pre-scored, pre-interpreted

    Evidence aggregated into a number. Provenance and raw associations abstracted away.

  • Not agent-queryable

    Designed for humans clicking a web interface. AI agents can't call it or cross-reference across indexes.

Devano
  • Continuously improved

    Data is continuously updated, corrected, and refined — not frozen in periodic releases.

  • Cross-index in one query

    GWAS, QTL, GEO, and literature pre-aligned to a single ontology. One call replaces scattered scraping and ad hoc metadata cleanup.

  • Full provenance, raw associations

    Every finding traces to source study, cohort, platform, and variant. Answers you can defend.

  • Plug in any way you work

    Query via MCP agent, direct API, or code. Structured data, no fixed interface, no fixed questions.

  • Built by scientists who ran these analyses themselves

    Decades at 23andMe, UCLA, Stanford, and Google — and spent years doing exactly the SOFT parsing, metadata curation, and batch correction this index eliminates.

Performance

How Devano compares.

A structural comparison of how different approaches handle study discovery, schema accuracy, cross-index linkage, and agent access.

Devano
✓ curated ✓ validated ✓ unified ✓ live ✓ MCP
DIY agent (PubMed + raw)
partial discovery error-prone no cross-index manual updates build it yourself
Genetics data platforms
limited scope ✓ schema siloed versioned partial agent
Manual research
incomplete varies manual lag no agents

Put biological data to work.

Start a project or talk with our team.

Piloting with biotech and academic teams • Enterprise-grade security • Expert implementation support