Somewhere in 200+ million published papers, the solution to your problem already exists.

It was published in a journal you didn't think to look at, using a different vocabulary, in a field you're not considering. We're building the system to find it.

All knowledge, connected.Learn more ↓

The Problem

Solutions hiding in plain sight

The knowledge that could solve your hardest problem may already exist. Just not in your field.

Materials Science ↔ Marine Biology

Sea cucumbers switch between rigid and soft states in seconds by regulating protein bridges between collagen fibrils in their skin, with no structural damage to the tissue.

→

Variable-stiffness nanocomposites for neural implants: stiff enough to insert through the skull, soft enough to match brain tissue once inside, reducing scarring.(1)

Distributed Systems ↔ Entomology

Ant colonies route food using pheromone trails that self-reinforce successful paths and fade from failed ones, with no central coordinator needed.

→

Fault-tolerant distributed routing algorithms that adapt to network failures without a single point of control.(2)

Robotics ↔ Biomechanics

Cockroaches stabilize over rough terrain using passive mechanical compliance. Their legs absorb and redirect forces without neural input.

→

Legged robot designs that don't need to compute every step, cutting the cost and latency of locomotion control.(3)

Each of these connections was found by someone who happened to read outside their field. Most are never found at all. No researcher can monitor millions of papers across thousands of disciplines. The answer to your problem may be sitting in a journal you have never heard of, described in a vocabulary you would not recognize, published decades before you started asking the question. Progress can be made so much faster if we can reduce months of research into minutes of queries for researchers.

David Fajgenbaum was diagnosed with a rare, fatal disease and given a 35% chance of survival. Rather than accept this, he spent nights systematically mining published medical literature, field by field, keyword by keyword, looking for patterns no specialist had connected. He found a drug approved for organ transplants that was targeting the same cellular pathway driving his disease. It saved his life.

His CROWN database now does this systematically for rare diseases.(4) Omnigraph builds the knowledge graph to do it for everything: ingesting the literature at scale, extracting structured knowledge across disciplines, and surfacing the connections no one thought to look for.

How It Works

The pipeline

Ingest

Papers from open-access databases (OpenAlex, PubMed, arXiv, Semantic Scholar), ingested at scale.

Extract

AI reads each abstract and extracts structured knowledge: entities, capabilities, and challenges, with their domain context.

Embed

Semantic embeddings capture conceptual meaning across vocabularies, so "pheromone signaling" and "decentralized load balancing" land near each other in vector space.

Build the graph

Entities, capabilities, and challenges become nodes and edges. The graph grows as new papers are processed.

Discover connections

Three complementary methods: semantic similarity across embedding space, property-based graph traversal, and entity-neighbor exploration. Candidates are ranked and passed for review.

On the roadmap

Intelligent discovery

Graph neural networks trained on the structure of known cross-domain transfers, learning which patterns in the graph predict real transfer value. Moves from retrieval to prediction.

Validate

Bayesian networks score and rank candidates against a model of prior successful transfers. LLMs generate specific mechanistic hypotheses: not just "these fields are similar" but "this mechanism could solve that problem, here is why."

Surface the frontier

Identify research gaps: problems that appear across many domains but have no solution anywhere in the literature. Map where cross-domain investment would have the highest impact.

The Proving Ground

Starting with evolutionary biology ↔ engineering

Biology and engineering are separated by extreme vocabulary fragmentation, yet the transfer potential is exceptional. Living systems have spent hundreds of millions of years solving the same problems engineers face: locomotion, structural integrity, sensing, communication, self-repair. The open data is accessible. The pattern is vivid.

Domain-agnostic by design: biology↔engineering today, medicine↔materials science tomorrow.

500K+

papers ingested (Year 1 target)

100+

validated cross-domain connections

Public

Explorer UI, open to all researchers

Year One Goals

What we aim to accomplish

These are directions, not claims of completion.

1.Ingest and process 500,000+ papers across 7 domains
2.Extract tens of thousands of entities, capabilities, and challenges
3.Discover and validate 100+ novel cross-domain connections
4.Produce at least one connection compelling enough to spark a real collaboration
5.Open-source the complete pipeline: ingestion, extraction, graph construction, discovery
6.Launch a public Explorer UI
7.Publish a weekly Discovery Feed

Why Now

Three things make this possible today

Open academic data at scale

OpenAlex (launched 2022) provides free API access to 250 million+ works with full metadata.(5) The corpus required to build this graph now exists and is openly accessible.

AI that can read and extract

Large language models can extract structured knowledge from unstructured abstracts: entities, relationships, and mechanisms, at a quality that was not achievable before 2023.

Embeddings that bridge vocabulary gaps

"Ant colony pheromone signaling" and "decentralized serverless load balancing" land near each other in vector space. Semantic similarity can now cross disciplinary vocabulary barriers.

The tools exist. The data exists. Someone just needs to connect them.