AcademiaHighly Complex
Research Paper Discovery Engine
Real CITES edges, traversed and ranked by meaning
Vector SearchTyped CITES EdgesCross-field TraversalHybrid Query
100
Papers
10
Fields
Typed
Citation Edges
All
Tests Passed
The Scenario
Research paper abstracts are embedded as vectors, and citations are authored as typed CITES edges from the reference lists, with provenance. A hybrid query seeds by abstract similarity, traverses the CITES edges (including across disciplines), and ranks the surviving frontier by meaning. A machine learning paper reaches a neuroscience paper because a real citation path connects them, ranked up because their abstracts genuinely overlap. Every cross-field connection is explainable from the citations behind it.
Key Results
- Cross-field connections follow real CITES edges, then rank by meaning
- Every connection is explainable from the citations behind it
- Citations authored or bulk-imported with provenance, not inferred
- 10-field coverage with citation-grounded inter-field bridges
100
Papers
10
Fields
Typed
Citation Edges
All
Tests Passed
The Code
Everything above, in a few lines of Python.
python
# Citations are typed CITES edges, authored or bulk-imported with provenance.
client.graph.put_edge("papers", source=paper_a, target=paper_b,
edge_type="CITES", provenance={"source": "refs"})
client.graph.bulk_import_edges("papers", citation_rows, format="csv")
# Cross-field discovery: seed by similarity, traverse CITES, rank by meaning.
results = (
client.graph.query("papers")
.vector_similar(paper_embedding, k=20)
.traverse("CITES", direction="outgoing")
.vector_rank(paper_embedding, k=20)
.return_nodes()
)