LLM Extraction (Bring Your Own Key)

Turn text chunks into typed entities and edges, on your terms

“What if you could turn raw text into a typed graph, on your own model and your own key?”

OpenAI-compatibleModelYou own itKeyPreview before spendCostOffDefault

Authoring every edge by hand does not scale when your structure is buried in prose. The relationships are there in the documents, but pulling them out one by one is slow. LLM extraction reads the text and proposes the typed entities and edges for you, so you start from a draft graph instead of a blank one.

It is bring-your-own-key. You point a hybrid collection at any OpenAI-compatible model and supply your own API key, base URL, and model name. SwarnDB does not embed a model or bill you for tokens; you own the key, the spend, and the choice of provider. You can also set an ontology, optionally seeded from a base template, so the extraction proposes the entity and edge types you actually care about.

Spending is predictable. Before any extraction runs, cost_preview estimates what a set of chunks will cost, so you see the bill before you commit to it rather than after. Nothing is extracted until you choose to start.

What comes back is not trusted blindly. Extracted entities and edges arrive as first-class typed edges with full provenance, and they enter the same curation loop as everything else: verify the ones you trust, reject the ones you do not, and re-extract with adjusted policies when needed. The whole feature is off by default and only active on hybrid collections you explicitly configure.

How LLM Extraction Works

Configure

Point the collection at any OpenAI-compatible model with set_llm_config: base URL, your own API key, model name, and generation settings. You supply and own the key; SwarnDB never embeds a model or bills you for tokens.

Set ontology

Define the entity and edge types you care about with set_ontology, optionally seeded from a base template. The extraction proposes structure that fits your domain instead of a generic guess.

Preview cost

Run cost_preview on your chunks to estimate the spend before committing. You see the bill in advance, so extraction never surprises you with a cost after the fact.

Extract & curate

Call start_extraction to produce typed entities and edges with full provenance, then verify, reject, or re-extract. The output enters the same curation loop as hand-authored edges, so human judgment stays in control.

What You Can Do

Capabilities

Any OpenAI-Compatible Model, Your Key

Point a hybrid collection at any OpenAI-compatible endpoint with your own base URL, API key, and model name. You own the key and the spend; SwarnDB does not embed a model or bill you for tokens. Use the provider and model that fit your budget and policy.

set_llm_config.py

client.extraction.set_llm_config(
    "articles",
    base_url="https://openrouter.ai/api/v1",
    api_key="sk-or-...",
    model_name="openai/gpt-4o-mini",
    temperature=0.0,
    max_tokens=2048,
)

Cost Preview Before You Spend

Estimate the cost of extracting a set of chunks before any tokens are spent. cost_preview returns an estimate so you can decide whether to proceed, set budgets, and avoid surprises. Nothing is extracted until you explicitly start it.

cost_preview.py

estimate = client.extraction.cost_preview("articles", chunks)
print(f"Estimated cost: ${estimate.estimated_cost_usd}")

Typed Entities & Edges with Provenance

Extraction produces typed entities and edges that carry full provenance: the source document, the source chunk, the model, and a confidence value. The result is the same first-class typed edge you would author by hand, so you always know which extracted fact came from where.

start_extraction.py

client.extraction.set_ontology("articles", base_template="research-papers", replace=False)
result = client.extraction.start_extraction("articles", chunks)

Verify / Reject / Re-Extract Loop

Extracted edges are proposals, not verdicts. Verify the ones you trust, reject the ones you do not, and re-extract with adjusted policies when the output is not what you want. Human judgment stays in control of the graph.

Off by Default

Extraction is off by default and only active on hybrid collections you explicitly configure. You supply and own the key, choose the model, and decide when to run it. A collection you do not configure for extraction is never touched.

A Draft Graph, On Your Own Key, That You Curate

Structure is often locked inside prose. A corpus of papers is full of CITES relationships; a pile of incident reports is full of who-did-what-to-whom; a knowledge base is full of entities and the links between them. Authoring all of that by hand is slow, and skipping it leaves the graph empty. LLM extraction reads the text and hands you a draft graph to work from.

It is bring-your-own-key, deliberately. You configure the collection with your own base URL, API key, and model name, so you choose the provider, control the spend, and keep the key. SwarnDB does not ship a model or charge you for tokens. You can also set an ontology, optionally from a base template, so the extraction proposes the entity and edge types your domain actually uses instead of a generic schema.

Cost comes first. cost_preview estimates what extracting a set of chunks will cost before a single token is spent, so you commit to a known number rather than discovering the bill afterward. Only when you call start_extraction does anything run.

And the output is a starting point, not the final word. Extracted entities and edges arrive as first-class typed edges with full provenance, and they enter the same verify, reject, and re-extract loop as everything else in the graph. You keep what is right, drop what is wrong, and re-extract when you need to. The feature stays off until you turn it on, on a hybrid collection you configure, with a key you own.

Insight:Off by default, your key, your model, your spend. Preview the cost first, then verify, reject, or re-extract, so the graph reflects your judgment, not raw model output.

byok_extraction.py

from swarndb import SwarnDBClient

with SwarnDBClient(host="localhost", port=50051) as client:
    # Bring your own OpenAI-compatible model and key.
    client.extraction.set_llm_config(
        "articles",
        base_url="https://openrouter.ai/api/v1",
        api_key="sk-or-...",
        model_name="openai/gpt-4o-mini",
        temperature=0.0,
        max_tokens=2048,
    )
    client.extraction.set_ontology("articles", base_template="research-papers", replace=False)

    # See the bill before you spend a token.
    estimate = client.extraction.cost_preview("articles", chunks)
    print(f"Estimated cost: ${estimate.estimated_cost_usd}")

    # Then extract typed entities and edges with provenance, and curate them.
    result = client.extraction.start_extraction("articles", chunks)

Complete Example

Everything above, in one script.

llm_extraction_complete.py

from swarndb import SwarnDBClient

with SwarnDBClient(host="localhost", port=50051) as client:
    # 1. Point a hybrid collection at your own OpenAI-compatible model.
    client.extraction.set_llm_config(
        "articles",
        base_url="https://openrouter.ai/api/v1",
        api_key="sk-or-...",
        model_name="openai/gpt-4o-mini",
        temperature=0.0,
        max_tokens=2048,
    )

    # 2. Set the entity and edge types you care about.
    client.extraction.set_ontology("articles", base_template="research-papers", replace=False)

    # 3. Preview the cost before spending a token.
    estimate = client.extraction.cost_preview("articles", chunks)
    print(f"Estimated cost: ${estimate.estimated_cost_usd}")

    # 4. Extract typed entities and edges with provenance, then curate them.
    result = client.extraction.start_extraction("articles", chunks)

Start building with LLM Extraction (Bring Your Own Key)

Clone the repo and explore this feature in minutes.

View on GitHub