Retail & E-commerce

E-commerce: Product Recommendations

Typed BOUGHT_WITH edges, price-filtered in one call

What if 'also bought' edges and semantic search lived in one engine, with the price filter applied in the same call?

"Customers also bought" is not a similarity guess; it is a fact about behavior. The strongest recommendation signal in e-commerce is the typed relationship between two products: this was bought with that, this was viewed in the same session as that. Those are edges you own, derived from order and clickstream events, not something a vector store should be left to infer.

SwarnDB lets you store both halves in one place. Embed each product description for semantic search, and the product vector's id is also its graph node id. Then author the behavioral edges: BOUGHT_WITH and VIEWED_WITH, typed and directed, created from your order and session logs and bulk-imported as CSV or JSONL. Every edge carries provenance, so a "frequently bought together" claim traces back to the events that produced it.

The payoff is a recommendation that combines structure and meaning in one query. Seed candidates by description similarity, walk the BOUGHT_WITH edges to reach genuinely co-purchased items, then rank the surviving frontier by relevance, with a server-side price filter applied in the same call. Filter-then-search means you fix the qualified set (in stock, under $50) first, then rank exactly within it. One database, one round trip.

The Traditional Approach

The fragmented stack most teams cobble together today.

Amazon Personalize or Recombee: managed recommendation service ($$)A graph store: explicit 'bought together' and 'viewed together' relationshipsElasticsearch: product search with keyword matchingPostgreSQL: product catalog and metadata storageCustom ML pipeline plus ETL to derive and sync co-purchase edges

Behavioral edges live in one store, product vectors in another, joined in app code
Separate services for search, recommendations, and filtering require app-level stitching
Managed recommendation services cost $5,000-$50,000/month at scale
Price-filtered recommendations need multiple service calls and client-side joining
No single audit trail tying a 'bought together' edge back to the orders that produced it
Keeping the co-purchase graph in sync with the catalog is a standing ETL job

The SwarnDB Approach

One database. Every capability built in.

Typed Behavioral Edges

BOUGHT_WITH and VIEWED_WITH are explicit, directed, typed edges derived from your order and session events. Author them with put_edge or bulk-import them as CSV/JSONL. Each carries provenance back to the events that produced it.

Hybrid Query

Seed by description similarity, walk the behavioral edges to reach genuinely co-purchased items, then rank the surviving frontier by relevance. vector_similar, then traverse, then vector_rank, in one composable query.

Filter-Then-Search

Price ranges, categories, brands, and stock applied server-side as a pre-filter. Fix the qualified set first (in stock, $20 to $50), then rank exactly within it, returning the true top-k among qualified items.

Edge Provenance and Bulk Import

Bulk-import millions of co-purchase edges from CSV or JSONL. Every edge knows its type, confidence, and source events, so a 'frequently bought together' claim is auditable.

Behavioral edges and product vectors in one store, one object per product
'Also bought' is an authored, event-derived typed edge, not a similarity guess
Search, graph traversal, and price filter in one call, one round trip
Filter-then-search returns the true top-k among qualified items
Bulk-import co-purchase edges from CSV or JSONL with full provenance
No managed ML service and no second graph store to keep in sync

Section 01

01. Authored 'Also Bought' Edges

"Customers also bought" should reflect what customers actually bought together, not what two descriptions happen to sound like. So the relationship is an explicit, typed edge you derive from behavior. Roll up your order history into co-purchase pairs and write each as a BOUGHT_WITH edge; roll up session logs into VIEWED_WITH edges. Both are directed, typed, and carry provenance back to the events that produced them.

You do not author these one at a time. Bulk-import millions of co-purchase edges as CSV or JSONL in a single call. The product they hang off is the same object as the product vector, because the vector's id is the graph node id. There is no separate graph store to sync and no foreign key to maintain.

Semantic similarity still has a job: it seeds and ranks. A brand-new product with no purchase history yet has a description, so description similarity gives it a sensible starting set of candidates until behavioral edges accumulate. But the headline "frequently bought together" claim is grounded in real events, and you can always trace it back through the edge's provenance.

Key insight:'Frequently bought together' is an authored, event-derived typed edge, not a similarity guess. Every edge traces back to the orders that produced it.

also_bought.py

from swarndb import SwarnDBClient

client = SwarnDBClient(host="localhost", port=50051)

# Hybrid mode: product vectors and a typed product graph in one store.
client.collections.create(
    "products", dimension=384, distance_metric="cosine", mode="hybrid"
)

# Store a product. The insert id is its graph node id.
speaker = client.vectors.insert("products",
    vector=product_embedding,
    metadata={"name": "Waterproof Outdoor Speaker", "price": 79.99,
              "category": "Electronics"}
)

# "Also bought" is a typed edge derived from orders, with provenance.
client.graph.put_edge("products", source=speaker, target=dress_id,
                      edge_type="BOUGHT_WITH",
                      provenance={"source": "orders", "co_count": 312})

# Bulk-import co-purchase edges from your order history.
client.graph.bulk_import_edges("products", copurchase_rows, format="csv")

Section 02

02. Cross-Category Recommendations

Cross-category recommendations are the holy grail of e-commerce. When a customer buying running shoes also sees a hydration pack, a fitness tracker, and a meal prep container, that is a coherent lifestyle recommendation that increases cart size. Those connections are real because customers genuinely buy across categories, and your order history records it.

SwarnDB turns that history into typed BOUGHT_WITH edges that span categories. A recommendation query then scopes by structure and ranks by meaning: seed candidates by description similarity, walk the co-purchase edges to reach items genuinely bought alongside the seed, and rank the surviving frontier by relevance. The result reaches across categories because the edges do, and stays relevant because the ranking does.

Every cross-category suggestion is explainable. You can show that this dress is recommended with this speaker because N customers bought them together, with the edge provenance to back it up, not because two product descriptions happened to share the word "outdoor." Structure earns the cross-sell; meaning keeps it relevant.

Key insight:Cross-sells follow real co-purchase edges, then rank by relevance. Every suggestion is explainable from the orders behind its edge.

cross_category.py

# Cross-category recommendations: scope by structure, rank by meaning
result = (
    client.graph.query("products")
    .vector_similar(speaker_embedding, k=20)
    .traverse("BOUGHT_WITH", direction="outgoing")
    .vector_rank(speaker_embedding, k=10)
    .return_nodes()
)

# Each result was genuinely bought alongside the seed product
# AND ranks well by description relevance. Cross-category by behavior,
# not by a coincidental word overlap.
for node in result.nodes:
    print(node.id, node.label)

Section 03

03. Price-Filtered in One Call

Showing "recommended products under $50, in stock" usually means stitching a search service, a recommendation service, and a price filter together in app code. SwarnDB does it in one query, with the filter applied server-side and correctly.

The key is filter-then-search. SwarnDB fixes the qualified set first (in stock, priced $20 to $50), then ranks exactly within it. You get the true top-k among qualified items, not a top-k computed over everything and then trimmed (which silently drops good matches that fall just outside the displayed window). The price constraint is a first-class part of the plan, not a post-filter.

Combined with the hybrid builder, one call seeds by similarity, walks the co-purchase edges, ranks the frontier by relevance, and respects the price filter throughout. One round trip, one store, the correct answer.

Key insight:Filter-then-search fixes the in-budget set first, then ranks within it, returning the true top-k among qualified items. One round trip, not three services.

one_call.py

from swarndb.search import Filter

# Filter-then-search: fix the qualified set, then rank within it.
# One call returns the true top-k among in-budget products.
results = client.search.query("products",
    vector=product_embedding, k=10,
    filter=Filter.between("price", 20.0, 50.0)
)

# Or combine with the graph: co-purchase edges, ranked, price-aware.
recs = (
    client.graph.query("products")
    .vector_similar(product_embedding, k=50)
    .traverse("BOUGHT_WITH", direction="outgoing")
    .vector_rank(product_embedding, k=10)
    .return_nodes()
)
# One round trip. One database. The correct top-k.

Section 04

04. The Full Recommendation Pipeline

A complete "You Might Also Like" section is the typed co-purchase edges plus one composable query. First, derive BOUGHT_WITH and VIEWED_WITH edges from your order and session logs and bulk-import them. Then run the hybrid builder: seed candidates by description similarity, traverse the behavioral edges to reach co-purchased items, and rank the surviving frontier by relevance.

The result is a feed that spans categories where customers genuinely cross-shop, stays relevant through the vector ranking, and respects a price filter applied in the same call. Every recommendation is auditable: you can trace it back to the co-purchase events behind its edge.

This is a complete recommendation flow over one store. No managed recommendation service, no second graph database, no ETL job keeping a co-purchase graph aligned with the catalog. The semantic matching and the behavioral traversal happen in one engine, over one copy of the data.

Key insight:One composable query over typed co-purchase edges produces an auditable cross-category feed. No managed ML service, no second store, no ETL sync.

recommend.py

# Full "You Might Also Like" pipeline, one composable query

# Behavioral edges derived from logs and bulk-imported:
#   client.graph.bulk_import_edges("products", copurchase_rows, format="csv")
#   client.graph.bulk_import_edges("products", coview_rows, format="jsonl")

feed = (
    client.graph.query("products")
    .vector_similar(product_embedding, k=50)
    .traverse("BOUGHT_WITH", direction="outgoing")
    .vector_rank(product_embedding, k=200)
    .return_nodes()
)

# Result: items genuinely bought alongside the seed, ranked by relevance.
# Cross-category where customers cross-shop, auditable to the orders behind it.

SwarnDB vs Traditional Stack

A side-by-side look at the traditional approach versus SwarnDB.

Capability	Traditional Stack	SwarnDB
'Also bought'	Edges in a separate graph or ML store	Typed BOUGHT_WITH edges, same engine
Stack Complexity	Search + recs + filter + graph services	One database, one object per product
Edge source	Opaque ML output	Authored from orders, with provenance
Price-Filtered Recs	Multi-service round trips	Filter-then-search, one call
Edge ingest	Standing ETL sync job	Bulk import CSV / JSONL

Key Metrics

Typed Edges

'Also Bought'

Order Events

Edge Source

One Query

Search + Graph

Filter-Then-Search

Price Filter

CSV / JSONL

Edge Import

Per-Edge

Provenance

The Code

Everything above, in a few lines of Python.

recommendations.py

from swarndb import SwarnDBClient
from swarndb.search import Filter

client = SwarnDBClient(host="localhost", port=50051)

# Hybrid mode: product vectors and a typed product graph in one store.
client.collections.create(
    "products", dimension=384, distance_metric="cosine", mode="hybrid"
)

# Store products. Each insert id is the product's graph node id.
for product in catalog:
    client.vectors.insert("products",
        vector=product["embedding"],
        metadata={"name": product["name"], "price": product["price"],
                  "category": product["category"]}
    )

# Author behavioral edges from orders / sessions, then bulk-import.
client.graph.put_edge("products", source=a, target=b,
                      edge_type="BOUGHT_WITH",
                      provenance={"source": "orders", "co_count": 312})
client.graph.bulk_import_edges("products", copurchase_rows, format="csv")

# Price-filtered recommendations: scope by structure, rank by meaning.
feed = (
    client.graph.query("products")
    .vector_similar(product_embedding, k=50)
    .traverse("BOUGHT_WITH", direction="outgoing")
    .vector_rank(product_embedding, k=200)
    .return_nodes()
)

Try it yourself

Clone the repo, spin up SwarnDB, and run this use case in minutes.

View on GitHub