← Compare Compare · per product

Hydrate vs Bytebell

Bytebell answers "where is the auth code?". Hydrate answers "what did we decide about it last Tuesday?". These are permanently different things — and the honest pitch for a serious team is run both.

Bytebell answers

Where is the authentication code? How does it connect to the API layer? Which repos import parse_file? What changed in the last commit that touched the auth module?

Layer 1 — code structure

Hydrate answers

What did we decide about the authentication implementation last Tuesday? Why did we abandon the session-token approach? What JWT convention did Tom establish? What did Dick and Harry need to know before touching the auth layer?

Layer 3 — team decisions

The graph tells the agent what the code is. Hydrate tells it what the team decided. You cannot derive one from the other — and on a serious project both questions get asked daily.

At a glance — different layers, different problems

At a glance

Dimension	Hydrate	Bytebell
Core problem	Session continuity	Codebase archaeology
What it indexes	Session transcripts, decisions, pinned facts	Files, classes, functions, imports, per-file semantics
When value peaks	Ongoing project; returning to work; team coordination	First session in unfamiliar 10k+ file codebase
Capture	Automatic Stop hook	`bytebell index <url>` per repo
Injection	Automatic UserPromptSubmit hook	AI calls 3 MCP tools
Storage	SQLite	Neo4j + Mongo + Redis
Runtime	Single Go binary	Bun + Docker (3 services)
BYOK key required	-	OpenRouter (at ingest)
Team sync	`team push/pull`	OSS: single-tenant · Enterprise: multi-tenant
Hydration packs	`.hpack`	-
Compact-survival	PreCompact + SessionStart	-
Cross-vendor MCP	✓ `hydrate-mcp`	✓ `127.0.0.1:8080/mcp`
Codebase structural graph	-	✓ Neo4j with AST + imports
Per-file LLM semantics	-	`purpose` / `summary` / `businessContext`
Licence	Closed source (v0.2.0 beta)	AGPL-3.0 + non-commercial · Enterprise separate

✓ present · - not present · bold + tinted cells mark the side that ships the capability.

What is Bytebell?

ByteBell/bytebell-oss is a local knowledge graph for your codebase, served over MCP. It indexes a repo into a Neo4j graph where every file node carries LLM-generated purpose, summary, and businessContext fields, so Claude Code, Cursor, and other MCP-capable agents can answer questions about your code without reading the whole repo into context.

The architecture is grounded in recent research. The README cites RepoGraph (ICLR 2025, +32.8% on SWE-bench), CodexGraph (NAACL 2025), CGM (43% on SWE-bench Lite), and several more — the converging finding is that purely structural retrieval (AST / call-graph) and purely semantic retrieval (embeddings) each leave large performance on the table, and combining them at index time unlocks the gains.

Bytebell binds to 127.0.0.1 only. No telemetry, no auth, no remote network surface. The OSS edition is AGPL-3.0 with a non-commercial clause; commercial deployment requires the separately-licensed Enterprise edition.

How to install Bytebell

# Prerequisites: Bun >= 1.1, Docker, OpenRouter API key
bytebell set openrouter-api-key sk-or-…
bytebell set openrouter-model anthropic/claude-sonnet-4.6
bytebell boot
bytebell index https://github.com/your/repo
bytebell ls

Then point your MCP client at http://127.0.0.1:8080/mcp. The boot step pulls Mongo, Neo4j, and Redis via docker-compose; data persists in named volumes across reboots. Configuration lives at ~/.bytebell/config.json (mode 0600); there's no .env, and bytebell set is the only sanctioned write path.

Where Bytebell is genuinely stronger

Repository structural graph. Per-file imports, calls, class hierarchy. Bytebell knows that auth/middleware.go imports internal/jwt which extends crypto/ed25519. Hydrate has none of this — Hydrate's data is session-derived, not code-derived.
LLM-enriched semantic surface per file. purpose, summary, businessContext fields close the "vocabulary gap" — the model can match what a developer means, not just what the code spells.
Research-grounded design. The README's bibliography is a real one. If you're sold by the recent literature on hybrid structure-plus-semantic retrieval, Bytebell is the most direct application of it.
Enterprise deployment shape. SSO / SCIM, audit logging, multi-tenant patterns, connectors to Confluence / Jira / Notion / GitHub Enterprise are documented in the Enterprise edition.

Where Hydrate is stronger

Session memory. This is the deepest distinction. Bytebell knows the codebase structure. It has no mechanism to know that last Tuesday your team decided to use JWT over session cookies, or that you tried the GraphQL approach and abandoned it. Those decisions live in session transcripts, not import graphs.
Automatic capture, zero developer overhead. Hydrate's Stop hook fires on every session without any command. Bytebell needs bytebell index <url> per repo.
Automatic injection before the prompt. Hydrate's UserPromptSubmit hook injects ranked context before the model reads the prompt. Bytebell's three MCP tools require the model to call them — meaning the model has to decide to remember.
Team sync + canon propagation. Hydrate has team push/pull for canon propagation across machines, plus .hpack archives. Bytebell OSS is single-tenant per Neo4j instance.
Compact-survival. Hydrate intercepts the PreCompact event and writes a snapshot. On the public benchmark (n=30, 3 complexity buckets) Hydrate recovers 27% of in-flight tasks across compaction; tools without this hook recover 0%.
Zero LLM cost to operate. Hydrate's distill is local Go (TF-IDF plus sentence scoring), no LLM call. Over a 26-hour orchestration sprint Hydrate compressed 25.5M raw tokens to 142K stored summary at $0 compression cost. Bytebell calls OpenRouter on every file at ingest.
Single binary install. Hydrate is one Go binary plus SQLite. Bytebell needs Bun, Docker, Mongo, Neo4j, and Redis.

When to pick each

Scenario	Better choice
First session in unfamiliar 10k+ file codebase	Bytebell
Cross-repo dependency tracing	Bytebell
Ongoing project, 2+ weeks of session history	Hydrate
Team of 3+ devs, decision propagation	Hydrate
Mid-session compaction-recovery	Hydrate
Onboarding a new dev	Both — composes
Privacy-sensitive / air-gap	Both — composes
Cost-sensitive (no ongoing API spend)	Hydrate
Commercial deployment, AGPL-incompatible	Hydrate (or Bytebell Enterprise)

Run both — the strongest joint pitch in the space

A Claude Code session can speak to both MCP servers simultaneously without configuration conflict:

{
  "mcpServers": {
    "bytebell": { "type": "http", "url": "http://127.0.0.1:8080/mcp" },
    "hydrate":  { "command": "hydrate-mcp" }
  }
}

The result: instant structural understanding of any codebase (Bytebell) plus persistent memory of every decision made while working in it (Hydrate). Two complementary memory layers, one session.

Brutally honest — what Hydrate doesn't do that Bytebell does

No codebase graph. Hydrate can't answer "which files import parse_file?" or "what calls validateJWT?".
No per-file LLM semantics. Hydrate doesn't know what auth/middleware.go is for.
No academic-citation depth. Bytebell's README cites a dozen recent papers; Hydrate's positioning is empirical (the benchmarks) rather than research-cited.

If your problem is "agent doesn't understand my codebase", Bytebell is the right tool. If your problem is "agent doesn't remember what my team decided", that's Hydrate. The honest answer is to run both.

Install Hydrate

brew install gethydrate/hydrate/hydrate
hydrate install-hooks

Questions? [email protected] · Licence enquiries: [email protected]

Homepage · /install · /benchmarks · Visit Bytebell →