Skip to content

RavenRustRAG

RavenRustRAG is a local-first RAG (Retrieval-Augmented Generation) engine written in Rust. It provides millisecond document indexing, microsecond query latency, and ships as a single static binary with zero runtime dependencies.

What is RAG?

Retrieval-Augmented Generation combines document retrieval with LLM prompting. Instead of feeding an entire corpus to a language model, RAG retrieves the most relevant chunks and provides them as context. This reduces hallucinations, keeps responses grounded in your data, and works within token limits.

Key Features

  • Vector search with cosine similarity over local embeddings
  • HNSW index for O(log n) approximate nearest neighbor search
  • Hybrid search combining BM25 keyword matching with vector retrieval (RRF fusion)
  • Knowledge graph for entity-relationship traversal and graph-vector fusion
  • Semantic splitting via embedding cosine similarity between sentences
  • Lock-free caching with DashMap for concurrent embedding cache access
  • Memory-mapped SQLite with 256 MB mmap for zero-copy reads
  • HTTP API (Axum) with Bearer auth, rate limiting, CORS, and OpenAPI schema
  • MCP server for AI assistant integration (Claude, Copilot, Cursor)
  • CLI with 20 commands for scripting and automation
  • File watching with debounce for automatic re-indexing
  • Incremental indexing via SHA-256 content fingerprinting
  • Docker support with minimal scratch-based images (~15 MB)

Architecture

RavenRustRAG is a Cargo workspace with 10 crates:

Crate Purpose
raven-core Document, Chunk, SearchResult, Config, RavenError, cosine similarity
raven-embed Embedder trait + Ollama/OpenAI/vLLM/LiteLLM/ONNX/Dummy backends + DashMap cache
raven-store VectorStore trait + SQLite (WAL, mmap) / Memory / HNSW stores
raven-split Splitter trait + Text/Sentence/Token/Semantic splitters
raven-load File loaders (txt, md, csv, json, html, pdf, docx) + JSONL export
raven-search DocumentIndex, BM25, KnowledgeGraph, Reranker, MultiQuery, Eval
raven-server Axum HTTP API (auth, CORS, rate limit, SSE streaming, WebSocket, metrics, OpenAPI)
raven-mcp MCP server (stdio JSON-RPC, tools + resources + prompts)
raven-cli CLI binary (20 commands)
ravenrustrag Top-level library crate with stable public API re-exports

Getting Started

See Installation to get the binary, then Quick Start to index your first documents.