Skip to content

Changelog

All notable changes to RavenRustRAG are documented here.

This project follows Semantic Versioning.

[1.0.2] — 2026-05-08

Added

  • OpenAIGenerator — OpenAI Chat Completions backend for LLM generation (supports vLLM, LiteLLM, TGI, LocalAI, llama.cpp)
  • vllm and litellm as first-class backend names in create_embedder and create_generator
  • Homebrew tap (brew install egkristi/tap/ravenrag) with auto-update on release

Changed

  • Binary renamed from raven to ravenrag across all platforms
  • create_generator now supports "openai", "vllm", "litellm" in addition to "ollama"

1.0.0 — 2026-05-05

Completed Phases

All development phases are complete. RavenRustRAG is functionally superior to the Python RavenRAG v0.7.0.

Phase 4 — Polish & Release: Publishing (crates.io metadata, publish workflow, Homebrew formula, AUR PKGBUILD, MCP marketplace listing), Quality (80%+ test coverage, Codecov badge, CLI integration tests, HTTP server integration tests, verified benchmarks), Architecture (top-level library crate with builder API), CLI (ravenrag init, ravenrag diff), Robustness (embeddings versioning, --read-only mode, MCP JSON Schema validation), Performance (HNSW index replacing O(n) scan), API Stability (stable public API surface, ONNX MSRV split, v1.0 release).

Phase 5 — Advanced Features: LLM Generation (POST /ask with SSE streaming citations), ONNX Runtime (local embedding, cross-encoder reranking, quantized model support), MCP (resources capability, prompts capability, --filter scoping), CLI (--explain scoring, ravenrag backup), Advanced (incremental BM25, async SQLite, binary/quantized embedding storage, WebSocket streaming, config hot-reload, plugin system).

Phase 6 — Distribution & Packaging: Windows (winget, Chocolatey, Scoop, portable ZIP), macOS (Homebrew), Linux (APT/deb, RPM, AUR, Zypper, APK, Snap, Flatpak, Nix, AppImage, static musl binary), Cross-platform (crates.io, GitHub Releases, Docker multi-arch, OCI, Helm chart, curl install script).

Added

  • POST /ask SSE streaming citations with typed events: source, token, error, done
  • ravenrag serve --read-only mode for production deployments
  • MCP resources/list + resources/read capabilities (raven://index/stats)
  • MCP prompts/list + prompts/get capabilities (rag_answer, summarize_index)
  • ravenrag mcp --filter <expr> to restrict exposed tools
  • MCP JSON Schema constraints on tools/list (additionalProperties, min/max bounds)
  • ravenrag query --explain for detailed scoring breakdown
  • ravenrag backup <file> via SQLite backup API
  • Embeddings versioning — model + dimensions stored, dimension mismatch rejected
  • HNSW index auto-rebuilds at store open, eliminates O(n) flat scan (#79)
  • Stable public API surface via ravenrustrag crate re-exports (#83)
  • CI coverage percentage threshold gate at 70% (#81)
  • CI ONNX feature gate job (#80)
  • Verified benchmark numbers in README (#82)
  • HttpEmbedder for custom embedding backends via generic HTTP API (plugin system) (#77)
  • WebSocket endpoint /ws for real-time streaming search and prompt (#76)
  • 37 new tests across all crates, including Unicode edge cases (#53)
  • ravenrag diff command to show changes since last index (#78)
  • ravenrag ask command for local LLM question-answering via Ollama (#63)
  • Generator trait and OllamaGenerator for LLM text generation with streaming
  • ravenrag completions command for shell completion generation (bash, zsh, fish, elvish, PowerShell) (#62)
  • ravenrag status command for rich index health dashboard (#74)
  • --dry-run mode for ravenrag index (#71)
  • Colored CLI output with term highlighting (#73)
  • Schema migration system with versioned upgrades for SqliteStore (#60)
  • Property-based tests with proptest for core, split, and search crates (#58)
  • Stress tests for concurrent indexing and large document handling (#59)
  • Fuzz targets for text splitter, all loaders, and cosine similarity (#57)

Changed

  • create_embedder and create_cached_embedder now support "http" backend (#77)
  • All Cargo.toml files now include crates.io metadata (homepage, repository, keywords, categories) (#52)

Fixed

  • Unicode text splitter bug where multi-byte chars at chunk boundaries produced empty chunks (#53)
  • ravenrag diff macOS path canonicalization issue with /var/folders vs /private/var/folders (#78)
  • mdBook + MkDocs documentation site
  • HNSW integration in SqliteStore for O(log n) vector search (#64)
  • VectorStore::get_by_doc_id() for efficient parent-child retrieval (#65)
  • Split read/write connections in SqliteStore for concurrent reads (#69)
  • LRU cache eviction via moka crate (#68)
  • Auto-detect embedding dimension from model (#67)
  • Full raven.toml config wiring in CLI (#66)
  • DashMap lock-free embedding cache (#47)
  • Memory-mapped I/O for SQLite (256 MB mmap) (#48)
  • CI concurrency groups to prevent queue flooding (#46)
  • Knowledge graph CLI commands (ravenrag graph build, ravenrag graph query) (#45)
  • Multi-target release binaries (linux-amd64, linux-musl, linux-arm64, darwin-amd64, darwin-arm64) (#54)
  • GitHub Actions CI with fmt, clippy, test, MSRV, bench stages
  • Docker workflow with GHCR publishing
  • CodeQL security scanning

Changed

  • Default --extensions for CLI index/watch commands now includes all supported formats (txt,md,csv,json,html,pdf,docx)
  • EmbeddingCache internals: Mutex replaced with DashMap + AtomicU64
  • SqliteStore uses separate read/write connections for better concurrency
  • Semaphore error in batch embedding now returns proper RavenError instead of panic

Fixed

  • SqliteStore search now uses HNSW for O(log n) instead of O(n) brute-force (#64)
  • query_parent() no longer loads entire database (#65)
  • Embedding dimension no longer hardcoded to 768 (#67)
  • CLI now reads and applies raven.toml configuration (#66)
  • Clippy compliance: removed unwrap in library code

0.1.0-alpha.1 — Initial Release

Added

  • Cargo workspace with 9 crates
  • Core types: Document, Chunk, SearchResult, Config, RavenError
  • Embedding backends: Ollama, OpenAI, DummyEmbedder
  • Embedding cache with configurable max size
  • CachedEmbedder wrapper
  • Text splitters: TextSplitter, SentenceSplitter, TokenSplitter, SemanticSplitter
  • File loaders: txt, md (with frontmatter), csv, json, jsonl, html
  • JSONL export/import
  • Vector stores: SqliteStore (WAL, persistent), MemoryStore (testing)
  • HNSW approximate nearest neighbor index
  • Metadata filtering on search results
  • DocumentIndex builder pattern pipeline orchestrator
  • BM25 keyword search with persistent term storage
  • Hybrid search with RRF (Reciprocal Rank Fusion)
  • Knowledge graph: entity extraction, graph building, traversal, graph-vector fusion
  • Multi-query expansion via keyword extraction
  • Reranker trait with KeywordReranker
  • Evaluation metrics: MRR, NDCG, Precision@k, Recall@k
  • Watch mode with debounce for live re-indexing
  • Multi-collection router
  • Parent-child retrieval
  • Axum HTTP server with auth, CORS, rate limiting, OpenAPI schema, metrics
  • MCP server (stdio JSON-RPC) with search, prompt, info, index tools
  • CLI: index, query, prompt, serve, watch, graph, info, clear, export, import, mcp, doctor, benchmark
  • Incremental indexing via content fingerprinting
  • Criterion benchmarks
  • Pre-commit hooks (fmt, clippy, test)
  • Docker multi-stage build (Alpine builder, scratch runtime)