Skip to content

Configuration

RavenRustRAG can be configured via a TOML config file, environment variables, or CLI flags.

Config File

Create a raven.toml in the project root or pass --config <path>:

[embedder]
backend = "ollama"
model = "nomic-embed-text"
url = "http://localhost:11434"

[splitter]
kind = "text"
chunk_size = 512
chunk_overlap = 50

[pipeline]
embed_batch_size = 64
store_batch_size = 100

[server]
host = "127.0.0.1"
port = 8484
# api_key = "your-secret"
# cors_origins = ["http://localhost:3000"]
request_timeout_secs = 60
rate_limit_per_second = 100
max_query_length = 10000
public_stats = false

Environment Variables

Variable Purpose Default
RAVEN_API_KEY API authentication key (server) — (no auth)
RAVEN_DB Default database path ./raven.db
RAVEN_MODEL Default embedding model nomic-embed-text
RAVEN_HOST Server bind address 127.0.0.1
RAVEN_PORT Server port 8484
RAVEN_EMBED_BACKEND Embedding backend (ollama, openai, vllm, litellm, http, onnx) ollama
RAVEN_EMBED_URL Custom embedding service URL
RAVEN_CORS_ORIGINS Allowed CORS origins (comma-separated) *
RAVEN_RATE_LIMIT Rate limit per second 100
RAVEN_REQUEST_TIMEOUT Request timeout in seconds 60
RAVEN_MAX_QUERY_LENGTH Max query string length 10000
RAVEN_LOG_FORMAT Log output format (text or json) text

Precedence

CLI flags > Environment variables > Config file > Defaults

Embedding Backends

Ollama (default)

Local inference via Ollama. Requires Ollama to be running with a pulled model.

OLLAMA_NO_CLOUD=1 ollama serve

Recommended models: - nomic-embed-text (768 dimensions, fast, good quality) - mxbai-embed-large (1024 dimensions, higher quality)

OpenAI

Uses the OpenAI embeddings API. Requires OPENAI_API_KEY environment variable.

ravenrag index ./docs --backend openai --model text-embedding-3-small

vLLM

Uses a vLLM server with OpenAI-compatible API (default: http://localhost:8000/v1).

ravenrag index ./docs --backend vllm --model nomic-embed-text --url http://localhost:8000/v1

LiteLLM

Uses LiteLLM proxy (default: http://localhost:4000/v1). Supports 100+ model providers through a single interface.

ravenrag index ./docs --backend litellm --model nomic-embed-text --url http://localhost:4000/v1

Database

RavenRustRAG uses SQLite as its vector store with these optimizations enabled by default:

  • WAL mode for concurrent read access
  • mmap (256 MB) for zero-copy reads
  • 64 MB page cache
  • 5 second busy timeout for write contention

The database file is portable and can be copied between machines with the same architecture.