Skip to content

vinikjkkj/code-indexer

Repository files navigation

code-indexer

Avoid making your AI run blind grep loops on every reply. Use code-indexer to index once, then retrieve semantically via MCP.

What it is

code-indexer is a Docker-only MCP server for multi-repo code retrieval.

Pipeline:

  • Tree-sitter chunking + textual fallback
  • Dense embeddings
  • Sparse retrieval
  • Reranking
  • Incremental indexing (mtime + size + content hash)

Model stack and where to find alternatives

Default models:

Find other options:

Requirements

  • Docker + Docker Compose
  • Optional GPU runtime:
    • NVIDIA: NVIDIA Container Toolkit
    • AMD ROCm: ROCm-compatible Linux host
    • Intel iGPU: /dev/dri access (runtime support depends on backend libraries)

Quick start

  1. Create local config files.

Windows (PowerShell):

Copy-Item .env.example .env
Copy-Item config/codebases.example.yaml config/codebases.yaml

Linux (bash):

cp .env.example .env
cp config/codebases.example.yaml config/codebases.yaml
  1. Start (CPU/default).

Windows (PowerShell):

docker compose up --build -d

Linux (bash):

docker compose up --build -d
  1. MCP endpoint:
  • http://localhost:${MCP_PORT}/mcp (default: http://localhost:8000/mcp)

Single source configuration

  • .env is the single source of truth for:
    • model IDs (EMBEDDING_MODEL, SPARSE_MODEL, RERANKER_MODEL)
    • runtime tuning (CHUNK_SIZE, USE_RERANKER, etc.)
  • config/codebases.yaml is only for:
    • codebase paths (codebases)
    • file scope (default_include_extensions, default_exclude_dirs)

GPU backend support

NVIDIA

Windows (PowerShell):

docker compose --compatibility -f docker-compose.yml -f docker-compose.nvidia.yml up --build -d

Linux (bash):

docker compose --compatibility -f docker-compose.yml -f docker-compose.nvidia.yml up --build -d

AMD ROCm

Uses mcp-server/Dockerfile.rocm.

Windows (PowerShell):

docker compose -f docker-compose.yml -f docker-compose.amd.yml up --build -d

Linux (bash):

docker compose -f docker-compose.yml -f docker-compose.amd.yml up --build -d

Intel iGPU (experimental)

Windows (PowerShell):

docker compose -f docker-compose.yml -f docker-compose.intel.yml up --build -d

Linux (bash):

docker compose -f docker-compose.yml -f docker-compose.intel.yml up --build -d

Notes:

  • Intel/AMD acceleration depends on backend libraries in your image/runtime.
  • If acceleration is unavailable, set devices to CPU:
    • EMBEDDING_DEVICE=cpu
    • SPARSE_DEVICE=cpu
    • RERANK_DEVICE=cpu

Path rules for register_codebase

Always pass container-visible paths.

  • Use /host/... for paths outside HOST_CODEBASES_ROOT
  • Use /workspaces/... for paths inside HOST_CODEBASES_ROOT
  • Do not pass raw host paths like C:\repo\api or /home/user/repo

Examples:

  • C:\Users\you\repos\api -> /host/Users/you/repos/api
  • /home/you/repos/api -> /host/home/you/repos/api
  • HOST_CODEBASES_ROOT=C:/repos + C:/repos/api -> /workspaces/api

MCP client config

{
  "mcpServers": {
    "code-indexer": {
      "transport": "streamable-http",
      "url": "http://localhost:<MCP_PORT>/mcp"
    }
  }
}

MCP tools

Tool Purpose
list_codebases List codebases and indexed chunk counts
register_codebase Register codebase path at runtime
reindex_codebase Synchronous reindex (supports full_reindex)
delete_codebase_index Delete vectors for one codebase index
delete_all_indexes Delete all vectors from the collection
start_index_job Start async indexing and return job_id
cancel_index_job Request safe cancellation for one indexing job
stop_codebase_indexing Request safe cancellation for active jobs of one codebase
get_index_job_status Poll async status/result
list_index_jobs List recent jobs
search_code Hybrid semantic search + rerank
read_code_file Read exact file lines

Query policy for AI agents

For best retrieval quality, always send search_code.query in English.

Suggested system instruction:

When calling MCP tool search_code, always write query in English.

Example AI interactions

Example 1:

User:

Index my repo and find where retry policy is implemented.

AI tool calls:

register_codebase(codebase_id="payments", path="/host/Users/you/repos/payments", index_now=true)
search_code(query="where retry policy is implemented", codebase_ids=["payments"], search_mode="code_only", limit=5)

Example 2 (large repo, timeout-safe):

User:

Reindex the monorepo now.

AI tool calls:

start_index_job(codebase_id="monorepo")
get_index_job_status(job_id="...", include_result=true)

Example 3 (Portuguese user, English retrieval query):

User:

Onde o fallback textual Γ© usado no chunking?

AI tool call:

search_code(query="where textual fallback is used in chunking", codebase_ids=["rag-test"], search_mode="code_only")

Benchmarks

Measured on 2026-02-28 on this same repository (code-indexer), using MCP tools and default models, on an NVIDIA GeForce RTX 4060 8GB.

Dataset/profile:

  • Files indexed: 14
  • Chunks after full rebuild: 218
  • USE_TREE_SITTER=true
  • USE_SPARSE=true
  • USE_RERANKER=true

Indexing

Scenario Runs Min (s) Avg (s) P50 (s) Max (s)
Full rebuild (full_reindex=true) 1 7.386 7.386 7.386 7.386
Incremental no-change 5 0.068 0.074 0.069 0.096
Async incremental end-to-end 1 0.140 0.140 0.140 0.140
Async time-to-first-response (start_index_job) 1 0.008 0.008 0.008 0.008

Search

Scenario Runs Min (s) Avg (s) P50 (s) Max (s)
search_code code_only 8 0.487 0.501 0.496 0.518
search_code mixed 8 0.857 0.879 0.873 0.917
search_code docs_only 8 0.388 0.391 0.390 0.397

Comparisons

  • Incremental no-change vs full rebuild: about 99.8x faster.
  • Async indexing first response: about 8 ms.

Large Repo Performance (10k+ files)

If indexing takes hours, tune for throughput first:

  1. Use async indexing tools (avoid MCP client timeout):
    • start_index_job(codebase_id="my-repo")
    • Poll with get_index_job_status(job_id="...")
  2. Restrict scope in config/codebases.yaml:
    • keep only required extensions in include_extensions
    • aggressively exclude heavy folders in exclude_dirs
  3. Increase indexing throughput in .env:
    • INDEX_CHUNK_BUFFER_SIZE=512 (or 1024)
    • QDRANT_UPSERT_BATCH_SIZE=256 (or 512)
    • QDRANT_WRITE_WAIT=false
    • INDEX_FILE_WORKERS=4 (or up to CPU cores)
    • INDEX_MAX_PENDING_FUTURES=16 (or workers * 4)
    • EMBEDDING_BATCH_SIZE=32 (or 64 if VRAM allows)
  4. If indexing speed is priority over recall, disable sparse indexing:
    • USE_SPARSE=false
  5. Enable progress logs:
    • INDEX_PROGRESS_EVERY_FILES=250
    • then run docker compose logs -f mcp-server
  6. For minified/single-line files, keep dedupe based on character ranges:
    • DEDUPE_CHAR_OVERLAP_THRESHOLD=0.65

Recent reindex_codebase / start_index_job results now include:

  • files_scanned
  • flush_batches
  • index_chunk_buffer_size
  • qdrant_upsert_batch_size
  • index_file_workers
  • duration_seconds

Troubleshooting

Missing session ID on all calls

  1. Confirm client transport is streamable-http
  2. Set MCP_STATELESS_HTTP=true
  3. Restart stack

Windows (PowerShell):

docker compose up -d --build

Linux (bash):

docker compose up -d --build

failed to connect to SSE stream ... /mcp: EOF

Cause:

  • MCP server process is alive, but endpoint is not ready yet (commonly blocked by long startup indexing).

Fix:

  1. Keep MCP_TRANSPORT=streamable-http
  2. Set AUTO_INDEX_ON_STARTUP=false for immediate startup, or keep it enabled with async startup on latest version
  3. Start indexing via async tools after server is up:
    • start_index_job(...)
    • get_index_job_status(...)

path not found in register_codebase

  • Check mount roots (HOST_FILESYSTEM_ROOT, HOST_CODEBASES_ROOT)
  • Send /host/... or /workspaces/... path, not host-native path

Persistence

  • Qdrant vectors: qdrant_data
  • Models/cache/state: model_cache
  • Incremental state: ${INDEX_STATE_DIR} (default /models/index_state)

About

πŸš€ MCP-native, Docker-only code retrieval engine with hybrid vectors, reranking, async safe indexing, and GPU acceleration for massive multi-repo codebases.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors