Skip to content

bug: Worktree dependency_locations empty for monorepo root — causes ~1GB node_modules duplication per worktree #1901

@BlackStar1453

Description

@BlackStar1453

Checklist

  • I searched existing issues and this hasn't been reported

Area

Backend

Operating System

macOS

Version

2.7.6

What happened?

I noticed that my Auto-Claude project directory was consuming abnormally large disk space. After investigating, I found that each worktree had its own full copy of node_modules (~1GB+) instead of sharing it via symlink from the main project.

The root cause is in ProjectAnalyzer._find_and_analyze_services() (apps/backend/analysis/analyzers/project_analyzer_module.py, line 85-126). In monorepo projects, this method iterates subdirectories of service_locations to discover services like apps/frontend and apps/backend, but it never analyzes the project root itself as a service.

In a monorepo like Auto-Claude where the root has package.json + node_modules (npm/pnpm workspaces pattern), the root-level node_modules is the largest dependency directory but is completely invisible to the analyzer. This causes project_index.json's dependency_locations field to be empty for root-level dependencies.

Data flow of the bug:

ServiceAnalyzer._detect_dependency_locations()  → correctly detects per-service deps ✅
ProjectAnalyzer._find_and_analyze_services()    → finds apps/frontend, apps/backend ✅
                                                → misses root package.json + node_modules ❌
ProjectAnalyzer._aggregate_dependency_locations() → aggregates from services → missing root deps
→ project_index.json.dependency_locations = [] (for root)
→ worktree setup can't symlink root node_modules
→ agent runs npm install in worktree → creates full ~1GB node_modules copy

Impact: With multiple worktrees (common when running parallel tasks), disk usage grows by ~1GB per worktree unnecessarily.

Steps to reproduce

  1. Use Auto-Claude on a monorepo project that has a root-level package.json + node_modules (e.g., Auto-Claude itself)
  2. Create a task that triggers worktree creation
  3. Check .auto-claude/project_index.jsondependency_locations will be missing the root node_modules
  4. Check the worktree directory — it has a full real node_modules instead of a symlink

Or verify directly:

cd apps/backend && .venv/bin/python -c "
from pathlib import Path
from analysis.analyzers.project_analyzer_module import ProjectAnalyzer
pa = ProjectAnalyzer(Path('.').resolve().parent.parent)
result = pa.analyze()
print('dependency_locations:', result.get('dependency_locations'))
# Root node_modules will be missing from the list
"

Expected behavior

project_index.json should include root-level node_modules in dependency_locations:

{
  "dependency_locations": [
    {"type": "node_modules", "path": "node_modules", "exists": true, "service": "root", "package_manager": "npm"},
    {"type": "node_modules", "path": "apps/frontend/node_modules", "exists": true, "service": "frontend"},
    {"type": "venv", "path": "apps/backend/.venv", "exists": true, "service": "backend"}
  ]
}

This would allow setup_worktree_dependencies() to correctly symlink all dependency directories, saving ~1GB per worktree.

Proposal: Use pnpm to fundamentally solve this

The current fix (PR incoming) patches the analyzer to include root-level dependencies. However, a more fundamental solution would be to adopt pnpm as the package manager.

pnpm uses a content-addressable store (~/.pnpm-store/) where each package version is stored once on disk, with node_modules using hardlinks/symlinks to the store. This means:

  1. Zero duplication across worktrees — all worktrees share the same global store via hardlinks
  2. No symlink management neededpnpm install in a worktree is near-instant and uses negligible extra disk space
  3. Faster installs — packages are cached globally, making fresh installs in worktrees very fast
  4. Strict dependency resolution — prevents phantom dependencies (accessing packages not explicitly declared)
  5. Already supported — Auto-Claude detects pnpm-workspace.yaml as a monorepo indicator (project_analyzer_module.py line 43)

The project already uses npm workspaces (package.json has a workspaces field). Migration to pnpm would be straightforward:

  • Replace package-lock.json with pnpm-lock.yaml
  • Add pnpm-workspace.yaml
  • Update CI/CD scripts and install:all to use pnpm
  • The entire worktree dependency symlink logic in both setup.py and worktree-handlers.ts could potentially be simplified

This would eliminate the root cause rather than patching around it.

Logs / Screenshots

Before fixdependency_locations missing root node_modules:

dependency_locations: [
  {type: node_modules, path: apps/frontend/node_modules, service: frontend},
  {type: venv, path: apps/backend/.venv, service: backend}
]

After fix — root node_modules correctly included:

dependency_locations: [
  {type: node_modules, path: apps/frontend/node_modules, service: frontend},
  {type: venv, path: apps/backend/.venv, service: backend},
  {type: node_modules, path: node_modules, service: root, package_manager: npm}
]

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    Status

    In progress

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions