Skip to content

feat(sitemap): bounded concurrency + default --limit 200#23

Merged
arberx merged 1 commit intomainfrom
arberx/sitemap-large
Apr 26, 2026
Merged

feat(sitemap): bounded concurrency + default --limit 200#23
arberx merged 1 commit intomainfrom
arberx/sitemap-large

Conversation

@arberx
Copy link
Copy Markdown
Member

@arberx arberx commented Apr 26, 2026

Summary

  • Audit sitemap pages with a 5-worker concurrency pool instead of fully sequential — large sitemaps (e.g. Teladoc's 1,800 URL sitemap.xml) used to trigger ~90 min sweeps that looked hung; the new pool plus a default --limit 200 keep the common case under ~2 min while still being polite to one origin.
  • Split pagesSkipped into pagesFiltered (non-HTML URLs dropped) and pagesTruncated (URLs beyond --limit), expose effectiveLimit in the report, and add an onPlan callback so the CLI can print a stderr notice up front when truncation fires; formatters now spell out what was skipped and how to opt out.
  • Bumps to 1.5.0 (backwards-compatible: existing JSON consumers keep pagesSkipped as the sum, and --limit 9999 restores the old "audit everything" behavior).

Test plan

  • pnpm run typecheck
  • pnpm lint
  • pnpm test (94 passed, including new mapWithConcurrency order/cap tests)
  • pnpm run build
  • Smoke test against a large real sitemap before publishing

…cation

- Audit sitemap pages with a 5-worker pool instead of fully sequential to
  cut wall time without hammering the origin.
- Default --limit to 200 when unset so large sitemaps (e.g. 1800 URLs)
  don't trigger hour-long sweeps; existing --limit overrides remain.
- Split pagesSkipped into pagesFiltered (non-HTML) + pagesTruncated (limit)
  in the report; expose effectiveLimit and an onPlan callback. Formatters
  now spell out what was skipped and why; CLI prints a stderr notice up
  front when truncation fires.
@arberx arberx merged commit d9adee4 into main Apr 26, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant