Performance Improvements - Multiprocessing, local cache DB, bug fixes and more.. #50

DrorDvash · 2025-12-29T22:28:09Z

Hey! 👋

This PR adds some performance improvements to help BOFHound handle larger AD environments more efficiently, some "resume" functionality and bug fixes.
I've made it using AI, during a red team engagement. I am dealing with a huge domain (millions of objects) therefore this was needed.

What's New

Parallel ACL Processing

ACL parsing now uses multiple CPU cores instead of running single-threaded. On a test dataset, this reduced ACL processing time from hours to minutes.

Before: Single python.exe process maxing one core
After: Multiple worker processes utilizing available cores

Local Cache for Faster Updates

BOFHound now saves processed data to a local SQLite cache (bofhound_cache.db). On the next run, it remembers what was already done and only processes new/changed objects.

What gets cached:

Parsed LDAP objects (users, groups, computers, etc.)
Computed ACL relationships
SID-to-object mappings for ACL resolution

On subsequent runs: Only new objects get parsed and their ACLs computed. Cached objects are skipped entirely.

Use cases:

Collecting data over time: First run processes everything. A week later, collect more LDAP data and run again on the same folder. Cache skips what it already saw, only processes the new stuff.
Multiple domains: Process domain1, save results. Then process domain2 with --context-from domain1_output/ to reuse domain1's SID mappings without recomputing them.
Iterative checks: Run once, review results, collect more data (like certificates), run again to add more objects without redoing everything.

Using --context-from flag:

Better Visibility

New progress indicators show what's happening during long runs:

Real-time progress bar with ETA
Objects/second processing rate
CPU core utilization info

New CLI Options

--workers N        Number of parallel workers (default: ~90% of CPU cores)
--no-cache         Disable caching for this run
--cache-stats      Show cache statistics and exit
--context-from     Load SID mappings from a previous run's cache

Bug Fixes

Fixed crash on malformed certificate data (invalid base64 in cacertificate attribute)
Fixed crash when certificate chain building encounters null certificates

These were discovered when testing against real-world AD data that contained edge cases not present in lab environments.

Disclaimer

I was in the same spot as in #39, so I jumped in to make it easier.
I ran numerous tests and fixed the bugs I encountered, but this is still not bulletproof or perfect. Built with a vibe coding. It noticeably improves performance and QoL for me, and I hope it will help you too.

- Add SQLite object cache to skip already-processed objects (by SID/DN) - Add performance CLI options: --no-cache, --cache-file, --workers, --cache-stats - Improve object initialization and enable parallel-ready ACL processing - Update README and gitignore for cleaner workflows

- Fix context-from to work with directory paths - Handle bad certificate data gracefully instead of crashing - Fix certificate chain issues - Show worker count in a single line - Set worker count to about 90% of CPU cores by default

- Use double quotes outer, single quotes inner for f-string - Works in PowerShell, bash, zsh, and other shells

Replace linear searches with dictionary lookups for delegation and OU membership resolution. This provides significant performance improvements on large datasets by reducing algorithmic complexity from O(n²) to O(1) for relationship lookups.

DrorDvash · 2025-12-31T09:21:46Z

Well, I must share those results.
I ran bofhound and measured the performance.
Tested on production dataset: millions of LDAP objects, millions of ACL relationships (on a powerful server with 80 CPU cores -> 72 workers). Yes, I know this server's specs don't match normal user hardware, but the multiprocessor and cache will save you time on any multiple-core hardware.

What used to take 2 days now finishes in under 3 hours.

PR Baseline (commit `2bc8cb7`):

Total runtime: 49h 52m (~2 days 2 hours)
Log parsing: 1h 9m
ACL processing: 43h
Delegation resolution: 35m
OU resolution: 46m
Cache update: 1h 16m
JSON write: 2h 51m

After additional PR optimizations (commit `a4e345a`):

Total runtime: 2h 44m
Log parsing: 9m (7.7× faster)
ACL processing: 32m at ~1,585 obj/sec (80× faster)
Delegation + OU resolution: ~7 seconds combined
Cache update: 15m (5× faster)
JSON write: 1h 16m

115GB of data were parsed in total.

DrorDvash added 3 commits December 24, 2025 12:37

fix: polish up the improvements

2da83b8

- Fix context-from to work with directory paths - Handle bad certificate data gracefully instead of crashing - Fix certificate chain issues - Show worker count in a single line - Set worker count to about 90% of CPU cores by default

fix: make python command cross-platform compatible

4076087

- Use double quotes outer, single quotes inner for f-string - Works in PowerShell, bash, zsh, and other shells

DrorDvash mentioned this pull request Dec 29, 2025

Question about overall performance #39

Open

DrorDvash changed the title ~~Performance Improvements - Multiprocessing, Local Cache DB, Bug Fixes and more..~~ Performance Improvements - Multiprocessing, local cache DB, bug fixes and more.. Dec 29, 2025

DrorDvash force-pushed the feature/performance-improvements branch from cd9f570 to 4674d61 Compare December 30, 2025 20:35

Tw1sm self-assigned this Dec 31, 2025

Tw1sm added the performance Related to speed/resource consumption label Dec 31, 2025

DrorDvash force-pushed the feature/performance-improvements branch from 7f47228 to dc22e69 Compare January 14, 2026 08:56

write JSON files before cache update

1f60b20

DrorDvash force-pushed the feature/performance-improvements branch from dc22e69 to 1f60b20 Compare January 14, 2026 09:16

docs: update CHANGELOG with performance improvements

112cabd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Performance Improvements - Multiprocessing, local cache DB, bug fixes and more.. #50

Performance Improvements - Multiprocessing, local cache DB, bug fixes and more.. #50

Uh oh!

DrorDvash commented Dec 29, 2025 •

edited

Loading

Uh oh!

DrorDvash commented Dec 31, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Performance Improvements - Multiprocessing, local cache DB, bug fixes and more.. #50

Are you sure you want to change the base?

Performance Improvements - Multiprocessing, local cache DB, bug fixes and more.. #50

Uh oh!

Conversation

DrorDvash commented Dec 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What's New

Parallel ACL Processing

Local Cache for Faster Updates

Better Visibility

New CLI Options

Bug Fixes

Disclaimer

Uh oh!

DrorDvash commented Dec 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Baseline (commit 2bc8cb7):

After additional PR optimizations (commit a4e345a):

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

DrorDvash commented Dec 29, 2025 •

edited

Loading

DrorDvash commented Dec 31, 2025 •

edited

Loading

PR Baseline (commit `2bc8cb7`):

After additional PR optimizations (commit `a4e345a`):