Skip to content

Refactor utils: improve robustness, memory efficiency, and unify Gaussian noise functions#113

Open
agentksimha wants to merge 25 commits intohumanai-foundation:mainfrom
agentksimha:refactor/utils
Open

Refactor utils: improve robustness, memory efficiency, and unify Gaussian noise functions#113
agentksimha wants to merge 25 commits intohumanai-foundation:mainfrom
agentksimha:refactor/utils

Conversation

@agentksimha
Copy link
Copy Markdown

@agentksimha agentksimha commented Mar 20, 2026

This PR improves utility functions by enhancing robustness, reducing redundancy, and improving performance. It adds path validation and better error handling to pdf_to_images, and makes count_lines_in_file memory-efficient using streaming while handling file existence, permission, encoding, and runtime errors. It also merges add_gaussian_noise and add_black_gaussian_noise into a single function with a mode parameter and updates all usages accordingly. Overall, this reduces code duplication, improves efficiency for large files, and makes the augmentation pipeline more flexible and maintainable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants