Skip to content

Refactor codebase: package restructure, pyproject.toml migration, and full internal refactor#1

Open
alexwolson wants to merge 14 commits intooptimal-uoft:mainfrom
alexwolson:main
Open

Refactor codebase: package restructure, pyproject.toml migration, and full internal refactor#1
alexwolson wants to merge 14 commits intooptimal-uoft:mainfrom
alexwolson:main

Conversation

@alexwolson
Copy link

This pull request consolidates two major refactoring efforts into a single, cohesive update. It modernizes the repository structure and dependency management, while also performing a deep internal refactor to improve correctness, performance, validation, and long-term maintainability.


1. Repository & Packaging Modernization

Package structure

  • Introduced a proper Python package under neurcam/
    • model.py, loss.py, and __init__.py
  • Removed legacy flat files (NeurCAM.py, Loss.py)
  • Added py.typed marker to support static type checking

Dependency management

  • Replaced environment.yml with pyproject.toml
  • Switched to the hatchling build backend
  • Defined core runtime dependencies (torch, numpy, pandas, scikit-learn, etc.)
  • Added optional dependency groups:
    • dev for development tooling
    • wrds for optional integrations
  • Documented CUDA-enabled PyTorch installation paths
  • Standardized installation workflows:
pip install -e .
# or
uv pip install -e .

Import path update

# Before
from NeurCAM import NeurCAM

# After
from neurcam import NeurCAM

2. Compatibility & Deprecation Fixes

  • Removed deprecated verbose parameter from ReduceLROnPlateau (PyTorch 2.x)
  • Fixed progress bar method calls when verbose=False
  • Added missing type annotation for verbose
  • Updated scheduler type hints to non-deprecated forms using forward references (e.g., "torch.optim.lr_scheduler.LRScheduler")

3. Full Internal Code Refactor

Correctness fixes

  • Moved scheduler.step() to run once per epoch (after the batch loop) instead of once per batch
  • Simplified torch.cdist() usage in FuzzyCMeansLoss by removing unnecessary unsqueeze() / squeeze()
  • Resolved forward-reference issues in type hints

Validation improvements

  • Added parameter validation across constructors:
    • NeurCAM (e.g., k, m, epochs, batch_size, etc.)
    • NeurCAMModel (e.g., input_dim, repr_dim, etc.)
    • FuzzyCMeansLoss (e.g., m)
  • Added validation for:
    • kl_weight (must be non-negative)
    • single_feature_channels and pairwise_feature_channels (must be non-negative)
    • PyTorch device specifiers (now accepts any valid value such as "cuda:0", "cuda:1", "mps", "cpu", etc.)

Performance & code quality

  • Vectorized _o1_valid_cuts and _o2_valid_cuts to remove loop-based logic and reduce duplication
  • Optimized tensor initialization in _forward to avoid unnecessary .to() calls
  • Added comprehensive input validation in _prepare_data
  • Simplified dictionary membership checks in _separated_forward_o1
  • Improved readability of _build_projection_network using extend()
  • Applied Black formatting consistently across the codebase

4. Status

  • All tests pass successfully.
  • This PR intentionally combines structural, packaging, and internal refactors to avoid intermediate broken states and to present a clean, modernized baseline for future development.

Copilot AI and others added 14 commits January 20, 2026 17:51
Co-authored-by: alexwolson <8996640+alexwolson@users.noreply.github.com>
Co-authored-by: alexwolson <8996640+alexwolson@users.noreply.github.com>
Co-authored-by: alexwolson <8996640+alexwolson@users.noreply.github.com>
…ize-files

Refactor repository structure and migrate to pyproject.toml
Co-authored-by: alexwolson <8996640+alexwolson@users.noreply.github.com>
Co-authored-by: alexwolson <8996640+alexwolson@users.noreply.github.com>
Co-authored-by: alexwolson <8996640+alexwolson@users.noreply.github.com>
Co-authored-by: alexwolson <8996640+alexwolson@users.noreply.github.com>
Co-authored-by: alexwolson <8996640+alexwolson@users.noreply.github.com>
Co-authored-by: alexwolson <8996640+alexwolson@users.noreply.github.com>
Refactor codebase: vectorize operations, add validation, fix deprecations
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments