Skip to content

Core code to-do list #18

@sam-may

Description

@sam-may

Here is my current to-do list on improvements to be made/features to be added to the core code.

Legend:
‼️ highest priority
❗ higher priority
🐢 lower priority/solution is time-intensive

  • 1. DataFetcher
    • 🐢 1.1 Automatically scale properly when running over a huge number of histograms (avoid huge memory usage)
  • 2. Algorithms/training
    • ‼️ 2.1 Train only with good runs by default
    • ‼️ 2.2 Implement flattening of 2d histograms for PCAs (merge Si's code)
    • 2.3 Autoencoders
      • ❗ 2.3.1 Make default behavior to train a single AutoEncoder per histogram
      • ❗ 2.3.2 Make algorithms and training configurable through json input (rather than just CLIs)
  • 3. Assessment
    • ‼️ 3.1 Make SSE histograms for good/bad runs in addition to train/test set
    • ❗ 3.2 Make SSE histograms both in per-events set format and per-algorithm format
    • ❗ 3.3 Plotting of 2d histograms
    • ❗ 3.4 Function for ROC curve plots
    • ❗ 3.5 Function to make summary table of AUC and tpr/fpr values
    • 🐢 3.6 Switch from yahist to boost, mplhep

edit: test

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions