Skip to content

Add COMPLETED job state for jobs that finish with partial errors #1134

@mihow

Description

@mihow

Context

From PR #1121 review@mihow suggested simplifying the logic that determines if a Job is in the FAILURE state:

I am thinking we should simplify the logic determining if a Job is in the FAILURE state. Let's just show the counts. Really we need a new state like "COMPLETED" instead of Celery's SUCCESS & FAILURE states. "Completed with errors". Then we can remove a number of checks related to the stage status & overall status.

Problem

Currently jobs use Celery's SUCCESS and FAILURE states, but real-world ML processing jobs often finish with some images failing (bad crops, missing files, timeouts) while the majority succeed. The current approach uses a failure ratio threshold to decide between SUCCESS and FAILURE, which requires threading a complete_state parameter through the progress stages and adds complexity.

Proposal

Add a COMPLETED (or COMPLETED_WITH_ERRORS) state to the Job status choices. A job that finishes processing all images would be COMPLETED regardless of individual failures. The UI would show the actual counts (processed, failed, detections, classifications) and let the user judge the outcome.

This would allow removing:

  • The failure ratio threshold logic in ami/jobs/tasks.py
  • The complete_state parameter threading through _update_job_progress
  • Various checks related to per-stage status determining overall status

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions