Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix multiple validation jobs #3465

Merged
merged 5 commits into from
Mar 12, 2025

Conversation

sjanssen2
Copy link
Contributor

Addressing #3379:
With the proposed changes, both validation jobs of a deblur command successfully run through in my local setup. I suspect two mistakes of the current code:

  1. maybe a copy&paste error of lines 1108ff as they seem to be identical to 1091ff
    p = Process(target=launcher['function'],
    args=(plugin_env_script,
    plugin_start_script,
    url,
    self.id,
    job_dir))
    p.start()
    job_id = p.pid
    if dependent_jobs_list:
    # for now, treat dependents as independent when
    # running locally. This means they will not be
    # organized into n 'queues' or 'chains', and
    # will all run simultaneously.
    for dependent in dependent_jobs_list:
    p = Process(target=launcher['function'],
    args=(plugin_env_script,
    plugin_start_script,
    url,
    self.id,
    job_dir))
    p.start()
  2. even though the dependent jobs where "submitted" as child processes, they were not "registered" by Qiita's DB, such that Qiita did not "poll" for successful execution and thus could never set status to "success" of these job(s) even though the actual process did run through and finished with exit 0

@coveralls
Copy link

Coverage Status

coverage: 92.376% (-0.01%) from 92.389%
when pulling 769e382 on jlab:fix_multiple_validation_jobs
into d49339f on qiita-spots:dev.

@antgonza
Copy link
Member

Got it, thank you for the fix. Do you think it will be possible to add a retroactive test that brakes without the fix and passes now with it?

@sjanssen2
Copy link
Contributor Author

We would need to have a plugin that generates multiple artifacts as a test case. Adding the environment, registering all plugins to be "active", executing the workflow up to this point runs approx 15min on my laptop. Not sure if we want to invest that much compute in github actions.
Maybe we can create a mock plugin that just "touches" files for two ore more artifacts?!

@antgonza
Copy link
Member

That's right and after doing a quick search in the code base I think this is happening in test_processing_job.py ProcessingJobTest.test_complete_multiple_outputs. Maybe the only thing missing for the test is changing the value of qiita_config.job_scheduler_dependency_q_cn from 1 to 2; what do you think?

@sjanssen2
Copy link
Contributor Author

I think the problem with the existing test is that it manually sets the state of the dependent job here to "running":

obs2._set_status('running')
which in a running qiita instance should indirectly be done by the submit() function. But the code could be a primer?!

@sjanssen2
Copy link
Contributor Author

The existing test does not seem to actually execute jobs. Creation of the biom files is faked here:

fd, fp1 = mkstemp(suffix="_table.biom")
self._clean_up_files.append(fp1)
close(fd)
with open(fp1, 'w') as f:
f.write('\n')
fd, fp2 = mkstemp(suffix="_table.biom")
self._clean_up_files.append(fp2)
close(fd)
with open(fp2, 'w') as f:
f.write('\n')

Since the submit() function is creating true child processes, I am not sure if we really want to invest in writing a mock to test this behaviour. Maybe an alternative is to just test the DB fingerprint of what a submit would change?! But than it is not truly testing submit. I don't see an easy way forward :-/

@antgonza
Copy link
Member

OK, thank you for checking. I agree with your assessment.

@antgonza antgonza merged commit d0e03de into qiita-spots:dev Mar 12, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants