Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions simexpal/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -453,16 +453,17 @@ def query_slurm(self):
if not self.slurm_queried:
# -h omits the header line.
# -r outputs one job array element per line.
output = subprocess.check_output(['squeue', '-h', '-r'])
# --format=%i,%t outputs jobid,status
output = subprocess.check_output(['squeue', '-h', '-r', '--format=%i,%t'])
output = output.decode().splitlines()

self.slurm_queried = True

if len(output) > 0:
output = [entry.split() for entry in output]
output = [entry.split(',') for entry in output]
for entry in output:
entry_jobid = entry[0]
entry_state = entry[4]
entry_state = entry[1]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this require that len(output) >= 2?

if entry_state == 'PD':
status = Status.SUBMITTED
elif entry_state in ['R', 'CG']:
Expand Down
17 changes: 17 additions & 0 deletions tests/examples/slurm-pr-183/experiments.yml
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's with the tests/examples/slurm-pr-183 folder structure? Do you want to leave it as this? Or what would be the actual folder you want to put this in?

Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
experiments:
- name: slow
args: [python, 'slow.py', '@EXTRA_ARGS@']
stdout: out

instances:
- repo: local
items:
- name: bar
files: []
extra_args: []

variants:
- axis: 'foo'
items:
- name: '1'
extra_args: ['1']
4 changes: 4 additions & 0 deletions tests/examples/slurm-pr-183/readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Test setup for PR #183
Issue: `simex e list` lists experiments as 'broken' even though they are running fine in slurm.
We currently do not have integration tests in simex - but: we can use the experiment setup in this folder manually to test for this bug and similar problems if the need arises.
Usage: just run `runtest.sh` and confirm that the output has correct status (i.e. not 'broken')
16 changes: 16 additions & 0 deletions tests/examples/slurm-pr-183/runtest.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
simex e purge --failed -f
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps use an interpreter line here?:

#!/bin/bash


rm -r output
rm -r aux
rm .simex.cache

simex e list
simex e launch --launcher scaling
simex e list

squeue
squeue -r

sleep 3s

simex e list
3 changes: 3 additions & 0 deletions tests/examples/slurm-pr-183/slow.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
import time

time.sleep(20)
Loading