-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Labels
bugSomething isn't workingSomething isn't working
Description
In this stanza: if pending_job['REASON'] in ['QOSGrpNodeLimit', 'QOSGrpCpuLimit', 'QOSGrpGRES']: in identify_problems, if there are no running jobs in the QoS then we get an error. I.e., a job can't run because of, say, QOSGrpCpuLimit but somehow no other jobs are in the QoS.
This should never happen but was happening for a user, perhaps related to Slurm issues after a downtime.
Traceback (most recent call last):
File "sq.py", line 527, in
File "sq.py", line 466, in display_queued_jobs
File "pandas/core/frame.py", line 7547, in apply
File "pandas/core/apply.py", line 180, in get_result
File "pandas/core/apply.py", line 255, in apply_standard
File "pandas/core/apply.py", line 284, in apply_series_generator
File "sq.py", line 418, in inner
TypeError: sequence item 0: expected str instance, float found
Here's the view from pdb:
Traceback (most recent call last):
File "sq.py", line 530, in <module>
if slurm_info.has_current_jobs(username) and (not args.all_jobs):
File "sq.py", line 470, in display_queued_jobs
df['PROBLEMS'] = df.apply(identify_problems(slurm_info), axis=1)
File "/global/home/users/paciorek/.conda/envs/sq/lib/python3.8/site-packages/pandas/core/frame.py", line 7547, in apply
return op.get_result()
File "/global/home/users/paciorek/.conda/envs/sq/lib/python3.8/site-packages/pandas/core/apply.py", line 180, in get_result
return self.apply_standard()
File "/global/home/users/paciorek/.conda/envs/sq/lib/python3.8/site-packages/pandas/core/apply.py", line 255, in apply_standard
results, res_index = self.apply_series_generator()
File "/global/home/users/paciorek/.conda/envs/sq/lib/python3.8/site-packages/pandas/core/apply.py", line 284, in apply_series_generator
results[i] = self.f(v)
File "sq.py", line 422, in inner
qos_running_jobs_str = ', '.join(qos_running_jobs['JOBID'] + ' (' + qos_running_jobs.apply(lambda x: filter_keys(qos_resource_limit, parse_tres_queue_job(x)), axis=1).apply(display_grp_tres) + ')')
TypeError: sequence item 0: expected str instance, float found
There's a problem with the .join because of the exact structure of the inputs when there are no jobs.
I have a freeze directory for this example.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working