-
Notifications
You must be signed in to change notification settings - Fork 25
Description
Hi,
We recently upgraded our LSF version from 8 to 10, and SJM no longer work.
There is no error message, but will have some "core dump" error as there will be a binary "core.123456" file been produced, and it much bigger than the original job file.
$ cat example4.sjm
job_begin
name jobA
time 1h
memory 500m
queue normal
project CompBio
cmd_begin
echo "hello from job jobA";
cmd_end
job_end
job_begin
name jobB
time 30m
memory 1g
queue normal
project CompBio
cmd_begin
echo "hello from job jobB";
cmd_end
job_end
order jobB after jobA
$ ~/app/SJM/src/sjm example4.sjm
Status file: example4.sjm.status
Log file: example4.sjm.status.log
Running jobs in the background....
$ ls -lrt
total 7200
...
-rw-r--r-- 1 xxxxx xxxxx 305 Mar 6 14:05 example4.sjm
-rw-r--r-- 1 xxxxx xxxxx 339 Mar 7 12:14 example4.sjm.status
-rw-r--r-- 1 xxxxx xxxxx 83 Mar 7 12:14 example4.sjm.status.log
-rw------- 1 xxxxx xxxxx 11247616 Mar 7 12:14 core.52031
We did some debug and it seems the job submitting has something wrong:
LS_LONG_INT jobId = lsb_submit(&req, &reply);
but still not sure what exactly went run.
I wonder if you can provide some insights?
Thanks a lot,