Skip to content

Release v1.0.0

Latest
Compare
Choose a tag to compare
@huerni huerni released this 24 Oct 06:13
· 37 commits to master since this release
9bba5f4

Overview
This is the first GA (General Availability) release of CraneSched, and can be considered ready for production.
For CraneSched documentation, see CraneSched-document.

New Features

  • Submit Batch Jobs via cbatch: Users submit the entire computational process script to the system for scheduling and execution via cbatch.
    • Support specifying the resources required for the job, including memory, number of cores, parallel tasks per node, number of nodes needed, etc.
    • Support specifying job execution parameters, including specifying/excluding certain compute nodes, specifying cluster partition type, QoS configuration, repeat execution count, timeout duration, environment variables, etc.
    • Support specifying task output information, including task name, account and user associated with the task, email notification method, execution log, and error log redirection, etc.
  • Submit Interactive Jobs via calloc and crun: After specifying task resources via command line, the task is launched on the compute node. calloc requires users to log into the compute node manually, while crun automatically connects to the compute node.
    • Support specifying the resources required for the job, including memory, number of cores, parallel tasks per node, number of nodes needed, etc.
    • Support specifying job execution parameters, including specifying/excluding certain compute nodes, specifying cluster partition type, QoS configuration, timeout duration, environment variables, etc.
    • Support specifying task output information, including task name, associated account, log level, etc.
  • Cancel Jobs via ccancel: Support unified job cancellation based on conditions such as submission account, submission username, task name, task ID, node, cluster partition, task status, etc.
  • View Job Queue via cqueue: Support filtering query results based on conditions such as submission account, user, task name, task ID, cluster partition, QoS configuration, task status, etc.
  • View Completed Job Queue via cacct: Support filtering query results based on conditions such as submission account, user, task name, task ID, cluster partition, QoS configuration, task status, execution time, submission time, end time, etc.
  • View Node and Partition Status via cinfo: Support filtering query results based on node response status, partition of the node, node work status, etc.; support querying at fixed intervals.
  • Dynamically View/Modify Node/Partition/Task Status via ccontrol: Support viewing detailed information of nodes/tasks/partitions; modifying job timeout, priority, etc.; modifying node status; pausing and resuming tasks, etc.
  • Manage User and Account Information via cacctmgr: Support adding, deleting, modifying, and querying accounts/users/QoS/partitions; banning/unbanning users/accounts.
  • Job Monitor Hook
  • Support for Plugin Module
  • Device Support
  • Support for IPV6