Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .changelog/26974.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
```release-note:bug
core: Fixed a bug where GC batch sizes for jobs resulted in excessively large Raft logs
```
6 changes: 4 additions & 2 deletions nomad/core_sched.go
Original file line number Diff line number Diff line change
Expand Up @@ -216,8 +216,10 @@ OUTER:

// jobReap contacts the leader and issues a reap on the passed jobs
func (c *CoreScheduler) jobReap(jobs []*structs.Job, leaderACL string) error {
// Call to the leader to issue the reap
for _, req := range c.partitionJobReap(jobs, leaderACL, structs.MaxUUIDsPerWriteRequest) {
// Call to the leader to issue the reap with a batch size intended to be
// similar to the GC by batches of UUIDs for evals, allocs, and nodes
// (limited by structs.MaxUUIDsPerWriteRequest)
Comment on lines +219 to +221
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, I think we'll want to revisit this limit if we adopt raft-wal in Nomad 1.12.0, but let's cross that bridge when we get to it

for _, req := range c.partitionJobReap(jobs, leaderACL, 2048) {
var resp structs.JobBatchDeregisterResponse
if err := c.srv.RPC(structs.JobBatchDeregisterRPCMethod, req, &resp); err != nil {
c.logger.Error("batch job reap failed", "error", err)
Expand Down