Skip to content

Conversation

@ClaytonPassmore
Copy link
Contributor

What

Extends the ActiveRecordBatchEnumerator to support relations with composite primary keys.

Accomplishes this by introducing a utility class, ActiveRecordBatchEnumerator::ColumnManager, which makes it a bit more convenient to ensure primary key values are being plucked and stripped out of the yielded cursor when appropriate.

Why

I was surprised to find that I couldn't batch over relations with composite primary keys in the maintenance_tasks gem!

This is a frequently encountered use case for multi-tenant apps. Having it supported would be helpful for me (and hopefully others)!

The logic that handled this for single column primary keys was fine, but throwing multiple primary key columns into the mix greatly increased the complexity. I figured pulling it out into a separate class would make it a bit more maintainable.

Changed behaviour

There is a change in behaviour being introduced here:

Previously we were short-circuiting when plucking a single primary key column. This "short circuit" was skipping a call to #serialize_column_values.

That definitely caused some inconsistency, because for relations that had a single primary key column which also happened to be a timestamp, the timestamp wouldn't be serialized when only the primary key was being selected. But if multiple columns were selected, the timestamp would be serialized.

I have changed the logic to no longer short circuit, and instead run the values through #serialize_column_values every time. If that's an issue and you would rather keep the old (inconsistent) behaviour, let me know and I can tweak that.

Notes on maintenance_tasks support

It appears that the maintenance tasks gem is still plagued by a multi-column cursor deserialization bug - Shopify/maintenance_tasks#1226. So even though you can now iterate over a composite primary key relation in batches, resuming the task from a specific cursor position won't work in the maintenance tasks gem just yet.

There's nothing actionable about that issue for this gem - it's just worth noting for anyone poking around trying to figure out if composite primary keys are supported.

Notes on code abstraction

It wasn't until after I finished writing this code that I realized there was a bunch of duplication between the batch enumerator cursor logic and the ActiveRecordCursor class.

I see in #91 that there was an intention to go back and pull out some of the common logic but it seems like that never happened.

The utility class I added has a bit of overlap with ActiveRecordCursor, and could potentially be used in that class (though you'd likely want to move the utility class out of the batch enumerator namespace).

Extends the active record batch enumerator to support models with
composite primary keys. Accomplishes this by introducing a utility
class, `ActiveRecordBatchEnumerator::ColumnManager`, which makes it a
bit more convenient to ensure primary key values are being plucked and
stripped out of the yielded cursor when appropriate.
@ClaytonPassmore
Copy link
Contributor Author

That's a lot of red - I'll see if I can fix this up shortly.

@Mangara
Copy link
Contributor

Mangara commented Nov 27, 2025

Thanks for the detailed contribution!

That's a lot of red - I'll see if I can fix this up shortly.

Composite primary keys were introduced in Rails 7.1, so we can skip the tests on earlier versions. I'm a little concerned that some of the tests appear to be flaky even on later versions.

* Fixed issues with yardoc comments
* Fixed tests that failed due to order schema not being loaded
* Skip composite primary key tests until ActiveRecord 7.1
@ClaytonPassmore
Copy link
Contributor Author

@Mangara happy to contribute!

I found the source of the flakey specs. It turned out to be fairly innocuous. Since the Order model doesn't get seeded in the test helper, if one of the tests that track queries for the order model run first without the Order schema ever being loaded, then the query matching gets messed up because ActiveRecord does a quick query to load the schema in the middle of the test. 😄

@ClaytonPassmore
Copy link
Contributor Author

ClaytonPassmore commented Nov 27, 2025

Sorry for all the failed runs. I see the issue here. Should be an easy fix. I'll see if I can do a local run with older versions of Rails to make sure this is resolved once and for all.

Rails 7.1 introduced support for primary keys, along with support for
querying multi-columns values - e.g.

```
Order.where([:store_id, :id] => [[1, 1], [1, 2]])
```

Since this gem supports older versions of Rails, when a model only has a
single column primary key, just use the regular tried-and-true syntax -
e.g.

```
Product.where(id: [1, 2])
```
Copy link
Contributor

@Mangara Mangara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Can you add a CHANGELOG entry, under "Main (unreleased)"?

@Mangara Mangara merged commit a69702c into Shopify:main Nov 28, 2025
23 checks passed
@ClaytonPassmore ClaytonPassmore deleted the feature/batch-enum-composite-pkey branch November 28, 2025 15:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants