Skip to content

Conversation

@hdgarrood
Copy link
Contributor

@hdgarrood hdgarrood commented Dec 3, 2025

Fixes #1611. With these changes, we can migrate an arbitrary number of entities with just 4 queries.

Before submitting your PR, check that you've:

  • Documented new APIs with Haddock markup
  • Added @since declarations to the Haddock
  • Ran fourmolu on any changed files (restyled will do this for you, so
    accept the suggested changes if it makes them)
  • Adhered to the code style (see the .editorconfig and fourmolu.yaml files for details)

After submitting your PR:

  • Update the Changelog.md file with a link to your PR
  • Bumped the version number if there isn't an (unreleased) on the Changelog
  • Check that CI passes (or if it fails, for reasons unrelated to your change, like CI timeouts)

import qualified Database.PostgreSQL.Simple.Types as PG

import qualified Blaze.ByteString.Builder.Char8 as BBB
import Control.Arrow
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file was getting pretty big, so I wanted to move all of the migrations stuff into a separate one. A lot of the code here is unchanged: in essence, all I've done is extract the querying part out so that it happens first, and so that we pull all the data we need at once (with 4 queries) rather than doing N+1s.

backend <- ask
pure (SqlBackend.connPrepare backend)

-- NB: we do not perform these migrations in main.hs
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the test suite is kind of cursed in that it leaves the schema in the test DB, because this means that if there's a bug in the code that determines what migrations you need to apply, you only see test failures if you run the tests twice in a row (if you're starting from a fresh DB).

I think I've made things slightly worse here by doing this... but I also think it's important to directly exercise the migrator like this.


let
expected =
SchemaState
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This almost acts like a golden test. If you want to change the expected output, the easiest thing to do is to just rerun the test, eyeball the "but got: ..." to make sure it makes sense, and then copy it into the code here.

collectSchemaState
:: (Text -> IO Statement) -> [EntityNameDB] -> IO (Either Text SchemaState)
collectSchemaState getStmt entityNames = runExceptT $ do
existence <- getTableExistence getStmt entityNames
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each of these four functions performs exactly one query, and these are all of the queries the migrator now needs to perform.

(errs, _) -> throwError (T.intercalate "\n" errs)
where
getTableExistenceSql =
"SELECT tablename FROM pg_catalog.pg_tables WHERE schemaname != 'pg_catalog'"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are all basically the same queries as before, but instead of doing where tablename = ?, I'm doing where tablename = ANY (?), and substituting with the full list of tables.

([], xs) -> pure $ Map.unionsWith Map.union xs
(errs, _) -> throwError (T.intercalate "\n" errs)
where
-- TODO: should this filter by schema?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that the existing query doesn't filter by schema and I feel like it maybe should? but that's one for later I think

allDefs
entity
(newcols, udspair)
(map dubiouslyRemoveReferences essColumns, Map.toList essConstraints)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is basically the same as before except for this one line - in the previous version, we were relying on getColumn to implicitly remove these references (or rather not fetch them in the first place), now we're doing it explicitly here

-- otherwise no-op, `getAlters` will handle dropping this for us.
oldCol

-- | Indicates whether a Postgres Column is safe to drop.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything from here downwards is exactly the same as it was before

@hdgarrood hdgarrood marked this pull request as ready for review December 4, 2025 17:41
@hdgarrood hdgarrood changed the title wip: avoid N+1 in postgresql migrations Avoid N+1 in postgresql migrations Dec 4, 2025
Copy link
Collaborator

@parsonsmatt parsonsmatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

@hdgarrood
Copy link
Contributor Author

hdgarrood commented Dec 10, 2025

@parsonsmatt I found that with the version of these changes that you previously approved, when running against the work codebase, I saw a few mismatches in the case where tables in question had multiple (sometimes redundant) foreign key constraints on the same table, because the previous version of this PR didn't handle that case well. In particular, persistent could suggest migrations that fail to apply - it assumed that a given column could only have one FK constraint, which meant that if a column happened to have two identical FK constraints on it, it could suggest dropping the redundant one and creating the other, not realising that the other also already exists, which would of course fail.

The most recent commits I've added resolve this problem by restoring the current behaviour, which is that we ignore all FK constraints on existing tables when migrating, except for the specific case where:

  • We have a simple FK constraint for a single column (ie, the kind you get when you use FooId as a column type - a ColumnReference, not a ForeignDef)
  • An FK constraint with the exact same name as the one indicated by the EntityDef already exists in the database

@hdgarrood
Copy link
Contributor Author

Okay, although this now works, it appears to be even slower than the previous version. So that needs fixing first too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Batch migration performance

3 participants