graph/db: use `/SLICE:<field_name>/` to optimise various graph queries #10081

ellemouton · 2025-07-15T15:59:46Z

In this PR, the /*SLICE:<field_name>*/ directive is made use of so that we can perform
WHERE x IN <list of x type> SQL queries. This then allows us to perform fewer DB calls.

We dont want to give our SQL lib an unbound number of items in the parameter though so this
PR also adds an ExecutePagedQuery helper that gives us the ability to call these new types of
SQL queries in a paginated fashion,.

gemini-code-assist

Summary of Changes

Hello @ellemouton, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces significant optimizations to graph database queries by leveraging SQL's IN clause with a custom /*SLICE:<field_name>*/ directive. To manage potentially large parameter lists, a new ExecutePagedQuery helper has been implemented, allowing these batch queries to be processed in a paginated fashion. This enhancement aims to reduce the number of database calls and improve overall query performance, especially for operations involving multiple graph entities.

Highlights

Database Query Optimization: Implemented the /*SLICE:<field_name>*/ directive to allow for efficient WHERE x IN <list> SQL queries, significantly reducing the number of database calls by batching multiple lookups into a single query.
Paginated Query Execution: Introduced a new ExecutePagedQuery helper function that enables queries with large lists of parameters to be executed in paginated batches. This prevents database parameter limits from being hit and ensures robust handling of extensive data sets.
Refactored Graph Store Queries: Applied the new paginated query mechanism to key methods within the SQLStore, specifically FetchChanInfos and PruneGraph, to improve their performance and efficiency when processing multiple channel IDs or outpoints.
SQLC Tooling Enhancement: Added a post-generation script to scripts/gen_sqlc_docker.sh to correctly handle sqlc.slice() directives for Postgres. This ensures proper parameter substitution in the generated Go code, addressing a limitation in sqlc's default behavior for IN clauses.
Test Suite Improvements: Updated various graph tests to use require.NoError, require.Len, and require.Empty from testify/require for more concise and robust error and assertion handling, improving test readability and maintainability.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a generic, paginated query helper to optimize database queries. This is a great performance improvement that reduces the number of database calls by using IN clauses, while also preventing issues with an unbound number of parameters. The implementation is solid, with good test coverage for the new functionality.

gemini-code-assist · 2025-07-15T16:03:08Z

sqldb/sqlc/db_custom.go

+	result := ""
+	for i := diff + 1; i <= numTotalArgs; i++ {
+		if i == numTotalArgs {
+			result += fmt.Sprintf("$%d", i)
+
+			continue
+		}
+		result += fmt.Sprintf("$%d,", i)
+	}
+	return result


The current implementation of makeQueryParams uses string concatenation inside a loop, which can be inefficient in Go as it leads to multiple memory allocations. For better performance, especially given that this function can be called with a page size of 100 or more, I recommend using strings.Builder. This will pre-allocate a buffer and avoid re-allocations in the loop.

if numListArgs == 0 { return "" } var b strings.Builder // Pre-allocate a rough estimation of the buffer size to avoid // re-allocations. A parameter like $1000, takes 6 bytes. b.Grow(numListArgs * 6) diff := numTotalArgs - numListArgs for i := 0; i < numListArgs; i++ { if i > 0 { b.WriteString(",") } fmt.Fprintf(&b, "$%d", i+diff+1) } return b.String()

Yeah, looks like this is indeed way more performant.

Here are the unit and benchmark tests I added for this (to validate Gemini's claim):

// BenchmarkMakeQueryParams benchmarks the makeQueryParams function for // various argument sizes. This helps to ensure the function performs // efficiently when generating SQL query parameter strings for different // input sizes. func BenchmarkMakeQueryParams(b *testing.B) { cases := []struct { totalArgs int listArgs int }{ {totalArgs: 5, listArgs: 2}, {totalArgs: 10, listArgs: 3}, {totalArgs: 50, listArgs: 10}, {totalArgs: 100, listArgs: 20}, } for _, c := range cases { name := fmt.Sprintf("totalArgs=%d/listArgs=%d", c.totalArgs, c.listArgs) b.Run(name, func(b *testing.B) { for i := 0; i < b.N; i++ { _ = makeQueryParams(c.totalArgs, c.listArgs) } }) } } // TestMakeQueryParams tests the makeQueryParams function for various // argument sizes and verifies the output matches the expected SQL // parameter string. The function is assumed to generate a comma-separated // list of parameters in the form $N for use in SQL queries. func TestMakeQueryParams(t *testing.T) { testCases := []struct { totalArgs int listArgs int expected string }{ {totalArgs: 5, listArgs: 2, expected: "$4,$5"}, {totalArgs: 10, listArgs: 3, expected: "$8,$9,$10"}, {totalArgs: 1, listArgs: 1, expected: "$1"}, {totalArgs: 3, listArgs: 0, expected: ""}, {totalArgs: 4, listArgs: 4, expected: "$1,$2,$3,$4"}, } for _, tc := range testCases { result := makeQueryParams(tc.totalArgs, tc.listArgs) require.Equal( t, tc.expected, result, "unexpected result for totalArgs=%d, listArgs=%d", tc.totalArgs, tc.listArgs, ) } }

nice! cool, updated. It did mean needing to change the replacement script a little so that we can handle the error that now needs to be returned from the helper function 👍

This copies the workaround introduced in the taproot-assets code base and will allow us to use `WHERE x in <list>` type queries.

Along with a test for it. This helper will allow us to easily create a pagination wrapper for queries that will make use of the new /*SLICE:<field_name>*/ directive. The next commit will add a test showing this.

Here, a new query (GetChannelsByOutpoints) is added which makes use of the /*SLICE:outpoints*/ directive & added workaround. This is then used in a test to demonstrate how the ExecutePagedQuery helper can be used to wrap a query like this such that calls are done in pages. The query that has been added will also be used by live code paths in an upcoming commit.

ellemouton · 2025-07-16T06:54:21Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces the use of the /*SLICE:<field_name>*/ directive to optimize graph queries by enabling the use of WHERE x IN <list of x type> SQL queries. It also adds an ExecutePagedQuery helper to handle these queries in a paginated fashion. The code changes include modifications to the graph database schema, query functions, and test cases to support this new feature.

config_test_native_sql.go

graph/db/sql_store.go

scripts/gen_sqlc_docker.sh

sqldb/sqlc/graph.sql.go

This commit adds a new GetChannelsByOutpoints query which takes a slice of outpoint strings. This lets us then update PruneGraph to use paginated calls to GetChannelsByOutpoints instead of making one DB call per outpoint.

Remove a TODO by making use of the new sqldb.ExecutePagedQuery to fetch channels in batches rather than one by one.

guggero

Very nice! I assume all of these changed methods have sufficient unit test coverage already?

guggero · 2025-07-16T11:36:27Z

sqldb/sqlc/db_custom.go

+	result := ""
+	for i := diff + 1; i <= numTotalArgs; i++ {
+		if i == numTotalArgs {
+			result += fmt.Sprintf("$%d", i)
+
+			continue
+		}
+		result += fmt.Sprintf("$%d,", i)
+	}
+	return result


Yeah, looks like this is indeed way more performant.

Here are the unit and benchmark tests I added for this (to validate Gemini's claim):

// BenchmarkMakeQueryParams benchmarks the makeQueryParams function for // various argument sizes. This helps to ensure the function performs // efficiently when generating SQL query parameter strings for different // input sizes. func BenchmarkMakeQueryParams(b *testing.B) { cases := []struct { totalArgs int listArgs int }{ {totalArgs: 5, listArgs: 2}, {totalArgs: 10, listArgs: 3}, {totalArgs: 50, listArgs: 10}, {totalArgs: 100, listArgs: 20}, } for _, c := range cases { name := fmt.Sprintf("totalArgs=%d/listArgs=%d", c.totalArgs, c.listArgs) b.Run(name, func(b *testing.B) { for i := 0; i < b.N; i++ { _ = makeQueryParams(c.totalArgs, c.listArgs) } }) } } // TestMakeQueryParams tests the makeQueryParams function for various // argument sizes and verifies the output matches the expected SQL // parameter string. The function is assumed to generate a comma-separated // list of parameters in the form $N for use in SQL queries. func TestMakeQueryParams(t *testing.T) { testCases := []struct { totalArgs int listArgs int expected string }{ {totalArgs: 5, listArgs: 2, expected: "$4,$5"}, {totalArgs: 10, listArgs: 3, expected: "$8,$9,$10"}, {totalArgs: 1, listArgs: 1, expected: "$1"}, {totalArgs: 3, listArgs: 0, expected: ""}, {totalArgs: 4, listArgs: 4, expected: "$1,$2,$3,$4"}, } for _, tc := range testCases { result := makeQueryParams(tc.totalArgs, tc.listArgs) require.Equal( t, tc.expected, result, "unexpected result for totalArgs=%d, listArgs=%d", tc.totalArgs, tc.listArgs, ) } }

guggero · 2025-07-16T11:42:59Z

sqldb/postgres_test.go

@@ -7,6 +7,11 @@ import (
 	"testing"
 )

+// isSQLite is false if the build tag is set to test_db_postgres. It is used in
+// tests that compile for both SQLite and Postgres databases to determine
+// which database implementation is being used.


Can we add a TODO here that with sqldbv2 there will be a DatabaseType on the BaseDB struct that will allow us to query this at runtime.

guggero · 2025-07-16T11:47:42Z

graph/db/sql_store.go

@@ -2404,9 +2391,19 @@ func (s *SQLStore) PruneGraph(spentOutputs []*wire.OutPoint,
 			}

 			closedChans = append(closedChans, info)
+


Do we also need to do a list query for the DeleteChannel call above? Probably not, as we don't expect thousands of channels to be closed by a single block worth of transactions...

guggero · 2025-07-16T11:47:54Z

graph/db/sql_store.go

-				return fmt.Errorf("unable to fetch channel: %w",
-					err)
-			}
+		// Define the callback function for processing each channel


nit: missing punctuation in some comments here and below.

guggero · 2025-07-16T11:54:23Z

graph/db/sql_store.go

+	// ChannelUpdateInfo. This allows us to quickly delete channels that we
+	// already know about.
+	for _, chanInfo := range chansInfo {
+		infoLookup[chanInfo.ShortChannelID.ToUint64()] = chanInfo


I wonder if we also need to re-fill this map in the reset callback?

guggero · 2025-07-16T11:55:15Z

graph/db/sql_store.go

@@ -2446,7 +2496,7 @@ func (s *SQLStore) PruneGraph(spentOutputs []*wire.OutPoint,
 // NOTE: this fetches channels for all protocol versions.
 func (s *SQLStore) forEachChanInOutpoints(ctx context.Context, db SQLQueries,
 	outpoints []*wire.OutPoint, cb func(ctx context.Context,
-	row sqlc.GetChannelsByOutpointsRow) error) error {
+		row sqlc.GetChannelsByOutpointsRow) error) error {


nit: there seems to be some back-and-forth in the formatting of these... I know GoLand sometimes does things it shouldn't which is quite annoying...

ellemouton self-assigned this Jul 15, 2025

ellemouton added graph no-changelog labels Jul 15, 2025

gemini-code-assist bot reviewed Jul 15, 2025

View reviewed changes

ellemouton mentioned this pull request Jul 15, 2025

[epic]: graph db: SQL implementation & migration #9795

Open

82 tasks

ellemouton force-pushed the graphSQLSlices branch from a3f3f2e to 9ea2ec6 Compare July 16, 2025 06:01

scripts: add sql slices workaround to sqlc gen script

0448002

This copies the workaround introduced in the taproot-assets code base and will allow us to use `WHERE x in <list>` type queries.

ellemouton force-pushed the graphSQLSlices branch 2 times, most recently from 59ccaf4 to 2661ab2 Compare July 16, 2025 06:43

ellemouton added 3 commits July 16, 2025 08:44

sqldb: add ExecutePagedQuery helper

cfd4f2f

Along with a test for it. This helper will allow us to easily create a pagination wrapper for queries that will make use of the new /*SLICE:<field_name>*/ directive. The next commit will add a test showing this.

graph+config: add sql pagination config to ChannelGraph

4ed1ca5

ellemouton force-pushed the graphSQLSlices branch 2 times, most recently from b89b18b to 5d9bf9b Compare July 16, 2025 06:54

gemini-code-assist bot reviewed Jul 16, 2025

View reviewed changes

ellemouton added 3 commits July 16, 2025 09:45

graph/db+sqldb: pass set of outpoints to SQL

488d691

This commit adds a new GetChannelsByOutpoints query which takes a slice of outpoint strings. This lets us then update PruneGraph to use paginated calls to GetChannelsByOutpoints instead of making one DB call per outpoint.

sqldb+graph/db: update FilterKnownChanIDs to use pagination

8233c60

Remove a TODO by making use of the new sqldb.ExecutePagedQuery to fetch channels in batches rather than one by one.

sqldb+graph/db: use pagination for FetchChanInfos

bf32adb

ellemouton force-pushed the graphSQLSlices branch from 5d9bf9b to bf32adb Compare July 16, 2025 07:46

ellemouton requested review from guggero and bhandras July 16, 2025 07:46

guggero reviewed Jul 16, 2025

View reviewed changes

		@@ -2404,9 +2391,19 @@ func (s SQLStore) PruneGraph(spentOutputs []wire.OutPoint,
		}

		closedChans = append(closedChans, info)

graph/db: use /*SLICE:<field_name>*/ to optimise various graph queries #10081

Are you sure you want to change the base?

graph/db: use /*SLICE:<field_name>*/ to optimise various graph queries #10081

Conversation

ellemouton commented Jul 15, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jul 15, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ellemouton commented Jul 16, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

guggero left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

graph/db: use `/SLICE:<field_name>/` to optimise various graph queries #10081

graph/db: use `/SLICE:<field_name>/` to optimise various graph queries #10081