Skip to content

Conversation

@nhenneaux
Copy link

@nhenneaux nhenneaux commented Nov 24, 2025

Describe your changes

Fixing following errors occurring when too much peers as a source by batching nft rule add by 200 sources
3240 peers does not lead to error while 3280 fails.

2025-11-24T15:56:59.2487119Z ERRO client/internal/acl/manager.go:75: Failed to apply route ACLs: 82 errors occurred:
	* add route rule: add route rule: apply source: source: create or get ipset: failed to add for key nb-8aecb703: flush error: conn.Receive: netlink receive: no such file or directory
	* add route rule: add route rule: apply source: source: create or get ipset: failed to add for key nb-8aecb703: flush error: conn.Receive: netlink receive: no such file or directory
	(...)

OS rhel9.7, kernel 5.14.0-570.60.1.el9_6.x86_64, nftables v1.0.9 (Old Doc Yak #3).

Issue ticket number and link

Stack

Checklist

  • Is it a bug fix
  • Is a typo/documentation fix
  • Is a feature enhancement
  • It is a refactor
  • Created tests that fail without the change (if possible)

By submitting this pull request, you confirm that you have read and agree to the terms of the Contributor License Agreement.

Documentation

Select exactly one:

  • I added/updated documentation for this change
  • Documentation is not needed for this change (explain why)
    Bug fix

Docs PR URL (required if "docs added" is checked)

Paste the PR link from https://github.com/netbirdio/docs here:

https://github.com/netbirdio/docs/pull/__

Summary by CodeRabbit

  • Refactor

    • Initialization now applies very large prefix lists in batched chunks for more reliable handling; per-batch logging and improved error messages report progress and failures.
  • Tests

    • Added compatibility tests for a very large (6k+) prefix set and for an empty prefix set; the large-scale test was accidentally duplicated.

✏️ Tip: You can customize this high-level summary in your review settings.

@CLAassistant
Copy link

CLAassistant commented Nov 24, 2025

CLA assistant check
All committers have signed the CLA.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 24, 2025

Walkthrough

Replaces single bulk ipset initialization with batched adds in createIpSet: add up to maxElements initially, flush, then add remaining prefixes in sublists (≤ maxElements) with per-batch flushes and logs. Adds two tests for large (6,375) and empty prefix sets; tests are duplicated in the file.

Changes

Cohort / File(s) Summary
IpSet creation batching
client/firewall/nftables/router_linux.go
Introduces maxPrefixesSet (1500) and computes maxElements = maxPrefixesSet * 2. Changes createIpSet to add an initial slice of elements up to maxElements, flush, then iterate remaining elements in sublists of ≤ maxElements calling SetAddElements, flushing and logging after each batch; errors and logs include sublist counts and contextual info.
Large-prefix & empty-prefix tests (duplicated)
client/firewall/nftables/manager_linux_test.go
Adds TestNftablesManagerCompatibilityWithIptablesFor6kPrefixes (builds 6,375 prefixes) and TestNftablesManagerCompatibilityWithIptablesForEmptyPrefixes; each test skips when nftables/iptables-save absent, creates/initializes manager, applies AddRouteFiltering, verifies iptables-save output, and cleans up. The test functions are duplicated within the file.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant C as createIpSet
    participant K as kernel ipset
    participant L as logger

    rect rgba(70,130,180,0.06)
    C->>K: Create set (metadata)
    end

    rect rgba(60,179,113,0.06)
    C->>K: AddElements(initialElements ≤ maxElements)
    K-->>C: OK / error
    C->>K: Flush set
    C-->>L: Log initial creation (count)
    end

    alt remaining elements exist
        loop for each sublist (≤ maxElements)
            rect rgba(255,215,0,0.04)
            C->>K: SetAddElements(sublist)
            K-->>C: OK / error (error includes sublist count)
            C->>K: Flush set
            C-->>L: Log sublist progress (count)
            end
        end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Verify batching math and boundary conditions (maxPrefixesSet → maxElements) and off-by-one behavior.
  • Confirm flush ordering and impact of partial failures on ipset state and rollback considerations.
  • Inspect error wrapping/messages for useful context (which batch failed, prefix counts).
  • Review duplicated tests for accidental duplication and environment assumptions (nftables + iptables-save).

Suggested reviewers

  • mlsmaycon
  • pappz

Poem

🐰 I sorted prefixes, chunk by chunk,
A first small heap, then batches in a bunch.
I flushed each batch and logged with pride,
If one batch trips, I point the side.
Hoppy nets — tidy routes worldwide.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: fixing nftables peer source limitations through batching, which is the primary purpose of this bug fix.
Description check ✅ Passed The description provides the required information: issue context with specific error details, reproduction steps, environment details (RHEL 9.7, nftables v1.0.9), the batching solution (200 sources), and properly completes the template with bug fix checkbox and documentation justification.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1bff5cf and b1243d1.

📒 Files selected for processing (1)
  • client/firewall/nftables/router_linux.go (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • client/firewall/nftables/router_linux.go

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
client/firewall/nftables/router_linux.go (1)

417-424: Consider more idiomatic batching loop.

The loop logic is correct but uses subEnd += maxSize with external declaration, which is less common than the standard pattern. This makes it slightly harder to verify correctness at a glance.

Consider the more idiomatic approach:

-	var subEnd int
 	maxSize := 200
 	for subStart := 0; subStart < len(sources); subStart += maxSize {
-		subEnd += maxSize
+		subEnd := subStart + maxSize
 		if subEnd > len(sources) {
 			subEnd = len(sources)
 		}
 		subSources := sources[subStart:subEnd]

This keeps subEnd scoped to the loop iteration and makes the relationship between subStart and subEnd explicit.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2097306 and 7d25873.

📒 Files selected for processing (1)
  • client/firewall/nftables/router_linux.go (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
client/firewall/nftables/router_linux.go (2)
client/firewall/manager/firewall.go (1)
  • Network (63-66)
client/firewall/manager/set.go (2)
  • Set (15-18)
  • NewPrefixSet (42-58)
🔇 Additional comments (2)
client/firewall/nftables/router_linux.go (2)

337-340: LGTM: Clean refactoring to extract source handling logic.

The extraction into applySources() improves code organization. The error handling is correct since the helper already wraps errors appropriately.


405-415: LGTM: Single source handling is correct.

The special case for 0.0.0.0/0 (skipping source matching) and the direct prefix usage for other single sources are both handled correctly.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7d25873 and d685d8e.

📒 Files selected for processing (1)
  • client/firewall/nftables/manager_linux_test.go (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
client/firewall/nftables/manager_linux_test.go (8)
client/firewall/nftables/router_linux_test.go (1)
  • NFTABLES (28-28)
client/firewall/create_linux.go (1)
  • NFTABLES (28-28)
client/firewall/nftables/manager_linux.go (1)
  • Create (47-67)
client/firewall/iptables/manager_linux.go (1)
  • Create (39-61)
client/proto/daemon.pb.go (3)
  • Network (2079-2088)
  • Network (2101-2101)
  • Network (2116-2118)
client/firewall/manager/firewall.go (2)
  • Network (63-66)
  • ActionAccept (57-57)
client/firewall/manager/protocol.go (1)
  • ProtocolTCP (9-9)
client/firewall/manager/port.go (1)
  • Port (10-16)
🔇 Additional comments (1)
client/firewall/nftables/manager_linux_test.go (1)

9-9: strconv import is correctly scoped and used

The new strconv import is used only in the added test and looks appropriate; no issues here.

@nhenneaux nhenneaux force-pushed the fix(router)-nft-tables-limit-number-of-peers-source branch 2 times, most recently from c0eb4ab to d5172b4 Compare November 25, 2025 04:53
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d5172b4 and e13e2c8.

📒 Files selected for processing (1)
  • client/firewall/nftables/manager_linux_test.go (1 hunks)

@nhenneaux nhenneaux force-pushed the fix(router)-nft-tables-limit-number-of-peers-source branch from e13e2c8 to d87ef88 Compare November 25, 2025 09:15
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
client/firewall/nftables/manager_linux_test.go (1)

389-437: Test logic looks good; consider renaming away from “10kPrefixes”.

The test setup (nftables/iptables checks, manager lifecycle, ~6k prefixes generation, route filter + iptables-save verification) looks sound and exercises the intended large-prefix path.

Minor nit: the function name says ...For10kPrefixes but the current loop bounds generate 24×254 = 6,096 prefixes. Either adjust the name to something more generic (e.g. ForManyPrefixes / For6kPrefixes) or tweak the counts to better match the name.

client/firewall/nftables/router_linux.go (1)

332-352: Critical: batched source sets are combined with AND semantics in a single rule, so rules with >200 sources will never match.

applySources currently batches sources into groups of 200, creates a firewall.Set per batch, and then appends the resulting Lookup expressions from applyNetwork into a single rule:

  • For multiple sources, each batch produces something equivalent to ip saddr @set_i as a separate match in the same rule (via getIpSetExprs).
  • In nftables, multiple match statements in a single rule are conjunctive: if any match fails, evaluation proceeds to the next rule, not to the next match within the same rule. All matches in a rule must succeed for the verdict to be reached. (manpages.opensuse.org)
  • Since the batched sets are disjoint subsets, no source address can be contained in all of them. A packet’s source can be in at most one set_i, so at least one Lookup will fail and the rule will never match for any source once you cross the batching threshold.

Net effect: for len(sources) > 200, the added route rule becomes effectively dead — it passes creation and avoids the netlink size error, but no traffic will ever match it. That’s a regression in correctness compared to “failing loudly” on too many sources.

To restore correct OR semantics while respecting size limits, you’ll need to restructure this so that each batch is not a separate Lookup within the same rule, for example:

  • Create one route rule per batch (same destination/proto/ports/action, different source set), and adjust the rules bookkeeping to track multiple nftables rules for a single logical route rule key; or
  • Reduce to a single set per route rule by keeping the large set but inserting elements in smaller chunks (e.g., via SetAddElements), so the kernel never receives an oversized netlink message, but the final rule still uses just one Lookup.

As-is, this change breaks matching for any route with more than 200 source prefixes and should be fixed before merging.

In nftables, if a rule contains multiple separate match statements like `ip saddr @set1` and `ip saddr @set2`, does the rule require both matches to succeed (logical AND), or is there any built-in OR semantics within a single rule? Please confirm the behavior and provide an example from official docs or man pages.

Also applies to: 414-448

🧹 Nitpick comments (1)
client/firewall/nftables/router_linux.go (1)

414-448: Clarify behavior for empty sources and 0.0.0.0/0, and tighten batching loop.

Minor points in applySources:

  1. Empty sources slice semantics

    • For len(sources) == 0, the “multiple sources” branch runs but the loop doesn’t execute, so exprs stays empty and you effectively create a rule that matches any source.
    • For a single 0.0.0.0/0, you also intentionally emit no source expressions.
      If “no sources” and “0.0.0.0/0” are both supposed to mean “all sources”, it might be worth making that explicit (either in code comments or via an early return for len(sources) == 0) so callers aren’t surprised.
  2. Batching loop clarity

    • Using a running subEnd with subEnd += maxSize works but is slightly harder to read than computing it from subStart:
      for subStart := 0; subStart < len(sources); subStart += maxSize {
          subEnd := subStart + maxSize
          if subEnd > len(sources) {
              subEnd = len(sources)
          }
          subSources := sources[subStart:subEnd]
          ...
      }

    This avoids any risk of accidental misuse of subEnd outside the loop and makes the chunking logic self-evident.

Both are small, non-blocking cleanups, but they would make the intent clearer for future readers.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e13e2c8 and d87ef88.

📒 Files selected for processing (2)
  • client/firewall/nftables/manager_linux_test.go (1 hunks)
  • client/firewall/nftables/router_linux.go (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
client/firewall/nftables/manager_linux_test.go (5)
client/firewall/nftables/manager_linux.go (1)
  • Create (56-76)
client/firewall/iptables/manager_linux.go (1)
  • Create (39-61)
client/iface/iface.go (1)
  • DefaultMTU (28-28)
client/firewall/manager/firewall.go (2)
  • Network (63-66)
  • ActionAccept (57-57)
client/firewall/manager/protocol.go (1)
  • ProtocolTCP (9-9)
client/firewall/nftables/router_linux.go (2)
client/firewall/manager/firewall.go (1)
  • Network (63-66)
client/firewall/manager/set.go (2)
  • Set (15-18)
  • NewPrefixSet (42-58)

@braginini
Copy link
Contributor

great stuff @nhenneaux, thank you! The team is looking into that

@nhenneaux nhenneaux force-pushed the fix(router)-nft-tables-limit-number-of-peers-source branch from d87ef88 to 6e8d485 Compare November 25, 2025 10:49
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
client/firewall/nftables/router_linux.go (1)

48-51: Batched source sets are still ANDed within a single rule; >maxPrefixesSet sources will never match

The new batching via maxPrefixesSet and applySources avoids oversized ipsets, but as currently wired it breaks matching semantics for more than maxPrefixesSet sources:

  • applySources chunks sources and, for each chunk, builds a Set and calls applyNetwork, which yields a Payload + Lookup pair per Set.
  • All of these expressions are concatenated into a single exprs slice and used in one nftables rule in AddRouteFiltering.
  • In nftables, sequential Lookup/Cmp expressions in the same rule are effectively AND’ed: the packet must be contained in every set to reach the verdict. With disjoint chunks, any given source IP can only belong to one Set, so once you exceed maxPrefixesSet prefixes the rule can never match.

This means route ACLs with more than 500 source prefixes will silently stop matching instead of just being split across batches.

To restore correct OR semantics while still avoiding kernel limits, you likely need to change the batching strategy, for example:

  1. Keep a single Set per logical rule and batch element insertion only.
    • Let applySources (or applyNetwork) still use one firewall.Set for all sources, and move the batching into createIpSet: create an empty set and then add elements in multiple SetAddElements calls, each under the kernel’s safe limit.
  2. OR via multiple rules instead of multiple Sets in one rule.
    • For each chunk, build its own Set and emit a separate nftables rule with identical destination/protocol/ports/action but a different source Set.
    • If you go this route, you’ll need to adjust how r.rules and DeleteRouteRule model a logical route rule (one high-level rule mapping to N nftables rules + ipsets), otherwise you’ll leak rules/sets or fail to delete all of them.

As written, this is a functional regression for large source lists and should be fixed before relying on batching in production. Based on nftables expression semantics, not on this repository’s code.

Also applies to: 335-355, 416-451

🧹 Nitpick comments (1)
client/firewall/nftables/router_linux.go (1)

416-430: Clarify single 0.0.0.0/0 and empty-sources handling in applySources

The special-casing for a single source has a couple of edge cases worth tightening up:

  • For len(sources) == 1 and sources[0].Bits() == 0, you leave source at its zero value and still call applyNetwork(source, sources, true). Depending on how firewall.Network.IsPrefix/IsSet are implemented, this may be treated as “no source constraint” (which matches the intent of 0.0.0.0/0) or as an invalid network; please confirm this is equivalent to the previous 0.0.0.0/0 behavior and can’t return an error.
  • When len(sources) == 0, the function falls into the “multiple sources” branch and returns an empty exprs slice, effectively creating a rule with no source restriction. If an empty sources list is not a valid input, consider explicitly rejecting it (e.g., return nil, fmt.Errorf("no source prefixes provided")) to avoid accidental “match any source” rules.

Making these edge cases explicit will reduce surprises and make the intent around 0.0.0.0/0 and empty inputs clearer.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d87ef88 and 6e8d485.

📒 Files selected for processing (2)
  • client/firewall/nftables/manager_linux_test.go (1 hunks)
  • client/firewall/nftables/router_linux.go (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
client/firewall/nftables/router_linux.go (2)
client/firewall/manager/firewall.go (1)
  • Network (63-66)
client/firewall/manager/set.go (2)
  • Set (15-18)
  • NewPrefixSet (42-58)
🔇 Additional comments (1)
client/firewall/nftables/manager_linux_test.go (1)

389-437: Test is well-structured and unique; minor naming suggestion

Verification confirms this test exists as a single definition (no duplication). The structure is solid—proper feature guards, iptables-save verification, preallocated prefix slice, and correct loop bounds.

One optional tidbit: the function name says "6kPrefixes" but generates 6,096 prefixes exactly ((25–1)×(255–1) = 24×254). If you want to be precise, either adjust the constant values slightly or add a brief comment noting the actual count. This is purely a naming clarity suggestion and not a blocker.

@nhenneaux nhenneaux force-pushed the fix(router)-nft-tables-limit-number-of-peers-source branch from 67ee4e1 to e2af8d7 Compare November 25, 2025 11:13
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
client/firewall/nftables/router_linux.go (1)

416-450: Batched source sets still use AND semantics; rules won’t match when len(sources) > maxPrefixesSet.

applySources currently creates multiple firewall.Set instances (one per subSources batch) and appends each batch’s Lookup expressions into a single rule via applyNetwork/getIpSetExprs. In nftables, multiple match expressions in one rule are combined with logical AND, so this effectively requires the packet source to belong to every set in the rule.

Because each batch contains disjoint prefixes, a given source address can only be in one of these sets; it will pass the Lookup for its own set and fail all others, so the rule never matches once you cross a single batch. This is the same logical issue highlighted in the earlier review: batching into multiple sets but keeping them in one rule creates Set1 AND Set2 AND … semantics instead of the desired union/OR.

Consequence:

  • For len(sources) <= maxPrefixesSet, behavior is fine (single set).
  • For len(sources) > maxPrefixesSet, the rule is installed without netlink errors, but traffic will never match it, effectively breaking route ACLs for large source lists.

To fix this you need OR semantics across batches, for example:

  • Keep one nftables set per rule and batch element insertion at the netlink level (e.g., create the set empty, then call SetAddElements in chunks), or
  • Create multiple rules, each referencing a single set for a batch of prefixes but sharing the same destination/proto/ports/action, so the chain provides OR semantics via multiple matching rules rather than multiple lookups in one rule.

Whichever approach you choose, ensure that:

  • ruleKey / DeleteRouteRule / decrementSetCounter remain consistent with how many nft rules and sets are actually created.
  • Existing tests with large prefix counts verify actual data‑plane behavior (packets from sources beyond the first batch must still match).
🧹 Nitpick comments (2)
client/firewall/nftables/router_linux.go (2)

48-51: Clarify maxPrefixesSet rationale and comment.

The comment mentions a ~3.2k/3.3k threshold, while maxPrefixesSet is set to 500. That’s a big safety margin but the relation isn’t obvious from the comment alone. Consider briefly explaining why 500 was chosen (e.g., “conservative per‑set limit to keep netlink messages small enough”) or adjusting the comment so future readers don’t try to infer a direct mapping between the 3.2k/3.3k numbers and this constant.


352-355: Preserve error context when applySources fails.

Here you return err directly, whereas most of the file wraps errors with context (e.g., "apply destination: %w", "convert protocol to number: %w"). For consistency and easier debugging, consider wrapping this as well:

-	exprs, err := r.applySources(sources)
-	if err != nil {
-		return nil, err
-	}
+	exprs, err := r.applySources(sources)
+	if err != nil {
+		return nil, fmt.Errorf("apply sources: %w", err)
+	}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6e8d485 and e2af8d7.

📒 Files selected for processing (2)
  • client/firewall/nftables/manager_linux_test.go (1 hunks)
  • client/firewall/nftables/router_linux.go (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • client/firewall/nftables/manager_linux_test.go
🧰 Additional context used
🧬 Code graph analysis (1)
client/firewall/nftables/router_linux.go (2)
client/firewall/manager/firewall.go (1)
  • Network (63-66)
client/firewall/manager/set.go (2)
  • Set (15-18)
  • NewPrefixSet (42-58)

@nhenneaux nhenneaux force-pushed the fix(router)-nft-tables-limit-number-of-peers-source branch from e2af8d7 to 68d634e Compare November 25, 2025 11:34
…ailing at 3277 prefixes on nftables v1.0.9 with Ubuntu 24.04.3 LTS, 6.14.0-35-generic netbirdio#35~24.04.1-Ubuntu
@nhenneaux nhenneaux force-pushed the fix(router)-nft-tables-limit-number-of-peers-source branch from 68d634e to 778a79a Compare November 25, 2025 15:23
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
client/firewall/nftables/router_linux.go (1)

415-449: Critical: batched source sets are AND‑ed inside a single rule; rules with >maxPrefixesSet sources will never match.

When len(sources) > maxPrefixesSet, applySources:

  • Splits sources into sub‑slices of at most maxPrefixesSet.
  • For each sub‑slice, builds a firewall.Set and calls applyNetwork with network.Set set.
  • applyNetworkgetIpSetgetIpSetExprs, which returns two expressions per set: a Payload (load addr) followed by a Lookup on that set.
  • All these expressions are appended into a single exprs slice, which AddRouteFiltering then uses to build one nftables rule.

Resulting rule shape (simplified):

  • Payload; Lookup(set_batch_1);
  • Payload; Lookup(set_batch_2);
  • ...
  • destination / proto / ports / verdict.

In nftables, expressions within a rule have AND semantics: if any Lookup (or Cmp) fails, evaluation of that rule aborts and processing moves to the next rule. Since each batch set contains a disjoint subset of prefixes, a source IP can only be in one of these sets. Requiring it to pass all Lookup expressions means no packet can ever satisfy the rule once there is more than one batch.

Practically, this means that for any route ACL whose sources length exceeds maxPrefixesSet, the constructed rule will never match any traffic, silently breaking route filtering for large peer sets.

To get OR semantics across batches, you’ll need to change the structure, for example:

  • Generate one nftables rule per batch (each with a single set lookup) that shares the same destination/proto/ports/action, and adjust how r.rules and DeleteRouteRule track multiple nft rules per logical ACL; or
  • Use a different nftables construct that provides OR semantics across multiple sets, and reflect that in applySources.

Given this affects correctness for large deployments, fixing the OR/AND semantics is critical before relying on batching in production.

Would you double‑check nftables rule semantics here and confirm that a single rule with multiple Lookup expressions indeed behaves as AND (i.e., aborts on first failed lookup), and that your tests cover actual packet matching for len(sources) > maxPrefixesSet, not just successful rule creation?

🧹 Nitpick comments (1)
client/firewall/nftables/router_linux.go (1)

48-51: Hard‑coded batching limit: comment is good, consider making behavior easier to tune.

The maxPrefixesSet=3000 choice is justified in the comment, but it is still a magic number tied to observed behavior (3277). If this limit is expected to vary between kernels/set types over time, consider either:

  • exposing it as a package‑level variable or config knob, or
  • expanding the comment to capture how it was derived (e.g. kernel version, set type, interval usage), so future maintainers know when they might need to revisit it.

This is non‑blocking, just about maintainability.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 68d634e and 778a79a.

📒 Files selected for processing (2)
  • client/firewall/nftables/manager_linux_test.go (1 hunks)
  • client/firewall/nftables/router_linux.go (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
client/firewall/nftables/router_linux.go (2)
client/firewall/manager/firewall.go (1)
  • Network (63-66)
client/firewall/manager/set.go (2)
  • Set (15-18)
  • NewPrefixSet (42-58)
🔇 Additional comments (3)
client/firewall/nftables/manager_linux_test.go (2)

389-437: LGTM! Past feedback has been addressed.

The test correctly:

  • Checks for iptables-save availability before running (line 394-396)
  • Uses efficient AddrFrom4 prefix generation instead of string parsing
  • Has accurate capacity pre-allocation: (octet2Count-1)*(octet3Count-1) = 24*254 = 6,096 matches the actual loop iterations
  • Uses a reasonable naming convention ("6kPrefixes" for 6,096 prefixes)

This test validates the batching fix for large prefix sets.


439-478: Good edge case coverage.

This test validates that AddRouteFiltering handles an empty prefix slice gracefully, which is an important edge case for the new applySources batching logic.

client/firewall/nftables/router_linux.go (1)

352-355: Refactor to applySources improves readability and error context.

Delegating source handling from AddRouteFiltering into applySources cleans up the main function and the new error message with the total len(sources) provides useful context when debugging failures. This change itself looks sound and keeps the rest of the rule construction logic unchanged.

@nhenneaux
Copy link
Author

nhenneaux commented Nov 25, 2025

I reproduced the issue with failing at 3277 prefixes on nftables v1.0.9 with Ubuntu 24.04.3 LTS,  6.14.0-35-generic #35~24.04.1-Ubuntu, 3276 prefixes work. It reproduces with TestNftablesManagerCompatibilityWithIptablesFor6kPrefixes and it is fixed with new batching mechanism. I still would like to confirm whether the CodeRabbit comment is relevant or not on the AND/OR semantic between ipset

Initial issue is on rhel9.7, kernel 5.14.0-570.60.1.el9_6.x86_64, nftables v1.0.9 (Old Doc Yak #3).

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 778a79a and 1e6c68d.

📒 Files selected for processing (1)
  • client/firewall/nftables/router_linux.go (2 hunks)
🔇 Additional comments (2)
client/firewall/nftables/router_linux.go (2)

48-51: LGTM!

The maxPrefixesSet constant with a safety margin is appropriate for addressing the nftables limit issue described in the PR.


500-513: LGTM! Batching approach correctly adds elements to a single Set.

The batching logic properly adds all elements to the same nftables Set in chunks, avoiding the AND semantics issue flagged in the previous review. The loop correctly uses min() for bounds checking and includes appropriate logging and error handling.

@nhenneaux nhenneaux force-pushed the fix(router)-nft-tables-limit-number-of-peers-source branch from 1e6c68d to cc31226 Compare November 26, 2025 13:30
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
client/firewall/nftables/router_linux.go (1)

498-498: Consider using leveled logging for consistency.

Lines 498 and 512 use log.Printf, which bypasses the logging framework's level filtering. The rest of the codebase uses structured logging (log.Debugf, log.Infof, log.Tracef).

Consider this change for consistency:

-	log.Printf("Created new ipset: %s with %d initial prefixes (total prefixes %d)", setName, len(initialElements)/2, len(elements)/2)
+	log.Infof("Created new ipset: %s with %d initial prefixes (total prefixes %d)", setName, len(initialElements)/2, len(elements)/2)

And similarly on line 512:

-		log.Printf("Added new prefixes (%d) in ipset: %s", nSubPrefixes, setName)
+		log.Infof("Added new prefixes (%d) in ipset: %s", nSubPrefixes, setName)

This allows operators to control log verbosity via level configuration. Given that this fix handles large peer counts (thousands), Infof is appropriate for tracking batch progress.

Also applies to: 512-512

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1e6c68d and cc31226.

📒 Files selected for processing (1)
  • client/firewall/nftables/router_linux.go (2 hunks)
🔇 Additional comments (2)
client/firewall/nftables/router_linux.go (2)

48-51: LGTM: Conservative batch limit with appropriate margin.

The maxPrefixesSet constant of 1500 provides a reasonable safety margin below the observed failure threshold of 1638 prefixes. Grouping refreshRulesMapError into the const block is a clean organizational improvement.


487-514: LGTM: Batching logic correctly implements OR semantics.

The batching implementation is sound and correctly solves the nftables limitation:

  1. AddSet creates the set with the initial batch (up to 3000 elements)
  2. SetAddElements adds subsequent batches to the same set (line 506)
  3. All elements end up in one set, so lookups match if the IP is in any batch (OR semantics)

This directly addresses the reported failure at ~3,280 peers by chunking element additions. The use of min(maxElements, nElements) on line 490 correctly prevents slice bounds panics for small input sets.

Note: The past review comment flagging "Critical: AND semantics" was based on a misunderstanding. The code does not create multiple sets per rule; it populates a single set incrementally, which is the correct approach.

@nhenneaux nhenneaux force-pushed the fix(router)-nft-tables-limit-number-of-peers-source branch from cc31226 to 1bff5cf Compare November 26, 2025 13:34
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
client/firewall/nftables/router_linux.go (1)

500-513: Consider reducing log verbosity for multiple batches.

The batching logic is correct and handles large prefix sets properly. However, line 512 uses log.Infof for each batch, which could create significant log volume when many batches are needed (e.g., 4+ Info messages for 6,375 prefixes).

Consider one of these approaches:

  1. Change line 512 to log.Debugf to reduce verbosity
  2. Add a summary log at the end instead of per-batch logging
  3. Keep log.Infof only for the final batch

Example diff to reduce verbosity:

-		log.Infof("Added new prefixes (%d) in ipset: %s", nSubPrefixes, setName)
+		log.Debugf("Added new prefixes (%d) in ipset: %s", nSubPrefixes, setName)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cc31226 and 1bff5cf.

📒 Files selected for processing (1)
  • client/firewall/nftables/router_linux.go (2 hunks)
🔇 Additional comments (2)
client/firewall/nftables/router_linux.go (2)

48-51: LGTM! Reasonable safety margin for batching.

The constant maxPrefixesSet = 1500 provides a safe margin below the observed failure threshold. Moving refreshRulesMapError into the same const block is a reasonable cleanup.


487-498: LGTM! Initial batch creation is correct.

The initial batch logic correctly handles the case where total elements are fewer than maxElements using min(). The error message and logging are clear and informative.

@nhenneaux
Copy link
Author

I addressed the issue on multiple ipset not working with OR semantic (I haven't found resources confirming CodeRabbit but it makes sense) It also addresses the issue with other IpSet with too much elements. Limit seems to be 1638 prefixes (3276 elements), it fails above.

@nhenneaux nhenneaux force-pushed the fix(router)-nft-tables-limit-number-of-peers-source branch from 1bff5cf to b1243d1 Compare November 26, 2025 13:41
@sonarqubecloud
Copy link

@nhenneaux
Copy link
Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 26, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@nhenneaux
Copy link
Author

Hi @braginini

We have tested the fix in our environment and it seems to work as expected without the error preventing acl with huge number of peers. Could you check this to integrate the fix in the upstream? 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants