Skip to content

fix: replace permanent blacklisting with clock skew tolerance window for WebSocket telemetry (#596)#617

Open
ionfwsrijan wants to merge 3 commits into
KanishJebaMathewM:mainfrom
ionfwsrijan:fix/issue-596-websocket-clock-skew
Open

fix: replace permanent blacklisting with clock skew tolerance window for WebSocket telemetry (#596)#617
ionfwsrijan wants to merge 3 commits into
KanishJebaMathewM:mainfrom
ionfwsrijan:fix/issue-596-websocket-clock-skew

Conversation

@ionfwsrijan

@ionfwsrijan ionfwsrijan commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Problem

The WebSocket tracking handler stored device timestamps in Redis with a 24-hour TTL. If a device sent a timestamp ahead of the server (e.g., clock 10 minutes fast), that future value poisoned the sequence cache — all subsequent legitimate pings were silently dropped for 24 hours. Device clock drift from NTP corrections or mobile OS sleep cycles caused permanent false-positive blacklisting.

Solution

  1. Added CLOCK_SKEW_TOLERANCE_MS config (default ±5 minutes), configurable via process.env.CLOCK_SKEW_TOLERANCE_MS
  2. Inserted a server-side timestamp validation gate before the out-of-order sequencer: if |deviceTime - serverTime| > tolerance, the update is ignored with a warning and the Redis sequence key is never updated
  3. If within tolerance, normal out-of-order detection proceeds as before
  4. Documented the new env var in .env.example

Files changed

  • backend/api/src/sockets/tracker.js — added clock skew validation before seqKey write
  • backend/api/.env.example — documented CLOCK_SKEW_TOLERANCE_MS

Closes #596

Summary by CodeRabbit

  • Bug Fixes
    • Improved WebSocket tracking by validating incoming device timestamps against a configurable clock-skew tolerance; out-of-window telemetry is ignored.
    • Updated customer logout to use the backend logout endpoint with user headers, ensuring local sign-out always completes.
  • Configuration
    • Added CLOCK_SKEW_TOLERANCE_MS to example environment settings (default: 300000 ms).
  • Tests
    • Enhanced bid error-path integration tests with wallet/escrow setup.
    • Updated tracker unit test timing to be relative to the current time.

@github-actions

Copy link
Copy Markdown
Contributor

🎉 Thank you for your contribution! Your pull request has been received and will be reviewed shortly.

If you enjoy the project, please consider giving the repository a ⭐. You can also follow my GitHub profile to stay updated on future open-source projects.

Thanks for being part of the community! 🚀

@coderabbitai

coderabbitai Bot commented Jun 18, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Adds a configurable CLOCK_SKEW_TOLERANCE_MS environment variable (default 300000 ms) to guard the tracker WebSocket handler against device clock drift. In handleLocationPing, the absolute difference between the device epoch and server time is computed; telemetry exceeding the tolerance is dropped with a [TRUXIFY CLOCK SKEW] warning and does not reach sequence-cache logic. The variable is documented in .env.example, and the Redis sequence test is updated to use relative timestamps. Additionally, two bid-acceptance error-path test cases are updated with wallet-address mocks and escrow deposit configuration, and the customer app logout flow is refactored from bearer-token to header-based auth, while redundant driver app imports and null checks are cleaned up.

Changes

Clock Skew Filtering for WebSocket Telemetry

Layer / File(s) Summary
Env variable declaration and module-level constant
backend/api/.env.example, backend/api/src/sockets/tracker.js
.env.example adds a # WebSocket Tracking section documenting CLOCK_SKEW_TOLERANCE_MS=300000 with inline comments describing max allowable clock drift. tracker.js reads the same env var at module load with an identical 300000 ms fallback default.
Clock-skew validation gate in handleLocationPing
backend/api/src/sockets/tracker.js
After timestamp parsing, computes Math.abs(deviceEpoch - Date.now()); if the result exceeds CLOCK_SKEW_TOLERANCE_MS, logs a [TRUXIFY CLOCK SKEW] warning and returns early, discarding the telemetry update before any sequence-cache or buffer write.
Clock skew test timestamp alignment
backend/api/test/unit/tracker.test.js
Redis sequence-gate test now uses a relative device timestamp (60 seconds in the past) instead of a fixed epoch, ensuring correct out-of-order telemetry rejection validation without clock-skew tolerance interference.

Bid Acceptance Test Setup for Wallet Mocking

Layer / File(s) Summary
Wallet setup in bid-acceptance error tests
backend/api/test/integration/bids.test.js
Two error-path test cases ("load offer is no longer available" and "order is no longer pending") now insert customer and driver profiles with polygon_wallet_address, add polygon_wallet_address to driver_details, and configure mockEscrowDeposit to resolve with a transaction hash before calling the bid-accept endpoint.

Flutter Mobile App Refactoring and Cleanup

Layer / File(s) Summary
Customer profile service logout flow rewrite
apps/customer/lib/services/profile_service.dart
ProfileService.logout switches from session-token-based _httpClient.post with Authorization header and 5-second timeout to _apiClient.post with x-user-id/x-user-role headers; adds early return when userId is null, and ensures local sign-out always occurs in finally block regardless of backend logout result.
Driver app imports and null-check simplification
apps/driver/lib/screens/profile_screen.dart, apps/driver/lib/services/marketplace_repository.dart
Removes a redundant package:http/http.dart import from profile_screen.dart; simplifies marketplace_repository.dart's newRecord validation from explicit null check (newRecord != null) to direct isEmpty check (newRecord.isNotEmpty).

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~13 minutes

Possibly related PRs

  • KanishJebaMathewM/Truxify#559: Updates bid-acceptance escrow deposit and refund behavior, with overlapping changes to escrow mocks and assertions in bids.test.js.
  • KanishJebaMathewM/Truxify#604: Adds escrow precondition changes that require both wallet addresses before escrow runs, aligning with this PR's wallet-address and escrowDeposit mock configuration updates.

Suggested labels

level:intermediate, type:testing, flutter, customer-app, driver-app

Poem

🐇 A rabbit checks the clock with care,
Device time drifted — far from fair!
Five minutes skewed? That ping's ignored,
The telemetry buffer safely stored.
Clocks in sync, the map stays true —
Hop along, fresh data's due! 🗺️

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Linked Issues check ⚠️ Warning PR partially addresses issue #596: implements clock skew detection [596] via CLOCK_SKEW_TOLERANCE_MS, but does not complete all four required fixes including server-time sequencing [596], null/undefined coordinate checks [596], or circuit breaker mechanism [596]. Implement remaining fixes from #596: use server timestamps for sequence detection, change coordinate validation from falsy to null/undefined checks, and add circuit breaker for stuck sequences.
Out of Scope Changes check ⚠️ Warning PR contains out-of-scope changes: test file modifications in bids.test.js adding wallet/escrow setup and marketplace_repository.js changes to null checks are unrelated to issue #596's WebSocket telemetry clock skew problem. Remove out-of-scope changes from bids.test.js wallet setup and marketplace_repository.js isNotEmpty check; focus PR solely on clock skew tolerance implementation.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately reflects the main change: replacing permanent blacklisting with a clock skew tolerance window for WebSocket telemetry, matching the clock skew validation implementation.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
backend/api/src/sockets/tracker.js (1)

342-354: ⚡ Quick win

Add explicit regression assertions for skew-drop behavior and sequence-key immutability.

The new gate is critical to #596 behavior. Please add/extend unit coverage to assert:

  1. out-of-window skew returns early and does not call Redis set,
  2. in-window skew still reaches normal sequence logic.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/api/src/sockets/tracker.js` around lines 342 - 354, Add explicit unit
test coverage for the clock skew validation logic (the CLOCK_SKEW_VALIDATION
gate section) to ensure regression prevention. Create test cases that assert:
when the calculated skewMs exceeds CLOCK_SKEW_TOLERANCE_MS, the function returns
early without calling any Redis set operations, and when skewMs is within the
tolerance threshold, the function continues execution to the normal sequence key
logic. This validates both the reject path (out-of-window skew) and the accept
path (in-window skew) work correctly together.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/api/src/sockets/tracker.js`:
- Line 11: The CLOCK_SKEW_TOLERANCE_MS constant uses the || operator which
treats the parsed value of "0" as falsy and would incorrectly fall back to the
default, and it does not validate that the parsed value is non-negative. Replace
the || logic with explicit validation to ensure the parsed integer from
process.env.CLOCK_SKEW_TOLERANCE_MS is finite and greater than or equal to zero,
falling back to the default 300000 only when validation fails or the environment
variable is unset.

---

Nitpick comments:
In `@backend/api/src/sockets/tracker.js`:
- Around line 342-354: Add explicit unit test coverage for the clock skew
validation logic (the CLOCK_SKEW_VALIDATION gate section) to ensure regression
prevention. Create test cases that assert: when the calculated skewMs exceeds
CLOCK_SKEW_TOLERANCE_MS, the function returns early without calling any Redis
set operations, and when skewMs is within the tolerance threshold, the function
continues execution to the normal sequence key logic. This validates both the
reject path (out-of-window skew) and the accept path (in-window skew) work
correctly together.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 61a4e604-48ef-4852-8dc5-282bf05dafbf

📥 Commits

Reviewing files that changed from the base of the PR and between 4656ed1 and e06b5cc.

📒 Files selected for processing (2)
  • backend/api/.env.example
  • backend/api/src/sockets/tracker.js

Comment thread backend/api/src/sockets/tracker.js
@ionfwsrijan ionfwsrijan force-pushed the fix/issue-596-websocket-clock-skew branch from e06b5cc to 12506e4 Compare June 18, 2026 15:46

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
backend/api/src/sockets/tracker.js (1)

331-333: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix coordinate presence validation to allow valid 0 values.

At Line 331, if (!latitude || !longitude) rejects valid coordinates like (0, 0). Validate null/undefined and numeric bounds instead.

Suggested fix
-  if (!latitude || !longitude) {
+  const lat = Number(latitude);
+  const lng = Number(longitude);
+  if (
+    !Number.isFinite(lat) ||
+    !Number.isFinite(lng) ||
+    lat < -90 || lat > 90 ||
+    lng < -180 || lng > 180
+  ) {
     return ws.send(JSON.stringify({ error: 'Missing mandatory tracking parameters (lat, lng).' }));
   }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/api/src/sockets/tracker.js` around lines 331 - 333, The coordinate
validation in the tracking handler currently rejects valid zero values because
the condition `if (!latitude || !longitude)` treats 0 as falsy. Replace this
validation logic to explicitly check for null and undefined values instead,
allowing 0 as a valid coordinate. Additionally, consider adding numeric bounds
validation to ensure the coordinates are within acceptable ranges for latitude
and longitude values.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@backend/api/src/sockets/tracker.js`:
- Around line 331-333: The coordinate validation in the tracking handler
currently rejects valid zero values because the condition `if (!latitude ||
!longitude)` treats 0 as falsy. Replace this validation logic to explicitly
check for null and undefined values instead, allowing 0 as a valid coordinate.
Additionally, consider adding numeric bounds validation to ensure the
coordinates are within acceptable ranges for latitude and longitude values.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: e3799ca6-4a42-4fd2-9e4e-0a672b13c760

📥 Commits

Reviewing files that changed from the base of the PR and between e06b5cc and 12506e4.

📒 Files selected for processing (2)
  • backend/api/.env.example
  • backend/api/src/sockets/tracker.js
✅ Files skipped from review due to trivial changes (1)
  • backend/api/.env.example

…or tests

Upstream wallet validation gate returns 422 when wallets are missing,
breaking tests that expect 500 from the subsequent RPC error.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
apps/customer/lib/services/profile_service.dart (1)

53-54: 💤 Low value

Replace print with structured logging.

Using print() is not recommended for production Flutter apps. Consider using developer.log (already imported in ApiClient) or a logging framework for consistency with the rest of the codebase.

+import 'dart:developer' as developer;
+
 ...
     } catch (e) {
-      print('Backend logout failed: $e');
+      developer.log('Backend logout failed: $e', name: 'ProfileService');
     } finally {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/customer/lib/services/profile_service.dart` around lines 53 - 54, The
catch block that handles backend logout failures is using print() for error
logging, which is not recommended for production Flutter applications. Replace
the print() statement with developer.log(), which is already imported in
ApiClient, or use the logging framework consistent with the rest of the
codebase. Update the error handling in the catch block to use the appropriate
structured logging method instead of the print() call, ensuring the error
details are still captured and logged properly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@apps/customer/lib/services/profile_service.dart`:
- Around line 53-54: The catch block that handles backend logout failures is
using print() for error logging, which is not recommended for production Flutter
applications. Replace the print() statement with developer.log(), which is
already imported in ApiClient, or use the logging framework consistent with the
rest of the codebase. Update the error handling in the catch block to use the
appropriate structured logging method instead of the print() call, ensuring the
error details are still captured and logged properly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 362a98c1-8c89-4c0c-80c7-336f736a4d38

📥 Commits

Reviewing files that changed from the base of the PR and between 16915f1 and 38e3eb8.

📒 Files selected for processing (4)
  • apps/customer/lib/services/profile_service.dart
  • apps/driver/lib/screens/profile_screen.dart
  • apps/driver/lib/services/marketplace_repository.dart
  • backend/api/test/unit/tracker.test.js
💤 Files with no reviewable changes (1)
  • apps/driver/lib/screens/profile_screen.dart

@ionfwsrijan

Copy link
Copy Markdown
Contributor Author

@KanishJebaMathewM Please review this and add labels

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WebSocket Telemetry Uses Device Timestamps for Out-of-Order Detection — Clock Skew Causes Permanent Telemetry Blacklisting

1 participant