Skip to content

feat: Introduce route saving, environmental data computation, and scoring functionalities with a new data processing API.#19

Merged
kaihere14 merged 57 commits into
kaihere14:mainfrom
GURUDAS-DEV:main
Feb 23, 2026
Merged

feat: Introduce route saving, environmental data computation, and scoring functionalities with a new data processing API.#19
kaihere14 merged 57 commits into
kaihere14:mainfrom
GURUDAS-DEV:main

Conversation

@GURUDAS-DEV

@GURUDAS-DEV GURUDAS-DEV commented Feb 22, 2026

Copy link
Copy Markdown
Contributor

Summary by CodeRabbit

Release Notes

  • New Features

    • Git merge driver for simplified conflict resolution workflow
    • API error handling with structured JSON responses
  • Improvements

    • Windows compatibility for Pathway functionality
    • Enhanced data fetching reliability with retry logic and timeouts
    • Improved breakpoint calculation with duplicate detection
    • Configurable deployment parameters for hosts, CORS, and service timeouts
  • Chores

    • Code cleanup and inline documentation consolidation

GURUDAS-DEV and others added 30 commits February 14, 2026 12:20
- Added HomeMap component to display a Mapbox map with geolocation features.
- Updated Home page to render the HomeMap component.
- Introduced PrivateLayout component for consistent layout structure.
- Updated TypeScript configuration to include additional type definitions.
- Added Mapbox GL and related dependencies to package.json and package-lock.json.
- Created global.d.ts for Mapbox CSS module declaration.
…nation selection with search and geolocation.
…RouteItemClient components, and configure Next.js image domains.
GURUDAS-DEV and others added 26 commits February 15, 2026 18:54
…ature

- Updated type definitions to include �qiScore in RouteData.
- Modified API response handling to extract AQI data from backend.
- Enhanced UI component to display AQI information with color-coded categories.
- Implemented AQI score calculation logic in the backend, including pollutant breakdown.
- Added error handling for AQI data fetching and display.
- Improved breakpoint calculation to optimize API calls for AQI data.
- Updated documentation to reflect changes in AQI integration and testing procedures.
- Added scoring transformers in 	ransformers.py to calculate weather, AQI, and traffic scores.
- Created functions for computing overall route scores based on environmental data.
- Introduced batch processing for multiple routes in compute_batch_scores.
- Established a cron job scheduler for periodic route score computation.
- Developed a Pathway client for communication with the scoring server.
- Added Redis utility for caching breakpoints.
- Created MongoDB schemas for storing breakpoints and routes.
- Documented backend architecture and data flow in README.md.
- Added tests for new functionalities in the 	ests directory.
- Updated requirements.txt with necessary dependencies.
…ute environmental scoring, integrated with client UI.
…Pathway with Python fallback and integrate with client and server components.
…xpress API server with data computation scheduler.
…ring functionalities with a new data processing API.
@vercel

vercel Bot commented Feb 22, 2026

Copy link
Copy Markdown
Contributor

@GURUDAS-DEV is attempting to deploy a commit to the kaihere14's projects Team on Vercel.

A member of the Team first needs to authorize it.

@coderabbitai

coderabbitai Bot commented Feb 22, 2026

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Walkthrough

This PR introduces a Git merge driver configuration and comprehensive enhancements to API error handling, compute utilities, and data pipeline integration. Changes include new Django middleware for API exception handling, Windows compatibility guards for Pathway, retry/timeout mechanisms for external API calls, refactored batching logic for concurrent processing, and removal of documentation comments throughout the codebase.

Changes

Cohort / File(s) Summary
Git Configuration & Merge Strategy
.gitattributes, scripts/README.md, scripts/git-merge-theirs.sh, scripts/git-merge-theirs.ps1, scripts/setup-merge-theirs.sh, scripts/setup-merge-theirs.ps1
Configures a custom Git merge driver ("theirs") that prefers incoming changes during merges. Includes shell and PowerShell implementations with setup scripts and documentation for Unix and Windows platforms.
Django API Middleware & Configuration
data-processing/dataProcessingServer/api/middleware.py, data-processing/dataProcessingServer/dataProcessingServer/settings.py, data-processing/dataProcessingServer/api/pathway/pipeline.py, data-processing/dataProcessingServer/api/views.py
Adds ApiExceptionMiddleware to catch unhandled exceptions on API paths and return JSON responses. Extends host origin and CORS configuration to support multiple environment variables. Introduces Windows guard for Pathway availability and conditional schema definition.
Schema & Comment Cleanup
server/src/Schema/breakPoints.ts, server/src/Schema/route.schema.ts, server/src/index.ts, server/src/routes/savedRoutes.routes.ts, server/src/routes/score.routes.ts, server/src/utils/scheduler/computeData.scheduler.ts
Removes descriptive header comments and documentation blocks from schema definitions and route handlers without altering functional behavior or exports.
AQI Data Fetching & Processing
server/src/utils/compute/aqi.compute.ts, server/src/controllers/score.controller.ts
Implements retry logic with timeout and AbortController, adds API response validation, and introduces typed structures (AQIAPIResponse, PointAQIResult, RouteAQIResult) for standardized data aggregation and error handling.
Breakpoint Calculation
server/src/utils/compute/breakPoint.compute.ts
Adds helper functions for dynamic breakpoint count calculation, coordinate deduplication with tolerance, and breakpoint extraction. Refactors breakpoint generation with per-route accumulation and fallback for duplicate coordinates.
Weather Data Fetching & Concurrency
server/src/utils/compute/weather.compute.ts
Replaces all-at-once fetch strategy with concurrency-limited batching, adds explicit API response validation, implements error handling via try/catch, and updates result aggregation with per-point tracking.
Route Caching & Breakpoint Assembly
server/src/controllers/savedRoutes.controllers.ts
Adds UUID format validation for cache path lookup with silent failure fallback. Simplifies breakpoint construction logic from nested loops to single-pass iteration using forEach, reduces separate fallback branches.
Controller Signature & Return Types
server/src/controllers/score.controller.ts
Updates getScoreController signature to multiline parameter list with explicit Promise return type annotation; preserves logical behavior and error handling.
Pathway Integration & Health Checks
server/src/utils/scheduler/pathwayClient.ts
Modifies sendToPathway to include request body with usePathway flag and explicit error handling. Improves checkPathwayHealth with configurable timeout via environment variable, URL normalization, and structured response handling.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

Possibly related PRs

Poem

🐰 Git commits merge like fresh carrots now,
With "theirs" preferred—no merge-conflict row!
Retries fetch weather and air so fine,
While breakpoints calculate, perfectly aligned. 🥕✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 57.69% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main changes: it introduces route saving, environmental data computation (AQI/weather), and scoring functionalities with a new data processing API across multiple files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (10)
server/src/utils/compute/aqi.compute.ts (2)

83-86: ⚠️ Potential issue | 🟠 Major

Non-OK HTTP responses bypass the retry loop.

When response.ok is false (e.g., HTTP 429 Too Many Requests, 503 Service Unavailable), the function returns null immediately instead of continuing the retry loop. Only exceptions (network failures, timeouts) actually trigger retries. This significantly reduces the value of the retry mechanism, since transient server errors from AQICN won't be retried.

Use continue instead of return null for retryable status codes, and reserve immediate return null for non-retryable client errors (4xx except 429).

Proposed fix
       if (!response.ok) {
-        console.error(`AQI API error for (${lat}, ${lon}): ${response.status}`);
-        return null;
+        const retryable = response.status === 429 || response.status >= 500;
+        console.error(
+          `AQI API error for (${lat}, ${lon}): ${response.status}${retryable ? ' (will retry)' : ''}`
+        );
+        if (!retryable) {
+          return null;
+        }
+        if (attempt < AQI_MAX_RETRIES) {
+          await new Promise((resolve) =>
+            setTimeout(resolve, 500 * (attempt + 1))
+          );
+        }
+        continue;
       }

Note: The same pattern exists in fetchWeatherForPoint in weather.compute.ts (though that function lacks retries entirely). If retries are added there, the same consideration applies.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/utils/compute/aqi.compute.ts` around lines 83 - 86, The current
AQI fetch block in the function handling the HTTP response (the response.ok
check in aqi.compute.ts) immediately returns null on any non-OK status, which
bypasses the retry loop; change that logic so retryable server responses (5xx)
and 429 result in continue to let the retry loop attempt again, while only
non-retryable client errors (4xx except 429) immediately return null; ensure the
loop still returns null after exhausting retries and preserve the existing error
logging (reference the response.ok check and the surrounding retry loop in the
AQI fetch function).

90-96: ⚠️ Potential issue | 🟡 Minor

API-level error status also bypasses retries.

Similar to the HTTP status issue above: when apiResponse.status !== "ok", the function returns null immediately. Some AQICN API-level errors may be transient. Consider using continue for these as well, or at minimum logging the attempt number for observability.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/utils/compute/aqi.compute.ts` around lines 90 - 96, The API-level
status check inside aqi.compute.ts currently returns null when
apiResponse.status !== "ok" (in the block referencing lat, lon and apiResponse),
which bypasses retries; change that behavior to treat non-"ok" API responses as
transient by using continue to advance the retry loop instead of returning, and
add a log that includes the attempt/retry number (use the existing retry/attempt
loop variable, or introduce one if missing) along with lat/lon and
apiResponse.status for observability; update the block that currently does
console.error(...) and return null to log the attempt and continue the retry
loop (or fall through to the next retry) so transient API errors are retried.
server/src/controllers/savedRoutes.controllers.ts (2)

65-103: ⚠️ Potential issue | 🟠 Major

Route persisted before breakpoints — partial write on insertMany failure.

Route.create (line 65) commits the route to MongoDB before breakpoints are inserted (line 137). If insertMany throws, the outer catch returns a 500, but the route document already exists. The client sees an error and may retry, creating a duplicate route.

Consider wrapping the route creation + breakpoint insertion in a MongoDB transaction, or inserting breakpoints inside their own try/catch so the route save still succeeds (and breakpoints can be backfilled later).

Option A: isolate breakpoint insertion failure so the route save succeeds
-    if (breakPointDocs.length > 0) {
-      await BreakPoint.insertMany(breakPointDocs);
-    }
+    if (breakPointDocs.length > 0) {
+      try {
+        await BreakPoint.insertMany(breakPointDocs);
+      } catch (bpErr) {
+        console.error("Failed to insert breakpoints:", bpErr);
+        // Route is still saved; breakpoints can be recomputed later
+      }
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/controllers/savedRoutes.controllers.ts` around lines 65 - 103, The
route is created via Route.create before breakpoints are persisted, so if
insertMany (the breakpoint insert) fails the route remains and clients may retry
causing duplicates; modify the flow to either wrap Route.create + breakpoint
insertion in a MongoDB session/transaction (use startSession(),
session.withTransaction() and pass { session } to both Route.create and the
breakpoints insert) or ensure Route.create is only called after
routeBreakpointsData is computed and inserted (i.e.,
computeBreakpoints/redis.get first, then insert breakpoints with insertMany
inside a try/catch that logs the error but still allows a successful route
save), referencing Route.create, routeBreakpointsData, computeBreakpoints,
redis.get and the insertMany breakpoint insertion logic so the operation is
atomic or the route save is not left partially written on insert failure.

77-78: ⚠️ Potential issue | 🟡 Minor

Bug in UUID regex: extra dash in the third character class.

The third group [0-9-a-f]{4} contains a literal dash between 9 and a, making it match - as a valid character inside hex groups. This allows strings that aren't valid UUIDs to pass validation.

🐛 Proposed fix
-      const uuidRegex =
-        /^[0-9a-f]{8}-[0-9a-f]{4}-[0-9-a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i;
+      const uuidRegex =
+        /^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/controllers/savedRoutes.controllers.ts` around lines 77 - 78, The
uuidRegex defined in savedRoutes.controllers.ts has an accidental literal dash
in the third character class (`[0-9-a-f]{4}`) which allows '-' as a valid hex
digit; update the uuidRegex variable to remove the extra dash so each group uses
`[0-9a-f]` (i.e., the full pattern should be eight-4-4-4-12 hex digits with
hyphens between) and preserve the `i` flag for case-insensitivity.
server/src/utils/compute/breakPoint.compute.ts (1)

129-136: ⚠️ Potential issue | 🟠 Major

Max-routes cap (3) is lower than the schema validator's cap (5).

computeBreakpoints throws for more than 3 routes (line 134), but route.schema.ts allows up to 5 route options (line 155 of the schema). When a user saves a route with 4 or 5 options, the throw is silently caught in the controller (lines 100-102) and no breakpoints are generated at all — not even for the first 3 options.

Either raise this limit to 5 to match the schema, or process only the first 3 routes instead of throwing.

Option: process up to the limit instead of throwing
-  if (routes.length > 3) {
-    throw new Error("Maximum 3 routes allowed");
-  }
+  const maxRoutes = 5;
+  const routesToProcess = routes.slice(0, maxRoutes);

Then iterate over routesToProcess instead of routes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/utils/compute/breakPoint.compute.ts` around lines 129 - 136,
computeBreakpoints currently rejects inputs >3 while route.schema.ts allows up
to 5, causing saved routes with 4–5 options to produce no breakpoints; fix by
either (A) aligning the cap with the schema: change the check in
computeBreakpoints to allow up to 5 routes (match route.schema.ts), or (B)
process only the first 3 routes to preserve current behavior: create a
routesToProcess = routes.slice(0, 3) and iterate over routesToProcess instead of
routes in computeBreakpoints so extra options are ignored but the first entries
still produce breakpoints.
server/src/utils/scheduler/computeData.scheduler.ts (3)

96-96: ⚠️ Potential issue | 🟡 Minor

trafficValue is hardcoded to 0

This will produce inaccurate route scores if the Pathway model uses traffic as a meaningful input. At minimum, add a TODO so it isn't silently ignored.

🔧 Suggested annotation
-      trafficValue: 0,
+      trafficValue: 0, // TODO: replace with real-time traffic data
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/utils/scheduler/computeData.scheduler.ts` at line 96, The field
trafficValue in computeData.scheduler.ts is hardcoded to 0 and will silently
ignore traffic input for route scoring; update the code around the object that
sets trafficValue to either compute the real traffic metric (calling the
appropriate traffic lookup or Pathway input) or, at minimum, replace the literal
0 with a clear TODO comment and a thrown/noted placeholder (e.g., TODO: compute
trafficValue from TrafficService/Pathway inputs) so future readers know this
must be implemented; locate the assignment to trafficValue in the computeData
scheduler function and add the TODO and/or the actual computation using the
existing traffic APIs or Pathway model inputs.

182-207: ⚠️ Potential issue | 🟠 Major

Accumulated results are discarded; outer catch silently swallows system-level errors

Two related observability problems:

  1. The results array (lines 182–204) is populated on every batch iteration but is never logged, returned, or otherwise acted upon. Failed routes are completely invisible once runBatchScoring returns.
  2. The outer catch block (lines 205–207) swallows genuine system errors such as a DB connection failure on line 144. The inline comment ("individual route errors are captured per-task") is misleading — per-task errors are caught, but fatal outer failures are silently dropped.
🔧 Proposed fix
+    const failed = results.filter((r) => !r.success);
+    if (failed.length > 0) {
+      console.error(
+        `[runBatchScoring] ${failed.length}/${results.length} route(s) failed`,
+        failed.map((r) => ({ routeId: r.routeId, error: r.error }))
+      );
+    }
   } catch (error) {
-    /* batch scoring failed silently — individual route errors are captured per-task */
+    console.error("[runBatchScoring] Fatal error during batch scoring run:", error);
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/utils/scheduler/computeData.scheduler.ts` around lines 182 - 207,
The results array built in runBatchScoring (results) is never surfaced and the
outer catch swallows system errors; change runBatchScoring to (1) capture
batch-level errors by replacing the silent catch with catch (err) and log or
rethrow the error (include context like "runBatchScoring failed" and the err)
instead of swallowing it, and (2) return or persist the accumulated results (the
results variable) so callers can observe per-task successes/failures; ensure
processRoute remains handling per-task errors but do not suppress fatal errors
(e.g., DB connection failures) — either log them via your logger and rethrow or
propagate the error to callers along with the results.

12-14: ⚠️ Potential issue | 🟠 Major

NaN interval possible when env var is set to a non-numeric string

The truthy guard (process.env.SCHEDULE_INTERVAL_MS ? ...) only fires the fallback when the variable is absent/empty. If it is set to an invalid value (e.g. "disabled", "abc"), parseInt returns NaN, and setInterval(fn, NaN) in Node.js coerces the delay to 0, effectively creating a tight scheduling loop that will hammer MongoDB and the Pathway service continuously.

🔧 Proposed fix
-const SCHEDULE_INTERVAL_MS = process.env.SCHEDULE_INTERVAL_MS
-  ? parseInt(process.env.SCHEDULE_INTERVAL_MS, 10)
-  : 30 * 60 * 1000;
+const DEFAULT_INTERVAL_MS = 30 * 60 * 1000;
+const _parsedInterval = parseInt(process.env.SCHEDULE_INTERVAL_MS ?? "", 10);
+const SCHEDULE_INTERVAL_MS =
+  Number.isFinite(_parsedInterval) && _parsedInterval > 0
+    ? _parsedInterval
+    : DEFAULT_INTERVAL_MS;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/utils/scheduler/computeData.scheduler.ts` around lines 12 - 14,
The SCHEDULE_INTERVAL_MS parsing can yield NaN for non-numeric env values
(causing setInterval to use 0); update the parsing logic where
SCHEDULE_INTERVAL_MS is defined to parse the env var (parseInt or Number) and
then validate the result (e.g., Number.isFinite or !Number.isNaN) and fall back
to the default 30*60*1000 when invalid; ensure any downstream use
(setInterval(fn, SCHEDULE_INTERVAL_MS)) receives a safe positive integer
(optionally enforce a minimum threshold) and reference the SCHEDULE_INTERVAL_MS
constant and any usages of setInterval in computeData.scheduler.ts to locate
changes.
data-processing/dataProcessingServer/api/pathway/pipeline.py (1)

196-200: ⚠️ Potential issue | 🟡 Minor

Failed route results are silently dropped.

When json.loads(row["score_json"]) fails, the route is skipped with continue and no log is emitted. This could cause a response with fewer routes than requested, and the caller would have no indication which routes failed or why.

🐛 Proposed fix — log on failure
         try:
             score_data = json.loads(row["score_json"])
             route_scores.append(score_data)
-        except (json.JSONDecodeError, KeyError):
-            continue
+        except (json.JSONDecodeError, KeyError) as e:
+            logger.warning("Failed to parse score result for a route: %s", e)
+            continue
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@data-processing/dataProcessingServer/api/pathway/pipeline.py` around lines
196 - 200, The code silently drops failed parses around
json.loads(row["score_json"]) causing missing route entries; update the except
block that currently catches (json.JSONDecodeError, KeyError) to log the failure
before continuing: include the exception details and identifying context from
the row (e.g., route id or whatever identifier is in row) so callers can see
which score_json failed; reference the existing logger instance (e.g., logger or
process_logger) and keep the continue to preserve behavior after logging.
data-processing/dataProcessingServer/dataProcessingServer/settings.py (1)

1-11: ⚠️ Potential issue | 🟡 Minor

Update docstring to reflect the actual Django version (5.2.11, not 6.0.2).

The auto-generated docstring and documentation links reference Django 6.0.2, but the project uses Django 5.2.11 (as pinned in requirements.txt). The requirements.txt even includes a comment acknowledging this mismatch. Update the docstring and URLs to prevent developers from consulting incorrect documentation.

Fix
-Generated by 'django-admin startproject' using Django 6.0.2.
+Generated by 'django-admin startproject' using Django 5.2.11.
 
 For more information on this file, see
-https://docs.djangoproject.com/en/6.0/topics/settings/
+https://docs.djangoproject.com/en/5.2/topics/settings/
 
 For the full list of settings and their values, see
-https://docs.djangoproject.com/en/6.0/ref/settings/
+https://docs.djangoproject.com/en/5.2/ref/settings/
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@data-processing/dataProcessingServer/dataProcessingServer/settings.py` around
lines 1 - 11, The module docstring at the top of settings.py incorrectly
references Django 6.0.2 and docs URLs for 6.0; update the docstring text and the
documentation links to reference Django 5.2.11 (the version pinned in
requirements.txt) and point to the corresponding docs (e.g.,
https://docs.djangoproject.com/en/5.2/topics/settings/ and
https://docs.djangoproject.com/en/5.2/ref/settings/); make this change directly
in the top-level string in settings.py so developers consult the correct version
and links.
🧹 Nitpick comments (15)
server/src/utils/compute/aqi.compute.ts (2)

167-203: Same inter-batch delay recommendation as weather.compute.ts.

No delay between batches here either. AQICN also has rate limits, and since fetchAQIForPoint already includes retries, adding a small inter-batch pause would complement the retry strategy and reduce the chance of triggering rate limits in the first place.

Also, the step comments ("2." on Line 167, "3." on Line 205) appear to be left over from a previous version — there's no "1." step comment.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/utils/compute/aqi.compute.ts` around lines 167 - 203, Add a small
inter-batch delay when iterating batches to avoid hitting AQICN rate limits:
after processing each batch in the loop that slices allPointTasks (the code
using CONCURRENCY_LIMIT, pointResultsMap, and batchResults), await a short
sleep/delay before continuing to the next slice so retries in fetchAQIForPoint
have room to succeed; implement or reuse a sleep helper and call it after
storing batchResults. Also clean up the step comments around this block (the
"2." and the subsequent "3." comments) so they reflect the current sequence or
are removed.

3-22: Redundant | undefined on optional properties.

Fields marked with ? are already T | undefined in TypeScript. The explicit | undefined union on dominentpol (Line 5), iaqi (Line 15), and time (Line 21) is redundant and adds visual noise.

Proposed cleanup
 export interface AQIData {
   aqi: number;
-  dominentpol?: string | undefined;
+  dominentpol?: string;
   iaqi?:
     | {
         pm25?: { v: number };
@@ ...
       }
-    | undefined;
+    ;
   time?:
     | {
         s: string;
         tz: string;
       }
-    | undefined;
+    ;
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/utils/compute/aqi.compute.ts` around lines 3 - 22, The AQIData
interface contains redundant "| undefined" unions on optional properties; update
the interface by removing the explicit "| undefined" for the optional members
(dominentpol, iaqi, and time) so they remain declared using the optional
modifier (e.g., dominentpol?: string; iaqi?: { ... }; time?: { s: string; tz:
string; };) while preserving the existing nested shapes in the iaqi and time
types.
server/src/utils/compute/weather.compute.ts (1)

114-161: Consider adding an inter-batch delay for API rate-limit protection.

The batching logic correctly limits concurrency to 5 parallel requests, but there is no delay between consecutive batches. With many routes/points, this can fire requests as fast as each batch resolves, potentially hitting OpenWeatherMap's rate limits (especially on free-tier keys). A small await sleep(...) between batches would add resilience.

Also, the PointTask interface (Lines 108-112) and the entire batching/aggregation pattern are nearly identical to aqi.compute.ts. If this pattern grows to more compute modules, extracting a shared generic batch-fetcher utility would reduce duplication.

Example inter-batch delay
     for (let i = 0; i < allPointTasks.length; i += CONCURRENCY_LIMIT) {
+      if (i > 0) {
+        await new Promise((resolve) => setTimeout(resolve, 200));
+      }
       const batch = allPointTasks.slice(i, i + CONCURRENCY_LIMIT);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/utils/compute/weather.compute.ts` around lines 114 - 161, Add a
small inter-batch delay to protect against API rate limits: after processing
each slice of allPointTasks with the existing CONCURRENCY_LIMIT loop and before
continuing to the next batch, await a shared sleep/delay function (e.g.,
sleep(ms)) to pause for a configurable interval; update anywhere that calls
fetchWeatherForPoint (within the batch.map) to respect this pacing. Also
consider extracting the repeated batching pattern (PointTask,
PointWeatherResult, the for loop that fills pointResultsMap, and
CONCURRENCY_LIMIT) into a reusable helper (e.g., a generic batchFetcher utility)
so aqi.compute.ts and weather.compute.ts can reuse the same logic and
configuration.
server/src/controllers/savedRoutes.controllers.ts (1)

152-174: Minor: cascade delete is not atomic with route deletion.

If BreakPoint.deleteMany (line 167) fails after Route.findOneAndDelete succeeds, orphaned breakpoints remain. This is existing code and low-severity since orphan breakpoints are harmless, but worth noting for future cleanup (e.g., a TTL index or periodic sweep).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/controllers/savedRoutes.controllers.ts` around lines 152 - 174,
The deleteRoute handler can leave orphaned BreakPoint documents if
BreakPoint.deleteMany fails after Route.findOneAndDelete succeeds; wrap the
deletion of the Route and the BreakPoint.deleteMany in a mongoose transaction
using a session: start a session, session.startTransaction(), perform
Route.findOneAndDelete(..., { session }) and BreakPoint.deleteMany({ routeId:
route._id }, { session }), then commitTransaction and endSession, and on error
abortTransaction and endSession before returning the 500; reference the
deleteRoute function and the Route.findOneAndDelete and BreakPoint.deleteMany
calls when implementing this change.
server/src/utils/compute/breakPoint.compute.ts (1)

63-67: Potential off-by-one: safeIndex clamping may collapse breakpoints on short arrays.

When totalCoords is small (e.g., 3) and count is 3, multiple iterations can compute the same safeIndex because Math.floor(fraction * totalCoords) yields the same value after clamping to [1, totalCoords - 2]. The dedup fallback (offset search) will handle this, but on very short coordinate arrays, most breakpoints may be skipped with only a console.warn.

This isn't necessarily a bug given the >= 2 guard, but be aware that for routes with very few coordinates the function may silently return far fewer breakpoints than requested.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/utils/compute/breakPoint.compute.ts` around lines 63 - 67, The
loop that computes breakpoints uses index = Math.floor(fraction * totalCoords)
then clamps to safeIndex = Math.max(1, Math.min(index, totalCoords - 2)), which
collapses many iterations to the same safeIndex on short coordinate arrays;
change the index calculation and clamping to produce more evenly distributed,
unique indices: compute index against (totalCoords - 1) (e.g.
Math.round(fraction * (totalCoords - 1))) and clamp to the intended bounds (or
when totalCoords - 2 < count, explicitly generate the set of available interior
indices 1..totalCoords-2 and pick from them in order or by stepping) so the
breakPoint.compute.ts loop (the variables fraction, index, safeIndex and the
dedup fallback/offset search) yields distinct breakpoints up to the available
number instead of collapsing to a single clipped index.
server/src/utils/scheduler/computeData.scheduler.ts (3)

163-164: Double type assertion as unknown as { travelMode: string } signals a schema type gap

This indicates the Mongoose document type for route options does not declare travelMode. Aligning the schema type (or the RouteOption interface) would eliminate the unsafe cast and surface any future mismatches at compile time.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/utils/scheduler/computeData.scheduler.ts` around lines 163 - 164,
The double type assertion on rawTravelMode (const rawTravelMode = (option as
unknown as { travelMode: string }).travelMode) indicates the Mongoose/TypeScript
types for route option documents are missing travelMode; update the RouteOption
interface/schema to include travelMode (or refine the Mongoose document type
used for option) and remove the unsafe 'as unknown as { travelMode: string }'
cast so rawTravelMode can be accessed via a properly typed property on option
(update any type definitions used in computeData.scheduler.ts and related models
to surface compile-time errors instead of runtime casts).

144-144: Unbounded Route.find({}) will load every route document into memory

With no limit or cursor-based pagination, a large dataset will exhaust heap before the batch loop even starts.

♻️ Suggested approach
-    const routes = await Route.find({}).lean();
+    const routes = await Route.find({}).limit(500).lean(); // or use cursor-based pagination

For larger scale, prefer a cursor() / batchSize() approach or paginate in chunks to keep memory bounded.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/utils/scheduler/computeData.scheduler.ts` at line 144, The
unbounded Route.find({}).lean() call that assigns to routes will load all route
documents into memory; change it to a cursor/batching pattern (e.g., use
Route.find({}).lean().cursor() or add batchSize() and iterate with an async
loop) and process each batch/stream chunk instead of materializing routes array;
update the code around the routes variable and the surrounding function
(computeData.scheduler logic that iterates routes) to consume the cursor/batches
and free memory between batches.

212-226: setInterval handle is discarded — scheduler cannot be stopped

There is no way to clear the interval for graceful shutdown (e.g. SIGTERM/SIGINT) or in test teardown.

♻️ Proposed fix
-export function initScheduler(): void {
-  setInterval(async () => {
+export function initScheduler(): NodeJS.Timeout {
+  const handle = setInterval(async () => {
     if (_isRunning) {
       return;
     }
     _isRunning = true;
     try {
       await runBatchScoring();
     } catch {
       /* scheduler tick failed — _isRunning reset in finally */
     } finally {
       _isRunning = false;
     }
   }, SCHEDULE_INTERVAL_MS);
+  return handle;
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/utils/scheduler/computeData.scheduler.ts` around lines 212 - 226,
The scheduler created in initScheduler() discards the setInterval handle so it
cannot be cleared for graceful shutdown or test teardown; store the interval ID
returned by setInterval in a module-scoped variable (e.g., _schedulerIntervalId)
and export a stopScheduler() function that calls
clearInterval(_schedulerIntervalId) and resets the ID, and update
initScheduler() to assign that ID (still using SCHEDULE_INTERVAL_MS) so callers
can stop the scheduler (SIGTERM/SIGINT handlers or tests) by invoking
stopScheduler().
data-processing/dataProcessingServer/api/pathway/pipeline.py (1)

152-155: pw.debug.table_from_markdown with complex JSON cell values is fragile.

The markdown table format is designed for simple debug/test data. Serialized JSON strings (weather/AQI data) as cell values work only because json.dumps doesn't emit newlines by default, and pipe characters are escaped. However, any data containing literal \n or unusual characters could silently corrupt rows. For a hackathon this is acceptable, but for production consider pw.debug.table_from_pandas with a DataFrame, which avoids string-format fragility entirely.

Also applies to: 254-276

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@data-processing/dataProcessingServer/api/pathway/pipeline.py` around lines
152 - 155, Replace fragile use of pw.debug.table_from_markdown in pipeline.py
(the input_table creation that calls pw.debug.table_from_markdown and the
similar block around lines 254-276) with pw.debug.table_from_pandas: construct a
pandas.DataFrame from the existing table_data (instead of calling
_markdown_from_data) and pass that DataFrame to pw.debug.table_from_pandas with
schema=RouteInputSchema (and any other schema usages). This avoids
serialized-JSON-in-cell fragility while keeping schema enforcement via
pw.debug.table_from_pandas on the same variables/functions: input_table,
pw.debug.table_from_markdown, _markdown_from_data.
server/src/utils/scheduler/pathwayClient.ts (1)

117-117: usePathway: false is hardcoded — Pathway pipeline is never used from this client.

This effectively makes the full Pathway pipeline (run_batch_pipeline) dead code from the Node.js server's perspective. If this is intentional for the hackathon (favoring reliability), consider adding a comment or making it configurable via an environment variable for easier toggling.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/utils/scheduler/pathwayClient.ts` at line 117, The request body in
pathwayClient.ts currently hardcodes usePathway: false which prevents the
Pathway pipeline (e.g., run_batch_pipeline) from ever being exercised; change
this to be configurable (read an env var like USE_PATHWAY or accept a parameter
on the function that builds the request) and use that variable instead of false,
or if intentional, add a clear inline comment explaining it's disabled for
reliability; update the code paths that call the client so callers can enable
pathway when desired and ensure the request payload uses that configurable flag
(reference: the JSON body containing routes and usePathway in pathwayClient.ts
and the run_batch_pipeline Pathway pipeline).
data-processing/dataProcessingServer/dataProcessingServer/settings.py (1)

81-84: DJANGO_ALLOW_ALL_HOSTS bypasses host validation entirely.

Setting ALLOWED_HOSTS = ['*'] disables Django's host header validation, which is a defense against HTTP Host header attacks. The env-var gate is fine for controlled environments, but ensure this is never enabled in production deployments. Consider adding a log warning when this flag is active.

🛡️ Proposed fix — warn when wildcard hosts are active
+import logging as _logging
+_settings_logger = _logging.getLogger(__name__)
+
 _allow_all_hosts = os.getenv('DJANGO_ALLOW_ALL_HOSTS', '').lower() in ('true', '1', 'yes')

 if _allow_all_hosts:
     ALLOWED_HOSTS = ['*']
+    _settings_logger.warning("DJANGO_ALLOW_ALL_HOSTS is enabled — host header validation is disabled.")
 elif _allowed_hosts_from_env:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@data-processing/dataProcessingServer/dataProcessingServer/settings.py` around
lines 81 - 84, The current DJANGO_ALLOW_ALL_HOSTS env flag sets _allow_all_hosts
to True and assigns ALLOWED_HOSTS = ['*'], which bypasses host validation;
update the settings so that when _allow_all_hosts is true you still set
ALLOWED_HOSTS = ['*'] but also emit a clear runtime warning (use Python logging
or Django's logger) that this is enabled and should never be used in production,
and optionally include the environment value and a suggestion to unset it;
reference the existing symbols _allow_all_hosts, DJANGO_ALLOW_ALL_HOSTS and
ALLOWED_HOSTS in settings.py so the warning is logged immediately after the
wildcard assignment.
server/src/controllers/score.controller.ts (2)

260-262: Silent Redis catch blocks hinder observability.

Three separate catch blocks silently discard Redis errors. While the graceful degradation approach is correct, adding at least a console.warn would make Redis connectivity issues visible in logs without affecting request flow.

♻️ Example for one of the blocks (apply similarly to others)
-    } catch {
-      /* redis set failed — serve response without caching searchId */
+    } catch (err) {
+      console.warn("[ScoreController] Redis set failed, continuing without cache:", err);
     }

Also applies to: 272-274, 470-472

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/controllers/score.controller.ts` around lines 260 - 262, The three
empty Redis catch blocks in score.controller.ts should be changed to capture the
error and emit a warning so connectivity issues are observable; replace each
bare "catch { ... }" with "catch (err) { /* serve response without caching
searchId */ console.warn('Redis set failed in <functionName>:', err)" or, if a
request-scoped logger exists (e.g., processLogger or this.logger), use that
instead (e.g., processLogger.warn('Redis set failed in <functionName>:', err));
apply the same change to the other two catch blocks to log the error and
preserve graceful degradation.

341-343: dominentpol is overwritten on each iteration — last value wins.

If AQI points along the route report different dominant pollutants, only the last non-falsy value is used. Consider using the most frequent value (mode) or the one corresponding to the highest AQI reading for a more representative result.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/controllers/score.controller.ts` around lines 341 - 343, The loop
currently assigns dominentpol = point.aqi.dominentpol on each iteration so the
last non-falsy value wins; change this to compute a representative pollutant
instead: collect occurrences of point.aqi.dominentpol (or track the pollutant
associated with the highest point.aqi.aqi value) during the iterations and after
the loop set dominentpol to the mode or the pollutant tied to the max AQI.
Update the code around the dominentpol assignment (the loop handling point and
point.aqi) to accumulate counts or compare point.aqi.aqi against a running max
and then set dominentpol once from that aggregate result.
scripts/README.md (1)

3-3: Missing Markdown heading syntax (##) on "Setup" and "Notes".

Both section titles render as plain paragraphs in the GitHub UI instead of navigable headings.

-Setup
+## Setup

 - Unix / Git Bash:
   - Run: `./scripts/setup-merge-theirs.sh`
 - Windows PowerShell:
   - Run: `./scripts/setup-merge-theirs.ps1`

-Notes
+## Notes
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/README.md` at line 3, Convert the plain text section titles "Setup"
and "Notes" into proper Markdown headings by prefixing each with "##" so they
render as navigable headings in GitHub; update the lines containing the literal
"Setup" and "Notes" in scripts/README.md to "## Setup" and "## Notes"
respectively, preserving the following content paragraphs.
scripts/setup-merge-theirs.ps1 (1)

1-4: Replace Write-Host with Write-Output or Write-Information.

PSScriptAnalyzer flags Write-Host because it writes directly to the host and cannot be captured, redirected, or suppressed in non-interactive hosts (and in PowerShell prior to v5.0).

🔧 Proposed fix
-Write-Host "Registering merge.theirs driver..."
+Write-Output "Registering merge.theirs driver..."
 git config --local merge.theirs.name "Keep theirs (incoming) during merges"
 git config --local merge.theirs.driver "powershell -ExecutionPolicy Bypass -File ./scripts/git-merge-theirs.ps1 %O %A %B"
-Write-Host "Registered merge.theirs driver (local git config)."
+Write-Output "Registered merge.theirs driver (local git config)."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/setup-merge-theirs.ps1` around lines 1 - 4, Replace the three
Write-Host calls in scripts/setup-merge-theirs.ps1 with a non-host-specific
cmdlet (use Write-Output for simple stdout or Write-Information if you want
informational stream support) so output can be captured/redirected; update the
two lines that currently call Write-Host "Registering merge.theirs driver..."
and Write-Host "Registered merge.theirs driver (local git config)." to use the
chosen cmdlet and keep the same message text, leaving the git config commands
and the merge driver setting (git config --local merge.theirs.name and git
config --local merge.theirs.driver) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.gitattributes:
- Around line 1-2: The .gitattributes entry currently uses a global pattern "*
merge=theirs" which silently forces the "theirs" driver for every file; update
this to narrow or remove that pattern by replacing "* merge=theirs" with
targeted patterns (e.g., use specific globs like "*.lock merge=theirs" or other
generated artifact patterns) or remove the line entirely so only explicit
per-file rules apply; ensure the change touches the unique symbol "*
merge=theirs" in the file so only intended auto-resolved file types get the
custom merge driver.

In `@data-processing/dataProcessingServer/api/middleware.py`:
- Around line 17-30: ApiExceptionMiddleware is returning JsonResponse before
CorsMiddleware runs, causing missing CORS headers; move ApiExceptionMiddleware
to be listed after CorsMiddleware in the MIDDLEWARE setting (so CorsMiddleware
runs first) rather than trying to add headers here. Edit settings.py and reorder
the MIDDLEWARE list so "corsheaders.middleware.CorsMiddleware" appears before
your "dataProcessingServer.api.middleware.ApiExceptionMiddleware" (keep
references to those exact names).

In `@scripts/git-merge-theirs.ps1`:
- Line 6: The current pipeline uses Get-Content -Raw with Set-Content -Encoding
UTF8 which treats the file as text and writes a UTF-8 BOM (corrupting binaries);
replace that pipeline with a binary-safe file copy using Copy-Item so the script
copies bytes without re-encoding or normalizing line endings (use Copy-Item with
the source variable $B and destination $A and pass the overwrite/force option to
ensure the target is replaced).

In `@scripts/setup-merge-theirs.ps1`:
- Line 3: The merge driver command in the git config (git config --local
merge.theirs.driver) currently points to ".\scripts\git-merge-theirs.ps1" which
won't run under sh; change the driver to invoke PowerShell explicitly (use pwsh
for PowerShell 7+ or powershell for Windows PowerShell) and use a POSIX-safe
path to the script (e.g., ./scripts/git-merge-theirs.ps1) so the shell can
execute the interpreter with the script and passed %O %A %B arguments.

In `@server/src/controllers/score.controller.ts`:
- Around line 479-483: The catch block in the route handler currently swallows
all errors; modify the catch to accept an error parameter (e.g., catch (err))
and log the error before returning the 500 response so failures inside
computeWeather, computeAQI, or the map scoring logic are recorded; use the
project's logger if available (e.g., logger.error or req.log.error) or fallback
to console.error and include contextual info like the handler name (score
controller) and relevant inputs to aid debugging, then keep the existing
res.status(500).json({ success: false, message: "Failed to compute route scores"
}) response.

In `@server/src/utils/compute/breakPoint.compute.ts`:
- Around line 28-36: The calculateBreakpointCount function returns non-monotonic
breakpoint counts (e.g., 400 -> 4 but 501 -> 3); fix calculateBreakpointCount so
counts never decrease as distance grows — for example update the final else
branch logic used by calculateBreakpointCount to return distance < 750 ? 4 : 5
(or otherwise raise the base for >500) so thresholds are monotonic, and add a
short comment on the distance parameter units (meters vs km) above the function
for future maintainers.
- Around line 1-4: Remove the duplicated Coordinate interface from
breakPoint.compute.ts and instead import the exported Coordinate from
weather.compute.ts; specifically delete the local "interface Coordinate { lat:
number; lon: number; }" and add an import for Coordinate (matching how
aqi.compute.ts imports it) so all code in breakPoint.compute.ts references the
single shared Coordinate type from weather.compute.ts.

In `@server/src/utils/scheduler/computeData.scheduler.ts`:
- Around line 228-230: runManualBatchScoring currently calls runBatchScoring
directly and bypasses the module's concurrency guard; update
runManualBatchScoring to check the existing _isRunning boolean (the same flag
used by the scheduled tick) before invoking runBatchScoring, set _isRunning=true
when starting, call await runBatchScoring(), and ensure _isRunning is reset to
false in a finally block so manual triggers cannot run concurrently with the
scheduled runner.

In `@server/src/utils/scheduler/pathwayClient.ts`:
- Around line 164-166: The timeout calculation using parseInt on
PATHWAY_TIMEOUT_MS can produce NaN and cause setTimeout to behave like 0; update
the logic in pathwayClient.ts where the constant timeout is computed to parse
the env var (e.g., via Number or parseInt), then validate the result (use
Number.isFinite / Number.isInteger and ensure > 0) and fall back to the safe
default 30000 when invalid or non-positive; ensure any later use of timeout (the
setTimeout call in the request/abort flow) uses this validated value so requests
are not aborted immediately.
- Around line 105-106: sendToPathway constructs url =
`${baseUrl}/api/compute-scores/` without normalizing baseUrl, so a baseUrl with
a trailing slash can produce a double-slash URL; update sendToPathway to strip
trailing slashes from baseUrl (or reuse the same normalization logic used by
checkPathwayHealth) before concatenation (e.g., normalize baseUrl via
baseUrl.replace(/\/+$/, '') or call the existing normalization helper) so the
final url is `${normalizedBaseUrl}/api/compute-scores/`; ensure references to
baseUrl and url in sendToPathway and any related timeout logic remain unchanged.

---

Outside diff comments:
In `@data-processing/dataProcessingServer/api/pathway/pipeline.py`:
- Around line 196-200: The code silently drops failed parses around
json.loads(row["score_json"]) causing missing route entries; update the except
block that currently catches (json.JSONDecodeError, KeyError) to log the failure
before continuing: include the exception details and identifying context from
the row (e.g., route id or whatever identifier is in row) so callers can see
which score_json failed; reference the existing logger instance (e.g., logger or
process_logger) and keep the continue to preserve behavior after logging.

In `@data-processing/dataProcessingServer/dataProcessingServer/settings.py`:
- Around line 1-11: The module docstring at the top of settings.py incorrectly
references Django 6.0.2 and docs URLs for 6.0; update the docstring text and the
documentation links to reference Django 5.2.11 (the version pinned in
requirements.txt) and point to the corresponding docs (e.g.,
https://docs.djangoproject.com/en/5.2/topics/settings/ and
https://docs.djangoproject.com/en/5.2/ref/settings/); make this change directly
in the top-level string in settings.py so developers consult the correct version
and links.

In `@server/src/controllers/savedRoutes.controllers.ts`:
- Around line 65-103: The route is created via Route.create before breakpoints
are persisted, so if insertMany (the breakpoint insert) fails the route remains
and clients may retry causing duplicates; modify the flow to either wrap
Route.create + breakpoint insertion in a MongoDB session/transaction (use
startSession(), session.withTransaction() and pass { session } to both
Route.create and the breakpoints insert) or ensure Route.create is only called
after routeBreakpointsData is computed and inserted (i.e.,
computeBreakpoints/redis.get first, then insert breakpoints with insertMany
inside a try/catch that logs the error but still allows a successful route
save), referencing Route.create, routeBreakpointsData, computeBreakpoints,
redis.get and the insertMany breakpoint insertion logic so the operation is
atomic or the route save is not left partially written on insert failure.
- Around line 77-78: The uuidRegex defined in savedRoutes.controllers.ts has an
accidental literal dash in the third character class (`[0-9-a-f]{4}`) which
allows '-' as a valid hex digit; update the uuidRegex variable to remove the
extra dash so each group uses `[0-9a-f]` (i.e., the full pattern should be
eight-4-4-4-12 hex digits with hyphens between) and preserve the `i` flag for
case-insensitivity.

In `@server/src/utils/compute/aqi.compute.ts`:
- Around line 83-86: The current AQI fetch block in the function handling the
HTTP response (the response.ok check in aqi.compute.ts) immediately returns null
on any non-OK status, which bypasses the retry loop; change that logic so
retryable server responses (5xx) and 429 result in continue to let the retry
loop attempt again, while only non-retryable client errors (4xx except 429)
immediately return null; ensure the loop still returns null after exhausting
retries and preserve the existing error logging (reference the response.ok check
and the surrounding retry loop in the AQI fetch function).
- Around line 90-96: The API-level status check inside aqi.compute.ts currently
returns null when apiResponse.status !== "ok" (in the block referencing lat, lon
and apiResponse), which bypasses retries; change that behavior to treat non-"ok"
API responses as transient by using continue to advance the retry loop instead
of returning, and add a log that includes the attempt/retry number (use the
existing retry/attempt loop variable, or introduce one if missing) along with
lat/lon and apiResponse.status for observability; update the block that
currently does console.error(...) and return null to log the attempt and
continue the retry loop (or fall through to the next retry) so transient API
errors are retried.

In `@server/src/utils/compute/breakPoint.compute.ts`:
- Around line 129-136: computeBreakpoints currently rejects inputs >3 while
route.schema.ts allows up to 5, causing saved routes with 4–5 options to produce
no breakpoints; fix by either (A) aligning the cap with the schema: change the
check in computeBreakpoints to allow up to 5 routes (match route.schema.ts), or
(B) process only the first 3 routes to preserve current behavior: create a
routesToProcess = routes.slice(0, 3) and iterate over routesToProcess instead of
routes in computeBreakpoints so extra options are ignored but the first entries
still produce breakpoints.

In `@server/src/utils/scheduler/computeData.scheduler.ts`:
- Line 96: The field trafficValue in computeData.scheduler.ts is hardcoded to 0
and will silently ignore traffic input for route scoring; update the code around
the object that sets trafficValue to either compute the real traffic metric
(calling the appropriate traffic lookup or Pathway input) or, at minimum,
replace the literal 0 with a clear TODO comment and a thrown/noted placeholder
(e.g., TODO: compute trafficValue from TrafficService/Pathway inputs) so future
readers know this must be implemented; locate the assignment to trafficValue in
the computeData scheduler function and add the TODO and/or the actual
computation using the existing traffic APIs or Pathway model inputs.
- Around line 182-207: The results array built in runBatchScoring (results) is
never surfaced and the outer catch swallows system errors; change
runBatchScoring to (1) capture batch-level errors by replacing the silent catch
with catch (err) and log or rethrow the error (include context like
"runBatchScoring failed" and the err) instead of swallowing it, and (2) return
or persist the accumulated results (the results variable) so callers can observe
per-task successes/failures; ensure processRoute remains handling per-task
errors but do not suppress fatal errors (e.g., DB connection failures) — either
log them via your logger and rethrow or propagate the error to callers along
with the results.
- Around line 12-14: The SCHEDULE_INTERVAL_MS parsing can yield NaN for
non-numeric env values (causing setInterval to use 0); update the parsing logic
where SCHEDULE_INTERVAL_MS is defined to parse the env var (parseInt or Number)
and then validate the result (e.g., Number.isFinite or !Number.isNaN) and fall
back to the default 30*60*1000 when invalid; ensure any downstream use
(setInterval(fn, SCHEDULE_INTERVAL_MS)) receives a safe positive integer
(optionally enforce a minimum threshold) and reference the SCHEDULE_INTERVAL_MS
constant and any usages of setInterval in computeData.scheduler.ts to locate
changes.

---

Duplicate comments:
In `@data-processing/dataProcessingServer/dataProcessingServer/settings.py`:
- Around line 115-116: Move the ApiExceptionMiddleware entry so it comes after
CorsMiddleware in the MIDDLEWARE list; specifically, ensure
'api.middleware.ApiExceptionMiddleware' is located below the existing
CorsMiddleware entry (e.g., 'corsheaders.middleware.CorsMiddleware') so that
error responses generated by ApiExceptionMiddleware still have CORS headers
applied. Verify the order by checking the MIDDLEWARE list where
ApiExceptionMiddleware and CorsMiddleware are defined.

---

Nitpick comments:
In `@data-processing/dataProcessingServer/api/pathway/pipeline.py`:
- Around line 152-155: Replace fragile use of pw.debug.table_from_markdown in
pipeline.py (the input_table creation that calls pw.debug.table_from_markdown
and the similar block around lines 254-276) with pw.debug.table_from_pandas:
construct a pandas.DataFrame from the existing table_data (instead of calling
_markdown_from_data) and pass that DataFrame to pw.debug.table_from_pandas with
schema=RouteInputSchema (and any other schema usages). This avoids
serialized-JSON-in-cell fragility while keeping schema enforcement via
pw.debug.table_from_pandas on the same variables/functions: input_table,
pw.debug.table_from_markdown, _markdown_from_data.

In `@data-processing/dataProcessingServer/dataProcessingServer/settings.py`:
- Around line 81-84: The current DJANGO_ALLOW_ALL_HOSTS env flag sets
_allow_all_hosts to True and assigns ALLOWED_HOSTS = ['*'], which bypasses host
validation; update the settings so that when _allow_all_hosts is true you still
set ALLOWED_HOSTS = ['*'] but also emit a clear runtime warning (use Python
logging or Django's logger) that this is enabled and should never be used in
production, and optionally include the environment value and a suggestion to
unset it; reference the existing symbols _allow_all_hosts,
DJANGO_ALLOW_ALL_HOSTS and ALLOWED_HOSTS in settings.py so the warning is logged
immediately after the wildcard assignment.

In `@scripts/README.md`:
- Line 3: Convert the plain text section titles "Setup" and "Notes" into proper
Markdown headings by prefixing each with "##" so they render as navigable
headings in GitHub; update the lines containing the literal "Setup" and "Notes"
in scripts/README.md to "## Setup" and "## Notes" respectively, preserving the
following content paragraphs.

In `@scripts/setup-merge-theirs.ps1`:
- Around line 1-4: Replace the three Write-Host calls in
scripts/setup-merge-theirs.ps1 with a non-host-specific cmdlet (use Write-Output
for simple stdout or Write-Information if you want informational stream support)
so output can be captured/redirected; update the two lines that currently call
Write-Host "Registering merge.theirs driver..." and Write-Host "Registered
merge.theirs driver (local git config)." to use the chosen cmdlet and keep the
same message text, leaving the git config commands and the merge driver setting
(git config --local merge.theirs.name and git config --local
merge.theirs.driver) unchanged.

In `@server/src/controllers/savedRoutes.controllers.ts`:
- Around line 152-174: The deleteRoute handler can leave orphaned BreakPoint
documents if BreakPoint.deleteMany fails after Route.findOneAndDelete succeeds;
wrap the deletion of the Route and the BreakPoint.deleteMany in a mongoose
transaction using a session: start a session, session.startTransaction(),
perform Route.findOneAndDelete(..., { session }) and BreakPoint.deleteMany({
routeId: route._id }, { session }), then commitTransaction and endSession, and
on error abortTransaction and endSession before returning the 500; reference the
deleteRoute function and the Route.findOneAndDelete and BreakPoint.deleteMany
calls when implementing this change.

In `@server/src/controllers/score.controller.ts`:
- Around line 260-262: The three empty Redis catch blocks in score.controller.ts
should be changed to capture the error and emit a warning so connectivity issues
are observable; replace each bare "catch { ... }" with "catch (err) { /* serve
response without caching searchId */ console.warn('Redis set failed in
<functionName>:', err)" or, if a request-scoped logger exists (e.g.,
processLogger or this.logger), use that instead (e.g., processLogger.warn('Redis
set failed in <functionName>:', err)); apply the same change to the other two
catch blocks to log the error and preserve graceful degradation.
- Around line 341-343: The loop currently assigns dominentpol =
point.aqi.dominentpol on each iteration so the last non-falsy value wins; change
this to compute a representative pollutant instead: collect occurrences of
point.aqi.dominentpol (or track the pollutant associated with the highest
point.aqi.aqi value) during the iterations and after the loop set dominentpol to
the mode or the pollutant tied to the max AQI. Update the code around the
dominentpol assignment (the loop handling point and point.aqi) to accumulate
counts or compare point.aqi.aqi against a running max and then set dominentpol
once from that aggregate result.

In `@server/src/utils/compute/aqi.compute.ts`:
- Around line 167-203: Add a small inter-batch delay when iterating batches to
avoid hitting AQICN rate limits: after processing each batch in the loop that
slices allPointTasks (the code using CONCURRENCY_LIMIT, pointResultsMap, and
batchResults), await a short sleep/delay before continuing to the next slice so
retries in fetchAQIForPoint have room to succeed; implement or reuse a sleep
helper and call it after storing batchResults. Also clean up the step comments
around this block (the "2." and the subsequent "3." comments) so they reflect
the current sequence or are removed.
- Around line 3-22: The AQIData interface contains redundant "| undefined"
unions on optional properties; update the interface by removing the explicit "|
undefined" for the optional members (dominentpol, iaqi, and time) so they remain
declared using the optional modifier (e.g., dominentpol?: string; iaqi?: { ...
}; time?: { s: string; tz: string; };) while preserving the existing nested
shapes in the iaqi and time types.

In `@server/src/utils/compute/breakPoint.compute.ts`:
- Around line 63-67: The loop that computes breakpoints uses index =
Math.floor(fraction * totalCoords) then clamps to safeIndex = Math.max(1,
Math.min(index, totalCoords - 2)), which collapses many iterations to the same
safeIndex on short coordinate arrays; change the index calculation and clamping
to produce more evenly distributed, unique indices: compute index against
(totalCoords - 1) (e.g. Math.round(fraction * (totalCoords - 1))) and clamp to
the intended bounds (or when totalCoords - 2 < count, explicitly generate the
set of available interior indices 1..totalCoords-2 and pick from them in order
or by stepping) so the breakPoint.compute.ts loop (the variables fraction,
index, safeIndex and the dedup fallback/offset search) yields distinct
breakpoints up to the available number instead of collapsing to a single clipped
index.

In `@server/src/utils/compute/weather.compute.ts`:
- Around line 114-161: Add a small inter-batch delay to protect against API rate
limits: after processing each slice of allPointTasks with the existing
CONCURRENCY_LIMIT loop and before continuing to the next batch, await a shared
sleep/delay function (e.g., sleep(ms)) to pause for a configurable interval;
update anywhere that calls fetchWeatherForPoint (within the batch.map) to
respect this pacing. Also consider extracting the repeated batching pattern
(PointTask, PointWeatherResult, the for loop that fills pointResultsMap, and
CONCURRENCY_LIMIT) into a reusable helper (e.g., a generic batchFetcher utility)
so aqi.compute.ts and weather.compute.ts can reuse the same logic and
configuration.

In `@server/src/utils/scheduler/computeData.scheduler.ts`:
- Around line 163-164: The double type assertion on rawTravelMode (const
rawTravelMode = (option as unknown as { travelMode: string }).travelMode)
indicates the Mongoose/TypeScript types for route option documents are missing
travelMode; update the RouteOption interface/schema to include travelMode (or
refine the Mongoose document type used for option) and remove the unsafe 'as
unknown as { travelMode: string }' cast so rawTravelMode can be accessed via a
properly typed property on option (update any type definitions used in
computeData.scheduler.ts and related models to surface compile-time errors
instead of runtime casts).
- Line 144: The unbounded Route.find({}).lean() call that assigns to routes will
load all route documents into memory; change it to a cursor/batching pattern
(e.g., use Route.find({}).lean().cursor() or add batchSize() and iterate with an
async loop) and process each batch/stream chunk instead of materializing routes
array; update the code around the routes variable and the surrounding function
(computeData.scheduler logic that iterates routes) to consume the cursor/batches
and free memory between batches.
- Around line 212-226: The scheduler created in initScheduler() discards the
setInterval handle so it cannot be cleared for graceful shutdown or test
teardown; store the interval ID returned by setInterval in a module-scoped
variable (e.g., _schedulerIntervalId) and export a stopScheduler() function that
calls clearInterval(_schedulerIntervalId) and resets the ID, and update
initScheduler() to assign that ID (still using SCHEDULE_INTERVAL_MS) so callers
can stop the scheduler (SIGTERM/SIGINT handlers or tests) by invoking
stopScheduler().

In `@server/src/utils/scheduler/pathwayClient.ts`:
- Line 117: The request body in pathwayClient.ts currently hardcodes usePathway:
false which prevents the Pathway pipeline (e.g., run_batch_pipeline) from ever
being exercised; change this to be configurable (read an env var like
USE_PATHWAY or accept a parameter on the function that builds the request) and
use that variable instead of false, or if intentional, add a clear inline
comment explaining it's disabled for reliability; update the code paths that
call the client so callers can enable pathway when desired and ensure the
request payload uses that configurable flag (reference: the JSON body containing
routes and usePathway in pathwayClient.ts and the run_batch_pipeline Pathway
pipeline).

Comment thread .gitattributes
Comment on lines +1 to +2
# Use the custom 'theirs' merge driver for all files to prefer incoming changes.
* merge=theirs

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

* merge=theirs globally silences ALL merge conflicts — high data-loss risk.

Assigning a merge driver in .gitattributes controls which driver handles specific files — and you should replace * with the file extension you actually want the custom driver applied to. The current glob * applies the "always take theirs" strategy to every file in the repository — source code, configuration, data files, binaries, and even .gitattributes itself.

Specific consequences:

  • Any developer who runs the setup script will have all merge conflicts silently resolved by accepting the incoming version, with no conflict markers and no opportunity to review.
  • Developers who haven't run the setup will silently fall back to the standard 3-way text merge, creating inconsistent behavior across the team.
  • This is committed to the repository, meaning the effect is permanent for any contributor who runs setup.

Suggest narrowing the pattern to only files that genuinely require automatic conflict resolution (e.g., auto-generated lock files like *.lock, build artifacts, etc.) or removing this entirely in favour of an explicit per-file opt-in:

-# Use the custom 'theirs' merge driver for all files to prefer incoming changes.
-* merge=theirs
+# Use the custom 'theirs' merge driver only for auto-generated files.
+package-lock.json merge=theirs
+*.lock merge=theirs
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.gitattributes around lines 1 - 2, The .gitattributes entry currently uses a
global pattern "* merge=theirs" which silently forces the "theirs" driver for
every file; update this to narrow or remove that pattern by replacing "*
merge=theirs" with targeted patterns (e.g., use specific globs like "*.lock
merge=theirs" or other generated artifact patterns) or remove the line entirely
so only explicit per-file rules apply; ensure the change touches the unique
symbol "* merge=theirs" in the file so only intended auto-resolved file types
get the custom merge driver.

Comment on lines +17 to +30
def __call__(self, request):
try:
return self.get_response(request)
except Exception as exc:
# Only intercept API routes
path = getattr(request, 'path', '') or ''
if path.startswith('/api/'):
if settings.DEBUG:
tb = traceback.format_exc()
return JsonResponse({'success': False, 'message': 'Internal server error', 'error': str(exc), 'trace': tb}, status=500)
else:
return JsonResponse({'success': False, 'message': 'Internal server error'}, status=500)
# Re-raise for non-API routes so Django can handle normally
raise

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Error responses bypass CorsMiddleware — browser will block cross-origin error bodies.

Because ApiExceptionMiddleware sits above CorsMiddleware in MIDDLEWARE (see settings.py line 116-118), when an exception is caught here the returned JsonResponse short-circuits back without passing through CorsMiddleware. The browser will block the frontend from reading the JSON error body due to missing CORS headers.

Two possible fixes:

Option A (preferred): Move the middleware below CorsMiddleware in settings.py:

 MIDDLEWARE = [
     'corsheaders.middleware.CorsMiddleware',
+    'api.middleware.ApiExceptionMiddleware',
     'django.middleware.security.SecurityMiddleware',
     ...
-    'api.middleware.ApiExceptionMiddleware',
 ]

Option B: Manually add CORS headers in the error response here, but this duplicates django-cors-headers logic and is harder to maintain.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@data-processing/dataProcessingServer/api/middleware.py` around lines 17 - 30,
ApiExceptionMiddleware is returning JsonResponse before CorsMiddleware runs,
causing missing CORS headers; move ApiExceptionMiddleware to be listed after
CorsMiddleware in the MIDDLEWARE setting (so CorsMiddleware runs first) rather
than trying to add headers here. Edit settings.py and reorder the MIDDLEWARE
list so "corsheaders.middleware.CorsMiddleware" appears before your
"dataProcessingServer.api.middleware.ApiExceptionMiddleware" (keep references to
those exact names).

Write-Error "git-merge-theirs.ps1: missing 'their' file"
exit 1
}
Get-Content -Raw -LiteralPath $B | Set-Content -LiteralPath $A -Encoding UTF8

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, check if the file exists and confirm the code in question
fd git-merge-theirs.ps1

Repository: kaihere14/BreathClean

Length of output: 93


🌐 Web query:

PowerShell 5.1 Set-Content -Encoding UTF8 BOM behavior

💡 Result:

In Windows PowerShell 5.1, Set-Content -Encoding UTF8 always writes UTF‑8 with a BOM (EF BB BF). There is no utf8NoBOM option in 5.1’s -Encoding parameter set. [1], [2]

Implications:

  • If you need BOM-less UTF‑8 in 5.1, you must use a .NET API (e.g., System.Text.UTF8Encoding($false) with WriteAllText/WriteAllLines). [2]
  • In PowerShell 7+, the behavior differs: UTF‑8 is BOM-less by default, and a BOM is requested explicitly with -Encoding utf8BOM. [1]

Sources

  1. Stack Overflow: “PowerShell 5.1, Output to a text file with Out-File / Set-Content and utf8” (mklement0) — explains PS 5.1 -Encoding utf8 => BOM; PS 7+ defaults to BOM-less UTF-8 and supports utf8BOM. https://stackoverflow.com/questions/77987322/
  2. 2Pint Software: “Why is ‘Add-Content’ bad in PowerShell 5.1?” — notes -Encoding UTF8 creates “UTF8 with BOM” in PS 5.1 and suggests .NET workarounds. https://2pintsoftware.com/news/details/why-is-add-content-bad-in-powershell-51

🌐 Web query:

PowerShell Get-Content -Raw binary safe text processing

💡 Result:

Get-Content -Raw is not binary-safe: it reads the file as text, decoding bytes using an encoding, and returns one big string (preserving embedded newlines rather than splitting into lines). (powershell.org)

If you need binary-safe processing

Read bytes, not strings:

  • PowerShell 7+:

    [byte[]]$bytes = Get-Content -LiteralPath $path -AsByteStream -Raw

    -AsByteStream returns a byte stream and ignores -Encoding. (learn.microsoft.com)

  • Windows PowerShell 5.1:

    [byte[]]$bytes = Get-Content -LiteralPath $path -Encoding Byte -Raw

    -Encoding Byte reads raw bytes. (stackoverflow.com)

Then do explicit decoding/encoding only where appropriate, e.g.:

$text = [Text.Encoding]::UTF8.GetString($bytes)   # decode explicitly
# ...text processing...
$newBytes = [Text.Encoding]::UTF8.GetBytes($text) # re-encode explicitly
[IO.File]::WriteAllBytes($path, $newBytes)

If the file is text (but you want reliable text processing)

Be explicit about encoding (especially on Windows PowerShell 5.1, where Get-Content may assume the legacy ANSI “Default” code page when there’s no BOM). (learn.microsoft.com)

$text = Get-Content -LiteralPath $path -Raw -Encoding utf8
# ...process...
Set-Content -LiteralPath $path -Encoding utf8 -Value $text

Notes:

  • -Raw loads the whole file into memory (great for whole-file regex/replace; not great for huge files). (learn.microsoft.com)

Citations:


🌐 Web query:

PowerShell Copy-Item binary safe preserve encoding

💡 Result:

  • Binary-safe copy (no encoding changes): Use Copy-Item. It copies files as files/bytes; text encoding isn’t involved unless you read/write the content as text. (learn.microsoft.com)

    Copy-Item -LiteralPath $src -Destination $dst -Force
  • What breaks “encoding preservation”: Piping file content through cmdlets like Get-Content/Out-File converts bytes → .NET strings → bytes again, and PowerShell does not preserve the original encoding automatically; you must specify it. (stackoverflow.com)

  • If you must round-trip bytes (still binary-safe): Read as bytes, then write bytes (no encoding). Get-Content supports -AsByteStream. (learn.microsoft.com)

    $bytes = Get-Content -LiteralPath $src -AsByteStream -Raw
    [System.IO.File]::WriteAllBytes($dst, $bytes)
  • If you must rewrite text and keep a specific encoding: Specify -Encoding explicitly on read/write (and be aware Windows PowerShell 5.1 vs PowerShell 7+ differ in UTF-8 BOM behavior). (stackoverflow.com)

  • If you also need to preserve timestamps/ACLs/attributes: Prefer robocopy with appropriate flags (e.g., copying data + attributes + timestamps + security). (learn.microsoft.com)

Citations:


Replace Get-Content | Set-Content with Copy-Item for binary-safe merge operation.

Set-Content -Encoding UTF8 on Windows PowerShell 5.1 writes UTF-8 with a 3-byte BOM (EF BB BF), corrupting files and breaking downstream tools. Additionally, Get-Content -Raw processes the file as text, normalizing line endings and mangling binary content; the equivalent Unix script uses cat which is a raw byte copy.

Use Copy-Item, which is both binary-safe and encoding-agnostic:

🛡️ Proposed fix
-Get-Content -Raw -LiteralPath $B | Set-Content -LiteralPath $A -Encoding UTF8
+Copy-Item -LiteralPath $B -Destination $A -Force
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Get-Content -Raw -LiteralPath $B | Set-Content -LiteralPath $A -Encoding UTF8
Copy-Item -LiteralPath $B -Destination $A -Force
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/git-merge-theirs.ps1` at line 6, The current pipeline uses
Get-Content -Raw with Set-Content -Encoding UTF8 which treats the file as text
and writes a UTF-8 BOM (corrupting binaries); replace that pipeline with a
binary-safe file copy using Copy-Item so the script copies bytes without
re-encoding or normalizing line endings (use Copy-Item with the source variable
$B and destination $A and pass the overwrite/force option to ensure the target
is replaced).

@@ -0,0 +1,4 @@
Write-Host "Registering merge.theirs driver..."
git config --local merge.theirs.name "Keep theirs (incoming) during merges"
git config --local merge.theirs.driver ".\scripts\git-merge-theirs.ps1 %O %A %B"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Driver command won't execute — .ps1 files need an explicit PowerShell interpreter.

Git invokes the merge driver command through sh (via Git for Windows / MSYS2 on Windows). A bare .\scripts\git-merge-theirs.ps1 ... path is not executable by sh — the shell has no handler for .ps1 files and the backslash path is also invalid in a POSIX shell context.

The driver path must invoke the PowerShell interpreter explicitly:

-git config --local merge.theirs.driver ".\scripts\git-merge-theirs.ps1 %O %A %B"
+git config --local merge.theirs.driver "powershell -ExecutionPolicy Bypass -File ./scripts/git-merge-theirs.ps1 %O %A %B"

Use pwsh instead of powershell if targeting PowerShell 7+.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
git config --local merge.theirs.driver ".\scripts\git-merge-theirs.ps1 %O %A %B"
git config --local merge.theirs.driver "powershell -ExecutionPolicy Bypass -File ./scripts/git-merge-theirs.ps1 %O %A %B"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/setup-merge-theirs.ps1` at line 3, The merge driver command in the
git config (git config --local merge.theirs.driver) currently points to
".\scripts\git-merge-theirs.ps1" which won't run under sh; change the driver to
invoke PowerShell explicitly (use pwsh for PowerShell 7+ or powershell for
Windows PowerShell) and use a POSIX-safe path to the script (e.g.,
./scripts/git-merge-theirs.ps1) so the shell can execute the interpreter with
the script and passed %O %A %B arguments.

Comment on lines +479 to 483
} catch {
res
.status(500)
.json({ success: false, message: "Failed to compute route scores" });
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Outer catch swallows all errors silently — no logging.

If computeWeather, computeAQI, or the .map() scoring logic throws, the error is discarded with no trace. This makes debugging production failures extremely difficult. At minimum, log the error.

🐛 Proposed fix
-  } catch {
-    res
-      .status(500)
-      .json({ success: false, message: "Failed to compute route scores" });
+  } catch (error) {
+    console.error("[ScoreController] Failed to compute route scores:", error);
+    res
+      .status(500)
+      .json({ success: false, message: "Failed to compute route scores" });
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/controllers/score.controller.ts` around lines 479 - 483, The catch
block in the route handler currently swallows all errors; modify the catch to
accept an error parameter (e.g., catch (err)) and log the error before returning
the 500 response so failures inside computeWeather, computeAQI, or the map
scoring logic are recorded; use the project's logger if available (e.g.,
logger.error or req.log.error) or fallback to console.error and include
contextual info like the handler name (score controller) and relevant inputs to
aid debugging, then keep the existing res.status(500).json({ success: false,
message: "Failed to compute route scores" }) response.

Comment on lines 1 to 4
interface Coordinate {
lat: number;
lon: number;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check all definitions of the Coordinate interface in the codebase
rg -n "interface Coordinate" --type=ts -C2

Repository: kaihere14/BreathClean

Length of output: 465


🏁 Script executed:

#!/bin/bash
# Also search for type Coordinate definitions (in case it's defined as a type instead of interface)
rg -n "type Coordinate" --type=ts -C2

Repository: kaihere14/BreathClean

Length of output: 755


🏁 Script executed:

#!/bin/bash
# Search for imports and usages of Coordinate to understand the scope
rg -n "Coordinate" server/src/utils/compute/ --type=ts

Repository: kaihere14/BreathClean

Length of output: 3150


Remove duplicate Coordinate interface and import from weather.compute.ts.

The Coordinate interface in breakPoint.compute.ts is identical to the exported interface in weather.compute.ts. The aqi.compute.ts file already correctly imports it from weather.compute.ts. Update breakPoint.compute.ts to import Coordinate from weather.compute.ts instead of redefining it locally.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/utils/compute/breakPoint.compute.ts` around lines 1 - 4, Remove
the duplicated Coordinate interface from breakPoint.compute.ts and instead
import the exported Coordinate from weather.compute.ts; specifically delete the
local "interface Coordinate { lat: number; lon: number; }" and add an import for
Coordinate (matching how aqi.compute.ts imports it) so all code in
breakPoint.compute.ts references the single shared Coordinate type from
weather.compute.ts.

Comment on lines 28 to 36
function calculateBreakpointCount(distance: number): number {
if (distance < 100) {
return 3;
} else if (distance >= 100 && distance <= 500) {
// Scale between 3-4 based on distance
return distance < 300 ? 3 : 4;
} else {
// For very long routes, use 3-4 points to prevent timeouts
return distance < 750 ? 3 : 4;
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Non-monotonic breakpoint count: 300–500 → 4 but 501–749 → 3.

The current logic yields:

Distance Count
< 100 3
100–299 3
300–500 4
501–749 3
≥ 750 4

A 501-unit route produces fewer breakpoints than a 400-unit route. This looks like a logic error in the else branch — the > 500 case likely should use a higher base (e.g., return distance < 750 ? 4 : 5;) or at minimum match the preceding tier.

🐛 Possible fix (adjust thresholds so count is non-decreasing)
 function calculateBreakpointCount(distance: number): number {
   if (distance < 100) {
     return 3;
   } else if (distance >= 100 && distance <= 500) {
     return distance < 300 ? 3 : 4;
   } else {
-    return distance < 750 ? 3 : 4;
+    return distance < 750 ? 4 : 5;
   }
 }

Please also consider documenting the expected unit for distance (meters? km?) so future maintainers can verify these thresholds.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
function calculateBreakpointCount(distance: number): number {
if (distance < 100) {
return 3;
} else if (distance >= 100 && distance <= 500) {
// Scale between 3-4 based on distance
return distance < 300 ? 3 : 4;
} else {
// For very long routes, use 3-4 points to prevent timeouts
return distance < 750 ? 3 : 4;
}
}
function calculateBreakpointCount(distance: number): number {
if (distance < 100) {
return 3;
} else if (distance >= 100 && distance <= 500) {
return distance < 300 ? 3 : 4;
} else {
return distance < 750 ? 4 : 5;
}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/utils/compute/breakPoint.compute.ts` around lines 28 - 36, The
calculateBreakpointCount function returns non-monotonic breakpoint counts (e.g.,
400 -> 4 but 501 -> 3); fix calculateBreakpointCount so counts never decrease as
distance grows — for example update the final else branch logic used by
calculateBreakpointCount to return distance < 750 ? 4 : 5 (or otherwise raise
the base for >500) so thresholds are monotonic, and add a short comment on the
distance parameter units (meters vs km) above the function for future
maintainers.

Comment on lines 228 to 230
export async function runManualBatchScoring(): Promise<void> {
await runBatchScoring();
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

runManualBatchScoring bypasses the _isRunning concurrency guard

A manual trigger can run concurrently with the scheduled interval tick, causing duplicate DB reads and writes against the same route documents.

🔧 Proposed fix
 export async function runManualBatchScoring(): Promise<void> {
+  if (_isRunning) {
+    return;
+  }
+  _isRunning = true;
+  try {
     await runBatchScoring();
+  } finally {
+    _isRunning = false;
+  }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/utils/scheduler/computeData.scheduler.ts` around lines 228 - 230,
runManualBatchScoring currently calls runBatchScoring directly and bypasses the
module's concurrency guard; update runManualBatchScoring to check the existing
_isRunning boolean (the same flag used by the scheduled tick) before invoking
runBatchScoring, set _isRunning=true when starting, call await
runBatchScoring(), and ensure _isRunning is reset to false in a finally block so
manual triggers cannot run concurrently with the scheduled runner.

Comment on lines 105 to +106
const url = `${baseUrl}/api/compute-scores/`;
const timeout = 30000; // 30 second timeout
const timeout = 30000;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

baseUrl not normalized — inconsistent with checkPathwayHealth.

checkPathwayHealth (line 162) strips trailing slashes from baseUrl, but sendToPathway does not. A trailing-slash baseUrl would produce a double-slash URL (http://host//api/compute-scores/), which may 404 on some servers.

🐛 Proposed fix
 export async function sendToPathway(
   baseUrl: string,
   routes: PathwayRouteInput[]
 ): Promise<PathwayResponse> {
-  const url = `${baseUrl}/api/compute-scores/`;
-  const timeout = 30000;
+  const normalizedBaseUrl = baseUrl.replace(/\/+$/, "");
+  const url = `${normalizedBaseUrl}/api/compute-scores/`;
+  const timeout = process.env.PATHWAY_TIMEOUT_MS
+    ? parseInt(process.env.PATHWAY_TIMEOUT_MS, 10)
+    : 30000;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const url = `${baseUrl}/api/compute-scores/`;
const timeout = 30000; // 30 second timeout
const timeout = 30000;
const normalizedBaseUrl = baseUrl.replace(/\/+$/, "");
const url = `${normalizedBaseUrl}/api/compute-scores/`;
const timeout = process.env.PATHWAY_TIMEOUT_MS
? parseInt(process.env.PATHWAY_TIMEOUT_MS, 10)
: 30000;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/utils/scheduler/pathwayClient.ts` around lines 105 - 106,
sendToPathway constructs url = `${baseUrl}/api/compute-scores/` without
normalizing baseUrl, so a baseUrl with a trailing slash can produce a
double-slash URL; update sendToPathway to strip trailing slashes from baseUrl
(or reuse the same normalization logic used by checkPathwayHealth) before
concatenation (e.g., normalize baseUrl via baseUrl.replace(/\/+$/, '') or call
the existing normalization helper) so the final url is
`${normalizedBaseUrl}/api/compute-scores/`; ensure references to baseUrl and url
in sendToPathway and any related timeout logic remain unchanged.

Comment on lines +164 to +166
const timeout = process.env.PATHWAY_TIMEOUT_MS
? parseInt(process.env.PATHWAY_TIMEOUT_MS, 10)
: 30000;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

parseInt of an invalid PATHWAY_TIMEOUT_MS yields NaN, causing setTimeout to fire immediately.

If the env var is set to a non-numeric string, parseInt returns NaN. setTimeout(..., NaN) behaves like setTimeout(..., 0), aborting the request instantly.

🛡️ Proposed fix — validate the parsed value
-  const timeout = process.env.PATHWAY_TIMEOUT_MS
-    ? parseInt(process.env.PATHWAY_TIMEOUT_MS, 10)
-    : 30000;
+  const envTimeout = process.env.PATHWAY_TIMEOUT_MS
+    ? parseInt(process.env.PATHWAY_TIMEOUT_MS, 10)
+    : NaN;
+  const timeout = Number.isFinite(envTimeout) && envTimeout > 0 ? envTimeout : 30000;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const timeout = process.env.PATHWAY_TIMEOUT_MS
? parseInt(process.env.PATHWAY_TIMEOUT_MS, 10)
: 30000;
const envTimeout = process.env.PATHWAY_TIMEOUT_MS
? parseInt(process.env.PATHWAY_TIMEOUT_MS, 10)
: NaN;
const timeout = Number.isFinite(envTimeout) && envTimeout > 0 ? envTimeout : 30000;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/utils/scheduler/pathwayClient.ts` around lines 164 - 166, The
timeout calculation using parseInt on PATHWAY_TIMEOUT_MS can produce NaN and
cause setTimeout to behave like 0; update the logic in pathwayClient.ts where
the constant timeout is computed to parse the env var (e.g., via Number or
parseInt), then validate the result (use Number.isFinite / Number.isInteger and
ensure > 0) and fall back to the safe default 30000 when invalid or
non-positive; ensure any later use of timeout (the setTimeout call in the
request/abort flow) uses this validated value so requests are not aborted
immediately.

@kaihere14 kaihere14 merged commit 77147c6 into kaihere14:main Feb 23, 2026
5 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants