Skip to content

Akearney6 issue125#127

Open
akearney6 wants to merge 4 commits intodev_devayanifrom
akearney6_issue125
Open

Akearney6 issue125#127
akearney6 wants to merge 4 commits intodev_devayanifrom
akearney6_issue125

Conversation

@akearney6
Copy link
Collaborator

No description provided.

@akearney6 akearney6 requested a review from Devayani1612 March 2, 2026 20:50
@akearney6 akearney6 requested review from cubap and thehabes as code owners March 2, 2026 20:50
Copy link
Collaborator

@thehabes thehabes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Working demo where I see results! Note I switched https://store.rerum.io/v1/ to https://devstore.rerum.io/v1/ for my test.

Image

#125 specifies paged search results. I only get one page of 50 results and can never see beyond the 50th result, so it is not paging.

The content goes beyond the screen size and so the UI is cut off awkwardly

Image

Be aware that #126 is also introducing a search UI, so search.html may be redundant. Work with the team to decide what to use as the final interface.

Here is the output from the Static Review which contains issues you should address.

Static Review Comments

Branch: akearney6_issue125
Review Date: 2026-03-02
Reviewer: Claude Code assisted by Bryan - @thehabes

Bryan and Claude make mistakes. Verify all issues and suggestions. Avoid unnecessary scope creep.

Category Issues Found
🔴 Critical 1
🟠 Major 1
🟡 Minor 4
🔵 Suggestions 2

Critical Issues 🔴

Issue 1: XSS via innerHTML with unsanitized RERUM API data

File: web/js/searchTool.js:240-255
Category: Security — Cross-Site Scripting (XSS)

Problem:

The renderResults() function interpolates data from the RERUM API directly into HTML strings via .innerHTML without any sanitization. The affected fields are item.type, item.id, and item.target.

RERUM is a public annotation store — anyone can create annotations with arbitrary data. An attacker could craft an annotation where type, @id, or target contains malicious HTML/JavaScript. When a user's search returns that annotation, the payload executes in their browser.

There are two distinct attack vectors:

  1. HTML injection in display text (e.g., type = <img src=x onerror=alert(document.cookie)>)
  2. javascript: URI injection in href attributes (e.g., @id = javascript:alert(document.cookie))

Current Code (searchTool.js lines 240-245):

// item.type and item.id injected raw
const typeText = item.type ? `Type: ${item.type}` : "Type: (unknown)";
const idLink =
  item.id && typeof item.id === "string"
    ? `<a href="${item.id}" target="_blank" rel="noopener noreferrer">${item.id}</a>`
    : "(no @id)";
meta.innerHTML = `${typeText}<br />@id: ${idLink}`; // UNSAFE

And lines 250-255:

// item.target injected raw
targetEl.innerHTML = `Target: <a href="${item.target}" ...>${item.target}</a>`; // UNSAFE

Suggested Fix:

Replace all .innerHTML assignments with safe DOM construction. Create a helper to safely build links:

function isSafeUrl(url) {
  if (typeof url !== "string") return false;
  try {
    const parsed = new URL(url);
    return ["http:", "https:"].includes(parsed.protocol);
  } catch {
    return false;
  }
}

function createSafeLink(url, displayText) {
  if (isSafeUrl(url)) {
    const a = document.createElement("a");
    a.href = url;
    a.textContent = displayText || url;
    a.target = "_blank";
    a.rel = "noopener noreferrer";
    return a;
  }
  const span = document.createElement("span");
  span.textContent = displayText || url || "(invalid URL)";
  return span;
}

Then replace the rendering block (~lines 238-264) to build all elements via the DOM API using textContent, createElement, and appendChild instead of string interpolation into .innerHTML.

How to Verify:

  1. Create a test annotation in RERUM with type set to <img src=x onerror=alert('XSS')>
  2. Search for text that would match it
  3. Confirm before fix: alert fires. After fix: the raw text displays harmlessly
  4. Also test with @id set to javascript:alert('XSS') — after fix, link should not render as clickable

Major Issues 🟠

Issue 2: Rate limit slot consumed before search completes

File: web/js/searchTool.js:106-109
Category: Logic Error

Problem:

checkRateLimit() records the search timestamp (lines 107-108) before the search actually executes. If the fetch() call fails due to a network error, the user has consumed a rate-limit slot for nothing. With the 5-per-minute limit, a flaky network could lock users out after 5 failed attempts.

Current Code:

// Record successful slot reservation.
rateLimitState.lastSearchTime = now;
rateLimitState.recentSearches.push(now);
return { ok: true };

Suggested Fix:

Split checkRateLimit() into a check and a record step. Only record after a successful fetch:

function checkRateLimit() {
  const now = Date.now();
  rateLimitState.recentSearches = rateLimitState.recentSearches.filter(
    (t) => now - t <= 60 * 1000
  );
  if (now - rateLimitState.lastSearchTime < 1000) {
    return {
      ok: false,
      reason: "You can run at most 1 search per second.",
      retryAfterMs: 1000 - (now - rateLimitState.lastSearchTime)
    };
  }
  if (rateLimitState.recentSearches.length >= 5) {
    const oldest = rateLimitState.recentSearches[0];
    const retryAfterMs = 60 * 1000 - (now - oldest);
    return {
      ok: false,
      reason: "You can run at most 5 searches per minute.",
      retryAfterMs: retryAfterMs > 0 ? retryAfterMs : 0
    };
  }
  return { ok: true };
}

function recordSearch() {
  const now = Date.now();
  rateLimitState.lastSearchTime = now;
  rateLimitState.recentSearches.push(now);
}

Then in the submit handler, call recordSearch() after performRerumSearch() succeeds (but before rendering):

const { results, fromCache } = await performRerumSearch({ ... });
if (!fromCache) {
  recordSearch(); // Only count actual API calls against the limit
}

How to Verify:

  1. Disconnect network (DevTools -> Network -> Offline)
  2. Attempt 5+ searches — all should fail gracefully
  3. Reconnect network — next search should proceed immediately without a rate-limit block

Minor Issues 🟡

Issue 3: No cache size limit — localStorage can fill up

File: web/js/searchTool.js:63-69
Category: Logic Error / Resource Exhaustion

Problem:

The setCachedResults() function stores every search result set in queryCache and persists it to localStorage without any limit on the number of entries. Each RERUM search can return up to 50 JSON annotation objects, each potentially several KB. Over time, the cache will grow until localStorage hits its ~5MB limit, at which point persistCache() will silently fail and the cache becomes unusable. This also affects other features on the same origin that use localStorage.

Current Code:

function setCachedResults(key, results) {
  queryCache[key] = {
    timestamp: Date.now(),
    results
  };
  persistCache();
}

Suggested Fix:

Add a max-entries eviction policy (e.g., keep the 20 most recent entries):

const MAX_CACHE_ENTRIES = 20;

function setCachedResults(key, results) {
  queryCache[key] = {
    timestamp: Date.now(),
    results
  };

  // Evict oldest entries if cache exceeds limit.
  const keys = Object.keys(queryCache);
  if (keys.length > MAX_CACHE_ENTRIES) {
    keys
      .sort((a, b) => (queryCache[a].timestamp || 0) - (queryCache[b].timestamp || 0))
      .slice(0, keys.length - MAX_CACHE_ENTRIES)
      .forEach((k) => delete queryCache[k]);
  }

  persistCache();
}

How to Verify:

  1. Run 25+ unique searches
  2. Check localStorage.getItem("rerumSearchCache_v1") in DevTools
  3. Confirm only the most recent 20 entries are stored

Issue 4: Redundant defer on module script

File: web/search.html:9
Category: Code Hygiene

Problem:

<script type="module"> scripts are deferred by default per the HTML spec. The defer attribute has no effect and is misleading.

Current Code:

<script src="./js/searchTool.js" type="module" defer></script>

Suggested Fix:

<script src="./js/searchTool.js" type="module"></script>

Note: tools.html has the same pattern — could be fixed in a separate cleanup.


Issue 5: Parameter shadows outer variable

File: web/js/searchTool.js:315
Category: Code Hygiene

Problem:

The inner function setSearching(isSearching) uses a parameter named isSearching which shadows the outer closure variable isSearching (declared on line 308). While functionally correct, this is a maintenance trap — a future developer might mistakenly reference the outer variable thinking it's the parameter, or vice versa.

Suggested Fix:

Rename the parameter:

function setSearching(busy) {
  searchButton.disabled = busy;
  loadMoreButton.disabled = busy;
  form.querySelectorAll("input,button").forEach((el) => {
    if (el === clientFilter || el === clearButton) return;
    if (busy) {
      el.setAttribute("aria-busy", "true");
    } else {
      el.removeAttribute("aria-busy");
    }
  });
}

Issue 6: No debounce on client-side filter

File: web/js/searchTool.js:474-476
Category: Performance

Problem:

The input event on client-filter calls renderResults() on every keystroke. With large result sets (up to hundreds of annotations after multiple "Load more" clicks), this re-creates the entire DOM list on every character typed, which could cause noticeable lag.

Current Code:

clientFilter.addEventListener("input", () => {
  renderResults(lastResults, clientFilter.value);
});

Suggested Fix:

let filterTimeoutId = null;
clientFilter.addEventListener("input", () => {
  if (filterTimeoutId) clearTimeout(filterTimeoutId);
  filterTimeoutId = setTimeout(() => {
    renderResults(lastResults, clientFilter.value);
  }, 250);
});

Suggestions 🔵

Suggestion 1: Purge expired cache entries on load

File: web/js/searchTool.js:19-31

loadCache() restores the entire cache from localStorage but doesn't clean up expired entries. Adding a sweep in loadCache() would reduce memory usage on page load:

function loadCache() {
  try {
    const raw = window.localStorage.getItem(CACHE_STORAGE_KEY);
    if (!raw) return;
    const parsed = JSON.parse(raw);
    if (parsed && typeof parsed === "object") {
      const now = Date.now();
      for (const key of Object.keys(parsed)) {
        if (!parsed[key] || now - parsed[key].timestamp > CACHE_TTL_MS) {
          delete parsed[key];
        }
      }
      queryCache = parsed;
      persistCache();
    }
  } catch (e) {
    console.warn("Unable to load search cache from localStorage.", e);
    queryCache = {};
  }
}

Suggestion 2: Add aria-live region for search status

File: web/search.html:100

The #search-status element updates dynamically but screen readers won't announce the changes unless it's an ARIA live region:

<p id="search-status" class="text-small" aria-live="polite"></p>

Similarly, #results-summary could benefit:

<p id="results-summary" class="text-small" aria-live="polite"></p>

@@ -0,0 +1,501 @@
const RERUM_API_BASE = "https://store.rerum.io/v1";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const cacheKey = makeCacheKey(searchText, usePhrase, limit, skip);

const cached = getCachedResults(cacheKey);
if (cached) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem here is that you are returning stale results.

To reproduce...

Go to search.html and perform a text search for the text "lorem", and you will see the results and they will be cached.

Then, in a new tab, go to https://tinydev.rerum.io and create the following Annotation

{
	"type":"Annotation",
	"body": {
		"value" : "this is a lorem ipsum test"
	},
	"target": "https://example.org/item/1"
}

Then go back to search.html. Refresh the page. Repeat the text search for "lorem". It will pull the results out of cache, and so the new Annotation you just created does not appear in the search results.

};
}

function renderResults(results, filterText) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 CRITICAL SECURITY ISSUE: XSS Vulnerability

The renderResults() function uses .innerHTML to inject unsanitized data from the RERUM API:

Line 245: meta.innerHTML = ${typeText}<br />@id: ${idLink};
Line 254: targetEl.innerHTML = Target: <a href="${item.target}" ...>${item.target}</a>;

PROBLEM: RERUM is a public annotation store. An attacker can create annotations with:

  • type = ""
  • @id = "javascript:alert('XSS')"

When a user searches for matching text, the malicious code executes in their browser.

REQUIRED FIX: Replace innerHTML with safe DOM construction:

function isSafeUrl(url) {
  if (typeof url !== "string") return false;
  try {
    const parsed = new URL(url);
    return ["http:", "https:"].includes(parsed.protocol);
  } catch {
    return false;
  }
}

function createSafeLink(url, displayText) {
  if (isSafeUrl(url)) {
    const a = document.createElement("a");
    a.href = url;
    a.textContent = displayText || url;
    a.target = "_blank";
    a.rel = "noopener noreferrer";
    return a;
  }
  const span = document.createElement("span");
  span.textContent = displayText || url || "(invalid URL)";
  return span;
}

Then rebuild the renderResults() block to use createElement and appendChild instead of innerHTML.

* Client-side protections for RERUM search API.
*/

const CACHE_TTL_MS = 20 * 60 * 1000; // 20 minutes
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ISSUE: With a 20-minute cache TTL, if a user searches for "lorem", gets results, and then another user creates a new annotation containing "lorem" in RERUM, the first user won't see it when they refresh and search again (unless 20 minutes have passed).

Example scenario that breaks the feature:

User searches for "test" → gets 50 results (cached for 20 min)
Another user creates new annotation: {"body": {"value": "test annotation"}}
Same user refreshes and searches "test" again
New annotation does NOT appear because cached results are returned
SUGGESTED FIXES (pick one): A) Reduce CACHE_TTL_MS from 20 minutes to 5 minutes B) Add a "Refresh Results" button to clear cache manually C) Only cache results within a single page session (don't persist to localStorage)

What's the intended behavior? Should users always see the latest data, or is stale data acceptable?

}

// Record successful slot reservation.
rateLimitState.lastSearchTime = now;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR ISSUE: Rate Limit Slot Consumed Before Search Completes

PROBLEM: checkRateLimit() records the timestamp BEFORE the search executes. If the network request fails, the user wastes a rate-limit slot.

Current code (lines 107-108):

rateLimitState.lastSearchTime = now; // Recorded immediately
rateLimitState.recentSearches.push(now); // Even if fetch will fail
return { ok: true };

CONSEQUENCE: With 5-searches-per-minute limit, a user with flaky network could hit the limit after just 5 failed attempts, locking them out unfairly.

REQUIRED FIX: Split into check and record steps - only record after successful fetch:

function checkRateLimit() {
const now = Date.now();
rateLimitState.recentSearches = rateLimitState.recentSearches.filter(
(t) => now - t <= 60 * 1000
);
if (now - rateLimitState.lastSearchTime < 1000) {
return { ok: false, reason: "...", retryAfterMs: ... };
}
if (rateLimitState.recentSearches.length >= 5) {
return { ok: false, reason: "...", retryAfterMs: ... };
}
return { ok: true }; // Don't record yet!
}

function recordSearch() {
const now = Date.now();
rateLimitState.lastSearchTime = now;
rateLimitState.recentSearches.push(now);
}

// In form submit handler:
const { results, fromCache } = await performRerumSearch({ ... });
if (!fromCache) {
recordSearch(); // Only record actual API calls
}
TEST: Disconnect network in DevTools, attempt 5+ searches, reconnect - next search should proceed without rate-limit block.

return entry.results;
}

function setCachedResults(key, results) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MINOR: No Cache Size Limit - localStorage Can Overflow

PROBLEM: setCachedResults() stores unlimited cache entries. Each search returns ~50KB of data. After ~100 searches, localStorage (~5MB limit) fills up completely.

When localStorage is full:

  • persistCache() silently fails
  • Cache becomes broken
  • Other site features using localStorage are affected

SUGGESTED FIX: Keep only 20 most recent cache entries:
const MAX_CACHE_ENTRIES = 20;

function setCachedResults(key, results) {
queryCache[key] = {
timestamp: Date.now(),
results
};

// Evict oldest entries if cache exceeds limit
const keys = Object.keys(queryCache);
if (keys.length > MAX_CACHE_ENTRIES) {
keys
.sort((a, b) => (queryCache[a].timestamp || 0) - (queryCache[b].timestamp || 0))
.slice(0, keys.length - MAX_CACHE_ENTRIES)
.forEach((k) => delete queryCache[k]);
}

persistCache();
}

VERIFY: Run 25+ unique searches, then in DevTools console run: JSON.parse(localStorage.getItem("rerumSearchCache_v1")).length Should return ~20, not 25+.

}
});

clientFilter.addEventListener("input", () => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MINOR: Performance - No Debounce on Client-Side Filter

PROBLEM: renderResults() runs on every keystroke. With 100+ results, re-rendering the entire DOM on each character typed causes noticeable lag.

Current code:

clientFilter.addEventListener("input", () => {
renderResults(lastResults, clientFilter.value);
});

SUGGESTED FIX: Add 250ms debounce:

let filterTimeoutId = null;
clientFilter.addEventListener("input", () => {
if (filterTimeoutId) clearTimeout(filterTimeoutId);
filterTimeoutId = setTimeout(() => {
renderResults(lastResults, clientFilter.value);
}, 250); // Wait 250ms after user stops typing
});

BENEFIT: Smooth filtering experience, especially with large result sets.

status.className = `text-small status-${type}`;
}

function setSearching(isSearching) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MINOR: Code Quality - Parameter Shadows Outer Variable

ISSUE: The function parameter isSearching has the same name as the outer closure variable isSearching (line 308). This confuses future maintainers.

Current code:

let isSearching = false; // Line 308 - outer variable

function setSearching(isSearching) { // Line 315 - shadows outer variable!
searchButton.disabled = isSearching;
// ...
}

SUGGESTED FIX: Rename parameter to busy:

function setSearching(busy) {
searchButton.disabled = busy;
loadMoreButton.disabled = busy;
form.querySelectorAll("input,button").forEach((el) => {
if (el === clientFilter || el === clearButton) return;
if (busy) {
el.setAttribute("aria-busy", "true");
} else {
el.removeAttribute("aria-busy");
}
});
}

IMPACT: Low - functional but improves clarity.

<title>Annotation Text Search | RERUM Playground</title>

<script src="./js/playground.js" type="module"></script>
<script src="./js/searchTool.js" type="module" defer></script>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MINOR: Code Quality - Redundant defer Attribute

ISSUE: <script type="module"> scripts are automatically deferred by the HTML spec. The defer attribute is redundant and misleading.

Current code:

<script src="./js/searchTool.js" type="module" defer></script>

SUGGESTED FIX:

<script src="./js/searchTool.js" type="module"></script>

Module scripts are always deferred by default - no need to specify.

recentSearches: [] // timestamps (ms) of searches within the last minute
};

function loadCache() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 SUGGESTION: Purge Expired Cache Entries on Page Load

DETAIL: loadCache() restores all cached items but doesn't clean up expired entries (older than 20 min). This wastes memory and localStorage space over time.

SUGGESTED FIX: Add cleanup sweep in loadCache():

function loadCache() {
try {
const raw = window.localStorage.getItem(CACHE_STORAGE_KEY);
if (!raw) return;
const parsed = JSON.parse(raw);
if (parsed && typeof parsed === "object") {
const now = Date.now();
// Delete any expired entries
for (const key of Object.keys(parsed)) {
if (!parsed[key] || now - parsed[key].timestamp > CACHE_TTL_MS) {
delete parsed[key];
}
}
queryCache = parsed;
persistCache(); // Save cleaned cache back
}
} catch (e) {
console.warn("Unable to load search cache from localStorage.", e);
queryCache = {};
}
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants