Skip to content

feat(catalog): expose upstream zh descriptions via locale-aware API#98

Merged
linkai0924 merged 2 commits into
mainfrom
feat/item-description-i18n
May 25, 2026
Merged

feat(catalog): expose upstream zh descriptions via locale-aware API#98
linkai0924 merged 2 commits into
mainfrom
feat/item-description-i18n

Conversation

@papysans
Copy link
Copy Markdown
Collaborator

@papysans papysans commented May 22, 2026

Summary

Upstream catalog/index.json already ships per-entry description_zh alongside description (≈99.8% coverage); the ingest path now stores both in a new descriptions jsonb column on capability_items and capability_versions, mirroring the JSONB locale-map pattern item_categories already uses.

API endpoints resolve description per request locale (?lang= query > Accept-Language > en) and surface the raw descriptions map so clients can re-resolve on locale switch without re-fetching.

What changed

  • Migration (20260522000000_…): descriptions jsonb NOT NULL DEFAULT '{}' on items + versions
  • Ingest (catalog_ingest_service.go): catalogEntry.DescriptionZh field + buildDescriptionsJSON / descriptionsJSONEqual helpers; integral-replacement semantics (upstream is the source of truth — drop a zh translation upstream and the next ingest clears it from DB)
  • Locale resolution (new handlers/locale.go): ResolveLocale (?lang= > Accept-Language > en) + PickDescription (fallback chain) + list/search bypass helpers
  • API: ItemResponse.Descriptions + buildItemResponse(c, …) signature; list (ListItems / ListMyItems), recommend (GetTrending / GetNewAndNoteworthy), search (SemanticSearch / HybridSearch / FindSimilar) all wired up
  • Search SQL: SELECT columns now include descriptions
  • Embedding unchanged: vector still generated from English description only — no vector-space drift
  • Test fixtures (sqlite CREATE TABLE in 3 test files) updated; 24 new unit tests (9 ingest + 15 locale)
  • Drive-by fix: cmd/migrate/main.go imports google/uuid (caught by Docker build — local go build ./internal/... was masking it)

Backward compatibility

description text column kept as the resolved-English fallback. Callers that omit Accept-Language continue to see the English string (zero API contract change). Frontend client (PR zgsm-sangfor/opencode#512) consumes the new descriptions map directly.

Test plan

  • go test ./internal/handlers/ ./internal/services/ (24 new tests, plus existing pass; 2 preexisting failures in marketplace_test/usage_test unrelated to this change)
  • Staging deployed (rc2): migration applied, ingest run with metadataUpdated=3315 (≈ every row got zh backfill, exactly as expected for first-time i18n ingest), failed=0, incomplete=0
  • curl -H 'Accept-Language: zh' /api/items returns Chinese description; curl /api/items (no header) returns English (back-compat verified)
  • Reviewer: verify swagger.json regen post-merge (currently blocked by preexisting models.MemoryFile swag parse error — separate issue)

Summary by CodeRabbit

Release Notes

  • New Features

    • Added multi-language support for item descriptions in English and Chinese
    • Implemented automatic language detection based on user preferences and browser settings
    • API now returns localized descriptions tailored to user locale
  • Documentation

    • Updated documentation with internationalization details for description handling and locale resolution

Review Change Stack

papysans added 2 commits May 22, 2026 17:28
Upstream catalog/index.json already ships per-entry description_zh
alongside description (en); the ingest path now stores them in a new
descriptions jsonb column on capability_items and capability_versions,
mirroring the JSONB locale-map pattern item_categories already uses.

API endpoints resolve description per request locale (?lang= query
> Accept-Language > en) and surface the raw descriptions map so
clients can re-resolve on locale switch without re-fetching.

- migration: add descriptions jsonb DEFAULT '{}' on items + versions
- ingest: catalogEntry.DescriptionZh + integral-replacement write
- handlers/locale.go: ResolveLocale + PickDescription + list helpers
- buildItemResponse + bypass list/search/recommend paths wired up
- search SELECT columns include descriptions for vector/hybrid paths
- test fixtures (sqlite CREATE TABLE) updated; 24 new unit tests pass

Resolved description field stays backwards compatible — callers
that omit Accept-Language continue to see the English string.
…anizations

cmd/migrate/main.go references uuid.New() at line 1261 in
backfillOrganizations but never imported the package. Caught when
the staging Docker build attempted RUN go build ./cmd/migrate
and failed; local go build ./internal/... had been masking it.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 22, 2026

📝 Walkthrough

Walkthrough

This PR implements multilingual description support for catalog items. It adds a descriptions JSONB column to persist localized text (English and Chinese), updates the ingest pipeline to process upstream description variants, and modifies API handlers to resolve and return locale-aware descriptions based on client request context (query parameters and Accept-Language headers).

Changes

Catalog Internationalization (i18n) for Item Descriptions

Layer / File(s) Summary
Database schema and persistent models
server/migrations/20260522000000_add_descriptions_jsonb_to_items.sql, server/internal/models/models.go, server/cmd/migrate/main_test.go, server/internal/handlers/registry_test.go, server/internal/services/scan_service_test.go
Migration adds descriptions JSONB NOT NULL DEFAULT '{}' to capability_items and capability_versions; models gain Descriptions fields; test fixtures updated with matching schema.
Locale resolution infrastructure
server/internal/handlers/locale.go, server/internal/handlers/locale_test.go
ResolveLocale extracts and normalizes locale from ?lang query param or Accept-Language header (query takes precedence, header uses first tag only, falls back to en); PickDescription selects best-match description from JSON map with fallback chain: requested locale → default en → legacy text field. Helper functions for list-based resolution; comprehensive test coverage for all normalization and fallback paths.
Catalog ingest service integration
server/internal/services/catalog_ingest_service.go, server/internal/services/catalog_ingest_service_test.go
Upstream catalogEntry now carries DescriptionZh; buildDescriptionsJSON constructs locale→text JSONB (includes en/zh keys only when non-empty); descriptionsJSONEqual performs semantic JSON comparison (order-insensitive). Ingest diffs descriptions using semantic equality; updateItem and insertItem set Descriptions from upstream; capability version rows include populated Descriptions. Tests verify integrity: locale deletion on upstream re-ingest clears the stored key instead of merging.
Item response and handler integration
server/internal/handlers/capability_item.go, server/internal/handlers/capability_registry.go
ItemResponse struct adds Descriptions field; buildItemResponse now accepts request context, resolves locale, selects description via PickDescription, and exposes raw descriptions JSON. All item response callsites (CreateItem, GetItem, UpdateItem*, createItemFrom*) pass context for locale resolution. List endpoints (ListItems, ListAllItems, ListMyItems) call ResolveItemListLocale to update item descriptions in-place before serialization.
Search and recommendation endpoints
server/internal/services/search_service.go, server/internal/handlers/search.go, server/internal/handlers/recommend.go
Search service queries (SemanticSearch, HybridSearch, FindSimilar) extended to select descriptions column; handlers post-process results via resolveSearchResultLocale/resolveSearchItemSliceLocale before JSON return. Recommendation handlers (GetTrending, GetNewAndNoteworthy) apply ResolveItemListLocale before pagination and response.
Documentation and supporting fixes
docs/CATALOG_INGEST.md, server/cmd/migrate/main.go
Docs section explains i18n architecture: upstream description/description_zh/description_original fields, JSONB storage in descriptions, retention of description as English default for backward compatibility and embeddings. API section documents locale precedence (?langAccept-Languageen), header normalization (zh-CN/zh-TW/zh-Hant → zh, en-US → en, others → en), response structure (localized description + raw descriptions map). UUID import added to main.go for compilation of backfill logic.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant APIHandler
  participant Database
  participant ResolveLocale
  participant PickDescription
  
  Client->>APIHandler: GET /item/123?lang=zh<br/>(or Accept-Language: zh-CN)
  APIHandler->>Database: fetch item (description + descriptions JSON)
  Database-->>APIHandler: CapabilityItem{description: "...", descriptions: {"en": "...", "zh": "..."}}
  APIHandler->>ResolveLocale: extract from context
  ResolveLocale-->>APIHandler: "zh"
  APIHandler->>PickDescription: descriptions JSON + "zh"
  PickDescription-->>APIHandler: "中文描述"
  APIHandler-->>Client: ItemResponse{description: "中文描述", descriptions: {...}}
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • XDfield/costrict-web#97: Directly related UUID import fix in migrate main.go to resolve compilation during migrations.
  • XDfield/costrict-web#58: Introduces TagService infrastructure that is now leveraged by updated buildItemResponse to conditionally fetch item tags when not preloaded.

Poem

🐰 Descriptions bloom in many tongues,
从英文到中文,数据常驻。
Query lang or Accept-Language calls,
Locale resolution finds the perfect text,
And JSONB holds them all.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 32.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the primary change: adding Chinese descriptions from upstream sources and exposing them through locale-aware API endpoints.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/item-description-i18n

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.12.2)

level=error msg="[linters_context] typechecking error: pattern ./...: directory prefix . does not contain modules listed in go.work or their selected dependencies"


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
server/internal/handlers/locale_test.go (2)

25-87: ⚡ Quick win

Add tests for query parameter normalization and unsupported query locale.

The test suite covers Accept-Language normalization (e.g., zh-CNzh) but doesn't verify that query parameters undergo the same normalization. Additionally, there's no test confirming that an unsupported query locale (e.g., ?lang=ja) falls back to en.

🧪 Suggested test additions
func TestResolveLocale_QueryParamNormalizes(t *testing.T) {
	c := newTestContext(t, "/items?lang=zh-CN", "")
	if got := ResolveLocale(c); got != "zh" {
		t.Errorf("?lang=zh-CN should normalize to zh: got %q", got)
	}
}

func TestResolveLocale_QueryParamUnsupportedFallsBack(t *testing.T) {
	c := newTestContext(t, "/items?lang=ja", "")
	if got := ResolveLocale(c); got != "en" {
		t.Errorf("?lang=ja should fall back to en: got %q", got)
	}
}

These tests confirm that query parameters respect the same normalization and fallback rules as headers, preventing regressions in the API contract.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/internal/handlers/locale_test.go` around lines 25 - 87, Add two unit
tests to verify query parameter locale normalization and fallback: implement
TestResolveLocale_QueryParamNormalizes and
TestResolveLocale_QueryParamUnsupportedFallsBack using newTestContext to create
requests like "/items?lang=zh-CN" and "/items?lang=ja" and assert
ResolveLocale(c) returns "zh" for the normalized zh-CN case and "en" for the
unsupported ja case; this ensures ResolveLocale handles query param
normalization and fallback the same way as Accept-Language header handling.

89-128: ⚡ Quick win

Add tests for the list resolution functions.

ResolveItemListLocale and ResolveCapabilityItemPointersLocale are used by list, search, and recommendation endpoints to rewrite descriptions in-place, but they lack test coverage. Testing these functions ensures the in-place mutation logic and nil-pointer handling work correctly.

🧪 Suggested test additions
func TestResolveItemListLocale(t *testing.T) {
	items := []models.CapabilityItem{
		{
			Description:  "fallback",
			Descriptions: datatypes.JSON([]byte(`{"en":"Hello","zh":"你好"}`)),
		},
		{
			Description:  "another fallback",
			Descriptions: datatypes.JSON([]byte(`{"en":"Goodbye"}`)),
		},
	}
	c := newTestContext(t, "/items", "zh-CN")
	ResolveItemListLocale(c, items)
	
	if items[0].Description != "你好" {
		t.Errorf("items[0].Description: got %q, want 你好", items[0].Description)
	}
	if items[1].Description != "Goodbye" {
		t.Errorf("items[1].Description: got %q, want Goodbye (fallback to en)", items[1].Description)
	}
}

func TestResolveCapabilityItemPointersLocale(t *testing.T) {
	item1 := &models.CapabilityItem{
		Description:  "fallback",
		Descriptions: datatypes.JSON([]byte(`{"en":"Hello","zh":"你好"}`)),
	}
	item2 := &models.CapabilityItem{
		Description:  "another fallback",
		Descriptions: datatypes.JSON([]byte(`{"en":"Goodbye"}`)),
	}
	items := []*models.CapabilityItem{item1, nil, item2}
	
	c := newTestContext(t, "/items", "zh-CN")
	ResolveCapabilityItemPointersLocale(c, items)
	
	if item1.Description != "你好" {
		t.Errorf("item1.Description: got %q, want 你好", item1.Description)
	}
	if item2.Description != "Goodbye" {
		t.Errorf("item2.Description: got %q, want Goodbye (fallback to en)", item2.Description)
	}
}

These tests verify that in-place mutation works correctly, nil pointers are handled gracefully, and locale fallback logic is applied consistently across list endpoints.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/internal/handlers/locale_test.go` around lines 89 - 128, Add unit
tests for ResolveItemListLocale and ResolveCapabilityItemPointersLocale that
exercise in-place mutation, nil-pointer handling, and locale fallback: create
sample CapabilityItem values with Description and Descriptions (datatypes.JSON)
covering cases where the requested locale exists, falls back to "en", and where
a pointer in the slice is nil; call ResolveItemListLocale(c, items) for a
[]models.CapabilityItem and ResolveCapabilityItemPointersLocale(c, itemsPtrs)
for []*models.CapabilityItem using newTestContext(t, "/items", "zh-CN"); assert
that items' Description fields are updated to the expected localized string
("你好" when zh present, "Goodbye" when only en present) and that nil entries are
skipped without panics.
server/internal/handlers/locale.go (1)

17-39: ⚡ Quick win

Consider respecting Accept-Language q-value fallback preferences.

The current implementation honors only the first tag in Accept-Language, ignoring quality values and secondary preferences. A user sending Accept-Language: ja,zh;q=0.9 would receive en (fallback) instead of their second preference zh. While the current behavior is clearly documented and reasonable for a first iteration, a more user-friendly approach would iterate through the tag list until a supported locale is found.

♻️ Optional enhancement to respect fallback preferences
 func ResolveLocale(c *gin.Context) string {
 	if c == nil {
 		return DefaultLocale
 	}
 	if q := strings.TrimSpace(c.Query("lang")); q != "" {
 		return normalizeLocale(q)
 	}
 	if h := strings.TrimSpace(c.GetHeader("Accept-Language")); h != "" {
-		// Accept-Language: zh-CN,en;q=0.9 → take "zh-CN" (first tag), ignore q values.
-		first := strings.SplitN(h, ",", 2)[0]
-		first = strings.SplitN(first, ";", 2)[0]
-		return normalizeLocale(first)
+		// Accept-Language: ja,zh-CN,en;q=0.9 → try each tag until we find a supported one.
+		tags := strings.Split(h, ",")
+		for _, tag := range tags {
+			tag = strings.TrimSpace(strings.SplitN(tag, ";", 2)[0])
+			if normalized := normalizeLocale(tag); normalized != DefaultLocale || tag == "" || strings.HasPrefix(strings.ToLower(tag), "en") {
+				return normalized
+			}
+		}
 	}
 	return DefaultLocale
 }

This allows unsupported first-choice languages (e.g., ja) to fall through to supported secondary choices (e.g., zh), improving UX for multilingual users.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/internal/handlers/locale.go` around lines 17 - 39, ResolveLocale
currently only takes the first tag from Accept-Language and ignores q-values;
update ResolveLocale to parse the Accept-Language header into slices of language
tags with their q weights, sort them by descending q (default q=1 when absent),
then iterate the sorted tags calling normalizeLocale(tag) and return the first
normalized value that represents a supported locale (i.e., not falling back to
DefaultLocale), otherwise continue to the next tag and finally return
DefaultLocale; you can reuse normalizeLocale and/or add a small helper like
isSupportedLocale to detect real-supported results rather than trusting the
first raw tag.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@server/cmd/migrate/main_test.go`:
- Line 48: The test DB schema created by newMigrateTestDB is missing the
descriptions column on capability_versions while capability_items was updated;
update newMigrateTestDB to add "descriptions TEXT NOT NULL DEFAULT '{}'" to the
CREATE TABLE for capability_versions so the test schema matches the production
change to capability_items.descriptions, ensuring migration tests exercise the
same column shape for version rows.

---

Nitpick comments:
In `@server/internal/handlers/locale_test.go`:
- Around line 25-87: Add two unit tests to verify query parameter locale
normalization and fallback: implement TestResolveLocale_QueryParamNormalizes and
TestResolveLocale_QueryParamUnsupportedFallsBack using newTestContext to create
requests like "/items?lang=zh-CN" and "/items?lang=ja" and assert
ResolveLocale(c) returns "zh" for the normalized zh-CN case and "en" for the
unsupported ja case; this ensures ResolveLocale handles query param
normalization and fallback the same way as Accept-Language header handling.
- Around line 89-128: Add unit tests for ResolveItemListLocale and
ResolveCapabilityItemPointersLocale that exercise in-place mutation, nil-pointer
handling, and locale fallback: create sample CapabilityItem values with
Description and Descriptions (datatypes.JSON) covering cases where the requested
locale exists, falls back to "en", and where a pointer in the slice is nil; call
ResolveItemListLocale(c, items) for a []models.CapabilityItem and
ResolveCapabilityItemPointersLocale(c, itemsPtrs) for []*models.CapabilityItem
using newTestContext(t, "/items", "zh-CN"); assert that items' Description
fields are updated to the expected localized string ("你好" when zh present,
"Goodbye" when only en present) and that nil entries are skipped without panics.

In `@server/internal/handlers/locale.go`:
- Around line 17-39: ResolveLocale currently only takes the first tag from
Accept-Language and ignores q-values; update ResolveLocale to parse the
Accept-Language header into slices of language tags with their q weights, sort
them by descending q (default q=1 when absent), then iterate the sorted tags
calling normalizeLocale(tag) and return the first normalized value that
represents a supported locale (i.e., not falling back to DefaultLocale),
otherwise continue to the next tag and finally return DefaultLocale; you can
reuse normalizeLocale and/or add a small helper like isSupportedLocale to detect
real-supported results rather than trusting the first raw tag.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 68004327-1c28-4e75-9b50-d409783750ce

📥 Commits

Reviewing files that changed from the base of the PR and between 1ae3200 and 3d83ba1.

📒 Files selected for processing (16)
  • docs/CATALOG_INGEST.md
  • server/cmd/migrate/main.go
  • server/cmd/migrate/main_test.go
  • server/internal/handlers/capability_item.go
  • server/internal/handlers/capability_registry.go
  • server/internal/handlers/locale.go
  • server/internal/handlers/locale_test.go
  • server/internal/handlers/recommend.go
  • server/internal/handlers/registry_test.go
  • server/internal/handlers/search.go
  • server/internal/models/models.go
  • server/internal/services/catalog_ingest_service.go
  • server/internal/services/catalog_ingest_service_test.go
  • server/internal/services/scan_service_test.go
  • server/internal/services/search_service.go
  • server/migrations/20260522000000_add_descriptions_jsonb_to_items.sql

item_type TEXT NOT NULL,
name TEXT NOT NULL,
description TEXT,
descriptions TEXT NOT NULL DEFAULT '{}',
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Keep test schema parity for capability_versions.descriptions.

Line 48 updates capability_items, but newMigrateTestDB still creates capability_versions without descriptions. That drift can hide/introduce migration-path failures when version rows are persisted through model paths.

🧪 Suggested schema parity patch
 		`CREATE TABLE IF NOT EXISTS capability_versions (
 			id TEXT PRIMARY KEY,
 			item_id TEXT NOT NULL,
 			revision INTEGER NOT NULL,
+			descriptions TEXT NOT NULL DEFAULT '{}',
 			content TEXT NOT NULL,
 			content_md5 TEXT DEFAULT '',
 			metadata TEXT DEFAULT '{}',
 			commit_msg TEXT,
 			created_by TEXT NOT NULL,
 			created_at DATETIME
 		)`,
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
descriptions TEXT NOT NULL DEFAULT '{}',
`CREATE TABLE IF NOT EXISTS capability_versions (
id TEXT PRIMARY KEY,
item_id TEXT NOT NULL,
revision INTEGER NOT NULL,
descriptions TEXT NOT NULL DEFAULT '{}',
content TEXT NOT NULL,
content_md5 TEXT DEFAULT '',
metadata TEXT DEFAULT '{}',
commit_msg TEXT,
created_by TEXT NOT NULL,
created_at DATETIME
)`,
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/cmd/migrate/main_test.go` at line 48, The test DB schema created by
newMigrateTestDB is missing the descriptions column on capability_versions while
capability_items was updated; update newMigrateTestDB to add "descriptions TEXT
NOT NULL DEFAULT '{}'" to the CREATE TABLE for capability_versions so the test
schema matches the production change to capability_items.descriptions, ensuring
migration tests exercise the same column shape for version rows.

@linkai0924 linkai0924 merged commit 59d2f04 into main May 25, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants