Skip to content

Conversation

Roffenlund
Copy link
Contributor

@Roffenlund Roffenlund commented Aug 27, 2025

Ensures all headings (# through ######) get unique ids for anchor navigation.

  • Enabled id attributes on headings (h1–h6) in ALLOWED_ATTRIBUTES.
  • Integrated anchors_plugin.

Refs. TS-2595

Summary by CodeRabbit

  • New Features

    • Markdown headings (H1–H6) now include slugified anchor IDs for direct linking; pages auto-scroll to matching headings on load or when anchors are clicked.
  • Chores

    • Added a Markdown plugin dependency to enable heading anchors.
  • Tests

    • Updated test expectations to reflect headings now including id attributes.

Add the plugins pacakge for MarkdownIt which includes the anchors plugin
for automatically generating anchor id's to headings.

Refs. TS-2595
Add anchors_plugin to the markdownfiy logic.

Add "id" to the list of allowed attributes for h1-h6 html tags.

Refs. TS-2595
Copy link

coderabbitai bot commented Aug 27, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds the mdit-py-plugins dependency, enables the anchors plugin with a custom slugger in MarkdownIt (producing heading ids prefixed with user-content-), allows id attributes on h1h6 during sanitization, updates server and frontend tests to expect these heading ids, and adds client-side in-page anchor handling in the base template.

Changes

Cohort / File(s) Summary
Dependency management
django/pyproject.toml
Added Poetry dependency mdit-py-plugins = "0.4.2" under [tool.poetry.dependencies].
Markdown rendering & slugging
django/thunderstore/markdown/templatetags/markdownify.py
Added slugger(text: str) and imported anchors_plugin from mdit_py_plugins.anchors; initialized MarkdownIt("gfm-like") with .use(anchors_plugin, slug_func=slugger, max_level=6) to generate heading ids prefixed with user-content-.
HTML sanitization config
django/thunderstore/markdown/allowed_tags.py
Expanded ALLOWED_ATTRIBUTES for h1h6 from ["align"] to ["align", "id"].
Server-side prerender tests
django/thunderstore/api/cyberstorm/tests/test_markdown.py
Updated expected prerendered HTML to include id="user-content-very-strong-header" on the <h1> output.
Frontend API tests
django/thunderstore/frontend/api/experimental/tests/test_markdown.py
Updated MARKDOWN_RENDERED expected string to include id="user-content-test-markdown" on <h1> and adjusted paragraph whitespace.
Client-side anchor behavior
django/thunderstore/frontend/templates/base.html
Added slugger(text) JS, scrollToHash(hash) logic, interception of in-page # clicks, and handlers for DOMContentLoaded/hashchange to locate slugged or existing ids and scroll into view.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant View as Server View / Template
    participant MarkdownTag as markdownify tag
    participant MD as MarkdownIt ("gfm-like")
    participant Anchors as anchors_plugin
    participant Sanitizer as HTML sanitizer

    View->>MarkdownTag: render_markdown(markdown)
    MarkdownTag->>MD: init MarkdownIt("gfm-like")
    MarkdownTag->>MD: use(Anchors, slug_func=slugger, max_level=6)
    MD->>Anchors: detect headings
    Anchors-->>MD: attach generated id attrs (e.g. user-content-...)
    MD-->>MarkdownTag: HTML with heading ids
    MarkdownTag->>Sanitizer: sanitize (allow id on h1–h6)
    Sanitizer-->>View: sanitized HTML
Loading
sequenceDiagram
    autonumber
    participant Browser as User Browser
    participant Page as base.html JS
    participant DOM as Document

    Browser->>Page: page load / hashchange
    Page->>DOM: slugger(hash) -> try getElementById(slugged or raw)
    DOM-->>Page: element found?
    alt element found
        Page->>DOM: element.scrollIntoView()
        Page->>Browser: history.pushState/update URL
    else not found
        Page-->>Browser: no-op
    end

    Browser->>Page: user clicks internal #anchor link
    Page->>DOM: intercept, compute slug, scroll as above
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

I hop through headings, tidy and quick,
Casting tiny ids like a playful trick.
Links find their carrots, paths snug and true,
From server seed to browser view.
A rabbit's nod — anchors stitched anew 🐇✨


📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 36cfbaf and 9175d98.

📒 Files selected for processing (1)
  • django/thunderstore/frontend/templates/base.html (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • django/thunderstore/frontend/templates/base.html
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Build docker image
  • GitHub Check: Analyze (javascript)
  • GitHub Check: Build docker image
✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch add-markdown-heading-anchor

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

codecov bot commented Aug 27, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 92.24%. Comparing base (f06b9b4) to head (9175d98).

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1177   +/-   ##
=======================================
  Coverage   92.23%   92.24%           
=======================================
  Files         331      331           
  Lines       10088    10094    +6     
  Branches      927      927           
=======================================
+ Hits         9305     9311    +6     
  Misses        657      657           
  Partials      126      126           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
django/thunderstore/markdown/allowed_tags.py (1)

1-10: Remove invalid HTML tags h7 and h8 from ALLOWED_TAGS.

They’re not standard elements and won’t be emitted by Markdown. Small cleanup.

 ALLOWED_TAGS = [
     "h1",
     "h2",
     "h3",
     "h4",
     "h5",
     "h6",
-    "h7",
-    "h8",
     "br",
django/thunderstore/markdown/templatetags/markdownify.py (1)

6-6: Enabling anchors_plugin at init is good; consider guarding against cross-page id collisions.

If there are existing DOM ids elsewhere (templates, components), heading ids could collide. Optional: prefix slugs (e.g., md-). Keeps headings unique across the full document without hurting UX.

-from mdit_py_plugins.anchors import anchors_plugin
+from mdit_py_plugins.anchors import anchors_plugin, slugify as _slugify
@@
-md = MarkdownIt("gfm-like").use(anchors_plugin, max_level=6)
+def _prefixed_slugify(s, idx):
+    return f"md-{_slugify(s, idx)}"
+
+md = MarkdownIt("gfm-like").use(anchors_plugin, max_level=6, slugify=_prefixed_slugify)

Add a quick test to ensure ids survive Bleach and remain distinct:

def test_markdown_headings_have_unique_ids():
    html = render_markdown("# A\n\n# A\n\n## B")
    from lxml import html as lh
    doc = lh.fromstring(html)
    ids = [h.get("id") for h in doc.xpath("//h1|//h2|//h3|//h4|//h5|//h6")]
    assert all(ids), ids
    assert len(ids) == len(set(ids)), ids

Also applies to: 15-15

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between f06b9b4 and 5c5167c.

⛔ Files ignored due to path filters (1)
  • django/poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (3)
  • django/pyproject.toml (1 hunks)
  • django/thunderstore/markdown/allowed_tags.py (1 hunks)
  • django/thunderstore/markdown/templatetags/markdownify.py (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Build docker image
  • GitHub Check: Analyze (javascript)
  • GitHub Check: Test pytest (5)
  • GitHub Check: Test pytest (6)
  • GitHub Check: Test pytest (4)
  • GitHub Check: Test pytest (3)
  • GitHub Check: Test pytest (2)
  • GitHub Check: Test pytest (1)
🔇 Additional comments (2)
django/thunderstore/markdown/allowed_tags.py (1)

82-87: Allowing id on h1–h6 is correct and aligns with anchors_plugin.

This enables heading anchors to survive Bleach sanitization. LGTM.

django/pyproject.toml (1)

35-35: Anchors plugin import verified at runtime

The mdit-py-plugins = "0.4.2" pin has been validated locally: importing anchors_plugin succeeds under markdown-it-py 3.0.0 as expected. No further changes are needed—please ensure your lockfile and CI images are updated to include the new wheel.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
django/thunderstore/api/cyberstorm/tests/test_markdown.py (2)

66-74: Reduce brittleness: assert 200 and ignore trailing whitespace

Avoid failures from incidental newline differences and assert status explicitly.

     response = api_client.get(
         f"/api/cyberstorm/package/{v.package.namespace}/{v.package.name}/latest/readme/",
     )
-    actual = response.json()
-    expected_html = (
-        '<h1 id="very-strong-header">Very <strong>strong</strong> header</h1>\n'
-    )
-    assert actual["html"] == expected_html
+    assert response.status_code == 200
+    actual = response.json()
+    expected_html = '<h1 id="very-strong-header">Very <strong>strong</strong> header</h1>'
+    assert actual["html"].strip() == expected_html

62-74: Optional: add coverage for duplicate and non-ASCII headings

Helps prevent regressions in slug/uniqueness behavior.

Example tests to consider:

@pytest.mark.django_db
def test_readme_heading_ids__deduplicate_on_conflict(api_client: APIClient) -> None:
    v = PackageVersionFactory(readme="# Title\n# Title")
    r = api_client.get(f"/api/cyberstorm/package/{v.package.namespace}/{v.package.name}/latest/readme/")
    assert r.status_code == 200
    html = r.json()["html"]
    assert 'id="title"' in html
    # Ensure a distinct id was generated for the duplicate
    assert html.count('id="title') >= 2 and 'id="title-' in html

@pytest.mark.django_db
def test_readme_heading_ids__non_ascii_slug(api_client: APIClient) -> None:
    v = PackageVersionFactory(readme="# Über café")
    r = api_client.get(f"/api/cyberstorm/package/{v.package.namespace}/{v.package.name}/latest/readme/")
    assert r.status_code == 200
    html = r.json()["html"]
    # Expect a stable, URL-safe slug; adjust assertion if your slugify rules differ
    assert 'id=' in html
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 5c5167c and 4022395.

📒 Files selected for processing (1)
  • django/thunderstore/api/cyberstorm/tests/test_markdown.py (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Analyze (python)
  • GitHub Check: Analyze (javascript)
  • GitHub Check: Build docker image
  • GitHub Check: Build docker image
🔇 Additional comments (1)
django/thunderstore/api/cyberstorm/tests/test_markdown.py (1)

70-73: LGTM: heading id expectation is correct

The expected anchor id matches the slug generated from the heading text and preserves inline .

Refs. TS-2595
@Roffenlund Roffenlund force-pushed the add-markdown-heading-anchor branch from 4022395 to ec17152 Compare August 27, 2025 14:02
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
django/thunderstore/frontend/api/experimental/tests/test_markdown.py (3)

31-32: Reduce brittleness of HTML assertion

Comparing the whole HTML string is fragile to minor formatting changes. Assert on key parts instead.

Apply this diff:

-    assert response.json() == {"html": MARKDOWN_RENDERED}
+    data = response.json()
+    html = data["html"]
+    assert 'id="test-markdown"' in html
+    assert "<p>this is a description</p>" in html

46-46: Remove debug print from test

Leftover print adds noise to CI logs.

Apply this diff:

-    print(response.content)

1-53: Add coverage for more cases (h2–h6, duplicates, punctuation/unicode)

Extend tests to ensure ids are present and unique across levels and repeated headings; also cover punctuation/unicode.

Example additional tests:

import pytest

@pytest.mark.django_db
@pytest.mark.parametrize("heading_tag, md", [
    ("h1", "# A Title"),
    ("h2", "## Sub Title"),
    ("h3", "### Deeper"),
    ("h4", "#### Deeper"),
    ("h5", "##### Deeper"),
    ("h6", "###### Deeper"),
])
def test_headings_all_levels_have_ids(api_client, heading_tag, md):
    resp = api_client.post(
        reverse("api:experimental:frontend.render-markdown"),
        json.dumps({"markdown": md}),
        content_type="application/json",
    )
    html = resp.json()["html"]
    assert f"<{heading_tag} id=" in html

@pytest.mark.django_db
def test_duplicate_headings_get_unique_ids(api_client):
    md = "# Foo\n\n# Foo\n"
    resp = api_client.post(
        reverse("api:experimental:frontend.render-markdown"),
        json.dumps({"markdown": md}),
        content_type="application/json",
    )
    html = resp.json()["html"]
    # Expect two different ids containing 'foo'
    ids = sorted(set(__import__("re").findall(r'<h1 id="([^"]+)"', html)))
    assert len(ids) == 2 and ids[0].startswith("foo") and ids[1].startswith("foo") and ids[0] != ids[1]

@pytest.mark.django_db
def test_heading_slug_punctuation_and_unicode(api_client):
    md = "# Héllo, world! — v2.0"
    resp = api_client.post(
        reverse("api:experimental:frontend.render-markdown"),
        json.dumps({"markdown": md}),
        content_type="application/json",
    )
    html = resp.json()["html"]
    assert "<h1 " in html and 'id="' in html  # do not lock to a specific slug algorithm
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 4022395 and ec17152.

📒 Files selected for processing (2)
  • django/thunderstore/api/cyberstorm/tests/test_markdown.py (1 hunks)
  • django/thunderstore/frontend/api/experimental/tests/test_markdown.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • django/thunderstore/api/cyberstorm/tests/test_markdown.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Analyze (javascript)
  • GitHub Check: Build docker image
  • GitHub Check: Build docker image
🔇 Additional comments (1)
django/thunderstore/frontend/api/experimental/tests/test_markdown.py (1)

15-17: LGTM: expected heading id matches anchors plugin slugging

The updated expectation for <h1 id="test-markdown"> looks correct and aligned with enabling anchors.

@x753 x753 self-requested a review August 29, 2025 07:15
Copy link
Contributor

@x753 x753 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Give me a bit more time to look at this, no merging yet

@Roffenlund
Copy link
Contributor Author

Roffenlund commented Aug 29, 2025

Give me a bit more time to look at this, no merging yet

@x753 Here is the source code of the plugin in case you want to explore potential issue.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
django/thunderstore/frontend/templates/base.html (2)

35-39: Harden slugger for broader browser support and trim/normalize input.

Unicode property escapes are widely supported but not universal; add a safe ASCII fallback. Also trim input to avoid leading/trailing dashes.

-            function slugger(text) {
-                let slug = text.replace(/\s+/g, "-");
-                slug = slug.replace(/[^\p{L}\p{N}-]+/gu, "");
-                return "user-content-" + slug.toLowerCase();
-            }
+            function slugger(text) {
+                let slug = String(text).trim();
+                try {
+                    slug = slug.replace(/\s+/g, "-").replace(/[^\p{L}\p{N}-]+/gu, "");
+                } catch (_) {
+                    // Fallback for older engines without Unicode property escapes
+                    slug = slug.replace(/\s+/g, "-").replace(/[^A-Za-z0-9-]+/g, "");
+                }
+                return "user-content-" + slug.toLowerCase();
+            }

68-70: Consider relying on native hash navigation and only augment when needed.

Native behavior already scrolls to elements with matching id. Keeping the listeners is fine, but ensure they only act when native scrolling fails (handled by your updated scrollToHash returning false). Optionally, remove the DOMContentLoaded call if you find double-scrolling in some browsers.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between ec17152 and 949c8ac.

📒 Files selected for processing (2)
  • django/thunderstore/frontend/templates/base.html (1 hunks)
  • django/thunderstore/markdown/templatetags/markdownify.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • django/thunderstore/markdown/templatetags/markdownify.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Analyze (javascript)
  • GitHub Check: Build docker image
  • GitHub Check: Build docker image
🔇 Additional comments (1)
django/thunderstore/frontend/templates/base.html (1)

33-70: Verify parity between backend and frontend slug algorithms.

The server uses anchors_plugin with a custom slugger (per PR description). Any drift from this JS slugger will break cross-linking. Please confirm both produce identical ids, including for diacritics, punctuation, emoji, CJK, and duplicate-heading suffixing behavior.

Would you like me to generate a small test comparing Python slug outputs against this JS version over a seeded set of headings?

x753 added 2 commits September 1, 2025 14:13
Update markdown tests to account for user-content slug change
Improve anchor navigation by attempting to find normal anchors first
rather than user-content prepended anchors
@x753
Copy link
Contributor

x753 commented Sep 25, 2025

TODO: Put the javascript somewhere more reasonable than base.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants