Skip to content

Commit

Permalink
Merge pull request #169 from kate-goldenring/sitemap-generator-script
Browse files Browse the repository at this point in the history
Add sitemap generator script to use from GH action
  • Loading branch information
kate-goldenring authored Oct 31, 2024
2 parents ed1f826 + 725ada2 commit 9937d6a
Show file tree
Hide file tree
Showing 3 changed files with 63 additions and 4 deletions.
8 changes: 5 additions & 3 deletions .github/workflows/mdbook.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,6 @@ jobs:
runs-on: ubuntu-latest
env:
MDBOOK_VERSION: 0.4.21
SITEMAP_GEN_VERSION: 0.2.0
ALERTS_VERSION: 0.6.7
PUBLISH_DOMAIN: component-model.bytecodealliance.org
steps:
Expand All @@ -40,17 +39,20 @@ jobs:
curl --proto '=https' --tlsv1.2 https://sh.rustup.rs -sSf -y | sh
rustup update
cargo install --version ${MDBOOK_VERSION} mdbook
cargo install --version ${SITEMAP_GEN_VERSION} mdbook-sitemap-generator
cargo install --version ${ALERTS_VERSION} mdbook-alerts
- name: Setup Pages
id: pages
uses: actions/configure-pages@v3
- name: Build with mdBook
run: mdbook build component-model
- name: Setup Python
uses: actions/setup-python@v2
with:
python-version: 3.10
- name: Generate sitemap
run: |
cd component-model
mdbook-sitemap-generator -d ${PUBLISH_DOMAIN} -o book/sitemap.xml
python3 ../scripts/generate_sitemap.py --domain "component-model.bytecodealliance.org" --higher-priority "design" --output-path book/sitemap.xml
cd ..
- name: Upload artifact
uses: actions/upload-pages-artifact@v2
Expand Down
1 change: 0 additions & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@ This repository also makes use of mdBook plugins. To install mdBook and the plug

```console
cargo install --version 0.4.21 mdbook
cargo install --version 0.2.0 mdbook-sitemap-generator
cargo install --version 0.6.7 mdbook-alerts
```

Expand Down
58 changes: 58 additions & 0 deletions scripts/generate_sitemap.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
import os
from urllib.parse import urljoin
from datetime import datetime
import argparse

def parse_summary():
"""Parse URLs from the SUMMARY.md file."""
with open("src/SUMMARY.md", "r") as file:
for line in file:
if "](" in line:
url = line.split("](")[1].split(")")[0]
# Add .html extension if not the root URL
if url.endswith(".md"):
url = url[:-3] + ".html"
yield url

def determine_priority(url_path, higher_priority_section):
"""Determine the priority based on the URL path and specified higher priority section."""
if url_path.count("/") <= 1: # Pages directly under the base URL
return "1.0"
elif higher_priority_section and url_path.startswith(f"./{higher_priority_section}"): # Pages in the specified higher priority section
return "0.8"
else:
return "0.5" # All other pages

def generate_sitemap(domain, output_path, higher_priority_section):
"""Generate a sitemap XML file from SUMMARY.md structure."""
domain = "https://" + domain
urls = parse_summary() # Add base URL to the list of URLs
urls = [""] + list(urls)

sitemap = '<?xml version="1.0" encoding="UTF-8"?>\n'
sitemap += '<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">\n'

for url in urls:
full_url = urljoin(domain, url)
priority = determine_priority(url, higher_priority_section)

sitemap += " <url>\n"
sitemap += f" <loc>{full_url}</loc>\n"
sitemap += " <changefreq>weekly</changefreq>\n"
sitemap += f" <priority>{priority}</priority>\n"
sitemap += " </url>\n"

sitemap += "</urlset>"

# Write the sitemap to the specified output path
with open(output_path, "w") as file:
file.write(sitemap)

if __name__ == "__main__":
parser = argparse.ArgumentParser(description="Generate a sitemap for mdBook")
parser.add_argument("-d", "--domain", required=True, help="Domain for the mdBook site (e.g., component-model.bytecodealliance.org)")
parser.add_argument("-o", "--output-path", default="sitemap.xml", help="Output path for the sitemap file")
parser.add_argument("-p", "--higher-priority", help="Subsection path (e.g., 'design') to assign a higher priority of 0.8")
args = parser.parse_args()

generate_sitemap(args.domain, args.output_path, args.higher_priority)

0 comments on commit 9937d6a

Please sign in to comment.