Skip to content

Add remote repository support and AI-powered release notes generation#1

Closed
jyejare wants to merge 8 commits intomainfrom
feature/remote-repository-access
Closed

Add remote repository support and AI-powered release notes generation#1
jyejare wants to merge 8 commits intomainfrom
feature/remote-repository-access

Conversation

@jyejare
Copy link
Copy Markdown
Owner

@jyejare jyejare commented Apr 13, 2026

Summary

This PR adds comprehensive remote repository support (GitHub/GitLab) and AI-powered release notes generation to the utility-mcp-server.

Key Features

1. Remote Repository Support

  • GitHub API Integration: Fetch commits directly from GitHub without cloning
  • GitLab API Integration: Fetch commits directly from GitLab without cloning
  • Local Git Support: Fallback to local git commands for local repositories
  • Provider Abstraction: Clean provider pattern for extensibility

2. AI-Powered Release Notes Generation

  • Two Modes:
    • Mode 1 (Default): Returns raw commit data + AI instructions for intelligent categorization
    • Mode 2 (formatted_output=True): Pre-formatted markdown with automatic categorization
  • AI Instructions Embedding: Tool provides comprehensive categorization guidelines
  • 10 Automatic Categories: Breaking Changes, Security, Features, Bug Fixes, Performance, Documentation, Refactoring, Testing, Chores, Other

3. Intelligent Tag Handling

  • Auto-detection: Automatically finds previous version tag using semantic versioning
  • Tag Validation: Validates tags exist before fetching commits
  • Semantic Version Sorting: Properly sorts tags by version (v1.0.0 > v0.9.0 > v0.8.0)

4. Enhanced Data Extraction

  • PR/MR Number Extraction: Automatically extracts GitHub PR numbers (#123) and GitLab MR numbers (!456)
  • Compare URLs: Generates GitHub/GitLab compare URLs for version diffs
  • Commit Links: Creates clickable links to commits and PRs/MRs

Architecture: Option B Implementation

The tool implements Option B architecture where:

  • MCP Tool: Fetches data + provides AI instructions
  • AI Agent: Follows instructions to categorize intelligently

This separation ensures:

  • Tool logic stays simple (data fetching only)
  • AI instructions are version-controlled with the tool
  • Different AI agents can use the same tool with consistent guidance

Changes Made

Core Features

  • git_providers.py: Provider abstraction for GitHub/GitLab/Local git
  • release_notes_tool.py: Main MCP tool with two output modes
  • _categorize_commit(): Automatic commit categorization logic
  • _format_release_notes_markdown(): Pre-formatted markdown generator

Bug Fixes

  • Fixed semantic version sorting: Tool now correctly auto-detects previous tags using packaging.version
  • Fixed module-level imports: Moved packaging import to module level to prevent silent failures
  • Fixed test mocking: Corrected mock patches to point to actual module paths

Testing

  • test_git_providers.py: 20+ tests for all three providers
  • test_release_notes_tool.py: Tests for both output modes
  • All tests passing ✅
  • Pre-commit checks passing ✅

Dependencies

  • PyGithub>=2.1.1: GitHub API client
  • python-gitlab>=4.0.0: GitLab API client
  • packaging>=23.0: Semantic version parsing

Usage Examples

Example 1: AI-Powered Mode (Default)

result = await generate_release_notes(
    version="v1.0.0",
    repo_url="https://github.com/owner/repo",
    github_token=os.getenv('GITHUB_TOKEN')
)

# Returns: raw commits + ai_instructions
# AI agent follows instructions to categorize

Example 2: Pre-Formatted Mode (IDE Usage)

result = await generate_release_notes(
    version="v1.0.0",
    repo_url="https://github.com/owner/repo",
    formatted_output=True
)

# Returns: pre-formatted markdown with categories
print(result['formatted_output'])

Example 3: Local Repository

result = await generate_release_notes(
    version="v1.0.0",
    repo_path="/path/to/repo"
)

Formatted Output Categories

When formatted_output=True, commits are automatically categorized:

  • ⚠️ Breaking Changes: API changes, breaking refactors
  • 🔒 Security Updates: CVE fixes, security patches
  • 🎉 New Features: New functionality
  • 🐛 Bug Fixes: Issue resolutions
  • Performance Improvements: Speed optimizations
  • 📚 Documentation: Doc updates
  • 🔄 Refactoring: Code improvements
  • 🧪 Testing: Test additions
  • 🔧 Chores: Maintenance tasks
  • 📦 Other Changes: Miscellaneous

Benefits

No Local Clone Required: Fetch commits remotely via API
Multi-Platform: GitHub, GitLab, and local git support
Automatic Tag Detection: Finds previous version automatically
AI-Ready: Provides comprehensive instructions for AI agents
IDE-Ready: Pre-formatted output for direct use in Cursor/VS Code
Extensible: Clean provider pattern for adding more platforms
Well-Tested: Comprehensive test coverage

Testing Done

  • ✅ Tested with feast-dev/feast v0.62.0 (50+ commits)
  • ✅ All unit tests passing
  • ✅ Pre-commit checks passing
  • ✅ Both output modes tested
  • ✅ Token handling tested
  • ✅ Error scenarios tested

Version

Bumped to v0.2.0 to reflect major new features.

jyejare and others added 8 commits April 10, 2026 15:39
Major improvements to release notes generator:

1. Remote Repository Access
   - GitHub support via PyGithub API (no clone needed)
   - GitLab support via python-gitlab API
   - Provider pattern for easy extensibility
   - Automatic provider selection based on repo URL

2. AI Agent Integration
   - New return_raw_data parameter returns structured data
   - Enables AI agents to perform intelligent categorization
   - Removes need for predefined categories in tool
   - Claude/other AI agents can create dynamic categories

3. Dependencies
   - Added PyGithub>=2.1.1 for GitHub API
   - Added python-gitlab>=4.0.0 for GitLab API
   - Version bump to 0.2.0

Breaking changes: None (all new parameters are optional)
Backward compatibility: Maintained (repo_path still works)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…dation

Major improvements based on code review feedback:

1. Auto-detect Previous Tag
   - All providers now auto-detect previous tag if not provided
   - GitHub: Uses get_tags() to find previous version
   - GitLab: Uses tags.list() sorted by date
   - Local: Uses git tag --sort=-version:refname

2. Tag Validation
   - All providers validate tags exist before fetching
   - Clear error messages when tags don't exist
   - Prevents confusing API errors

3. Improved PR/MR Extraction
   - Returns "Not Found" instead of None when PR/MR not found
   - More specific patterns to avoid false positives
   - Only checks first line of commit message
   - Supports common formats: (#123), !123, Merge pull request #

4. Removed Old Categorization Logic
   - Completely removed regex-based categorization
   - Removed all formatting functions
   - Tool now only fetches and returns raw data
   - AI agents handle all intelligent categorization
   - Simplified from 620 lines to 150 lines

5. Comprehensive Unit Tests
   - Created test_git_providers.py with 20+ tests
   - Updated test_release_notes_tool.py for new architecture
   - Tests cover: tag validation, auto-detection, PR extraction
   - Mocked external APIs for fast, reliable tests

Architecture Change:
- MCP Tool: Data fetching only (commits, tags, PRs)
- AI Agent: All intelligence (categorization, formatting)

This separation keeps the tool simple, maintainable, and enables
more powerful AI-driven categorization than regex patterns.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Implements Option B architecture where the MCP tool includes comprehensive
ai_instructions in its response, ensuring instructions travel with data.

## What Changed

### Tool Response Structure
- Added `ai_instructions` field to response with comprehensive guidance:
  - role: "release_notes_categorizer"
  - task: Clear description of what AI should do
  - guidelines: List of categorization principles
  - categorization_strategy: Step-by-step approach
  - suggested_sections: Always-consider + conditionally-add sections
  - output_format: Markdown structure specification
  - context_understanding: Examples of interpreting commits
  - best_practices: Quality guidelines

### Benefits
✅ Instructions version-controlled with tool (always in sync)
✅ Consistent categorization across all workflows/agents
✅ Self-documenting - AI knows how to use the data
✅ No need to duplicate instructions in each workflow
✅ Tool can be used standalone with any AI agent

### Files Modified
- utility_mcp_server/src/tools/release_notes_tool.py:
  - Added comprehensive ai_instructions to response
  - Updated docstring to document new field
  - Instructions cover categorization strategy, section suggestions, formatting

- README.md:
  - Updated AI Agent Integration section
  - Documented ai_instructions field structure
  - Explained benefits of embedded instructions

- tests/test_release_notes_tool.py:
  - Added assertions to verify ai_instructions presence
  - Tests check for required instruction fields

## Architecture

```
MCP Tool Returns:
{
  "data": {...},           // Raw commits
  "ai_instructions": {...} // How to categorize them
}

AI Agent:
1. Extracts ai_instructions
2. Follows guidelines and strategy
3. Creates dynamic categories
4. Formats per output_format spec
```

This ensures instructions are never out of sync with tool capabilities.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Fixed test failures and pre-commit check failures:

1. Fixed test_git_providers.py mock patches:
   - Changed @patch("utility_mcp_server.src.tools.git_providers.Github")
     to @patch("github.Github")
   - Changed @patch("utility_mcp_server.src.tools.git_providers.gitlab.Gitlab")
     to @patch("gitlab.Gitlab")
   - Changed @patch("utility_mcp_server.src.tools.git_providers.subprocess.run")
     to @patch("subprocess.run")

   Issue: The imports are done inside __init__ methods using
   "from github import Github", so the patch needs to be at the
   module level (github.Github) not at the file level.

2. Applied ruff-format auto-formatting:
   - Reformatted 4 files per ruff-format rules
   - No functional changes, just formatting

All tests should now pass and pre-commit hooks succeed.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…pagination logging

CRITICAL FIX for missing commits when auto-detecting previous version.

## Problem
User reported only getting ~10-20 commits instead of 50+ for feast.dev/feast v0.62.0.

## Root Causes

1. **Tag Sorting Issue (Primary)**
   - Code assumed `get_tags()` returns chronologically sorted tags
   - GitHub/GitLab APIs return tags in arbitrary order (often alphabetical)
   - For repos with patch versions (v0.61.0, v0.61.1, v0.61.9, v0.62.0):
     * If sorted alphabetically: v0.62.0 might pick v0.61.9 instead of v0.61.0
     * Result: Only 10-20 commits between v0.61.9 and v0.62.0 instead of 50+

2. **No Pagination Logging**
   - GitHub compare API has 250 commit limit
   - No visibility into whether all commits were fetched

## Fixes

### 1. Semantic Version Sorting (All Providers)
**GitHubProvider & GitLabProvider:**
- Added `packaging` library for proper semantic version parsing
- Filter tags to only semantic versions (skip non-version tags)
- Sort using `packaging.version.parse()` (handles v0.61.1 vs v0.61.10 correctly)
- Better logging: shows total tags, filtered version tags, auto-detected tag

**LocalGitProvider:**
- Already uses `git tag --sort=-version:refname` (Git's built-in semver sort)
- Added better logging for consistency

### 2. Enhanced Commit Fetching (GitHub)
- Log `comparison.total_commits` to show expected count
- Explicitly convert PaginatedList to list to ensure all pages fetched
- Warn if `total_commits > 250` (GitHub API limit)
- Warn if fetched commits < total_commits (missing commits)
- Better logging with tag range

### 3. Dependencies
- Added `packaging>=23.0` for semantic version parsing

## Impact
- ✅ Correct previous tag detection (v0.61.0 not v0.61.9)
- ✅ All commits fetched (50+ instead of 10-20)
- ✅ Better visibility with logging
- ✅ Works for all semver tag formats (v1.0.0, 1.0.0, etc.)

## Testing
Should now correctly handle feast.dev/feast v0.62.0 and similar repos
with multiple patch versions between releases.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
CRITICAL UX FIX: Prevents confusing "retry works" behavior.

## Problem Reported by User

First attempt: ❌ Error: "Could not auto-detect previous tag"
Second attempt: ✅ Success! Found v0.61.0

Same parameters, different results → confusing UX

## Root Cause

packaging import was INSIDE the method:
```python
def _get_previous_tag(self, current_tag: str):
    try:
        from packaging import version as packaging_version  # ← HERE
        ...
    except Exception as e:
        return None  # Silent failure!
```

**First call:**
- packaging not installed yet
- Import fails
- Exception caught
- Returns None
- Error: "Could not auto-detect previous tag" (doesn't mention packaging!)

**Second call:**
- packaging auto-installed or loaded
- Import succeeds
- Tag detection works

## Fix

Moved import to module level:
```python
# At top of file
from packaging import version as packaging_version
```

**Now:**
- If packaging missing: Clear error on import: "ModuleNotFoundError: No module named 'packaging'"
- No silent failures
- No confusing retry behavior
- Error message is clear and actionable

## Why This is Better

Module-level imports fail FAST and LOUD:
✅ Immediate clear error if dependency missing
✅ No silent failures in try/except
✅ Consistent behavior (no "works on retry")
✅ Standard Python best practice

packaging is required dependency in pyproject.toml, so it should
always be available. Module-level import ensures it.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Implement formatted_output boolean parameter (default: False)
- Add _format_release_notes_markdown() helper function
- Generate pre-formatted markdown with emojis and statistics when enabled
- Use for direct IDE usage (Cursor, VS Code) for immediate readable output
- Keep default False for AI agents to do intelligent categorization
- Add List import for type hint

This fixes the issue where emojis, statistics, and release introduction
were missing when using the tool directly in Cursor IDE.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Implement _categorize_commit() to auto-categorize commits
- Add _format_commit_entry() helper for consistent formatting
- Categorize commits into 10 categories with emojis:
  * ⚠️ Breaking Changes
  * 🔒 Security Updates
  * 🎉 New Features
  * 🐛 Bug Fixes
  * ⚡ Performance Improvements
  * 📚 Documentation
  * 🔄 Refactoring
  * 🧪 Testing
  * 🔧 Chores
  * 📦 Other Changes
- Add category breakdown in release statistics
- Support conventional commits (feat:, fix:, etc.) and keyword-based detection

This fixes the missing subcategories in formatted output when using the
tool directly in IDEs.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant