Skip to content

Conversation

@blairhyy-amazon
Copy link
Contributor

@blairhyy-amazon blairhyy-amazon commented Oct 16, 2025

Fixes

Summary

Changes

This PR implements interactive batch processing for service auditing in the CloudWatch Application Signals MCP server. The changes introduce a new workflow that automatically processes large service lists in manageable batches, allowing users to interactively decide whether to investigate findings or continue processing.

Key additions:

  • New batch processing utilities (batch_processing_utils.py) - Core logic for creating, managing, and processing audit batches
  • New batch tools (batch_tools.py) - MCP tools for continuing batch processing
  • Enhanced audit_services function - Integrated batch processing for large service lists (>5 services)
  • Interactive workflow - When findings are discovered, users can choose to investigate specific issues or continue processing
  • Memory management - Session cleanup functionality to free resources
  • Comprehensive test coverage - Full test suites for both batch processing utilities and tools

User experience

Before this change:

  • Users auditing large numbers of services (>5) would receive overwhelming results all at once and wait for a long time before getting results
  • No way to process services incrementally or interactively

After this change:

  • Batch processing: Large service lists are processed in batches of 5
  • Interactive decision making: When findings are discovered in a batch, users are presented with options to either investigate specific findings or continue processing the next batch
  • Flexible workflow: Users maintain control over the audit process while benefiting from automated batching

Checklist

If your change doesn't seem to apply, please leave them unchecked.

  • I have reviewed the contributing guidelines
  • I have performed a self-review of this change
  • Changes have been tested
  • Changes are documented

Is this a breaking change? (Y/N)

RFC issue number:

Checklist:

  • Migration process documented
  • Implement warnings (if it can live side by side)

Acknowledgment

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of the project license.

@codecov
Copy link

codecov bot commented Oct 16, 2025

Codecov Report

❌ Patch coverage is 98.33333% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.51%. Comparing base (620db3f) to head (0fdd7a0).

Files with missing lines Patch % Lines
...awslabs/cloudwatch_appsignals_mcp_server/server.py 89.47% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1530      +/-   ##
==========================================
+ Coverage   89.46%   89.51%   +0.05%     
==========================================
  Files         724      726       +2     
  Lines       50966    51082     +116     
  Branches     8145     8162      +17     
==========================================
+ Hits        45596    45727     +131     
+ Misses       3459     3446      -13     
+ Partials     1911     1909       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@blairhyy-amazon blairhyy-amazon force-pushed the 2nd_batch branch 4 times, most recently from a84f673 to f735675 Compare October 20, 2025 19:17
Copy link
Contributor

@yiyuan-he yiyuan-he left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

sessions_by_age = sorted(_batch_sessions.items(), key=lambda x: x[1].get('created_at', ''))

# Remove oldest sessions until we're under the limit
excess_count = len(_batch_sessions) - MAX_BATCH_SESSIONS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks the max sess limit is set to 1 by default, does it mean we don't support MCP to work with more than 1 session at the same time?

excess_count = len(_batch_sessions) - MAX_BATCH_SESSIONS
for i in range(excess_count):
session_id, _ = sessions_by_age[i]
del _batch_sessions[session_id]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a race condition risk here?

Returns:
Session ID for tracking the batch processing
"""
session_id = str(uuid.uuid4())
Copy link
Contributor

@mxiamxia mxiamxia Oct 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, MCP server should be stateless. And it looks like you're generating random session id as key for the cache for each call. Does cache really work?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: To triage

Development

Successfully merging this pull request may close these issues.

3 participants