Is this related to an existing feature request or issue?
Based on the existing AWS Observability Kiro Power, adapted into the agent-plugins marketplace format.
Summary
This RFC proposes a new aws-observability plugin that provides a comprehensive AWS observability platform combining CloudWatch Logs, Metrics, Alarms, Application Signals (APM), CloudTrail security auditing, and automated codebase observability gap analysis. The plugin integrates four MCP servers from AWS Labs and provides eight reference files covering incident response, log analysis, alerting setup, performance monitoring, security auditing, observability gap analysis, Application Signals enablement, and CloudTrail data source selection.
Use case
AI coding agents today lack integrated access to AWS observability tooling. When developers need to troubleshoot production incidents, analyze logs, monitor performance, audit security events, or assess codebase observability gaps, they must manually switch between multiple AWS consoles and tools.
Key use cases:
- Incident response: Quickly triage production incidents by correlating alarms, logs, traces, metrics, and recent changes across CloudWatch, Application Signals, and CloudTrail
- Log analysis: Query CloudWatch Logs using Logs Insights syntax with pattern detection, anomaly analysis, and multi-log-group support
- Performance monitoring: Monitor microservices health via Application Signals APM with SLOs, distributed tracing, and service dependency maps
- Security auditing: Investigate security incidents and perform compliance audits using CloudTrail with a prioritized data source strategy (Lake > CloudWatch Logs > Lookup Events API)
- Alerting setup: Configure intelligent CloudWatch alarms using AWS best-practice recommendations with composite alarms and anomaly detection
- Observability gap analysis: Audit codebases across Python, Java, JavaScript/TypeScript, Go, Ruby, and C#/.NET for missing logging, metrics, tracing, error handling, and health checks
Proposal
Plugin structure
plugins/aws-observability/
├── .claude-plugin/
│ └── plugin.json # Plugin manifest
├── .mcp.json # 4 MCP server definitions
└── skills/
└── aws-observability/
├── SKILL.md # Main skill (~155 lines, auto-triggers)
└── references/
├── alerting-setup.md
├── application-signals-setup.md
├── cloudtrail-data-source-selection.md
├── incident-response.md
├── log-analysis.md
├── observability-gap-analysis.md
├── performance-monitoring.md
└── security-auditing.md
MCP servers
| Server |
Type |
Purpose |
awslabs.cloudwatch-mcp-server |
stdio |
CloudWatch Logs, Metrics, Alarms, log group analysis |
awslabs.cloudwatch-applicationsignals-mcp-server |
stdio |
Application Signals APM, SLOs, distributed tracing |
awslabs.cloudtrail-mcp-server |
stdio |
CloudTrail security auditing, API activity tracking |
awslabs.aws-documentation-mcp-server |
stdio |
Official AWS documentation search and access |
Skill design
The SKILL.md follows progressive disclosure:
- Initial load (~155 lines): Prerequisites, configuration, capability overview, reference file index with load conditions, quick start examples, essential log query patterns, and best practices
- On-demand references (8 files): Loaded only when the agent needs deep domain knowledge for a specific workflow (e.g., incident response, security auditing)
User experience
Before: Users must manually navigate AWS Console, run CLI commands, and context-switch between CloudWatch, X-Ray, CloudTrail, and documentation.
After: Users describe their intent naturally (e.g., "investigate the high error rate on my API", "audit my CloudTrail for IAM changes", "check my codebase for observability gaps") and the agent auto-triggers the aws-observability skill, loads relevant references, and uses the MCP servers to execute the workflow.
Prerequisites
- AWS CLI configured with credentials
- Python 3.10+ and
uv installed
- Required IAM permissions:
cloudwatch:*, logs:*, xray:*, cloudtrail:*, application-signals:*, synthetics:Get*, s3:GetObject, s3:ListBucket, iam:Get*
Out of scope
- AWS resource provisioning or modification: This plugin is read-only for observability data; it does not create, modify, or delete AWS resources
- Custom dashboard creation: The plugin queries data but does not create CloudWatch Dashboards or other persistent UI artifacts
- Automated remediation: The plugin identifies issues and provides recommendations but does not automatically fix them
- Non-AWS observability platforms: Integration with Datadog, Splunk, Grafana, or other third-party monitoring tools
- Cost Explorer integration: While referenced in some workflows, Cost Explorer MCP server integration is not included in this initial version
Potential challenges
- IAM permissions breadth: The plugin requires broad permissions across CloudWatch, X-Ray, CloudTrail, Application Signals, and S3. Users with restricted IAM policies may encounter partial functionality. Mitigation: Clear prerequisites documentation and graceful error handling guidance in reference files.
- Reference file size: Some reference files (security-auditing.md, performance-monitoring.md, incident-response.md) exceed the 100-line guideline from DESIGN_GUIDELINES.md due to the breadth of query patterns and workflows. Mitigation: Content is organized with clear headings for selective loading; the SKILL.md itself stays well under 300 lines.
- MCP server availability: All four MCP servers are published on PyPI as
uvx-installable packages. If any server has breaking changes, the plugin may need updates. Mitigation: Using @latest version pins for automatic updates.
- Region and profile configuration: Default configuration uses
default AWS profile and us-east-1 region. Users must manually update .mcp.json env vars for different profiles/regions. Mitigation: Configuration section in SKILL.md provides clear instructions.
Dependencies and Integrations
Dependencies (all from AWS Labs):
Integration with existing plugins:
- Complements
deploy-on-aws by providing post-deployment monitoring and troubleshooting capabilities
- The CloudTrail security auditing capability pairs well with infrastructure changes made via the deploy plugin
Alternative solutions
-
Individual MCP server setup without a plugin: Users could manually configure each MCP server and write their own prompts. The plugin adds value through curated skill descriptions, progressive-disclosure reference files, and pre-built workflow patterns that guide the agent through complex multi-tool observability tasks.
-
Separate plugins per capability: Could split into aws-cloudwatch, aws-application-signals, aws-cloudtrail plugins. However, observability workflows frequently span multiple tools (e.g., incident response correlates alarms + logs + traces + CloudTrail changes), making a unified plugin more effective.
Is this related to an existing feature request or issue?
Based on the existing AWS Observability Kiro Power, adapted into the agent-plugins marketplace format.
Summary
This RFC proposes a new aws-observability plugin that provides a comprehensive AWS observability platform combining CloudWatch Logs, Metrics, Alarms, Application Signals (APM), CloudTrail security auditing, and automated codebase observability gap analysis. The plugin integrates four MCP servers from AWS Labs and provides eight reference files covering incident response, log analysis, alerting setup, performance monitoring, security auditing, observability gap analysis, Application Signals enablement, and CloudTrail data source selection.
Use case
AI coding agents today lack integrated access to AWS observability tooling. When developers need to troubleshoot production incidents, analyze logs, monitor performance, audit security events, or assess codebase observability gaps, they must manually switch between multiple AWS consoles and tools.
Key use cases:
Proposal
Plugin structure
MCP servers
awslabs.cloudwatch-mcp-serverawslabs.cloudwatch-applicationsignals-mcp-serverawslabs.cloudtrail-mcp-serverawslabs.aws-documentation-mcp-serverSkill design
The SKILL.md follows progressive disclosure:
User experience
Before: Users must manually navigate AWS Console, run CLI commands, and context-switch between CloudWatch, X-Ray, CloudTrail, and documentation.
After: Users describe their intent naturally (e.g., "investigate the high error rate on my API", "audit my CloudTrail for IAM changes", "check my codebase for observability gaps") and the agent auto-triggers the aws-observability skill, loads relevant references, and uses the MCP servers to execute the workflow.
Prerequisites
uvinstalledcloudwatch:*,logs:*,xray:*,cloudtrail:*,application-signals:*,synthetics:Get*,s3:GetObject,s3:ListBucket,iam:Get*Out of scope
Potential challenges
uvx-installable packages. If any server has breaking changes, the plugin may need updates. Mitigation: Using@latestversion pins for automatic updates.defaultAWS profile andus-east-1region. Users must manually update.mcp.jsonenv vars for different profiles/regions. Mitigation: Configuration section in SKILL.md provides clear instructions.Dependencies and Integrations
Dependencies (all from AWS Labs):
Integration with existing plugins:
deploy-on-awsby providing post-deployment monitoring and troubleshooting capabilitiesAlternative solutions
Individual MCP server setup without a plugin: Users could manually configure each MCP server and write their own prompts. The plugin adds value through curated skill descriptions, progressive-disclosure reference files, and pre-built workflow patterns that guide the agent through complex multi-tool observability tasks.
Separate plugins per capability: Could split into aws-cloudwatch, aws-application-signals, aws-cloudtrail plugins. However, observability workflows frequently span multiple tools (e.g., incident response correlates alarms + logs + traces + CloudTrail changes), making a unified plugin more effective.