-
Notifications
You must be signed in to change notification settings - Fork 35
Add smus-upgrade-agent #32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
spark-upgrade-agent/POWER.md
Outdated
| "I want to upgrade my PySpark application from Spark 3.3 to 3.5. | ||
| The project is located at /path/to/my/spark/project. | ||
| Can you help me create an upgrade plan?" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use AWS Docs example as we are missing EMR-EC2 or EMR-S application information. you can also use example from https://github.com/awslabs/mcp/tree/main/src/sagemaker-unified-studio-spark-upgrade-mcp-server
Update all the prompts and usage mentioned in the above doc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 please explicitly mention the source/target EMR version (not spark version)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied almost everything from the link that @naikvaib provided.
Tbh @yuxiaorun it's not obvious that users should think about it as EMR versions, not Spark versions - the name is literally "Apache Spark upgrade agent", not "AWS EMR upgrade agent". So I think we should we should do some work to unify that on our end and simplify that for customers instead of expecting them to know that caveat. I think we should make the agent smart enough to disambiguate between Spark and EMR versions for the customer.
Lmk what you think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also added a bunch of references and quotes from public docs that I thought would best steer Kiro here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can name it as "Apache Spark upgrade agent for Amazon EMR" ? (same with the blog) we currently only supports EMR as our validation platform, and EMR version implicitly tells us the Spark/Java/Python version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on ERM since it's in the public docs title.
Note that Kiro dev team POC requested that we keep the format of other powers which focus on the concept of a "power", which (I think) means something that Kiro can do. For example, "Deploy[ing] web apps with Netlify" is a power. So I went with "Upgrade Spark applications on Amazon EMR"
spark-upgrade-agent/POWER.md
Outdated
| 1. **Planning**: Analyzes project structure and generates upgrade plans | ||
| 2. **Compile and Build**: Updates build environment, dependencies, and fixes build failures | ||
| 3. **Spark Code Edit Tools**: Applies targeted code updates for version compatibility | ||
| 4. **Execute & Validation**: Submits remote validation jobs to EMR and monitors execution |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we mention emr-ec2 and emr-serverless explicitly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The AWS Labs and public docs don't mention this explicitly here. I think it will be inferred from the rest of the doc
0ed0b28 to
a3aa678
Compare
a3aa678 to
1979066
Compare
This change adds a Kiro Power for the Apache Spark upgrade agent - an MCP server with tools that assist customers with Spark application upgrades in AWS EMR on EC2 and EMR Serverless
I tested this PR by:
navigating to the Power tab in the Kiro IDE
clicking "Add Custom Power"
selecting the "spark-upgrade-agent" directory
clicking the power in the "installed" powers list to open the power's landing page
clicking "Try power", which opens a dialogue with Kiro to configure and enable the power
here is the dialogue
I just installed the spark-upgrade-agent power and want to use it. Kiro I'll help you get started with the spark-upgrade-agent power! Let me activate it first to understand its capabilities. Activated Kiro power spark-upgrade-agent Great! I've activated the spark-upgrade-agent power. Let me give you an overview and walk through the onboarding process. What This Power Does The Spark Upgrade Agent accelerates Apache Spark version upgrades for EMR applications. Instead of spending months manually updating code, it uses AI to: Automatically transform code for Spark version compatibility (PySpark & Scala) Resolve dependency conflicts and update build configurations Submit validation jobs to EMR to test your upgraded code Validate data quality to ensure output correctness Support EMR EC2 and EMR Serverless platforms It's a conversational tool - you describe what you want to upgrade, and it handles the heavy lifting. Let's Go Through Onboarding Step 1: Validate Prerequisites Let me check what you have installed: Command aws --version && echo "---" && python3 --version && echo "---" && uv --version && echo "---" && aws sts get-caller-identity && echo "---" && echo "AWS_PROFILE: ${AWS_PROFILE:-default}" aws-cli/2.28.4 Python/3.13.4 Linux/5.10.245-243.979.amzn2int.x86_64 exe/x86_64.amzn.2 --- Python 3.12.6 --- uv 0.6.17 --- An error occurred (InvalidClientTokenId) when calling the GetCallerIdentity operation: The security token included in the req uest is invalid Good news - you have AWS CLI, Python 3.12, and uv installed! However, your AWS credentials aren't configured properly. Let me help you set that up. Current Status: ✅ AWS CLI: v2.28.4 ✅ Python: 3.12.6 (meets 3.10+ requirement) ✅ uv: 0.6.17 ❌ AWS Credentials: Not configured Step 2: Configure AWS Credentials Before we can deploy the CloudFormation stack, you need to configure your AWS credentials. You have a few options: Configure AWS CLI credentials: aws configure You'll need your AWS Access Key ID, Secret Access Key, and preferred region. Or use environment variables: export AWS_ACCESS_KEY_ID=your_access_key export AWS_SECRET_ACCESS_KEY=your_secret_key export AWS_DEFAULT_REGION=us-east-1 Or if you're using IAM roles (on EC2/ECS), they should be automatically available. Once you've configured your credentials, run: aws sts get-caller-identity This should return your AWS account details. Step 3: Deploy CloudFormation Stack After your credentials are working, you'll need to: Log into the AWS Console Navigate to the Spark Troubleshooting Agent Setup Page Deploy the CloudFormation stack with these key parameters: SparkUpgradeIAMRoleName: Name for the IAM role EnableEMREC2: true (if using EMR on EC2) EnableEMRServerless: true (if using EMR Serverless) StagingBucketPath: S3 path for artifacts (or leave empty to auto-generate) After deployment completes, note the Role ARN and Region from the stack outputs. Step 4: Configure AWS CLI Profile Once you have the Role ARN from the CloudFormation stack, let me know and I'll help you configure the spark-upgrade-profile and update your MCP configuration. Simple Example (After Setup) Once everything is configured, here's how you'd use it: "I want to upgrade my PySpark application from Spark 3.3 to 3.5. The project is located at /home/myuser/spark-projects/etl-pipeline. Can you help me create an upgrade plan?" The agent would then: Analyze your project structure Identify compatibility issues Generate an upgrade plan Apply automated transformations (with your approval) Update dependencies Submit validation jobs to EMR Validate data quality Next Steps: Configure your AWS credentials Deploy the CloudFormation stack Share the Role ARN and Region with me I'll configure the MCP profile for you Let me know when you're ready to proceed or if you need help with any of these steps! Checkpoint Restore please try the aws creds call again Kiro I'll check your AWS credentials now. Command aws sts get-caller-identity { "UserId": "AROA2PY2DM4KYKVBOBDYA:arusignu-Isengard", "Account": "721072383765", "Arn": "arn:aws:sts::721072383765:assumed-role/Admin/arusignu-Isengard" } Great! Your AWS credentials are configured and working. You're authenticated as an Admin role in account 721072383765. Now let me check your current AWS profile: Command echo ${AWS_PROFILE:-default} arusignu Perfect! You're using the arusignu profile. Here's what we've confirmed so far: Prerequisites Status: ✅ AWS CLI: Installed and configured ✅ AWS Credentials: Working (profile: arusignu, account: 721072383765) ✅ Python 3.10+: Need to verify ✅ uv package manager: Need to verify Let me check the remaining prerequisites: Command python3 --version && uv --version Python 3.12.6 uv 0.6.17 Excellent! All prerequisites are met: Prerequisites Status: ✅ AWS CLI: Installed ✅ AWS Credentials: Configured (profile: arusignu) ✅ Python 3.12.6: Installed ✅ uv 0.6.17: Installed Next Steps: You now need to deploy the CloudFormation stack to create the IAM role with proper permissions for the Spark Upgrade Agent. Here's what to do: Deploy the CloudFormation Stack: Log into the AWS Console with your current role Navigate to: https://docs.aws.amazon.com/emr/latest/ReleaseGuide/spark-troubleshooting-agent-setup.html#spark-troubleshooting-agent-setup-resources Deploy the stack to your desired region Configure the parameters (especially SparkUpgradeIAMRoleName, EnableEMREC2, EnableEMRServerless) Wait for deployment to complete After deployment, provide me with: The region where you deployed the stack The IAM role ARN from the CloudFormation outputs Once you have those, I'll configure the AWS CLI profile and update your MCP configuration automatically. Would you like to proceed with the CloudFormation deployment, or do you already have a role configured with sagemaker-unified-studio-mcp permissions? Checkpoint Restore I deployed the stack. The role is `arn:aws:iam::721072383765:role/spark_upgrade_role` and the region is `us-east-1` Kiro Perfect! Let me configure everything for you now. Step 3: Configure AWS CLI Profile I'll set up the spark-upgrade-profile with your role: Command aws configure set profile.spark-upgrade-profile.source_profile arusignu && aws configure set profile.spark-upgrade-profile.role_arn arn:aws:iam::721072383765:role/spark_upgrade_role && aws configure set profile.spark-upgrade-profile.region us-east-1 && echo "Profile configured successfully" Profile configured successfully Great! Now let me verify the profile works: Command AWS_PROFILE=spark-upgrade-profile aws sts get-caller-identity { "UserId": "AROA2PY2DM4KT7OFDXN33:botocore-session-1765998498", "Account": "721072383765", "Arn": "arn:aws:sts::721072383765:assumed-role/spark_upgrade_role/botocore-session-1765998498" } Excellent! The profile is working correctly. Now let me update your MCP configuration: Command cat ~/.kiro/settings/mcp.json { "mcpServers": { "builder-mcp": { "command": "/home/arusignu/.toolbox/bin/builder-mcp", "args": [], "env": {}, "disabled": false, "autoApprove": [] } }, "powers": { "mcpServers": { "power-postman-postman": { "url": "https://mcp.postman.com/minimal", "headers": { "Authorization": "Bearer ${POSTMAN_API_KEY}" } }, "power-spark-troubleshooting-agent-sagemaker-unified-studio-mcp-troubleshooting": { "type": "stdio", "command": "uvx", "args": [ "mcp-proxy-for-aws@latest", "https://sagemaker-unified-studio-mcp.us-east-1.api.aws/spark-troubleshooting/mcp", "--service", "sagemaker-unified-studio-mcp", "--profile", "smus-mcp-profile", "--region", "us-east-1", "--read-timeout", "180" ], "timeout": 180000, "disabled": false }, "power-spark-troubleshooting-agent-sagemaker-unified-studio-mcp-code-rec": { "type": "stdio", "command": "uvx", "args": [ "mcp-proxy-for-aws@latest", "https://sagemaker-unified-studio-mcp.us-east-1.api.aws/spark-code-recommendation/mcp", "--service", "sagemaker-unified-studio-mcp", "--profile", "smus-mcp-profile", "--region", "us-east-1", "--read-timeout", "180" ], "timeout": 180000, "disabled": false }, "power-spark-upgrade-agent-spark-upgrade": { "type": "stdio", "command": "uvx", "args": [ "mcp-proxy-for-aws@latest", "https://sagemaker-unified-studio-mcp.{AWS_REGION}.api.aws/spark-upgrade/mcp", "--service", "sagemaker-unified-studio-mcp", "--profile", "spark-upgrade-profile", "--region", "{AWS_REGION}", "--read-timeout", "180" ], "timeout": 180000, "disabled": false } } } }% I can see the spark-upgrade-agent is already configured in your MCP settings, but it has placeholder values {AWS_REGION}. Let me update it with the correct region: Command cat ~/.kiro/settings/mcp.json | python3 -c "import sys, json; data = json.load(sys.stdin); data['powers']['mcpServers']['power-spark-upgrade-agent-spark-upgrade']['args'] = [arg.replace('{AWS_REGION}', 'us-east-1') for arg in data['powers']['mcpServers']['power-spark-upgrade-agent-spark-upgrade']['args']]; print(json.dumps(data, indent=2))" > ~/.kiro/settings/mcp.json.tmp && mv ~/.kiro/settings/mcp.json.tmp ~/.kiro/settings/mcp.json && echo "MCP configuration updated successfully" MCP configuration updated successfully Perfect! Let me verify the configuration: Command cat ~/.kiro/settings/mcp.json | grep -