A Model Context Protocol (MCP) server that provides AI assistants with comprehensive access to AWS HealthOmics services for genomic workflow management, execution, and analysis.
AWS HealthOmics is a purpose-built service for storing, querying, and analyzing genomic, transcriptomic, and other omics data. This MCP server enables AI assistants to interact with HealthOmics workflows through natural language, making genomic data analysis more accessible and efficient.
This MCP server provides tools for:
- Create and validate workflows: Support for WDL, CWL, and Nextflow workflow languages
- Lint workflow definitions: Validate WDL and CWL workflows using industry-standard linting tools
- Version management: Create and manage workflow versions with different configurations
- Package workflows: Bundle workflow definitions into deployable packages
- Start and monitor runs: Execute workflows with custom parameters and monitor progress
- Task management: Track individual workflow tasks and their execution status
- Resource configuration: Configure compute resources, storage, and caching options
- Performance analysis: Analyze workflow execution performance and resource utilization
- Failure diagnosis: Comprehensive troubleshooting tools for failed workflow runs
- Log access: Retrieve detailed logs from runs, engines, tasks, and manifests
- Genomics file search: Intelligent discovery of genomics files across S3 buckets, HealthOmics sequence stores, and reference stores
- Pattern matching: Advanced search with fuzzy matching against file paths and object tags
- File associations: Automatic detection and grouping of related files (BAM/BAI indexes, FASTQ pairs, FASTA indexes)
- Relevance scoring: Smart ranking of search results based on match quality and file relationships
- Multi-region support: Get information about AWS regions where HealthOmics is available
- ListAHOWorkflows - List available HealthOmics workflows with pagination support
- CreateAHOWorkflow - Create new workflows with WDL, CWL, or Nextflow definitions from local ZIP files, S3 URIs, or base64-encoded content, with optional container registry mappings
- GetAHOWorkflow - Retrieve detailed workflow information and export definitions
- CreateAHOWorkflowVersion - Create new versions of existing workflows from local ZIP files, S3 URIs, or base64-encoded content, with optional container registry mappings
- ListAHOWorkflowVersions - List all versions of a specific workflow
- LintAHOWorkflowDefinition - Lint single WDL or CWL workflow files using miniwdl and cwltool, accepting local file paths, S3 URIs, or inline content
- LintAHOWorkflowBundle - Lint multi-file WDL or CWL workflow bundles with import/dependency support, accepting local directories, ZIP files, S3 prefixes, or inline dictionaries
- PackageAHOWorkflow - Package workflow files into base64-encoded ZIP format, accepting local file paths, S3 URIs, or inline content
- StartAHORun - Start workflow runs with custom parameters, resource configuration, and optional VPC networking mode with a named configuration
- ListAHORuns - List workflow runs with filtering by status and date ranges
- GetAHORun - Retrieve detailed run information including status and metadata
- ListAHORunTasks - List tasks for specific runs with status filtering
- GetAHORunTask - Get detailed information about specific workflow tasks
- AnalyzeAHORunPerformance - Analyze workflow run performance and resource utilization
- DiagnoseAHORunFailure - Comprehensive diagnosis of failed workflow runs with remediation suggestions
- GetAHORunLogs - Access high-level workflow execution logs and events
- GetAHORunEngineLogs - Retrieve workflow engine logs (STDOUT/STDERR) for debugging
- GetAHORunManifestLogs - Access run manifest logs with runtime information and metrics
- GetAHOTaskLogs - Get task-specific logs for debugging individual workflow steps
- SearchGenomicsFiles - Intelligent search for genomics files across S3 buckets, HealthOmics sequence stores, and reference stores with pattern matching, file association detection, and relevance scoring
- CreateAHORunGroup - Create a new run group with optional resource limits (maxCpus, maxGpus, maxDuration, maxRuns) and tags
- GetAHORunGroup - Retrieve detailed information about a specific run group
- ListAHORunGroups - List available run groups with optional name filtering and pagination
- UpdateAHORunGroup - Update an existing run group's name or resource limits
- CreateAHORunCache - Create a new run cache with a cache behavior (CACHE_ALWAYS or CACHE_ON_FAILURE), S3 URI for cache storage, and optional name, description, tags, and cross-account bucket owner ID
- GetAHORunCache - Retrieve detailed information about a specific run cache including configuration, status, and metadata
- ListAHORunCaches - List available run caches with optional filtering by name, status, or cache behavior, with pagination support
- UpdateAHORunCache - Update an existing run cache's cache behavior, name, or description
- CreateAHOSequenceStore - Create a new sequence store with optional encryption, description, fallback location, and tags
- ListAHOSequenceStores - List sequence stores with optional name filtering and pagination
- GetAHOSequenceStore - Get detailed information about a specific sequence store
- UpdateAHOSequenceStore - Update a sequence store's name, description, or fallback location (manages ETags internally)
- ListAHOReadSets - List read sets in a sequence store with filtering by sample ID, subject ID, reference ARN, status, file type, and date range
- GetAHOReadSetMetadata - Get detailed metadata for a specific read set including sequence information and file details
- StartAHOReadSetImportJob - Import genomic files from S3 into a sequence store with batch support
- GetAHOReadSetImportJob - Get status and details of a read set import job including per-source statuses
- ListAHOReadSetImportJobs - List import jobs for a sequence store with pagination
- StartAHOReadSetExportJob - Export read sets from a sequence store to S3 with batch support
- GetAHOReadSetExportJob - Get status and details of a read set export job
- ListAHOReadSetExportJobs - List export jobs for a sequence store with pagination
- ActivateAHOReadSets - Activate archived read sets for analysis access
- ListAHOReferenceStores - List reference stores with optional name filtering and pagination
- GetAHOReferenceStore - Get detailed information about a specific reference store
- ListAHOReferences - List references in a reference store with optional name and status filtering
- GetAHOReferenceMetadata - Get detailed metadata for a specific reference including file information
- StartAHOReferenceImportJob - Import reference files from S3 into a reference store with batch support
- GetAHOReferenceImportJob - Get status and details of a reference import job including per-source statuses
- ListAHOReferenceImportJobs - List import jobs for a reference store with pagination
- CreateAHOConfiguration - Create a new HealthOmics configuration for workflow runs with optional run settings, description, and tags
- GetAHOConfiguration - Get detailed information about a specific configuration including run settings and status
- ListAHOConfigurations - List available configurations with pagination support
- DeleteAHOConfiguration - Delete a configuration
- GetAHOSupportedRegions - List AWS regions where HealthOmics is available
This MCP server enables AI assistants like Kiro, Cline, Cursor, and Windsurf to help users with AWS HealthOmics genomic workflow management. Here's how to effectively use these tools:
AWS HealthOmics is designed for genomic data analysis workflows. Key concepts:
- Workflows: Computational pipelines written in WDL, CWL, or Nextflow that process genomic data
- Runs: Executions of workflows with specific input parameters and data
- Tasks: Individual steps within a workflow run
- Storage Types: STATIC (fixed storage) or DYNAMIC (auto-scaling storage)
-
Creating Workflows:
- From local files: Use
PackageAHOWorkflowto bundle workflow files, then use the base64-encoded ZIP withCreateAHOWorkflow - From S3: Store your workflow definition ZIP file in S3 and reference it using the
definition_uriparameter - Validate workflows with appropriate language syntax (WDL, CWL, Nextflow)
- Include parameter templates to guide users on required inputs
- Choose the appropriate method based on your workflow storage preferences
- From local files: Use
-
S3 URI Support:
- Both
CreateAHOWorkflowandCreateAHOWorkflowVersionsupport S3 URIs as an alternative to base64-encoded ZIP files - Benefits of S3 URIs:
- Better for large workflow definitions (no base64 encoding overhead)
- Easier integration with CI/CD pipelines that store artifacts in S3
- Reduced memory usage during workflow creation
- Direct reference to existing S3-stored workflow definitions
- Requirements:
- S3 URI must start with
s3:// - The S3 bucket must be in the same region as the HealthOmics service
- Appropriate S3 permissions must be configured for the HealthOmics service
- S3 URI must start with
- Usage: Specify either
definition_source(local ZIP path, S3 URI, or base64 content) ORdefinition_uri, but not both. The legacydefinition_zip_base64parameter is still accepted as a deprecated alias.
- Both
-
Version Management:
- Create new versions for workflow updates rather than modifying existing ones
- Use descriptive version names that indicate changes or improvements
- List versions to help users choose the appropriate one
- Both base64 ZIP and S3 URI methods are supported for version creation
-
Starting Runs:
- Always specify required parameters: workflow_id, role_arn, name, output_uri
- Choose appropriate storage type (DYNAMIC recommended for most cases)
- Use meaningful run names for easy identification
- Configure caching when appropriate to save costs and time
-
Monitoring Runs:
- Use
ListAHORunswith status filters to track active workflows - Check individual run details with
GetAHORunfor comprehensive status - Monitor tasks with
ListAHORunTasksto identify bottlenecks
- Use
When workflows fail, follow this diagnostic approach:
-
Start with DiagnoseAHORunFailure: This comprehensive tool provides:
- Failure reasons and error analysis
- Failed task identification
- Log summaries and recommendations
- Actionable troubleshooting steps
-
Access Specific Logs:
- Run Logs: High-level workflow events and status changes
- Engine Logs: Workflow engine STDOUT/STDERR for system-level issues
- Task Logs: Individual task execution details for specific failures
- Manifest Logs: Resource utilization and workflow summary information
-
Performance Analysis:
- Use
AnalyzeAHORunPerformanceto identify resource bottlenecks - Review task resource utilization patterns
- Optimize workflow parameters based on analysis results
- Use
The MCP server includes built-in workflow linting capabilities for validating WDL and CWL workflows before deployment:
-
Lint Workflow Definitions:
- Single files: Use
LintAHOWorkflowDefinitionfor individual workflow files - Multi-file bundles: Use
LintAHOWorkflowBundlefor workflows with imports and dependencies - Syntax errors: Catch parsing issues before deployment
- Missing components: Identify missing inputs, outputs, or steps
- Runtime requirements: Ensure tasks have proper runtime specifications
- Import resolution: Validate imports and dependencies between files
- Best practices: Get warnings about potential improvements
- Single files: Use
-
Supported Formats:
- WDL: Uses miniwdl for comprehensive validation
- CWL: Uses cwltool for standards-compliant validation
-
No Additional Installation Required: Both miniwdl and cwltool are included as dependencies and available immediately after installing the MCP server.
The MCP server includes a powerful genomics file search tool that helps users locate and discover genomics files across multiple storage systems:
-
Multi-Storage Search:
- S3 Buckets: Search configured S3 bucket paths for genomics files
- HealthOmics Sequence Stores: Discover read sets and their associated files
- HealthOmics Reference Stores: Find reference genomes and associated indexes
- Unified Results: Get combined, deduplicated results from all storage systems
-
Intelligent Pattern Matching:
- File Path Matching: Search against S3 object keys and HealthOmics resource names
- Tag-Based Search: Match against S3 object tags and HealthOmics metadata
- Fuzzy Matching: Find files even with partial or approximate search terms
- Multiple Terms: Support for multiple search terms with logical matching
-
Automatic File Association:
- BAM/CRAM Indexes: Automatically group BAM files with their .bai indexes and CRAM files with .crai indexes
- FASTQ Pairs: Detect and group R1/R2 read pairs using standard naming conventions (_R1/_R2, _1/_2)
- FASTA Indexes: Associate FASTA files with their .fai, .dict, and BWA index collections
- Variant Indexes: Group VCF/GVCF files with their .tbi and .csi index files
- Complete File Sets: Identify complete genomics file collections for analysis pipelines
-
Smart Relevance Scoring:
- Pattern Match Quality: Higher scores for exact matches, lower for fuzzy matches
- File Type Relevance: Boost scores for files matching the requested type
- Associated Files Bonus: Increase scores for files with complete index sets
- Storage Accessibility: Consider storage class (Standard vs. Glacier) in scoring
-
Comprehensive File Metadata:
- Access Paths: S3 URIs or HealthOmics S3 access point paths for direct data access
- File Characteristics: Size, storage class, last modified date, and file type detection
- Storage Information: Archive status and retrieval requirements
- Source System: Clear indication of whether files are from S3, sequence stores, or reference stores
-
Configuration and Setup:
- S3 Bucket Configuration: Set
GENOMICS_SEARCH_S3_BUCKETSenvironment variable with comma-separated bucket paths - Example:
GENOMICS_SEARCH_S3_BUCKETS=s3://my-genomics-data/,s3://shared-references/hg38/ - Permissions: Ensure appropriate S3 and HealthOmics read permissions
- Performance: Parallel searches across storage systems for optimal response times
- S3 Bucket Configuration: Set
-
Performance Optimizations:
- Smart S3 API Usage: Optimized to minimize S3 API calls by 60-90% through intelligent caching and batching
- Lazy Tag Loading: Only retrieves S3 object tags when needed for pattern matching
- Result Caching: Caches search results to eliminate repeated S3 calls for identical searches
- Batch Operations: Retrieves tags for multiple objects in parallel batches
- Configurable Performance: Tune cache TTLs, batch sizes, and tag search behavior for your use case
- Path-First Matching: Prioritizes file path matching over tag matching to reduce API calls
-
Find FASTQ Files for a Sample:
User: "Find all FASTQ files for sample NA12878" → Use SearchGenomicsFiles with file_type="fastq" and search_terms=["NA12878"] → Returns R1/R2 pairs automatically grouped together → Includes file sizes and storage locations -
Locate Reference Genomes:
User: "Find human reference genome hg38 files" → Use SearchGenomicsFiles with file_type="fasta" and search_terms=["hg38", "human"] → Returns FASTA files with associated .fai, .dict, and BWA indexes → Provides S3 access point paths for HealthOmics reference stores -
Search for Alignment Files:
User: "Find BAM files from the 1000 Genomes project" → Use SearchGenomicsFiles with file_type="bam" and search_terms=["1000", "genomes"] → Returns BAM files with their .bai index files → Ranked by relevance with complete file metadata -
Discover Variant Files:
User: "Locate VCF files containing SNP data" → Use SearchGenomicsFiles with file_type="vcf" and search_terms=["SNP"] → Returns VCF files with associated .tbi index files → Includes both S3 and HealthOmics store results
The genomics file search includes several optimizations to minimize S3 API calls and improve performance:
-
For Path-Based Searches (Recommended):
# Use specific file/sample names in search terms # This enables path matching without tag retrieval GENOMICS_SEARCH_ENABLE_S3_TAG_SEARCH=true # Keep enabled for fallback GENOMICS_SEARCH_RESULT_CACHE_TTL=600 # Cache results for 10 minutes
-
For Tag-Heavy Environments:
# Optimize batch sizes for your dataset GENOMICS_SEARCH_MAX_TAG_BATCH_SIZE=200 # Larger batches for better performance GENOMICS_SEARCH_TAG_CACHE_TTL=900 # Longer tag cache for frequently accessed objects
-
For Cost-Sensitive Environments:
# Disable tag search if only path matching is needed GENOMICS_SEARCH_ENABLE_S3_TAG_SEARCH=false # Eliminates all tag API calls GENOMICS_SEARCH_RESULT_CACHE_TTL=1800 # Longer result cache to reduce repeated searches
-
For Development/Testing:
# Disable caching for immediate results during development GENOMICS_SEARCH_RESULT_CACHE_TTL=0 # No result caching GENOMICS_SEARCH_TAG_CACHE_TTL=0 # No tag caching GENOMICS_SEARCH_MAX_TAG_BATCH_SIZE=50 # Smaller batches for testing
Performance Impact: These optimizations can reduce S3 API calls by 60-90% and improve search response times by 5-10x compared to the unoptimized implementation.
-
Workflow Development:
User: "Help me create a new genomic variant calling workflow" → Option A: Use PackageAHOWorkflow to bundle files, then CreateAHOWorkflow with base64 ZIP → Option B: Upload workflow ZIP to S3, then CreateAHOWorkflow with S3 URI → Validate syntax and parameters → Choose method based on workflow size and storage preferences -
Production Execution:
User: "Run my alignment workflow on these FASTQ files" → Use SearchGenomicsFiles to find FASTQ files for the run → Use StartAHORun with appropriate parameters → Monitor with ListAHORuns and GetAHORun → Track task progress with ListAHORunTasks -
Troubleshooting:
User: "My workflow failed, what went wrong?" → Use DiagnoseAHORunFailure for comprehensive analysis → Access specific logs based on failure type → Provide actionable remediation steps -
Performance Optimization:
User: "How can I make my workflow run faster?" → Use AnalyzeAHORunPerformance to identify bottlenecks → Review resource utilization patterns → Suggest optimization strategies -
Workflow Validation:
User: "Check if my WDL workflow is valid" → Use LintAHOWorkflowDefinition for single files → Use LintAHOWorkflowBundle for multi-file workflows with imports → Check for missing inputs, outputs, or runtime requirements → Validate import resolution and dependencies → Get detailed error messages and warnings
- IAM Permissions: Ensure proper IAM roles with HealthOmics permissions
- Regional Availability: Use
GetAHOSupportedRegionsto verify service availability - Cost Management: Monitor storage and compute costs, especially with STATIC storage
- Data Security: Follow genomic data handling best practices and compliance requirements
- Resource Limits: Be aware of service quotas and limits for concurrent runs
When tools return errors:
- Check AWS credentials and permissions
- Verify resource IDs (workflow_id, run_id, task_id) are valid
- Ensure proper parameter formatting and required fields
- Use diagnostic tools to understand failure root causes
- Provide clear, actionable error messages to users
| Kiro | Cursor | VS Code |
|---|---|---|
Install using uvx:
uvx awslabs.aws-healthomics-mcp-serverOr install from source:
git clone <repository-url>
cd mcp/src/aws-healthomics-mcp-server
uv sync
uv run -m awslabs.aws_healthomics_mcp_server.serverAWS_REGION- AWS region for HealthOmics operations (default: us-east-1)AWS_PROFILE- AWS profile for authenticationFASTMCP_LOG_LEVEL- Server logging level (default: WARNING)HEALTHOMICS_DEFAULT_MAX_RESULTS- Default maximum number of results for paginated API calls (default: 10)
GENOMICS_SEARCH_S3_BUCKETS- Comma-separated list of S3 bucket paths to search for genomics files (e.g., "s3://my-genomics-data/,s3://shared-references/")GENOMICS_SEARCH_ENABLE_S3_TAG_SEARCH- Enable/disable S3 tag-based searching (default: true)- Set to
falseto disable tag retrieval and only use path-based matching - Significantly reduces S3 API calls when tag matching is not needed
- Set to
GENOMICS_SEARCH_MAX_TAG_BATCH_SIZE- Maximum objects to retrieve tags for in a single batch (default: 100)- Larger values improve performance for tag-heavy searches but use more memory
- Smaller values reduce memory usage but may increase API call latency
GENOMICS_SEARCH_RESULT_CACHE_TTL- Result cache TTL in seconds (default: 600)- Set to
0to disable result caching - Caches complete search results to eliminate repeated S3 calls for identical searches
- Set to
GENOMICS_SEARCH_TAG_CACHE_TTL- Tag cache TTL in seconds (default: 300)- Set to
0to disable tag caching - Caches individual object tags to avoid duplicate retrievals across searches
- Set to
GENOMICS_SEARCH_MAX_CONCURRENT- Maximum concurrent S3 bucket searches (default: 10)GENOMICS_SEARCH_TIMEOUT_SECONDS- Search timeout in seconds (default: 300)GENOMICS_SEARCH_ENABLE_HEALTHOMICS- Enable/disable HealthOmics sequence/reference store searches (default: true)
Note for Large S3 Buckets: When searching very large S3 buckets (millions of objects), the genomics file search may take longer than the default MCP client timeout. If you encounter timeout errors, increase the MCP server timeout by adding a
"timeout"property to your MCP server configuration (e.g.,"timeout": 300000for five minutes, specified in milliseconds). This is particularly important when using the search tool with extensive S3 bucket configurations or whenGENOMICS_SEARCH_ENABLE_S3_TAG_SEARCH=trueis used with large datasets. The value of"timeout"should always be greater than the value ofGENOMICS_SEARCH_TIMEOUT_SECONDSif you want to prevent the MCP timeout from preempting the genomics search timeout
AGENT- Agent identifier appended to the User-Agent string on all boto3 API calls asagent/<value>(optional)- Use case: Attributing API calls to specific AI agents for traceability via CloudTrail and AWS service logs
- Behavior: When set, the value is sanitized to visible ASCII characters (0x20-0x7E), stripped of leading/trailing whitespace, lowercased, and appended to the User-Agent header as
agent/<value> - Validation: Empty, whitespace-only, or values that become empty after sanitization are treated as unset
- Example:
export AGENT=KIROproducesUser-Agent: ... agent/kiro
The following environment variables are primarily intended for testing scenarios, such as integration testing against mock service endpoints:
-
HEALTHOMICS_SERVICE_NAME- Override the AWS service name used by the HealthOmics client (default: omics)- Use case: Testing against mock services or alternative implementations
- Validation: Cannot be empty or whitespace-only; falls back to default with warning if invalid
- Example:
export HEALTHOMICS_SERVICE_NAME=omics-mock
-
HEALTHOMICS_ENDPOINT_URL- Override the endpoint URL used by the HealthOmics client- Use case: Integration testing against local mock services or alternative endpoints
- Validation: Must begin with
http://orhttps://; ignored with warning if invalid - Example:
export HEALTHOMICS_ENDPOINT_URL=http://localhost:8080 - Note: Only affects the HealthOmics client; other AWS services use default endpoints
Important: These testing configuration variables should only be used in development and testing environments. In production, always use the default AWS HealthOmics service endpoints for security and reliability.
This server requires AWS credentials with appropriate permissions for HealthOmics operations. Configure using:
- AWS CLI:
aws configure - Environment variables:
AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY - IAM roles (recommended for EC2/Lambda)
- AWS profiles: Set
AWS_PROFILEenvironment variable
The following IAM permissions are required:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"omics:ListWorkflows",
"omics:CreateWorkflow",
"omics:GetWorkflow",
"omics:CreateWorkflowVersion",
"omics:ListWorkflowVersions",
"omics:StartRun",
"omics:ListRuns",
"omics:GetRun",
"omics:ListRunTasks",
"omics:GetRunTask",
"omics:CreateRunGroup",
"omics:GetRunGroup",
"omics:ListRunGroups",
"omics:UpdateRunGroup",
"omics:CreateRunCache",
"omics:GetRunCache",
"omics:ListRunCaches",
"omics:UpdateRunCache",
"omics:ListSequenceStores",
"omics:ListReadSets",
"omics:GetReadSetMetadata",
"omics:ListReferenceStores",
"omics:ListReferences",
"omics:GetReferenceMetadata",
"omics:CreateSequenceStore",
"omics:GetSequenceStore",
"omics:UpdateSequenceStore",
"omics:StartReadSetImportJob",
"omics:GetReadSetImportJob",
"omics:ListReadSetImportJobs",
"omics:StartReadSetExportJob",
"omics:GetReadSetExportJob",
"omics:ListReadSetExportJobs",
"omics:StartReadSetActivationJob",
"omics:GetReferenceStore",
"omics:StartReferenceImportJob",
"omics:GetReferenceImportJob",
"omics:ListReferenceImportJobs",
"omics:CreateConfiguration",
"omics:GetConfiguration",
"omics:ListConfigurations",
"omics:DeleteConfiguration",
"logs:DescribeLogGroups",
"logs:DescribeLogStreams",
"logs:GetLogEvents"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"s3:ListBucket",
"s3:GetObject",
"s3:GetObjectTagging",
"s3:HeadBucket"
],
"Resource": [
"arn:aws:s3:::*genomics*",
"arn:aws:s3:::*genomics*/*",
"arn:aws:s3:::*omics*",
"arn:aws:s3:::*omics*/*"
]
},
{
"Effect": "Allow",
"Action": [
"iam:PassRole"
],
"Resource": "arn:aws:iam::*:role/HealthOmicsExecutionRole*"
}
]
}Note: The S3 permissions above use wildcard patterns for genomics-related buckets. In production, replace these with specific bucket ARNs that you want to search. For example:
{
"Effect": "Allow",
"Action": [
"s3:ListBucket",
"s3:GetObject",
"s3:GetObjectTagging",
"s3:HeadBucket"
],
"Resource": [
"arn:aws:s3:::my-genomics-data",
"arn:aws:s3:::my-genomics-data/*",
"arn:aws:s3:::shared-references",
"arn:aws:s3:::shared-references/*"
]
}See the Kiro IDE documentation or the Kiro CLI documentation for details.
For global configuration, edit ~/.kiro/settings/mcp.json. For project-specific configuration, edit .kiro/settings/mcp.json in your project directory.
Add to your Kiro MCP configuration (~/.kiro/settings/mcp.json):
{
"mcpServers": {
"aws-healthomics": {
"command": "uvx",
"args": ["awslabs.aws-healthomics-mcp-server"],
"timeout": 300000,
"env": {
"AWS_REGION": "us-east-1",
"AWS_PROFILE": "your-profile",
"HEALTHOMICS_DEFAULT_MAX_RESULTS": "10",
"AGENT": "kiro",
"GENOMICS_SEARCH_S3_BUCKETS": "s3://my-genomics-data/,s3://shared-references/",
"GENOMICS_SEARCH_ENABLE_S3_TAG_SEARCH": "true",
"GENOMICS_SEARCH_MAX_TAG_BATCH_SIZE": "100",
"GENOMICS_SEARCH_RESULT_CACHE_TTL": "600",
"GENOMICS_SEARCH_TAG_CACHE_TTL": "300"
}
}
}
}For integration testing against mock services:
{
"mcpServers": {
"aws-healthomics-test": {
"command": "uvx",
"args": ["awslabs.aws-healthomics-mcp-server"],
"timeout": 300000,
"env": {
"AWS_REGION": "us-east-1",
"AWS_PROFILE": "test-profile",
"HEALTHOMICS_SERVICE_NAME": "omics-mock",
"HEALTHOMICS_ENDPOINT_URL": "http://localhost:8080",
"GENOMICS_SEARCH_S3_BUCKETS": "s3://test-genomics-data/",
"GENOMICS_SEARCH_ENABLE_S3_TAG_SEARCH": "false",
"GENOMICS_SEARCH_RESULT_CACHE_TTL": "0",
"FASTMCP_LOG_LEVEL": "DEBUG"
}
}
}
}Configure according to your client's documentation, using:
- Command:
uvx - Args:
["awslabs.aws-healthomics-mcp-server"] - Environment variables as needed
For Windows users, the MCP server configuration format is slightly different:
{
"mcpServers": {
"awslabs.aws-healthomics-mcp-server": {
"disabled": false,
"timeout": 300000,
"type": "stdio",
"command": "uv",
"args": [
"tool",
"run",
"--from",
"awslabs.aws-healthomics-mcp-server@latest",
"awslabs.aws-healthomics-mcp-server.exe"
],
"env": {
"FASTMCP_LOG_LEVEL": "ERROR",
"AWS_PROFILE": "your-aws-profile",
"AWS_REGION": "us-east-1",
"GENOMICS_SEARCH_S3_BUCKETS": "s3://my-genomics-data/,s3://shared-references/",
"GENOMICS_SEARCH_ENABLE_S3_TAG_SEARCH": "true",
"GENOMICS_SEARCH_MAX_TAG_BATCH_SIZE": "100",
"GENOMICS_SEARCH_RESULT_CACHE_TTL": "600",
"GENOMICS_SEARCH_TAG_CACHE_TTL": "300"
}
}
}
}For testing scenarios on Windows:
{
"mcpServers": {
"awslabs.aws-healthomics-mcp-server-test": {
"disabled": false,
"timeout": 300000,
"type": "stdio",
"command": "uv",
"args": [
"tool",
"run",
"--from",
"awslabs.aws-healthomics-mcp-server@latest",
"awslabs.aws-healthomics-mcp-server.exe"
],
"env": {
"FASTMCP_LOG_LEVEL": "DEBUG",
"AWS_PROFILE": "test-profile",
"AWS_REGION": "us-east-1",
"HEALTHOMICS_SERVICE_NAME": "omics-mock",
"HEALTHOMICS_ENDPOINT_URL": "http://localhost:8080",
"GENOMICS_SEARCH_S3_BUCKETS": "s3://test-genomics-data/",
"GENOMICS_SEARCH_ENABLE_S3_TAG_SEARCH": "false",
"GENOMICS_SEARCH_RESULT_CACHE_TTL": "0"
}
}
}
}git clone <repository-url>
cd aws-healthomics-mcp-server
uv sync# Run tests with coverage
uv run pytest --cov --cov-branch --cov-report=term-missing
# Run specific test file
uv run pytest tests/test_server.py -v# Format code
uv run ruff format
# Lint code
uv run ruff check
# Type checking
uv run pyrightContributions are welcome! Please see the contributing guidelines for more information.
This project is licensed under the Apache-2.0 License. See the LICENSE file for details.