Restore EMR Spark plugin startup scripts#625
Conversation
Bring back the EMR startup helper and companion configuration files that remained on legacy-main after the branch split. Signed-off-by: liyuan <yuali@nvidia.com>
Add the standard Apache license headers required by CI for the restored EMR Python and shell scripts. Signed-off-by: liyuan <yuali@nvidia.com>
Greptile SummaryThis PR restores the EMR Spark RAPIDS startup tooling — a Python cluster-creation script, two cgroup bootstrap shell scripts, and two EMR 6/7 JSON configuration files — that existed on the legacy-main branch but was absent from main after the branch split.
Confidence Score: 5/5Safe to merge — all files are new additions with no changes to existing code paths. Every issue raised in prior review rounds has been addressed: placeholder syntax is consistent between the JSON configs and Python replacements, the unsupported-instance guard is present, paths are resolved relative to file, the config is written to a temp file rather than overwriting the source, the S3 upload return value is checked before proceeding, and all required EC2 attributes are included in the cluster command. No new defects were found. No files require special attention. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[User invokes emr-spark-plugin-startup.py] --> B{Parse CLI args}
B --> C{Release label contains emr-7?}
C -- Yes --> D[config-emr7.json + cgroup-bootstrap-emr7.sh]
C -- No --> E[config-emr6.json + cgroup-bootstrap-emr6.sh]
D --> F{worker_instance in g4dn_instance_map?}
E --> F
F -- No --> G[Print error and return]
F -- Yes --> H[Compute exec_cores and task_gpu_amount]
H --> I[Read config JSON, replace placeholders]
I --> J[Upload bootstrap script to S3]
J -- Failed --> K[Return early]
J -- OK --> L[Write config to NamedTemporaryFile]
L --> M[aws emr create-cluster with ec2-attributes and bootstrap-actions]
M -- Success --> N[Print cluster ID]
M -- Error --> O[Print stderr]
Reviews (4): Last reviewed commit: "Resolve EMR config paths relative to scr..." | Re-trigger Greptile |
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Validate supported worker instance types, pass the key pair and subnet into the EMR cluster attributes, and write substituted EMR configuration to a temporary file instead of modifying the source JSON. Signed-off-by: liyuan <yuali@nvidia.com>
Return upload status from the S3 helper and skip cluster creation when the bootstrap action cannot be uploaded. Signed-off-by: liyuan <yuali@nvidia.com>
Load the EMR configuration files and upload bootstrap scripts using paths anchored to the startup script directory so the helper works from any current working directory. Signed-off-by: liyuan <yuali@nvidia.com>
|
merged since just cherry pick the missing #476 |
Summary
Test plan
emr-spark-plugin-startup.pywith Python AST.config-emr6.jsonandconfig-emr7.jsonwith Python JSON parser.bash -non both EMR cgroup bootstrap scripts.