Feat: RTF + mTAN encoder integration: feature preparation, inference, and embedding storage by KeshavMajithia · Pull Request #484 · boom-astro/boom

KeshavMajithia · 2026-06-03T18:23:48Z

Adds RTF and mTAN encoder inference to the ZTF enrichment pipeline. When a ZTF alert is processed, BOOM now computes a 128D RTF embedding and a 2D mTAN embedding alongside the existing ACAI/BTSBot classifications, and stores them in the alert document under an embeddings field.

Changes

src/enrichment/models/rtf.rs — prepare_features() constructs the (1, 257, 37) photometry tensor, (1, 257) padding mask, and (1, 3, 63, 63) CHW cutout tensor from raw alert data. The 37 channels match the training pipeline: log time deltas, logflux, band one-hots, and 30 alert metadata keys.

src/enrichment/models/mtan.rs — prepare_features() filters to g/r bands, merges nearby observations, normalizes magnitudes and time to [0,1], and pads to 200 steps. pool_embedding() mean-pools qz0_mean across 50 query times to produce the final 2D vector. Skips alerts with fewer than 3 observations.

src/enrichment/ztf.rs — Adds ZtfAlertEmbeddings struct, compute_embeddings() method, and integrates it into process_alerts(). Both models fail gracefully (log a warning and store None for that embedding).

k8s/08-boom-scheduler-ztf.yaml — Adds an initContainer that downloads all 8 ONNX model files (5 ACAI, BTSBot, RTF, mTAN) from boom-astro/boomEncoders into an emptyDir volume mounted at /app/data/models.

MongoDB schema after this change

{
  "classifications": { "acai_h": 0.95, ... },
  "embeddings": {
    "rtf": [0.123, -0.456, ...],   // 128 floats
    "mtan": [0.789, -0.012]        // 2 floats
  }
}

…lder

- rtf.rs: prepare_features() builds (1,257,37) photometry tensor, (1,257) pad mask, and (1,3,63,63) CHW cutout from ZTF alert data. 30 metadata channels match ALERT_META_KEYS from training pipeline. - mtan.rs: prepare_features() filters g/r photometry, merges nearby obs, normalizes magnitudes and time, pads to 200 steps. Adds pool_embedding() to mean-pool qz0_mean into 2D vector. - ztf.rs: ZtfAlertEmbeddings struct, compute_embeddings() method, integrated into process_alerts() to store embeddings in MongoDB alongside existing classifications.

…ning

Adds an initContainer to the ZTF scheduler k8s deployment that pulls rtf_embed.onnx and mtan_embed.onnx from boom-astro/boomEncoders into an emptyDir volume mounted at /app/data/models. The main container reads these model files at startup to initialize ONNX inference.

…iner

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR introduces RTF and mTAN encoder embeddings into the ZTF enrichment pipeline, adjusts Kafka topic handling to avoid an async deletion race, and makes select filter/schema endpoints publicly accessible.

Changes:

Add RTF + mTAN ONNX encoder model support and persist per-alert embeddings during enrichment.
Avoid Kafka topic deletion to prevent “ghost topic” races during initialization.
Expose filter test and schema-related endpoints without authentication and add a ZTF scheduler deployment that downloads ONNX models at startup.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 13 comments.

Show a summary per file

File	Description
src/kafka/base.rs	Skips topic deletion to avoid Kafka async delete race during initialization.
src/enrichment/ztf.rs	Adds embeddings schema + computes/persists RTF/mTAN embeddings per alert.
src/enrichment/models/rtf.rs	New RTF encoder wrapper and feature preparation for ONNX inference.
src/enrichment/models/mtan.rs	New mTAN encoder wrapper, feature preparation, and pooling logic.
src/enrichment/models/mod.rs	Registers new models and adds them to `SharedModels`.
src/api/routes/filters.rs	Removes explicit auth requirement for filter test endpoints.
src/api/auth.rs	Adds filter test and schema routes to the public allowlist.
k8s/08-boom-scheduler-ztf.yaml	Adds a scheduler deployment that downloads ONNX models via initContainer.
Dockerfile	Switches Kafka download URL to Apache archive.
.github/workflows/build-fork.yaml	Adds workflow to build/push image for specific branches.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+            results.push(Some(ZtfAlertEmbeddings {
+                rtf: rtf_embedding,
+                mtan: mtan_embedding,
+            }));


+            // RTF embedding
+            let rtf_embedding = match RtfModel::prepare_features(&item.alert, &item.cutouts) {
+                Ok((x, pad_mask, images)) => {
+                    match models.rtf_embed.lock().unwrap().embed(&x, &pad_mask, &images) {


+                    match models
+                        .mtan_embed
+                        .lock()
+                        .unwrap()
+                        .embed_raw(&x, &time_steps, &query_times)


+            // The topic and data directory are inconsistent.
+            // NOTE: We intentionally skip delete_topic here because Kafka
+            // deletes topics asynchronously, causing a race condition where
+            // initialize_topic later sees a ghost topic with 0 partitions.
+            // Instead, we let initialize_topic handle it after downloading.
+            // warn!("recreating topic {}", topic_name);
+            // delete_topic(&self.server_url(), &topic_name).await?;


 pub async fn post_filter_test(
    db: web::Data<Database>,
    body: web::Json<FilterTestRequest>,
-    current_user: Option<web::ReqData<User>>,
 ) -> HttpResponse {


+              # ACAI classifiers (~35 KB each)
+              for variant in acai_h acai_n acai_v acai_o acai_b; do
+                echo "Downloading ${variant}.d1_dnn_20201130.onnx ..."
+                curl -fsSL "${HF_BASE}/${variant}.d1_dnn_20201130.onnx" -o "/models/${variant}.d1_dnn_20201130.onnx"


+
+              # BTSBot classifier (~900 KB)
+              echo "Downloading btsbot-v1.0.1.onnx ..."
+              curl -fsSL "${HF_BASE}/btsbot-v1.0.1.onnx" -o /models/btsbot-v1.0.1.onnx


+
+              # RTF encoder (8.7 MB)
+              echo "Downloading rtf_embed.onnx ..."
+              curl -fsSL "${HF_BASE}/rtf_embed.onnx" -o /models/rtf_embed.onnx


+
+              # mTAN encoder (383 KB)
+              echo "Downloading mtan_embed.onnx ..."
+              curl -fsSL "${HF_BASE}/mtan_embed.onnx" -o /models/mtan_embed.onnx


+            let mut j = i;
+            while j < points.len() && (points[j].jd - t).abs() <= MERGE_TOL_DAYS {
+                let band = points[j].band_idx;
+                mp.mag[band] = points[j].mag as f32;
+                mp.mask[band] = 1.0;
+                j += 1;
+            }


# Conflicts: # src/enrichment/ztf.rs

…dings - compute_embeddings: replace .lock().unwrap() with match on lock() to gracefully handle poisoned mutex instead of panicking the worker - compute_embeddings: return None when both RTF and mTAN fail, so MongoDB doesn't get { rtf: null, mtan: null } written needlessly - mtan.rs: add comment explaining last-write-wins merge matches the Python training pipeline for numerical consistency - k8s: remove stale NOTE about uploading ACAI/BTSBot (already done)

petebachant · 2026-06-09T13:32:48Z

I'm curious about the inclusion of a Kubernetes manifest here. AFAIK, Kubernetes is not used at Caltech or UMN. So far, models have been saved in this repo using Git LFS and baked into the Docker images.

mcoughlin · 2026-06-09T21:17:59Z

@petebachant @antoine-le-calloch I do worry about the difference in deployment methods and how we can effectively develop / add unit tests for some of these forthcoming features. k8s is how we will deploy testing instances at NRP, so we need to support that too.

petebachant · 2026-06-10T14:35:52Z

@petebachant @antoine-le-calloch I do worry about the difference in deployment methods and how we can effectively develop / add unit tests for some of these forthcoming features. k8s is how we will deploy testing instances at NRP, so we need to support that too.

Can we write up the use case, i.e., who will be running it, what they'll be testing, etc.? We could convert from Docker Compose to Helm, but if it's just to test ML models or something, the solution would look different.

KeshavMajithia added 15 commits April 30, 2026 22:44

feat: make filter test endpoints public for babamul-web filter tester

43788bf

ci: add fork build workflow for NRP image

15c211b

ci: retry build (Apache CDN 503)

935bb43

fix: use archive.apache.org for Kafka download (CDN 503)

3dbc5aa

fix: skip async topic deletion to prevent 0-partition race condition

aee7f15

feat: add schema endpoint to public routes and invalidate docker cache

604d70e

fix: remove hardcoded auth checks from public filter endpoints

6164c8b

feat: add /filters/schemas/{ZTF,LSST} to public routes for filter bui…

946dccc

…lder

Add RTF and mTAN ONNX encoder middleware

7a3be6b

Trigger GitHub action for this branch

4301846

ci: retrigger build

d8d1695

fix: use public re-export path for Candidate, suppress unused var war…

0e06df4

…ning

feat: download all ONNX models (ACAI, BTSBot, RTF, mTAN) in initConta…

c7971ab

…iner

Copilot AI review requested due to automatic review settings June 3, 2026 18:23

Copilot AI reviewed Jun 3, 2026

View reviewed changes

KeshavMajithia added 2 commits June 3, 2026 23:58

Merge remote-tracking branch 'origin/main' into feat/rtf-mtan-encoders

d6c12e3

# Conflicts: # src/enrichment/ztf.rs

KeshavMajithia added 2 commits June 10, 2026 18:23

Merge remote-tracking branch 'origin/main' into feat/rtf-mtan-encoders

ee898da

fix: update AlertCutout import path after upstream refactor

2428402

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat: RTF + mTAN encoder integration: feature preparation, inference, and embedding storage#484

Feat: RTF + mTAN encoder integration: feature preparation, inference, and embedding storage#484
KeshavMajithia wants to merge 19 commits into
boom-astro:mainfrom
KeshavMajithia:feat/rtf-mtan-encoders

KeshavMajithia commented Jun 3, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

petebachant commented Jun 9, 2026

Uh oh!

mcoughlin commented Jun 9, 2026

Uh oh!

petebachant commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

KeshavMajithia commented Jun 3, 2026

Changes

MongoDB schema after this change

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

petebachant commented Jun 9, 2026

Uh oh!

mcoughlin commented Jun 9, 2026

Uh oh!

petebachant commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants