Skip to content

Host galaxy association module#411

Open
mcoughlin wants to merge 10 commits into
mainfrom
feature/host-galaxy-association
Open

Host galaxy association module#411
mcoughlin wants to merge 10 commits into
mainfrom
feature/host-galaxy-association

Conversation

@mcoughlin

Copy link
Copy Markdown
Collaborator

This PR adds a host galaxy association module with DLR + Bayesian scoring.

Based on implementation in https://github.com/alexandergagliano/Prost.

Copilot AI review requested due to automatic review settings March 12, 2026 11:30

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new host-galaxy association utility (DLR + Bayesian scoring) and wires it into the ZTF/LSST enrichment workers, driven by new config options and LS_DR10 crossmatches.

Changes:

  • Introduces utils::host with DLR computation, candidate scoring, and association output structs.
  • Stores host_galaxy association results back into alert documents during ZTF/LSST enrichment when enabled.
  • Extends config.yaml crossmatch configuration to include LS_DR10 and adds a host_galaxy config block.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
src/utils/mod.rs Exposes new host utility module.
src/utils/host.rs Implements host association logic + unit tests.
src/enrichment/ztf.rs Projects cross_matches, runs host association, writes host_galaxy into Mongo updates.
src/enrichment/lsst.rs Runs host association from cross_matches, writes host_galaxy into Mongo updates.
src/enrichment/mod.rs Re-exports host association types.
src/conf.rs Adds host_galaxy to AppConfig.
config.yaml Enables host association and adds LS_DR10 crossmatch configs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread src/utils/host.rs
Comment thread src/utils/host.rs Outdated
Comment thread src/utils/host.rs Outdated
Comment thread src/utils/host.rs Outdated
Comment thread src/utils/host.rs Outdated
Comment thread src/utils/host.rs Outdated
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI commented Mar 12, 2026

Copy link
Copy Markdown
Contributor

@mcoughlin I've opened a new pull request, #412, to work on those changes. Once the pull request is ready, I'll request review from you.

mcoughlin and others added 4 commits March 12, 2026 06:36
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

@Theodlz Theodlz left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering, because that is not in the PR description. How is this meant to be used? Right now we crossmatch with NED for example, nothing fancy. I want to figure out how we don't end up with too many places for people to look at potential host galaxies.

First off, since this is purely positional (I suppose?), shouldn't we have that info at the object level? If so, and since the crossmatches need to happen just once per object and we do this in the alert worker, should we move that there? Then in the enrichment worker, we could just compute a hosted boolean feature (ala star, near brightstar) and use that to build the babamul topics for example?

@github-actions

Copy link
Copy Markdown

Throughput results (ba81fe4b967d26d96cb4fd8b180c85d5b8d8e432):

New wall time Baseline wall time Difference
159.6 183.2 -12.00%

@github-actions

Copy link
Copy Markdown

Throughput results (e76f78cc9670e230707c0250031ad3c16c3ba3b8):

New wall time Baseline wall time Difference
181.3 183.2 -1.00%

@github-actions

Copy link
Copy Markdown

Throughput results (ec18f1edbbcd753c11ceef4840cb8589fcef8038):

New wall time Baseline wall time Difference
186.8 183.2 1.00%

@github-actions

Copy link
Copy Markdown

Throughput results (11268a9342453f8469060441e32725eb847a0c3a):

New wall time Baseline wall time Difference
193.0 183.2 5.00%

@github-actions

Copy link
Copy Markdown

Throughput results (806ae0733c076ea3d28a8b80676b928e96763ac9):

New wall time Baseline wall time Difference
187.1 183.2 2.00%

@github-actions

Copy link
Copy Markdown

Throughput results (b32982720144eeb398a893677da397094962e05e):

New wall time Baseline wall time Difference
186.5 183.2 1.00%

Copilot AI and others added 2 commits March 12, 2026 06:54
#412)

Non-finite `shape_e1`/`shape_e2` (NaN, ±∞) would pass through the
existing `shape_r` guard and propagate NaN into `axis_ratio`,
`b_arcsec`, DLR, and Bayesian posteriors — since `e.min(0.999)` and
`f64::max` do not suppress NaN.

## Changes

- **Early rejection**: return `None` when `shape_e1` or `shape_e2` are
non-finite, before any arithmetic.
- **Post-computation guard**: return `None` if derived `q`, `a_arcsec`,
or `b_arcsec` are non-finite or ≤ 0 (defensive against edge cases like a
NaN `min_b`).
- **Tests**: `test_tractor_shape_non_finite_ellipticity` covering NaN
`e1`, NaN `e2`, both NaN, `+∞ e1`, `−∞ e2`.

```rust
// Before: NaN silently propagated
let e = (shape_e1 * shape_e1 + shape_e2 * shape_e2).sqrt(); // NaN if inputs are NaN
let e_clamped = e.min(0.999); // NaN.min(0.999) == NaN

// After: reject early
if !shape_e1.is_finite() || !shape_e2.is_finite() {
    return None;
}
// ...and defensively after derivation:
if !q.is_finite() || q <= 0.0 || !b.is_finite() || b <= 0.0 {
    return None;
}
```

<!-- START COPILOT CODING AGENT TIPS -->
---

✨ Let Copilot coding agent [set things up for
you](https://github.com/boom-astro/boom/issues/new?title=✨+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot)
— coding agent works faster and does higher quality work when set up for
your repo.

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mcoughlin <2291947+mcoughlin@users.noreply.github.com>
@mcoughlin

Copy link
Copy Markdown
Collaborator Author

@Theodlz The idea is that GHOST gives you a metric to associated transients with hosts that understands morphology. Because most galaxies are ellipses on the sky, shape matters a lot when doing the association.

@github-actions

Copy link
Copy Markdown

Throughput results (694708dfaf461f75e6572bb1dc2a3376c2f5f422):

New wall time Baseline wall time Difference
187.7 183.2 2.00%

@github-actions

Copy link
Copy Markdown

Throughput results (5f0d67a43930498575d778f1fa3a1c07e2c60a17):

New wall time Baseline wall time Difference
185.6 183.2 1.00%

@github-actions

Copy link
Copy Markdown

Throughput results (471c531c20602b07d28458e5f60e56f0607d8f76):

New wall time Baseline wall time Difference
189.2 183.2 3.00%

@github-actions

Copy link
Copy Markdown

Throughput results (9f897e58f1f7f7e18b0465a5ce9ff559b85b8a98):

New wall time Baseline wall time Difference
196.6 183.2 7.00%

@mcoughlin mcoughlin requested a review from Theodlz March 12, 2026 12:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants