Skip to content

[WIP] Add watchlist support for filters.#457

Open
antoine-le-calloch wants to merge 14 commits into
mainfrom
add_watchlist
Open

[WIP] Add watchlist support for filters.#457
antoine-le-calloch wants to merge 14 commits into
mainfrom
add_watchlist

Conversation

@antoine-le-calloch

@antoine-le-calloch antoine-le-calloch commented May 1, 2026

Copy link
Copy Markdown
Contributor

A watchlist is a Mongo collection prefixed with watchlist_. It acts as:

  1. A queryable catalog (via API, subject to per-user ACL)
  2. A crossmatch source for alerts
  3. A filter gate

Access is controlled via per-user ACLs.


How crossmatch works

When a new alert is ingested, it is crossmatched against every watchlist
configured under crossmatch.<survey>. On a positional match (within the
configured radius), the alert's objectId is $addToSet-ed onto the
matched watchlist document, under matching_<survey>_objects:

// in the watchlist_<name> document
matching_ztf_objects:  [ "ZTF...",  ]
matching_lsst_objects: [  ]

So the watchlist itself accumulates the list of survey objects that fell
within its entries. This is the array a bound filter later $lookups
against (see Behavior). Alerts that existed before a watchlist was added
are not matched by live ingest, backfill them with the reprocess_crossmatch
binary (Setup step 4).


Access Control

  • Public catalogs → always accessible
  • watchlist_* catalogs → accessible only if:
    • user is admin, or
    • listed in User.watchlist_access

Unauthorized users receive 404. (Watchlists are queryable via the
find / count / pipeline endpoints, but not via cone_search.)


Setup

  1. Create a Mongo collection: watchlist_<name>

  2. Add it to config.yaml under crossmatch.<survey>, with a
    projection selecting the fields to surface on matched alerts

  3. Restart alert workers

  4. Backfill existing alerts: run the reprocess_crossmatch binary with the
    watchlist name(s) passed via --catalogs (comma-separated; each must be
    declared under crossmatch.<survey>). It loops over the watchlist entries
    and $addToSets the object_ids of every alerts_aux record within radius
    onto the watchlist document (matching_<survey>_objects) — idempotent, so
    safe to re-run alongside live ingest:

    reprocess_crossmatch --survey ztf --catalogs watchlist_supernovas
  5. Grant access:

    PATCH /users/{user_id}/watchlist_access

Using Watchlists in Filters

Filters can bind to a watchlist:

{
  "watchlist": "watchlist_<name>"
}

Validation (at filter creation) ensures:

  • correct watchlist_ prefix
  • collection exists + user has access
  • watchlist is configured for crossmatch on the filter's survey

Behavior

When a filter binds a watchlist, the loader injects:

  • A $lookup joining the alert objectId against the watchlist
    document's matching_<survey>_objects array

  • A $match keeping only alerts with at least one match

  • A $set surfacing the matched watchlist docs (projected to the
    configured fields) under annotations.watchlist:

    annotations.watchlist = [ { …projected fields… },  ]

→ only watchlist-matched alerts pass, each carrying its watchlist data.


Kafka Routing

All filter output goes to the standard results topics
(ZTF_alerts_results / LSST_alerts_results) regardless of watchlist.
There is no dedicated per-watchlist topic; isolation is handled at the
data level (the $match) and via ACL at filter creation.


Summary

Watchlists provide:

  • data-level filtering (via crossmatch + $lookup/$match)
  • annotation enrichment (matched docs in annotations.watchlist)
  • access control (via ACL, enforced at filter creation)

All with a single convention: watchlist_*.

@antoine-le-calloch antoine-le-calloch self-assigned this May 1, 2026
Copilot AI review requested due to automatic review settings May 1, 2026 05:10
@antoine-le-calloch antoine-le-calloch moved this to In Progress in BOOM Dev May 1, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds “watchlist” support across the API and filter workers so filters can be bound to watchlist_* catalogs, enforcing per-user access control and routing matched alerts to watchlist-named Kafka topics.

Changes:

  • Introduces watchlist ACLs on User and enforces watchlist visibility/access across catalog + query endpoints.
  • Extends filters with optional watchlist binding, injects crossmatch gating in filter pipelines, and routes output per watchlist/topic.
  • Updates filter worker plumbing/tests to handle per-topic fan-out.

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
tests/test_ztf.rs Adjusts assertions for new “alerts grouped by topic” filter output shape.
tests/test_lsst.rs Adjusts assertions for new “alerts grouped by topic” filter output shape.
tests/test_filter.rs Updates build_loaded_filter calls to include a default Kafka topic parameter.
src/utils/testing.rs Updates test filter insertion to include watchlist: None.
src/filter/ztf.rs Updates ZTF worker to load filters with a default topic and return alerts grouped by destination topic.
src/filter/lsst.rs Updates LSST worker to load filters with a default topic and return alerts grouped by destination topic.
src/filter/mod.rs Re-exports group_alerts_by_topic helper.
src/filter/base.rs Adds Filter.watchlist, LoadedFilter.output_topic, watchlist $match injection at load time, per-topic grouping helper, and per-topic Kafka send loop.
src/bin/api.rs Registers the new PATCH /users/{id}/watchlist_access route.
src/bin/add_filter.rs Adds CLI support for binding a filter to a watchlist (basic prefix + existence validation).
src/api/routes/users.rs Adds watchlist_access to User, implements catalog access helper, and introduces admin endpoint to set watchlist ACLs.
src/api/routes/filters.rs Validates watchlist bindings on filter creation (prefix, existence/access, crossmatch config) and persists watchlist on filters.
src/api/routes/catalogs.rs Enforces catalog visibility/access for listing/indexes/sample; hides unauthorized watchlists.
src/api/routes/queries/pipeline.rs Enforces catalog access checks with the authenticated user.
src/api/routes/queries/find.rs Enforces catalog access checks with the authenticated user.
src/api/routes/queries/count.rs Enforces catalog access checks with the authenticated user.
src/api/routes/queries/cone_search.rs Switches to catalog_accessible but currently uses unauthenticated access mode (None).
src/api/db.rs Ensures admin user initialization preserves/sets watchlist_access.
src/api/catalogs.rs Adds watchlist prefix constant and centralizes “visible vs accessible” catalog checks (including ACL gating).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/bin/api.rs
Comment thread src/api/routes/queries/cone_search.rs
Comment thread src/bin/add_filter.rs
Comment thread src/api/routes/users.rs Outdated
Comment thread src/filter/base.rs Outdated
Comment thread src/filter/base.rs
Comment thread src/filter/base.rs Outdated
@github-actions

github-actions Bot commented May 1, 2026

Copy link
Copy Markdown

Throughput results (8b25bd667279d0c68f3731bd33dfb5d4a0815ba3):

New wall time Baseline wall time Difference
244.4 244.9 0%

@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown

Throughput results (8ca5c8cd9659a79016caa5a51c7bbb6b743774ef):

Storage New wall time Baseline wall time Difference
mongo 248.5 246.3 0%
s3 273.9 275.1 0%

Baseline run: 27139027476

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 25 out of 25 changed files in this pull request and generated 4 comments.

Comment thread src/filter/base.rs
Comment thread src/filter/base.rs
Comment thread src/filter/base.rs
Comment thread src/bin/add_filter.rs
@github-actions

Copy link
Copy Markdown

Throughput results (c662c3483f50bad5ef27237fda48b94f04bc2a89):

Storage New wall time Baseline wall time Difference
mongo 248.0 234.1 5.00%
s3 294.2 294.1 0%

Baseline run: 27294071934

@github-actions

Copy link
Copy Markdown

Throughput results (9619e75df5b9d676b1618b691961ee03f6e5e920):

Storage New wall time Baseline wall time Difference
mongo 249.0 234.1 6.00%
s3 294.1 294.1 0%

Baseline run: 27294071934

1 similar comment
@github-actions

Copy link
Copy Markdown

Throughput results (9619e75df5b9d676b1618b691961ee03f6e5e920):

Storage New wall time Baseline wall time Difference
mongo 249.0 234.1 6.00%
s3 294.1 294.1 0%

Baseline run: 27294071934

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

2 participants