change PYON format to be JSON #64

jordens · 2025-06-27T14:50:18Z

This addresses several long-standing issues:

No more eval() in decode() which is a step towards making the stack safer
PYON v2 now is JSON. This makes it easier for external non-Python code (e.g. our JS/TS or Rust code) to talk to sipyco/ARTIQ interfaces. The format is compatible with JSON-RPC v1 class hinting but that fact is not used, guaranteed, or tested.
Support for custom types (superseding PYON: add plugin support #22 and allowing pyon: Add xarray object encoding and decoding. #27 to be implemented by the user)
Performance improvements: for the mid-size unittest data which is very heavy on low performing special cases, the relative time is 0.9 for encode() and 0.25 for decode() (lower than one is faster than before). For the large twitter.json dataset which is pure JSON the relative time is 0.3 for encode() and 0.07 for decode(). This has a significant impact since the ratio of decode() per encode() tends to be larger than one (in ARTIQ). encode() performance could probably be improved further with orjson.

The format is a bit more verbose, especially with pretty printing.

A PYON v1 -> v2 converter is included.

A data loss scenario in store_file() has been fixed that may have been involved in some corruption cases of the UI state files.

Several more unittests have been added.

Testing in ARTIQ is ongoing but apart from the migration and documentation I don't see a technical hurdle there. The API is backwards compatible, but since the wire format changes, it'll need some coordination.

Obviously this breaks using existing data (ARTIQ: dataset db, ui state files, devargs overrides, PYON arguments, PYON literals in experiments/CLI tools/GUI fields, HDF5 file contents, ...) or talking to sipyco interfaces running PYON v1.
Auto-migration would only be one-way, only works for files, doesn't work on the two-way sipyco interfaces, and roll-back is tricky since the user doesn't really track what's being migrated.
A transparent v1 fall-back is also only possible/helpful for decode().
One could do shadow or proxy interfaces for sipyco rpc/sync_struct etc. to ease the migration in distributed/heterogeneous deployments but that is quite a bit of work to implement and ultimately the coherent upgrade will require the same amount of user work.

This replaces non-JSON serde with JSON-compliant wrappers. The wrapping is fundamentally compatible with the class hinting of JSON-RPC v1 and looks like `{"__jsonclass__": ["CONSTRUCTOR", ARGUMENTS]}`. JSON-RPC v1 properties are ignored. Types that are native to JSON (None, bool, float, int, str, list, Dict[str, Any]) are encoded without hints. Also gets rid of the `eval()` and associated risks. The intrinsic DOS attacks remain unmitigated (c.f. Python JSON warnings). This changes the errors reported on parsing invalid PYON. Support for several more numpy scalar data types has been added.

used in artiq dashboard

dnadlinger · 2025-07-07T16:44:05Z

Ouch, the lack of backwards-compatibility across RPC would make this the single most widely breaking change in the last ~8 years for us. Of course, the big advantage of having an RPC boundary between components is that it decouples their versioning and implementation, but that's out of the window if the format changes, so any such change should be carefully considered.

To illustrate, we have many shared pieces of infrastructure between experiments/labs that use sipyco.pc_rpc for communication – some of them embedded devices still running Python 3.5 and pyon from artiq.protocols –, for which a concerted upgrade is challenging as it would involve multiple labs with their own timelines, and modifying well-matured infrastructure that hasn't been touched in a long time.

As you say, "one could" implement transparent fallback for RPC as well – in my opinion, this is the only sensible option, and this change should be rolled back until then.

jbqubit · 2025-07-07T20:11:14Z

I agree with @dnadlinger that transparent fallback to the original protocol is important.

jordens · 2025-07-07T22:34:37Z

To set a couple things straight: Python 3.5 has been EOL for five years. Sipyco broke compat with Python 3.6 three years ago already. Merging this doesn't impact anyone yet as no downstream user has switched.

It's unreasonable to demand continued support for pieces of postulated, untestable, and proprietary abandonware from eight years ago (kudos for the euphemism "well-matured infrastructure"). It's disproportionately unfair to all other current and future users who need the performance, safety, interoperability, and flexibility benefits this brings and have invested into an agile and maintainable infrastructure for everyone's benefit. It shifts the burden of your own local infrastructural idiosyncracies and inflexibility upstream to future development that would need to maintain v1 interfaces/v1-v2 proxies.

Conversely it seems plausible that a lab that has chosen to set up and maintain such an infrastructure also has educated and resourceful expert users well versed in maintaining their own forks and able to choose between any of the other options of (a) reducing their code baggage and organizing an upgrade of their legacy applications (b) adding to their ARTIQ patch set to stick with PYON v1, or (c) implement and maintain a v1-v2 proxy.
If you are unable to touch, wrap, or maintain your own infrastructure you are guaranteed to become terribly stuck eventually and unable to follow.

I'll think about a proxy or keeping a v1 client option but I have zero love for it. You need to be aware that you are likely the only one truly desiring, testing, maintaining, and depending on it in such a way. I suspect almost everyone else demanding it doesn't have a case. And I can see no reason why even you would need a transparent fallback. What you describe is served with running a pc_rpc proxy or setting a pc_rpc client option. And I see no need to revert this PR. The transparent fallback I described is already in the ARTIQ PR.

dnadlinger · 2025-07-08T00:22:14Z

As the former maintainer of an infrastructure-type open-source project – probably considerably more widely used than ARTIQ –, I do believe I understand the tension between agility/avoidance of technical debt on the developer side and backwards-compatibility on the user side the quite well. Accordingly, for mostly self-contained, even somewhat gratuitous changes in storage formats, APIs, etc., I had zero complaints even if they caused code churn for our local tree – for which I insist on the diff being kept minimal for exactly that reason –, or if they needed special precaution in libraries like ndscan that need to work across a range of ARTIQ versions.

However, this change is uniquely cross-cutting in that it "virally" requires every single component in the distributed system to be upgraded at the same time, in a way that's neither forward or backward-compatible. You will be aware that one can't load two versions of a given package (sipyco) into the same Python process.

Contrary to what you are insinuating, it is widely recognised that such "all-or-nothing upgrades" can be an infeasible proposition even in the most professionally-run distributed systems; see e.g. the support of many messaging libraries (Protobuf/Thrift/…) for schema evolution features such as optional fields. Or take most mature, widely-used library/infrastructure-type projects out there as an example (such as Python itself): backwards-incompatible changes go through a carefully considered phase of deprecations before eventual removal. In situations where a change absolutely can't be made in a way that allows a graceful overlap period of coexistence (unavoidable name collisions, etc.), people tend to go to great lengths to provide a smooth migration path (even making such undesirable compromises like deprecating A in favour of B only to then rename B back to A with another deprecation period). In contrast, this attempt doesn’t seem particularly thought through, beyond “that is quite a bit of work” and “I have zero love for it”. Welcome to the fun world of working on software that has a non-trivial user base who interact with it on a source-code level.

I do understand the benefits of infrastructure agility. Most things here are on reproducible, easily modified environments through Nix or Poetry (-> uv), precisely so we can thoroughly test and roll out new patches quickly. But this makes an all-at-once migration only marginally more feasible. Changing the RPC protocol while deliberately avoiding any chance for compatibility forces a complete split of the universe into a before and after (with all the planning issues in terms of lab schedules, and even discounting that, annoyances with external dependencies, when bisecting issues across the change, momentarily downgrading for one experiment but not another), for no reason other than momentary convenience. Keep supporting both seamlessly for a major release, deprecate old PYON in the next, and remove it in the one after.

And I can see no reason why even you would need a transparent fallback.

Because users might end up wanting to mix old and new code in the same Python process/environment – they might be consuming drivers/libraries/… from various sources, some of which are aware of the change and some of which are not. Requiring an extra option to be passed or renamed function to be called to achieve backwards compatibility is the worst of both worlds from this perspective. Having the new implementations in sipyco.{pyon,pc_rpc}_v2, or behind a protocol=2 flag or similar would be the preferred "polarity". This way, only code that is aware of the new protocol version needs to, and can appropriately, deal with potential incompatibilities, such that client code can be incrementally migrated over even if you wanted to avoid having a transparent fallback.

And I see no need to revert this PR. The transparent fallback I described is already in the ARTIQ PR.

I think it is clear that this PR falls short, and all but blocks anybody from tracking and testing ARTIQ master outside the smallest, most monolithic environments. Just backing it out until you’ve found a way to address the compatibility issue is the obvious way to go, just like when breaking accidentally breaking master in the good old days before automated pre-commit-checks. I get that you are excited about https://github.com/quartiq/vscode-artiq et al. (and moving from almost-JSON to actually-JSON is a welcome change in the abstract!), but the wider ARTIQ community shouldn’t be left to pick up the pieces.

jordens added 5 commits June 26, 2025 18:01

pyon: support custom types

fbf30e1

pyon: add flush+fsync to store_file

b6d6ce1

pyon: add v1-to-v2 converter

d1ad0a6

pyon: assert __jsonclass__ absence

8562b9f

jordens changed the title ~~change PYON format to be JSON compatible~~ change PYON format to be JSON Jun 27, 2025

jordens added 5 commits June 27, 2025 18:28

pyon: style, docs

200905c

pyon: test v1, repr, various errors, keyword passthrough

360a042

pyon: keep _numpy_scalar name

202dabc

used in artiq dashboard

pyon: test all npscalars

723dd43

pyon: tweak errors, use reshape-copy

87f2dd8

jordens merged commit db78b05 into master Jul 7, 2025

This was referenced Jul 7, 2025

sipyco with PYON v2 m-labs/artiq#2804

Merged

PYON: add plugin support #22

Closed

pyon: Add xarray object encoding and decoding. #27

Open

jordens deleted the json branch July 8, 2025 12:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

change PYON format to be JSON #64

change PYON format to be JSON #64

Uh oh!

jordens commented Jun 27, 2025 •

edited

Loading

Uh oh!

dnadlinger commented Jul 7, 2025 •

edited

Loading

Uh oh!

jbqubit commented Jul 7, 2025

Uh oh!

jordens commented Jul 7, 2025

Uh oh!

dnadlinger commented Jul 8, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

change PYON format to be JSON #64

change PYON format to be JSON #64

Uh oh!

Conversation

jordens commented Jun 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dnadlinger commented Jul 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jbqubit commented Jul 7, 2025

Uh oh!

jordens commented Jul 7, 2025

Uh oh!

dnadlinger commented Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jordens commented Jun 27, 2025 •

edited

Loading

dnadlinger commented Jul 7, 2025 •

edited

Loading

dnadlinger commented Jul 8, 2025 •

edited

Loading