Respect autosave setting in RTC backend #479

Darshan808 · 2025-04-25T19:15:32Z

Fixes jupyterlab/jupyterlab#14619

Previously, documents were always written to disk on changes, even if autosave was disabled. This PR fixes that by:

Sending the autosave setting from each client via document awareness.
Skipping disk writes on doc changes if all connected clients have autosave disabled.
Writing to disk only when autosave is enabled by at least one client.

The related PR in jupyterlab re-enables manual save in RTC mode. This allows users to save explicitly when autosave is off. On manual save, the frontend sends a save_to_disc message via the document's WebSocket provider, triggering a backend save.

Note: I didn't find a way to determine which client made the change. So if any connected client has autosave enabled, the document will be saved.

Feedback on the approach is welcome!

github-actions · 2025-04-25T19:15:46Z

👈 Launch a Binder on branch Darshan808/jupyter-collaboration/fix-autosave

projects/jupyter-server-ydoc/jupyter_server_ydoc/handlers.py

krassowski · 2025-04-25T20:26:06Z

packages/docprovider-extension/src/filebrowser.ts

+      // Force autosave to be true by default initially
+      if (docmanagerSettings) {
+        void docmanagerSettings.set('autosave', true);
+      }


Wouldn't this change user's autosave setting? I think we could just take the value as-is because autosave is the default in lab and notebook.

Oh, if autosave is set to true by default, then we don't need this.
Let me go ahead and remove it.

krassowski · 2025-04-25T20:29:34Z

Note: I didn't find a way to determine which client made the change. So if any connected client has autosave enabled, the document will be saved.

I think this makes sense. If others agree, I wonder if we should make it clear in the UI or at very least document it somehow.

davidbrochart · 2025-04-26T08:55:42Z

Having autosave enabled when at least one client wants it and disabled when all clients wants it doesn't make sense to me, that's why there was no choice but having autosave enabled in the first place. I have never seen this kind of behavior anywhere. Correct me if I'm wrong but all collaborative applications have autosave enabled (Google Docs...)?

Darshan808 · 2025-04-26T11:30:16Z

I think we can decide how to handle autosave when multiple clients are working on a document. However, what makes the most sense to me is to be able to disable autosave when the extension is installed, but I am working alone on the document.

krassowski · 2025-04-27T12:10:18Z

Having autosave enabled when at least one client wants it and disabled when all clients wants it doesn't make sense to me, that's why there was no choice but having autosave enabled in the first place. I have never seen this kind of behavior anywhere.

The difference is you can install docprovider without using RTC, this is just to have:

server-side execution (and offline execution notifications which follows)
history timeline
better completion with jupyter-ai

Forcing non-RTC users to use autosave in that scenario is not good because they may (and in fact have, which is why we opened this PR) ran into IO limitations with large enough notebooks stalling high-performance workloads.

So I think what we want to achieve is to:

respect autosave preferences (on/off, interval) when only one user is connected (regardless of the number of window open?)
possibly force autosave (if needs be) when multiple users (not clients) connect; this could be enabled by a plugin in collaboration-ui package which could be disabled if users really don't like it, and which could add a flare in UI to indicate that autosave is active (toolbar? statusbar?) with explanation on hover that it was auto-enabled because of multi-user RTC.

Correct me if I'm wrong but all collaborative applications have autosave enabled (Google Docs...)?

Quoting my earlier comment from a month ago jupyterlab/jupyterlab#14619 (comment):

I think the pattern is to enforce autosave in cloud-synced documents, not in collaborative documents; yes most collaborative documents are cloud-synced nowadays, but in cases where they are not, the user still has to deal with their file system limitations and auto-save can lead to inadvertent side-effects, for example if users have file system watch scripts, such as performing expensive anti-virus scans on each modification (which is sometimes enforced and user has no way to disable it).

Databricks also run in issues with autosaving of large notebooks and automatically disables autosave for notebooks larger than 8 MB, see https://kb.databricks.com/notebooks/notebook-autosave

Here is an interesting pattern from OnlyOffice:

How saving works

You can decide when you want your changes sent to Document Server. Find the Autosaving option in the File tab -> Advanced settings:

If autosaving is on, your changes are sent to Document Server (the editors cache) automatically.
If it’s off you need to click the Save button to save your changes in the editors’ cache.

Saving during co-editing

The editors have two co-editing modes – Fast and Strict and they do have influence on autosaving.

In Strict mode, you lock the paragraph you are working on. Others can’t see your changes until you click the Save button, and you can’t see theirs. In this mode, when you click Save your changes are sent to Document Server as usual.

In Fast mode, you can see everything your co-authors are typing in real-time. In this mode, you don’t need to click Save at all – all the changes are saved automatically the second you stop typing. The Save button remains inactive.

https://www.onlyoffice.com/blog/2020/04/save-and-force-save-in-onlyoffice-never-lose-a-document

Another, one is Collabora Office where auto-save is enabled by default but can be disabled (and interval can be configured).

krassowski · 2025-05-02T10:39:55Z

@davidbrochart did you have time to think about it a bit more? Any other thoughts/suggestions?

davidbrochart · 2025-05-05T10:03:09Z

projects/jupyter-server-ydoc/jupyter_server_ydoc/handlers.py

@@ -291,6 +292,16 @@ async def on_message(self, message):
        """
        On message receive.
        """
+        if message == "save_to_disc":


I think that we should use a custom message type (see here). Maybe 2 followed by save?

davidbrochart · 2025-05-05T10:09:36Z

packages/docprovider/src/ydrive.ts

@@ -123,7 +126,7 @@ export class RtcContentProvider
      const provider = this._providers.get(key);

      if (provider) {
-        // Save is done from the backend
+        provider.wsProvider?.ws?.send('save_to_disc');


I think that we should use a custom message type (see here). Maybe 2 followed by save?
Also, should we wait for a reply indicating that the file has indeed been saved? Otherwise the following get will probably not return the state of the saved file.

Otherwise the following get will probably not return the state of the saved file.

True, But since the signal below is fired after each save from server (due to hash change) and the contents model is automatically updated with the new values, it may not be necessary to wait for the reply here to update the contents model.

jupyter-collaboration/packages/docprovider/src/ydrive.ts

Lines 198 to 200 in 9059310

this._ydriveFileChanged.emit({

type: 'save',

newValue: { ...model, hash: hashChange.newValue },

mlucool · 2025-05-08T13:42:41Z

FWIW I strongly agree with the idea that you don't always want auto-save with RTC. It seems to me that many things are benefiting from RTC's design of moving state to the backend that are not collaborative environments. In those cases, users would like Jupyter to work as it did before, but also allow for many of the benefits of moving state to the server side.

davidbrochart · 2025-05-08T14:05:07Z

Technically no state is moved to the server when using CRDTs, the state is just distributed among all peers, the server being just one of them. But yes in a future where jupyter-collaboration makes it in Jupyter core, one will use them even when "collaborating with oneself", and I can see an interest in deciding when to save to disk.

krassowski · 2025-05-08T16:26:00Z

packages/docprovider/src/ydrive.ts

+    const autosave =
+      (this._docmanagerSettings?.composite?.['autosave'] as boolean) ?? true;
+
+    sharedModel.awareness.setLocalStateField('autosave', autosave);


Should this also include autosaveInterval?

Do you think we should also send the autosaveInterval to the backend?

The reason I ask is that currently, JupyterLab (frontend) automatically calls save after every autosaveInterval, which in turn triggers save_to_disc on the backend.
However, the current implementation is that when autosave is enabled, the backend saves the document to disk on every document change, regardless of the configured autosave interval.

If we want to respect the autosave interval properly, I think we could consider removing the _on_document_change function on the backend.

jupyter-collaboration/projects/jupyter-server-ydoc/jupyter_server_ydoc/rooms.py

Lines 235 to 237 in 9059310

def _on_document_change(self, target: str, event: Any) -> None:

"""

Called when the shared document changes.

Here's why:

When autosave is off, only manual saves are possible, so observing document changes for saving might not be necessary.

When autosave is on, the client will already call save at the configured interval, which will trigger save_to_disc. Thus, observing document changes on the backend may also be redundant in this case.

One potential caveat is that if multiple clients with different autosaveInterval values are connected to the same document, save_to_disc will still be called for each of them when their individual autosave timers trigger.

Would love to hear your thoughts on this approach and whether you see any concerns or alternatives.

CC: @davidbrochart

Maybe we should move the discussion to a new issue and address this in a separate PR?

I think that is reasonable. We could create a new issue after this PR gets merged.

Darshan808 added 5 commits April 18, 2025 15:36

initial-try

c9d8dc8

prevent-always-autosave

789c874

fix-unused-import

40fba54

check_for_manual_save

19a0335

force-autosave-by-default

45b9ec2

Darshan808 mentioned this pull request Apr 25, 2025

Enable save in collaborative mode jupyterlab/jupyterlab#17508

Open

Darshan808 self-assigned this Apr 25, 2025

Darshan808 added the enhancement New feature or request label Apr 25, 2025

krassowski reviewed Apr 25, 2025

View reviewed changes

projects/jupyter-server-ydoc/jupyter_server_ydoc/handlers.py Show resolved Hide resolved

krassowski reviewed Apr 25, 2025

View reviewed changes

Darshan808 added 4 commits April 26, 2025 16:35

Merge branch 'main' into fix-autosave

cecb9eb

remove-force-autosave

de5267e

fix-type-hints

28c1ac9

fix-autosave-in-tests

72fd92b

add-tests

7e84bf8

danyeaw mentioned this pull request May 1, 2025

Weekly Team Meetings: Jan–Jun 2025 jupyterlab/frontends-team-compass#266

Open

davidbrochart reviewed May 5, 2025

View reviewed changes

Darshan808 added 2 commits May 6, 2025 21:11

follow-y-protocol

6b94adb

add-update-value

9ddd24d

Darshan808 requested a review from davidbrochart May 6, 2025 15:51

krassowski reviewed May 8, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Respect autosave setting in RTC backend #479

Respect autosave setting in RTC backend #479

Darshan808 commented Apr 25, 2025 •

edited

Loading

github-actions bot commented Apr 25, 2025

krassowski Apr 25, 2025

Darshan808 Apr 26, 2025

krassowski commented Apr 25, 2025

davidbrochart commented Apr 26, 2025

Darshan808 commented Apr 26, 2025

krassowski commented Apr 27, 2025

How saving works

Saving during co-editing

krassowski commented May 2, 2025

davidbrochart May 5, 2025

davidbrochart May 5, 2025

Darshan808 May 6, 2025

mlucool commented May 8, 2025

davidbrochart commented May 8, 2025

krassowski May 8, 2025

Darshan808 May 13, 2025 •

edited

Loading

krassowski May 13, 2025

Darshan808 May 13, 2025 •

edited

Loading

	this._ydriveFileChanged.emit({
	type: 'save',
	newValue: { ...model, hash: hashChange.newValue },

	def _on_document_change(self, target: str, event: Any) -> None:
	"""
	Called when the shared document changes.

Respect autosave setting in RTC backend #479

Are you sure you want to change the base?

Respect autosave setting in RTC backend #479

Conversation

Darshan808 commented Apr 25, 2025 • edited Loading

Fixes jupyterlab/jupyterlab#14619

github-actions bot commented Apr 25, 2025

krassowski Apr 25, 2025

Choose a reason for hiding this comment

Darshan808 Apr 26, 2025

Choose a reason for hiding this comment

krassowski commented Apr 25, 2025

davidbrochart commented Apr 26, 2025

Darshan808 commented Apr 26, 2025

krassowski commented Apr 27, 2025

How saving works

Saving during co-editing

krassowski commented May 2, 2025

davidbrochart May 5, 2025

Choose a reason for hiding this comment

davidbrochart May 5, 2025

Choose a reason for hiding this comment

Darshan808 May 6, 2025

Choose a reason for hiding this comment

mlucool commented May 8, 2025

davidbrochart commented May 8, 2025

krassowski May 8, 2025

Choose a reason for hiding this comment

Darshan808 May 13, 2025 • edited Loading

Choose a reason for hiding this comment

krassowski May 13, 2025

Choose a reason for hiding this comment

Darshan808 May 13, 2025 • edited Loading

Choose a reason for hiding this comment

Darshan808 commented Apr 25, 2025 •

edited

Loading

Darshan808 May 13, 2025 •

edited

Loading

Darshan808 May 13, 2025 •

edited

Loading