Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not that atomic, unfortunately #4

Open
staltz opened this issue Jan 20, 2020 · 6 comments
Open

Not that atomic, unfortunately #4

staltz opened this issue Jan 20, 2020 · 6 comments

Comments

@staltz
Copy link

staltz commented Jan 20, 2020

I've been receiving bug reports for Manyverse (hence we can assume Android is the OS), consistently about the same bug (and crash) related to atomic-file used in ssb-conn (specifically, in ssb-conn-db) which look like:

SyntaxError: Unexpected token , in JSON at position 132
    at JSON.parse (<anonymous>)
    at Object.decode (node_modules/atomic-file/codec/json.js:6:17)
    at /atomic-file/index.js:43:25

I myself have also experienced this on my Android device. I have opened conn.json on my Android device to see how it looks like at those declared positions, and it's usually some extraneous }\n,}}, (or something like that) added to the end of the file.

I don't know how those extra characters get added to the file, but I know that they are getting added, and this happens in a variety of devices. My suspicion is that the fs.rename hack isn't quite working, perhaps because Android is not POSIX compatible or because rename isn't always atomic.

@dominictarr
Copy link
Collaborator

can this be reproduced? reading those links I don't really see hard evidence that mv isn't ment to be atomic. POSIX is huge, so not being posix compliant is easy, and it sounds like even the android standard AtomicFile https://developer.android.com/reference/android/support/v4/util/AtomicFile.html seems like it does the same thing.

mv being atomic is really fundamental, so much stuff would break if you can't do that. I'd need to see much stronger evidence before I can believe that doesn't work. but there could be something else, like somehow the temp file is written twice, then copied?

@dominictarr
Copy link
Collaborator

sorry, I mean moved.

maybe check the hash of the temp file before moving?

@staltz
Copy link
Author

staltz commented Jan 29, 2020

I don't have a reproducible test case but I do know that it is happening to several people on several devices on Android. My usage of atomic-file is here: https://github.com/staltz/ssb-conn-db/blob/master/src/index.ts grep for _stateFile

@dominictarr
Copy link
Collaborator

If you where writing to the statefile from multiple processes it could maybe have a problem, or if you created more than one AtomicFile object with the same file?

that won't happen if you call set twice on the same object because it only does one write at a time.

@dominictarr
Copy link
Collaborator

how many bug reports have you received and when did they start? this code has not changed in ages, so it doesn't make sense that it has a bug but there have been no reports of bugs before.

@staltz
Copy link
Author

staltz commented Jan 30, 2020

If you where writing to the statefile from multiple processes it could maybe have a problem,

Definitely just one process. Manyverse runs as just one process, even node.js runs as a thread, not a separate process. Also, even if there were multiple SSB apps on Android, they would not share the same .ssb folder because at the moment there isn't a "home" ~/ equivalent folder, each app has its own folder which is strongly guaranteed by the OS that no other apps can read into it.

or if you created more than one AtomicFile object with the same file?

The code I showed https://github.com/staltz/ssb-conn-db/blob/master/src/index.ts actually can create two different AtomicFile objects, but these are for two clearly different files: gossip.json and conn.json.

how many bug reports have you received and when did they start?

I think it has "always" occurred in Manyverse now and then. The most recent report was this: https://gitlab.com/staltz/manyverse/issues/685 (note that that bug report is much more vague than this atomic-file issue, it's a win that we're able to have diagnose it at the current level of (im)precision even if we don't have a deterministic reproduction) for Manyverse version 0.1911.19, which had this package-lock.json, ssb-conn-db using [email protected], but flumeview-reduce uses its own copy of [email protected], for different files, I believe.

But the first report goes way back to October 2018 https://gitlab.com/staltz/manyverse/issues/190 (this was the first version of Manyverse), with the following package-lock.json using ssb-gossip and [email protected], so this rules out the specific code in ssb-conn-db since it was ssb-gossip back then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants