Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check for windows NTFS compression on monerod startup (fixes #9617) #9664

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

eversinc33
Copy link
Contributor

Fixes #9617

If the data-dir is compressed and the lmdb folder is (check for both, because if data-dir is compressed and lmdb doesnt exist yet, it is likely that the compression is configured to be applied recursively):

image

If just the db folder is compressed:

image

If not compressed at all:

image

Just logging an ERROR message, just like if Fat32 is used, not aborting, because for a small blockchain this not be a critical issue.

Can we assume that "lmdb" stays the name of the db folder? Since it is hardcoded here. Otherwise, one would need to define it somewhere central. I assumed so, because BlockchainLMDB::get_db_name() also has it hardcoded.

@nahuhh
Copy link

nahuhh commented Dec 30, 2024

Should probably only need to check the data-dir/lmdb

example .bitmonero/lmdb.

Can we assume that "lmdb" stays the name of the db folder?

Testnet & stagenet will end up in subfolders like .bitmonero/testnet/lmdb

@eversinc33
Copy link
Contributor Author

eversinc33 commented Dec 30, 2024

Should probably only need to check the data-dir/lmdb

If there is no edge case where data-dir is created, but lmdb isnt yet, sure. Or I could change it to only check the data direcotry if the database directory ("lmdb") doesnt exist.

Testnet & stagenet will end up in subfolders like .bitmonero/testnet/lmdb

That should be care of with data_dir / "lmdb"

@eversinc33
Copy link
Contributor Author

Changed so that only the database directory is checked, and if it doesnt exist, it falls back to checking the data-dir.

@iamamyth
Copy link

Either the software supports a compressed database/data dir, or it doesn't (I'm not certain as to the policy here, though compression + memory mapping seems like a bad idea); logging an ERROR and proceeding with startup doesn't make sense to me, and doesn't "fix" much of anything.

@eversinc33
Copy link
Contributor Author

My reasoning was that this error only seems to become a problem if the blockchain exceeds 70GB (see #9569 (comment)), so its not a reason to halt if e.g. running for a smaller chain such as the testnet (or if the on disk chain becomes smaller in the future due to e.g. further pruning improvements). Im very open to discussions as I am just getting familiar with the codebase.

@iamamyth
Copy link

iamamyth commented Dec 30, 2024

I think you're on the right track here, but, conceptually, the blockchain grows forever, so allowing startup in a configuration which will eventually break wastes the feature detection and proves quite irritating for users (mainnet is the most critical use case and it exceeds the failure threshold). I may have uncovered the underlying problem: "If you compress a file that is larger than 30 gigabytes, the compression may not succeed." (https://learn.microsoft.com/en-us/windows/win32/fileio/file-compression-and-decompression). Rather than simply warning the user, or aborting, I suggest transparently fixing the problem in the BlockchainLMDB class:

  1. When creating the data directory in open, disable compression on the directory.
  2. Ensure the db file exists (i.e. touch the path). If the db file is compressed, prior to mdb_env_open, disable compression on the db file. This would fix existing databases going forward, and guards against cases where users put the system into an effectively broken state by fiddling with file attributes.

You can find info on the relevant windows APIs here:

Edit: Removed "maybe" qualifier from (2), as I think it's necessary because some users clone databases/data directories across systems, so the design shouldn't ignore existing, compressed database files.

@eversinc33
Copy link
Contributor Author

Very good points, I think this is the better approach. I will start implementing this in the new year and update this PR

@eversinc33 eversinc33 force-pushed the check-ntfs-compression branch from ceff273 to 50adf42 Compare January 1, 2025 20:55
@eversinc33
Copy link
Contributor Author

Force pushed with the new approach.

image

Disables compression on both the directory as well as the blockchain file if set. If not, continues silently. Let me know what you think.

@eversinc33 eversinc33 force-pushed the check-ntfs-compression branch from 50adf42 to ce21549 Compare January 1, 2025 21:00
@eversinc33 eversinc33 marked this pull request as draft January 2, 2025 08:32
@eversinc33
Copy link
Contributor Author

Thank you again for the great feedback. Please see the latest commit.

@eversinc33 eversinc33 force-pushed the check-ntfs-compression branch 2 times, most recently from ee9edf6 to 9ad2712 Compare January 2, 2025 13:34
@eversinc33 eversinc33 marked this pull request as ready for review January 2, 2025 13:35
@iamamyth
Copy link

iamamyth commented Jan 2, 2025

A clarification on handling directory-level compression for other reviewers: One could argue that if the directory already exists, the program should preserve its compression attribute; in fact, I suggested as much in earlier comments. However, upon further consideration, I think it makes sense to choose the most intrusive option (always uncompress the directory) because:

  1. It helps ensure consistent daemon behavior (as bug reports, and the Windows docs themselves, have illustrated, Windows filesystem compression is unreliable and non-transparent).
  2. If the choice proves unworkable, a future PR can downgrade it, whereas starting with the least intrusive option (mixed compression modes) needlessly adds complexity.

@iamamyth
Copy link

iamamyth commented Jan 2, 2025

Can you confirm that this modification runs normally on Windows and results in an an unencrypted data directory + db post-shutdown in the following startup scenarios?

  • Empty data directory
    • Encrypted
    • Not encrypted
  • Complete mainnet data directory
    • Encrypted
    • Not encrypted

@eversinc33
Copy link
Contributor Author

eversinc33 commented Jan 3, 2025

Can you confirm that this modification runs normally on Windows and results in an an unencrypted data directory + db post-shutdown in the following startup scenarios?

* Empty data directory
  
  * Encrypted
  * Not encrypted

* Complete mainnet data directory
  
  * Encrypted
  * Not encrypted

Yes, tested those configurations, as well as data-dir compressed but database-file not compressed and vice versa, with no issues.

Should I write an additional unit-test? How is the procedure, should unit-tests be provided for any added functionality? I develop from Linux, so I cross-compile and test manually, but I can set up a development environment on Windows to write a unit test if desired.

@iamamyth
Copy link

iamamyth commented Jan 3, 2025

In most projects, I'd add tests for new behavior. Here, the tests ordinarily don't run on windows, so extra tests don't do much. If you want to add tests, you have two mutually exclusive options:

  1. Add windows-only tests to tests/unit_test/blockchain_db.cpp (probably simpler and a technically reasonable choice).
  2. Add windows-only functional tests to tests/functional_tests (reflects the regression/integration aspect of the tests).

Copy link

@iamamyth iamamyth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a failing build. It looks like you need to explicitly add:

#include <boost/filesystem/fstream.hpp>

Everything else looks good. Thanks for your work on this issue!

@iamamyth
Copy link

iamamyth commented Jan 3, 2025

One last thing: Probably a good idea to squash commits for merge.

@eversinc33 eversinc33 force-pushed the check-ntfs-compression branch from 85baef4 to feccf6b Compare January 3, 2025 20:02
@eversinc33
Copy link
Contributor Author

Fixed and squashed. Thanks again for the good feedback!

@selsta
Copy link
Collaborator

selsta commented Jan 14, 2025

Please also open this PR against release-v0.18 branch.

@tobtoht
Copy link
Collaborator

tobtoht commented Jan 14, 2025

Lots of extraneous whitespace in this patch.

@iamamyth
Copy link

Lots of extraneous whitespace in this patch.

@tobtoht Was this posted on the wrong PR? I don't see any extraneous whitespace.

@selsta
Copy link
Collaborator

selsta commented Jan 14, 2025

There are a lot of spaces at the end of some lines, would be good to fix before merging.

@nahuhh
Copy link

nahuhh commented Jan 14, 2025

It seems to emulate codestyle from 1377-1402

@iamamyth
Copy link

Now I see the whitespace (using CLI), I think the github web UI got worse and made it really hard to see. @eversinc33 Can you remove the extraneous end-of-line whitespace and squash?

@eversinc33 eversinc33 force-pushed the check-ntfs-compression branch from feccf6b to 12a48e8 Compare January 14, 2025 20:10
@eversinc33
Copy link
Contributor Author

Now I see the whitespace (using CLI), I think the github web UI got worse and made it really hard to see. @eversinc33 Can you remove the extraneous end-of-line whitespace and squash?

Sorry, not sure where that came from. Squashed and pushed

@eversinc33
Copy link
Contributor Author

Please also open this PR against release-v0.18 branch.

See #9702

@selsta
Copy link
Collaborator

selsta commented Jan 15, 2025

@eversinc33 Could you check again? This line for example still has whitespaces at the end of the line.

https://github.com/monero-project/monero/pull/9664/files#diff-94841b820d7dfb0ef2f631d635c246ea152fba2cf868b62a0dc084f04e9e2f6cR1364

@eversinc33 eversinc33 force-pushed the check-ntfs-compression branch from 12a48e8 to 4e54b19 Compare January 15, 2025 07:30
@eversinc33
Copy link
Contributor Author

eversinc33 commented Jan 15, 2025

Weird. I force pushed again, can you check now?

Also: Would there be interested in adding a trailing-whitespace check / linting into the CI pipeline?

Copy link
Collaborator

@selsta selsta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good now.

Copy link
Collaborator

@0xFFFC0000 0xFFFC0000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Windows NTFS compression causes crash during sync
6 participants