Skip to content

[fix] Use filepath instead of XDocument for md5hash (submarine)#16021

Open
Vam-Jam wants to merge 2 commits intoFakeFishGames:masterfrom
Vam-Jam:fix/SubHashTask
Open

[fix] Use filepath instead of XDocument for md5hash (submarine)#16021
Vam-Jam wants to merge 2 commits intoFakeFishGames:masterfrom
Vam-Jam:fix/SubHashTask

Conversation

@Vam-Jam
Copy link

@Vam-Jam Vam-Jam commented May 3, 2025

Prior to this commit, it would load an XML file (ignoring white spaces, deserializing overhead etc), write the new XDocument into a string (Loading the game would peak at 150MB just for this function alone). It would then try to strip whitespaces again (which allocates another 50-70mb as it constructs a string builder with half the current string size) and finally hash it.

Instead, read the FilePath, strip whitespaces and hash. Very small load times test showed a reduction of roughly 3 seconds, most likely wont be as big in release mode. But at very least, it will make GC more happy :)

This was tested on Linux with .NET 6


I've added a new commit that changes file hash calculations to stream the file, instead of loading them into memory as a string, stripping and then hashing. Unless there's a good reason for stripping whitespace from files before we hash, streaming them is more efficient.

Vam-Jam added 2 commits May 3, 2025 14:20
Prior to this commit, it would load an XML file (ignoring white spaces),
write XDocument into a string (Loading the game would peak at 150MB just
for this function alone). It would then try to strip whitespaces again
(which allocates another 50-70mb as it constructs a string builder with
half the current string size) and finally hash it.

Instead, read the FilePath (which could be cached by the OS already),
strip whitespaces and hash.

With this change, this function no longer allocates enough to appear in
DPA.
@Regalis11 Regalis11 self-requested a review May 5, 2025 11:06
@Regalis11 Regalis11 added the Code Programming task label May 5, 2025
Copy link
Collaborator

@Regalis11 Regalis11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR! Finally got around to taking a proper look at this, and it does seem like a very nice improvement. I have one question though: in the description you say "instead, read the FilePath, strip whitespaces and hash". At what point is stripping the whitespaces done? Unless I'm missing something, that's no longer done at any point - the only place in the code where we seem to strip whitespaces is the CalculateForString method, which is no longer used. The StringHashOptions enum seems to also be unused now.

While I think it might be safe to get rid of stripping the whitespace and just do byte-perfect hash comparisons, I think there might be some oversight here if your intention was to strip whitespaces at some point during the hash calculation.

@Vam-Jam
Copy link
Author

Vam-Jam commented Aug 28, 2025

Sorry about the confusion (I forgot to cross-out the original description out). Originally whitespace were still being stripped because the original commit would read the file and pass it into CalculateForString with IgnoreWhitespace. The optimization was just skipping the XML parsing step (which also stripped whitespaces as XDocument defaults would do this automatically iirc).

With the latest commit, it's no longer relevant as we don't parse the xml anymore when trying to figure out the hash value, and we don't load the entire file into memory. There's no reason to strip whitespaces anymore, unless there's some weird edge case on mac/windows/linux (?). Happy to get rid of CalculateForString and StringHashOptions too, or mark them as obsolete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Code Programming task

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants