[fix] Use filepath instead of XDocument for md5hash (submarine)#16021
[fix] Use filepath instead of XDocument for md5hash (submarine)#16021Vam-Jam wants to merge 2 commits intoFakeFishGames:masterfrom
Conversation
Prior to this commit, it would load an XML file (ignoring white spaces), write XDocument into a string (Loading the game would peak at 150MB just for this function alone). It would then try to strip whitespaces again (which allocates another 50-70mb as it constructs a string builder with half the current string size) and finally hash it. Instead, read the FilePath (which could be cached by the OS already), strip whitespaces and hash. With this change, this function no longer allocates enough to appear in DPA.
There was a problem hiding this comment.
Thank you for the PR! Finally got around to taking a proper look at this, and it does seem like a very nice improvement. I have one question though: in the description you say "instead, read the FilePath, strip whitespaces and hash". At what point is stripping the whitespaces done? Unless I'm missing something, that's no longer done at any point - the only place in the code where we seem to strip whitespaces is the CalculateForString method, which is no longer used. The StringHashOptions enum seems to also be unused now.
While I think it might be safe to get rid of stripping the whitespace and just do byte-perfect hash comparisons, I think there might be some oversight here if your intention was to strip whitespaces at some point during the hash calculation.
|
Sorry about the confusion (I forgot to cross-out the original description out). Originally whitespace were still being stripped because the original commit would read the file and pass it into With the latest commit, it's no longer relevant as we don't parse the xml anymore when trying to figure out the hash value, and we don't load the entire file into memory. There's no reason to strip whitespaces anymore, unless there's some weird edge case on mac/windows/linux (?). Happy to get rid of |
Prior to this commit, it would load an XML file (ignoring white spaces, deserializing overhead etc), write the new XDocument into a string (Loading the game would peak at 150MB just for this function alone). It would then try to strip whitespaces again (which allocates another 50-70mb as it constructs a string builder with half the current string size) and finally hash it.
Instead, read the FilePath, strip whitespaces and hash. Very small load times test showed a reduction of roughly 3 seconds, most likely wont be as big in release mode. But at very least, it will make GC more happy :)
This was tested on Linux with .NET 6
I've added a new commit that changes file hash calculations to stream the file, instead of loading them into memory as a string, stripping and then hashing. Unless there's a good reason for stripping whitespace from files before we hash, streaming them is more efficient.