-
Notifications
You must be signed in to change notification settings - Fork 1.9k
chksum: run 256K benchmark on demand #17946
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
mcmilk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will fix the cosmetic issue.
But we will need to dig a bit deeper into the problem, why the bs256k variable is zero ;-)
amotin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally I find it OK. I don't think saving here makes much sense if we may lose consistency. But before we merge this I think it would be good to understand the original reported problem, so that we would not hide it deeper, ending up with sub-optimal implementation selection.
ZFS-CI-Type: linux Signed-off-by: Alexx Saver <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Fix openzfs#17945
mcmilk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we may rework the chksum_init() and maybe other functions.
I would try to fix this also.
|
I think zfs_chksum.c has gotten out of control, having spent too long even grokking it (including the new state machine, yay) let alone trying to find the underlying bug. 😁 So, I think the underlying bug is that the optimization in I think the real fix is to wrap this code: ... in ... but there's a lot of complication and smell here for something that optimizes a function that will normally only get called once ever. IMVHO. (If it were up to me I'd consider reverting the optimization.) |
|
Let us first get the code into a working state, worthy of the 2.4 release. No surprises. No experiments. @mcmilk, after that, you can redo it according to some new concept. @adamdmoss, thanks. I will take a look. |
ZFS-CI-Type: linux Signed-off-by: Alexx Saver <[email protected]> Co-authored-by: Adam Moss <[email protected]>
ca2011b to
950eb73
Compare
|
Well, now it works. We may leave it as it is for now. That is, without taking the governor into account. |
|
|
||
| #define AT_STARTUP 0 | ||
| #define AT_BENCHMARK 1 | ||
| #define AT_DONE 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove the AT_DONE state - when each reading of the benchmark file should create new statistics.
Motivation and Context
#17563 speeds up system boot by ensuring the 256K benchmark runs during boot, while others run on demand. However, for some reason, the 256K benchmark does not always run during boot. This patch forces the 256K benchmark to be run on demand as well, which should resolve #17945.
Description
One of two things must be true: either the 256K benchmark is not being run when the system boots, or the results of that benchmark are not being taken into account. If you force the 256K benchmark to run on demand, the data is displayed correctly.
It may not quite fit the concept, but it seems to solve the problem. Probably @mcmilk has some more thoughts on the topic.
However, there is an objection to the current concept. If it is true that the 256K benchmark result obtained during boot should be provided on demand, then values should become inconsistent when the governor is changed.
Probably we do not want the 256K benchmark values from the upper table to end up in the lower one.
We could try even more aggressively.
However, this is an intuitive line of reasoning. It has not been tested yet because of the issue, which is probably more important.
How Has This Been Tested?
The patch was tested on top of three current branches. If the 256K benchmark results were all zeros, they display correctly after applying the patch. However, it should be noted that the scenario where the 256K benchmark still runs during system boot and then runs again on demand was not tested. It could be that this PR does not solve the root cause of the problem. In any case, there is hope that this can be fixed without a major rewrite of the code.
Types of changes
Checklist:
Signed-off-by.