Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PT-2378 - extended FP precision in pt-table-sync #863

Open
wants to merge 3 commits into
base: 3.x
Choose a base branch
from

Conversation

hpoettker
Copy link
Contributor

The PR is intended to resolve this issue: https://perconadev.atlassian.net/browse/PT-2378

The tests added with pt-2378.t all fail with the current code base. With the proposed change, they run successfully. They test that both REPLACE and UPDATE statements are generated by pt-table-sync such that floating point numbers that are not intended to change indeed don't change.

The code is my own creation and it can be distributed under the GPL2 licence.

  • The contributed code is licensed under GPL v2.0
  • Contributor Licence Agreement (CLA) is signed
  • util/update-modules has been ran
  • Documentation updated
  • Test suite update

@hpoettker hpoettker force-pushed the PT-2378_table_sync_with_more_fp_precision branch from c0f3c96 to 687f24b Compare September 15, 2024 20:18
@hpoettker
Copy link
Contributor Author

I updated the PR such that the change is also included in lib/Quoter.pm.

This affected the source code of many tools. However, the only tool except for pt-table-sync that actually calls quote_val, is pt-archiver. And pt-archiver does not set the flag is_float, wherefore it also isn't actually affected by the change. So the change is really covered by the two tests in this PR.

@hpoettker hpoettker force-pushed the PT-2378_table_sync_with_more_fp_precision branch from 687f24b to 6cbe827 Compare September 15, 2024 22:26
@hpoettker hpoettker force-pushed the PT-2378_table_sync_with_more_fp_precision branch from 6cbe827 to ad350a2 Compare November 14, 2024 23:45
@hpoettker
Copy link
Contributor Author

Thanks so much for the massive effort to adjust the code base to MySQL 8.4!

I've rebased the changes on the latest commit in 3.x.

Copy link
Collaborator

@svetasmirnova svetasmirnova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proposed change caused failure of t/pt-table-sync/float_precision.t due to extra digits, added to value 31.6: it was formatted as 31.600000000000001 As a result, test "Sync rows with float values (bug 1229861)" failed.

I suggest making the fix more complicated. Please check my latest commit and let me know if you agree or not.

hpoettker and others added 3 commits December 27, 2024 08:06
pt-table-sync now uses up to 17 decimal digits when writing
floating point numbers in the generated SQL statements.
This is necessary to prevent unintended data changes.
…ements with insufficient precision

- Modified proposed patch, so it does not add extra digits
use serialization error as criterion for additional precision
@hpoettker hpoettker force-pushed the PT-2378_table_sync_with_more_fp_precision branch from 37b9e2c to 634ed46 Compare December 27, 2024 07:08
@hpoettker
Copy link
Contributor Author

I completely agree on the problem. With the extended precision, also the number 0.1 gets printed as 0.10000000000000001. Both strings are deserialized by MySQL into the same floating point number, so both representations are technically fine. But 0.1 is definitely better for the user experience and more consistent with the output of earlier versions of pt-table-sync.

I don't think it can't be decided purely on the length of the string representation of a floating point number whether the full precision is required. This would be a counter example for testing against a length of 15:

> $x = 1 + 1111000 / 9999999
> print($x)
1.11110001111
> print(length($x))
13
> printf("%.17g", $x)
1.1111000111100011
> print($x - 1.11110001111)
1.11022302462516e-15
> print($x - 1.1111000111100011)
0

I've changed the criterion to be a non-zero serialization error with the default format, which should work more broadly and captures the intent.

And I've rebased on the latest commit of 3.x. Please take another look.

@svetasmirnova svetasmirnova self-requested a review December 27, 2024 15:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants