Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Table Space Bloat Due to Gh-ost #1484

Open
xiehaopeng opened this issue Dec 20, 2024 · 0 comments
Open

Table Space Bloat Due to Gh-ost #1484

xiehaopeng opened this issue Dec 20, 2024 · 0 comments

Comments

@xiehaopeng
Copy link

Description:
We have observed significant table space bloat as a result of using gh-ost to alter table. The number of rows before and after alter table has basically not changed, but the table space has expanded by 60%.The amount of table space expansion is related to the size of a single row in the table. When the size of a single row is about 4500 bytes, the expansion is more obvious.

We believe the primary cause of this issue appears to be the interchangeable use of "copy rows" and "apply events" during the gh-ost processes. When there are INSERT operations happening concurrently, this leads to a mix of sequential and out-of-order inserts. This mixed insertion pattern triggers InnoDB's insert point split strategy frequently. After the split, the leaf nodes contain unutilized space (page-level fragmentation), which results in wasted space and inflated table size.

Steps to Reproduce:

  • First, mock a large table of 50GB, with the size of a single row controlled at around 4500 bytes.
  • Then use native MySQL alter to organize table.
  • Then write a script to simulate low-frequency insert traffic.
  • Then execute gh-ost , and start the traffic simulation script before gh-ost.
  • Finally, after gh-ost is completed, compare the size of the original table and the temporary table.

A Possible Improvement Idea:
Based on the ignoring binlog events feature of feat binlog apply optimization #1378 we dynamically expand the maximum boundary value of the copy, ignore the large unique key insert event, which may avoid this problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant