-
Notifications
You must be signed in to change notification settings - Fork 6.1k
bindinfo: add last_used_date to track bindinfo usage frequency #63409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #63409 +/- ##
================================================
+ Coverage 72.7332% 75.5088% +2.7755%
================================================
Files 1848 1895 +47
Lines 498692 514696 +16004
================================================
+ Hits 362715 388641 +25926
+ Misses 113968 102798 -11170
- Partials 22009 23257 +1248
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
|
/retest |
2 similar comments
|
/retest |
|
/retest |
|
/retest |
|
/retest |
|
/retest-required |
pkg/session/bootstrap.go
Outdated
| source VARCHAR(10) NOT NULL DEFAULT 'unknown', | ||
| sql_digest varchar(64) DEFAULT NULL, | ||
| plan_digest varchar(64) DEFAULT NULL, | ||
| last_used_time TIMESTAMP DEFAULT NULL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also specify the precision, @henrybw just updated the precision to 6 to fix some unstable problems: https://github.com/pingcap/tidb/pull/63524/files
pkg/bindinfo/utils.go
Outdated
| ) | ||
|
|
||
| func updateBindingUsageInfoToStorage(sPool util.DestroyableSessionPool, bindings []*Binding) error { | ||
| err := callWithSCtx(sPool, true, func(sctx sessionctx.Context) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In one of our customer's workload, there are around 100,000 bindings, is this OK to execute 100,000 update statements in one transaction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to limit the number of update statements in this transaction to around 200 or 500 for safety I think. Better to confirm this with our Txn Team members.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I will confirm this with the transaction team.
Additionally, if a cluster really has that many binds, there must also be many TiDB instances. In fact, we need to consider the impact on the entire cluster from multiple instances. So as long as it is sufficiently scattered, with small batches and executed by multiple instances, it should be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The slow write here actually won't have much impact. DBAs won't delete a bindinfo just because it hasn't been used for a day or two; that would be quite risky.
|
/retest |
henrybw
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem to load usage info from storage into the bindings cache? I think we need to also update readBindingsFromStorage() to read the last_used_time column into the Binding structures.
| randomDuration( | ||
| bindinfo.MinCheckIntervalForUpdateBindingUsageInfo, | ||
| bindinfo.MaxCheckIntervalForUpdateBindingUsageInfo, | ||
| ), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Picking random durations seems unnecessarily complicated. What's wrong with picking just one check interval?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the customer's cluster has many TiDB nodes, especially for SaaS clusters, we need to avoid having all TiDB nodes execute batch updates at the same time, which could impact the customer's business. Additionally, updating bindinfo is not urgent, so it's fine to have larger time intervals between updates. Moreover, the code for this is not extensive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the current code, there is also a random interval inside the deltaUpdateTickerWorker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
time-and-fate
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the current implementation, if the tidb server shuts down in 6 hours, the usage info will never get a chance to be recorded.
I think we need to confirm with PM that this is the expected behavior.
pkg/bindinfo/utils.go
Outdated
| } | ||
| binding.ResetUsageInfo() | ||
| } | ||
| if cnt > updateBindingUsageInfoBatchSize { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cnt is never updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated.
pkg/bindinfo/utils.go
Outdated
| lastUsedTime := ts.UTC().Format(types.TimeFormat) | ||
| _, _, err := execRows( | ||
| sctx, | ||
| "UPDATE mysql.bind_info SET last_used_time = CONVERT_TZ(%?, '+00:00', @@TIME_ZONE) WHERE sql_digest = %?", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this would be a table full scan.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update. I use plan_digest and sql_digest as the condition. then it can use the digest_index to update.
+---------------+---------+------+--------------------------------------------------------------+---------------+
| id | estRows | task | access object | operator info |
+---------------+---------+------+--------------------------------------------------------------+---------------+
| Update_3 | N/A | root | | N/A |
| └─Point_Get_1 | 1.00 | root | table:bind_info, index:digest_index(plan_digest, sql_digest) | |
+---------------+---------+------+--------------------------------------------------------------+---------------+
Yes, it is only for the user to check whether the bind is in use. |
24a0f3e to
d043c76
Compare
Signed-off-by: Weizhen Wang <[email protected]>
Signed-off-by: Weizhen Wang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds a last_used_date column to the mysql.bind_info table to track binding usage frequency, helping customers identify unused bindings that may be putting pressure on TiDB.
Key changes:
- Adds new database schema column and upgrade mechanism for version 253
- Implements usage tracking that updates binding last-used timestamps when bindings are matched
- Adds periodic storage updates with randomized intervals to prevent thundering herd problems
Reviewed Changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/session/upgrade.go | Adds version 253 upgrade function to add last_used_date column |
| pkg/session/bootstrap.go | Updates bind_info table schema to include new column |
| pkg/bindinfo/binding.go | Adds usage tracking fields and methods to Binding struct |
| pkg/bindinfo/utils.go | Implements batch update logic for writing usage info to storage |
| pkg/bindinfo/binding_cache.go | Adds interface method for updating usage info to storage |
| pkg/domain/domain.go | Adds periodic worker to update binding usage info with randomized intervals |
| pkg/bindinfo/tests/bind_usage_info_test.go | Comprehensive test coverage for new usage tracking functionality |
| Multiple test files | Updates existing tests to handle new column in insert statements |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Signed-off-by: Weizhen Wang <[email protected]>
Signed-off-by: Weizhen Wang <[email protected]>
henrybw
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost there.
Signed-off-by: Weizhen Wang <[email protected]>
Signed-off-by: Weizhen Wang <[email protected]>
Signed-off-by: Weizhen Wang <[email protected]>
Signed-off-by: Weizhen Wang <[email protected]>
Signed-off-by: Weizhen Wang <[email protected]>
henrybw
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for diligently working through this. LGTM
|
/unhold |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: D3Hunter, henrybw, Leavrth, qw4990, you06, yudongusa The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest |
|
@hawkingrei: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/cherrypick release-8.5 |
Signed-off-by: ti-chi-bot <[email protected]>
|
@hawkingrei: new pull request created to branch In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
Signed-off-by: ti-chi-bot <[email protected]>
What problem does this PR solve?
Issue Number: close #63407
Problem Summary:
What changed and how does it work?
Currently, some customers have created a large number of bindings. However, it is difficult to determine whether these bindings are still in use. The sheer volume of bindings also puts pressure on TiDB. Therefore, we need a way to identify and mark the bindings that are not in use.
Check List
Tests
start a tidb with master which is started by script.
TikvandPDis started bytiup playground nightly --mode tikv-slim.kill this TiDB and start changed tidb verson.
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.