-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Enhancement Request] Fine-Grained GTID Support for Improved Read-After-Write Performance #472
Comments
Proposal For Read-After-Write Performance ImprovementIntroductionHi, I find this project very cool and want to be a contributor, I write this proposal based on the ReadAfterWrite Consistency Document, and the changes I make are marked in Bold or This is only a preliminary version, and I hope that I can fully discuss the optimization logic with community members before designing the code implementation, looking forward to your reply : ) Goals
Design DetailsStep 1: Get GTID after write operation without extra network roundStarting from MySQL 5.7, the MySQL protocol implements a mechanism to collect the GTIDs to be sent over the wire in the response packet. This feature assists us in acquiring GTIDs without introducing further network rounds. To enable the feature:
Step 2: Manage the latest GTID and update time for each table in the last t senondsWe can use The code below is just used to illustrate the method: // LatestGTIDEntry represents an entry in the LatestGTIDManager with the table name, GTID, and the time it was updated.
type LatestGTIDEntry struct {
GTID string
UpdateTime time.Time
}
// LatestGTIDManager manages the latest GTID and update time for each table.
type LatestGTIDManager struct {
latestGTIDs map[string]LatestGTIDEntry // Key is the table name, value is the LatestGTIDEntry struct.
expireTime time.Duration // The expiration time for GTID entries.
mu sync.RWMutex // Mutex for read-write synchronization.
wg sync.WaitGroup // WaitGroup to wait for the cleanup goroutine to finish.
}
// NewLatestGTIDManager creates a new instance of LatestGTIDManager.
func NewLatestGTIDManager(expireTime time.Duration) *LatestGTIDManager {
return &LatestGTIDManager{
latestGTIDs: make(map[string]LatestGTIDEntry),
expireTime: expireTime,
}
}
// UpdateGTID updates the latest GTID and update time for a given table.
func (m *LatestGTIDManager) UpdateGTID(tableName, gtid string) {
m.mu.Lock()
defer m.mu.Unlock()
m.latestGTIDs[tableName] = LatestGTIDEntry{
GTID: gtid,
UpdateTime: time.Now(),
}
}
// GetLatestGTID retrieves the latest GTID for a given table.
// If the table is not found or the GTID has expired, it returns an empty string and false.
func (m *LatestGTIDManager) GetLatestGTID(tableName string) (string, bool) {
m.mu.RLock()
defer m.mu.RUnlock()
entry, ok := m.latestGTIDs[tableName]
if !ok || time.Now().Sub(entry.UpdateTime) > m.expireTime {
return "", false
}
return entry.GTID, true
}
// startCleaner starts a goroutine to periodically clean up expired GTID entries.
func (m *LatestGTIDManager) startCleaner() {
m.wg.Add(1)
go func() {
defer m.wg.Done()
ticker := time.NewTicker(m.expireTime)
defer ticker.Stop()
for {
select {
case <-ticker.C:
m.mu.Lock()
now := time.Now()
for tableName, entry := range m.latestGTIDs {
if now.Sub(entry.UpdateTime) > m.expireTime {
delete(m.latestGTIDs, tableName)
}
}
m.mu.Unlock()
}
}
}()
}
// Stop waits for the cleanup goroutine to finish.
func (m *LatestGTIDManager) Stop() {
m.wg.Wait()
} Depends on the consistency level, the LatestGTIDManager may be initialized in the client’s session or a global memory data structure. // Initialize LatestGTIDManager with an expiration time of 10 minutes.
gm := NewLatestGTIDManager(10 * time.Second)
gm.startCleaner() Step 3: Store the GTID in WeSQL WeScale sessionsAfter parsing the response packet and get the GTIDs, WeSQL WeScale will store them in the memory. If the operation is a write operation, the LatestGTIDManager will update the latest GTID and write time for the table has be written. gm.UpdateGTID("my_table", "abcdefg-1234567-890")
When a read operation happens, we will utilize the LatestGTIDManager to get the Latest_GTID_for_Table_to_be_Read, ok := gm.GetLatestGTID("my_table") Two situations will occur at this time:
Step 4: Select a MySQL follower for readingA Therefore, GTIDs from step1 will update
During routing phase of a read operation, it will use the As long as the picked MySQL instance containes the Step 5: Ensure write requests have been propagated to the follower MySQLAll the follower MySQL instances may be lagging, or the We can either send the read operation to the leader, or send the read operation to the follower with a We can use multi-statement to save one network round: -- for example, if user's SQL is: select * from t1;
-- the actual SQL sent to follower may be a multi-statement like this:
select WAIT_FOR_EXECUTED_GTID_SET('ab73d556-cd43-11ed-9608-6967c6ac0b32:7', 3);select * from t1; We need to handle the mysql protocol carefully to use the multi-statement, otherwise the mysql connection may be broken. |
Thank you for your interest in this topic. If you would like to proceed, please feel free to send an email to [email protected]. |
Cool! I have sent you an email about the idea of |
Hi, I am very interested in this issue. Yesterday, I sent an email outlining some of my thoughts and ideas. I look forward to the opportunity to discuss them with you further.Thank you for your time and consideration! |
Background
The current implementation of the read_after_write consistency feature in the system relies on waiting for the execution of the last global transaction identifier (GTID), indiscriminately applying this method across SQL operations regardless of their data dependencies. This broad-stroke approach leads to unnecessarily high latency and decreased throughput for read-after-write operations, particularly when these operations do not interact with the same table. The lack of differentiation significantly hinders performance, especially in use cases where operations could otherwise proceed in parallel without data consistency issues.
Proposal
Implement Table-Level Read-After-Write Support: Introduce the capability for the system to intelligently discern operations across different tables, allowing for parallel processing of read-after-write operations where there are no direct data dependencies. This refinement is anticipated to substantially lower wait times for operations not confined to the same table, enhancing responsiveness.
Provide Configuration Options for Global and Table Levels: Offer users the ability to adjust read-after-write settings specifically for global and table levels. This granularity in configuration would empower users to tailor performance optimization strategies more precisely to their application's operational characteristics and requirements.
Performance Analysis for Global and Table Level Settings: Undertake a comprehensive analysis to evaluate the performance implications of utilizing global versus table-level settings for read-after-write operations. The insights gained from this analysis would equip users with the knowledge to make informed decisions, optimizing their configurations for either broader or more targeted performance improvements based on their specific scenarios.
The text was updated successfully, but these errors were encountered: