-
Notifications
You must be signed in to change notification settings - Fork 0
Update glossary.md #32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
WalkthroughThe changes introduce a series of new entries to the Changes
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Outside diff range and nitpick comments (2)
glossary.md (2)
218-221
: Improve clarity and fix punctuationThe entry contains redundant phrasing and needs punctuation fixes.
### PiTR -Point in Time Recovery (PiTR) is a database feature that allows the user to restore to a specific point in time (for example just before an accidental `DELETE` statement). +Point-in-Time Recovery (PiTR) is a database feature that allows restoring a database to a specific moment (for example, just before an accidental `DELETE` statement).🧰 Tools
🪛 LanguageTool
[style] ~219-~219: This phrase is redundant. Consider writing “Point” or “time”.
Context: ...r isolated from the network. ### PiTR Point in Time Recovery (PiTR) is a database feature t...(MOMENT_IN_TIME)
[style] ~220-~220: This phrase is redundant. Consider writing “point” or “time”.
Context: ...llows the user to restore to a specific point in time (for example just before an accidental ...(MOMENT_IN_TIME)
[uncategorized] ~220-~220: Possible missing comma found.
Context: ...estore to a specific point in time (for example just before an accidentalDELETE
stat...(AI_HYDRA_LEO_MISSING_COMMA)
171-171
: Standardize terminology formattingFor consistency, standardize the capitalization of terms across the glossary.
-OnLine Analytical Processing +Online Analytical Processing -OnLine Transaction Processing +Online Transaction ProcessingAlso applies to: 179-179
🧰 Tools
🪛 LanguageTool
[uncategorized] ~171-~171: This verb does not appear to agree with the subject. Consider using a different form.
Context: ...AP OnLine Analytical Processing (OLAP) are describing database workloads that most...(AI_EN_LECTOR_REPLACEMENT_VERB_AGREEMENT)
🪛 GitHub Check: vale
[warning] 171-171:
[vale] reported by reviewdog 🐶
[PingCAP.Ambiguous] Consider using a clearer word than 'many' because it may cause confusion.Raw Output:
{"message": "[PingCAP.Ambiguous] Consider using a clearer word than 'many' because it may cause confusion.", "location": {"path": "glossary.md", "range": {"start": {"line": 171, "column": 193}}}, "severity": "INFO"}
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
glossary.md
(11 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
glossary.md (1)
Pattern **/*.md
: As a technical writer in the database area, you need to first learn the changes in this PR, and then improve the content to make it clear, concise, and easy to understand in English according to your database knowledge.
🪛 LanguageTool
glossary.md
[uncategorized] ~49-~49: Possible missing comma found.
Context: ...rmance. ### CF CF is short for Column Family as used by RocksDB / TiKV. ### Coalesc...
(AI_HYDRA_LEO_MISSING_COMMA)
[uncategorized] ~61-~61: A punctuation mark might be missing here.
Context: ...E) is part of the SQL standard and uses [WITH
](/sql-statements/sql-statement-wit...
(AI_EN_LECTOR_MISSING_PUNCTUATION)
[uncategorized] ~79-~79: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...the latest features but doesn't provide long term support. See [TiDB Versioning](/release...
(EN_COMPOUND_ADJECTIVE_INTERNAL)
[uncategorized] ~86-~86: You might be missing the article “the” here.
Context: ...backups and standby clusters. ### DXF Distributed eXecution Framework (DXF) is the framew...
(AI_EN_LECTOR_MISSING_DETERMINER_THE)
[misspelling] ~107-~107: Did you mean the phrasal verb “clean up” instead of the noun ‘cleanup’?
Context: ...rbage Collection (GC) is the process to cleanup unused resources. See [GC](/garbage-col...
(CLEAN_UP)
[misspelling] ~123-~123: Use “an” instead of ‘a’ if the following word starts with a vowel sound, e.g. ‘an article’, ‘an hour’.
Context: ...DS Instance Metadata Service (IMDS) is a AWS service that can be used to manage ...
(EN_A_VS_AN)
[style] ~137-~137: Consider replacing this phrase with the adverb “securely” to avoid wordiness.
Context: ...he storage and retrieval of secret keys in a secure way. Examples of this are the AWS KMS, GCP ...
(IN_A_X_MANNER)
[uncategorized] ~141-~141: Possible missing preposition found.
Context: ...vices. ### KV Key-Value (KV) is a way storing information that allows easy store and ...
(AI_EN_LECTOR_MISSING_PREPOSITION)
[uncategorized] ~150-~150: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...uthentication-plugin-status). ### LTS Long Term Support (LTS) are software versions tha...
(EN_COMPOUND_ADJECTIVE_INTERNAL)
[uncategorized] ~171-~171: This verb does not appear to agree with the subject. Consider using a different form.
Context: ...AP OnLine Analytical Processing (OLAP) are describing database workloads that most...
(AI_EN_LECTOR_REPLACEMENT_VERB_AGREEMENT)
[uncategorized] ~179-~179: This verb may not be in the correct tense. Consider changing the tense to fit the context better.
Context: ...P OnLine Transaction Processing (OLTP) are describing database workloads that mostly deal wit...
(AI_EN_LECTOR_REPLACEMENT_VERB_TENSE)
[duplication] ~183-~183: Possible typo: you repeated a word
Context: ...a situation where a system fails due to a a lack of available memory. See [Troubles...
(ENGLISH_WORD_REPEAT_RULE)
[uncategorized] ~216-~216: Possible missing article found.
Context: ...arners is vastly different from that of leader. Followers in pending cannot be elected...
(AI_HYDRA_LEO_MISSING_A)
[uncategorized] ~216-~216: You might be missing the article “the” here.
Context: ... state that a peer ceases to respond to leader for a long time, which usually means th...
(AI_EN_LECTOR_MISSING_DETERMINER_THE)
[style] ~219-~219: This phrase is redundant. Consider writing “Point” or “time”.
Context: ...r isolated from the network. ### PiTR Point in Time Recovery (PiTR) is a database feature t...
(MOMENT_IN_TIME)
[style] ~220-~220: This phrase is redundant. Consider writing “point” or “time”.
Context: ...llows the user to restore to a specific point in time (for example just before an accidental ...
(MOMENT_IN_TIME)
[uncategorized] ~220-~220: Possible missing comma found.
Context: ...estore to a specific point in time (for example just before an accidental DELETE
stat...
(AI_HYDRA_LEO_MISSING_COMMA)
[uncategorized] ~308-~308: You might be missing the article “the” here.
Context: ...ls, see TSO. ## U ### URI Uniform Resource Identifier (URI) is a uniform ...
(AI_EN_LECTOR_MISSING_DETERMINER_THE)
[uncategorized] ~312-~312: You might be missing the article “the” here.
Context: ...ipedia for more information. ### UUID Universally Unique Identifier (UUID) is a 128-bit (...
(AI_EN_LECTOR_MISSING_DETERMINER_THE)
🪛 GitHub Check: vale
glossary.md
[failure] 137-137:
[vale] reported by reviewdog 🐶
[Vale.Avoid] Avoid using 'GCP'.
Raw Output:
{"message": "[Vale.Avoid] Avoid using 'GCP'.", "location": {"path": "glossary.md", "range": {"start": {"line": 137, "column": 129}}}, "severity": "ERROR"}
[warning] 171-171:
[vale] reported by reviewdog 🐶
[PingCAP.Ambiguous] Consider using a clearer word than 'many' because it may cause confusion.
Raw Output:
{"message": "[PingCAP.Ambiguous] Consider using a clearer word than 'many' because it may cause confusion.", "location": {"path": "glossary.md", "range": {"start": {"line": 171, "column": 193}}}, "severity": "INFO"}
🔇 Additional comments (3)
glossary.md (3)
33-35
: LGTM!
The definition is clear, concise, and includes proper documentation reference.
115-118
: LGTM!
The HTAP definition is comprehensive, technically accurate, and includes both internal implementation details and an external reference.
Line range hint 1-313
: Verify documentation links
Let's verify that all documentation links are valid and accessible.
🧰 Tools
🪛 LanguageTool
[uncategorized] ~61-~61: A punctuation mark might be missing here.
Context: ...E) is part of the SQL standard and uses [WITH
](/sql-statements/sql-statement-wit...
(AI_EN_LECTOR_MISSING_PUNCTUATION)
[uncategorized] ~79-~79: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...the latest features but doesn't provide long term support. See [TiDB Versioning](/release...
(EN_COMPOUND_ADJECTIVE_INTERNAL)
[uncategorized] ~86-~86: You might be missing the article “the” here.
Context: ...backups and standby clusters. ### DXF Distributed eXecution Framework (DXF) is the framew...
(AI_EN_LECTOR_MISSING_DETERMINER_THE)
[misspelling] ~107-~107: Did you mean the phrasal verb “clean up” instead of the noun ‘cleanup’?
Context: ...rbage Collection (GC) is the process to cleanup unused resources. See [GC](/garbage-col...
(CLEAN_UP)
[misspelling] ~123-~123: Use “an” instead of ‘a’ if the following word starts with a vowel sound, e.g. ‘an article’, ‘an hour’.
Context: ...DS Instance Metadata Service (IMDS) is a AWS service that can be used to manage ...
(EN_A_VS_AN)
|
||
### OLAP | ||
|
||
OnLine Analytical Processing (OLAP) are describing database workloads that mostly deal with analytical workloads like reporting. The characteristics of this is read heavy queries that process many rows. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix grammar and style issues
Several grammar issues need to be addressed:
-OnLine Analytical Processing (OLAP) are describing database workloads that mostly deal with analytical workloads like reporting.
+Online Analytical Processing (OLAP) describes database workloads that mostly deal with analytical tasks like reporting.
-OnLine Transaction Processing (OLTP) are describing database workloads that mostly deal with transactional workloads
+Online Transaction Processing (OLTP) describes database workloads that mostly deal with transactional operations
-Out of Memory (OOM) is a situation where a system fails due to a a lack of available memory.
+Out of Memory (OOM) is a situation where a system fails due to a lack of available memory.
-Followers in pending cannot be elected as leader.
+Followers in pending state cannot be elected as the leader.
-Uniform Resource Identifier (URI) is a uniform way
+A Uniform Resource Identifier (URI) is a uniform way
-Universally Unique Identifier (UUID) is a 128-bit
+A Universally Unique Identifier (UUID) is a 128-bit
Also applies to: 179-179, 183-183, 216-216, 308-308, 312-312
🧰 Tools
🪛 LanguageTool
[uncategorized] ~171-~171: This verb does not appear to agree with the subject. Consider using a different form.
Context: ...AP OnLine Analytical Processing (OLAP) are describing database workloads that most...
(AI_EN_LECTOR_REPLACEMENT_VERB_AGREEMENT)
🪛 GitHub Check: vale
[warning] 171-171:
[vale] reported by reviewdog 🐶
[PingCAP.Ambiguous] Consider using a clearer word than 'many' because it may cause confusion.
Raw Output:
{"message": "[PingCAP.Ambiguous] Consider using a clearer word than 'many' because it may cause confusion.", "location": {"path": "glossary.md", "range": {"start": {"line": 171, "column": 193}}}, "severity": "INFO"}
### DXF | ||
|
||
Distributed eXecution Framework (DXF) is the framework used by TiDB to speedup index creation and data import by distributing tasks over all available resources. See [DXF Introduction](/tidb-distributed-execution-framework.md) for more details | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix punctuation and verify documentation link
The entry is missing proper punctuation and needs link verification.
### DXF
-Distributed eXecution Framework (DXF) is the framework used by TiDB to speedup index creation and data import by distributing tasks over all available resources. See [DXF Introduction](/tidb-distributed-execution-framework.md) for more details
+Distributed eXecution Framework (DXF) is the framework used by TiDB to speed up index creation and data import by distributing tasks over all available resources. See [DXF Introduction](/tidb-distributed-execution-framework.md) for more details.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
### DXF | |
Distributed eXecution Framework (DXF) is the framework used by TiDB to speedup index creation and data import by distributing tasks over all available resources. See [DXF Introduction](/tidb-distributed-execution-framework.md) for more details | |
### DXF | |
Distributed eXecution Framework (DXF) is the framework used by TiDB to speed up index creation and data import by distributing tasks over all available resources. See [DXF Introduction](/tidb-distributed-execution-framework.md) for more details. |
🧰 Tools
🪛 LanguageTool
[uncategorized] ~86-~86: You might be missing the article “the” here.
Context: ...backups and standby clusters. ### DXF Distributed eXecution Framework (DXF) is the framew...
(AI_EN_LECTOR_MISSING_DETERMINER_THE)
### CF | ||
|
||
CF is short for Column Family as used by RocksDB / TiKV. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Enhance the Column Family definition
The current definition is too brief. Consider expanding it to explain what a Column Family is and its role in RocksDB/TiKV.
### CF
-CF is short for Column Family as used by RocksDB / TiKV.
+CF is short for Column Family, a feature in RocksDB and TiKV that allows storing different types of data separately. Column Families provide logical separation of data, enabling different storage configurations and independent compaction for different types of data within the same database.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
### CF | |
CF is short for Column Family as used by RocksDB / TiKV. | |
### CF | |
CF is short for Column Family, a feature in RocksDB and TiKV that allows storing different types of data separately. Column Families provide logical separation of data, enabling different storage configurations and independent compaction for different types of data within the same database. |
🧰 Tools
🪛 LanguageTool
[uncategorized] ~49-~49: Possible missing comma found.
Context: ...rmance. ### CF CF is short for Column Family as used by RocksDB / TiKV. ### Coalesc...
(AI_HYDRA_LEO_MISSING_COMMA)
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds several new terms to the glossary, enhancing the documentation. The additions seem relevant and well-defined. However, I have a few suggestions to improve clarity and consistency, referencing the provided style guide.
Merge Readiness
The changes are valuable additions to the glossary. However, addressing the suggestions regarding clarity, consistency, and style guide adherence would further improve the quality of the documentation. I recommend addressing the review comments before merging. I am unable to approve this pull request, and users should have others review and approve this code before merging.
### BR | ||
|
||
BR is the Backup and Restore tool for TiDB. See [BR Overview](/br/backup-and-restore-overview.md) for more information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider rephrasing this definition to be more descriptive of what BR
does, rather than just stating what it is. For example, "BR is a tool used to back up and restore TiDB clusters, enabling disaster recovery and data migration."
Also, consider using active voice: "You can use BR to..." 1
Style Guide References
### BR | |
BR is the Backup and Restore tool for TiDB. See [BR Overview](/br/backup-and-restore-overview.md) for more information. | |
BR is a tool used to back up and restore TiDB clusters, enabling disaster recovery and data migration. You can use BR to... |
Footnotes
### DR | ||
|
||
Disaster Recovery (DR) describes solutions that can be used to recover from a disaster in the future. This includes things like backups and standby clusters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This definition could be more descriptive. Consider rephrasing to explain the purpose of Disaster Recovery in the context of TiDB. For example, "Disaster Recovery (DR) encompasses strategies and solutions for restoring TiDB clusters to a functional state after a disruptive event, including data backups, replication, and failover mechanisms."
### DR | |
Disaster Recovery (DR) describes solutions that can be used to recover from a disaster in the future. This includes things like backups and standby clusters. | |
Disaster Recovery (DR) encompasses strategies and solutions for restoring TiDB clusters to a functional state after a disruptive event, including data backups, replication, and failover mechanisms. |
### CF | ||
|
||
CF is short for Column Family as used by RocksDB / TiKV. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be helpful to expand on the use of Column Families in RocksDB/TiKV for readers unfamiliar with the concept. For example, "CF refers to Column Family, a feature used by RocksDB and TiKV to logically group key-value pairs for better performance and storage management."
### CF | |
CF is short for Column Family as used by RocksDB / TiKV. | |
CF refers to Column Family, a feature used by RocksDB and TiKV to logically group key-value pairs for better performance and storage management. |
|
||
### KMS | ||
|
||
Key Management Service (KMS) allows the storage and retrieval of secret keys in a secure way. Examples of this are the AWS KMS, GCP KMS and HashiCorp Vault. Various TiDB components can use this to manage the keys that are used for storage encryption and related services. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding a sentence about how KMS is used in TiDB, e.g. "TiDB uses KMS to manage encryption keys for data at rest."
Key Management Service (KMS) allows the storage and retrieval of secret keys in a secure way. Examples of this are the AWS KMS, GCP KMS and HashiCorp Vault. Various TiDB components can use this to manage the keys that are used for storage encryption and related services. | |
Key Management Service (KMS) allows the storage and retrieval of secret keys in a secure way. Examples of this are the AWS KMS, GCP KMS and HashiCorp Vault. TiDB uses KMS to manage encryption keys for data at rest. |
|
||
### OLAP | ||
|
||
OnLine Analytical Processing (OLAP) are describing database workloads that mostly deal with analytical workloads like reporting. The characteristics of this is read heavy queries that process many rows. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider rephrasing this definition to be more descriptive of what OLAP workloads are in the context of TiDB. For example, "OnLine Analytical Processing (OLAP) refers to database workloads that primarily involve complex analytical queries, such as reporting and data warehousing, often characterized by read-heavy operations and large data scans."
Style Guide References
OnLine Analytical Processing (OLAP) are describing database workloads that mostly deal with analytical workloads like reporting. The characteristics of this is read heavy queries that process many rows. | |
OnLine Analytical Processing (OLAP) refers to database workloads that primarily involve complex analytical queries, such as reporting and data warehousing, often characterized by read-heavy operations and large data scans. |
|
||
### RPC | ||
|
||
Remote Procedure Call (RPC) is a way for software components to communicate. In a TiDB cluster gRPC standard is used for communication between different components such as TiDB, TiKV and TiFlash. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider rephrasing this sentence to use active voice: "TiDB uses gRPC for communication between its components."1
Style Guide References
Remote Procedure Call (RPC) is a way for software components to communicate. In a TiDB cluster gRPC standard is used for communication between different components such as TiDB, TiKV and TiFlash. | |
Remote Procedure Call (RPC) is a way for software components to communicate. TiDB uses gRPC for communication between its components such as TiDB, TiKV and TiFlash. |
Footnotes
|
||
### SST | ||
|
||
Static Sorted Table, Sorted String Table or Sorted Sequence Table (SST) is the file storage format of RocksDB. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding a brief explanation of what RocksDB is, e.g. "RocksDB is an embedded key-value store used by TiKV."
Static Sorted Table, Sorted String Table or Sorted Sequence Table (SST) is the file storage format of RocksDB. | |
Static Sorted Table, Sorted String Table or Sorted Sequence Table (SST) is the file storage format of RocksDB, an embedded key-value store used by TiKV. |
|
||
### TPS | ||
|
||
Transactions Per Second (TPS) is a performance metric of a database. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider rephrasing this definition to be more descriptive of what TPS measures. For example, "Transactions Per Second (TPS) is a key performance indicator (KPI) that measures the number of transactions a database system can process per second."
Transactions Per Second (TPS) is a performance metric of a database. | |
Transactions Per Second (TPS) is a key performance indicator (KPI) that measures the number of transactions a database system can process per second. |
|
||
### URI | ||
|
||
Uniform Resource Identifier (URI) is a uniform way of describing a resource. See [Uniform Resource Identifier](https://en.wikipedia.org/wiki/Uniform_Resource_Identifier) on Wikipedia for more information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider rephrasing this definition to be more descriptive of what URI is. For example, "Uniform Resource Identifier (URI) is a string of characters that identifies a resource, enabling uniform identification of resources."
Uniform Resource Identifier (URI) is a uniform way of describing a resource. See [Uniform Resource Identifier](https://en.wikipedia.org/wiki/Uniform_Resource_Identifier) on Wikipedia for more information. | |
Uniform Resource Identifier (URI) is a string of characters that identifies a resource, enabling uniform identification of resources. |
|
||
Universally Unique Identifier (UUID) is a 128-bit (16 byte) generated ID that can be used to identify records in a database. See [UUID](/best-practices/uuid.md) for more information on how UUID's are used in TiDB. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider rephrasing this definition to be more descriptive of what UUID is. For example, "Universally Unique Identifier (UUID) is a 128-bit identifier used to uniquely identify information in computer systems, ensuring uniqueness across space and time."
Universally Unique Identifier (UUID) is a 128-bit (16 byte) generated ID that can be used to identify records in a database. See [UUID](/best-practices/uuid.md) for more information on how UUID's are used in TiDB. | |
Universally Unique Identifier (UUID) is a 128-bit identifier used to uniquely identify information in computer systems, ensuring uniqueness across space and time. |
/bot-review |
|
||
### Bucket | ||
|
||
A [Region](#regionpeerraft-group) is logically divided into several small ranges called bucket. TiKV collects query statistics by buckets and reports the bucket status to PD. For details, see the [Bucket design doc](https://github.com/tikv/rfcs/blob/master/text/0082-dynamic-size-region.md#bucket). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sentence is not clear enough. It is recommended to clarify the relationship between Region and bucket, and to supplement the explanation of how buckets are used.
A [Region](#regionpeerraft-group) is logically divided into several small ranges called bucket. TiKV collects query statistics by buckets and reports the bucket status to PD. For details, see the [Bucket design doc](https://github.com/tikv/rfcs/blob/master/text/0082-dynamic-size-region.md#bucket). | |
A [Region](#regionpeerraft-group) is logically divided into several small ranges called buckets. TiKV collects query statistics by these buckets and reports the bucket status to PD. This helps in optimizing query performance and load balancing. For details, see the [Bucket design doc](https://github.com/tikv/rfcs/blob/master/text/0082-dynamic-size-region.md#bucket). |
|
||
### CF | ||
|
||
CF is short for Column Family as used by RocksDB / TiKV. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The definition of CF is too brief and lacks context. It is recommended to provide more information about its role and significance in RocksDB/TiKV.
CF is short for Column Family as used by RocksDB / TiKV. | |
CF is short for Column Family as used by RocksDB / TiKV. In the context of RocksDB and TiKV, a Column Family is a logical separation within a database that allows for different settings and optimizations for different types of data. This feature enables more efficient data management and retrieval. |
|
||
### DM | ||
|
||
Data Migration is the tool that allows MySQL to TiDB migration by reading data from a source instance and applying it to a target MySQL instance. See [DM Overview](/dm/dm-overview.md) for more information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sentence is unclear and could be improved for better readability. It is recommended to clarify the process of data migration from MySQL to TiDB.
Data Migration is the tool that allows MySQL to TiDB migration by reading data from a source instance and applying it to a target MySQL instance. See [DM Overview](/dm/dm-overview.md) for more information. | |
Data Migration (DM) is the tool that facilitates the migration from MySQL to TiDB by reading data from a source MySQL instance and applying it to a target TiDB instance. See [DM Overview](/dm/dm-overview.md) for more information. |
|
||
### DXF | ||
|
||
Distributed eXecution Framework (DXF) is the framework used by TiDB to speedup index creation and data import by distributing tasks over all available resources. See [DXF Introduction](/tidb-distributed-execution-framework.md) for more details |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sentence is missing a period at the end.
Distributed eXecution Framework (DXF) is the framework used by TiDB to speedup index creation and data import by distributing tasks over all available resources. See [DXF Introduction](/tidb-distributed-execution-framework.md) for more details | |
Distributed eXecution Framework (DXF) is the framework used by TiDB to speed up index creation and data import by distributing tasks over all available resources. See [DXF Introduction](/tidb-distributed-execution-framework.md) for more details. |
|
||
### GTID | ||
|
||
Global Transactions ID's (GTIDs) are used by recent MySQL versions binary log to indicate what transactions have been replicated and which have not. This information can be used by DM. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sentence contains a grammatical error. "ID's" should be "IDs".
Global Transactions ID's (GTIDs) are used by recent MySQL versions binary log to indicate what transactions have been replicated and which have not. This information can be used by DM. | |
Global Transaction IDs (GTIDs) are used by recent MySQL versions' binary log to indicate what transactions have been replicated and which have not. This information can be used by DM. |
|
||
### RU | ||
|
||
Request Unit (RU) is used in TiDB to describe the unit for the resource usage. This is used with [Resource Control](/tidb-resource-control.md) to manage resource usage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sentence is missing a comma after 'usage'.
Request Unit (RU) is used in TiDB to describe the unit for the resource usage. This is used with [Resource Control](/tidb-resource-control.md) to manage resource usage. | |
Request Unit (RU) is used in TiDB to describe the unit for the resource usage. This is used with [Resource Control](/tidb-resource-control.md) to manage resource usage. |
|
||
### SST | ||
|
||
Static Sorted Table, Sorted String Table or Sorted Sequence Table (SST) is the file storage format of RocksDB. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The definition of SST is unclear and lacks context. It is recommended to provide a more detailed explanation of what SST is and its role in RocksDB.
Static Sorted Table, Sorted String Table or Sorted Sequence Table (SST) is the file storage format of RocksDB. | |
Static Sorted Table, Sorted String Table, or Sorted Sequence Table (SST) is a file storage format used by RocksDB. It organizes data in a sorted manner to optimize read and write operations, enhancing the performance of the database. |
|
||
### TPS | ||
|
||
Transactions Per Second (TPS) is a performance metric of a database. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The definition of TPS is too brief and lacks context. Consider providing more details about its significance in database performance.
Transactions Per Second (TPS) is a performance metric of a database. | |
Transactions Per Second (TPS) is a performance metric of a database that measures the number of transactions a system can process in one second. It is a critical indicator of the system's throughput and efficiency. |
|
||
### URI | ||
|
||
Uniform Resource Identifier (URI) is a uniform way of describing a resource. See [Uniform Resource Identifier](https://en.wikipedia.org/wiki/Uniform_Resource_Identifier) on Wikipedia for more information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The explanation of URI could be expanded to provide more context about its use in TiDB.
Uniform Resource Identifier (URI) is a uniform way of describing a resource. See [Uniform Resource Identifier](https://en.wikipedia.org/wiki/Uniform_Resource_Identifier) on Wikipedia for more information. | |
Uniform Resource Identifier (URI) is a uniform way of describing a resource, which is used in TiDB to identify and access various resources within the system. See [Uniform Resource Identifier](https://en.wikipedia.org/wiki/Uniform_Resource_Identifier) on Wikipedia for more information. |
|
||
### UUID | ||
|
||
Universally Unique Identifier (UUID) is a 128-bit (16 byte) generated ID that can be used to identify records in a database. See [UUID](/best-practices/uuid.md) for more information on how UUID's are used in TiDB. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The explanation of UUID should clarify its importance and usage in TiDB.
Universally Unique Identifier (UUID) is a 128-bit (16 byte) generated ID that can be used to identify records in a database. See [UUID](/best-practices/uuid.md) for more information on how UUID's are used in TiDB. | |
Universally Unique Identifier (UUID) is a 128-bit (16 byte) generated ID that can be used to uniquely identify records in a database, ensuring data integrity and consistency. See [UUID](/best-practices/uuid.md) for more information on how UUID's are used in TiDB. |
✅ AI review completed, 21 comments generated. |
What is changed, added or deleted? (Required)
The doc changes come from a draft PR for testing purposes.
Which TiDB version(s) do your changes apply to? (Required)
Tips for choosing the affected version(s):
By default, CHOOSE MASTER ONLY so your changes will be applied to the next TiDB major or minor releases. If your PR involves a product feature behavior change or a compatibility change, CHOOSE THE AFFECTED RELEASE BRANCH(ES) AND MASTER.
For details, see tips for choosing the affected versions.
What is the related PR or file link(s)?
Do your changes match any of the following descriptions?
Summary by CodeRabbit