Skip to content

Commit 860e861

Browse files
committed
text: Add README.md for vt transactions
Signed-off-by: Andres Taylor <[email protected]>
1 parent 030f025 commit 860e861

File tree

2 files changed

+92
-0
lines changed

2 files changed

+92
-0
lines changed

README.md

+6
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ The `vt` binary encapsulates several utility tools for Vitess, providing a compr
55
## Tools Included
66
- **`vt test`**: A testing utility using the same test files as the [MySQL Test Framework](https://github.com/mysql/mysql-server/tree/8.0/mysql-test). It compares the results of identical queries executed on both MySQL and Vitess (vtgate), helping to ensure compatibility.
77
- **`vt keys`**: A utility that analyzes query logs and provides information about queries, tables, joins, and column usage.
8+
- **`vt transactions`**: A tool that analyzes query logs to identify transaction patterns and outputs a JSON report detailing these patterns.
89
- **`vt trace`**: A tool that generates execution traces for queries without comparing against MySQL. It helps analyze query behavior and performance in Vitess environments.
910
- **`vt summarize`**: A tool used to summarize or compare trace logs or key logs for easier human consumption.
1011

@@ -116,6 +117,11 @@ This command generates a `keys-log.json` file that contains a detailed analysis
116117
This command summarizes the key analysis, providing insight into which tables and columns are used across queries, and how frequently they are involved in filters, groupings, and joins.
117118
[Here](https://github.com/vitessio/vt/blob/main/go/summarize/testdata/keys-summary.md) is an example summary report.
118119

120+
## Transaction Analysis with vt transactions
121+
The `vt transactions` command is designed to analyze query logs and identify patterns of transactional queries.
122+
It processes the logs to find sequences of queries that form transactions and outputs a JSON report summarizing these patterns.
123+
Read more about how to use and how to read the output in the [vt transactions documentation](./go/transactions/README.md).
124+
119125
## Using `--backup-path` Flag
120126

121127
The `--backup-path` flag allows `vt test` and `vt trace` to initialize tests from a database backup rather than an empty database.

go/transactions/README.md

+86
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
# VT Transactions
2+
3+
The vt transactions command is a sub-command of the vt toolset, designed to analyze query logs, identify transaction patterns, and produce a JSON report summarizing these patterns.
4+
This tool is particularly useful for understanding complex transaction behaviors, optimizing database performance, choosing sharding strategy, and auditing transactional queries.
5+
6+
## Usage
7+
8+
The basic usage of vt transactions is:
9+
10+
```bash
11+
vt transactions querylog.log > report.json
12+
```
13+
14+
* querylog.log: The input query log file. This can be in various formats, such as SQL files, slow query logs, MySQL general query logs, or VTGate query logs.
15+
* report.json: The output JSON file containing the transaction patterns.
16+
17+
### Supported Input Types
18+
19+
`vt transactions` supports different input file formats through the --input-type flag:
20+
* Default: Assumes the input is an SQL file or a slow query log. A SQL script would also fall under this category.
21+
* MySQL General Query Log: Use --input-type=mysql-log for MySQL general query logs.
22+
* VTGate Query Log: Use --input-type=vtgate-log for VTGate query logs.
23+
24+
## Understanding the JSON Output
25+
26+
The output JSON file contains an array of transaction patterns, each summarizing a set of queries that commonly occur together within transactions. Here’s a snippet of the JSON output:
27+
28+
```json
29+
{
30+
"query-signatures": [
31+
"update pos_reports where id = :0 set `csv`, `error`, intraday, pos_type, ...",
32+
"update pos_date_requests where cache_key = :1 set cache_value"
33+
],
34+
"predicates": [
35+
"pos_date_requests.cache_key = ?",
36+
"pos_reports.id = ?"
37+
],
38+
"count": 223
39+
}
40+
```
41+
42+
### Fields Explanation
43+
44+
* query-signatures: An array of generalized query patterns involved in the transaction. Placeholders like :0, :1, etc., represent variables in the queries.
45+
* predicates: An array of predicates (conditions) extracted from the queries, generalized to identify patterns.
46+
* count: The number of times this transaction pattern was observed in the logs.
47+
48+
### Understanding predicates
49+
50+
The predicates array lists the conditions used in the transactional queries, with variables generalized for pattern recognition.
51+
* Shared Variables: If the same variable is used across different predicates within a transaction, it is assigned a numerical placeholder (e.g., 0, 1, 2). This indicates that the same variable or value is used in these predicates.
52+
* Unique Variables: Variables that are unique to a single predicate are represented with a ?.
53+
54+
### Example Explained
55+
56+
Consider the following predicates array:
57+
58+
```json
59+
{
60+
"predicates": [
61+
"timesheets.day = ?",
62+
"timesheets.craft_id = ?",
63+
"timesheets.store_id = ?",
64+
"dailies.day = 0",
65+
"dailies.craft_id = 1",
66+
"dailies.store_id = 2",
67+
"tickets.day = 0",
68+
"tickets.craft_id = 1",
69+
"tickets.store_id = 2"
70+
]
71+
}
72+
```
73+
74+
* Shared Values: Predicates with the same value across different conditions are assigned numerical placeholders (0, 1, 2), indicating that the same variable or value is used in these predicates.
75+
* For example, `dailies.craft_id = 1` and `tickets.craft_id = 1` share the same variable or value (represented as 1).
76+
* Unique Values: Predicates used only once are represented with ?, indicating a unique or less significant variable in the pattern.
77+
* For example, `timesheets.day = ?` represents a unique value for day.
78+
79+
This numbering helps identify the relationships between different predicates in the transaction patterns and can be used to optimize queries or understand transaction scopes.
80+
81+
## Practical Use Cases
82+
83+
* Optimization: Identify frequently occurring transactions to optimize database performance.
84+
* Sharding Strategy: When implementing horizontal sharding, it’s crucial to ensure that as many transactions as possible are confined to a single shard. The insights from vt transactions can help in choosing appropriate sharding keys for your tables to achieve this.
85+
* Audit: Analyze transactional patterns for security audits or compliance checks.
86+
* Debugging: Understand complex transaction behaviors during development or troubleshooting.

0 commit comments

Comments
 (0)