Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance dbinfo sub-command with column names and global variables #70

Merged
merged 12 commits into from
Nov 29, 2024
50 changes: 38 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,25 @@
# VT Utilities

The `vt` binary encapsulates several utility tools for Vitess, providing a comprehensive suite for testing, summarizing, and query analysis.
The `vt` binary encapsulates several utility tools for Vitess, providing a comprehensive suite for testing, summarizing,
and query analysis.

## Tools Included
- **`vt test`**: A testing utility using the same test files as the [MySQL Test Framework](https://github.com/mysql/mysql-server/tree/8.0/mysql-test). It compares the results of identical queries executed on both MySQL and Vitess (vtgate), helping to ensure compatibility.
- **`vt keys`**: A utility that analyzes query logs and provides information about queries, tables, joins, and column usage.
- **`vt transactions`**: A tool that analyzes query logs to identify transaction patterns and outputs a JSON report detailing these patterns.
- **`vt trace`**: A tool that generates execution traces for queries without comparing against MySQL. It helps analyze query behavior and performance in Vitess environments.

- **`vt test`**: A testing utility using the same test files as
the [MySQL Test Framework](https://github.com/mysql/mysql-server/tree/8.0/mysql-test). It compares the results of
identical queries executed on both MySQL and Vitess (vtgate), helping to ensure compatibility.
- **`vt keys`**: A utility that analyzes query logs and provides information about queries, tables, joins, and column
usage.
- **`vt transactions`**: A tool that analyzes query logs to identify transaction patterns and outputs a JSON report
detailing these patterns.
- **`vt trace`**: A tool that generates execution traces for queries without comparing against MySQL. It helps analyze
query behavior and performance in Vitess environments.
- **`vt summarize`**: A tool used to summarize or compare trace logs or key logs for easier human consumption.
- **`vt dbinfo`**: A tool that provides information about the database schema, including row counts, useful column
attributes and relevant subset of global variables.

## Installation

You can install `vt` using the following command:

```bash
@@ -18,7 +28,8 @@ go install github.com/vitessio/vt/go/vt@latest

## Testing Methodology

To verify compatibility and correctness, the testing strategy involves running identical queries on both MySQL and vtgate, followed by a comparison of results. The process includes:
To verify compatibility and correctness, the testing strategy involves running identical queries on both MySQL and
vtgate, followed by a comparison of results. The process includes:

1. **Query Execution**: Each test query is executed on both MySQL and vtgate.
2. **Result Comparison**: The returned data, result set structure (column types, order), and errors are compared.
@@ -27,7 +38,9 @@ To verify compatibility and correctness, the testing strategy involves running i
This dual-testing strategy ensures high confidence in vtgate's compatibility with MySQL.

### Sharded Testing Strategy
Vitess operates in a sharded environment, presenting unique challenges, especially during schema changes (DDL). The `vt test` tool handles these by converting DDL statements into VSchema commands.

Vitess operates in a sharded environment, presenting unique challenges, especially during schema changes (DDL). The
`vt test` tool handles these by converting DDL statements into VSchema commands.

Here's an example of running `vt test`:

@@ -60,7 +73,9 @@ vt trace --vschema=t/vschema.json --backup-path=/path/to/backup --number-of-shar
```

`vt trace` accepts most of the same configuration flags as `vt test`, including:
- `--sharded`: Enable auto-sharded mode - uses primary keys as sharding keys. Not a good idea for a production environment, but can be used to ensure that all queries work in a sharded environment.

- `--sharded`: Enable auto-sharded mode - uses primary keys as sharding keys. Not a good idea for a production
environment, but can be used to ensure that all queries work in a sharded environment.
- `--vschema`: Specify the VSchema configuration
- `--backup-path`: Initialize from a backup
- `--number-of-shards`: Specify the number of shards to bring up
@@ -69,6 +84,7 @@ vt trace --vschema=t/vschema.json --backup-path=/path/to/backup --number-of-shar
Both `vt trace` and `vt keys` support different input file formats through the `--input-type` flag:

Example using different input types:

```bash
# Analyze SQL file or slow query log
vt trace slow-query.log > trace-log.json
@@ -90,7 +106,7 @@ vt summarize trace-log1.json trace-log2.json # Compare two traces
## Key Analysis Workflow

`vt keys` analyzes query logs and outputs detailed information about tables, columns usage and joins in queries.
This data can be summarized using `vt summarize`.
This data can be summarized using `vt summarize`.
Here's a typical workflow:

1. **Run `vt keys` to analyze queries**:
@@ -106,17 +122,26 @@ Here's a typical workflow:
vt trace --input-type=vtgate-log vtgate-querylog.log > trace-log.json
```

This command generates a `keys-log.json` file that contains a detailed analysis of table and column usage from the queries.
This command generates a `keys-log.json` file that contains a detailed analysis of table and column usage from the
queries.

2. **Summarize the `keys-log` using `vt summarize`**:

```bash
vt summarize keys-log.json
```

This command summarizes the key analysis, providing insight into which tables and columns are used across queries, and how frequently they are involved in filters, groupings, and joins.
This command summarizes the key analysis, providing insight into which tables and columns are used across queries,
and how frequently they are involved in filters, groupings, and joins.
[Here](https://github.com/vitessio/vt/blob/main/go/summarize/testdata/keys-summary.md) is an example summary report.

If you have access to the running database, you can use `vt dbinfo > dbinfo.json` and pass it to `summarize` so
that the analysis can take into the account the additional information from the database schema and configuration:

```bash
vt summarize keys-log.json dbinfo.json
```

## Transaction Analysis with vt transactions
The `vt transactions` command is designed to analyze query logs and identify patterns of transactional queries.
It processes the logs to find sequences of queries that form transactions and outputs a JSON report summarizing these patterns.
@@ -156,4 +181,5 @@ Vitess Tester is licensed under the Apache 2.0 license. See the [LICENSE](./LICE

## Acknowledgments

Vitess Tester started as a fork from [pingcap/mysql-tester](https://github.com/pingcap/mysql-tester). We thank the original authors for their foundational work.
Vitess Tester started as a fork from [pingcap/mysql-tester](https://github.com/pingcap/mysql-tester). We thank the
original authors for their foundational work.
12 changes: 6 additions & 6 deletions go/cmd/schema.go → go/cmd/dbinfo.go
Original file line number Diff line number Diff line change
@@ -20,23 +20,23 @@ import (
"github.com/spf13/cobra"
"vitess.io/vitess/go/mysql"

"github.com/vitessio/vt/go/schema"
"github.com/vitessio/vt/go/dbinfo"
)

func schemaCmd() *cobra.Command {
func dbinfoCmd() *cobra.Command {
var vtParams mysql.ConnParams

cmd := &cobra.Command{
Use: "schema ",
Use: "dbinfo ",
Short: "Loads info from the database including row counts",
Example: "vt schema",
Example: "vt dbinfo",
Args: cobra.ExactArgs(0),
RunE: func(_ *cobra.Command, _ []string) error {
cfg := schema.Config{
cfg := dbinfo.Config{
VTParams: vtParams,
}

return schema.Run(cfg)
return dbinfo.Run(cfg)
},
}

2 changes: 1 addition & 1 deletion go/cmd/root.go
Original file line number Diff line number Diff line change
@@ -37,7 +37,7 @@ func Execute() {
root.AddCommand(testerCmd())
root.AddCommand(tracerCmd())
root.AddCommand(keysCmd())
root.AddCommand(schemaCmd())
root.AddCommand(dbinfoCmd())
root.AddCommand(transactionsCmd())

err := root.Execute()
76 changes: 54 additions & 22 deletions go/schema/schema.go → go/dbinfo/dbinfo.go
Original file line number Diff line number Diff line change
@@ -14,12 +14,10 @@ See the License for the specific language governing permissions and
limitations under the License.
*/

package schema
package dbinfo

import (
"context"
"encoding/json"
"fmt"
"io"
"os"

@@ -47,14 +45,24 @@ func run(out io.Writer, cfg Config) error {
return err
}

type TableColumn struct {
Name string `json:"name"`
Type string `json:"type"`
KeyType string `json:"keyType,omitempty"`
IsNullable bool `json:"isNullable,omitempty"`
Extra string `json:"extra,omitempty"`
}

type TableInfo struct {
Name string `json:"name"`
Rows int `json:"rows"`
Name string `json:"name"`
Rows int `json:"rows"`
Columns []*TableColumn `json:"columns"`
}

type Info struct {
FileType string `json:"fileType"`
Tables []TableInfo `json:"tables"`
FileType string `json:"fileType"`
Tables []*TableInfo `json:"tables"`
GlobalVariables map[string]string `json:"globalVariables"`
}

func Get(cfg Config) (*Info, error) {
@@ -66,29 +74,53 @@ func Get(cfg Config) (*Info, error) {
DbName: cfg.VTParams.DbName,
}

vtConn, err := mysql.Connect(context.Background(), vtParams)
dbh := NewDBHelper(vtParams)
ts, err := dbh.getTableSizes()
if err != nil {
return nil, err
}
defer vtConn.Close()
queryTableSizes := "SELECT table_name, table_rows FROM information_schema.tables WHERE table_schema = '%s' and table_type = 'BASE TABLE'"
qr, err := vtConn.ExecuteFetch(fmt.Sprintf(queryTableSizes, cfg.VTParams.DbName), -1, false)

var tableInfo []*TableInfo
tableMap := make(map[string]*TableInfo)

for tableName, tableRows := range ts {
tableMap[tableName] = &TableInfo{
Name: tableName,
Rows: tableRows,
}
}

tc, err := dbh.getColumnInfo()
if err != nil {
return nil, err
}
var tables []TableInfo
for _, row := range qr.Rows {
tableName := row[0].ToString()
tableRows, _ := row[1].ToInt64()
tables = append(tables, TableInfo{
Name: tableName,
Rows: int(tableRows),
})

for tableName, columns := range tc {
ti, ok := tableMap[tableName]
if !ok {
ti = &TableInfo{
Name: tableName,
}
tableMap[tableName] = ti
}
ti.Columns = columns
}

for tableName := range tableMap {
tableInfo = append(tableInfo, tableMap[tableName])
}

globalVariables, err := dbh.getGlobalVariables()
if err != nil {
return nil, err
}
schemaInfo := &Info{
Tables: tables,

dbInfo := &Info{
FileType: "dbinfo",
Tables: tableInfo,
GlobalVariables: globalVariables,
}
return schemaInfo, nil
return dbInfo, nil
}

func Load(fileName string) (*Info, error) {
Loading