-
Notifications
You must be signed in to change notification settings - Fork 3
Database configuration and migration #42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
6148a0b
to
2250ed3
Compare
2250ed3
to
12d3063
Compare
7d2c0c1
to
6e94ae8
Compare
6e94ae8
to
ed959fb
Compare
wp-includes/sqlite-ast/class-wp-sqlite-information-schema-reconstructor.php
Outdated
Show resolved
Hide resolved
wp-includes/sqlite-ast/class-wp-sqlite-information-schema-reconstructor.php
Outdated
Show resolved
Hide resolved
wp-includes/sqlite-ast/class-wp-sqlite-information-schema-reconstructor.php
Show resolved
Hide resolved
wp-includes/sqlite-ast/class-wp-sqlite-information-schema-reconstructor.php
Outdated
Show resolved
Hide resolved
wp-includes/sqlite-ast/class-wp-sqlite-information-schema-reconstructor.php
Show resolved
Hide resolved
wp-includes/sqlite-ast/class-wp-sqlite-information-schema-reconstructor.php
Show resolved
Hide resolved
wp-includes/sqlite-ast/class-wp-sqlite-information-schema-reconstructor.php
Outdated
Show resolved
Hide resolved
Great start! And also a good idea to use a cascade of data sources for backfilling. There's some more way to go, though — I've commented on a bunch of potential issues |
a8a5604
to
231bab2
Compare
f514c0b
to
21476a5
Compare
21476a5
to
c67f6c4
Compare
|
||
jobs: | ||
verify-version: | ||
name: Verify that plugin version matches SQLITE_DRIVER_VERSION constant |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about this slightly longer name?
name: Verify that plugin version matches SQLITE_DRIVER_VERSION constant | |
name: Assert the WordPress plugin header declares the same version as the SQLITE_DRIVER_VERSION constant |
$parser = new WP_MySQL_Parser( self::$mysql_grammar, $tokens ); | ||
$ast = $parser->parse(); | ||
|
||
// TODO: Translate and execute all queries in the SQL input string. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we don't need to support that:
- $wpdb uses mysqli_query
mysqli_query
doc page says "mysqli::query -- mysqli_query — Performs a query on the database"- There'e a separate mysqli_multi_query function to run more than one query.
Does this mean WordPress actually doesn't support multiple queries in here?
Also, dbDelta supports multiple queries by exploding the string by ;
🙈 The query parser would make it much safer once we can contribute to core again. cc @dmsnell
Also, it would be useful to have a first-class API for looping over multiple queries in a string – or even in a stream. It's not for this PR for sure, but it seems like we have all the parts that we need. We'd just need to detect when it fails due to a syntax error vs incomplete input and add a method such as append_bytes()
and presto. Then we could just run any, arbitrarily large mysql dump.
@@ -2182,7 +2236,7 @@ private function translate( $node ): ?string { | |||
'%s AS %s', | |||
$value, | |||
$this->quote_sqlite_identifier( | |||
'@@' . ( $type_token ? "$type_token->value." : '' ) . $original_name | |||
'@@' . ( $type_token ? "{$type_token->get_value()}." : '' ) . $original_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: slightly clearer code style
'@@' . ( $type_token ? "{$type_token->get_value()}." : '' ) . $original_name | |
'@@' . ( $type_token ? $type_token->get_value().'' : '' ) . $original_name |
@@ -3481,10 +3540,9 @@ private function unquote_sqlite_identifier( string $quoted_identifier ): string | |||
$first_byte = $quoted_identifier[0] ?? null; | |||
if ( '"' === $first_byte || '`' === $first_byte ) { | |||
$unquoted = substr( $quoted_identifier, 1, -1 ); | |||
} else { | |||
$unquoted = $quoted_identifier; | |||
return str_replace( $first_byte . $first_byte, $first_byte, $unquoted ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a state machine for this? Or is this actually this easy for all possible scenarios? E.g. quadruple""""quotes
, backslashes without \""NO_BACKSLASH_ESCAPES\""
etc.?
FROM sqlite_schema | ||
WHERE type = 'table' | ||
AND name != ? | ||
AND name NOT LIKE ? ESCAPE '\' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL about ESCAPE
* @return string[] The names of tables in the information schema. | ||
*/ | ||
private function get_information_schema_table_names(): array { | ||
$tables_table = $this->schema_builder->get_table_name( false, 'tables' ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm always nervous about having to remember which calls produce a contextually-safe part of the query and which don't. Is there anything we could do to the name to make this clear? As in get_quoted_table_name
or get_escaped_table_name()
or so?
Alternatively – and for the future – we might want to consider our own flavor of prepared statements that supports binding table names. But that's way beyond the scope here.
Tangentially related, @dmsnell taught me that backticks can be a part of MySQL tables names! So a table can be called wp_plugin_`meta`_data
* | ||
* @return array<string, WP_Parser_Node> The WordPress CREATE TABLE statements. | ||
*/ | ||
private function get_wp_create_table_statements(): array { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, so much better than what WordPress core does:
private function generate_create_table_statement( string $table_name ): string { | ||
// Columns. | ||
$columns = $this->driver->execute_sqlite_query( | ||
sprintf( 'PRAGMA table_xinfo("%s")', $table_name ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto my point from earlier about safe vs unsafe inputs. Can we either explicitly quote this, even if we don't have to, just to make tell the future reader "yup, all safe, nothing to look for here"?
|
||
// Key columns. | ||
$key_columns = $this->driver->execute_sqlite_query( | ||
'SELECT * FROM pragma_index_info("' . $key_info['name'] . '")' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double quote is a valid part of a table name and perhaps also a key name in SQLite, let's quote eagerly.
sqlite> create table "tricky_""table""_name" (id int);
sqlite> .tables
tricky_"table"_name
} | ||
|
||
// Unquoted string literal. E.g.: abc | ||
return $this->escape_mysql_string_literal( $default_value ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this happen in practice? Or should this be an error? I'm worried about converting things like DEFAULT NOW()
to DEFAULT "NOW()"
} | ||
|
||
/** | ||
* 5. Otherwise, the affinity is NUMERIC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great documentation
private function escape_mysql_string_literal( string $literal ): string { | ||
// See: https://www.php.net/manual/en/mysqli.real-escape-string.php | ||
return "'" . addcslashes( $literal, "\0\n\r'\"\Z" ) . "'"; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dmsnell can you poke any holes in this? We assume everything is UTF-8, and if it isn't then GIGO.
* @return string The quoted identifier value. | ||
*/ | ||
private function quote_sqlite_identifier( string $unquoted_identifier ): string { | ||
return '`' . str_replace( '`', '``', $unquoted_identifier ) . '`'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you told me once that double backticks are the only way to escape a backtick so this seems fine. Can we mention that in the docstring with a documentation link? It just looks suspiciously easy and I don't want to mislead anyone in the future into searching for a security issue here.
$first_byte = $quoted_identifier[0] ?? null; | ||
if ( '"' === $first_byte || '`' === $first_byte ) { | ||
$unquoted = substr( $quoted_identifier, 1, -1 ); | ||
return str_replace( $first_byte . $first_byte, $first_byte, $unquoted ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto my state machine question from earlier. Should we use the parser for this?
I left some more notes. We're getting close here. Thank you for persevering @JanJakes! This is the last stretch before we can finally enjoy the new parser everywhere ❤️ I know I'm going deep on a piece of code we only need for a one-off migration, but a) it still matters and b) I think we'll reuse it later on for migrating between the database backends. |
This PR extracts existing SQLite database configuration to a
WP_SQLite_Configurator
and implementsWP_SQLite_Information_Schema_Reconstructor
with the ability to backfill missing table data in information schema.How it works:
WP_SQLite_Configurator::ensure_database_configured()
), while acquiring anEXCLUSIVE
lock to prevent race conditions (multiple requests triggering the same migration).WP_SQLite_Information_Schema_Reconstructor
.wp_get_db_schema()
._mysql_data_types_cache
, if it exists, otherwise directly from the SQLite schema using SQLite column affinity rules.