-
-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] @graphile/lds2 #662
Comments
I guess I'll get this party starting with an immediate comment on my own proposal. Having read everything meticulously again, I think there might be a way to prevent breaking changes by implementing point nr. 3 (watch capability) and using the additional schema information to retrieve the pk in case of I do have to say that I am not familiar with the technical implementation of this watching, so I am not sure how much (duplicate) effort it would be to incorporate this in either |
Excellent outline 🙌 Most of this week (except Thursday) is dedicated to OSS from me, so ping me on Discord if you want to chat about this in real time so we can try and make some headway. I'm on UK time (UTC+1). QuestionsQ1: Do we need include-pk?I think you're saying that when you have If we can fix this particular issue in Q2: What specifically do you think we need to monitor for in watch mode?The list of columns/PK obviously, but not sure what else? We could/should add watch mode support to Q3: What's involved in adding virtual table support; can we add this to
|
Link to the previous Discord discussion: https://discord.com/channels/489127045289476126/489717163280826374/756222484172243054 |
Thanks! I'll try and see if I can make some time this week.
I already opened up an issue in the @eulerto Just pinging you to this discussion, if you have any useful ideas on how to tackle detecting the version of wal2json that'd be very helpful!
I think we can easily solve this using said watch capability in combination with the logic you're describing to detect that there's more in
I would also argue that we make sure that we watch the relations of any partitioned tables, or other constructs (timescale) that look like it. This will allow us to correctly announce for the parent table as well for any child ones. I built a small fix for timescale once to make sure that a change in a chunk of a table would also trigger the table itself:
Above fix was quite trivial, because I could just register another callback on the system table that was keeping track of this relation between hypertables and their chunks. So on startup of the plugin I would gather all links, and afterwards place a callback to listen to any changes on this system table so I would stay up to date: this.subscriptions = {
'["_timescaledb_catalog", "hypertable"]': [
[updateHyperTableChunks, undefined],
],
};
I believe it could be as simple as this: this.subscriptions = {
'["pg_catalog", "pg_inherits"]': [
[updateTablePartitions, undefined],
],
}; The table |
Regarding Q3; are you suggesting that pg_catalog and pg_inherits are streamed via logical decoding too, and that we could subscribe to this? That's pretty cool! What I meant, though, is we should support these partitioned/hyper tables via PgTablesPlugin and the like. |
Hmm, I just tried this locally but I couldn't listen on any pg_catalog.* table. It probably only works because the way timescale does it isn't regarded as being a system table. |
@wesselvdv To keep this moving forward; can I suggest that you raise a PR just to solve the problem of Question 1; i.e. look at the length of In general, I'm thinking that we could/should use a flag rather than a separate version in order to support different wal2json features. This is slightly complicated by LDS being independent of Graphile Engine, but I think we can still achieve it 👍 |
@benjie I agree, what's the best approach for incorporating the watch capabilities? I have yet to take a look at |
The LDS module is completely standalone to graphile-build (you can use it without caring about GraphQL at all), so it would have to implement its own watch mode. Rather than having watch mode, I think I would do just-in-time introspection of the tables. So when the first request comes through for a table, quickly look up it's primary keys, and cache it. Then all the |
@benji Wouldn't it be more useful to register events for ddl changes, and store these changes in a central audit table. Said audit table can then be listened upon using the in place callback system. I just tested the below, seems to work nicely. I'll try to incorporate timescale hypertable detection in this aswel. create schema graphile_lds_watch;
CREATE TABLE graphile_lds_watch.tables
(
"table" regclass primary key NOT NULL,
primary_keys TEXT[] NOT NULL DEFAULT '{}'::TEXT[],
parent REGCLASS NULL,
last_updated TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE OR REPLACE FUNCTION graphile_lds_watch.track_primary_keys()
RETURNS event_trigger AS
$$
DECLARE
r RECORD;
primary_keys TEXT[];
parent regclass;
pg_version numeric = current_setting('server_version_num');
BEGIN
FOR r IN SELECT * FROM pg_event_trigger_ddl_commands()
LOOP
IF r.command_tag = 'ALTER TABLE' OR r.command_tag = 'CREATE TABLE' then
primary_keys := (
SELECT array_agg(a.attname)
FROM pg_index i
JOIN pg_attribute a ON a.attrelid = i.indrelid
AND a.attnum = ANY (i.indkey)
WHERE i.indisprimary
AND i.indrelid = r.objid
);
IF pg_version >= 10000 THEN
parent := (
SELECT inhparent::regclass
FROM pg_catalog.pg_inherits
WHERE inhrelid = r.objid
);
end if;
INSERT
INTO graphile_lds_watch.tables("table", primary_keys, parent, last_updated)
VALUES (r.object_identity, coalesce(primary_keys, '{}'::TEXT[]), parent,
statement_timestamp())
ON CONFLICT ("table")
DO UPDATE SET primary_keys = excluded.primary_keys,
parent = excluded.parent;
end if;
END LOOP;
END ;
$$ LANGUAGE plpgsql;
CREATE EVENT TRIGGER track_ddl ON ddl_command_end
EXECUTE PROCEDURE graphile_lds_watch.track_primary_keys();
CREATE EVENT TRIGGER track_ddl_drops ON sql_drop
EXECUTE PROCEDURE graphile_lds_watch.track_primary_keys();
DROP TABLE IF EXISTS attributes;
CREATE TABLE IF NOT EXISTS attributes
(
id serial NOT NULL,
date TIMESTAMPTZ DEFAULT 'infinity'::TIMESTAMPTZ NOT NULL,
PRIMARY KEY (id, date)
);
ALTER TABLE attributes
DROP CONSTRAINT attributes_pkey;
ALTER TABLE attributes
ADD CONSTRAINT attributes_pkey_two PRIMARY KEY (id);
DROP TABLE IF EXISTS measurement;
CREATE TABLE IF NOT EXISTS measurement
(
city_id int not null,
logdate date not null,
peaktemp int,
unitsales int
) PARTITION BY RANGE (logdate);
DROP TABLE IF EXISTS measurement_y2006m02;
CREATE TABLE IF NOT EXISTS measurement_y2006m02 PARTITION OF measurement
FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');
DROP TABLE IF EXISTS measurement_y2006m03;
CREATE TABLE IF NOT EXISTS measurement_y2006m03 PARTITION OF measurement
FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');
DROP TABLE IF EXISTS measurement_y2007m11;
CREATE TABLE IF NOT EXISTS measurement_y2007m11 PARTITION OF measurement
FOR VALUES FROM ('2007-11-01') TO ('2007-12-01');
SELECT * FROM graphile_lds_watch.tables; |
I'm not sure I see the value of mutating the database on every mutation in the database to be able to track mutations in the database; could you expand? In PostGraphile the event triggers come through to Node which then checks the latest state of the DB; I'd expect something similar to work well here? I'm not super comfortable with any tool mutating the database, there has to be a really good reason to do it. Event triggers in particular are frustrating because you can't easily omit them from pg_dumps so they pollute user's schema dumps. |
@wesselvdv Are you still interested in working on this enhancement? I'm looking to make some optimizations to the live-query system (as described here), and this enhancement appears to be a necessary step for entries 1 and 3 in that thread. Specifically, they rely on having the full data of the changed row (for efficient matching against active live-queries), which requires |
Feature description
The new package
@graphile/lds2
is in essence a copy of the current implementation with two changes (1 & 2) and one addition (3):Minimal required version of wal2json is at this time of writing 2.3, due to requiring the
include-pk
parameter to be available. This is required for@graphile/subscriptions-lds
to properly work in cases whereREPLICA IDENTITY FULL
is enabled on a table.Dropping support for
version-format 1
, and moving toversion-format 2
which no longer reports changes on a transaction level, but on a tuple level. Additional bonus is that in my opinion the json object is structured more like what the industry standard is today.Adding watch capability much to what I believe
graphile-build
has for building it schema. Currently@graphile/lds
is not able to actively listen to schema changes, adding this would allow us to circumvent the need for theinclude-pk
parameter and as such also keep supportingversion-format 1
and wal2json <= 2.3, yet this is not the main goal of this addition.I believe the most useful purpose of adding a watch capability is to be able to support subscriptions on native pg >=11 partitioned tables and timescale hypertables. These table structures are providing a
virtual
table that one can doinsert
,update
,delete
,select
upon, but in actuality these operations are redirected to one or more sub tables (partitions). wal2json does NOT report changes on this (virtual
) parent table level, but on the actual subtable that the change occurred. This makes it impossible to subscribe to these type of tables with the current implementation of@graphile/lds
. I am aware of the fact that native pg >= 11 partitioned tables are NOT very well supported inpostgraphile
atm (Only the partitions are showing up in the schema, not the virtual table), but due to the different technical implementation of timescale hypertables these are actually coming out in the generated schema just fine (No partitions are shown, just the parent table).Having this watch capability allows us to figure out that when a change occurs in one of these sub tables, we also need to fabricate a same change for the (
virtual
) parent table.Motivating example
Motivation for this alternative version is twofold, being that the the current implementation of
@graphile/lds
:Assumes that the primary key information of a change is always included in the property
oldkeys
if the propertyoldkeys
is included. This is NOT always the case due to the possibilty of enablingREPLICA IDENTITY FULL
on a table which changes the contents ofoldkeys
from not only containing the primary key(s), but to containing the entire record as it was before said change (much like theOLD
value in update/delete triggers). As such, enabling this feature on a table breaks the current implementation of@graphile/lds
.Fixing the above issue requires the use of a parameter
include-pk
which is only available in version >=2.3 of wal2json, and as of right now we haven't been able to find a method to detect which version of wal2json is installed. The only way seems to be that oftry {} catch {}
the code with the parameter and falling back on one without if it fails with a parameter unknown error. Problem I have with this approach is that we'd have to resort to string matching to determine the cause, which feels a bit too iffy to me.Is built on
version-format 1
of wal2json which produces a JSON object per transaction. All of the new/old tuples are available in the JSON object. Whereas the newerversion-format 2
produces a JSON object per tuple, supports thetruncate
operation and provides in my opinion a more logical constructed object.Below are both formats as defined in
typescript
definitions, these are both the default format which can be augmented with additional metadata based on parameters. One of which is the much neededinclude-pk
which is only available in wal2json >= 2.3format-version 1
:format-version 2
:As we discussed in Discord fixing/adding the above 2 points would result in such breaking changes that are whole new package is warranted. This new package
@graphile/lds2
would have more strict requirements on which version of wal2json is used.Breaking changes
None, because it is a new package. Yet, in order to make this package useful an update to
@graphile/subscription-lds
or alternative package that does the same but using@graphile/lds2
is required.Supporting development
I [tick all that apply]:
The text was updated successfully, but these errors were encountered: