-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplest MVP transaction filtering #398
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, just a small comment about the not
How do we plan to share this crate with aptos-core? |
@bowenyang007 sorry I don't understand the question |
Data service needs to use this right? I'm curious if we're going to copy over the code or upload it to crates.io. A few additional questions:
|
@bowenyang007 I figured we'd do it the way we do everything else, i.e;
I can move stuff to the readme and add an example, sure |
a28c57c
to
13a63d8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lots of small things, mostly around how the builders work in certain cases. Overall structure still looks great, molto benne.
if !self.data.is_allowed(&item.data) { | ||
return false; | ||
} | ||
|
||
true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit, could be collapsed to just returning self.data.is_allowed(&item.data)
directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could! I left it as-is to make the eventual merge of advanced filtering a bit easier haha, but I can change this too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mind so much, explicit is good also.
#[derive(Clone, Debug, Default, Deserialize, PartialEq, Serialize)] | ||
#[serde(deny_unknown_fields)] | ||
#[derive(derive_builder::Builder)] | ||
#[builder(setter(into, strip_option), default)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if default
is a footgun in this case, it might be best to have people explicitly set (even to None
) all the fields for this one? I can see people forgetting to set address
and only setting module
for example. Not too opinionated, up to you.
Same comment for the entry function filter and the like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIR the issue is builder requires default unfortunately.
I'm not an expert in the library though and there are a lot of config options; the biggest thing I wish we could do is call the validation function when calling "build", and have that return our error type- maybe there is, maybe I can do it without a lot of boiler plate, I can take a look
if !(self.address.is_allowed(&module.address) | ||
&& self.function.is_allowed(&module.name)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not 100% sure I'm understanding this one right. In this block, address or function could be None. I believe None by default resolves to false for is_allowed
based on the trait below. Which I don't think is what we want here, if the user hasn't set a value for address
then we shouldn't evaluate it as a filter?
Is there a unit test that explicitly checks this case where only one of address or function is set that can put my mind at ease hahah.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I get it, is_allowed
if false if the value is None but true if the filter is None, perfect. In which case, I suppose if self.address.is_some() || self.function.is_some()
is just an optimization, we don't actually need it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly- we don't explicitly need it, it's just an optimization to save on some ops if we don't need it
#[derive(Clone, Debug, Default, Deserialize, PartialEq, Serialize)] | ||
#[serde(deny_unknown_fields)] | ||
#[derive(derive_builder::Builder)] | ||
#[builder(setter(strip_option), default)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figure this shouldn't be default
, because there is only one field so you must set it otherwise it'll always be invalid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's required for the builder derive to allow partial completions- otherwise if you only set one field, and even though the overall builder is valid, it yells at you for fields being unset
I think there is some improvements we can make with it, I tried not to get too deep into it to avoid this PRs continued growth hahaha
There are two types of transaction filtering we will support in the future: 1. Per stream configuration: The downstream declares what txns they want to receive. 2. Global configuration: At the data service level we refuse to include full txns for all streams. This PR implements the second of these, using @CapCap's work here: aptos-labs/aptos-indexer-processors#398. Rather than not sending txns at all if they match the blocklist filters, we just omit the writesets and events. Not sending the txns entirely would cause issues with processors, which today assume that they will receive all txns.
There are two types of transaction filtering we will support in the future: 1. Per stream configuration: The downstream declares what txns they want to receive. 2. Global configuration: At the data service level we refuse to include full txns for all streams. This PR implements the second of these, using @CapCap's work here: aptos-labs/aptos-indexer-processors#398. Rather than not sending txns at all if they match the blocklist filters, we just omit the writesets and events. Not sending the txns entirely would cause issues with processors, which today assume that they will receive all txns.
There are two types of transaction filtering we will support in the future: 1. Per stream configuration: The downstream declares what txns they want to receive. 2. Global configuration: At the data service level we refuse to include full txns for all streams. This PR implements the second of these, using @CapCap's work here: aptos-labs/aptos-indexer-processors#398. Rather than not sending txns at all if they match the blocklist filters, we just omit the writesets and events. Not sending the txns entirely would cause issues with processors, which today assume that they will receive all txns.
f1d3f28
to
8f62517
Compare
There are two types of transaction filtering we will support in the future: 1. Per stream configuration: The downstream declares what txns they want to receive. 2. Global configuration: At the data service level we refuse to include full txns for all streams. This PR implements the second of these, using @CapCap's work here: aptos-labs/aptos-indexer-processors#398. Rather than not sending txns at all if they match the blocklist filters, we just omit the writesets and events. Not sending the txns entirely would cause issues with processors, which today assume that they will receive all txns.
There are two types of transaction filtering we will support in the future: 1. Per stream configuration: The downstream declares what txns they want to receive. 2. Global configuration: At the data service level we refuse to include full txns for all streams. This PR implements the second of these, using @CapCap's work here: aptos-labs/aptos-indexer-processors#398. Rather than not sending txns at all if they match the blocklist filters, we just omit the writesets and events. Not sending the txns entirely would cause issues with processors, which today assume that they will receive all txns.
There are two types of transaction filtering we will support in the future: 1. Per stream configuration: The downstream declares what txns they want to receive. 2. Global configuration: At the data service level we refuse to include full txns for all streams. This PR implements the second of these, using @CapCap's work here: aptos-labs/aptos-indexer-processors#398. Rather than not sending txns at all if they match the blocklist filters, we just omit the writesets and events. Not sending the txns entirely would cause issues with processors, which today assume that they will receive all txns.
There are two types of transaction filtering we will support in the future: 1. Per stream configuration: The downstream declares what txns they want to receive. 2. Global configuration: At the data service level we refuse to include full txns for all streams. This PR implements the second of these, using @CapCap's work here: aptos-labs/aptos-indexer-processors#398. Rather than not sending txns at all if they match the blocklist filters, we just omit the writesets and events. Not sending the txns entirely would cause issues with processors, which today assume that they will receive all txns.
8f62517
to
020a761
Compare
020a761
to
5fc070b
Compare
There are two types of transaction filtering we will support in the future: 1. Per stream configuration: The downstream declares what txns they want to receive. 2. Global configuration: At the data service level we refuse to include full txns for all streams. This PR implements the second of these, using @CapCap's work here: aptos-labs/aptos-indexer-processors#398. Rather than not sending txns at all if they match the blocklist filters, we just omit the writesets and events. Not sending the txns entirely would cause issues with processors, which today assume that they will receive all txns.
There are two types of transaction filtering we will support in the future: 1. Per stream configuration: The downstream declares what txns they want to receive. 2. Global configuration: At the data service level we refuse to include full txns for all streams. This PR implements the second of these, using @CapCap's work here: aptos-labs/aptos-indexer-processors#398. Rather than not sending txns at all if they match the blocklist filters, we just omit the writesets and events. Not sending the txns entirely would cause issues with processors, which today assume that they will receive all txns.
There are two types of transaction filtering we will support in the future: 1. Per stream configuration: The downstream declares what txns they want to receive. 2. Global configuration: At the data service level we refuse to include full txns for all streams. This PR implements the second of these, using @CapCap's work here: aptos-labs/aptos-indexer-processors#398. Rather than not sending txns at all if they match the blocklist filters, we just omit the writesets and events. Not sending the txns entirely would cause issues with processors, which today assume that they will receive all txns.
There are two types of transaction filtering we will support in the future: 1. Per stream configuration: The downstream declares what txns they want to receive. 2. Global configuration: At the data service level we refuse to include full txns for all streams. This PR implements the second of these, using @CapCap's work here: aptos-labs/aptos-indexer-processors#398. Rather than not sending txns at all if they match the blocklist filters, we just omit the writesets and events. Not sending the txns entirely would cause issues with processors, which today assume that they will receive all txns.
There are two types of transaction filtering we will support in the future: 1. Per stream configuration: The downstream declares what txns they want to receive. 2. Global configuration: At the data service level we refuse to include full txns for all streams. This PR implements the second of these, using @CapCap's work here: aptos-labs/aptos-indexer-processors#398. Rather than not sending txns at all if they match the blocklist filters, we just omit the writesets and events. Not sending the txns entirely would cause issues with processors, which today assume that they will receive all txns.
* [GRPC] Enable data service ZSTD and update crate that uses old tonic (#13621) * replace println with tracing * [GRPC] Simple Transaction Filtering * Improve transaction filter comments, exports, README, fix lz4 in tests * [Data Service] Implement simple upstream transaction filtering (#13699) There are two types of transaction filtering we will support in the future: 1. Per stream configuration: The downstream declares what txns they want to receive. 2. Global configuration: At the data service level we refuse to include full txns for all streams. This PR implements the second of these, using @CapCap's work here: aptos-labs/aptos-indexer-processors#398. Rather than not sending txns at all if they match the blocklist filters, we just omit the writesets and events. Not sending the txns entirely would cause issues with processors, which today assume that they will receive all txns. --------- Co-authored-by: Max Kaplan <[email protected]> Co-authored-by: yuunlimm <[email protected]> Co-authored-by: CapCap <[email protected]> Co-authored-by: Daniel Porteous <[email protected]>
There are two types of transaction filtering we will support in the future: 1. Per stream configuration: The downstream declares what txns they want to receive. 2. Global configuration: At the data service level we refuse to include full txns for all streams. This PR implements the second of these, using @CapCap's work here: aptos-labs/aptos-indexer-processors#398. Rather than not sending txns at all if they match the blocklist filters, we just omit the writesets and events. Not sending the txns entirely would cause issues with processors, which today assume that they will receive all txns.
an MVP version of the transaction filtering crate.
Performance
On a random 3mb proto of 1000txns during taptos, running the "everything" filter (looped) gives:
Random 10mb proto from April 25th with "everything" filter:
Graffio 100mb proto ("everything" filter, but with
or
):More advanced feature complete version here: #392