-
-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(uptime): snuba table and configuration for new dataset #6686
Conversation
This PR has a migration; here is the generated SQL for -- start migrations
-- forward migration uptime_monitor_checks : 0001_uptime_monitor_checks
Local op: CREATE TABLE IF NOT EXISTS uptime_monitor_checks_local (organization_id UInt64, project_id UInt64, environment LowCardinality(Nullable(String)), uptime_subscription_id UInt64, uptime_check_id UUID, scheduled_check_time DateTime, timestamp DateTime, _sort_timestamp DateTime, duration UInt64, region_id Nullable(UInt16), check_status LowCardinality(String), check_status_reason LowCardinality(Nullable(String)), http_status_code UInt16, trace_id UUID, retention_days UInt16) ENGINE ReplicatedReplacingMergeTree('/clickhouse/tables/uptime_monitor_checks/{shard}/default/uptime_monitor_checks_local', '{replica}') PRIMARY KEY (organization_id, project_id, uptime_subscription_id, _sort_timestamp, uptime_check_id, trace_id) ORDER BY (organization_id, project_id, uptime_subscription_id, _sort_timestamp, uptime_check_id, trace_id) PARTITION BY (toMonday(_sort_timestamp)) TTL _sort_timestamp + toIntervalDay(retention_days) SETTINGS index_granularity=8192;
Distributed op: CREATE TABLE IF NOT EXISTS uptime_monitor_checks_dist (organization_id UInt64, project_id UInt64, environment LowCardinality(Nullable(String)), uptime_subscription_id UInt64, uptime_check_id UUID, scheduled_check_time DateTime, timestamp DateTime, _sort_timestamp DateTime, duration UInt64, region_id Nullable(UInt16), check_status LowCardinality(String), check_status_reason LowCardinality(Nullable(String)), http_status_code UInt16, trace_id UUID, retention_days UInt16) ENGINE Distributed(`cluster_one_sh`, default, uptime_monitor_checks_local, cityHash64(reinterpretAsUInt128(trace_id)));
Local op: ALTER TABLE uptime_monitor_checks_local ADD INDEX IF NOT EXISTS bf_trace_id trace_id TYPE bloom_filter GRANULARITY 1;
-- end forward migration uptime_monitor_checks : 0001_uptime_monitor_checks
-- backward migration uptime_monitor_checks : 0001_uptime_monitor_checks
Distributed op: DROP TABLE IF EXISTS uptime_monitor_checks_dist;
Local op: DROP TABLE IF EXISTS uptime_monitor_checks_local;
-- end backward migration uptime_monitor_checks : 0001_uptime_monitor_checks |
) -> anyhow::Result<(Vec<UptimeMonitorCheckRow>, f64)> { | ||
let monitor_message: UptimeMonitorCheckMessage = serde_json::from_slice(payload)?; | ||
|
||
let rows = vec![UptimeMonitorCheckRow { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
relevant upstream code: https://github.com/getsentry/uptime-checker/blob/e120f3ff6856d5f47cabb883efa11fae03c8c7b6/src/types/result.rs#L61
need to tweak this consumer before turning on.
# do i actually need primary key to be different than sorting key? | ||
primary_key="(organization_id, project_id, uptime_subscription_id, _sort_timestamp, uptime_check_id, trace_id)", | ||
order_by="(organization_id, project_id, uptime_subscription_id, _sort_timestamp, uptime_check_id, trace_id)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for our query pattern, it probably makes sense to include project_id and subscription_id in the order_by key. and uptime_check_id corresponds to event_id.
local_table_name = f"{table_prefix}_local" | ||
dist_table_name = f"{table_prefix}_dist" | ||
|
||
## what about all the fancy codecs? do we need those? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should i use all the codecs that the spans table uses? or does it matter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Start with no codec and we'll see where we need them. By default, everything is ZSTD(1)
.
should i split this PR up? what would be the right way to do that? could i leave the processor off the config to start? |
scripts/load_uptime_checks.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the script for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm using it to load in data locally to test row size, compression ratio, and querying the data using EAP API.
|
||
stream_loader: | ||
processor: UptimeMonitorChecksProcessor | ||
default_topic: snuba-uptime-monitor-checks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i believe we can actually reuse the uptime-results
topic and don't need our own separate one for snuba here.
Column("organization_id", UInt(64)), | ||
Column("project_id", UInt(64)), | ||
Column("environment", String(Modifiers(nullable=True, low_cardinality=True))), | ||
Column("uptime_subscription_id", UInt(64)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about uptime_subscription_region_id or project_uptime_subscription_id?
snuba/snuba_migrations/uptime_monitor_checks/0001_uptime_monitor_checks.py
Outdated
Show resolved
Hide resolved
columns=columns, | ||
engine=table_engines.ReplacingMergeTree( | ||
# do i actually need primary key to be different than sorting key? | ||
primary_key="(organization_id, project_id, uptime_subscription_id, _sort_timestamp, uptime_check_id, trace_id)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we actually need trace_id here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The EAP RPC has a way to query by trace_id
. If you're not putting a trace_id
somewhere, the requests on this endpoint will be slower.
Also, you don't need to set the sort key and the primary key. What we usually want is the primary key to be as small as possible (it's in RAM) and there's no need to be overly specific since, at some point, there will be so little data compared to what needs to be scanned.
In your case, I would put the timestamp right after project_id
because, after this, how many uptime_subscription_id
will you have? if it's something like under 8k, it might fit into one part and scanning the primary key then the part is not faster than just the part.
The sort key can have more fields since it dictates how it's laid on disk and the behavior of the ReplacingMergeTree
so you can be as specific as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The EAP RPC has a way to query by trace_id. If you're not putting a trace_id somewhere, the requests on this endpoint will be slower.
i've added a bloom filter index in the migration.
Also, you don't need to set the sort key and the primary key. What we usually want is the primary key to be as small as possible (it's in RAM) and there's no need to be overly specific since, at some point, there will be so little data compared to what needs to be scanned.
thank you. modified.
snuba/snuba_migrations/uptime_monitor_checks/0001_uptime_monitor_checks.py
Outdated
Show resolved
Hide resolved
snuba/utils/streams/topics.py
Outdated
@@ -45,6 +45,7 @@ class Topic(Enum): | |||
PROFILE_CHUNKS = "snuba-profile-chunks" | |||
|
|||
REPLAYEVENTS = "ingest-replay-events" | |||
UPTIME_MONITOR_CHECKS = "snuba-uptime-monitor-checks" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change this to uptime results
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to actually figure out if we need to enrich stuff from postgres, or we modify uptime checker
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should put the storage
config in the EAP and you won't need an entity since you're going to use the EAP RPC.
You need to split the PRs as Snuba won't let you add a migration with other code. Put the migration in another PR and keep the configuration and the processors here. |
migration moved to #6690 |
done. |
@@ -178,6 +181,11 @@ def __init__( | |||
storage_sets_keys={StorageSetKey.PROFILE_CHUNKS}, | |||
readiness_state=ReadinessState.PARTIAL, | |||
), | |||
MigrationGroup.UPTIME_MONITOR_CHECKS: _MigrationGroup( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: determine if we should change the StorageSetKey
here
are there any further tests i need to add? do i need to add a test in python similar to
|
❌ 1 Tests Failed:
View the top 1 failed tests by shortest run time
To view more test analytics, go to the Test Analytics Dashboard |
@@ -2,4 +2,4 @@ | |||
|
|||
|
|||
def test_dlq() -> None: | |||
assert len(get_dlq_topics()) == 9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to delete this test, IMO that's just annoying to deal with as you add new datasets.
closing as this wasn't in the right order |
Context: https://github.com/getsentry/team-uptime/issues/23
migration PR: #6690