-
-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Maria DB data offloading #598
Comments
How would purging work? Also can we do different TTL per user type? |
A CRON job that checks the disk usage and drops part of the oldest data if it's above a threshold seems most reliable. Theoretically, we don't need to define storage duration and it just stores as much as possible. If we wanted to have different TTL per user type, we'd combine this with predetermined expiration (again enforced by a scheduled function), and the disk check job would become just a fallback. As a rough estimate, 1 month of data at the current usage level = 2 TB (Maria's compression might reduce that somewhat). |
Feels icky doing it with cron. Wouldn't it also lock the tables and create issues when it runs? |
Partitioning by days or weeks solves the delete problem, as dropping partitions is fast (unlike |
As a follow-up to #269 (comment) we may want to offload data older than 1 hour from Redis to another storage so that we can keep the measurements results stored for longer. S3 and similar alternatives would be very expensive because of request based pricing, so Maria DB seems like the best option. All queries would be single-row lookups by primary key, so a server with 16 - 32 GB RAM and 2 TB+ of fast storage should be sufficient.
The text was updated successfully, but these errors were encountered: