[GSoC Proposal Draft] - Digvijay Rawat - SQL Adapter for Background Jobs #240
Replies: 2 comments 1 reply
-
|
Hi @Digvijay-x1 Overall, you have a very strong grasp of the core problem. I appreciate that you explicitly noted this project is about adding a data durability layer for stateless environments, rather than trying to build a distributed queue. Your approach to handling graceful shutdowns with SIGTERM and mitigating thundering herds on boot using There're a few edge cases I'd love for you to clarify or rethink for your next revision: 1. Database Connections & Fiber Concurrency There is a bit of a contradiction in the proposal regarding Active Record. In the Solution section, you mention "no new dependencies beyond activerecord", but your code snippets use the raw pg gem, and your timeline mentions building an Active Record adapter later. We need to align on exactly which approach you are proposing as the primary deliverable. 2. Dead Letter Queue Your current DB cleanup strategy relies on hard deletes upon completion, which is great for keeping the table lean. However, the proposal is currently missing a strategy for poison pills. For example, in your 3. SQLite Support Your proposal relies on Looking forward to your updated version! Let me know if you have any questions about this feedback. |
Beta Was this translation helpful? Give feedback.
-
|
Hi @Digvijay-x1 I love this! Added a couple of suggestions, but this is already very strong! Make sure to submit your proposal before March 31!
I can see the appeal here - we could use the SQL backend to remove one of the trafe-offs This is also very different from how
I think you do need a separate table. The limit for
Great thinking, but this will be problematic in real-world deployments. This approach essentially hides some of the tables from the schema file. Additionally, it assumes the DB credentials given to the app process have the DDL permissions, which is not necessarily the case. If an app process only has DML permissions, this line will crash the app on boot. Instead, there will need to be a command to generate the migrations for the SQL backend.
I have the same suggestion here as with the polling loop - think through the abstrations and try to avoid the two-way dependency where the backend schedules the tasks in the queue. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
SQL Adapter for Background Jobs
mentored by @rsamoilov & @cuneyter
Introduction
Rage is a Ruby web framework built for speed, using fibers and the Iodine HTTP server. Its built-in background job system,
Rage::Deferred, is intentionally lightweight and works well for single-node deployments.The problem is durability in cloud-native environments. Today, deferred jobs are persisted to local disk. In Kubernetes and similar setups, local storage is ephemeral and coordination through file locks is node-local. This creates a real risk of lost jobs during pod restarts.
This proposal introduces a SQL backend for
Rage::Deferredthat keeps Rage's execution model intact while adding shared durability and multi-pod coordination.The backend will use Active Record as the database abstraction layer, so Rage does not need separate backend implementations per SQL engine.
Problem Understanding
What Rage does today (from the codebase)
Current
Rage::Deferredbehavior has a clean separation between queue logic and backend storage:add,remove,pending_tasks).So the existing dependency direction is one-way: queue -> backend.
Why this is a problem in Kubernetes
The disk backend relies on per-node file locking (
flock) and local files.flockdoes not coordinate across nodes/pods.That means
Rage::Deferredjobs can be silently lost in normal cloud operations (rollouts, evictions, OOM kills).Design Goals
Technical Approach
1. Keep one-way architecture (Queue -> Backend)
A key update based on mentor feedback:
2. Backend responsibilities
Rage::Deferred::Backends::Sqlwill handle:add).remove).pending_tasks).Queue/deferred lifecycle code will handle:
3. Schema design
I am proposing three tables:
Notes:
owner_id IS NULLmeans unclaimed.4. Enqueue and execution flow
Immediate task:
addwithowner_id = current_worker_id.Iodine.run_after(nil)).remove.Delayed task:
addwithowner_id = NULLandpublish_at.This keeps SQL as durability/coordination, not execution.
5. Active Record claiming flow
Task claiming will be implemented through Active Record transactions and locking APIs, keeping the backend implementation framework-level instead of adapter-specific.
FOR UPDATE SKIP LOCKEDis passed through Active Record locking APIs to avoid double-claiming across workers.5.1 Contract mapping for
add/remove/pending_tasksTo stay aligned with the existing backend contract, SQL behavior maps directly to these methods:
add(task, publish_at:, task_id:)addrage_deferred_tasks;owner_idis current worker for immediate tasks,NULLfor delayed tasks.remove(task_id)removepending_taskspending_tasks[task_id, context, publish_at]tuples for queue scheduling.6. Crash recovery and lifecycle hooks
Iodine already exposes worker lifecycle events (
:on_start,:on_finish).Plan:
On worker start (
on_start):On worker finish (
on_finish):owner_id = NULL),Hard crash path:
7. Dead-letter handling
For deserialization/infrastructure failures:
rage_deferred_dead_taskswith failure metadata.rage_deferred_tasks.This creates a clear operator workflow: inspect, requeue manually if needed, or purge by retention policy.
8. Migration strategy (instead of runtime DDL)
Another important update from feedback: no
create_tablescall during runtime boot.Rage already has migration tooling and task loading.
So schema will be delivered through migrations:
rage g migration create_rage_deferred_sql_tablesrage db:migrateThis avoids assuming production app DB users have DDL permission.
Configuration API
Rage::Configuration::Deferredcurrently supports:diskandnilonly.The project will extend that API with
:sqlwhile preserving existing behavior:Backend will use
ActiveRecord::Base.connection_pool.with_connectionfor DB operations.Testing Plan
Unit tests
Integration tests
SKIP LOCKED,E2E validation
A test Rage app on Kind with 2+ pods and PostgreSQL to validate:
Milestones & Timeline
add/remove/pending_tasks/claim+ heartbeat/worker registry primitives.:sqlbackend option, option parsing, migration generator/docs.Deliverables
Rage::Deferred::Backends::Sqlimplemented with ActiveRecord.:sqlsupport in deferred backend configuration.Validation Criteria
The feature is considered complete when all of the following pass:
About Me
I'm an undergraduate student who started learning Ruby while preparing for my college technical society, where we maintain campus systems like ERP, event sites, and internal admin tools. Since our ERP stack uses Ruby on Rails, Ruby became my natural starting point.
In my second year, I began contributing to Ruby open source projects and found myself increasingly interested in infrastructure-level work. This Rage SQL adapter project fits that interest perfectly: it solves a practical reliability gap for production deployments while staying close to framework internals.
I want to work on this because it combines distributed coordination, failure recovery, and API design in a way that is both challenging and directly useful to Rage users.
1. How much time would you be able to devote to the project?
I have summer break from May to July (21 May to 19 July), during which I can dedicate about 40 hours/week for 8 weeks (around 320 hours).
From 27 July to 15 September (about 7 weeks), my semester will be active, and I can dedicate around 20 hours/week (around 140 hours).
Overall, this is enough time for the project, and I can adjust hours if needed near deadlines.
2. What other obligations might you need to work around during the summer?
3. How often, and through which channel(s), do you plan on communicating with your mentor?
I plan to share daily progress updates and keep communication regular.
If selected for Rage in GSoC, I will focus on producing high-quality contributions, staying active in mentor/community communication, and continuing long-term contributions even after the program.
Thanks and regards,
Digvijay Rawat (Digvijay-x1)
Beta Was this translation helpful? Give feedback.
All reactions