Searching for job ID fails on Heroku due to 30s request limit #1618

francois · 2025-03-12T16:59:12Z

We are on 4.9.3 and have run good_job:update and db:migrate.

On the dashboard, searching for a job ID fails on Heroku, because our 120k rows good_jobs table takes a very long time to actually complete. Of course, it is possible to open any job, then hack the URL to replace it with the job ID that we have on hand. This works, and takes less than 30s, but is error-prone.

The actual query that runs is:

SELECT "good_jobs"."id", "good_jobs"."queue_name", "good_jobs"."priority", "good_jobs"."serialized_params", "good_jobs"."scheduled_at", "good_jobs"."performed_at", "good_jobs"."finished_at", "good_jobs"."error", "good_jobs"."created_at", "good_jobs"."updated_at", "good_jobs"."active_job_id", "good_jobs"."concurrency_key", "good_jobs"."cron_key", "good_jobs"."cron_at", "good_jobs"."batch_id", "good_jobs"."batch_callback_id", "good_jobs"."executions_count", "good_jobs"."job_class", "good_jobs"."error_event", "good_jobs"."labels", "good_jobs"."locked_by_id", "good_jobs"."locked_at" FROM "good_jobs" WHERE ((to_tsvector('english', id::text) || to_tsvector('english', COALESCE(active_job_id::text, '')) || to_tsvector('english', serialized_params) || to_tsvector('english', COALESCE(serialized_params->>'arguments', '')) || to_tsvector('english', COALESCE(error, '')) || to_tsvector('english', COALESCE(array_to_string(labels, ' '), ''))) @@ websearch_to_tsquery('38979b0b-af58-4bf4-969e-017b10fa3bb7')) ORDER BY ts_rank((to_tsvector('english', id::text) || to_tsvector('english', COALESCE(active_job_id::text, '')) || to_tsvector('english', serialized_params) || to_tsvector('english', COALESCE(serialized_params->>'arguments', '')) || to_tsvector('english', COALESCE(error, '')) || to_tsvector('english', COALESCE(array_to_string(labels, ' '), ''))), websearch_to_tsquery('38979b0b-af58-4bf4-969e-017b10fa3bb7')) DESC, created_at desc, id desc LIMIT 25

In our database, this means converting 125,000 rows, callingto_tsvector() on 9 columns, and converting JSONB, UUID and TEXT columns back to TEXT. That's almost a million to_tsvector calls!

There are two ways forward:

Add a full-text search index
Recognize a "common" case and let JobsFilter recognize this type of value and make the query more performant

I'm worried that adding an index may negatively impact performance. We can't forget that GoodJob's database is hosted within another one, and if GoodJob takes more CPU/RAM/disk than the customer's database, then GoodJob would be considered a bad citizen.

The text was updated successfully, but these errors were encountered:

Because there are no indices for full-text search, searching for a job ID takes an inordinate amount of time. This commit examines the search query and if the only item is a UUID, it specifically searches for a job matching that ID. Fixes bensheldon#1618

francois · 2025-03-12T17:07:05Z

On our production cluster, we preserve job records for 4 days only. That allows us to cover Saturday, Sunday and Monday, while we come back on Tuesday after a 3-day weekend.

Search is a really interesting feature, but unfortunately, it isn't quite useable at the moment.

francois · 2025-03-12T17:19:49Z

Ick, array_to_string is only stable, not immutable. A full-text search on cannot be made on the existing query. It would have to change to support that.

# create index index_good_jobs_full_text on good_jobs using gin (  ((to_tsvector('english', id::text) || to_tsvector('english', COALESCE(active_job_id::text, '')) || to_tsvector('english', serialized_params) || to_tsvector('english', COALESCE(serialized_params->>'arguments', '')) || to_tsvector('english', COALESCE(error, '')) || to_tsvector('english', COALESCE(array_to_string(labels, ' '), ''))))  );;
ERROR:  functions in index expression must be marked IMMUTABLE

# \df+ array_to_string
                                                                                                                    List of functions
   Schema   |      Name       | Result data type | Argument data types  | Type | Volatility | Parallel |  Owner   | Security | Access privileges | Language |   Internal name    |                              Description
------------+-----------------+------------------+----------------------+------+------------+----------+----------+----------+-------------------+----------+--------------------+------------------------------------------------------------------------
 pg_catalog | array_to_string | text             | anyarray, text       | func | stable     | safe     | postgres | invoker  |                   | internal | array_to_text      | concatenate array elements, using delimiter, into text
 pg_catalog | array_to_string | text             | anyarray, text, text | func | stable     | safe     | postgres | invoker  |                   | internal | array_to_text_null | concatenate array elements, using delimiter and null string, into text
(2 rows)

I was curious to see how large the index would be. Original database size:

# select pg_size_pretty(pg_total_relation_size('good_jobs'));
 pg_size_pretty
----------------
 109 MB

After adding the index, but without labels:

# select pg_size_pretty(pg_total_relation_size('good_jobs'));
 pg_size_pretty
----------------
 159 MB

# select count(*) from good_jobs;
 count
--------
 133982

We don't use labels yet.

github-project-automation bot added this to GoodJob Backlog v2 Mar 12, 2025

github-project-automation bot moved this to Inbox in GoodJob Backlog v2 Mar 12, 2025

francois linked a pull request Mar 12, 2025 that will close this issue

Improve dashboard performance when searching for job ID #1619

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Searching for job ID fails on Heroku due to 30s request limit #1618

Searching for job ID fails on Heroku due to 30s request limit #1618

francois commented Mar 12, 2025

francois commented Mar 12, 2025

francois commented Mar 12, 2025 •

edited

Loading

Searching for job ID fails on Heroku due to 30s request limit #1618

Searching for job ID fails on Heroku due to 30s request limit #1618

Comments

francois commented Mar 12, 2025

francois commented Mar 12, 2025

francois commented Mar 12, 2025 • edited Loading

francois commented Mar 12, 2025 •

edited

Loading