Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for specific time scheduled crawls skipping if too many crawlers #597

Merged
merged 7 commits into from
Jan 10, 2025

Conversation

mattnowzari
Copy link
Contributor

@mattnowzari mattnowzari commented Jan 7, 2025

A fix for specific time scheduled crawls being skipped if too many crawlers exist.

See 'Related Pull Requests' below for the work to introduce a configurable poll_interval value in ent-search.

Great explanation by @navarone-feekery here that includes an RCA, how to reproduce, and three solutions to tackle this problem, two of which are addressed in this PR.

  1. The Time.now here should probably be set at the beginning of the iteration over connector settings
  2. We should add debug logging to connectors-ruby here for when a schedule is skipped because it was deemed not available to sync. Having this debug log would have saved me a few hours in SDH.

Checklists

Pre-Review Checklist

  • Covered the changes with automated tests
  • Tested the changes locally
  • Added a label for each target release version (example: v7.13.2, v7.14.0, v8.0.0)
  • Considered corresponding documentation changes
  • Contributed any configuration settings changes to the configuration reference

Changes Requiring Extra Attention

  • Security-related changes (encryption, TLS, SSRF, etc)
  • New external service dependencies added.

Related Pull Requests

https://github.com/elastic/ent-search/pull/8256

Release Note

For Elastic Internal Use Only

  • Considered corresponding documentation changes to contribute separately
  • New configuration settings added in this PR follow the official guidelines
  • Built gems (both connectors_utility and connectors_service) and included into Enterprise Search and tested that Enterprise Search works well with new gem versions. Instruction can be found here

@mattnowzari mattnowzari marked this pull request as ready for review January 9, 2025 13:41
Copy link
Contributor

@navarone-feekery navarone-feekery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Just left a few possible test cases that could be covered.

@seanstory
Copy link
Member

Nit: your PR title is Closes <issue> but there's going to need to be a follow-up ent-search PR to expose this setting in enterprise-search.yml, right? So you probably don't want the issue to close as soon as this PR merges. Instead, you may want to use Related to <issue> or Part of <issue> instead

…s skipped + added an expect() to an existing test to check if debug logs are presented there as well + revised debug log wording to include the poll_interval value
Copy link
Contributor

@navarone-feekery navarone-feekery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for implementing the tests! I just have some ruby-specific comments, but the implementation looks really good :D

…he correct messages + moved those expect() lines out of the before(:each) blocks and into it blocks
Copy link
Contributor

@navarone-feekery navarone-feekery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@mattnowzari mattnowzari merged commit 8eb8850 into main Jan 10, 2025
2 checks passed
@mattnowzari mattnowzari deleted the mnowzari_crawler_schedule branch January 10, 2025 15:42
Copy link

💔 Failed to create backport PR(s)

Status Branch Result
8.17 The branch "8.17" is invalid or doesn't exist

To backport manually run:
backport --pr 597 --autoMerge --autoMergeMethod squash

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants