-
Notifications
You must be signed in to change notification settings - Fork 528
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
functionaltests: restart Integrations Server after ingestion #15356
Conversation
This removes the previous implementation with document count checks and the logic that was trying to ensure ingestion was completed. The method now relies only on restarting the APM Server.
This pull request does not have a backport label. Could you fix it @endorama? 🙏
|
|
functionaltests/main_test.go
Outdated
case "production": | ||
return "https://api.elastic-cloud.com" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't this panic?
- 'pro' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. I updated it in e528ec5 and also moved to using constants to reduce the chance of this occurring. In parallel I'm discussing how to make this clearer in #14935 (comment)
Use constants to reduce confusion and chance of error when dealing with evaluating the test target environment.
* expose deployment id * add ecclient package * restart apm-server after ingest * refactor RunBlockingWait to restart APM Server This removes the previous implementation with document count checks and the logic that was trying to ensure ingestion was completed. The method now relies only on restarting the APM Server. (cherry picked from commit c230240) # Conflicts: # functionaltests/go.sum
Motivation/summary
This is an alternative #15317.
Instead of expanding the logic to wait for ingestion completion by monitoring data streams and document count, rely on a side effect of shutting down the APM Server. On shutdown APM Server flushes all the data, included the aggregation for the data already ingested.
We use this behaviour triggering a restart of the Integration Server after the test data ingestion is done to ensure ingestion in completed before moving to checking document counts and data streams details.
I run 5 tests and in most cases the restart took ~1 minutes. In 1 case it took 6 and in 1 case it took more than 10 minutes (but the code has a timeout at 10 minutes).
In 1 case the test failed due to race condition when testing that the lazy rollover happened: it did not at the moment of testing but I confirmed manually that it indeed happened. This behaviour suggests lazy rollover may happen later than expected (at ingestion after an upgrade) and is not impacted by this change, but further investigation is required to prevent flakyness.
Checklist
How to test these changes
Functional tests pipeline for this branch should be green: https://github.com/elastic/apm-server/actions/workflows/functional-tests.yml?query=branch%3Afunctionaltests-restartapmserver
Related issues
#14100