Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

functionaltests: restart Integrations Server after ingestion #15356

Merged
merged 10 commits into from
Jan 24, 2025

Conversation

endorama
Copy link
Member

@endorama endorama commented Jan 23, 2025

Motivation/summary

This is an alternative #15317.

Instead of expanding the logic to wait for ingestion completion by monitoring data streams and document count, rely on a side effect of shutting down the APM Server. On shutdown APM Server flushes all the data, included the aggregation for the data already ingested.

We use this behaviour triggering a restart of the Integration Server after the test data ingestion is done to ensure ingestion in completed before moving to checking document counts and data streams details.

I run 5 tests and in most cases the restart took ~1 minutes. In 1 case it took 6 and in 1 case it took more than 10 minutes (but the code has a timeout at 10 minutes).

In 1 case the test failed due to race condition when testing that the lazy rollover happened: it did not at the moment of testing but I confirmed manually that it indeed happened. This behaviour suggests lazy rollover may happen later than expected (at ingestion after an upgrade) and is not impacted by this change, but further investigation is required to prevent flakyness.

Checklist

How to test these changes

Functional tests pipeline for this branch should be green: https://github.com/elastic/apm-server/actions/workflows/functional-tests.yml?query=branch%3Afunctionaltests-restartapmserver

Related issues

#14100

This removes the previous implementation with document
count checks and the logic that was trying to ensure
ingestion was completed.
The method now relies only on restarting the APM Server.
@endorama endorama requested a review from a team as a code owner January 23, 2025 22:29
Copy link
Contributor

mergify bot commented Jan 23, 2025

This pull request does not have a backport label. Could you fix it @endorama? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-7.17 is the label to automatically backport to the 7.17 branch.
  • backport-8./d is the label to automatically backport to the 8./d branch. /d is the digit.
  • backport-8.x is the label to automatically backport to the 8.x branch.

Copy link
Contributor

mergify bot commented Jan 23, 2025

backport-8.x has been added to help with the transition to the new branch 8.x.
If you don't need it please use backport-skip label.

@mergify mergify bot added the backport-8.x Automated backport to the 8.x branch with mergify label Jan 23, 2025
Comment on lines 124 to 125
case "production":
return "https://api.elastic-cloud.com"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this panic?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. I updated it in e528ec5 and also moved to using constants to reduce the chance of this occurring. In parallel I'm discussing how to make this clearer in #14935 (comment)

Use constants to reduce confusion and chance of error when
dealing with evaluating the test target environment.
@endorama endorama enabled auto-merge (squash) January 24, 2025 14:12
@endorama endorama merged commit c230240 into main Jan 24, 2025
14 checks passed
@endorama endorama deleted the functionaltests-restartapmserver branch January 24, 2025 14:21
mergify bot pushed a commit that referenced this pull request Jan 24, 2025
* expose deployment id

* add ecclient package

* restart apm-server after ingest

* refactor RunBlockingWait to restart APM Server

This removes the previous implementation with document
count checks and the logic that was trying to ensure
ingestion was completed.
The method now relies only on restarting the APM Server.

(cherry picked from commit c230240)

# Conflicts:
#	functionaltests/go.sum
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-8.x Automated backport to the 8.x branch with mergify
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants