CheckIn bulker 10 second window is a problem

At the moment the flush interval for the `checkin.NewBulk` is `10 * time.Second`. That means to me that during check-in the Elastic Agent document will not be updated for at most 10 seconds. I believe this could be an issue with ensuring that the `.fleet-agents` document for that Elastic Agent is actually correct for the next check-in.

The following scenario that could occur is (especially when there is multiple Fleet Servers). Lets assume there is Fleet Server 1 and Fleet Server 2.

- Elastic Agent checks-in on Fleet Server 1.
  - It has a status that needs to be written and updated, but it also has actions that it needs to handle.
- Fleet Server 1 sends a response as it has an action.
  - At this point the check-in bulk has not occurred (as its every 10 seconds and it all depends on when Fleet Server started for that interval)
- Elastic Agent checks-in on Fleet Server 2 (round-robin).
  - It has a different status that needs to be written, no actions to handle so it stays connected.
  - Now another check-in bulk needs to occur, but that happens in its own 10 second interval (the interval is not in sync with Fleet Server 1)

This means it is possible that Fleet Server 2 performs the sync before Fleet Server 1, but then Fleet Server 1 performs it sync. This now means that the first check-in overwrites the second check-in, when it shouldn't because actually the second check-in should take preference over the first one.

I believe we might need to ensure that upon check-in that the document is written as soon as possible. This would ensure this doesn't happen. The constant writing of the document to show that the Elastic Agent is connected to the long-poll endpoint can still use the 10 second window, but the initial check-in should not.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CheckIn bulker 10 second window is a problem #5793

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

CheckIn bulker 10 second window is a problem #5793

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions