Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sure other tasks are processed with tasklists that contain both bad xfrs (that cause reload to exit) and other tasks #390

Merged
merged 10 commits into from
Oct 23, 2024

Conversation

wtoorop
Copy link
Member

@wtoorop wtoorop commented Oct 18, 2024

The issue is that updating the database from transfers and other tasks that change the database (such as adding zones or changing patterns) are processed in so called task list. Especially applying transfers are executed batch-wise periodically as configured by xfrd-reload-timeout: (default every second). If a zone is added it is appended to the tasklist. If there would have been an "apply transfer task" before that other (adding a zone) task, then that will be executed first. If the "apply transfer task" fails, the reload will be aborted (because the database could be corrupt) and the effect of any tasks before or after the transfer are undone. This can lead to a situation that the transfer daemon (xfrd) thinks a zone is configured, but the zone is not configured in the server process (main).

This PR replaces PR #332 . It removes the additional fork by letting serve childs send NOTIFY messages directly to the xfrd process instead of letting main or backup-main proxy for it.

wtoorop added 9 commits May 25, 2024 13:15
This test is to make sure that the effects of other tasks are effectuated when
they are processed together with other tasks in a single tasklist.  The test
does a reload for a single bad update (from xfr) together with an addzone in a
single tasklist. The addzone should be effectuated.
So that the backup main has all other tasks processed when one of the transfers
fails (causing an exit)
Directly without main or backup-main proxying for it.
... when configured to do so
Copy link
Member

@wcawijngaards wcawijngaards left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Different notify communication could be an improvement, if it is easier. The task separation hopefully stops state mismatch. And the code looks nice.

@wtoorop
Copy link
Member Author

wtoorop commented Oct 22, 2024

Just for the record. A write to a pipe is guaranteed to be atomic when the written size is less then PIPE_BUF (according to POSIX). The size of PIPE_BUF is at least 512 (according to POSIX).

An alternative (simpler?) solution could be to use a single pipe to the xfrd where all the serve children are writing on the other end of that single pipe, and then communicate only the name of the zone and an optional serial, instead of the whole NOTIFY message, which is together less than 512.

I think that with such a solution the necessity for a "NOTIFY per zone" buffering mechanism becomes a bit more urgent though, since the serve children loose their individual pipe buffer.

Copy link

@k0ekk0ek k0ekk0ek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for cleaning up the forward mode 👍

wtoorop added a commit that referenced this pull request Oct 23, 2024
@wtoorop wtoorop merged commit 1f2a015 into master Oct 23, 2024
6 checks passed
@wtoorop wtoorop deleted the bugfix/bad-xfr-with-other-tasks2 branch January 3, 2025 09:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants