Skip to content

Conversation

@gao-yan
Copy link
Member

@gao-yan gao-yan commented Apr 1, 2020

This introduces the new behavior as discussed from:

#2012

Any static/random delays that are introduced by pcmk_delay_base/max
configured for the corresponding fencing resources will be added to
priority-fencing-delay. This delay should be significantly greater than, safely twice,
the maximum pcmk_delay_base/max. By default, priority fencing delay is
disabled.

@kgaillot
Copy link

kgaillot commented Apr 1, 2020

Looks good to me. Why would the priority delay need to be twice the other? I would think anything greater would work.

@wenningerk , thoughts?

@wenningerk
Copy link
Contributor

Looks good to me. Why would the priority delay need to be twice the other? I would think anything greater would work.

@wenningerk , thoughts?

Don't see a reason why it would have to be double either.
Significantly larger so that the difference between priority-fencing-delay and pcmk_delay_base/max is comfortably larger than any jitter in fencing-start should be enough.

@gao-yan
Copy link
Member Author

gao-yan commented Apr 1, 2020

Define "significantly" :-) Users would ask the same. If users set a relatively big value of pcmk_delay_base/max, it means they expect relatively big "difference" in terms of delay so that it's safe enough to prevent double-fencing.

Considering the corner cases we are trying to cover here, there should be corresponding "difference" gets reflected by priority-fencing-delay - pcmk_delay_base/max. Well, "twice" is kind of a simple calculable reference for users ... Do you think of any better recommendations? Or just "twice" rather than "at least twice"?

@wenningerk
Copy link
Contributor

I think twice is good as a simple recommendation. It just shouldn't read as if there was a
sophisticated theory behind it ;-)
"To be on the safe side choose priority-fencing-delay double pcmk_delay_base/max (to assure
that ...)"

@gao-yan gao-yan force-pushed the combine-priority-fencing-delay branch 2 times, most recently from 48cedc5 to 347d105 Compare April 1, 2020 21:58
@gao-yan
Copy link
Member Author

gao-yan commented Apr 1, 2020

Now it's "This delay should be significantly greater than, safely twice, the maximum pcmk_delay_base/max." What do you think?

@wenningerk
Copy link
Contributor

Content-wise I'm fine with it. Language details I leave for the native-speakers to comment on ;-)

@gao-yan
Copy link
Member Author

gao-yan commented Apr 2, 2020

Ok. @kgaillot must have better ideas on the phrasing :-)

@kgaillot
Copy link

kgaillot commented Apr 2, 2020

Works for me.

@wenningerk , any comments on overall PR?

@wenningerk
Copy link
Contributor

Looking good to me.
2 things where I'm not sure if the most comprehensible route was taken. But that is probably just my personal feelings:

  • couple of times there is 'combined' used for a delay resulting from base/max & priority-delay
    Implies for me that there might be some algorithm behind it like 'max' or whatever.
    A simple 'added' instead would remove that ambiguity.
  • I know it is documented what '-1' for the delay does mean here
    depends on what the more natural connotation with '-1' is ... rather disable (= no delay)
    or take defaults/what is coming from somewhere else
    But given that we are adding delays, using '-1' for absolutely no delay is probably the better
    solution.

@gao-yan
Copy link
Member Author

gao-yan commented Apr 2, 2020

Looking good to me.
2 things where I'm not sure if the most comprehensible route was taken. But that is probably just my personal feelings:

  • couple of times there is 'combined' used for a delay resulting from base/max & priority-delay
    Implies for me that there might be some algorithm behind it like 'max' or whatever.
    A simple 'added' instead would remove that ambiguity.

You mean you prefer the word "added" rather than "combined" to be used in the doc/metadata, code comments and commit descriptions? Sure, if you think that's better :-)

  • I know it is documented what '-1' for the delay does mean here
    depends on what the more natural connotation with '-1' is ... rather disable (= no delay)
    or take defaults/what is coming from somewhere else
    But given that we are adding delays, using '-1' for absolutely no delay is probably the better
    solution.

It's what it exactly does invoking fence_with_delay() with "-1".

Or do you mean "-1" for priority-fencing-delay should disable all delays including pcmk_delay_base/max? Is it a condition that an option so-called "priority-fencing-delay" should cover?

@gao-yan
Copy link
Member Author

gao-yan commented Apr 2, 2020

  • I know it is documented what '-1' for the delay does mean here
    depends on what the more natural connotation with '-1' is ... rather disable (= no delay)
    or take defaults/what is coming from somewhere else
    But given that we are adding delays, using '-1' for absolutely no delay is probably the better
    solution.

It's what it exactly does invoking fence_with_delay() with "-1".

Or do you mean "-1" for priority-fencing-delay should disable all delays including pcmk_delay_base/max? Is it a condition that an option so-called "priority-fencing-delay" should cover?

Or do you mean the descriptions about the meaning of value "-1" for the delay parameters/variables in the code comments/doc?

@wenningerk
Copy link
Contributor

You mean you prefer the word "added" rather than "combined" to be used in the doc/metadata, code comments and commit descriptions? Sure, if you think that's better :-)

Yes - directly naming the operation done ...

It's what it exactly does invoking fence_with_delay() with "-1".

Or do you mean "-1" for priority-fencing-delay should disable all delays including pcmk_delay_base/max? Is it a condition that an option so-called "priority-fencing-delay" should cover?

Scratch my comment about '-1' it is fine as it is.

gao-yan added 6 commits April 3, 2020 17:39
…bled

This commit also documents the upcoming new behavior as discussed from:

ClusterLabs#2012

Any static/random delays that are introduced by `pcmk_delay_base/max`
configured for the corresponding fencing resources will be added to this
delay. This delay should be significantly greater than, safely twice,
the maximum `pcmk_delay_base/max`. By default, priority fencing delay is
disabled.
… have equal priority

In any cases, priority-fencing-delay won't take precedence over any
configured pcmk_delay_base/max.
…uested fencing delay

Requested fencing delay doesn't take precedence over any configured
pcmk_delay_base/max.

A delay value -1 now means disable also any static/random fencing delays
from pcmk_delay_base/max. It's not used by any consumers for now.
This commit also documents the current behavior in the help:

- Any static/random delays from pcmk_delay_base/max will be added to
requested fencing delay.

- A delay value -1 now means disable also any static/random fencing
delays from pcmk_delay_base/max.
…y_base is added

This commit also updates log patterns for the log changes.
@gao-yan gao-yan force-pushed the combine-priority-fencing-delay branch from 347d105 to 6ab6d38 Compare April 3, 2020 17:09
@gao-yan gao-yan changed the title Feature: priority-fencing-delay combines with any delays from pcmk_delay_base/max Feature: any delays from pcmk_delay_base/max are added to priority-fencing-delay Apr 3, 2020
@kgaillot kgaillot merged commit eb73f22 into ClusterLabs:master Apr 3, 2020
nrwahl2 added a commit to nrwahl2/pacemaker that referenced this pull request Mar 19, 2025
To replace crm_element_value_int().

Note that, like the other recently added getter functions, this leaves
dest unchanged on error. crm_element_value_int() set dest to -1.

This should indirectly improve some of the callers, which were not
error-checking crm_element_value_int()'s return code and simply assumed
the value was valid and not -1.

And IMO, it's better not to log errors or warnings within the getter
function as crm_element_value_int() was doing. The caller has more
context and can log errors or warnings if appropriate. Sometimes, a
parse failure may be expected as a legitimate possibility.

The one case I found questionable was the getting of PCMK__XA_ST_DELAY
in create_remote_stonith_op(). If that value is set to -1, then all
delays are disabled. crm_element_value_int() sets the value to -1 upon a
parse error, and this function was not doing any error checking before
now. So I wondered if we were relying on the parse error defaulting to
-1.

However, I don't think it's a problem. It looks like that attribute is
only ever getting set by crm_xml_add_int() -- so even if it's set based
on user input, it's set to a valid integer. The -1 value seems to be
intended only for when it's set explicitly to -1. We simply haven't
error-checked the parsing before now.

See commit a0cbd52 and associated commits in PR ClusterLabs#2027.

Signed-off-by: Reid Wahl <[email protected]>
nrwahl2 added a commit to nrwahl2/pacemaker that referenced this pull request Mar 20, 2025
To replace crm_element_value_int().

Note that, like the other recently added getter functions, this leaves
dest unchanged on error. crm_element_value_int() set dest to -1.

This should indirectly improve some of the callers, which were not
error-checking crm_element_value_int()'s return code and simply assumed
the value was valid and not -1.

And IMO, it's better not to log errors or warnings within the getter
function as crm_element_value_int() was doing. The caller has more
context and can log errors or warnings if appropriate. Sometimes, a
parse failure may be expected as a legitimate possibility.

The one case I found questionable was the getting of PCMK__XA_ST_DELAY
in create_remote_stonith_op(). If that value is set to -1, then all
delays are disabled. crm_element_value_int() sets the value to -1 upon a
parse error, and this function was not doing any error checking before
now. So I wondered if we were relying on the parse error defaulting to
-1.

However, I don't think it's a problem. It looks like that attribute is
only ever getting set by crm_xml_add_int() -- so even if it's set based
on user input, it's set to a valid integer. The -1 value seems to be
intended only for when it's set explicitly to -1. We simply haven't
error-checked the parsing before now.

See commit a0cbd52 and associated commits in PR ClusterLabs#2027.

Signed-off-by: Reid Wahl <[email protected]>
nrwahl2 added a commit to nrwahl2/pacemaker that referenced this pull request Mar 20, 2025
To replace crm_element_value_int().

Note that, like the other recently added getter functions, this leaves
dest unchanged on error. crm_element_value_int() set dest to -1.

This should indirectly improve some of the callers, which were not
error-checking crm_element_value_int()'s return code and simply assumed
the value was valid and not -1.

And IMO, it's better not to log errors or warnings within the getter
function as crm_element_value_int() was doing. The caller has more
context and can log errors or warnings if appropriate. Sometimes, a
parse failure may be expected as a legitimate possibility.

The one case I found questionable was the getting of PCMK__XA_ST_DELAY
in create_remote_stonith_op(). If that value is set to -1, then all
delays are disabled. crm_element_value_int() sets the value to -1 upon a
parse error, and this function was not doing any error checking before
now. So I wondered if we were relying on the parse error defaulting to
-1.

However, I don't think it's a problem. It looks like that attribute is
only ever getting set by crm_xml_add_int() -- so even if it's set based
on user input, it's set to a valid integer. The -1 value seems to be
intended only for when it's set explicitly to -1. We simply haven't
error-checked the parsing before now.

See commit a0cbd52 and associated commits in PR ClusterLabs#2027.

Signed-off-by: Reid Wahl <[email protected]>
nrwahl2 added a commit to nrwahl2/pacemaker that referenced this pull request Mar 21, 2025
To replace crm_element_value_int().

Note that, like the other recently added getter functions, this leaves
dest unchanged on error. crm_element_value_int() set dest to -1.

This should indirectly improve some of the callers, which were not
error-checking crm_element_value_int()'s return code and simply assumed
the value was valid and not -1.

And IMO, it's better not to log errors or warnings within the getter
function as crm_element_value_int() was doing. The caller has more
context and can log errors or warnings if appropriate. Sometimes, a
parse failure may be expected as a legitimate possibility.

The one case I found questionable was the getting of PCMK__XA_ST_DELAY
in create_remote_stonith_op(). If that value is set to -1, then all
delays are disabled. crm_element_value_int() sets the value to -1 upon a
parse error, and this function was not doing any error checking before
now. So I wondered if we were relying on the parse error defaulting to
-1.

However, I don't think it's a problem. It looks like that attribute is
only ever getting set by crm_xml_add_int() -- so even if it's set based
on user input, it's set to a valid integer. The -1 value seems to be
intended only for when it's set explicitly to -1. We simply haven't
error-checked the parsing before now.

See commit a0cbd52 and associated commits in PR ClusterLabs#2027.

Signed-off-by: Reid Wahl <[email protected]>
nrwahl2 added a commit to nrwahl2/pacemaker that referenced this pull request Mar 21, 2025
To replace crm_element_value_int().

Note that, like the other recently added getter functions, this leaves
dest unchanged on error. crm_element_value_int() set dest to -1.

This should indirectly improve some of the callers, which were not
error-checking crm_element_value_int()'s return code and simply assumed
the value was valid and not -1.

And IMO, it's better not to log errors or warnings within the getter
function as crm_element_value_int() was doing. The caller has more
context and can log errors or warnings if appropriate. Sometimes, a
parse failure may be expected as a legitimate possibility.

The one case I found questionable was the getting of PCMK__XA_ST_DELAY
in create_remote_stonith_op(). If that value is set to -1, then all
delays are disabled. crm_element_value_int() sets the value to -1 upon a
parse error, and this function was not doing any error checking before
now. So I wondered if we were relying on the parse error defaulting to
-1.

However, I don't think it's a problem. It looks like that attribute is
only ever getting set by crm_xml_add_int() -- so even if it's set based
on user input, it's set to a valid integer. The -1 value seems to be
intended only for when it's set explicitly to -1. We simply haven't
error-checked the parsing before now.

See commit a0cbd52 and associated commits in PR ClusterLabs#2027.

Signed-off-by: Reid Wahl <[email protected]>
nrwahl2 added a commit to nrwahl2/pacemaker that referenced this pull request Mar 21, 2025
To replace crm_element_value_int().

Note that, like the other recently added getter functions, this leaves
dest unchanged on error. crm_element_value_int() set dest to -1.

This should indirectly improve some of the callers, which were not
error-checking crm_element_value_int()'s return code and simply assumed
the value was valid and not -1.

And IMO, it's better not to log errors or warnings within the getter
function as crm_element_value_int() was doing. The caller has more
context and can log errors or warnings if appropriate. Sometimes, a
parse failure may be expected as a legitimate possibility.

The one case I found questionable was the getting of PCMK__XA_ST_DELAY
in create_remote_stonith_op(). If that value is set to -1, then all
delays are disabled. crm_element_value_int() sets the value to -1 upon a
parse error, and this function was not doing any error checking before
now. So I wondered if we were relying on the parse error defaulting to
-1.

However, I don't think it's a problem. It looks like that attribute is
only ever getting set by crm_xml_add_int() -- so even if it's set based
on user input, it's set to a valid integer. The -1 value seems to be
intended only for when it's set explicitly to -1. We simply haven't
error-checked the parsing before now.

See commit a0cbd52 and associated commits in PR ClusterLabs#2027.

Signed-off-by: Reid Wahl <[email protected]>
nrwahl2 added a commit to nrwahl2/pacemaker that referenced this pull request Mar 24, 2025
To replace crm_element_value_int().

Note that, like the other recently added getter functions, this leaves
dest unchanged on error. crm_element_value_int() set dest to -1.

This should indirectly improve some of the callers, which were not
error-checking crm_element_value_int()'s return code and simply assumed
the value was valid and not -1.

And IMO, it's better not to log errors or warnings within the getter
function as crm_element_value_int() was doing. The caller has more
context and can log errors or warnings if appropriate. Sometimes, a
parse failure may be expected as a legitimate possibility.

The one case I found questionable was the getting of PCMK__XA_ST_DELAY
in create_remote_stonith_op(). If that value is set to -1, then all
delays are disabled. crm_element_value_int() sets the value to -1 upon a
parse error, and this function was not doing any error checking before
now. So I wondered if we were relying on the parse error defaulting to
-1.

However, I don't think it's a problem. It looks like that attribute is
only ever getting set by crm_xml_add_int() -- so even if it's set based
on user input, it's set to a valid integer. The -1 value seems to be
intended only for when it's set explicitly to -1. We simply haven't
error-checked the parsing before now.

See commit a0cbd52 and associated commits in PR ClusterLabs#2027.

Signed-off-by: Reid Wahl <[email protected]>
nrwahl2 added a commit to nrwahl2/pacemaker that referenced this pull request Mar 25, 2025
To replace crm_element_value_int().

Note that, like the other recently added getter functions, this leaves
dest unchanged on error. crm_element_value_int() set dest to -1.

This should indirectly improve some of the callers, which were not
error-checking crm_element_value_int()'s return code and simply assumed
the value was valid and not -1.

And IMO, it's better not to log errors or warnings within the getter
function as crm_element_value_int() was doing. The caller has more
context and can log errors or warnings if appropriate. Sometimes, a
parse failure may be expected as a legitimate possibility.

The one case I found questionable was the getting of PCMK__XA_ST_DELAY
in create_remote_stonith_op(). If that value is set to -1, then all
delays are disabled. crm_element_value_int() sets the value to -1 upon a
parse error, and this function was not doing any error checking before
now. So I wondered if we were relying on the parse error defaulting to
-1.

However, I don't think it's a problem. It looks like that attribute is
only ever getting set by crm_xml_add_int() -- so even if it's set based
on user input, it's set to a valid integer. The -1 value seems to be
intended only for when it's set explicitly to -1. We simply haven't
error-checked the parsing before now.

See commit a0cbd52 and associated commits in PR ClusterLabs#2027.

Signed-off-by: Reid Wahl <[email protected]>
nrwahl2 added a commit to nrwahl2/pacemaker that referenced this pull request Mar 27, 2025
To replace crm_element_value_int().

Note that, like the other recently added getter functions, this leaves
dest unchanged on error. crm_element_value_int() set dest to -1.

This should indirectly improve some of the callers, which were not
error-checking crm_element_value_int()'s return code and simply assumed
the value was valid and not -1.

And IMO, it's better not to log errors or warnings within the getter
function as crm_element_value_int() was doing. The caller has more
context and can log errors or warnings if appropriate. Sometimes, a
parse failure may be expected as a legitimate possibility.

The one case I found questionable was the getting of PCMK__XA_ST_DELAY
in create_remote_stonith_op(). If that value is set to -1, then all
delays are disabled. crm_element_value_int() sets the value to -1 upon a
parse error, and this function was not doing any error checking before
now. So I wondered if we were relying on the parse error defaulting to
-1.

However, I don't think it's a problem. It looks like that attribute is
only ever getting set by crm_xml_add_int() -- so even if it's set based
on user input, it's set to a valid integer. The -1 value seems to be
intended only for when it's set explicitly to -1. We simply haven't
error-checked the parsing before now.

See commit a0cbd52 and associated commits in PR ClusterLabs#2027.

Signed-off-by: Reid Wahl <[email protected]>
nrwahl2 added a commit to nrwahl2/pacemaker that referenced this pull request Jul 3, 2025
To replace crm_element_value_int().

Note that, like the other recently added getter functions, this leaves
dest unchanged on error. crm_element_value_int() set dest to -1.

This should indirectly improve some of the callers, which were not
error-checking crm_element_value_int()'s return code and simply assumed
the value was valid and not -1.

And IMO, it's better not to log errors or warnings within the getter
function as crm_element_value_int() was doing. The caller has more
context and can log errors or warnings if appropriate. Sometimes, a
parse failure may be expected as a legitimate possibility.

The one case I found questionable was the getting of PCMK__XA_ST_DELAY
in create_remote_stonith_op(). If that value is set to -1, then all
delays are disabled. crm_element_value_int() sets the value to -1 upon a
parse error, and this function was not doing any error checking before
now. So I wondered if we were relying on the parse error defaulting to
-1.

However, I don't think it's a problem. It looks like that attribute is
only ever getting set by crm_xml_add_int() -- so even if it's set based
on user input, it's set to a valid integer. The -1 value seems to be
intended only for when it's set explicitly to -1. We simply haven't
error-checked the parsing before now.

See commit a0cbd52 and associated commits in PR ClusterLabs#2027.

Signed-off-by: Reid Wahl <[email protected]>
nrwahl2 added a commit to nrwahl2/pacemaker that referenced this pull request Jul 23, 2025
To replace crm_element_value_int().

Note that, like the other recently added getter functions, this leaves
dest unchanged on error. crm_element_value_int() set dest to -1.

This should indirectly improve some of the callers, which were not
error-checking crm_element_value_int()'s return code and simply assumed
the value was valid and not -1.

And IMO, it's better not to log errors or warnings within the getter
function as crm_element_value_int() was doing. The caller has more
context and can log errors or warnings if appropriate. Sometimes, a
parse failure may be expected as a legitimate possibility.

The one case I found questionable was the getting of PCMK__XA_ST_DELAY
in create_remote_stonith_op(). If that value is set to -1, then all
delays are disabled. crm_element_value_int() sets the value to -1 upon a
parse error, and this function was not doing any error checking before
now. So I wondered if we were relying on the parse error defaulting to
-1.

However, I don't think it's a problem. It looks like that attribute is
only ever getting set by crm_xml_add_int() -- so even if it's set based
on user input, it's set to a valid integer. The -1 value seems to be
intended only for when it's set explicitly to -1. We simply haven't
error-checked the parsing before now.

See commit a0cbd52 and associated commits in PR ClusterLabs#2027.

Signed-off-by: Reid Wahl <[email protected]>
nrwahl2 added a commit to nrwahl2/pacemaker that referenced this pull request Jul 23, 2025
To replace crm_element_value_int().

Note that, like the other recently added getter functions, this leaves
dest unchanged on error. crm_element_value_int() set dest to -1.

This should indirectly improve some of the callers, which were not
error-checking crm_element_value_int()'s return code and simply assumed
the value was valid and not -1.

And IMO, it's better not to log errors or warnings within the getter
function as crm_element_value_int() was doing. The caller has more
context and can log errors or warnings if appropriate. Sometimes, a
parse failure may be expected as a legitimate possibility.

The one case I found questionable was the getting of PCMK__XA_ST_DELAY
in create_remote_stonith_op(). If that value is set to -1, then all
delays are disabled. crm_element_value_int() sets the value to -1 upon a
parse error, and this function was not doing any error checking before
now. So I wondered if we were relying on the parse error defaulting to
-1.

However, I don't think it's a problem. It looks like that attribute is
only ever getting set by crm_xml_add_int() -- so even if it's set based
on user input, it's set to a valid integer. The -1 value seems to be
intended only for when it's set explicitly to -1. We simply haven't
error-checked the parsing before now.

See commit a0cbd52 and associated commits in PR ClusterLabs#2027.

Signed-off-by: Reid Wahl <[email protected]>
nrwahl2 added a commit to nrwahl2/pacemaker that referenced this pull request Jul 24, 2025
To replace crm_element_value_int().

Note that, like the other recently added getter functions, this leaves
dest unchanged on error. crm_element_value_int() set dest to -1.

This should indirectly improve some of the callers, which were not
error-checking crm_element_value_int()'s return code and simply assumed
the value was valid and not -1.

And IMO, it's better not to log errors or warnings within the getter
function as crm_element_value_int() was doing. The caller has more
context and can log errors or warnings if appropriate. Sometimes, a
parse failure may be expected as a legitimate possibility.

The one case I found questionable was the getting of PCMK__XA_ST_DELAY
in create_remote_stonith_op(). If that value is set to -1, then all
delays are disabled. crm_element_value_int() sets the value to -1 upon a
parse error, and this function was not doing any error checking before
now. So I wondered if we were relying on the parse error defaulting to
-1.

However, I don't think it's a problem. It looks like that attribute is
only ever getting set by crm_xml_add_int() -- so even if it's set based
on user input, it's set to a valid integer. The -1 value seems to be
intended only for when it's set explicitly to -1. We simply haven't
error-checked the parsing before now.

See commit a0cbd52 and associated commits in PR ClusterLabs#2027.

Signed-off-by: Reid Wahl <[email protected]>
nrwahl2 added a commit to nrwahl2/pacemaker that referenced this pull request Aug 6, 2025
To replace crm_element_value_int().

Note that, like the other recently added getter functions, this leaves
dest unchanged on error. crm_element_value_int() set dest to -1.

This should indirectly improve some of the callers, which were not
error-checking crm_element_value_int()'s return code and simply assumed
the value was valid and not -1.

And IMO, it's better not to log errors or warnings within the getter
function as crm_element_value_int() was doing. The caller has more
context and can log errors or warnings if appropriate. Sometimes, a
parse failure may be expected as a legitimate possibility.

The one case I found questionable was the getting of PCMK__XA_ST_DELAY
in create_remote_stonith_op(). If that value is set to -1, then all
delays are disabled. crm_element_value_int() sets the value to -1 upon a
parse error, and this function was not doing any error checking before
now. So I wondered if we were relying on the parse error defaulting to
-1.

However, I don't think it's a problem. It looks like that attribute is
only ever getting set by crm_xml_add_int() -- so even if it's set based
on user input, it's set to a valid integer. The -1 value seems to be
intended only for when it's set explicitly to -1. We simply haven't
error-checked the parsing before now.

See commit a0cbd52 and associated commits in PR ClusterLabs#2027.

Signed-off-by: Reid Wahl <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants