-
Notifications
You must be signed in to change notification settings - Fork 348
Replace uses of strtok() with g_strsplit() and friends #3967
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
For readability. Similarly for the use of pcmk__str_eq() -- using strcmp() in add_possible_values_default() was not a bug in this case, as both arguments were guaranteed to be non-NULL, but this was not obvious. Signed-off-by: Reid Wahl <[email protected]>
For readability and reliability. strtok() modifies its argument string, but according to the C11 standard (per the following Stack Overflow answer), the setlocale() return value should not be modified by the program. * https://stackoverflow.com/a/29116455 Signed-off-by: Reid Wahl <[email protected]>
The name is overly long and uses the public API prefix ("services_" with a single underscore). Also rename the lpc variable to "i" and scope it to the loop, rename the "root" argument to "dir", and use bool instead of gboolean. I'm guessing that the "os" prefix, as well as the separate "services_linux.c" file, were meant to allow different implementations of certain functions depending on the OS. GLib does similar things. However, the commit that introduced it all (cc2110d) sheds no light on this question. Signed-off-by: Reid Wahl <[email protected]>
* namelist was not being freed when entries == 0, indicating an empty directory. * A namelist entry was not being freed if stat() failed. Signed-off-by: Reid Wahl <[email protected]>
Signed-off-by: Reid Wahl <[email protected]>
For readability Signed-off-by: Reid Wahl <[email protected]>
The name is overly long and uses the public API prefix ("services_" with a single underscore). Use bool instead of gboolean, and rename the "root" argument to "dirs". "dirs" is a colon-delimited list of directories. I'm guessing that the "os" prefix, as well as the separate "services_linux.c" file, were meant to allow different implementations of certain functions depending on the OS. GLib does similar things. However, the commit that introduced it all (cc2110d) sheds no light on this question. Signed-off-by: Reid Wahl <[email protected]>
For readability versus strtok(). Signed-off-by: Reid Wahl <[email protected]>
Previously, the libcrmservice code supported initdir being a colon-delimited list of directories. However, this was not documented in the configure help text or comments. All of the documentation referred to initdir being a single directory. The default behavior, if not building with "--with-initdir=<dir>", was to check a list of directories and set INITDIR to the first one that existed. Signed-off-by: Reid Wahl <[email protected]>
Its sole remaining caller did nothing except call it. Move the code there. Signed-off-by: Reid Wahl <[email protected]>
The resources_os_list_ocf_providers() is listing top-level subdirectories of each directory in PCMK__OCF_RA_PATH. Signed-off-by: Reid Wahl <[email protected]>
Signed-off-by: Reid Wahl <[email protected]>
...up. * Give it a shorter and clearer name that doesn't look like public API. * Drop arguments that always take the same value. (There is only a single caller.) * Avoid a fixed-size buffer where we don't control the input. * Use g_strsplit() for readability instead of strtok(). Signed-off-by: Reid Wahl <[email protected]>
Signed-off-by: Reid Wahl <[email protected]>
...in get_directory_list(). To do this, copy the current services__list_dir() function into a static helper called gdl_helper(). This is the only call to services__list_dir() that may take a false value for the executable argument. Signed-off-by: Reid Wahl <[email protected]>
It's always true Signed-off-by: Reid Wahl <[email protected]>
Previously, files of type other than directory and regular file would be listed unconditionally. Now, list only executable regular files if exec_files is true, and list only directories if exec_files is false. Signed-off-by: Reid Wahl <[email protected]>
Signed-off-by: Reid Wahl <[email protected]>
To services__list_ocf_providers(), in order to follow the library's naming conventions. Signed-off-by: Reid Wahl <[email protected]>
To services__list_ocf_agents(), in order to follow the library's naming conventions. Also defunctionize list_provider_agents(). It seems more straightforward to have it inline. Signed-off-by: Reid Wahl <[email protected]>
And return bool instead of gboolean. More to come. Signed-off-by: Reid Wahl <[email protected]>
For readability Signed-off-by: Reid Wahl <[email protected]>
And assert on memory errors Signed-off-by: Reid Wahl <[email protected]>
Nothing uses this yet Signed-off-by: Reid Wahl <[email protected]>
Signed-off-by: Reid Wahl <[email protected]>
And compare strchr() return value against NULL instead of 0. Signed-off-by: Reid Wahl <[email protected]>
Signed-off-by: Reid Wahl <[email protected]>
Signed-off-by: Reid Wahl <[email protected]>
This is pretty big regardless. So depending on the speed of reviewing it, we could break off the fencer commits into a later PR, and maybe work on sharing code with I don't love the idea of pulling this parsing code into |
One
Edit: Added a commit to fix this in the test |
It's still confusing, but hopefully a little bit less so. When delay_base_s is set, it means we've found our mapping. So we use that as a loop exit condition. We no longer need valptr, since we never change what value points to. Now that value isn't changing, it will be easier to feel confident about using g_strsplit_set() in place of strtok(). Note the similarity to pcmk__scan_nvpair(). Signed-off-by: Reid Wahl <[email protected]>
Instead of strtok(). This seems clearer. Skip empty mappings so that we don't log a "malformed" error. strtok() skipped contiguous sequences of delimiters, so we wouldn't try to parse an empty mapping. g_strsplit_set() returns an empty string between contiguous delimiters. Strip leading and trailing whitespace before parsing the param value, and don't try to parse as a sequence of mappings (delimited by ';', ' ', and '\t') if there are no more delimiters after stripping. This avoids logging an error when the value is formatted like a single interval: no colon, no semicolon, no spaces or tabs except optionally at the beginning or end of the string. It seems reasonable not to strip leading or trailing semicolons. We can ignore leading or trailing whitespace, but a leading or trailing semicolon seems like a clear indicator that this is supposed to be a mapping. This will result an error for values like ";3s " or " 3s;" but not for " 3s ". Signed-off-by: Reid Wahl <[email protected]>
To reduce some nesting Signed-off-by: Reid Wahl <[email protected]>
Signed-off-by: Reid Wahl <[email protected]>
Seems cleaner since it isolates the "mappings" variable and g_strsplit_set() call. Signed-off-by: Reid Wahl <[email protected]>
An ISO 8601 interval can include a colon character. I can't think of any valid reason why a user would specify an interval this way for a fencing delay, but there doesn't seem to be any reason to exclude mapping values that have a colon. The "make sure there's no colon" check seemed to be an artifact of the way the parsing code worked before we started using g_strsplit(), etc. Signed-off-by: Reid Wahl <[email protected]>
Commit f057778 fixed an off-by-one error that a test relied on to avoid randomness in the output. Also add a new test to ensure that pcmk_delay_base is capped by pcmk_delay_max. Signed-off-by: Reid Wahl <[email protected]>
...to i. Also rename max to len. Signed-off-by: Reid Wahl <[email protected]>
This is a static function with only one caller, and that caller always passes the address of a struct member. Signed-off-by: Reid Wahl <[email protected]>
Signed-off-by: Reid Wahl <[email protected]>
It's about to go out of scope, so it doesn't get reused. It also doesn't get freed because the aliases table has taken ownership of it. Signed-off-by: Reid Wahl <[email protected]>
1c0b21a
to
bf6eb20
Compare
This should make absolutely no difference but uses the correct semantics. Signed-off-by: Reid Wahl <[email protected]>
This reverts e95198f. Backslash escapes are useful in C and on the command line, but it's not clear how they would be useful in a pcmk_host_map value. Backslash escapes are not used in XML, where the configuration is stored. The removed code did the following upon finding a backslash: * If we're not inside the value part of a host-to-port mapping, skip both a backslash and the character after it. * Otherwise, skip the backslash and keep the character after it (unless that character is also a backslash). This doesn't seem to make sense. Any characters that are special in XML attribute values would need to be escaped using XML entities. Other characters, including backslashes, are allowed as literals. * https://www.w3.org/TR/REC-xml/#NT-AttValue Signed-off-by: Reid Wahl <[email protected]>
Previously, if we hit a parsing error in pcmk_host_map, we logged a debug message and continued trying to parse the rest of the value. It's hard to be confident about the resulting behavior. Now, we stop parsing if we encounter an error. Also as part of this commit, we use g_strsplit_set() twice. At the first level, we use it to split the list of node-name-to-port mappings in pcmk_host_map, using ';', ' ', and '\t' as delimiters. At the second level, we use it to split each mapping, using '=' and ':' as delimiters. Support for '=' is likely to be removed in a future release. I greatly struggled to reason about the existing code based on "last = i + 1". It felt as if there were lots of potential edge cases. The behavior may not be identical as of this commit, but it should be more robust and predictable, and the code should be easier to maintain. Signed-off-by: Reid Wahl <[email protected]>
bf6eb20
to
b936817
Compare
* into fewer than two mappings avoids a "Malformed mapping" error message | ||
* below. | ||
*/ | ||
if (g_strv_length(mappings) < 2) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is wrong. There could be a single mapping -- though if there are no delimiters, we don't want to throw an error if it's an interval instead of a mapping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this still need to be addressed?
g_string_append_c(buf, '"'); | ||
g_string_append(buf, value); | ||
g_string_append(buf, *value); | ||
g_string_append_c(buf, '"'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These calls could be replaced with:
g_string_append_printf(buf, "\"%s\"", *value);
if (!found_default && (strcmp(value, option->default_value) == 0)) { | ||
if (!found_default | ||
&& pcmk__str_eq(*value, option->default_value, | ||
pcmk__str_none)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This I definitely don't care about, but just a suggestion to potentially make this a little more legible. You could rearrange this block as:
if (!pcmk__str_eq(*value, option->default_value, pcmk__str_none)) {
continue;
}
if (!found_default) {
...
}
The idea being that we bring the second condition back a little to the left and then maybe it all fits on one line.
|
||
#include <crm_internal.h> | ||
|
||
#include <stdbool.h> // true, false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should have mentioned this a long time ago, but I don't think listing what we import from stdbool.h is at all worth it. It's also not worth changing in this PR either.
|
||
for (char *dir = strtok(dirs, ":"); !found && (dir != NULL); | ||
dir = strtok(NULL, ":")) { | ||
dirs = g_strsplit(PCMK__OCF_RA_PATH, ":", 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that if PCMK__OCF_RA_PATH
is NULL, this will assert instead of return false like we previously did. I have not gone back and checked previous patches to see if this is taken into consideration in those.
lib/services/services_ocf.c
Outdated
if (dirs == NULL) { | ||
return ENOMEM; | ||
} | ||
gchar **dirs = g_strsplit(PCMK__OCF_RA_PATH, ":", 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is related to my previous comment - here we'll assert, but it's better than what we used to do if PCMK__OCF_RA_PATH
is NULL.
* into fewer than two mappings avoids a "Malformed mapping" error message | ||
* below. | ||
*/ | ||
if (g_strv_length(mappings) < 2) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this still need to be addressed?
# Resulting delay will be 1 second (capped by pcmk_delay_max) | ||
test.add_cmd("stonith_admin", | ||
args='--output-as=xml -R true3 -a fence_dummy -o mode=pass -o "pcmk_host_list=node1 node2 node3"') | ||
args='--output-as=xml -R true3 -a fence_dummy -o mode=pass -o "pcmk_host_list=node1 node2 node3" -o pcmk_delay_base=2 -o pcmk_delay_max=1') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little surprised that the argument to pcmk_host_list
doesn't have to be surrounded by some quotes here (and everywhere else) since it contains spaces.
daemons/fenced/fenced_commands.c
Outdated
for (; lpc <= max; lpc++) { | ||
switch (hostmap[lpc]) { | ||
len = strlen(hostmap); | ||
for (int i = 0; i <= len; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be fine with you also getting rid of len
and just putting the call to strlen here since it's a pretty short line. Any halfway decent compiler should be smart enough to generate reasonable code here. If you don't want to, that's fine too.
} | ||
} | ||
value[k] = '\0'; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docs (doc/sphinx/Pacemaker_Explained/fencing.rst) say this about pcmk_host_map
:
- A mapping of node names to ports for devices that do not understand the
node names. For example, ``node1:1;node2:2,3`` tells the cluster to use
port 1 for ``node1`` and ports 2 and 3 for ``node2``. If
``pcmk_host_check`` is explicitly set to ``static-list``, either this or
``pcmk_host_list`` must be set. The port portion of the map may contain
special characters such as spaces if preceded by a backslash *(since 2.1.2)*.
I'm unclear on what a "port" is in this context, and what characters are valid in the definition of a port. I'd like to make sure that if we are expected to support characters such as spaces in port names, that the new parsing code handles that without backslashes. Also, since this has been present since 2.1.2, I'm a little concerned about breaking whoever is using this right now regardless of how little sense it makes or how it is currently mangled in the XML.
|
||
if (pcmk__str_empty(nvpair[0]) || pcmk__str_empty(nvpair[1])) { | ||
crm_err(PCMK_FENCING_HOST_MAP ": Malformed mapping '%s'", *mapping); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we're changing behavior... I wonder if it makes sense to return an empty mapping in the case where it's malformed. I can see going either way with it. If there's a parse error, who knows what they screwed up so maybe the entire thing is invalid. On the other hand, maybe we should try to use what we can from before the error occurred (what we're doing right now) to implement the user's configuration as much as possible.
Most of this seems fairly straightforward, but the fencer changes related to
pcmk_delay_base
andpcmk_host_map
could use some testing. The existing code was unintelligible to me at first, so it required a LOT of small, incremental changes.