-
-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
std.posix.getenv: early-return comparison #23265
base: master
Are you sure you want to change the base?
Conversation
Instead of parsing the full key and value for each environment variable before checking the key for (case-insensitive) equality, we skip to the next environment variable once it's no longer possible for the key to match. This makes getting environment variables about 2x faster across the board on Windows. Note: We still have to scan to find the end of each environment variable, even the ones that are skipped (we only know where it ends by a NUL terminator), so this strategy doesn't provide the same speedup on Windows as it does on POSIX (ziglang#23265)
Instead of parsing the full key and value for each environment variable before checking the key for (case-insensitive) equality, we skip to the next environment variable once it's no longer possible for the key to match. This makes getting environment variables about 2x faster across the board on Windows. Note: We still have to scan to find the end of each environment variable, even the ones that are skipped (we only know where it ends by a NUL terminator), so this strategy doesn't provide the same speedup on Windows as it does on POSIX (ziglang#23265)
This will cause a regression with regards to looking up environment variable names with An easy fix would be adding something like: if (std.mem.indexOfScalar(u8, key, '=') != null) return null; (see also the standalone test in #23272 for a relevant test case) |
Apologies, I believe that POSIX does not allow for environment variables with embedded https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap08.html
|
Correct, but users can still request the lookup of invalid environment variable names. With the changes in this PR, calling Instead, looking up any name with zig/lib/libc/musl/src/env/getenv.c Lines 7 to 8 in 9c9d393
|
I see what you mean. The reference implementation would return |
const std = @import("std");
pub fn main() void {
std.log.info("FOO: {s}", .{ std.c.getenv("FOO") orelse "" });
std.log.info("FOO=ABC: {s}", .{ std.c.getenv("FOO=ABC") orelse "" });
} > FOO="ABC=123" zig run getenv-demo
info: FOO: ABC=123
info: FOO=ABC: ABC=123 |
If by reference implementation you mean the implementation on (not efficient, but correct behavior) EDIT: Sorry, I think I misunderstood what you were saying. If so, ignore this |
Terribly sorry if I'm misunderstanding the code, but by "reference" I mean Please see my understanding of what happens // name == "FOO=ABC"
size_t l = __strchrnul(name, '=') /* == (name + 3) */ - name; // == 3
if (l && !name[l] && __environ) // 3 && !'=' && __environ
for (char **e = __environ; *e; e++)
// **e == &("FOO=ABC=123")
if (!strncmp(name, *e, l) /* strncmp("FOO=ABC", "FOO=ABC=123", 3) == 0 */ && l[*e] /* "FOO=ABC=123"[3] */ == '=')
return *e + l+1; // "FOO=ABC=123"[4:] == "ABC=123"
return 0; |
No worries, the code is hard to understand. You're misinterpreting the Here's my rewrite of your comment for that line: if (l && !name[l] && __environ) // 3 && name[3] == 0 && __environ so in the case of |
Thank you for clarifying, pushed the suggested change to match This still, I think, puts a spotlight on the fact that |
I don't think they will behave differently, but it is a good idea to test for this. I'll make the standalone test added in #23272 also compile version(s) with libc linked and make sure it tests |
At least on macOS I'm observing behaviour reported in #23265 (comment). Including with #include <stdlib.h>
#include <stdio.h>
int main(int argc, char *argv[])
{
const char *v = getenv("FOO");
printf("FOO: %s\n", v ? v : "");
v = getenv("FOO=ABC");
printf("FOO=ABC: %s\n", v ? v : "");
return 0;
} |
Hm, you seem to be right, and musl might be the odd one out here. MinGW and MSVC libc on Windows returns non-null for the The musl behavior seems obviously more correct to me, though, since as you quoted from the POSIX spec before:
FWIW, EDIT: Was hoping to maybe find a musl commit where this behavior was introduced but it's been there since the earliest commit: https://git.musl-libc.org/cgit/musl/commit/src/env/getenv.c?id=0b44a0315b47dd8eced9f3b7f31580cf14bbfc01 |
Made a follow-up issue for |
if (!mem.eql(u8, this_key, key)) continue; | ||
while (line[line_i] != 0) : (line_i += 1) { | ||
if (line_i == key.len) break; | ||
if (line[line_i] != key[line_i]) break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can label the outer loop and continue :outer;
instead which allows us to skip the line_i != key.len
check below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I experimented with this, and interestingly it results in slower code with distinctly bad tails.
I'm running with this diff applied
diff --git a/lib/std/posix.zig b/lib/std/posix.zig
index 0d4ad3abaa..e09daf0703 100644
--- a/lib/std/posix.zig
+++ b/lib/std/posix.zig
@@ -2010,13 +2010,13 @@ pub fn getenv(key: []const u8) ?[:0]const u8 {
}
if (builtin.link_libc) {
var ptr = std.c.environ;
- while (ptr[0]) |line| : (ptr += 1) {
+ environ: while (ptr[0]) |line| : (ptr += 1) {
var line_i: usize = 0;
while (line[line_i] != 0) : (line_i += 1) {
if (line_i == key.len) break;
- if (line[line_i] != key[line_i]) break;
+ if (line[line_i] != key[line_i]) continue :environ;
}
- if ((line_i != key.len) or (line[line_i] != '=')) continue;
+ if (line[line_i] != '=') continue;
return mem.sliceTo(line + line_i + 1, 0);
}
Typical hyperfine
invocation looks like below, running on M3 mac. Different --warmup
values don't really change the picture (tried until dozens)
> hyperfine --warmup 3 ./getenv-zig-speedup ./getenv-zig-label1
Benchmark 1: ./getenv-zig-speedup
Time (mean ± σ): 117.0 ms ± 0.1 ms [User: 116.5 ms, System: 0.3 ms]
Range (min … max): 116.9 ms … 117.6 ms 25 runs
Benchmark 2: ./getenv-zig-label1
Time (mean ± σ): 133.1 ms ± 19.9 ms [User: 132.5 ms, System: 0.3 ms]
Range (min … max): 117.0 ms … 177.6 ms 22 runs
Summary
./getenv-zig-speedup ran
1.14 ± 0.17 times faster than ./getenv-zig-label1
I don't think I understand LLVM enough to articulate why exactly this happens, other than that manual control flow isn't trivial to optimize around.
If anyone would like to try and reproduce this effect on other systems, would be much appreciated.
if (!mem.eql(u8, key, this_key)) continue; | ||
while (ptr[line_i] != 0) : (line_i += 1) { | ||
if (line_i == key.len) break; | ||
if (ptr[line_i] != key[line_i]) break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise
Addresses the issue described in #22917.
Possibly interferes with @andrewrk work on https://github.com/ziglang/zig/tree/main branch.
Original implementation https://github.com/ziglang/zig/blob/aa3db7cc15/lib/std/posix.zig#L2004 for each environment variable iterates until the end of its name (until
=
), and only then compares entire name tokey
.Since some of the environment variables could be quite long (i.e.
GHOSTTY_SHELL_INTEGRATION_NO_SUDO=1
), these sizes add up.Simply - in order to find a
key
inenviron
, it has to iterate over cumulative sizes of each env variable name before it.Proposed implementation functionally does what
strncmp
would do: stops iterating and moves to the next variable on first character mismatch withkey
.See the benchmarks below.
Disclaimer: this was tested on macOS, with a variation of #23264 fix applied.
To address the elephant in the room: this loop is duplicated, but I'm hesitant to refactor it in order to deduplicate because of
// TODO see https://github.com/ziglang/zig/issues/4524
preamble and ongoing work to fix it.