Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std.posix.getenv: early-return comparison #23265

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

g-logunov
Copy link

Addresses the issue described in #22917.

Possibly interferes with @andrewrk work on https://github.com/ziglang/zig/tree/main branch.

Original implementation https://github.com/ziglang/zig/blob/aa3db7cc15/lib/std/posix.zig#L2004 for each environment variable iterates until the end of its name (until =), and only then compares entire name to key.
Since some of the environment variables could be quite long (i.e. GHOSTTY_SHELL_INTEGRATION_NO_SUDO=1), these sizes add up.
Simply - in order to find a key in environ, it has to iterate over cumulative sizes of each env variable name before it.

Proposed implementation functionally does what strncmp would do: stops iterating and moves to the next variable on first character mismatch with key.

See the benchmarks below.
Disclaimer: this was tested on macOS, with a variation of #23264 fix applied.

// getenv-c.zig
const std = @import("std");

pub fn main() void {
    for (0..10_000_000) |_| {
        _ = std.mem.doNotOptimizeAway(std.c.getenv("FOOBAR"));
    }
}
// getenv-zig.zig
const std = @import("std");

pub fn main() void {
    for (0..10_000_000) |_| {
        _ = std.mem.doNotOptimizeAway(std.posix.getenv("FOOBAR"));
    }
}
> env | wc -l
      43
> hyperfine --warmup 3 ./getenv-c ./getenv-zig-aa3db7cc15 ./getenv-zig-speedup
Benchmark 1: ./getenv-c
  Time (mean ± σ):     294.1 ms ±   3.5 ms    [User: 293.1 ms, System: 0.8 ms]
  Range (min … max):   288.7 ms … 299.3 ms    10 runs

Benchmark 2: ./getenv-zig-aa3db7cc15
  Time (mean ± σ):      1.907 s ±  0.022 s    [User: 1.901 s, System: 0.005 s]
  Range (min … max):    1.881 s …  1.941 s    10 runs

Benchmark 3: ./getenv-zig-speedup
  Time (mean ± σ):     130.8 ms ±   0.5 ms    [User: 129.8 ms, System: 0.8 ms]
  Range (min … max):   129.7 ms … 131.7 ms    22 runs

Summary
  ./getenv-zig-speedup ran
    2.25 ± 0.03 times faster than ./getenv-c
   14.58 ± 0.17 times faster than ./getenv-zig-aa3db7cc15

To address the elephant in the room: this loop is duplicated, but I'm hesitant to refactor it in order to deduplicate because of
// TODO see https://github.com/ziglang/zig/issues/4524 preamble and ongoing work to fix it.

squeek502 added a commit to squeek502/zig that referenced this pull request Mar 17, 2025
Instead of parsing the full key and value for each environment variable before checking the key for (case-insensitive) equality, we skip to the next environment variable once it's no longer possible for the key to match.

This makes getting environment variables about 2x faster across the board on Windows.

Note: We still have to scan to find the end of each environment variable, even the ones that are skipped (we only know where it ends by a NUL terminator), so this strategy doesn't provide the same speedup on Windows as it does on POSIX (ziglang#23265)
squeek502 added a commit to squeek502/zig that referenced this pull request Mar 17, 2025
Instead of parsing the full key and value for each environment variable before checking the key for (case-insensitive) equality, we skip to the next environment variable once it's no longer possible for the key to match.

This makes getting environment variables about 2x faster across the board on Windows.

Note: We still have to scan to find the end of each environment variable, even the ones that are skipped (we only know where it ends by a NUL terminator), so this strategy doesn't provide the same speedup on Windows as it does on POSIX (ziglang#23265)
@squeek502
Copy link
Collaborator

squeek502 commented Mar 22, 2025

This will cause a regression with regards to looking up environment variable names with = in them. Such environment variable names should always lead to getenv returning null (for example: if you set FOO=ABC=123, calling getenv("FOO=ABC") should return null, not 123)

An easy fix would be adding something like:

if (std.mem.indexOfScalar(u8, key, '=') != null) return null;

(see also the standalone test in #23272 for a relevant test case)

@g-logunov
Copy link
Author

Apologies, I believe that POSIX does not allow for environment variables with embedded =.

https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap08.html

These strings have the form name=value; names shall not contain
any bytes that have the encoded value of the character '='.
For values to be portable across systems conforming to POSIX.1-2024,
the value shall be composed of bytes that have the encoded value of
characters from the portable character set (except NUL and as indicated below).

@squeek502
Copy link
Collaborator

squeek502 commented Mar 22, 2025

Correct, but users can still request the lookup of invalid environment variable names. With the changes in this PR, calling getenv("FOO=ABC") will return a non-null value if the environment variable FOO is set to ABC=<something>.

Instead, looking up any name with = in it should always return null. Here's the musl implementation for reference which checks for = in the name:

size_t l = __strchrnul(name, '=') - name;
if (l && !name[l] && __environ)

@g-logunov
Copy link
Author

I see what you mean. The reference implementation would return ABC=123 to FOO=ABC request, I believe, because it considers "prefix before first =" as total match.

@g-logunov
Copy link
Author

const std = @import("std");

pub fn main() void {
    std.log.info("FOO: {s}", .{ std.c.getenv("FOO") orelse "" });
    std.log.info("FOO=ABC: {s}", .{ std.c.getenv("FOO=ABC") orelse "" });
}
> FOO="ABC=123" zig run getenv-demo
info: FOO: ABC=123
info: FOO=ABC: ABC=123

@squeek502
Copy link
Collaborator

squeek502 commented Mar 23, 2025

If by reference implementation you mean the implementation on master currently, then the mem.eql would fail because it'd be comparing FOO to FOO=ABC so null would end up being returned (after looping through the entire environment block).

(not efficient, but correct behavior)

EDIT: Sorry, I think I misunderstood what you were saying. If so, ignore this

@g-logunov
Copy link
Author

Terribly sorry if I'm misunderstanding the code, but by "reference" I mean musl implementation from above

Please see my understanding of what happens

// name == "FOO=ABC"
size_t l = __strchrnul(name, '=') /* == (name + 3) */ - name; // == 3
if (l && !name[l] && __environ) // 3 && !'=' && __environ
    for (char **e = __environ; *e; e++)
        // **e == &("FOO=ABC=123")
        if (!strncmp(name, *e, l) /* strncmp("FOO=ABC", "FOO=ABC=123", 3) == 0 */ && l[*e] /* "FOO=ABC=123"[3] */ == '=')
            return *e + l+1; // "FOO=ABC=123"[4:] == "ABC=123"
return 0;

@squeek502
Copy link
Collaborator

squeek502 commented Mar 23, 2025

Terribly sorry if I'm misunderstanding the code

No worries, the code is hard to understand. You're misinterpreting the !name[l] condition, which checks that __strchrnul returned the position of the NUL terminator.

Here's my rewrite of your comment for that line:

if (l && !name[l] && __environ) // 3 && name[3] == 0 && __environ

so in the case of FOO=ABC, !name[3] aka name[3] == 0 would be false.

@g-logunov
Copy link
Author

g-logunov commented Mar 23, 2025

Thank you for clarifying, pushed the suggested change to match null behaviour.

This still, I think, puts a spotlight on the fact that std.c.getenv, and therefore std.posix.getenvZ (may) behave differently than musl and std.posix.getenv for keys with embedded =.

@squeek502
Copy link
Collaborator

squeek502 commented Mar 23, 2025

This still, I think, puts a spotlight on the fact that std.c.getenv, and therefore std.posix.getenvZ (may) behave differently than musl and std.posix.getenv for keys with embedded =.

I don't think they will behave differently, but it is a good idea to test for this. I'll make the standalone test added in #23272 also compile version(s) with libc linked and make sure it tests std.c.getenv/std.posix.getenvZ as well.

@g-logunov
Copy link
Author

I don't think they will behave differently, but it is a good idea to test for this. I'll make the standalone test added in #23272 also compile version(s) with libc linked and make sure it tests std.c.getenv/std.posix.getenvZ as well.

At least on macOS I'm observing behaviour reported in #23265 (comment).

Including with

#include <stdlib.h>
#include <stdio.h>

int main(int argc, char *argv[])
{
    const char *v = getenv("FOO");
    printf("FOO: %s\n", v ? v : "");
    v = getenv("FOO=ABC");
    printf("FOO=ABC: %s\n", v ? v : "");
    return 0;
}

@squeek502
Copy link
Collaborator

squeek502 commented Mar 23, 2025

Hm, you seem to be right, and musl might be the odd one out here. MinGW and MSVC libc on Windows returns non-null for the FOO=ABC case, as does glibc on Linux. Only musl returns null.

The musl behavior seems obviously more correct to me, though, since as you quoted from the POSIX spec before:

names shall not contain any bytes that have the encoded value of the character '='.

FWIW, GetEnvironmentVariableW on Windows returns ERROR_ENVVAR_NOT_FOUND

EDIT: Was hoping to maybe find a musl commit where this behavior was introduced but it's been there since the earliest commit: https://git.musl-libc.org/cgit/musl/commit/src/env/getenv.c?id=0b44a0315b47dd8eced9f3b7f31580cf14bbfc01

@squeek502
Copy link
Collaborator

squeek502 commented Mar 23, 2025

Made a follow-up issue for = in environment variable names: #23331

if (!mem.eql(u8, this_key, key)) continue;
while (line[line_i] != 0) : (line_i += 1) {
if (line_i == key.len) break;
if (line[line_i] != key[line_i]) break;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can label the outer loop and continue :outer; instead which allows us to skip the line_i != key.len check below.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I experimented with this, and interestingly it results in slower code with distinctly bad tails.

I'm running with this diff applied

diff --git a/lib/std/posix.zig b/lib/std/posix.zig
index 0d4ad3abaa..e09daf0703 100644
--- a/lib/std/posix.zig
+++ b/lib/std/posix.zig
@@ -2010,13 +2010,13 @@ pub fn getenv(key: []const u8) ?[:0]const u8 {
     }
     if (builtin.link_libc) {
         var ptr = std.c.environ;
-        while (ptr[0]) |line| : (ptr += 1) {
+        environ: while (ptr[0]) |line| : (ptr += 1) {
             var line_i: usize = 0;
             while (line[line_i] != 0) : (line_i += 1) {
                 if (line_i == key.len) break;
-                if (line[line_i] != key[line_i]) break;
+                if (line[line_i] != key[line_i]) continue :environ;
             }
-            if ((line_i != key.len) or (line[line_i] != '=')) continue;
+            if (line[line_i] != '=') continue;

             return mem.sliceTo(line + line_i + 1, 0);
         }

Typical hyperfine invocation looks like below, running on M3 mac. Different --warmup values don't really change the picture (tried until dozens)

> hyperfine --warmup 3 ./getenv-zig-speedup ./getenv-zig-label1
Benchmark 1: ./getenv-zig-speedup
  Time (mean ± σ):     117.0 ms ±   0.1 ms    [User: 116.5 ms, System: 0.3 ms]
  Range (min … max):   116.9 ms … 117.6 ms    25 runs

Benchmark 2: ./getenv-zig-label1
  Time (mean ± σ):     133.1 ms ±  19.9 ms    [User: 132.5 ms, System: 0.3 ms]
  Range (min … max):   117.0 ms … 177.6 ms    22 runs

Summary
  ./getenv-zig-speedup ran
    1.14 ± 0.17 times faster than ./getenv-zig-label1

I don't think I understand LLVM enough to articulate why exactly this happens, other than that manual control flow isn't trivial to optimize around.

If anyone would like to try and reproduce this effect on other systems, would be much appreciated.

if (!mem.eql(u8, key, this_key)) continue;
while (ptr[line_i] != 0) : (line_i += 1) {
if (line_i == key.len) break;
if (ptr[line_i] != key[line_i]) break;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants