Skip to content

Perl 5.40 on Solaris can not parse setlocale(3C) output #23195

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
vlmarek opened this issue Apr 14, 2025 · 3 comments
Closed

Perl 5.40 on Solaris can not parse setlocale(3C) output #23195

vlmarek opened this issue Apr 14, 2025 · 3 comments

Comments

@vlmarek
Copy link
Contributor

vlmarek commented Apr 14, 2025

Description

Perl 5.40 built on recent Solaris 11 reports an error:

locale.c: 3407: panic: Can't change locale for LC_ALL (6) from '/cs_CZ.UTF-8/cs_CZ.UTF-8/cs_CZ.UTF-8/cs_CZ.UTF-8/cs_CZ.UTF-8/C' to '/cs_CZ.UTF-8/cs_CZ.UTF-8/cs_CZ.UTF-8/cs_CZ.UTF-8/cs_CZ.UTF-8/C' 
Called via locale.c: 8561; errno=22
Called by locale.c: 9232

This happens only when composite locales are reported by setlocale(3C). In my case:

$ locale
LANG=cs_CZ.UTF-8
LC_CTYPE=cs_CZ.UTF-8
LC_NUMERIC=cs_CZ.UTF-8
LC_TIME=cs_CZ.UTF-8
LC_COLLATE=cs_CZ.UTF-8
LC_MONETARY=cs_CZ.UTF-8
LC_MESSAGES=C
LC_ALL=

The problem is that setlocale(3C) reports a slash at the beginning. Here is except from man page:

       A  null  pointer  for  locale  directs setlocale() to query the current
       global locale setting and return a pointer  to  the  string  associated
       with  the  category  for  the current global locale. If the category is
       LC_ALL and the current global locale is a composite locale, the  string
       will  have  locale names for LC_CTYPE, LC_NUMERIC, LC_TIME, LC_COLLATE,
       LC_MONETARY, and LC_MESSAGES categories, and in  that  order,  concate-
       nated  together  where  each  category's locale name is prefixed with a
       slash ('/' or 0x2F) character, for instance,  "/en_US.UTF-8/C/C/C/C/C".
       Otherwise, the string will have a locale name such as "en_US.UTF-8".

The function S_parse_LC_ALL_string does not expect that slash and thinks that there is additinal (empty) locale definition. And exits with an error as there is now unexpected number of locales detected from the string.

Steps to Reproduce
Compile on Solaris, use force the setlocale(3C) to use composite locales. For example via

export LC_CTYPE=cs_CZ.UTF-8
export LC_NUMERIC=cs_CZ.UTF-8
export LC_TIME=cs_CZ.UTF-8
export LC_COLLATE=cs_CZ.UTF-8
export LC_MONETARY=cs_CZ.UTF-8
export LC_MESSAGES=C

Run perl -e 0.

Expected behavior
See perl exit cleanly without any error.

Perl configuration

$ /usr/perl5/5.40/bin/perl -V
Summary of my perl5 (revision 5 version 40 subversion 1) configuration:
   
  Platform:
    osname=solaris
    osvers=2.11
    archname=i86pc-solaris-thread-multi-64
    uname='sunos ulx-0 5.11 11.4.81.193.0 i86pc i386 i86pc non-virtualized '
    config_args='-de -Dmksymlinks -Ulocincpth= -Dbin=/usr/perl5/5.40/bin -Dcc=gcc -Dcf_by=perl-bugs [email protected] -Dlibperl=libperl.so -Duseshrplib -Dusedtrace -Duse64bitall -Dusethreads 
-Dmyhostname=localhost -Dmydomain=foobar.example.org -Dprefix=/usr/perl5/5.40 -Dprivlib=/usr/perl5/5.40/lib -Dsitelib=/usr/perl5/site_perl/5.40 -Dsiteprefix=/usr/perl5/5.40 -Dvendorlib=/usr/perl5/vendor_perl/5.4
0 -Dvendorprefix=/usr/perl5/5.40 -Dlibpth=/lib/64 /usr/lib/64 -Doptimize=-O3   '
    hint=recommended
    useposix=true
    d_sigaction=define
    useithreads=define
    usemultiplicity=define
    use64bitint=define
    use64bitall=define
    uselongdouble=undef
    usemymalloc=n
    default_inc_excludes_dot=define
  Compiler:
    cc='gcc'
    ccflags ='-D_REENTRANT -m64 -fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -D_LARGEFILE64_SOURCE -D_FORTIFY_SOURCE=2'
    optimize='-O3 '
    cppflags='-D_REENTRANT -m64 -fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong'
    ccversion=''
    gccversion='14.2.0'
    gccosandvers=''
    intsize=4
    longsize=8
    ptrsize=8
    doublesize=8
    byteorder=12345678
    doublekind=3
    d_longlong=define
    longlongsize=8
    d_longdbl=define
    longdblsize=16
    longdblkind=3
    ivtype='long'
    ivsize=8
    nvtype='double'
    nvsize=8
    Off_t='off_t'
    lseeksize=8
    alignbytes=8
    prototype=define
  Linker and Libraries:
   ld='gcc'
    ldflags =' -m64 -fstack-protector-strong -L/usr/gnu/lib '
    libpth=/lib/64 /usr/lib/64 /usr/gcc/14/lib /usr/lib /usr/gnu/lib /usr/ccs/lib
    libs=-lpthread -lsocket -lnsl -lgdbm -ldb -ldl -lm -lc
    perllibs=-lpthread -lsocket -lnsl -ldl -lm -lc
    libc=/lib/libc.so
    so=so
    useshrplib=true
    libperl=libperl.so
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_dlopen.xs
    dlext=so
    d_dlsymun=undef
    ccdlflags='  -R /usr/perl5/5.40/lib/i86pc-solaris-thread-multi-64/CORE'
    cccdlflags='-fPIC'
    lddlflags=' -shared -m64 -L/usr/gnu/lib -fstack-protector-strong'


Characteristics of this binary (from libperl): 
  Compile-time options:
    HAS_LONG_DOUBLE
    HAS_STRTOLD
    HAS_TIMES
    MULTIPLICITY
    PERLIO_LAYERS
    PERL_COPY_ON_WRITE
    PERL_DONT_CREATE_GVSV
    PERL_HASH_FUNC_SIPHASH13
    PERL_HASH_USE_SBOX32
    PERL_MALLOC_WRAP
    PERL_OP_PARENT
    PERL_PRESERVE_IVUV
    PERL_USE_SAFE_PUTENV
    USE_64_BIT_ALL
    USE_64_BIT_INT
    USE_ITHREADS
    USE_LARGE_FILES
    USE_LOCALE
    USE_LOCALE_COLLATE
    USE_LOCALE_CTYPE
    USE_LOCALE_NUMERIC
    USE_LOCALE_TIME
    USE_PERLIO
    USE_PERL_ATOF
    USE_REENTRANT_API
    USE_THREAD_SAFE_LOCALE
  Built under solaris
  Compiled at Apr  8 2025 23:48:26
  @INC:
    /usr/perl5/site_perl/5.40/i86pc-solaris-thread-multi-64
    /usr/perl5/site_perl/5.40
    /usr/perl5/vendor_perl/5.40/i86pc-solaris-thread-multi-64
    /usr/perl5/vendor_perl/5.40
    /usr/perl5/5.40/lib/i86pc-solaris-thread-multi-64
    /usr/perl5/5.40/lib
@vlmarek
Copy link
Contributor Author

vlmarek commented Apr 14, 2025

I seem to have problems uploading a patch which fixes the issue, I'll just paste it here.

--- perl-5.40.1/locale.c
+++ perl-5.40.1/locale.c
@@ -1303,6 +1303,17 @@ S_parse_LC_ALL_string(pTHX_ const char *
 
     Size_t index;           /* Our internal index for the current category */
     const char * s = string;
+
+#if defined(__sun__)
+    /* Solaris setlocale(3C) returns composite locale prefixed by slash. For example
+     * "/en_US.UTF-8/C/C/C/C/C". See man page. We must remove it or this
+     * function will think that there is additional empty locale at the
+     * beginning of the string and the number of detected locales will not
+     * match expected LC_ALL_INDEX_. */
+    if (s == instr(s, separator)) {
+        s += separator_len;
+    }
+#endif
     const char * e = s + strlen(string);
     const char * category_end = NULL;
     const char * saved_first = NULL;

I can create pull request if you are inclined to accept it.

Thank you

@khwilliamson
Copy link
Contributor

Thank you for your comprehensive analysis.

Yes, I'd be inclined to accept the patch. The one thing I might want to change is to not do this just for Solaris.

Locale names are documented as being opaque. It would be legal if libc used control characters entirely to spell them. That said, it is a basic human desire to name things. Posix tried to not have a way of finding the current locale, but was forced eventually to add that capability. (Although the positional notation used by Solaris and others is close to not being human readable, since you have to have memorized or look up which position means which category.)

So parsing a locale is fraught with the possibility of getting it wrong. Yet all systems known to me are of the name=value form or the positional form. This leading slash is an additional wrinkle. My guess is that a leading separator string in any system is more likely to mean what it does in Solaris than that there is an empty category (which would be illegal, since there always has to be a locale.) Therefore, I'd be inclined to pull out the #ifdef

@khwilliamson
Copy link
Contributor

This was fixed by #23199

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants