Skip to content

Commit 1ecf83e

Browse files
committed
perlapi: Add extensive strftime documentation
Due to the differences in various systems' implementations, I think it is a good idea to more fully document the vagaries I have discovered, and how perl resolves them.
1 parent 2da6a35 commit 1ecf83e

File tree

2 files changed

+125
-74
lines changed

2 files changed

+125
-74
lines changed

ext/POSIX/lib/POSIX.pod

Lines changed: 6 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -1866,49 +1866,17 @@ Identical to the string form of C<$!>, see L<perlvar/$ERRNO>.
18661866
=item C<strftime>
18671867

18681868
Convert date and time information to string based on the current
1869-
underlying locale of the program (except for any daylight savings time).
1870-
Returns the string.
1869+
underlying locale of the program.
1870+
Returns the string in a mortalized SV; set to an empty string on error.
18711871

1872-
Synopsis:
1873-
1874-
strftime(fmt, sec, min, hour, mday, mon, year,
1875-
wday = -1, yday = -1, isdst = 0)
1876-
1877-
The month (C<mon>) begins at zero,
1878-
I<e.g.>, January is 0, not 1. The
1879-
year (C<year>) is given in years since 1900, I<e.g.>, the year 1995 is 95; the
1880-
year 2001 is 101. Consult your system's C<strftime()> manpage for details
1881-
about these and the other arguments.
1872+
my $sv = strftime(fmt, sec, min, hour, mday, mon, year,
1873+
wday = -1, yday = -1, isdst = -1)
18821874

18831875
The C<wday> and C<yday> parameters are both ignored. Their values are
18841876
always determinable from the other parameters.
18851877

1886-
C<isdst> should be C<1> or C<0>, depending on whether or not daylight
1887-
savings time is in effect for the given time or not.
1888-
1889-
If you want your code to be portable, your format (C<fmt>) argument
1890-
should use only the conversion specifiers defined by the ANSI C
1891-
standard (C99, to play safe). These are C<aAbBcdHIjmMpSUwWxXyYZ%>.
1892-
But even then, the B<results> of some of the conversion specifiers are
1893-
non-portable. For example, the specifiers C<aAbBcpZ> change according
1894-
to the locale settings of the user, and both how to set locales (the
1895-
locale names) and what output to expect are non-standard.
1896-
The specifier C<c> changes according to the timezone settings of the
1897-
user and the timezone computation rules of the operating system.
1898-
The C<Z> specifier is notoriously unportable since the names of
1899-
timezones are non-standard. Sticking to the numeric specifiers is the
1900-
safest route.
1901-
1902-
The arguments, except for C<isdst>, are made consistent as though by
1903-
calling C<mktime()> before calling your system's C<strftime()> function.
1904-
To get correct results, you must set C<isdst> to be the proper value.
1905-
When omitted, the function assumes daylight savings is not in effect.
1906-
1907-
The string for Tuesday, December 12, 1995 in the C<C> locale.
1908-
1909-
$str = POSIX::strftime( "%A, %B %d, %Y",
1910-
0, 0, 0, 12, 11, 95, 2 );
1911-
print "$str\n";
1878+
More details on the behavior and the specification of the other
1879+
parameters are described in L<perlapi/sv_strftime_ints>.
19121880

19131881
=item C<strlen>
19141882

locale.c

Lines changed: 119 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -8154,11 +8154,12 @@ S_maybe_override_codeset(pTHX_ const char * codeset,
81548154

81558155
/*
81568156
=for apidoc_section $time
8157-
=for apidoc sv_strftime_tm
8158-
=for apidoc_item sv_strftime_ints
8157+
=for apidoc sv_strftime_ints
8158+
=for apidoc_item sv_strftime_tm
81598159
=for apidoc_item my_strftime
81608160
8161-
These implement the libc strftime().
8161+
These implement libc strftime(), overcoming various deficiencies it has; you
8162+
will come to regret sooner or later using it directly instead of these.
81628163
81638164
On failure, they return NULL, and set C<errno> to C<EINVAL>.
81648165
@@ -8167,70 +8168,152 @@ handle the UTF-8ness of the current locale, the input C<fmt>, and the returned
81678168
result. Only if the current C<LC_TIME> locale is a UTF-8 one (and S<C<use
81688169
bytes>> is not in effect) will the result be marked as UTF-8.
81698170
8171+
For these, the caller assumes ownership of the returned SV with a reference
8172+
count of 1.
8173+
81708174
C<my_strftime> is kept for backwards compatibility. Knowing if its result
81718175
should be considered UTF-8 or not requires significant extra logic.
81728176
81738177
Note that all three functions are always executed in the underlying
81748178
C<LC_TIME> locale of the program, giving results based on that locale.
81758179
8176-
The functions differ as follows:
8177-
8178-
C<sv_strftime_tm> takes a pointer to a filled-in S<C<struct tm>> parameter. It
8179-
ignores the values of the C<wday> and C<yday> fields in it. The other fields
8180-
give enough information to accurately calculate these values, and are used for
8181-
that purpose.
8180+
The stringified C<fmt> parameter in all is the same as the system libc
8181+
C<strftime>. The available conversion specifications vary by platform. These
8182+
days, every specification listed in the ANSI C99 standard should be usable
8183+
everywhere. These are C<a A b B c d H I j m M p S U w W x X y Y Z %>.
81828184
8183-
The caller assumes ownership of the returned SV with a reference count of 1.
8185+
But note that the B<results> of some of the conversion specifiers are
8186+
non-portable. For example, the specifiers C<a A b B c p Z> change according
8187+
to the locale settings of the user, and both how to set locales (the
8188+
locale names) and what output to expect are not standardized.
8189+
The specifier C<c> changes according to the timezone settings of the
8190+
user and the timezone computation rules of the operating system.
8191+
The C<Z> specifier is notoriously unportable since the names of
8192+
timezones are not standardized. Sticking to the numeric specifiers is the
8193+
safest route.
81848194
8185-
C<sv_strftime_ints> takes a bunch of integer parameters that together
8186-
completely define a given time. It calculates the S<C<struct tm>> to pass to
8187-
libc strftime(), and calls that function.
8195+
At the time of this writing, for example, C<%s> is not available on
8196+
Windows-like systems.
81888197
8189-
The value of C<isdst> is used as follows:
8198+
The functions differ as follows:
81908199
81918200
=over
81928201
8193-
=item 0
8202+
=item *
81948203
8195-
No daylight savings time is in effect
8204+
The C<fmt> parameter and the return from C<my_strftime> are S<C<char *>>
8205+
instead of the S<C<SV *>> in the other two functions. This means the
8206+
UTF-8ness of the format and result are unspecified. The result MUST be
8207+
arranged to be FREED BY THE CALLER).
81968208
8197-
=item E<gt>0
8209+
=item *
81988210
8199-
Check if daylight savings time is in effect, and adjust the results
8200-
accordingly.
8211+
C<sv_strftime_ints> and C<my_strftime> take a bunch of integer parameters that
8212+
together completely define a given time. They calculate the S<C<struct tm>>
8213+
to pass to libc strftime(), and call that function. See below for the meaning
8214+
of the parameters.
82018215
8202-
=item E<lt>0
8216+
C<sv_strftime_tm> takes a pointer to an already filled-in S<C<struct tm>>
8217+
parameter, so avoids that calculation.
82038218
8204-
This value is reserved for internal use by the L<POSIX> module for backwards
8205-
compatibility purposes.
8219+
=item *
82068220
8207-
=back
8221+
C<my_strftime> takes two extra parameters that are ignored, being kept only
8222+
for historical reasons. These are C<wday> and C<yday>.
82088223
8209-
The caller assumes ownership of the returned SV with a reference count of 1.
8224+
=back
82108225
8211-
C<my_strftime> is like C<sv_strftime_ints> except that:
8226+
The C99 Standard calls for S<C<struct tm>> to contain at least these fields:
8227+
8228+
int tm_sec; // seconds after the minute — [0, 60]
8229+
int tm_min; // minutes after the hour — [0, 59]
8230+
int tm_hour; // hours since midnight — [0, 23]
8231+
int tm_mday; // day of the month — [1, 31]
8232+
int tm_mon; // months since January — [0, 11]
8233+
int tm_year; // years since 1900
8234+
int tm_wday; // days since Sunday — [0, 6]
8235+
int tm_yday; // days since January 1 — [0, 365]
8236+
int tm_isdst; // Daylight Saving Time flag
8237+
8238+
C<tm_wday> and C<tm_yday> are output only; the other fields give enough
8239+
information to accurately calculate these, and are internally used for that
8240+
purpose.
8241+
8242+
The numbers enclosed in the square brackets above give the maximum legal
8243+
ranges for values in the corresponding field. Those ranges are restricted for
8244+
some inputs. For example, not all months have 31 days, but all hours have 60
8245+
minutes. If you set a number that is outside the corresponding range, perl
8246+
and the libc functions will automatically normalize it to be inside the range,
8247+
adjusting other values as necessary. For example, specifying February 29, is
8248+
the same as saying March 1 for non-leap years; and using a minute value of 60
8249+
will instead change that to a 0, and increment the hour, which in turn, if the
8250+
hour was 23, will roll it over to 0 it and increment the day, and so on.
8251+
8252+
Each parameter to C<sv_strftime_ints> and C<my_strftime> populates the
8253+
similarly-named field in this structure.
8254+
8255+
A value of 60 is legal for C<tm_sec>, but only for those moments when an
8256+
official leap second has been declared. It is undefined behavior to use them
8257+
otherwise, and the behavior does vary depending on the implementation.
8258+
Some implementations take your word for it that this is a leap second, leaving
8259+
it as the 61st second of the given minute; some roll it over to be the 0th
8260+
second of the following minute; some treat it as 0. Some non-conforming
8261+
implementations always roll it over to the next minute, regardless of whether
8262+
an actual leap second is occurring or not. (And yes, it is a real problem
8263+
that different computers have a different conception of what the current time
8264+
is; you can search the internet for details.)
8265+
8266+
There is no limit (outside the size of C<int>) for the value of C<tm_year>,
8267+
but sufficiently negative values (for earlier than 1900) may have different
8268+
results on different systems and locales. Some libc implementations may know
8269+
when a given locale adopted the Greorian calendar, and adjust for that.
8270+
Others will not. (And some countries didn't adopt the Gregorian calendar
8271+
until after 1900.)
8272+
8273+
The treatment of the C<isdst> field has varied over previous Perl versions,
8274+
and has been buggy (both by perl and by some libc implementations), but is now
8275+
aligned, as best we can, with the POSIX Standard, as follows:
82128276
82138277
=over
82148278
8215-
=item The C<fmt> parameter and the return are S<C<char *>> instead of
8216-
S<C<SV *>>.
8279+
=item C<is_dist> is 0
82178280
8218-
This means the UTF-8ness of the result is unspecified. The result MUST be
8219-
arranged to be FREED BY THE CALLER).
8281+
The function is to assume that daylight savings time is not in effect. This
8282+
should now always work properly, as perl uses its own implementation in this
8283+
case, avoiding non-conforming libc ones.
82208284
8221-
=item The C<is_dst> parameter is ignored.
8285+
=item C<is_dist> is E<gt>0
82228286
8223-
Daylight savings time is never considered to be in effect.
8287+
The function is to assume that daylight savings time is in effect, though some
8288+
underlying libc implementations treat this as a hint instead of a mandate.
82248289
8225-
=item It has extra parameters C<yday> and C<wday> that are ignored.
8290+
=item C<is_dist> is E<lt>0
82268291
8227-
These exist only for historical reasons; the values for the corresponding
8228-
fields in S<C<struct tm>> are calculated from the other arguments.
8292+
The function is to itself try to calculate if daylight savings time is in
8293+
effect. More recent libc implementations are better at this than earlier
8294+
ones.
82298295
82308296
=back
82318297
8232-
Note that all three functions are always executed in the underlying C<LC_TIME>
8233-
locale of the program, giving results based on that locale.
8298+
Some libc implementations have extra fields in S<C<struct tm>>. The two that
8299+
perl handles are:
8300+
8301+
int tm_gmtoff; // Seconds East of UTC [%z]
8302+
const char * tm_zone; // Timezone abbreviation [%Z]
8303+
8304+
These are both output only. Using the respective conversion specifications
8305+
(enclosed in the square brackets) in the C<fmt> parameter is a portable way to
8306+
gain access to these values, working both on systems that have and don't have
8307+
these fields.
8308+
8309+
Example, in the C<C> locale:
8310+
8311+
my_strftime( "%A, %B %d, %Y", 0, 0, 0, 12, 11, 95, 0, 0, -1 );
8312+
8313+
returns
8314+
8315+
"Tuesday, December 12, 1995"
8316+
82348317
=cut
82358318
*/
82368319

0 commit comments

Comments
 (0)