Skip to content

<locale>: Repair std::collate<unsigned short> #5361

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

muellerj2
Copy link
Contributor

@muellerj2 muellerj2 commented Mar 25, 2025

This repairs collate<unsigned short> by adding two missing unsigned short specializations for _LStrcoll and _LStrxfrm when we're building the STL. Fixes #5236.

There are similar specializations for a few functions in <xlocale>. I guess _LStrcoll and _LStrxfrm were missed because they are located in a different header.

Since collate is the only user of these functions, I also moved them from <xlocinfo> to <locale> just above collate.

As for why this also fixes collate<wchar_t> = collate<unsigned short> under non-native wchar_t when linking to the DLL, this is because the the locale facets are constructed in the DLL here:

STL/stl/src/wlocale.cpp

Lines 58 to 88 in f2a2933

// moved from locale to ease subsetting
using _Tu1 = ctype<unsigned short>;
using _Tu2 = num_get<unsigned short>;
using _Tu3 = num_put<unsigned short>;
using _Tu4 = numpunct<unsigned short>;
using _Tu5 = collate<unsigned short>;
using _Tu6 = messages<unsigned short>;
using _Tu7 = money_get<unsigned short>;
using _Tu8 = money_put<unsigned short>;
using _Tu9 = moneypunct<unsigned short, false>;
using _Tu10 = moneypunct<unsigned short, true>;
using _Tu11 = time_get<unsigned short>;
using _Tu12 = time_put<unsigned short>;
using _Tu13 = codecvt<unsigned short, char, _Mbstatet>;
void __CLRCALL_OR_CDECL locale::_Locimp::_Makeushloc(const _Locinfo& lobj, locale::category cat, _Locimp* ptrimp,
const locale* ptrloc) { // setup wide part of a new locale
ADDFAC(_Tu1, cat, ptrimp, ptrloc);
ADDFAC(_Tu2, cat, ptrimp, ptrloc);
ADDFAC(_Tu3, cat, ptrimp, ptrloc);
ADDFAC(_Tu4, cat, ptrimp, ptrloc);
ADDFAC(_Tu5, cat, ptrimp, ptrloc);
ADDFAC(_Tu6, cat, ptrimp, ptrloc);
ADDFAC(_Tu7, cat, ptrimp, ptrloc);
ADDFAC(_Tu8, cat, ptrimp, ptrloc);
ADDFAC(_Tu9, cat, ptrimp, ptrloc);
ADDFAC(_Tu10, cat, ptrimp, ptrloc);
ADDFAC(_Tu11, cat, ptrimp, ptrloc);
ADDFAC(_Tu12, cat, ptrimp, ptrloc);
ADDFAC(_Tu13, cat, ptrimp, ptrloc);
}

Thus, collate's virtual functions originate from the DLL as well. These virtual functions call _LStrcoll and _LStrxfrm, but the specializations for unsigned short were missing since the DLL is built with native wchar_t.

In the new test, we have to skip collate::transform() tests when there is IDL mismatch between TU and linked DLL.

@muellerj2 muellerj2 requested a review from a team as a code owner March 25, 2025 18:48
@github-project-automation github-project-automation bot moved this to Initial Review in STL Code Reviews Mar 25, 2025
@StephanTLavavej StephanTLavavej added the bug Something isn't working label Mar 25, 2025
@StephanTLavavej StephanTLavavej self-assigned this Mar 25, 2025
@StephanTLavavej StephanTLavavej removed their assignment Apr 16, 2025
@StephanTLavavej StephanTLavavej moved this from Initial Review to Ready To Merge in STL Code Reviews Apr 16, 2025
@StephanTLavavej
Copy link
Member

Thanks! I updated the product code's preprocessor logic, pushed significant changes to the test, and updated the PR description accordingly - please meow if I messed something up.

The general principle that we now follow (although our legacy codebase didn't) is that while /Zc:wchar_t- should work (fake wchar_t), normal users of /Zc:wchar_t (real wchar_t) should not experience unsigned short as anything special. Only the STL's separately compiled code has to worry about real and fake wchar_t simultaneously.

@StephanTLavavej StephanTLavavej moved this from Ready To Merge to Merging in STL Code Reviews Apr 22, 2025
@StephanTLavavej
Copy link
Member

I'm mirroring this to the MSVC-internal repo - please notify me if any further changes are pushed.

@StephanTLavavej StephanTLavavej merged commit 42cee97 into microsoft:main Apr 22, 2025
39 checks passed
@github-project-automation github-project-automation bot moved this from Merging to Done in STL Code Reviews Apr 22, 2025
@StephanTLavavej
Copy link
Member

Thanks for fixing the world's best library with the world's worst compiler option! 😹 💚 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

<locale>: std::collate<wchar_t> does not respect collation order when compiled with /MD(d) /Zc:wchar_t-
2 participants