Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clang-cl /EHa with MSVC std::variant emits linker error LNK2019: unresolved external symbol #93251

Closed
StephanTLavavej opened this issue May 23, 2024 · 29 comments · Fixed by #128866
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. needs-reduction Large reproducer that should be reduced into a simpler form platform:windows

Comments

@StephanTLavavej
Copy link
Member

Repros with Clang 17.0.3 and VS 2022 17.11 Preview 1. MSVC accepts, but Clang rejects.

C:\Temp>"C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Auxiliary\Build\vcvarsall.bat" x64
**********************************************************************
** Visual Studio 2022 Developer Command Prompt v17.11.0-pre.1.0
** Copyright (c) 2022 Microsoft Corporation
**********************************************************************
[vcvarsall.bat] Environment initialized for: 'x64'

C:\Temp>clang-cl -v
clang version 17.0.3
Target: x86_64-pc-windows-msvc
Thread model: posix
InstalledDir: C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Tools\Llvm\x64\bin

C:\Temp>type meow.cpp
#include <variant>
using namespace std;

struct UDT {
    UDT() {}
    ~UDT() {}
};

int main() {
    using V = variant<int, double, UDT>;
    [[maybe_unused]] V a{1729};
    [[maybe_unused]] V b{3.14};
    [[maybe_unused]] V c{UDT{}};
}
C:\Temp>cl /EHsc /nologo /W4 /std:c++17 /MTd /Od meow.cpp
meow.cpp

C:\Temp>cl /EHa /nologo /W4 /std:c++17 /MTd /Od meow.cpp
meow.cpp

C:\Temp>clang-cl /EHsc /nologo /W4 /std:c++17 /MTd /Od meow.cpp

C:\Temp>clang-cl /EHa /nologo /W4 /std:c++17 /MTd /Od meow.cpp
meow-aeb2d1.obj : error LNK2019: unresolved external symbol "public: __cdecl std::_Variant_storage_<0,double,struct UDT>::~_Variant_storage_<0,double,struct UDT>(void)" (??1?$_Variant_storage_@$0A@NUUDT@@@std@@QEAA@XZ) referenced in function "int `public: __cdecl std::_Variant_storage_<0,int,double,struct UDT>::_Variant_storage_<0,int,double,struct UDT><1,double,0>(struct _Variant_storage_<0,int,double,struct UDT>::integral_constant<unsigned __int64,1>,double &&)'::`1'::dtor$3" (?dtor$3@?0???$?0$00N$0A@@?$_Variant_storage_@$0A@HNUUDT@@@std@@QEAA@U?$integral_constant@_K$00@1@$$QEAN@Z@4HA)
meow-aeb2d1.obj : error LNK2019: unresolved external symbol "public: __cdecl std::_Variant_storage_<0,struct UDT>::~_Variant_storage_<0,struct UDT>(void)" (??1?$_Variant_storage_@$0A@UUDT@@@std@@QEAA@XZ) referenced in function "int `public: __cdecl std::_Variant_storage_<0,double,struct UDT>::_Variant_storage_<0,double,struct UDT><1,struct UDT,0>(struct _Variant_storage_<0,double,struct UDT>::integral_constant<unsigned __int64,1>,struct UDT &&)'::`1'::dtor$3" (?dtor$3@?0???$?0$00UUDT@@$0A@@?$_Variant_storage_@$0A@NUUDT@@@std@@QEAA@U?$integral_constant@_K$00@1@$$QEAUUDT@@@Z@4HA)
meow.exe : fatal error LNK1120: 2 unresolved externals
clang-cl: error: linker command failed with exit code 1120 (use -v to see invocation)

Reduced from the original user report DevCom-10647850 "Linking bug of std::unordered_map with std::variant Value Type with latest MS Visual Studio Community 2022".

@EugeneZelenko EugeneZelenko added clang-cl `clang-cl` driver. Don't use for other compiler parts and removed new issue labels May 23, 2024
@jneuhaus20
Copy link

Has anyone had any luck working around this? I have this exact same problem in a larger codebase. We're converting to clang to move towards Linux, so we'll have to wean ourselves off of SEH eventually, but it'd be real nice to keep it on Windows for the transition.

@Endilll Endilll added clang Clang issues not falling into any other category platform:windows and removed clang-cl `clang-cl` driver. Don't use for other compiler parts labels Jul 31, 2024
@phoebewang
Copy link
Contributor

I did an experiment, with [[maybe_unused]] V a{1729}; only, we can find a

define linkonce_odr dso_local void @"??1?$_Variant_storage_@$0A@HNUUDT@@@std@@QEAA@XZ" ...

in dumped IR.
With

    [[maybe_unused]] V a{1729};
    [[maybe_unused]] V b{3.14};

There are

define linkonce_odr dso_local void @"??1?$_Variant_storage_@$0A@HNUUDT@@@std@@QEAA@XZ" ...
declare dso_local void @"??1?$_Variant_storage_@$0A@NUUDT@@@std@@QEAA@XZ" ...

In which, the declare of "??1?$_Variant_storage_@$0A@NUUDT@@@std@@QEAA@XZ" causes unresolved external error.
In the original case, there are

define linkonce_odr dso_local void @"??1?$_Variant_storage_@$0A@HNUUDT@@@std@@QEAA@XZ" ...
declare dso_local void @"??1?$_Variant_storage_@$0A@NUUDT@@@std@@QEAA@XZ" ...
declare dso_local void @"??1?$_Variant_storage_@$0A@UUDT@@@std@@QEAA@XZ" ...

For

"??1?$_Variant_storage_@$0A@HNUUDT@@@std@@QEAA@XZ"

is called once by

"??1?$_Variant_base@HNUUDT@@@std@@QEAA@XZ"

in IR built with /EHsc. But with /EHa, it's called 4 more times in ehcleanup by

"??1?$_Variant_base@HNUUDT@@@std@@QEAA@XZ"
"??$?0$0A@H$0A@@?$_Variant_base@HNUUDT@@@std@@QEAA@U?$in_place_index_t@$0A@@1@$$QEAH@Z"
"??$?0$00N$0A@@?$_Variant_base@HNUUDT@@@std@@QEAA@U?$in_place_index_t@$00@1@$$QEAN@Z"
"??$?0$01UUDT@@$0A@@?$_Variant_base@HNUUDT@@@std@@QEAA@U?$in_place_index_t@$01@1@$$QEAUUDT@@@Z"

Similarly, in /EHa case

"??1?$_Variant_storage_@$0A@NUUDT@@@std@@QEAA@XZ"

is called twice by

"??$?0$00N$0A@@?$_Variant_storage_@$0A@HNUUDT@@@std@@QEAA@U?$integral_constant@_K$00@1@$$QEAN@Z"
"??$?0$01UUDT@@$0A@@?$_Variant_storage_@$0A@HNUUDT@@@std@@QEAA@U?$integral_constant@_K$01@1@$$QEAUUDT@@@Z"

in ehcleanup block only.

I think this explains the difference between define and declare, i.e., Clang won't generate function defination if it's only used by ehcleanup. It is true even if specify -Xclang -disable-llvm-passes. So I believe it is a Clang front end issue.

CC @AaronBallman

@AaronBallman AaronBallman added the clang:codegen IR generation bugs: mangling, exceptions, etc. label Dec 6, 2024
@llvmbot
Copy link
Member

llvmbot commented Dec 6, 2024

@llvm/issue-subscribers-clang-codegen

Author: Stephan T. Lavavej (StephanTLavavej)

Repros with Clang 17.0.3 and VS 2022 17.11 Preview 1. MSVC accepts, but Clang rejects.
C:\Temp&gt;"C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Auxiliary\Build\vcvarsall.bat" x64
**********************************************************************
** Visual Studio 2022 Developer Command Prompt v17.11.0-pre.1.0
** Copyright (c) 2022 Microsoft Corporation
**********************************************************************
[vcvarsall.bat] Environment initialized for: 'x64'

C:\Temp&gt;clang-cl -v
clang version 17.0.3
Target: x86_64-pc-windows-msvc
Thread model: posix
InstalledDir: C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Tools\Llvm\x64\bin

C:\Temp&gt;type meow.cpp
#include &lt;variant&gt;
using namespace std;

struct UDT {
    UDT() {}
    ~UDT() {}
};

int main() {
    using V = variant&lt;int, double, UDT&gt;;
    [[maybe_unused]] V a{1729};
    [[maybe_unused]] V b{3.14};
    [[maybe_unused]] V c{UDT{}};
}
C:\Temp&gt;cl /EHsc /nologo /W4 /std:c++17 /MTd /Od meow.cpp
meow.cpp

C:\Temp&gt;cl /EHa /nologo /W4 /std:c++17 /MTd /Od meow.cpp
meow.cpp

C:\Temp&gt;clang-cl /EHsc /nologo /W4 /std:c++17 /MTd /Od meow.cpp

C:\Temp&gt;clang-cl /EHa /nologo /W4 /std:c++17 /MTd /Od meow.cpp
meow-aeb2d1.obj : error LNK2019: unresolved external symbol "public: __cdecl std::_Variant_storage_&lt;0,double,struct UDT&gt;::~_Variant_storage_&lt;0,double,struct UDT&gt;(void)" (??1?$_Variant_storage_@$0A@<!-- -->NUUDT@@@<!-- -->std@@<!-- -->QEAA@<!-- -->XZ) referenced in function "int `public: __cdecl std::_Variant_storage_&lt;0,int,double,struct UDT&gt;::_Variant_storage_&lt;0,int,double,struct UDT&gt;&lt;1,double,0&gt;(struct _Variant_storage_&lt;0,int,double,struct UDT&gt;::integral_constant&lt;unsigned __int64,1&gt;,double &amp;&amp;)'::`1'::dtor$3" (?dtor$3@?0???$?0$00N$0A@@?$_Variant_storage_@$0A@<!-- -->HNUUDT@@@<!-- -->std@@<!-- -->QEAA@<!-- -->U?$integral_constant@<!-- -->_K$00@<!-- -->1@$$QEAN@<!-- -->Z@<!-- -->4HA)
meow-aeb2d1.obj : error LNK2019: unresolved external symbol "public: __cdecl std::_Variant_storage_&lt;0,struct UDT&gt;::~_Variant_storage_&lt;0,struct UDT&gt;(void)" (??1?$_Variant_storage_@$0A@<!-- -->UUDT@@@<!-- -->std@@<!-- -->QEAA@<!-- -->XZ) referenced in function "int `public: __cdecl std::_Variant_storage_&lt;0,double,struct UDT&gt;::_Variant_storage_&lt;0,double,struct UDT&gt;&lt;1,struct UDT,0&gt;(struct _Variant_storage_&lt;0,double,struct UDT&gt;::integral_constant&lt;unsigned __int64,1&gt;,struct UDT &amp;&amp;)'::`1'::dtor$3" (?dtor$3@?0???$?0$00UUDT@@$0A@@?$_Variant_storage_@$0A@<!-- -->NUUDT@@@<!-- -->std@@<!-- -->QEAA@<!-- -->U?$integral_constant@<!-- -->_K$00@<!-- -->1@$$QEAUUDT@@@<!-- -->Z@<!-- -->4HA)
meow.exe : fatal error LNK1120: 2 unresolved externals
clang-cl: error: linker command failed with exit code 1120 (use -v to see invocation)

Reduced from the original user report DevCom-10647850 "Linking bug of std::unordered_map with std::variant Value Type with latest MS Visual Studio Community 2022".

@AaronBallman
Copy link
Collaborator

I think this explains the difference between define and declare, i.e., Clang won't generate function defination if it's only used by ehcleanup. It is true even if specify -Xclang -disable-llvm-passes.

Thank you for the investigation!

So I believe it is a Clang front end issue.

CC @rjmccall @efriedma-quic @rnk for awareness (or to see if you have any bandwidth for addressing this one, as it seems pretty important)

@EugeneZelenko EugeneZelenko removed the clang Clang issues not falling into any other category label Dec 6, 2024
@efriedma-quic
Copy link
Collaborator

At first glance, this doesn't really make sense to me. The only way I can see that we wouldn't emit the definition of a destructor is if the destructor isn't odr-used. I have no idea how you could declare a variable in a way that would make CodeGen try to call a destructor that isn't odr-used. (If the destructor is in fact odr-used, some core infrastructure would have to be very fundamentally broken to avoid triggering deferred emission.)

If someone could reduce a testcase that doesn't require including all of std::variant, that would be helpful.

@shafik shafik added the needs-reduction Large reproducer that should be reduced into a simpler form label Dec 6, 2024
@AaronBallman
Copy link
Collaborator

CC @Endilll for help with reducing, in case he's got bandwidth

@momo5502
Copy link
Contributor

I am in need for a fix. Is anyone actively working on this right now? If not, I can maybe have a look at it to see if I can fix it.

@Endilll
Copy link
Contributor

Endilll commented Feb 24, 2025

I am in need for a fix. Is anyone actively working on this right now? If not, I can maybe have a look at it to see if I can fix it.

Chances are no one does, so go ahead and submit a PR!

@efriedma-quic
Copy link
Collaborator

I don't have time to look at this, but I'm happy to answer questions.

@momo5502
Copy link
Contributor

momo5502 commented Feb 25, 2025

I started digging into the issue. First of all, the destructor exists in the source:

Image

The fact that it's missing is therefore not related to that.

I took a look at the IR code and it looks a bit weird:

Image

The destructor call is needed for the SEH scope, that is emitted due to /EHa. However, the scope is empty. So, at least to my understanding, the destructor would never be called, as there is nothing within that scope that can ever throw.

The scope is generated for this constructor call:

Image

The scope begin is emitted, then the constructor body (which is an empty compount statement representing the red circle), and then the scope ends.

I think, ideally, the seh scope should not be generated, if the body of the scope is empty. However, I don't think it's that easy to do.

I also checked the AST. It seems that no body is attached to the destructor declaration. However, I'm not very familiar with the clang frontend, so it might take a while for me to understand what's going on there. As the destructor exists in source, I assume it also generates a body for it. However it probably knows the destructor is not needed and thus discards the body at some point. I have not been able to confirm that yet, though.

In the end, threre are multiple ways to fix this:

  1. instruct the frontend to generate and retain the destructor
  2. prevent generating empty SEH scopes
  3. remove empty SEH scopes in the backend

Option 2 technically sounds like the cleanest, however I'm not sure if I have enough knowledge to implement that.
Option 3 sounds like a last resort. I think I know how to do that, but option 1 is probably my way to go here.

I guess I'll further look into the AST generation to try to understand what's the deal with the destructor here.

@phoebewang
Copy link
Contributor

I'd prefer not to create the problem at the beginning. I noticed the function body has noexcept on it. Is it possible to start from not generating SEH scopes for noexcept?

@momo5502
Copy link
Contributor

I'd prefer not to create the problem at the beginning. I noticed the function body has noexcept on it. Is it possible to start from not generating SEH scopes for noexcept?

That's a good idea. I will try to implement that tomorrow

@efriedma-quic
Copy link
Collaborator

It seems that no body is attached to the destructor declaration

That would be because it's not considered "used", and therefore isn't instantiated.

However, the scope is empty. So, at least to my understanding, the destructor would never be called, as there is nothing within that scope that can ever throw.

Sema doesn't reason about that sort of thing.


My best guess is that that MarkBaseAndMemberDestructorsReferenced calls MarkFunctionReferenced, but somehow it's not considered an odr-use.

@efriedma-quic
Copy link
Collaborator

I'd prefer not to create the problem at the beginning. I noticed the function body has noexcept on it. Is it possible to start from not generating SEH scopes for noexcept?

That seems like a complicated optimization. And I'm not sure it really fixes the underlying issue.

@momo5502
Copy link
Contributor

momo5502 commented Feb 26, 2025

My best guess is that that MarkBaseAndMemberDestructorsReferenced calls MarkFunctionReferenced, but somehow it's not considered an odr-use.

Yes, that's exactly what's happening. I adjusted NeedDefinition within MarkFunctionReferenced to always generate definitions for destructors with /EHa and the error was "fixed":

Image

However, that does not feel like a proper fix to me and I'm certain this causes way more instantiations than necessary. But I lack the knowledge to confirm that.

I'd prefer not to create the problem at the beginning. I noticed the function body has noexcept on it. Is it possible to start from not generating SEH scopes for noexcept?

That seems like a complicated optimization. And I'm not sure it really fixes the underlying issue.

I'm not sure if it's that complicated. I think I almost got it. There's only an assertion throwing somewhere and I need to figure out why. I'll try to finish this fix and then both of you can judge wether it's good or not 😂

I think, to really be able to judge wether this is a proper fix or not, it would be good to have a reduced sample that triggers this issue without all the bloat std::variant brings. However, I have not managed to reduce the sample yet.

@momo5502
Copy link
Contributor

momo5502 commented Feb 26, 2025

I created a PR that stops generating SEH scopes for noexcept functions. It fixes the sample @StephanTLavavej provided.
However, @8051Enthusiast provided me with a dastically reduced sample that still triggers the bug:

template <class...> struct VS {};
template <class _First, class... _Rest> struct VS<_First, _Rest...> {
  union {
    VS<_Rest...> _Tail;
  };
  ~VS() { /* empty */ }
  VS(long) {};
  VS(short) : _Tail(long{}) {}
};
int main() { VS<int, int>(short{}); }

Ontop of that, I'm not sure if, in general, noexcept functions should not have SEH scopes, as I'm sure noexcept does not account for potential memory corruptions that one might want to catch using EHa.

It seems that not even the NeededForEHa fix works here, as MarkFunctionReferenced is not called at all for the destructor in question.

@phoebewang
Copy link
Contributor

Ontop of that, I'm not sure if, in general, noexcept functions should not have SEH scopes, as I'm sure noexcept does not account for potential memory corruptions that one might want to catch using EHa.

I think the concern here makes sense. According to LangRef, the safe way should be to check if there's a C++ object in the scope and the object must have a non-trivial destructor.

@momo5502
Copy link
Contributor

momo5502 commented Feb 26, 2025

Ontop of that, I'm not sure if, in general, noexcept functions should not have SEH scopes, as I'm sure noexcept does not account for potential memory corruptions that one might want to catch using EHa.

I think the concern here makes sense. According to LangRef, the safe way should be to check if there's a C++ object in the scope and the object must have a non-trivial destructor.

I think it's already done this way, isn't it? We have an object in scope and the object has a non-trivial destructor. Namely the one that has no definition.

From what I can see, SEH scopes are always generated. Wether necessary or not. As eli mentioned, Sema does not reason about that, which explains why they are generated.

What's interesting is that a lot of SEH scopes are usesless in the initial sample. Not only the one causing problems. Even with optimizations enabled, these scopes do not get removed:

Image

Would anyone be opposed to me implementing cleanup logic somewhere (maybe at the end of the optimization or in the backend) to just get rid of those unnecessary scopes?

I think this might also be a fix/workaround for this issue. Unless there is a way the destructor is not instantiated in a case where SEH scopes are needed. However, It's hard for me to judge if this can occur.

@momo5502
Copy link
Contributor

momo5502 commented Feb 26, 2025

Ok, I have a sample that triggers this bug, even without /EHa, so while cleaning up SEH scopes might be good, it is not a fix for this bug.

void somefunc();

template <class...> struct VS {};
template <class _First, class... _Rest> struct VS<_First, _Rest...> {
  union {
    VS<_Rest...> _Tail;
  };
  ~VS() { /* empty */ }
  VS(long) {};
  VS(short) : _Tail(long{}) { somefunc(); }
};
int main() { VS<int, int>(short{}); }

somefunc may raise a regular exception and thus creates the need for the destructor instantiation, even without EHa.
However, it's not instantiated:

Image

I guess this makes this purely a frontend issue. I'm not sure if I have the expertise to fix this.
Removing the union fixes this by the way. So it's definitely related to that.

@phoebewang
Copy link
Contributor

Oh, I mistook non-trivial destructor with empty defination. So, I agree in the /EHa case, we should remove empty scopes given no exception can be triggered by empty. But I want to make sure if it is empty at the time of generating the scope intrinsics or optimizated later. If it is the former, we should not generate it at the beginning, if it is the latter, we should disable optimizaitons within the scope, because there might be some memory operations moved out/eliminated, which is not expected with /EHa.

@momo5502
Copy link
Contributor

there might be some memory operations moved out/eliminated, which is not expected with /EHa.

Good point. I didn't check that.

However, I have a potential fix. I'm not sure if it is valid. I'm essentially marking union members as referenced:

Image

@momo5502
Copy link
Contributor

momo5502 commented Feb 26, 2025

It seems that my PR breaks tests. The destructor is instantiated in too many cases. Seems like checking wether the constructor may throw (or if EHa is enabled) is required. GCC does this correctly and only instantiates the destructor when the constructor actually throws.

I can try to have a look at how that could be done tomorrw, but I'm afraid I lack the frontend knowledge to achieve that on my own.

I've seen @zygoloid's name pop up at a few places that were related to unions and destructor instantiation. Maybe you can provide more input :D

@yurybura
Copy link

@momo5502 Thanks for your investigation.
The following issue may also be related to the union's destructor instantiation #81774.

@efriedma-quic
Copy link
Collaborator

Maybe something is forgetting to push an ExpressionEvaluationContext? With the right context, isOdrUseContext should return OdrUseContext::Used, so MarkFunctionReferenced should do the right thing, I think.

@momo5502
Copy link
Contributor

Maybe something is forgetting to push an ExpressionEvaluationContext? With the right context, isOdrUseContext should return OdrUseContext::Used, so MarkFunctionReferenced should do the right thing, I think.

In the reduced sample, MarkFunctionReferenced is not called for the destructor in question.

@efriedma-quic
Copy link
Collaborator

Okay, I see what's happening.

CodeGenFunction::EmitInitializerForField emits, for each initialized field, a cleanup for that field.

Sema::SetCtorInitializers, on the other hand, just calls MarkBaseAndMemberDestructorsReferenced, which iterates over the bases/fields of the class. Which is the same in most cases, but not for an anonymous struct/union: MarkBaseAndMemberDestructorsReferenced can't look inside to find the actual field that's getting initialized. Sema::SetCtorInitializers needs to iterate over the baseOrMemberInitializers instead of the CXXRecordDecl.

@efriedma-quic
Copy link
Collaborator

Cleaned up testcase; should produce an error, but no error with clang:

template <class T> struct VSX {
  ~VSX() { static_assert(sizeof(T) != 4); }
};
struct VS {
  union {
    VSX<int> _Tail;
  };
  ~VS() { }
  VS(short);
};
VS::VS(short) : _Tail() { }

@momo5502
Copy link
Contributor

momo5502 commented Feb 27, 2025

Sema::SetCtorInitializers needs to iterate over the baseOrMemberInitializers instead of the CXXRecordDecl.

Thanks for the hint. I did that and this seems to be almost it. A few tests are still failing. And I think the reason is because the destructors must be instantiated only if the constructor body may throw, or if EHa is enabled.

I don't know how to determine if the body may throw. Is that even possible at this point? Has the body already been generated when SetCtorInitializers is called? Do I maybe have to defer the destructor instantiation to somewhere within ActOnFinishFunctionBody to be able to reason about the body?

Or is that even necessary? The sample you provided, @efriedma-quic, does not throw and still needs destructor instantiation. So maybe the code is right and the tests need to be adjusted?

@momo5502
Copy link
Contributor

I think I figured it out. The existing tests are all passing, while all the samples from here are fixed. I will cleanup my PR and add a test case, then you can destroy me 😂
#128866

cor3ntin pushed a commit that referenced this issue Mar 13, 2025
Initializing fields, that are part of an anonymous union, in a
constructor, requires their destructors to be instantiated.

In general, initialized members within non-delegating constructors, need
their destructor instantiated.

This fixes #93251
frederik-h pushed a commit to frederik-h/llvm-project that referenced this issue Mar 18, 2025
Initializing fields, that are part of an anonymous union, in a
constructor, requires their destructors to be instantiated.

In general, initialized members within non-delegating constructors, need
their destructor instantiated.

This fixes llvm#93251
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. needs-reduction Large reproducer that should be reduced into a simpler form platform:windows
Projects
None yet