Skip to content

[pull] main from llvm:main#5765

Open
pull[bot] wants to merge 195 commits intoEricsson:mainfrom
llvm:main
Open

[pull] main from llvm:main#5765
pull[bot] wants to merge 195 commits intoEricsson:mainfrom
llvm:main

Conversation

@pull
Copy link
Copy Markdown

@pull pull Bot commented May 8, 2026

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

NewSigma and others added 30 commits May 8, 2026 15:18
Previously, calling a host-device mismatch function inside a discarded
`if constexpr` branch would trigger an error. This patch recognizes that
discarded statements are never instantiated and allows such code.
[libc] Include CPU model in overlay CI sccache key

The overlay CI compiles opt_host memory tests with `-march=native`,
which generates object files specific to the runner CPU model. sccache
treats `-march=native` as a literal string in its hash key, so cached
`.o` files compiled on one CPU model get served to runners with a
different CPU. When the cached binary uses instructions the current CPU
lacks, the test crashes with SIGILL.

## Symptoms

The `memcmp_opt_host`, `memmove_opt_host`, `memset_opt_host`,
`bcmp_opt_host`, and `bzero_opt_host` tests crash when SIMD code paths
are first exercised. Simple tests like `CmpZeroByte` pass because they
use small sizes that do not enter SIMD routines. The failures are fully
reproducible on reruns because the cache stays poisoned.

## Evidence

Three consecutive runs of the same fwide PR (#196157), same code:

| Run | Azure Region | Cache Hits | Cache Misses | Result |
|-----|-------------|-----------|-------------|--------|
|
[25512875679](https://github.com/llvm/llvm-project/actions/runs/25512875679/job/74876008545)
| westus3 | 9 | 5354 | PASS |
|
[25524024922](https://github.com/llvm/llvm-project/actions/runs/25524024922/job/74916241365)
| northcentralus | 5345 | 0 | CRASH |
|
[25524839613](https://github.com/llvm/llvm-project/actions/runs/25524839613/job/74965830435)
| westus | 5345 | 0 | CRASH |

The first run had a nearly empty cache and compiled everything locally
(0.17% hit rate). An intermediate [syscall-unistd
run](https://github.com/llvm/llvm-project/actions/runs/25517783708/job/74893495220)
in eastus then populated the cache with object files compiled for that
region's CPU. Subsequent runs on different hardware got 100% cache hits
and crashed because the cached `.o` files use instructions their CPUs
lack.

## Fix

Added a "Detect CPU model" step that reads the CPU model string from
`/proc/cpuinfo` (Linux) or `sysctl` (macOS) and appends it to the
sccache cache key. Runners with different CPUs now get separate cache
buckets.

Assisted-by: Automated tooling, human reviewed.
…192041)

libclc has configure warning on Windows:
clang: error: no such file or directory:
'/clang:--target=amdgcn-amd-amdhsa-llvm' clang: error: no such file or
directory: '/clang:-print-target-triple'
  CMake Warning at CMakeLists.txt:239 (message):
    Failed to execute `llvm-project/build/bin/clang.exe
/clang:--target=amdgcn-amd-amdhsa-llvm /clang:-print-target-triple` to
    normalize target triple.

Switch to check CMAKE_C_COMPILER_FRONTEND_VARIANT because
- CMAKE_C_SIMULATE_ID=MSVC: true for both clang and clang-cl.
- CMAKE_C_COMPILER_FRONTEND_VARIANT=MSVC: true for clang-cl; false for clang.
Replace the explicit specialization lists in `__is_signed_integer_v` and
`__is_unsigned_integer_v` with detection using `is_integral`,
`is_signed`, and `is_unsigned`. This covers `_BitInt(N)` for any N, in
addition to all standard and extended integer types. Character types and
`bool` are excluded via `__is_character_or_bool_v`.

This unblocks `<bit>` operations (`popcount`, `countl_zero`, `rotl`,
etc.) for `_BitInt(N)`.

Part of the [_BitInt(N) libc++
effort](https://discourse.llvm.org/t/bitint-n-support-in-libc-investigations-possible-improvements-looking-for-guidance/90063).

Assisted-by: Claude (Anthropic)

---------

Co-authored-by: Claude Opus 4.6 <[email protected]>
No codegen and instruction.
It may be ratified in the future. riscv/riscv-isa-manual#2598
The cost of sub-reductions is either the cost of *mlslb + *mlslt, or the
cost of a dot operation with 2 negations:
```
       partial_reduce_umls acc, lhs, rhs
  <=> -partial_reduce_umla -acc, lhs, rhs
```
(codegen for this was added by #186809)

The cost-model was previously a bit of a hack, since sub-reductions were
expanded and therefore expensive, although we made the expansion cost
artifically cheaper so that it would still be a candidate for cdot
instructions.
In private ZA functions without any instructions that require "active"
ZA we can omit all ZA setup (and saves/restores). This is equivalent to
removing the `__arm_new("za/zt0")` attribute when ZA state is unused.
Add fwide function and tests. Part 1/11. All build file changes are in
part 11.

Assisted by Gemini
This patch adds some extra state collection methods to DebuggerBase and
implements them for DAP only. These methods are used to fetch a
stacktrace without variable information, and to populate variable
information into a StepIR containing only a stacktrace. These methods
are currently unused, making this patch NFC, but this is a necessary
precursor to the new script model, where we examine the stacktrace to
determine what variable info we will collect.

As part of the stacktrace-collection function, we also fetch the
instruction address for each stack frame, if it is made available by the
debugger; to enable this, this patch adds a new value with default
`None` to `FrameIR`.
Now that #194607 landed we use a normalized triple in the README for the
SPIRV targets. Before `spirv-mesa3d-` and `spirv64-mesa3d-` were being
used and those will be normalized to `spirv-unknown-mesa3d` and
`spirv64-unknown-mesa3d` by the following command in
`runtimes/CMakeLists.txt` with this command:

```console
$ clang --target=spirv-mesa3d- -print-target-triple
spirv-unknown-mesa3d
```

This is because in `llvm/lib/TargetParser/Triple.cpp` the term `mesa3d`
is recognized as an OS and placed in third position. The install path
for `libclc.spv` there ends up in `spirv-unknown-mesa3d/libclc.spv`.

With this change we suggest to use triples that "survive" the
normalization:

```console
$ clang --target=spirv-mesa-mesa3d -print-target-triple
spirv-mesa-mesa3d
```

See also this discussion:
#194607 (comment)
…r. (#196366)

When building with precompiled headers and Xcode as a generator, It adds
`obj.lldbDAP.dir/${BUILD_TYPE}/cmake_pch.xxx` but does not generate one
causing the build to fail.
This might have to do with `add_llvm_library` adding a source file
`Dummy.c` to any object it creates if using Xcode as a generator and
`lldbDAP` object not declaring it's LINK_LIBS and LINK_COMPONENTS.
… shift amounts only generate poison (#196489)

Matches ValueTracking / GISel implementations - although testing options are limited until DAG has actual uses of UndefPoisonKind::UndefOnly
Introduce G_WRAPPER_RIP it is the same node as in DAG. It is required to
make legalization possible when a load from stub is required to obtain a
pointer to a global value. It allows to avoid manual selection in
X86InstructionSelector.

Also added a missing check on X86SelectAddress failure.
…185425)

Extends llvm-objdump to print the information of images contained in
nested OffloadBinaries. For example, for a binary compiled with #185413
it shows
```
 $llvm-objdump --offloading ./a.out

./a.out:        file format elf64-x86-64

OFFLOADING IMAGE [0]:
kind            elf
arch
triple          spirv64-intel
producer        openmp
image size      43104 bytes
  [Nested OffloadBinary format detected]
  Number of inner images: 1
  kind            spir-v
  arch
  triple          spirv64-intel
  producer        openmp
  image size      42944 bytes
```  

New tests are added for clang-ling-wrapper and llvm-offload-binary using
this new functionality.

Depends on #185413

---------

Co-authored-by: Yury Plyakhin <[email protected]>
…195816)

On macOS, LLDB's test suite randomly receives SIGHUP signals that stop
the test suite early. The source of these SIGHUP's seems to be a bug in
the kernel (most likely job control).

The exact steps for reproducing this are not clear, but I have a set of
three tests of which two need to run concurrently for this to trigger:

* TestDAP_runInTerminal
* TestDAP_launch_io_integratedTerminal
* TestDAP_launch_stdio_redirection_and_console

I was also running UBSan on this build which may or may not be necessary
to make this random failure more persistent.

When these tests run, macOS job control will send SIGHUP to the process
group of the spawned subprocesses in that test. As LIT is in the same
process group, it also receives the SIGHUP and shuts down.

This patch just uses Python's API for forcing the spawned subprocess to
its own process group. This won't stop the SIGHUP and only prevents it
from reaching LIT.

The SIGHUP itself doesn't seem to affect the DAP test itself. My theory
here is that the SIGHUP is received during test shutdown (or after it
was shut down), so that's why it doesn't cause any visible failures in
any of the tests.
Fixes: #191735

Teach AArch64 LowerVECTOR_SHUFFLE to recognize byte shuffles that are a
zero fill right shift and lower them to EXT with a zero vector. Adds a
regression test too.

Change-Id: Iffe97ff7e35cfaff790f537b4f1f5ba9aded4f92
Currenty compiling a `target reduction` results in the following assert
for spirv64-intel target:

> Assertion `New->getType() == getType() && "replaceUses of value with
new value of different type!"' failed.

This patch fixes it by adding an addrespace cast where necessary to make
the types of the expressions match.

Assisted-by: claude-sonnet-4-5
Reverts #177160

The new test is timing out on the AArch64 Linux buildbot
(https://lab.llvm.org/buildbot/#/builders/59/builds/34166) and on my own
machine.

I suspect something to do with the requested terminal size. If what we
get is smaller than requested, it could time out waiting for expected
program output.
This patch introduces `_LIBCPP_{BEGIN/END}_EXPLICIT_ABI_ANNOTATIONS` and
marks everything within an
`_LIBCPP_{BEGIN,END}_UNVERSIONED_NAMESPACE_STD` (and any derivatives
like `_LIBCPP_{BEGIN,END}_NAMESPACE_STD`) implicitly by default. This
allows us to drop `_LIBCPP_HIDE_FROM_ABI` in most of the code base,
except for functions which shouldn't be `_LIBCPP_HIDE_FROM_ABI`.

This patch doesn't remove any `_LIBCPP_HIDE_FROM_ABI`s, since we have
over 13k of them in the code base. Actually dropping them will happen
over some time to avoid too many merge conflicts.
In expert scheduling mode, change the VA_VDST counts to match the
hardware implementation. The inserted waits were conservatively correct
before. This just makes them more precise in some cases.
…licting instructions. (#195132)

Essentially uses ModRef analysis in place of getClobberingMemoryAccess()
because the former has more accurate information as to how in loop
accesses and the hoist/sink target relate.
The using statement inside the lambda is redundant with the same using 4
lines up.

No behavior change.
This fixes a warning in omptest about a set but unused variable. The var
was intended to control whether colored logging output is created.
That logic has been moved into the `Logger` itself.
BOLT currently supports LDR relaxation for LDRXl and LDRWl. Add support
for LDR relaxation on LDRSWl.
This PR excludes unsupported part (NAN, -NAN, INFINITY, -INFINITY) from
the following 2 compiler-rt tests on z/OS.
```
compiler-rt/test/builtins/Unit/compiler_rt_scalbnl_test.c
compiler-rt/test/builtins/Unit/compiler_rt_logbl_test.c
```
Added ErrorOr-returning syscall wrappers for access, chdir, dup, dup2,
dup3, faccessat, fchdir, fsync, lseek, readlink, readlinkat, rename,
rmdir, and unlinkat.

Migrated the Linux entrypoint implementations in src/unistd/linux/ and
src/stdio/linux/rename.cpp to use them.

Replaced internal::lseekimpl() with linux_syscalls::lseek() in the
File infrastructure and deleted the now-unused lseekImpl.h.

Assisted-by: Automated tooling, human reviewed.
… helpers (#195911)

This is a follow-up to #194879 to ensure that the helpers for reduction use the right calling convention (in particular that they are marked as spir_func for SPIRV).

Assisted by Claude Sonnet 4.5.
MaskRay and others added 30 commits May 9, 2026 23:38
xxHash64 is a legacy, pre-XXH3 hash whose only non-test caller in the
monorepo is llvm::getKCFITypeID. #196774 accidentally exposed the API.
…apping (#195995)

This way we can remove a few workarounds of lambda expressions where
outer template arguments of concepts have to be preserved through
ImplicitConceptSpecializationDecls.

Fixes #193944
Use the [name of the imported
targets](https://cmake.org/cmake/help/latest/module/CheckSymbolExists.html)
when testing the libraries during cmake configuration. This removes the
need to also set `CMAKE_REQUIRED_INCLUDES` and
`CMAKE_REQUIRED_DEFINITIONS` and reflects more modern CMake usage where
targets are preferred over variables.

This is already the case when checking libcurl in the same file.
…DEF -> CTTZ_ZERO_POISON/CTLZ_ZERO_POISON/CTTZ_ELTS_ZERO_POISON (#196732)

DAG/GISel are ambiguous about whether zero-input results in
UNDEF/POISON, unlike the rest of LLVM which makes it clear its POISON.

I've tried to clean this up once and for all by ensuring
SelectionDAG::canCreateUndefOrPoison does a includesPoison(Kind) check,
renaming the opcodes (including the VP variants) and updating as many
comments/tests as possible (I may still have missed some...).
Opaque pointer inttoptr was recording ptr as a pointee type, so
OpConvertUToPtr was emitted as pointer-to-pointer and then bitcasted
back. Please see an example below.

LLVM IR:
```
%p = inttoptr i64 %x to ptr addrspace(1)
%a = call ptr addrspace(1) @llvm.ptr.annotation(... %p ...)
call spir_func void @prefetch(ptr addrspace(1) %a, ...)
```

SPIR-V (before the change):
```
%p2 = OpConvertUToPtr %_ptr_CrossWorkgroup__ptr_CrossWorkgroup_uchar %x
%p1 = OpBitcast %_ptr_CrossWorkgroup_uchar %p2
OpFunctionCall ... %p1 ...
```

Skip assigning pointee type for inttoptr when the destination is
untyped, fallback later recovers the correct single pointer type.
…2430)

Invalidate cached values in register context data structures on every
thread stop.

NativeRegisterContextRegisterInfo::InvalidateAllRegisters performs no
operation by default. Subclasses may override it to clear cached values
within their register context data structures whenever a thread stops.

This change intends to set up the necessary infrastructure to support
caching of the thread context in NativeRegisterContextWindows_arm64,
which will improve read performance. Currently, the thread context is
retrieved for every read or write operation.
…artial vector reductions" (#196796)

Reverts #195119 while reported assertions are investigated.
…` attributes (#196144)

Adds new warning that is emitted when parameter is marked as
`[[clang::lifetimebound]]` but is not returned in one way or another
(tracked via `OriginEscapeFact`).

Closes #182935
Set ParentState::AlignedTo for ternary operands.
…196803)

Change
  OpenMPGroupprivate      -> OmpGroupprivateDirective
  OpenMPThreadprivate     -> OmpThreadprivateDirective
  OpenMPRequiresConstruct -> OmpRequiresDirective
  OpenMPUtilityConstruct  -> OmpUtilityDirective
Add patterns to match post-increment truncating stores from lane 0 of
wide integer vectors (v4i32/v2i64) to narrower types (i8/i16/i32). This
avoids transferring the value through a GPR when storing.

Also remove the pre-legaliztion early-exit in `combineStoreValueFPToInt`
as it prevented the optimization from applying in some cases.
…d-code-paths (#191625)

Part of the work in #183462.

Closes #183464.

Splitting the check into two more focused checks was considered during
discussion, but since clang-tidy does not support one-to-many aliases, a
single name covering both behaviors was chosen instead that is more
clear than `multiway-paths-covered`.

---------

Co-authored-by: Zeyi Xu <[email protected]>
…ndition [NFC] (#196795)

as it is not possible to combine bundles and conditions from
#160460 reflect that in
CreateAssumption
### Summary

part of : #185382

lower `vuzp` intrinsics in:
https://arm-software.github.io/acle/neon_intrinsics/advsimd.html#unzip-elements

this is a follow up : #195527

Lower `NEON::BI__builtin_neon_vuzp_v` and
`NEON::BI__builtin_neon_vuzpq_v`in CIRGenBuiltinAArch64.cpp by porting
by porting the existing incubator
logic(clangir/clang/lib/CIR/CodeGen/CIRGenBuiltinAArch64.cpp) : two
bitcasts on the input vectors,two rounds of cir.vec.shuffle generating
the deinterleave (even/odd) shuffle patterns with indices 2*i+vi, each
stored via ptr_stride on the sret base pointer.
vfneg is not available on zvfhmin or zvfbfmin, it's expected to expand
to integer operations instead of unrolling to scalar operations.
General expandFNEG already handles that in most of cases except for
fixed vector types that are not promotable, we need to find a better
heuristic to gate this.
vfabs is not available on zvfhmin or zvfbfmin, it's expected to expand
to integer operations instead of unrolling to scalar operations.
General expandFABS already handles that in most of cases except for
fixed vector types that are not promotable, we need to find a better
heuristic to gate this.
vfsgnj is not available on zvfhmin or zvfbfmin, it's expected to expand
to integer operations instead of unrolling to scalar operations.
General expandFCOPYSIGN already handles that in most of cases except for
fixed vector types that are not promotable, we need to find a better
heuristic to gate this.
Byte constants are equivalent to integer constants when stored to
memory. Replacing them in store instructions reduces IR differences and
enables existing optimizations over integer constants.
PR #167254 inappropriately introduced llvm/Support/Hash.{h,cpp} for the
KCFI helpers. The name is misleading — it has nothing to do with the
generic hashing facility in llvm/ADT/Hashing.h — and KCFI is a
CodeGen/IR feature that does not belong in the foundational Support
layer.

Move the files to llvm/lib/Transforms/Utils/KCFIHash.cpp, alongside
setKCFIType, which is the only existing KCFI helper in TransformUtils.

Also relocate the deprecated pre-xxh3 xxHash64 implementation into
KCFIHash.cpp, the sole user. clang/test/CodeGen/kcfi-generalize.c and
kcfi-normalize.c are end-to-end regression tests for the xxHash64 output
…r macro (#195427)

```
  // a.cc
  static void foo(int x) {
    switch (x) {
  #define GENERIC(n) case n:
  #include "types.def"   // -isystem header invokes a user macro
      break;
    }
  }

  // sys/types.def
  #define MID(name) GENERIC(name)
  MID(0)
  MID(1)
  MID(2)
```

```
$ clang -fprofile-instr-generate -fcoverage-mapping -isystem sys -c a.cc
Assertion `SystemHeadersCoverage ||
           !SM.isInSystemHeader(SM.getSpellingLoc(Loc))' failed.
```

Commit 702a2b6 ("[Coverage] Rework !SystemHeadersCoverage")
replaced the system-header skip in gatherFileIDs with this assertion,
which trips as `SM.isInSystemHeader(SM.getSpellingLoc(Loc))` is false.

This patch adds back the pre-#91446 condition but folds it with
the macro-token remap `if` statement.

Fixes #179316/#195422.
Clang Opus 4.7 identified clang/lib/Parse/ParseExpr.cpp, created a
minimal reproduce with cvise, and wrote the initial version of this
CodeGen patch. (An earlier session papered over the bug by patching
llvm-cov instead, which I abandoned).
`ClassifiedToken` is used in only the implementation of
`UseTrailingReturnTypeCheck`. Move it into the unnamed namespace of the
cpp file instead of it being in the header.
This fixes 2f4c387.

Co-authored-by: Google Bazel Bot <[email protected]>
cos() does `using namespace range_reduction_double_internal;` and
range_reduction_double_internal after 51e9430 contains

    using LIBC_NAMESPACE::fputil::DoubleDouble;
    using Float128 = LIBC_NAMESPACE::fputil::DyadicFloat<128>;

So the local using statements for DoubleDouble and Float128 shadowed
these. Just remove the local using statements.

No behavior change.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.