[pull] main from llvm:main#5765
Open
pull[bot] wants to merge 195 commits intoEricsson:mainfrom
Open
Conversation
Previously, calling a host-device mismatch function inside a discarded `if constexpr` branch would trigger an error. This patch recognizes that discarded statements are never instantiated and allows such code.
[libc] Include CPU model in overlay CI sccache key The overlay CI compiles opt_host memory tests with `-march=native`, which generates object files specific to the runner CPU model. sccache treats `-march=native` as a literal string in its hash key, so cached `.o` files compiled on one CPU model get served to runners with a different CPU. When the cached binary uses instructions the current CPU lacks, the test crashes with SIGILL. ## Symptoms The `memcmp_opt_host`, `memmove_opt_host`, `memset_opt_host`, `bcmp_opt_host`, and `bzero_opt_host` tests crash when SIMD code paths are first exercised. Simple tests like `CmpZeroByte` pass because they use small sizes that do not enter SIMD routines. The failures are fully reproducible on reruns because the cache stays poisoned. ## Evidence Three consecutive runs of the same fwide PR (#196157), same code: | Run | Azure Region | Cache Hits | Cache Misses | Result | |-----|-------------|-----------|-------------|--------| | [25512875679](https://github.com/llvm/llvm-project/actions/runs/25512875679/job/74876008545) | westus3 | 9 | 5354 | PASS | | [25524024922](https://github.com/llvm/llvm-project/actions/runs/25524024922/job/74916241365) | northcentralus | 5345 | 0 | CRASH | | [25524839613](https://github.com/llvm/llvm-project/actions/runs/25524839613/job/74965830435) | westus | 5345 | 0 | CRASH | The first run had a nearly empty cache and compiled everything locally (0.17% hit rate). An intermediate [syscall-unistd run](https://github.com/llvm/llvm-project/actions/runs/25517783708/job/74893495220) in eastus then populated the cache with object files compiled for that region's CPU. Subsequent runs on different hardware got 100% cache hits and crashed because the cached `.o` files use instructions their CPUs lack. ## Fix Added a "Detect CPU model" step that reads the CPU model string from `/proc/cpuinfo` (Linux) or `sysctl` (macOS) and appends it to the sccache cache key. Runners with different CPUs now get separate cache buckets. Assisted-by: Automated tooling, human reviewed.
…192041) libclc has configure warning on Windows: clang: error: no such file or directory: '/clang:--target=amdgcn-amd-amdhsa-llvm' clang: error: no such file or directory: '/clang:-print-target-triple' CMake Warning at CMakeLists.txt:239 (message): Failed to execute `llvm-project/build/bin/clang.exe /clang:--target=amdgcn-amd-amdhsa-llvm /clang:-print-target-triple` to normalize target triple. Switch to check CMAKE_C_COMPILER_FRONTEND_VARIANT because - CMAKE_C_SIMULATE_ID=MSVC: true for both clang and clang-cl. - CMAKE_C_COMPILER_FRONTEND_VARIANT=MSVC: true for clang-cl; false for clang.
Replace the explicit specialization lists in `__is_signed_integer_v` and `__is_unsigned_integer_v` with detection using `is_integral`, `is_signed`, and `is_unsigned`. This covers `_BitInt(N)` for any N, in addition to all standard and extended integer types. Character types and `bool` are excluded via `__is_character_or_bool_v`. This unblocks `<bit>` operations (`popcount`, `countl_zero`, `rotl`, etc.) for `_BitInt(N)`. Part of the [_BitInt(N) libc++ effort](https://discourse.llvm.org/t/bitint-n-support-in-libc-investigations-possible-improvements-looking-for-guidance/90063). Assisted-by: Claude (Anthropic) --------- Co-authored-by: Claude Opus 4.6 <[email protected]>
No codegen and instruction. It may be ratified in the future. riscv/riscv-isa-manual#2598
The cost of sub-reductions is either the cost of *mlslb + *mlslt, or the
cost of a dot operation with 2 negations:
```
partial_reduce_umls acc, lhs, rhs
<=> -partial_reduce_umla -acc, lhs, rhs
```
(codegen for this was added by #186809)
The cost-model was previously a bit of a hack, since sub-reductions were
expanded and therefore expensive, although we made the expansion cost
artifically cheaper so that it would still be a candidate for cdot
instructions.
…gcn_rcp, and amdgcn_sqrt (#195099)
In private ZA functions without any instructions that require "active"
ZA we can omit all ZA setup (and saves/restores). This is equivalent to
removing the `__arm_new("za/zt0")` attribute when ZA state is unused.
Add fwide function and tests. Part 1/11. All build file changes are in part 11. Assisted by Gemini
This patch adds some extra state collection methods to DebuggerBase and implements them for DAP only. These methods are used to fetch a stacktrace without variable information, and to populate variable information into a StepIR containing only a stacktrace. These methods are currently unused, making this patch NFC, but this is a necessary precursor to the new script model, where we examine the stacktrace to determine what variable info we will collect. As part of the stacktrace-collection function, we also fetch the instruction address for each stack frame, if it is made available by the debugger; to enable this, this patch adds a new value with default `None` to `FrameIR`.
Now that #194607 landed we use a normalized triple in the README for the SPIRV targets. Before `spirv-mesa3d-` and `spirv64-mesa3d-` were being used and those will be normalized to `spirv-unknown-mesa3d` and `spirv64-unknown-mesa3d` by the following command in `runtimes/CMakeLists.txt` with this command: ```console $ clang --target=spirv-mesa3d- -print-target-triple spirv-unknown-mesa3d ``` This is because in `llvm/lib/TargetParser/Triple.cpp` the term `mesa3d` is recognized as an OS and placed in third position. The install path for `libclc.spv` there ends up in `spirv-unknown-mesa3d/libclc.spv`. With this change we suggest to use triples that "survive" the normalization: ```console $ clang --target=spirv-mesa-mesa3d -print-target-triple spirv-mesa-mesa3d ``` See also this discussion: #194607 (comment)
…r. (#196366) When building with precompiled headers and Xcode as a generator, It adds `obj.lldbDAP.dir/${BUILD_TYPE}/cmake_pch.xxx` but does not generate one causing the build to fail. This might have to do with `add_llvm_library` adding a source file `Dummy.c` to any object it creates if using Xcode as a generator and `lldbDAP` object not declaring it's LINK_LIBS and LINK_COMPONENTS.
… shift amounts only generate poison (#196489) Matches ValueTracking / GISel implementations - although testing options are limited until DAG has actual uses of UndefPoisonKind::UndefOnly
…nly generate poison (#196492)
Introduce G_WRAPPER_RIP it is the same node as in DAG. It is required to make legalization possible when a load from stub is required to obtain a pointer to a global value. It allows to avoid manual selection in X86InstructionSelector. Also added a missing check on X86SelectAddress failure.
…185425) Extends llvm-objdump to print the information of images contained in nested OffloadBinaries. For example, for a binary compiled with #185413 it shows ``` $llvm-objdump --offloading ./a.out ./a.out: file format elf64-x86-64 OFFLOADING IMAGE [0]: kind elf arch triple spirv64-intel producer openmp image size 43104 bytes [Nested OffloadBinary format detected] Number of inner images: 1 kind spir-v arch triple spirv64-intel producer openmp image size 42944 bytes ``` New tests are added for clang-ling-wrapper and llvm-offload-binary using this new functionality. Depends on #185413 --------- Co-authored-by: Yury Plyakhin <[email protected]>
…195816) On macOS, LLDB's test suite randomly receives SIGHUP signals that stop the test suite early. The source of these SIGHUP's seems to be a bug in the kernel (most likely job control). The exact steps for reproducing this are not clear, but I have a set of three tests of which two need to run concurrently for this to trigger: * TestDAP_runInTerminal * TestDAP_launch_io_integratedTerminal * TestDAP_launch_stdio_redirection_and_console I was also running UBSan on this build which may or may not be necessary to make this random failure more persistent. When these tests run, macOS job control will send SIGHUP to the process group of the spawned subprocesses in that test. As LIT is in the same process group, it also receives the SIGHUP and shuts down. This patch just uses Python's API for forcing the spawned subprocess to its own process group. This won't stop the SIGHUP and only prevents it from reaching LIT. The SIGHUP itself doesn't seem to affect the DAP test itself. My theory here is that the SIGHUP is received during test shutdown (or after it was shut down), so that's why it doesn't cause any visible failures in any of the tests.
Fixes: #191735 Teach AArch64 LowerVECTOR_SHUFFLE to recognize byte shuffles that are a zero fill right shift and lower them to EXT with a zero vector. Adds a regression test too. Change-Id: Iffe97ff7e35cfaff790f537b4f1f5ba9aded4f92
Currenty compiling a `target reduction` results in the following assert for spirv64-intel target: > Assertion `New->getType() == getType() && "replaceUses of value with new value of different type!"' failed. This patch fixes it by adding an addrespace cast where necessary to make the types of the expressions match. Assisted-by: claude-sonnet-4-5
Reverts #177160 The new test is timing out on the AArch64 Linux buildbot (https://lab.llvm.org/buildbot/#/builders/59/builds/34166) and on my own machine. I suspect something to do with the requested terminal size. If what we get is smaller than requested, it could time out waiting for expected program output.
This patch introduces `_LIBCPP_{BEGIN/END}_EXPLICIT_ABI_ANNOTATIONS` and
marks everything within an
`_LIBCPP_{BEGIN,END}_UNVERSIONED_NAMESPACE_STD` (and any derivatives
like `_LIBCPP_{BEGIN,END}_NAMESPACE_STD`) implicitly by default. This
allows us to drop `_LIBCPP_HIDE_FROM_ABI` in most of the code base,
except for functions which shouldn't be `_LIBCPP_HIDE_FROM_ABI`.
This patch doesn't remove any `_LIBCPP_HIDE_FROM_ABI`s, since we have
over 13k of them in the code base. Actually dropping them will happen
over some time to avoid too many merge conflicts.
In expert scheduling mode, change the VA_VDST counts to match the hardware implementation. The inserted waits were conservatively correct before. This just makes them more precise in some cases.
…licting instructions. (#195132) Essentially uses ModRef analysis in place of getClobberingMemoryAccess() because the former has more accurate information as to how in loop accesses and the hoist/sink target relate.
The using statement inside the lambda is redundant with the same using 4 lines up. No behavior change.
This fixes a warning in omptest about a set but unused variable. The var was intended to control whether colored logging output is created. That logic has been moved into the `Logger` itself.
BOLT currently supports LDR relaxation for LDRXl and LDRWl. Add support for LDR relaxation on LDRSWl.
This PR excludes unsupported part (NAN, -NAN, INFINITY, -INFINITY) from the following 2 compiler-rt tests on z/OS. ``` compiler-rt/test/builtins/Unit/compiler_rt_scalbnl_test.c compiler-rt/test/builtins/Unit/compiler_rt_logbl_test.c ```
Added ErrorOr-returning syscall wrappers for access, chdir, dup, dup2, dup3, faccessat, fchdir, fsync, lseek, readlink, readlinkat, rename, rmdir, and unlinkat. Migrated the Linux entrypoint implementations in src/unistd/linux/ and src/stdio/linux/rename.cpp to use them. Replaced internal::lseekimpl() with linux_syscalls::lseek() in the File infrastructure and deleted the now-unused lseekImpl.h. Assisted-by: Automated tooling, human reviewed.
xxHash64 is a legacy, pre-XXH3 hash whose only non-test caller in the monorepo is llvm::getKCFITypeID. #196774 accidentally exposed the API.
Use the [name of the imported targets](https://cmake.org/cmake/help/latest/module/CheckSymbolExists.html) when testing the libraries during cmake configuration. This removes the need to also set `CMAKE_REQUIRED_INCLUDES` and `CMAKE_REQUIRED_DEFINITIONS` and reflects more modern CMake usage where targets are preferred over variables. This is already the case when checking libcurl in the same file.
This is part one of removing the `hicpp-*` checks. RFC: https://discourse.llvm.org/t/rfc-regarding-the-current-status-of-hicpp-checks/89883 Part of #183462
…DEF -> CTTZ_ZERO_POISON/CTLZ_ZERO_POISON/CTTZ_ELTS_ZERO_POISON (#196732) DAG/GISel are ambiguous about whether zero-input results in UNDEF/POISON, unlike the rest of LLVM which makes it clear its POISON. I've tried to clean this up once and for all by ensuring SelectionDAG::canCreateUndefOrPoison does a includesPoison(Kind) check, renaming the opcodes (including the VP variants) and updating as many comments/tests as possible (I may still have missed some...).
Opaque pointer inttoptr was recording ptr as a pointee type, so OpConvertUToPtr was emitted as pointer-to-pointer and then bitcasted back. Please see an example below. LLVM IR: ``` %p = inttoptr i64 %x to ptr addrspace(1) %a = call ptr addrspace(1) @llvm.ptr.annotation(... %p ...) call spir_func void @prefetch(ptr addrspace(1) %a, ...) ``` SPIR-V (before the change): ``` %p2 = OpConvertUToPtr %_ptr_CrossWorkgroup__ptr_CrossWorkgroup_uchar %x %p1 = OpBitcast %_ptr_CrossWorkgroup_uchar %p2 OpFunctionCall ... %p1 ... ``` Skip assigning pointee type for inttoptr when the destination is untyped, fallback later recovers the correct single pointer type.
…2430) Invalidate cached values in register context data structures on every thread stop. NativeRegisterContextRegisterInfo::InvalidateAllRegisters performs no operation by default. Subclasses may override it to clear cached values within their register context data structures whenever a thread stops. This change intends to set up the necessary infrastructure to support caching of the thread context in NativeRegisterContextWindows_arm64, which will improve read performance. Currently, the thread context is retrieved for every read or write operation.
Set ParentState::AlignedTo for ternary operands.
…196803) Change OpenMPGroupprivate -> OmpGroupprivateDirective OpenMPThreadprivate -> OmpThreadprivateDirective OpenMPRequiresConstruct -> OmpRequiresDirective OpenMPUtilityConstruct -> OmpUtilityDirective
Add patterns to match post-increment truncating stores from lane 0 of wide integer vectors (v4i32/v2i64) to narrower types (i8/i16/i32). This avoids transferring the value through a GPR when storing. Also remove the pre-legaliztion early-exit in `combineStoreValueFPToInt` as it prevented the optimization from applying in some cases.
…d-code-paths (#191625) Part of the work in #183462. Closes #183464. Splitting the check into two more focused checks was considered during discussion, but since clang-tidy does not support one-to-many aliases, a single name covering both behaviors was chosen instead that is more clear than `multiway-paths-covered`. --------- Co-authored-by: Zeyi Xu <[email protected]>
…nctions in modernize-use-string-view" (#196387)
### Summary part of : #185382 lower `vuzp` intrinsics in: https://arm-software.github.io/acle/neon_intrinsics/advsimd.html#unzip-elements this is a follow up : #195527 Lower `NEON::BI__builtin_neon_vuzp_v` and `NEON::BI__builtin_neon_vuzpq_v`in CIRGenBuiltinAArch64.cpp by porting by porting the existing incubator logic(clangir/clang/lib/CIR/CodeGen/CIRGenBuiltinAArch64.cpp) : two bitcasts on the input vectors,two rounds of cir.vec.shuffle generating the deinterleave (even/odd) shuffle patterns with indices 2*i+vi, each stored via ptr_stride on the sret base pointer.
vfneg is not available on zvfhmin or zvfbfmin, it's expected to expand to integer operations instead of unrolling to scalar operations. General expandFNEG already handles that in most of cases except for fixed vector types that are not promotable, we need to find a better heuristic to gate this.
vfabs is not available on zvfhmin or zvfbfmin, it's expected to expand to integer operations instead of unrolling to scalar operations. General expandFABS already handles that in most of cases except for fixed vector types that are not promotable, we need to find a better heuristic to gate this.
vfsgnj is not available on zvfhmin or zvfbfmin, it's expected to expand to integer operations instead of unrolling to scalar operations. General expandFCOPYSIGN already handles that in most of cases except for fixed vector types that are not promotable, we need to find a better heuristic to gate this.
Byte constants are equivalent to integer constants when stored to memory. Replacing them in store instructions reduces IR differences and enables existing optimizations over integer constants.
PR #167254 inappropriately introduced llvm/Support/Hash.{h,cpp} for the KCFI helpers. The name is misleading — it has nothing to do with the generic hashing facility in llvm/ADT/Hashing.h — and KCFI is a CodeGen/IR feature that does not belong in the foundational Support layer. Move the files to llvm/lib/Transforms/Utils/KCFIHash.cpp, alongside setKCFIType, which is the only existing KCFI helper in TransformUtils. Also relocate the deprecated pre-xxh3 xxHash64 implementation into KCFIHash.cpp, the sole user. clang/test/CodeGen/kcfi-generalize.c and kcfi-normalize.c are end-to-end regression tests for the xxHash64 output
…r macro (#195427) ``` // a.cc static void foo(int x) { switch (x) { #define GENERIC(n) case n: #include "types.def" // -isystem header invokes a user macro break; } } // sys/types.def #define MID(name) GENERIC(name) MID(0) MID(1) MID(2) ``` ``` $ clang -fprofile-instr-generate -fcoverage-mapping -isystem sys -c a.cc Assertion `SystemHeadersCoverage || !SM.isInSystemHeader(SM.getSpellingLoc(Loc))' failed. ``` Commit 702a2b6 ("[Coverage] Rework !SystemHeadersCoverage") replaced the system-header skip in gatherFileIDs with this assertion, which trips as `SM.isInSystemHeader(SM.getSpellingLoc(Loc))` is false. This patch adds back the pre-#91446 condition but folds it with the macro-token remap `if` statement. Fixes #179316/#195422. Clang Opus 4.7 identified clang/lib/Parse/ParseExpr.cpp, created a minimal reproduce with cvise, and wrote the initial version of this CodeGen patch. (An earlier session papered over the bug by patching llvm-cov instead, which I abandoned).
`ClassifiedToken` is used in only the implementation of `UseTrailingReturnTypeCheck`. Move it into the unnamed namespace of the cpp file instead of it being in the header.
This fixes 2f4c387. Co-authored-by: Google Bazel Bot <[email protected]>
No behavior change.
cos() does `using namespace range_reduction_double_internal;` and range_reduction_double_internal after 51e9430 contains using LIBC_NAMESPACE::fputil::DoubleDouble; using Float128 = LIBC_NAMESPACE::fputil::DyadicFloat<128>; So the local using statements for DoubleDouble and Float128 shadowed these. Just remove the local using statements. No behavior change.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )