Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] main from dotnet:main #122

Open
wants to merge 2,241 commits into
base: main
Choose a base branch
from
Open

[pull] main from dotnet:main #122

wants to merge 2,241 commits into from

Conversation

pull[bot]
Copy link

@pull pull bot commented Sep 18, 2024

See Commits and Changes for more details.


Created by pull[bot]

Can you help keep this open source service alive? 💖 Please sponsor : )

@pull pull bot added the ⤵️ pull label Sep 18, 2024
saucecontrol and others added 29 commits March 19, 2025 17:28
* improve support for 512-bit Vector<T>

* fix size check

* add MaxVectorTBitWidth configs to jitstress-isas-x86 pipeline

* clamp Vector<T> size to largest accelerated fixed-sized vector

* move PreferredVectorBitWidth logic to VM

* tidying

* tidying2
* unblock long xplat intrinsics on x86

* tidying

* tidying2

* remove CreateScalarUnsafe opt for small loads

* skip more redundant casts for CreateScalar of small types

* use temp reg for CreateScalar float SSE fallback

* formatting patch

* simplify storeind containment of ToScalar

* don't use temp reg for CreateScalar float SSE fallback

* skip cast on other memory loads

* use proper containment check

* add more validation, remove CreateSequence restriction

* use appropriate helpers for decomposing ToScalar
* [Mono]: Fix additional stack-walks to be async safe.

Mono's stack-walker can be signal/async safe when needed,
making sure it won't allocate additional memory or taking internal
runtime locks. It is however up to the caller of the stack walking API's
to decide if it should be signal and/or async safe. Currently this is
controlled using two different flags, the MONO_UNWIND_SIGNAL_SAFE as well
as mono_thread_info_is_async_context.

This is problematic since callers wants signal safe stack-walking but
since not both are set, it will not behave fully signal safe.

dotnet/android#9365 hit a couple of scenarios
described here:

dotnet/android#9365 (comment)

that ends up deadlocking due to the fact that they did stack-walking
from within a signal handler and deadlocked or dumping stack on suspended
thread holding runtime loader lock, but without making the stack-walk async safe.

Fix makes sure that calls to stack-walk API's can be made signal and/or
async safe and that identified areas uses the correct set of flags given
state of threads when stack-walking.

* Add signal async safe stack unwind option.

* Assert that walk_stack_full_llvm_only is only called in llvm only mode.

* Correct some bool usage.

* Make signal safe unwind option, signal async safe.

Mono's current MONO_UNWIND_SIGNAL_SAFE was not fully signal safe since
it was not async safe, that could lead to taking loader lock. This
will fix MONO_UNWIND_SIGNAL_SAFE to be signal asycn safe, it will also
change current use of MONO_UNWIND_SIGNAL_SAFE to MONO_UNWIND_NONE since
they where equal before this fix, meaning old calls using
MONO_UNWIND_SIGNAL_SAFE will behave identical using MONO_UNWIND_NONE
so no regression.
…tant `indices` (#99596)

* Squash into 1 commit

* Remove internal dependency on ShuffleUnsafe's behaviour wrt high bit

* Optimise some codegen

- Optimise comparison in `gtNewSimdShuffleNodeVariable` for xarch
- Optimise for constant vector in Vector256.Shuffle{Unsafe} when have AVX2 only

* jit format

* jit format

* Simplify logic for using Shuffle for ShuffleUnsafe

* Move `ShuffleUnsafeModified` out of `Base64Helper`

- This was requested feedback via Discord

* Remove unnecessary `CompExactlyDependsOn` and `using`s

* Update SearchValues.cs

* Support AVX-512/AVX-10.1 acceleration of Shuffle V128<ulong/long/double>

* Additional optimisation for V512 constant index shuffle

- When 128-bit lanes are not crossed, emit vpshufb instead of vperm*

* jit format & typo

* Fix operand order

* Changes to `IsValidForShuffle` & jit format

- Ensure we use IsValidForShuffle correctly (i.e., ensure all cases that could be emittet at some point, are able to be done)
- jit format changes

* jit format

* Update hwintrinsicxarch.cpp

* Update hwintrinsicxarch.cpp

* Update gentree.cpp

- Implement preferring `vshufpd`/`vpshufd`/`vshufps` (when same size & possible & no zeroing) over `vpshufb` (when no crossing lane) over `vpermd`/`vpermps`/`vpermpd`/the ushort & byte equivalents in constant index case

Update gentree.cpp

- Continuation of previous commit, for the variable index shuffle stuff

Update gentree.cpp

Update gentree.cpp

Fix typo

* Make `op2DupSafe` be consistently ordered

- Adjust code such that `op2DupSafe` is consistently ordered with respect to `retNode`(it happens to end up before now) - this is the likely cause of the test failures

* jit format

* Use `compIsEvexOpportunisticallySupported` instead of explicit AVX-10 in remaining places

* jit format

* Update gentree.cpp

- Ensure `op1` side effects are done before `op2` side effects

* Update Vector128.cs

* Use `BlockNonDeterministicIntrinsics` instead of `CompExactlyDependsOn`

- As per feedback

* Ensure V128<byte> ShuffleUnsafe is not regressed on mono

* Update gentree.cpp

* Rename `ShuffleUnsafe` to `ShuffleNative`

- And do a full review of the logic & comments of all my code
- Fix 3 bugs (2 pre-existing)
- Simplify/improve some code

* jit format

* jit format again

* um

* Move assertion on arm64 to correct spot

* Normalize indices for `vpshufb` in all cases of constant indices Shuffle

- This is a nice-to-have, not a correctness issue - it matches what we're doing throughout the rest of the function, and keeps the asm more readable imo

* jit format

* Revert last pair of commits

- There's other places where they're not normalised. could be done, but not really worth the trouble probably

* Feedback

* Address some feedback & a bug fix

- Implement a bunch of feedback
- Fix an oversight in `IsValidForShuffle` where we don't treat something as variable indices that gets emitted as such

* Ensure ShuffleNative's behaviour with reflection/function pointers/etc. is the same as a normal call

- Not a particularly pretty solution to the problem, but it should work correctly at least

* jit format & compilation error

* jit format & fix buuld

* jit format & fix build

* jit format

* Update hwintrinsicxarch.cpp

* Update gentree.cpp

Missing `!= nullptr` & typo in comment & add additional comments in a few places.

* Fix mono code

* Mono: also cast arguments

* Move mono impl back into c#

- This should fix the mono issues (wasm impl assumes indices are constant)
- Additionally ensure V128.Shuffle(byte/sbyte) is vectorised for all mono platforms with ssse3 or arm64 advsimd or packedsimd, when given variable indices

* Update Vector128.cs

* um

---------

Co-authored-by: Tanner Gooding <[email protected]>
Co-authored-by: Egor Bogatov <[email protected]>
…ts.TypesTest` when `DataSetXmlSerializationIsSupported` is set (#113084)

* Conditionally run the test
* Reduce tpdiff when regMaskTP has more than 64 registers by working on SingleTypeRegSet instead.

* Handle liveRegs locally as SingleTypeRegSet.

* Addressing review comments.

* Revert "Handle liveRegs locally as SingleTypeRegSet."

This reverts commit 65acca3.

* Fix formatting.
…ch (#112479)

* Import ReverseEndianness for LA64

* Eliminate cast under bswap16
Enable inlining of method with EH. Inlinee EH clauses are integrated
into the root method EH table at the appropriate point (mid-table if
the call site is in an EH region; at table end otherwise).

Don't try an inline managed methods with unmanaged calling conventions.
This mainly copes with x86 where unmanaged calling conventions use reversed
arg order, but I've disabled it in general. No diffs as these methods seem
to always include EH.

Remove uses of `compXcptnsCount` as this goes stale whenever we clone or
remove EH, or (eventually) inline methods with EH. Instead, rely on
`compHndBBtabCount`.

Defer allocating x86's shadow SP var and area until later in jitting,
so this reflects any changes in EH table structure. In particular we
often are able to eliminate EH in part or all together and this saves
a low-offset allocation and so leads to some nice code size savings on
x86.

Also on x86 remove the runtime-dependent catch class case from the
computation for keeping this alive, as we now transform such into
runtime lookups in filters (that may well keep this alive).

Once we can inline methods with EH, IL ranges are no longer a reliable indicator
of a mutual-protect try regions. Instead, after importation, we can rely on
mutual-protect trys having the same start and end blocks.

Also update other case where we were using `info.compXcptnsCount` in morph
to decide if we needed a frame pointer. This lets us simplify the logic around
frame pointers and EH (though I still think we're making up our minds too early).

Contributes to #108900.
This change decouples the "bv" space (used for the connection graph
and related data) from the "lclnum" space, so that the graph nodes can
represent other entities that are not locals.

Conditional escape analysis introduced pseudo-locals, so we already had
this notion, but it was implicit, and extending it further (for say
fields) was proving awkward. So now it is explicit.

The one tricky aspect is that as the analysis proceeds we may introduce
new locals that we want to track, so we need to anticipate this and
leave bv space for them, since our bv length must be known in advance.

Only a subset of locals need to be analyzed, so in some (many?) cases
the overall BV lengths should decrease, though there is now a bit more
work to interpret what the bits mean.

We reuse the tracked var concept and data structures for locals, eg to
map from bv index to local num; for other entities the mappings are
one-to-one so reverse mapping doesn't require table lookups.
…d extra Implemented Interface (#113721)

* #108964 [Mono] ves_icall_RuntimeType_GetInterfaces generates an extra Implemented Interface

* Added the regression test

---------

Signed-off-by: Medha Tiwari <[email protected]>
* Target NetFrameworkCurrent in tests and remove old mentions

... of previous .NET Framework versions in docs / tests.

NetFrameworkMinimum (net462) -> NetFrameworkCurrent (net481) so that we can
start consuming xunit v3 which requires at least net472. Use
NetFrameworkCurrent as the actual runtime that is then used is
.NET Framework 4.8.1 anyway (innerloop & CI).

* Revert source-generator ref assembly change

* Try to fix VoidMainWithExitCodeApp test

* Fix typo

---------

Co-authored-by: Alexander Köplinger <[email protected]>
This class does not seem to be necessary after the unification of the crypto assemblies.
…der/decoder (#113223)

Turn GcInfoEncoder, GcInfoDecoder, and GcSlotDecoder into templates that accept a type argument that provides encoding traits
Give each target's GcInfoEncoding a unique name so that you can see which one is being used in the debugger
Remove no-op normalization of safe point counts and interruptible range counts
Remove (DE)NORMALIZE_REGISTER since it's a no-op
* Prevent OS core dump creation for intentionally crashing tests

There are three coreclr tests that intentionally run a crashing secondary
process. While the CreateDump invocation on crash for these tests was
already disabled, the OS core dump creation was still happening.
In the CI this was causing test machines getting out of disk space.
This change disables OS core dump creation for those tests.

Close #113652

* Reflect PR

* call the setrlimit explicitly on Linux / macOS only
* fix missing reference to the CoreCLRTestLibrary.csproj in the ParallelCrash.csproj.
  I have accidentally put it into ParallelCrashTester.csproj instead.

* Update src/tests/Common/CoreCLRTestLibrary/Utilities.cs

Co-authored-by: Aaron Robinson <[email protected]>

---------

Co-authored-by: Aaron Robinson <[email protected]>
Adds cDAC support for runtime Frame types:
* `TransitionFrame` (and subclasses)
* `FuncEvalFrame`
* `ResumableFrame` (and subclasses)
* `FaultingExceptionFrame`
* `HijackFrame`
This partially implements ML-KEM for Linux on top of OpenSSL 3.5. The current supported operations are key generation, key import and export, and key encapsulation and decapsulation.
Add Await overloads for configured task awaitables.
rzikm and others added 30 commits April 4, 2025 09:16
* [Test Failure] SslStreamDisposeTest.Dispose_ParallelWithHandshake_ThrowsODE on Unix
Fixes #113833

* fixup! [Test Failure] SslStreamDisposeTest.Dispose_ParallelWithHandshake_ThrowsODE on Unix Fixes #113833

* fixup! fixup! [Test Failure] SslStreamDisposeTest.Dispose_ParallelWithHandshake_ThrowsODE on Unix Fixes #113833

* Update src/libraries/System.Net.Security/tests/FunctionalTests/SslStreamDisposeTest.cs

Co-authored-by: Copilot <[email protected]>

* Fix build

---------

Co-authored-by: Copilot <[email protected]>
* Tweak disabled test for future investigation

Seems to pass with 50000 but fails with 100000.

* Run bridge tests also in tarjan only mode

Initially, we were running the bridge tests as comparison between tarjan and new bridge output. In order for the output to match, certain optimizations need to be disabled for the tarjan bridge and this configuration would be different from standard production configuration. Address this by also running the tests in tarjan only mode. While we don't verify the correctness of the generated SCCs in this configuration, we could still catch certain assertions in the implementation. For example, FauxHeavyNodeWithCycles would lead to crashes in tarjan only configuration.
…utable (#113924)

* Make COR_PRF_DISABLE_OPTIMIZATIONS and COR_PRF_DISABLE_INLINING mutable flags

* On module initialization, capture the profiler JIT flags so they apply consistently on that module

* Change old macro usages and make them call Module::AreJITOptimizationsDisabled() instead

* Test for dynamic assignment of COR_PRF_DISABLE_OPTIMIZATIONS and COR_PRF_DISABLE_INLINING

* Update src/coreclr/vm/ceeload.cpp

Co-authored-by: Jan Kotas <[email protected]>

---------

Co-authored-by: Jan Kotas <[email protected]>
…111408)

When we are running an APC Callback we are not allowed to SetIP because when APC Callback is resuming it will check if the IP is not changed if CET is enabled.
To avoid this problem we use the APC to suspend the thread, but then we enable the single step and continue the thread execution, this will exit the apc callback and pause in the single step, so we are allowed to SetIP to FuncEvalHijack to run FuncEvals.
When using Type.GetType, always throw FileLoadException for invalid assembly names no matter what throwOnError is set to.
When using Assembly.GetType, throw only when throwOnError is set, and ArgumentException (not FileLoadException)

Fixes #113534
…114227)

Presence of `.cctor` in `Thread` can cause circular dependency if Lock needs to block while Thread .cctor has not run yet.

1. Lock needs to wait on a WaitHandle
2. WaitHandle needs Thread.CurrentThread
3. if Thread's .cctor has not run yet, it needs to run.     
(it is unusual for this to be the first use of Thread, but the activation pattern in #113949 made it possible)
4. .cctor needs to take a Lock, so we go to `#1`

Fixes: #113949
…#114270)

Remove nonexistent namespace check and avoid comparing the same 25 characters over and over for every System.Runtime.Intrinsics method.

---------

Co-authored-by: Copilot <[email protected]>
Azure Linux has a regression in crypto stack that leads to intermittent hangs.

Workaround dotnet/dnceng#5329
… TYP_MASK (#113864)

* optimize ConditionalSelect with const zero

* remove superfluous LowerNode calls

* Revert "remove superfluous LowerNode calls"

This reverts commit 6dae598.

* just remove one LowerNode call

* Revert "just remove one LowerNode call"

This reverts commit 17886e7.
* fix the mask values returned by Helper

* Address review feedback from copilot

* Remove unrequired cast

* Fix MaskBothSet method

* Make sure to use unsigned version for GetMask*
#113286)

This is a combination of @am11's work in PR #109087 and some work to just rewrite the Windows X86 helpers in assembly.
  - For non 64-bit platforms, remove the helpers
  - For Unix platforms, rely on an implementation which uses an FCall to native code to invoke the various operations.
  - For Windows X86 on CoreCLR, rewrite the helpers in assembly with a tail-call approach for throwing.
    - Also it was noted that the existing helpers do a fair bit of unnecessary stack manipulation, and the manual rewrite avoids all of that. Maybe this will actually be faster. (Turns out it is about the same on performance. The real win would come from re-ordering the argument order, but that's a much more invasive change in the JIT, which I judge to be too risky)
cmake 4.0 no longer sets the CMAKE_OSX_SYSROOT variable for macOS targets: https://cmake.org/cmake/help/v4.0/release/4.0.html#other-changes

> Builds targeting macOS no longer choose any SDK or pass an -isysroot flag to the compiler by default. Instead, compilers are expected to choose a default macOS SDK on their own. In order to use a compiler that does not do this, users must now specify -DCMAKE_OSX_SYSROOT=macosx when configuring their build.

We need to stop passing the variable to swiftc in that case and rely on the default behavior.
Adds `ExportPkcs8PrivateKey`, `TryExportPkcs8PrivateKey`, and `TryExportPkcs8PrivateKeyCore` to `MLKem`.
…ence-packages build 20250404.2 (#114283)

Microsoft.SourceBuild.Intermediate.source-build-reference-packages
 From Version 10.0.617601 -> To Version 10.0.620402

Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com>
* Update maintenance-packages versions

* Update dependencies from https://github.com/dotnet/source-build-reference-packages build 20250404.2

Microsoft.SourceBuild.Intermediate.source-build-reference-packages
 From Version 10.0.617601 -> To Version 10.0.620402

---------

Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com>
Part of #107749. The latter half of the JIT frontend contains numerous optimizations that rely on block weights to compute the profitability of their transformations, and may benefit from more consistent profile data. Morph and flow opts frequently redirect flow out of blocks and into other ones, and the corresponding tweaks to the profile are usually too local to reduce/increase flow along affected paths. This gives downstream phases inaccurate ideas about which blocks are newly cold/hot, and other profile transformations (such as fgExpandRarelyRunBlocks) can further propagate inaccuracies. Thus, re-running synthesis shortly after morph seems like an opportune and cheap place to fix the profile. Like the late profile synthesis run, we aren't interested in changing edge likelihoods here -- we just want to propagate changes in block weights through the flowgraph.

I also included some dead code cleanup I meant to do in a previous PR.
…semblyName (#114289)

Apply the fix to NativeAOT TypeNameResolver too

Fixes #114288
Update the coredistools.h header file with the source copy
in dotnet/jitutils. A new coredistools binary will hopefully
soon be updated to match. Nonetheless, enabling the build enables
the new coredistools binaries but does not require it.

Co-authored-by: Jan Kotas <[email protected]>
* Remove unused and unnecessary CLRConfigs.

Co-authored-by: Jan Kotas <[email protected]>
This code makes the current stack state match the stack state expected by a bblock that we transition to. The code was accidentally offseting from the current stack pointer, rather than from the base of the stack.
Avoids issues with source-build overwriting them.
* Removing unnecessary comments

* Reducing more the number of the methods
)

* add a test with comment that explains the unexpected behavior

* adjust the AssemblyNameInfoFuzzer

* mention normalization as well
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.