Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prototype gchandle #2

Open
wants to merge 54 commits into
base: main
Choose a base branch
from

Conversation

jkotas
Copy link
Owner

@jkotas jkotas commented Mar 14, 2025

No description provided.

// Promote the object and let the bridge indicate when the
// object is actually dead.
Ops* pOps = (Ops*)lp1;
pOps->pfn(&pObj, pOps->sc, 0);
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think we want to be promoting when collecting the list of potentially dead objects.

Promoting an object can end up promoting other objects recursively. Some of these other objects may be participating in the GC bridge as well, and so we would miss them when building the graph.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Promoting the objects should be a separate pass.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this needs more investigation. I need the GC team or more time to learn a better approach in this area.

_ASSERTE(lp1 != NULL);
_ASSERTE(lp2 != NULL);

Object *pObj = VolatileLoad((PTR_Object*)pObjRef);
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not need to be VolatileLoad. Nothing is touching these references while this is running.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. This was copy/paste from other callbacks.

return (void*)instHandle;
}

extern "C" BOOL QCALLTYPE JavaMarshal_GetContext(
Copy link
Owner Author

@jkotas jkotas Mar 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be an FCall - if perf of this call matters.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep.

return;
}

g_mcrargs = mcrargs;
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be a subtle race condition. If the system is overloaded and it did not get a chance to run since the last turn of the crank, we can still see g_BridgeReady set to TRUE, but g_mcrargs is still going to point to the old payload that we are going to leak by overwriting the point.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This entire thread operation is completely dirty and a simple prototype. It all needs to be redone.

@@ -2562,6 +2562,10 @@ static PCODE PreStubWorker_Preemptive(
// more we can do except fail fast. The reverse P/Invoke isn't
// going to work.
CREATETHREAD_IF_NULL_FAILFAST(currentThread, W("Failed to setup new thread during reverse P/Invoke"));

// [TODO] Ensure thread is in Preemptive mode.
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is a hack that will be reverted.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't a hack, at least I don't think it is. There is something wrong in this code path and I'm not sure yet what it is, but it seems like a subtle bug. We are creating a thread above in CREATETHREAD_IF_NULL_FAILFAST and it is in COOP mode. I'm unclear how anything I am doing can influence this at present. I put this in a separate commit to investigate tomorrow.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are missing HasStarted call when setting up the new thread. Creating a new thread in native CoreCLR is a complicated handshake. You may want to create the thread on the managed side - it is much easier and less bug prone to do that way, and it can be reused for NAOT.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This worked. Thanks.

@AaronRobinsonMSFT
Copy link

@jkotas I applied some of the feedback that was structural. The handshake for the bridge still needs to be redone and we can consider the FCall for GetContext, I think that is reasonable. The GC logic remains as-is since it all needs collaboration with the GC team.

Comment on lines 14 to 17
#if !NATIVEAOT
private static Thread? s_bridgeThread;
#endif

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#if !NATIVEAOT
private static Thread? s_bridgeThread;
#endif

Nit: It does not need to be stored in a static.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in, just allocate it and don't worry about it?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, allocate it, start it and don't worry about it.

am11 and others added 20 commits March 16, 2025 06:04
In particular we could create `LCL_FLD` or `LCL_VAR` nodes for address
exposed locals without marking them with `GTF_GLOB_REF`. This would
result in not creating some necessary copies.
Loop cloning with EH had a bug when:
* the loop contained a try T0 that was within a handler H1
* the associated try T1 was also within the loop
* the entire loop was within another try T2

Here T0's containing try is T2, and its enclosing try index was properly
adjusted when the EH table expands as part of cloning. We were doing another
index adjustment later which lead to an out of bounds index.

Also cleaned up the code that detects which try regions need cloning as in the
above we only need to consider T1. When we go to clone it we also will clone H1
 and hence T2.

Also fixed/added some more dumping, and some new test cases.

This fix addresses a problem that came up stress testing dotnet#112998. Since the
underlying bug does not require inlining to create the EH setup above,
I am fixing it separately.
…ence-packages build 20250314.1 (dotnet#113593)

Microsoft.SourceBuild.Intermediate.source-build-reference-packages
 From Version 10.0.615601 -> To Version 10.0.616401

Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com>
* Fix lifetime of data passed to native code during SslStream handshake on Windows.

* Remove unnecessary init

* Fix init
….6998.89 (dotnet#113577)

* Automated bump of chrome version

* Update V8 version in BrowserVersions.props

Revert v8 version update (V8 for Linux is still not available and a newer stable version was not published yet).

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Ilona Tomkowicz <32700855+ilonatommy@users.noreply.github.com>
The marshalling rule is that `Marshal.PtrToStructure` will use the parameterless constructor to create instance and refuse working if there isn't one, but p/invoke will use `GetUninitializedObject` and ignore any constructors.
Co-authored-by: Jakub Dražka <kubadrazka@microsoft.com>
Co-authored-by: campersau <buchholz.bastian@googlemail.com>
Co-authored-by: pavelsavara <pavel.savara@gmail.com>
This API is still generally broken because the partial implementation from dotnet/corert#8144 is still only partial, but this should either return correct result or throw. Here we can compute the correct result. Full fix still tracked at dotnet#89157.
…ents (dotnet#113499)

LSRA can assign different registers to the same local over its lifetime.
When this happens for a parameter, there is special logic to ensure that
the parameter gets its right initial register assignment. This handling
was missing for mapped parameter register targets.
…otnet#113501)

No longer used with recent work to switch to new ABI info.
dotnet#113525)

* Do not set SSL/TLS protocols to WinHTTP and let the OS chose.

* Remove SecurityProtocol.cs.

* Feedback
* Switch to AwesomeAssertions

* React to brekaing API changes
- Parse optional properties in global.json:
  ```
  {
      "sdk": {
          "paths": [ "path/to/sdk/root", "$host$" ],
          "errorMessage": "<message>"
      }
  }
  ```
  - `paths` can be absolute or relative to the global.json. `$host$` is a special value indicating the running `dotnet` path
  - `errorMessage` is shown on SDK resolution failure (instead of the host's default SDK resolution error message).
- Make `hostfxr` search for SDKs with custom paths if specified and error out with custom message if specified
  - `dotnet <command>` - global.json based on process working directory, runs command using resolved SDK respecting global.json paths
  - `hostfxr_resolve_sdk2` - global.json based on working directory argument, returns resolved SDK respecting global.json paths
  - `--info` and `--list-sdks` - global.json based on process working directory, prints found SDKs respecting global.json paths
  - `hostfxr_get_available_sdks` and `hostfxr_get_dotnet_environment_info` - global.json based on process working directory, returns found SDKs respecting global.json paths
- Update error and tracing messages to include information about configured search paths
MichalStrehovsky and others added 28 commits March 17, 2025 15:18
…otnet#113413)

Native layout is a parallel form of metadata that was introduced in .NET Native mostly because in the structuring of that compiler, the decisions about reflection metadata and decisions about native layout happened in different components written in different languages (reflection metadata in C# during IL2IL transforms, native layout in C++ in NUTC).

Native layout has its own representation of methods and fields. There is a bunch of code to read and reconcile those two formats at runtime. This then introduces problems such as in dotnet#91381 where updating one format necessitates updating the other one.

Instead of adding more code to native layout generation/reading/reconciliation to account for custom modifiers, I'm just deleting ~2000 lines of code responsible for native layout metadata from the product instead.

One thing I needed to adjust was that native layout metadata is more compact than reflection metadata - so I added a new way of emitting reflection metadata for methods where the generated method won't have any custom attributes or parameter names. This hobbled method is not reachable with trim safe code. If someone gets a hold of it, more things will be broken than just the ability to invoke it.
* LoongArch is a NEW ISA which is different from MIPS

reference:https://docs.kernel.org/arch/loongarch/introduction.html
Introduction to LoongArch

* Update src/coreclr/jit/codegenlinear.cpp

---------

Co-authored-by: Jan Kotas <jkotas@microsoft.com>
Co-authored-by: Adeel Mujahid <3840695+am11@users.noreply.github.com>
* EGPR-encoding-part-1

* EGPR-encoding-part-2

* bug fix

* revert LSRA changes.

* formatting.

* Make sure KMOV does not use EVEX under X86
* Refactor geometric type converters for improved parsing

This commit refactors the parsing logic for string representations of geometric types (Point, Rectangle, Size, SizeF) by using `Span<Range>` to enhance performance and reduce memory allocations.

Exception handling has been improved with more specific error messages for parsing failures.

Additionally, tests have been updated to reflect changes in expected values and formats, covering a wider range of cases, including edge cases and whitespace handling.

The syntax for array initialization has been modernized to use collection initializers for better readability.

Overall, these changes enhance the robustness, readability, and performance of the type conversion code.

* Refactor CanConvertTo methods for clarity

This commit refactors the `CanConvertTo` method in the `PointConverter`, `RectangleConverter`, `SizeConverter`, and `SizeFConverter` classes.

The changes enhance readability and maintainability by replacing the use of an array to hold converted values with direct string interpolation for constructing the return value. This results in clearer and more concise code.

* Optimize ConvertFrom methods in converter classes

Refactor the `ConvertFrom` method in `PointConverter`, `RectangleConverter`, `SizeConverter`, and `SizeFConverter` to use `AsSpan().Trim()` for improved performance by reducing string allocations.
…tnet#111644)

* handle containment for EVEX-encoded shift instructions

* use tuple type instead of instruction list
This is consistently failing on osx after updating Helix to osx13.
The fix replaces the native implementation of the native format decoder with the clone of the managed implementation.

Fixes dotnet#113609
* First round of changes to build on wasm

* More changes to build on wasm

* Next round of changes

It builds now

* Fix debug build

* Fix paste typo

* Feedback

* Feedback

It is not needed anymore

* Feedback

This is not needed anymore as FEATURE_METADATA_UPDATER is now disabled on wasm

* Feedback

UMEntryThunkCode was removed from main recently

* Feedback

Disable FEATURE_VIRTUAL_STUB_DISPATCH and enable FEATURE_CACHED_INTERFACE_DISPATCH

* Feedback

Remove old host changes

* Build runtime on CI

* Inline few functions in header files

* Feedback

Forgot to remove this as well

* Add missing file

* Feedback

Add assert to Thread::VirtualUnwindCallFrame

* Remove unwanted change

* Remove unwanted change
* Add util code for working with IL opcodes

The opcode name, its argument type and a method for decoding the opcode enum value from IL byte stream.

* Implement move opcodes

These will be used for arg/local loads/stores and various other move operations.

* Implement conv opcodes

With some floating point exceptions that require special handling

* Traverse IL and build the initial set of basic blocks

This is done as an initial iteration over the IL code, with the main objective of creating the initial set of basic blocks and mapping from il offset to basic block. In future changes, when we compile the code, these basic blocks will be linked together, according to the control flow resulting from IL instructions.

* Add simple array data structure

The interpreter will require additional data structures in the future: linked list, hashtable, bitset.

* Finish basic block support and implement branches

From the initial set of basic blocks we will start generating code for the current method. Each basic block has a stack state, representing the state of the IL stack at the moment when we enter. Codegen starts from a dummy entry bblock which has an empty stack state. As we iterate over instructions, we check if a new bblock starts at this offset (this is determined based on the m_ppOffsetToBB mapping created in a previous commit). If we are starting generating code inside a new bblock, various things need to be done. We need to check if the basic block is a fall through from the previous bblock (in which case we initialize the stack state of this new bblock). Also if the bblock is not fall through, we can only generate code into it if the stack state is initialized. Otherwise, we will skip through its instructions and, once the traverse the entire IL, we will do a reiteration generating code only in the bblocks that haven't yet been emitted.

When emitting the final code, one basic block at a time, we update the real native offset of each basic block. When we need to emit a branch, in case we know the offset of the target bblock, we emit it directly in the code. Otherwise we add a relocation record in an array. Once we finish generating code for all basic blocks, we traverse the relocation array and patch all recorded instruction slots with the now known branch offset.

* Add support for ldarg + ldloc

Add a new InterpTypeByRef, to be used to precisely track vars that might contain interior pointers and need to be reported to the GC. We didn't have this type on mono.

* Add binary and unary operators

* branch floating point comparisons

squash into branches

* Add support for direct calls

Unlike other instructions, INTOP_CALL doesn't have a fixed number of sVars, since the method to be called can have any signature. Instead it will hold a -1 terminated array of sVars. The arguments will be placed one after the other, at an offset referred to as callArgsOffset, representing the only source offset for the call instruction. Vars that are sources for a call will be allocated in this sequential position by the var offset allocator, which is not yet included as part of this commit.

INTOP_CALL instruction receives an additional fixed argument, which is a tagged `MethodDesc*`. Rather than embedding the pointer into the instruction stream, this is passed as an index into a table. This will both reduce code memory use, since identical data will be stored at the same index, as well as provide us with an unique slot to atomically patch the data. During first execution of a call, we would observe that the data item is a tagged pointer, suggesting this is a MethodDesc*, in which case we try to obtain the actual method code, triggering method compilation if necessary. Once this is done, the MethodDesc pointer will be replaced with the actual code pointer. In the future, this pointer can be either interpreter IR or jit compiled code.

Given we currently only have support to specify a single method to be interpreted, in order to expand on this, this commit introduces a temporary hack behavior where once we enter the interpreter, we don't exit it for methods from the same module. In the future we might want to switch this to passing an env var specifying which assembly to be interpreted and which to be jitted.

* Add the offset allocator for vars

The var offset allocator allocates for each var a positive offset. All instructions will end up referencing this offset instead of the var index. During execution, these offsets are added to the frame's stack pointer to obtain the actual location on the interpreter stack for the value.

We have three logical areas on the interpreter stack where vars reside. The global vars space, the local vars space followed at last by the param area space. In the global vars space, each global var will have an offset allocated to it, that its constant throughout method execution. This space is first filled by the argument vars (which must be at the beginning of the stack, according to the interpreter ccall convention). Then we allocate each IL local in this space. Once we finish generating code for the method, we enter the actual var offset allocator.

First we detect which vars are used in multiple basic blocks, these vars will be marked as global and have an unique offset allocated. The space following the global vars will be used for allocating vars used in a single basic block. We will traverse each bblock and, as a local var is defined, we will allocate it at the current offset. We have liveStart and liveEnd computed for each local var (measured as instruction indexes in the corresponding basic block). Based on these liveness markers, we determine when to pop vars from the set of active vars (vars that are currently live).

Variables that are arguments to a call will be allocated instead in the param area. Given we don't know the exact offset of the param area during bblock iteration (because it follows the local vars space, which is determined only after traversing all bblocks), these vars will be initially allocated with an offset relative within the param area. At the very end, we will offset these with the param area offset. Vars passed to calls have to die immediately following the call. If the call arg var can't satisify this constraint (for example the var is global, or a local var that could be referenced following the call), we will create a new temporary var that we copy the arg into, and this new var will serve as the call arg instead.

* Add verbose compilation logging

This will have to be enabled via an env var, rather than statically in code.

* Add a temporary intrinsic for Environment.FailFast

Because we currently don't have support for interoping between interpreter and jit, in order to signal test failure in interpreter test we will just crash instead.

* Add a few sanity tests to ensure new interp functionality mostly works

Minor fixes to ensure tests work. Add support for returns of any type. Add opcode for loading full I4.

* Address PR feedback

* Add gc state transition
…3625)

OSR is meant to be used when we can't escape, but these tailcalls are
natural points where we're actually able to escape. So prefer call
counting + tiering + potential transition into the tier 1 code in these
situations.
* Link peer's X509 stack handle to parent SSL safe handle

* Update src/libraries/Common/src/Interop/Unix/System.Security.Cryptography.Native/Interop.Ssl.cs

Co-authored-by: Kevin Jones <vcsjones@github.com>

* Add test for dispose parallel with handshake.

---------

Co-authored-by: Kevin Jones <vcsjones@github.com>
…ible (dotnet#113632)

* Skip QIs for 'IReferenceTracker' when possible

* Use 'try/finally' and just 'IntPtr'
…t#111778)

* enable containment for rorx/sarx/shlx/shrx

* update emitExtractEvexPrefix
Collect the handles and place them in a pseudo SCC to
demonstrate the pattern.
The VM now allocates a thread for the bridge to execute
the bridge callback.
a reverse P/Invoke scenario when Preemptive is expected.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet