-
Notifications
You must be signed in to change notification settings - Fork 209
[RuntimeAsync] Switch the implementation to the new shape of runtime async methods. #2818
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
All tests work again, except These two hit some asserts in JIT. I suspect wrapped/unwrapped mismatches in return values, but it could take me quite some time to trace why that happens. Like something is picking up @jakobbotsch - could you take a look what is going on with these tests? |
@@ -55,17 +55,27 @@ EXTERN_C VOID STDCALL NDirectImportThunk(); | |||
#define METHOD_TOKEN_RANGE_BIT_COUNT (24 - METHOD_TOKEN_REMAINDER_BIT_COUNT) | |||
#define METHOD_TOKEN_RANGE_MASK ((1 << METHOD_TOKEN_RANGE_BIT_COUNT) - 1) | |||
|
|||
enum class AsyncMethodType | |||
enum class AsyncMethodKind |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, but why do you prefer "kind" over "type"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think inside the type system implementation referring to someting as "type" when it is not a type adds extra confusion.
"Kind" seems slightly better, especially when we end up with fields/locals called "type".
src/coreclr/vm/method.hpp
Outdated
}; | ||
|
||
struct AsyncMethodData | ||
{ | ||
AsyncMethodType type; | ||
AsyncMethodKind type; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be renamed as well then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. I missed a few places. Mostly was looking at test failures after things started failing after the change.
src/coreclr/vm/method.hpp
Outdated
{ | ||
LIMITED_METHOD_DAC_CONTRACT; | ||
if (!HasAsyncMethodData()) | ||
return false; | ||
AsyncMethodType asyncType = GetAddrOfAsyncMethodData()->type; | ||
return asyncType == AsyncMethodType::AsyncToTask || asyncType == AsyncMethodType::TaskToAsync; | ||
auto asyncType = GetAddrOfAsyncMethodData()->type; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
auto asyncType = GetAddrOfAsyncMethodData()->type; | |
AsyncMethodKind asyncType = GetAddrOfAsyncMethodData()->type; |
and potentially a further rename of the variable.
// We use "async2" for runtime async methods that return "Unwrapped" values (i.e. T instead of Task<T>) | ||
// The type of promise is typically captured in a modreq. | ||
// CONSIDER: We probably need a better name for the concept, but it is hard to beat shortness of "async2" | ||
inline bool IsAsync2Method() const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see good reason to have both this and IsAsyncHelperMethod
. It seems like one of them should just be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Thunk" and "Helper" were the same thing before this change and then had to be sorted out on what we actually want in every place where "IsThunk" was used.
Now we have the same situation with "IsHelper" and "IsAsync2". They are the same in current implementation, but fundamentally do not need to be. We may end up with a scenario where we have to declare something that is an "IsAsync2" but not a helper. - perhaps for internal helpers like UnsafeAwaitAwaiterFromRuntimeAsync
There are quite a few calls to IsAsync2Method
and even more to IsAsyncHelperMethod
I'd like to keep these two separate for now. It is easy to fold then into one when we completely sure.
src/coreclr/vm/method.hpp
Outdated
|
||
return asyncType == AsyncMethodType::Async; | ||
return asyncType == AsyncMethodKind::Async; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be AsyncImplHelper
, since you changed the meaning of Async
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. This is a bug.
_ASSERTE(false); | ||
pNewMethod->SetAsyncMethodKind(AsyncMethodKind::Async); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like some kind of TODO?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is dead code now. I tried to limit diffs initially - to see what has changed, but this can be now removed.
src/coreclr/vm/prestub.cpp
Outdated
CorElementType etype; | ||
IfFailThrow(pSig.PeekElemType(&etype)); | ||
|
||
if ((etype & ELEMENT_TYPE_GENERICINST) != 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if ((etype & ELEMENT_TYPE_GENERICINST) != 0) | |
if (etype == ELEMENT_TYPE_GENERICINST) |
This is not a flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, can it be an assert? I think it should be impossible for this not to be a Task<T>
or ValueTask<T>
instantiation here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initially I thought of trying to keep the capability of parsing the old (modreq'd) signatures - in case we need that for some internal reasons, but checking for ELEMENT_TYPE_GENERICINST
here is not the way. I think it would not work if we have a modreq'd signature that returns Task<T>
as we would unwrap both modreq and Task wrappers.
Yes. It would be simpler if this is just an assert.
If we need to parse something else (I have more and more doubts) we would just have a different helper.
It looks like Roslyn is producing illegal IL. E.g. for simple-eh, Roslyn is storing an // Token: 0x06000003 RID: 3 RVA: 0x000020B4 File Offset: 0x000002B4
.method public hidebysig static
class [System.Runtime]System.Threading.Tasks.Task`1<int32> Handler () cil managed securitymitigations
{
// Header Size: 12 bytes
// Code Size: 20 (0x14) bytes
// LocalVarSig Token: 0x11000002 RID: 2
.maxstack 1
.locals init (
[0] class [System.Runtime]System.Threading.Tasks.Task`1<int32>
)
.try
{
/* 0x000002C0 1F2A */ IL_0000: ldc.i4.s 42
/* 0x000002C2 281600000A */ IL_0002: call int32 modreq([System.Runtime]System.Threading.Tasks.Task`1) Async2SimpleEH::Throw(int32)
/* 0x000002C7 0A */ IL_0007: stloc.0 // <- this stores int into Task<int>
/* 0x000002C8 DE08 */ IL_0008: leave.s IL_0012
} // end .try
catch Async2SimpleEH/IntegerException
{
/* 0x000002CA 7B01000004 */ IL_000A: ldfld int32 Async2SimpleEH/IntegerException::Value
/* 0x000002CF 0A */ IL_000F: stloc.0
/* 0x000002D0 DE00 */ IL_0010: leave.s IL_0012
} // end handler
/* 0x000002D2 06 */ IL_0012: ldloc.0
/* 0x000002D3 2A */ IL_0013: ret
} // end of method Async2SimpleEH::Handler |
Right. These locals are introduced very late - in emit phase, after async rewrite would have happened, so they pick the wrapped return type. |
Co-authored-by: Jakob Botsch Nielsen <[email protected]>
src/tests/async/collectible-alc.cs
Outdated
@@ -17,14 +17,20 @@ public static void TestEntryPoint() | |||
AsyncEntryPoint().Wait(); | |||
} | |||
|
|||
[MethodImpl(MethodImplOptions.NoInlining)] | |||
private static void CollectOnce() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a workaround for an assert failure.
The assert is caused by this PR (as in - it does not happen without the changes, but I am not familiar enough with JIT codebase to tell right away how it is related to the changes. Seems like something related to PInvoke setup in an async2 method . . .
without the workaround:
Assert failure(PID 55592 [0x0000d928], Thread: 16608 [0x40e0]): Assertion failed 'comp->fgFirstBBisScratch()' in 'Async2CollectibleAlc:AsyncEntryPoint()' during 'Lowering nodeinfo' (IL size 49; hash 0x76f7a2c6; FullOpts)
File: E:\dotnetA2\runtimelab\src\coreclr\jit\lower.cpp:5905
Image: E:\dotnetA2\runtimelab\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\corerun.exe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a new issue, it reproduces also before this PR with DOTNET_TieredCompilation=0
, so we can ignore this particular issue.
However, the fact that it reproduces here means that tiered compilation is not working with the new scheme. The new thunks seem to be handled differently from metadata functions with respect to PGO, optimizations, etc.. Can you take a look at that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix for the pinvoke issue is up in #2847.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new thunks seem to be handled differently from metadata functions with respect to PGO, optimizations, etc.. Can you take a look at that?
The thunks were explicitly excluded from tiering. perhaps because thunk methods are too small.
Exclusion of thunks is one of a few places that should not switch to IsAsyncHelperMethod()
and should continue using IsAsyncThunkMethod
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default everything that was using IsAsyncThunkMethod()
switched to IsAsyncHelperMethod()
, but in this case it was incorrect and effectively disabled tiering of runtime async methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to validate that PGO works now, but the new Roslyn prototype fails to compile my test example:
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.
using System;
using System.Diagnostics;
using System.Linq;
using System.Threading.Tasks;
using Xunit;
public class Async2Pgo
{
[Fact]
public static void EntryPoint()
{
AsyncEntryPoint().Wait();
}
internal static async2 Task<int> AsyncEntryPoint()
{
int[] arr = Enumerable.Range(0, 100_000).ToArray();
int sum = 0;
for (int i = 0; i < 4; i++)
{
for (int j = 0; j < 100; j++)
sum += await AggregateDelegateAsync(arr, new AggregateSum(), 0);
await Task.Delay(100);
}
return sum;
}
private class AggregateSum : I<int>
{
#pragma warning disable CS1998
public async2 Task<int> Aggregate(int a, int b) => a + b;
}
public interface I<T>
{
public Task<T> Aggregate(T seed, T val);
}
public static async2 Task<T> AggregateDelegateAsync<T>(T[] arr, I<T> aggregate, T seed)
{
foreach (T val in arr)
seed = await aggregate.Aggregate(seed, val);
return seed;
}
}
C:\Users\Jakob\.nuget\packages\microsoft.net.compilers.toolset\4.13.0-async-11\tasks\netcore\Microsoft.CSharp.Core.targets(89,5): er
ror : Process terminated. System.NullReferenceException: Object reference not set to an instance of an object. [C:\dev\dotnet\runtim
elab\src\tests\async\pgo.csproj]
C:\Users\Jakob\.nuget\packages\microsoft.net.compilers.toolset\4.13.0-async-11\tasks\netcore\Microsoft.CSharp.Core.targets(89,5): er
ror : at Microsoft.CodeAnalysis.CSharp.Symbols.Async2ThunkForAsyncMethod.get_ReturnTypeWithAnnotations() in C:\dev\dotnet\async-r
oslyn-repo\src\Compilers\CSharp\Portable\Symbols\AsyncThunks.cs:line 83 [C:\dev\dotnet\runtimelab\src\tests\async\pgo.csproj]
C:\Users\Jakob\.nuget\packages\microsoft.net.compilers.toolset\4.13.0-async-11\tasks\netcore\Microsoft.CSharp.Core.targets(89,5): er
ror : at Microsoft.CodeAnalysis.CSharp.Symbols.MethodSymbol.get_ReturnType() in C:\dev\dotnet\async-roslyn-repo\src\Compilers\CSh
arp\Portable\Symbols\MethodSymbol.cs:line 251 [C:\dev\dotnet\runtimelab\src\tests\async\pgo.csproj]
C:\Users\Jakob\.nuget\packages\microsoft.net.compilers.toolset\4.13.0-async-11\tasks\netcore\Microsoft.CSharp.Core.targets(89,5): er
ror : at Microsoft.CodeAnalysis.CSharp.Symbols.MethodSymbol.Microsoft.Cci.ISignature.GetType(EmitContext context) in C:\dev\dotne
t\async-roslyn-repo\src\Compilers\CSharp\Portable\Emitter\Model\MethodSymbolAdapter.cs:line 245 [C:\dev\dotnet\runtimelab\src\tests\
async\pgo.csproj]
...
It works fine with the previous prototype, and PGO kicks in with full hoisting and opts:
G_M19365_IG03: ;; offset=0x0018
test rbx, rbx
je SHORT G_M19365_IG07
mov r8, 0x7FF8320A6DB0 ; Async2Pgo+AggregateSum
cmp qword ptr [rbx], r8
jne SHORT G_M19365_IG07
add rdx, 16
align [0 bytes for IG04]
;; size=24 bbWeight=0.99 PerfScore 5.72
G_M19365_IG04: ;; offset=0x0030
mov r11d, dword ptr [rdx]
add r9d, r11d
add rdx, 4
dec ecx
jne SHORT G_M19365_IG04
;; size=14 bbWeight=165.40 PerfScore 620.24
G_M19365_IG05: ;; offset=0x003E
mov eax, r9d
xor ecx, ecx
;; size=5 bbWeight=1.00 PerfScore 0.50
G_M19365_IG06: ;; offset=0x0043
add rsp, 32
pop rbx
pop rsi
pop rdi
ret
This reverts commit ef45803.
I am still looking at reenabling tiered compilation. The part where we go back from native code -> method def -> method desc is not right for async methods. We start with implementation code and arrive to the method desc for a thunk, thus tiering up fails. The problem is understood, but need to figure what/where the right fix would be. |
Tiered compilation works again. All current tests pass. I will look at Roslyn prototype issues with the PGO sample above next. |
@jakobbotsch - the PGO test should build now. I've added a test in the Roslyn prototype branch, and it builds, but I have not yet try running it. All existing tests pass though. |
@jakobbotsch I have added your sample as a testcase. It looks like tiering, PGO and OSR are working (if I read the following correctly). ; Assembly listing for method Async2Pgo:AsyncEntryPoint():int (Tier1-OSR)
; Emitting BLENDED_CODE for X64 with AVX512 - Windows
; Tier1-OSR code
; OSR variant for entry point 0x32
; async2
; optimized code
; optimized using Dynamic PGO
; rsp based frame
; fully interruptible
; with Dynamic PGO: fgCalledCount is 116.0628
; 5 inlinees with PGO data; 15 single block inlinees; 2 inlinees without PGO data
; Final local variable assignments
;
; V00 AsyncCont [V00,T10] ( 6, 3.86) ref -> rcx single-def "Async continuation arg"
; V01 loc1 [V01,T08] ( 3, 98.56) ref -> rdi class-hnd <int[]>
; V02 loc2 [V02,T07] ( 5, 197.13) int -> rsi
; V03 loc3 [V03,T13] ( 5, 4.32) int -> rbp
; V04 loc4 [V04,T04] ( 5, 298.56) int -> rbx
;* V05 loc5 [V05 ] ( 0, 0 ) int -> zero-ref
; V06 OutArgs [V06 ] ( 1, 1 ) struct (32) [rsp+0x00] do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
;* V07 tmp1 [V07 ] ( 0, 0 ) ref -> zero-ref class-hnd exact "NewObj constructor temp" <Async2Pgo+AggregateSum>
;* V08 tmp2 [V08 ] ( 0, 0 ) int -> zero-ref "Inline return value spill temp"
; V09 tmp3 [V09,T05] ( 3, 295.10) ref -> rax class-hnd "Inline stloc first use temp" <int[]>
;* V10 tmp4 [V10 ] ( 0, 0 ) int -> zero-ref "Inline stloc first use temp"
; V11 tmp5 [V11,T03] ( 2,32932.47) int -> r8 "Inline stloc first use temp"
; V12 tmp6 [V12,T00] ( 4,66259.19) int -> rcx "Inlining Arg"
;* V13 tmp7 [V13 ] ( 0, 0 ) struct ( 8) zero-ref ld-addr-op "Inline stloc first use temp" <System.Runtime.CompilerServices.TaskAwaiter>
;* V14 tmp8 [V14 ] ( 0, 0 ) struct ( 8) zero-ref ld-addr-op "Inline ldloca(s) first use temp" <System.Threading.CancellationToken>
;* V15 tmp9 [V15 ] ( 0, 0 ) struct ( 8) zero-ref "Inlining Arg" <System.Threading.CancellationToken>
; V16 tmp10 [V16,T12] ( 7, 5.47) ref -> r14 class-hnd "Inline return value spill temp" <System.Threading.Tasks.Task>
;* V17 tmp11 [V17 ] ( 0, 0 ) struct ( 8) zero-ref ld-addr-op "Inlining Arg" <System.Threading.CancellationToken>
; V18 tmp12 [V18,T11] ( 2, 5.76) ref -> rbx class-hnd exact "Inlining Arg" <System.TimeProvider+SystemTimeProvider>
;* V19 tmp13 [V19 ] ( 0, 0 ) ref -> zero-ref class-hnd exact "NewObj constructor temp" <System.Threading.Tasks.Task+DelayPromiseWithCancellation>
; V20 tmp14 [V20,T09] ( 3, 8.64) ref -> r14 class-hnd exact "NewObj constructor temp" <System.Threading.Tasks.Task+DelayPromise>
;* V21 tmp15 [V21 ] ( 0, 0 ) ubyte -> zero-ref "Inline return value spill temp"
;* V22 tmp16 [V22 ] ( 0, 0 ) ref -> zero-ref class-hnd "Inlining Arg" <System.Threading.CancellationTokenSource>
;* V23 tmp17 [V23 ] ( 0, 0 ) struct ( 8) zero-ref ld-addr-op "NewObj constructor temp" <System.Runtime.CompilerServices.TaskAwaiter>
;* V24 tmp18 [V24 ] ( 0, 0 ) ubyte -> zero-ref "Inlining Arg"
;* V25 tmp19 [V25 ] ( 0, 0 ) ubyte -> zero-ref "Inlining Arg"
;* V26 tmp20 [V26 ] ( 0, 0 ) ref -> zero-ref class-hnd "Inlining Arg" <System.Threading.Tasks.Task>
;* V27 tmp21 [V27 ] ( 0, 0 ) int -> zero-ref "Inlining Arg"
;* V28 tmp22 [V28 ] ( 0, 0 ) ref -> zero-ref class-hnd "Inlining Arg" <System.Threading.Tasks.Task>
;* V29 tmp23 [V29 ] ( 0, 0 ) ref -> zero-ref "field V13.m_task (fldOffset=0x0)" P-INDEP
;* V30 tmp24 [V30 ] ( 0, 0 ) ref -> zero-ref "field V14._source (fldOffset=0x0)" P-INDEP
;* V31 tmp25 [V31 ] ( 0, 0 ) ref -> zero-ref "field V15._source (fldOffset=0x0)" P-INDEP
;* V32 tmp26 [V32 ] ( 0, 0 ) ref -> zero-ref "field V17._source (fldOffset=0x0)" P-INDEP
; V33 tmp27 [V33,T14] ( 2, 2.88) ref -> rcx "field V23.m_task (fldOffset=0x0)" P-INDEP
; V34 cse0 [V34,T06] ( 3, 295.10) int -> rdx "CSE #01: aggressive"
; V35 rat0 [V35,T01] ( 4,49496.68) byref -> rax "Strength reduced derived IV"
; V36 rat1 [V36,T02] ( 4,49496.68) int -> rdx "Trip count IV"
; V37 rat2 [V37,T15] ( 3, 2.30) ref -> rcx "returned continuation"
; V38 rat3 [V38,T16] ( 6, 0 ) ref -> rbx "new continuation"
; V39 rat4 [V39,T18] ( 3, 0 ) ref -> r15 "object[] for continuation"
; V40 rat5 [V40,T17] ( 4, 0 ) ref -> rax "byte[] for continuation"
; V41 rat6 [V41,T19] ( 3, 0 ) ref -> rdx "byte[] for continuation"
; V42 rat7 [V42,T20] ( 3, 0 ) ref -> rcx "object[] for continuation"
;
; Lcl frame size = 40
G_M18330_IG01: ;; offset=0x0000
sub rsp, 88
mov qword ptr [rsp+0x138], r15
mov qword ptr [rsp+0x130], r14
mov qword ptr [rsp+0x128], rdi
mov qword ptr [rsp+0x120], rsi
mov qword ptr [rsp+0x118], rbx
mov rcx, gword ptr [rsp+0x150]
mov rdi, gword ptr [rsp+0x100]
mov esi, dword ptr [rsp+0xFC]
mov ebp, dword ptr [rsp+0xF8]
mov ebx, dword ptr [rsp+0xF4]
;; size=81 bbWeight=1 PerfScore 15.25
G_M18330_IG02: ;; offset=0x0051
test rcx, rcx
je SHORT G_M18330_IG06
;; size=5 bbWeight=1 PerfScore 1.25
G_M18330_IG03: ;; offset=0x0056
mov rdx, gword ptr [rcx+0x10]
cmp dword ptr [rdx+0x10], 0
jl SHORT G_M18330_IG06
jmp G_M18330_IG16
;; size=15 bbWeight=0.86 PerfScore 6.89
G_M18330_IG04: ;; offset=0x0065
mov ecx, dword ptr [r14+0x34]
and ecx, 0x11000000
cmp ecx, 0x1000000
jne G_M18330_IG13
;; size=22 bbWeight=1.44 PerfScore 5.04
G_M18330_IG05: ;; offset=0x007B
inc ebp
cmp ebp, 100
jge G_M18330_IG14
xor ebx, ebx
;; size=13 bbWeight=1.44 PerfScore 2.52
G_M18330_IG06: ;; offset=0x0088
cmp ebx, 100
jge SHORT G_M18330_IG11
;; size=5 bbWeight=1.44 PerfScore 1.80
G_M18330_IG07: ;; offset=0x008D
xor ecx, ecx
mov rax, rdi
mov edx, dword ptr [rax+0x08]
test edx, edx
jle SHORT G_M18330_IG10
;; size=12 bbWeight=98.56 PerfScore 369.60
G_M18330_IG08: ;; offset=0x0099
add rax, 16
align [3 bytes for IG09]
;; size=7 bbWeight=97.97 PerfScore 48.99
G_M18330_IG09: ;; offset=0x00A0
mov r8d, dword ptr [rax]
add ecx, r8d
add rax, 4
dec edx
jne SHORT G_M18330_IG09
;; size=14 bbWeight=16466.24 PerfScore 61748.39
G_M18330_IG10: ;; offset=0x00AE
add esi, ecx
inc ebx
cmp ebx, 100
jl SHORT G_M18330_IG07
;; size=9 bbWeight=98.56 PerfScore 172.48
G_M18330_IG11: ;; offset=0x00B7
mov rcx, 0x2464F000238 ; const ptr
mov rbx, gword ptr [rcx]
mov rcx, 0x7FF814DFAD00 ; System.Threading.Tasks.Task+DelayPromise
call CORINFO_HELP_NEWSFAST
mov r14, rax
mov rcx, r14
mov r8, rbx
mov edx, 1
call [System.Threading.Tasks.Task+DelayPromise:.ctor(uint,System.TimeProvider):this]
mov rcx, r14
test dword ptr [rcx+0x34], 0x1600000
jne G_M18330_IG04
;; size=64 bbWeight=1.44 PerfScore 16.91
G_M18330_IG12: ;; offset=0x00F7
mov rcx, r14
call [System.Runtime.CompilerServices.RuntimeHelpers:UnsafeAwaitAwaiterFromRuntimeAsync[System.Runtime.CompilerServices.TaskAwaiter](System.Runtime.CompilerServices.TaskAwaiter)]
test rcx, rcx
jne SHORT G_M18330_IG17
jmp G_M18330_IG04
;; size=19 bbWeight=1.15 PerfScore 7.49
G_M18330_IG13: ;; offset=0x010A
mov rcx, r14
xor edx, edx
call [System.Runtime.CompilerServices.TaskAwaiter:HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task,int)]
jmp G_M18330_IG05
;; size=16 bbWeight=0.00 PerfScore 0.01
G_M18330_IG14: ;; offset=0x011A
mov eax, esi
xor ecx, ecx
;; size=4 bbWeight=0.01 PerfScore 0.00
G_M18330_IG15: ;; offset=0x011E
add rsp, 280
pop rbx
pop rsi
pop rdi
pop r14
pop r15
pop rbp
ret
;; size=16 bbWeight=0.01 PerfScore 0.04
G_M18330_IG16: ;; offset=0x012E
mov rdx, gword ptr [rcx+0x10]
mov esi, dword ptr [rdx+0x18]
mov ebp, dword ptr [rdx+0x1C]
mov rcx, gword ptr [rcx+0x18]
mov rdi, gword ptr [rcx+0x10]
mov r14, gword ptr [rcx+0x18]
jmp G_M18330_IG04
;; size=27 bbWeight=0 PerfScore 0.00
G_M18330_IG17: ;; offset=0x0149
mov edx, 2
mov r8d, 16
call CORINFO_HELP_ALLOC_CONTINUATION
mov rbx, rax
mov rcx, 0x7FF814B7A850 ; code for (dynamicClass):IL_STUB_AsyncResume_AsyncEntryPoint_Tier1OSR(System.Object):System.Object
mov qword ptr [rbx+0x20], rcx
mov rcx, 0x400000000
mov qword ptr [rbx+0x28], rcx
mov r15, gword ptr [rbx+0x18]
lea rcx, bword ptr [r15+0x10]
mov rdx, rdi
call CORINFO_HELP_ASSIGN_REF
lea rcx, bword ptr [r15+0x18]
mov rdx, r14
call CORINFO_HELP_ASSIGN_REF
mov rax, gword ptr [rbx+0x10]
mov dword ptr [rax+0x10], 50
mov dword ptr [rax+0x18], esi
mov dword ptr [rax+0x1C], ebp
mov rcx, rbx
;; size=95 bbWeight=0 PerfScore 0.00
G_M18330_IG18: ;; offset=0x01A8
add rsp, 280
pop rbx
pop rsi
pop rdi
pop r14
pop r15
pop rbp
ret
;; size=16 bbWeight=0 PerfScore 0.00
; Total bytes of code 440, prolog size 81, PerfScore 62396.66, instruction count 109, allocated bytes for code 440 (MethodHash=022db865) for method Async2Pgo:AsyncEntryPoint():int (Tier1-OSR)
; ============================================================ |
@@ -719,6 +719,15 @@ namespace | |||
|
|||
COR_ILMETHOD_DECODER* pHeader = NULL; | |||
COR_ILMETHOD* ilHeader = pConfig->GetILHeader(); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What we do here:
when pConfig tries to match the native code to global IL it will follow native code
->methoddef
->methoddesc
->IL
chain.
However, we have two methoddescs associated with the methoddef and the one that it the "primary" one (as in - stored in the def->desc mapping dictionary) is the thunk. It's signature matches metadata, but its IL is runtime-provided, thus there would be no global IL for it.
In a case if we get NULL
IL, we check if the original methoddesc was actually an async "implementation" helper that is not a wrapper/thunk. In such case we ask the methoddesc directly for its IL as that would be the IL we are looking for.
src/tests/async/PGO.cs
Outdated
@@ -0,0 +1,52 @@ | |||
// Licensed to the .NET Foundation under one or more agreements. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All other test files in this directory are lower-case. Can you please make this match?
src/tests/async/PGO.cs
Outdated
for (int j = 0; j < 100; j++) | ||
sum += await AggregateDelegateAsync(arr, new AggregateSum(), 0); | ||
|
||
await Task.Delay(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This wait might not be enough to get us out of the tiering start-up mode (the tiering system has heuristic to wait with starting the tier-up while it thinks the app is in start-up mode).
The original test has i < 4
to potentially tier up through 4 tiers (in reality it should only need 3), but the longer wait to make sure the tiering system kicks in between each round of calls to the previous tier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not see OSR scenario to happen with just 4 iterations. Perhaps not enough call count?
With 100 iterations it does happen, but to make the test run in reasonable time, an individual iteration needs to be sorter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to worry about testing OSR in this test. We already have coverage of OSR from other tests in the tree. Perhaps we could write a more explicit one too, but I think it would look different from this one. (No sleep is necessary, but OSR triggers after 10000 hits to a backedge, so a simple loop can do it).
This test just needs to ensure that PGO information is properly saved and restored from tier0-instrumentation into tier1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I can revert to the original 4 iteration/100 millisecond
Can you add Also it looks like you haven't pushed the changes to Roslyn to https://github.com/dotnet/roslyn/commits/demos/async2-experiment1/ yet. |
I thought I did. Something did not work though. It was a big change since I merged in changes from main. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM now! I validated that I see what I expect in the PGO test.
Not that familiar with the VM bits, so you may want to ping @davidwrighton offline to get a review.
The most material change on the VM side is that for async2 methods, while we keep the invariant that the "primary" methoddesc is the one declared in metadata, it is now the Task-returning one. What bothered me a bit is the number of cases that would throw/assert/bail if they see the non-primary/helper methoddesc. Some of those are probably never supposed to see such scenarios, but some may be just NYI and may eventually need some support. Reviewing all those places at some point could be a good idea. I will definitely talk to David about this and further steps. |
Thanks for reviewing! I think I will merge this, so it does not block any further changes, and if any more follow up is needed, that would be separate changes. |
7624a38
into
dotnet:feature/async2-experiment
Re: https://github.com/dotnet/runtime/blob/main/docs/design/specs/runtime-async.md
It would not be too hard to switch to an ordinary attribute if we decide otherwise, but dealing with MethodImpl is somewhat simpler, so I made an assumption that we will use MethodImpl.
Before this change we had the notion of "Thunk" that covered two concepts
Before this change these were one thing - "Thunk". A Thunk was always serving both purposes.
The most laborious part of this change was teasing apart a single Thunk concept into a Helper that provides a different signature and a Thunk - a method that forwards to another.
These things are not necessarily the same now. We can have a method with a signature provided in metadata yet its method body would forward to a synthetic entry point. The other way is possible too - A helper with synthetic Async2 signature, but with a body constructed by JIT-ting actual IL.