Skip to content

fix: stabilize A5 validation samples#642

Open
HecreReed wants to merge 10 commits into
hw-native-sys:mainfrom
HecreReed:a5-six-fixes
Open

fix: stabilize A5 validation samples#642
HecreReed wants to merge 10 commits into
hw-native-sys:mainfrom
HecreReed:a5-six-fixes

Conversation

@HecreReed
Copy link
Copy Markdown
Collaborator

Summary

  • fix A5 quant/quant_asym sample shapes and goldens to match per-row scale semantics
  • make mgather/mscatter validation generation deterministic and compare mscatter outputs by indices
  • skip partarg in remote validation when the vendored pto-isa lacks TPARTARG intrinsics
  • simplify the A5 abs sample to avoid unsupported dynamic partition dims in board validation

Validation

  • python3 -m py_compile test/npu_validation/scripts/generate_testcase.py test/samples/Abs/abs.py test/samples/Mgather/mgather.py test/samples/Mscatter/mscatter.py test/samples/Quant/quant.py test/samples/Quant/quant_asym.py test/samples/Quant/quant_golden.py test/samples/Quant/quant_asym_golden.py
  • bash -n test/npu_validation/scripts/run_remote_npu_validation.sh
  • A5 board on 192.168.1.52: abs / quant / quant_asym passed earlier in the same validation workspace
  • A5 board on 192.168.1.52: mgather passed after regenerating testcase from runop output
  • A5 board on 192.168.1.52: mscatter passed after fixing indexed compare generation
  • A5 board on 192.168.1.52: partarg skips as expected because vendored pto-isa is missing TPARTARGMAX/TPARTARGMIN

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for MSCATTER and MGATHER operations in the NPU validation framework, including logic for index generation and result comparison. It also updates quantization samples to utilize per-row scaling and offsets, simplifying the golden reference generation. Furthermore, the remote validation script now includes a check for required ISA symbols to skip unsupported test cases. Feedback was provided regarding the heuristic used to identify scatter indices, noting an inconsistency with existing logic that could lead to incorrect operand identification.

Comment on lines +1886 to +1892
for p in reversed(init_ptrs):
p_dtype = _np_dtype_for_cpp(p["cpp_type"])
if p.get("role") == "input" and (
p_dtype.startswith("np.int") or p_dtype.startswith("np.uint")
):
mscatter_indices_input = p
break
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The heuristic for identifying mscatter_indices_input uses reversed(init_ptrs), which selects the last integer input as the indices. This is inconsistent with the tscatter logic (which selects the first integer input) and may be incorrect depending on the operand order of the MSCATTER operation. In mscatter.py, arg0 appears to be the indices and arg1 the data; if so, this heuristic will misidentify arg1 as the indices, potentially breaking the compare_bin_at_indices logic later. Consider if init_ptrs (picking the first) would be more appropriate or if a more robust identification method is needed.

@HecreReed
Copy link
Copy Markdown
Collaborator Author

/run a5 abs quant quant_asym mgather mscatter partarg

@reedhecre
Copy link
Copy Markdown

已接收 /run a5 abs quant quant_asym mgather mscatter partarg,A5 板测器会处理这条请求。

页面会自动刷新,可以直接看当前阶段、排队情况和最近结果。

@reedhecre
Copy link
Copy Markdown

reedhecre commented May 8, 2026

Codex Review

该评论由 review 机器人自动更新。

  • PR: fix: stabilize A5 validation samples #642 fix: stabilize A5 validation samples
  • Author: HecreReed
  • Base/Head: main / a5-six-fixes
  • Head SHA: 731793656a69
  • Trigger: PR 有新提交
  • Generated At: 2026-05-13T10:45:44Z
  • Previous Head SHA: 447c37baa265
  • Status: failed at codex-review (exit=1)

Summary

Review failed at stage codex-review: exit=1

Findings

未生成结构化 findings,因为 review 过程提前失败。

Log Tail

  7628	- `INT8_SYM`: symmetric quantization; `dst` element type must be `i8`.
  7629	- `INT8_ASYM`: asymmetric quantization; `dst` element type must be `ui8`.
exec
/bin/bash -lc 'grep -n "role" -n test/npu_validation/scripts/generate_testcase.py | head -n 80' in /tmp/ptoas-pr-review-monitor/runs/20260513_184003_pr642/repo succeeded in 0ms:
1624:                    "role": (
1648:    data_ptrs = [p for p in params if p["kind"] == "ptr" and p["role"] != "ffts"]
1649:    ffts_ptrs = [p for p in params if p["kind"] == "ptr" and p["role"] == "ffts"]
1651:    output_ptrs = [p for p in data_ptrs if p["role"] == "output"]
1760:        if p["role"] == "ffts":
1880:            if p.get("role") == "input":
1888:            if p.get("role") == "input" and (
1903:        is_output = p.get("role") == "output"
1905:        is_tscatter_indices = kernel_has_tscatter and p.get("role") == "input" and is_integer and size == elem_count
1914:            and p.get("role") == "input"
1918:        is_tgatherb_offset = kernel_has_tgatherb and p.get("role") == "input" and is_integer and size < elem_count
1919:        is_tgatherb_src = kernel_has_tgatherb and p.get("role") == "input" and not is_tgatherb_offset
2267:            if p.get("role") == "input" and (p_dtype.startswith("np.int") or p_dtype.startswith("np.uint")):
exec
/bin/bash -lc "nl -ba test/npu_validation/scripts/generate_testcase.py | sed -n '1500,1785p'" in /tmp/ptoas-pr-review-monitor/runs/20260513_184003_pr642/repo succeeded in 0ms:
  1500	    shared_validation_runtime = sample_root.parent / "validation_runtime.py"
  1501	
  1502	    raw_kernel = input_cpp.read_text(encoding="utf-8")
  1503	    raw_kernel_for_analysis = raw_kernel
  1504	    kernel_info = _describe_kernel_source(raw_kernel_for_analysis)
  1505	    # pto.tcmp / pto.tcmps produce packed predicate masks and leave parts of the
  1506	    # logical u8 tile undefined. This can make byte-wise compares flaky.
  1507	    has_packed_pred_mask = re.search(r"\bTCMPS?\s*\(", raw_kernel_for_analysis) is not None
  1508	    has_dav_cube = "__DAV_CUBE__" in raw_kernel
  1509	    has_dav_vec = "__DAV_VEC__" in raw_kernel
  1510	    has_intra_block_sync = "set_intra_block(" in raw_kernel or "wait_intra_block(" in raw_kernel
  1511	    has_mixed_section_sync = has_dav_cube and has_dav_vec and has_intra_block_sync
  1512	    has_cube_only_section = has_dav_cube and not has_dav_vec
  1513	    has_vec_only_section = has_dav_vec and not has_dav_cube
  1514	
  1515	    is_mixed_kernel = kernel_info["kind"] == "mixed"
  1516	
  1517	    if aicore_arch is None:
  1518	        if is_mixed_kernel:
  1519	            sv = (soc_version or "").lower()
exec
/bin/bash -lc "nl -ba test/npu_validation/scripts/generate_testcase.py | sed -n '2180,2275p'" in /tmp/ptoas-pr-review-monitor/runs/20260513_184003_pr642/repo succeeded in 0ms:
  2180	    -Wl,-z,now
  2181	)
  2182	
  2183		set(CMAKE_CCE_COMPILE_OPTIONS
  2184		    -xcce
  2185		    -fenable-matrix
  2186		    --cce-aicore-enable-tl
  2187		{cce_enable_print_opt}
  2188		{cce_print_define_opt}
  2189		    -fPIC
  2190		    -Xhost-start -Xhost-end
  2191		{cce_stack_size_opt}\
  2192		    "SHELL:-mllvm -cce-aicore-function-stack-size=0x8000"
  2193		    "SHELL:-mllvm -cce-aicore-record-overflow=true"
  2194	    "SHELL:-mllvm -cce-aicore-addr-transform"
  2195	    "SHELL:-mllvm -cce-aicore-dcci-insert-for-scalar=false"
  2196	)
  2197	
  2198	set(CMAKE_CPP_COMPILE_OPTIONS
  2199	    -xc++
exec
/bin/bash -lc "grep -RIn \"MGATHER\\|MSCATTER\" test/samples | head -n 200" in /tmp/ptoas-pr-review-monitor/runs/20260513_184003_pr642/repoexec
/bin/bash -lc "grep -RIn \"pto.mgather\\|pto.mscatter\" test/lit | head -n 200" in /tmp/ptoas-pr-review-monitor/runs/20260513_184003_pr642/repo succeeded in 0ms:

exec
/bin/bash -lc "grep -RIn \"MakeTensorViewOp(.*layout=t.layout_dn\\|TileBufType.get(\\[_SHAPE\\[0\\], 1\\]\" test/samples | head -n 200" in /tmp/ptoas-pr-review-monitor/runs/20260513_184003_pr642/repo succeeded in 30ms:

 succeeded in 16ms:

Reconnecting... 1/5 (unexpected status 403 Forbidden: insufficient balance, url: https://codex.0u0o.com/responses, request id: 653dbc1e-72e7-4eec-884c-c7c5c6d71837)
Reconnecting... 2/5 (unexpected status 403 Forbidden: insufficient balance, url: https://codex.0u0o.com/responses, request id: 2e0820b7-7258-4745-9fec-8f003c016e38)
Reconnecting... 3/5 (unexpected status 403 Forbidden: insufficient balance, url: https://codex.0u0o.com/responses, request id: ff391acd-4267-41ad-96f6-d9ceaf8b399e)
Reconnecting... 4/5 (unexpected status 403 Forbidden: insufficient balance, url: https://codex.0u0o.com/responses, request id: 3d7b218f-d330-4601-8271-7a9acb2c9619)
Reconnecting... 5/5 (unexpected status 403 Forbidden: insufficient balance, url: https://codex.0u0o.com/responses, request id: 83715f42-2b40-4885-b0a0-924272b5bdf6)
ERROR: unexpected status 403 Forbidden: insufficient balance, url: https://codex.0u0o.com/responses, request id: 19202b05-22ef-43f3-8fc8-4f2c23ba148c
Warning: no last agent message; wrote empty content to /tmp/ptoas-pr-review-monitor/runs/20260513_184003_pr642/codex_last_message.json
tokens used
98,720
===== END STAGE codex-review rc=1 @ 2026-05-13 18:45:44 =====

@reedhecre
Copy link
Copy Markdown

A5 板测失败

  • 触发方式:manual
  • 源码提交:9fab97bfdd30
  • 结果汇总:OK 4 / FAIL 1 / SKIP 1
  • 日志:/root/ptoas-board-monitor-a5/logs/20260508_102905_manual_pr642.log
  • 手动指令:/run a5 abs quant quant_asym mgather mscatter partarg
  • 触发人:HecreReed
  • 指定用例:abs,quant,quant_asym,mgather,mscatter,partarg
  • 触发评论:fix: stabilize A5 validation samples #642 (comment)
  • 失败阶段:board-validation / exit=1

失败用例

  • mscatter (run, exit=1)

@reedhecre
Copy link
Copy Markdown

A5 板测失败详情:PR #642

mscatter

stage=run info=exit=1

[ERROR] aclrtSynchronizeStream(stream) failed: 507035 (/tmp/ptoas-board-monitor-a5/runs/20260508_102905_manual_pr642/npu_validation/Mscatter/mscatter/main.cpp:99)
[ERROR] RecentErrMsg: EZ9999: Inner Error!
EZ9999[PID: 1448460] 2026-05-08-10:34:09.272.332 (EZ9999):  The error from device(chipId:0, dieId:0), serial number is 30, there is an aivec error exception, core id is 0, error code = 334, dump info: pc start: 0x100040800000, current: 0x1000408000f0, sc error info: 0xffffffffffff, su error info: 0xe6f7d23d139c7bd7,0xcc3fd0e410009bfd, mte error info: 0x1fd3f5c60007eff1, vec error info: 0x408001e000390037, cube error info: 0, l1 error info: 0, aic error mask: 0x395856, para base: 0x100040200000, mte error: 0.[FUNC:ProcessDavidStarsCoreErrorInfo][FILE:device_error_proc_c.cc][LINE:580]
        TraceBack (most recent call last):
       The extend info: errcode:(334) errorStr: The data returned by the BIU to the VEC is incorrect. subErrType: 0x4.[FUNC:ProcessDavidStarsCoreErrorInfo][FILE:device_error_proc_c.cc][LINE:583]
       Kernel task happen error, retCode=0x31, [vector core exception].[FUNC:PreCheckTaskErr][FILE:davinci_kernel_task.cc][LINE:1728]
       AIV Kernel happen error, retCode=0x31.[FUNC:GetError][FILE:stream.cc][LINE:1478]
       [AIC_INFO] after execute:args print end[FUNC:GetError][FILE:stream.cc][LINE:1478]
       [DFX_INFO]Aicore kernel execute failed, device_id=1, stream_id=61, report_stream_id=61, task_id=0, flip_num=0, fault kernel_name=_Z18mscatter_kernel_2dPiS_S_, fault kernel info ext=_Z18mscatter_kernel_2dPiS_S_, program id=0, hash=279618682955286547.[FUNC:GetError][FILE:stream.cc][LINE:1478]
       rtStreamSynchronize execution failed, reason=vector core exception[FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:65]
       synchronize stream failed, runtime result = 507035[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:148]
[2026-05-08 10:34:44] ERROR: testcase failed (exit 1): mscatter

@HecreReed
Copy link
Copy Markdown
Collaborator Author

/run a5 abs quant quant_asym mgather mscatter partarg

@reedhecre
Copy link
Copy Markdown

已接收 /run a5 abs quant quant_asym mgather mscatter partarg,A5 板测器会处理这条请求。

页面会自动刷新,可以直接看当前阶段、排队情况和最近结果。

@reedhecre
Copy link
Copy Markdown

A5 板测失败

  • 触发方式:manual
  • 源码提交:9fab97bfdd30
  • 结果汇总:OK 4 / FAIL 1 / SKIP 1
  • 日志:/root/ptoas-board-monitor-a5/logs/20260508_143304_manual_pr642.log
  • 手动指令:/run a5 abs quant quant_asym mgather mscatter partarg
  • 触发人:HecreReed
  • 指定用例:abs,quant,quant_asym,mgather,mscatter,partarg
  • 触发评论:fix: stabilize A5 validation samples #642 (comment)
  • 失败阶段:board-validation / exit=1

失败用例

  • mscatter (run, exit=1)

@reedhecre
Copy link
Copy Markdown

A5 板测失败详情:PR #642

mscatter

stage=run info=exit=1

[ERROR] aclrtSynchronizeStream(stream) failed: 507035 (/tmp/ptoas-board-monitor-a5/runs/20260508_143304_manual_pr642/npu_validation/Mscatter/mscatter/main.cpp:99)
[ERROR] RecentErrMsg: EZ9999: Inner Error!
EZ9999[PID: 177569] 2026-05-08-14:36:02.662.566 (EZ9999):  The error from device(chipId:0, dieId:0), serial number is 1, there is an aivec error exception, core id is 0, error code = 334, dump info: pc start: 0x100040800000, current: 0x1000408000f0, sc error info: 0xffffffffffff, su error info: 0xe6f7d23d139c7b97,0xcc3fd0e410009bf5, mte error info: 0x1fd3f5c600076ff1, vec error info: 0x408001e000390037, cube error info: 0, l1 error info: 0, aic error mask: 0x395856, para base: 0x100040200000, mte error: 0.[FUNC:ProcessDavidStarsCoreErrorInfo][FILE:device_error_proc_c.cc][LINE:580]
        TraceBack (most recent call last):
       The extend info: errcode:(334) errorStr: The data returned by the BIU to the VEC is incorrect. subErrType: 0x4.[FUNC:ProcessDavidStarsCoreErrorInfo][FILE:device_error_proc_c.cc][LINE:583]
       Kernel task happen error, retCode=0x31, [vector core exception].[FUNC:PreCheckTaskErr][FILE:davinci_kernel_task.cc][LINE:1728]
       AIV Kernel happen error, retCode=0x31.[FUNC:GetError][FILE:stream.cc][LINE:1478]
       [AIC_INFO] after execute:args print end[FUNC:GetError][FILE:stream.cc][LINE:1478]
       [DFX_INFO]Aicore kernel execute failed, device_id=1, stream_id=62, report_stream_id=62, task_id=0, flip_num=0, fault kernel_name=_Z18mscatter_kernel_2dPiS_S_, fault kernel info ext=_Z18mscatter_kernel_2dPiS_S_, program id=0, hash=279618682955286547.[FUNC:GetError][FILE:stream.cc][LINE:1478]
       rtStreamSynchronize execution failed, reason=vector core exception[FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:65]
       synchronize stream failed, runtime result = 507035[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:148]
[2026-05-08 14:36:08] ERROR: testcase failed (exit 1): mscatter

@HecreReed
Copy link
Copy Markdown
Collaborator Author

/run a5 abs quant quant_asym mgather mscatter partarg

@reedhecre
Copy link
Copy Markdown

已接收 /run a5 abs quant quant_asym mgather mscatter partarg,A5 板测器会处理这条请求。

页面会自动刷新,可以直接看当前阶段、排队情况和最近结果。

@reedhecre
Copy link
Copy Markdown

A5 板测失败

  • 触发方式:manual
  • 源码提交:d8e11d0e28f1
  • 结果汇总:OK 0 / FAIL 5 / SKIP 1
  • 日志:/root/ptoas-board-monitor-a5/logs/20260509_101705_manual_pr642.log
  • 手动指令:/run a5 abs quant quant_asym mgather mscatter partarg
  • 触发人:HecreReed
  • 指定用例:abs,quant,quant_asym,mgather,mscatter,partarg
  • 触发评论:fix: stabilize A5 validation samples #642 (comment)
  • 失败阶段:board-validation / exit=1

失败用例

  • quant (run, exit=1)
  • quant_asym (run, exit=1)
  • mscatter (run, exit=1)
  • mgather (run, exit=1)
  • abs (run, exit=1)

@reedhecre
Copy link
Copy Markdown

A5 板测失败详情:PR #642

quant

stage=run info=exit=1

[ERROR] aclrtSetDevice(deviceId) failed: 507033 (/tmp/ptoas-board-monitor-a5/runs/20260509_101705_manual_pr642/npu_validation/Quant/quant/main.cpp:79)
[ERROR] RecentErrMsg: [PID: 220619] 2026-05-09-10:19:24.696.599 Invalid_Argument(EE1001): The argument is invalid.Reason: rtGetDevMsg execution failed, the context is a null pointer.
        Solution: 1.Check the input parameter range of the function. 2.Check the function invocation relationship.
        TraceBack (most recent call last):
        TsdOpen failed. devId=1, tdt error=1[FUNC:PrintfTsdError][FILE:runtime.cc][LINE:2618]
        Check param failed, dev can not be NULL![FUNC:DeviceRetain][FILE:runtime.cc][LINE:3536]
        Check param failed, dev can not be NULL![FUNC:PrimaryContextRetain][FILE:runtime.cc][LINE:3153]
        Check param failed, ctx can not be NULL![FUNC:PrimaryContextRetain][FILE:runtime.cc][LINE:3184]
        Check param failed, context can not be null.[FUNC:SetDevice][FILE:api_impl.cc][LINE:3321]
        rtSetDevice execution failed, reason=device retain error[FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:65]
        open device 1 failed, runtime result = 507033.[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:148]
        ctx is NULL![FUNC:GetDevErrMsg][FILE:api_impl.cc][LINE:6120]
        The argument is invalid.Reason: rtGetDevMsg execution failed, the context is a null pointer.
[2026-05-09 10:19:25] ERROR: testcase failed (exit 1): quant
quant_asym

stage=run info=exit=1

[ERROR] aclrtSetDevice(deviceId) failed: 507033 (/tmp/ptoas-board-monitor-a5/runs/20260509_101705_manual_pr642/npu_validation/Quant/quant_asym/main.cpp:83)
[ERROR] RecentErrMsg: [PID: 221534] 2026-05-09-10:19:27.933.285 Invalid_Argument(EE1001): The argument is invalid.Reason: rtGetDevMsg execution failed, the context is a null pointer.
        Solution: 1.Check the input parameter range of the function. 2.Check the function invocation relationship.
        TraceBack (most recent call last):
        TsdOpen failed. devId=1, tdt error=1[FUNC:PrintfTsdError][FILE:runtime.cc][LINE:2618]
        Check param failed, dev can not be NULL![FUNC:DeviceRetain][FILE:runtime.cc][LINE:3536]
        Check param failed, dev can not be NULL![FUNC:PrimaryContextRetain][FILE:runtime.cc][LINE:3153]
        Check param failed, ctx can not be NULL![FUNC:PrimaryContextRetain][FILE:runtime.cc][LINE:3184]
        Check param failed, context can not be null.[FUNC:SetDevice][FILE:api_impl.cc][LINE:3321]
        rtSetDevice execution failed, reason=device retain error[FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:65]
        open device 1 failed, runtime result = 507033.[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:148]
        ctx is NULL![FUNC:GetDevErrMsg][FILE:api_impl.cc][LINE:6120]
        The argument is invalid.Reason: rtGetDevMsg execution failed, the context is a null pointer.
[2026-05-09 10:19:28] ERROR: testcase failed (exit 1): quant_asym
[2026-05-09 10:19:28] SKIP: partarg (pto-isa missing TPARTARG intrinsics)
mscatter

stage=run info=exit=1

[ERROR] aclrtSetDevice(deviceId) failed: 507033 (/tmp/ptoas-board-monitor-a5/runs/20260509_101705_manual_pr642/npu_validation/Mscatter/mscatter/main.cpp:79)
[ERROR] RecentErrMsg: [PID: 222041] 2026-05-09-10:19:31.142.318 Invalid_Argument(EE1001): The argument is invalid.Reason: rtGetDevMsg execution failed, the context is a null pointer.
        Solution: 1.Check the input parameter range of the function. 2.Check the function invocation relationship.
        TraceBack (most recent call last):
        TsdOpen failed. devId=1, tdt error=1[FUNC:PrintfTsdError][FILE:runtime.cc][LINE:2618]
        Check param failed, dev can not be NULL![FUNC:DeviceRetain][FILE:runtime.cc][LINE:3536]
        Check param failed, dev can not be NULL![FUNC:PrimaryContextRetain][FILE:runtime.cc][LINE:3153]
        Check param failed, ctx can not be NULL![FUNC:PrimaryContextRetain][FILE:runtime.cc][LINE:3184]
        Check param failed, context can not be null.[FUNC:SetDevice][FILE:api_impl.cc][LINE:3321]
        rtSetDevice execution failed, reason=device retain error[FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:65]
        open device 1 failed, runtime result = 507033.[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:148]
        ctx is NULL![FUNC:GetDevErrMsg][FILE:api_impl.cc][LINE:6120]
        The argument is invalid.Reason: rtGetDevMsg execution failed, the context is a null pointer.
[2026-05-09 10:19:31] ERROR: testcase failed (exit 1): mscatter
mgather

stage=run info=exit=1

[ERROR] aclrtSetDevice(deviceId) failed: 507033 (/tmp/ptoas-board-monitor-a5/runs/20260509_101705_manual_pr642/npu_validation/Mgather/mgather/main.cpp:79)
[ERROR] RecentErrMsg: [PID: 222537] 2026-05-09-10:19:34.322.085 Invalid_Argument(EE1001): The argument is invalid.Reason: rtGetDevMsg execution failed, the context is a null pointer.
        Solution: 1.Check the input parameter range of the function. 2.Check the function invocation relationship.
        TraceBack (most recent call last):
        TsdOpen failed. devId=1, tdt error=1[FUNC:PrintfTsdError][FILE:runtime.cc][LINE:2618]
        Check param failed, dev can not be NULL![FUNC:DeviceRetain][FILE:runtime.cc][LINE:3536]
        Check param failed, dev can not be NULL![FUNC:PrimaryContextRetain][FILE:runtime.cc][LINE:3153]
        Check param failed, ctx can not be NULL![FUNC:PrimaryContextRetain][FILE:runtime.cc][LINE:3184]
        Check param failed, context can not be null.[FUNC:SetDevice][FILE:api_impl.cc][LINE:3321]
        rtSetDevice execution failed, reason=device retain error[FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:65]
        open device 1 failed, runtime result = 507033.[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:148]
        ctx is NULL![FUNC:GetDevErrMsg][FILE:api_impl.cc][LINE:6120]
        The argument is invalid.Reason: rtGetDevMsg execution failed, the context is a null pointer.
[2026-05-09 10:19:34] ERROR: testcase failed (exit 1): mgather
abs

stage=run info=exit=1

[ERROR] aclrtSetDevice(deviceId) failed: 507033 (/tmp/ptoas-board-monitor-a5/runs/20260509_101705_manual_pr642/npu_validation/Abs/abs/main.cpp:75)
[ERROR] RecentErrMsg: [PID: 223031] 2026-05-09-10:19:37.513.638 Invalid_Argument(EE1001): The argument is invalid.Reason: rtGetDevMsg execution failed, the context is a null pointer.
        Solution: 1.Check the input parameter range of the function. 2.Check the function invocation relationship.
        TraceBack (most recent call last):
        TsdOpen failed. devId=1, tdt error=1[FUNC:PrintfTsdError][FILE:runtime.cc][LINE:2618]
        Check param failed, dev can not be NULL![FUNC:DeviceRetain][FILE:runtime.cc][LINE:3536]
        Check param failed, dev can not be NULL![FUNC:PrimaryContextRetain][FILE:runtime.cc][LINE:3153]
        Check param failed, ctx can not be NULL![FUNC:PrimaryContextRetain][FILE:runtime.cc][LINE:3184]
        Check param failed, context can not be null.[FUNC:SetDevice][FILE:api_impl.cc][LINE:3321]
        rtSetDevice execution failed, reason=device retain error[FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:65]
        open device 1 failed, runtime result = 507033.[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:148]
        ctx is NULL![FUNC:GetDevErrMsg][FILE:api_impl.cc][LINE:6120]
        The argument is invalid.Reason: rtGetDevMsg execution failed, the context is a null pointer.
[2026-05-09 10:19:37] ERROR: testcase failed (exit 1): abs
[2026-05-09 10:19:38] === SUMMARY ===
[2026-05-09 10:19:38] OK=0 FAIL=5 SKIP=1
[2026-05-09 10:19:38] RESULTS_TSV=/tmp/ptoas-board-monitor-a5/runs/20260509_101705_manual_pr642/remote_npu_validation_results.tsv

@HecreReed
Copy link
Copy Markdown
Collaborator Author

/run a5 abs quant quant_asym mgather mscatter partarg

@reedhecre
Copy link
Copy Markdown

已接收 /run a5 abs quant quant_asym mgather mscatter partarg,A5 板测器会处理这条请求。

页面会自动刷新,可以直接看当前阶段、排队情况和最近结果。

@reedhecre
Copy link
Copy Markdown

A5 板测失败

  • 触发方式:manual
  • 源码提交:d8e11d0e28f1
  • 结果汇总:OK 4 / FAIL 1 / SKIP 1
  • 日志:/root/ptoas-board-monitor-a5/logs/20260509_141705_manual_pr642.log
  • 手动指令:/run a5 abs quant quant_asym mgather mscatter partarg
  • 触发人:HecreReed
  • 指定用例:abs,quant,quant_asym,mgather,mscatter,partarg
  • 触发评论:fix: stabilize A5 validation samples #642 (comment)
  • 失败阶段:board-validation / exit=1

失败用例

  • mscatter (run, exit=1)

@reedhecre
Copy link
Copy Markdown

A5 板测失败详情:PR #642

mscatter

stage=run info=exit=1

[ERROR] aclrtSynchronizeStream(stream) failed: 507035 (/tmp/ptoas-board-monitor-a5/runs/20260509_141705_manual_pr642/npu_validation/Mscatter/mscatter/main.cpp:99)
[ERROR] RecentErrMsg: EZ9999: Inner Error!
EZ9999[PID: 79475] 2026-05-09-14:20:03.840.387 (EZ9999):  The error from device(chipId:0, dieId:0), serial number is 4, there is an aivec error exception, core id is 0, error code = 334, dump info: pc start: 0x100040800000, current: 0x1000408000f0, sc error info: 0xffffffffffff, su error info: 0xe6f7d23d139c5bb7,0xcc3fd0e410009bfd, mte error info: 0x2051a, vec error info: 0x408001e000390037, cube error info: 0, l1 error info: 0, aic error mask: 0x395856, para base: 0x100040200000, mte error: 0.[FUNC:ProcessDavidStarsCoreErrorInfo][FILE:device_error_proc_c.cc][LINE:580]
        TraceBack (most recent call last):
       The extend info: errcode:(334) errorStr: The data returned by the BIU to the VEC is incorrect. subErrType: 0x4.[FUNC:ProcessDavidStarsCoreErrorInfo][FILE:device_error_proc_c.cc][LINE:583]
       Kernel task happen error, retCode=0x31, [vector core exception].[FUNC:PreCheckTaskErr][FILE:davinci_kernel_task.cc][LINE:1728]
       AIV Kernel happen error, retCode=0x31.[FUNC:GetError][FILE:stream.cc][LINE:1478]
       [AIC_INFO] after execute:args print end[FUNC:GetError][FILE:stream.cc][LINE:1478]
       [DFX_INFO]Aicore kernel execute failed, device_id=1, stream_id=62, report_stream_id=62, task_id=0, flip_num=0, fault kernel_name=_Z18mscatter_kernel_2dPiS_S_, fault kernel info ext=_Z18mscatter_kernel_2dPiS_S_, program id=0, hash=279618682955286547.[FUNC:GetError][FILE:stream.cc][LINE:1478]
       rtStreamSynchronize execution failed, reason=vector core exception[FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:65]
       synchronize stream failed, runtime result = 507035[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:148]
[2026-05-09 14:20:09] ERROR: testcase failed (exit 1): mscatter

@HecreReed
Copy link
Copy Markdown
Collaborator Author

/run a5 abs quant quant_asym mgather mscatter partarg

@reedhecre
Copy link
Copy Markdown

已接收 /run a5 abs quant quant_asym mgather mscatter partarg,A5 板测器会处理这条请求。

页面会自动刷新,可以直接看当前阶段、排队情况和最近结果。

@reedhecre
Copy link
Copy Markdown

A5 板测失败

  • 触发方式:manual
  • 源码提交:d8e11d0e28f1
  • 结果汇总:OK 3 / FAIL 2 / SKIP 1
  • 日志:/root/ptoas-board-monitor-a5/logs/20260511_095505_manual_pr642.log
  • 手动指令:/run a5 abs quant quant_asym mgather mscatter partarg
  • 触发人:HecreReed
  • 指定用例:abs,quant,quant_asym,mgather,mscatter,partarg
  • 触发评论:fix: stabilize A5 validation samples #642 (comment)
  • 失败阶段:board-validation / exit=1

失败用例

  • mscatter (run, exit=1)
  • mgather (run, exit=1)

@reedhecre
Copy link
Copy Markdown

A5 板测失败详情:PR #642

mscatter

stage=run info=exit=1

[ERROR] aclrtSynchronizeStream(stream) failed: 507035 (/tmp/ptoas-board-monitor-a5/runs/20260511_095505_manual_pr642/npu_validation/Mscatter/mscatter/main.cpp:99)
[ERROR] RecentErrMsg: EZ9999: Inner Error!
EZ9999[PID: 644054] 2026-05-11-09:58:05.147.557 (EZ9999):  The error from device(chipId:0, dieId:0), serial number is 21, there is an aivec error exception, core id is 0, error code = 334, dump info: pc start: 0x100040800000, current: 0x1000408000f0, sc error info: 0xffffffffffff, su error info: 0xe6f7d23d139c0038,0x8040000010009bfd, mte error info: 0x20064, vec error info: 0x408001e000390037, cube error info: 0, l1 error info: 0, aic error mask: 0x395856, para base: 0x100040200000, mte error: 0.[FUNC:ProcessDavidStarsCoreErrorInfo][FILE:device_error_proc_c.cc][LINE:580]
        TraceBack (most recent call last):
       The extend info: errcode:(334) errorStr: The data returned by the BIU to the VEC is incorrect. subErrType: 0x4.[FUNC:ProcessDavidStarsCoreErrorInfo][FILE:device_error_proc_c.cc][LINE:583]
       Kernel task happen error, retCode=0x31, [vector core exception].[FUNC:PreCheckTaskErr][FILE:davinci_kernel_task.cc][LINE:1728]
       AIV Kernel happen error, retCode=0x31.[FUNC:GetError][FILE:stream.cc][LINE:1478]
       [AIC_INFO] after execute:args print end[FUNC:GetError][FILE:stream.cc][LINE:1478]
       [DFX_INFO]Aicore kernel execute failed, device_id=1, stream_id=62, report_stream_id=62, task_id=0, flip_num=0, fault kernel_name=_Z18mscatter_kernel_2dPiS_S_, fault kernel info ext=_Z18mscatter_kernel_2dPiS_S_, program id=0, hash=279618682955286547.[FUNC:GetError][FILE:stream.cc][LINE:1478]
       rtStreamSynchronize execution failed, reason=vector core exception[FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:65]
       synchronize stream failed, runtime result = 507035[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:148]
[2026-05-11 09:58:09] ERROR: testcase failed (exit 1): mscatter
mgather

stage=run info=exit=1

[ERROR] aclrtSynchronizeStream(stream) failed: 507035 (/tmp/ptoas-board-monitor-a5/runs/20260511_095505_manual_pr642/npu_validation/Mgather/mgather/main.cpp:99)
[ERROR] RecentErrMsg: EZ9999: Inner Error!
EZ9999[PID: 645512] 2026-05-11-09:58:25.232.651 (EZ9999):  The error from device(chipId:0, dieId:0), serial number is 22, there is an aivec error exception, core id is 0, error code = 334, dump info: pc start: 0x100040800000, current: 0x100040800108, sc error info: 0xffffffffffff, su error info: 0xe6f7d23d139c0038,0x8040000010009bfd, mte error info: 0x20064, vec error info: 0x408001f000390033, cube error info: 0, l1 error info: 0, aic error mask: 0x395856, para base: 0x100040200000, mte error: 0.[FUNC:ProcessDavidStarsCoreErrorInfo][FILE:device_error_proc_c.cc][LINE:580]
        TraceBack (most recent call last):
       The extend info: errcode:(334) errorStr: The data returned by the BIU to the VEC is incorrect. subErrType: 0x4.[FUNC:ProcessDavidStarsCoreErrorInfo][FILE:device_error_proc_c.cc][LINE:583]
       Kernel task happen error, retCode=0x31, [vector core exception].[FUNC:PreCheckTaskErr][FILE:davinci_kernel_task.cc][LINE:1728]
       AIV Kernel happen error, retCode=0x31.[FUNC:GetError][FILE:stream.cc][LINE:1478]
       [AIC_INFO] after execute:args print end[FUNC:GetError][FILE:stream.cc][LINE:1478]
       [DFX_INFO]Aicore kernel execute failed, device_id=1, stream_id=62, report_stream_id=62, task_id=0, flip_num=0, fault kernel_name=_Z17mgather_kernel_2dPiS_S_, fault kernel info ext=_Z17mgather_kernel_2dPiS_S_, program id=0, hash=14980436151442853146.[FUNC:GetError][FILE:stream.cc][LINE:1478]
       rtStreamSynchronize execution failed, reason=vector core exception[FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:65]
       synchronize stream failed, runtime result = 507035[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:148]
[2026-05-11 09:58:30] ERROR: testcase failed (exit 1): mgather

@HecreReed
Copy link
Copy Markdown
Collaborator Author

/run a5 abs quant quant_asym mgather mscatter partarg

@reedhecre
Copy link
Copy Markdown

已接收 /run a5 abs quant quant_asym mgather mscatter partarg,A5 板测器会处理这条请求。

页面会自动刷新,可以直接看当前阶段、排队情况和最近结果。

@reedhecre
Copy link
Copy Markdown

A5 板测失败

  • 触发方式:manual
  • 源码提交:cccf60b487b2
  • 结果汇总:OK 4 / FAIL 1 / SKIP 1
  • 日志:/root/ptoas-board-monitor-a5/logs/20260511_114205_manual_pr642.log
  • 手动指令:/run a5 abs quant quant_asym mgather mscatter partarg
  • 触发人:HecreReed
  • 指定用例:abs,quant,quant_asym,mgather,mscatter,partarg
  • 触发评论:fix: stabilize A5 validation samples #642 (comment)
  • 失败阶段:board-validation / exit=1

失败用例

  • mscatter (run, exit=1)

@reedhecre
Copy link
Copy Markdown

A5 板测失败详情:PR #642

mscatter

stage=run info=exit=1

[ERROR] aclrtSynchronizeStream(stream) failed: 507035 (/tmp/ptoas-board-monitor-a5/runs/20260511_114205_manual_pr642/npu_validation/Mscatter/mscatter/main.cpp:99)
[ERROR] RecentErrMsg: EZ9999: Inner Error!
EZ9999[PID: 850267] 2026-05-11-11:45:03.707.560 (EZ9999):  The error from device(chipId:0, dieId:0), serial number is 23, there is an aivec error exception, core id is 54, error code = 334, dump info: pc start: 0x100040800000, current: 0x1000408000f0, sc error info: 0xffffffffffff, su error info: 0x82b87afe01fa0038,0x80400000c800ed8d, mte error info: 0xf7ddfec00001fbff, vec error info: 0x408001e000390037, cube error info: 0, l1 error info: 0, aic error mask: 0x395856, para base: 0x100040200000, mte error: 0.[FUNC:ProcessDavidStarsCoreErrorInfo][FILE:device_error_proc_c.cc][LINE:580]
        TraceBack (most recent call last):
       The extend info: errcode:(334) errorStr: The data returned by the BIU to the VEC is incorrect. subErrType: 0x4.[FUNC:ProcessDavidStarsCoreErrorInfo][FILE:device_error_proc_c.cc][LINE:583]
       Kernel task happen error, retCode=0x31, [vector core exception].[FUNC:PreCheckTaskErr][FILE:davinci_kernel_task.cc][LINE:1728]
       AIV Kernel happen error, retCode=0x31.[FUNC:GetError][FILE:stream.cc][LINE:1478]
       [AIC_INFO] after execute:args print end[FUNC:GetError][FILE:stream.cc][LINE:1478]
       [DFX_INFO]Aicore kernel execute failed, device_id=1, stream_id=60, report_stream_id=60, task_id=0, flip_num=0, fault kernel_name=_Z18mscatter_kernel_2dPiS_S_, fault kernel info ext=_Z18mscatter_kernel_2dPiS_S_, program id=0, hash=279618682955286547.[FUNC:GetError][FILE:stream.cc][LINE:1478]
       rtStreamSynchronize execution failed, reason=vector core exception[FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:65]
       synchronize stream failed, runtime result = 507035[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:148]
[2026-05-11 11:45:08] ERROR: testcase failed (exit 1): mscatter

@HecreReed HecreReed marked this pull request as ready for review May 13, 2026 03:33
@HecreReed
Copy link
Copy Markdown
Collaborator Author

/run a5 abs quant quant_asym mgather mscatter partarg

@reedhecre
Copy link
Copy Markdown

已接收 /run a5 abs quant quant_asym mgather mscatter partarg,A5 板测器会处理这条请求。

页面会自动刷新,可以直接看当前阶段、排队情况和最近结果。

@HecreReed
Copy link
Copy Markdown
Collaborator Author

/run a5 abs quant quant_asym mgather mscatter partarg

@reedhecre
Copy link
Copy Markdown

已接收 /run a5 abs quant quant_asym mgather mscatter partarg,A5 板测器会处理这条请求。

页面会自动刷新,可以直接看当前阶段、排队情况和最近结果。

@reedhecre
Copy link
Copy Markdown

A5 板测失败

  • 触发方式:manual
  • 源码提交:3600a473c4c1
  • 结果汇总:OK 0 / FAIL 0 / SKIP 0
  • 日志:/root/ptoas-board-monitor-a5/logs/20260513_114305_manual_pr642.log
  • 手动指令:/run a5 abs quant quant_asym mgather mscatter partarg
  • 触发人:HecreReed
  • 指定用例:abs,quant,quant_asym,mgather,mscatter,partarg
  • 触发评论:fix: stabilize A5 validation samples #642 (comment)
  • 失败阶段:prepare-pto-isa-vendor / exit=128

日志尾部

_ONLY_CASES], test/samples/Xors/xors_golden.py [not in RUN_ONLY_CASES], test/samples/Xors/xors_compare.py [not in RUN_ONLY_CASES], test/samples/Xors/xors.py [not in RUN_ONLY_CASES], test/samples/Xor/xor_golden.py [not in RUN_ONLY_CASES], test/samples/Xor/xor_compare.py [not in RUN_ONLY_CASES], test/samples/Xor/xor.py [not in RUN_ONLY_CASES], ... (+532 more)

===== STAGE sample-build-and-test @ 2026-05-13 11:45:12 =====
bash test/samples/runop.sh --enablebc all
PTOAS_OUT_DIR=/tmp/ptoas-board-monitor-a5/runs/20260513_114305_manual_pr642/payload/test/samples
========== SUMMARY ==========
Abs(abs.py)  OK   generated: abs-pto.cpp
Mgather(mgather.py) OK   generated: mgather-pto.cpp
Mscatter(mscatter.py) OK   generated: mscatter-pto.cpp
Partarg(partarg.py) OK   generated: partarg-pto.cpp
Quant(quant_asym.py) OK   generated: quant_asym-pto.cpp
Quant(quant.py) OK   generated: quant-pto.cpp
-----------------------------
OK=6  FAIL=0  SKIP=0
=============================
===== END STAGE sample-build-and-test rc=0 @ 2026-05-13 11:45:19 =====

===== STAGE prepare-pto-isa-vendor @ 2026-05-13 11:45:19 =====
set -euo pipefail
unset ALL_PROXY all_proxy HTTPS_PROXY https_proxy HTTP_PROXY http_proxy NO_PROXY no_proxy GIT_PROXY_COMMAND
git -c http.proxy= -c https.proxy= clone 'https://gitcode.com/cann/pto-isa.git' '/root/ptoas-board-monitor-a5/cache/pto-isa-vendor-tmp-i2q_pcl6/repo'
cd '/root/ptoas-board-monitor-a5/cache/pto-isa-vendor-tmp-i2q_pcl6/repo'
git -c http.proxy= -c https.proxy= checkout -f '3a82aa3868efb11462c7ab32d9fa32fa06b579fc'
git rev-parse HEAD
Cloning into '/root/ptoas-board-monitor-a5/cache/pto-isa-vendor-tmp-i2q_pcl6/repo'...
fatal: reference is not a tree: 3a82aa3868efb11462c7ab32d9fa32fa06b579fc
===== END STAGE prepare-pto-isa-vendor rc=128 @ 2026-05-13 11:45:23 =====

@HecreReed
Copy link
Copy Markdown
Collaborator Author

/run a5 abs quant quant_asym mgather mscatter partarg

@reedhecre
Copy link
Copy Markdown

已接收 /run a5 abs quant quant_asym mgather mscatter partarg,A5 板测器会处理这条请求。

页面会自动刷新,可以直接看当前阶段、排队情况和最近结果。

@reedhecre
Copy link
Copy Markdown

A5 板测失败

  • 触发方式:manual
  • 源码提交:451047bce534
  • 结果汇总:OK 4 / FAIL 2 / SKIP 0
  • 日志:/root/ptoas-board-monitor-a5/logs/20260513_114905_manual_pr642.log
  • 手动指令:/run a5 abs quant quant_asym mgather mscatter partarg
  • 触发人:HecreReed
  • 指定用例:abs,quant,quant_asym,mgather,mscatter,partarg
  • 触发评论:fix: stabilize A5 validation samples #642 (comment)
  • 失败阶段:board-validation / exit=1

失败用例

  • mscatter (run, exit=2)
  • mgather (run, exit=2)

@reedhecre
Copy link
Copy Markdown

A5 板测失败详情:PR #642

mscatter

stage=run info=exit=2

/tmp/ptoas-board-monitor-a5/runs/20260513_114905_manual_pr642/payload/pto-isa/include/pto/npu/a5/MScatter.hpp:389:13: error: static assertion failed due to requirement '(kIdxValidR == 1 && kIdxValidC == kSrcValidR) || (kIdxValidR == kSrcValidR && kIdxValidC == 1)': MSCATTER Coalesce::Row requires index tile valid shape [1, R] or [R, 1] matching TileSrc::ValidRow.
            static_assert(
            ^
/tmp/ptoas-board-monitor-a5/runs/20260513_114905_manual_pr642/payload/pto-isa/include/pto/npu/a5/MScatter.hpp:424:5: note: in instantiation of function template specialization 'pto::MScatterCheck<pto::Coalesce::Row, pto::ScatterAtomicOp::None, pto::GlobalTensor<int, pto::Shape<1, 1, 1, 32, 32>, pto::Stride<1024, 1024, 1024, 32, 1>, pto::Layout::ND>, pto::Tile<pto::TileType::Vec, int, 32, 32, pto::BLayout::RowMajor, 32, 32, pto::SLayout::NoneBox, 512, pto::PadValue::Null, pto::CompactMode::Null>, pto::Tile<pto::TileType::Vec, int, 32, 32, pto::BLayout::RowMajor, 32, 32, pto::SLayout::NoneBox, 512, pto::PadValue::Null, pto::CompactMode::Null>>' requested here
    MScatterCheck<Mode, Atomic>(table, src, indices);
    ^
/tmp/ptoas-board-monitor-a5/runs/20260513_114905_manual_pr642/payload/pto-isa/include/pto/common/pto_instr.hpp:1728:5: note: in instantiation of function template specialization 'pto::MSCATTER_IMPL<pto::Coalesce::Row, pto::ScatterAtomicOp::None, pto::ScatterOOB::Undefined, pto::ScatterConflict::Last, pto::GlobalTensor<int, pto::Shape<1, 1, 1, 32, 32>, pto::Stride<1024, 1024, 1024, 32, 1>, pto::Layout::ND>, pto::Tile<pto::TileType::Vec, int, 32, 32, pto::BLayout::RowMajor, 32, 32, pto::SLayout::NoneBox, 512, pto::PadValue::Null, pto::CompactMode::Null>, pto::Tile<pto::TileType::Vec, int, 32, 32, pto::BLayout::RowMajor, 32, 32, pto::SLayout::NoneBox, 512, pto::PadValue::Null, pto::CompactMode::Null>>' requested here
    MAP_INSTR_IMPL(MSCATTER, dst, src, indexes);
    ^
/tmp/ptoas-board-monitor-a5/runs/20260513_114905_manual_pr642/payload/pto-isa/include/pto/common/pto_instr.hpp:23:34: note: expanded from macro 'MAP_INSTR_IMPL'
#define MAP_INSTR_IMPL(API, ...) API##_IMPL(__VA_ARGS__)
                                 ^
<scratch space>:278:1: note: expanded from here
MSCATTER_IMPL
^
/tmp/ptoas-board-monitor-a5/runs/20260513_114905_manual_pr642/npu_validation/Mscatter/mscatter/mscatter_kernel.cpp:105:3: note: in instantiation of function template specialization 'pto::MSCATTER<pto::GlobalTensor<int, pto::Shape<1, 1, 1, 32, 32>, pto::Stride<1024, 1024, 1024, 32, 1>, pto::Layout::ND>, pto::Tile<pto::TileType::Vec, int, 32, 32, pto::BLayout::RowMajor, 32, 32, pto::SLayout::NoneBox, 512, pto::PadValue::Null, pto::CompactMode::Null>, pto::Tile<pto::TileType::Vec, int, 32, 32, pto::BLayout::RowMajor, 32, 32, pto::SLayout::NoneBox, 512, pto::PadValue::Null, pto::CompactMode::Null>>' requested here
  MSCATTER(v17, v18, v21);
  ^
1 error generated.
gmake[2]: *** [CMakeFiles/mscatter_kernel.dir/build.make:76: CMakeFiles/mscatter_kernel.dir/mscatter_kernel.cpp.o] Error 1
gmake[2]: *** Waiting for unfinished jobs....
gmake[1]: *** [CMakeFiles/Makefile2:85: CMakeFiles/mscatter_kernel.dir/all] Error 2
gmake: *** [Makefile:91: all] Error 2
[2026-05-13 11:52:25] ERROR: testcase failed (exit 2): mscatter
mgather

stage=run info=exit=2

/tmp/ptoas-board-monitor-a5/runs/20260513_114905_manual_pr642/payload/pto-isa/include/pto/npu/a5/MGather.hpp:263:13: error: static assertion failed due to requirement '(kIdxValidR == 1 && kIdxValidC == kDstValidR) || (kIdxValidR == kDstValidR && kIdxValidC == 1)': MGATHER Coalesce::Row requires index tile valid shape [1, R] or [R, 1] matching TileDst::ValidRow.
            static_assert(
            ^
/tmp/ptoas-board-monitor-a5/runs/20260513_114905_manual_pr642/payload/pto-isa/include/pto/npu/a5/MGather.hpp:294:5: note: in instantiation of function template specialization 'pto::MGatherCheck<pto::Coalesce::Row, pto::Tile<pto::TileType::Vec, int, 32, 32, pto::BLayout::RowMajor, 32, 32, pto::SLayout::NoneBox, 512, pto::PadValue::Null, pto::CompactMode::Null>, pto::GlobalTensor<int, pto::Shape<1, 1, 1, 32, 32>, pto::Stride<1024, 1024, 1024, 32, 1>, pto::Layout::ND>, pto::Tile<pto::TileType::Vec, int, 32, 32, pto::BLayout::RowMajor, 32, 32, pto::SLayout::NoneBox, 512, pto::PadValue::Null, pto::CompactMode::Null>>' requested here
    MGatherCheck<Mode>(dst, table, indices);
    ^
/tmp/ptoas-board-monitor-a5/runs/20260513_114905_manual_pr642/payload/pto-isa/include/pto/common/pto_instr.hpp:1701:5: note: in instantiation of function template specialization 'pto::MGATHER_IMPL<pto::Coalesce::Row, pto::GatherOOB::Undefined, pto::Tile<pto::TileType::Vec, int, 32, 32, pto::BLayout::RowMajor, 32, 32, pto::SLayout::NoneBox, 512, pto::PadValue::Null, pto::CompactMode::Null>, pto::GlobalTensor<int, pto::Shape<1, 1, 1, 32, 32>, pto::Stride<1024, 1024, 1024, 32, 1>, pto::Layout::ND>, pto::Tile<pto::TileType::Vec, int, 32, 32, pto::BLayout::RowMajor, 32, 32, pto::SLayout::NoneBox, 512, pto::PadValue::Null, pto::CompactMode::Null>>' requested here
    MAP_INSTR_IMPL(MGATHER, dst, src, indexes);
    ^
/tmp/ptoas-board-monitor-a5/runs/20260513_114905_manual_pr642/payload/pto-isa/include/pto/common/pto_instr.hpp:23:34: note: expanded from macro 'MAP_INSTR_IMPL'
#define MAP_INSTR_IMPL(API, ...) API##_IMPL(__VA_ARGS__)
                                 ^
<scratch space>:277:1: note: expanded from here
MGATHER_IMPL
^
/tmp/ptoas-board-monitor-a5/runs/20260513_114905_manual_pr642/npu_validation/Mgather/mgather/mgather_kernel.cpp:100:3: note: in instantiation of function template specialization 'pto::MGATHER<pto::Tile<pto::TileType::Vec, int, 32, 32, pto::BLayout::RowMajor, 32, 32, pto::SLayout::NoneBox, 512, pto::PadValue::Null, pto::CompactMode::Null>, pto::GlobalTensor<int, pto::Shape<1, 1, 1, 32, 32>, pto::Stride<1024, 1024, 1024, 32, 1>, pto::Layout::ND>, pto::Tile<pto::TileType::Vec, int, 32, 32, pto::BLayout::RowMajor, 32, 32, pto::SLayout::NoneBox, 512, pto::PadValue::Null, pto::CompactMode::Null>>' requested here
  MGATHER(v18, v11, v15);
  ^
1 error generated.
gmake[2]: *** [CMakeFiles/mgather_kernel.dir/build.make:76: CMakeFiles/mgather_kernel.dir/mgather_kernel.cpp.o] Error 1
gmake[2]: *** Waiting for unfinished jobs....
gmake[1]: *** [CMakeFiles/Makefile2:85: CMakeFiles/mgather_kernel.dir/all] Error 2
gmake: *** [Makefile:91: all] Error 2
[2026-05-13 11:52:27] ERROR: testcase failed (exit 2): mgather

@HecreReed
Copy link
Copy Markdown
Collaborator Author

/run a5 abs quant quant_asym mgather mscatter partarg

@reedhecre
Copy link
Copy Markdown

已接收 /run a5 abs quant quant_asym mgather mscatter partarg,A5 板测器会处理这条请求。

页面会自动刷新,可以直接看当前阶段、排队情况和最近结果。

@reedhecre
Copy link
Copy Markdown

A5 板测失败

  • 触发方式:manual
  • 源码提交:d900ee1280cf
  • 结果汇总:OK 0 / FAIL 0 / SKIP 0
  • 日志:/root/ptoas-board-monitor-a5/logs/20260513_152805_manual_pr642.log
  • 手动指令:/run a5 abs quant quant_asym mgather mscatter partarg
  • 触发人:HecreReed
  • 指定用例:abs,quant,quant_asym,mgather,mscatter,partarg
  • 触发评论:fix: stabilize A5 validation samples #642 (comment)
  • 失败阶段:sample-build-and-test / exit=1

日志尾部

        ~~~~~^^
  File "/tmp/ptoas-board-monitor-a5/runs/20260513_152805_manual_pr642/repo/test/samples/Mgather/mgather.py", line 68, in build
    m.operation.verify()
    ~~~~~~~~~~~~~~~~~~^^
mlir._mlir_libs._site_initialize.<locals>.MLIRError: Verification failed:
error: unknown: 'pto.alloc_tile' op expects result row-major none_box tile row byte size (cols * sizeof(dtype)) to be 32-byte aligned, but got 4 bytes
 note: unknown: see current operation: %8 = "pto.alloc_tile"() <{operandSegmentSizes = array<i32: 0, 0, 0>}> : () -> !pto.tile_buf<vec, 32x1xi32>
Traceback (most recent call last):
  File "/tmp/ptoas-board-monitor-a5/runs/20260513_152805_manual_pr642/repo/test/samples/Mscatter/mscatter.py", line 73, in <module>
    print(build())
          ~~~~~^^
  File "/tmp/ptoas-board-monitor-a5/runs/20260513_152805_manual_pr642/repo/test/samples/Mscatter/mscatter.py", line 68, in build
    m.operation.verify()
    ~~~~~~~~~~~~~~~~~~^^
mlir._mlir_libs._site_initialize.<locals>.MLIRError: Verification failed:
error: unknown: 'pto.alloc_tile' op expects result row-major none_box tile row byte size (cols * sizeof(dtype)) to be 32-byte aligned, but got 4 bytes
 note: unknown: see current operation: %10 = "pto.alloc_tile"() <{operandSegmentSizes = array<i32: 0, 0, 0>}> : () -> !pto.tile_buf<vec, 32x1xi32>
========== SUMMARY ==========
Abs(abs.py)  OK   generated: abs-pto.cpp
Mgather(mgather.py) FAIL python failed: mgather.py
Mscatter(mscatter.py) FAIL python failed: mscatter.py
Partarg(partarg.py) OK   generated: partarg-pto.cpp
Quant(quant_asym.py) OK   generated: quant_asym-pto.cpp
Quant(quant.py) OK   generated: quant-pto.cpp
-----------------------------
OK=4  FAIL=2  SKIP=0
=============================
===== END STAGE sample-build-and-test rc=1 @ 2026-05-13 15:30:19 =====

@reedhecre
Copy link
Copy Markdown

A5 板测失败

  • 触发方式:manual
  • 源码提交:d900ee1280cf
  • 结果汇总:OK 0 / FAIL 0 / SKIP 0
  • 日志:/root/ptoas-board-monitor-a5/logs/20260513_153419_manual_pr642.log
  • 手动指令:/run a5 abs quant quant_asym mgather mscatter partarg
  • 触发人:root
  • 指定用例:abs,quant,quant_asym,mgather,mscatter,partarg
  • 失败阶段:sample-build-and-test / exit=1

日志尾部

        ~~~~~^^
  File "/tmp/ptoas-board-monitor-a5/runs/20260513_153419_manual_pr642/repo/test/samples/Mgather/mgather.py", line 68, in build
    m.operation.verify()
    ~~~~~~~~~~~~~~~~~~^^
mlir._mlir_libs._site_initialize.<locals>.MLIRError: Verification failed:
error: unknown: 'pto.alloc_tile' op expects result row-major none_box tile row byte size (cols * sizeof(dtype)) to be 32-byte aligned, but got 4 bytes
 note: unknown: see current operation: %8 = "pto.alloc_tile"() <{operandSegmentSizes = array<i32: 0, 0, 0>}> : () -> !pto.tile_buf<vec, 32x1xi32>
Traceback (most recent call last):
  File "/tmp/ptoas-board-monitor-a5/runs/20260513_153419_manual_pr642/repo/test/samples/Mscatter/mscatter.py", line 73, in <module>
    print(build())
          ~~~~~^^
  File "/tmp/ptoas-board-monitor-a5/runs/20260513_153419_manual_pr642/repo/test/samples/Mscatter/mscatter.py", line 68, in build
    m.operation.verify()
    ~~~~~~~~~~~~~~~~~~^^
mlir._mlir_libs._site_initialize.<locals>.MLIRError: Verification failed:
error: unknown: 'pto.alloc_tile' op expects result row-major none_box tile row byte size (cols * sizeof(dtype)) to be 32-byte aligned, but got 4 bytes
 note: unknown: see current operation: %10 = "pto.alloc_tile"() <{operandSegmentSizes = array<i32: 0, 0, 0>}> : () -> !pto.tile_buf<vec, 32x1xi32>
========== SUMMARY ==========
Abs(abs.py)  OK   generated: abs-pto.cpp
Mgather(mgather.py) FAIL python failed: mgather.py
Mscatter(mscatter.py) FAIL python failed: mscatter.py
Partarg(partarg.py) OK   generated: partarg-pto.cpp
Quant(quant.py) OK   generated: quant-pto.cpp
Quant(quant_asym.py) OK   generated: quant_asym-pto.cpp
-----------------------------
OK=4  FAIL=2  SKIP=0
=============================
===== END STAGE sample-build-and-test rc=1 @ 2026-05-13 15:36:33 =====

@reedhecre
Copy link
Copy Markdown

A5 板测失败

  • 触发方式:manual
  • 源码提交:914864c2fcc3
  • 结果汇总:OK 5 / FAIL 1 / SKIP 0
  • 日志:/root/ptoas-board-monitor-a5/logs/20260513_154617_manual_pr642.log
  • 手动指令:/run a5 abs quant quant_asym mgather mscatter partarg
  • 触发人:root
  • 指定用例:abs,quant,quant_asym,mgather,mscatter,partarg
  • 失败阶段:board-validation / exit=1

失败用例

  • mgather (run, exit=2)

@reedhecre
Copy link
Copy Markdown

A5 板测失败详情:PR #642

mgather

stage=run info=exit=2

[ERROR] Mismatch: golden_v3.bin vs v3.bin, max diff=31.0 at idx=992 (golden=31, out=0, dtype=int32)
[ERROR] compare failed
[2026-05-13 15:50:11] ERROR: testcase failed (exit 2): mgather

@reedhecre
Copy link
Copy Markdown

A5 板测失败

  • 触发方式:manual
  • 源码提交:72767a6bff40
  • 结果汇总:OK 0 / FAIL 0 / SKIP 0
  • 日志:/root/ptoas-board-monitor-a5/logs/20260513_160304_manual_pr642.log
  • 手动指令:/run a5 abs quant quant_asym mgather mscatter partarg
  • 触发人:root
  • 指定用例:abs,quant,quant_asym,mgather,mscatter,partarg
  • 失败阶段:sample-build-and-test / exit=1

日志尾部

60513_160304_manual_pr642/repo/test/samples/Mgather/mgather.py", line 68, in build
    m.operation.verify()
    ~~~~~~~~~~~~~~~~~~^^
mlir._mlir_libs._site_initialize.<locals>.MLIRError: Verification failed:
error: unknown: 'pto.mgather' op expects dst and idx static row dimensions to match
 note: unknown: see current operation: "pto.mgather"(%6, %8, %9) <{gatherOob = #pto<gather_oob undefined>}> : (!pto.partition_tensor_view<32x32xi32>, !pto.tile_buf<vec, 1x32xi32>, !pto.tile_buf<vec, 32x32xi32>) -> ()
Traceback (most recent call last):
  File "/tmp/ptoas-board-monitor-a5/runs/20260513_160304_manual_pr642/repo/test/samples/Mscatter/mscatter.py", line 73, in <module>
    print(build())
          ~~~~~^^
  File "/tmp/ptoas-board-monitor-a5/runs/20260513_160304_manual_pr642/repo/test/samples/Mscatter/mscatter.py", line 68, in build
    m.operation.verify()
    ~~~~~~~~~~~~~~~~~~^^
mlir._mlir_libs._site_initialize.<locals>.MLIRError: Verification failed:
error: unknown: 'pto.mscatter' op expects src and idx static row dimensions to match
 note: unknown: see current operation: "pto.mscatter"(%9, %10, %8) <{scatterAtomicOp = #pto<scatter_atomic_op none>, scatterOob = #pto<scatter_oob undefined>}> : (!pto.tile_buf<vec, 32x32xi32>, !pto.tile_buf<vec, 1x32xi32>, !pto.partition_tensor_view<32x32xi32>) -> ()
========== SUMMARY ==========
Abs(abs.py)  OK   generated: abs-pto.cpp
Mgather(mgather.py) FAIL python failed: mgather.py
Mscatter(mscatter.py) FAIL python failed: mscatter.py
Partarg(partarg.py) OK   generated: partarg-pto.cpp
Quant(quant.py) OK   generated: quant-pto.cpp
Quant(quant_asym.py) OK   generated: quant_asym-pto.cpp
-----------------------------
OK=4  FAIL=2  SKIP=0
=============================
===== END STAGE sample-build-and-test rc=1 @ 2026-05-13 16:05:17 =====

@reedhecre
Copy link
Copy Markdown

A5 板测失败

  • 触发方式:manual
  • 源码提交:81215cb20a39
  • 结果汇总:OK 0 / FAIL 0 / SKIP 0
  • 日志:/root/ptoas-board-monitor-a5/logs/20260513_162045_manual_pr642.log
  • 手动指令:/run a5 abs quant quant_asym mgather mscatter partarg
  • 触发人:root
  • 指定用例:abs,quant,quant_asym,mgather,mscatter,partarg
  • 失败阶段:build-ptoas / exit=1

日志尾部

 -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -Wimplicit-fallthrough -Wno-nonnull -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -std=c++17   -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS  -fno-exceptions -funwind-tables -fno-rtti -Werror -MD -MT lib/PTO/IR/CMakeFiles/obj.PTOIR.dir/PTO.cpp.o -MF lib/PTO/IR/CMakeFiles/obj.PTOIR.dir/PTO.cpp.o.d -o lib/PTO/IR/CMakeFiles/obj.PTOIR.dir/PTO.cpp.o -c /tmp/ptoas-board-monitor-a5/runs/20260513_162045_manual_pr642/repo/lib/PTO/IR/PTO.cpp
/tmp/ptoas-board-monitor-a5/runs/20260513_162045_manual_pr642/repo/lib/PTO/IR/PTO.cpp: In function ‘llvm::LogicalResult verifyMGatherMScatterTileShape(mlir::Operation*, mlir::Type, mlir::Type, llvm::StringRef)’:
/tmp/ptoas-board-monitor-a5/runs/20260513_162045_manual_pr642/repo/lib/PTO/IR/PTO.cpp:3043:22: error: ‘isKnownUnitExtent’ was not declared in this scope
 3043 |       idxRowMajor && isKnownUnitExtent(idxValid[0]) &&
      |                      ^~~~~~~~~~~~~~~~~
/tmp/ptoas-board-monitor-a5/runs/20260513_162045_manual_pr642/repo/lib/PTO/IR/PTO.cpp:3044:7: error: ‘hasCompatibleKnownExtent’ was not declared in this scope
 3044 |       hasCompatibleKnownExtent(idxValid[1], dataValid[0]);
      |       ^~~~~~~~~~~~~~~~~~~~~~~~
[64/69] Building CXX object tools/ptoas/CMakeFiles/pto-opt.dir/ptoas.cpp.o
[65/69] Building CXX object lib/PTO/Transforms/CMakeFiles/obj.PTOTransforms.dir/PTOToEmitC.cpp.o
ninja: build stopped: subcommand failed.
===== END STAGE build-ptoas rc=1 @ 2026-05-13 16:21:36 =====

@reedhecre
Copy link
Copy Markdown

A5 板测失败

  • 触发方式:manual
  • 源码提交:3d27037785a3
  • 结果汇总:OK 5 / FAIL 1 / SKIP 0
  • 日志:/root/ptoas-board-monitor-a5/logs/20260513_162706_manual_pr642.log
  • 手动指令:/run a5 abs quant quant_asym mgather mscatter partarg
  • 触发人:root
  • 指定用例:abs,quant,quant_asym,mgather,mscatter,partarg
  • 失败阶段:board-validation / exit=1

失败用例

  • mgather (run, exit=2)

@reedhecre
Copy link
Copy Markdown

A5 板测失败详情:PR #642

mgather

stage=run info=exit=2

[ERROR] Mismatch: golden_v3.bin vs v3.bin, max diff=1023.0 at idx=991 (golden=1023, out=0, dtype=int32)
[ERROR] compare failed
[2026-05-13 16:31:13] ERROR: testcase failed (exit 2): mgather

@reedhecre
Copy link
Copy Markdown

A5 板测失败

  • 触发方式:manual
  • 源码提交:1d13f09ea829
  • 结果汇总:OK 0 / FAIL 2 / SKIP 0
  • 日志:/root/ptoas-board-monitor-a5/logs/20260513_172307_manual_pr642.log
  • 手动指令:/run a5 mgather mscatter
  • 触发人:root
  • 指定用例:mgather,mscatter
  • 失败阶段:board-validation / exit=1

失败用例

  • mscatter (run, exit=2)
  • mgather (run, exit=2)

@reedhecre
Copy link
Copy Markdown

A5 板测失败详情:PR #642

mscatter

stage=run info=exit=2

/tmp/ptoas-board-monitor-a5/runs/20260513_172307_manual_pr642/payload/pto-isa/include/pto/npu/a5/MScatter.hpp:402:13: error: static assertion failed due to requirement 'kIdxValidR == kSrcValidR': MSCATTER Coalesce::Elem requires index tile ValidRow == source tile ValidRow.
            static_assert(kIdxValidR == kSrcValidR,
            ^             ~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/ptoas-board-monitor-a5/runs/20260513_172307_manual_pr642/payload/pto-isa/include/pto/npu/a5/MScatter.hpp:424:5: note: in instantiation of function template specialization 'pto::MScatterCheck<pto::Coalesce::Elem, pto::ScatterAtomicOp::None, pto::GlobalTensor<int, pto::Shape<1, 1, 1, 32, 32>, pto::Stride<1024, 1024, 1024, 32, 1>, pto::Layout::ND>, pto::Tile<pto::TileType::Vec, int, 32, 32, pto::BLayout::RowMajor, 32, 32, pto::SLayout::NoneBox, 512, pto::PadValue::Null, pto::CompactMode::Null>, pto::Tile<pto::TileType::Vec, int, 1, 32, pto::BLayout::RowMajor, 1, 32, pto::SLayout::NoneBox, 512, pto::PadValue::Null, pto::CompactMode::Null>>' requested here
    MScatterCheck<Mode, Atomic>(table, src, indices);
    ^
/tmp/ptoas-board-monitor-a5/runs/20260513_172307_manual_pr642/payload/pto-isa/include/pto/common/pto_instr.hpp:1737:5: note: in instantiation of function template specialization 'pto::MSCATTER_IMPL<pto::Coalesce::Elem, pto::ScatterAtomicOp::None, pto::ScatterOOB::Undefined, pto::ScatterConflict::Last, pto::GlobalTensor<int, pto::Shape<1, 1, 1, 32, 32>, pto::Stride<1024, 1024, 1024, 32, 1>, pto::Layout::ND>, pto::Tile<pto::TileType::Vec, int, 32, 32, pto::BLayout::RowMajor, 32, 32, pto::SLayout::NoneBox, 512, pto::PadValue::Null, pto::CompactMode::Null>, pto::Tile<pto::TileType::Vec, int, 1, 32, pto::BLayout::RowMajor, 1, 32, pto::SLayout::NoneBox, 512, pto::PadValue::Null, pto::CompactMode::Null>>' requested here
    MSCATTER_IMPL<Mode>(dst, src, indexes);
    ^
/tmp/ptoas-board-monitor-a5/runs/20260513_172307_manual_pr642/npu_validation/Mscatter/mscatter/mscatter_kernel.cpp:105:3: note: in instantiation of function template specialization 'pto::MSCATTER<pto::Coalesce::Elem, pto::GlobalTensor<int, pto::Shape<1, 1, 1, 32, 32>, pto::Stride<1024, 1024, 1024, 32, 1>, pto::Layout::ND>, pto::Tile<pto::TileType::Vec, int, 32, 32, pto::BLayout::RowMajor, 32, 32, pto::SLayout::NoneBox, 512, pto::PadValue::Null, pto::CompactMode::Null>, pto::Tile<pto::TileType::Vec, int, 1, 32, pto::BLayout::RowMajor, 1, 32, pto::SLayout::NoneBox, 512, pto::PadValue::Null, pto::CompactMode::Null>>' requested here
  MSCATTER<pto::Coalesce::Elem>(v17, v18, v21);
  ^
1 error generated.
gmake[2]: *** [CMakeFiles/mscatter_kernel.dir/build.make:76: CMakeFiles/mscatter_kernel.dir/mscatter_kernel.cpp.o] Error 1
gmake[2]: *** Waiting for unfinished jobs....
gmake[1]: *** [CMakeFiles/Makefile2:85: CMakeFiles/mscatter_kernel.dir/all] Error 2
gmake: *** [Makefile:91: all] Error 2
[2026-05-13 17:25:27] ERROR: testcase failed (exit 2): mscatter
mgather

stage=run info=exit=2

/tmp/ptoas-board-monitor-a5/runs/20260513_172307_manual_pr642/payload/pto-isa/include/pto/npu/a5/MGather.hpp:273:13: error: static assertion failed due to requirement 'kIdxValidR == kDstValidR': MGATHER Coalesce::Elem requires index tile ValidRow == destination tile ValidRow.
            static_assert(kIdxValidR == kDstValidR,
            ^             ~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/ptoas-board-monitor-a5/runs/20260513_172307_manual_pr642/payload/pto-isa/include/pto/npu/a5/MGather.hpp:294:5: note: in instantiation of function template specialization 'pto::MGatherCheck<pto::Coalesce::Elem, pto::Tile<pto::TileType::Vec, int, 32, 32, pto::BLayout::RowMajor, 32, 32, pto::SLayout::NoneBox, 512, pto::PadValue::Null, pto::CompactMode::Null>, pto::GlobalTensor<int, pto::Shape<1, 1, 1, 32, 32>, pto::Stride<1024, 1024, 1024, 32, 1>, pto::Layout::ND>, pto::Tile<pto::TileType::Vec, int, 1, 32, pto::BLayout::RowMajor, 1, 32, pto::SLayout::NoneBox, 512, pto::PadValue::Null, pto::CompactMode::Null>>' requested here
    MGatherCheck<Mode>(dst, table, indices);
    ^
/tmp/ptoas-board-monitor-a5/runs/20260513_172307_manual_pr642/payload/pto-isa/include/pto/common/pto_instr.hpp:1710:5: note: in instantiation of function template specialization 'pto::MGATHER_IMPL<pto::Coalesce::Elem, pto::GatherOOB::Undefined, pto::Tile<pto::TileType::Vec, int, 32, 32, pto::BLayout::RowMajor, 32, 32, pto::SLayout::NoneBox, 512, pto::PadValue::Null, pto::CompactMode::Null>, pto::GlobalTensor<int, pto::Shape<1, 1, 1, 32, 32>, pto::Stride<1024, 1024, 1024, 32, 1>, pto::Layout::ND>, pto::Tile<pto::TileType::Vec, int, 1, 32, pto::BLayout::RowMajor, 1, 32, pto::SLayout::NoneBox, 512, pto::PadValue::Null, pto::CompactMode::Null>>' requested here
    MGATHER_IMPL<CMode>(dst, src, indexes);
    ^
/tmp/ptoas-board-monitor-a5/runs/20260513_172307_manual_pr642/npu_validation/Mgather/mgather/mgather_kernel.cpp:101:3: note: in instantiation of function template specialization 'pto::MGATHER<pto::Coalesce::Elem, pto::Tile<pto::TileType::Vec, int, 32, 32, pto::BLayout::RowMajor, 32, 32, pto::SLayout::NoneBox, 512, pto::PadValue::Null, pto::CompactMode::Null>, pto::GlobalTensor<int, pto::Shape<1, 1, 1, 32, 32>, pto::Stride<1024, 1024, 1024, 32, 1>, pto::Layout::ND>, pto::Tile<pto::TileType::Vec, int, 1, 32, pto::BLayout::RowMajor, 1, 32, pto::SLayout::NoneBox, 512, pto::PadValue::Null, pto::CompactMode::Null>>' requested here
  MGATHER<pto::Coalesce::Elem>(v18, v11, v15);
  ^
1 error generated.
gmake[2]: *** [CMakeFiles/mgather_kernel.dir/build.make:76: CMakeFiles/mgather_kernel.dir/mgather_kernel.cpp.o] Error 1
gmake[2]: *** Waiting for unfinished jobs....
gmake[1]: *** [CMakeFiles/Makefile2:85: CMakeFiles/mgather_kernel.dir/all] Error 2
gmake: *** [Makefile:91: all] Error 2
[2026-05-13 17:25:29] ERROR: testcase failed (exit 2): mgather
[2026-05-13 17:25:29] === SUMMARY ===
[2026-05-13 17:25:29] OK=0 FAIL=2 SKIP=0
[2026-05-13 17:25:29] RESULTS_TSV=/tmp/ptoas-board-monitor-a5/runs/20260513_172307_manual_pr642/remote_npu_validation_results.tsv

@HecreReed
Copy link
Copy Markdown
Collaborator Author

/run a5 mgather,mscatter

@reedhecre
Copy link
Copy Markdown

已接收 /run a5 mgather mscatter,A5 板测器会处理这条请求。

页面会自动刷新,可以直接看当前阶段、排队情况和最近结果。

@reedhecre
Copy link
Copy Markdown

A5 板测成功

  • 触发方式:manual
  • 源码提交:e35ea85cff68
  • 结果汇总:OK 2 / FAIL 0 / SKIP 0
  • 日志:/root/ptoas-board-monitor-a5/logs/20260513_181006_manual_pr642.log
  • 结果 TSV:/root/ptoas-board-monitor-a5/logs/20260513_181006_manual_pr642.tsv
  • 手动指令:/run a5 mgather mscatter
  • 触发人:HecreReed
  • 指定用例:mgather,mscatter
  • 触发评论:fix: stabilize A5 validation samples #642 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants