Traceback (most recent call last):
File "/xpu-perf/micro_perf/core/backend.py", line 400, in perf
latency_us, _ = self.core_perf(op_instance, 2, 2, tensor_list, profiling=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/xpu-perf/micro_perf/backends/GPU/backend_gpu.py", line 196, in core_perf
op_instance.core_run(tensor_list[index])
File "/xpu-perf/micro_perf/core/op.py", line 171, in core_run
return self._run_func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/xpu-perf/micro_perf/core/ops/llm_ops.py", line 3164, in vendor_impl_run
smooth_per_token_dynamic_quant(
File "/xpu-perf/micro_perf/core/utils.py", line 691, in smooth_per_token_dynamic_quant
hidden_states = hidden_states.contiguous().view(ori_shape[0], -1).to(torch.float32)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: cannot reshape tensor of 0 elements into shape [0, -1] because the unspecified dimension size -1 can be any value and is ambiguous
原因:
没有考虑专家并行时,某个本 rank 上的 expert 在某次前向里分到 0 个 token情况
Traceback (most recent call last):
File "/xpu-perf/micro_perf/core/backend.py", line 400, in perf
latency_us, _ = self.core_perf(op_instance, 2, 2, tensor_list, profiling=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/xpu-perf/micro_perf/backends/GPU/backend_gpu.py", line 196, in core_perf
op_instance.core_run(tensor_list[index])
File "/xpu-perf/micro_perf/core/op.py", line 171, in core_run
return self._run_func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/xpu-perf/micro_perf/core/ops/llm_ops.py", line 3164, in vendor_impl_run
smooth_per_token_dynamic_quant(
File "/xpu-perf/micro_perf/core/utils.py", line 691, in smooth_per_token_dynamic_quant
hidden_states = hidden_states.contiguous().view(ori_shape[0], -1).to(torch.float32)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: cannot reshape tensor of 0 elements into shape [0, -1] because the unspecified dimension size -1 can be any value and is ambiguous
原因:
没有考虑专家并行时,某个本 rank 上的 expert 在某次前向里分到 0 个 token情况