Skip to content

Conversation

yucai-intel
Copy link
Contributor

Fixed the following issues found by test/test_nn.py::TestNNDeviceTypeXPU::test_avg_pool_large_tensor2_xpu

  1. A segmentation fault caused by a data type conversion error that invalidated the memory address.
  2. A calculation error caused by data overflow.

@yucai-intel
Copy link
Contributor Author

image

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes critical issues in the AveragePool2dKernel implementation for XPU devices, specifically addressing a segmentation fault and calculation error that were causing test failures.

  • Replaced direct index access with XPU_KERNEL_LOOP macros for safer kernel iteration
  • Changed data types from index_t to int64_t for better overflow handling
  • Added group_size limit to prevent exceeding hardware capabilities

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

const int64_t height,
const int64_t width,
const int64_t pooled_height,
const int pooled_width,
Copy link
Preview

Copilot AI Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent data type usage: pooled_width uses int while other dimension parameters use int64_t. This inconsistency could lead to overflow issues or unexpected behavior when dealing with large tensors. Consider changing to const int64_t pooled_width for consistency.

Copilot uses AI. Check for mistakes.

const int64_t height_;
const int64_t width_;
const int64_t pooled_height_;
const int pooled_width_;
Copy link
Preview

Copilot AI Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent data type in member variable: pooled_width_ uses int while other dimension member variables use int64_t. This should be const int64_t pooled_width_ to match the constructor parameter and prevent potential overflow issues.

Copilot uses AI. Check for mistakes.

Copy link
Contributor

@CuiYifeng CuiYifeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check if test_avg_pool_large_tensor2_xpu is activated in CI.
The rest of this PR looks good to me.

Comment on lines 299 to 300
const uint32_t group_size =
std::min(static_cast<int>(syclMaxWorkItemsPerSubSlice()), 1024);
Copy link
Contributor

@CuiYifeng CuiYifeng Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This hard-code may have a negative impact on some platforms, especially future platforms.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I know why here need 1024?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

@CuiYifeng CuiYifeng requested a review from guangyey September 24, 2025 08:56
Copy link
Contributor

@guangyey guangyey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One nit, otherwise, LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants