Skip to content

Conversation

jiafei96
Copy link

fix for #1178

@kevin-t-tang
Copy link

kevin-t-tang commented Aug 14, 2025

Looks Good, and one suggestion for runtime exception:
undefined symbol: _Z16gptq_marlin_gemmRN2at6TensorES1_S1_S1_S1_S1_lS0_lllib

--- a/csrc/custom_marlin/gptq_marlin/gptq_marlin.cu
+++ b/csrc/custom_marlin/gptq_marlin/gptq_marlin.cu
@@ -74,8 +74,8 @@ namespace gptq_marlin {
 torch::Tensor gptq_marlin_gemm(torch::Tensor& a, torch::Tensor& b_q_weight,
     torch::Tensor& b_scales, torch::Tensor& g_idx,
     torch::Tensor& perm, torch::Tensor& workspace,
-    int64_t num_bits, int64_t size_m, int64_t size_n,
-    int64_t size_k, bool is_k_full) {
+    int64_t num_bits, torch::Tensor size_m_tensor, int64_t size_m, int64_t size_n,
+    int64_t size_k, int sms, bool is_k_full) {
     TORCH_CHECK_NOT_IMPLEMENTED(false,
         "marlin_gemm(..) requires CUDA_ARCH >= 8.0");
     return torch::empty({ 1, 1 });

@jiafei96
Copy link
Author

Looks Good, and one suggestion for runtime exception: undefined symbol: _Z16gptq_marlin_gemmRN2at6TensorES1_S1_S1_S1_S1_lS0_lllib

--- a/csrc/custom_marlin/gptq_marlin/gptq_marlin.cu
+++ b/csrc/custom_marlin/gptq_marlin/gptq_marlin.cu
@@ -74,8 +74,8 @@ namespace gptq_marlin {
 torch::Tensor gptq_marlin_gemm(torch::Tensor& a, torch::Tensor& b_q_weight,
     torch::Tensor& b_scales, torch::Tensor& g_idx,
     torch::Tensor& perm, torch::Tensor& workspace,
-    int64_t num_bits, int64_t size_m, int64_t size_n,
-    int64_t size_k, bool is_k_full) {
+    int64_t num_bits, torch::Tensor size_m_tensor, int64_t size_m, int64_t size_n,
+    int64_t size_k, int sms, bool is_k_full) {
     TORCH_CHECK_NOT_IMPLEMENTED(false,
         "marlin_gemm(..) requires CUDA_ARCH >= 8.0");
     return torch::empty({ 1, 1 });

I sincerely apologize. This patch only addresses compilation issues. Since vLLMMarlin wasn't supported, I had commented it out in my local environment. Alternatively, we could move the import vLLMMarlin statement to the actual usage location. I'll update the patch shortly.

@jiafei96 jiafei96 force-pushed the fix_rocm_build_error branch from f87f015 to 7bee717 Compare August 28, 2025 08:12
@jiafei96
Copy link
Author

Looks Good, and one suggestion for runtime exception: undefined symbol: _Z16gptq_marlin_gemmRN2at6TensorES1_S1_S1_S1_S1_lS0_lllib

--- a/csrc/custom_marlin/gptq_marlin/gptq_marlin.cu
+++ b/csrc/custom_marlin/gptq_marlin/gptq_marlin.cu
@@ -74,8 +74,8 @@ namespace gptq_marlin {
 torch::Tensor gptq_marlin_gemm(torch::Tensor& a, torch::Tensor& b_q_weight,
     torch::Tensor& b_scales, torch::Tensor& g_idx,
     torch::Tensor& perm, torch::Tensor& workspace,
-    int64_t num_bits, int64_t size_m, int64_t size_n,
-    int64_t size_k, bool is_k_full) {
+    int64_t num_bits, torch::Tensor size_m_tensor, int64_t size_m, int64_t size_n,
+    int64_t size_k, int sms, bool is_k_full) {
     TORCH_CHECK_NOT_IMPLEMENTED(false,
         "marlin_gemm(..) requires CUDA_ARCH >= 8.0");
     return torch::empty({ 1, 1 });

@kevin-t-tang Just updated – could you please review? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants