Skip to content
Merged
Show file tree
Hide file tree
Changes from 41 commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
f11e7cc
[tensilelite] G1-G3+G5: tensilelite-host as a co-exported shared library
davidd-amd Jun 3, 2026
e5d0f9c
[tensilelite][hipsparselt] G6+G7: delete dead Tensile-fork glue; drop…
davidd-amd Jun 3, 2026
938eefb
[tensilelite] G2/G5: co-export tensilelite-host library-only (headers…
davidd-amd Jun 3, 2026
8cac881
[tensilelite][hipsparselt] fix the dedup for hipSPARSELt + SOVERSION …
davidd-amd Jun 3, 2026
df9bb76
[tensilelite] G2/G3: install curated TensileLite public-header FILE_S…
davidd-amd Jun 3, 2026
a67ddcf
[tensilelite] drop dead non-HIP DataTypes paths; stop leaking TENSILE…
davidd-amd Jun 3, 2026
a020fa9
[tensilelite] un-gate TENSILE_DEFAULT_SERIALIZATION; expose load API …
davidd-amd Jun 4, 2026
7158843
[tensilelite] install the rocisa enum.hpp the public Tensile headers …
davidd-amd Jun 4, 2026
a39bf8f
[tensilelite] drop dead TENSILE_HIDDEN_BEGIN/END no-op macros
davidd-amd Jun 16, 2026
6d8a4a3
[tensilelite] add public/private headers via a single target_sources …
davidd-amd Jun 16, 2026
06319a2
[tensilelite] drop explanatory comments flagged in review
davidd-amd Jun 16, 2026
18081a1
[tensilelite][hipblaslt][hipsparselt] keep origami out of the install…
davidd-amd Jun 16, 2026
839a143
[tensilelite] stop installing the Tensile/* host headers; keep the li…
davidd-amd Jun 16, 2026
9c6312c
[tensilelite] use the generated export macro directly; drop TENSILE_API
davidd-amd Jun 16, 2026
9460d12
[origami] run interface header-set verification during the build
davidd-amd Jun 16, 2026
7a94148
[hipblaslt] disable RocRoller for static builds so the export resolves
davidd-amd Jun 16, 2026
2c3e51d
[hipblaslt] export RocRoller as a static lib instead of disabling it …
davidd-amd Jun 16, 2026
c1153f6
[tensilelite] export client-needed public-header symbols from tensile…
davidd-amd Jun 17, 2026
f229395
[tensilelite] add structured client diagnostics facility
davidd-amd Jun 20, 2026
9ae357d
[tensilelite] mirror structured diagnostics in the python test harness
davidd-amd Jun 20, 2026
f3a9d04
[tensilelite] fix Windows dllexport error on TensorDescriptor operator<<
davidd-amd Jun 20, 2026
c695496
[tensilelite] stop the shared lib re-exporting static LLVM symbols
davidd-amd Jun 20, 2026
51f9d51
[hipsparselt] EXCLUDE_FROM_ALL on the hipBLASLt subdirectory
davidd-amd Jun 24, 2026
5c60cab
[tensilelite] add device-library codegen cmake modules (P3 G1/G2)
davidd-amd Jun 24, 2026
952b504
[tensilelite] wire device-library codegen export (P3 G2/G3)
davidd-amd Jun 24, 2026
b6e10da
[tensilelite] install + co-export the public Tensile header surface (…
davidd-amd Jun 24, 2026
7e00c78
[hipsparselt] consume hipBLASLt via find_package on TheRock (P3 G4, a…
davidd-amd Jun 24, 2026
9848563
[tensilelite] restore tensilelite:: alias; drive type from BUILD_SHAR…
davidd-amd Jun 24, 2026
38bff9c
[tensilelite] address CMake review: roc:: alias, simplify shared-libs…
davidd-amd Jun 25, 2026
6e9632d
Merge remote-tracking branch 'origin/develop' into users/davidd-amd/t…
davidd-amd Jun 25, 2026
d9ccb2a
[tensilelite] export fileToMsgObject across the shared-lib boundary
davidd-amd Jun 25, 2026
4273d91
[tensilelite][hipblaslt] keep origami out of the installed package su…
davidd-amd Jun 25, 2026
28fa92e
[tensilelite] drop dead GlobalParameters keys from gfx950 multi-DU YAMLs
davidd-amd Jun 25, 2026
e5dad83
[tensilelite] gate tensilelite-host shared build behind TENSILELITE_B…
davidd-amd Jun 26, 2026
48b0cb8
[hipsparselt] ship Tensile device library by gating embedded-dep inst…
davidd-amd Jun 26, 2026
56929fa
[hipsparselt] drop redundant origami/stinkytofu install gating
davidd-amd Jun 27, 2026
c33c000
[tensilelite] fix joblib 'Set changed size during iteration' in logic…
davidd-amd Jun 28, 2026
d5a26ea
Merge branch 'develop' into users/davidd-amd/tensilelite-shared-p2-te…
davidd-amd Jun 28, 2026
84f135b
[tensilelite] provision codegen Python a priori; keep the cmake expor…
davidd-amd Jun 26, 2026
58c50c1
Merge remote-tracking branch 'origin/develop' into users/davidd-amd/t…
davidd-amd Jun 29, 2026
22c59a1
[tensilelite] drop the structured client diagnostics facility from th…
davidd-amd Jun 29, 2026
e310076
[tensilelite][hipsparselt] address CMake review on codegen export & deps
davidd-amd Jun 30, 2026
8a1552e
Merge branch 'develop' into users/davidd-amd/tensilelite-shared-p2-te…
davidd-amd Jun 30, 2026
779092f
Merge branch 'develop' into users/davidd-amd/tensilelite-shared-p2-te…
davidd-amd Jul 1, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 20 additions & 10 deletions projects/hipblaslt/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -211,6 +211,11 @@ if(HIPBLASLT_ENABLE_DEVICE AND NOT WIN32)
endif()
endif()

set(HIPBLASLT_CODEGEN_ROOT "" CACHE STRING "Root of the codegen (tensilelite) Python sources. Defaults to the in-tree 'tensilelite' directory; override to use an installed/alternate tensile.")
if(NOT HIPBLASLT_CODEGEN_ROOT)
set(HIPBLASLT_CODEGEN_ROOT "${hipblaslt_SOURCE_DIR}/tensilelite")
endif()
Comment thread
davidd-amd marked this conversation as resolved.
Outdated

if(HIPBLASLT_BUNDLE_PYTHON_DEPS)
set(HIPBLASLT_PYTHON_DEPS "_rocisa")
set(hipblaslt_python_dev Development.Module)
Expand Down Expand Up @@ -271,13 +276,6 @@ if(NOT ROCM_LIBS_SUPERBUILD)
endif()
endif()

# INTERFACE library that owns the public hipblaslt API headers (the in-tree
# `library/include` subtree plus its build-tree counterpart, where
# `hipblaslt-export.h` and `hipblaslt-version.h` are generated). Routing the
# include directories through a target lets consumers pick them up via the
# build graph rather than via `target_include_directories(<tgt> BEFORE
# PRIVATE .../library/include ...)`. `hip::host` is exposed as an INTERFACE
# link because the in-tree headers `#include <hip/...>`.
add_library(hipblaslt-headers INTERFACE)
add_library(hipblaslt::headers ALIAS hipblaslt-headers)
target_include_directories(hipblaslt-headers
Expand Down Expand Up @@ -336,7 +334,7 @@ if(HIPBLASLT_ENABLE_HOST)
roc::${hipblas_target}
PRIVATE
hip::device
tensilelite::tensilelite-host
roc::tensilelite-host
${CMAKE_DL_LIBS}
${rocTracer}
)
Expand Down Expand Up @@ -426,8 +424,12 @@ if(HIPBLASLT_ENABLE_HOST)

rocm_install_targets(TARGETS hipblaslt)

set(_hipblaslt_export_targets roc::hipblaslt)
if(TENSILELITE_BUILD_SHARED_LIBS)
list(APPEND _hipblaslt_export_targets roc::tensilelite-host)
endif()
rocm_export_targets(
TARGETS roc::hipblaslt
TARGETS ${_hipblaslt_export_targets}
Comment thread
davidd-amd marked this conversation as resolved.
DEPENDS PACKAGE hip
DEPENDS PACKAGE ${hipblas_target}
NAMESPACE roc::
Expand All @@ -436,9 +438,17 @@ if(HIPBLASLT_ENABLE_HOST)
configure_file(
"${CMAKE_CURRENT_SOURCE_DIR}/cmake/hipblaslt-config.cmake.in"
"${CMAKE_CURRENT_BINARY_DIR}/hipblaslt-config.cmake"
COPYONLY
@ONLY
)

if(NOT WIN32)
install(
FILES "${CMAKE_CURRENT_SOURCE_DIR}/cmake/HipBLASLtCodegen.cmake"
DESTINATION "${CMAKE_INSTALL_LIBDIR}/cmake/hipblaslt"
COMPONENT devel
)
endif()
Comment thread
davidd-amd marked this conversation as resolved.
Outdated

if( LEGACY_HIPBLAS_DIRECT )
rocm_package_add_dependencies(DEPENDS "hipblas >= 0.50.0")
else()
Expand Down
4 changes: 2 additions & 2 deletions projects/hipblaslt/clients/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ target_link_libraries(hipblaslt-clients-common
${LAPACK_LIBRARIES}
PRIVATE
hip::device
tensilelite::tensilelite-host
roc::tensilelite-host
)

if(HIPBLASLT_ENABLE_MXDATAGENERATOR)
Expand Down Expand Up @@ -111,7 +111,7 @@ if(BUILD_TESTING OR HIPBLASLT_BUILD_TESTING)
PRIVATE
hip::device
hipblaslt::hipblaslt-clients-common
tensilelite::tensilelite-host
roc::tensilelite-host
GTest::gtest
OpenMP::OpenMP_CXX
${CMAKE_DL_LIBS}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@
// only C++ test binary TheRock builds, ships, and runs for hipBLASLt is this client
// `hipblaslt-test`
// (HIPBLASLT_BUILD_TESTING), executed by test/therock/test_hipblaslt.py. Because
// hipblaslt-test already links tensilelite::tensilelite-host (via
// hipblaslt-test already links roc::tensilelite-host (via
// hipblaslt-clients-common), this white-box unit test can include
// <Tensile/CachingLibrary.hpp> directly and run with no new build dependency.
// Placing the regression here is what makes it actually execute in CI and guard
Expand Down
127 changes: 127 additions & 0 deletions projects/hipblaslt/cmake/HipBLASLtCodegen.cmake
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
# Copyright Advanced Micro Devices, Inc., or its affiliates.
# SPDX-License-Identifier: MIT

include_guard(GLOBAL)

function(hipblaslt_create_device_library)
set(_opts "")
set(_one
TARGET LOGIC_PATH OUTPUT_DIR CXX_COMPILER OFFLOAD_BUNDLER JOBS LOGIC_FILTER
ASAN YAML_FORMAT NO_COMPRESS EXPERIMENTAL LAZY_LOAD ASM_COMMENTS KEEP_BUILD_TMP ASM_DEBUG)
set(_multi ARCHES)
cmake_parse_arguments(_cdl "${_opts}" "${_one}" "${_multi}" ${ARGN})

if(_cdl_UNPARSED_ARGUMENTS)
message(FATAL_ERROR "hipblaslt_create_device_library: unexpected arguments: ${_cdl_UNPARSED_ARGUMENTS}")
endif()
if(NOT _cdl_LOGIC_PATH)
message(FATAL_ERROR "hipblaslt_create_device_library: LOGIC_PATH is required")
endif()
if(NOT _cdl_OUTPUT_DIR)
message(FATAL_ERROR "hipblaslt_create_device_library: OUTPUT_DIR is required")
endif()
if(NOT HIPBLASLT_PYTHON_COMMAND)
message(FATAL_ERROR "hipblaslt_create_device_library: HIPBLASLT_PYTHON_COMMAND is not set")
endif()
if(NOT HIPBLASLT_CODEGEN_ROOT)
message(FATAL_ERROR "hipblaslt_create_device_library: HIPBLASLT_CODEGEN_ROOT is not set")
endif()

if(NOT _cdl_TARGET)
set(_cdl_TARGET "tensilelite-device-libraries")
endif()
if(NOT _cdl_ARCHES)
set(_cdl_ARCHES ${GPU_TARGETS})
endif()
if(NOT _cdl_ARCHES)
message(FATAL_ERROR "hipblaslt_create_device_library: no ARCHES given and GPU_TARGETS is empty")
endif()
if(NOT _cdl_CXX_COMPILER)
set(_cdl_CXX_COMPILER "${CMAKE_CXX_COMPILER}")
endif()
if(NOT DEFINED _cdl_LAZY_LOAD)
set(_cdl_LAZY_LOAD ON)
endif()

file(MAKE_DIRECTORY "${_cdl_OUTPUT_DIR}/library")

list(JOIN _cdl_ARCHES "$<SEMICOLON>" _arches_semi)
set(_opts_list "--architecture=${_arches_semi}" "--cxx-compiler=${_cdl_CXX_COMPILER}")
if(_cdl_OFFLOAD_BUNDLER)
list(APPEND _opts_list "--offload-bundler=${_cdl_OFFLOAD_BUNDLER}")
endif()
if(_cdl_ASAN)
list(APPEND _opts_list "--address-sanitizer")
endif()
if(_cdl_JOBS)
list(APPEND _opts_list "--jobs=${_cdl_JOBS}")
endif()
if(_cdl_KEEP_BUILD_TMP)
list(APPEND _opts_list "--keep-build-tmp")
endif()
if(_cdl_ASM_DEBUG)
list(APPEND _opts_list "--asm-debug")
endif()
if(_cdl_YAML_FORMAT)
list(APPEND _opts_list "--library-format=yaml")
endif()
if(_cdl_LOGIC_FILTER)
list(APPEND _opts_list "--logic-filter=${_cdl_LOGIC_FILTER}")
endif()
if(_cdl_NO_COMPRESS)
list(APPEND _opts_list "--no-compress")
endif()
if(_cdl_EXPERIMENTAL)
list(APPEND _opts_list "--experimental")
endif()
if(NOT _cdl_LAZY_LOAD)
list(APPEND _opts_list "--no-lazy-library-loading")
endif()
if(NOT _cdl_ASM_COMMENTS)
list(APPEND _opts_list "--disable-asm-comments")
endif()

set(_known_bugs "${HIPBLASLT_CODEGEN_ROOT}/Tensile/TensileLogic/known_bugs.yaml")
set(_logic_stamp "${CMAKE_CURRENT_BINARY_DIR}/${_cdl_TARGET}-TensileLogic.stamp")
add_custom_command(
OUTPUT "${_logic_stamp}"
COMMENT "Validating library logic (TensileLogic --check-all) for ${_cdl_TARGET} ..."
COMMAND ${HIPBLASLT_PYTHON_COMMAND}
"${HIPBLASLT_CODEGEN_ROOT}/Tensile/bin/TensileLogic"
"${_cdl_LOGIC_PATH}"
--known-bugs
"${_known_bugs}"
--check-all
COMMAND ${CMAKE_COMMAND} -E touch "${_logic_stamp}"
DEPENDS ${HIPBLASLT_PYTHON_DEPS} "${_known_bugs}"
VERBATIM
USES_TERMINAL
)

set(_output_stamp "${CMAKE_CURRENT_BINARY_DIR}/${_cdl_TARGET}.stamp")
set(_tcl_command
${HIPBLASLT_PYTHON_COMMAND} -m Tensile.TensileCreateLibrary
${_opts_list}
"${_cdl_LOGIC_PATH}"
"${_cdl_OUTPUT_DIR}"
HIP
)
add_custom_command(
OUTPUT "${_output_stamp}"
COMMENT "Building device libraries to ${_cdl_OUTPUT_DIR} ..."
COMMAND ${_tcl_command}
COMMAND ${CMAKE_COMMAND} -E touch "${_output_stamp}"
DEPENDS ${HIPBLASLT_PYTHON_DEPS} "${_logic_stamp}"
VERBATIM
USES_TERMINAL
)

block(SCOPE_FOR VARIABLES)
list(JOIN _tcl_command " " _formatted_tcl)
message(STATUS "Device lib build command (${_cdl_TARGET}): ${_formatted_tcl}")
endblock()

add_custom_target(${_cdl_TARGET} ALL
DEPENDS "${_output_stamp}"
)
endfunction()
24 changes: 24 additions & 0 deletions projects/hipblaslt/cmake/hipblaslt-config.cmake.in
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,27 @@ block(SCOPE_FOR VARIABLES)
message(FATAL_ERROR "Do not export targets with hip::device as an interface link library")
endif()
endblock()

if(EXISTS "${CMAKE_CURRENT_LIST_DIR}/HipBLASLtCodegen.cmake")
Comment thread
davidd-amd marked this conversation as resolved.
if(NOT HIPBLASLT_PYTHON_COMMAND OR NOT HIPBLASLT_CODEGEN_ROOT)
find_package(Python3 QUIET COMPONENTS Interpreter)
endif()
if(NOT HIPBLASLT_PYTHON_COMMAND AND Python3_EXECUTABLE)
set(HIPBLASLT_PYTHON_COMMAND "${Python3_EXECUTABLE}")
endif()
if(NOT HIPBLASLT_CODEGEN_ROOT AND Python3_EXECUTABLE)
execute_process(
COMMAND "${Python3_EXECUTABLE}" -c "import os, Tensile; print(os.path.dirname(Tensile.__path__[0]))"
OUTPUT_VARIABLE _hipblaslt_codegen_root
OUTPUT_STRIP_TRAILING_WHITESPACE
ERROR_QUIET
RESULT_VARIABLE _hipblaslt_codegen_rc
)
if(_hipblaslt_codegen_rc EQUAL 0 AND _hipblaslt_codegen_root)
set(HIPBLASLT_CODEGEN_ROOT "${_hipblaslt_codegen_root}")
endif()
unset(_hipblaslt_codegen_root)
unset(_hipblaslt_codegen_rc)
endif()
include("${CMAKE_CURRENT_LIST_DIR}/HipBLASLtCodegen.cmake")
endif()
2 changes: 1 addition & 1 deletion projects/hipblaslt/cmake/hipblaslt_python.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ function(hipblaslt_configure_bundled_python_command python_binary_dir asan_optio
endif()
set(_python_path
"${python_binary_dir}"
"${hipblaslt_SOURCE_DIR}/tensilelite"
"${HIPBLASLT_CODEGEN_ROOT}"
)
list(JOIN _python_path "${_ds}" _python_path)

Expand Down
112 changes: 19 additions & 93 deletions projects/hipblaslt/device-library/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
# Copyright Advanced Micro Devices, Inc., or its affiliates.
# SPDX-License-Identifier: MIT

include(HipBLASLtCodegen)

set(TENSILELITE_BUILD_PARALLEL_LEVEL "" CACHE STRING "Number of CPU cores to use for building device libraries (will use nproc if unset).")
set(TENSILELITE_KEEP_BUILD_TMP OFF CACHE STRING "Keep temporary build directory for device libraries (turning this ON bloat the build size).")
set(TENSILELITE_ASM_DEBUG "" CACHE STRING "Keep debug information for built code objects.")
Expand All @@ -12,104 +14,28 @@ set(TENSILELITE_OFFLOADBUNDLER "" CACHE STRING "Path to clang-offload-bundler.")

set(HIPBLASLT_LIBLOGIC_PATH "" CACHE STRING "Path to library logic files (will use 'library' if unset).")
set(HIPBLASLT_TENSILE_LIBPATH "${hipblaslt_BINARY_DIR}/Tensile" CACHE STRING "Path to output the device gemm libraries.")
file(MAKE_DIRECTORY "${HIPBLASLT_TENSILE_LIBPATH}/library")

list(JOIN GPU_TARGETS "$<SEMICOLON>" TENSILELITE_GPU_TARGETS_SEMI_ESCAPED)

set(TENSILELITE_BUILD_OPTS ${TENSILELITE_BUILD_OPTS} "--architecture=${TENSILELITE_GPU_TARGETS_SEMI_ESCAPED}")
set(TENSILELITE_BUILD_OPTS ${TENSILELITE_BUILD_OPTS} "--cxx-compiler=${CMAKE_CXX_COMPILER}")
if(TENSILELITE_OFFLOADBUNDLER)
set(TENSILELITE_BUILD_OPTS ${TENSILELITE_BUILD_OPTS} "--offload-bundler=${TENSILELITE_OFFLOADBUNDLER}")
endif()
if(HIPBLASLT_ENABLE_ASAN)
set(TENSILELITE_BUILD_OPTS ${TENSILELITE_BUILD_OPTS} "--address-sanitizer")
endif()
if(TENSILELITE_BUILD_PARALLEL_LEVEL)
set(TENSILELITE_BUILD_OPTS ${TENSILELITE_BUILD_OPTS} "--jobs=${TENSILELITE_BUILD_PARALLEL_LEVEL}")
endif()
if(TENSILELITE_KEEP_BUILD_TMP)
set(TENSILELITE_BUILD_OPTS ${TENSILELITE_BUILD_OPTS} "--keep-build-tmp")
endif()
if(TENSILELITE_ASM_DEBUG)
set(TENSILELITE_BUILD_OPTS ${TENSILELITE_BUILD_OPTS} "--asm-debug")
endif()
if(HIPBLASLT_ENABLE_YAML)
set(TENSILELITE_BUILD_OPTS ${TENSILELITE_BUILD_OPTS} "--library-format=yaml")
endif()
if(TENSILELITE_LOGIC_FILTER)
set(TENSILELITE_BUILD_OPTS ${TENSILELITE_BUILD_OPTS} "--logic-filter=${TENSILELITE_LOGIC_FILTER}")
endif()
if(TENSILELITE_NO_COMPRESS)
set(TENSILELITE_BUILD_OPTS ${TENSILELITE_BUILD_OPTS} "--no-compress")
endif()
if(TENSILELITE_EXPERIMENTAL)
set(TENSILELITE_BUILD_OPTS ${TENSILELITE_BUILD_OPTS} "--experimental")
endif()
if(NOT HIPBLASLT_ENABLE_LAZY_LOAD)
set(TENSILELITE_BUILD_OPTS ${TENSILELITE_BUILD_OPTS} "--no-lazy-library-loading")
endif()
if(NOT TENSILELITE_ENABLE_ASM_COMMENTS)
set(TENSILELITE_BUILD_OPTS ${TENSILELITE_BUILD_OPTS} "--disable-asm-comments")
endif()

if(NOT HIPBLASLT_LIBLOGIC_PATH)
set(HIPBLASLT_LIBLOGIC_PATH "${hipblaslt_SOURCE_DIR}/library")
endif()
Comment thread
davidd-amd marked this conversation as resolved.

# Pre-build gate: validate all library logic YAMLs (WorkGroup, MatrixInstruction,
# WorkGroupMappingXCC vs CU count, etc.) before generating .dat files. Fails build
# if any solution fails validation so bad logic is never compiled.
set(HIPBLASLT_TENSILELOGIC_KNOWN_BUGS
"${hipblaslt_SOURCE_DIR}/tensilelite/Tensile/TensileLogic/known_bugs.yaml"
)
# Stamp DEPENDS include known_bugs.yaml but not every library logic YAML (thousands
# of files; CONFIGURE_DEPENDS globs are costly). After editing logic only, run
# scripts/run_tensile_logic_check.py or touch the stamp input to force re-validation.
set(TENSILELOGIC_STAMP "${CMAKE_CURRENT_BINARY_DIR}/TensileLogic.stamp")
add_custom_command(
OUTPUT "${TENSILELOGIC_STAMP}"
COMMENT "Validating library logic (TensileLogic --check-all) ..."
COMMAND ${HIPBLASLT_PYTHON_COMMAND}
"${hipblaslt_SOURCE_DIR}/tensilelite/Tensile/bin/TensileLogic"
"${HIPBLASLT_LIBLOGIC_PATH}"
--known-bugs
"${HIPBLASLT_TENSILELOGIC_KNOWN_BUGS}"
--check-all
COMMAND ${CMAKE_COMMAND} -E touch "${TENSILELOGIC_STAMP}"
DEPENDS ${HIPBLASLT_PYTHON_DEPS} "${HIPBLASLT_TENSILELOGIC_KNOWN_BUGS}"
VERBATIM
USES_TERMINAL
)

set(output_stamp "${CMAKE_CURRENT_BINARY_DIR}/Tensile.stamp")
set(TENSILE_CREATE_LIBRARY_COMMAND
${HIPBLASLT_PYTHON_COMMAND} -m Tensile.TensileCreateLibrary
${TENSILELITE_BUILD_OPTS}
${HIPBLASLT_LIBLOGIC_PATH}
"${HIPBLASLT_TENSILE_LIBPATH}"
HIP
)

add_custom_command(
OUTPUT "${output_stamp}"
COMMENT "Building device libraries to ${HIPBLASLT_TENSILE_LIBPATH} ..."
COMMAND ${TENSILE_CREATE_LIBRARY_COMMAND}
COMMAND ${CMAKE_COMMAND} -E touch "${output_stamp}"
DEPENDS ${HIPBLASLT_PYTHON_DEPS} "${TENSILELOGIC_STAMP}"
# Because the command can contain special characters
VERBATIM
# Because this can be very long running and difficult to debug deadlocks
# without streaming.
USES_TERMINAL
)

block(SCOPE_FOR VARIABLES)
list(JOIN TENSILE_CREATE_LIBRARY_COMMAND " " FORMATTED_TCL)
message(STATUS "Device lib build command: ${FORMATTED_TCL}")
endblock()

add_custom_target(tensilelite-device-libraries ALL
DEPENDS "${output_stamp}"
hipblaslt_create_device_library(
TARGET tensilelite-device-libraries
LOGIC_PATH "${HIPBLASLT_LIBLOGIC_PATH}"
OUTPUT_DIR "${HIPBLASLT_TENSILE_LIBPATH}"
ARCHES ${GPU_TARGETS}
CXX_COMPILER "${CMAKE_CXX_COMPILER}"
OFFLOAD_BUNDLER "${TENSILELITE_OFFLOADBUNDLER}"
JOBS "${TENSILELITE_BUILD_PARALLEL_LEVEL}"
LOGIC_FILTER "${TENSILELITE_LOGIC_FILTER}"
ASAN "${HIPBLASLT_ENABLE_ASAN}"
YAML_FORMAT "${HIPBLASLT_ENABLE_YAML}"
NO_COMPRESS "${TENSILELITE_NO_COMPRESS}"
EXPERIMENTAL "${TENSILELITE_EXPERIMENTAL}"
LAZY_LOAD "${HIPBLASLT_ENABLE_LAZY_LOAD}"
ASM_COMMENTS "${TENSILELITE_ENABLE_ASM_COMMENTS}"
KEEP_BUILD_TMP "${TENSILELITE_KEEP_BUILD_TMP}"
ASM_DEBUG "${TENSILELITE_ASM_DEBUG}"
)

if(HIPBLASLT_ENABLE_EXTOPS)
Expand Down
Loading
Loading