Skip to content

ManagedSharedPtr #298

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 50 commits into
base: develop
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
6147df2
Basic reference counting on host; host & device construction in make_…
mdavis36 Apr 4, 2024
f489a15
reinterpret cast record pointer on copy; adding accessors; custom del…
mdavis36 Apr 9, 2024
c18723a
SharedPointerRecord header; Generalizing Record w/ void* pointers; Fo…
mdavis36 Apr 9, 2024
75749a2
MSPtr counter takes on ownership of the pointer record creation and l…
mdavis36 Apr 9, 2024
067b66a
polymorphic object host->device copy testing; Working on getting Shar…
mdavis36 Apr 22, 2024
c4db6a1
Pushing latest changes to ManagedSharedPtr interface
mdavis36 Apr 30, 2024
5cabef2
Fixing Umpire copies; Using umpire allocators in makeSharedPtr.
mdavis36 May 1, 2024
7b0176e
Recursive polymorphic object copies.
mdavis36 May 7, 2024
435172a
Better copies between host + device; Register Touch considers const a…
mdavis36 May 8, 2024
acf00ad
Proper destruction on host & device; Correct deallocation w/ Umpire; …
mdavis36 May 9, 2024
c4bcb9b
CHAIPoly type tag for polymorphic types and defining their h/d copy b…
mdavis36 May 9, 2024
c5b9255
Making ManagedSharedPtr CHAICopyable (ISSUE: umpire throws error upon…
mdavis36 Jun 3, 2024
de62d4e
makeSharedPtrRecord takes pointers & spaces as initializer list.
mdavis36 Jun 3, 2024
f237611
Cleaning up unnecessary logic in SharedPointerRecord.
mdavis36 Jun 4, 2024
6e19aa8
Usind Def Ctor & Dtor to init and free elements of a ManagedArray tha…
mdavis36 Jun 4, 2024
1b4d1fa
Non GPU build compiling and passing basic tests...
mdavis36 Jun 5, 2024
d81d019
empty map assertions for shared_ptr tests.
mdavis36 Jun 5, 2024
32f81be
ManagedArray Size updates from pointer record on host call.
mdavis36 Jun 5, 2024
d97b556
Turn off copy ctor debug output.
mdavis36 Jun 5, 2024
c9d4bd0
Guarding GPU related work that throws an error when built without GPU…
mdavis36 Jun 11, 2024
eb72574
Guarding tests when not built w/ CUDA/HIP
mdavis36 Jun 26, 2024
be02184
Squash warnings when not building w/ GPU support.
mdavis36 Jul 9, 2024
79cb51f
Squash warnings when building Werror or pedantic.
mdavis36 Jul 9, 2024
2a53f53
Revert changes to managed_ptr_tests
mdavis36 Sep 9, 2024
eb23581
RajaExec plugin construction needs to be static.
mdavis36 Nov 21, 2024
68de020
HIP agnostic test macros.
mdavis36 Dec 12, 2024
9bb1281
HIP support for ManagedSharedPtr
mdavis36 Dec 17, 2024
a54519d
Merge branch 'develop' into feature/ManagedSharedPtr
mdavis36 Feb 3, 2025
6dcbce4
Pointing submodules to the same as develop.
mdavis36 Feb 3, 2025
c88613a
raja-chai-launch @ develop
mdavis36 Feb 3, 2025
338db55
More updates w/ develop.
mdavis36 Feb 3, 2025
5ab9ac3
Fixing clang + cuda build on blueos.
mdavis36 Feb 4, 2025
abf6011
Better comments for initInner & freeInner; Remove redudant PR inst.
mdavis36 Feb 4, 2025
6706d35
Pulling out common device helper functions and generalizing some devi…
mdavis36 Feb 4, 2025
e682c10
Squashing warnings in managed_shared_ptr_tests.
mdavis36 Feb 4, 2025
640f871
Seperate xnack host-config for amdclang
mdavis36 Feb 4, 2025
6bfafc9
Separate host-configs for amdclang w/o +xnack.
mdavis36 Feb 4, 2025
2d80d44
Squashing wanrings for amdclang HIP builds.
mdavis36 Feb 4, 2025
ccc0d75
Merge branch 'warn/amdclang' into feature/ManagedSharedPtr
mdavis36 Feb 4, 2025
5119b29
Squashing warnings for amdclang in ManagedSharedPtr.
mdavis36 Feb 4, 2025
bc7d2e3
Direct CHAI_UNUSED_VAR declaration.
mdavis36 Feb 4, 2025
20dcd64
Merge branch 'warn/amdclang' into feature/ManagedSharedPtr
mdavis36 Feb 5, 2025
3bbed6a
Merge branch 'develop' into feature/ManagedSharedPtr
mdavis36 Feb 7, 2025
07e804c
Squash warning from sync call not returning.
mdavis36 Feb 12, 2025
0c30375
Merge branch 'develop' into feature/ManagedSharedPtr
mdavis36 Feb 12, 2025
292a000
Cleaning up commented code moved to ChaiManager.hpp; Commenting out c…
mdavis36 Feb 25, 2025
ec7e084
Removing commented code blocks unnecessary for MSPtr.
mdavis36 Feb 25, 2025
0c181e3
Removing debug print statements from MSPtr development.
mdavis36 Feb 25, 2025
42b4830
Adding ChaiManager.hpp to chai_headers.
mdavis36 Feb 25, 2025
8aa65e0
Merge branch 'develop' into feature/ManagedSharedPtr
mdavis36 May 1, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
114 changes: 1 addition & 113 deletions src/chai/ArrayManager.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -7,122 +7,10 @@
#ifndef CHAI_ArrayManager_HPP
#define CHAI_ArrayManager_HPP

#include "chai/config.hpp"
#include "chai/ChaiMacros.hpp"
#include "chai/ExecutionSpaces.hpp"
#include "chai/PointerRecord.hpp"
#include "chai/Types.hpp"

#if defined(CHAI_ENABLE_RAJA_PLUGIN)
#include "chai/pluginLinker.hpp"
#endif

#include <unordered_map>

#include "umpire/Allocator.hpp"
#include "umpire/util/MemoryMap.hpp"

#if defined(CHAI_ENABLE_CUDA)
#include <cuda_runtime_api.h>
#endif
#if defined(CHAI_ENABLE_HIP)
#include "hip/hip_runtime_api.h"
#endif
#include "chai/ChaiManager.hpp"

namespace chai
{
// CHAI_GPU_ERROR_CHECK macro
#if defined(CHAI_ENABLE_CUDA) || defined(CHAI_ENABLE_HIP)

#ifdef CHAI_ENABLE_GPU_ERROR_CHECKING

#ifdef CHAI_ENABLE_CUDA
inline void gpuErrorCheck(cudaError_t code, const char *file, int line, bool abort=true)
{
if (code != cudaSuccess) {
fprintf(stderr, "[CHAI] GPU Error: %s %s %d\n", cudaGetErrorString(code), file, line);
if (abort) {
exit(code);
}
}
}
#elif defined(CHAI_ENABLE_HIP)
inline void gpuErrorCheck(hipError_t code, const char *file, int line, bool abort=true)
{
if (code != hipSuccess) {
fprintf(stderr, "[CHAI] GPU Error: %s %s %d\n", hipGetErrorString(code), file, line);
if (abort) {
exit(code);
}
}
}
#endif


#define CHAI_GPU_ERROR_CHECK(code) { gpuErrorCheck((code), __FILE__, __LINE__); }
#else // CHAI_ENABLE_GPU_ERROR_CHECKING
#define CHAI_GPU_ERROR_CHECK(code) code
#endif // CHAI_ENABLE_GPU_ERROR_CHECKING

#endif

// wrapper for hip/cuda synchronize
inline void synchronize() {
#if defined (CHAI_ENABLE_HIP) &&!defined(__HIP_DEVICE_COMPILE__)
CHAI_GPU_ERROR_CHECK(hipDeviceSynchronize());
#elif defined (CHAI_ENABLE_CUDA) &&!defined(__CUDA_ARCH__)
CHAI_GPU_ERROR_CHECK(cudaDeviceSynchronize());
#endif
}

#if defined(CHAI_GPUCC) || defined(CHAI_ENABLE_GPU_SIMULATION_MODE)

// wrapper for hip/cuda free
CHAI_HOST inline void gpuFree(void* buffer) {
#if defined(CHAI_ENABLE_GPU_SIMULATION_MODE)
free(buffer);
#elif defined (CHAI_ENABLE_HIP)
CHAI_GPU_ERROR_CHECK(hipFree(buffer));
#elif defined (CHAI_ENABLE_CUDA)
CHAI_GPU_ERROR_CHECK(cudaFree(buffer));
#endif
}

// wrapper for hip/cuda malloc
CHAI_HOST inline void gpuMalloc(void** devPtr, size_t size) {
#if defined(CHAI_ENABLE_GPU_SIMULATION_MODE)
*devPtr = (void*)malloc(size);
#elif defined (CHAI_ENABLE_HIP)
CHAI_GPU_ERROR_CHECK(hipMalloc(devPtr, size));
#elif defined (CHAI_ENABLE_CUDA)
CHAI_GPU_ERROR_CHECK(cudaMalloc(devPtr, size));
#endif
}

// wrapper for hip/cuda managed malloc
CHAI_HOST inline void gpuMallocManaged(void** devPtr, size_t size) {
#if defined(CHAI_ENABLE_GPU_SIMULATION_MODE)
*devPtr = (void*)malloc(size);
#elif defined (CHAI_ENABLE_HIP)
CHAI_GPU_ERROR_CHECK(hipMallocManaged(devPtr, size));
#elif defined (CHAI_ENABLE_CUDA)
CHAI_GPU_ERROR_CHECK(cudaMallocManaged(devPtr, size));
#endif
}

// wrapper for hip/cuda mem copy
CHAI_HOST inline void gpuMemcpy(void* dst, const void* src, size_t count, gpuMemcpyKind kind) {
#if defined(CHAI_ENABLE_GPU_SIMULATION_MODE)
memcpy(dst, src, count);
#elif defined (CHAI_ENABLE_HIP)
CHAI_GPU_ERROR_CHECK(hipMemcpy(dst, src, count, kind));
#elif defined (CHAI_ENABLE_CUDA)
CHAI_GPU_ERROR_CHECK(cudaMemcpy(dst, src, count, kind));
#endif
}

#endif //#if defined(CHAI_GPUCC)

/*!
* \brief Singleton that manages caching and movement of ManagedArray objects.
*
Expand Down
2 changes: 2 additions & 0 deletions src/chai/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ set (chai_headers
ArrayManager.hpp
ArrayManager.inl
ChaiMacros.hpp
ChaiManager.hpp
ExecutionSpaces.hpp
ManagedArray.hpp
ManagedArray.inl
Expand All @@ -29,6 +30,7 @@ if(CHAI_DISABLE_RM)
endif ()

set (chai_sources
SharedPtrManager.cpp
ArrayManager.cpp)

set (chai_depends
Expand Down
10 changes: 10 additions & 0 deletions src/chai/ChaiMacros.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,11 @@
#define gpuMemcpyDeviceToDevice cudaMemcpyDeviceToDevice
#define gpuMemcpyDefault cudaMemcpyDefault

#define gpuSuccess cudaSuccess
#define gpuError_t cudaError_t
#define gpuGetErrorString cudaGetErrorString
#define gpuDeviceSynchronize cudaDeviceSynchronize

// NOTE: Cannot have if defined(__HIPCC__) in the condition below, since __HIPCC__ comes from the included header hip_runtime below.
#elif defined(CHAI_ENABLE_HIP)

Expand All @@ -48,6 +53,11 @@
#define gpuMemcpyDeviceToDevice hipMemcpyDeviceToDevice
#define gpuMemcpyDefault hipMemcpyDefault

#define gpuSuccess hipSuccess
#define gpuError_t hipError_t
#define gpuGetErrorString hipGetErrorString
#define gpuDeviceSynchronize hipDeviceSynchronize

#else

#define CHAI_HOST
Expand Down
28 changes: 28 additions & 0 deletions src/chai/ChaiManager.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
//////////////////////////////////////////////////////////////////////////////
// Copyright (c) 2016-24, Lawrence Livermore National Security, LLC and CHAI
// project contributors. See the CHAI LICENSE file for details.
//
// SPDX-License-Identifier: BSD-3-Clause
//////////////////////////////////////////////////////////////////////////////
#ifndef CHAI_ChaiManager_HPP
#define CHAI_ChaiManager_HPP

#include "chai/ChaiMacros.hpp"
#include "chai/ExecutionSpaces.hpp"
#include "chai/Types.hpp"

#include "chai/PointerRecord.hpp"

#if defined(CHAI_ENABLE_RAJA_PLUGIN)
#include "chai/pluginLinker.hpp"
#endif

#include <unordered_map>

#include "umpire/Allocator.hpp"
#include "umpire/util/MemoryMap.hpp"


#include "chai/util/DeviceHelpers.hpp"

#endif // CHAI_ChaiManager_HPP
24 changes: 22 additions & 2 deletions src/chai/ManagedArray.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -373,14 +373,14 @@ class ManagedArray : public CHAICopyable
// shenanigan reasons need to be defined here.
#if !defined(CHAI_DISABLE_RM)
// if T is a CHAICopyable, then it is important to initialize all the
// ManagedArrays to nullptr at allocation, since it is extremely easy to
// elements with default constructors, since it is extremely easy to
// trigger a moveInnerImpl, which expects inner values to be initialized.
template <bool B = std::is_base_of<CHAICopyable, T>::value,
typename std::enable_if<B, int>::type = 0>
CHAI_HOST bool initInner(size_t start = 0)
{
for (size_t i = start; i < m_size/sizeof(T); ++i) {
m_active_base_pointer[i] = nullptr;
new (&m_active_base_pointer[i]) T();
}
return true;
}
Expand All @@ -392,6 +392,26 @@ class ManagedArray : public CHAICopyable
{
return false;
}

// if T is a CHAICopyable, then it is important to free all the
// CHAICopyable containers, which expect inner values to be initialized.
template <bool B = std::is_base_of<CHAICopyable, T>::value,
typename std::enable_if<B, int>::type = 0>
CHAI_HOST bool freeInner(size_t start = 0)
{
for (size_t i = start; i < m_size/sizeof(T); ++i) {
m_active_base_pointer[i].~T();
}
return true;
}

// Do not deep initialize if T is not a CHAICopyable.
template <bool B = std::is_base_of<CHAICopyable, T>::value,
typename std::enable_if<!B, int>::type = 0>
CHAI_HOST bool freeInner(size_t = 0)
{
return false;
}
#endif
protected:
/*!
Expand Down
7 changes: 7 additions & 0 deletions src/chai/ManagedArray.inl
Original file line number Diff line number Diff line change
Expand Up @@ -249,12 +249,14 @@ CHAI_HOST void ManagedArray<T>::reallocate(size_t elems)
// trigger a moveInnerImpl, which expects inner values to be initialized.
if (initInner(old_size/sizeof(T))) {
// if we are active on the GPU, we need to send any newly initialized inner members to the device
#if defined(CHAI_ENABLE_CUDA) || defined(CHAI_ENABLE_HIP)
if (m_pointer_record->m_last_space == GPU && old_size < m_size) {
umpire::ResourceManager & umpire_rm = umpire::ResourceManager::getInstance();
void *src = (void *)(((char *)(m_pointer_record->m_pointers[CPU])) + old_size);
void *dst = (void *)(((char *)(m_pointer_record->m_pointers[GPU])) + old_size);
umpire_rm.copy(dst,src,m_size-old_size);
}
#endif
}

CHAI_LOG(Debug, "m_active_ptr reallocated at address: " << m_active_pointer);
Expand All @@ -276,6 +278,8 @@ CHAI_HOST void ManagedArray<T>::free(ExecutionSpace space)
if (m_pointer_record == &ArrayManager::s_null_record) {
m_pointer_record = m_resource_manager->makeManaged((void *)m_active_base_pointer,m_size,space,true);
}
freeInner();

m_resource_manager->free(m_pointer_record, space);
m_active_pointer = nullptr;
m_active_base_pointer = nullptr;
Expand All @@ -302,6 +306,9 @@ CHAI_HOST void ManagedArray<T>::reset()
template<typename T>
CHAI_INLINE
CHAI_HOST_DEVICE size_t ManagedArray<T>::size() const {
#if !defined(CHAI_DEVICE_COMPILE)
if (!m_is_slice) m_size = m_pointer_record->m_size;
#endif
return m_size/sizeof(T);
}

Expand Down
Loading