mathprim: a light-weight Tensor(View) library

mathprim, or math's primitives, is a glue-like library for optimized numeric algorithms.

Current Backends:

CPU
CUDA

Features:

Key features:
1. zero cost abstraction
2. Eigen support, with Map class,
BLAS: Optimized for speed.
1. CPU: handmade, Eigen, cblas
2. GPU: cuBLAS
Parallism:
1. CPU: openmp, sequential
2. GPU: cuda
Sparse matrices: (mainly on spd)
1. CSR, CSC matrix spmv support.
2. cusparse spmv support.
3. direct: cholmod.
4. iterative solver: cg/pcg
5. preconditioners:
  1. diagonal: CPU/CUDA
  2. ic, ilu: for both CPU(Eigen's implementation), and CUDA(cuSPARSE).
  3. FSAI0: (crafted)
Fully customizable optimizers:
1. GD/SGD/GD, with momentum(or nesterov momentum)
2. AdamW
3. l_bfgs: customizable...
  1. linesearcher (default to backtracking)
  2. preconditoner (default to scaled identity)

Future works:

torch bindings: especially for spd sparse matrices.
tiny nn libraries, ref. tiny-cuda-nn.
view's inplace operations, faster gemms on cpu/cuda
cuda: graph capture, multiple streams, ...

Recipies

View from existing buffer.

shape_t<keep_dim, 3, 2> shape(4, 3, 2);
int p[24];
for (int i = 0; i < 24; ++i) {
  p[i] = i + 1;
}

auto view = make_view<device::cpu>(p, shape);
auto value0 = view(0, 0, 0);

View from buffer.

auto buf = make_buffer<int>(4, 3, 2);
auto view = buf.view();
for (int i = 0; i < 4; ++i) {
  for (int j = 0; j < 3; ++j) {
    for (int k = 0; k < 2; ++k) {
      buf.data()[i * 6 + j * 2 + k] = i * 6 + j * 2 + k + 1;
      EXPECT_EQ(view(i, j, k), i * 6 + j * 2 + k + 1);
    }
  }
}

for (const auto &vi : view) {
  for (int j = 0; j < 3; ++j) {
    for (int k = 0; k < 2; ++k) {
      EXPECT_EQ(view(0, j, k), j * 2 + k + 1);
    }
  }
  break;
}

CUDA parallism

__global__ void set_value(float *ptr, int size) {
  int idx = blockIdx.x * blockDim.x + threadIdx.x;
  if (idx < size) {
    ptr[idx] = static_cast<float>(idx);
  }
}

// Create buffer on CUDA.
auto buf = make_buffer<float, device::cuda>(shape_t<-1, 4>(10, 4));
auto view = buf.view();
set_value<<<view.size(), 1>>>(buf.data(), buf.size());

// Parallism
par::cuda parfor;
parfor.run(view.shape(), [view]__device__(auto idx)  {
  auto [i, j] = idx;
  printf("Lambda view[%d, %d] = %f\n", i, j, view(i, j));
});
parfor.run(dshape<4>(10, 4, 1, 1), [view] __device__ (auto idx)  {
  auto [i, j, k, l] = idx;
  printf("Lambda view[%d, %d, %d, %d] = %f\n", i, j, k, l, view(i, j));
});

Name		Name	Last commit message	Last commit date
Latest commit History 126 Commits
cmake		cmake
mathprim		mathprim
tests		tests
.clang-format		.clang-format
.clang-tidy		.clang-tidy
.clangd		.clangd
.continueignore		.continueignore
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mathprim: a light-weight Tensor(View) library

Recipies

View from existing buffer.

View from buffer.

CUDA parallism

About

Languages

Adversarr/mathprim

Folders and files

Latest commit

History

Repository files navigation

mathprim: a light-weight Tensor(View) library

Recipies

View from existing buffer.

View from buffer.

CUDA parallism

About

Resources

Stars

Watchers

Forks

Languages