Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MueLu: Initial integration of Kokkos Tuning into MueLu #13883

Closed
wants to merge 24 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
db86544
MueLu: Initial version of the KokkosTuning interface that compiles, b…
csiefer2 Mar 11, 2025
7ea6384
MueLu: Adding kokkos tune support to driver
csiefer2 Mar 12, 2025
ce6c8fc
Minor fixes to tuning interface (and debugging statements)
csiefer2 Mar 13, 2025
1402960
MueLu: Adding nested list support to KTI
csiefer2 Mar 13, 2025
04433ab
Minor fixes to tuning interface (and debugging statements)
csiefer2 Mar 13, 2025
bf79bc2
MueLu: Getting the finally working version of the KokkosTuning MueLu …
csiefer2 Mar 14, 2025
434bc24
MueLu: Adding KokkosTuning unit test
csiefer2 Mar 14, 2025
32a7d36
STK: swap macro order
jhux2 Mar 10, 2025
cd19c0f
MueLu DriverCore: Fix types
cgcgcg Mar 10, 2025
d89f4cd
Xpetra: Fix unreachable return
cgcgcg Mar 10, 2025
d0d9115
Xpetra: Fix for BlockedCrsMatrix
cgcgcg Feb 25, 2025
cd6d1aa
Xpetra CrsMatrixWrap: Fix return type for getLocalMatrixHost
cgcgcg Feb 26, 2025
e341a76
Teuchos: Remove GCC diagnostic pragmas
cgcgcg Feb 26, 2025
9c1f03e
MueLu: Fix compile issues on Windows
cgcgcg Feb 26, 2025
33161ef
Bump github/codeql-action from 3.28.10 to 3.28.11
dependabot[bot] Mar 10, 2025
42cad4d
MueLu CoalesceDropFactory_kokkos: Fix debug code
cgcgcg Mar 11, 2025
bc1d455
MueLu RefMaxwell: Pass "distance laplacian algo" option in Kokkos cod…
cgcgcg Mar 11, 2025
abde905
MueLu Driver: Fix build issue with complex
cgcgcg Mar 11, 2025
1aa4c60
MueLu MeshTyingBlocked_NodeBased: Fix type for complex build
cgcgcg Mar 11, 2025
d039e90
STK: Snapshot 03-11-25 10:17 from Sierra 5.23.7-239-g3b36ebca
hpacella Mar 11, 2025
f7d2a22
intrepid2: update SerialQR_Internal namespacing
ndellingwood Mar 12, 2025
65f2d9b
Shards: Remove unimplemented methods
cgcgcg Jan 20, 2025
fd56e72
ShyLU-Basker : compile errors with complex variables (#13870)
iyamazaki Mar 13, 2025
3cfada4
Panzer: add a variant of setupExodusFile() with a vector<IossProperty…
tkordenbrock Mar 12, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
4 changes: 2 additions & 2 deletions .github/workflows/codeql.yml
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ jobs:
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

- name: Initialize CodeQL
uses: github/codeql-action/init@b56ba49b26e50535fa1e7f7db0f4f7b4bf65d80d # v3.28.10
uses: github/codeql-action/init@6bb031afdd8eb862ea3fc1848194185e076637e5 # v3.28.11
with:
languages: ${{ matrix.language }}
build-mode: ${{ matrix.build-mode }}
Expand Down Expand Up @@ -108,6 +108,6 @@ jobs:
ninja -j 16

- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@b56ba49b26e50535fa1e7f7db0f4f7b4bf65d80d # v3.28.10
uses: github/codeql-action/analyze@6bb031afdd8eb862ea3fc1848194185e076637e5 # v3.28.11
with:
category: "/language:${{matrix.language}}"
2 changes: 1 addition & 1 deletion .github/workflows/scorecards.yml
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,6 @@ jobs:

# Upload the results to GitHub's code scanning dashboard.
- name: "Upload to code-scanning"
uses: github/codeql-action/upload-sarif@b56ba49b26e50535fa1e7f7db0f4f7b4bf65d80d # v3.28.10
uses: github/codeql-action/upload-sarif@6bb031afdd8eb862ea3fc1848194185e076637e5 # v3.28.11
with:
sarif_file: results.sarif
13 changes: 11 additions & 2 deletions packages/amesos2/src/Amesos2_ShyLUBasker_FunctionMap.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -41,16 +41,25 @@ namespace Amesos2 {
* \brief Pass function calls to ShyLUBasker based on data type.

*/
#ifdef HAVE_TEUCHOS_COMPLEX
#ifdef HAVE_TEUCHOS_INST_COMPLEX_DOUBLE
template <>
struct FunctionMap<ShyLUBasker,Kokkos::complex<double>>
{
static std::complex<double> * convert_scalar(Kokkos::complex<double> * pData) {
return reinterpret_cast<std::complex<double> *>(pData);
}
};
#endif // HAVE_TEUCHOS_COMPLEX_DOUBLE

#endif // HAVE_TEUCHOS_COMPLEX
#ifdef HAVE_TEUCHOS_INST_COMPLEX_FLOAT
template <>
struct FunctionMap<ShyLUBasker,Kokkos::complex<float>>
{
static std::complex<float> * convert_scalar(Kokkos::complex<float> * pData) {
return reinterpret_cast<std::complex<float> *>(pData);
}
};
#endif // HAVE_TEUCHOS_INST_COMPLEX_FLOAT

// if not specialized, then assume generic conversion is fine
template <typename scalar_t>
Expand Down
8 changes: 4 additions & 4 deletions packages/amesos2/src/Amesos2_ShyLUBasker_TypeMap.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -63,8 +63,8 @@ struct TypeMap<ShyLUBasker,double>
template <>
struct TypeMap<ShyLUBasker,std::complex<float> >
{
typedef std::complex<double> dtype;
typedef Kokkos::complex<double> type;
typedef std::complex<float> dtype;
typedef Kokkos::complex<float> type;
};

template <>
Expand All @@ -77,8 +77,8 @@ struct TypeMap<ShyLUBasker,std::complex<double> >
template <>
struct TypeMap<ShyLUBasker,Kokkos::complex<float> >
{
typedef std::complex<double> dtype;
typedef Kokkos::complex<double> type;
typedef std::complex<float> dtype;
typedef Kokkos::complex<float> type;
};

template <>
Expand Down
109 changes: 102 additions & 7 deletions packages/amesos2/test/solvers/ShyLUBasker_UnitTests.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,12 @@ namespace {
const global_size_t INVALID = OrdinalTraits<global_size_t>::invalid();
RCP<const Comm<int> > comm = getDefaultComm();
const size_t rank = comm->getRank();
if (rank==0) {
std::cout << std::endl
<< " >> UnitTest for ShyLUBasker::Initialization with Scalar = "
<< ST::name() << " <<" << std::endl << std::endl;
}

// create a Map
const size_t numLocal = 10;
RCP<Map<LO,GO,Node> > map = rcp( new Map<LO,GO,Node>(INVALID,numLocal,0,comm) );
Expand Down Expand Up @@ -183,6 +189,11 @@ namespace {
const global_size_t INVALID = OrdinalTraits<global_size_t>::invalid();
RCP<const Comm<int> > comm = getDefaultComm();
const size_t rank = comm->getRank();
if (rank==0) {
std::cout << std::endl
<< " >> UnitTest for ShyLUBasker::SymbolicFactorization with Scalar = "
<< ST::name() << " <<" << std::endl << std::endl;
}
// create a Map
const size_t numLocal = 10;
RCP<Map<LO,GO,Node> > map = rcp( new Map<LO,GO,Node>(INVALID,numLocal,0,comm) );
Expand Down Expand Up @@ -217,6 +228,11 @@ namespace {
const global_size_t INVALID = OrdinalTraits<global_size_t>::invalid();
RCP<const Comm<int> > comm = getDefaultComm();
const size_t rank = comm->getRank();
if (rank==0) {
std::cout << std::endl
<< " >> UnitTest for ShyLUBasker::NumericFactorization with Scalar = "
<< ST::name() << " <<" << std::endl << std::endl;
}
// create a Map
const size_t numLocal = 10;
RCP<Map<LO,GO,Node> > map = rcp( new Map<LO,GO,Node>(INVALID,numLocal,0,comm) );
Expand Down Expand Up @@ -257,6 +273,12 @@ namespace {
const size_t numVecs = 1;

RCP<const Comm<int> > comm = Tpetra::getDefaultComm();
const size_t rank = comm->getRank();
if (rank==0) {
std::cout << std::endl
<< " >> UnitTest for ShyLUBasker::Solve with Scalar = "
<< ST::name() << " <<" << std::endl << std::endl;
}

// NDE: Beginning changes towards passing parameter list to shylu basker
// for controlling various parameters per test, matrix, etc.
Expand Down Expand Up @@ -325,6 +347,11 @@ namespace {
Array<Mag> xhatnorms(numVecs), xnorms(numVecs);
Xhat->norm2(xhatnorms());
X->norm2(xnorms());
if (rank==0) {
for (int i=0; i<xnorms.size(); i++)
std::cout << "err[" << i << "] = " << xnorms[i] << " - " << xhatnorms[i]
<< " = " << xnorms[i]-xhatnorms[i] << std::endl;
}
TEST_COMPARE_FLOATING_ARRAYS( xhatnorms, xnorms, 0.005 );
}

Expand All @@ -338,6 +365,12 @@ namespace {
const size_t numVecs = 1;

RCP<const Comm<int> > comm = Tpetra::getDefaultComm();
const size_t rank = comm->getRank();
if (rank==0) {
std::cout << std::endl
<< " >> UnitTest for ShyLUBasker::SolveTrans with Scalar = "
<< ST::name() << " <<" << std::endl << std::endl;
}

// NDE: Beginning changes towards passing parameter list to shylu basker
// for controlling various parameters per test, matrix, etc.
Expand Down Expand Up @@ -405,6 +438,11 @@ namespace {
Array<Mag> xhatnorms(numVecs), xnorms(numVecs);
Xhat->norm2(xhatnorms());
X->norm2(xnorms());
if (rank==0) {
for (int i=0; i<xnorms.size(); i++)
std::cout << "err[" << i << "] = " << xnorms[i] << " - " << xhatnorms[i]
<< " = " << xnorms[i]-xhatnorms[i] << std::endl;
}
TEST_COMPARE_FLOATING_ARRAYS( xhatnorms, xnorms, 0.005 );
}

Expand Down Expand Up @@ -479,9 +517,13 @@ namespace {
using Scalar = SCALAR;

RCP<const Comm<int> > comm = Tpetra::getDefaultComm();

size_t myRank = comm->getRank();
const global_size_t numProcs = comm->getSize();
if (myRank==0) {
std::cout << std::endl
<< " >> UnitTest for ShyLUBasker::NonContigGID with Scalar = "
<< ST::name() << " <<" << std::endl << std::endl;
}

// Unit test created for 2 processes
if ( numProcs == 2 ) {
Expand Down Expand Up @@ -621,6 +663,11 @@ namespace {
Array<Mag> xhatnorms(numVectors), xnorms(numVectors);
Xhat->norm2(xhatnorms());
X->norm2(xnorms());
if (myRank==0) {
for (int i=0; i<xnorms.size(); i++)
std::cout << "err[" << i << "] = " << xnorms[i] << " - " << xhatnorms[i]
<< " = " << xnorms[i]-xhatnorms[i] << std::endl;
}
TEST_COMPARE_FLOATING_ARRAYS( xhatnorms, xnorms, 0.005 );
} // end if numProcs = 2
}
Expand All @@ -636,6 +683,12 @@ namespace {
//typedef ScalarTraits<Mag> MT;

RCP<const Comm<int> > comm = Tpetra::getDefaultComm();
size_t myRank = comm->getRank();
if (myRank==0) {
std::cout << std::endl
<< " >> UnitTest for ShyLUBasker::ComplexSolve with Scalar = "
<< ST::name() << " <<" << std::endl << std::endl;
}

RCP<MAT> A =
Tpetra::MatrixMarket::Reader<MAT>::readSparseFile("../matrices/amesos2_test_mat4.mtx",comm);
Expand Down Expand Up @@ -692,6 +745,11 @@ namespace {
Array<Mag> xhatnorms(1), xnorms(1);
Xhat->norm2(xhatnorms());
X->norm2(xnorms());
if (myRank==0) {
for (int i=0; i<xnorms.size(); i++)
std::cout << "err[" << i << "] = " << xnorms[i] << " - " << xhatnorms[i]
<< " = " << xnorms[i]-xhatnorms[i] << std::endl;
}
TEST_COMPARE_FLOATING_ARRAYS( xhatnorms, xnorms, 0.005 );
}

Expand All @@ -706,6 +764,12 @@ namespace {
const size_t numVecs = 7;

RCP<const Comm<int> > comm = Tpetra::getDefaultComm();
size_t myRank = comm->getRank();
if (myRank==0) {
std::cout << std::endl
<< " >> UnitTest for ShyLUBasker::ComplexSolve2 with Scalar = "
<< ST::name() << " <<" << std::endl << std::endl;
}

RCP<MAT> A =
Tpetra::MatrixMarket::Reader<MAT>::readSparseFile("../matrices/amesos2_test_mat2.mtx",comm);
Expand Down Expand Up @@ -738,6 +802,11 @@ namespace {
Array<Mag> xhatnorms(numVecs), xnorms(numVecs);
Xhat->norm2(xhatnorms());
X->norm2(xnorms());
if (myRank==0) {
for (int i=0; i<xnorms.size(); i++)
std::cout << "err[" << i << "] = " << xnorms[i] << " - " << xhatnorms[i]
<< " = " <<xnorms[i]-xhatnorms[i] << std::endl;
}
TEST_COMPARE_FLOATING_ARRAYS( xhatnorms, xnorms, 0.005 );
}

Expand All @@ -752,6 +821,12 @@ namespace {
const size_t numVecs = 7;

RCP<const Comm<int> > comm = Tpetra::getDefaultComm();
size_t myRank = comm->getRank();
if (myRank==0) {
std::cout << std::endl
<< " >> UnitTest for ShyLUBasker::ComplexSolve2Trans with Scalar = "
<< ST::name() << " <<" << std::endl << std::endl;
}

RCP<MAT> A =
Tpetra::MatrixMarket::Reader<MAT>::readSparseFile("../matrices/amesos2_test_mat3.mtx",comm);
Expand All @@ -776,7 +851,7 @@ namespace {
= Amesos2::create<MAT,MV>("ShyLUBasker", A, Xhat, B);

Teuchos::ParameterList amesos2_params("Amesos2");
amesos2_params.sublist("ShyLUBasker").set("Trans","CONJ","Solve with conjugate-transpose");
amesos2_params.sublist("ShyLUBasker").set("transpose",true,"Solve with conjugate-transpose");

solver->setParameters( rcpFromRef(amesos2_params) );
solver->symbolicFactorization().numericFactorization().solve();
Expand All @@ -788,16 +863,35 @@ namespace {
Array<Mag> xhatnorms(numVecs), xnorms(numVecs);
Xhat->norm2(xhatnorms());
X->norm2(xnorms());
if (myRank==0) {
for (int i=0; i<xnorms.size(); i++)
std::cout << "err[" << i << "] = " << xnorms[i] << " - " << xhatnorms[i]
<< " = " << xnorms[i]-xhatnorms[i] << std::endl;
}
TEST_COMPARE_FLOATING_ARRAYS( xhatnorms, xnorms, 0.005 );
}


/*
* Instantiations
*/
#ifdef HAVE_TPETRA_INST_COMPLEX_FLOAT
# define UNIT_TEST_GROUP_ORDINAL_COMPLEX_FLOAT(LO, GO) \
TEUCHOS_UNIT_TEST_TEMPLATE_3_INSTANT( ShyLUBasker, ComplexSolve, float, LO, GO ) \
TEUCHOS_UNIT_TEST_TEMPLATE_3_INSTANT( ShyLUBasker, ComplexSolve2, float, LO, GO ) \
/*TEUCHOS_UNIT_TEST_TEMPLATE_3_INSTANT( ShyLUBasker, ComplexSolve2Trans, float, LO, GO ) */
#else
# define UNIT_TEST_GROUP_ORDINAL_COMPLEX_FLOAT(LO, GO)
#endif

#ifdef HAVE_TPETRA_INST_COMPLEX_DOUBLE
# define UNIT_TEST_GROUP_ORDINAL_COMPLEX_DOUBLE(LO, GO) \
TEUCHOS_UNIT_TEST_TEMPLATE_3_INSTANT( ShyLUBasker, ComplexSolve, double, LO, GO ) \
TEUCHOS_UNIT_TEST_TEMPLATE_3_INSTANT( ShyLUBasker, ComplexSolve2, double, LO, GO ) \
/*TEUCHOS_UNIT_TEST_TEMPLATE_3_INSTANT( ShyLUBasker, ComplexSolve2Trans, double, LO, GO ) */
#else
# define UNIT_TEST_GROUP_ORDINAL_COMPLEX_DOUBLE(LO, GO)
//#endif
#endif

#ifdef HAVE_TPETRA_INST_FLOAT
# define UNIT_TEST_GROUP_ORDINAL_FLOAT( LO, GO ) \
Expand All @@ -818,11 +912,12 @@ namespace {
TEUCHOS_UNIT_TEST_TEMPLATE_3_INSTANT( ShyLUBasker, SolveTrans, SCALAR, LO, GO )

#define UNIT_TEST_GROUP_ORDINAL_ORDINAL( LO, GO ) \
UNIT_TEST_GROUP_ORDINAL_FLOAT(LO, GO) \
UNIT_TEST_GROUP_ORDINAL_DOUBLE(LO, GO) \
UNIT_TEST_GROUP_ORDINAL_COMPLEX_DOUBLE(LO,GO)
UNIT_TEST_GROUP_ORDINAL_FLOAT(LO, GO) \
UNIT_TEST_GROUP_ORDINAL_DOUBLE(LO, GO) \
UNIT_TEST_GROUP_ORDINAL_COMPLEX_DOUBLE(LO,GO) \
UNIT_TEST_GROUP_ORDINAL_COMPLEX_FLOAT(LO,GO)

#define UNIT_TEST_GROUP_ORDINAL( ORDINAL ) \
#define UNIT_TEST_GROUP_ORDINAL( ORDINAL ) \
UNIT_TEST_GROUP_ORDINAL_ORDINAL( ORDINAL, ORDINAL )

//Add JDB (10-19-215)
Expand Down
20 changes: 15 additions & 5 deletions packages/intrepid2/src/Projection/Intrepid2_ProjectionTools.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -524,9 +524,14 @@ class ProjectionTools {
auto w0_host = Kokkos::create_mirror_view(Kokkos::subview(work, 0, Kokkos::ALL()));

//computing QR of A0. QR is saved in A0 and tau0
KokkosBatched::SerialQR_Internal::invoke(A0_host.extent(0), A0_host.extent(1),
A0_host.data(), A0_host.stride_0(), A0_host.stride_1(),
tau0_host.data(), tau0_host.stride_0(), w0_host.data());
#if KOKKOS_VERSION >= 40599
KokkosBatched::Impl::SerialQR_Internal::invoke
#else
KokkosBatched::SerialQR_Internal::invoke
#endif
(A0_host.extent(0), A0_host.extent(1),
A0_host.data(), A0_host.stride_0(), A0_host.stride_1(),
tau0_host.data(), tau0_host.stride_0(), w0_host.data());

Kokkos::deep_copy(A0_device, A0_host);
Kokkos::deep_copy(A0, A0_device);
Expand Down Expand Up @@ -580,8 +585,13 @@ class ProjectionTools {
A(i,j) = A(j,i);

//computing QR of A. QR is saved in A and tau
KokkosBatched::SerialQR_Internal::invoke(A.extent(0), A.extent(1),
A.data(), A.stride_0(), A.stride_1(), tau.data(), tau.stride_0(), w.data());
#if KOKKOS_VERSION >= 40599
KokkosBatched::Impl::SerialQR_Internal::invoke
#else
KokkosBatched::SerialQR_Internal::invoke
#endif
(A.extent(0), A.extent(1),
A.data(), A.stride_0(), A.stride_1(), tau.data(), tau.stride_0(), w.data());

auto b = Kokkos::subview(elemRhs, ic, Kokkos::ALL());

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,8 @@
#include <BelosTpetraAdapter.hpp>
#endif

#include "Kokkos_Core.hpp"

namespace MueLu {

/*!
Expand Down Expand Up @@ -95,7 +97,7 @@ class ShiftedLaplacian : public BaseClass {
, Nullspace_("constant")
, numLevels_(5)
, coarseGridSize_(100)
, omega_(2.0 * M_PI)
, omega_(2.0 * Kokkos::numbers::pi_v<double>)
, iters_(500)
, blksize_(1)
, tol_(1.0e-4)
Expand Down
12 changes: 12 additions & 0 deletions packages/muelu/doc/UsersGuide/masterList.xml
Original file line number Diff line number Diff line change
Expand Up @@ -1987,6 +1987,18 @@ optimized neighbor discovery for Importer construction.</description>
</parameter>
</amgx>

<tuning>
<parameter>
<name>kokkos tuning: muelu parameter mapping</name>
<type>\parameterlist</type>
<description>Sublist for Kokkos Tuning of MueLu Parameters</description>
<visible>false</visible>
<comment-ML>not supported by ML</comment-ML>
</parameter>
</tuning>



<debug>

<parameter>
Expand Down
2 changes: 2 additions & 0 deletions packages/muelu/doc/UsersGuide/paramlist_hidden.tex
Original file line number Diff line number Diff line change
Expand Up @@ -447,6 +447,8 @@

\cba{amgx:params}{\parameterlist}{Sublist for listing AMGX configuration parameters}

\cba{kokkos tuning: muelu parameter mapping}{\parameterlist}{Sublist for Kokkos Tuning of MueLu Parameters}

\cbb{debug: graph level}{int}{-2}{Output dependency graph on level X (use -1 for all levels).}

\cbb{maxwell1: mode}{string}{"standard"}{Specifying the order of solve of the block system. Allowed values are: "standard" (default), "refmaxwell"}
Expand Down
Loading
Loading