Skip to content

Commit

Permalink
Merge pull request #8 from CNugteren/proper_library
Browse files Browse the repository at this point in the history
Re-structured the includes and added install targets
  • Loading branch information
CNugteren committed Apr 12, 2015
2 parents e1206b1 + cdb2f0c commit 8f2bad2
Show file tree
Hide file tree
Showing 26 changed files with 196 additions and 161 deletions.
6 changes: 6 additions & 0 deletions CHANGELOG
Original file line number Diff line number Diff line change
@@ -1,4 +1,10 @@

Version 1.3.2
- Now prints OpenCL version when running on a device
- Added install targets to CMake
- Moved header files around and renamed the main include to "cltune.h"
- Catches OpenCL exceptions and skips those configurations

Version 1.3.1
- Fixed simulated annealing's random number generation
- Added new FindOpenCL CMake script
Expand Down
44 changes: 35 additions & 9 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -25,17 +25,28 @@
# CMake project
cmake_minimum_required(VERSION 2.8)
project("cltune" C CXX)
set(cltune_VERSION_MAJOR 1)
set(cltune_VERSION_MINOR 3)
set(cltune_VERSION_PATCH 2)

# Options
option(ENABLE_SAMPLES "Enable compilation of sample programs" ON)
option(ENABLE_TESTS "Enable compilation of the Google tests" OFF)
option(SAMPLES "Enable compilation of sample programs" ON)
option(TESTS "Enable compilation of the Google tests" OFF)

# ==================================================================================================

# RPATH settings
set(CMAKE_SKIP_BUILD_RPATH false) # Use, i.e. don't skip the full RPATH for the build tree
set(CMAKE_BUILD_WITH_INSTALL_RPATH false) # When building, don't use the install RPATH already
set(CMAKE_INSTALL_RPATH "") # The RPATH to be used when installing
set(CMAKE_INSTALL_RPATH_USE_LINK_PATH false) # Don't add the automatically determined parts

# ==================================================================================================

# Compiler-version check
if("${CMAKE_CXX_COMPILER_ID}" STREQUAL "GNU")
if (CMAKE_CXX_COMPILER_VERSION VERSION_LESS 4.7)
message(FATAL_ERROR "GCC version must be at least 4.7 (for C++11)")
if (CMAKE_CXX_COMPILER_VERSION VERSION_LESS 4.9)
message(FATAL_ERROR "GCC version must be at least 4.9 (for full C++11 compatibility)")
endif()
endif()

Expand Down Expand Up @@ -69,13 +80,26 @@ set(TUNER
src/searchers/random_search.cc
src/searchers/annealing.cc)

# Links the library
# Creates and links the library
add_library(cltune SHARED ${TUNER})
target_link_libraries(cltune ${OPENCL_LIBRARIES})

# Installs the library
install(TARGETS cltune DESTINATION lib)
install(FILES
include/cltune.h
include/cl.hpp
DESTINATION include)
install(FILES
include/cltune/opencl.h
include/cltune/memory.h
include/cltune/string_range.h
include/cltune/kernel_info.h
DESTINATION include/cltune)

# ==================================================================================================
# Optional: Enable compilation of sample programs
if (ENABLE_SAMPLES)
# Optional: Enables compilation of sample programs
if (SAMPLES)

# Adds sample programs
add_executable(sample_simple samples/simple.cc)
Expand All @@ -85,10 +109,12 @@ if (ENABLE_SAMPLES)
target_link_libraries(sample_gemm cltune ${OPENCL_LIBRARIES} ${OpenMP_LIBRARY})
target_link_libraries(sample_gemm_annealing cltune ${OPENCL_LIBRARIES} ${OpenMP_LIBRARY})

# Note: these are not installed because they depend on their separate OpenCL kernel files

endif()
# ==================================================================================================
# Optional: Enable compilation of the Google tests
if (ENABLE_TESTS)
# Optional: Enables compilation of the Google tests
if (TESTS)

# Enables Google Test tests (source-code is shipped with the project)
add_subdirectory(external/gtest-1.7.0)
Expand Down
66 changes: 30 additions & 36 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,72 +1,66 @@

CLTune: An automatic OpenCL kernel tuner
CLTune: Automatic OpenCL kernel tuning
================

CLTune is a C++ library which can be used to automatically tune your OpenCL kernels. How does this
work? The only thing you'll need to provide is a tuneable kernel and a list of allowed parameters
and values.
CLTune is a C++ library which can be used to automatically tune your OpenCL kernels. The only thing you'll need to provide is a tuneable kernel and a list of allowed parameters and values.

For example, if you would perform loop unrolling or local memory tiling through a pre-processor define, just remove the define from your kernel code, pass the kernel to CLTune and tell it what the name of your parameter(s) are and what values you want to try. CLTune will take care of the rest: it will iterate over all possible permutations, test them, and report the best combination.

For example, if you would perform loop unrolling or local memory tiling through a pre-
processor define, just remove the define from your kernel code, pass the kernel to CLTune and tell
it what the name of your parameter(s) are and what values you want to try. CLTune will take care of
the rest: it will iterate over all possible permutations, test them, and report the best
combination.

Compilation
-------------

CLTune can be compiled as a shared library using CMake. The pre-requisites are:

* CMake version 2.8 or higher
* A C++11 compiler [_tested with icc, gcc, and clang_]
* An OpenCL library [_tested with the Apple OpenCL framework, the NVIDIA CUDA SDK, and the AMD APP
SDK_]
* A C++11 compiler, for example:
- GCC 4.9.0 or newer
- Clang 3.3 or newer
- ICC 14.0 or newer
* An OpenCL library. CLTune has been tested with:
- Apple OpenCL
- NVIDIA CUDA SDK
- AMD APP SDK

An example of an out-of-source build follows (starting from the root of the cltune folder):
An example of an out-of-source build (starting from the root of the cltune folder):

mkdir build
cd build
cmake ..
make
sudo make install

A custom installation folder can be specified when calling CMake:

cmake -DCMAKE_INSTALL_PREFIX=/path/to/install/directory ..

You can then link your own programs against the CLTune library. An example for a Linux-system
follows:
You can then link your own programs against the CLTune library. An example for a Linux-system:

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/libcltune.so
g++ example.cc -o example -L/path/to/libcltune.so -lcltune -lOpenCL


Example of using the tuner
-------------

Before we start using the tuner, we'll have to create one. The constructor takes two arguments:
the first specifying the OpenCL platform number, and the second the device ID on that platform:
Before we start using the tuner, we'll have to create one. The constructor takes two arguments: the first specifying the OpenCL platform number, and the second the device ID on that platform:

cltune::Tuner my_tuner(0, 1); // Tuner on device 1 of OpenCL platform 0

Now that we have a tuner, we can add a tuning kernel. This is done by providing the path to an
OpenCL kernel (first argument), the name of the kernel (second argument), a list of global thread
dimensions (third argument), and a list of local thread or workgroup dimensions (fourth argument).
Here is an example:
Now that we have a tuner, we can add a tuning kernel. This is done by providing the path to an OpenCL kernel (first argument), the name of the kernel (second argument), a list of global thread dimensions (third argument), and a list of local thread or workgroup dimensions (fourth argument). Here is an example:

auto id = my_tuner.AddKernel("path/to/kernel.opencl", "my_kernel", {1024,512}, {16,8});

Notice that the AddKernel function returns an integer: it is the ID of the added kernel. We'll need
this ID when we want to add tuning parameters to this kernel. Let's say that our kernel has two
pre-processor parameters named `PARAM_1` and `PARAM_2`:
Notice that the AddKernel function returns an integer: it is the ID of the added kernel. We'll need this ID when we want to add tuning parameters to this kernel. Let's say that our kernel has two pre-processor parameters named `PARAM_1` and `PARAM_2`:

my_tuner.AddParameter(id, "PARAM_1", {16, 24});
my_tuner.AddParameter(id, "PARAM_2", {0, 1, 2, 3, 4});

Now that we've added a kernel and its parameters, we can add another one if we wish. When we're
done, there are a couple of things left to be done. Let's start with adding an reference kernel.
This reference kernel can provide the tuner with the ground-truth and is optional - only when it is
provided will the tuner perform verification checks to ensure correctness.
Now that we've added a kernel and its parameters, we can add another one if we wish. When we're done, there are a couple of things left to be done. Let's start with adding an reference kernel. This reference kernel can provide the tuner with the ground-truth and is optional - only when it is provided will the tuner perform verification checks to ensure correctness.

my_tuner.SetReference("path/to/reference.opencl", "my_reference", {8192}, {128});

The tuner also needs to know which arguments the kernels take. Scalar arguments can be provided
as-is and are passed-by-value, whereas arrays have to be provided as C++ `std::vector`s. That's
right, we won't have to create OpenCL buffers, CLTune will handle that for us! Here is an example:
The tuner also needs to know which arguments the kernels take. Scalar arguments can be provided as-is and are passed-by-value, whereas arrays have to be provided as C++ `std::vector`s. That's right, we won't have to create OpenCL buffers, CLTune will handle that for us! Here is an example:

auto my_variable = 900;
std::vector<float> input_vector(8192);
Expand All @@ -81,17 +75,17 @@ Now that we've configured the tuner, it is time to start it and ask it to report
my_tuner.Tune(); // Starts the tuner
my_tuner.PrintToScreen(); // Prints the results


Other examples
-------------

Two examples are included as part of the CLTune distribution. They illustrate some more advanced
features, such as modifying the thread dimensions based on the parameters and adding user-defined
parameter constraints. The examples are compiled when providing `-ENABLE_SAMPLES=ON` to CMake
(default option). The two included examples are:
Examples are included as part of the CLTune distribution. They illustrate some more advanced features, such as modifying the thread dimensions based on the parameters and adding user-defined parameter constraints. The examples are compiled when providing `-ENABLE_SAMPLES=ON` to CMake (default option). The included examples are:

* `simple.cc` providing a basic example of matrix-vector multiplication
* `gemm.cc` providing a more advanced and heavily tuned implementation of matrix-matrix
multiplication or SGEMM
* `gemm_annealing.cc` demonstrating an alternative search technique: simulated annealing


Development and tests
-------------
Expand All @@ -104,7 +98,7 @@ licensed under the MIT license by SURFsara, (c) 2014. The contributing authors s
* Cedric Nugteren

CLTune is packaged with Google Test 1.7.0 and a custom test suite. The tests will be compiled when
providing the `-DENABLE_TESTS=ON` option to CMake. Running the tests goes as follows:
providing the `-TESTS=ON` option to CMake. Running the tests goes as follows:

./unit_tests

Expand Down
16 changes: 5 additions & 11 deletions include/tuner/tuner.h → include/cltune.h
Original file line number Diff line number Diff line change
Expand Up @@ -28,23 +28,17 @@
//
// =================================================================================================

#ifndef CLTUNE_TUNER_TUNER_H_
#define CLTUNE_TUNER_TUNER_H_
#ifndef CLTUNE_CLTUNE_H_
#define CLTUNE_CLTUNE_H_

#include <string>
#include <vector>
#include <stdexcept>
#include <memory>
#include <functional>

// Include other classes
#include "tuner/internal/memory.h"
#include "tuner/internal/opencl.h"
#include "tuner/internal/kernel_info.h"
#include "tuner/internal/string_range.h"
#include "tuner/internal/searchers/full_search.h"
#include "tuner/internal/searchers/random_search.h"
#include "tuner/internal/searchers/annealing.h"
#include "cltune/memory.h"
#include "cltune/kernel_info.h"

namespace cltune {
// =================================================================================================
Expand Down Expand Up @@ -218,5 +212,5 @@ class Tuner {
// =================================================================================================
} // namespace cltune

// CLTUNE_TUNER_TUNER_H_
// CLTUNE_CLTUNE_H_
#endif
Original file line number Diff line number Diff line change
Expand Up @@ -28,20 +28,17 @@
//
// =================================================================================================

#ifndef CLBLAS_TUNER_KERNEL_INFO_H_
#define CLBLAS_TUNER_KERNEL_INFO_H_
#ifndef CLTUNE_KERNEL_INFO_H_
#define CLTUNE_KERNEL_INFO_H_

#include <string>
#include <vector>
#include <iostream>
#include <stdexcept>
#include <functional>

// The C++ OpenCL wrapper
#include "cl.hpp"

// Include other classes and structures
#include "tuner/internal/string_range.h"
#include "cltune/string_range.h"

namespace cltune {
// =================================================================================================
Expand Down Expand Up @@ -162,5 +159,5 @@ class KernelInfo {
// =================================================================================================
} // namespace cltune

// CLBLAS_TUNER_KERNEL_INFO_H_
// CLTUNE_KERNEL_INFO_H_
#endif
11 changes: 4 additions & 7 deletions include/tuner/internal/memory.h → include/cltune/memory.h
Original file line number Diff line number Diff line change
Expand Up @@ -27,18 +27,15 @@
//
// =================================================================================================

#ifndef CLBLAS_TUNER_MEMORY_H_
#define CLBLAS_TUNER_MEMORY_H_
#ifndef CLTUNE_MEMORY_H_
#define CLTUNE_MEMORY_H_

#include <string>
#include <vector>
#include <stdexcept>
#include <memory>

// The C++ OpenCL wrapper
#include "tuner/internal/opencl.h"

#include "cl.hpp"
#include "cltune/opencl.h"

namespace cltune {
// =================================================================================================
Expand Down Expand Up @@ -82,5 +79,5 @@ class Memory {
// =================================================================================================
} // namespace cltune

// CLBLAS_TUNER_MEMORY_H_
// CLTUNE_MEMORY_H_
#endif
16 changes: 9 additions & 7 deletions include/tuner/internal/opencl.h → include/cltune/opencl.h
Original file line number Diff line number Diff line change
Expand Up @@ -27,14 +27,13 @@
//
// =================================================================================================

#ifndef CLBLAS_TUNER_OPENCL_H_
#define CLBLAS_TUNER_OPENCL_H_
#ifndef CLTUNE_OPENCL_H_
#define CLTUNE_OPENCL_H_

#include <string>
#include <vector>
#include <stdexcept>

// The C++ OpenCL wrapper
#include "cl.hpp"

namespace cltune {
Expand All @@ -43,6 +42,12 @@ namespace cltune {
// See comment at top of file for a description of the class
class OpenCL {
public:

// Messages printed to stdout (in colours)
static const std::string kMessageFull;

// Types of devices to consider
const cl_device_type kDeviceType = CL_DEVICE_TYPE_ALL;

// Converts an unsigned integer to a string by first casting it to a long long integer. This is
// required for older compilers that do not fully implement std::to_string (part of C++11).
Expand All @@ -58,9 +63,6 @@ class OpenCL {
};
};

// Types of devices to consider
const cl_device_type kDeviceType = CL_DEVICE_TYPE_ALL;

// Initializes the OpenCL platform, device, and creates a context and a queue
explicit OpenCL(const size_t platform_id, const size_t device_id);

Expand Down Expand Up @@ -97,5 +99,5 @@ class OpenCL {
// =================================================================================================
} // namespace cltune

// CLBLAS_TUNER_OPENCL_H_
// CLTUNE_OPENCL_H_
#endif
Original file line number Diff line number Diff line change
Expand Up @@ -28,12 +28,12 @@
//
// =================================================================================================

#ifndef CLBLAS_TUNER_SEARCHER_H_
#define CLBLAS_TUNER_SEARCHER_H_
#ifndef CLTUNE_SEARCHER_H_
#define CLTUNE_SEARCHER_H_

#include <vector>

#include "tuner/internal/kernel_info.h"
#include "cltune/kernel_info.h"

namespace cltune {
// =================================================================================================
Expand Down Expand Up @@ -71,5 +71,5 @@ class Searcher {
// =================================================================================================
} // namespace cltune

// CLBLAS_TUNER_SEARCHER_H_
// CLTUNE_SEARCHER_H_
#endif
Loading

0 comments on commit 8f2bad2

Please sign in to comment.