poly

Let's play with 236 exp(x) implementation. This work is the code of my exponential paper, how make an exponential faster than vendor exp(x). All details are in this paper. The idea is to perform factorization for the polynomial evaluation of the exp(x) function. For each factor choose a method evaluation and bench. As processors are out of order, you never know what will be the results, surprise!

At least it works on X86, I did not tested since a long time on Power because I do not have any machine available.

Minimum Recquirements:

GCC > 4.9 (primary compiler because inlining is better)
Intel Compiler (to compare to vendor implementation)
Power/X86 system
linux system
cmake > 2.9
machine that understand x86/ppc ASM and the inline GCC mode

Arborescence:

  poly -- bench (contains the benchmarks for latency/throughtput/ulp + header for the timer library)
          -- latency (latency benchmark)
          -- throughtput (throughtput benchmark)
          -- lib (contains implementation of exp, scalar/vector version)
             -- exp
                -- scalar (implementation of the exp scalar version) 
                -- vector (implementation of the exp vector version)          
             -- poly
                -- scalar (implementation of the polynomial evaluation scalar version) 
                -- vector (implementation of the polynomial evaluation vector version)          
             -- tool
                -- scalar(implementation of 2^k and the branching part for the scalar version)
                -- vector(implementation of 2^k and the branching part for the vector version)            
          -- ulp (ulp benchmark)
       -- cyme (DSL for the vectorial version)
       -- dot (contains ASM - DAG graphiz format/ATT)
       -- llc (tiny library to measure the throughput, read hardware counter)  
       -- poly (contains the program that generate all variations of the exp implementation for poly/lib directory)

Compilation

  mkdir b
  cd b
  cmake ..
  make // can be long > 3000 files to compile

Modification

  ccmake .
  POLY_CYME buidl the vectorial version using cyme DSL (ON DEFAULT)
  POLY_BENCH build the benchmark throughput/ulp/latency (ON DEFAULT)
  POLY_TEST build the test (ON DEFAULT)
  CMAKE_BUILD_TYPE DEBUG (default) / RELEASE (mandatory for the perf)

Run (all)

  run.sh b exp > out // b for the build directory and exp for the results

Run (by hand)

All numbers are fictives

Latency

   ./b/bench/latency/scalar_vector_latency_ed10 poly // run ed10 for the polynomial scalar and vector version
   ./b/bench/latency/scalar_vector_latency_ed10 exp  // run ed10 for the polynomial scalar and vector version
   ./b/bench/latency/scalar_vector_latency_ed10 tool  // run the 2^k and the boundary
   [ewart@super_machine b]$ ./b/bench/latency/scalar_vector_latency_ed10 poly
   scalar::poly 35.0671
   vector::poly 30.2791
   [ewart@super_machine b]$ ./b/bench/latency/scalar_vector_latency_ed10 exp
   scalar::exp 75.0317
   vector::exp 61.4446
   [ewart@super_machine b]$ ./b/bench/latency/scalar_vector_latency_ed10 tool
   scalar::twok 36.006
   scalar::boundary 26.0249
   vector::twok 31.0046 // 2^k
   vector::boundary 20.96371 // the boundary condition
   [ewart@super_machine b]$ ./b/bench/latency/scalar_vector_latency_ed10 vendor
   imf 76.9489 // scalar version of intel
   svml4d 75.166 //vec version of intel

ULP

   ./b/bench/ulp/exp/exp_scalar_ulp_ed10
   3 // the ulp is 3 compare to std::exp (IEEE std)

Throughput

   ./bench/throughput/exp/exp_scalar_throughput_ed10
   4.63

to postprocess (all)

  perl pp_exp.pl out > out.hmtl

For the story: The directory poly/lib/scalar and poly/lib/vector contain the implementation of every exp(x). The generation of all theses files is performed with main.cpp of the lib directory.

If your machine is super experimental you may switch off (POLY_CYME,POLY_BENCH,POLY_TEST). Then you will a get a library for every implementation of the exp, free to you to work a simple benchmark with it.

Name		Name	Last commit message	Last commit date
Latest commit History 225 Commits
CMake		CMake
bench		bench
cyme		cyme
dot		dot
llc		llc
poly		poly
test		test
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
res.html		res.html
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

poly

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

timocafe/poly

Folders and files

Latest commit

History

Repository files navigation

poly

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages