-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Advanced topics
dav1312 edited this page Dec 7, 2022
·
28 revisions
Notes:
- Stop all other applications when measuring the speedup of Stockfish
- Run at least 20 default benches (depth 13) for each build of Stockfish to have accurate measures
- A speedup of 0.3% could be meaningless (i.e. within the measurement noise)
To measure the speedup of several builds of Stockfish, use one of these applications:
-
All OS:
-
bash script
bench_parallel.sh
(runbash bench_parallel.sh
for the help):Click to view
#!/bin/bash _bench () { ${1} << EOF > /dev/null 2>> ${2} bench 16 1 ${depth} default depth EOF } # _bench function customization example # setoption name SyzygyPath value C:\table_bases\wdl345;C:\table_bases\dtz345 # bench 128 4 ${depth} default depth if [[ ${#} -ne 4 ]]; then cat << EOF usage: ${0} ./stockfish_base ./stockfish_test depth n_runs fast bench: ${0} ./stockfish_base ./stockfish_test 13 10 slow bench: ${0} ./stockfish_base ./stockfish_test 20 10 EOF exit 1 fi sf_base=${1} sf_test=${2} depth=${3} n_runs=${4} # preload of CPU/cache/memory printf "preload CPU" (_bench ${sf_base} sf_base0.txt)& (_bench ${sf_test} sf_test0.txt)& wait # temporary files initialization : > sf_base0.txt : > sf_test0.txt : > sf_temp0.txt # bench loop: SMP bench with background subshells for ((k=1; k<=${n_runs}; k++)); do printf "\rrun %3d /%3d" ${k} ${n_runs} # swap the execution order to avoid bias if [ $((k%2)) -eq 0 ]; then (_bench ${sf_base} sf_base0.txt)& (_bench ${sf_test} sf_test0.txt)& wait else (_bench ${sf_test} sf_test0.txt)& (_bench ${sf_base} sf_base0.txt)& wait fi done # text processing to extract nps values cat sf_base0.txt | grep second | grep -Eo '[0-9]{1,}' > sf_base1.txt cat sf_test0.txt | grep second | grep -Eo '[0-9]{1,}' > sf_test1.txt for ((k=1; k<=${n_runs}; k++)); do echo ${k} >> sf_temp0.txt done printf "\rrun sf_base sf_test diff\n" paste sf_temp0.txt sf_base1.txt sf_test1.txt | awk '{printf "%3d %8d %8d %8+d\n", $1, $2, $3, $3-$2}' #paste sf_temp0.txt sf_base1.txt sf_test1.txt | awk '{printf "%3d\t%8d\t%8d\t%7+d\n", $1, $2, $3, $3-$2}' paste sf_base1.txt sf_test1.txt | awk '{printf "%d\t%d\t%d\n", $1, $2, $2-$1}' > sf_temp0.txt # compute: sample mean, 1.96 * std of sample mean (95% of samples), speedup # std of sample mean = sqrt(NR/(NR-1)) * (std population) / sqrt(NR) cat sf_temp0.txt | awk '{sum1 += $1 ; sumq1 += $1**2 ;sum2 += $2 ; sumq2 += $2**2 ;sum3 += $3 ; sumq3 += $3**2 } END {printf "\nsf_base = %8d +/- %d\nsf_test = %8d +/- %d\ndiff = %8d +/- %d\nspeedup = %.6f\n\n", sum1/NR , 1.96 * sqrt(sumq1/NR - (sum1/NR)**2)/sqrt(NR-1) , sum2/NR , 1.96 * sqrt(sumq2/NR - (sum2/NR)**2)/sqrt(NR-1) , sum3/NR , 1.96 * sqrt(sumq3/NR - (sum3/NR)**2)/sqrt(NR-1) , (sum2 - sum1)/sum1 }' # remove temporary files rm -f sf_base0.txt sf_test0.txt sf_temp0.txt sf_base1.txt sf_test1.txt
-
Windows only:
- FishBench: Latest release Fishbench v6.0
- Buildtester: Latest release Buildtester 1.4.7.0
There is a branch developed with a MPI cluster implementation of Stockfish, allowing stockfish to run on clusters of compute nodes connected with a high-speed network. See https://github.com/official-stockfish/Stockfish/pull/1571 for some discussion of the initial implementation and https://github.com/official-stockfish/Stockfish/pull/1931 for some early performance results.
Feedback on this branch is welcome! Here are some git commands for people interested to test this MPI/Cluster idea:
- If you don't have the cluster branch yet on your local git repository, you can download the latest state of the
official-stockfish/cluster
branch (See also https://github.com/official-stockfish/Stockfish/tree/cluster), then switch to it with the following commands:
git fetch official cluster:cluster
git checkout -f cluster
- After switching to the cluster branch as above, see the README.md for detailed instructions on how to compile and run the branch. TL;DR:
make clean
make -j ARCH=x86-64-modern COMPILER=mpic++ build
mpirun -np 4 ./stockfish bench