Skip to content

Update dependencies to achieve better performance#25

Open
Alexsandruss wants to merge 5 commits into
conda-forge:mainfrom
Alexsandruss:dev/better-deps
Open

Update dependencies to achieve better performance#25
Alexsandruss wants to merge 5 commits into
conda-forge:mainfrom
Alexsandruss:dev/better-deps

Conversation

@Alexsandruss
Copy link
Copy Markdown

Checklist

  • Used a personal fork of the feedstock to propose changes
  • Bumped the build number (if the version is unchanged)
  • Reset the build number to 0 (if the version changed)
  • Re-rendered with the latest conda-smithy (Use the phrase @conda-forge-admin, please rerender in a comment in this PR for automated rerendering)
  • Ensured the license file is being packaged.

Target of this pull request is to change dependencies from GNU OpenMP and OpenBLAS to LLVM OpenMP and MKL on Linux to improve Faiss performance like it's done in original "pytorch" conda channel recipe.

Small performance comparison of dependencies:

Number of neighbors Number of rows Number of columns searching time (stock: gomp + openblas), s searching time (llvm-omp + mkl), s Speedup with llvm-omp + mkl
5 10000 16 0.407 0.202 2.01
5 10000 64 0.427 0.178 2.40
5 20000 16 1.262 0.567 2.23
5 20000 64 1.262 0.572 2.20
5 50000 16 7.371 3.521 2.09
5 50000 64 7.438 3.444 2.16
32 10000 16 0.504 0.253 1.99
32 10000 64 0.485 0.255 1.90
32 20000 16 1.509 0.726 2.08
32 20000 64 1.484 0.730 2.03
32 50000 16 7.767 3.693 2.10
32 50000 64 8.058 3.773 2.14

OS: CentOS Linux 7
SW: Python 3.8.6, numpy 1.19.5, faiss 1.6.5, MKL 2020.4, LLVM-OpenMP 11.0.1, GNU OpenMP 9.3.0, OpenBLAS 3.9.0.
HW: Intel(R) Xeon(R) Platinum 8280L CPU @ 2.70GHz, 2 sockets X 28 cores per socket, 376 GB of RAM.

Benchmark script:

import numpy as np
import faiss
from timeit import default_timer as timer

def measure_function(f, *args):
    times = []
    for i in range(5):
        t0 = timer()
        f(*args)
        t1 = timer()
        times.append(t1 - t0)
    return sum(times) / len(times)

rows_range = [10000, 20000, 50000]
columns_range = [16, 64]
k_range = [5, 32]

print("k,rows,cols,time[s]")
for k in k_range:
    for rows in rows_range:
        for cols in columns_range:
            data = np.random.uniform(size=(rows * 2, cols)).astype(np.float32)
            train_data, test_data = data[:rows,:], data[rows:,:]
            index = faiss.IndexFlatL2(cols)
            index.add(train_data)
            time = measure_function(index.search, test_data, k)
            print("{},{},{},{}".format(k, rows, cols, round(time, 6)))

@conda-forge-linter
Copy link
Copy Markdown

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@isuruf
Copy link
Copy Markdown
Member

isuruf commented Jan 26, 2021

MKL can be selected at install time. See https://conda-forge.org/docs/maintainer/knowledge_base.html#switching-blas-implementation

@h-vetinari
Copy link
Copy Markdown
Member

I'd be fine with either of:

  • using track_features to weigh down non-MKL
  • prioritizing mkl via buildnumber bump like e.g. here
  • adding a post-link script (c.f. e.g. here) with a message that recommends installing the mkl-version of blas/lapack.
    @jakirkham also suggested this a while ago, but I haven't gotten around to it yet.

Don't think it's possible to switch the openmp-implementation in the environment like for blas, right @isuruf?

@jakirkham
Copy link
Copy Markdown
Member

@jakirkham
Copy link
Copy Markdown
Member

As conda-forge already allows users to choose between different BLAS and OpenMP implementations, I'm not sure I see the need to restrict this on a feedstock basis. Instead we should focus on educating users on these options they have available and help them make informed choices about those options based on their needs

@h-vetinari
Copy link
Copy Markdown
Member

@Alexsandruss
Thanks again for trying to improve the packaging here! 🙃

Would you like to try any of the options I suggested? "Pinning" blas/openmp directly is not the best approach, as explained by the various comments in this thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants