Open
Description
I could have sworn this used to be much faster:
using FFTW
FFTW.set_num_threads(8)
a = randn(ComplexF64,512,512);
using BenchmarkTools
@btime fft(a);
26.528 ms (98490 allocations: 10.73 MiB)
FFTW.set_num_threads(1)
@btime fft(a);
5.165 ms (6 allocations: 4.00 MiB)
Compare with fftw installed via python (scypy) here https://github.com/andrej5elin/howto_fftw_apple_silicon, where 4 threads takes about 500us for double precision, on slightly weaker hardware.
Rosetta with mkl is also significantly (>10x) faster than fftw.jl according to those benchmarks. Am I missing something?
julia> versioninfo()
Julia Version 1.10.0-rc1
Commit 5aaa9485436 (2023-11-03 07:44 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: macOS (arm64-apple-darwin22.4.0)
CPU: 10 × Apple M1 Max
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, apple-m1)
Threads: 11 on 8 virtual cores
Environment:
JULIA_PKG_DEVDIR = /Users/abradley/Dropbox/Julia/Dev
JULIA_NUM_THREADS = 8
JULIA_PKG_SERVER = us-west.pkg.julialang.org
JULIA_EDITOR = code
Metadata
Metadata
Assignees
Labels
No labels