|
| 1 | +List of features / changes made / release notes, in reverse chronological order |
| 2 | + |
| 3 | +* Move second half of onedim_fseries_kernel() to GPU (with a simple heuristic |
| 4 | + basing on nf1 to switch between the CPU and the GPU version). |
| 5 | +* Melody fixed bug in MAX_NF being 0 due to typecasting 1e11 to int (thanks |
| 6 | + Elliot Slaughter for catching that). |
| 7 | +* Melody fixed kernel eval so done w*d not w^d times, speeds up 2d a little, 3d |
| 8 | + quite a lot! (PR#130) |
| 9 | +* Melody added 1D support for both types 1 (GM-sort and SM methods) 2 (GM-sort), |
| 10 | + in C++/CUDA and their test executables (but not Python interface). |
| 11 | + |
| 12 | +v 1.2 (02/17/21) |
| 13 | + |
| 14 | +* Warning: Following are Python interface changes -- not backwards compatible |
| 15 | + with v 1.1 (See examples/example2d1,2many.py for updated usage) |
| 16 | + |
| 17 | + - Made opts a kwarg dict instead of an object: |
| 18 | + def __init__(self, ... , opts=None, dtype=np.float32) |
| 19 | + => def __init__(self, ... , dtype=np.float32, **kwargs) |
| 20 | + - Renamed arguments in plan creation `__init__`: |
| 21 | + ntransforms => n_trans, tol => eps |
| 22 | + - Changed order of arguments in plan creation `__init__`: |
| 23 | + def __init__(self, ... ,isign, eps, ntransforms, opts, dtype) |
| 24 | + => def __init__(self, ... ,ntransforms, eps, isign, opts, dtype) |
| 25 | + - Removed M in `set_pts` arguments: |
| 26 | + def set_pts(self, M, kx, ky=None, kz=None) |
| 27 | + => def set_pts(self, kx, ky=None, kz=None) |
| 28 | + |
| 29 | +* Python: added multi-gpu support (in beta) |
| 30 | +* Python: added more unit tests (wrong input, kwarg args, multi-gpu) |
| 31 | +* Fixed various memory leaks |
| 32 | +* Added index bound check in 2D spread kernels (Spread_2d_Subprob(_Horner)) |
| 33 | +* Added spread/interp tests to `make check` |
| 34 | +* Fixed user request tolerance (eps) to kernel width (w) calculation |
| 35 | +* Default kernel evaluation method set to 0, ie exp(sqrt()), since faster |
| 36 | +* Removed outdated benchmark codes, cleaner spread/interp tests |
| 37 | + |
| 38 | +v 1.1 (09/22/20) |
| 39 | + |
| 40 | +* Python: extended the mode tuple to 3D and reorder from C/python |
| 41 | + ndarray.shape style input (nZ, nY, nX) to to the (F) order expected by the |
| 42 | + low level library (nX, nY, nZ). |
| 43 | +* Added bound checking on the bin size |
| 44 | +* Dual-precision support of spread/interp tests |
| 45 | +* Improved documentation of spread/interp tests |
| 46 | +* Added dummy call of cuFFTPlan1d to avoid timing the constant cost of cuFFT |
| 47 | + library. |
| 48 | +* Added heuristic decision of maximum batch size (number of vectors with the |
| 49 | + same nupts to transform at the same time) |
| 50 | +* Reported execution throughput in the test codes |
| 51 | +* Fixed timing in the tests code |
| 52 | +* Professionalized handling of too-small-eps (requested tolerance) |
| 53 | +* Rewrote README.md and added cuFINUFFT logo. |
| 54 | +* Support of advanced Makefile usage, e.g. make -site=olcf_summit |
| 55 | +* Removed FFTW dependency |
| 56 | + |
| 57 | +v 1.0 (07/29/20) |
0 commit comments