Skip to content

Commit 9afe77d

Browse files
committed
doc trouble type
1 parent 213adb3 commit 9afe77d

File tree

4 files changed

+151
-3
lines changed

4 files changed

+151
-3
lines changed

Diff for: docs/refs.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ References
66
Please cite the following two papers if you use this software:
77

88
[FIN]
9-
A parallel non-uniform fast Fourier transform library based on an ``exponential of semicircle'' kernel.
9+
A parallel non-uniform fast Fourier transform library based on an "exponential of semicircle" kernel.
1010
A. H. Barnett, J. F. Magland, and L. af Klinteberg.
1111
SIAM J. Sci. Comput. 41(5), C479-C504 (2019). `arxiv version <https://arxiv.org/abs/1808.06736>`_
1212

Diff for: docs/trouble.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ If FINUFFT is slow (eg, less than $10^6$ nonuniform points per second), here is
4040

4141
- Try printing debug output to see step-by-step progress by FINUFFT. Do this by setting ``opts.debug`` to 1 or 2 then looking at the timing information.
4242

43-
- Try reducing the number of threads, either those available via OpenMP, or via ``opts.nthreads``, perhaps down to 1 thread, to make sure you are not having collisions between threads, or slowdown due to thread overheads. Hyperthreading (more threads than physical cores) rarely helps much. Thread collisions are possible if large problems are run with a large number of (say more than 64) threads. Another ase causing slowness is very many repetitions of small problems; see ``test/manysmallprobs`` which exceeds $10^7$ points/sec with one thread via the guru interface, but can get ridiculously slower with many threads; see https://github.com/flatironinstitute/finufft/issues/86
43+
- Try reducing the number of threads, either those available via OpenMP, or via ``opts.nthreads``, perhaps down to 1 thread, to make sure you are not having collisions between threads, or slowdown due to thread overheads. Hyperthreading (more threads than physical cores) rarely helps much. Thread collisions are possible if large problems are run with a large number of (say more than 64) threads. Another case causing slowness is very many repetitions of small problems; see ``test/manysmallprobs`` which exceeds $10^7$ points/sec with one thread via the guru interface, but can get ridiculously slower with many threads; see https://github.com/flatironinstitute/finufft/issues/86
4444

4545
- Try setting a crude tolerance, eg ``tol=1e-3``. How many digits do you actually need? This has a big effect in higher dimensions, since the number of flops scales like $(\log 1/\epsilon)^d$, but not quite as big an effect as this scaling would suggest, because in higher dimensions the flops/RAM ratio is higher.
4646

Diff for: finufft-manual.pdf

-97 Bytes
Binary file not shown.

Diff for: matlab/finufft_plan.m

+149-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,152 @@
1-
1+
% FINUFFT_PLAN is a class which wraps the guru interface to FINUFFT.
2+
%
3+
% Full documentation is given in ../finufft-manual.pdf and online at
4+
% http://finufft.readthedocs.io
5+
% Also see examples in the matlab/examples and matlab/test directories.
6+
%
7+
% PROPERTIES
8+
% mwptr - opaque pointer to a C++ finufft_plan object (see MWrap manual),
9+
% whose properties cannot be accessed directly
10+
% floatprec - either 'double' or 'single', tracks what precision of C++
11+
% library is being called
12+
% type, dim, n_modes, n_trans, nj, nk - other plan parameters
13+
% Note: the user should never alter these plan properties directly! Rather,
14+
% the below methods should be used to create, use, and destroy plans.
15+
%
16+
% METHODS
17+
% finufft_plan - create guru plan object for one/many general nonuniform FFTs.
18+
% setpts - process nonuniform points for general FINUFFT transform(s).
19+
% execute - execute single or many-vector FINUFFT transforms in a plan.
20+
%
21+
% General notes:
22+
% * use delete(plan) to remove a plan after use.
23+
% * See ERRHANDLER, VALID_*, and this code for warning/error IDs.
24+
%
25+
%
26+
%
27+
% =========== Detailed description of guru methods ==========================
28+
%
29+
% 1) FINUFFT_PLAN create guru plan object for one/many general nonuniform FFTs.
30+
%
31+
% plan = finufft_plan(type, n_modes_or_dim, isign, ntrans, eps)
32+
% plan = finufft_plan(type, n_modes_or_dim, isign, ntrans, eps, opts)
33+
%
34+
% Creates a finufft_plan MATLAB object in the guru interface to FINUFFT, of
35+
% type 1, 2 or 3, and with given numbers of Fourier modes (unless type 3).
36+
%
37+
% Inputs:
38+
% type transform type: 1, 2, or 3
39+
% n_modes_or_dim if type is 1 or 2, the number of Fourier modes in each
40+
% dimension: [ms] in 1D, [ms mt] in 2D, or [ms mt mu] in 3D.
41+
% Its length sets the dimension, which must be 1, 2 or 3.
42+
% If type is 3, in contrast, its *value* fixes the dimension
43+
% isign if >=0, uses + sign in exponential, otherwise - sign.
44+
% eps relative precision requested (generally between 1e-15 and 1e-1)
45+
% opts optional struct with optional fields controlling the following:
46+
% opts.debug: 0 (silent, default), 1 (timing breakdown), 2 (debug info).
47+
% opts.spread_debug: spreader: 0 (no text, default), 1 (some), or 2 (lots)
48+
% opts.spread_sort: 0 (don't sort NU pts), 1 (do), 2 (auto, default)
49+
% opts.spread_kerevalmeth: 0: exp(sqrt()), 1: Horner ppval (faster)
50+
% opts.spread_kerpad: (iff kerevalmeth=0) 0: don't pad to mult of 4, 1: do
51+
% opts.fftw: FFTW plan mode, 64=FFTW_ESTIMATE (default), 0=FFTW_MEASURE, etc
52+
% opts.upsampfac: sigma. 2.0 (default), or 1.25 (low RAM, smaller FFT)
53+
% opts.spread_thread: for ntrans>1 only. 0:auto, 1:seq multi, 2:par, etc
54+
% opts.maxbatchsize: for ntrans>1 only. max blocking size, or 0 for auto.
55+
% opts.nthreads: number of threads, or 0: use all available (default)
56+
% opts.floatprec: library precision to use, 'double' (default) or 'single'.
57+
% for type 1 and 2 only, the following opts fields are also relevant:
58+
% opts.modeord: 0 (CMCL increasing mode ordering, default), 1 (FFT ordering)
59+
% opts.chkbnds: 0 (don't check NU points valid), 1 (do, default)
60+
% Outputs:
61+
% plan finufft_plan object (opaque pointer)
62+
%
63+
% Notes:
64+
% * For type 1 and 2, this does the FFTW planning and kernel-FT precomputation.
65+
% * For type 3, this does very little, since the FFT sizes are not yet known.
66+
% * Be default all threads are planned; control how many with opts.nthreads.
67+
% * The vectorized (many vector) plan, ie ntrans>1, can be much faster
68+
% than repeated calls with the same nonuniform points. Note that here the I/O
69+
% data ordering is stacked rather than interleaved. See ../docs/matlab.rst
70+
% * For more details about the opts fields, see ../docs/opts.rst
71+
%
72+
%
73+
% 2) SETPTS process nonuniform points for general FINUFFT transform(s).
74+
%
75+
% plan.setpts(xj)
76+
% plan.setpts(xj, yj)
77+
% plan.setpts(xj, yj, zj)
78+
% plan.setpts(xj, [], [], s)
79+
% plan.setpts(xj, yj, [], s, t)
80+
% plan.setpts(xj, yj, zj, s, t, u)
81+
%
82+
% When plan is a finufft_plan MATLAB object, brings in nonuniform point
83+
% coordinates (xj,yj,zj), and additionally in the type 3 case, nonuniform
84+
% frequency target points (s,t,u). Empty arrays may be passed in the case of
85+
% unused dimensions. For all types, sorting is done to internally store a
86+
% reindexing of points, and for type 3 the spreading and FFTs are planned.
87+
% The nonuniform points may be used for multiple transforms.
88+
%
89+
% Inputs:
90+
% xj vector of x-coords of all nonuniform points
91+
% yj empty (if dim<2), or vector of y-coords of all nonuniform points
92+
% zj empty (if dim<3), or vector of z-coords of all nonuniform points
93+
% s vector of x-coords of all nonuniform frequency targets
94+
% t empty (if dim<2), or vector of y-coords of all frequency targets
95+
% u empty (if dim<3), or vector of z-coords of all frequency targets
96+
% Input/Outputs:
97+
% plan finufft_plan object
98+
%
99+
% Notes:
100+
% * For type 1 and 2, the values in xj (and if nonempty, yj and zj) must
101+
% lie in the interval [-3pi,3pi). For type 1 they are "sources", but for
102+
% type 2, "targets". In contrast, for type 3 there are no restrictions other
103+
% than the resulting size of the internal fine grids.
104+
% * s (and t and u) are only relevant for type 3, and may be omitted otherwise
105+
% * The matlab vectors xj,... and s,... should not be changed before calling
106+
% future execute calls, because the plan stores only pointers to the
107+
% arrays (they are not duplicated internally).
108+
% * The precision (double/single) of all inputs must match that chosen at the
109+
% plan stage using opts.floatprec, otherwise an error is raised.
110+
%
111+
%
112+
% 3) EXECUTE execute single or many-vector FINUFFT transforms in a plan.
113+
%
114+
% result = plan.execute(data_in);
115+
%
116+
% For plan a previously created finufft_plan object also containing all
117+
% needed nonuniform point coordinates, do a single (or if ntrans>1 in the
118+
% plan stage, multiple) NUFFT transform(s), with the strengths or Fourier
119+
% coefficient inputs vector(s) from data_in. The result of the transform(s)
120+
% is returned as a (possibly multidimensional) array.
121+
%
122+
% Inputs:
123+
% plan finufft_plan object
124+
% data_in strengths (types 1 or 3) or Fourier coefficients (type 2)
125+
% vector, matrix, or array of appropriate size. For type 1 and 3,
126+
% this is either a length-M vector (where M is the length of xj),
127+
% or an (M,ntrans) matrix when ntrans>1. For type 2, in 1D this is
128+
% length-ms, in 2D size (ms,mt), or in 3D size (ms,mt,mu), or
129+
% each of these with an extra last dimension ntrans if ntrans>1.
130+
% Outputs:
131+
% result vector of output strengths at targets (types 2 or 3), or array
132+
% of Fourier coefficients (type 1), or, if ntrans>1, a stack of
133+
% such vectors or arrays, of appropriate size.
134+
% Specifically, if ntrans=1, for type 1, in 1D
135+
% this is a length-ms column vector, in 2D a matrix of size
136+
% (ms,mt), or in 3D an array of size (ms,mt,mu); for types 2 and 3
137+
% it is a column vector of length M (the length of xj in type 2),
138+
% or nk (the length of s in type 3). If ntrans>1 its is a stack
139+
% of such objects, ie, it has an extra last dimension ntrans.
140+
%
141+
% Notes:
142+
% * The precision (double/single) of all inputs must match that chosen at the
143+
% plan stage using opts.floatprec, otherwise an error is raised.
144+
%
145+
%
146+
% 4) To deallocate (delete) a nonuniform FFT plan, use delete(plan)
147+
%
148+
% This deallocates all stored FFTW plans, nonuniform point sorting arrays,
149+
% kernel Fourier transforms arrays, etc.
2150
classdef finufft_plan < handle
3151

4152
properties

0 commit comments

Comments
 (0)