@@ -38,10 +38,6 @@ if not index.unique then
38
38
key_def_inst = key_def_inst :merge (key_def .new (space .index [0 ].parts ))
39
39
end
40
40
41
- -- Create a merger context.
42
- -- NB: It worth to cache it.
43
- local ctx = merger .context .new (key_def_inst )
44
-
45
41
-- Prepare M sources.
46
42
local sources = {}
47
43
for _ , conn in ipairs (connects ) do
@@ -52,13 +48,13 @@ for _, conn in ipairs(connects) do
52
48
end
53
49
54
50
-- Merge.
55
- local merger_inst = merger .new (ctx , sources )
51
+ local merger_inst = merger .new (key_def_inst , sources )
56
52
local res = merger_inst :select ()
57
53
```
58
54
59
55
## How to form key parts
60
56
61
- The merger expects that each input tuple stream is sorted in the order that
57
+ The merger expects that each input tuple stream is sorted in the order that is
62
58
acquired for a result (via key parts and the ` reverse ` flag). It performs a
63
59
kind of the merge sort: chooses a source with a minimal / maximal tuple on each
64
60
step, consumes a tuple from this source and repeats.
@@ -202,10 +198,10 @@ limit and GT iterator (with a key extracted from a last fetched tuple).
202
198
Note: such way to implement a cursor / a pagination will work smoothly only
203
199
with unique indexes. See also #3898 .
204
200
205
- More complex scenarious are possible: using futures (` is_async = true `
206
- parameters of net.box methods) to fetch a next chunk while merge a current one
207
- or, say, call a function with several return values (some of them need to be
208
- skipped manually in a ` gen ` function to let merger read tuples).
201
+ More complex scenarious are possible: using futures (` is_async = true ` option
202
+ of net.box methods) to fetch a next chunk while merge a current one or, say,
203
+ call a function with several return values (some of them need to be skipped
204
+ manually in a ` gen ` function to let merger read tuples).
209
205
210
206
Note: When using ` is_async = true ` net.box option one can lean on the fact that
211
207
net.box writes an answer w/o yield: a partial result cannot be observed.
@@ -250,6 +246,9 @@ indexes) and use vshard API on a client.
250
246
-- See chunked_example_fast/frontend.lua.
251
247
```
252
248
249
+ In this example we also cache key_def instances to reuse them for processing
250
+ results from same space and index.
251
+
253
252
## Multiplexing requests
254
253
255
254
Consider the case when a network latency between storage machines and frontend
@@ -261,7 +260,7 @@ one network request. We'll consider approach when a storage function returns
261
260
many box.space.<...>: select (<...>) results instead of one.
262
261
263
262
One need to skip iproto_data header, two array headers and then run a merger N
264
- times on the same buffers (with the same or different contexts ). No extra data
263
+ times on the same buffers (with the same or different key_defs ). No extra data
265
264
copies, no tuples decoding into a Lua memory.
266
265
267
266
``` lua
@@ -278,7 +277,7 @@ copies, no tuples decoding into a Lua memory.
278
277
279
278
## Cascading mergers
280
279
281
- The idea is simple: a merger instance itself is a merger source.
280
+ The idea is simple: a merger instance itself is a merge source.
282
281
283
282
The example below is synthetic to be simple. Real cases when cascading can be
284
283
profitable likely involve additional layers of Tarantool instances between a
@@ -291,7 +290,7 @@ behaviour for a source and a merger looks as the good property of the API.
291
290
< ... requires ... >
292
291
293
292
local sources = < ... 100 sources ... >
294
- local ctx = merger . context . new ( key_def .new (< ... > ) )
293
+ local key_def_inst = key_def .new (< ... > )
295
294
296
295
-- Create 10 mergers with 10 sources in each.
297
296
local middleware_mergers = {}
@@ -300,10 +299,46 @@ for i = 1, 10 do
300
299
for j = 1 , 10 do
301
300
current_sources [j ] = sources [(i - 1 ) * 10 + j ]
302
301
end
303
- middleware_mergers [i ] = merger .new (ctx , current_sources )
302
+ middleware_mergers [i ] = merger .new (key_def_inst , current_sources )
304
303
end
305
304
306
- -- Note: Using different contexts will lead to extra copying of
307
- -- tuples.
308
- local res = merger .new (ctx , middleware_mergers ):select ()
305
+ local res = merger .new (key_def_inst , middleware_mergers ):select ()
309
306
```
307
+
308
+ ## When comparisons are fast?
309
+
310
+ ### In short
311
+
312
+ If tuples are from a local space and a key_def for a merger is created using
313
+ parts of an index from the space (see the 'How to form key parts' section
314
+ above), then comparisons will be fast (and no extra tuple creations occur).
315
+
316
+ If tuples are received from net.box, stored into a buffer and created with a
317
+ buffer source, then everything is okay too.
318
+
319
+ When tuples are created from Lua tables comparisons will be fast too, but the
320
+ case possibly means that extra work is performed to decode a tuple into a Lua
321
+ table (say, in net.box) and then to encode it to a new tuple in a merge source.
322
+
323
+ When tuples are created with ` box.tuple.new() ` comparisons likely will be slow.
324
+
325
+ ### In details
326
+
327
+ First, some background information. Tuples can be created with different tuple
328
+ formats. A format in particular defines which fields have precalculated offsets
329
+ (these offsets are stored within a tuple). When there is a precalculated offset
330
+ reading of the field is faster: it does not require to decode the whole msgpack
331
+ data until the field. When a tuple is obtained from a space all indexed fields
332
+ (all fields that are part of an index from this space) have offsets. When a
333
+ tuple is created with ` box.tuple.new(<...>) ` it has no offsets.
334
+
335
+ A merge source differs in a way how tuples are obtained. A buffer source always
336
+ creates tuples itself. A tuple or a table source can pass existing tuples or
337
+ create tuples from Lua tables.
338
+
339
+ When a merger acquires a tuple from a source it pass a tuple format, which can
340
+ be used to create a tuple. So when a tuple is created by a source, field
341
+ accesses will be fast and so comparisons will be fast. When a tuple is passes
342
+ through a source it is possible that it lacks some offsets and so comparisons
343
+ can be slow. In this case it is a user responsibility to provide tuples with
344
+ needed offsets if (s)he want to do merge faster.
0 commit comments