@@ -27,7 +27,8 @@ the new commands.
27
27
28
28
## Command Basics
29
29
30
- Commands have a basic request/response headers which look like:
30
+ Commands and responses start with a two character code then a set of flags, and
31
+ potentially value data. Flags may have token data attached to them.
31
32
```
32
33
set request:
33
34
ms foo 2 T90 F1\r\n
@@ -43,7 +44,7 @@ response:
43
44
hi\r\n
44
45
45
46
delete request:
46
- md foo I \r\n
47
+ md foo\r\n
47
48
48
49
response:
49
50
HD\r\n
@@ -63,29 +64,40 @@ documentation.
63
64
64
65
Standard GET:
65
66
66
- ` mg foo t f v `
67
+ ```
68
+ mg foo f v\r\n -- ask for client flags, value
69
+ VA 2 f30\r\n -- get length, client flags, value
70
+ hi\r\n
71
+ ```
72
+
73
+ ... will return client flags, value. Add ` k ` to also get the key back.
67
74
68
75
GETS (get with CAS):
69
76
70
- ` mg foo t f c v `
77
+ ```
78
+ mg foo f c v\r\n
79
+ VA 2 f30 c3\r\n -- also gets the CAS value back
80
+ hi\r\n
81
+ ```
71
82
72
83
TOUCH (just update TTL, no response data):
73
84
74
- ` mg foo T30 `
75
-
76
- ... will update the TTL to be 30 seconds from now.
85
+ ```
86
+ mg foo T30\r\n -- update the TTL to be 30 seconds from now.
87
+ HD\r\n -- no flags or value requested, get HD return code
88
+ ```
77
89
78
90
GAT (get and touch):
79
91
80
- ` mg foo t f v T90 `
92
+ ` mg foo f v T90 `
81
93
82
- ... will fetch standard data and update the TTL to be 90 seconds from now.
94
+ ... will fetch client flags, value and update the TTL to be 90 seconds from now.
83
95
84
96
GATS (get and touch with CAS):
85
97
86
- ` mg foo t f c v T100 `
98
+ ` mg foo f c v T100 `
87
99
88
- ... same as above, but also gets the CAS value for later use .
100
+ ... same as above, but also gets the CAS value.
89
101
90
102
## New MetaData Flags
91
103
@@ -380,7 +392,138 @@ pipeline a bunch of sets together but don't want all of the "ST" code
380
392
responses, pass the 'q' flag with it. If a set results in a code other than
381
393
"ST" (ie; "EX" for a failed CAS), the response will still be returned.
382
394
383
- Any syntax errors will still result in a response as well (CLIENT_ERROR).
395
+ Any syntax errors will still result in a response as well (` CLIENT_ERROR ` ).
396
+
397
+ ## Data consistency with CAS overrides
398
+
399
+ ### Data version or time based consistency
400
+
401
+ After version 1.6.27 the meta protocol supports directly providing a CAS value
402
+ during mutation operations. By default the CAS value (or CAS id) for an item
403
+ is generated directly by memcached using a globally incrementing counter, but
404
+ we can now override this with the ` E ` flag.
405
+
406
+ For example, if the data you are caching has a "version id" or "row version"
407
+ we can provide that:
408
+
409
+ ```
410
+ ms foo 2 E73 -- directly provide the CAS value
411
+ hi
412
+ HD
413
+ mg foo c v -- later able to retrieve it
414
+ VA 2 c73
415
+ hi
416
+ ```
417
+
418
+ We have directly set this value's CAS id to ` 73 ` . Now we can use standard CAS
419
+ operations to update the data. For example, attempting to update the data with
420
+ an older version will now fail:
421
+
422
+ ```
423
+ ms foo 2 C72 E73
424
+ hi
425
+ EX
426
+ ```
427
+
428
+ The above could be the result of a race condition: two proceses are trying to
429
+ move the data from version 72 to 73 at the same time. Since the underlying
430
+ version is already 73, the second command will fail.
431
+
432
+ Note that any command which can generate a new cas id also accepts the ` E `
433
+ flag. IE: delete with invalidate, ma for incr/decr, and so on.
434
+
435
+ #### Time and types
436
+
437
+ Anything that fits in an 8 byte _ incrementing_ number can be used as a CAS id:
438
+ versions, clocks, HLC's, and so on. So long as the next number is higher than
439
+ the previous number.
440
+
441
+ ### Data consistency across multiple pools
442
+
443
+ If you are attempting to keep multiple pools of memcached servers in sync, we
444
+ can use the CAS override to help improve our consistency results. Please note
445
+ this system is not strict.
446
+
447
+ #### Leader and follower pools
448
+
449
+ Assume you have pools A, B, C, and one pool is designated as a "leader", we
450
+ can provide general consistency which can be also be repaired:
451
+
452
+ ```
453
+ -- against pool A:
454
+ mg foo c v\r\n
455
+ VA 2 C50\r\n
456
+ hi\r\n
457
+ -- we fetched a value. we have a new row version (or timestamp):
458
+ ms foo 2 C50 E51\r\n
459
+ ih\r\n
460
+ HD\r\n
461
+ -- Success. Now, against pools B and C we issue "blind sets":
462
+ -- pool B:
463
+ ms foo 2 E51\r\n
464
+ ih\r\n
465
+ HD\r\n
466
+ -- pool C:
467
+ ms foo 2 E51\r\n
468
+ ih\r\n
469
+ HD\r\n
470
+ ```
471
+
472
+ If all is well all copies will have the same CAS ID as data in pool A. This is
473
+ again, not strict, but can help verify if data is consistent or not. If the
474
+ data in pool A has gone missing, you can decide on quorum or highest ID to
475
+ repair data from B/C. Or simply not allow data to change until A has been
476
+ repaired, but in the meantime data can be read from B/C.
477
+
478
+ #### Full cross consistency
479
+
480
+ It is difficult if not impossible to guarantee consistency across multiple
481
+ pools. You can attempt this by:
482
+
483
+ ```
484
+ -- against pool A:
485
+ mg foo c v\r\n -- if we need the value. else drop the v.
486
+ -- against pools B/C:
487
+ mg foo c\r\n -- don't necessarily need the value
488
+ -- use the same set command across all three hosts:
489
+ ms foo 2 Cnnnn E90\r\n
490
+ hi\r\n
491
+ ```
492
+
493
+ If this fails on any host, do the whole routine again. If a value is being
494
+ frequently updated this can be problematic or permanently blocking.
495
+
496
+ You can augment this with the stale/win flags by picking a "leader" pool and
497
+ issuing an invalidation:
498
+
499
+ ```
500
+ md foo I E74\r\n -- invalidate the key if it exists, prep it for the new ver
501
+ HD\r\n
502
+ mg foo c v\r\n
503
+ VA 2 c75 X W\r\n -- data is stale (X) and we atomically win (W)
504
+ hi\r\n
505
+ ```
506
+
507
+ Now this client has the exclusive right to update this value across all pools.
508
+ If updates to pools B/C end up coming _ out of order_ the highest CAS value
509
+ should eventually succeed if the updates are asynchronous:
510
+
511
+ - If pool B starts at version 70, then gets updated to 75 as above
512
+ - A later update to 73 will fail (CAS too old)
513
+ - If pool C starts at version 70, gets updated to 73, then gets updated to 75
514
+ - The final result will be the correct version
515
+
516
+ If the client waits for either quorum (2/3 in this case) or all (3/3) hosts to
517
+ respond before responding to its user, the data should be the same.
518
+
519
+ Problems where this fails should be relatively obscure, for example:
520
+ - Pool A is our leader, we get win flag (W) and successfully update it.
521
+ - We update Pools B, C, but in the meantime pool A has failed.
522
+ - Depending on how new leaders are elected and how long updates take, we can
523
+ end up with inconsistent data.
524
+
525
+ In reality host or pool election is slow and cross-stream updates are
526
+ relatively fast, so it should be unusual.
384
527
385
528
## Probabilistic Hot Cache
386
529
0 commit comments