Skip to content

Commit e89358e

Browse files
committed
Document E overrides for meta
updates general text a little as well.
1 parent 377b6ad commit e89358e

File tree

1 file changed

+155
-12
lines changed

1 file changed

+155
-12
lines changed

content/protocols/meta.md

Lines changed: 155 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,8 @@ the new commands.
2727

2828
## Command Basics
2929

30-
Commands have a basic request/response headers which look like:
30+
Commands and responses start with a two character code then a set of flags, and
31+
potentially value data. Flags may have token data attached to them.
3132
```
3233
set request:
3334
ms foo 2 T90 F1\r\n
@@ -43,7 +44,7 @@ response:
4344
hi\r\n
4445
4546
delete request:
46-
md foo I\r\n
47+
md foo\r\n
4748
4849
response:
4950
HD\r\n
@@ -63,29 +64,40 @@ documentation.
6364

6465
Standard GET:
6566

66-
`mg foo t f v`
67+
```
68+
mg foo f v\r\n -- ask for client flags, value
69+
VA 2 f30\r\n -- get length, client flags, value
70+
hi\r\n
71+
```
72+
73+
... will return client flags, value. Add `k` to also get the key back.
6774

6875
GETS (get with CAS):
6976

70-
`mg foo t f c v`
77+
```
78+
mg foo f c v\r\n
79+
VA 2 f30 c3\r\n -- also gets the CAS value back
80+
hi\r\n
81+
```
7182

7283
TOUCH (just update TTL, no response data):
7384

74-
`mg foo T30`
75-
76-
... will update the TTL to be 30 seconds from now.
85+
```
86+
mg foo T30\r\n -- update the TTL to be 30 seconds from now.
87+
HD\r\n -- no flags or value requested, get HD return code
88+
```
7789

7890
GAT (get and touch):
7991

80-
`mg foo t f v T90`
92+
`mg foo f v T90`
8193

82-
... will fetch standard data and update the TTL to be 90 seconds from now.
94+
... will fetch client flags, value and update the TTL to be 90 seconds from now.
8395

8496
GATS (get and touch with CAS):
8597

86-
`mg foo t f c v T100`
98+
`mg foo f c v T100`
8799

88-
... same as above, but also gets the CAS value for later use.
100+
... same as above, but also gets the CAS value.
89101

90102
## New MetaData Flags
91103

@@ -380,7 +392,138 @@ pipeline a bunch of sets together but don't want all of the "ST" code
380392
responses, pass the 'q' flag with it. If a set results in a code other than
381393
"ST" (ie; "EX" for a failed CAS), the response will still be returned.
382394

383-
Any syntax errors will still result in a response as well (CLIENT_ERROR).
395+
Any syntax errors will still result in a response as well (`CLIENT_ERROR`).
396+
397+
## Data consistency with CAS overrides
398+
399+
### Data version or time based consistency
400+
401+
After version 1.6.27 the meta protocol supports directly providing a CAS value
402+
during mutation operations. By default the CAS value (or CAS id) for an item
403+
is generated directly by memcached using a globally incrementing counter, but
404+
we can now override this with the `E` flag.
405+
406+
For example, if the data you are caching has a "version id" or "row version"
407+
we can provide that:
408+
409+
```
410+
ms foo 2 E73 -- directly provide the CAS value
411+
hi
412+
HD
413+
mg foo c v -- later able to retrieve it
414+
VA 2 c73
415+
hi
416+
```
417+
418+
We have directly set this value's CAS id to `73`. Now we can use standard CAS
419+
operations to update the data. For example, attempting to update the data with
420+
an older version will now fail:
421+
422+
```
423+
ms foo 2 C72 E73
424+
hi
425+
EX
426+
```
427+
428+
The above could be the result of a race condition: two proceses are trying to
429+
move the data from version 72 to 73 at the same time. Since the underlying
430+
version is already 73, the second command will fail.
431+
432+
Note that any command which can generate a new cas id also accepts the `E`
433+
flag. IE: delete with invalidate, ma for incr/decr, and so on.
434+
435+
#### Time and types
436+
437+
Anything that fits in an 8 byte _incrementing_ number can be used as a CAS id:
438+
versions, clocks, HLC's, and so on. So long as the next number is higher than
439+
the previous number.
440+
441+
### Data consistency across multiple pools
442+
443+
If you are attempting to keep multiple pools of memcached servers in sync, we
444+
can use the CAS override to help improve our consistency results. Please note
445+
this system is not strict.
446+
447+
#### Leader and follower pools
448+
449+
Assume you have pools A, B, C, and one pool is designated as a "leader", we
450+
can provide general consistency which can be also be repaired:
451+
452+
```
453+
-- against pool A:
454+
mg foo c v\r\n
455+
VA 2 C50\r\n
456+
hi\r\n
457+
-- we fetched a value. we have a new row version (or timestamp):
458+
ms foo 2 C50 E51\r\n
459+
ih\r\n
460+
HD\r\n
461+
-- Success. Now, against pools B and C we issue "blind sets":
462+
-- pool B:
463+
ms foo 2 E51\r\n
464+
ih\r\n
465+
HD\r\n
466+
-- pool C:
467+
ms foo 2 E51\r\n
468+
ih\r\n
469+
HD\r\n
470+
```
471+
472+
If all is well all copies will have the same CAS ID as data in pool A. This is
473+
again, not strict, but can help verify if data is consistent or not. If the
474+
data in pool A has gone missing, you can decide on quorum or highest ID to
475+
repair data from B/C. Or simply not allow data to change until A has been
476+
repaired, but in the meantime data can be read from B/C.
477+
478+
#### Full cross consistency
479+
480+
It is difficult if not impossible to guarantee consistency across multiple
481+
pools. You can attempt this by:
482+
483+
```
484+
-- against pool A:
485+
mg foo c v\r\n -- if we need the value. else drop the v.
486+
-- against pools B/C:
487+
mg foo c\r\n -- don't necessarily need the value
488+
-- use the same set command across all three hosts:
489+
ms foo 2 Cnnnn E90\r\n
490+
hi\r\n
491+
```
492+
493+
If this fails on any host, do the whole routine again. If a value is being
494+
frequently updated this can be problematic or permanently blocking.
495+
496+
You can augment this with the stale/win flags by picking a "leader" pool and
497+
issuing an invalidation:
498+
499+
```
500+
md foo I E74\r\n -- invalidate the key if it exists, prep it for the new ver
501+
HD\r\n
502+
mg foo c v\r\n
503+
VA 2 c75 X W\r\n -- data is stale (X) and we atomically win (W)
504+
hi\r\n
505+
```
506+
507+
Now this client has the exclusive right to update this value across all pools.
508+
If updates to pools B/C end up coming _out of order_ the highest CAS value
509+
should eventually succeed if the updates are asynchronous:
510+
511+
- If pool B starts at version 70, then gets updated to 75 as above
512+
- A later update to 73 will fail (CAS too old)
513+
- If pool C starts at version 70, gets updated to 73, then gets updated to 75
514+
- The final result will be the correct version
515+
516+
If the client waits for either quorum (2/3 in this case) or all (3/3) hosts to
517+
respond before responding to its user, the data should be the same.
518+
519+
Problems where this fails should be relatively obscure, for example:
520+
- Pool A is our leader, we get win flag (W) and successfully update it.
521+
- We update Pools B, C, but in the meantime pool A has failed.
522+
- Depending on how new leaders are elected and how long updates take, we can
523+
end up with inconsistent data.
524+
525+
In reality host or pool election is slow and cross-stream updates are
526+
relatively fast, so it should be unusual.
384527

385528
## Probabilistic Hot Cache
386529

0 commit comments

Comments
 (0)