Skip to content
This repository was archived by the owner on Nov 4, 2024. It is now read-only.

Commit 5cff991

Browse files
authored
Documents update and annotation completion (#346)
Signed-off-by: Jiayu Wu <[email protected]>
1 parent 61bb464 commit 5cff991

File tree

11 files changed

+446
-163
lines changed

11 files changed

+446
-163
lines changed

doc/benchmark.md

+10-1
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,15 @@ To test performance of KVDK, you can run our benchmark tool "bench", the tool is
44

55
You can manually run individual benchmark follow the examples as shown bellow, or simply run our basic benchmark script "scripts/run_benchmark.py" to test all the basic read/write performance.
66

7+
To run the script, you shoulf first build kvdk, then run:
8+
9+
```
10+
scripts/run_benchmark.py [data_type] [key distribution]
11+
```
12+
13+
data_type: Which data type to benchmark, it can be string/sorted/hash/list/blackhole/all
14+
15+
key distribution: Distribution of key of the benchmark workloads, it can be random/zipf/all
716
## Fill data to new instance
817

918
To test performance, we need to first fill key-value pairs to the KVDK instance. Since KVDK did not support cross-socket access yet, we need to bind bench program to a numa node:
@@ -20,7 +29,7 @@ Explanation of arguments:
2029

2130
-space: PMem space that allocate to the KVDK instance.
2231

23-
-max_access_threads: Max concurrent access threads of the KVDK instance, set it to the number of the hyper-threads for performance consideration.
32+
-max_access_threads: Max concurrent access threads in the KVDK instance, set it to the number of the hyper-threads for performance consideration. You can call KVDK API with any number of threads, but if your parallel threads more than max_access_threads, the performance will be degraded due to synchronization cost
2433

2534
-type: Type of key-value pairs to benchmark, it can be "string", "hash" or "sorted".
2635

doc/user_doc.md

+68-30
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
KVDK
22
=======
33

4-
KVDK(Key-Value Development Kit) is a Key-Value store for Persistent memory(PMem).
4+
KVDK(Key-Value Development Kit) is a Key-Value store for Persistent Memory (PMem).
55

6-
KVDK supports both sorted and unsorted KV-Pairs.
6+
KVDK supports basic read and write operations on both sorted and unsorted KV-Pairs, it also support some advanced features, such as **backup**, **checkpoint**, **expire key**, **atomic batch write** and **transactions**.
77

88
Code snippets in this user documents are from `./examples/tutorial/cpp_api_tutorial.cpp`, which is built as `./build/examples/tutorial/cpp_api_tutorial`.
99

@@ -70,7 +70,7 @@ int main()
7070
`kvdk::Status` indicates status of KVDK function calls.
7171
Functions return `kvdk::Status::Ok` if such a function call is a success.
7272
If exceptions are raised during function calls, other `kvdk::Status` is returned,
73-
such as `kvdk::Status::MemoryOverflow`.
73+
such as `kvdk::Status::MemoryOverflow` while no enough memory to allocate.
7474

7575
## Close a KVDK instance
7676

@@ -97,26 +97,45 @@ int main()
9797
```
9898

9999
## Data types
100-
KVDK currently supports string type for both keys and values.
101-
### Strings
102-
All keys and values in a KVDK instance are strings.
100+
KVDK currently supports raw string, sorted collection, hash collection and list data type.
101+
102+
### Raw String
103+
104+
All keys and values in a KVDK instance are strings. You can directly store or read key-value pairs in global namespace, which is accessible via Get, Put, Delete and Modify operations, we call them string type data in kvdk.
103105

104106
Keys are limited to have a maximum size of 64KB.
105107

106-
A string value can be at max 64MB in length by default. The maximum length can be configured when initializing a KVDK instance.
108+
A value can be at max 64MB in length by default. The maximum length can be configured when initializing a KVDK instance.
109+
110+
### Collections
111+
112+
Instead of raw string, you can organize key-value pairs to a collection, each collection has its own namespace.
113+
114+
Currently we have three types of collection:
115+
116+
#### Sorted Collection
117+
118+
KV pairs are stored with some kind of order (lexicographical order by default) in Sorted Collection, they can be iterated forward or backward starting from an arbitrary point(at a key or between two keys) by an iterator. They can also be directly accessed via SortedGet, SortedPut, SortedDelete operations.
119+
120+
#### Hash Collection
121+
122+
Hash Collection is like Raw String with a name space, you can access KV pairs via HashGet, HashPut, HashDelete and HashModify operations.
123+
124+
In current version, performance of operations on hash collection is similar to sorted collection, which much slower than raw-string, so we recomend use raw-string or sorted collection as high priority.
125+
126+
#### List
107127

108-
## Collections
109-
All Key-Value pairs(KV-Pairs) are organized into collections.
128+
List is a list of string elements, you can access elems at the front or back via ListPushFront, ListPushBack, ListPopFron, ListPopBack, or operation elems with index via ListInsertAt, ListInsertBefore, ListInsertAfter and ListErase. Notice that operation with index take O(n) time, while operation on front and back only takes O(1).
110129

111-
There is an anonymous global collection with KV-Pairs directly accessible via Get, Put, Delete operations. The anonymous global collection is unsorted.
130+
### Namespace
112131

113-
Users can also create named collections.
132+
Each collection has its own namespace, so you can store same key in every collection. Howevery, collection name and raw string key are in a same namespace, so you can't assign same name for a collection and a string key, otherwise a error status (Status::WrongType) will be returned.
114133

115-
KVDK currently supports sorted named collections. Users can iterate forward or backward starting from an arbitrary point(at a key or between two keys) by an iterator. Elements can also be directly accessed via SortedGet, SortedPut, SortedDelete operations.
134+
## API Examples
116135

117-
## Reads and Writes in Anonymous Global Collection
136+
### Reads and Writes with String type
118137

119-
A KVDK instance provides Get, Put, Delete methods to query/modify/delete entries.
138+
A KVDK instance provides Get, Put, Delete methods to query/modify/delete raw string kvs.
120139

121140
The following code performs a series of Get, Put and Delete operations.
122141

@@ -125,7 +144,7 @@ int main()
125144
{
126145
... Open a KVDK instance as described in "Open a KVDK instance" ...
127146

128-
// Reads and Writes on Anonymous Global Collection
147+
// Reads and Writes String KV
129148
{
130149
std::string key1{"key1"};
131150
std::string key2{"key2"};
@@ -173,11 +192,11 @@ int main()
173192
}
174193
```
175194

176-
## Reads and Writes in a Named Collection
195+
### Reads and Writes in a Sorted Collection
177196

178197
A KVDK instance provides SortedGet, SortedPut, SortedDelete methods to query/modify/delete sorted entries.
179198

180-
The following code performs a series of SortedGet, SortedPut and SortedDelete operations, which also initialize a named collection implicitly.
199+
The following code performs a series of SortedGet, SortedPut and SortedDelete operations on a sorted collection.
181200

182201
```c++
183202
int main()
@@ -194,9 +213,13 @@ int main()
194213
std::string value2{"value2"};
195214
std::string v;
196215

216+
// You must create sorted collections before you do any operations on them
217+
status = engine->SortedCreate(collection1);
218+
assert(status == kvdk::Status::Ok);
219+
status = engine->SortedCreate(collection2);
220+
assert(status == kvdk::Status::Ok);
221+
197222
// Insert key1-value1 into "my_collection_1".
198-
// Implicitly create a collection named "my_collection_1" in which
199-
// key1-value1 is stored.
200223
status = engine->SortedPut(collection1, key1, value1);
201224
assert(status == kvdk::Status::Ok);
202225

@@ -206,8 +229,6 @@ int main()
206229
assert(v == value1);
207230

208231
// Insert key1-value2 into "my_collection_2".
209-
// Implicitly create a collection named "my_collection_2" in which
210-
// key1-value2 is stored.
211232
status = engine->SortedPut(collection2, key1, value2);
212233
assert(status == kvdk::Status::Ok);
213234

@@ -236,8 +257,13 @@ int main()
236257
status = engine->SortedDelete(collection1, key1);
237258
assert(status == kvdk::Status::Ok);
238259

239-
printf("Successfully performed SortedGet, SortedPut, SortedDelete operations on named "
240-
"collections.\n");
260+
// Destroy sorted collections
261+
status = engine->SortedDestroy(collection1);
262+
assert(status == kvdk::Status::Ok);
263+
status = engine->SrotedDestroy(collection2);
264+
assert(status == kvdk::Status::Ok);
265+
266+
printf("Successfully performed SortedGet, SortedPut, SortedDelete operations.\n");
241267
}
242268

243269
... Do something else with KVDK instance ...
@@ -246,17 +272,18 @@ int main()
246272
}
247273
```
248274

249-
## Iterating a Named Collection
250-
The following example demonstrates how to iterate through a named collection. It also demonstrates how to iterate through a range defined by Key.
275+
### Iterating a Sorted Collection
276+
The following example demonstrates how to iterate through a sorted collection at a consistent view of data. It also demonstrates how to iterate through a range defined by Key.
251277

252278
```c++
253279
int main()
254280
{
255281
... Open a KVDK instance as described in "Open a KVDK instance" ...
256282

257-
// Iterating a Sorted Named Collection
283+
// Iterating a Sorted Sorted Collection
258284
{
259285
std::string sorted_collection{"my_sorted_collection"};
286+
engine->SortedCreate(sorted_collection);
260287
// Create toy keys and values.
261288
std::vector<std::pair<std::string, std::string>> kv_pairs;
262289
for (int i = 0; i < 10; ++i) {
@@ -282,7 +309,9 @@ int main()
282309
// Sort kv_pairs for checking the order of "my_sorted_collection".
283310
std::sort(kv_pairs.begin(), kv_pairs.end());
284311

285-
// Iterate through collection "my_sorted_collection"
312+
// Iterate through collection "my_sorted_collection", the iter is
313+
// created on a consistent view while you create it, e.g. all
314+
// modifications after you create the iter won't be observed
286315
auto iter = engine->SortedIteratorCreate(sorted_collection);
287316
iter->SeekToFirst();
288317
{
@@ -320,7 +349,7 @@ int main()
320349
}
321350
}
322351

323-
printf("Successfully iterated through a sorted named collections.\n");
352+
printf("Successfully iterated through a sorted collections.\n");
324353
engine->SortedIteratorRelease(iter);
325354
}
326355

@@ -330,7 +359,7 @@ int main()
330359
}
331360
```
332361

333-
## Atomic Updates
362+
### Atomic Updates
334363
KVDK supports organizing a series of Put, Delete operations into a `kvdk::WriteBatch` object as an atomic operation. If KVDK fail to apply the `kvdk::WriteBatch` object as a whole, i.e. the system shuts down during applying the batch, it will roll back to the status right before applying the `kvdk::WriteBatch`.
335364

336365
```c++
@@ -387,7 +416,12 @@ A KVDK instance can be accessed by multiple read and write threads safely. Synch
387416
Users can configure KVDK to adapt to their system environment by setting up a `kvdk::Configs` object and passing it to 'kvdk::Engine::Open' when initializing a KVDK instance.
388417

389418
### Max Access Threads
390-
Maximum number of access threads is specified by `kvdk::Configs::max_access_threads`. Defaulted to 48. It's recommended to set this number to the number of threads provided by CPU.
419+
Maximum number of internal access threads in kvdk is specified by `kvdk::Configs::max_access_threads`. Defaulted to 64. It's recommended to set this number to the number of threads provided by CPU.
420+
421+
You can call KVDK API with any number of threads, but if your parallel threads more than max_access_threads, the performance will be degraded due to synchronization cost
422+
423+
### Clean Threads
424+
KVDK reclaim space of updated/deleted data in background with dynamic number of clean threads, you can specify max clean thread number with `kvdk::Configs::clean_threads`. Defaulted to 8, you can config more clean threads in delete intensive workloads to avoid space be exhausted.
391425

392426
### PMem File Size
393427
`kvdk::Configs::pmem_file_size` specifies the space allocated to a KVDK instance. Defaulted to 2^38Bytes = 256GB.
@@ -418,3 +452,7 @@ Specified by `kvdk::Configs::hash_bucket_num`. Greater number will improve perfo
418452

419453
### Buckets per Slot
420454
Specified by `kvdk::Configs::num_buckets_per_slot`. Smaller number will improve performance by reducing lock contentions and improving caching at the cost of greater DRAM space. Please read Architecture Documentation for details before tuning this parameter.
455+
456+
## Advanced features and more API
457+
458+
Please read examples/tutorial for more API and advanced features in KVDK.

engine/hash_collection/hash_list.hpp

+11
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,17 @@ class HashList : public Collection {
8282
// Notice: the deleting key should already been locked by engine
8383
WriteResult Delete(const StringView& key, TimestampType timestamp);
8484

85+
// Modify value of "key" in the hash list
86+
//
87+
// Args:
88+
// * modify_func: customized function to modify existing value of key. See
89+
// definition of ModifyFunc (types.hpp) for more details.
90+
// * modify_args: customized arguments of modify_func.
91+
//
92+
// Return:
93+
// Status::Ok if modify success.
94+
// Status::Abort if modify function abort modifying.
95+
// Return other non-Ok status on any error.
8596
WriteResult Modify(const StringView key, ModifyFunc modify_func,
8697
void* modify_args, TimestampType timestamp);
8798

engine/kv_engine.cpp

+9-9
Original file line numberDiff line numberDiff line change
@@ -1226,10 +1226,10 @@ Status KVEngine::batchWriteRollbackLogs() {
12261226
return Status::Ok;
12271227
}
12281228

1229-
Status KVEngine::GetTTL(const StringView str, TTLType* ttl_time) {
1229+
Status KVEngine::GetTTL(const StringView key, TTLType* ttl_time) {
12301230
*ttl_time = kInvalidTTL;
1231-
auto ul = hash_table_->AcquireLock(str);
1232-
auto res = lookupKey<false>(str, ExpirableRecordType);
1231+
auto ul = hash_table_->AcquireLock(key);
1232+
auto res = lookupKey<false>(key, ExpirableRecordType);
12331233

12341234
if (res.s == Status::Ok) {
12351235
ExpireTimeType expire_time;
@@ -1266,15 +1266,15 @@ Status KVEngine::TypeOf(StringView key, ValueType* type) {
12661266
if (res.s == Status::Ok) {
12671267
switch (res.entry_ptr->GetIndexType()) {
12681268
case PointerType::Skiplist: {
1269-
*type = ValueType::SortedSet;
1269+
*type = ValueType::SortedCollection;
12701270
break;
12711271
}
12721272
case PointerType::List: {
12731273
*type = ValueType::List;
12741274
break;
12751275
}
12761276
case PointerType::HashList: {
1277-
*type = ValueType::HashSet;
1277+
*type = ValueType::HashCollection;
12781278
break;
12791279
}
12801280
case PointerType::StringRecord: {
@@ -1289,7 +1289,7 @@ Status KVEngine::TypeOf(StringView key, ValueType* type) {
12891289
return res.s == Status::Outdated ? Status::NotFound : res.s;
12901290
}
12911291

1292-
Status KVEngine::Expire(const StringView str, TTLType ttl_time) {
1292+
Status KVEngine::Expire(const StringView key, TTLType ttl_time) {
12931293
auto thread_holder = AcquireAccessThread();
12941294

12951295
int64_t base_time = TimeUtils::millisecond_time();
@@ -1298,10 +1298,10 @@ Status KVEngine::Expire(const StringView str, TTLType ttl_time) {
12981298
}
12991299

13001300
ExpireTimeType expired_time = TimeUtils::TTLToExpireTime(ttl_time, base_time);
1301-
auto ul = hash_table_->AcquireLock(str);
1301+
auto ul = hash_table_->AcquireLock(key);
13021302
auto snapshot_holder = version_controller_.GetLocalSnapshotHolder();
13031303
// TODO: maybe have a wrapper function(lookupKeyAndMayClean).
1304-
auto lookup_result = lookupKey<false>(str, ExpirableRecordType);
1304+
auto lookup_result = lookupKey<false>(key, ExpirableRecordType);
13051305
if (lookup_result.s == Status::Outdated) {
13061306
return Status::NotFound;
13071307
}
@@ -1313,7 +1313,7 @@ Status KVEngine::Expire(const StringView str, TTLType ttl_time) {
13131313
ul.unlock();
13141314
version_controller_.ReleaseLocalSnapshot();
13151315
lookup_result.s = Modify(
1316-
str,
1316+
key,
13171317
[](const std::string* old_val, std::string* new_val, void*) {
13181318
new_val->assign(*old_val);
13191319
return ModifyOperation::Write;

engine/kv_engine.hpp

+2-2
Original file line numberDiff line numberDiff line change
@@ -77,13 +77,13 @@ class KVEngine : public Engine {
7777
// 1. Expire assumes that str is not duplicated among all types, which is not
7878
// implemented yet
7979
// 2. Expire is not compatible with checkpoint for now
80-
Status Expire(const StringView str, TTLType ttl_time) final;
80+
Status Expire(const StringView key, TTLType ttl_time) final;
8181
// Get time to expire of str
8282
//
8383
// Notice:
8484
// Expire assumes that str is not duplicated among all types, which is not
8585
// implemented yet
86-
Status GetTTL(const StringView str, TTLType* ttl_time) final;
86+
Status GetTTL(const StringView key, TTLType* ttl_time) final;
8787

8888
Status TypeOf(StringView key, ValueType* type) final;
8989

include/kvdk/configs.hpp

+4-5
Original file line numberDiff line numberDiff line change
@@ -19,9 +19,6 @@ enum class LogLevel : uint8_t {
1919
None,
2020
};
2121

22-
// A snapshot indicates a immutable view of a KVDK engine at a certain time
23-
struct Snapshot {};
24-
2522
// Configs of created sorted collection
2623
// For correctness of encoding, please add new config field in the end of the
2724
// existing fields
@@ -31,12 +28,14 @@ struct SortedCollectionConfigs {
3128
};
3229

3330
struct Configs {
31+
// TODO: rename to concurrent internal threads
32+
//
3433
// Max number of concurrent threads read/write the kvdk instance internally.
35-
// Set it >= your CPU core number to get best performance
34+
// Set it to the number of the hyper-threads to get best performance
3635
//
3736
// Notice: you can call KVDK API with any number of threads, but if your
3837
// parallel threads more than max_access_threads, the performance will be
39-
// damaged due to synchronization cost
38+
// degraded due to synchronization cost
4039
uint64_t max_access_threads = 64;
4140

4241
// Size of PMem space to store KV data, this is not scalable in current

0 commit comments

Comments
 (0)