Merge pull request #654 from IntersectMBO/wenkokke/package-description

wenkokke · web-flow · commit 1a19e33568a5 · 2025-03-31T18:17:57.000Z
doc: add package description
diff --git a/lsm-tree.cabal b/lsm-tree.cabal
@@ -1,8 +1,117 @@
 cabal-version:      3.4
 name:               lsm-tree
 version:            0.1.0.0
-synopsis:           Log-structured merge-tree
-description:        Log-structured merge-tree.
+synopsis:           Log-structured merge-trees
+description:
+  This package contains an efficient implementation of on-disk key–value storage, implemented as a log-structured merge-tree or LSM-tree.
+  An LSM-tree is a data structure for key–value mappings, similar to "Data.Map", but optimized for large tables with a high insertion volume.
+  It has support for:
+
+  *   Basic key–value operations, such as lookup, insert, and delete.
+  *   Range lookups, which efficiently retrieve the values for all keys in a given range.
+  *   Monoidal upserts (or \"mupserts\") which combine the stored and new values.
+  *   BLOB storage which assocates a large auxiliary BLOB with a key.
+  *   Durable on-disk persistence and rollback via named snapshots.
+  *   Cheap table duplication where all duplicates can be independently accessed and modified.
+  *   High-performance lookups on SSDs using I\/O batching and parallelism.
+
+  This package exports two modules:
+
+  *   "Database.LSMTree.Simple"
+
+      This module exports a simplified API which picks sensible defaults for a number of configuration parameters.
+
+      It does not support mupserts or BLOBs, due to their unintuitive interaction, see [Mupserts and BLOBs](#mupsertsandblobs).
+
+      If you are looking at this package for the first time, it is strongly recommended that you start by reading this module.
+
+  *   "Database.LSMTree"
+
+      This module exports the full API.
+
+  == Mupserts and BLOBs #mupsertsandblobs#
+
+  The interaction between mupserts and BLOBs is unintuitive.
+  A mupsert updates the value associated with the key by combining the old and new value with a user-specified function.
+  However, this does not apply to any BLOB value associated with the key, which is simply overwritten by the new BLOB value.
+
+  == Portability #portability#
+
+  * This package only supports 64-bit, little-endian systems.
+  * On Windows, the package has only been tested with NTFS filesystems.
+  * On Linux, executables using this package, including test and benchmark suites, must be compiled with the [@-threaded@](https://downloads.haskell.org/ghc/latest/docs/users_guide/phases.html#ghc-flag-threaded) RTS option enabled.
+
+  == Concurrency #concurrency#
+
+  LSM-trees can be used concurrently, but with a few restrictions:
+
+  *   Each session locks its session directory.
+      This means that a database cannot be accessed from different processes at the same time.
+  *   Tables can be used concurrently and concurrent use of read operations such as lookups is determinstic.
+      However, concurrent use of write operations such as insert or delete with any other operation results in a race condition.
+
+  == Performance #performance#
+
+  The worst-case time and space complexities are given in [big-O notation](http://en.wikipedia.org/wiki/Big_O_notation).
+  The time cost of operations on LSM-trees is generally dominated by the number of disk I\/O actions.
+  As such, the worst-case complexity of basic operations refer to the number of disk I\/O actions.
+
+  TODO: Describe the time complexity of the basic operations.
+
+  The in-memory size of an LSM-tree is described in terms of the variable \(n\), which refers to the number of /physical/ database entries.
+  A /physical/ database entry is any key–operation pair, e.g., @Insert k v@ or @Delete k@, whereas a /logical/ database entry is determined by all physical entries with the same key.
+
+  The worst-case in-memory size of an LSM-tree is \(O(n)\).
+
+  *   The worst-case size of the write buffer is \(O(1)\).
+
+      The maximum size of the write buffer on the write buffer allocation strategy, which is determined by the @'confWriteBufferAlloc'@ field of @'TableConfig'@.
+      Regardless of write buffer allocation strategy, the size of the write buffer may never exceed 4GiB.
+
+      [@AllocNumEntries maxEntries@]:
+        The maximum size of the write buffer is the maximum number of entries multiplied by the average size of a key–operation pair.
+
+  *   The worst-case size of the Bloom filters is \(O(n)\).
+
+      The total size of all Bloom filters depends on the Bloom filter allocation strategy, which is determined by the @'confBloomFilterAlloc'@ field of @'TableConfig'@.
+
+      [@AllocFixed bitsPerPhysicalEntry@]:
+          The total size of all Bloom filters is the number of bits per physical entry multiplied by the number of physical entries.
+      [@AllocRequestFPR requestedFPR@]:      
+          TODO: How does one determine the bloom filter size using @AllocRequestFPR@?
+
+  *   The worst-case size of the indexes is \(O(n)\).
+
+      The total size of all indexes depends on the index type, which is determined by the @'confFencePointerIndex'@ field of @'TableConfig'@.
+      The size of the various indexes is described in reference to the size of the database in [/memory pages/](https://en.wikipedia.org/wiki/Page_%28computer_memory%29).
+
+      [@OrdinaryIndex@]:
+          An ordinary index stores the maximum serialised key for each memory page.
+          The total size of all indexes is proportional to the average size of one serialised key per memory page.
+      [@CompactIndex@]:
+          A compact index stores the 64 most significant bits of the minimum serialised key for each memory page, as well as 1 bit per memory page to resolve clashes, 1 bit per memory page to mark overflow pages, and a negligable amount of memory for tie breakers.
+          The total size of all indexes is approximately 66 bits per memory page.
+
+  The total size of an LSM-tree must not exceed \(2^{41}\) physical entries.
+  Violation of this condition /is/ checked and will throw a 'TableTooLargeError'.
+
+  == Implementation
+
+  The implementation of LSM-trees in this package draws inspiration from:
+
+  *   Chris Okasaki.
+      1998.
+      \"Purely Functional Data Structures\"
+      [doi:10.1017/CBO9780511530104](https://doi.org/10.1017/CBO9780511530104)
+  *   Niv Dayan, Manos Athanassoulis, and Stratos Idreos.
+      2017.
+      \"Monkey: Optimal Navigable Key-Value Store.\"
+      [doi:10.1145/3035918.3064054](https://doi.org/10.1145/3035918.3064054)
+  *   Subhadeep Sarkar, Dimitris Staratzis, Ziehen Zhu, and Manos Athanassoulis.
+      2021.
+      \"Constructing and analyzing the LSM compaction design space.\"
+      [doi:10.14778/3476249.3476274](https://doi.org/10.14778/3476249.3476274)
+
 license:            Apache-2.0
 license-file:       LICENSE
 author: