Revert "Merge pull request rqlite#1149 from rqlite/delete-docs"

This reverts commit 2652f29, reversing changes made to 5af24e3.
lalalalatt · Jan 7, 2023 · 1174a0c · 1174a0c
1 parent 4727431
commit 1174a0c
Show file tree

Hide file tree

Showing 20 changed files with 1,345 additions and 1 deletion.
diff --git a/DOC/AUTO_CLUSTERING.md b/DOC/AUTO_CLUSTERING.md
@@ -0,0 +1,156 @@
+# Automatic clustering
+This document describes various ways to dynamically form rqlite clusters, which is particularly useful for automating your deployment of rqlite.
+
+> :warning: **This functionality was introduced in version 7.0. It does not exist in earlier releases.**
+
+## Contents
+* [Quickstart](#quickstart)
+  * [Automatic Boostrapping](#automatic-bootstrapping)
+  * [Using DNS for Bootstrapping](#using-dns-for-bootstrapping)
+    * [DNS SRV](#dns-srv)
+  * [Kubernetes](#kubernetes)
+  * [Consul](#consul)
+  * [etcd](#etcd)
+* [Next steps](#next-steps)
+  * [Customizing your configuration](#customizing-your-configuration)
+    * [Running multiple different clusters](#running-multiple-different-clusters)
+* [Design](#design)
+
+## Quickstart
+
+### Automatic Bootstrapping
+While [manually creating a cluster](https://github.com/rqlite/rqlite/blob/master/DOC/CLUSTER_MGMT.md) is simple, it does suffer one drawback -- you must start one node first and with different options, so it can become the Leader. _Automatic Bootstrapping_, in constrast, allows you to start all the nodes at once, and in a very similar manner. **You just need to know the network addresses of the nodes ahead of time**.
+
+For simplicity, let's assume you want to run a 3-node rqlite cluster. The network addresses of the nodes are `$HOST1`, `$HOST2`, and `$HOST3`. To bootstrap the cluster, use the `-bootstrap-expect` option like so:
+
+Node 1:
+```bash
+rqlited -node-id 1 -http-addr=$HOST1:4001 -raft-addr=$HOST1:4002 \
+-bootstrap-expect 3 -join http://$HOST1:4001,http://$HOST2:4001,http://$HOST3:4001 data
+```
+Node 2:
+```bash
+rqlited -node-id 2 -http-addr=$HOST2:4001 -raft-addr=$HOST2:4002 \
+-bootstrap-expect 3 -join http://$HOST1:4001,http://$HOST2:4001,http://$HOST3:4001 data
+```
+Node 3:
+```bash
+rqlited -node-id 3 -http-addr=$HOST3:4001 -raft-addr=$HOST3:4002 \
+-bootstrap-expect 3 -join http://$HOST1:4001,http://$HOST2:4001,http://$HOST3:4001 data
+```
+
+`-bootstrap-expect` should be set to the number of nodes that must be available before the bootstrapping process will commence, in this case 3. You also set `-join` to the HTTP URL of all 3 nodes in the cluster. **It's also required that each launch command has the same values for `-bootstrap-expect` and `-join`.**
+
+After the cluster has formed, you can launch more nodes with the same options. A node will always attempt to first perform a normal cluster-join using the given join addresses, before trying the bootstrap approach.
+
+#### Docker
+With Docker you can launch every node identically:
+```bash
+docker run rqlite/rqlite -bootstrap-expect 3 -join http://$HOST1:4001,http://$HOST2:4001,http://$HOST3:4001
+```
+where `$HOST[1-3]` are the expected network addresses of the containers.
+
+__________________________
+
+### Using DNS for Bootstrapping
+You can also use the Domain Name System (DNS) to bootstrap a cluster. This is similar to automatic clustering, but doesn't require you to specify the network addresses of other nodes at the command line. Instead you create a DNS record for the host `rqlite.local`, with an [A Record](https://www.cloudflare.com/learning/dns/dns-records/dns-a-record/) for each rqlite node's IP address. 
+
+To launch a node with node ID `$ID` and network address `$HOST`, using DNS for cluster boostrap, execute the following (example) command:
+```bash
+rqlited -node-id $ID -http-addr=$HOST:4001 -raft-addr=$HOST:4002 \
+-disco-mode=dns -disco-config='{"name":"rqlite.local"}' -bootstrap-expect 3 data
+```
+You would launch other nodes similarly, setting `$ID` and `$HOST` as required for each node. In the example above, resolving `rqlite.local` should result in 3 IP addresses.
+
+#### DNS SRV
+Using [DNS SRV](https://www.cloudflare.com/learning/dns/dns-records/dns-srv-record/) gives you more control over the rqlite node address details returned by DNS, including the HTTP port each node is listening on. This means that unlike using just simple DNS records, each rqlite node can be listening on a different HTTP port. Simple DNS records are probably good enough for most situations, however.
+
+To launch a node using DNS SRV boostrap, execute the following (example) command:
+```bash
+rqlited -node-id $ID  -http-addr=$HOST:4001 -raft-addr=$HOST:4002 \
+-disco-mode=dns-srv -disco-config='{"name":"rqlite.local","service":"rqlite-svc"}' -bootstrap-expect 3 data
+```
+You would launch other nodes similarly, setting `$ID` and `$HOST` as required for each node. You would launch other nodes similarly. In the example above rqlite will lookup SRV records at `_rqlite-svc._tcp.rqlite.local`
+__________________________
+
+### Kubernetes
+DNS-based approaches can be quite useful for many deployment scenarios, in particular systems like Kubernetes. To learn how to deploy rqlite on Kubernetes, check the [Kubernetes deployment guide](https://github.com/rqlite/rqlite/blob/master/DOC/KUBERNETES.md).
+__________________________
+
+### Consul
+Another approach uses [Consul](https://www.consul.io/) to coordinate clustering. The advantage of this approach is that you do not need to know the network addresses of all nodes ahead of time.
+
+Let's assume your Consul cluster is running at `http://example.com:8500`. Let's also assume that you are going to run 3 rqlite nodes, each node on a different machine. Launch your rqlite nodes as follows:
+
+Node 1:
+```bash
+rqlited -node-id $ID1 -http-addr=$HOST1:4001 -raft-addr=$HOST1:4002 \
+-disco-mode consul-kv -disco-config '{"address":"example.com:8500"}' data
+```
+Node 2:
+```bash
+rqlited -node-id $ID2 -http-addr=$HOST2:4001 -raft-addr=$HOST2:4002 \
+-disco-mode consul-kv -disco-config '{"address":"example.com:8500"}' data
+```
+Node 3:
+```bash
+rqlited -node-id $ID3 -http-addr=$HOST3:4001 -raft-addr=$HOST3:4002 \
+-disco-mode consul-kv -disco-config '{"address":"example.com:8500"}' data
+```
+
+These three nodes will automatically find each other, and cluster. You can start the nodes in any order and at anytime. Furthermore, the cluster Leader will continually update Consul with its address. This means other nodes can be launched later and automatically join the cluster, even if the Leader changes. Refer to the [_Next Steps_](#next-steps) documentation below for further details on Consul configuration.
+
+#### Docker
+It's even easier with Docker, as you can launch every node almost identically:
+```bash
+docker run rqlite/rqlite -disco-mode=consul-kv -disco-config '{"address":"example.com:8500"}'
+```
+__________________________
+
+### etcd
+A third approach uses [etcd](https://www.etcd.io/) to coordinate clustering. Autoclustering with etcd is very similar to Consul. Like when you use Consul, the advantage of this approach is that you do not need to know the network addresses of all the nodes ahead of time.
+
+Let's assume etcd is available at `example.com:2379`.
+
+Node 1:
+```bash
+rqlited -node-id $ID1 -http-addr=$HOST1:4001 -raft-addr=$HOST1:4002 \
+	-disco-mode etcd-kv -disco-config '{"endpoints":["example.com:2379"]}' data
+```
+Node 2:
+```bash
+rqlited -node-id $ID2 -http-addr=$HOST2:4001 -raft-addr=$HOST2:4002 \
+	-disco-mode etcd-kv -disco-config '{"endpoints":["example.com:2379"]}' data
+```
+Node 3:
+```bash
+rqlited -node-id $ID3 -http-addr=$HOST3:4001 -raft-addr=$HOST3:4002 \
+	-disco-mode etcd-kv -disco-config '{"endpoints":["example.com:2379"]}' data
+```
+ Like with Consul autoclustering, the cluster Leader will continually report its address to etcd.  Refer to the [_Next Steps_](#next-steps) documentation below for further details on etcd configuration.
+
+ #### Docker
+```bash
+docker run rqlite/rqlite -disco-mode=etcd-kv -disco-config '{"endpoints":["example.com:2379"]}'
+```
+
+## Next Steps
+### Customizing your configuration
+For detailed control over Discovery configuration `-disco-confg` can either be an actual JSON string, or a path to a file containing a JSON-formatted configuration. The former option may be more convenient if the configuration you need to supply is very short, as in the examples above.
+
+The examples above demonstrates simple configurations, and most real deployments may require more detailed configuration. For example, your Consul system might be reachable over HTTPS. To more fully configure rqlite for Discovery, consult the relevant configuration specification below. You must create a JSON-formatted configuration which matches that described in the source code.
+
+- [Full Consul configuration description](https://github.com/rqlite/rqlite-disco-clients/blob/main/consul/config.go)
+- [Full etcd configuration description](https://github.com/rqlite/rqlite-disco-clients/blob/main/etcd/config.go)
+- [Full DNS configuration description](https://github.com/rqlite/rqlite-disco-clients/blob/main/dns/config.go)
+- [Full DNS SRV configuration description](https://github.com/rqlite/rqlite-disco-clients/blob/main/dnssrv/config.go)
+
+#### Running multiple different clusters
+If you wish a single Consul or etcd key-value system to support multiple rqlite clusters, then set the `-disco-key` command line argument to a different value for each cluster. To run multiple rqlite clusters with DNS, use a different domain name per cluster.
+
+## Design
+When using Automatic Bootstrapping, each node notifies all other nodes of its existence. The first node to have a record of enough nodes (set by `-boostrap-expect`) forms the cluster. Only one node can bootstrap the cluster, any other node that attempts to do so later will fail, and instead become a Follower in the new cluster.
+
+When using either Consul or etcd for automatic clustering, rqlite uses the key-value store of each system. In each case the Leader atomically sets its HTTP URL, allowing other nodes to discover it. To prevent multiple nodes updating the Leader key at once, nodes uses a check-and-set operation, only updating the Leader key if it's value has not changed since it was last read by the node. See [this blog post](https://www.philipotoole.com/rqlite-7-0-designing-node-discovery-and-automatic-clustering/) for more details on the design.
+
+For DNS-based discovery, the rqlite nodes simply resolve the hostname, and use the returned network addresses, once the number of returned addresses is at least as great as the `-bootstrap-expect` value. Clustering then proceeds as though the network addresses were passed at the command line via `-join`.
diff --git a/DOC/BACKUPS.md b/DOC/BACKUPS.md
@@ -0,0 +1,34 @@
+# Backups
+
+rqlite supports hot backing up a node. You can retrieve and write a copy of the underlying SQLite database to a file via the CLI:
+```
+127.0.0.1:4001> .backup bak.sqlite3
+backup file written successfully
+```
+This command will write the SQLite database file to `bak.sqlite3`.
+
+You can also access the rqlite API directly, via a HTTP `GET` request to the endpoint `/db/backup`. For example, using `curl`, and assuming the node is listening on `localhost:4001`, you could retrieve a backup as follows:
+```bash
+curl -s -XGET localhost:4001/db/backup -o bak.sqlite3
+```
+Note that if the node is not the Leader, the node will transparently forward the request to Leader, wait for the backup data from the Leader, and return it to the client. If, instead, you want a backup of SQLite database of the actual node that receives the request, add `noleader` to the URL as a query parameter.
+
+If you do not wish a Follower to transparently forward a backup request to a Leader, add `redirect` to the URL as a query parameter. In that case if a Follower receives a backup request the Follower will respond with [HTTP 301 Moved Permanently](https://en.wikipedia.org/wiki/HTTP_301) and include the address of the Leader as the `Location` header in the response. It is then up the clients to re-issue the command to the Leader.
+
+In either case the generated file can then be used to restore a node (or cluster) using the [restore API](https://github.com/rqlite/rqlite/blob/master/DOC/RESTORE_FROM_SQLITE.md).
+
+## Generating a SQL text dump
+You can dump the database in SQL text format via the CLI as follows:
+```
+127.0.0.1:4001> .dump bak.sql
+SQL text file written successfully
+```
+The API can also be accessed directly:
+```bash
+curl -s -XGET localhost:4001/db/backup?fmt=sql -o bak.sql
+```
+
+## Backup isolation level
+The isolation offered by binary backups is `READ COMMITTED`. This means that any changes due to transactions to the database, that take place during the backup, will be reflected immediately once the transaction is committed, but not before.
+
+See the [SQLite documentation](https://www.sqlite.org/isolation.html) for more details.
diff --git a/DOC/BULK.md b/DOC/BULK.md
@@ -0,0 +1,58 @@
+# Bulk API
+The bulk API allows multiple updates or queries to be executed in a single request. Both non-paramterized and parameterized requests are supported by the Bulk API. The API does not support mixing the parameterized and non-parameterized form in a single request.
+
+A bulk update is contained within a single Raft log entry, so round-trips between nodes are at a minimum. This should result in much better throughput, if it is possible to use this kind of update. You can also ask rqlite to do the batching for you automatically, through the use of [_Queued Writes_](https://github.com/rqlite/rqlite/blob/master/DOC/QUEUED_WRITES.md). This relieves the client of doing any batching before transmitting a request to rqlite.
+
+## Updates
+Bulk updates are supported. To execute multiple statements in one HTTP call, simply include the statements in the JSON array:
+
+_Non-parameterized example:_
+```bash
+curl -XPOST 'localhost:4001/db/execute?pretty&timings' -H "Content-Type: application/json" -d "[
+    \"INSERT INTO foo(name) VALUES('fiona')\",
+    \"INSERT INTO foo(name) VALUES('sinead')\"
+]"
+```
+_Parameterized example:_
+```bash
+curl -XPOST 'localhost:4001/db/execute?pretty&timings' -H "Content-Type: application/json" -d '[
+    ["INSERT INTO foo(name) VALUES(?)", "fiona"],
+    ["INSERT INTO foo(name) VALUES(?)", "sinead"]
+]'
+```
+
+The response is of the form:
+
+```json
+{
+    "results": [
+        {
+            "last_insert_id": 1,
+            "rows_affected": 1,
+            "time": 0.00759015
+        },
+        {
+            "last_insert_id": 2,
+            "rows_affected": 1,
+            "time": 0.00669015
+        }
+    ],
+    "time": 0.869015
+}
+```
+### Atomicity
+Because a bulk operation is contained within a single Raft log entry, and only one Raft log entry is ever processed at one time, a bulk operation will never be interleaved with other requests.
+
+### Transaction support
+You may still wish to set the `transaction` flag when issuing a bulk update. This ensures that if any error occurs while processing the bulk update, all changes will be rolled back.
+
+## Queries
+If you want to execute more than one query per HTTP request then perform a POST, and place the queries in the body of the request as a JSON array. For example:
+
+```bash
+curl -XPOST 'localhost:4001/db/query?pretty' -H "Content-Type: application/json" -d '[
+    "SELECT * FROM foo",
+    "SELECT * FROM bar"
+]'
+```
+Parameterized statements are also supported.
diff --git a/DOC/CLI.md b/DOC/CLI.md
@@ -0,0 +1,38 @@
+# Command Line Interface
+rqlite comes with a CLI, which makes it easier to interact with a rqlite system. It is installed in the same directory as the node binary `rqlited`. Since rqlite is built on SQLite, you should consult the [SQLite query language documentation](https://www.sqlite.org/lang.html) for full details on what is supported.
+
+> **⚠ WARNING: Only enter one command at a time at CLI. Don't enter multiple commands at once, separated by ;**  
+> While it may work, mixing reads and writes to the database in a single CLI command results in undefined behavior.
+
+An example session is shown below.
+```sh
+$ rqlite 
+127.0.0.1:4001> CREATE TABLE foo (id INTEGER NOT NULL PRIMARY KEY, name TEXT)
+0 row affected (0.000362 sec)
+127.0.0.1:4001> .tables
++------+
+| name |
++------+
+| foo  |
++------+
+127.0.0.1:4001> .schema
++---------------------------------------------------------------+
+| sql                                                           |
++---------------------------------------------------------------+
+| CREATE TABLE foo (id INTEGER NOT NULL PRIMARY KEY, name TEXT) |
++---------------------------------------------------------------+
+127.0.0.1:4001> INSERT INTO foo(name) VALUES("fiona")
+1 row affected (0.000117 sec)
+127.0.0.1:4001> SELECT * FROM foo
++----+-------+
+| id | name  |
++----+-------+
+| 1  | fiona |
++----+-------+
+127.0.0.1:4001> quit
+bye~
+```
+You can connect the CLI to any node in a cluster, and it will automatically forward its requests to the leader if needed. Pass `-h` to `rqlite` to learn more.
+
+## History
+Command history is stored and reload between sessions, in a hidden file in the user's home directory named `.rqlite_history`. By default 100 previous commands are stored, though the value can be explicitly set via the environment variable `RQLITE_HISTFILESIZE`.