Skip to content

Commit

Permalink
Add tutorial on integrating kubo with index-provider (#372)
Browse files Browse the repository at this point in the history
* Add tutorial on integrating kubo with index-provider

---------

Co-authored-by: Masih H. Derkani <[email protected]>
  • Loading branch information
ischasny and masih authored May 31, 2023
1 parent 648d6ea commit 7e76f92
Showing 1 changed file with 104 additions and 0 deletions.
104 changes: 104 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,110 @@ Delegated Routing server is off by default. To enable it, add the following conf
}
```

#### Configuring Kubo to advertise content onto IPNI

Kubo supports HTTP delegated routing as of [v0.18.0](https://github.com/ipfs/kubo/releases/tag/v0.18.0). The following section contains configuration examples and a few tips to enable Kubo to advertise its CIDs to
IPNI systems like `cid.contact` using `index-provider`. Delegated Routing is still in the Experimental stage and configuration might change from version to version.
This section serves as an inspiration for configuring your node to use IPNI, but for comprehensive information, refer to the [Kubo documentation](https://docs.ipfs.tech/install/command-line/). Here are some important points to consider:

* The `index-provider` delegated routing server should be running continuously as a "sidecar" to the Kubo node. While `index-provider` can be restarted safely, if it goes down, no new CIDs will flow from Kubo to IPNI.
* The latest version of Kubo (v0.18.+) with HTTP delegated routing support should be used as `index-provider` no longer supports Reframe.
* Kubo advertises its data in snapshots, which means that all CIDs managed by Kubo are reprovided to the configured routers every 12/24 hours (configurable). This mechanism is similar to how the Distributed Hash Table (DHT) works. During the reproviding process, there may be significant communication between the involved processes. In between reprovides, Kubo also sends new individual CIDs to the configured routers.
* Kubo requires `index-provider` only for publishing its CIDs to IPNI. Kubo can perform IPNI lookups natively without the need for a sidecar (refer to Kubo docs on `auto` routers).
* `index-provider` must be publicly reachable. IPNI will try to establish connection into it to fetch Advertisement chains. If that can't be done CIDs will not appear in IPNI.
Ensure that your firewall is configured to allow incoming connections on the `ProviderServer` port specified in the `index-provider` configuration.

To configure `index-provider` to expose the delegated routing server, use the following configuration:

```
"DelegatedRouting": {
"ListenMultiaddr": "/ip4/0.0.0.0/tcp/50617",
"ProviderID": "PEER ID OF YOUR IPFS NODE",
"Addrs": // List of multiaddresses that you'd like to be advertised to IPNI. If not specified, Swarm addresses of the Kubo node will be used.
}
```

Configure Kubo to publish into both DHT and IPNI:
```
"Routing": {
"Methods": {
"find-peers": {
"RouterName": "WanDHT"
},
"find-providers": {
"RouterName": "ParallelHelper"
},
"get-ipns": {
"RouterName": "WanDHT"
},
"provide": {
"RouterName": "ParallelHelper"
},
"put-ipns": {
"RouterName": "WanDHT"
}
},
"Routers": {
"IndexProvider": {
"Parameters": {
"Endpoint": "http://127.0.0.1:50617",
"MaxProvideBatchSize": 10000,
"MaxProvideConcurrency": 1
},
"Type": "http"
},
"ParallelHelper": {
"Parameters": {
"Routers": [
{
"IgnoreErrors": true,
"RouterName": "IndexProvider",
"Timeout": "30m"
},
{
"IgnoreErrors": true,
"RouterName": "WanDHT",
"Timeout": "30m"
}
]
},
"Type": "parallel"
},
"WanDHT": {
"Parameters": {
"AcceleratedDHTClient": false,
"Mode": "auto",
"PublicIPNetwork": true
},
"Type": "dht"
}
},
"Type": "custom"
},
```

With the above configuration, Kubo will advertise its CIDs to both DHT and IPNI and will use both DHT and IPNI for `find-providers` lookups. Additionally, enable the following flag in the Kubo config to enable batch re-provides (especially for larger nodes):
```json
"Experimental": {
"AcceleratedDHTClient": true,
},
```

After adding a new file to your Kubo node, you should see `index-provider` logs starting to appear immediately. If that doesn't happen, it's likely that Kubo has been configured incorrectly.

`index-provider` publishes announcements about new advertisements on a libp2p pub/sub topic. This topic is listened by IPNI systems like `cid.contact`. Once a new announcement is seen,
IPNI would reach out to `index-provider` to download advertisement chains and index the content. It's important to keep in mind:
* There might be a delay before IPNI picks up an announcement from the libp2p pub/sub depending on the network, number of hops and etc;
* There might be a delay before IPNI reaches out to `index-provider` depending on the overall business of the system;
* If no comminication has been received from IPNI within a reasonable amount of time then most likely `index-provider` is not reachable from the Internet. You can verify whether
it's reachable by using `index-provider` CLI. For example `provider ls ad --provider-addr-info=/ip4/76.21.23.45/tcp/24001/p2p/12D3KooWPNbEgjdBNeaCGpsgCrPRETe4uBZf1ShFXSdN18ys` (replace with the correct
multiaddress and peer id of your `index-provider`). Remember to run this command not from the same computer where `index-provider` is.

Here are a few additional configuration options to consider:

* `ChunkSize`: `index-provider` publishes advertisements with a certain number of CIDs in each chunk. An advertisement needs to accumulate enough CIDs before it gets published. You can reduce the `ChunkSize` parameter to publish data more quickly. The default value is 1000.
* `AdFlushFrequency`: `index-provider` can publish advertisements before they are full based on the `AdFlushFrequency` parameter. In other words, an advertisement will be published either when it has reached the `ChunkSize` or after the specified `AdFlushFrequency`. Setting this value to a lower value helps in publishing data more quickly. The default is 10 minutes.

### Embedding index provider integration

The [root go module](go.mod) offers a set of reusable libraries that can be used to embed index
Expand Down

0 comments on commit 7e76f92

Please sign in to comment.