Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: built-in content blocking based on IPIP-383 #10161

Merged
merged 22 commits into from
Oct 28, 2023
Merged

feat: built-in content blocking based on IPIP-383 #10161

merged 22 commits into from
Oct 28, 2023

Conversation

hsanjuan
Copy link
Contributor

@hsanjuan hsanjuan commented Oct 3, 2023

Fixes #8492.

This introduces "nopfs" as a preloaded plugin into Kubo.

It automatically make Kubo watch *.deny files (syntax from IPIP-383) found in:

  • /etc/ipfs/denylists
  • $XDG_CONFIG_HOME/ipfs/denylists
  • $IPFS_PATH/denylists

(files need to be present before starting in order to be watched).

Debug logging can be enabled with GOLOG_LOG_LEVEL="nopfs=debug".

All blocks are logged to "nopfs-blocks", so logging requests for blocked content can be achieved with
GOLOG_LOG_LEVEL="nopfs-blocks=warn":

WARN (...) QmRFniDxwxoG2n4AcnGhRdjqDjCM5YeUcBE75K8WXmioH3: blocked (test.deny:9)

Interactive/gateway users will also receive errors as responses but with less details:

Error: /ipfs/QmQvjk82hPkSaZsyJ8vNER5cmzKW7HyGX5XVusK7EAenCN is blocked and cannot be provided

One particularity to keep in mind is that GetMany() will silently drop blocked blocks from the response (warnings are logged). AddMany() will act similarly and avoid adding blocked blocks.

The code implementing all this is actually in nopfs:

The interpretation of the list rules and block detection is well tested, but a general review might be in order.

TODO

Fixes #8492.

This introduces "nopfs" as a preloaded plugin into Kubo.

It automatically make Kubo watch *.deny files found in:

- /etc/ipfs/denylists
- $XDG_CONFIG_HOME/ipfs/denylists
- $IPFS_PATH/denylists

(files need to be present before boot in order to be watched).

Debug logging can be enabled with `GOLOG_LOG_LEVEL="nopfs=debug"`.

All blocks are logged to "nopfs-blocks", so logging requests for blocked
content can be achieved with
`GOLOG_LOG_LEVEL="nopfs-blocks=warn"`:

```
WARN (...) QmRFniDxwxoG2n4AcnGhRdjqDjCM5YeUcBE75K8WXmioH3: blocked (test.deny:9)
```

Interactive/gateway users will also receive errors as responses but with less details:

```
Error: /ipfs/QmQvjk82hPkSaZsyJ8vNER5cmzKW7HyGX5XVusK7EAenCN is blocked and cannot be provided
```

One particularity to keep in mind is that GetMany() will silently drop blocked
blocks from the response (a warnings are logged). AddMany() will act
similarly and avoid adding blocked blocks.

The code implementing all this is actually in nopfs:

- https://github.com/ipfs-shipyard/nopfs (main library)
- https://github.com/ipfs-shipyard/nopfs/tree/master/ipfs (wrappers)

The interpretation of the list rules and block detection is well tested, but a
general review might be in order.
@hsanjuan hsanjuan requested a review from a team as a code owner October 3, 2023 14:42
@hsanjuan hsanjuan self-assigned this Oct 3, 2023
@lidel lidel self-assigned this Oct 3, 2023
@lidel lidel self-requested a review October 3, 2023 17:10
plugin/plugins/nopfs/nopfs.go Outdated Show resolved Hide resolved
@lidel
Copy link
Member

lidel commented Oct 19, 2023

Added docs, now I am testing + applying changes caused by recent boxo reactor, will open relevant PRs later today.

lidel added a commit to ipfs-shipyard/nopfs that referenced this pull request Oct 19, 2023
This applies changes necessary after ipfs/boxo#459
we need this to unblock ipfs/kubo#10161
@lidel
Copy link
Member

lidel commented Oct 19, 2023

I've added tests and the /ipfs/cid/* rule works as expected on CLI, but fails on gateway (CI log)

Repro: blocking the root CID of wikipedia, and everything under it:

$ cat  $IPFS_PATH/denylists/text.deny
/ipfs/bafybeiaysi4s6lnjev27ln5icwm6tueaw2vdykrtjkwiphwekaywqhcjze/*

It correctly blocks things in CLI:

$ ipfs resolve /ipns/en.wikipedia-on-ipfs.org/wiki/
Error: /ipfs/bafybeiaysi4s6lnjev27ln5icwm6tueaw2vdykrtjkwiphwekaywqhcjze/wiki is blocked and cannot be provided

but the gateway returns blocked payload, bypasses the rule somehow:

$ curl -is http://127.0.0.1:8080/ipns/en.wikipedia-on-ipfs.org/wiki/ | grep HTTP
HTTP/1.1 200 OK

lidel added 2 commits October 20, 2023 01:23
CLI works as expected, gateway does not respect the rule
(needs investigation)
@hsanjuan
Copy link
Contributor Author

but the gateway returns blocked payload, bypasses the rule somehow:

$ curl -is http://127.0.0.1:8080/ipns/en.wikipedia-on-ipfs.org/wiki/ | grep HTTP
HTTP/1.1 200 OK

Yes, I think this is because the gateway here is created without passing the Kubo resolver, but it creates one internally. This is why I added the WithResolver() option. On rainbow currently this works as expected:

curl http://127.0.0.1:8090/ipns/en.wikipedia-on-ipfs.org/wiki/
failed to resolve /ipns/en.wikipedia-on-ipfs.org/wiki/: /ipfs/bafybeiaysi4s6lnjev27ln5icwm6tueaw2vdykrtjkwiphwekaywqhcjze is blocked and cannot be provided

We can fix this once latest boxo has bubbled to Kubo.

@hacdias
Copy link
Member

hacdias commented Oct 20, 2023

@hsanjuan you can use Boxo's latest commit in main here. We allow Kubo to use a commit version of Boxo and then at release time we ensure that it has a tagged version.

@hsanjuan
Copy link
Contributor Author

@hsanjuan you can use Boxo's latest commit in main here. We allow Kubo to use a commit version of Boxo and then at release time we ensure that it has a tagged version.

ah ok, I thought path changes etc had not been bubbled. Can do.

core/corehttp/gateway.go Outdated Show resolved Hide resolved
cosmetic change to error message, this should be more robust
@lidel lidel mentioned this pull request Oct 24, 2023
11 tasks
@hsanjuan
Copy link
Contributor Author

@lidel I updated nopfs to a commit that has both ipfs-shipyard/nopfs#28 and ipfs-shipyard/nopfs#27 merged. Eventually we should produce tags for boxo, nopfs and bubble that.

this ensures we test something other than default sha256

Ref. ipfs-shipyard/nopfs#28
adds missing tests for "no fetch" gateways one can expose,
in both cases the offline mode is done by passing custom
blockservice/exchange into path resolver, which means
global path resolver that has nopfs intercept is not used,
and the content blocking does not happen on these gateways.

needs to be fixed, but at least now we have tests that
fail until it is fixed.
@lidel
Copy link
Member

lidel commented Oct 27, 2023

Close, but not ready yet: content blocking is not applied to "no fetch" gateways.

c99068e adds missing tests for "no fetch" gateways one can expose. In both cases the "no fetch" (non-recursive mode) is implemented by passing custom blockservice/exchange into path resolver, which means global path resolver that has nopfs intercept is not used, and the content blocking does not happen on these gateways:

Needs to be fixed before merge, but at least now we have tests that fail on this PR to surface the problem.

I'll look into this on Friday, but if I run out of time before code freeze, this might slip from 0.24 to 0.25.

lidel added a commit that referenced this pull request Oct 27, 2023
this fixes the problem described in
#10161 (comment)
by adding explicit offline path resolvers that are backed
by offline exchange, and using them in NoFetch gateways
instead of the default online ones
this fixes the problem described in
#10161 (comment)
by adding explicit offline path resolvers that are backed
by offline exchange, and using them in NoFetch gateways
instead of the default online ones
lidel added a commit to ipfs/boxo that referenced this pull request Oct 28, 2023
this is the minimum we need right now to make
content blocking from
ipfs/kubo#10161
return HTTP 410 on rule match
lidel added a commit to ipfs/boxo that referenced this pull request Oct 28, 2023
this is the minimum we need right now to make
content blocking from
ipfs/kubo#10161
return HTTP 410 on rule match
requires ipfs/boxo#497
which is based on top of the boxo already used in kubo master
to avoid issues caused by ilater commits in boxo main
Copy link
Member

@lidel lidel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched this PR to commits that are in main branches of boxo and nopfs.
TODOs are addressed, and tests pass. Gateway returns HTTP 410.

I am merging this to ensure it is included in 0.24-rc1, allowing wider community to test and provide feedback.
We will switch to tagged release versions before final 0.24.

Once again, thank you @hsanjuan for work on NOpfs and the spec, made this possible.

@lidel lidel changed the title plugin: Add support for content blocking directly in Kubo feat: built-in content blocking based on IPIP-383 Oct 28, 2023
@lidel lidel merged commit a0f34b1 into master Oct 28, 2023
26 checks passed
@lidel lidel deleted the nopfs branch October 28, 2023 03:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Archived in project
Development

Successfully merging this pull request may close these issues.

IPFS filtering to allow node operators to decide on content they are willing to serve
3 participants