Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: resolver.cloudflare-eth.com is down; change default ENS resolver? #771

Open
lidel opened this issue Dec 23, 2024 · 2 comments
Open
Labels
kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization

Comments

@lidel
Copy link
Member

lidel commented Dec 23, 2024

Problem

DNSLinks that use ENS (websites on .eth TLD) are broken for users that run boxo/gateway with default settings (incl. Kubo, Rainbow, IPFS Desktop).

Cloudflare resolver at https://resolver.cloudflare-eth.com/dns-query is currently broken and results no DNSLink results ("Answer":[] at the end):

$ curl -s -H "accept: application/dns-json" "https://resolver.cloudflare-eth.com/dns-query?name=_dnslink.vitalik.eth&type=TXT"
{"AD":true,"CD":false,"RA":true,"RD":true,"TC":false,"Status":3,"Question":[{"name":"_dnslink.vitalik.eth.","type":16}],"Answer":[]

We've reported outage to Cloudflare, but if it does not get fixed until January when the team is back from holidays, we should consider removing .eth support from implicit defaults in Boxo (and Kubo), or switch implicit default to a different DoH resolver.

Solution

We could make things more robust by supporting fallbacks (ipfs/kubo#8173) but for that we need more than one, and it seems ENS has only one stable resolver atm.

So the options are:

Solution (A): do nothing, wait for resolver.cloudflare-eth.com to be fixed

This happens once a year on average. Acceptable? 🤷

Solution (B): remove default resolver for .eth

This would break all ENS websites, and require all end users to choose or set up their own resolver.
Probably not what we want given ENS+IPFS use in the wider ecosystem, but writing this down here for completeness.

Solution (C): switch URL to https://dns.eth.limo/dns-query

The DoH at https://dns.eth.limo/dns-query is a good candidate, seems to be well maintained and provided non-empty DNSLink in "Answer":

$ curl -s -H "accept: application/dns-json" "https://dns.eth.limo/dns-query?name=_dnslink.vitalik.eth&type=TXT"
{"Status":"0","TC":false,"Question":[{"name":"_dnslink.vitalik.eth","type":16}],"Answer":[{"name":"_dnslink.vitalik.eth","data":"dnslink=/ipfs/bafybeifvusbh4iunpvwjlowu47sxnt4hjlebx46kxi4yz5zdsoecfpkkei","type":16,"ttl":300}]}

(this is not final decision, consider this commit as a way fo kicking-off conversation what we should do in 2025 to minimize issues with proxied naming systems like ENS)

Solution (D): ?

Other ideas welcome.

@lidel lidel added kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization labels Dec 23, 2024
@aschmahmann
Copy link
Contributor

A few other options:

Solution (D): Mandate multiple resolvers for defaults

Insist that any naming system that wants to be included by default in boxo provide multiple (2 or 3?) independently operated trusted DNS resolvers and then add some logic in boxo to be able to try the others should one of them fail consistently.

Some notes:

  • It can't just be a check for a 4xx / 5xx response or a SERVFAIL DNS response because sometimes failures (like the Cloudflare one above) will fail differently (e.g. an incorrect NXDOMAIN) which makes this more complicated
  • This is roughly the same as Add DNS Fallback Resolvers kubo#8173 (comment), but taking into account that some failures might not be obvious
  • The extra logic here might be reusable for something like querying IPNI providers
  • We'd have to consider removing .crypto support unless their community was willing to operate 1-2 more resolvers

Solution (E): Have the dweb.link maintainers maintain public DNSLink infra that can be more agile / adaptable to issues

This basically hides the existing problem behind a party that is likely to be more closely aligned with the boxo maintainers. On its own this basically trades one centralized party for another. That the centralized party is closer aligned with the library and its dependents can be helpful, but it also seems reasonable that responsibility around continuity of these naming systems should reside with their communities. Continuity questions are ones we'd have to bring up when considering new defaults.

Solution (F): Use default Ethereum JSON RPC endpoints rather than default ENS resolvers

There are many more Ethereum JSON RPC providers than ENS providers so we could use a library (e.g. https://github.com/wealdtech/go-ens) to do the ENS translation (and handle CCIP, etc.).

The two major downsides here are:

  1. Given folks have built businesses around being ETH RPC providers it seems unlikely we'll have a stable one we can include by default here and requiring an ETH RPC endpoint to be configured is not what's needed of default behavior. This is likely what kills this idea ... although if this turns out to be incorrect that'd be great.
  2. There is both added maintenance and gatekeeping added to the boxo maintainers. To be fair it could be argued that we have some of this already by virtue of including defaults (as noted by us needing to worry about this Cloudflare outage)

Solution (G): Implement verifiable decentralized ENS resolution

Instead of using a DoH resolver for ENS, implement a way of fetching ENS data verifiably from a distributed set of peers.

Personally I'd like to see this happen, but I suspect that realistically it's a lot of work and would require some coordination and funding from the ENS community to make it happen. While I tend to agree with not wanting to pick "winners", perform gatekeeping, or deal with extra maintenance work I feel less bad about giving some preferential treatment to protocols where:

  1. There is adoption of the protocol
  2. We can promote verifiable and resilient systems over trusted and brittle proxies

Realistically, I suspect the best path forward for now is:

  1. If this outage goes on for a while and the eth.limo folks are ok with it go with option C
  2. Take a look at the viability / ease of implementing D since that should be the easiest way to gain resiliency
  3. I'd be curious if G had enough interest to be a thing and we can discuss with the ENS community (e.g. on their forums or ping their about this issue), but it's definitely not a near-term solution

@MicahZoltu
Copy link

For (G) I believe the recently launched Portal Network is likely at least part of the solution, combined with some sort of light client. The Portal Network is a state storage system that allows participants to distribute all state over a large decentralized network. You would still need a light client that can do ENS resolution (which likely requires running an EVM). In theory, one should be able to build a light client to achieve these goals, but I don't believe one exists as of yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization
Projects
None yet
Development

No branches or pull requests

3 participants