Identify each node client with libp2p UserAgent #3

alrevuelta · 2022-10-26T10:57:44Z

Background

Since we have multiple waku implementations that will eventually coexist in the same network, it really comes in handy to be able to identify the so-called "client diversity" (a term taken from the Ethereum community). This metric allows having an idea of the number of nodes of each type that the network contains: nwaku, js-waku, go-waku and waku-rs.

This metric can be used to estimate how diverse the network is, and can help in decision making:

What if just 1% of the nodes run nwaku, while being the reference implementation?
What if a critical bug is detected in a client run by 90% of the peers?
What if no one uses any client beyond nwaku?

Solution

In order to be able to estimate the diversity, it's suggested that each implementation advertises its client-id using the userAgent field from libp2p (i.e. .withAgentVersion(agentString) for nim-libp2p), where client-id is:

nwaku
js-waku
go-waku
waku-rs

Some nuances:

Since we want to be privacy-preserving and we don't want this information to be used against the node, just the client type is advertised, without its release, version, OS or any other sensitive information.
This follows the nimbus rationale (an ethereum consensus client)
A user of waku must be free to update this string easily if they doesn't want to display this information.
This will allow us to estimate the diversity of the network, but this data must be taken with a grain of salt.

Tracking issues

Acceptance

Advertised userAgent reflects the client (nwaky, js-waku, go-waku, waku-rs)
userAgent can be configured with a cli flag, in case someone doesn't want to reveal it.

cc @fryorcraken @richard-ramos @jm-clius @bernardoaraujor

The text was updated successfully, but these errors were encountered:

richard-ramos · 2022-10-26T12:05:14Z

go-waku: waku-org/go-waku#348

fryorcraken · 2022-10-26T23:57:18Z

@kaiserd Any thoughts on this re privacy?

@alrevuelta we are explicitly excluding the version is this correct?

Also note that this is only useful when connecting directly to a node.
Which means it may not be that useful to have an idea of the network topology.

However, it could be useful to investigate odd behaviour encountered when monitoring our own nodes.

alrevuelta · 2022-10-27T06:10:38Z

@fryorcraken Yes, version and other sensitive information are not displayed, just if it's nim/go/rust/js. Its also a parameter than can be easily overridden with a config flag, so if anyone is really concerned, they can easily change it.

This follows what nimbus is doing with their ethereum client, and it really helps the Ethereum network to know the so-called client diversity. I believe at some point this can come in very handy.

Agree that we only get this information after connecting to a node, but the networkmonitor will take care of it. See waku-org/nwaku#1290

As discussed in waku-org/nwaku#1010 exposing the version is a no go for some of the stakeholders, so it won't be displayed, but note that in the protocol we advertise some kind of version, which ofc can't be avoided, i.e. /vac/waku/relay/2.0.0-beta2. Less verbose than the release version but still. But yeah, this goes beyond this :)

kaiserd · 2022-10-27T08:08:07Z

At first glance, I do not see any directly exploitable issues.
Still, this gives attackers additional information.
It is basically a trade-off between benefits for logging and anonymity (though the anonymity cost is low).

I'd not publish the user-agent, but I do not have strong arguments against it.
(I would be strongly against publishing the version, but following the discussion, this is already ruled out as an option.)
I'd rather err on the side of not publishing information even though it might not be too useful for attacks.

Potential implications I can think of right now:

Published user-agent info can reduce the anonymity set when trying to track nodes over sessions
- right now this can be done using the PeerID anyways, but in the future...
Published user-agent info might reduce a bit of attacker uncertainty in mass deanonymization attacks
- if the full graph is not known to the attacker, knowing that nodes have labels can help
- might also help in graph learning attacks, where the attacker tries to infer the gossip-sub topology
One specific implementation might be easier to exploit, regardless of the specific version
- also, the latest version, or a specific version might be weak, and knowing the user-agent narrows down the search space

@fryorcraken

alrevuelta · 2022-10-27T08:33:20Z

@kaiserd Thanks for the input. imho the benefits of having this information totally outweigh the cons, as explained above in the Background section. The cost of anonymity is low and can be lowered to 0 since this is a flag that can be changed easily. If at some point we detect that no one is using the default nwaku/etc flag, we can remove it. But having this overview of what's in the network is really useful when scaling, and these metrics can help taking some strategic decisions.

jm-clius · 2022-10-27T09:15:49Z

IMO, I think the tradeoff is acceptable here if (a) we exclude any version information and (b) the setting is easily overridable in config - both of which this issue and implementations adhere to.

kaiserd · 2022-10-27T09:21:22Z

The cost of anonymity is low and can be lowered to 0

the setting is easily overridable in config

This further reduces the anonymity set of nodes that do not override. (Just to be aware of, not arguing strongly against the proposal.)

alrevuelta · 2022-10-27T09:24:35Z

This further reduces the anonymity set of nodes that do not override. (Just to be aware of, not arguing strongly against the proposal.)

Yep good point, but assuming that only a very small subset uses the flag.

bernardoaraujor · 2022-10-27T10:11:44Z

waku-rs: bernardoaraujor/waku-rs#3

fryorcraken · 2022-10-28T04:34:28Z

I am not sure I see a benefit in setting a permanent, overridable flag here.

What if just 1% of the nodes run nwaku, while being the reference implementation?

and? What action/decision would you take? The only way to receive actionable feedback here is for
- us using nwaku and dogfooding so we are aware of caveats/flaws
- discussing with user base to understand choice/preferences

What if a critical bug is detected in a client run by 90% of the peers?

Are you saying that if a critical bug is found in a client not run by 90% of the peers then maybe we'll delay the resolution?

What if no one uses any client beyond nwaku?

Same than first point

User agent in Ethereum beacon node makes sense to help each different team measure their market share. Client diversity is important for Ethereum for sustainability and robustness. I don't think this is the case for Waku.

I would prefer if we expose an API to override the default user agent value while keep the current user agent value as it is (or/end override it to waku for everyone).

Once the API is exposed, I would actually like to encourage Status client team to set the status-web, status-desktop and status-mobile value when the software is build in debug/develop/dogfooding mode so we can easily investigate odd queries (e.g. Waku Store) and now from which client it comes from.

alrevuelta · 2022-10-28T06:18:16Z

What if just 1% of the nodes run nwaku, while being the reference implementation?

and? What action/decision would you take? The only way to receive actionable feedback here is for
us using nwaku and dogfooding so we are aware of caveats/flaws
discussing with user base to understand choice/preferences

Isn't this a metric useful to know the state of the network? Not saying 1% is good or bad, but can help us raise the issue in the community. Difficult to configure? bad performance? missing features? It's not about taking a specific decision, is about awareness.

Are you saying that if a critical bug is found in a client not run by 90% of the peers then maybe we'll delay the resolution?

Of course, we won't delay the solution. But a bug is not fixed immediately. And during the time the bug is present in the networking, knowing the amount/share of clients it affects, is useful imho. For this use case having the release v.x.y.z of each client in the userAgent would be really useful, and it's something that really helps the Ethereum network when forking. But it was agreed that the version is too much and was discarded.

User agent in Ethereum beacon node makes sense to help each different team measure their market share. Client diversity is important for Ethereum for sustainability and robustness. I don't think this is the case for Waku.

As we scale and onboard new operators I think it's also the case for Waku. Beyond client diversity (Ethereum aims for 20% share of each) in our case perhaps we don't care about an equal share for each client, but having this kind of overview of whats in the network is useful. Same as knowing the amount of peers, their location, etc.

I would prefer if we expose an API to override the default user agent value while keep the current user agent value as it is (or/end override it to waku for everyone).

In the proposed solution, the user agent agent can be configured with a cli flag, but I would like to insist that the default value is different for each client. Anyone is free to overide it with just one flag.

Once the API is exposed, I would actually like to encourage Status client team to set the status-web, status-desktop and status-mobile value when the software is build in debug/develop/dogfooding mode so we can easily investigate odd queries (e.g. Waku Store) and now from which client it comes from.

Perhaps this goes against what was discussed in #1242. I think it's too specific. Regardless, I think it's out of scope for this and I would not try to enforce it, up to them.

alrevuelta · 2022-11-17T12:08:08Z

Thanks everyone! Only waku-rs is left but don't see a lot of activity in the repo. Not sure if we should close this.

jm-clius · 2022-11-17T12:52:41Z

Let's close, as the three main clients have been updated.

bernardoaraujor · 2022-11-17T20:12:38Z

Thanks everyone! Only waku-rs is left but don't see a lot of activity in the repo. Not sure if we should close this.

sorry everyone! waku-rs is a side-project for me, and I haven't had the time to tackle this yet.

feel free to close it, I'll report here when I implement it.

bernardoaraujor mentioned this issue Oct 26, 2022

advertise client-id using the userAgent field from libp2p bernardoaraujor/waku-rs#3

Open

fryorcraken mentioned this issue Nov 7, 2022

Default Libp2p Agent Version to js-waku and make it easily overridable waku-org/js-waku#1009

Closed

2 tasks

danisharora099 mentioned this issue Nov 10, 2022

feat!: add support for adding/setting user agent waku-org/js-waku#1016

Merged

fryorcraken closed this as completed Nov 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Identify each node client with libp2p UserAgent #3

Identify each node client with libp2p UserAgent #3

alrevuelta commented Oct 26, 2022 •

edited

Loading

richard-ramos commented Oct 26, 2022

fryorcraken commented Oct 26, 2022

alrevuelta commented Oct 27, 2022

kaiserd commented Oct 27, 2022

alrevuelta commented Oct 27, 2022

jm-clius commented Oct 27, 2022

kaiserd commented Oct 27, 2022 •

edited

Loading

alrevuelta commented Oct 27, 2022

bernardoaraujor commented Oct 27, 2022

fryorcraken commented Oct 28, 2022

alrevuelta commented Oct 28, 2022

alrevuelta commented Nov 17, 2022

jm-clius commented Nov 17, 2022

bernardoaraujor commented Nov 17, 2022

Identify each node client with libp2p UserAgent #3

Identify each node client with libp2p UserAgent #3

Comments

alrevuelta commented Oct 26, 2022 • edited Loading

Background

Solution

Tracking issues

Acceptance

richard-ramos commented Oct 26, 2022

fryorcraken commented Oct 26, 2022

alrevuelta commented Oct 27, 2022

kaiserd commented Oct 27, 2022

alrevuelta commented Oct 27, 2022

jm-clius commented Oct 27, 2022

kaiserd commented Oct 27, 2022 • edited Loading

alrevuelta commented Oct 27, 2022

bernardoaraujor commented Oct 27, 2022

fryorcraken commented Oct 28, 2022

alrevuelta commented Oct 28, 2022

alrevuelta commented Nov 17, 2022

jm-clius commented Nov 17, 2022

bernardoaraujor commented Nov 17, 2022

alrevuelta commented Oct 26, 2022 •

edited

Loading

kaiserd commented Oct 27, 2022 •

edited

Loading