Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benjamin Kaduk's ballot comments #156

Open
huitema opened this issue Mar 10, 2022 · 5 comments
Open

Benjamin Kaduk's ballot comments #156

huitema opened this issue Mar 10, 2022 · 5 comments

Comments

@huitema
Copy link
Owner

huitema commented Mar 10, 2022

Benjamin Kaduk has entered the following ballot position for
draft-ietf-dprive-dnsoquic-10: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)

Please refer to https://www.ietf.org/about/groups/iesg/statements/handling-ballot-positions/
for more information about how to handle DISCUSS and COMMENT positions.

The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-dprive-dnsoquic/


DISCUSS:

I have a 0-RTT-related topic that I'd like to discuss, as the current
situation isn't entirely clear to me. In particular, TLS 1.3 provides
(and QUIC inherits) a mechanism for a server to advertise that it just
does not support 0-RTT at all, via the (absence of the) "early_data"
extension. This meshes nicely with the guidance in RFC 8446 that 0-RTT is
to only be used cautiously, and only with specific request from the
application. However, this specificiation diverges from that requirement
for application opt-in (per §9.1), and so when I read the directive in
§5.5 that "servers MUST adopt one of the following behaviors", I am forced
to wonder if the absence of a "abort the connection, because you do not
enable early data at all" option is intended to forbid a server from
taking that approach and thus require servers to implement and enable
0-RTT at runtime.
I hope that the intent was just for the §5.5 listing to be predicated on
the server using 0-RTT at all, but it's hard to reach that conclusion from
the existing text, so I have to seek clarification.


COMMENT:

Thanks to Phillip Hallam-Baker for the secdir review. I did want to
reiterate one of his comments, regarding the potential for harmful
interaction between use of DoQ (or really, any encrypted DNS transport)
and captive portals. While this would accordingly have been best placed
in something generic to DNS privacy mechanisms, such as RFC 9076 or RFC
8932, I think there might still be room to mention it here. I could
attempt to craft some text, if there is interest.

I made a pull request with some editorial suggestions at
#154

Section 1

The specific non-goals of this document are:
[...]
2. No attempt to support server-initiated transactions, which are
used only in DNS Stateful Operations (DSO) [RFC8490].

RFC 8490 is a proposed standard, so excluding it maybe is a bit in
conflict with claiming that this is a "general-purpose transport for DNS",
absent some other argument that DSO is a special-purpose tool.

Section 5.1.2

DoQ connections MUST NOT use UDP port 53. This recommendation
against use of port 53 for DoQ is to avoid confusion between DoQ and
the use of DNS over UDP [RFC1035].

Just to clarify: this prohibition is intended to apply even if there would
otherwise be mutual agreement to use port 53?

Section 5.2

DNS traffic follows a simple pattern in which the client sends a
query, and the server provides one or more responses (multiple
responses can occur in zone transfers).

Is this true even for DSO server-initiated transactions?

The client MUST select the next available client-initiated
bidirectional stream for each subsequent query on a QUIC connection,
in conformance with the QUIC transport specification [RFC9000].

Just to note: RFC 9000 does not require the client to use the "next
available" stream, instead saying that "a stream ID that is used out of
order results in all streams of that type with lower- numbered stream IDs
also being opened". So this "MUST select the next available" is a new
requirement for DoQ, and it's not entirely clear to me that it's required
for interop (though it is more efficient than any alternatives).

Section 5.2.1

This has implications for proxying DoQ message to and from other
transports. For example, proxies may have to manage the fact that
DoQ can support a larger number of outstanding queries on a single
connection than e.g., DNS over TCP because DoQ is not limited by the
Message ID space. This issue already exists for DoH, where a Message
ID of 0 is recommended.

I'm not sure how often this motivating text is relevant. The ID field
seems to be 16 bits, thus enabling 65k outstanding queries on a single
connection -- how often is there a need to have that many queries
outstanding at once? It looks like the motivation presented in RFC 8484
for setting the ID to zero is to improve caching, as otherwise queries
identical at the DNS level would be cached as separate requests by HTTP.
I agree, of course, that the ID field is redundant with the QUIC stream ID
and that it should be set to zero, I am just not sure if the number of
outstanding queries is a relevant motivation for doing so.

(It also looks like RFC 8484 refers to this value as the "DNS ID" rather than
"Message ID". I guess our options for consistent terminology are somewhat
limited, though.)

Section 5.3

The following error codes are defined for use when abruptly
terminating streams, aborting reading of streams, or immediately
closing connections:

Should we say that these are what QUIC calls "application error code"s?
(Subsequent occurrences of the phrase "error code" might be modified to
"application error code" as well.)

Section 5.3.2

set to DOQ_INTERNAL_ERROR. [...]

Is there any further guidance to give on when a DNS SERVFAIL response vs
QUIC RESET_STREAM is preferred (or is the guidance really always to issue
RESET_STREAM)?

Section 5.3.3

It is noted that the restrictions on use of the above EDNS(0) options
has implications for proxying message from TCP/DoT/DoH over DoQ.

Was it already rejeted to spend a sentence mentioning that such proxying
would involve translating the messages per the needs of the different
protocols on the different connections?

Section 5.5

Servers MUST NOT execute non replayable transactions received in
0-RTT data. Servers MUST adopt one of the following behaviors:

I think we should clarify whether "execute" means "take any action in
response to" or just "send a response message for". (I think it needs to
be the former.)

Section 6.4

Implementations MUST protect against the traffic analysis attacks
described in Section 9.5 by the judicious injection of padding. This

I think this is already overtaken by events, but a MUST-level requirement
seems overbearing here. My understanding is that providing complete
protection against these types of attack is still an open research
question....

could be done either by padding individual DNS messages using the
EDNS(0) Padding Option [RFC7830] or by padding QUIC packets (see
Section 8.6 of [RFC9000], the QUIC transport specification.

There is no Section 8.6 in RFC 9000.

Section 6.5.2

Clients that want to maintain long duration DoQ connections SHOULD
use the idle timeout mechanisms defined in Section 10.1 of [RFC9000],
the QUIC transport specification. Clients and servers MUST NOT send
the edns-tcp-keepalive EDNS(0) Option [RFC7828] in any messages sent
on a DoQ connection (because it is specific to the use of TCP/TLS as
a transport).

Should we make some statement (analogous to what RFC 7828 does) that if
such an option is received it MUST be ignored? In the absence of such
guidance I can imagine implementors feeling a need to enforce the "MUST
NOT send" on the receiving end.

Section 6.7

[RFC9103] specifies zone transfer over TLS (XoT) and includes updates
to [RFC1995] (IXFR), [RFC5936] (AXFR) and [RFC7766]. [...]

I note that there is currently no "Updates:" header to indicate this
relationship.

  • DoQ implementations SHOULD

    • use the same QUIC connection for both AXFR and IXFR requests to
      the same primary

    • pipeline such requests (if they pipeline XFR requests in
      general) and MAY intermingle them

    • send the response(s) for each request as soon as they are
      available i.e. responses MAY be sent intermingled

Given the "SHOULD use the same QUIC connection", what does MAY-level
guidance to "intermingle such requests" mean, in a QUIC context? Each DoQ
request is on a separate QUIC stream, so I do not see any opportunity for
intermingling other than by virtue of being in the same QUIC connection,
which is already a SHOULD. This is in contrast to a TCP or TLS situation,
where there is only a single data stream and intermingling has some
natural meaning (or meanings, for the response case specifically, where it
might apply to overall responses (composed of multiple response messages)
or individual response messages).

Section 8

The discussion in §6.5.2 about resource management could be security
relevant at times, if we wanted to backreference it.

The security considerations of DoQ should be comparable to those of
DoT [RFC7858]. DoT as specified in [RFC7858] only addresses the stub

The security considerations section of RFC 7858 includes a MUST-level
requirement to adhere to the recommendations of BCP 195. Does such a
MUST-level requirement apply to DoQ as well? (I note that BCP 195 is
currently listed as only an informative reference, which would need to
change if a MUST-level requirement was added.)

to recursive resolver scenario, but the considerations about person-
in-the-middle attacks, middleboxes and caching of data from clear
text connections also apply for DoQ to the resolver to authoritative
server scenario. [...]

RFC 7858 also lists a fourth consideration, traffic analysis or
side-channel leaks. Do we want to forward-reference §9.5 for completeness
(or even take the secdir reviewer's suggestion of coalescing the privacy
considerations into the security considerations section as confidentiality
considerations)?

Section 9.1

The prevention on allowing replayable transactions in 0-RTT data
expressed in Section 5.5 blocks the most obvious risks of replay

Is the parity of negations correct here ("prevention on allowing")? I see
§5.5 prohibiting execution of non-replayable transactions in 0-RTT data,
i.e., allowing replayable ones.

Section 10.4

Provisional reservations share the range of values larger than 0x3f
with some permanent registrations. This is by design, to enable
conversion of provisional registrations into permanent registrations
without requiring changes in deployed systems. (This design is
aligned with the principles set in Section 22 of [RFC9000].)

Do we want to specifically call out the guidance on selecting specific
codepoints from §22.1.2 of RFC 9000? (Or is it seen as not applicable
here?)

Section 12.1

We currently only specifically reference RFC 6891 in one place, to mention
that its provision for specifying maximum UDP message size is not relevant
for DoQ. However, since we do define and require (in some cases) use of a
new "Too Early" EDNS(0) error code, it seems that the solution should be
to reference it from more places, rather than to demote it to an
informative reference.

Similarly, we only reference RFC 8914 in the IANA considerations where we
allocate the codepoint, and would likely benefit from sprinkling an
additional reference or two in the main body of the text.

RFC 7828, on the other hand, seems to only be mentioned to say that you
MUST NOT use it, which would probably be fine as an informative reference.

RFC 7873 is referenced for "similar to the DNS Cookies mechanism", which
also sounds solely informative.

[I-D.ietf-dnsop-rfc8499bis]
Hoffman, P. and K. Fujiwara, "DNS Terminology", Work in
Progress, Internet-Draft, draft-ietf-dnsop-rfc8499bis-03,

It's kind of surprising to see DoQ electing to take a normative dependency
on this draft that is not even in WGLC yet. Wouldn't that risk incurring
substantial (unbounded) delays?

Section 12.2

A SHOULD-level requirement to implement the anti-replay mechanisms from
RFC 8446 seems to promote it to normative status, per
https://www.ietf.org/about/groups/iesg/statements/normative-informative-references/

@huitema
Copy link
Owner Author

huitema commented Mar 18, 2022

0RTT discuss issue is addressed in PR #158
Editorial comments in PR #154 have been approved.

@huitema
Copy link
Owner Author

huitema commented Mar 19, 2022

Most remaining issues are addressed in PR #166. The following points are debatable:

Thanks to Phillip Hallam-Baker for the secdir review. I did want to
reiterate one of his comments, regarding the potential for harmful
interaction between use of DoQ (or really, any encrypted DNS transport)
and captive portals. While this would accordingly have been best placed
in something generic to DNS privacy mechanisms, such as RFC 9076 or RFC
8932, I think there might still be room to mention it here. I could
attempt to craft some text, if there is interest.

This is not addressed the new draft. We are very reluctant to start documenting this very specific deployment issue in a transport draft. There is work in progress in the ADD WG, which will address DoH as well as DoQ. Maybe we should just wait for that.

Section 1

The specific non-goals of this document are:
[...]
2. No attempt to support server-initiated transactions, which are
used only in DNS Stateful Operations (DSO) [RFC8490].

RFC 8490 is a proposed standard, so excluding it maybe is a bit in
conflict with claiming that this is a "general-purpose transport for DNS",
absent some other argument that DSO is a special-purpose tool.

DSO is a special-purpose tool because it defines a new state model for a session based connection that overrides RFC7766 (the default behaviour for DNS-over-TCP)- and that new state model is what enables server initiated transactions. To our knowledge it has only been implemented for DNS Service Discovery (which drove its initial development) and is not used for any of the scenarios covered in this draft.

The client MUST select the next available client-initiated
bidirectional stream for each subsequent query on a QUIC connection,
in conformance with the QUIC transport specification [RFC9000].

Just to note: RFC 9000 does not require the client to use the "next
available" stream, instead saying that "a stream ID that is used out of
order results in all streams of that type with lower- numbered stream IDs
also being opened". So this "MUST select the next available" is a new
requirement for DoQ, and it's not entirely clear to me that it's required
for interop (though it is more efficient than any alternatives).

Opening streams in order is definitely best practice. Not doing so interferes with mechanisms limiting the number of open streams. The new draft clarifies that the server should not enforce in order processing. Queries may arrive out of order due for example to packet losses and retransmissions.

Section 5.2.1

This has implications for proxying DoQ message to and from other
transports. For example, proxies may have to manage the fact that
DoQ can support a larger number of outstanding queries on a single
connection than e.g., DNS over TCP because DoQ is not limited by the
Message ID space. This issue already exists for DoH, where a Message
ID of 0 is recommended.

I'm not sure how often this motivating text is relevant. The ID field
seems to be 16 bits, thus enabling 65k outstanding queries on a single
connection -- how often is there a need to have that many queries
outstanding at once? It looks like the motivation presented in RFC 8484
for setting the ID to zero is to improve caching, as otherwise queries
identical at the DNS level would be cached as separate requests by HTTP.
I agree, of course, that the ID field is redundant with the QUIC stream ID
and that it should be set to zero, I am just not sure if the number of
outstanding queries is a relevant motivation for doing so.

(It also looks like RFC 8484 refers to this value as the "DNS ID" rather than
"Message ID". I guess our options for consistent terminology are somewhat
limited, though.)

There was a fair bit of discussion about that in reviews, and the current text is the results of these discussions. And yes, RFC 8484 also zeroes "the ID in the DNS header" -- which is how it is defined in RFC 1035.

Section 5.3.3

It is noted that the restrictions on use of the above EDNS(0) options
has implications for proxying message from TCP/DoT/DoH over DoQ.

Was it already rejeted to spend a sentence mentioning that such proxying
would involve translating the messages per the needs of the different
protocols on the different connections?

The current text is already the result of many discussions...

Section 6.4

Implementations MUST protect against the traffic analysis attacks
described in Section 9.5 by the judicious injection of padding. This

I think this is already overtaken by events, but a MUST-level requirement
seems overbearing here. My understanding is that providing complete
protection against these types of attack is still an open research
question....

There was pretty strong consensus on "must do something", knowing full well that it is not perfect.

Section 6.5.2

Clients that want to maintain long duration DoQ connections SHOULD
use the idle timeout mechanisms defined in Section 10.1 of [RFC9000],
the QUIC transport specification. Clients and servers MUST NOT send
the edns-tcp-keepalive EDNS(0) Option [RFC7828] in any messages sent
on a DoQ connection (because it is specific to the use of TCP/TLS as
a transport).

Should we make some statement (analogous to what RFC 7828 does) that if
such an option is received it MUST be ignored? In the absence of such
guidance I can imagine implementors feeling a need to enforce the "MUST
NOT send" on the receiving end.

It is already specified as an error condition in {{Protocol-Errors}}, so yes, implementers are absolutely going to enforce "MUST NOT send." No ambiguity there.

Section 6.7

[RFC9103] specifies zone transfer over TLS (XoT) and includes updates
to [RFC1995] (IXFR), [RFC5936] (AXFR) and [RFC7766]. [...]

I note that there is currently no "Updates:" header to indicate this
relationship.

It seems it does. Looking at https://www.ietf.org/rfc/rfc9103.txt, the header includes an update line.

The discussion in §6.5.2 about resource management could be security
relevant at times, if we wanted to backreference it.

The security considerations of DoQ should be comparable to those of
DoT [RFC7858]. DoT as specified in [RFC7858] only addresses the stub

The QUIC security consideration include discussion of Slowloris Attacks (section 21.6). Isn't that sufficient?

RFC 7858 also lists a fourth consideration, traffic analysis or
side-channel leaks. Do we want to forward-reference §9.5 for completeness
(or even take the secdir reviewer's suggestion of coalescing the privacy
considerations into the security considerations section as confidentiality
considerations)?

Maybe not. The draft does have a section about traffic analysis and mitigations, which cover DoQ specific issues. Side-channel discussions could easily diverge into a rat-hole, with little actionable results. Then we would have to distinguish between voluntary side channel, such as emitting a series of queries with very specific timing , and involuntary side channel, in which a third party tweaks the messages to carry some signal. The former is not really actionable, and the latter is mostly a problem for QUIC itself rater than DoQ.

Do we want to specifically call out the guidance on selecting specific
codepoints from §22.1.2 of RFC 9000? (Or is it seen as not applicable
here?)

Not really applicable. 22.1.2 is concerned with the extra overhead caused by long numbers. This is mostly an issue for frequently used code points, like frame types, which could be used on every packet. We only have code points for error conditions, and it doesn't matter very much whether those code points encode in 1, 2, 4 or even 8 bytes.

@kaduk
Copy link
Contributor

kaduk commented Mar 19, 2022

Thanks for all the commentary here, I appreciate all the responses even if I will select just a few to specifically reply to.

The specific non-goals of this document are:
[...]
2. No attempt to support server-initiated transactions, which are
used only in DNS Stateful Operations (DSO) [RFC8490].
RFC 8490 is a proposed standard, so excluding it maybe is a bit in
conflict with claiming that this is a "general-purpose transport for DNS",
absent some other argument that DSO is a special-purpose tool.

DSO is a special-purpose tool because it defines a new state model for a session based connection that overrides RFC7766 (the default behaviour for DNS-over-TCP)- and that new state model is what enables server initiated transactions. To our knowledge it has only been implemented for DNS Service Discovery (which drove its initial development) and is not used for any of the scenarios covered in this draft.

That convinces me. Thanks.

This has implications for proxying DoQ message to and from other transports. For example, proxies may have to manage the fact that DoQ can support a larger number of outstanding queries on a single connection than e.g., DNS over TCP because DoQ is not limited by the Message ID space. This issue already exists for DoH, where a Message ID of 0 is recommended.

I'm not sure how often this motivating text is relevant. The ID field seems to be 16 bits, thus enabling 65k outstanding queries on a single connection -- how often is there a need to have that many queries outstanding at once? It looks like the motivation presented in RFC 8484 for setting the ID to zero is to improve caching, as otherwise queries identical at the DNS level would be cached as separate requests by HTTP. I agree, of course, that the ID field is redundant with the QUIC stream ID and that it should be set to zero, I am just not sure if the number of outstanding queries is a relevant motivation for doing so.

There was a bit of discussion later on about the terminology used to describe the ID field, and the requirement to zero it, as having gotten extensive WG discussion. I just want to highlight here that my primary comment relates to the text about "limited by the Message ID space". I think the actual behaviors specified are good, I'm just not sure whether the number of queries outstanding on a connection is ever actually limited by the Message ID space in practice. (But I also am not going to insist on any change; I comment here only to ensure that my intent was understood.)

[RFC9103] specifies zone transfer over TLS (XoT) and includes updates
to [RFC1995] (IXFR), [RFC5936] (AXFR) and [RFC7766]. [...]
I note that there is currently no "Updates:" header to indicate this
relationship.

It seems it does. Looking at https://www.ietf.org/rfc/rfc9103.txt, the header includes an update line.

Oops, that's my mistake. I was reading too fast and misread the quoted bit as saying that dnsoquic was including updates to the listed RFCs. dnsoquic has no Updates: headers for those RFCs (which is correct, because RFC 9103 should and does have them instead).

The security considerations of DoQ should be comparable to those of
DoT [RFC7858]. DoT as specified in [RFC7858] only addresses the stub

The QUIC security consideration include discussion of Slowloris Attacks (section 21.6). Isn't that sufficient?

I think something got jumbled here and I'm not sure what the intent of the reply was. It looks like my question was whether we intended the BCP 195 guidance to apply to DoQ, and if so, whether that should be mentioned specifically.

@huitema
Copy link
Owner Author

huitema commented Mar 19, 2022

On the 65K limit: I have seen traces of DNS root servers in which some individual IP addresses were sending tens of thousands of messages per second. I have also see traces in which big clusters of servers with addresses in the same /24 or /48. It is not hard to think of such resolvers eventually hitting a 65K limit. Granted that's a bit of a stretch today, but it could happen. Those are also the resolvers most worried about the Kaminsky attack. We see a variety of tactics used to blunt such attacks, but QUIC would be a very good fit. Eventually, once the tech is proven.

The intro of BCP 195 says "It is expected that the TLS 1.3 specification will resolve many of the vulnerabilities listed in this document." QUIC embeds TLS 1.3 and has no mechanism to negotiate down to TLS 1.2 or others. I don't think that an explicit reference to BCP 195 is necessary.

But yes, things got jumbled. The Slowloris mention was a response to something else:

The discussion in §6.5.2 about resource management could be security
relevant at times, if we wanted to backreference it.

I looked whether I could work out a reference to 6.5.2 in the security section, but it did not really seem to fit. Plus, the relevant attacks really are variants of Slowloris and similar denial of service attacks, which are already addressed in the QUIC security review.

@kaduk
Copy link
Contributor

kaduk commented Mar 19, 2022

Okay, thanks. I think we can consider these all resolved, then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants