-
Notifications
You must be signed in to change notification settings - Fork 46
IPv6 Hop-by-Hop & Destination Option #56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from 11 commits
3771848
10355fd
fe346e4
d4972e5
703fd49
cc70169
800c506
7456957
c5c1a52
9e75a1f
06a5ea0
4638a0f
4601746
5ff83a8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -494,19 +494,140 @@ corruption at preceding hops. | |
|
|
||
| ## Header Location | ||
|
|
||
| We describe three encapsulation formats in this specification, covering | ||
| We describe five encapsulation formats in this specification, covering | ||
| different deployment scenarios, with and without network virtualization: | ||
|
|
||
| 1. *INT over TCP/UDP* - A shim header is inserted following TCP/UDP | ||
| 2. "INT over IPv6" - INT Headers are carried in the IPv6 packets as Hop-by-Hop option. | ||
| 3. *INT over TCP/UDP* - A shim header is inserted following TCP/UDP | ||
| header. INT Headers are carried between this shim header and TCP/UDP payload. | ||
| This approach doesn’t rely on any tunneling/virtualization mechanism and is | ||
| versatile to apply INT to both native and virtualized traffic. | ||
| 2. *INT over VXLAN* - VXLAN generic protocol extensions | ||
| 4. *INT over VXLAN* - VXLAN generic protocol extensions | ||
| (draft-ietf-nvo3-vxlan-gpe) are used to carry INT Headers between | ||
| the VXLAN header and the encapsulated VXLAN payload. | ||
| 3. *INT over Geneve* - Geneve is an extensible tunneling framework, allowing | ||
| 5. *INT over Geneve* - Geneve is an extensible tunneling framework, allowing | ||
| Geneve options to be defined for INT Headers. | ||
|
|
||
| ### INT over IPv6 | ||
|
|
||
| In case the traffic being monitored is not encapsulated by any virtualization | ||
| header, INT over VXLAN or INT over Geneve is not helpful. Instead, | ||
| the INT metadata can be carried as Hop-by-Hop option data. | ||
|
|
||
| IPv6 Hop-by-Hop Option format for carrying INT Header and | ||
| Metadata: | ||
|
|
||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+ | ||
| | Nxt HDR = UL | HbyH Ext Len | Padding|(MBZ) | | | ||
mickeyspiegel marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ I | ||
| | Option Type | Opt Data Len | Reserved (MBZ) | N | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ T | ||
| | Variable Option Data (INT Metadata Headers and Metadata) | | | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+ | ||
|
|
||
|
|
||
| Nxt Hdr: 8-bit selector. Identifies the type of header immediately following the | ||
| Hop-by-Hop or Destination Options header. Uses the same values as the IPv4 Protocol | ||
| Field [IANA-PN] | ||
|
|
||
| HDR Ext Len: 8-bit unsigned integer. Length of the Hop-by-Hop or Destination | ||
| Options header in 8-octet units, not including the first 8 octets. | ||
|
||
|
|
||
| Option Type: 8-bit identifier of the type of option. | ||
|
|
||
| 001xxxxxx 8-bit identifier of the type of option. xxxxxx=TBD_IANA_INT_HOP_BY_HOP_OPTION_IPV6. | ||
| 001xxxxxx 8-bit identifier of the type of option. xxxxxx=TBD_IANA_INT_DESTINATION_OPTION_IPV6. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Looking at the IANA registry, there are a total of 32 code points of which 17 have already been allocated. The registration procedure is IESG Approval, IETF Review or Standards Action. IOAM is asking for 4 code points, which seems unlikely. The chances for INT to get any code points are not high.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see two options:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see another problem with the corresponding IETF IOAM IPv6 draft. The text says that "a router MUST drop packets which contain extension headers carrying IOAM data-fields", to "ensure that the IOAM data does not unintentionally get forwarded outside the IOAM domain." However, they asked for an Option Type codepoint starting with "00", which means when the option type is unrecognized, "skip over this option and continue processing the header". If the text is correct, then they should ask for any of the other codepoint prefixes "01" (discard the packet), "10" (discard and send ICMP parameter problem, code 2, back to the packet's source address), or "11" (discard and send ICMP only if the packet's destination address was not a multicast address).
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I will close loop with IETF and address this comment. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Whatever we do, two codepoints will not fly. At a minimum we would have to go with TBD_IANA_INT_OPTION_IPV6 (not distinguishing between INT hop-by-hop and INT destination), which would later get resolved to either experimental hop-by-hop options codeopint or whatever IOAM has assigned. If we go with IOAM then the INT Type values might need to be shifted to avoid conflicts. I also wonder if we should use xxx or yyy for the first 3 bits as well given the other open issue I stated above. |
||
|
|
||
| Opt Data Len: 8-bit unsigned integer. Length of the Reserved and Option Data field of this | ||
| option, in octets. | ||
|
|
||
| Reserved (MBZ): 16 bit field, must be filled with zeroes. | ||
mickeyspiegel marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| Variable Opt Data: INT Header and Metadata, multiple of four octets in length. | ||
|
|
||
| Padding: 16-bit pad. Needed to ensure that the variable length of the complete | ||
| Hop-by-Hop or Destination Options Header is an integer multiple of 8 octets long. | ||
|
||
|
|
||
|
|
||
| The format of the IPv6 packet with Hop-by-Hop option is shown below: | ||
|
|
||
| 0 1 2 3 | ||
| 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||
| |Version| Traffic Class | Flow Label | | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||
| | Payload Length |Nxt HDR = HbyH | Hop Limit | | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||
| | Source IPv6 Address | | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||
| | Destination IPv6 Address | | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+ | ||
| | Nxt HDR = UL | HbyH Data Len | Padding | | | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ I | ||
| | Option Type | Opt Data Len | Reserved (MBZ) | N | ||
mickeyspiegel marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ T | ||
| | Variable Option Data (INT Data) | | | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+ | ||
| | Payload + Padding (L4/ESP/….) | | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||
|
|
||
| The format of the IPv6 packet with Hop-by-Hop option for INT-MD (Embedded Metadata) is shown below: | ||
|
|
||
| 0 1 2 3 | ||
| 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||
| |Version| Traffic Class | Flow Label | | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||
| | Payload Length | Nxt HDR = HbyH| Hop Limit | | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||
| | (Outer) Source IPv6 Address | | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||
| | (Outer) Destination IPv6 Address | | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+ | ||
| | Nxt HDR = IPv6| HbyH Ext Len | Padding|(MBZ) | | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||
| | Option Type | Opt Data Len | Reserved (MBZ) | | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+ | ||
| | INT-Type - MD | Length | Reserved | Next Protocol | | | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-| | | ||
| |Ver = 2|Rep|C|E|M| Reserved | Hop ML |Rmainig HopCnt | I | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ N | ||
| | Instruction Bitmap | Domain Specific ID | T | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | ||
| | DS Flags | DS Instruction | | | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | ||
| | Variable Option Data (INT DATA) | | | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+ | ||
| | Payload Original Packet | | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||
|
|
||
| The format of the IPv6 packet with Hop-by-Hop option for INT-MX (Direct Export) is shown below: | ||
|
|
||
| 0 1 2 3 | ||
| 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||
| |Version| Traffic Class | Flow Label | | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||
| | Payload Length | Nxt HDR = HbyH| Hop Limit | | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||
| | (Outer) Source IPv6 Address | | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||
| | (Outer) Destination IPv6 Address | | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+ | ||
| | Nxt HDR = IPv6| HbyH Ext Len | Padding|(MBZ) | | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||
| | Option Type | Opt Data Len | Reserved (MBZ) | | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+ | ||
| | INT-Type - MX | Length | Reserved | Next Protocol | | | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-| | | ||
| |Ver = 2|Rep|C|E|M| Reserved | Hop ML |Rmainig HopCnt | I | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ N | ||
| | Instruction Bitmap | Domain Specific ID | T | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | ||
| | DS Flags | DS Instruction | | | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+ | ||
| | Payload Original Packet | | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||
|
|
||
| ### INT over TCP/UDP | ||
|
|
||
| In case the traffic being monitored is not encapsulated by any virtualization | ||
|
|
@@ -752,12 +873,12 @@ hop-by-hop INT header must fit in a single Geneve option. | |
| In this section, we define the format for INT hop-by-hop metadata headers, | ||
| and the metadata itself. | ||
|
|
||
| INT Metadata Header and Metadata Stack: | ||
| INT Metadata Header and Metadata Stack (Version = 1): | ||
| ` | ||
| 0 1 2 3 | ||
| 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | ||
| 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||
| | Ver |Rep|C|E|M| Reserved | Hop ML |RemainingHopCnt| | ||
| |Ver = 1|Rep|C|E|M| Reserved | Hop ML |RemainingHopCnt| | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||
| | Instruction Bitmap | Reserved | | ||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||
|
|
@@ -821,7 +942,7 @@ The original packet must have C bit set to 0. | |
| switch(es) set the M bit based on knowledge of the network topology | ||
| and "Switch ID, Ingress port ID, Egress port ID" tuples in the INT | ||
| metadata stack. | ||
| - R: Reserved bits. | ||
| - R (10b): Reserved bits. | ||
| - Hop ML (5b): Per-hop Metadata Length, the length of metadata in 4-Byte words | ||
| to be inserted at each INT hop. | ||
| - While the largest value of Per-hop Metadata Length is 31, an INT-capable | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add INT over IPv6 after the other three encaps? Specially because the text in the paragraph is referring to scenarios where "INT over VXLAN or Geneve is not helpful"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I followed what was done earlier. TCP/UDP was listed first and it referenced encaps. I just stuck to that. I am fine with changing the order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be woefully out of date on IPv6 extension header behavior, but regarding the option '"INT over IPv6" - INT Headers are carried in the IPv6 packets as Hop-by-Hop option.', I had thought that switches in practice have to punt packets with an IPv6 Hop-by-Hop extension header to the slow path, e.g. software forwarding on a general purpose CPU.
I did a quick search and found that RFC 7045 (published Dec 2013) says this in Section 2.2 "Hop-by-Hop Options":
The IPv6 Hop-by-Hop Options header SHOULD be processed by
intermediate forwarding nodes as described in [RFC2460]. However, it
is to be expected that high-performance routers will either ignore it
or assign packets containing it to a slow processing path. Designers
planning to use a hop-by-hop option need to be aware of this likely
behaviour.
Is there really a desire to put INT data into a header that will likely result in slow path processing in the network?