Skip to content

Conversation

@zmerp
Copy link
Member

@zmerp zmerp commented Aug 1, 2025

Refactor Stream socket splitting UDP and TCP implementation, reducing the size of packets/shards prefixes
Also removes buffer initialization to 0 by using MaybeUninit slices with socket2 at the receive side.

Tested on Quest 3/Windows

@zmerp zmerp force-pushed the protocol-cleanup branch 2 times, most recently from d3cde78 to fe2718b Compare August 1, 2025 15:42
Copy link
Collaborator

@The-personified-devil The-personified-devil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really didn't review the tcp or udp specific code, but here's my takes on the rest of the stuff.

impl<H: Serialize> StreamSender<H> {
pub fn get_buffer(&mut self, header: &H) -> Result<Buffer<H>> {
let mut buffer = self.used_buffers.pop().unwrap_or_default();
buffer.resize(self.payload_offset, 0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't know what the use of re-using buffers here would be, it feels like a premature optimization. If not using a really awful allocator (which I think isn't default on any target we have, and even then we could just change it), the allocator can do this recycling work just as efficiently for us.

It makes sense when we're trying to cache the allocation size for e.g. control packets cuz there we'd incur frequent resizes as the packet grows but here it's just one allocation either way. Especially because as it currently stands a big video buffer could be given to a small audio packet (which probably doesn't really waste that much either way because the allocator would cache it too, but it does make the memory less reusable and could increase the max memory usage).

The only thing that I could note for that is that it's kinda troublesome with non-constant size headers because those could actually grow which would blow up, but using the byte size of the struct should be a not too overblown upper bound for that size. In that realm it makes sense to have an api to directly get a buffer with the correct size, even tho copying around the prefix probably isn't really costly. (This is also relevant for the control socket btw, because in the sizes we're dealing with allocating 50 percent more really shouldn't incur extra costs and may just be more efficient than doing in house recycling).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Especially because as it currently stands a big video buffer could be given to a small audio packet

Nope, buffers are recycled with per-stream queues.

As you said, we need to reallocate the vector multiple times while receiving shards, and copying the data multiple times. we may be able to do and optimization where we preallocate a sufficient size, for example the biggest packet that was sent. In the chance of an anomalous huge packet, this is bad, but also it's bad keeping just one of those huge buffers around. we may need to do some statistics, for example using a size of the 99% percentile, or trashing buffers that we consider having an anomalous size

/// If the range is outside the valid range, new space will be allocated
/// NB: the offset parameter is applied on top of the internal offset of the buffer
#[must_use]
pub fn get_range_mut(&mut self, range: Range<usize>) -> &mut [u8] {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're reusing buffers I think this could really use a comment that it starts off with zero non-prefix memory so calling this doesn't run risk to have unused memory at the end. I got confused by that for a while.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alright, fair

Comment on lines +199 to +210
fn wrapping_cmp(lhs: u32, rhs: u32) -> Ordering {
let diff = lhs.wrapping_sub(rhs);
if diff == 0 {
Ordering::Equal
} else if diff < u32::MAX / 2 {
Ordering::Greater
} else {
// if diff > u32::MAX / 2, it means the sub operation wrapped
Ordering::Less
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how this is any better than a normal cmp, I'm pretty sure that normal cmp would have the wished effect. Especially because this isn't really correct, if lhs is u32::max and rhs is 1 then this will be wrong. And I really don't see what the special use of this is except horrible jank.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, for some reason I was using u32::MAX as an exclusive upper value, while it is inclusive. In any case we need this method. values near u32::MAX are functionally < than small values

Copy link
Member Author

@zmerp zmerp Aug 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the off by one error here is really inconsequential. we just need a rough midpoint of the u32 range. very unlikely we would get packets from the network with (wrapping) distance near u32::MAX/2, which is the point if this method. Mind that packets are numbered per-stream

}
}

fn socket_peek(socket: &mut Socket, buffer: &mut [u8]) -> ConResult<usize> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pretty sure socket2 implements peek now, no need to have this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

socket2 has been supporting peek for 5 years (checked on github blame). But if you notice the implementations are different. The premade implementation was missing some flags which was causing unnecessary errors and connection drops

@zmerp zmerp force-pushed the protocol-cleanup branch from fe2718b to 1a70790 Compare August 11, 2025 15:44
@zmerp zmerp force-pushed the protocol-cleanup branch 2 times, most recently from 299bac0 to fb4b572 Compare October 8, 2025 14:03
@zmerp zmerp force-pushed the protocol-cleanup branch 3 times, most recently from c662089 to 9a39de7 Compare October 20, 2025 15:50
@zmerp zmerp marked this pull request as ready for review October 20, 2025 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants