-
-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Receiving websocket payloads larger than ~4K yields garbled data #214
Comments
(BTW, I'll also gladly accept the news that I'm somehow running afoul of some necessary yet missing websocket handling 😟) |
When I attempt to send much larger payloads (say, 20MB), more stranger things happen:
...with pages and pages of log indicating that, in this case, tons of 99-byte messages were received. That larger message often seems to be something close to 131000 bytes, which seems just as suspiciously a round number as the ~4K threshold I noted in my original report. 🤔 |
This is almost certainly a problem in Dream or websocket/af. I unfortunately won't be able to look at this immediately, but I am reading the very helpful information and will try to address this soon. |
I fished around in both dream and websocketaf hoping to track the specific cause of this down, but without ultimate success. Some tidbits that might be helpful though:
My hunch is that websocketaf is what's misbehaving. If Dream's streams were at fault, then things like large multipart form uploads should be breaking too; and that "broken close frame" report is probably not something that Dream's higher-level websocket support could provoke?
Edit: actually, doing my own framing doesn't work, either. As before, the correct amount of data is sent (in this case, per-3k message), but various parts of those "frames" contain AFAICT random data. Maybe there is a timing issue that is provoked by "large" amounts of data going through a websocket connection per some unit of time, which affects both large, multi-frame messages as well as large-ish numbers of smaller messages sent very quickly? I'm declaring defeat on my end at this point for now. |
…ng dream, temporarily works around aantron#214
I've recently been debugging this and this appears to be an upstream Websocket/af issue, as probably also suggested by anmonteiro/httpun-ws#48, because this can be worked around by changing the buffer size in Websocket/af. As far as I can tell, Dream already receives garbled data for fragments after the first, and I don't think Dream is clobbering or can clobber Webscoket/af's internal buffer. So I don't see how Dream can fix the reassembly of these messages. cc @anmonteiro |
Yes, it's definitely a problem upstream of Dream, and @anmonteiro and I have sporadically gone back and forth on the subject. I think I currently owe him a minimal repro app, which I hope to get to assemble in the next week or so. |
In case it's useful, here is the diff from my "instrumented" Dream that I used to observe the issue and check that Dream gets garbled buffers already from Websocket/af: diff --git a/src/http/shared/dune b/src/http/shared/dune
index 31b3c06..7b9d528 100644
--- a/src/http/shared/dune
+++ b/src/http/shared/dune
@@ -5,6 +5,7 @@
bigstringaf
dream-pure
dream-httpaf.dream-websocketaf
+ cstruct
)
(preprocess (pps lwt_ppx))
(instrumentation (backend bisect_ppx)))
diff --git a/src/http/shared/websocket.ml b/src/http/shared/websocket.ml
index 358d577..fab93a9 100644
--- a/src/http/shared/websocket.ml
+++ b/src/http/shared/websocket.ml
@@ -134,6 +134,7 @@ let websocket_handler stream socket =
log "Unknown frame type with length %i" length); *)
read ~data ~flush ~ping ~pong ~close ~exn
| Some (`Data properties, payload) ->
+ prerr_endline "assigning current_payload";
current_payload := Some (properties, payload);
read ~data ~flush ~ping ~pong ~close ~exn
end
@@ -141,6 +142,7 @@ let websocket_handler stream socket =
Websocketaf.Payload.schedule_read
payload
~on_read:(fun buffer ~off ~len ->
+ Printf.eprintf "on_read: %s %i\n%!" (Cstruct.to_string ~off ~len:16 (Cstruct.of_bigarray buffer)) len;
match !last_chunk with
| None ->
last_chunk := Some (buffer, off, len);
@@ -150,6 +152,7 @@ let websocket_handler stream socket =
let binary = binary = `Binary in
data last_buffer last_offset last_length binary false)
~on_eof:(fun () ->
+ prerr_endline "clearning current_payload";
current_payload := None;
match !last_chunk with
| None ->
diff --git a/src/pure/message.ml b/src/pure/message.ml
index e96da42..86422c0 100644
--- a/src/pure/message.ml
+++ b/src/pure/message.ml
@@ -315,6 +315,7 @@ let receive_fragment stream =
in
let text_or_binary = if binary then `Binary else `Text in
let end_of_message = if fin then `End_of_message else `Continues in
+ Printf.eprintf "data prefix: %s\n%!" (String.sub string 0 16);
Lwt.wakeup_later
resolver (Some (string, text_or_binary, end_of_message)))
@@ -344,16 +345,20 @@ let receive_full stream =
| None ->
Lwt.return (Some (acc, text_or_binary))
| Some (fragment, _, `End_of_message) ->
+ Printf.eprintf "End continuations %i %s\n%!" (String.length fragment) (String.sub fragment 0 16);
Lwt.return (Some (acc ^ fragment, text_or_binary))
| Some (fragment, _, `Continues) ->
+ Printf.eprintf "Interior continuation %i\n%!" (String.length fragment);
receive_continuations text_or_binary (acc ^ fragment)
in
match%lwt receive_fragment stream with
| None ->
Lwt.return_none
| Some (fragment, text_or_binary, `End_of_message) ->
+ Printf.eprintf "Full message %i\n%!" (String.length fragment);
Lwt.return (Some (fragment, text_or_binary))
| Some (fragment, text_or_binary, `Continues) ->
+ Printf.eprintf "First fragment %i %s\n%!" (String.length fragment) (String.sub fragment 0 16);
receive_continuations text_or_binary fragment
let receive stream = This should apply over 313fe46. We were testing this with your repro case by opening a browser tab and asking it to send 4200 bytes to the server. |
Commenting here for both this one and #230 The general situation of my websocketaf fork is that it's an experiment to check whether it could work. Unfortunately, and as you've noticed in Dream, it's clearly not in a state where it should be used at all right now (unlike h2/httpaf, which are running production workloads). I suppose that the former doesn't matter when it's indeed used in Dream, and we should strive to make the library better. Unfortunately I don't have any free time to spare right now on websocketaf – there seems to be other interest in using it too from other community folks. I'm happy to review PRs or otherwise talk about a paid engagement to make websocketaf ready for dream. |
Hello, I'm also hit by this bug. I'd be happy to help on it! @anmonteiro commented here:
However, I still see the problem occurring on dream (master branch), which seems to have included all commits from @anmonteiro's fork. Am I missing something that would have addressed the garbled data? If the issue is still open, I would appreciate some pointers on how to investigate on this issue! |
I finally found the root cause of this issue and have fixed it in anmonteiro/httpun-ws#68 |
Here's a minimal reproduction that lets one send arbitrarily-sized websocket payloads to the server; sizes > ~4K results in garbled (seemingly arbitrary) data:
A sample output log looks like this (note that the messages being sent by the client consist entirely of exclamation points, just to easily see where the bad data starts):
Of course maybe there's another perfectly reasonable cause, it sure feels like the websocket facility is successfully handling the first "frame" of data (which is probably 4096 bytes, including some header data, yielding something like 4000 bytes of expected data), but then only increases the extent of the resulting string, without actually copying over the payload of websocket "frame" >= 2.
The text was updated successfully, but these errors were encountered: