Description
A friend of mine has a really interesting case of hackney unable to reliably establish ssl session over HTTP proxy. I have an access to packet traces on the wire which unfortunately is something I can not freely share, but the traces let us to narrow down the difference between cases when connection establishment is successful and when it is not. In case whole HTTP response to CONNECT request returned by the proxy was contained in a single TCP segment the ssl connection was established successfully and if HTTP response was split in 2 TCP segments the ssl_connect/2 was returning terminal {error,{tls_alert,"record overflow"}}. The error usually means a garbage was read by ssl_connect from the socket.
We spend some time investigating how hackney check if a tunnel has been successfully established and the criteria defined by RFCs .. from https://tools.ietf.org/html/rfc7231#section-4.3.6 - Any 2xx (Successful) response indicates that the sender (and all inbound proxies) will switch to tunnel mode immediately after the blank line that concludes the successful response's header section; The hackney https://github.com/benoitc/hackney/blob/master/src/hackney_http_connect.erl has a function check_response(Socket) which basically waits for any size segment to arrive to the socket and then uses function check_status/1 to pattern match the response code and protocol version supported and discards the rest of what was read from the socket. The way hackney checks the response is prone to break in real life scenarios because single read from the socket does not guarantee to have complete HTTP response (including new line) and may leave the rest of HTTP response on the socket buffer ( because it was not read in one gen_tcp:recv or i just came later in time). The ssl_connect then gets called to upgrade session to TLS: it sends client hello and tries to read server hello records but gets to read the rest of HTTP response still lingering in the socket buffer instead and errors out.
I am not committed to providing a patch yet as it directly does not hurt me yet ;-) but a complete implementation of HTTP response parsing would be appreciated from anyone who stepped on this bug.