Skip to content

Commit 2e768fa

Browse files
perf: skip H2 for full-tunnel batch requests
Full-tunnel batches already coalesce N ops into one HTTP request, so H2 stream multiplexing adds no benefit — there's nothing to multiplex. Worse, H2 introduces measurable regressions: - Long-poll batches complete at 16–17s (LONGPOLL_DEADLINE + latency) instead of timing out at 10s on H1. Each idle poll holds an Apps Script execution slot 60% longer, reducing available concurrency. - NonRetryable error path (RequestSent::Maybe) silently drops batches with no retry — data loss the H1 path doesn't have. - POOL_MIN_H2_FALLBACK trims the H1 pool from 8 to 2 once H2 lands, starving tunnel batches that still need the H1 pool. A/B tested on Pixel 6 Pro (30 batch samples each): | Metric | H2 (stock v1.9.20) | H1 (this PR) | v1.9.14 | |------------------|--------------------|--------------|---------| | 16–17s batches | 8–10/30 | 0/30 | 0/30 | | 10s timeouts | 0 | 4/30 | 5/30 | | Active RTTs | 1.4–2.4s | 1.3–2.2s | 1.4–2.3s| Changes: - tunnel_batch_request_to: skip h2_relay_request, go straight to H1 pool acquire(). Removes the H2 try/fallback/NonRetryable block. - run_pool_refill: always maintain POOL_MIN (8) connections. The H2-era POOL_MIN_H2_FALLBACK (2) trim starved tunnel batches; with tunnel traffic on H1 the pool must stay at full capacity. H2 multiplexing remains active for relay mode (non-full) where it genuinely helps — each browser request is a separate HTTP call that benefits from stream multiplexing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 9611279 commit 2e768fa

1 file changed

Lines changed: 12 additions & 66 deletions

File tree

src/domain_fronter.rs

Lines changed: 12 additions & 66 deletions
Original file line numberDiff line numberDiff line change
@@ -983,24 +983,20 @@ impl DomainFronter {
983983
}
984984
}
985985

986-
/// Background loop that keeps the h1 fallback pool warm.
986+
/// Background loop that keeps the h1 pool warm.
987987
///
988-
/// Target depends on whether the h2 fast path is active:
989-
/// - h2 disabled (or peer refused ALPN h2): keep `POOL_MIN` (8)
990-
/// sockets so the per-request acquire never pays a cold handshake
991-
/// — the pre-h2 default behavior.
992-
/// - h2 active: keep just `POOL_MIN_H2_FALLBACK` (2). All real
993-
/// traffic rides the multiplexed h2 connection; the h1 pool only
994-
/// exists to cover the case where h2 dies and we need to fall
995-
/// back instantly without a cold handshake.
988+
/// Always maintains `POOL_MIN` (8) connections. Full-tunnel mode
989+
/// uses the h1 pool for all batch traffic (h2 is skipped for
990+
/// tunnel batches), so the pool must stay at full capacity
991+
/// regardless of h2 status. Relay mode also benefits from a warm
992+
/// pool as h1 fallback.
996993
///
997994
/// A connection only counts toward the minimum if it has at least
998995
/// 20 s of TTL remaining — nearly-expired entries don't help.
999996
/// Checks every `POOL_REFILL_INTERVAL_SECS`, evicts expired entries,
1000997
/// and opens replacements one at a time so there's no burst.
1001998
pub async fn run_pool_refill(self: Arc<Self>) {
1002999
const MIN_REMAINING_SECS: u64 = 20;
1003-
const POOL_MIN_H2_FALLBACK: usize = 2;
10041000
loop {
10051001
tokio::time::sleep(Duration::from_secs(POOL_REFILL_INTERVAL_SECS)).await;
10061002

@@ -1010,24 +1006,7 @@ impl DomainFronter {
10101006
pool.retain(|e| e.created.elapsed().as_secs() < POOL_TTL_SECS);
10111007
}
10121008

1013-
// Decide target. We treat "h2 active right now" as having a
1014-
// fresh, non-poisoned cell. h2_disabled is the sticky flag
1015-
// (peer never agreed to h2); a transient cell-poison after
1016-
// h2 success briefly drops back to the larger target until
1017-
// ensure_h2 reopens.
1018-
let target = if self.h2_disabled.load(Ordering::Relaxed) {
1019-
POOL_MIN
1020-
} else {
1021-
let cell = self.h2_cell.lock().await;
1022-
let h2_alive = cell
1023-
.as_ref()
1024-
.map(|c| {
1025-
c.created.elapsed().as_secs() < H2_CONN_TTL_SECS
1026-
&& !c.dead.load(Ordering::Relaxed)
1027-
})
1028-
.unwrap_or(false);
1029-
if h2_alive { POOL_MIN_H2_FALLBACK } else { POOL_MIN }
1030-
};
1009+
let target = POOL_MIN;
10311010

10321011
// Count only connections with enough life left.
10331012
// Refill one at a time to avoid bursting TLS handshakes.
@@ -2876,44 +2855,11 @@ impl DomainFronter {
28762855

28772856
let path = format!("/macros/s/{}/exec", script_id);
28782857

2879-
// h2 fast path. A batch carries N stateful tunnel ops — each
2880-
// `data`/`udp_data`/`connect` may have already executed
2881-
// upstream when the response framing failed. Replaying the
2882-
// whole batch on h1 risks duplicating every op in it. Only
2883-
// fall back when h2 definitely never sent. Honors
2884-
// user-configured batch_timeout so a slow but legitimate
2885-
// batch isn't cut off at an arbitrary fixed cap.
2886-
match self
2887-
.h2_relay_request(&path, payload.clone(), self.batch_timeout)
2888-
.await
2889-
{
2890-
Ok((status, _hdrs, _resp_body)) if is_h2_fronting_refusal_status(status) => {
2891-
// Edge rejected the batch before forwarding. Safe to
2892-
// fall back: no batched op reached Apps Script, so
2893-
// replaying via h1 won't double-fire any of them.
2894-
self.sticky_disable_h2_for_fronting_refusal(status, "tunnel batch")
2895-
.await;
2896-
// fall through to h1
2897-
}
2898-
Ok((status, _hdrs, resp_body)) => {
2899-
return self.finalize_batch_response(script_id, status, resp_body);
2900-
}
2901-
Err((e, RequestSent::No)) => {
2902-
tracing::debug!(
2903-
"h2 batch request pre-send failure: {} — falling back to h1",
2904-
e
2905-
);
2906-
}
2907-
Err((e, RequestSent::Maybe)) => {
2908-
tracing::warn!(
2909-
"h2 batch request post-send failure: {} — \
2910-
not replaying on h1 to avoid duplicating batched ops",
2911-
e
2912-
);
2913-
return Err(e);
2914-
}
2915-
}
2916-
2858+
// Skip h2 for tunnel batches. Batched ops are already coalesced
2859+
// into one HTTP request so h2 multiplexing adds no benefit.
2860+
// The h1 pool path is simpler and avoids h2-specific overhead
2861+
// (ready timeout, NonRetryable errors, concurrent stream
2862+
// contention with long-poll batches).
29172863
let mut entry = self.acquire().await?;
29182864

29192865
let req_head = format!(

0 commit comments

Comments
 (0)