-
Notifications
You must be signed in to change notification settings - Fork 35
Offline Mode: Reject external http requests when network is offline #1491
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Rahul for improving this! I think the approach is good. Ideally, we would prefer that the queries failed fast as you mention in your description, but this simulates that behaviour through the http header, which works. I have tested it in Windows and Mac and it works as expected, it does not return any timeouts. LGTM!
Windows in offline mode
CleanShot.2025-06-20.at.12.51.40.mp4
Windows when turning on the internet connection
CleanShot.2025-06-20.at.12.53.26.mp4
Also, I would be interested in getting feedback from other team members
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice one!
Really an improvement compared to the initial approach.
LGTM 👍
@adamziel, does this problem affect Playground in the browser or Playground CLI? |
*/ | ||
export async function isOnline(hostname: string = 'google.com'): Promise<boolean> { | ||
try { | ||
await dns.promises.lookup(hostname); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any chance this would still be served from a dns cache when offline?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is definitely possible. However, I haven't faced any issues during the development and testing of this PR, so I cannot confirm definitively.
@wojtekn interesting! Playground CLI and web both go through this code path that disables network access during the installation: As for the initial |
Note while this PR solves the problem for
Interesting! What's the specific causation link between not supporting the DNS resolution and the requests not failing quickly? I'm not sure how they're related, and we could support DNS resolution in Node if that's worthwhile. Also, note that while PHP may not be DNS-aware, we still do DNS resolution in the outbound network proxy: If that doesn't fail quickly, it seems like a bug in that proxy. If it can be fixed there, it would solve for all possible ways of talking to the network. |
Thanks for flagging this, @adamziel. I didn't think about that. I think we should be able to set
So, the following were my observations while setting up the site in offline mode. I added HTTP requests profiling in WordPress site using
Here are the actual logs, the request timeout was set to 30s:
Ideally, we want to fail these requests under ~20 ms as native PHP when we are offline. I hope this helps to understand where I am coming from.
Interesting. Could the local DNS cache cause this, or might there be another factor at play? |
Nice @gavande1! I'd just be careful to only do that for the installer, not for the entire site – many plugins use curl and fopen directly.
Lovely! I'd start with checking how well
^ it's mouthful and it's also the equivalent of just
In that script, you might try connecting to, say, https://w.org when your computer is offline. If it takes a long time, that's the issue right there. From there, I'd play with that proxy file I've linked to earlier and see if it, indeed, takes 30 seconds before it communicates the failure. If yes, that's where we can fix it. If not, we're back to the drawing board as the problem might be somewhere else. You might also try it on node v23 with JSPI enabled – it changes how all async calls are handled:
|
@gavande1 I've played with this and this patch seems to be helping: diff --git a/isomorphic-git b/isomorphic-git
--- a/isomorphic-git
+++ b/isomorphic-git
@@ -1 +1 @@
-Subproject commit cdca7e5dbf9bc4654eab3465ceab64a54ab30a76
+Subproject commit cdca7e5dbf9bc4654eab3465ceab64a54ab30a76-dirty
diff --git a/packages/php-wasm/node/src/lib/networking/outbound-ws-to-tcp-proxy.ts b/packages/php-wasm/node/src/lib/networking/outbound-ws-to-tcp-proxy.ts
index a1205bc252..7db7e936ea 100644
--- a/packages/php-wasm/node/src/lib/networking/outbound-ws-to-tcp-proxy.ts
+++ b/packages/php-wasm/node/src/lib/networking/outbound-ws-to-tcp-proxy.ts
@@ -207,6 +207,7 @@ async function onWsConnect(client: any, request: http.IncomingMessage) {
// Send empty binary data to notify requester that connection was
// initiated
client.send([]);
+ await new Promise(resolve => setTimeout(resolve, 0))
client.close(3000);
return;
} This script seems to be consistently failing in 30 milliseconds: <?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://wordpress.org');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_TIMEOUT, 3);
$time = microtime(true);
echo "Before curl exec\n";
$content = curl_exec($ch);
echo "After curl exec (".(microtime(true) - $time).")\n";
curl_close($ch);
var_dump(strlen($content)); Whereas without the await part, it sometimes fails and sometimes hangs until the 3 seconds timeout runs out. One possible explanation is that PHP expects the socket to go through a certain state transition between "connection initiated" and "connection closed," and closing the connection synchronously doesn't give it a chance to notice that. Another explanation is that Node is doing some socket-level optimizations and skips a step if we close right after sending an empty buffer. Either way, skipping one event loop tick seems to remedy the problem. Would you be willing to take that patch for a spin and confirm it helped? You could even edit the Studio |
Thanks for taking a look at it and playing around.
Sure thing. Let me try and get back to you. Hoping for the best. |
@adamziel I tested this briefly locally by applying a monkey patch. I didn't see any improvements in the Studio in offline mode. I will test it again tomorrow with a fresh mind and share detailed results. |
Hi @adamziel, I tested the proposed changes by cloning the Playground repository and running it locally. As you mentioned, when using the PHP tester script directly, the PHP request fails immediately as expected:
However, when I try to run the same through Studio, it does not seem to reflect any change. I suspect I might be doing something wrong when trying to apply the patch in Studio. It looks like I cannot directly modify I also tried pointing "@php-wasm/node": "file:/Users/USER/projects/wordpress-playground/dist/packages/php-wasm/node",
"@php-wasm/scopes": "file:/Users/USER/projects/wordpress-playground/dist/packages/php-wasm/scopes",
"@php-wasm/universal": "file:/Users/USER/projects/wordpress-playground/dist/packages/php-wasm/universal",
"@php-wasm/node-polyfills": "file:/Users/USER/projects/wordpress-playground/dist/packages/php-wasm/node-polyfills", But that did not help either. The PHP requests in Studio still take the full timeout duration. At this point, I am unsure whether my local changes are being applied correctly when running Studio with the patched version or if the patch is applied correctly, but requests fail as before. Do you have any suggestions on how I can confirm or debug this more effectively? Thanks! |
@gavande1 the code is probably a part of |
Thanks @adamziel for the tip. The code was indeed in I was looking at the |
@gavande1 Oh, too bad that didn't help :( How can I build Studio locally? I'd like to reproduce the issue somehow. Btw, if you add a console.log() beside adding that patch, would the logged message show up in the terminal? It's a relatively easy way of verifying whether the updated code is running. |
Hey @adamziel, here is how you can run Studio locally:
This will launch the Studio app locally. If you make changes to the Node code, you will need to run
Yes, I tested it with a
|
Thank you! I'm on a meetup this week. I'll look into it as much as I can, but it may go much slower until Monday. |
No problem. FYI, I have been trying to understand the more advanced aspects of the PHP wasm environment, such as how WebSocket communication works. But it seems too much to grasp at once. Do we have an architecture doc where I can learn more about the internals? |
Yes we do - here: https://wordpress.github.io/wordpress-playground/developers/architecture |
Thank you 👍 |
@adamziel Just checking if you've had a chance to look at this. If not, no worries. |
@gavande1 I just took this for a spin. I didn't have enough time available to run it in Studio so I've continued with Playground CLI. I've reproduced the slow admin loading – it happened in two scenarios:
Here's a patch that solves for 1 and outputs detailed debug information to see if 2 might be the case on your computer. Would you please apply it and share the outcomes as well as the CLI output log? I'll be AFK until the end of the week, but @mho22 might be able to help while I'm away. diff --git a/packages/php-wasm/node/src/lib/networking/outbound-ws-to-tcp-proxy.ts b/packages/php-wasm/node/src/lib/networking/outbound-ws-to-tcp-proxy.ts
index a1205bc252..af899cc309 100644
--- a/packages/php-wasm/node/src/lib/networking/outbound-ws-to-tcp-proxy.ts
+++ b/packages/php-wasm/node/src/lib/networking/outbound-ws-to-tcp-proxy.ts
@@ -15,7 +15,7 @@ import { WebSocketServer } from 'ws';
import { debugLog } from './utils';
function log(...args: any[]) {
- debugLog('[WS Server]', ...args);
+ console.log('[WS Server]', ...args);
}
const lookup = util.promisify(dns.lookup);
@@ -120,7 +120,7 @@ export function initOutboundWebsocketProxyServer(
async function onWsConnect(client: any, request: http.IncomingMessage) {
const clientAddr = client?._socket?.remoteAddress || client.url;
const clientLog = function (...args: any[]) {
- log(' ' + clientAddr + ': ', ...args);
+ log(`[${new Date().getTime()}] ` + ' ' + clientAddr + ': ', ...args);
};
clientLog(
@@ -191,7 +191,9 @@ async function onWsConnect(client: any, request: http.IncomingMessage) {
} as any);
client.on('error', function (a: string | Buffer) {
clientLog('WebSocket client error: ' + a);
- target.end();
+ if (target) {
+ target.end();
+ }
});
// Resolve the target host to an IP address if it isn't one already
@@ -204,10 +206,12 @@ async function onWsConnect(client: any, request: http.IncomingMessage) {
clientLog('resolved ' + reqTargetHost + ' -> ' + reqTargetIp);
} catch (e) {
clientLog("can't resolve " + reqTargetHost + ' due to:', e);
- // Send empty binary data to notify requester that connection was
- // initiated
client.send([]);
- client.close(3000);
+ setTimeout(() => {
+ // Send empty binary data to notify requester that connection was
+ // initiated
+ client.close(3000);
+ })
return;
}
} else {
@@ -238,7 +238,12 @@ async function onWsConnect(client: any, request: http.IncomingMessage) {
});
target.on('error', function (e: any) {
clientLog('target connection error', e);
- target.end();
- client.close(3000);
+ setTimeout(() => {
+ client.send([]);
+ client.close(3000);
+ if (target) {
+ target.end();
+ }
+ })
});
} @mho22, here's the Playground CLI command I've used to trigger this. You need to run it once when online, and then it will also work when you go offline. DEV=1 node --disable-warning=ExperimentalWarning --experimental-strip-types --experimental-transform-types --import ./packages/meta/src/node-es-module-loader/register.mts ./packages/playground/cli/src/cli.ts server --php=7.4 --wp=trunk |
Thanks @adamziel for looking into it. I am going to give it a try today and share my findings later. |
@adamziel @mho22, I applied the changes into CleanShot.2025-07-10.at.13.43.02.mp4
If it helps in any way, I have uploaded the entire console output to the following gist. https://gist.github.com/gavande1/086e3c24abb6086d5784f932f1163cc5 |
I am sorry @gavande1, I'm trying to reproduce the issue since two hours now but I can't. I followed those steps :
I have never faced any timing issue identical to yours. ![]() There is a RSS Error obviously but no freeze.
There's probably a missing step. |
Hi @mho22, thanks for trying that out, and apologies for the misunderstanding. Could you please try the same on the trunk branch? This branch already fixes the issue by applying the fix in the Studio repository. But Adam and I were trying to fix the issue on upstream so that other projects can benefit from the fix as well. |
Aha, silly me! I’ll try that now. Edit: Ok! I see the difference now. Thanks. Let's dig into this. |
Could you try this on your side?
} catch (e) {
clientLog("can't resolve " + reqTargetHost + ' due to:', e);
- client.send([]);
setTimeout(() => {
// Send empty binary data to notify requester that connection was
// initiated
+ client.send([]);
client.close(3000);
})
return;
} It may sound overly simple, but it's worth a try. |
I tried it and it worked for the first couple of requests, but then requests started to take full time to time out. I have attached the video of the behaviour. The last reload took a full 30 seconds. CleanShot.2025-07-10.at.18.26.19.mp4 |
We did make some progress. I think what's happening now is related to multiple websocket servers bound at the same time.
I'm investigating. |
Interesting observation. I also noticed that, but ignored it, thinking it might be due to multiple workers creating their own connections. But I think it makes sense now. Ideally, we would want a single WebSocket server, right? |
I think I've found something : each instance of the Dashboard is trying to connect to the first created webserver. This is not problematic when you load the dashboard every, let's say, 5 seconds, one at a time. But when you load two dashboards at the same time, the first one will run normally because it reaches the webserver first, while the second does not and waits until timeout. The following logs shows that :
As you can see every call here are handled by
But the first one remains the To log the Connection handled by server message I added these lines in const serverHost = client?._socket?.localAddress;
const serverPort = client?._socket?.localPort;
clientLog(`Connection handled by server at ${serverHost}:${serverPort}`); |
I also tried to comment out the entire // packages/php-wasm/node/src/lib/networking/with-networking.ts
async function withNetworking(phpModuleArgs = {}) {
const [inboundProxyWsServerPort, outboundProxyWsServerPort] = await findFreePorts(2);
log( "Inbound Proxy Ws Server Port : " + inboundProxyWsServerPort);
log( "Outbound Proxy Ws Server Port : " + outboundProxyWsServerPort);
const outboundNetworkProxyServer = await initOutboundWebsocketProxyServer(
outboundProxyWsServerPort
);
return {
...phpModuleArgs,
outboundNetworkProxyServer,
// websocket: {
// ...phpModuleArgs["websocket"] || {},
// url: (_, host, port) => {
// const query = new URLSearchParams({
// host,
// port
// }).toString();
// return `ws://127.0.0.1:${outboundProxyWsServerPort}/?${query}`;
// },
// subprotocol: "binary",
// decorator: addSocketOptionsSupportToWebSocketClass,
// serverDecorator: addTCPServerToWebSocketServerClass.bind(
// null,
// inboundProxyWsServerPort
// )
// }
};
} |
Without those lines, network requests won't go out at all - so there's no timeout but not in a useful way. The concurrency insight is interesting. I wonder if it's about the wasm server only using a single worker. I'll look into this again today |
@adamziel Certainly, I was aware that commenting out those lines wasn't the solution, but I wanted to document that discovery. Let me know if I can help. |
Yes, and the documenting is very useful indeed as it narrows down possible causes. Thank you! |
I've started drafting a test harness for the outbound network proxy in WordPress/wordpress-playground#2370 and found one failing scenario and one scenario that times out. Concurrent connections, weirdly, work just fine. I'll get these tests to pass and share the findings. |
I traced it down to WordPress/wordpress-playground#2260 – a fix that landed in Playground 1.1.3 (Studio uses 1.1.0). That's why I couldn't reproduce this with Playground CLI. The network proxy patches are still relevant and I'll land them in core, but to fix the problem reported here it seems like you only have to update to the latest Playground packages – which you'll get for free as a part of migration to Playground CLI. |
Related issues
Fixes STU-550
My previous PR #1478, has following problems:
Proposed Changes
I propose a solution of rejecting external requests when the network is offline. In this new approach, the network status is determined via a DNS query, which adds minimal overhead (~8ms). It doesn't require server restarts when the network status is changed.
How it Works
A custom header
STUDIO_IS_OFFLINE
is attached when the network is offline. If this header is detected in the mu-plugin, it intercepts all external requests and rejects them.Why this change?
AFAIU, PHP WASM does not have native DNS resolution capability, so it cannot immediately detect offline status.
As a result:
SSL_ERROR_SYSCALL
after a few seconds.This is the primary reason why the WordPress admin is significantly slower in offline mode.
Native PHP has DNS resolution capability and can fail fast when a request cannot be resolved. Other tools, such as Local by Flywheel and Laravel Herd, since they use native PHP (with DNS resolution), offline requests fail quickly as expected; hence, they don't require additional handling for offline mode.
Testing Instructions
Pre-merge Checklist