You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
2 workers as http server, listening by different URL path on the same hostname
1 worker as the tail worker for both of the http servers, that handles logging and metrics reporting for each request
Currently I'm on a trial of the enterprise plan and discovered, through logpush, that around ~0.0003% requests having a high EdgeTimeToFirstByteMs ranged from 500ms~7s. The logs also indicated these outliers had a CacheCacheStatus: hit status.
The 2 main workers are using node:diagnostics_channel to pass information to the tail worker, seemingly related to the TTFB issue so I'll post the relevant part of the code here. More details and full code can be shared once this is picked up by the Cloudflare support team.
Observed behavior
Among all the requests, an avg of 0.0003% requests having TTFB over 500ms, up to 7s.
According to Cloudflare technical support's initial findings, the high TTFB cases were considered to be caused by making subrequests from our workers to a third-party metrics vendor, reaching the IAD node.
Expected behavior
All requests should have a consistent TTFB.
By our code review, there's no subrequest made by our workers code to the third-party metrics vendor except for the tail worker. There should not be such subrequests observed.
Steps to reproduce
A minimal working subset of your worker code:
// inside the fetch() function of index.tsconstoriginStart=Date.now()constcache=caches.defaultconstcached=awaitcache.match(request.url)if(cached){constrespHeaders=newHeaders(cached.headers)respHeaders.set('Access-Control-Allow-Origin','*')respHeaders.set('Vary','Origin')constresp=newResponse(cached.body,{headers: respHeaders,status: cached.status,})sentinel.event({event: eventNameRedacted,
request,response: resp,payload: {
...reqCtx,duration: Date.now()-originStart},})sentinel.event({event: eventNameRedacted,
request,response: resp,payload: reqCtx})returnresp}
// inside the sentinel.ts// datadog imported for calling a util method for generating the tags which doesn't make any HTTP requestimport{Channel,channel}from'node:diagnostics_channel'import{SentinelEventType,SentinelMessage}from'./types'import{Datadog}from'../../shared/datadog'import{DatadogPayload}from'../../shared/datadog.d'exportclassSentinel{privatereadonlycreatedAt: DateprivatereadonlyserviceName: stringprivatereadonlychannel: Channelconstructor(serviceName: string){this.createdAt=newDate()this.serviceName=serviceNamethis.channel=channel(serviceName)}event(message: {event: SentinelEventType;request ? : Request;response ? : Response;payload: DatadogPayload}): void{if(!Boolean(message.payload.duration)){message.payload.duration=Date.now()-this.createdAt.getTime()}this.channel.publish({event: message.event,duration: message.payload.duration,log: {request: {// tag fields redacted},response: {// tag fields redacted},},metric: {// tag fields redacted},}asSentinelMessage)}}
Which Cloudflare product(s) does this pertain to?
Workers Runtime
What version(s) of the tool(s) are you using?
3.91.0 [wrangler], 5.7.2 [typescript], 4.20241202.0 [@cloudflare/workers-types]
What version of Node are you using?
v20.18.0
What operating system and version are you using?
macOS 15.2
Describe the Bug
My workers have the following setup:
Currently I'm on a trial of the enterprise plan and discovered, through logpush, that around ~0.0003% requests having a high
EdgeTimeToFirstByteMs
ranged from 500ms~7s. The logs also indicated these outliers had aCacheCacheStatus: hit
status.The 2 main workers are using
node:diagnostics_channel
to pass information to the tail worker, seemingly related to the TTFB issue so I'll post the relevant part of the code here. More details and full code can be shared once this is picked up by the Cloudflare support team.Observed behavior
Expected behavior
Steps to reproduce
A minimal working subset of your worker code:
A minimal working subset of your
wrangler.toml
:Commands used to start your local dev server, including custom env and cli args:
Not relevant, issue observed on edge.
Steps to be performed in the browser, curl commands, or a test we can run that reliably fails (at least a percent of the time):
This is a part of the enterprise support. Details are confidential.
The text was updated successfully, but these errors were encountered: