You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm having an issue that I don't get locally, it happens in the following scenario:
I have two cog models deployed on Replicate (as Deployments)
one of them at some point calls the other (see snippet below)
they were built and deployed using cog==0.13.7 , replicate==1.0.4 , and the cog CLI 0.14.3, python 3.11, ubuntu==22.04
Here's how I call one deployment from the other:
fromreplicate.helpersimportbase64_encode_filevectorizer_deployment=replicate.deployments.get(VECTORIZER_DEPLOYMENT)
withopen(img_path, "rb") asf:
b64=base64_encode_file(f)
prediction=vectorizer_deployment.predictions.create(
input={"images": [b64_images]} ,
)
logger.debug(f"{prediction.id=}")
prediction.wait() # this line hangs forever after 30-ish GET requests
The called deployment does complete the inference, and I can see the status as succeeded on Replicate.
In the logs of the calling deployment, I can see about 30-ish GET requests, all looking like INFO:httpx:HTTP Request: GET https://api.replicate.com/v1/predictions/7atmc23wmsrga0cp7ag9y5s6pm "HTTP/1.1 200 OK"
I have investigated the replicate python client source code, I can see that the prediction.wait() method calls the '.reload()' method which itselfs performs the GET requests.
I've tried increasing the env var REPLICATE_POLL_INTERVAL but to no effect.
Strange thing is, as said above, locally it works! ie:
when I run locally in python the main endpoint everything works well (I run like predictor.predict(...) )
when I run locally with cog predict -i ..., inference goes through, but at the end after my inference completes I get this error log: {"logger": "cog.server.worker", "timestamp": "2025-04-15T19:11:52.878929Z", "exception": "Traceback (most recent call last):\n File \"/root/.pyenv/versions/3.11.10/lib/python3.11/site-packages/cog/server/worker.py\", line 299, in _consume_events\n self._consume_events_inner()\n File \"/root/.pyenv/versions/3.11.10/lib/python3.11/site-packages/cog/server/worker.py\", line 337, in _consume_events_inner\n ev = self._events.recv()\n ^^^^^^^^^^^^^^^^^^^\n File \"/root/.pyenv/versions/3.11.10/lib/python3.11/multiprocessing/connection.py\", line 251, in recv\n return _ForkingPickler.loads(buf.getbuffer())\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\nTypeError: URLPath.__init__() missing 3 required keyword-only arguments: 'source', 'filename', and 'fileobj'", "severity": "ERROR", "message": "unhandled error in _consume_events"}
So far I'm clueless as to why everything suddenly hangs, making all my project useless. I guess it's due to the deployed environment.
Hi,
I'm having an issue that I don't get locally, it happens in the following scenario:
Deployments
)cog==0.13.7
,replicate==1.0.4
, and thecog CLI 0.14.3
,python 3.11
,ubuntu==22.04
Here's how I call one deployment from the other:
The called deployment does complete the inference, and I can see the status as
succeeded
on Replicate.In the logs of the calling deployment, I can see about 30-ish GET requests, all looking like
INFO:httpx:HTTP Request: GET https://api.replicate.com/v1/predictions/7atmc23wmsrga0cp7ag9y5s6pm "HTTP/1.1 200 OK"
I have investigated the replicate python client source code, I can see that the
prediction.wait()
method calls the '.reload()' method which itselfs performs the GET requests.I've tried increasing the env var
REPLICATE_POLL_INTERVAL
but to no effect.Strange thing is, as said above, locally it works! ie:
cog predict -i ...
, inference goes through, but at the end after my inference completes I get this error log:{"logger": "cog.server.worker", "timestamp": "2025-04-15T19:11:52.878929Z", "exception": "Traceback (most recent call last):\n File \"/root/.pyenv/versions/3.11.10/lib/python3.11/site-packages/cog/server/worker.py\", line 299, in _consume_events\n self._consume_events_inner()\n File \"/root/.pyenv/versions/3.11.10/lib/python3.11/site-packages/cog/server/worker.py\", line 337, in _consume_events_inner\n ev = self._events.recv()\n ^^^^^^^^^^^^^^^^^^^\n File \"/root/.pyenv/versions/3.11.10/lib/python3.11/multiprocessing/connection.py\", line 251, in recv\n return _ForkingPickler.loads(buf.getbuffer())\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\nTypeError: URLPath.__init__() missing 3 required keyword-only arguments: 'source', 'filename', and 'fileobj'", "severity": "ERROR", "message": "unhandled error in _consume_events"}
So far I'm clueless as to why everything suddenly hangs, making all my project useless. I guess it's due to the deployed environment.
@zeke @erbridge @meatballhat @aron @mattt
thanks for your help
The text was updated successfully, but these errors were encountered: