Skip to content

API call from deployment to deployment hangs forever #424

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Clement-Lelievre opened this issue Apr 15, 2025 · 0 comments
Open

API call from deployment to deployment hangs forever #424

Clement-Lelievre opened this issue Apr 15, 2025 · 0 comments

Comments

@Clement-Lelievre
Copy link

Clement-Lelievre commented Apr 15, 2025

Hi,

I'm having an issue that I don't get locally, it happens in the following scenario:

  • I have two cog models deployed on Replicate (as Deployments)
  • one of them at some point calls the other (see snippet below)
  • they were built and deployed using cog==0.13.7 , replicate==1.0.4 , and the cog CLI 0.14.3, python 3.11, ubuntu==22.04

Here's how I call one deployment from the other:

from replicate.helpers import base64_encode_file

vectorizer_deployment = replicate.deployments.get(VECTORIZER_DEPLOYMENT)


with open(img_path, "rb") as f:
        b64 = base64_encode_file(f)
prediction = vectorizer_deployment.predictions.create(
            input={"images": [b64_images]} ,
        )
logger.debug(f"{prediction.id=}")
prediction.wait() # this line hangs forever after 30-ish GET requests

The called deployment does complete the inference, and I can see the status as succeeded on Replicate.
In the logs of the calling deployment, I can see about 30-ish GET requests, all looking like INFO:httpx:HTTP Request: GET https://api.replicate.com/v1/predictions/7atmc23wmsrga0cp7ag9y5s6pm "HTTP/1.1 200 OK"

I have investigated the replicate python client source code, I can see that the prediction.wait() method calls the '.reload()' method which itselfs performs the GET requests.
I've tried increasing the env var REPLICATE_POLL_INTERVAL but to no effect.

Strange thing is, as said above, locally it works! ie:

  • when I run locally in python the main endpoint everything works well (I run like predictor.predict(...) )
  • when I run locally with cog predict -i ..., inference goes through, but at the end after my inference completes I get this error log:
    {"logger": "cog.server.worker", "timestamp": "2025-04-15T19:11:52.878929Z", "exception": "Traceback (most recent call last):\n File \"/root/.pyenv/versions/3.11.10/lib/python3.11/site-packages/cog/server/worker.py\", line 299, in _consume_events\n self._consume_events_inner()\n File \"/root/.pyenv/versions/3.11.10/lib/python3.11/site-packages/cog/server/worker.py\", line 337, in _consume_events_inner\n ev = self._events.recv()\n ^^^^^^^^^^^^^^^^^^^\n File \"/root/.pyenv/versions/3.11.10/lib/python3.11/multiprocessing/connection.py\", line 251, in recv\n return _ForkingPickler.loads(buf.getbuffer())\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\nTypeError: URLPath.__init__() missing 3 required keyword-only arguments: 'source', 'filename', and 'fileobj'", "severity": "ERROR", "message": "unhandled error in _consume_events"}

So far I'm clueless as to why everything suddenly hangs, making all my project useless. I guess it's due to the deployed environment.

@zeke @erbridge @meatballhat @aron @mattt

thanks for your help

@Clement-Lelievre Clement-Lelievre changed the title API call hangs forever, extremely annoying API call from deployment to deployment hangs forever Apr 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant