Replies: 2 comments
-
For my 2 vCPU machine, I use this option as
I use it in my Django app because memory usage is increasing after a while. So I want to rectify this issue automatically. |
Beta Was this translation helpful? Give feedback.
-
@benoitc We are using 4 workers and 2 maximum requests in our primary environment. In another we use 1 worker and 0 maximum requests to accommodate certain testing I am told. After some analysis today we are moving to N workers and 0 maximum requests, which is the point of this discussion. Our memory usage remains stable across multiple requests, and memory leaks appear not to be an issue. Perhaps you will have some time to comment on the following. It would be appreciated. Why Use Maximum RequestsWhy do we use this configuration? The configuration was chosen years ago, and it is not clear today why 4/2 was chosen. It is being revisited today because of some infrastructure and environment changes. From program history it seems like there were memory issues at some point in our containers, and it was thought that this might be due to memory leaks. The choice was made to reboot the workers using maximum requests. After some analysis today we are moving to N workers and 0 maximum requests. A Current ExampleOne of the tests that was run recently to gain a better understanding of what happens when a worker is rebooted uses one worker and N=4 maximum requests. This configuration was chosen for its simplicity, and the ability to get to the maximum requests quickly. Also, it was easy to monitor the gunicorn worker and watch it restart with a different PID. Here is what was observed. At the start of the gunicorn server: Starting gunicorn with 1 workers and 4 max-requests
2024-12-04 13:48:17 [8] [gunicorn.error:278] [INFO] Starting gunicorn 22.0.0
2024-12-04 13:48:17 [8] [gunicorn.error:278] [INFO] Listening at: http://0.0.0.0:6327 (8)
2024-12-04 13:48:17 [8] [gunicorn.error:278] [INFO] Using worker: sync
2024-12-04 13:57:55 [gunicorn.error:278] [INFO] Autorestarting worker after current request.
... The application continues logging.
2024-12-04 13:57:59 [ERROR] traceback [File "/opt/miniconda3/lib/python3.11/concurrent/futures/thread.py", line 169, in submit;
raise RuntimeError('cannot schedule new futures after')]
2024-12-04 13:58:00 [144] [gunicorn.error:278] [INFO] Booting worker with pid: 144 This behavior is similar to reference (1) below, but there they have 2 workers and suggest that as the 1st worker reaches its maximum requests the second worker, which is handling a request, submits a task to ThreadPoolExecutor and a failure occurs: ObservationWhat I see happening here is that as the request comes in (N=4), gunicorn signals it is autorestarting the worker at 13:57:55. Five seconds later gunicorn signals that is is booting the new worker. Somewhere within those 5 seconds an error is thrown. That makes sense to me because the worker that was handling the request is no longer alive to handle the request. Whether there is 1 worker or many workers, anytime a worker reaches its max requests, this sort of behavior will exhibit itself because the worker handling that particular request is no longer alive. Two Questions
Some Related Issues |
Beta Was this translation helpful? Give feedback.
-
Serious question but why so much people are using the option
--max_requests
run run gunicorn?This option has been added to Funicorn to offer a temporary workaround for an application code leaking memory. But it seems people are using it for another usage.
Please let me know. There maybe a better way than forcing gunicorn to kill that worker .
Beta Was this translation helpful? Give feedback.
All reactions