RabbitMQ docker image takes more time to start when using home dir in an EFS mount #471
Replies: 8 comments 5 replies
-
| The bitnami chart/image isn't derived from this image https://github.com/bitnami/bitnami-docker-rabbitmq/blob/master/3.8/debian-10/Dockerfile The discussion in that thread is pretty informative for troubleshooting the issue, notably bitnami/charts#4936 (comment) and michaelklishin's comment about using  | 
Beta Was this translation helpful? Give feedback.
-
| @michaelklishin @wglambert I did use the debug and strace and here are my findings. please have a look when you have some time while its stuck for 7 minutes at [root@ip-172-31-27-174 fs1]# docker run --cap-add SYS_PTRACE -v /mnt/efs/fs1:/var/lib/rabbitmq -v /mnt/efs/fs1/rabbitmq.conf:/etc/rabbitmq/rabbitmq.conf --hostname my-rabbit --name some-rabbit rabbitmq:3.8.11
Unable to find image 'rabbitmq:3.8.11' locally
3.8.11: Pulling from library/rabbitmq
d519e2592276: Pull complete
d22d2dfcfa9c: Pull complete
b3afe92c540b: Pull complete
cd4e41ce9500: Pull complete
e2741828ce46: Extracting [=======================================>           ]  25.56MB/32.73MB
e2741828ce46: Pull complete
6cf1935b659a: Pull complete
3df71d67553c: Pull complete
ac4f52d15541: Pull complete
0af823fd61c8: Pull complete
85579530757b: Pull complete
Digest: sha256:52e73c649b3ef628fb2b0dafd5b043c0b397bd188a0326a6514d37662d84b425
Status: Downloaded newer image for rabbitmq:3.8.11
Configuring logger redirectionStrace outputs USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
rabbitmq     1  0.2  0.0   4636   888 ?        Ss   03:55   0:00 /bin/sh /opt/rabbitmq/sbin/rabbitmq-server
rabbitmq    16  2.7  4.9 1686864 50052 ?       Sl   03:55   0:02 /usr/local/lib/erlang/erts-11.1.7/bin/beam.smp -W w -MBas ageffcbf -MHas ageffcbf -MBlmbcs 512 -MHlmbcs 512 -MMmcs 30 -P 1048576
rabbitmq    23  0.0  0.0   4528   880 ?        Ss   03:55   0:00 erl_child_setup 1024
rabbitmq    48  0.0  0.0   8280    88 ?        S    03:55   0:00 /usr/local/lib/erlang/erts-11.1.7/bin/epmd -daemon
rabbitmq    68  0.0  0.1   8272  1180 ?        Ss   03:55   0:00 inet_gethost 4
rabbitmq    69  0.0  0.1  10392  1716 ?        S    03:55   0:00 inet_gethost 4
root        70  0.5  0.3  20264  3840 pts/0    Ss   03:56   0:00 bash
root       330  0.0  0.3  36160  3284 pts/0    R+   03:57   0:00 ps -aux
root@my-rabbit:/# strace -p 1
strace: Process 1 attached
rt_sigsuspend([], 8
root@my-rabbit:/# strace -p 16
strace: Process 16 attached
select(0, NULL, NULL, NULL, NULL
root@my-rabbit:/# strace -p 23
strace: Process 23 attached
select(5, [3 4], NULL, NULL, NULL
root@my-rabbit:/# strace -p 48
strace: Process 48 attached
select(7, [3 4 5], NULL, NULL, {tv_sec=0, tv_usec=147862}) = 0 (Timeout)
select(7, [3 4 5], NULL, NULL, {tv_sec=5, tv_usec=0}) = 0 (Timeout)
select(7, [3 4 5], NULL, NULL, {tv_sec=5, tv_usec=0}) = 0 (Timeout)
select(7, [3 4 5], NULL, NULL, {tv_sec=5, tv_usec=0}) = 0 (Timeout)
select(7, [3 4 5], NULL, NULL, {tv_sec=5, tv_usec=0}) = 0 (Timeout)
select(7, [3 4 5], NULL, NULL, {tv_sec=5, tv_usec=0}) = 0 (Timeout)
select(7, [3 4 5], NULL, NULL, {tv_sec=5, tv_usec=0}) = 0 (Timeout)
select(7, [3 4 5], NULL, NULL, {tv_sec=5, tv_usec=0}) = 0 (Timeout)
select(7, [3 4 5], NULL, NULL, {tv_sec=5, tv_usec=0}) = 0 (Timeout)
select(7, [3 4 5], NULL, NULL, {tv_sec=5, tv_usec=0}) = 0 (Timeout)
select(7, [3 4 5], NULL, NULL, {tv_sec=5, tv_usec=0}) = 0 (Timeout)
select(7, [3 4 5], NULL, NULL, {tv_sec=5, tv_usec=0}) = 0 (Timeout)
select(7, [3 4 5], NULL, NULL, {tv_sec=5, tv_usec=0}) = 0 (Timeout)
select(7, [3 4 5], NULL, NULL, {tv_sec=5, tv_usec=0}) = 0 (Timeout)
select(7, [3 4 5], NULL, NULL, {tv_sec=5, tv_usec=0}) = 0 (Timeout)
select(7, [3 4 5], NULL, NULL, {tv_sec=5, tv_usec=0}) = 0 (Timeout)
select(7, [3 4 5], NULL, NULL, {tv_sec=5, tv_usec=0}) = 0 (Timeout)
select(7, [3 4 5], NULL, NULL, {tv_sec=5, tv_usec=0} | 
Beta Was this translation helpful? Give feedback.
-
| Sounds similar to helm/charts#1711 (so EFS is just NFS behind the scenes?). Although about elasticsearch, this post seems relevant since RabbitMQ likely also cares about filesystem performance: 
 | 
Beta Was this translation helpful? Give feedback.
-
| I have the same issue with rabbitmq:latest | 
Beta Was this translation helpful? Give feedback.
-
| I'm able to run RabbitMQ 3.8.11 just fine persisting to our internal NFS, but when EFS gets involved start up times turn nasty. Interesting thing though is I see fine run time performance. @michaelklishin would you have any ideas? Just a note, I spent a significant amount of time with AWS support before we found this issue and we can confirm the EFS is functioning just fine, it just seems that for some reason rabbit won't write very quickly to it. Just watching the EFS during start up, it seems to be writing 400mb of quorum queue data very slowly. We're not using this feature yet so I'm not too familiar with it but this happens every start up. It deletes the data and rewrites it. | 
Beta Was this translation helpful? Give feedback.
-
| Hi team. 
 we would further test with higher throughput and operations per second from the efs side. | 
Beta Was this translation helpful? Give feedback.
-
| Reducing the  | 
Beta Was this translation helpful? Give feedback.
-
| Was this issue ever fixed? I'm seeing the same problem but with EBS / ext4 storage, not EFS. | 
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Describe the bug
RabbitMQ Docker image takes a long time to start (5- 10 minutes) when using an efs mount as home directory location(/var/lib/rabbitmq).
To Reproduce
Steps to reproduce the behavior:
Create an efs file system
Create an ec2 instance mounting the efs file system to a path (/mnt/efs/fs1)
Run without using an efs mount
docker run --hostname my-rabbit --name some-rabbit rabbitmq:3.8.11Starts normally. logs attached with
log.file.level = debugwithout_efs.log
Run using an efs mount for home dir
docker run -v /mnt/efs/fs1:/var/lib/rabbitmq --hostname my-rabbit --name some-rabbit rabbitmq:3.8.11
Takes longer time to start. logs attached with
log.file.level = debug#efs.log
Additional Information
This is also reproducible in Kubernetes (using bitnami rabbitmq chart) .
#bitnami/charts#4936
Beta Was this translation helpful? Give feedback.
All reactions