Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VPP-1467] spurious segfaults running make test with dlmalloc enabled #2931

Open
vvalderrv opened this issue Feb 1, 2025 · 2 comments
Open

Comments

@vvalderrv
Copy link
Contributor

Description

The issue cannot be reproduced if 

        //Use dlmalloc memory allocator.

        VPP_USE_DLMALLOC:BOOL=OFF

in CMakeCache.txt

 


root@vpp:/vpp# cd /vpp && make build && make test-debug

make[1]: Entering directory '/vpp/build-root'

@@@@ Arch for platform 'vpp' is native @@@@

@@@@ Finding source for external @@@@

@@@@ Makefile fragment found in /vpp/build-data/packages/external.mk @@@@

@@@@ Source found in /vpp/build @@@@

@@@@ Arch for platform 'vpp' is native @@@@

@@@@ Finding source for vpp @@@@

@@@@ Makefile fragment found in /vpp/build-data/packages/vpp.mk @@@@

@@@@ Source found in /vpp/src @@@@

@@@@ Configuring external: nothing to do @@@@

@@@@ Building external: nothing to do @@@@

@@@@ Installing external: nothing to do @@@@

@@@@ Configuring vpp: nothing to do @@@@

@@@@ Building vpp in /vpp/build-root/build-vpp_debug-native/vpp @@@@

[1/1] Re-running CMake...

– Looking for ccache

– Looking for ccache - found

Marvell MUSDK not found - marvell_plugin disabled

– Looking for mbedTLS

– Found mbedTLS in /usr/include

– Found DPDK 18.08.0 in /vpp/build-root/install-vpp_debug-native/external/include/dpdk

– DPDK depends on IPSec MB library

– Configuration:

VPP version         : 18.10-rc0~586-gaeedb7f

VPP library version : 18.10

GIT toplevel dir    : /vpp

C flags             : -march=corei7 -mtune=corei7-avx -g -O0 -DCLIB_DEBUG -DFORTIFY_SOURCE=2 -fstack-protector-all -fPIC -Werror

Linker flags        : -g -O0 -DCLIB_DEBUG -DFORTIFY_SOURCE=2 -fstack-protector-all -fPIC -Werror

Target processor    : x86_64

Build type          :

Prefix path         : /opt/vpp/external/x86_64;/vpp/build-root/install-vpp_debug-native/external

Install prefix      : /vpp/build-root/install-vpp_debug-native/vpp

– Configuring done

– Generating done

– Build files have been written to: /vpp/build-root/build-vpp_debug-native/vpp

[2/2] Linking C executable bin/vpp

@@@@ Installing vpp @@@@

[1/1] Install the project...

– Install configuration: ""

make[1]: Leaving directory '/vpp/build-root'

make -C /vpp/build-root PLATFORM=vpp TAG=vpp_debug vpp-install

make[1]: Entering directory '/vpp/build-root'

@@@@ Arch for platform 'vpp' is native @@@@

@@@@ Finding source for external @@@@

@@@@ Makefile fragment found in /vpp/build-data/packages/external.mk @@@@

@@@@ Source found in /vpp/build @@@@

@@@@ Arch for platform 'vpp' is native @@@@

@@@@ Finding source for vpp @@@@

@@@@ Makefile fragment found in /vpp/build-data/packages/vpp.mk @@@@

@@@@ Source found in /vpp/src @@@@

@@@@ Configuring external: nothing to do @@@@

@@@@ Building external: nothing to do @@@@

@@@@ Installing external: nothing to do @@@@

@@@@ Configuring vpp: nothing to do @@@@

@@@@ Building vpp in /vpp/build-root/build-vpp_debug-native/vpp @@@@

ninja: no work to do.

@@@@ Installing vpp: nothing to do @@@@

make[1]: Leaving directory '/vpp/build-root'

make -C test TEST_DIR=/vpp/test VPP_TEST_BUILD_DIR=/vpp/build-root/build-vpp_debug-native VPP_TEST_BIN=/vpp/build-root/install-vpp_debug-native/vpp/bin/vpp VPP_TEST_PLUGIN_PATH=/vpp/build-root/install-vpp_debug-native/vpp/lib/vpp_plugins:/vpp/build-root/install-vpp_debug-native/vpp/lib64/vpp_plugins VPP_TEST_INSTALL_PATH=/vpp/build-root/install-vpp_debug-native/ LD_LIBRARY_PATH=/vpp/build-root/install-vpp_debug-native/vpp/lib/:/vpp/build-root/install-vpp_debug-native/vpp/lib64/ EXTENDED_TESTS= PYTHON= OS_ID=ubuntu CACHE_OUTPUT= test

make[1]: Entering directory '/vpp/test'

bash: line 1: 17751 Segmentation fault      python sanity_run_vpp.py


- Sanity check failed, cannot run vpp

Makefile:143: recipe for target 'sanity' failed

make[1]: *** [sanity] Error 1

make[1]: Leaving directory '/vpp/test'

Makefile:395: recipe for target 'test-debug' failed

make: *** [test-debug] Error 2

root@vpp:/vpp#


root@vpp:/vpp# valgrind -v --leak-check=full --show-leak-kinds=all /vpp/build-root/install-vpp-native/vpp/bin/vpp -c /etc/vpp/startup.conf

==5391== Memcheck, a memory error detector

==5391== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.

==5391== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info

==5391== Command: /vpp/build-root/install-vpp-native/vpp/bin/vpp -c /etc/vpp/startup.conf

==5391==

-5391- Valgrind options:

-5391-    -v

-5391-    --leak-check=full

-5391-    --show-leak-kinds=all

-5391- Contents of /proc/version:

-5391-   Linux version 4.4.0-21-generic (buildd@lgw01-21) (gcc version 5.3.1 20160413 (Ubuntu 5.3.1-14ubuntu2) ) #37-Ubuntu SMP Mon Apr 18 18:33:37 UTC 2016

-5391-

-5391- Arch and hwcaps: AMD64, LittleEndian, amd64-cx16-lzcnt-rdtscp-sse3-avx

-5391- Page sizes: currently 4096, max supported 4096

-5391- Valgrind library directory: /usr/lib/valgrind

-5391- Reading syms from /vpp/build-root/install-vpp-native/vpp/bin/vpp

-5391- Reading syms from /lib/x86_64-linux-gnu/ld-2.23.so

-5391-   Considering /lib/x86_64-linux-gnu/ld-2.23.so ..

-5391-   .. CRC mismatch (computed aa979a42 wanted 9019bbb7)

-5391-   Considering /usr/lib/debug/lib/x86_64-linux-gnu/ld-2.23.so ..

-5391-   .. CRC is valid

-5391- Reading syms from /usr/lib/valgrind/memcheck-amd64-linux

-5391-   Considering /usr/lib/valgrind/memcheck-amd64-linux ..

-5391-   .. CRC mismatch (computed eea41ea9 wanted 2009db78)

-5391-    object doesn't have a symbol table

-5391-    object doesn't have a dynamic symbol table

-5391- Scheduler: using generic scheduler lock implementation.

-5391- Reading suppressions file: /usr/lib/valgrind/default.supp

==5391== embedded gdbserver: reading from /tmp/vgdb-pipe-from-vgdb-to-5391-by-root-on-???

==5391== embedded gdbserver: writing to   /tmp/vgdb-pipe-to-vgdb-from-5391-by-root-on-???

==5391== embedded gdbserver: shared mem   /tmp/vgdb-pipe-shared-mem-vgdb-5391-by-root-on-???

==5391==

==5391== TO CONTROL THIS PROCESS USING vgdb (which you probably

==5391== don't want to do, unless you know exactly what you're doing,

==5391== or are doing some strange experiment):

==5391==   /usr/lib/valgrind/../../bin/vgdb --pid=5391 ...command...

==5391==

==5391== TO DEBUG THIS PROCESS USING GDB: start GDB like this

==5391==   /path/to/gdb /vpp/build-root/install-vpp-native/vpp/bin/vpp

==5391== and then give GDB the following command

==5391==   target remote | /usr/lib/valgrind/../../bin/vgdb --pid=5391

==5391== --pid is optional if only one valgrind process is running

==5391==

-5391- REDIR: 0x401cfd0 (ld-linux-x86-64.so.2:strlen) redirected to 0x3809e181 (???)

-5391- Reading syms from /usr/lib/valgrind/vgpreload_core-amd64-linux.so

-5391-   Considering /usr/lib/valgrind/vgpreload_core-amd64-linux.so ..

-5391-   .. CRC mismatch (computed 2567ccf6 wanted 49420590)

-5391-    object doesn't have a symbol table

-5391- Reading syms from /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so

-5391-   Considering /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so ..

-5391-   .. CRC mismatch (computed 0e27c9a8 wanted ac585421)

-5391-    object doesn't have a symbol table

==5391== WARNING: new redirection conflicts with existing – ignoring it

-5391-     old: 0x0401cfd0 (strlen              ) R-> (0000.0) 0x3809e181 ???

-5391-     new: 0x0401cfd0 (strlen              ) R-> (2007.0) 0x04c31020 strlen

-5391- REDIR: 0x401b920 (ld-linux-x86-64.so.2:index) redirected to 0x4c30bc0 (index)

-5391- REDIR: 0x401bb40 (ld-linux-x86-64.so.2:strcmp) redirected to 0x4c320d0 (strcmp)

-5391- REDIR: 0x401dd30 (ld-linux-x86-64.so.2:mempcpy) redirected to 0x4c35270 (mempcpy)

-5391- Reading syms from /vpp/build-root/install-vpp-native/vpp/lib/libvlibmemory.so.18.10

-5391- Reading syms from /vpp/build-root/install-vpp-native/vpp/lib/libvnet.so.18.10

-5391- Reading syms from /lib/x86_64-linux-gnu/libdl-2.23.so

-5391-   Considering /lib/x86_64-linux-gnu/libdl-2.23.so ..

-5391-   .. CRC mismatch (computed 39227170 wanted ab6e2c22)

-5391-   Considering /usr/lib/debug/lib/x86_64-linux-gnu/libdl-2.23.so ..

-5391-   .. CRC is valid

-5391- Reading syms from /vpp/build-root/install-vpp-native/vpp/lib/libvlib.so.18.10

-5391- Reading syms from /vpp/build-root/install-vpp-native/vpp/lib/libsvm.so.18.10

-5391- Reading syms from /lib/x86_64-linux-gnu/libpthread-2.23.so

-5391-   Considering /usr/lib/debug/.build-id/ce/17e023542265fc11d9bc8f534bb4f070493d30.debug ..

-5391-   .. build-id is valid

-5391- Reading syms from /vpp/build-root/install-vpp-native/vpp/lib/libvppinfra.so.18.10

-5391- Reading syms from /lib/x86_64-linux-gnu/libc-2.23.so

-5391-   Considering /lib/x86_64-linux-gnu/libc-2.23.so ..

-5391-   .. CRC mismatch (computed 7a8ee3e4 wanted a5190ac4)

-5391-   Considering /usr/lib/debug/lib/x86_64-linux-gnu/libc-2.23.so ..

-5391-   .. CRC is valid

-5391- Reading syms from /lib/x86_64-linux-gnu/libcrypto.so.1.0.0

-5391-    object doesn't have a symbol table

-5391- Reading syms from /lib/x86_64-linux-gnu/libm-2.23.so

-5391-   Considering /lib/x86_64-linux-gnu/libm-2.23.so ..

-5391-   .. CRC mismatch (computed e8c3647b wanted c3efddac)

-5391-   Considering /usr/lib/debug/lib/x86_64-linux-gnu/libm-2.23.so ..

-5391-   .. CRC is valid

-5391- Reading syms from /lib/x86_64-linux-gnu/librt-2.23.so

-5391-   Considering /lib/x86_64-linux-gnu/librt-2.23.so ..

-5391-   .. CRC mismatch (computed 734d0439 wanted 09d6393c)

-5391-   Considering /usr/lib/debug/lib/x86_64-linux-gnu/librt-2.23.so ..

-5391-   .. CRC is valid

-5391- REDIR: 0x6680a00 (libc.so.6:strcasecmp) redirected to 0x4a286f0 (_vgnU_ifunc_wrapper)

-5391- REDIR: 0x667c280 (libc.so.6:strcspn) redirected to 0x4a286f0 (_vgnU_ifunc_wrapper)

-5391- REDIR: 0x6682cf0 (libc.so.6:strncasecmp) redirected to 0x4a286f0 (_vgnU_ifunc_wrapper)

-5391- REDIR: 0x667e6f0 (libc.so.6:strpbrk) redirected to 0x4a286f0 (_vgnU_ifunc_wrapper)

-5391- REDIR: 0x667ea80 (libc.so.6:strspn) redirected to 0x4a286f0 (_vgnU_ifunc_wrapper)

-5391- REDIR: 0x668014b (libc.so.6:memcpy@GLIBC_2.2.5) redirected to 0x4a286f0 (_vgnU_ifunc_wrapper)

-5391- REDIR: 0x667acd0 (libc.so.6:strcmp) redirected to 0x4a286f0 (_vgnU_ifunc_wrapper)

-5391- REDIR: 0x667e400 (libc.so.6:rindex) redirected to 0x4c308a0 (rindex)

-5391- REDIR: 0x667c720 (libc.so.6:strlen) redirected to 0x4c30f60 (strlen)

-5391- REDIR: 0x667cb20 (libc.so.6:strncmp) redirected to 0x4a286f0 (_vgnU_ifunc_wrapper)

-5391- REDIR: 0x6736a90 (libc.so.6:_strncmp_sse42) redirected to 0x4c317f0 (_strncmp_sse42)

-5391- REDIR: 0x6675130 (libc.so.6:malloc) redirected to 0x4c2db20 (malloc)

-5391- REDIR: 0x667f060 (libc.so.6:_GI_strstr) redirected to 0x4c354d0 (_strstr_sse2)

-5391- REDIR: 0x6675d10 (libc.so.6:calloc) redirected to 0x4c2faa0 (calloc)

-5391- REDIR: 0x667c8c0 (libc.so.6:strnlen) redirected to 0x4c30ee0 (strnlen)

-5391- REDIR: 0x6685470 (libc.so.6:_GI_memcpy) redirected to 0x4c32b00 (_GI_memcpy)

-5391- REDIR: 0x667f860 (libc.so.6:memchr) redirected to 0x4c32170 (memchr)

-5391- REDIR: 0x66756c0 (libc.so.6:realloc) redirected to 0x4c2fce0 (realloc)

-5391- REDIR: 0x66754f0 (libc.so.6:free) redirected to 0x4c2ed80 (free)

==5391== Invalid read of size 1

==5391==    at 0x63D400F: mspace_malloc (dlmalloc.c:4339)

==5391==    by 0x63D58CB: mspace_get_aligned (dlmalloc.c:4225)

==5391==    by 0x63D931C: clib_mem_alloc_aligned_at_offset (mem.h:118)

==5391==    by 0x63D931C: clib_mem_alloc_aligned (mem.h:142)

==5391==    by 0x63D931C: clib_spinlock_init (lock.h:59)

==5391==    by 0x63D931C: clib_mem_init (mem_dlmalloc.c:221)

==5391==    by 0x406D52: main (main.c:260)

==5391==  Address 0x370 is not stack'd, malloc'd or (recently) free'd

==5391==

==5391==

==5391== Process terminating with default action of signal 11 (SIGSEGV)

==5391==  Access not within mapped region at address 0x370

==5391==    at 0x63D400F: mspace_malloc (dlmalloc.c:4339)

==5391==    by 0x63D58CB: mspace_get_aligned (dlmalloc.c:4225)

==5391==    by 0x63D931C: clib_mem_alloc_aligned_at_offset (mem.h:118)

==5391==    by 0x63D931C: clib_mem_alloc_aligned (mem.h:142)

==5391==    by 0x63D931C: clib_spinlock_init (lock.h:59)

==5391==    by 0x63D931C: clib_mem_init (mem_dlmalloc.c:221)

==5391==    by 0x406D52: main (main.c:260)

==5391==  If you believe this happened as a result of a stack

==5391==  overflow in your program's main thread (unlikely but

==5391==  possible), you can try to increase the size of the

==5391==  main thread stack using the --main-stacksize= flag.

==5391==  The main thread stack size used in this run was 8388608.

==5391==

==5391== HEAP SUMMARY:

==5391==     in use at exit: 365 bytes in 22 blocks

==5391==   total heap usage: 45 allocs, 23 frees, 6,861 bytes allocated

==5391==

==5391== Searching for pointers to 22 not-freed blocks

==5391== Checked 527,952 bytes

==5391==

==5391== 47 bytes in 1 blocks are still reachable in loss record 1 of 3

==5391==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)

==5391==    by 0x667C4D9: strndup (strndup.c:43)

==5391==    by 0x407001: main (main.c:159)

==5391==

==5391== 142 bytes in 20 blocks are still reachable in loss record 2 of 3

==5391==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)

==5391==    by 0x667C4D9: strndup (strndup.c:43)

==5391==    by 0x40706C: main (main.c:178)

==5391==

==5391== 176 bytes in 1 blocks are still reachable in loss record 3 of 3

==5391==    at 0x4C2FD5F: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)

==5391==    by 0x40717B: main (main.c:188)

==5391==

==5391== LEAK SUMMARY:

==5391==    definitely lost: 0 bytes in 0 blocks

==5391==    indirectly lost: 0 bytes in 0 blocks

==5391==      possibly lost: 0 bytes in 0 blocks

==5391==    still reachable: 365 bytes in 22 blocks

==5391==         suppressed: 0 bytes in 0 blocks

==5391==

==5391== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

==5391==

==5391== 1 errors in context 1 of 1:

==5391== Invalid read of size 1

==5391==    at 0x63D400F: mspace_malloc (dlmalloc.c:4339)

==5391==    by 0x63D58CB: mspace_get_aligned (dlmalloc.c:4225)

==5391==    by 0x63D931C: clib_mem_alloc_aligned_at_offset (mem.h:118)

==5391==    by 0x63D931C: clib_mem_alloc_aligned (mem.h:142)

==5391==    by 0x63D931C: clib_spinlock_init (lock.h:59)

==5391==    by 0x63D931C: clib_mem_init (mem_dlmalloc.c:221)

==5391==    by 0x406D52: main (main.c:260)

==5391==  Address 0x370 is not stack'd, malloc'd or (recently) free'd

==5391==

==5391== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

Segmentation fault

(env) root@vpp:/vpp#

Assignee

Paul Vinciguerra

Reporter

Paul Vinciguerra

Comments

  • klementsekera (Wed, 19 Jun 2019 12:01:03 +0000): Paul, I see that in my CMakeCache.txt,

 

291 //Use dlmalloc memory allocator.

292 VPP_USE_DLMALLOC:BOOL=ON

 

so I assume DLMALLOC is ON for quite some time, yet I never saw this crash.

 

Is this still an issue for you?

  • jhahn (Sun, 17 Feb 2019 23:21:09 +0000): Klement Sekera Is this still an issue in 19.01?

Original issue: https://jira.fd.io/browse/VPP-1467

@vvalderrv
Copy link
Contributor Author

Paul, I see that in my CMakeCache.txt,

 

291 //Use dlmalloc memory allocator.
292 VPP_USE_DLMALLOC:BOOL=ON

 

so I assume DLMALLOC is ON for quite some time, yet I never saw this crash.

 

Is this still an issue for you?

 

@vvalderrv
Copy link
Contributor Author

Klement Sekera Is this still an issue in 19.01?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant