-
Notifications
You must be signed in to change notification settings - Fork 632
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[VPP-238] fd.io/master crash during enic interface bringup #1492
Comments
My only guess is that something else is not getting configured right. I dont know enough about the queuing mechanism yet to have deeper insight. wq 1 wq desc 1024 EAL: Detected 48 lcore(s) when i use the defaults and add my patch to allow one rx queue i see packets but this happens 00:02:27:369695: dpdk-input |
Sean, I don't see any problems in the latest log output. I don't see any reason why VPP wouldn't pass traffic. I pulled the latest VPP code from master, and it passes traffic on my test setup. My conf file looks is: unix { interactive exec /home/neescoba/proj/dpdk-confs/vpp_loj2.cmd } Where the vpp_log2.cmd file being executed just contains the commands to bring up the interfaces. As you are probably much more familiar with VPP, do you have any theories about why VPP isn't passing any traffic? Just a note for the future: Increasing the size of the WQ descriptor ring to 1024 and RQ descriptor ring to 4096 may improve performance.. |
Still not working. Set wq to 2 and rq to 2 and cq to 4 in the adapter policy and now i see no traffic at all with vpp master latest as of this posting. EAL: Detected 48 lcore(s) vpp/vpp/conf/startup.conf unix { api-trace { api-segment { |
Certainly, I will test that out and agree if it is an invalid configuration it should definitely warn. |
It looks like your VIC is configured with only 1 RQ. Can you change this so that the VIC is configured with 2x the number of RQs that DPDK will be using? When scattered RX was introduced, we started requiring that 2x the number of RQs be configured in the VIC. Your VPP configuration isn't using scattered rx, but right now, that 2x RQs is still required even if the application will not be using scattered rx. I'm leaning towards adding a check and printing a message that having only one RQ configured in the VIC is not a valid configuration. Thanks for reporting this problem. |
Found what may be the issue. When rx is 1 it gets max_rx_queues gets set to 0 by the /2. This causes initialization of intr to get skipped and lead to the 0 deref. root@devstack-vpp0:/home/localadmin/dpdk# git diff ENICPMD_FUNC_TRACE();
|
startup.conf unix { api-trace { dpdk { dev 0000:0e:00.0 {vlan-strip-offload off} socket-mem 8192,8192 api-segment { |
Description
Bringing an interface up causes a crash because enic->intr is not set up correctly
(gdb) p enic->intr
$1 =
{index = 0, vdev = 0x0, ctrl = 0x0}
root@devstack-vpp0:/home/localadmin# vppctl sh int
TenGigabitEthernet8/0/0 1 down
TenGigabitEthernetd/0/0 2 down
TenGigabitEthernete/0/0 3 down
local0 0 down
root@devstack-vpp0:/home/localadmin# vppctl
vpp# sh int
TenGigabitEthernet8/0/0 1 down
TenGigabitEthernetd/0/0 2 down
TenGigabitEthernete/0/0 3 down
local0 0 down
vpp# set interface ip address TenGigabitEthernet8/0/0 13.11.1.10/24
vpp# set interface state TenGigabitEthernet8/0/0 up
#0 0x000000000075c6ad in iowrite32 (val=0, addr=0x20) at /home/localadmin/src/vpp/build-root/build-vpp_debug-native/dpdk/dpdk-16.07/drivers/net/enic/enic_compat.h:113
#1 0x000000000075c6d7 in vnic_intr_unmask (intr=0x7ff73ffceb78) at /home/localadmin/src/vpp/build-root/build-vpp_debug-native/dpdk/dpdk-16.07/drivers/net/enic/base/vnic_intr.h:70
#2 0x00000000007687e2 in enic_enable (enic=0x7ff73ffcd180) at /home/localadmin/src/vpp/build-root/build-vpp_debug-native/dpdk/dpdk-16.07/drivers/net/enic/enic_main.c:472
#3 0x0000000000750667 in enicpmd_dev_start (eth_dev=0xfe0e40 <rte_eth_devices>) at /home/localadmin/src/vpp/build-root/build-vpp_debug-native/dpdk/dpdk-16.07/drivers/net/enic/eni2
#4 0x00000000008948dc in rte_eth_dev_start (port_id=0 '\000') at /home/localadmin/src/vpp/build-root/build-vpp_debug-native/dpdk/dpdk-16.07/lib/librte_ether/rte_ethdev.c:1103
#5 0x00007ffff7114ae7 in dpdk_interface_admin_up_down (vnm=0x10627c0 <vnet_main>, hw_if_index=1, flags=1) at /home/localadmin/src/vpp/build-data/../vnet/vnet/devices/dpdk/device.9
#6 0x00007ffff6d0ee88 in vnet_sw_interface_set_flags_helper (vnm=0x10627c0 <vnet_main>, sw_if_index=1, flags=1, helper_flags=0) at /home/localadmin/src/vpp/build-data/../vnet/vne0
#7 0x00007ffff6d0effd in vnet_sw_interface_set_flags (vnm=0x10627c0 <vnet_main>, sw_if_index=1, flags=1) at /home/localadmin/src/vpp/build-data/../vnet/vnet/interface.c:466
#8 0x00007ffff6d16af9 in set_state (vm=0x1062160 <vlib_global_main>, input=0x7fffc5030d10, cmd=0x7fffc4ef92bc) at /home/localadmin/src/vpp/build-data/../vnet/vnet/interface_cli.c1
#9 0x00007ffff74b28f5 in vlib_cli_dispatch_sub_commands (vm=0x1062160 <vlib_global_main>, cm=0x10623c8 <vlib_global_main+616>, input=0x7fffc5030d10, parent_command_index=201) at 3
#10 0x00007ffff74b2803 in vlib_cli_dispatch_sub_commands (vm=0x1062160 <vlib_global_main>, cm=0x10623c8 <vlib_global_main+616>, input=0x7fffc5030d10, parent_command_index=3) at /h1
#11 0x00007ffff74b2803 in vlib_cli_dispatch_sub_commands (vm=0x1062160 <vlib_global_main>, cm=0x10623c8 <vlib_global_main+616>, input=0x7fffc5030d10, parent_command_index=0) at /h1
#12 0x00007ffff74b2bda in vlib_cli_input (vm=0x1062160 <vlib_global_main>, input=0x7fffc5030d10, function=0xac8ac2 <shmem_cli_output>, function_arg=140736498699488) at /home/local7
#13 0x0000000000ac8d32 in vl_api_cli_request_t_handler (mp=0x305d9f20) at /home/localadmin/src/vpp/build-data/../vpp/vpp-api/api.c:3413
#14 0x00007ffff7bc777a in vl_msg_api_handler_with_vm_node (am=0x1062940 <api_main>, the_msg=0x305d9f20, vm=0x1062160 <vlib_global_main>, node=0x7fffc5028000) at /home/localadmin/s1
#15 0x00007ffff79acea7 in memclnt_process (vm=0x1062160 <vlib_global_main>, node=0x7fffc5028000, f=0x0) at /home/localadmin/src/vpp/build-data/../vlib-api/vlibmemory/memory_vlib.c2
#16 0x00007ffff74d9278 in vlib_process_bootstrap (_a=140736504110368) at /home/localadmin/src/vpp/build-data/../vlib/vlib/main.c:1191
#17 0x00007ffff63f9ecc in clib_calljmp () at /home/localadmin/src/vpp/build-data/../vppinfra/vppinfra/longjmp.S:110
#18 0x00007fffc5559cf0 in ?? ()
#19 0x00007ffff74d93a3 in vlib_process_startup (vm=0x248070c927ff0700, p=0xffffffffffffffff, f=0x7fffc5028000) at /home/localadmin/src/vpp/build-data/../vlib/vlib/main.c:1213
#20 0x0000000001062280 in vlib_global_main ()
#21 0x00041d166eba1d47 in ?? ()
#22 0x00007fffc5028000 in ?? ()
#23 0x00007fffc4d53378 in ?? ()
#24 0x00007fffc4d53330 in ?? ()
#25 0x0000000000000004 in ?? ()
#26 0x00007fffc4d53378 in ?? ()
#27 0x00007fffc5028000 in ?? ()
#28 0x00007fffc504511c in ?? ()
#29 0x0000000000000000 in ?? ()
Assignee
Nelson Escobar
Reporter
sean chandler
Comments
wq 1 wq desc 1024
rq 2 rq desc 4096
EAL: Detected 48 lcore(s)
EAL: Probing VFIO support...
PMD: bnxt_rte_pmd_init() called for (null)
[New Thread 0x7fff9a230700 (LWP 6872)]
EAL: PCI device 0000:0d:00.0 on NUMA socket 0
EAL: probe driver: 1137:43 rte_enic_pmd
PMD: rte_enic_pmd: vNIC MAC addr 00:25:b5:aa:2b:12 wq/rq 1024/4096 mtu 1500, max mtu:9004
PMD: rte_enic_pmd: vNIC csum tx/rx yes/yes rss no intr mode any type min timer 125 usec loopback tag 0x0000
PMD: rte_enic_pmd: vNIC resources avail: wq 1 rq 2 cq 4 intr 6
DPDK physical memory layout:
Segment 0: phys:0x100000000, len:1073741824, virt:0x7ff700000000, socket_id:0, hugepage_sz:1073741824, nchannel:0, nrank:0
Segment 1: phys:0x2080000000, len:1073741824, virt:0x7fe680000000, socket_id:1, hugepage_sz:1073741824, nchannel:0, nrank:0
[New Thread 0x7fff99a2f700 (LWP 6873)]
PMD: rte_enic_pmd: Scatter rx mode disabled
PMD: rte_enic_pmd: Scatter rx mode not being used
PMD: rte_enic_pmd: Using 1024 rx descriptors (sop 1024, data 0)
PMD: rte_enic_pmd: vNIC resources used: wq 1 rq 2 cq 2 intr 0
PMD: rte_enic_pmd: MTU (1518) is greater than value configured in NIC (1500)
PMD: rte_enic_pmd: MTU changed from 1500 to 1518
when i use the defaults and add my patch to allow one rx queue i see packets but this happens
00:02:27:369695: dpdk-input
TenGigabitEthernetd/0/0 rx queue 0
buffer 0x42003f7d: current data 0, length 118, free-list 0, totlen-nifb 0, trace 0x0
PKT MBUF: port 1, nb_segs 1, pkt_len 118
IP4: 52:54:00:0a:39:5a -> 00:25:b5:aa:2b:12 802.1q vlan 1311
ICMP: 13.11.1.1 -> 13.11.1.99
ICMP echo_request checksum 0x9ad
00:02:27:369729: error-drop
dpdk-input: Rx L4 checksum errors
I pulled the latest VPP code from master, and it passes traffic on my test setup. My conf file looks is:
Where the vpp_log2.cmd file being executed just contains the commands to bring up the interfaces.
As you are probably much more familiar with VPP, do you have any theories about why VPP isn't passing any traffic?
Just a note for the future: Increasing the size of the WQ descriptor ring to 1024 and RQ descriptor ring to 4096 may improve performance..
EAL: Detected 48 lcore(s)
EAL: Probing VFIO support...
PMD: bnxt_rte_pmd_init() called for (null)
[New Thread 0x7fff9a230700 (LWP 27438)]
EAL: PCI device 0000:06:00.0 on NUMA socket 0
EAL: Device is blacklisted, not initializing
EAL: PCI device 0000:07:00.0 on NUMA socket 0
EAL: Device is blacklisted, not initializing
EAL: PCI device 0000:08:00.0 on NUMA socket 0
EAL: probe driver: 1137:43 rte_enic_pmd
PMD: rte_enic_pmd: vNIC MAC addr 00:25:b5:aa:2b:11 wq/rq 256/512 mtu 1500, max mtu:9004
PMD: rte_enic_pmd: vNIC csum tx/rx yes/yes rss yes intr mode any type min timer 125 usec loopback tag 0x0000
PMD: rte_enic_pmd: vNIC resources avail: wq 2 rq 2 cq 4 intr 6
EAL: PCI device 0000:0d:00.0 on NUMA socket 0
EAL: probe driver: 1137:43 rte_enic_pmd
PMD: rte_enic_pmd: vNIC MAC addr 00:25:b5:aa:2b:12 wq/rq 256/512 mtu 1500, max mtu:9004
PMD: rte_enic_pmd: vNIC csum tx/rx yes/yes rss yes intr mode any type min timer 125 usec loopback tag 0x0000
PMD: rte_enic_pmd: vNIC resources avail: wq 2 rq 2 cq 4 intr 6
EAL: PCI device 0000:0e:00.0 on NUMA socket 0
EAL: probe driver: 1137:43 rte_enic_pmd
PMD: rte_enic_pmd: vNIC MAC addr 00:25:b5:aa:2b:13 wq/rq 256/512 mtu 1500, max mtu:9004
PMD: rte_enic_pmd: vNIC csum tx/rx yes/yes rss yes intr mode any type min timer 125 usec loopback tag 0x0000
PMD: rte_enic_pmd: vNIC resources avail: wq 2 rq 2 cq 4 intr 6
DPDK physical memory layout:
Segment 0: phys:0x300000000, len:1073741824, virt:0x7fef40000000, socket_id:0, hugepage_sz:1073741824, nchannel:0, nrank:0
Segment 1: phys:0x2280000000, len:1073741824, virt:0x7fdec0000000, socket_id:1, hugepage_sz:1073741824, nchannel:0, nrank:0
[New Thread 0x7fff99a2f700 (LWP 27439)]
PMD: rte_enic_pmd: WQ 0 - number of tx desc in cmd line (1024)is greater than that in the UCSM/CIMC adapterpolicy. Applying the value in the adapter policy (256)
PMD: rte_enic_pmd: Scatter rx mode disabled
PMD: rte_enic_pmd: Scatter rx mode not being used
PMD: rte_enic_pmd: Number of rx_descs too high, adjusting to maximum
PMD: rte_enic_pmd: Using 512 rx descriptors (sop 512, data 0)
PMD: rte_enic_pmd: vNIC resources used: wq 1 rq 2 cq 2 intr 0
PMD: rte_enic_pmd: MTU (1518) is greater than value configured in NIC (1500)
PMD: rte_enic_pmd: MTU changed from 1500 to 1518
PMD: rte_enic_pmd: WQ 0 - number of tx desc in cmd line (1024)is greater than that in the UCSM/CIMC adapterpolicy. Applying the value in the adapter policy (256)
PMD: rte_enic_pmd: Scatter rx mode disabled
PMD: rte_enic_pmd: Scatter rx mode not being used
PMD: rte_enic_pmd: Number of rx_descs too high, adjusting to maximum
PMD: rte_enic_pmd: Using 512 rx descriptors (sop 512, data 0)
PMD: rte_enic_pmd: vNIC resources used: wq 1 rq 2 cq 2 intr 0
PMD: rte_enic_pmd: MTU (1518) is greater than value configured in NIC (1500)
PMD: rte_enic_pmd: MTU changed from 1500 to 1518
PMD: rte_enic_pmd: WQ 0 - number of tx desc in cmd line (1024)is greater than that in the UCSM/CIMC adapterpolicy. Applying the value in the adapter policy (256)
PMD: rte_enic_pmd: Scatter rx mode disabled
PMD: rte_enic_pmd: Scatter rx mode not being used
PMD: rte_enic_pmd: Number of rx_descs too high, adjusting to maximum
PMD: rte_enic_pmd: Using 512 rx descriptors (sop 512, data 0)
PMD: rte_enic_pmd: vNIC resources used: wq 1 rq 2 cq 2 intr 0
PMD: rte_enic_pmd: MTU (1518) is greater than value configured in NIC (1500)
PMD: rte_enic_pmd: MTU changed from 1500 to 1518
vpp/vpp/conf/startup.conf
unix {
nodaemon
log /tmp/vpp.log
full-coredump
}
api-trace {
on
}
api-segment {
gid vpp
}
When scattered RX was introduced, we started requiring that 2x the number of RQs be configured in the VIC. Your VPP configuration isn't using scattered rx, but right now, that 2x RQs is still required even if the application will not be using scattered rx.
I'm leaning towards adding a check and printing a message that having only one RQ configured in the VIC is not a valid configuration.
Thanks for reporting this problem.
root@devstack-vpp0:/home/localadmin/dpdk# git diff
diff --git a/drivers/net/enic/enic_ethdev.c b/drivers/net/enic/enic_ethdev.c
index 47b07c9..dc3241e 100644
— a/drivers/net/enic/enic_ethdev.c
+++ b/drivers/net/enic/enic_ethdev.c
@@ -440,7 +440,10 @@ static void enicpmd_dev_info_get(struct rte_eth_dev *eth_dev,
ENICPMD_FUNC_TRACE();
unix {
nodaemon
cli-listen localhost:5002
log /tmp/vpp.log
full-coredump
}
api-trace {
on
}
dpdk {
dev 0000:08:00.0
dev 0000:0d:00.0
{vlan-strip-offload off}
dev 0000:0e:00.0
{vlan-strip-offload off}
socket-mem 8192,8192
no-multi-seg
num-mbufs 512000
}
api-segment {
gid vpp
}
Original issue: https://jira.fd.io/browse/VPP-238
The text was updated successfully, but these errors were encountered: