-
Notifications
You must be signed in to change notification settings - Fork 861
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for PACKET_FANOUT on Linux #869
base: master
Are you sure you want to change the base?
Conversation
Hi all, are there any issues with merging this PR? |
I wonder if there's a more generic way to expose this, as many network stacks provide similar capabilities. Ones I definitely know of are DPDK and XDP (technically Linux, too, but you configure it by other means), but I expect most other mainstream *nixes to have equivalent features. |
@Oppen
From what I gather, libpcap currently has initial support for DPDK, while it provides no support for XDP. |
@tsnoam I see your point, but I think masking implementation details is more or less the point of using libpcap as an abstraction. I'm OK-ish with marking it as an unstable API, although I have my doubts on that concept (eventually it has users and you can never drop it without facing their wrath). Also, it may be interesting to take a look at this patchset. |
|
Also, thanks for the review. I pushed and updated version of the code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tsnoam I see your point, but I think masking implementation details is more or less the point of using libpcap as an abstraction. I'm OK-ish with marking it as an unstable API, although I have my doubts on that concept (eventually it has users and you can never drop it without facing their wrath).
Note though that I'm not a maintainer. My opinions and suggestions are just that, and carry no weight towards acceptance. Just in case you feel discouraged by my reluctance to make it platform-specific at the API level.
Also, it may be interesting to take a look at this patchset.1. Again, I get what you're saying about properly wrapping all various technologies into a single API, hiding implementation details from the user. However, waiting until everything was properly learned and consolidated may take too long and practically stops progress. 2. I looked at the patchset you've sent. It seems to be already merged into the kernel. If you were referring to the fact that there are more functions to wrap: well, this is exactly the point: you make small progress by slowly adding functions to competing APIs (packet socket vs dpdk vs ... ). One day you reach a point where you know enough of all technologies and feel comfortable wrapping them into a single API. But until then, why wait?
- I agree.
- That wasn't a counter argument or a suggestion for you to implement it now. It was just a "look at this cool thing", that may interest you because you're interested in something similar :)
I know it's merged, but the patch has a nice explanation about what it does, and I think it's a good way to point to features.
Oh, thanks for the reference then :-) In any case, I feel that we do have an understanding that this PR is worthy of finding itself into upstream, with the open question of whether it should be marked as "unstable" pending maintainers decision. |
I think it's definitely worthy of getting upstream. That doesn't mean improving it (whatever that means) is out of the question. |
.SH DESCRIPTION | ||
On Linux network interface devices, | ||
.B pcap_set_fanout_linux() | ||
sets the fanout configuration to be used in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please explain what the affect of having packet fanout mode?
When would I turn it on and why? Is there is a downside to it? What if every application turned it on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the AF_PACKET fanout mode, packet reception can be load balanced among processes/threads.
When "configuring" the socket to be used, you would be required to set a group_id, use the same group_id to achieve the required load balancing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't tell me. Put it in the manpage!
@tsnoam @mcr @guyharris I tried building libpcap from your branch. And I followed the doc. Everything compiles well and I test it using 3 parallel processes. So I changed my pcap_open_live to the following in each process: pcap_handle = pcap_create((char*)pcap_file, err_open);
set = pcap_set_fanout_linux(pcap_handle, 1, 0, root_idx);
pcap_set_snaplen(pcap_handle, snaplen);
pcap_set_promisc(pcap_handle, promisc);
pcap_set_timeout(pcap_handle, 500);
pcap_activate(pcap_handle); Which prints the desired output: process (group) id: 0 fanout set: 0
process (group) id: 1 fanout set: 0
process (group) id: 2 fanout set: 0 However, when checking if FANOUT worked, I observe that each process still receives all packets. |
@aouinizied it's very strange. I'm using the same construction in production:
pcap_set_fanout_linux here do the same as
Does your root_idx is unique? |
@aouinizied looks like I found the problem:
I think it will be better to use:
because the third parameter is mode. I think it will be better to remove enable because it's not used in code. |
@xpahos Thanks for the feedback. I will replace PACKET_FANOUT_LB by 0 for FLOW_HASH mode, test, and get back to you. |
@xpahos checked that each root_idx is unique but the same observations. set = pcap_set_fanout_linux(pcap_handle, 1, 0, root_idx) The only difference with your setup is the mode. You set it to PACKET_FANOUT_LB whereas I use PACKET_FANOUT_HASH as the third parameter. |
@aouinizied yes, you right. PACKET_FANOUT_HASH should send packets only to one unique socket per each address port mapping. |
@xpahos another detail that may help reproducing is that I'm using: int rv_handle = pcap_next_ex(pcap_handle, &hdr, &data); to poll the received packets. But this should not affect fanout right? |
@xpahos I found what was wrong. And it's clearly stated in the doc.
In my example replacing root_idx by 0 solve it and now I see traffic correctly load-balanced across my processes. @tsnoam Thank you for this great contribution! A detail that maybe need to be added to the doc is that for PACKET_FANOUT_HASH there is a known bug on some Linux kernel versions. The bug consists in that a src->dst flow traffic will be hashed with a different hash than dst->src and this will result in inconsistency for bidirectional flow-based apps. |
@aouinizied Sorry for the delay in response, I'm happy to see that you've eventually fixed the problem. Anyhow, most of the thanks are to @xpahos who had made the original PR, I have based on their work. @xpahos that's a good catch regarding the |
Hi. I wanted to try out fanout support in tcpdump. Is this mode controlled by a command line option to tcpdump or is it always enabled in this branch? |
@tsnoam is it possible to rebase your branch? Many thanks for your contribution. |
@aouinizied have you encountered problems trying to rebase yourself? |
@tsnoam no but I'm using your fork (until it gets merged within libpcap master) within my project (https://www.nfstream.org/docs/#building-nfstream-from-sources) and seems to be outdated. updating your branch with your rebase would be really helpful. |
Hi @aouinizied , As the branch can be directly merged to master, i mostly wait for the maintainers to pick it up. Personally, my systems work in production based on my branch, but I do not have the capacity at the moment to test a rebase over latest master. |
@mcr Anything I can do to help, let me know! |
.SH DESCRIPTION | ||
On Linux network interface devices, | ||
.B pcap_set_fanout_linux() | ||
sets the fanout configuration to be used in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't tell me. Put it in the manpage!
Thank you for proposing these changes. If you consider them ready for merging, please rebase them on the current master branch and squash the commits together into one or more commits, so each change completely progress the working tree from one consistent working state to another. |
This work is based on PR the-tcpdump-group#674 while taking into consideration the review comments which caused the PR to be closed by the original author, @xpahos.
@mcr I've added the documentation as requested. I hope it is clear enough. If not, I'll be happy to fix as necessary. @infrastation I do consider this change as ready for merging (FWIW, I've been using this code for 2+ years in production). As requested, I've squashed all commits and rebased it over current master. Regarding CI failure in AppVeyor, it seems unrelated to my change ( |
@guyharris @tsnoam any plan to merge this PR? Thanks, |
@guyharris any news regarding this PR? |
REMINDER: @infrastation @mcr any estimation of when this PR will be merged? Zied |
This work is based on PR #674 while taking into consideration the review
comments which caused the PR to be closed by the original author,
@xpahos.