Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Ice Lake kernel #94

Open
k0ekk0ek opened this issue Aug 4, 2023 · 3 comments
Open

Implement Ice Lake kernel #94

k0ekk0ek opened this issue Aug 4, 2023 · 3 comments
Assignees
Labels
enhancement New feature or request

Comments

@k0ekk0ek
Copy link
Contributor

k0ekk0ek commented Aug 4, 2023

So far, we support only SSE4.1 and AVX2, but AVX-512 may greatly improve speed. An initial port of simd.h won't require much work and halves the number of operations for the scanner. I expect AVX-512 to improve parsing of certain data types, base16 sequences and base64 sequences, although we can worry about those at a later stage and start of just including AVX2 operations and go from there.

@k0ekk0ek k0ekk0ek self-assigned this Aug 4, 2023
@lemire
Copy link
Collaborator

lemire commented Aug 4, 2023

An initial port of simd.h won't require much work and halves the number of operations for the scanner.

In many instances, AVX-512 can be twice as fast on the same hardware. But if you just merely do a straight port, the likely outcome is that the performance won't be improved. It is not the wider registers that actually help most. E.g., Zen 4 still uses 256-bit operations internally and is competitive with Ice Lake. It is somewhat misleading to think of AVX-512 as just wider registers (though it is that). AVX-512 requires "from the ground up" design to really shine. Of course, it is not a daring research question: simdjson shines with AVX-512. And it is not super hard... but it is not a refactoring problem.

@k0ekk0ek
Copy link
Contributor Author

k0ekk0ek commented Aug 5, 2023

As a first project I'm hoping to port the the scanner (or stage1 in simdjson terms) to get an initial kernel started. At least, I expect that part to be relatively straight forward. After that I hope to implement faster parsing of base16 sequences, hoping that compress will make a big difference there. Over the last week I added many RRs and a lot of them use hex encoding. e.g. EUI48 and EUI64 (or MAC addresses), which are encoded as xx-xx-xx-xx-xx-xx (this is relatively straightforward in SSE too), but also but also just plain DS records. I have some ideas for making that better. But, I'm not very experienced with AVX-512 yet, so this may prove harder than I expect 😅

@lemire
Copy link
Collaborator

lemire commented Aug 5, 2023

I recommend requiring VBMI2. That's what you get in Ice Lake and Zen 4.

@k0ekk0ek k0ekk0ek added the enhancement New feature or request label Nov 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants