-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for incorrect thread count reporting on zen based processors #3
Fix for incorrect thread count reporting on zen based processors #3
Conversation
Some reference material from AMD on this matter: https://gpuopen.com/learn/cpu-core-count-detection-windows/ https://gpuopen.com/wp-content/uploads/2018/05/gdc_2018_sponsored_optimizing_for_ryzen.pdf (see page 25) |
Not sure if this was directed at me, but a function that's called But generally speaking SMT improves the performance of a product, thus should not be disabled by default. |
@samklop you just removed code that checks result of |
And how possibly it may be "not correct"? Also this sample is not ready to be used "as is", see below.
I fully understand why this guidance and sample was created. But as stated "Remember to profile!", this sample is just an example code that should never be used "as is" in production. It doesn't help that this sample was poorly written, in 2017 with Zen(1) in mind. But the check is only for "Bulldozer" family, and everything else is implicitly treated like Zen(1) with the guidance linked above. Now if you run this code 10 years from now on brand new architecture it will still apply guidance from Zen1 era (and won't use 4-way SMT that we may have). Or even now, does this still holds in Zen3, one may ask? One cannot expect anything from the future architectures and such optimizations like affinity/thread count need to be done per application and per specific cpu family in mind. Generally using logical core count is universal way and any fine-tuning need to be profiled and not blindly applied. That said this PR is obviously wrong, because main point of this sample code is to show how to select different thread count based on CPU family and you removed it. It is developer job to apply the code to his product properly to maximize performance. |
I think GPUOpen overestimates the mental capabilities of game developers if they truly expect all of them to profile this stuff, since even a AAA game developer like CDPR didn't bother to do it. |
Please just remove this repository. It is utterly bad.
Thank you. |
The main point of this code, according to readme and its name, is to detect logical and physical core counts. It does just that. The problem is that it then makes an arbitrary decision to treat some AMD CPUs differently. What this sample code should do is just return the number of cores and that's it. Or at the very least always return logical core count as default thread count because that's the only safe default. Everything else is harmful and misleading. It's naive to think that developers would test code before copying it. Especially when it requires complex testing on different hardware. CDPR is just one recent example of that. The PR is definitely correct in that it removes the arbitrary logic and just makes the code do what it supposed to be doing - return core counts. |
While I understand the sentiment, what you are saying is not exactly correct. The code has separate value for The sample does already return just cores and just threads. The function in question does a 3rd thing, which is something else. This repo should be just removed. It doesn't serve any useful purpose to anybody. A way better approach is to make a blog post, with some examples and a discussion. |
can't imagine how many applications and games be limited on AMD's cpu only because of this mistake👀 |
Simple, fire up any application that makes use of GPUOpen libraries and check whether it's not accounting for "SMT threads" aka the second thread for every core. |
I totally agree. In my response I was kind of trying to understand (and explain) why this repo even exists, but it should never be used in production as copy-paste snippet. As a appendix to blog post it is fine, shouldn't be a github repo, but an attachment to said blog post at best. But guys, it all doesn't matter. There are loads of bad code over the net. Just don't copy it to your application. Think before. It is unfortunate that it's branded under GPUOpen libraries, but what can you do... |
The issue with Cyberpunk caused us to revisit what this sample code recommends to developers and we have been addressing it, both with work with developers who have already used the code in production, and in a future update to this code where we want to make it clearer how to work with it and understand what advice it gives. So we've been working on it, and I'll chase up where we are with a public update. That means we won't merge in this pull request, but we are working on it. |
This fixes issue #2
What this does:
Removes the code that changes the thread count to the core count when not based on the Bulldozer architecture
I have tested my changes on the Ryzen 5 3600