Skip to content

Use logical count for everything newer than dozer#4

Closed
Artoria2e5 wants to merge 2 commits intoGPUOpen-LibrariesAndSDKs:masterfrom
Artoria2e5:master
Closed

Use logical count for everything newer than dozer#4
Artoria2e5 wants to merge 2 commits intoGPUOpen-LibrariesAndSDKs:masterfrom
Artoria2e5:master

Conversation

@Artoria2e5
Copy link
Copy Markdown

It probably doesn't make sense to kill the whole core detection completely, but we can still make everything a bit better by not limiting ourselves to dozer only. After all, Ken Mitchell 2017 does state that SMT is useful most of the time.

Fixes #2. Should be more in the spirit of @ThomasTheGerman's suggestion.

if ((0 == strcmp(vendor, "AuthenticAMD")) && (0x15 == getCpuidFamily())) {
// AMD "Bulldozer" family microarchitecture
if ((0 == strcmp(vendor, "AuthenticAMD")) && (0x15 <= getCpuidFamily())) {
// AMD "Bulldozer" family microarchitecture or newer
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, maybe not this..

@Artoria2e5
Copy link
Copy Markdown
Author

Wait, when does AMD start having logical/core differences anyways? Isn't it just dozer with the clusters? What is this function good for anyways "usually" then, if all these SMT usually gets better performance? Shouldn't a README change be enough?

Eff this.

@Artoria2e5 Artoria2e5 closed this Dec 13, 2020
@SylwesterZarebski
Copy link
Copy Markdown

SylwesterZarebski commented Dec 14, 2020

This patch is wrong. It should return logical core count when family is greater or equal than 0x15 not less or equal.
Ryzen has family 0x17 (23).

Example: https://www.cpu-world.com/CPUs/Zen/AMD-Ryzen%207%201700.html

@e0x70i
Copy link
Copy Markdown

e0x70i commented Dec 15, 2020

This patch is wrong. It should return logical core count when family is greater or equal than 0x15 not less or equal.
Ryzen has family 0x17 (23).

Example: https://www.cpu-world.com/CPUs/Zen/AMD-Ryzen%207%201700.html

That’s exactly what the patch does.... (0x15 <= getCpuidFamily()) cpu family is greater or equal than 0x15 for that expression to return true.

@SylwesterZarebski
Copy link
Copy Markdown

Sorry, You're right, I replaced sides in my mind ;-), sorry!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

getDefaultThreadCount faulty

3 participants