Skip to content

Conversation

@ShadowLive
Copy link

@ShadowLive ShadowLive commented Nov 28, 2020

Added support for Apple Silicon ARM64/OPENCL (RC5-72).

ARM64 Native (this fork)
[Nov 28 10:37:15 UTC] RC5-72 benchmark summary :
Default core : #-1 (undefined) 0 keys/sec
Fastest core : #1 (CL 1-pipe) 952,068,808 keys/sec

Existing AMD64 under Rosetta 2
[Nov 28 10:39:10 UTC] RC5-72 benchmark summary :
Default core : #-1 (undefined) 0 keys/sec
Fastest core : #1 (CL 1-pipe) 881,571,409 keys/sec

Added support for Apple Silicon ARM64 (OGR-NG).

ARM64 Native (this fork)
[Nov 30 13:12:32 UTC] OGR-NG: Benchmark for core #0 (FLEGE 2.0)
0.00:00:16.98 [109,275,989 nodes/sec]

Existing AMD64 under Rosetta 2
[Nov 30 13:13:43 UTC] OGR-NG benchmark summary :
Default core : #2 (cj-asm-sse2) 60,832,450 nodes/sec
Fastest core : #1 (cj-asm-generic) 88,987,193 nodes/sec

Added support for Apple Silicon ARM64 (RC5-72 ANSI).

Slower than AMD64 under Rosetta 2 as ScalarFusion Core not functional on Apple ARM64 at this time.

ARM64 Native (this fork)
[Nov 30 13:15:16 UTC] RC5-72 benchmark summary :
Default core : #-1 (undefined) 0 keys/sec
Fastest core : #0 (ANSI 4-pipe) 7,213,400 keys/sec

Existing AMD64 under Rosetta 2
[Nov 30 13:18:03 UTC] RC5-72 benchmark summary :
Default core : #3 (GO 2-pipe d) 11,192,248 keys/sec
Fastest core : #1 (KBE-64 3-pipe) 11,864,297 keys/sec

Notes:

  • Confirmed it compiles using Clang 12.
    Apple clang version 12.0.0 (clang-1200.0.32.27)
    Target: arm64-apple-darwin20.1.0

  • Confirmed all tests passed using './dnetc -test'

  • Updated temperature.cpp to allow ARM64 but have not confirmed temperature can be read on Apple Silicon.

Still to do:

  • Get ARM64 ScalarFusion functional on MacOS.
  • Get Temperature reading on Apple Silicon.
  • CPUINFO can not currently be read by ARM64 client.

Added .DS_Store to .gitignore
It doesn't seem arm64 & OpenCL was originally planned. Added an exception for ARM64 & OpenCL.
Update to configure to include macosx-arm64-opencl. Although arm64 support is macOS 11 is technically in macOS 11 not Mac OS X, it has more in common. Consider adding a new OS to the list and updating.
Added OPENCL and ARM64 as a vaild combination.
Added a (bad) assumption that if you're using an Apple arm64 processor with OpenCL it's not iOS.
Turn OGR back on in configure
Determined arm64 in add_sources is not required with OpenCL.
Tidied up macOS defintions to a single line.
Added ARM64 & ARM to CPU list rather than directly under APPLE
Fixed spacing
OSX: leading underscore required on assembler functions that are going to be global and visible to C/C++ code.
Added maxosx-arm64 support for OGR-NG cores, and ANSI RC5-72 cores. ScalarFusion not currently functional.
ScalarFusion not currently passing tests for MacOS ARM64.
@ShadowLive ShadowLive changed the title Added support for ARM64/OpenCL (RC5-72) on macOS 11 (Apple Silicon) Added support for ARM64/OpenCL (RC5-72) and ARM64 (OGR-NG & ANSI RC5-72) on macOS 11 (Apple Silicon) Nov 30, 2020
@SunsetAsh
Copy link
Contributor

The current ARM64 core shipping in the private branch is MONIKA, which has replaced the (slow) ScalarFusion. MONIKA's performance is shaky on some cores (it works well on Kryo, poorly on licensable cores) but if you'd like to get in touch about making it work, feel free to hop on the IRC channel.

@ertyu
Copy link
Member

ertyu commented Jul 18, 2025

I got the monika core going, here are the results:

dnetc v2.9116-525-CTR-16021318 for Mac OS X (Darwin 23.6.0).
Please provide the entire version descriptor when submitting bug reports.
The distributed.net bug report pages are at http://bugs.distributed.net/

[Jul 18 04:45:52 UTC] RC5-72: using core #0 (ANSI 4-pipe).
[Jul 18 04:46:11 UTC] RC5-72: Benchmark for core #0 (ANSI 4-pipe)
0.00:00:17.00 [8,778,491 keys/sec]
[Jul 18 04:46:11 UTC] RC5-72: using core #1 (ANSI 2-pipe).
[Jul 18 04:46:31 UTC] RC5-72: Benchmark for core #1 (ANSI 2-pipe)
0.00:00:16.17 [8,709,204 keys/sec]
[Jul 18 04:46:31 UTC] RC5-72: using core #2 (ANSI 1-pipe).
[Jul 18 04:46:49 UTC] RC5-72: Benchmark for core #2 (ANSI 1-pipe)
0.00:00:16.19 [8,124,599 keys/sec]
[Jul 18 04:46:49 UTC] RC5-72: using core #3 (KS-MONIKA 4-pipe).
[Jul 18 04:47:08 UTC] RC5-72: Benchmark for core #3 (KS-MONIKA 4-pipe)
0.00:00:16.11 [10,159,636 keys/sec]
[Jul 18 04:47:08 UTC] RC5-72: using core #4 (KS-MONIKA 2-pipe).
[Jul 18 04:47:26 UTC] RC5-72: Benchmark for core #4 (KS-MONIKA 2-pipe)
0.00:00:16.08 [5,103,393 keys/sec]
[Jul 18 04:47:26 UTC] RC5-72 benchmark summary :
Default core : #-1 (undefined) 0 keys/sec
Fastest core : #3 (KS-MONIKA 4-pipe) 10,159,636 keys/sec

dnetc v2.9112-521-CTR-16020314 for Mac OS X (Darwin 23.6.0).
Please provide the entire version descriptor when submitting bug reports.
The distributed.net bug report pages are at http://bugs.distributed.net/

[Jul 18 04:41:40 UTC] Automatic processor type detection found
an Intel Xeon 56xx processor.
[Jul 18 04:41:40 UTC] RC5-72: using core #0 (SNJL 3-pipe).
[Jul 18 04:41:59 UTC] RC5-72: Benchmark for core #0 (SNJL 3-pipe)
0.00:00:16.05 [10,244,592 keys/sec]
[Jul 18 04:41:59 UTC] RC5-72: using core #1 (KBE-64 3-pipe).
[Jul 18 04:42:19 UTC] RC5-72: Benchmark for core #1 (KBE-64 3-pipe)
0.00:00:17.05 [11,903,789 keys/sec]
[Jul 18 04:42:19 UTC] RC5-72: using core #2 (GO 2-pipe c).
[Jul 18 04:42:38 UTC] RC5-72: Benchmark for core #2 (GO 2-pipe c)
0.00:00:16.08 [9,547,581 keys/sec]
[Jul 18 04:42:38 UTC] RC5-72: using core #3 (GO 2-pipe d).
[Jul 18 04:42:58 UTC] RC5-72: Benchmark for core #3 (GO 2-pipe d)
0.00:00:17.05 [11,224,523 keys/sec]
[Jul 18 04:42:58 UTC] RC5-72 benchmark summary :
Default core : #3 (GO 2-pipe d) 11,224,523 keys/sec
Fastest core : #1 (KBE-64 3-pipe) 11,903,789 keys/sec

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants