Release 23.05
Summary
New Features
- Added support for dynamic weights in CpuAcc and GpuAcc for FullyConnected workloads.
- Added support for Crops in CpuAcc and GpuAcc BatchToSpaceND workloads.
- Added support for int8 and changed the Compute Library kernel used for Fp32 in CpuAcc and GpuAcc Batch MatMul workloads.
- Added Opaque TfLite Delegate which provides the same operator coverage as the existing/classic TfLite Delegate. More information can be found in the TfLite Delegate section below
TfLite Parser
- Added support for CEIL and SPACE_TO_DEPTH operators.
- Fixed bug where calculated output shape wasn't being recorded in ParseSqueeze.
- Fixed segfault in ParseTransposeConv2d when output_shape is not constant.
- Fixed bug where negative axis was being read incorrectly in ParseMean.
- Calculate explicit padding for Transpose Convolution using output shape, if specified.
ONNX Parser
- Added support for dimensions > 2 to MatMul/FullyConnected.
Arm NN Serializer/Deserializer:
- Added support for CEIL.
Bug Fixes
- Fixed compare-output output feature in ExecuteNetwork.
- Fixed gcc 13 compiler errors.
Other Changes
- Added ElementwiseBinaryLayer to replace Add, Div, Sub, Maximum, Mul and Minimum layers.
- Updated build Android NDK guide (BuildGuideAndroidNDK.md).
- Set default quantization parameter scale to 1.0, instead of 0.0.
- Fp16ToFp32 and Fp32ToFp16 convert workloads now use arm_compute::NECast in CpuAcc backend, when available. This should in general be faster.
- Added Opaque TfLite Delegate build option to the build-tool.
Known Issues
- Intermittent issue on Dma Buf memory import on GPU. This is fix in Mali Driver r30p0.
- There might be performance regressions against v20.08 in Inception v3 using int8 data types on Arm Mali-G77 GPUs. Currently under investigation.
ABI/API Changes
The following front-end API changes have occurred during the implementation of 23.05 that users should be aware of before upgrading. Note: No ABI breaking change occured in ArmNN Core (libarmnn.so) and so the Major version has not changed, only a bump in minor version (32.0.0 → 32.1.0).
Feature | SHA | Gerrit Review | Resultant ABI/API changes |
---|---|---|---|
Implement Pimpl Idiom for Delegate Options | 1bae865 | https://review.mlplatform.org/c/ml/armnn/+/9358 | Size of class DelegateOptions has been changed from 488 bytes to 8 bytes. Layout of parameter's stack of several functions has been changed and therefore parameters at higher positions in the stack may be incorrectly initialized by applications. Size of class Delegate has been changed from 552 bytes to 72 bytes. Size of field m_Options has been changed from 488 bytes to 8 bytes. The fields or parameters of such data type may be incorrectly initialized or accessed by old client applications. |
Implement Pimpl Idiom for OptimizerOptions | c5ee0d7 | https://review.mlplatform.org/c/ml/armnn/+/9369 | The following functions have been changed to now accept OptimizerOptionsOpaque argument instead of the unstable OptimizerOptions. Note: OptimizerOptionsOpaque will accept an OptimizerOptions to it's constructor so this is not an API break, only an ABI break. DelegateOptions::DelegateOptions [C1] ( enum armnn::Compute computeDevice, struct armnn::OptimizerOptions const& optimizerOptions,armnn::Optionalarmnn::LogSeverityconst& logSeverityLevel, armnn::Optional<std::function<void(arm::pipe::ProfilingGuid, unsigned int,armnn::ITensorHandle*)> >const& func) DelegateOptions::DelegateOptions [C2] ( enum armnn::Compute computeDevice, struct armnn::OptimizerOptions const& optimizerOptions, armnn::Optionalarmnn::LogSeverityconst& logSeverityLevel, armnn::Optional<std::function<void(arm::pipe::ProfilingGuid, unsigned int, armnn::ITensorHandle*)> >const& func) DelegateOptions::DelegateOptions [C1] ( std::vectorarmnn::BackendIdconst& backends, struct armnn::OptimizerOptions const& optimizerOptions, armnn::Optionalarmnn::LogSeverityconst& logSeverityLevel, armnn::Optional<std::function<void(arm::pipe::ProfilingGuid, unsigned int, armnn::ITensorHandle*)> >const& func) DelegateOptions::DelegateOptions [C2] ( std::vectorarmnn::BackendIdconst& backends, struct armnn::OptimizerOptions const& optimizerOptions, armnn::Optionalarmnn::LogSeverityconst& logSeverityLevel, armnn::Optional<std::function<void(arm::pipe::ProfilingGuid, unsigned int, armnn::ITensorHandle*)> >const& func) DelegateOptions::SetOptimizerOptions ( struct armnn::OptimizerOptions const& optimizerOptions) Replacement functions: DelegateOptions::DelegateOptions [C1] ( enum armnn::Compute computeDevice, armnn::OptimizerOptionsOpaque const& optimizerOptions, armnn::Optionalarmnn::LogSeverityconst& logSeverityLevel,armnn::Optional<std::function<void(arm::pipe::ProfilingGuid, unsigned int, armnn::ITensorHandle*)> >const& func) DelegateOptions::DelegateOptions [C2] ( enum armnn::Compute computeDevice, armnn::OptimizerOptionsOpaque const& optimizerOptions, armnn::Optionalarmnn::LogSeverityconst& logSeverityLevel, armnn::Optional<std::function<void(arm::pipe::ProfilingGuid, unsigned int, armnn::ITensorHandle*)> >const& func) DelegateOptions::DelegateOptions [C1] ( std::vectorarmnn::BackendIdconst& backends, armnn::OptimizerOptionsOpaque const& optimizerOptions, armnn::Optionalarmnn::LogSeverityconst& logSeverityLevel, armnn::Optional<std::function<void(arm::pipe::ProfilingGuid, unsigned int, armnn::ITensorHandle*)> >const& func) DelegateOptions::DelegateOptions [C2] ( std::vectorarmnn::BackendIdconst& backends, armnn::OptimizerOptionsOpaque const& optimizerOptions, armnn::Optionalarmnn::LogSeverityconst& logSeverityLevel, armnn::Optional<std::function<void(arm::pipe::ProfilingGuid, unsigned int, armnn::ITensorHandle*)> >const& func) DelegateOptions::SetOptimizerOptions ( armnn::OptimizerOptionsOpaque const& optimizerOptions ) |
Add constant version of IConnectableLayer::GetConstantTensorsByRef | aeec3ce | https://review.mlplatform.org/c/ml/armnn/+/9196 | For class IConnectableLayer a pure virtual method GetConstantTensorsByRef ( ) const has been added. Applications will not provide the implementation for this pure virtual method and therefore cause a crash in the library trying to call this method. Note: The pure virtual function was added at the end so will not result in an ABI break. However you should usually not add new virtual functions for any reason, even to leaf classes, if the class is intended to remain binary compatible on Windows. Doing so may reorder existing virtual functions and break binary compatibility. As we do not target support for Windows we are not going to consider this an ABI break for ArmNN Core. |
The following back-end API changes have occurred during the implementation of 23.05 that users should be aware of before upgrading.
Feature | SHA | Gerrit Review | Resultant ABI/API changes |
---|---|---|---|
Remove GetGraph and include of Graph.hpp header from public header | c1c5f2a | https://review.mlplatform.org/c/ml/armnn/+/9351 | Size of class OptimizationViews has been changed from 360 bytes to 88 bytes. Field m_Graph has been removed from this type. The fields or parameters of such data type may be incorrectly initialized or accessed by old client applications. |
TfLite Delegate
- The existing TfLite Delegate has now been renamed to "Classic" TfLite Delegate to accommodate the new Opaque TfLite Delegate. There has been a file restructure because of this.
- The Opaque TfLite Delegate provides the same operator coverage as the existing/classic TfLite Delegate. A list of these supported operators can be found in the TfLite Delegate section of the documentation
New features
- Added support for CEIL, EXPAND_DIMS and SQUEEZE operators.
Bug Fixes
- Fixed layer support for Comparison, ElementWiseBinary and LogicalBinary operators, by expanding the TensorShape before verifying support, if required.
- Fixed handling of a negative size value in the Slice operator.
- Calculate explicit padding for Transpose Convolution using output shape, if specified.
Build Dependencies
Tools | Supported Version |
---|---|
Git | 2.17.1 or later |
SCons | 2.4.1 (Ubuntu) 2.5.1 (Debian) |
Cmake | 3.5.1 (Ubuntu) and 3.7.2 (Debian) |
Tensorflow | 2.12.0 (SHA 6f692f73cb2043b4a0b0446539cd8c15b3dd9220) |
Onnx | 1.6.0 |
Flatbuffer | 2.0.6 |
Protobuf | 3.12.0 |
Android NDK | r25 |
mapbox/variant | 1.2.0 |
cxxopts | 3.1.1 (SHA eb787304d67ec22f7c3a184ee8b4c481d04357fd) |
doctest | 2.4.6 |
fmt | 7.0.1 |
ghc | 1.3.2 |
half | 1.12.0 |
mapbox/variant | 1.2.0 |
stb | 2.16 |
xxd | 1.10 |