Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
81 commits
Select commit Hold shift + click to select a range
64cd4f0
Conv2D Bias Adaptation
diaconuccalin Jul 29, 2025
6eb7422
Adde PULPOpen support for Conv2D and partially working DW Conv2D. Fix…
diaconuccalin Aug 7, 2025
201e06a
DW 2D Float Conv for PULPOpen platform now working. Updated im2col bu…
diaconuccalin Sep 18, 2025
5892f05
Optimized the PULPOpen DW 2D fp32 Convolution and fixed the bias vers…
diaconuccalin Sep 19, 2025
46595f7
Updated float reshape with skip connection test to a smaller one
diaconuccalin Sep 19, 2025
4d2d235
Fixed generic platform alias_of bug
diaconuccalin Sep 22, 2025
72f809e
Fixed the PULPOpen FloatGemmTemplate (identical issue to the generic …
diaconuccalin Sep 22, 2025
cdcadb1
Working TinyViT Demo test. Added it to the CI pipeline. Added float s…
diaconuccalin Sep 22, 2025
a15c940
Added GEMM batched fix to MatMul template
diaconuccalin Sep 23, 2025
99489cd
Fixed formatting
diaconuccalin Sep 23, 2025
7c27cc8
Fixes to avoid warnings
diaconuccalin Sep 23, 2025
829d4c9
Fix formatting
diaconuccalin Sep 23, 2025
108ac02
Merge fix
diaconuccalin Sep 23, 2025
8085944
Dynamic buffer calculation fix. Other fixes
diaconuccalin Sep 24, 2025
fda36d1
Reformat
diaconuccalin Sep 24, 2025
8f95458
Added back CI tests removed by merge
diaconuccalin Sep 24, 2025
9506588
Updated changelog file
diaconuccalin Sep 24, 2025
accc43e
Applied fixes suggested in the PR review
diaconuccalin Sep 24, 2025
73b6a95
Post-merge fixes
diaconuccalin Oct 13, 2025
9523f7c
Quickfix
diaconuccalin Oct 13, 2025
a32ac63
PR fixes
diaconuccalin Oct 17, 2025
b2c4ae7
Addressed PR review. Minor fix for aliasing in reshape parser
diaconuccalin Oct 21, 2025
3d78267
Minor fix based on PR review
diaconuccalin Oct 21, 2025
eee3945
Post-rebase fixes
diaconuccalin Oct 27, 2025
4161de9
More post-rebase fixes
diaconuccalin Oct 27, 2025
48b1ef1
Quickfix
diaconuccalin Oct 27, 2025
6eb8aca
Fix aliases parameter issues
diaconuccalin Oct 27, 2025
ed81a92
Bring alias changes to PULP reshape
diaconuccalin Oct 27, 2025
0e7fe40
Alias quickfix
diaconuccalin Oct 27, 2025
7994c36
Fixes based on CodeRabbit PR Reviews
diaconuccalin Oct 27, 2025
533c761
Removed alias leftovers
diaconuccalin Oct 28, 2025
e4cafd1
Reduced code duplication in PULP Conv2D parsers
diaconuccalin Oct 28, 2025
d882ed5
Remove duplicate code in the PULPOpen reshape template
diaconuccalin Oct 29, 2025
104ac5e
Restrained n_cores usage to PULP
diaconuccalin Oct 30, 2025
240b6f4
Quickfix
diaconuccalin Oct 30, 2025
f0825df
Fix based on CodeRabbit review
diaconuccalin Oct 30, 2025
2bef17a
Make IntegerDataTypes a tuple
lukamac Oct 31, 2025
bfc3519
Change _mapNode to _selectEngine
lukamac Oct 31, 2025
1f79e15
Revert removal of PULPReshapeBindings
lukamac Oct 31, 2025
dec4383
Add n_cores to PULPClusterEngine and a hacky way to annotate the oper…
lukamac Oct 31, 2025
5cc8e3a
Remove n_cores annotation from context and parser
lukamac Oct 31, 2025
aa3d411
Trailing whitespace fixes
lukamac Oct 31, 2025
af017fc
Revert unnecessary changes
lukamac Oct 31, 2025
56b7406
Rename --n_cores flag to --cores to align with the preexisting siracu…
lukamac Oct 31, 2025
f36d58e
Use parsed args instead of an extra argument to testRunner
lukamac Nov 1, 2025
b6443cd
Revert removal of header guard and addition of full license in the he…
lukamac Nov 1, 2025
fe40743
Fix missing space between `--cores` argument and previous args
lukamac Nov 1, 2025
240c6d2
Conv2D Bias Adaptation
diaconuccalin Jul 29, 2025
98395c4
Adde PULPOpen support for Conv2D and partially working DW Conv2D. Fix…
diaconuccalin Aug 7, 2025
dbf083b
DW 2D Float Conv for PULPOpen platform now working. Updated im2col bu…
diaconuccalin Sep 18, 2025
c8c470b
Optimized the PULPOpen DW 2D fp32 Convolution and fixed the bias vers…
diaconuccalin Sep 19, 2025
d2f4e4b
Fixed generic platform alias_of bug
diaconuccalin Sep 22, 2025
f00459e
Fixed the PULPOpen FloatGemmTemplate (identical issue to the generic …
diaconuccalin Sep 22, 2025
e2b7583
Fixed formatting
diaconuccalin Sep 23, 2025
18926b7
Fixes to avoid warnings
diaconuccalin Sep 23, 2025
1addcb0
Merge fix
diaconuccalin Sep 23, 2025
ad73844
Dynamic buffer calculation fix. Other fixes
diaconuccalin Sep 24, 2025
8c8698b
Applied fixes suggested in the PR review
diaconuccalin Sep 24, 2025
c916ad0
Post-merge fixes
diaconuccalin Oct 13, 2025
4e91c86
Included bias handling in tile constraint handling for conv 2D. Fix i…
diaconuccalin Sep 30, 2025
a1fd1dd
Fixed input cube computation for 2D conv tiling
diaconuccalin Sep 30, 2025
a06fa0e
Added tiling bias handling for regular Conv2D in PULPOpen
diaconuccalin Oct 2, 2025
9abee3e
Added core number information to the network container, for buffer al…
diaconuccalin Oct 3, 2025
5adf8cc
Added non-RQ FP DW Conv2D tiling-ready binding and constraints. Test …
diaconuccalin Oct 8, 2025
e8d18c9
Bias support for DW 2d Conv tiling. im2col buffer computation fix. co…
diaconuccalin Oct 8, 2025
845d46f
Added broadcasting handling for MatMul tile constraint. Fixed formatting
diaconuccalin Oct 9, 2025
97989e5
Added prototypes for PULPOpen Slice and ReduceMean tiling constraints
diaconuccalin Oct 9, 2025
ca692ca
Added geometrical constraint and serialization for the slice operation
diaconuccalin Oct 10, 2025
316228c
Added geometrical constraints and tiling serialization to the ReduceM…
diaconuccalin Oct 10, 2025
26f55df
Updated PULP GELU function to generic version (while keeping the para…
diaconuccalin Oct 10, 2025
876d45c
Fix core number value passing. Minor PULP GELU clean up
diaconuccalin Oct 11, 2025
4a1c8d1
Fixed PULP DW 2D Conv geometrical tile constraints
diaconuccalin Oct 11, 2025
c398c66
Fixes in DWConv2D and MatMul tile constraints
diaconuccalin Oct 11, 2025
ec21316
Fixed formatting
diaconuccalin Oct 11, 2025
c12c451
MatMul constraint fix
diaconuccalin Oct 12, 2025
d96ba8a
Fixed unique item check
diaconuccalin Oct 22, 2025
19a04dd
Format fix
diaconuccalin Oct 22, 2025
2cc6177
PR cleanup
diaconuccalin Nov 4, 2025
66a6c43
Quick change
diaconuccalin Nov 4, 2025
cd386b3
Quickfix
diaconuccalin Nov 4, 2025
a490245
Quickfix
diaconuccalin Nov 4, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions .github/workflows/ci-platform-siracusa.yml
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,15 @@ jobs:
testBacktracking
testFloatAdder
testFloatGEMM

testFloat2DConvolution
testFloat2DConvolutionBias
testFloat2DConvolutionZeroBias

testFloat2DDWConvolution
testFloat2DDWConvolutionBias
testFloat2DDWConvolutionZeroBias

testFloatLayerNorm
testFloatRelu
testFloatMaxPool
Expand All @@ -64,6 +72,7 @@ jobs:
Quant
Dequant
testFloatReduceSum
testFloatReshapeWithSkipConnection
testFloatSoftmaxGrad
testFloatSoftmaxCrossEntropy
testFloatSoftmaxCrossEntropyGrad
Expand All @@ -87,4 +96,5 @@ jobs:
CCT/CCT_1_16_16_8
CCT/CCT_2_32_32_128_Opset20
testTrainCCT/CCT1_Classifier_Training/CCT_1_16_16_8
testFloatDemoTinyViT
num-cores: 8
20 changes: 16 additions & 4 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ This file contains the changelog for the Deeploy project. The changelog is divid
## Unreleased (Planned Release Target: v0.2.1)

### List of Pull Requests
- TinyViT on non-tiled Siracusa [#117](https://github.com/pulp-platform/Deeploy/pull/117)
- Support Fully Asynchronous DMAs [#114](https://github.com/pulp-platform/Deeploy/pull/114)
- Disallow shape inference [#128](https://github.com/pulp-platform/Deeploy/pull/128)
- Remove memory-aware node bindings [#123](https://github.com/pulp-platform/Deeploy/pull/123)
Expand All @@ -24,6 +25,13 @@ This file contains the changelog for the Deeploy project. The changelog is divid
- Fix bias hoisting in generic GEMM with no bias [#126](https://github.com/pulp-platform/Deeploy/pull/126)

### Added
- PULP 2D FP DW conv Im2Col template and kernel, with bias support.
- Bias support for PULP 2D FP regular conv Im2Col in template & kernel.
- PULP FP DW conv 2D parser.
- FP conv 2D (simple & DW), reshape & skip connection, and TinyViT demo tests to the non-tiled Siracusa CI pipeline.
- FP bindings and mappings for PULP slice, DW conv 2D, and reduce mean operations.
- FP PULP DW conv lowering optimization pass similar to the existent one for integer version.
- RemoveEmptyConvBiasPass to the PULP optimizer.
- Add manual type inference feature (CLI: `--input-type-map`/`--input-offset-map`) to resolve ambiguities when test inputs are not representative enough
- Added a `testTypeInferenceDifferentTypes` test case to validate type inference for different input types
- Added `_mangleNodeNames` function to avoid duplicate node mappings
Expand Down Expand Up @@ -60,6 +68,7 @@ This file contains the changelog for the Deeploy project. The changelog is divid
- Added new waiting-strategy logic with fine-grained `PerTensorWaitingStrategy`

### Changed
- Reduced size of reshape & skip connection test, for non-tiled Siracusa memory compatibility.
- Replaced platform-specific tags (`*-amd64`, `*-arm64`) with direct digest references in `Noelware/docker-manifest-action`.
- mchan HAL is now reduced to bare-bones
- refactor of the IntrospectiveCodeTransformation to work on the Mako template
Expand Down Expand Up @@ -97,6 +106,9 @@ This file contains the changelog for the Deeploy project. The changelog is divid
- Refactored DMA code generation (`SnitchDma`, `Mchan`) to correctly overlap transfers and compute in double-buffering mode

### Fixed
- Fixed bug for non-batched elements in the PULPOpen FP GEMM and matmul templates.
- Added underscore to the beginning of closure names to avoid naming issues when they start with unsupported first characters (like numbers).
- Data types in the PULPOpen FP add and mul templates.
- Prevent node duplication for graphs generated via GraphSurgeon
- Resolved issue with missing `id` in the `Build Cache for Docker` step, used in the `Inject build-cache` step.
- Fix license CI check and prevent potential issues with `jq` installation
Expand Down Expand Up @@ -185,9 +197,9 @@ This release containing major architectural changes, new platform support, enhan


### Added
- BatchNorm kernel
- ConvTranspose kernel
- MaxPool1D kernel
- BatchNorm kernel
- ConvTranspose kernel
- MaxPool1D kernel
- Template for 1D Convolution
- Support for float32 data type in the previous kernels
- Float binding for Pad1D kernel
Expand Down Expand Up @@ -326,7 +338,7 @@ This release containing major architectural changes, new platform support, enhan

### Changed
- FloatConvTemplate file
- Platform.py file
- Platform.py file
- Bump the CMake version to 3.24 as required for the chimera-sdk
- Bump GVSoC's version and add chimera simulation target
- Rename the generic source util to utils to avoid name collision with chimera-sdk
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -155,7 +155,8 @@ def apply(self,
executionBlock: ExecutionBlock,
name: str,
verbose: CodeGenVerbosity = _NoVerbosity) -> Tuple[NetworkContext, ExecutionBlock]:
self.closureName = name + self.closureSuffix
# Prepend underscore to avoid name issues when beginning with problematic characters (like numbers)
self.closureName = "_" + name + self.closureSuffix
self.functionCall = executionBlock.generate(ctxt)
self._generateClosureStruct(ctxt, executionBlock)
ctxt = self._generateClosureCtxt(ctxt, name)
Expand Down
10 changes: 5 additions & 5 deletions Deeploy/CommonExtensions/DataTypes.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,11 +87,11 @@ class float64_t(FloatImmediate):

SignedIntegerDataTypes: Tuple[Type[IntegerImmediate], ...] = (int8_t, int16_t, int32_t, int64_t)
UnsignedIntegerDataTypes: Tuple[Type[IntegerImmediate], ...] = (uint8_t, uint16_t, uint32_t, uint64_t)
IntegerDataTypes: Tuple[Type[IntegerImmediate], ...] = (sorted((
*SignedIntegerDataTypes,
*UnsignedIntegerDataTypes,
),
key = lambda _type: _type.typeWidth))
IntegerDataTypes: Tuple[Type[IntegerImmediate], ...] = tuple(
sorted((
*SignedIntegerDataTypes,
*UnsignedIntegerDataTypes,
), key = lambda _type: _type.typeWidth))
FloatDataTypes: Tuple[Type[FloatImmediate], ...] = (bfloat16_t, float16_t, float32_t, float64_t)


Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,9 @@
#
# SPDX-License-Identifier: Apache-2.0

from typing import Any, Union

import onnx_graphsurgeon as gs

from Deeploy.DeeployTypes import CodeGenVerbosity, NetworkContext, NetworkDeployer, ONNXLayer, _NoVerbosity
from Deeploy.DeeployTypes import CodeGenVerbosity, DeploymentEngine, NetworkContext, NetworkDeployer, _NoVerbosity


class NetworkDeployerWrapper(NetworkDeployer):
Expand Down Expand Up @@ -68,8 +66,8 @@ def generateBufferAllocationCode(self) -> str:
return self._innerObject.generateBufferAllocationCode()

# MultiEngineDeployer augment
def _mapNode(self, node: gs.Node) -> Union[ONNXLayer, Any]:
return self._innerObject._mapNode(node)
def _selectEngine(self, node: gs.Node) -> DeploymentEngine:
return self._innerObject._selectEngine(node)

def _printMemorySummary(self):
return self._innerObject._printMemorySummary()
Expand Down
11 changes: 6 additions & 5 deletions Deeploy/DeeployTypes.py
Original file line number Diff line number Diff line change
Expand Up @@ -325,15 +325,15 @@ def fromNode(cls, node: gs.Node):
return (cls(name = node.name, shape = node.shape if not isinstance(node, gs.Constant) else node.values.shape))

def has_live_aliases(self, ctxt: NetworkContext) -> bool:
"""Checks whether this VariableBuffer has any live ancestors, i.e. buffers that are still live and are aliased by this buffer.
"""Checks whether this VariableBuffer has any live aliases, i.e. buffers that are still live and are aliased by this buffer.
Parameters
----------
ctxt : NetworkContext
Current NetworkContext
Returns
-------
bool
True if this VariableBuffer has any live ancestors, False otherwise
True if this VariableBuffer has any live aliases, False otherwise
"""
# Do a breadth-first search across the aliasing double-linked list
live = self._live
Expand Down Expand Up @@ -2562,10 +2562,10 @@ def codeTransform(self, verbose: CodeGenVerbosity = _NoVerbosity):
self.ctxt = layer.codeTransform(self.ctxt, verbose)
self.transformed = True

def _mapNode(self, node: gs.Node) -> Union[ONNXLayer, Any]:
def _selectEngine(self, node: gs.Node) -> DeploymentEngine:
for engine in self.Platform.engines:
if node.op in engine.Mapping:
return engine.Mapping[node.op](node)
return engine
raise RuntimeError(f"No mapping found for node {node.name} with op type {node.op}")

def _bindLayers(self):
Expand All @@ -2582,7 +2582,8 @@ def _bindLayers(self):
flatSchedule += subGraph

for node in flatSchedule:
layer = self._mapNode(node)
engine = self._selectEngine(node)
layer = engine.Mapping[node.op](node)
if isinstance(layer, ONNXLayer):
log.debug(f" {SUCCESS_MARK} Bind {node.name} to layer {layer.__class__.__name__}")
self.layerBinding[layer.node.name] = layer
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,13 @@
#
# SPDX-License-Identifier: Apache-2.0

from typing import Any, Callable, Dict, Type, Union
from typing import Callable, Dict, Type

import onnx_graphsurgeon as gs

from Deeploy.AbstractDataTypes import Pointer
from Deeploy.CommonExtensions.NetworkDeployers.NetworkDeployerWrapper import NetworkDeployerWrapper
from Deeploy.DeeployTypes import DeploymentPlatform, NetworkDeployer, ONNXLayer, Schedule, TopologyOptimizer
from Deeploy.DeeployTypes import DeploymentEngine, DeploymentPlatform, NetworkDeployer, Schedule, TopologyOptimizer
from Deeploy.EngineExtension.OptimizationPasses.TopologyOptimizationPasses.EngineColoringPasses import \
EngineColoringPass, EngineMapper

Expand Down Expand Up @@ -48,14 +48,14 @@ def lower(self, graph: gs.Graph) -> gs.Graph:
) == 0, f"Missing engine color for nodes {[node.name for node in uncoloredNodes]} with operations {uncoloredOperations}"
return graph

def _mapNode(self, node: gs.Node) -> Union[ONNXLayer, Any]:
def _selectEngine(self, node: gs.Node) -> DeploymentEngine:
assert "engine" in node.attrs, f"Node {node.name} doesn't have an engine color."
engineName = node.attrs["engine"]
assert isinstance(engineName, str) and engineName in self.engineDict, \
f"Node {node.name} has an invalid engine {engineName} assigned."
engine = self.engineDict[engineName]
assert node.op in engine.Mapping, f"No mapping found for {node.op} in engine {engine.name}"
return engine.Mapping[node.op](node)
return engine


class EngineColoringDeployerWrapper(EngineColoringDeployer, NetworkDeployerWrapper):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@


class _FloatReduceMeanTemplate(NodeTemplate):
# WARNING: Currently only supports single axis reducing!

def __init__(self, templateStr):
super().__init__(templateStr)
Expand Down
1 change: 1 addition & 0 deletions Deeploy/Targets/Generic/Templates/ReduceMeanTemplate.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@


class _ReduceMeanTemplate(NodeTemplate):
# WARNING: Currently only supports single axis reducing!

def __init__(self, templateStr):
super().__init__(templateStr)
Expand Down
1 change: 1 addition & 0 deletions Deeploy/Targets/Generic/Templates/SliceTemplate.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@


class _SliceTemplate(NodeTemplate):
# WARNING: Currently only supports single axis slicing!

def __init__(self, templateStr):
super().__init__(templateStr)
Expand Down
51 changes: 36 additions & 15 deletions Deeploy/Targets/PULPOpen/Bindings.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,13 @@
from Deeploy.CommonExtensions.CodeTransformationPasses.Closure import ClosureGeneration, MemoryAwareClosureGeneration
from Deeploy.CommonExtensions.CodeTransformationPasses.MemoryAllocation import ArgumentStructGeneration, \
MemoryManagementGeneration, MemoryPassthroughGeneration
from Deeploy.CommonExtensions.DataTypes import IntegerDataTypes, SignedIntegerDataTypes, float32_t, int8_t, int32_t, \
uint8_t
from Deeploy.CommonExtensions.DataTypes import FloatDataTypes, IntegerDataTypes, SignedIntegerDataTypes, float32_t, \
int8_t, int32_t, int64_t, uint8_t
from Deeploy.DeeployTypes import CodeTransformation, NodeBinding, NodeTemplate
from Deeploy.FutureExtension.Bindings.AutoFutureBinding import AutoFutureBinding
from Deeploy.FutureExtension.CodeTransformationPasses.FutureCodeTransformation import FutureGeneration
from Deeploy.Targets.Generic.Templates import AddTemplate, ConcatTemplate, DequantTemplate, FloatReduceSumTemplate, \
GatherTemplate, QuantTemplate, RQSiGELUTemplate, iHardswishTemplate
from Deeploy.Targets.Generic.Templates import AddTemplate, ConcatTemplate, DequantTemplate, FloatReduceMeanTemplate, \
FloatReduceSumTemplate, GatherTemplate, QuantTemplate, RQSiGELUTemplate, SliceTemplate, iHardswishTemplate
from Deeploy.Targets.Generic.TypeCheckers import AddChecker, ConcatChecker, ConvChecker, DequantChecker, \
GatherChecker, GELUChecker, GEMMChecker, HardswishChecker, LayerNormChecker, MatMulChecker, MulChecker, \
QuantChecker, ReduceMeanChecker, ReluChecker, ReshapeChecker, RQAddChecker, RQHardswishChecker, SGDChecker, \
Expand All @@ -27,11 +27,11 @@
from Deeploy.Targets.PULPOpen.DataTypes import PULPDMAFuture
from Deeploy.Targets.PULPOpen.DMA.L3Dma import l3DmaHack
from Deeploy.Targets.PULPOpen.DMA.MchanDma import MchanDma
from Deeploy.Targets.PULPOpen.Templates import ConvTemplate, FloatAddTemplate, FloatConvTemplate, FloatGELUTemplate, \
FloatGemmTemplate, FloatLayernormTemplate, FloatMatMulTemplate, FloatMaxPoolTemplate, FloatMulTemplate, \
FloatReluTemplate, FloatSoftmaxTemplate, GEMMTemplate, MatrixVectorTemplate, MaxPool2DTemplate, MulTemplate, \
ReduceMeanTemplate, RequantShiftTemplate, ReshapeTemplate, RQAddTemplate, RQSiHardswishTemplate, SGDTemplate, \
SliceTemplate, SoftmaxCrossEntropyLossTemplate, TallGEMMTemplate, TransposeTemplate, UniformRequantShiftTemplate, \
from Deeploy.Targets.PULPOpen.Templates import ConvTemplate, DMASliceTemplate, FloatAddTemplate, FloatConvTemplate, \
FloatGELUTemplate, FloatGemmTemplate, FloatLayernormTemplate, FloatMatMulTemplate, FloatMaxPoolTemplate, \
FloatMulTemplate, FloatReluTemplate, FloatSoftmaxTemplate, GEMMTemplate, MatrixVectorTemplate, MaxPool2DTemplate, \
MulTemplate, ReduceMeanTemplate, RequantShiftTemplate, ReshapeTemplate, RQAddTemplate, RQSiHardswishTemplate, \
SGDTemplate, SoftmaxCrossEntropyLossTemplate, TallGEMMTemplate, TransposeTemplate, UniformRequantShiftTemplate, \
iRMSNormTemplate, iSoftmaxTemplate
from Deeploy.Targets.PULPOpen.TypeCheckers import PULPConvChecker, PULPLinearChecker, PULPMaxPoolChecker, \
PULPRequantShiftChecker
Expand Down Expand Up @@ -148,16 +148,24 @@
PointerClass(uint8_t),
PointerClass(uint8_t),
PointerClass(uint8_t)
], [PULPDMAFuture(underlyingType = type)]), SliceTemplate.referenceTemplate, MemoryAwareForkTransformer)
], [PULPDMAFuture(underlyingType = type)]), DMASliceTemplate.referenceTemplate, MemoryAwareForkTransformer)
for type in IntegerDataTypes
]

PULPSliceBindings = [
NodeBinding(
SliceChecker([
PointerClass(type),
PointerClass(uint8_t),
PointerClass(uint8_t),
PointerClass(uint8_t),
PointerClass(uint8_t)
], [PointerClass(type)]), SliceTemplate.referenceTemplate, ForkTransformer) for type in FloatDataTypes
]

PULPReshapeBindings = [
NodeBinding(ReshapeChecker([PointerClass(type), PointerClass(int32_t)], [PointerClass(type)]),
ReshapeTemplate.referenceTemplate, SkipTransformer) for type in IntegerDataTypes
] + [
NodeBinding(ReshapeChecker([PointerClass(float32_t), PointerClass(type)], [PointerClass(float32_t)]),
ReshapeTemplate.referenceTemplate, SkipTransformer) for type in IntegerDataTypes
NodeBinding(ReshapeChecker([PointerClass(type), PointerClass(int64_t)], [PointerClass(type)]),
ReshapeTemplate.referenceTemplate, SkipTransformer) for type in IntegerDataTypes + FloatDataTypes
]

PULPRQAddBindings = [
Expand Down Expand Up @@ -225,6 +233,14 @@
ForkTransformer)
]

PULPFloatDWConv2DBindings = [
NodeBinding(
ConvChecker(
[PointerClass(float_type), PointerClass(float_type),
PointerClass(float_type)], [PointerClass(float_type)]), FloatConvTemplate.referenceDW2DIm2ColTemplate,
ForkTransformer) for float_type in FloatDataTypes
]

PULPRQSMatrixVecBindings = [
NodeBinding(
PULPLinearChecker([PointerClass(type1),
Expand Down Expand Up @@ -276,6 +292,11 @@
PULPReduceMeanBindings = [
NodeBinding(ReduceMeanChecker([PointerClass(type)], [PointerClass(type)]), ReduceMeanTemplate.referenceTemplate,
ClusterTransformer) for type in IntegerDataTypes
] + [
NodeBinding(ReduceMeanChecker([PointerClass(float_type), PointerClass(integer_type)], [PointerClass(float_type)]),
FloatReduceMeanTemplate.referenceTemplate, ClusterTransformer)
for integer_type in SignedIntegerDataTypes
for float_type in FloatDataTypes
]

PULPReduceSumBindings = [
Expand Down
24 changes: 19 additions & 5 deletions Deeploy/Targets/PULPOpen/Deployer.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
from Deeploy.DeeployTypes import ConstantBuffer, DeploymentPlatform, NodeTemplate, TopologyOptimizer, VariableBuffer
from Deeploy.Targets.Generic.TopologyOptimizationPasses.Passes import ReshapeConstOptPass, TransposeConstOptPass, \
TransposeMergePass, TransposeNoPermOptPass, TransposeSplitPass
from Deeploy.Targets.PULPOpen.Platform import PULPClusterEngine
from Deeploy.Targets.PULPOpen.TopologyOptimizationPasses.Passes import RQAddTransposeSquashPass

_L3AllocTemplate = NodeTemplate("""
Expand Down Expand Up @@ -63,19 +64,32 @@ def __init__(self,

self.extNameCount = 0

def bind(self):
def annotateNCores(self) -> None:
for layer in self.layerBinding.values():
node = layer.node
engine = self._selectEngine(node)
opRepr = layer.mapper.parser.operatorRepresentation
if isinstance(engine, PULPClusterEngine):
opRepr["n_cores"] = engine.n_cores

def bind(self) -> bool:
# SCHEREMO: THIS IS A STOP GAP SOLUTION. DONT REUSE. I MEAN IT. I WILL FIND YOU.
# SCHEREMO: The BindingOptimizationPass system is fairly fragile;
# it was designed this way because implementing further topology optimizations after
# parsing is very involved. If there are further use-cases, we should consider making this effort,
# but if there is only very few cases, this solution is okay.
autoTransposePass = AutoTransposeMergePass()
#self.ctxt, self.layerBinding = autoTransposePass.apply(self.ctxt, self.graph, self.layerBinding)

# LMACAN: THIS IS A STOP GAP SOLUTION. DONT REUSE. I MEAN IT. I WILL FIND YOU.
self.annotateNCores()

# SCHEREMO: THIS IS A STOP GAP SOLUTION. DONT REUSE. I MEAN IT. I WILL FIND YOU.
ret = super().bind()
if ret:
self.ctxt.hoistGlobalDefinition("cluster_dev", "extern struct pi_device cluster_dev;")
return ret
if not super().bind():
return False

self.ctxt.hoistGlobalDefinition("cluster_dev", "extern struct pi_device cluster_dev;")
return True

def _l3ConstBuffer(self) -> List[VariableBuffer]:
return [
Expand Down
Loading
Loading