Commit 5fb69ea
authored
[NPU] Model marshalling without weights copies (#31939)
### Details:
* Extends the [OV StreamSerializer &
XmlSerializer](#31639)
in order to allow passing an `ov::Model` through the driver without
copying its weights into a separate buffer.
* The purpose of this PR is to reduce memory consumption by avoiding
weights duplications.
* This feature will be disabled by default for the CiD interface (at
least for a while). The changes are first meant to be integrated in the
upcoming CiP interface.
* The implementation followed and adapted the sample provided in [this
PR](#31969).
* Two config options introduced: `intel_npu::use_base_model_serializer`
- a switch between the old & new serialization algorithms, and
`intel_npu::serialization_weights_size_threshold` - controls which
weights are copied into a separate buffer and which ones have only
metadata (memory location & size) stored as runtime information. More
concretely, weights smaller than this value will be copied.
* `vcl_serializer.hpp` is meant to contain all operations required to
prepare an `ov::Model` as food for the VCL interface. This implies using
the new/old model serializer, I/O & config serialization.
`xml_serializer.hpp` is a more generic, weightless (no weights copies
unless `serialization_weights_size_threshold` is used) implementation of
the OV serializer.
* Roughly how this works:
* The plugin passes through all `ov::Constant` nodes and places weights
metadata (`intel_npu::WeightsPointerAttribute`) as runtime information
on the nodes that have buffers smaller than
`serialization_weights_size_threshold`.
* The new `intel_npu::StreamSerialize` is called which uses the
`intel_npu::XmlSerializer` for serializing the model. Note that
`StreamSerialize` uses a slightly different format within the buffer
(metadata containing offsets & sizes, custom data, weights & the XML
graph), see `ov::pass::StreamSerialize` for details.
* `intel_npu::XmlSerializer` will not write weights into its dedicated
buffer if the `WeightsPointerAttribute` is found within the current
`ov::Constant` node. Instead, weights metadata will be written as
runtime information by calling the visit method corresponding to the
attribute.
* The deserializer will be able to distinguish between the two cases
(weights copied vs. weights stored as metadata) by looking for this
attribute in the serialized buffer.
* See the ticket for some performance reports.
## Related PRs
* [Sample for extending the serialization
algorithm](#31969)
* [The PR that made the OV serializer
extensible](#31639)
### Tickets:
- *CVS-173711*1 parent 9b3e405 commit 5fb69ea
File tree
23 files changed
+724
-290
lines changed- src
- core/xml_util
- plugins/intel_npu
- src
- al/include/intel_npu
- config
- common/include/intel_npu/common
- compiler_adapter
- include
- src
- plugin/src
- tests/functional
- behavior
- npu_driver_compiler_adapter
- ov_infer_request
- internal/compiler_adapter
23 files changed
+724
-290
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
22 | | - | |
| 22 | + | |
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
| |||
Lines changed: 14 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1426 | 1426 | | |
1427 | 1427 | | |
1428 | 1428 | | |
| 1429 | + | |
| 1430 | + | |
| 1431 | + | |
| 1432 | + | |
| 1433 | + | |
| 1434 | + | |
| 1435 | + | |
| 1436 | + | |
| 1437 | + | |
| 1438 | + | |
| 1439 | + | |
| 1440 | + | |
| 1441 | + | |
| 1442 | + | |
1429 | 1443 | | |
Lines changed: 12 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
357 | 357 | | |
358 | 358 | | |
359 | 359 | | |
360 | | - | |
361 | | - | |
362 | | - | |
| 360 | + | |
| 361 | + | |
363 | 362 | | |
364 | 363 | | |
365 | 364 | | |
| 365 | + | |
| 366 | + | |
| 367 | + | |
| 368 | + | |
| 369 | + | |
| 370 | + | |
| 371 | + | |
| 372 | + | |
| 373 | + | |
| 374 | + | |
366 | 375 | | |
367 | 376 | | |
368 | 377 | | |
| |||
Lines changed: 46 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
Lines changed: 7 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| 7 | + | |
7 | 8 | | |
8 | 9 | | |
9 | 10 | | |
10 | 11 | | |
11 | 12 | | |
12 | 13 | | |
13 | 14 | | |
14 | | - | |
| 15 | + | |
15 | 16 | | |
16 | 17 | | |
17 | 18 | | |
| |||
27 | 28 | | |
28 | 29 | | |
29 | 30 | | |
30 | | - | |
| 31 | + | |
| 32 | + | |
31 | 33 | | |
32 | 34 | | |
33 | 35 | | |
| |||
44 | 46 | | |
45 | 47 | | |
46 | 48 | | |
47 | | - | |
| 49 | + | |
48 | 50 | | |
49 | 51 | | |
50 | 52 | | |
51 | | - | |
| 53 | + | |
| 54 | + | |
52 | 55 | | |
53 | 56 | | |
54 | 57 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
| 28 | + | |
28 | 29 | | |
29 | 30 | | |
30 | 31 | | |
| |||
Lines changed: 5 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
75 | 75 | | |
76 | 76 | | |
77 | 77 | | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
78 | 83 | | |
79 | 84 | | |
80 | 85 | | |
| |||
Lines changed: 8 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
| 13 | + | |
13 | 14 | | |
14 | 15 | | |
15 | 16 | | |
| |||
18 | 19 | | |
19 | 20 | | |
20 | 21 | | |
21 | | - | |
| 22 | + | |
| 23 | + | |
22 | 24 | | |
23 | | - | |
| 25 | + | |
| 26 | + | |
24 | 27 | | |
25 | 28 | | |
26 | 29 | | |
27 | | - | |
| 30 | + | |
28 | 31 | | |
29 | 32 | | |
30 | 33 | | |
31 | | - | |
| 34 | + | |
| 35 | + | |
32 | 36 | | |
33 | 37 | | |
34 | 38 | | |
| |||
Lines changed: 0 additions & 82 deletions
This file was deleted.
Lines changed: 7 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
22 | | - | |
| 22 | + | |
| 23 | + | |
23 | 24 | | |
24 | | - | |
| 25 | + | |
| 26 | + | |
25 | 27 | | |
26 | 28 | | |
27 | 29 | | |
28 | | - | |
| 30 | + | |
29 | 31 | | |
30 | 32 | | |
31 | 33 | | |
32 | | - | |
| 34 | + | |
| 35 | + | |
33 | 36 | | |
34 | 37 | | |
35 | 38 | | |
| |||
0 commit comments