[P3] Width Mismatch for Heterogeneous Submodule Interface Arrays in the stage of PyMTL Verilog Translation

This issue reports a potential PyMTL Verilog translation limitation/bug when translating a heterogeneous array of CGRA instances. The generated Verilog appears to collapse per-instance interface array widths to a single uniform width, which causes out-of-range indexing for larger instances.

### Reproduction Branch

[VectorCGRA at issue287-pymtl-bitwidth-mismatch](https://github.com/tancheng/VectorCGRA/tree/issue287-pymtl-bitwidth-mismatch)

### Reproduction Steps

```shell
cd /path/to/VectorCGRA
# upstream: https://github.com/tancheng/VectorCGRA
git fetch upstream
git checkout -b issue287-pymtl-bitwidth-mismatch upstream/issue287-pymtl-bitwidth-mismatch
mkdir build && cd build
pytest ../multi_cgra/test/MeshMultiCgraTemplateRTL_test.py::test_mesh_multi_hetero_cgra -vs --tb=long --test-verilog --dump-vtb --dump-vcd > ../hetero_cgra.log
```

After running, the `build/` directory contains generated Verilog such as:

- `MeshMultiCgraTemplateRTL__<hash_id>__pickled.v` and `hetero_cgra.log` contains the detailed failure.

### Target Architecture (Heterogeneous Multi-CGRA)

The test architecture is a `2x2` multi-CGRA:

- CGRA0: `2x2`
- CGRA1: `3x3`
- CGRA2: `2x2`
- CGRA3: `2x2`

https://github.com/tancheng/VectorCGRA/blob/46c1f00d3c06dc534e16fb29e9155702372ae1cc/multi_cgra/test/arch_multi_hetero_cgra_override.yaml#L3-L20

### Expected Behavior

For each CGRA instance, boundary interfaces should match that instance’s tile shape:
https://github.com/tancheng/VectorCGRA/blob/46c1f00d3c06dc534e16fb29e9155702372ae1cc/cgra/CgraTemplateRTL.py#L131-L139

So in this architecture:
- CGRA0/2/3 boundary width should be `2`
- CGRA1 boundary width should be `3`


### Observed Behavior

However, during translation, the generated Verilog appears to use a uniform width of 2 for these boundary arrays across **all CGRAs**, including CGRA1.

For example, MeshMultiCgraTemplateRTL__<hash_id>.v Line 34233

```verilog
  logic [0:0] cgra__recv_data_on_boundary_south__val [0:3][0:1];
```

- `[0:3]` indexes 4 CGRAs
- inner `[0:1]` gives width 2 for each
- but CGRA1 requires width 3 (`[0:2]`)

This leads to out-of-range accesses, e.g. in `hetero_cgra.log`:

```shell
Selection index out of range: 2
E           outside 1:0
E                         : ... In instance MeshMultiCgraTemplateRTL___05Ff665d7f96c8c724c
E           15206 |   assign cgra__recv_data_on_boundary_south__val[1][2] = 1'd0;
E                 |                                                   ^
```



## Current Workaround/Solution

A practical workaround is to pad every CGRA boundary interface to the maximum CGRA dimensions (`max_tile_rows`, `max_tile_cols`) and ground unused ports.

This works functionally, but wastes wires/ports for smaller CGRAs.
I think the current solution is okay (modifying the logic of PyMTL's Verilog translation would require too much effort, I guess). Maybe we'll need to pay attention to this issue/case when instantiating heterogeneous multi-CGRAs later.

	multi_cgra_defaults:
	rows: 2
	columns: 2

	cgra_defaults:
	rows: 2
	columns: 2
	configMemSize: 16

	tile_defaults:
	num_registers: 16
	fu_types: ["add", "mul", "div", "fadd", "fmul", "fdiv", "logic", "cmp", "sel", "type_conv", "vfmul", "fadd_fadd", "fmul_fadd", "grant", "loop_control", "phi", "constant", "mem", "return", "mem_indexed", "alloca", "shift"]

	cgra_overrides:
	- cgra_x: 1
	cgra_y: 0
	rows: 3
	columns: 3

	if is_multi_cgra:
	s.recv_data_on_boundary_north = [RecvIfcRTL(DataType) for _ in range(per_cgra_columns)]
	s.send_data_on_boundary_north = [SendIfcRTL(DataType) for _ in range(per_cgra_columns)]
	s.recv_data_on_boundary_south = [RecvIfcRTL(DataType) for _ in range(per_cgra_columns)]
	s.send_data_on_boundary_south = [SendIfcRTL(DataType) for _ in range(per_cgra_columns)]
	s.recv_data_on_boundary_west = [RecvIfcRTL(DataType) for _ in range(per_cgra_rows)]
	s.send_data_on_boundary_west = [SendIfcRTL(DataType) for _ in range(per_cgra_rows)]
	s.recv_data_on_boundary_east = [RecvIfcRTL(DataType) for _ in range(per_cgra_rows)]
	s.send_data_on_boundary_east = [SendIfcRTL(DataType) for _ in range(per_cgra_rows)]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[P3] Width Mismatch for Heterogeneous Submodule Interface Arrays in the stage of PyMTL Verilog Translation #289

Reproduction Branch

Reproduction Steps

Target Architecture (Heterogeneous Multi-CGRA)

Expected Behavior

Observed Behavior

Current Workaround/Solution

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[P3] Width Mismatch for Heterogeneous Submodule Interface Arrays in the stage of PyMTL Verilog Translation #289

Description

Reproduction Branch

Reproduction Steps

Target Architecture (Heterogeneous Multi-CGRA)

Expected Behavior

Observed Behavior

Current Workaround/Solution

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions