Skip to content

Comments

Enable WITH_LAYER_crop for NCNN for tiled VAE decode#29

Merged
nihui merged 1 commit intonihui:masterfrom
nphuracm:enable-crop-layer
Feb 14, 2026
Merged

Enable WITH_LAYER_crop for NCNN for tiled VAE decode#29
nihui merged 1 commit intonihui:masterfrom
nphuracm:enable-crop-layer

Conversation

@nphuracm
Copy link
Contributor

The program consistently segfaulted when attempting to run tiled VAE decode at the end of inference because it tried to copy_cut_border which tried to create_layer(LayerType::Crop) which returned NULL because layer_to_index did not find the type in the layer_registry which was defined by these build-time flags and - as we now see it - this required type is not even enabled at build time.

This commit addresses that and the program flow now runs successfully.

This may also reveal a broader lack of graceful error handling in this repo, for the failure to catch the null pointer before it is dereference, however that will not be detailed in the scope of this PR.

I do not rule out the possibility of this setting being intentional and a better fix than this being available.

There may also be other missing layer types that should be enabled and included in the registry but are not.

This resolves #28 .

[New Thread 0x7fffea6be6c0 (LWP 628471)]
[New Thread 0x7fffe9ebd6c0 (LWP 628472)]
[0 AMD Radeon RX 580 2048SP (RADV POLARIS10)]  queueC=1[4]  queueT=0[1]  rebar=1  r-score=64
[0 AMD Radeon RX 580 2048SP (RADV POLARIS10)]  fp16-p/s/u/a=1/1/1/0  int8-p/s/u/a=1/1/1/1  bf16-p/s=1/0
[0 AMD Radeon RX 580 2048SP (RADV POLARIS10)]  subgroup=64(64~64)  ops=1/1/1/1/1/1/1/1/1/1
[0 AMD Radeon RX 580 2048SP (RADV POLARIS10)]  fp16-cm=0  int8-cm=0  bf16-cm=0  fp8-cm=0
prompt = A half-length portrait in the warm light of a convenience store late at night. An East Asian beauty, holding milk, meets your gaze in front of the freezer.
negative-prompt = 
output-path = out.png
model = /mnt/ntfshdd/SD_REPOS_MIRROR/z-image-ncnn/z-image-turbo
image-size = 1024 x 1024
steps = 9
seed = 1576066655
gpu-id = 0
low_vram = 1
vae_tile_size = 512 x 512
[New Thread 0x7fffe8ffeb40 (LWP 628487)]
[New Thread 0x7fffe27febc0 (LWP 628488)]
[New Thread 0x7fffe1ffcc40 (LWP 628489)]
[New Thread 0x7fffe17facc0 (LWP 628490)]
[New Thread 0x7fffe0ff8d40 (LWP 628491)]
[New Thread 0x7fffcbffedc0 (LWP 628492)]
[New Thread 0x7fffcb7fce40 (LWP 628493)]
[New Thread 0x7fffcaffaec0 (LWP 628494)]
[New Thread 0x7fffca7f8f40 (LWP 628495)]
[New Thread 0x7fffc9ff6fc0 (LWP 628496)]
[New Thread 0x7fffc97f5040 (LWP 628497)]
[New Thread 0x7fffc8ff30c0 (LWP 628498)]
[New Thread 0x7fffabfff140 (LWP 628499)]
[New Thread 0x7fffab7fd1c0 (LWP 628500)]
[New Thread 0x7fffaaffb240 (LWP 628501)]
[New Thread 0x7fffaa7f92c0 (LWP 628502)]
[New Thread 0x7fffa9ff7340 (LWP 628503)]
num_patches = 64 x 64
step 0 done
step 1 done
step 2 done
step 3 done
step 4 done
step 5 done
step 6 done
step 7 done
step 8 done
overwrite built-in layer type GroupNorm
[Thread 0x7fffe9ebd6c0 (LWP 628472) exited]
[Thread 0x7fffea6be6c0 (LWP 628471) exited]
[Thread 0x7fffe27febc0 (LWP 628488) exited]
[Thread 0x7fffe1ffcc40 (LWP 628489) exited]
[Thread 0x7fffe8ffeb40 (LWP 628487) exited]
[Thread 0x7fffe17facc0 (LWP 628490) exited]
[Thread 0x7fffe0ff8d40 (LWP 628491) exited]
[Thread 0x7fffcbffedc0 (LWP 628492) exited]
[Thread 0x7fffcb7fce40 (LWP 628493) exited]
[Thread 0x7fffcaffaec0 (LWP 628494) exited]
[Thread 0x7fffc9ff6fc0 (LWP 628496) exited]
[Thread 0x7fffca7f8f40 (LWP 628495) exited]
[Thread 0x7fffc97f5040 (LWP 628497) exited]
[Thread 0x7fffc8ff30c0 (LWP 628498) exited]
[Thread 0x7fffab7fd1c0 (LWP 628500) exited]
[Thread 0x7fffabfff140 (LWP 628499) exited]
[Thread 0x7fffaaffb240 (LWP 628501) exited]
[Thread 0x7fffaa7f92c0 (LWP 628502) exited]
[Thread 0x7fffa9ff7340 (LWP 628503) exited]
[Inferior 1 (process 628466) exited normally]

The program consistently segfaulted when attempting to run tiled VAE decode at the end of inference because it tried to copy_cut_border which tried to create_layer(LayerType::Crop) which returned NULL because layer_to_index did not find the type in the layer_registry which was defined by these build-time flags and - as we now see it - this required type is not even enabled at build time.

This commit addresses that and the program flow now runs successfully.

This may also reveal a broader lack of graceful error handling in this repo, for the failure to catch the null pointer before it is dereference, however that will not be detailed in the scope of this PR.

I do not rule out the possibility of this setting being intentional and a better fix than this being available.

There may also be other missing layer types that should be enabled and included in the registry but are not.
@nihui
Copy link
Owner

nihui commented Feb 14, 2026

good catch!
Thanks for your contribution !

@nihui nihui merged commit 34595c5 into nihui:master Feb 14, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Consistent Segmentation Fault

2 participants