SSIM implementation #1988

MatthiasLen · 2023-08-08T11:30:42Z

MatthiasLen
Aug 8, 2023

Hi all,

is the SSIM implementation src/torchmetrics/functional/image/ssim.py really stable ?

For Gaussian Kernel and small sigmas, e.g. sigma=0.2, we can have 2D kernel size [1,1] or 3d kernel size [1,1,1], which leads to padding values of 0, see Line 127ff

    pad_h = (gauss_kernel_size[0] - 1) // 2
    pad_w = (gauss_kernel_size[1] - 1) // 2

This on the other hand leads to degenerate tensors in Line 166 ff

    if is_3d:
        ssim_idx = ssim_idx_full_image[..., pad_h:-pad_h, pad_w:-pad_w, pad_d:-pad_d]
    else:
        ssim_idx = ssim_idx_full_image[..., pad_h:-pad_h, pad_w:-pad_w]

due to empty slicing (0:-0) and finally to NaN values in the return statement

    return ssim_idx.reshape(ssim_idx.shape[0], -1).mean(-1)

A simple solution for this would be avoiding the negative indexing , e.g. modifying the respective line in the following manner

        s = ssim_idx_full_image.shape
        ssim_idx = ssim_idx_full_image[..., pad_h:s[-3]-pad_h, pad_w:s[-2]-pad_w, pad_d:s[-1]-pad_d]

Similarly for large sigmas (i.e. large kernels) the removal of padding as implemented in Line 166ff can lead in the aforementioned setting to NaN values. A sufficient condition that it works would be for example

assert 2* pad_h < target.shape[2]
assert 2* pad_w < target.shape[3]
assert 2* pad_d < target.shape[4]

This is due to the fact that the valid convolution in Line 149 eats up the padding and afterwards a margin of size padding is again removed in Line 166 ff. Is this the intended behaviour ? In first oder, this appreas to be a complicated variant of directly applying a valid convolution to the unpadded tensor.

IMHO this is a importatnt issue when applying this in medical imaging. E.g. in MRI imaging we often encounter the case where the z dimension is small (stack of few MRI slices, each slice having a high resolution). In the current implementation, for the final SSIM calculation a relative high fraction of slices may be removed.

For example: sig=1.5, gaussian_kernel= True --> padding approx 5 --> 5 top and 5 bottom slices from SSIM map will be ignored due to Line 166 ff. When we have a slice stack of 20, e.g. image tensor size 20x200x200, this feels limiting. Any thoughts?

Best, Matthias

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SSIM implementation #1988

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

SSIM implementation #1988

MatthiasLen Aug 8, 2023

Replies: 0 comments

MatthiasLen
Aug 8, 2023