Some technical questions. #5

lyf1212 · 2025-01-08T13:17:54Z

Thank you for your brilliant work! I have some questions after reading:

Why not LayerNorm? As revealed by continuous exposure learning for low-light enhancement using Neural ODE, LayerNorm generalize diverse illumination distribution inherently, probably benefit for your IIM.
Why initial IIM can provide invariant features? As depicted in your code, all kernels are initialized to Gaussian distribution, but not ones. As I thought, those $w_i$ should be 1 in Eq.(7) to ensure illumination-invariant from the beginning.
Why not zero-shot generalization? You retrain IIM on COCO to examine generalizability, but have you trailed on zero-shot manner without training? That is, directly leverage the fused results generated by object detection task, to segmentation task.

Looking forward for your reply!

[1] Continuous exposure learning for low-light enhancement using Neural ODE, submission to ICLR 25.

MingboHong · 2025-01-09T07:12:44Z

3. ou retrain IIM on COCO to exam

Hi，Thank you for your valuable questions and for showing interest in our project :)

Q:Why not LayerNorm?
R: : I align with your point, many researchers would suggest not to use BN on image restoration tasks. It seems that LN would be a better choice for such a task. However, LN typically needs a fixed input shape, which may impede the flexibility of detection tasks (such as Multi-scale training/testing).

Q:Why initial IIM can provide invariant features?
R: : The initialization method probably would not affect the IIM. As you can see in Eq.7:

This equation only contains intrinsic property (aka illuminant invariant items), which means you can use various initialization methods for IIM at the beginning. (But zero initialization may not be a good choice)

Q:Why not zero-shot generalization?
R: : Actually, we did some zero-shot experiments, as shown in OpenReview. But, it is not remarkable.

lyf1212 · 2025-01-18T09:15:00Z

Thank you for such enthusiastic response!

Yes, LayerNorm requires for fixed spatial shape, hampering further adaptability for variable image inputs. Btw, InstanceNorm which only modulate along the channel axis seems to be a better choice? Since it normalize all channels into a similar distribution, which is probably more variant and stable under diverse illumination. Inside a deep CNN, each channel represent basic pixel information of inputs.

As for the initial issue, I made a mistake. Under the settings of Eq.(5) and "zero-mean", it indeed only contains basic surface properties which is illumination-invariant.
However, Eq.(5) is still confusing, the straightforward derivation should be

,
although it seems reasonable in Eq.(5) in terms of final result.

As for zero-shot generalization, I really mean "training with only normal-lit images, test on low-light images". Since it is easy to access large-scale high-quality annotated normal-lit images, if it is possible to construct a mapping from "illumination-invariant features" to segmentation/detection results, the robustness should be better intuitively.

In brief, your work has indeed made significant contributions to low-light downstream tasks. Thank you for your insightful thinking and diligent efforts!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some technical questions. #5

Some technical questions. #5

lyf1212 commented Jan 8, 2025

MingboHong commented Jan 9, 2025

lyf1212 commented Jan 18, 2025

Some technical questions. #5

Some technical questions. #5

Comments

lyf1212 commented Jan 8, 2025

MingboHong commented Jan 9, 2025

lyf1212 commented Jan 18, 2025