AI训练复现不成功的原因？Valid Loss超大 #13

linchengsi · 2019-07-04T14:07:06Z

您好！我下载了您的代码然后直接运行了HEVC-Complexity-Reduction-master\ETH-CNN_Training_AI里面的train_CNN_CTU64.py。因为readme文件里面说是可以直接运行的？所以我也没有进行修改。
然而我得到的结果图跟原本就在models文件夹里面的很不一样，具体如下

您提供的结果（原本就在models文件夹里面的）其中一张如下

可以看到我的valid loss等等很不正常，数值超大，但是我刚刚进行代码的学习暂时不是很清楚造成问题的原因是什么。请问您能解释一下吗？感谢！

（我的配置是windows10 GTX1060MQ python3.7 tensorflow1.13.1 使用pycharm运行）
还有我的邮箱是[email protected]

tianyili2017 · 2019-07-08T11:39:55Z

您好，
程序确实可以直接运行，但因为全部数据库太大，附带的数据文件只是所有数据的一小部分，用于测试程序是否能跑通。如果迭代100万次训练，很容易出现过拟合。我也准备重新下载一次、运行，看结果是否像您那样？如果希望复现论文结果，推荐您按照训练说明，由原始YUV视频和标签生成完整数据文件，再训练。谢谢您指出本程序的一个问题，关注我们的工作！

tianyili2017 · 2019-07-09T02:41:08Z

您好。我重新运行一遍，结果和您的图像很相似，说明在小样本上确实会出现过拟合。之后，准备在训练说明里标注一下：demo sets只用于使程序运行，可以用来验证Tensorflow等环境是否正确安装，但不用于复现论文结果。欲复现与论文接近的结果，请先生成完整训练数据。谢谢您理解！

zhoushuairan · 2019-08-12T08:20:25Z

您好，我下载您的代码之后运行HEVC-Complexity-Reduction-master\HM-16.5_Test_AI\build\HM_vc10.sln `结果出现如下信息` Tensor("Conv2D:0", shape=(?, 1, 1, 1), dtype=float32)
Tensor("ResizeNearestNeighbor:0", shape=(?, 16, 16, 1), dtype=float32)
Tensor("LeakyRelu:0", shape=(?, 4, 4, 16), dtype=float32)
Tensor("LeakyRelu_1:0", shape=(?, 2, 2, 24), dtype=float32)
Tensor("LeakyRelu_2:0", shape=(?, 1, 1, 32), dtype=float32)
Tensor("Conv2D_4:0", shape=(?, 2, 2, 1), dtype=float32)
Tensor("ResizeNearestNeighbor_1:0", shape=(?, 32, 32, 1), dtype=float32)
Tensor("LeakyRelu_3:0", shape=(?, 8, 8, 16), dtype=float32)
Tensor("LeakyRelu_4:0", shape=(?, 4, 4, 24), dtype=float32)
Tensor("LeakyRelu_5:0", shape=(?, 2, 2, 32), dtype=float32)
Tensor("Conv2D_8:0", shape=(?, 4, 4, 1), dtype=float32)
Tensor("ResizeNearestNeighbor_2:0", shape=(?, 64, 64, 1), dtype=float32)
Tensor("LeakyRelu_6:0", shape=(?, 16, 16, 16), dtype=float32)
Tensor("LeakyRelu_7:0", shape=(?, 8, 8, 24), dtype=float32)
Tensor("LeakyRelu_8:0", shape=(?, 4, 4, 32), dtype=float32)
Tensor("concat:0", shape=(?, 2688), dtype=float32)
Tensor("cond/Merge:0", shape=(?, 64), dtype=float32)
Tensor("cond_1/Merge:0", shape=(?, 48), dtype=float32)
Tensor("cond_2/Merge:0", shape=(?, 1), dtype=float32)
Tensor("cond_3/Merge:0", shape=(?, 128), dtype=float32)
Tensor("cond_4/Merge:0", shape=(?, 96), dtype=float32)
Tensor("cond_5/Merge:0", shape=(?, 4), dtype=float32)
Tensor("cond_7/Merge:0", shape=(?, 256), dtype=float32)
Tensor("cond_8/Merge:0", shape=(?, 192), dtype=float32)
Tensor("cond_9/Merge:0", shape=(?, 16), dtype=float32)
D:\HM\HM-16.5_Test_AI\bin\vc10\x64\Release\BasketballPass_416x240_50.yuv frame 501/501 416x240

Predicting Time : 7.986 sec.

HM software: Encoder Version [16.5] (including RExt)[Windows][VS 1900][64 bit]

python video_to_cu_depth.py D:\HM\HM-16.5_Test_AI\bin\vc10\x64\Release\BasketballPass_416x240_50.yuv 416 240 32

Input File : D:\HM\HM-16.5_Test_AI\bin\vc10\x64\Release\BasketballPass_416x240_50.yuv
Bitstream File : str.bin
Reconstruction File : rec.yuv
Real Format : 416x240 50Hz
Internal Format : 416x240 50Hz
Sequence PSNR output : Linear average only
Sequence MSE output : Disabled
Frame MSE output : Disabled
Cabac-zero-word-padding : Enabled
Frame/Field : Frame based coding
Frame index : 0 - 99 (100 frames)
Profile : main
CU size / depth / total-depth : 64 / 4 / 4
RQT trans. size (min / max) : 4 / 32
Max RQT depth inter : 3
Max RQT depth intra : 3
Min PCM size : 8
Motion search range : 64
Intra period : 1
Decoding refresh type : 0
QP : 32.00
Max dQP signaling depth : 0
Cb QP Offset : 0
Cr QP Offset : 0
QP adaptation : 0 (range=0)
GOP size : 1
Input bit depth : (Y:8, C:8)
MSB-extended bit depth : (Y:8, C:8)
Internal bit depth : (Y:8, C:8)
PCM sample bit depth : (Y:8, C:8)
Intra reference smoothing : Enabled
diff_cu_chroma_qp_offset_depth : -1
extended_precision_processing_flag : Disabled
implicit_rdpcm_enabled_flag : Disabled
explicit_rdpcm_enabled_flag : Disabled
transform_skip_rotation_enabled_flag : Disabled
transform_skip_context_enabled_flag : Disabled
cross_component_prediction_enabled_flag: Disabled
high_precision_offsets_enabled_flag : Disabled
persistent_rice_adaptation_enabled_flag: Disabled
cabac_bypass_alignment_enabled_flag : Disabled
log2_sao_offset_scale_luma : 0
log2_sao_offset_scale_chroma : 0
Cost function: : Lossy coding (default)
RateControl : 0
Max Num Merge Candidates : 5

TOOL CFG: IBD:0 HAD:1 RDQ:1 RDQTS:1 RDpenalty:0 SQP:0 ASR:0 FEN:1 ECU:0 FDM:1 CFM:0 ESD:0 RQT:1 TransformSkip:1 TransformSkipFast:1 TransformSkipLog2MaxSize:2 Slice: M=0 SliceSegment: M=0 CIP:0 SAO:1 PCM:0 TransQuantBypassEnabled:0 WPP:0 WPB:0 PME:2 WaveFrontSynchro:0 WaveFrontSubstreams:1 ScalingList:0 TMVPMode:1 AQpS:0 SignBitHidingFlag:1 RecalQP:0

Non-environment-variable-controlled macros set as follows:

                            RExt__DECODER_DEBUG_BIT_STATISTICS =   0
                                  RExt__HIGH_BIT_DEPTH_SUPPORT =   0
                        RExt__HIGH_PRECISION_FORWARD_TRANSFORM =   0
                                    O0043_BEST_EFFORT_DECODING =   0

               Input ChromaFormatIDC =   4:2:0
   Output (internal) ChromaFormatIDC =   4:2:0

``
于是我单独运行D:\HM\HM-16.5_Test_AI\bin下文件video_to_cu_depth.py文件出现
File"video_to_cu_depth.py",line 120,in
assert len(sys.argv)==5
AssertionError
请问是为什么呢？？？如果你能回答很感谢！

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AI训练复现不成功的原因？Valid Loss超大 #13

AI训练复现不成功的原因？Valid Loss超大 #13

linchengsi commented Jul 4, 2019 •

edited

Loading

tianyili2017 commented Jul 8, 2019

tianyili2017 commented Jul 9, 2019

zhoushuairan commented Aug 12, 2019 •

edited

Loading

AI训练复现不成功的原因？Valid Loss超大 #13

AI训练复现不成功的原因？Valid Loss超大 #13

Comments

linchengsi commented Jul 4, 2019 • edited Loading

tianyili2017 commented Jul 8, 2019

tianyili2017 commented Jul 9, 2019

zhoushuairan commented Aug 12, 2019 • edited Loading

linchengsi commented Jul 4, 2019 •

edited

Loading

zhoushuairan commented Aug 12, 2019 •

edited

Loading