Error showing up during detection #4

tsathya98 · 2021-04-16T06:30:27Z

Hi,
When I try to run detect.py using this:
"python detect.py --pic-dir images/chess_pictures --model-path output_model/chess/best_model_tiny_0.985/1 --class-names dataset/chess.names --nms-score-threshold 0.1",
the error which shows up is:

Traceback (most recent call last):
File "detect.py", line 119, in
main(args)
File "detect.py", line 95, in main
model = tf.keras.models.load_model(args.model_path)
File "C:\Users\tsath\anaconda3\envs\yolo\lib\site-packages\tensorflow\python\keras\saving\save.py", line 212, in load_model
return saved_model_load.load(filepath, compile, options)
File "C:\Users\tsath\anaconda3\envs\yolo\lib\site-packages\tensorflow\python\keras\saving\saved_model\load.py", line 147, in load
keras_loader.finalize_objects()
File "C:\Users\tsath\anaconda3\envs\yolo\lib\site-packages\tensorflow\python\keras\saving\saved_model\load.py", line 612, in finalize_objects
self._reconstruct_all_models()
File "C:\Users\tsath\anaconda3\envs\yolo\lib\site-packages\tensorflow\python\keras\saving\saved_model\load.py", line 631, in _reconstruct_all_models
self._reconstruct_model(model_id, model, layers)
File "C:\Users\tsath\anaconda3\envs\yolo\lib\site-packages\tensorflow\python\keras\saving\saved_model\load.py", line 677, in _reconstruct_model
created_layers) = functional_lib.reconstruct_from_config(
File "C:\Users\tsath\anaconda3\envs\yolo\lib\site-packages\tensorflow\python\keras\engine\functional.py", line 1285, in reconstruct_from_config
process_node(layer, node_data)
File "C:\Users\tsath\anaconda3\envs\yolo\lib\site-packages\tensorflow\python\keras\engine\functional.py", line 1222, in process_node
nest.flatten(inbound_node.outputs)[inbound_tensor_index])
IndexError: list index out of range

I used the default dataset (chess) for training and tried to detect using the saved model.
The tensorflow version running is: TF 2.4.1
Can someone help me with resolving this issue? Thanks!!

MidnessX · 2021-04-20T14:06:42Z

I'm running into the same issue. So far I haven't been able to pinpoint the cause.

Hasankanso · 2021-04-24T20:51:19Z

same problem... is this relatable tensorflow/tensorflow#21894 ?

MidnessX · 2021-04-26T10:35:34Z

I managed to fix the problem!

The issue lies in the way the model is exported. More precisely, line 294 in train.py calls the Keras API method for exporting models.
This appears to be problematic as the network implementation uses custom layers which probably are not strictly adherent to the Keras specification.

Changing model.save(best_model_path) into tf.saved_model.save(model, best_model_path) fixed the issue for me.
This is due to now using the Tensorflow lower layer API save method, rather than the Keras's one. More info in the official Tensorflow documentation.

At the same time, change line 95 in detect.py from tf.keras.models.load_model(args.model_path) to tf.saved_model.load(args.model_path).

Lastly, there's one more issue that need to be fixed in detect.py: the normalization happening at line 29 converts the batch into the float64 type, while the network expects float32 as inputs. Change that line from img = img / 255 to img = tf.image.convert_image_dtype(img, tf.float32).

After all these changes, I am able to train the network and load it back without any issues.
I would have done a pull request rather than explaining my findings in this post, but I can't since my code also includes other modifications strictly related to what I will be using the network for. I hope @wangermeng2021 can fix the issue in the next commit! :)

Hasankanso · 2021-04-26T10:53:04Z

The issue lies in the way the model is exported. More precisely, line 294 in train-py calls the Keras API method for exporting models.
This appears to be problematic as the network implementation uses custom layers which probably are not strictly adherent to the Keras specification.

Changing model.save(best_model_path) into tf.saved_model.save(model, best_model_path) fixed the issue for me.
This is due to now using the Tensorflow lower layer API save method, rather than the Keras's one. More info in the official Tensorflow documentation.

does this mean I have to train the model again?

MidnessX · 2021-04-26T11:26:39Z

The issue lies in the way the model is exported. More precisely, line 294 in train-py calls the Keras API method for exporting models.
This appears to be problematic as the network implementation uses custom layers which probably are not strictly adherent to the Keras specification.
Changing model.save(best_model_path) into tf.saved_model.save(model, best_model_path) fixed the issue for me.
This is due to now using the Tensorflow lower layer API save method, rather than the Keras's one. More info in the official Tensorflow documentation.

does this mean I have to train the model again?

I'm afraid yes.

hzk7287 · 2021-04-28T08:37:15Z

I think to fix the problem. You have to add one line
" batch_img_tensor=tf.convert_to_tensor(batch_img)"
and using "batch_img_tensor" instead of "batch_img" as the input of "detect_batch_img".

tsathya98 · 2021-04-29T06:56:31Z

I managed to fix the problem!

The issue lies in the way the model is exported. More precisely, line 294 in train.py calls the Keras API method for exporting models.
This appears to be problematic as the network implementation uses custom layers which probably are not strictly adherent to the Keras specification.

Changing model.save(best_model_path) into tf.saved_model.save(model, best_model_path) fixed the issue for me.
This is due to now using the Tensorflow lower layer API save method, rather than the Keras's one. More info in the official Tensorflow documentation.

At the same time, change line 95 in detect.py from tf.keras.models.load_model(args.model_path) to tf.saved_models.load(args.model_path).

Lastly, there's one more issue that need to be fixed in detect.py: the normalization happening at line 29 converts the batch into the float64 type, while the network expects float32 as inputs. Change that line from img = img / 255 to img = tf.image.convert_image_dtype(img, tf.float32).

After all these changes, I am able to train the network and load it back without any issues.
I would have done a pull request rather than explaining my findings in this post, but I can't since my code also includes other modifications strictly related to what I will be using the network for. I hope @wangermeng2021 can fix the issue in the next commit! :)

UPDATE:
Thanks to the solutions given by @MidnessX, the problem I faced while running detect.py has been fixed!!!
I was able to train the model and detect as well.

Hasankanso · 2021-05-03T01:49:14Z

At the same time, change line 95 in detect.py from tf.keras.models.load_model(args.model_path) to tf.saved_models.load(args.model_path).

to save some time for others, it's saved_model and not saved_models, tf.saved_model.load(args.model_path)

MidnessX · 2021-05-03T06:18:00Z

At the same time, change line 95 in detect.py from tf.keras.models.load_model(args.model_path) to tf.saved_models.load(args.model_path).

to save some time for others, it's saved_model and not saved_models, tf.saved_model.load(args.model_path)

Thank you, I didn't catch the error. I've updated my post with the correct module name.

wangermeng2021 · 2021-06-27T02:19:18Z

I managed to fix the problem!

The issue lies in the way the model is exported. More precisely, line 294 in train.py calls the Keras API method for exporting models.
This appears to be problematic as the network implementation uses custom layers which probably are not strictly adherent to the Keras specification.

Changing model.save(best_model_path) into tf.saved_model.save(model, best_model_path) fixed the issue for me.
This is due to now using the Tensorflow lower layer API save method, rather than the Keras's one. More info in the official Tensorflow documentation.

At the same time, change line 95 in detect.py from tf.keras.models.load_model(args.model_path) to tf.saved_model.load(args.model_path).

Lastly, there's one more issue that need to be fixed in detect.py: the normalization happening at line 29 converts the batch into the float64 type, while the network expects float32 as inputs. Change that line from img = img / 255 to img = tf.image.convert_image_dtype(img, tf.float32).

After all these changes, I am able to train the network and load it back without any issues.
I would have done a pull request rather than explaining my findings in this post, but I can't since my code also includes other modifications strictly related to what I will be using the network for. I hope @wangermeng2021 can fix the issue in the next commit! :)

Thanks to the solutions, the problem has been fixed !

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error showing up during detection #4

Error showing up during detection #4

tsathya98 commented Apr 16, 2021

MidnessX commented Apr 20, 2021

Hasankanso commented Apr 24, 2021

MidnessX commented Apr 26, 2021 •

edited

Loading

Hasankanso commented Apr 26, 2021

MidnessX commented Apr 26, 2021

hzk7287 commented Apr 28, 2021

tsathya98 commented Apr 29, 2021

Hasankanso commented May 3, 2021

MidnessX commented May 3, 2021

wangermeng2021 commented Jun 27, 2021

Error showing up during detection #4

Error showing up during detection #4

Comments

tsathya98 commented Apr 16, 2021

MidnessX commented Apr 20, 2021

Hasankanso commented Apr 24, 2021

MidnessX commented Apr 26, 2021 • edited Loading

Hasankanso commented Apr 26, 2021

MidnessX commented Apr 26, 2021

hzk7287 commented Apr 28, 2021

tsathya98 commented Apr 29, 2021

Hasankanso commented May 3, 2021

MidnessX commented May 3, 2021

wangermeng2021 commented Jun 27, 2021

MidnessX commented Apr 26, 2021 •

edited

Loading