Can I just sub the ASL model/labels files? #4
-
I am working on Raspberry Pi4 with a picamera, and have a working mobilenet_v1_1.0_224_quant.tflite classifier program which I tried "blindly" with the ASL model.tflite and labels.txt. It seems to produce "nothing" when I fill the image with all black, but classifies nearly anything else as "C". Is there something more I need to do to get this running?
|
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
In the demo image, it shows “Frame, Crop, View” parameters with “Crop 224x224” and in the Mobilenet doc it mentions this dimension “Our primary network (width multiplier 1, 224 × 224),” but elsewhere in the doc it mentions the input as 320x320: "Both MobileNet models are trained and evaluated with … The input resolution of both models is 320 × I don’t know if the app uses a segmentation step to “find a hand” and then apply a crop of 3202 or 2242 around “the hand”, but it seems like I need to, at a minimum, try cropping out the center 3202 of the picamera’s 640x480 image and see what happens. But the actual ASL doc states " The model takes an input image of size 224x224 with three channels per pixel (RGB - Red Green Blue)." Perhaps I will have to crop out the center 2242 of the image but since I don't have a preview ability on my robot, it will be difficult to know when my hand is in the proper position - requiring an additional segmentation "find the hand" step. I so wanted this to be easy! |
Beta Was this translation helpful? Give feedback.
-
Hey, @slowrunner! You can use this notebook for reference. The input image will be in the shape (1, 224, 224, 3). |
Beta Was this translation helpful? Give feedback.
In the demo image, it shows “Frame, Crop, View” parameters with “Crop 224x224” and in the Mobilenet doc it mentions this dimension “Our primary network (width multiplier 1, 224 × 224),” but elsewhere in the doc it mentions the input as 320x320: "Both MobileNet models are trained and evaluated with … The input resolution of both models is 320 ×
320. "
I don’t know if the app uses a segmentation step to “find a hand” and then apply a crop of 3202 or 2242 around “the hand”, but it seems like I need to, at a minimum, try cropping out the center 3202 of the picamera’s 640x480 image and see what happens.
But the actual ASL doc states " The model takes an input image of size 224x224 with three c…