-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how to get the video level "weak" label #3
Comments
Hi, We didn't use all 1000 imagenet classes, but ~20 selected audio-related classes. Then we normalize the class probabilities for these classes, so you could get multiple labels with class probability larger than the threshold. Also, 0.3 is just empirical. Thanks for your interest! |
@rhgao |
Dear Mr. Gao |
Hi, We use all the collected basis vectors to initialize W, namely M x K with M = 3000, K=25. 3,000 is just a hyperparameter, and a larger number of basis vectors could potentially lead to better results. |
Thanks, cloud you please give me your train loss/mAp ,and val loss/mAp. my train loss is about 0.0001, train Map is about 0.72. My val loss is about 0.1 and val mAp is 0.65 after 300 iter, batchSize and Valsize is the same of you. Is that normal? |
Dear Mr. Gao
Thank you so much for the great work. However, I met some problems when I implemented this code.
As described in you article, "For the visual frames, we use an ImageNet pre-trained ResNet-152 network [34] to make object category predictions, and we max-pool over predictions of all frames to obtain a video-level prediction. The top labels (with class probability larger than a threshold = 0.3) are used as weak \labels" for the unlabeled video."
However, when I use the pre-trained-152 network, I can get the only one category prediction lager than the threshold. How can I get multi-labels through the pre-trained-152 network.
Should I train a object detection network or a multi-classes multi-labels network or some other solutions. Thank you for your assistance
Best regards!
The text was updated successfully, but these errors were encountered: