Skip to content

Latest commit

 

History

History
204 lines (166 loc) · 22.8 KB

File metadata and controls

204 lines (166 loc) · 22.8 KB

Summary

A saturday-and-half-sunday project to find the strangest objects on a sample of around ~10k pictures of mine - using the pre-trained Inception-v3 deep convolutional network by Google.

I specifically look for "unusual" objects, focusing on the objects that score a low count in a natural language corpus.

High-level process description

Setup

I first prepared the setup for having Inception-v3 working. Inception-v3 is a deep convolutional network by Google, which is already trained on the ImageNet Large Visual Recognition corpus.

Inception-v3 works on the TensorFlow framework by Google. Setting up the framework as of Feb 2016 is moderately complicated; the installation is simple but the actual running of the real examples requires some heavy stackoverflow and github digging.

Luckily, there is a docker container that works pretty much out of the box, here: https://hub.docker.com/r/atong01/imagenet-tensorflow/ . So after some half a day of trouble getting TensorFlow to work, I finally used the container and got the process to work in an hour.

After that, I made an export of around 10k of my pictures.

Finally, I downloaded a file with the 1/3 million most frequent words, all lowercase, with counts, from Peter Norvig's page here http://norvig.com/ngrams/ . This file would allow me to score the "strangeness" of the labels.

The coding part

I made three scripts:

  1. runAdjusted.py which runs the object recognition. This script emits in output a text blurb listing the objects found in the picture, with the confidence level. See an example here https://www.tensorflow.org/versions/r0.7/tutorials/image_recognition/index.html

  2. extractToken.py which filters just one label (made of one word only) out of the objects recognition text.

  3. howFrequent.py which takes the word of the previous point and looks it up in the "word corpus count" file.

The three commands piped together give the filename, the object detected with the highest confidence, and the "word corpus count" of the found object.

The running part

After having the container installed, download this project. From the directory of the project, run:

docker run -it -v $PWD:/root/tmp -v $PWD/images1:/root/tmp/images1 -v $PWD/images2:/root/tmp/images2 -v $PWD/images3:/root/tmp/images3 -v $PWD/images4:/root/tmp/images4 atong01/imagenet-tensorflow

Then, form "inside" the container:

cd /root/tmp/ sh ./processAllImages1.sh > resultsImages1.csv

(you can run multiple processes, "processAllImages2" operates on the "images2" directory and so on up to "processAllImages4")

This will process all the images in the ./images1/ directory.

The output looks like this (see the file name, the recognised object and the "word corpus" count of that word):

(You will want to redirect that to a .csv file for further processing.)

I've put the images in three separate directories and run 3 processes in parallel, outputting the results in .csv files.

After a night of processing, I collated together the three .csv files into one for analysis.

More about ImageNet and convolutional networks

The ImageNet Large Visual Recognition is a big dataset [Deng et al. 2009] which has been used since 2010 as a benchmark for object recognitions in images. Since 2012, the benchmark has been dominated by convolutional networks. Although convolutional networks have been around for a long time, there has been incredible progress in the past five years or so in terms of speed of their training, mostly due to the advent of "big data" (which readied the whole IT industry for workflows based on transfer and analysis of huge quantities of data) and the availability of server farms enabled with graphic cards (where both training and operation of these networks could be done fast and cheap).

For a great overview, see any of Yann LeCun's (a key player in the field) videos on the matter e.g. https://www.youtube.com/watch?v=M7smwHwdOIA

Results (the strangest objects in my pics)

The objects are ordered in decreasing strangeness (an object is more strange if it has a lower count in the Natural Language Corpus Data).

ashcan (corpus count: 17521 )

consomme (corpus count: 24689 )

chainlink (corpus count: 28909 )

toyshop (corpus count: 35622 )

toyshop (corpus count: 35622 )

toyshop (corpus count: 35622 )

toyshop (corpus count: 35622 )

jackfruit (corpus count: 41792 )

jackfruit (corpus count: 41792 )

trolleybus (corpus count: 48586 )

axolotl (corpus count: 57021 )

komondor (corpus count: 67610 )

kuvasz (corpus count: 72446 )

drumstick (corpus count: 89753 )

snowplow (corpus count: 96186 ). It's actually a police car but come on.

dugong (corpus count: 103769 ) it's probably a whale but could be a dugong with some fantasy.

maillot (corpus count: 111683 )

meerkat (corpus count: 113040 ) pic of an Argos catalog page.

washbasin (corpus count: 114082 )

stupa (corpus count: 117550 )

sandbar (corpus count: 136849 )

speedboat (corpus count: 141990 )

breastplate (corpus count: 154648 ) . still from the video "All is full of love".

triumphal (corpus count: 155633 )

megalith (corpus count: 163808 )

loupe (corpus count: 168244 )

shoji (corpus count: 231018 )

flagpole (corpus count: 270789 )

airship (corpus count: 296641 )

airship (corpus count: 296641 )

breakwater (corpus count: 307474 )

geyser (corpus count: 313513 ) albeit an urban one.

boathouse (corpus count: 324233 )

photocopier (corpus count: 341379 )

trifle (corpus count: 349545 )

ladle (corpus count: 372653 )

bannister (corpus count: 399566 )

tricycle (corpus count: 400029 ) Not quite a tricycle but giving it a pass since it did find a curious object indeed.

rotisserie (corpus count: 459806 )

streetcar (corpus count: 471936 )

odometer (corpus count: 482886 )

oscilloscope (corpus count: 535361 )

albatross (corpus count: 553191 )

kimono (corpus count: 586714 )

grasshopper (corpus count: 595548 )

hermit (corpus count: 828204 )

eel (corpus count: 956144 )

forklift (corpus count: 1006085 )

confectionery (corpus count: 1138693 )

parachute (corpus count: 1311422 )

carousel (corpus count: 1369112 )

hamster (corpus count: 1444712 )

cloak (corpus count: 1450056 )

broccoli (corpus count: 1625392 )

flamingo (corpus count: 1638181 )

joystick (corpus count: 1687753 )

gong (corpus count: 2128445 ) not quite but semantically close.



wig (corpus count: 2151939 ) highly impressive that a convnet can tell the difference between hair and a wig.











pelican (corpus count: 2199254 )

ox (corpus count: 2258722 )

maze (corpus count: 2404306 )

wreck (corpus count: 2447935 )

altar (corpus count: 2627835 )

radiator (corpus count: 2850529 )

knot (corpus count: 2980542 )

jigsaw (corpus count: 3237104 )

jigsaw (corpus count: 3237104 )

feather (corpus count: 3331460 )

throne (corpus count: 3392423 )

boxer (corpus count: 3592432 )

goose (corpus count: 3739382 )

sock (corpus count: 3748769 )

sock (corpus count: 3748769 )

binoculars (corpus count: 4487365 )

Funny mis-labelings

warplane (corpus count: 43421 ) yes but no.

mousetrap (corpus count: 204797 ). Many small mechanical things are classified as mousetraps.

hippopotamus (corpus count: 223453 )

miniskirt (corpus count: 254335 )

guillotine (corpus count: 307824 )

baboon (corpus count: 359080 )

crayfish (corpus count: 419647 )

rotisserie (corpus count: 459806 )

barbershop (corpus count: 481271 ). How would you classify this one?

strainer (corpus count: 503244 ). (it's the core of an induction coil).

croquet (corpus count: 571392 )

corkscrew (corpus count: 581475 ) (robotic surgeon, pic I took from a book)

clog (corpus count: 787644 )

barbell (corpus count: 819496 )



punching (corpus count: 1000845 ) (somewhat similar to a punching bag?)

teapot (corpus count: 1069708 ) (don't ask what this is)

parachute (corpus count: 1311422 )

joystick (corpus count: 1687753 ) (again, don't ask why we did this in our work time)

toaster (corpus count: 2092936 )

cello (corpus count: 2358848 ) hmmm no

bathing (corpus count: 2738196 )

submarine (corpus count: 2766790 ) hmmmm quite

kite (corpus count: 2808067 )

racer (corpus count: 3625077 ) (yes but slow)

limousine (corpus count: 4007578 ). Many "group pictures" are classified as limousine

violin (corpus count: 5327802 ) no no no!

More on found objects

Inception-v3 processed 9726 images. The most found objects and the least found objects are:

Most found: pier 390 window 178 lakeside 177 spotlight 150 seashore 147 web 139 ...

Least found: ... vulture 1 walking 1 warthog 1 whippet 1 whiptail 1 whiskey 1 yellow 1 zebra 1 zucchini 1

The objects-occurrences.csv file lists them all.