Skip to content

Sanjay Jayaram's solution to image search for the OA#1

Open
sanjaycal wants to merge 2 commits intoadanomad:mainfrom
sanjaycal:main
Open

Sanjay Jayaram's solution to image search for the OA#1
sanjaycal wants to merge 2 commits intoadanomad:mainfrom
sanjaycal:main

Conversation

@sanjaycal
Copy link

Setup:

Since, I created a python webserver, to run this, python>=3.10 must be installed on the machine.
My environment setup script also assumes a linux machine, however it might work on a Mac.
to setup the environment for the webserver, simply run the setupServer.sh file in the pythonWebServer folder,
then to start the server, simply run python3 main.py

then start the npm(I have only tested in dev mode, so that will be part of my setup process) with npm run dev in the main folder

Ideally the machine that is running these servers has a NVIDIA GPU to accelerate the image embeddings.

Explanation of the approach

I just exported the computation to a python webserver, the python webserver then does all of the heavy embedding effort with a CLIP model.

the process

So, first of all, I spent way too long trying to get this to work in the JS side of things, however JS doesn't have anywhere near as mature of a ML ecosystem as python. Thus, after failing to get JS inference of a CLIP model to work, I just decided to make the python webserver and did it in <1/4 the time I spent trying to get it to work in entirely JS. After I decided to just export the computation to the python webserver, and had the javascript just call it, it became a lot easier as I have a lot more experience working with ML and other AI projects in Python as opposed to Javascript.

image_features /= image_features.norm(dim=-1, keepdim=True)
text_features /= text_features.norm(dim=-1, keepdim=True)
text_probs = image_features @ text_features.T
return text_probs.item() > 0.2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How have you determined 0.2 as a threshold?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What kind of distance is this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants