Skip to content

Created image search functionality using MobileNet Model#9

Open
ryanrahman27 wants to merge 4 commits intoadanomad:mainfrom
ryanrahman27:ryan-rahman-imagesearch
Open

Created image search functionality using MobileNet Model#9
ryanrahman27 wants to merge 4 commits intoadanomad:mainfrom
ryanrahman27:ryan-rahman-imagesearch

Conversation

@ryanrahman27
Copy link

The ImageSearch feature allows users to search for images within a PDF document using text-based queries. After the PDF is uploaded, images are extracted using PDF.js and converted into embeddings via MobileNet. The user's search query is also transformed into an embedding using the Universal Sentence Encoder. To find the most relevant images, I used cosine similarity to compare the query embedding with the image embeddings, identifying the closest matches.

One of the challenges I faced was loading the MobileNet model from TensorFlow Hub, as the server response was invalid. I overcame this by downloading the model and serving it locally within the project, which allowed me to bypass the external URL issues. While there were some challenges getting the MobileNet model fully functional during testing, the primary focus was to demonstrate the implementation process, and the logic was successfully put in place to handle real-world image search functionality.

Comment on lines +17 to +27
// Step 1: Extract images from the PDF using the helper function
const images = await extractImagesFromPDF(pdfUrl);

// Step 2: Generate embeddings for each image
const imageEmbeddings = await Promise.all(images.map(async (img) => getImageEmbedding(img)));

// Step 3: Generate text query embedding
const queryEmbedding = await getTextEmbedding(searchQuery);

// Step 4: Perform search by comparing the query embedding with image embeddings
const matchingResults = searchEmbeddings(queryEmbedding, imageEmbeddings);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, good clarity. Have you thought about doing this on the server side?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I did actually plan on doing it on the server side in Node.js because I thought it would work just as well if not better, but I decided to keep it consistent with all the other components of the web app and wrote it in TypeScript on the frontend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants