Created image search functionality using MobileNet Model#9
Open
ryanrahman27 wants to merge 4 commits intoadanomad:mainfrom
Open
Created image search functionality using MobileNet Model#9ryanrahman27 wants to merge 4 commits intoadanomad:mainfrom
ryanrahman27 wants to merge 4 commits intoadanomad:mainfrom
Conversation
sunapi386
reviewed
Oct 3, 2024
Comment on lines
+17
to
+27
| // Step 1: Extract images from the PDF using the helper function | ||
| const images = await extractImagesFromPDF(pdfUrl); | ||
|
|
||
| // Step 2: Generate embeddings for each image | ||
| const imageEmbeddings = await Promise.all(images.map(async (img) => getImageEmbedding(img))); | ||
|
|
||
| // Step 3: Generate text query embedding | ||
| const queryEmbedding = await getTextEmbedding(searchQuery); | ||
|
|
||
| // Step 4: Perform search by comparing the query embedding with image embeddings | ||
| const matchingResults = searchEmbeddings(queryEmbedding, imageEmbeddings); |
Contributor
There was a problem hiding this comment.
Nice, good clarity. Have you thought about doing this on the server side?
Author
There was a problem hiding this comment.
Thank you! I did actually plan on doing it on the server side in Node.js because I thought it would work just as well if not better, but I decided to keep it consistent with all the other components of the web app and wrote it in TypeScript on the frontend.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The ImageSearch feature allows users to search for images within a PDF document using text-based queries. After the PDF is uploaded, images are extracted using PDF.js and converted into embeddings via MobileNet. The user's search query is also transformed into an embedding using the Universal Sentence Encoder. To find the most relevant images, I used cosine similarity to compare the query embedding with the image embeddings, identifying the closest matches.
One of the challenges I faced was loading the MobileNet model from TensorFlow Hub, as the server response was invalid. I overcame this by downloading the model and serving it locally within the project, which allowed me to bypass the external URL issues. While there were some challenges getting the MobileNet model fully functional during testing, the primary focus was to demonstrate the implementation process, and the logic was successfully put in place to handle real-world image search functionality.