Skip to content

Database and search submission#22

Open
0289 wants to merge 3 commits intoadanomad:mainfrom
0289:database-and-search
Open

Database and search submission#22
0289 wants to merge 3 commits intoadanomad:mainfrom
0289:database-and-search

Conversation

@0289
Copy link

@0289 0289 commented Jul 4, 2025

Instructions

Installation remains largely unchanged. This application requires both a Google Cloud API Key and a Open AI API Key. Instructions for these can be find at the below. Afterwards, put both keys into your .enc.local file with the variable names "GOOGLE_API_KEY=" and "OPEN_AI_KEY=".

Google API Key
https://console.cloud.google.com/apis/credentials
Navigate to this button
image

Open AI Key
https://help.openai.com/en/articles/4936850-where-do-i-find-my-openai-api-key

Implementation

NOTE: This implementation is incomplete

Implemented Features:

  • Google OCR when scanning pdfs for text recognition in images
  • Embedding of text
  • Database to store embedding

Not Implemented Features:

  • Uploading pdfs to the database
  • Searching between multiple pdfs based on embeddings

First, the Google OCR was implemented by converting the pdf pages into images and sending them to the software as an API call. This is done by the server which receives the images from the client and returns a set of text and bounding boxes describing their location. Then, the text is overlayed onto the original pdf based on the bounding box positions. In this section, the most challenging part was understanding how files are passed between the client, the server, and the Google API. Since each had access to and accepted varying data forms, it was hard to match everything perfectly.

Second, the embedding of text was similarly conducted by creating an API call to the Open AI API. This is stored on a per page basis in an SQLite database. Additionally, the entirety of the database is accessed via API calls. Unfortunately, the upload and search functionality are not implemented. Nevertheless, the upload would function by creating an embedding of the text on each page and storing them in the database. Then, whenever a search is performed, the query is similarly embedded, and the pages with the closest Euclidean distance would be considered first in the results.

Overall, I tried to keep all processes in a similar format to the existing repository. This includes formatting, adding where they make sense, and using APIs in similar cases.

@0289 0289 changed the title Database and search Database and search submission Jul 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant