This example builds embedding index based on Google Drive files. It continuously updates the index as files are added / updated / deleted in the source folders: it keeps the index in sync with the source folders effortlessly.
Before running the example, you need to:
-
Install Postgres if you don't have one.
-
Prepare for Google Drive:
- Setup a service account in Google Cloud, and download the credential file.
- Share folders containing files you want to import with the service account's email address.
See Setup for Google Drive for more details.
-
Create
.env
file with your credential file and folder IDs. Starting from copying the.env.example
, and then edit it to fill in your credential file path and folder IDs.cp .env.exmaple .env $EDITOR .env
Install dependencies:
pip install -e .
Setup:
python main.py cocoindex setup
Run:
python main.py
During running, it will keep observing changes in the source folders and update the index automatically. At the same time, it accepts queries from the terminal, and performs search on top of the up-to-date index.
CocoInsight is in Early Access now (Free) 😊 You found us! A quick 3 minute video tutorial about CocoInsight: Watch on YouTube.
Run CocoInsight to understand your RAG data pipeline:
python main.py cocoindex server -c https://cocoindex.io
You can also add a -L
flag to make the server keep updating the index to reflect source changes at the same time:
python main.py cocoindex server -c https://cocoindex.io -L
Then open the CocoInsight UI at https://cocoindex.io/cocoinsight.