Scene-Localization in Dense Images-via-Natural Language Queries

This project aims to build a system that can identify and localize specific sub-scenes within a single dense image based on a natural language query describing one of the events occurring in the scene.

#Setup and Installation Follow these steps to set up the environment and run the project.

Step 1: Clone the Repository

First, clone this repository to your local machine:

git clone https://github.com/prakhar14-op/Scene-Localization-in-Dense-Images-via-Natural-Language-Queries-
cd Scene-Localization-in-Dense-Images-via-Natural-Language-Queries

Step 2: Create Folders and Download Weights

Create a weights directory in the main project folder.
Download the model weights from this link:
- https://drive.google.com/file/d/160I2AMxnQ_ZT_MOO4dBA6jPH6BwKXFeV/view?usp=sharing
- and this file https://drive.google.com/file/d/1WMfdrdaiIPmRKIAx0i9q_NCIf7Xed2mC/view?usp=sharing
download these files and put them in a folder named "weights"

Step 3: Set Up Python Environment

This project requires Python 3.9. It is highly recommended to use a virtual environment Create and activate the environment

py -3.9 -m venv venv on windows On Windows: .\venv\Scripts\activate

Step 4: Install Dependencies

With the virtual environment activated, install all required packages.

pip install -r requirements.txt
pip install -e GroundingDINO/

Note: If you encounter build errors on Windows, you may need to install the Microsoft C++ Build Tools.

How to Run

Make sure your virtual environment is activated.
Run the main script from your terminal:
- python aims.py
The script will prompt you to enter the path to an image and your text query.
The results (an annotated image and a cropped image) will be saved in a new results folder.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
GroundingDINO		GroundingDINO
README.md		README.md
aims.css		aims.css
aims.html		aims.html
aims.js		aims.js
aims.py		aims.py
app.py		app.py
dinning.jpeg		dinning.jpeg
dinning_result.png		dinning_result.png
market.jpg		market.jpg
market_result.jpg		market_result.jpg
requirements.txt		requirements.txt
street.webp		street.webp
street_result.png		street_result.png
street_vendor_result.png		street_vendor_result.png
street_vendor_scene.jpg		street_vendor_scene.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Scene-Localization in Dense Images-via-Natural Language Queries

Step 1: Clone the Repository

Step 2: Create Folders and Download Weights

Step 3: Set Up Python Environment

Step 4: Install Dependencies

How to Run

Dining Scene

Market Scene

Street Scene

Street Vendor Scene

About

Uh oh!

Releases

Packages

Languages

prakhar14-op/Scene-Localization-in-Dense-Images-via-Natural-Language-Queries

Folders and files

Latest commit

History

Repository files navigation

Scene-Localization in Dense Images-via-Natural Language Queries

Step 1: Clone the Repository

Step 2: Create Folders and Download Weights

Step 3: Set Up Python Environment

Step 4: Install Dependencies

How to Run

Dining Scene

Market Scene

Street Scene

Street Vendor Scene

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages