Skip to content

Commit 1e17bc3

Browse files
authored
Moved the multimodal_retrieval under the community directory (#259)
* multimodal retrieval by fciannella * Fixed the images path in the README.md file * Changed the location of the repository
1 parent e8f0254 commit 1e17bc3

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

53 files changed

+2693
-0
lines changed
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
mongodbdata/
Lines changed: 139 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,139 @@
1+
# Introduction
2+
3+
This is a multimodal retrieval using long context. You will be able to ingest HTML documents and ask questions about the document. The tool will allow you to find answers inside the images and the tables.
4+
5+
Here is an example:
6+
7+
![Finding an answer inside a table](assets/table_example.png)
8+
![Finding an answer inside a chart](assets/image_example.png)
9+
10+
The tool uses an openai vision model or an nvidia vision model (llama v3.2 90B)
11+
12+
13+
### Setup details
14+
15+
There are two setups that need to be spun up:
16+
17+
- Langgraph that runs the agent
18+
- Mongodb and langserve that run the database and some services that can be tested along with the Gradio UI to test
19+
20+
The idea is that you have a gradio UI that allows you to ingest html documents and then you can query the agent that is provided by langgraph.
21+
22+
23+
24+
# QuickStart
25+
26+
In this setup we will launch the langgraph agent in dev mode on the host machine and the rest of the setup will be hosted in docker containers, configured through docker compose.
27+
You can also launch langgraph with the containers with `langgraph up`, in that case you don't need an extra .env.lg file (see below)
28+
29+
## Langgraph setup in the host machine
30+
31+
Run this command from the root of the repository (the one with the `langgraph.json` and `docker-compose.yml` files)
32+
33+
Install a venv:
34+
35+
```shell
36+
python3 -m venv lb-venv
37+
source ./lg-venv/bin/activate
38+
pip install -r requirements.txt
39+
```
40+
41+
42+
## Create the env files
43+
44+
You need to create two .env files (one for the docker compose and one for the langgraph agent)
45+
46+
### .env
47+
48+
Create a .env file in the root directory of this repository (the one with the `langgraph.json` and `docker-compose.yml` files)
49+
50+
```shell
51+
MONGO_INITDB_ROOT_USERNAME=admin
52+
MONGO_INITDB_ROOT_PASSWORD=secret
53+
MONGO_HOST=mongodb
54+
MONGO_PORT=27017
55+
OPENAI_API_KEY=
56+
LANGCHAIN_API_KEY=
57+
LANGSMITH_API_KEY=
58+
LANGGRAPH_CLOUD_LICENSE_KEY=
59+
NVIDIA_API_KEY=
60+
IMAGES_HOST=localhost
61+
AGENTS_HOST=
62+
AGENTS_PORT=2024
63+
```
64+
65+
Normally LANGCHAIN_API_KEY and LANGSMITH_API_KEY have the same value.
66+
67+
AGENTS_HOST is the IP address of the host where you are running docker. It could be the IP address of your PC for instance.
68+
69+
### .env.lg
70+
71+
We need this because we want to launch langgraph in dev mode, so to be able to reach mongodb from inside the langgraph agent we need to set its hostname to the localhost.
72+
73+
It should be located in the root of the repository (the one with the `langgraph.json` and `docker-compose.yml` files)
74+
75+
```shell
76+
MONGO_INITDB_ROOT_USERNAME=admin
77+
MONGO_INITDB_ROOT_PASSWORD=secret
78+
MONGO_HOST=localhost
79+
MONGO_PORT=27017
80+
OPENAI_API_KEY=
81+
LANGCHAIN_API_KEY=
82+
LANGSMITH_API_KEY=
83+
LANGGRAPH_CLOUD_LICENSE_KEY=
84+
NVIDIA_API_KEY=
85+
IMAGES_HOST=localhost
86+
AGENTS_HOST=localhost
87+
AGENTS_PORT=2024
88+
```
89+
90+
# Launch the mongodb and gradio services
91+
92+
Update the `.env` file adding your API Keys.
93+
94+
Launch the docker compose services
95+
96+
```shell
97+
docker compose up --build
98+
```
99+
then you can connect to `http://localhost:7860` to ingest documents
100+
101+
# Launch langgraph
102+
103+
```bash
104+
langgraph dev --host 0.0.0.0
105+
```
106+
107+
## Test Langgraph
108+
109+
```bash
110+
curl --request POST \
111+
--url http://localhost:2024/runs/stream \
112+
--header 'Content-Type: application/json' \
113+
--data '{
114+
"assistant_id": "agent",
115+
"input": {
116+
"messages": [
117+
{
118+
"role": "user",
119+
"content": "What is the harness?"
120+
}
121+
]
122+
},
123+
"metadata": {},
124+
"config": {
125+
"configurable": {
126+
"collection_name": "test",
127+
"document_id": "8eb8f7396e4fe72595e6577c35a7a587"
128+
}
129+
},
130+
"multitask_strategy": "reject",
131+
"stream_mode": [
132+
"values"
133+
]
134+
}'
135+
136+
```
137+
138+
139+

community/multimodal_retrieval/__init__.py

Whitespace-only changes.

0 commit comments

Comments
 (0)