English | 中文
A lightweight, single-file web-search enhancement module for GenericAgent-style agents. It provides Tavily search, OpenAI-compatible deep search, parallel search, web content extraction, and site map discovery.
This project was ported and simplified from an internal GenericAgent integration inspired by openclaw-websearch-plugin, with secrets and local-only configuration removed for public release.
- Tavily Search:
tavily_search- Basic / advanced search depth
- General / news topic support
- Sticky multi-key failover for
401/429style authorization or rate-limit failures
- Grok / OpenAI-compatible deep search:
grok_search- Works with an OpenAI-compatible
/v1/chat/completionsendpoint - Optional model override and platform focus
- Adds current time context for time-sensitive queries
- Works with an OpenAI-compatible
- Parallel search:
dual_search- Runs Tavily and Grok-compatible search concurrently
- Separates Tavily-only and Grok-only keyword arguments to avoid parameter pollution
- Web content extraction:
ws_fetch- Tavily Extract first
- Firecrawl v2
/scrapefallback when configured
- Site map discovery:
ws_map- Tavily Map first
- Optional local same-domain BFS fallback when
beautifulsoup4is installed
For local development:
pip install -e .Or copy ga_search_enhanced.py into your agent/tool directory.
Python 3.10+ is required.
Optional dependency for local site-map fallback:
pip install -e ".[map-fallback]"Configuration is intentionally lazy-loaded. The module works in plain Python and can also be used inside a GenericAgent-style environment.
Configuration precedence:
- Environment variables
- Local
web-search.envbesidega_search_enhanced.py - Optional lazy
keychainadapter, if a compatible module exists
Example web-search.env:
TAVILY_API_KEYS=tvly-key1,tvly-key2
GROK_API_KEY=your-grok-or-openai-compatible-key
GROK_API_URL=http://127.0.0.1:8000
GROK_MODEL=grok-3-mini
FIRECRAWL_API_KEY=fc-your-key
FIRECRAWL_API_URL=https://api.firecrawl.dev/v2Supported key aliases include:
- Tavily:
TAVILY_API_KEYS,TAVILY_API_KEY,tavilyApiKeys - Grok/OpenAI compatible:
GROK_API_KEY,GROK_API_URL,GROK_MODEL,grokApiKey,grokApiUrl,grokModel - Firecrawl:
FIRECRAWL_API_KEY,FIRECRAWL_API_URL,firecrawlApiKey,firecrawlApiUrl
web-search.env is ignored by git. Never commit real API keys.
from ga_search_enhanced import tavily_search, grok_search, dual_search, ws_fetch, ws_map
print(tavily_search("latest Python release", depth="advanced", max_results=5))
print(grok_search("Analyze the latest Python release impact", platform="GitHub and official docs"))
print(dual_search("Compare current LLM web search APIs", max_results=3, model="grok-3-mini"))
print(ws_fetch(["https://example.com"]))
print(ws_map(url="https://example.com", depth=2, limit=10))tavily_search(query, depth="basic", max_results=5, topic="general", days=None, include_answer=True, include_raw=False)
Searches with Tavily Search API.
Behavior:
- Empty
queryreturns{"error": "query is required"}. - Invalid
depthis normalized to"basic". - Invalid
topicis normalized to"general". max_resultsis clamped to1..20.- Multiple Tavily keys are sticky: the current key is reused until a
401or429style failure, then the pool fails over to the next key.
Typical success shape:
{
"result": "## Answer\n...\n\n## Sources\n...",
"details": {...}, # Tavily response
"raw": {...}, # same raw Tavily response for compatibility
}Typical error shape:
{"error": "No Tavily API keys configured"}Calls an OpenAI-compatible chat completion endpoint at:
{GROK_API_URL}/v1/chat/completions
Behavior:
- Empty
queryreturns{"error": "query is required"}. - Missing key returns
{"error": "Grok API key not configured"}. modeloverrides configuredGROK_MODEL.platformappends a focus instruction to the user message.- Time-sensitive queries receive a current date/time prefix.
Typical success shape:
{
"result": "...model answer...",
"model": "...",
"usage": {...},
"details": {...},
}Runs tavily_search and grok_search concurrently.
Keyword routing:
- Tavily kwargs:
depth,max_results,topic,days,include_answer,include_raw - Grok kwargs:
model,platform tavily={...}andgrok={...}override routed kwargs for each side.
Return shape:
{
"tavily": {...},
"grok": {...},
"combined": "## Tavily Results\n...\n\n## Grok Results\n...",
}Each side may independently contain either result or error.
Extracts web page content.
Fallback decision:
- Try Tavily Extract with configured Tavily keys.
- If Tavily extraction fails or returns no content, try Firecrawl v2
/scrapewhenFIRECRAWL_API_KEYis configured. - If all methods fail, return an error.
urls may be a string or a list of strings.
Typical success shape:
{
"result": "# https://example.com\n\n...",
"source": "tavily", # or "firecrawl"
"failed": [],
}Typical error shape:
{"error": "All extraction methods failed (Tavily Extract and FireCrawl)"}ws_map(url=None, depth=1, breadth=20, limit=50, instructions=None, start_url=None, max_pages=None, same_domain=True)
Discovers URLs for a site.
Behavior:
start_urlis accepted as an alias forurl.max_pagesis accepted as an alias affectinglimitandbreadth.depthis clamped to1..5.breadthandlimitare clamped to1..100.
Fallback decision:
- Try Tavily Map with configured Tavily keys.
- If Tavily Map fails and
beautifulsoup4is installed, run a simple local BFS crawler. - If
beautifulsoup4is not installed, return the Tavily error.
Typical Tavily success shape:
{
"result": "# Site Map: ...",
"url": "https://example.com",
"urls": ["https://example.com", "..."],
"details": {...},
"source": "tavily",
}Typical local fallback shape:
{
"result": "# Site Map: ...",
"url": "https://example.com",
"urls": ["https://example.com", "..."],
"source": "local_bfs",
"tavily_error": "...",
}- In plain Python, use environment variables or a local
web-search.env. - In GenericAgent-style runtimes, a compatible lazy
keychainmodule may provide secrets. - The package never imports the keychain at module import time, so ordinary
import ga_search_enhancedremains safe outside GA. - Secrets are not printed by the module.
The included tests are offline and use monkeypatched fake HTTP responses. They do not call real external APIs.
python3 -m pytest -qThe current test matrix covers:
- Empty argument validation
- Tavily sticky key failover
- Grok/OpenAI-compatible response parsing
- Firecrawl fallback for
ws_fetch - Local BFS fallback for
ws_map - Public exports and package metadata expectations
- No real API keys are included in this repository.
web-search.env,.env, and local secret files are ignored.- Do not paste tokens into issues, commits, logs, or test output.
- The module returns structured errors instead of raising for normal user/configuration failures.
This repository currently does not ship a standalone license file. Review the code and upstream constraints before reuse or redistribution.
GA WebSearch Plugin / GA Search Enhanced 是一个面向 GenericAgent 风格智能体的轻量级联网搜索增强模块。它以单文件形式提供搜索、深度检索、网页内容抓取和站点结构映射能力,便于复制、集成和二次开发。
该项目来自内部 GenericAgent 集成实践,并参考了 openclaw-websearch-plugin 的使用场景;公开版本已移除本地私有配置、密钥和个人环境依赖。
- Tavily 搜索:
tavily_search- 支持 basic / advanced 搜索深度
- 支持 general / news 主题
- 支持多 Tavily Key 粘滞复用,并在
401/429类授权或限流失败时切换
- Grok / OpenAI 兼容深度搜索:
grok_search- 支持 OpenAI 兼容的
/v1/chat/completions接口 - 支持模型覆盖和平台聚焦参数
- 对时间敏感问题自动注入当前时间上下文
- 支持 OpenAI 兼容的
- 并行搜索:
dual_search- 同时调用 Tavily 与 Grok/OpenAI 兼容接口
- 自动拆分 Tavily/Grok 专属参数,避免参数污染
- 网页内容抓取:
ws_fetch- 优先使用 Tavily Extract
- 配置 Firecrawl 后可降级到 Firecrawl v2
/scrape
- 站点地图发现:
ws_map- 优先使用 Tavily Map
- 安装
beautifulsoup4后可使用同域本地 BFS 降级方案
开发模式安装:
pip install -e .也可以直接把 ga_search_enhanced.py 复制到你的 Agent 工具目录中使用。
要求 Python 3.10 或更高版本。
本地站点地图降级能力的可选依赖:
pip install -e ".[map-fallback]"配置采用懒加载方式。模块既可以在普通 Python 环境中使用,也可以在 GenericAgent 风格环境中使用。
配置优先级:
- 环境变量
ga_search_enhanced.py同级目录下的web-search.env- 如果存在兼容模块,则懒加载可选
keychain适配器
示例 web-search.env:
TAVILY_API_KEYS=tvly-key1,tvly-key2
GROK_API_KEY=your-grok-or-openai-compatible-key
GROK_API_URL=http://127.0.0.1:8000
GROK_MODEL=grok-3-mini
FIRECRAWL_API_KEY=fc-your-key
FIRECRAWL_API_URL=https://api.firecrawl.dev/v2支持的配置别名包括:
- Tavily:
TAVILY_API_KEYS、TAVILY_API_KEY、tavilyApiKeys - Grok/OpenAI 兼容接口:
GROK_API_KEY、GROK_API_URL、GROK_MODEL、grokApiKey、grokApiUrl、grokModel - Firecrawl:
FIRECRAWL_API_KEY、FIRECRAWL_API_URL、firecrawlApiKey、firecrawlApiUrl
web-search.env 已加入 .gitignore,请不要提交真实 API Key。
from ga_search_enhanced import tavily_search, grok_search, dual_search, ws_fetch, ws_map
print(tavily_search("最新 Python 版本", depth="advanced", max_results=5))
print(grok_search("分析最新 Python 版本的影响", platform="GitHub 和官方文档"))
print(dual_search("对比当前主流 LLM 联网搜索 API", max_results=3, model="grok-3-mini"))
print(ws_fetch(["https://example.com"]))
print(ws_map(url="https://example.com", depth=2, limit=10))tavily_search(query, depth="basic", max_results=5, topic="general", days=None, include_answer=True, include_raw=False)
通过 Tavily Search API 搜索。
行为契约:
- 空
query返回{"error": "query is required"}。 - 非法
depth会被归一化为"basic"。 - 非法
topic会被归一化为"general"。 max_results会被限制在1..20。- 多 Tavily Key 采用粘滞复用:当前 key 会一直使用到出现
401或429类失败,再切换到下一个 key。
典型成功结构:
{
"result": "## Answer\n...\n\n## Sources\n...",
"details": {...},
"raw": {...},
}调用 OpenAI 兼容聊天补全接口:
{GROK_API_URL}/v1/chat/completions
行为契约:
- 空
query返回{"error": "query is required"}。 - 未配置 key 返回
{"error": "Grok API key not configured"}。 model会覆盖配置中的GROK_MODEL。platform会追加平台聚焦指令。- 时间敏感问题会自动添加当前日期时间。
典型成功结构:
{
"result": "...模型回答...",
"model": "...",
"usage": {...},
"details": {...},
}并行运行 tavily_search 与 grok_search。
参数路由:
- Tavily 参数:
depth、max_results、topic、days、include_answer、include_raw - Grok 参数:
model、platform tavily={...}与grok={...}可分别覆盖两侧参数
返回结构:
{
"tavily": {...},
"grok": {...},
"combined": "## Tavily Results\n...\n\n## Grok Results\n...",
}两侧结果可能分别包含 result 或 error。
抓取网页内容。
降级决策:
- 先使用 Tavily Extract。
- 如果 Tavily 抓取失败或没有内容,并且已配置
FIRECRAWL_API_KEY,则调用 Firecrawl v2/scrape。 - 如果全部失败,返回错误结构。
urls 可以是字符串或字符串列表。
典型成功结构:
{
"result": "# https://example.com\n\n...",
"source": "tavily", # 或 "firecrawl"
"failed": [],
}ws_map(url=None, depth=1, breadth=20, limit=50, instructions=None, start_url=None, max_pages=None, same_domain=True)
发现站点 URL。
行为契约:
start_url是url的兼容别名。max_pages是影响limit与breadth的兼容别名。depth限制在1..5。breadth和limit限制在1..100。
降级决策:
- 先使用 Tavily Map。
- 如果 Tavily Map 失败且已安装
beautifulsoup4,则运行简单同域 BFS 爬取。 - 如果未安装
beautifulsoup4,返回 Tavily 错误。
典型返回结构:
{
"result": "# Site Map: ...",
"url": "https://example.com",
"urls": ["https://example.com", "..."],
"source": "tavily", # 或 "local_bfs"
}- 普通 Python 环境建议使用环境变量或
web-search.env。 - GenericAgent 风格环境可以通过兼容
keychain模块提供密钥。 - 模块不会在 import 阶段导入 keychain,因此脱离 GA 也能安全导入。
- 模块不会打印密钥。
仓库内置离线测试,通过 monkeypatch 模拟 HTTP 响应,不会调用真实外部 API:
python3 -m pytest -q当前测试矩阵覆盖:
- 空参数校验
- Tavily 粘滞 key 失败切换
- Grok/OpenAI 兼容响应解析
ws_fetch的 Firecrawl 降级ws_map的本地 BFS 降级- 公开导出与包元数据预期
- 仓库不包含真实 API 密钥。
web-search.env、.env和本地密钥文件已加入忽略规则。- 不要在 issue、commit 或日志中粘贴 token。
- 常规用户输入或配置错误会以结构化
error返回,而不是直接抛出异常。
本仓库当前不随附独立许可文件。复用或再分发前,请自行审查代码及上游约束。