Adding internet and research paper tools for open world search tasks by mr-sarthakgupta · Pull Request #75 · skydiscover-ai/skydiscover

mr-sarthakgupta · 2026-06-01T14:05:46Z

Adding internet and research paper tools for open world search tasks

the tools allow the agent to search papers and understand existing methods before trying introducing alterations to the solution set.

I've used these tools on my own project for determining protein aggregation laws using a skydiscover derivative and found them really useful!

These tools are heavily inspired by huggingface's https://github.com/huggingface/ml-intern

gemini-code-assist

Code Review

This pull request expands the agentic capabilities of the codebase by introducing several new tools: web_search (via DuckDuckGo and Semantic Scholar), fetch_webpage (for downloading and parsing webpages), research_papers (for discovering and reading ML papers), and run_command (for executing sandboxed shell commands). The review feedback highlights critical security and robustness issues, including a safety check bypass in run_command when unsafe is enabled, backwards rate-limiting logic for unauthenticated Semantic Scholar requests, and opportunities to improve HTML parsing, timeout handling, and arXiv ID normalization.

gemini-code-assist · 2026-06-01T14:08:41Z

+    if not unsafe:
+        safety_err = _check_command_safety(command)
+        if safety_err:
+            return f"Error: {safety_err}", False


When unsafe is True, the safety check is completely bypassed, allowing arbitrary command execution (including blocked executables like rm, curl, etc.). To prevent this security bypass, _check_command_safety should always be called, passing the unsafe flag to enforce the appropriate safety constraints.

safety_err = _check_command_safety(command, unsafe=unsafe) if safety_err: return f"Error: {safety_err}", False

gemini-code-assist · 2026-06-01T14:08:42Z

+        # Rate limit only when authenticated (1 req/s for search, 10 req/s for others)
+        if S2_API_KEY:
+            min_interval = 1.0 if "search" in path else 0.1
+            elapsed = time.monotonic() - _s2_last_request
+            if elapsed < min_interval:
+                await asyncio.sleep(min_interval - elapsed)


The rate-limiting logic is backwards: it only throttles requests when S2_API_KEY is present (authenticated). Unauthenticated requests have much stricter rate limits on Semantic Scholar and will immediately trigger 429 Too Many Requests if not throttled. We should always enforce a rate limit, defaulting to a safer 1.0 second interval when unauthenticated.

# Rate limit always, defaulting to 1.0s when unauthenticated min_interval = 1.0 if not S2_API_KEY or "search" in path else 0.1 elapsed = time.monotonic() - _s2_last_request if elapsed < min_interval: await asyncio.sleep(min_interval - elapsed)

gemini-code-assist · 2026-06-01T14:08:42Z

+def _validate_arxiv_id(args: dict) -> str | None:
+    """Return arxiv_id or None if missing."""
+    return args.get("arxiv_id")


LLMs frequently extract arXiv IDs with prefixes like arxiv: or arXiv:. If these are passed directly to the APIs or URLs, the requests will fail. Normalizing the ID by stripping any case-insensitive arxiv: prefix in _validate_arxiv_id makes the tool much more robust.

Suggested change

def _validate_arxiv_id(args: dict) -> str | None:

"""Return arxiv_id or None if missing."""

return args.get("arxiv_id")

def _validate_arxiv_id(args: dict) -> str | None:

"""Return arxiv_id or None if missing."""

val = args.get("arxiv_id")

if val and isinstance(val, str):

return re.sub(r"^(arxiv:)+", "", val, flags=re.IGNORECASE).strip()

return val

gemini-code-assist · 2026-06-01T14:08:42Z

+            except httpx.RequestError:
+                continue


If _parse_paper_html raises an unexpected parsing exception (e.g., AttributeError or TypeError due to malformed HTML), it will propagate out of the loop and fail the entire tool call. Catching general Exception alongside httpx.RequestError ensures we robustly fall back to the next URL or the abstract fallback.

Suggested change

except httpx.RequestError:

continue

except (httpx.RequestError, Exception):

continue

gemini-code-assist · 2026-06-01T14:08:42Z

+def html_to_text(html_content: str) -> str:
+    """Convert HTML to clean plain text."""
+    extractor = _TextExtractor()
+    try:
+        extractor.feed(html_content)
+    except Exception:
+        # If the parser chokes, fall back to stripping all tags.
+        return re.sub(r"<[^>]+>", " ", html_content)
+    return extractor.get_text()


Using a custom HTMLParser subclass is highly sensitive to mismatched or unclosed tags in real-world HTML, which can cause the parser to skip large portions of the webpage. Since beautifulsoup4 is already a dependency of the project, we should use it to decompose unwanted tags and extract text reliably.

Suggested change

def html_to_text(html_content: str) -> str:

"""Convert HTML to clean plain text."""

extractor = _TextExtractor()

try:

extractor.feed(html_content)

except Exception:

# If the parser chokes, fall back to stripping all tags.

return re.sub(r"<[^>]+>", " ", html_content)

return extractor.get_text()

def html_to_text(html_content: str) -> str:

"""Convert HTML to clean plain text."""

try:

from bs4 import BeautifulSoup

soup = BeautifulSoup(html_content, "html.parser")

for element in soup(_SKIP_TAGS):

element.decompose()

return soup.get_text(separator="\n").strip()

except Exception:

return re.sub(r"<[^>]+>", " ", html_content)

gemini-code-assist · 2026-06-01T14:08:42Z

+            resp = requests.get(
+                hit.url,
+                headers={"User-Agent": USER_AGENT},
+                timeout=15,
+                allow_redirects=True,
+            )


A timeout of 15 seconds per page fetch can make the web_search tool extremely slow and unresponsive if multiple top hits are slow or hanging. Reducing the timeout to 5 seconds ensures the search returns quickly while still capturing responsive pages.

Suggested change

resp = requests.get(

hit.url,

headers={"User-Agent": USER_AGENT},

timeout=15,

allow_redirects=True,

)

resp = requests.get(

hit.url,

headers={"User-Agent": USER_AGENT},

timeout=5,

allow_redirects=True,

)

shubham3-ucb · 2026-06-06T17:31:16Z

@mr-sarthakgupta nice work, this is a really useful addition. I pulled it down and ran the tools, web_search and fetch_webpage both worked with no key needed, and the safe-mode design for run_command makes sense. A few things before the merge is approved:

Lint is failing. black wants to reformat 5 of the new files (agentic_generator.py, run_command_tool.py, web_search_tool.py, papers_tool.py, fetch_webpage_tool.py). A quick black skydiscover/ and a push should turn CI green.
In safe mode, run_command can still read files outside the codebase root. The docstring says absolute paths outside the root are blocked, but there is no actual path check, so cat/ls/head/find/grep happily take an absolute path. I tested it with codebase_root set to a temp dir:
```
cat /etc/passwd   -> exit 0, returns the file
ls /Users/<me>    -> lists my home dir
```
Could we resolve the path args and reject anything outside codebase_root? Or if that is out of scope for now, just drop the "absolute paths are blocked" line so the docstring matches the behavior.
env is on the safe allowlist, so the agent can dump the whole environment (including OPENAI_API_KEY) straight into the model context. I confirmed it prints the key. I would pull env and printenv out of the allowlist.
Small one: fetch_webpage correctly blocks non-http(s) schemes (file:// is rejected), but there is no host check, so localhost and 169.254.169.254 (cloud metadata) still get hit. I confirmed both are attempted rather than refused. A guard for loopback, link-local, and private ranges would be good if this ever runs on shared infra.

None of these are hard to fix and the feature itself is great. This should be good to merge once the lint and the path/env bits are handled. Thanks!

mr-sarthakgupta · 2026-06-06T17:46:52Z

Hi Shubham, thanks for the feedback!

Yes, these features do have some holes right now, I mostly duct-taped them for my own use. Would be fixing the issues in the coming days within this PR. Also, I might open another PR for symbolic regression soon!

PS: Some time ago I had sent an email to you with some questions about sky-discover, while a few of them have been answered with my own experiments but would still really appreciate if you could take a look if you find any time! (I sent it using the ID mrsarthakgupta@gmail.com)

mr-sarthakgupta · 2026-06-06T17:51:22Z

Also, another small problem I faced (not related to this PR or topic), I wasn't able to find a way to run EvoX and Ada-Evolve with each other, they seem like alternatives by looking at the code structure but are at separate layers in terms of search logic and I believe I also saw some results on the website or someplace with both being used

add all tools

1eaf211

gemini-code-assist Bot reviewed Jun 1, 2026

View reviewed changes

additional edits

6e851e0

lynnliu030 requested review from lynnliu030 and shubham3-ucb June 5, 2026 05:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding internet and research paper tools for open world search tasks#75

Adding internet and research paper tools for open world search tasks#75
mr-sarthakgupta wants to merge 2 commits into
skydiscover-ai:mainfrom
mr-sarthakgupta:add-more-tools

mr-sarthakgupta commented Jun 1, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Uh oh!

shubham3-ucb commented Jun 6, 2026 •

edited

Loading

Uh oh!

mr-sarthakgupta commented Jun 6, 2026 •

edited

Loading

Uh oh!

mr-sarthakgupta commented Jun 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mr-sarthakgupta commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

shubham3-ucb commented Jun 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mr-sarthakgupta commented Jun 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mr-sarthakgupta commented Jun 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mr-sarthakgupta commented Jun 1, 2026 •

edited

Loading

shubham3-ucb commented Jun 6, 2026 •

edited

Loading

mr-sarthakgupta commented Jun 6, 2026 •

edited

Loading