Skip to content

fix: sources_count + MCP v1.26 compat#157

Open
sonar5JR wants to merge 2 commits intoteng-lin:mainfrom
sonar5JR:main
Open

fix: sources_count + MCP v1.26 compat#157
sonar5JR wants to merge 2 commits intoteng-lin:mainfrom
sonar5JR:main

Conversation

@sonar5JR
Copy link

@sonar5JR sonar5JR commented Mar 6, 2026

Summary

Two bug fixes for the NotebookLM MCP server:

1. Fix sources_count always returning 0 in Notebook.from_api_response

The list_notebooks response includes the full sources array at data[1], but Notebook.from_api_response() never extracted it. This caused sources_count to always default to 0, even when notebooks had sources.

Fix: Added extraction of len(data[1]) to populate sources_count correctly.

2. Fix MCP server startup on mcp v1.26.0+

The FastMCP constructor in mcp v1.26.0 renamed the description parameter to instructions, causing a TypeError on server startup.

Fix: Updated mcp_server.py to use instructions instead of description.

Testing

  • Verified list_notebooks now correctly returns source counts (e.g., 5, 19, 70) matching the browser UI
  • Confirmed MCP server starts successfully and registers all 39 tools

@gemini-code-assist
Copy link

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses two critical issues within the NotebookLM Model Context Protocol (MCP) server: ensuring compatibility with recent mcp library updates and fixing an incorrect source count display for notebooks. It also includes a new script to facilitate connection testing. These changes improve the robustness and accuracy of the MCP server's integration with NotebookLM.

Highlights

  • MCP Server Compatibility: Updated the FastMCP constructor in mcp_server.py to use the instructions parameter instead of description, ensuring compatibility with mcp library version 1.26.0 and above.
  • Notebook Source Count Fix: Corrected an issue in Notebook.from_api_response where sources_count was always returning 0. The method now accurately extracts the number of sources from the API response.
  • New Connection Test Script: Introduced a new utility script, test_connection.py, to quickly verify the NotebookLM client's connection and list available notebooks.
Changelog
  • mcp_server.py
    • Added the initial implementation of the NotebookLM Model Context Protocol server, including various tools for notebook, source, chat, artifact, and research management.
    • Updated the FastMCP constructor to use the instructions parameter for compatibility with mcp v1.26.0+.
  • src/notebooklm/types.py
    • Updated the Notebook.from_api_response method to correctly parse and set the sources_count based on the API response data.
  • test_connection.py
    • Added a new script to test the NotebookLM client connection and list notebooks.
Activity
  • No specific review comments or activity have been recorded for this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@sonar5JR
Copy link
Author

sonar5JR commented Mar 6, 2026

Glad I could help with this, it is an awesome, awesome thing and I appreciate all of you.

@sonar5JR sonar5JR closed this Mar 6, 2026
@sonar5JR sonar5JR reopened this Mar 6, 2026
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request primarily addresses two bug fixes: correctly calculating sources_count for notebooks and ensuring compatibility with MCP v1.26.0 by updating a parameter name. It also introduces a new mcp_server.py file to expose the NotebookLM API as MCP tools, granting full programmatic access to notebooks, sources, chat, and artifacts. Critically, this new MCP server introduces several path traversal vulnerabilities in its artifact download tools. These tools accept an output_path from the LLM and write files to the local filesystem without sanitization, which could be exploited via prompt injection to overwrite sensitive user files. Beyond this, the mcp_server.py file could benefit from improvements in client lifecycle management, removal of direct sys.path manipulation, refactoring to reduce code duplication, and ensuring proper serialization of all data types to enhance its overall maintainability and robustness.

Comment on lines +804 to +819
async def download_audio(
notebook_id: str, output_path: str, artifact_id: str | None = None
) -> str:
"""Download an Audio Overview to a local file.

Args:
notebook_id: The notebook ID.
output_path: Local file path to save the audio (e.g., "podcast.mp3").
artifact_id: Specific artifact ID. If None, downloads the first completed audio.

Returns the output file path.
"""
client = await get_client()
path = await client.artifacts.download_audio(notebook_id, output_path, artifact_id=artifact_id)
return json.dumps({"downloaded": str(path)})

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The download_audio tool accepts an output_path argument without any sanitization. This allows an attacker to perform a path traversal attack via prompt injection, potentially overwriting sensitive files on the user's machine (e.g., ~/.bashrc, ~/.ssh/authorized_keys). Please sanitize the output_path to ensure it remains within a safe directory and does not contain directory traversal sequences like ...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 127c76d — added

_sanitize_output_path()
to all 8 download tools. It rejects .. traversal, strips absolute paths/drive letters, confines writes to a configurable ALLOWED_DOWNLOAD_DIR (default ./downloads/), and blocks sensitive directories (.ssh, .gnupg, System32, etc.). Thanks for catching this — it was a critical oversight.

Comment on lines +822 to +837
async def download_video(
notebook_id: str, output_path: str, artifact_id: str | None = None
) -> str:
"""Download a Video Overview to a local file.

Args:
notebook_id: The notebook ID.
output_path: Local file path to save the video (e.g., "overview.mp4").
artifact_id: Specific artifact ID. If None, downloads the first completed video.

Returns the output file path.
"""
client = await get_client()
path = await client.artifacts.download_video(notebook_id, output_path, artifact_id=artifact_id)
return json.dumps({"downloaded": str(path)})

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The download_video tool accepts an output_path argument without any sanitization. This allows an attacker to perform a path traversal attack via prompt injection, potentially overwriting sensitive files on the user's machine. Please sanitize the output_path to ensure it remains within a safe directory.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed together with all download tools in 127c76d.

Comment on lines +840 to +862
async def download_quiz(
notebook_id: str,
output_path: str,
output_format: str = "json",
artifact_id: str | None = None,
) -> str:
"""Download a quiz to a local file.

Args:
notebook_id: The notebook ID.
output_path: Local file path to save the quiz.
output_format: Format — "json", "markdown", or "html".
artifact_id: Specific artifact ID. If None, downloads the first quiz.

Returns the output file path.
"""
client = await get_client()
path = await client.artifacts.download_quiz(
notebook_id, output_path, output_format=output_format, artifact_id=artifact_id
)
return json.dumps({"downloaded": str(path)})


Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The download_quiz tool accepts an output_path argument without any sanitization. This allows an attacker to perform a path traversal attack via prompt injection, potentially overwriting sensitive files on the user's machine. Please sanitize the output_path to ensure it remains within a safe directory.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed together with all download tools in 127c76d.

Comment on lines +864 to +886
async def download_flashcards(
notebook_id: str,
output_path: str,
output_format: str = "json",
artifact_id: str | None = None,
) -> str:
"""Download flashcards to a local file.

Args:
notebook_id: The notebook ID.
output_path: Local file path to save the flashcards.
output_format: Format — "json", "markdown", or "html".
artifact_id: Specific artifact ID. If None, downloads the first flashcards.

Returns the output file path.
"""
client = await get_client()
path = await client.artifacts.download_flashcards(
notebook_id, output_path, output_format=output_format, artifact_id=artifact_id
)
return json.dumps({"downloaded": str(path)})


Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The download_flashcards tool accepts an output_path argument without any sanitization. This allows an attacker to perform a path traversal attack via prompt injection, potentially overwriting sensitive files on the user's machine. Please sanitize the output_path to ensure it remains within a safe directory.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed together with all download tools in 127c76d.

Comment on lines +888 to +906
async def download_slide_deck(
notebook_id: str, output_path: str, artifact_id: str | None = None
) -> str:
"""Download a slide deck to a local file (PDF or PPTX).

Args:
notebook_id: The notebook ID.
output_path: Local file path to save the slides (e.g., "slides.pdf" or "slides.pptx").
artifact_id: Specific artifact ID. If None, downloads the first slide deck.

Returns the output file path.
"""
client = await get_client()
path = await client.artifacts.download_slide_deck(
notebook_id, output_path, artifact_id=artifact_id
)
return json.dumps({"downloaded": str(path)})


Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The download_slide_deck tool accepts an output_path argument without any sanitization. This allows an attacker to perform a path traversal attack via prompt injection, potentially overwriting sensitive files on the user's machine. Please sanitize the output_path to ensure it remains within a safe directory.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed together with all download tools in 127c76d.

Comment on lines +16 to +17
# Add src to path so we can import notebooklm
sys.path.insert(0, str(__import__("pathlib").Path(__file__).parent / "src"))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Modifying sys.path directly is fragile and can lead to issues depending on the execution context. It's better to rely on standard Python packaging.

Comment on lines +27 to +33
async def get_client() -> NotebookLMClient:
"""Get or create the NotebookLM client singleton."""
global _client
if _client is None or not _client.is_connected:
_client = await NotebookLMClient.from_storage()
await _client.__aenter__()
return _client

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The get_client function calls _client.__aenter__() to open the client connection but there is no corresponding call to _client.__aexit__() to close it. This can lead to resource leaks, such as open network connections not being properly terminated when the server shuts down. A better approach is to manage the client's lifecycle along with the server's lifecycle.

Many modern async frameworks (like FastAPI, which FastMCP may be based on) support a lifespan context manager. Consider using one to manage the client's connection.

from contextlib import asynccontextmanager
from mcp.server.fastmcp import FastMCP

@asynccontextmanager
async def lifespan(app: FastMCP):
    """Manage the NotebookLM client lifecycle."""
    global _client
    _client = await NotebookLMClient.from_storage()
    await _client.__aenter__()
    yield
    if _client:
        await _client.__aexit__(None, None, None)

# When creating the mcp server
mcp = FastMCP(
    "NotebookLM",
    instructions="...",
    lifespan=lifespan,
)

# And simplify get_client
async def get_client() -> NotebookLMClient:
    """Get the initialized NotebookLM client singleton."""
    global _client
    if _client is None:
        # This should not happen if lifespan is used correctly
        raise RuntimeError("NotebookLMClient not initialized.")
    return _client

If FastMCP doesn't support a lifespan argument, you should find another way to ensure _client.__aexit__ is called on server shutdown.

Comment on lines +36 to +50
def _serialize(obj: Any) -> Any:
"""Serialize dataclass/object to JSON-safe dict."""
if obj is None:
return None
if isinstance(obj, (str, int, float, bool)):
return obj
if isinstance(obj, list):
return [_serialize(item) for item in obj]
if isinstance(obj, dict):
return {k: _serialize(v) for k, v in obj.items()}
if hasattr(obj, "__dataclass_fields__"):
return {k: _serialize(getattr(obj, k)) for k in obj.__dataclass_fields__}
if hasattr(obj, "__dict__"):
return {k: _serialize(v) for k, v in obj.__dict__.items() if not k.startswith("_")}
return str(obj)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The _serialize function does not explicitly handle datetime objects. It will fall back to str(obj), which produces a string representation that is not a standard format like ISO 8601. This can make parsing the JSON on the client side more difficult. It's better to add specific handling for datetime objects to serialize them to ISO 8601 format.

Suggested change
def _serialize(obj: Any) -> Any:
"""Serialize dataclass/object to JSON-safe dict."""
if obj is None:
return None
if isinstance(obj, (str, int, float, bool)):
return obj
if isinstance(obj, list):
return [_serialize(item) for item in obj]
if isinstance(obj, dict):
return {k: _serialize(v) for k, v in obj.items()}
if hasattr(obj, "__dataclass_fields__"):
return {k: _serialize(getattr(obj, k)) for k in obj.__dataclass_fields__}
if hasattr(obj, "__dict__"):
return {k: _serialize(v) for k, v in obj.__dict__.items() if not k.startswith("_")}
return str(obj)
def _serialize(obj: Any) -> Any:
"""Serialize dataclass/object to JSON-safe dict."""
if obj is None:
return None
if isinstance(obj, (str, int, float, bool)):
return obj
if isinstance(obj, __import__("datetime").datetime):
return obj.isoformat()
if isinstance(obj, list):
return [_serialize(item) for item in obj]
if isinstance(obj, dict):
return {k: _serialize(v) for k, v in obj.items()}
if hasattr(obj, "__dataclass_fields__"):
return {k: _serialize(getattr(obj, k)) for k in obj.__dataclass_fields__}
if hasattr(obj, "__dict__"):
return {k: _serialize(v) for k, v in obj.__dict__.items() if not k.startswith("_")}
return str(obj)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Good catch — that's a valid improvement. I kept this PR scoped to the two bug fixes (sources_count + MCP v1.26 compat) to keep it clean, but happy to address this in a follow-up PR if the maintainer wants."


Returns success status.
"""
from notebooklm import ChatGoal, ChatResponseLength

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Placing imports inside functions is generally discouraged. It can lead to repeated import overhead and makes it harder to see all of the file's dependencies at a glance. These imports should be moved to the top of the file. This applies to several other functions in this file as well (e.g., list_artifacts, generate_audio).

Comment on lines +577 to +593
qty = None
if quantity:
qty_map = {
"less": QuizQuantity.LESS,
"default": QuizQuantity.DEFAULT,
"more": QuizQuantity.MORE,
}
qty = qty_map.get(quantity.lower())

diff = None
if difficulty:
diff_map = {
"easy": QuizDifficulty.EASY,
"medium": QuizDifficulty.MEDIUM,
"hard": QuizDifficulty.HARD,
}
diff = diff_map.get(difficulty.lower())

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic for mapping quantity and difficulty strings to their corresponding enums is duplicated in generate_quiz and generate_flashcards. This duplicated code can be extracted into a helper function to improve maintainability.

For example:

def _parse_quiz_options(quantity: str | None, difficulty: str | None) -> tuple[QuizQuantity | None, QuizDifficulty | None]:
    """Parse quantity and difficulty strings into enums."""
    from notebooklm import QuizDifficulty, QuizQuantity

    qty = None
    if quantity:
        qty_map = {
            "less": QuizQuantity.LESS,
            "default": QuizQuantity.DEFAULT,
            "more": QuizQuantity.MORE,
        }
        qty = qty_map.get(quantity.lower())

    diff = None
    if difficulty:
        diff_map = {
            "easy": QuizDifficulty.EASY,
            "medium": QuizDifficulty.MEDIUM,
            "hard": QuizDifficulty.HARD,
        }
        diff = diff_map.get(difficulty.lower())
    
    return qty, diff

You can then call this helper from both generate_quiz and generate_flashcards.

Addresses critical CWE-22 path traversal vulnerability identified by
Gemini Code Assist in PR review. All 8 download_* functions now pass
output_path through _sanitize_output_path() which:

- Rejects '..' traversal sequences
- Strips absolute paths (drive letters, leading slashes)
- Confines writes to ALLOWED_DOWNLOAD_DIR (default: ./downloads/)
- Blocks writes to sensitive directories (.ssh, .gnupg, System32, etc.)
- Configurable via NOTEBOOKLM_DOWNLOAD_DIR env var
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant