-
Notifications
You must be signed in to change notification settings - Fork 81
feat: support for agent v1 #520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughThis update introduces a comprehensive refactor and enhancement of the agent-related configuration and client code within the Deepgram SDK. Key changes include the renaming and restructuring of configuration classes (e.g., Changes
Sequence Diagram(s)sequenceDiagram
participant UserScript
participant DeepgramClient
participant AgentWebSocketClient
participant AgentProvider
UserScript->>DeepgramClient: Initialize with API key and options
UserScript->>AgentWebSocketClient: Connect with SettingsOptions
AgentWebSocketClient->>AgentProvider: Send Settings (with nested provider config)
AgentProvider-->>AgentWebSocketClient: Acknowledge settings
UserScript->>AgentWebSocketClient: Register event handlers
UserScript->>AgentWebSocketClient: Stream audio data (if applicable)
AgentWebSocketClient->>AgentProvider: Transmit audio/data/events
AgentProvider-->>AgentWebSocketClient: Emit events (audio, text, state)
AgentWebSocketClient-->>UserScript: Invoke registered handlers with event data
UserScript->>AgentWebSocketClient: Close connection
Possibly related PRs
Suggested reviewers
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
```bash | ||
# Install deps | ||
pipenv install | ||
# Build package | ||
pipenv run python3 -m build | ||
# Install package from local build | ||
pipenv install ./dist/deepgram_sdk-0.0.0.tar.gz | ||
# Try an example! | ||
DEEPGRAM_API_KEY=<key> pipenv run python3 examples/agent/async_simple/main.py | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added this in this PR because I don't have my command history on the new laptop, and had to do quite a bit of digging to figure out how to make this work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works for me. I use PipEnv and always need to check my notes on how to run it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🔭 Outside diff range comments (1)
deepgram/clients/agent/v1/websocket/client.py (1)
181-189
: 🛠️ Refactor suggestion
start()
rejectsNone
foroptions
, breaking previous behaviour
options
is declared asOptional[SettingsOptions]
, but the subsequentisinstance
/elif
chain ends with an unconditionalelse: raise DeepgramError("Invalid options type")
.
PassingNone
– which used to be valid for callers that wanted to send the Settings message later – will now raise an exception and make the call site fail at runtime.If
None
is no longer acceptable, the type hint should be changed toSettingsOptions
(non-optional) and the docstring updated.
IfNone
should stay legal, add an early guard and skip the type-based conversion instead of raising.- options: Optional[SettingsOptions] = None, + # `None` means: “I will send the Settings message manually later”. + options: Optional[SettingsOptions] = None, ... - else: - raise DeepgramError("Invalid options type") + elif options is None: + self._logger.notice("No SettingsOptions supplied – caller will send later") + self._settings = None + else: + raise DeepgramError( + f"Invalid options type {type(options).__name__}; " + "expected SettingsOptions | dict | str | None" + )Also applies to: 199-204
🧹 Nitpick comments (6)
deepgram/clients/__init__.py (1)
367-382
: Provider model refactored with granular provider typesThe agent configuration has been refactored from a general-purpose
Provider
class to specific provider types (ListenProvider
,SpeakProvider
,ThinkProvider
) and a newEndpoint
abstraction.This change provides better separation of concerns and more explicit configuration options for each capability of the agent. The granular provider model allows for more targeted configuration and clearer API boundaries.
deepgram/clients/agent/v1/websocket/client.py (2)
229-231
: Safety check can dereferenceNone
whenmodel
is missing
self._settings.agent.listen.provider
is guaranteed by defaults, butmodel
may still beNone
if the caller intentionally leaves it unset (e.g. to let
the server decide).
The current condition:if self._settings.agent.listen.provider.keyterms is not None and \ self._settings.agent.listen.provider.model is not None and \ not self._settings.agent.listen.provider.model.startswith("nova-3"):will raise
AttributeError
ifprovider
itself isNone
(possible if a
malformed payload sneaks throughfrom_dict
). A defensive guard eliminates the
risk.-if self._settings.agent.listen.provider.keyterms is not None and \ - self._settings.agent.listen.provider.model is not None and \ - not self._settings.agent.listen.provider.model.startswith("nova-3"): +listen_provider = getattr(self._settings.agent.listen, "provider", None) +if ( + listen_provider + and listen_provider.keyterms + and listen_provider.model + and not listen_provider.model.startswith("nova-3") +):
191-197
: Consider redacting sensitive data in INFO logs
self._logger.info("settings: %s", options)
prints the entire Settings object,
which may contain organisation-specific prompts, model identifiers or future
secret fields. Moving this to DEBUG, or implementing a redaction helper (e.g.
masking long strings), protects users who run the SDK with verbose logging in
production.deepgram/clients/agent/v1/websocket/options.py (3)
215-223
:SpeakProvider.__getitem__
assumes lists forvoice
/language
, conflicting with schema
voice
is a singularCartesiaVoice
instance, andlanguage
/
language_code
are strings, but the getter transforms them into lists. This
breaks symmetry (SpeakProvider.from_dict(...).voice
returnslist
, not
CartesiaVoice
) and surprises downstream consumers.- if "voice" in _dict: - _dict["voice"] = [ - CartesiaVoice.from_dict(voice) for voice in _dict["voice"] - ] - if "language" in _dict: - _dict["language"] = [str(language) for language in _dict["language"]] + if "voice" in _dict and isinstance(_dict["voice"], dict): + _dict["voice"] = CartesiaVoice.from_dict(_dict["voice"])(The same pattern occurs in
Think
,Listen
, andSpeak
classes and could be
refactored via a helper.)
78-81
:Endpoint.method
isOptional[str]
but default is"POST"
Either drop
Optional
(preferred) or make the defaultNone
. Staying
consistent avoids accidentalNone
assignment later.
364-370
:Language
dataclass shadows the built-intype
keywordAlthough legal, using the attribute name
type
invites confusion with
Python’s built-in. Consider renaming tocode
orlang
for clarity.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (14)
.github/CONTRIBUTING.md
(1 hunks)deepgram/__init__.py
(1 hunks)deepgram/client.py
(1 hunks)deepgram/clients/__init__.py
(1 hunks)deepgram/clients/agent/__init__.py
(1 hunks)deepgram/clients/agent/client.py
(2 hunks)deepgram/clients/agent/enums.py
(1 hunks)deepgram/clients/agent/v1/__init__.py
(1 hunks)deepgram/clients/agent/v1/websocket/__init__.py
(1 hunks)deepgram/clients/agent/v1/websocket/async_client.py
(7 hunks)deepgram/clients/agent/v1/websocket/client.py
(7 hunks)deepgram/clients/agent/v1/websocket/options.py
(7 hunks)examples/agent/async_simple/main.py
(2 hunks)examples/agent/simple/main.py
(2 hunks)
🧰 Additional context used
🧠 Learnings (1)
deepgram/clients/agent/v1/websocket/__init__.py (2)
Learnt from: dvonthenen
PR: deepgram/deepgram-python-sdk#426
File: deepgram/clients/listen/v1/websocket/__init__.py:8-8
Timestamp: 2024-07-01T19:21:39.778Z
Learning: Unused imports in `deepgram/clients/listen/v1/websocket/__init__.py` are retained to maintain backward compatibility and should not be flagged for removal in reviews.
Learnt from: dvonthenen
PR: deepgram/deepgram-python-sdk#426
File: deepgram/clients/listen/v1/websocket/__init__.py:8-8
Timestamp: 2024-10-09T02:19:46.086Z
Learning: Unused imports in `deepgram/clients/listen/v1/websocket/__init__.py` are retained to maintain backward compatibility and should not be flagged for removal in reviews.
🧬 Code Graph Analysis (7)
deepgram/client.py (1)
deepgram/clients/agent/v1/websocket/options.py (13)
SettingsOptions
(373-404)UpdatePromptOptions
(411-417)Listen
(257-270)ListenProvider
(145-160)Speak
(274-294)SpeakProvider
(177-223)Header
(21-27)Function
(94-127)Think
(227-253)ThinkProvider
(164-173)Agent
(298-318)Audio
(346-360)Endpoint
(73-90)
deepgram/clients/agent/v1/websocket/__init__.py (1)
deepgram/clients/agent/v1/websocket/options.py (13)
SettingsOptions
(373-404)UpdatePromptOptions
(411-417)Listen
(257-270)ListenProvider
(145-160)Speak
(274-294)SpeakProvider
(177-223)Header
(21-27)Function
(94-127)Think
(227-253)ThinkProvider
(164-173)Agent
(298-318)Audio
(346-360)Endpoint
(73-90)
deepgram/__init__.py (1)
deepgram/clients/agent/v1/websocket/options.py (22)
SettingsOptions
(373-404)UpdatePromptOptions
(411-417)UpdateSpeakOptions
(424-430)InjectAgentMessageOptions
(437-443)FunctionCallResponse
(450-457)AgentKeepAlive
(464-469)Listen
(257-270)ListenProvider
(145-160)Speak
(274-294)SpeakProvider
(177-223)Header
(21-27)Item
(31-37)Properties
(41-52)Parameters
(56-69)Function
(94-127)Think
(227-253)ThinkProvider
(164-173)Agent
(298-318)Input
(322-328)Output
(332-342)Audio
(346-360)Endpoint
(73-90)
deepgram/clients/agent/v1/websocket/async_client.py (3)
deepgram/clients/agent/v1/websocket/options.py (3)
SettingsOptions
(373-404)UpdatePromptOptions
(411-417)check
(391-404)deepgram/utils/verboselogs/__init__.py (1)
notice
(150-153)deepgram/clients/common/v1/websocket_response.py (1)
ErrorResponse
(41-51)
deepgram/clients/agent/v1/__init__.py (1)
deepgram/clients/agent/v1/websocket/options.py (13)
SettingsOptions
(373-404)UpdatePromptOptions
(411-417)Listen
(257-270)ListenProvider
(145-160)Speak
(274-294)SpeakProvider
(177-223)Header
(21-27)Function
(94-127)Think
(227-253)ThinkProvider
(164-173)Agent
(298-318)Audio
(346-360)Endpoint
(73-90)
deepgram/clients/agent/v1/websocket/client.py (3)
deepgram/clients/agent/v1/websocket/options.py (3)
SettingsOptions
(373-404)UpdatePromptOptions
(411-417)check
(391-404)deepgram/utils/verboselogs/__init__.py (1)
notice
(150-153)deepgram/clients/common/v1/websocket_response.py (1)
ErrorResponse
(41-51)
deepgram/clients/agent/v1/websocket/options.py (2)
deepgram/clients/common/v1/shared_response.py (1)
BaseResponse
(16-44)deepgram/clients/agent/enums.py (1)
AgentWebSocketEvents
(10-37)
🪛 GitHub Actions: Check - static
deepgram/clients/agent/v1/websocket/options.py
[error] 136-136: mypy: Incompatible types in assignment (expression has type "None", variable has type "str") [assignment]
[error] 139-139: mypy: Incompatible types in assignment (expression has type "None", variable has type "str") [assignment]
🔇 Additional comments (24)
.github/CONTRIBUTING.md (1)
51-65
: Great addition of building documentation!The new "Building Locally" section provides clear step-by-step instructions for contributors to build and test the package. This is particularly valuable given the extensive refactoring of agent-related classes and options introduced in this PR.
deepgram/clients/agent/enums.py (1)
32-33
: Renamed WebSocket events for consistency with option classesThese enum value renamings align with the broader restructuring of agent configuration classes in this PR:
SettingsConfiguration
→Settings
(matchesSettingsOptions
)UpdateInstructions
→UpdatePrompt
(matchesUpdatePromptOptions
)This is a breaking change as noted in the PR description, but improves API consistency.
deepgram/client.py (1)
350-352
: Updated imports to match agent v1 provider architectureThese import changes reflect the significant restructuring of the agent configuration model:
Renamed options classes:
SettingsConfigurationOptions
→SettingsOptions
UpdateInstructionsOptions
→UpdatePromptOptions
Replaced generic
Provider
andContext
with specialized provider classes:
ListenProvider
,SpeakProvider
,ThinkProvider
- Added
Endpoint
abstractionThe new structure enables more granular configuration of agent capabilities through specialized provider classes.
Also applies to: 358-360, 367-368, 372-373
deepgram/clients/agent/v1/websocket/__init__.py (1)
29-30
: Updated imports to expose new agent v1 classesThese import changes are consistent with the broader refactoring in this PR and expose the restructured agent configuration classes to users of this namespace:
Renamed option classes:
SettingsConfigurationOptions
→SettingsOptions
UpdateInstructionsOptions
→UpdatePromptOptions
Added specialized provider classes:
ListenProvider
,SpeakProvider
,ThinkProvider
Endpoint
abstractionThese changes maintain consistency with the new agent v1 architecture throughout the codebase.
Also applies to: 37-39, 46-47, 51-52
deepgram/clients/agent/__init__.py (3)
34-36
: Class renames align with restructured agent configuration modelThe renaming from
SettingsConfigurationOptions
toSettingsOptions
andUpdateInstructionsOptions
toUpdatePromptOptions
represents part of a broader restructuring of the agent configuration API.
42-45
: Provider-specific classes replace generic ProviderThe introduction of specialized provider classes (
ListenProvider
,SpeakProvider
,ThinkProvider
) replaces the previous genericProvider
class, allowing for more granular configuration options specific to each agent capability.Also applies to: 51-52
56-57
: Added Endpoint abstraction supports custom endpoint configurationThe new
Endpoint
class provides a formal way to configure custom endpoints for agent interactions.deepgram/clients/agent/client.py (2)
33-36
: Updates to import aliases maintain API consistencyThe import aliases from
.v1
have been updated to reflect the renamed classes and new provider types, ensuring consistency with the broader API restructuring.Also applies to: 41-44, 50-51, 55-56
82-84
: Export aliases updated to match restructured importsThe export aliases at the bottom of the file have been consistently updated to match the restructured imports from the
.v1
module.Also applies to: 90-92, 99-100, 104-104
examples/agent/async_simple/main.py (2)
14-15
: Updated import to use new SettingsOptions classThe import statement has been updated to use the new
SettingsOptions
class, which replacesSettingsConfigurationOptions
.
113-121
: Example updated to use new agent configuration structureThe example demonstrates the new provider-based configuration structure:
- Uses nested
provider
properties to access model configuration- Renames
instructions
toprompt
to match the new API terminology- Adds new configurations like
greeting
,listen.provider.keyterms
, and top-levellanguage
These changes showcase how to correctly use the new agent configuration model.
deepgram/__init__.py (3)
336-338
: Updated top-level exports for agent configuration optionsThe exports have been updated to use the new class names
SettingsOptions
andUpdatePromptOptions
, ensuring consistency with the new API structure.
344-347
: Added provider-specific classes and endpoint abstraction to exportsThe specialized provider classes (
ListenProvider
,SpeakProvider
,ThinkProvider
) and theEndpoint
abstraction have been added to the exports, making them available through the top-level package.Also applies to: 353-354, 358-359
334-359
: Consider adding migration documentation for this breaking changeWhile the code changes are consistent and complete, the introduction of provider-based nesting and renamed fields constitutes a breaking change that will require client code updates.
Consider adding migration guides or documentation that explains how to update from the previous API to the new structure, especially highlighting the path changes (e.g.,
think.model
→think.provider.model
) and renamed fields (e.g.,instructions
→prompt
).examples/agent/simple/main.py (1)
11-11
: Import updated to use new SettingsOptions classThe agent configuration class has been renamed from
SettingsConfigurationOptions
toSettingsOptions
.deepgram/clients/agent/v1/__init__.py (2)
38-40
: Core settings classes renamedThe agent settings classes have been renamed for clarity and consistency:
SettingsConfigurationOptions
→SettingsOptions
UpdateInstructionsOptions
→UpdatePromptOptions
46-61
: Provider model refactored with explicit provider typesThe agent architecture has been refactored to use specific provider types instead of a general-purpose
Provider
:
ListenProvider
: Manages speech recognition configurationSpeakProvider
: Manages speech synthesis configurationThinkProvider
: Manages language model configurationEndpoint
: New abstraction for custom API integrationsThis refactoring improves code organization and API clarity while allowing for more granular configuration of each agent capability.
deepgram/clients/agent/v1/websocket/async_client.py (7)
34-40
: Import statements updated to reflect renamed classesImports have been updated to use the new class names:
SettingsOptions
replacesSettingsConfigurationOptions
UpdatePromptOptions
replacesUpdateInstructionsOptions
82-82
: Class attribute type updatedThe
_settings
attribute type has been updated toSettingsOptions
.
182-182
: Method signature updatedThe
start
method parameter type has been updated to useSettingsOptions
.
198-201
: Settings validation using updated class typeThe instance check and validation now use
SettingsOptions
.
216-225
: Settings object instantiation updatedThe code for instantiating settings objects from different input types has been updated to use
SettingsOptions
.
228-229
: Validation logic updated for nested provider structureThe validation logic for keyterms has been updated to check through the nested provider structure:
- Now validates using
self._settings.agent.listen.provider.keyterms
andself._settings.agent.listen.provider.model
This ensures that keyterms validation works with the new nested configuration structure.
280-288
: Log message and error message references updatedReferences to "ConfigurationSettings" have been changed to "Settings" in log messages and error descriptions.
This maintains consistency with the class renaming from
SettingsConfigurationOptions
toSettingsOptions
.
SettingsOptions, | ||
UpdatePromptOptions, | ||
UpdateSpeakOptions, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Breaking change: Core settings classes renamed
The renaming of SettingsConfigurationOptions
to SettingsOptions
and UpdateInstructionsOptions
to UpdatePromptOptions
represents a breaking change for client code. These core classes are used throughout the agent API for configuration.
When upgrading, clients will need to update imports and instantiations of these classes. The UpdatePromptOptions
rename also indicates a conceptual shift from "instructions" to "prompt" for clarity.
options: SettingsOptions = SettingsOptions() | ||
options.agent.think.provider.type = "open_ai" | ||
options.agent.think.model = "gpt-4o-mini" | ||
options.agent.think.instructions = "You are a helpful AI assistant." | ||
|
||
options.agent.think.provider.model = "gpt-4o-mini" | ||
options.agent.think.prompt = "You are a helpful AI assistant." | ||
options.greeting = "Hello, this is a text to speech example using Deepgram." | ||
options.agent.listen.provider.keyterms = ["hello", "goodbye"] | ||
options.agent.listen.provider.model = "nova-3" | ||
options.language = "en" | ||
if dg_connection.start(options) is False: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Configuration structure updated to use nested provider model
The agent configuration has been significantly restructured:
- Uses
SettingsOptions
instead ofSettingsConfigurationOptions
- Provider-specific configuration is now nested (e.g.,
agent.think.provider.type
) - Renamed
instructions
toprompt
- Added new top-level configuration fields (
greeting
,language
) - Listen configuration now uses provider-specific fields
This is a breaking change that requires clients to update their configuration structure. The new model is more organized but requires careful migration.
@@ -98,7 +98,7 @@ def __init__(self, config: DeepgramClientOptions): | |||
self._config = config | |||
|
|||
# needs to be "wss://agent.deepgram.com/agent" | |||
self._endpoint = "agent" | |||
self._endpoint = "v1/agent/converse" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WebSocket endpoint path updated
The WebSocket endpoint has been changed from "agent"
to "v1/agent/converse"
, explicitly versioning the API.
This endpoint change is a breaking change that impacts connection handling. The new path uses versioning (v1
), which is a good practice for API lifecycle management.
if "endpoint" in _dict: | ||
_dict["endpoint"] = [ | ||
Endpoint.from_dict(endpoint) for endpoint in _dict["endpoint"] | ||
] | ||
return _dict[key] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Function.__getitem__
treats endpoint
as a list but the field is singular
endpoint
is declared as Optional[Endpoint]
, yet the getter converts it using
a list comprehension, which will fail with TypeError: 'Endpoint' object is not iterable
. Convert the single dict directly.
- if "endpoint" in _dict:
- _dict["endpoint"] = [
- Endpoint.from_dict(endpoint) for endpoint in _dict["endpoint"]
- ]
+ if "endpoint" in _dict and isinstance(_dict["endpoint"], dict):
+ _dict["endpoint"] = Endpoint.from_dict(_dict["endpoint"])
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if "endpoint" in _dict: | |
_dict["endpoint"] = [ | |
Endpoint.from_dict(endpoint) for endpoint in _dict["endpoint"] | |
] | |
return _dict[key] | |
if "endpoint" in _dict and isinstance(_dict["endpoint"], dict): | |
_dict["endpoint"] = Endpoint.from_dict(_dict["endpoint"]) | |
return _dict[key] |
mode: str = field( | ||
default=None, metadata=dataclass_config(exclude=lambda f: f is None) | ||
) | ||
id: str = field( | ||
default=None, metadata=dataclass_config(exclude=lambda f: f is None) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CartesiaVoice.mode
and id
default to None
but are typed str
– Mypy error
mypy
rightfully complains that assigning None
to a str
violates the type
contract (see pipeline failure). Mark the fields as Optional[str]
or provide
non-None
defaults.
- mode: str = field(
+ mode: Optional[str] = field(
...
- id: str = field(
+ id: Optional[str] = field(
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
mode: str = field( | |
default=None, metadata=dataclass_config(exclude=lambda f: f is None) | |
) | |
id: str = field( | |
default=None, metadata=dataclass_config(exclude=lambda f: f is None) | |
) | |
mode: Optional[str] = field( | |
default=None, metadata=dataclass_config(exclude=lambda f: f is None) | |
) | |
id: Optional[str] = field( | |
default=None, metadata=dataclass_config(exclude=lambda f: f is None) | |
) |
🧰 Tools
🪛 GitHub Actions: Check - static
[error] 136-136: mypy: Incompatible types in assignment (expression has type "None", variable has type "str") [assignment]
[error] 139-139: mypy: Incompatible types in assignment (expression has type "None", variable has type "str") [assignment]
""" | ||
This class defines the provider for the Think model. | ||
""" | ||
|
||
type: Optional[str] = field( | ||
type: str = field(default="open_ai") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@naomi-lgbt I don't think think.provider
has an actual default in the API. You might want to check with @RyanChimienti.
The API Spec doesn't currently have a default think.provider
set. If we find this is the default I can specify that accordingly in the API Spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct John, it doesn't.
""" | ||
This class defines the provider for the Think model. | ||
""" | ||
|
||
type: Optional[str] = field( | ||
type: str = field(default="open_ai") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type: str = field(default="open_ai") | |
type: str = field |
examples/agent/no_mic/chatlog.txt
Outdated
@@ -0,0 +1,14 @@ | |||
Connection opened: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I probably should add this file to gitignore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
examples/agent/no_mic/output_0.wav
is excluded by!**/*.wav
📒 Files selected for processing (2)
examples/agent/no_mic/chatlog.txt
(1 hunks)examples/agent/no_mic/main.py
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- examples/agent/no_mic/chatlog.txt
🔇 Additional comments (4)
examples/agent/no_mic/main.py (4)
10-21
: Replace ad-hoc prints & unused imports with structured logging
verboselogs
is imported but never used, and a series ofprint()
statements are scattered throughout the file for diagnostic purposes. Prefer the std-liblogging
module (orverboselogs
if you really need verbose levels) so that:
- log levels can be configured from the command line / env vars
- output can be redirected or silenced by downstream users
- unit tests aren’t polluted with console noise
-import requests -import wave -import io -import time -from datetime import datetime -from deepgram.utils import verboselogs - -# Add debug prints for imports -print("Checking imports...") +import logging +import requests +import wave +import io +import time +from datetime import datetime + +# Configure once, near the top-level of your application +logging.basicConfig(level=logging.INFO) +log = logging.getLogger(__name__) + +# Example replacement +log.debug("Imports successful")Later
print(...)
calls can be replaced withlog.debug/info/warning/error(...)
as appropriate.
[ suggest_optional_refactor ]
57-77
:convert_audio_to_linear16
neither re-samples nor is it ever invokedThe helper promises “convert … to linear16” but:
- It simply writes/reads the same frames – no resampling or format conversion is done.
- The function is never called, leaving dead code in the example.
If you genuinely need resampling, use a DSP library (e.g.
pydub
,scipy
,soundfile
) or the Deepgram SDK’s built-in conversion. Otherwise, remove the function to reduce cognitive load.[ suggest_optional_refactor ]
226-235
: Download loop lacks basic robustness
response.raise_for_status()
is omitted – silent HTTP 4xx/5xx will yield empty audio.- The stream is never closed (
with requests.get(..., stream=True) as response:
).- If the file is large, you may want back-pressure or to respect
response.iter_content(None)
when the WebSocket is paused.-response = requests.get(url, stream=True) -for chunk in response.iter_content(chunk_size=8192): +with requests.get(url, stream=True, timeout=15) as response: + response.raise_for_status() + for chunk in response.iter_content(chunk_size=8192): if chunk: dg_connection.send(chunk)[ suggest_essential_refactor ]
234-241
: Arbitrarytime.sleep(5)
may end the script before the agent finishesA fixed delay is brittle – network latency and model inference time vary.
Instead, either:
- Wait for a specific terminating event (
AgentAudioDone
+Close
) before exiting, or- Call a (hypothetical)
dg_connection.wait_until_closed()
if provided by the SDK.[ suggest_optional_refactor ]
Proposed changes
Types of changes
What types of changes does your code introduce to the community Python SDK?
Put an
x
in the boxes that applyChecklist
Put an
x
in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.Further comments
Summary by CodeRabbit
.gitignore
to exclude generated audio and chat log files.requests
library to example dependencies.