Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(provider): add Google VertexAI support #24

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

onyedikachi-david
Copy link

@onyedikachi-david onyedikachi-david commented Dec 30, 2024

Add Google VertexAI Provider Support

This PR adds support for Google VertexAI models (Gemini) as a new provider in the Hub, allowing users to route their LLM requests to Google's models through our unified API interface.

Definition of Done

  • Implementation of VertexAI provider
    • Chat completions endpoint with Gemini models
    • Text completions endpoint
    • Embeddings endpoint with textembedding-gecko
    • Tool/function calling support
    • Streaming support
    • Multi-modal support
  • Unit tests
    • Provider configuration tests
    • Request/response conversion tests
    • Error handling tests
    • Model format conversion tests
  • Integration tests
    • Chat completions test with recorded responses
    • Completions test with recorded responses
    • Embeddings test with recorded responses
    • Tool calling test with recorded responses
    • Test cassettes for offline testing
    • Quota retry mechanism
  • Documentation
    • Configuration examples in README
    • Authentication methods (API key and service account)
    • Model support documentation
    • Usage examples with OpenAI SDK
  • Configuration
    • Example configuration in config-example.yaml
    • Support for both API key and service account auth
    • Required parameters (project_id)
    • Optional parameters (location, credentials_path)
  • Error Handling
    • Proper status code mapping from Google API
    • Informative error messages
    • Quota limit handling with configurable retries
    • Authentication error handling
  • Code Review and Approval

Changes Made

  1. Added new VertexAI provider implementation

    • Full support for Gemini models (chat, completion)
    • Text embeddings with textembedding-gecko
    • Tool/function calling with proper mapping
    • Streaming support for real-time responses
    • Multi-modal capabilities for image+text inputs
  2. Added comprehensive test suite

    • Unit tests for core functionality
    • Integration tests with recorded responses
    • Test cassettes for offline testing
    • Quota retry mechanism for rate limits
    • Both auth methods covered
  3. Updated documentation

    • Detailed VertexAI setup instructions
    • Authentication configuration guide
    • Model compatibility list
    • Usage examples with OpenAI SDK
    • Configuration parameter reference
  4. Added robust configuration support

    • API key authentication
    • Service account authentication (JSON key file)
    • Project ID configuration
    • Regional endpoint configuration
    • Default values for optional parameters

Testing

  • Unit tests: cargo test
  • Integration tests with credentials:
    # Service Account Auth
    export VERTEXAI_CREDENTIALS_PATH="../credentials/vertexai-key.json"
    
    # Record new responses
    RECORD_MODE=1 cargo test
    
    # Replay mode (default)
    cargo test
  • Test retry configuration:
    export RETRY_DELAY=60  # Seconds between retries

Security Considerations

  • Credentials stored outside repository
  • Test cassettes cleaned of sensitive data
  • Support for both API key and service account auth
  • Environment variable support for credentials
  • No hardcoded sensitive values

Notes

  • Default location is us-central1 (configurable)
  • Automatic retry on quota limits (configurable)
  • Test cassettes provided for offline development
  • Compatible with existing OpenAI SDK clients

Fixes #19
/claim #19

Signed-off-by: David Anyatonwu <[email protected]>
@CLAassistant
Copy link

CLAassistant commented Dec 31, 2024

CLA assistant check
All committers have signed the CLA.

…tting

- Changed project_id retrieval to use expect for mandatory parameter.
- Updated location retrieval to use unwrap_or for default value.
- Modified endpoint formatting to dynamically include location in the URL for both chat and embeddings requests.
- Refactored test provider setup to use constants for project_id and location.
@onyedikachi-david
Copy link
Author

onyedikachi-david commented Dec 31, 2024

@galkleinman This PR is ready for review.

running 9 tests
test providers::vertexai::tests::test_chat_completions_with_api_key ... ignored, Requires valid API key which is not available yet
test providers::vertexai::provider::tests::test_gemini_request_conversion ... ok
test providers::vertexai::provider::tests::test_gemini_response_conversion ... ok
test providers::vertexai::provider::tests::test_provider_new_missing_project_id - should panic ... ok
test providers::vertexai::provider::tests::test_provider_new ... ok
test providers::vertexai::tests::test_chat_completions ... ok
test providers::vertexai::tests::test_chat_completions_with_tools ... ok
test providers::vertexai::tests::test_embeddings ... ok
test providers::vertexai::tests::test_completions ... ok

test result: ok. 8 passed; 0 failed; 1 ignored; 0 measured; 0 filtered out; finished in 3.32s

     Running unittests src/main.rs (target/debug/deps/hub-e2c6417f88e881cd)

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

   Doc-tests hub

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
image

- Introduced a new method `validate_location` to sanitize and validate location input, defaulting to "us-central1" if invalid.
- Updated the provider initialization to utilize the new location validation method.
- Added extensive unit tests for the provider, covering various scenarios including location validation, request conversion, and handling of empty messages.
- Ensured that invalid characters in location parameters are filtered out correctly.
- Enhanced tests to verify the precedence of API key over credentials path in configuration.

This commit improves the robustness of the VertexAIProvider by ensuring valid location formats and enhancing test coverage.
Copy link
Member

@nirga nirga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @onyedikachi-david - left a lot of comments. You're code isn't following idiomatic rust constructs and it's not following other providers already implemented in this repo. You're also missing integration (black box) tests which tests the whole system - not just the provider.

@@ -37,3 +46,44 @@ pub struct ChatCompletionChunk {
#[serde(skip_serializing_if = "Option::is_none")]
pub usage: Option<Usage>,
}

impl ChatCompletionChunk {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should use the From trait, and be part of gemini provider and not here (see other providers)

use crate::models::vertexai::GeminiChatResponse;

#[derive(Deserialize, Serialize, Clone, Debug, Default)]
pub struct Delta {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between this and ChoiceDelta?

use crate::models::tool_choice::{SimpleToolChoice, ToolChoice};
use crate::models::usage::Usage;

#[derive(Debug, Serialize, Deserialize)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please follow the similar structure we have in other providers - this should be under providers/vertexai/models.rs

pub total_token_count: i32,
}

impl GeminiChatRequest {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should use the From trait

println!("Using API key authentication");
Ok(self.config.api_key.clone())
} else {
println!("Using service account authentication");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not use print, use logs API in the proper level (probably debug)

.collect();

if sanitized.is_empty() {
"us-central1".to_string() // Default if invalid
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why?


if status.is_success() {
if payload.stream.unwrap_or(false) {
Err(StatusCode::BAD_REQUEST) // Streaming not supported yet
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add streaming support

payload: CompletionRequest,
model_config: &ModelConfig,
) -> Result<CompletionResponse, StatusCode> {
// For Gemini, we'll use the chat endpoint for completions as well
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why? if there's no chat endpoint for gemini this needs to be unimplemented

}
}

#[cfg(test)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be here, it belongs to the test file


}

#[cfg(test)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is it here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature Request: Add Google VertexAI Provider Support
3 participants