-
Couldn't load subscription status.
- Fork 13
Open
Description
Add Health Check Endpoint to Python SDK
Summary
Add the /healthz endpoint to the Python SDK to enable production monitoring and service health checks.
API Endpoint
Endpoint: GET /healthz
Current API Implementation (healthz.py:14-66):
@router.get("/healthz")
async def health_check():
"""Health check endpoint that includes memory monitoring"""
return {
"status": "healthy",
"database": "healthy" if db_health else "unhealthy",
"rate_limiter": "healthy" if rate_limiter_health else "unhealthy",
"llm_providers": llm_health,
"memory": {
"status": memory_status,
"process_memory_mb": round(memory_stats["process_memory_mb"], 1),
"system_memory_percent": round(memory_stats["system_memory_percent"], 1)
}
}Response Example:
{
"status": "healthy",
"database": "healthy",
"rate_limiter": "healthy",
"llm_providers": {
"openai": true,
"groq": true,
"cerebras": true
},
"memory": {
"status": "OK",
"process_memory_mb": 245.3,
"system_memory_percent": 67.8
}
}Proposed SDK Implementation
def health_check(self) -> dict:
"""
Check the health status of the ScrapeGraph API service.
Returns:
dict: Health status including:
- status: Overall service status ("healthy" or "unhealthy")
- database: Database connection status
- rate_limiter: Rate limiter status
- llm_providers: Status of each LLM provider
- memory: Memory usage statistics
Raises:
requests.exceptions.HTTPError: If the service is unhealthy (503)
Example:
>>> client = Client(api_key="your_api_key")
>>> health = client.health_check()
>>> if health["status"] == "healthy":
... print("✓ Service is operational")
... print(f"Memory usage: {health['memory']['process_memory_mb']} MB")
>>> else:
... print("✗ Service is experiencing issues")
Example with retry logic:
>>> import time
>>>
>>> def wait_for_healthy_service(client, max_retries=5):
... for attempt in range(max_retries):
... try:
... health = client.health_check()
... if health["status"] == "healthy":
... return True
... except:
... pass
... time.sleep(2 ** attempt) # Exponential backoff
... return False
"""
response = self._make_request(
"GET",
f"{self.base_url}/healthz"
)
return responseUse Cases
1. Production Monitoring
client = Client(api_key=API_KEY)
# Regular health checks
health = client.health_check()
if health["status"] != "healthy":
send_alert(f"Service unhealthy: {health}")2. Pre-flight Checks
def process_batch(urls):
client = Client(api_key=API_KEY)
# Check service health before batch processing
health = client.health_check()
if health["status"] != "healthy":
raise ServiceUnavailableError("API service is not healthy")
for url in urls:
client.smartscraper(url=url, user_prompt="Extract data")3. Circuit Breaker Pattern
class CircuitBreaker:
def __init__(self, client):
self.client = client
self.failures = 0
self.max_failures = 3
def call_api(self, func, *args, **kwargs):
if self.failures >= self.max_failures:
# Check if service recovered
try:
health = self.client.health_check()
if health["status"] == "healthy":
self.failures = 0
else:
raise CircuitOpenError("Service still unhealthy")
except:
raise CircuitOpenError("Circuit breaker open")
try:
result = func(*args, **kwargs)
self.failures = 0
return result
except Exception as e:
self.failures += 1
raise4. Status Dashboard
def get_service_metrics():
client = Client(api_key=API_KEY)
health = client.health_check()
return {
"overall_status": health["status"],
"components": {
"database": health["database"],
"rate_limiter": health["rate_limiter"],
"llm_providers": health["llm_providers"]
},
"memory_usage_mb": health["memory"]["process_memory_mb"],
"memory_status": health["memory"]["status"]
}Implementation Checklist
- Add
health_check()method toClientclass inscrapegraph_py/client.py - Add comprehensive docstring with examples
- Add response type hints (TypedDict or Pydantic model)
- Create unit tests in
tests/ - Update API documentation
- Add usage examples to README
- Update CHANGELOG
Test Cases
def test_health_check_success():
"""Test successful health check"""
client = Client(api_key=TEST_API_KEY)
health = client.health_check()
assert "status" in health
assert "database" in health
assert "llm_providers" in health
assert "memory" in health
assert health["status"] in ["healthy", "unhealthy"]
def test_health_check_response_structure():
"""Test health check response has expected structure"""
client = Client(api_key=TEST_API_KEY)
health = client.health_check()
# Verify top-level keys
assert "status" in health
assert "database" in health
assert "rate_limiter" in health
assert "llm_providers" in health
assert "memory" in health
# Verify memory structure
assert "status" in health["memory"]
assert "process_memory_mb" in health["memory"]
assert "system_memory_percent" in health["memory"]
def test_health_check_no_auth_required():
"""Test that health check works without authentication"""
# Health check endpoint may not require authentication
# depending on API design
client = Client(api_key="")
try:
health = client.health_check()
assert "status" in health
except:
# If auth is required, this test documents that
passBenefits
For Users:
- Production monitoring and alerting
- Better error handling with retry logic
- Circuit breaker pattern implementation
- Pre-flight checks before batch operations
- Integration with health check systems (Kubernetes, Docker, monitoring tools)
For Project:
- Production-ready SDK
- Better developer experience
- Reduced support burden (users can diagnose issues)
- Industry best practice
Priority
MEDIUM - Nice-to-have for production users, but not blocking core functionality.
Labels: enhancement, python-sdk
Metadata
Metadata
Assignees
Labels
No labels