Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
ca48a87
feat(autofix): Add GitLab repository support
dnplkndll Jan 28, 2026
09227b6
chore: Add CodeRabbit AI code review configuration
dnplkndll Jan 28, 2026
a47cb01
ci: Add GitHub Actions workflow for GCP Artifact Registry
dnplkndll Jan 28, 2026
f761d6b
ci: Update cloudbuild.yaml with fallback for models
dnplkndll Jan 28, 2026
7ba41cf
ci: Fix tests workflow for fork compatibility
dnplkndll Jan 28, 2026
227c4ea
ci: Make linting workflow fork-compatible
dnplkndll Jan 28, 2026
cfe77fe
fix: Address pre-commit linting issues
dnplkndll Jan 28, 2026
7a23e4b
fix: Add _build_file_tree_string to BaseRepoClient for mypy
dnplkndll Jan 28, 2026
8cde311
style: Format base_repo_client.py with black
dnplkndll Jan 28, 2026
781e618
fix(ci): Use requirements hash for Docker cache key
dnplkndll Jan 28, 2026
ed29f07
fix(deps): Modernize langfuse and openai, migrate to uv
dnplkndll Jan 28, 2026
f99179b
feat: Migrate to langfuse 3.x and openai 2.x
dnplkndll Jan 29, 2026
a90811a
feat: Switch VCR cassette encryption to kencove-prod KMS
dnplkndll Jan 29, 2026
eb3e84c
fix(security): Address CodeRabbit security review findings
dnplkndll Jan 29, 2026
bb72a01
fix(langfuse): Complete langfuse 3.x API migration
dnplkndll Jan 29, 2026
42eca1a
style: Fix import ordering (isort)
dnplkndll Jan 29, 2026
22011dc
chore: Remove untracked files from repo
dnplkndll Jan 29, 2026
bbe3225
fix(langfuse): Replace DatasetItemClient.observe() with run()
dnplkndll Jan 29, 2026
31ccc58
fix(mypy): Fix all pre-existing mypy type errors
dnplkndll Jan 29, 2026
c47169f
chore: Remove untracked files
dnplkndll Jan 29, 2026
629cc14
chore: Add local config files to gitignore
dnplkndll Jan 29, 2026
1e74b5c
style: Fix black formatting
dnplkndll Jan 29, 2026
20e9ec7
fix: Update Claude model to sonnet-4 and fix VCR cassettes
dnplkndll Jan 29, 2026
f2e0d54
chore: Ignore docstring linting rules in flake8 config
dnplkndll Jan 29, 2026
aba5762
ci: Pass GITHUB_TOKEN to test container from GH_PAT secret
dnplkndll Jan 29, 2026
a5b9b6a
fix: Use correct git diff patch type format in tests
dnplkndll Jan 29, 2026
8e58006
fix: Add redis dependency required by python-gitlab
dnplkndll Jan 29, 2026
8b974dd
fix: Add async-timeout dependency required by redis
dnplkndll Jan 29, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
132 changes: 132 additions & 0 deletions .coderabbit.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
# yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json
# CodeRabbit Configuration for Seer - Sentry's AI/ML Service
# Documentation: https://docs.coderabbit.ai/guides/configure-coderabbit

language: en-US

early_access: true

reviews:
# Enable high-quality reviews
high_level_summary: true
high_level_summary_placeholder: "@coderabbitai summary"

# Review profile - assertive for thorough code review
profile: assertive

# Request changes when issues found
request_changes_workflow: true

# Collapse walkthrough for cleaner PR comments
collapse_walkthrough: true

# Enable poem in reviews (fun touch)
poem: false

# Review status - show in PR
review_status: true

# Auto-review settings
auto_review:
enabled: true
auto_incremental_review: true
drafts: false # Don't review draft PRs
base_branches:
- main
- master

# Path-based review instructions
path_instructions:
- path: "src/seer/automation/**/*.py"
instructions: |
Focus on:
- LLM prompt injection vulnerabilities
- Proper error handling for external API calls (GitHub, GitLab, OpenAI, Anthropic)
- Resource cleanup (temp directories, file handles)
- Timeout handling for long-running operations
- Type safety with abstract base classes

- path: "src/seer/automation/codebase/**/*.py"
instructions: |
This is the repository client layer. Pay attention to:
- Consistent return types between GitHub and GitLab implementations
- Proper authentication token handling (never log tokens)
- Rate limiting considerations
- Branch/commit SHA validation

- path: "src/seer/automation/agent/**/*.py"
instructions: |
This is the LLM client layer. Check for:
- Multi-provider compatibility (Anthropic, OpenAI, Google)
- Proper streaming/timeout handling
- Token counting and context limits
- Fallback logic between regions/models

- path: "tests/**/*.py"
instructions: |
Ensure tests:
- Use real database connections, not mocks (per project guidelines)
- Don't test logging or mock behavior
- Use dependency injection for isolation
- Have meaningful assertions

- path: "**/*migration*.py"
instructions: |
Database migrations require extra scrutiny:
- Check for data loss risks
- Verify rollback capability
- Consider performance on large tables

- path: "src/seer/configuration.py"
instructions: |
Configuration changes:
- Ensure no secrets have default values
- Check for proper type annotations
- Verify environment variable naming consistency

# Tools configuration
tools:
# Enable AST-based analysis
ast-grep:
enabled: true

# Python-specific tools
ruff:
enabled: true

# Security scanning
semgrep:
enabled: true

# Shell script checking
shellcheck:
enabled: true

# GitHub Actions validation
actionlint:
enabled: true

# Markdown linting
markdownlint:
enabled: true

# YAML validation
yamllint:
enabled: true

# Biome for any JS/TS
biome:
enabled: true

chat:
auto_reply: true

# Knowledge base for better context
knowledge_base:
opt_out: false
learnings:
scope: auto
issues:
scope: auto
pull_requests:
scope: auto
117 changes: 117 additions & 0 deletions .github/workflows/build-push-gcp.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
name: Build and Push to GCP Artifact Registry

on:
push:
branches:
- main
paths-ignore:
- '**.md'
- '.coderabbit.yaml'
pull_request:
branches:
- main
types: [closed]
workflow_dispatch:
inputs:
tag:
description: 'Image tag (defaults to commit SHA)'
required: false
type: string

env:
GCP_PROJECT_ID: kencove-prod
GCP_REGION: us-central1
REPOSITORY: kencove-docker-repo
IMAGE_NAME: seer

jobs:
build-and-push:
runs-on: ubuntu-latest
# Only run on push to main, manual trigger, or merged PRs
if: |
github.event_name == 'push' ||
github.event_name == 'workflow_dispatch' ||
(github.event_name == 'pull_request' && github.event.pull_request.merged == true)

permissions:
contents: read
id-token: write # Required for Workload Identity Federation

steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3

- name: Authenticate to Google Cloud
id: auth
uses: google-github-actions/auth@v2
with:
credentials_json: ${{ secrets.GCP_SA_KEY }}
# Alternative: Use Workload Identity Federation (more secure, requires GCP setup)
# workload_identity_provider: 'projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/POOL_ID/providers/PROVIDER_ID'
# service_account: 'SA_NAME@PROJECT_ID.iam.gserviceaccount.com'
Comment on lines +47 to +54
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider Workload Identity Federation for improved security.

The workflow uses credentials_json with a service account key stored in secrets. While functional, Workload Identity Federation provides better security through short-lived tokens without storing long-lived credentials. The commented alternative shows the pattern to follow when GCP setup permits.

🤖 Prompt for AI Agents
In @.github/workflows/build-push-gcp.yml around lines 47 - 54, Replace usage of
long-lived service account key in the "Authenticate to Google Cloud" step (id:
auth, uses: google-github-actions/auth@v2) by switching from credentials_json to
Workload Identity Federation: remove or stop using credentials_json and instead
set workload_identity_provider to the pool/provider resource and service_account
to the GCP SA email; ensure the runner/GCP setup (OIDC provider, IAM binding) is
configured and update the workflow secrets/env as needed to supply the provider
string and service account name rather than a JSON key.


- name: Configure Docker for Artifact Registry
run: gcloud auth configure-docker ${{ env.GCP_REGION }}-docker.pkg.dev --quiet

- name: Generate image tags
id: tags
env:
# Pass user-controlled inputs through env vars to prevent script injection
INPUT_TAG: ${{ inputs.tag }}
HEAD_REF: ${{ github.head_ref }}
run: |
REGISTRY="${{ env.GCP_REGION }}-docker.pkg.dev/${{ env.GCP_PROJECT_ID }}/${{ env.REPOSITORY }}/${{ env.IMAGE_NAME }}"
SHA_SHORT=$(git rev-parse --short HEAD)

# Use input tag if provided, otherwise use commit SHA
if [ -n "$INPUT_TAG" ]; then
CUSTOM_TAG="$INPUT_TAG"
else
CUSTOM_TAG="${SHA_SHORT}"
fi

# Build tags list
TAGS="${REGISTRY}:${CUSTOM_TAG}"
TAGS="${TAGS},${REGISTRY}:${SHA_SHORT}"

# Add 'latest' tag only on main branch push
if [ "${{ github.ref }}" == "refs/heads/main" ] && [ "${{ github.event_name }}" == "push" ]; then
TAGS="${TAGS},${REGISTRY}:latest"
fi

# Add branch name tag for PRs
if [ "${{ github.event_name }}" == "pull_request" ]; then
BRANCH_TAG=$(echo "$HEAD_REF" | sed 's/[^a-zA-Z0-9]/-/g' | cut -c1-50)
TAGS="${TAGS},${REGISTRY}:${BRANCH_TAG}"
fi

echo "tags=${TAGS}" >> $GITHUB_OUTPUT
echo "sha_short=${SHA_SHORT}" >> $GITHUB_OUTPUT
echo "Generated tags: ${TAGS}"

- name: Build and push Docker image
uses: docker/build-push-action@v5
with:
context: .
platforms: linux/amd64
push: true
tags: ${{ steps.tags.outputs.tags }}
build-args: |
SEER_VERSION_SHA=${{ steps.tags.outputs.sha_short }}
SENTRY_ENVIRONMENT=production
cache-from: type=gha
cache-to: type=gha,mode=max

- name: Output image info
run: |
echo "### Docker Image Published :rocket:" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "**Registry:** ${{ env.GCP_REGION }}-docker.pkg.dev/${{ env.GCP_PROJECT_ID }}/${{ env.REPOSITORY }}/${{ env.IMAGE_NAME }}" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "**Tags:**" >> $GITHUB_STEP_SUMMARY
echo '${{ steps.tags.outputs.tags }}' | tr ',' '\n' | while read tag; do
echo "- \`${tag}\`" >> $GITHUB_STEP_SUMMARY
done
7 changes: 4 additions & 3 deletions .github/workflows/linting.yml
Original file line number Diff line number Diff line change
Expand Up @@ -48,17 +48,18 @@ jobs:
# --show-diff-on-failure will display what needs to be fixed without making changes
xargs pre-commit run --show-diff-on-failure --files

# Auto-fix requires Sentry's internal GitHub App - skip for forks
- name: Get auth token
id: token
if: ${{ steps.pre-commit_results.outcome == 'failure' }}
if: ${{ steps.pre-commit_results.outcome == 'failure' && vars.SENTRY_INTERNAL_APP_ID != '' }}
continue-on-error: true
uses: getsentry/action-github-app-token@d4b5da6c5e37703f8c3b3e43abb5705b46e159cc # v3.0.0
with:
app_id: ${{ vars.SENTRY_INTERNAL_APP_ID }}
private_key: ${{ secrets.SENTRY_INTERNAL_APP_PRIVATE_KEY }}

- name: Apply any pre-commit fixed files
if: ${{ steps.pre-commit_results.outcome == 'failure' }}
# note: this runs "always" or else it's skipped when pre-commit fails
if: ${{ steps.pre-commit_results.outcome == 'failure' && steps.token.outputs.token != '' }}
uses: getsentry/action-github-commit@5972d5f578ad77306063449e718c0c2a6fbc4ae1 # v2.1.0
with:
github-token: ${{ steps.token.outputs.token }}
Expand Down
50 changes: 27 additions & 23 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -31,23 +31,25 @@ jobs:
uses: docker/setup-buildx-action@b5ca514318bd6ebac0fb2aedd5d36ec1b5c232a2 # v3

- id: "auth"
uses: google-github-actions/auth@3a3c4c57d294ef65efaaee4ff17b22fa88dd3c69 # v1
uses: google-github-actions/auth@v2
continue-on-error: true
with:
workload_identity_provider: "projects/868781662168/locations/global/workloadIdentityPools/prod-github/providers/github-oidc-pool"
service_account: "[email protected]"
token_format: "id_token"
id_token_audience: "610575311308-9bsjtgqg4jm01mt058rncpopujgk3627.apps.googleusercontent.com"
id_token_include_email: true
create_credentials_file: true
credentials_json: ${{ secrets.GCP_SA_KEY }}

- name: Compute requirements hash
id: req-hash
run: echo "hash=$(sha256sum requirements.txt | cut -c1-8)" >> $GITHUB_OUTPUT

- name: Build and push Docker image
run: |
make .env
# Use requirements hash in cache key to invalidate when deps change
CACHE_KEY="ghcr.io/${{ github.repository_owner }}/seer:cache-${{ steps.req-hash.outputs.hash }}"
docker buildx bake --file docker-compose.yml --file docker-compose-cache.json \
--set *.cache-to=type=registry,ref=ghcr.io/getsentry/seer:cache,mode=max \
--set *.cache-from=type=registry,ref=ghcr.io/getsentry/seer:cache \
--set *.cache-to=type=registry,ref=${CACHE_KEY},mode=max \
--set *.cache-from=type=registry,ref=${CACHE_KEY} \
--set *.output=type=registry \
--set *.tags=ghcr.io/getsentry/seer:cache-${{ github.sha }}
--set *.tags=ghcr.io/${{ github.repository_owner }}/seer:cache-${{ github.sha }}

typecheck:
needs: [build_and_push]
Expand All @@ -70,7 +72,7 @@ jobs:

- name: Pull pre-built Docker image
run: |
docker pull ghcr.io/getsentry/seer:${IMAGE_TAG}
docker pull ghcr.io/${{ github.repository_owner }}/seer:${IMAGE_TAG}
cp docker-compose.ci.yml docker-compose.override.yml

- name: Create blank .env
Expand Down Expand Up @@ -105,21 +107,17 @@ jobs:
password: ${{ secrets.GITHUB_TOKEN }}

- id: "auth"
uses: google-github-actions/auth@3a3c4c57d294ef65efaaee4ff17b22fa88dd3c69 # v1
uses: google-github-actions/auth@v2
continue-on-error: true
with:
workload_identity_provider: "projects/868781662168/locations/global/workloadIdentityPools/prod-github/providers/github-oidc-pool"
service_account: "[email protected]"
token_format: "id_token"
id_token_audience: "610575311308-9bsjtgqg4jm01mt058rncpopujgk3627.apps.googleusercontent.com"
id_token_include_email: true
create_credentials_file: true
credentials_json: ${{ secrets.GCP_SA_KEY }}
Comment on lines 109 to +113
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don’t mask GCP auth failures on non-fork runs.

continue-on-error: true can hide credential breakage on pushes, silently degrading test coverage. Prefer skipping the step when the secret is missing, and letting it fail otherwise.

🔧 Proposed fix
-      - id: "auth"
-        uses: google-github-actions/auth@v2
-        continue-on-error: true
-        with:
-          credentials_json: ${{ secrets.GCP_SA_KEY }}
+      - id: "auth"
+        if: ${{ secrets.GCP_SA_KEY != '' }}
+        uses: google-github-actions/auth@v2
+        with:
+          credentials_json: ${{ secrets.GCP_SA_KEY }}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- id: "auth"
uses: google-github-actions/auth@3a3c4c57d294ef65efaaee4ff17b22fa88dd3c69 # v1
uses: google-github-actions/auth@v2
continue-on-error: true
with:
workload_identity_provider: "projects/868781662168/locations/global/workloadIdentityPools/prod-github/providers/github-oidc-pool"
service_account: "[email protected]"
token_format: "id_token"
id_token_audience: "610575311308-9bsjtgqg4jm01mt058rncpopujgk3627.apps.googleusercontent.com"
id_token_include_email: true
create_credentials_file: true
credentials_json: ${{ secrets.GCP_SA_KEY }}
- id: "auth"
if: ${{ secrets.GCP_SA_KEY != '' }}
uses: google-github-actions/auth@v2
with:
credentials_json: ${{ secrets.GCP_SA_KEY }}
🤖 Prompt for AI Agents
In @.github/workflows/tests.yml around lines 103 - 107, The auth step with id
"auth" currently uses continue-on-error: true which masks failures; change the
step to run only when the secret exists and remove continue-on-error so real
auth failures surface: replace continue-on-error: true with an if condition such
as if: ${{ secrets.GCP_SA_KEY != '' }} on the step that uses
google-github-actions/auth@v2 (id "auth") so the step is skipped when the secret
is missing (e.g., forks) but will run and fail normally when the secret is
present.


- name: Set up Cloud SDK
uses: google-github-actions/setup-gcloud@e30db14379863a8c79331b04a9969f4c1e225e0b # v1

- name: Pull pre-built Docker image
run: |
docker pull ghcr.io/getsentry/seer:${IMAGE_TAG}
docker pull ghcr.io/${{ github.repository_owner }}/seer:${IMAGE_TAG}
cp docker-compose.ci.yml docker-compose.override.yml

- name: Create blank .env
Expand All @@ -138,16 +136,22 @@ jobs:

- name: Fetch models
if: github.event_name == 'push'
continue-on-error: true
run: |
rm -rf ./models
gcloud storage cp -r gs://sentry-ml/seer/models ./
gcloud storage cp -r gs://sentry-ml/seer/models ./ || {
echo "Models not accessible, using NO_REAL_MODELS mode"
mkdir -p models
echo "# Placeholder" > models/.keep
}
Comment on lines 137 to +146
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fail on main if model fetch breaks.

On push, the fallback to placeholders can silently drop real-model coverage. For main (or release) branches, this should fail fast; allow fallback only for non-critical branches.

🔧 Proposed fix
       - name: Fetch models
         if: github.event_name == 'push'
-        continue-on-error: true
         run: |
           rm -rf ./models
-          gcloud storage cp -r gs://sentry-ml/seer/models ./ || {
-            echo "Models not accessible, using NO_REAL_MODELS mode"
-            mkdir -p models
-            echo "# Placeholder" > models/.keep
-          }
+          if gcloud storage cp -r gs://sentry-ml/seer/models ./; then
+            :
+          elif [[ "${{ github.ref }}" == "refs/heads/main" ]]; then
+            echo "Models fetch failed on main; aborting."
+            exit 1
+          else
+            echo "Models not accessible, using NO_REAL_MODELS mode"
+            mkdir -p models
+            echo "# Placeholder" > models/.keep
+          fi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- name: Fetch models
if: github.event_name == 'push'
continue-on-error: true
run: |
rm -rf ./models
gcloud storage cp -r gs://sentry-ml/seer/models ./
gcloud storage cp -r gs://sentry-ml/seer/models ./ || {
echo "Models not accessible, using NO_REAL_MODELS mode"
mkdir -p models
echo "# Placeholder" > models/.keep
}
- name: Fetch models
if: github.event_name == 'push'
run: |
rm -rf ./models
if gcloud storage cp -r gs://sentry-ml/seer/models ./; then
:
elif [[ "${{ github.ref }}" == "refs/heads/main" ]]; then
echo "Models fetch failed on main; aborting."
exit 1
else
echo "Models not accessible, using NO_REAL_MODELS mode"
mkdir -p models
echo "# Placeholder" > models/.keep
fi
🤖 Prompt for AI Agents
In @.github/workflows/tests.yml around lines 131 - 140, The "Fetch models" step
currently uses continue-on-error: true causing pushes to main/release to
silently fall back to placeholders; change the step named "Fetch models" so that
continue-on-error is conditional: set continue-on-error to an expression that is
false for main and release branches and true otherwise (use github.ref checks,
e.g. github.ref == 'refs/heads/main' or startsWith(github.ref,
'refs/heads/release') in the GitHub Actions expression) so pushes to
main/release fail fast while non-critical branches still allow the placeholder
fallback.


- name: Set test environment flags
run: |
if [[ "${{ github.event_name }}" == "push" ]]; then
echo "EXTRA_COMPOSE_TEST_OPTIONS=-e NO_SENTRY_INTEGRATION=1 -e CI=1" >> $GITHUB_ENV
# Check if models directory has real models (not just placeholder)
if [[ -d "./models" && $(find ./models -type f ! -name '.keep' ! -name '.gitignore' | head -1) ]]; then
echo "EXTRA_COMPOSE_TEST_OPTIONS=-e NO_SENTRY_INTEGRATION=1 -e CI=1 -e GITHUB_TOKEN=${{ secrets.GH_PAT }}" >> $GITHUB_ENV
else
echo "EXTRA_COMPOSE_TEST_OPTIONS=-e NO_REAL_MODELS=1 -e NO_SENTRY_INTEGRATION=1 -e CI=1" >> $GITHUB_ENV
echo "EXTRA_COMPOSE_TEST_OPTIONS=-e NO_REAL_MODELS=1 -e NO_SENTRY_INTEGRATION=1 -e CI=1 -e GITHUB_TOKEN=${{ secrets.GH_PAT }}" >> $GITHUB_ENV
fi

- name: Test with pytest
Expand Down
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -151,3 +151,7 @@ dmypy.json

# Cassettes
tests/**/cassettes/

# Local config files
CLAUDE.md
entrypoint.sh
Loading
Loading