Skip to content

Conversation

@FicoHu
Copy link
Collaborator

@FicoHu FicoHu commented Jan 5, 2026

Summary

  • Add detailed debug logging in shared/utils/crypto.py to troubleshoot AES key inconsistency between backend and executor
  • Log GIT_TOKEN_AES_KEY and GIT_TOKEN_AES_IV partial info (first 4 and last 4 chars) on initialization
  • Log encrypt/decrypt function calls with partial content info for debugging

Test plan

  • Deploy backend and executor with this change
  • Check logs to compare GIT_TOKEN_AES_KEY values between backend and executor
  • Verify if both services are using the same encryption key

Summary by CodeRabbit

  • Chores
    • Enhanced verbose logging for encryption and decryption operations to improve debugging and monitoring capabilities.

✏️ Tip: You can customize this high-level summary in your review settings.

Add detailed logging in shared/utils/crypto.py to help diagnose
AES key inconsistency issues between backend and executor:
- Log GIT_TOKEN_AES_KEY and GIT_TOKEN_AES_IV info on initialization
- Log encrypt/decrypt function calls with partial content info
- This helps identify if backend and executor are using different keys
@coderabbitai
Copy link

coderabbitai bot commented Jan 5, 2026

📝 Walkthrough

Walkthrough

Debug logging enhancements added to encryption and decryption helper functions in shared/utils/crypto.py. The changes introduce verbose logging to track key loading, encryption input/output characteristics, and decryption operations without modifying function signatures or public API behavior.

Changes

Cohort / File(s) Summary
Debug Logging Enhancement
shared/utils/crypto.py
Added verbose debug logging to _get_encryption_key, encrypt_sensitive_data, and decrypt_sensitive_data functions to track key characteristics (length, first/last characters), encryption inputs/outputs, and decryption operations. Warning log added to decrypt error path.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested reviewers

  • cc-yafei
  • Micro66

Poem

🐰 Through keys and ciphers, now we see,
With logs that whisper 'neath the bytes with glee,
First chars and lengths dance in the debug light,
A bunny cryptographer logs it just right! 🔐

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and accurately describes the main change: adding debug logging to troubleshoot AES key mismatches in the crypto utilities.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Fix all issues with AI Agents 🤖
In @shared/utils/crypto.py:
- Around line 122-126: The logging inside decrypt_sensitive_data currently uses
logger.info and prints a partial ciphertext sample; change the call to
logger.debug and remove any plaintext/ciphertext snippets—log only non-sensitive
metadata such as the encrypted_text length and a safe status message. Locate the
logger call that references encrypted_text and replace it with a logger.debug
that reports length and status (no substrings or content of encrypted_text).
- Around line 67-71: The logger call inside encrypt_sensitive_data currently
prints plaintext characters (first 4 chars), which must be removed; update the
logger.info in encrypt_sensitive_data to never include any portion of plain_text
and instead log only non-sensitive metadata such as length, type, and/or a safe
SHA-256 hex digest for correlation (compute hash from plain_text using
hashlib.sha256 and log the hex digest), and ensure any exception paths also
avoid printing plaintext.
- Around line 92-98: The log in encrypt_sensitive_data currently uses
logger.info and prints a ciphertext sample; change this to logger.debug and stop
logging partial ciphertext to avoid leaking data — log only a success message
and the encrypted length (e.g., logger.debug(f"encrypt_sensitive_data success -
encrypted length={len(encrypted_str)}")) and remove the substring sample; keep
the return encrypted_str as-is.
- Around line 34-41: The current logger.info calls expose partial AES key/IV
material (aes_key, aes_iv) which is a security risk; remove those prints and
instead log non-sensitive verification data: compute and log a SHA-256 (or other
secure) hash of aes_key and aes_iv (or a combined HMAC/checksum) and/or a
boolean indicating successful load and source, and emit a timestamped
key-rotation or load event; update the code around the logger.info calls
referencing aes_key and aes_iv to log only the hash values or flags and never
raw or partial secret material.
- Around line 150-156: The logger.info call in shared/utils/crypto.py currently
logs parts of the decrypted plaintext (variable decrypted_str) which leaks
secrets; remove any logging of decrypted content (including the first 4
characters) from the decryption path (e.g., inside decrypt_sensitive_data or
where decrypted_str is produced) and replace it with a non-sensitive success log
that only records operation status and safe metadata (e.g., "decryption
succeeded", length as an integer without printing content, or a boolean),
ensuring logger.info no longer references decrypted_str or any substring of it.
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fd9dca0 and bfd1560.

📒 Files selected for processing (1)
  • shared/utils/crypto.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{py,ts,tsx,js,jsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{py,ts,tsx,js,jsx}: All code comments MUST be written in English
File size MUST NOT exceed 1000 lines - split into multiple sub-modules if exceeded
Function length SHOULD NOT exceed 50 lines (preferred)

Files:

  • shared/utils/crypto.py
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Python code MUST follow PEP 8, use Black formatter with line length 88, and isort for imports
Python code MUST include type hints
Python functions and classes MUST have descriptive names and docstrings for public functions/classes
Python MUST extract magic numbers to named constants

Files:

  • shared/utils/crypto.py
🧠 Learnings (1)
📚 Learning: 2025-12-18T02:09:09.776Z
Learnt from: CR
Repo: wecode-ai/Wegent PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-18T02:09:09.776Z
Learning: Backend MUST encrypt Git tokens and API keys using AES-256-CBC encryption

Applied to files:

  • shared/utils/crypto.py
🪛 Ruff (0.14.10)
shared/utils/crypto.py

98-98: Consider moving this statement to an else block

(TRY300)


156-156: Consider moving this statement to an else block

(TRY300)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: E2E Tests (Shard 3/3)
  • GitHub Check: E2E Tests (Shard 1/3)
  • GitHub Check: E2E Tests (Shard 2/3)
  • GitHub Check: Test wegent CLI Integration

Comment on lines +34 to +41
# Debug logging: print key info for troubleshooting encryption mismatch
logger.info(
f"Loaded encryption keys from environment variables - "
f"GIT_TOKEN_AES_KEY: first 4 chars='{aes_key[:4]}', last 4 chars='{aes_key[-4:]}', length={len(aes_key)}"
)
logger.info(
f"GIT_TOKEN_AES_IV: first 4 chars='{aes_iv[:4]}', last 4 chars='{aes_iv[-4:]}', length={len(aes_iv)}"
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

CRITICAL: Never log encryption key material, even partially.

Logging any portion of encryption keys (first/last characters) is a severe security vulnerability that:

  • Aids cryptanalysis and reduces effective key entropy
  • Violates cryptographic security principles and compliance standards (PCI-DSS, NIST, SOC 2)
  • Exposes secrets if logs are compromised, aggregated to external services, or accessed by unauthorized personnel
  • Creates an audit trail of sensitive cryptographic material

For troubleshooting key mismatches between services, use these safer alternatives:

  • Hash comparison: Log sha256(key) instead of partial key content—services can verify matching hashes without exposing key material
  • Key rotation events: Log when keys are rotated or updated (timestamp only)
  • Environment validation: Log boolean flags indicating whether keys are loaded from expected sources
  • Checksum/version tags: Use non-reversible identifiers
🔎 Safer alternative using hash-based verification
+import hashlib
+
 def _get_encryption_key():
     """Get or initialize encryption key and IV from environment variables"""
     global _aes_key, _aes_iv
     if _aes_key is None:
         # Load keys from environment variables
         aes_key = os.environ.get('GIT_TOKEN_AES_KEY', '12345678901234567890123456789012')
         aes_iv = os.environ.get('GIT_TOKEN_AES_IV', '1234567890123456')
         _aes_key = aes_key.encode('utf-8')
         _aes_iv = aes_iv.encode('utf-8')
-        # Debug logging: print key info for troubleshooting encryption mismatch
+        # Log hash for key verification (safe for troubleshooting)
+        key_hash = hashlib.sha256(aes_key.encode('utf-8')).hexdigest()[:16]
+        iv_hash = hashlib.sha256(aes_iv.encode('utf-8')).hexdigest()[:16]
         logger.info(
-            f"Loaded encryption keys from environment variables - "
-            f"GIT_TOKEN_AES_KEY: first 4 chars='{aes_key[:4]}', last 4 chars='{aes_key[-4:]}', length={len(aes_key)}"
+            f"Loaded encryption keys - "
+            f"GIT_TOKEN_AES_KEY hash: {key_hash}, length={len(aes_key)}"
         )
         logger.info(
-            f"GIT_TOKEN_AES_IV: first 4 chars='{aes_iv[:4]}', last 4 chars='{aes_iv[-4:]}', length={len(aes_iv)}"
+            f"GIT_TOKEN_AES_IV hash: {iv_hash}, length={len(aes_iv)}"
         )
     return _aes_key, _aes_iv
🤖 Prompt for AI Agents
In @shared/utils/crypto.py around lines 34-41, The current logger.info calls
expose partial AES key/IV material (aes_key, aes_iv) which is a security risk;
remove those prints and instead log non-sensitive verification data: compute and
log a SHA-256 (or other secure) hash of aes_key and aes_iv (or a combined
HMAC/checksum) and/or a boolean indicating successful load and source, and emit
a timestamped key-rotation or load event; update the code around the logger.info
calls referencing aes_key and aes_iv to log only the hash values or flags and
never raw or partial secret material.

Comment on lines +67 to +71
# Debug logging: print plain text info
logger.info(
f"encrypt_sensitive_data called - plain_text length={len(plain_text)}, "
f"first 4 chars='{plain_text[:4] if len(plain_text) >= 4 else plain_text}'"
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

CRITICAL: Never log sensitive plaintext data (git tokens, API keys).

This function encrypts git tokens and API keys—logging the first 4 characters of plaintext secrets is a severe security vulnerability:

  • Exposes partial credentials in logs accessible to developers, operators, log aggregation services, and backups
  • Violates the security requirement that sensitive data should only exist encrypted in persistent storage
  • Creates a breach vector if logs are compromised
  • Non-compliant with PCI-DSS, GDPR (secrets as PII), and SOC 2 controls

For debugging encryption issues, log only non-sensitive metadata:

  • Data length and type (already logged)
  • Success/failure status
  • Timing information
  • Hash of the input (if absolutely necessary for correlation)

Never log actual secret content, even partially.

🔎 Safer logging approach
-    # Debug logging: print plain text info
-    logger.info(
-        f"encrypt_sensitive_data called - plain_text length={len(plain_text)}, "
-        f"first 4 chars='{plain_text[:4] if len(plain_text) >= 4 else plain_text}'"
+    # Log encryption attempt with non-sensitive metadata only
+    logger.debug(
+        f"encrypt_sensitive_data called - input length={len(plain_text)}"
     )
🤖 Prompt for AI Agents
In @shared/utils/crypto.py around lines 67-71, The logger call inside
encrypt_sensitive_data currently prints plaintext characters (first 4 chars),
which must be removed; update the logger.info in encrypt_sensitive_data to never
include any portion of plain_text and instead log only non-sensitive metadata
such as length, type, and/or a safe SHA-256 hex digest for correlation (compute
hash from plain_text using hashlib.sha256 and log the hex digest), and ensure
any exception paths also avoid printing plaintext.

Comment on lines +92 to +98
encrypted_str = base64.b64encode(encrypted_bytes).decode('utf-8')
# Debug logging: print encryption success info
logger.info(
f"encrypt_sensitive_data success - encrypted length={len(encrypted_str)}, "
f"first 10 chars='{encrypted_str[:10]}'"
)
return encrypted_str
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Use logger.debug() for debug logging instead of logger.info().

Debug information should use logger.debug() level to prevent it from appearing in production logs by default. The info level is intended for significant operational events, not verbose debugging output.

Additionally, logging partial ciphertext samples, while less critical than logging plaintext, can still provide information useful for cryptanalysis. Consider logging only length and success status.

🔎 Recommended changes
     # Return base64 encoded encrypted data
     encrypted_str = base64.b64encode(encrypted_bytes).decode('utf-8')
-    # Debug logging: print encryption success info
-    logger.info(
-        f"encrypt_sensitive_data success - encrypted length={len(encrypted_str)}, "
-        f"first 10 chars='{encrypted_str[:10]}'"
+    # Log encryption success with metadata only
+    logger.debug(
+        f"encrypt_sensitive_data success - output length={len(encrypted_str)}"
     )
     return encrypted_str
-except Exception as e:
+except Exception as e:
     logger.error(f"Failed to encrypt sensitive data: {str(e)}")
     raise

Note: The static analysis tool (Ruff TRY300) suggests moving the return to an else block, but the current structure is acceptable for this use case.

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools
🪛 Ruff (0.14.10)

98-98: Consider moving this statement to an else block

(TRY300)

🤖 Prompt for AI Agents
In @shared/utils/crypto.py around lines 92-98, The log in encrypt_sensitive_data
currently uses logger.info and prints a ciphertext sample; change this to
logger.debug and stop logging partial ciphertext to avoid leaking data — log
only a success message and the encrypted length (e.g.,
logger.debug(f"encrypt_sensitive_data success - encrypted
length={len(encrypted_str)}")) and remove the substring sample; keep the return
encrypted_str as-is.

Comment on lines +122 to +126
# Debug logging: print encrypted text info
logger.info(
f"decrypt_sensitive_data called - encrypted_text length={len(encrypted_text)}, "
f"first 10 chars='{encrypted_text[:10] if len(encrypted_text) >= 10 else encrypted_text}'"
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Use logger.debug() for debug logging and avoid logging ciphertext samples.

Same issues as the encryption logging:

  • Use logger.debug() instead of logger.info() for debug-only information
  • Avoid logging partial ciphertext samples; log only non-sensitive metadata (length, status)
🔎 Recommended changes
-    # Debug logging: print encrypted text info
-    logger.info(
-        f"decrypt_sensitive_data called - encrypted_text length={len(encrypted_text)}, "
-        f"first 10 chars='{encrypted_text[:10] if len(encrypted_text) >= 10 else encrypted_text}'"
+    # Log decryption attempt with metadata only
+    logger.debug(
+        f"decrypt_sensitive_data called - input length={len(encrypted_text)}"
     )
🤖 Prompt for AI Agents
In @shared/utils/crypto.py around lines 122-126, The logging inside
decrypt_sensitive_data currently uses logger.info and prints a partial
ciphertext sample; change the call to logger.debug and remove any
plaintext/ciphertext snippets—log only non-sensitive metadata such as the
encrypted_text length and a safe status message. Locate the logger call that
references encrypted_text and replace it with a logger.debug that reports length
and status (no substrings or content of encrypted_text).

Comment on lines +150 to +156
decrypted_str = decrypted_bytes.decode('utf-8')
# Debug logging: print decryption success info
logger.info(
f"decrypt_sensitive_data success - decrypted length={len(decrypted_str)}, "
f"first 4 chars='{decrypted_str[:4] if len(decrypted_str) >= 4 else decrypted_str}'"
)
return decrypted_str
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

CRITICAL: Never log decrypted sensitive data.

This is identical to the pre-encryption logging issue—logging the first 4 characters of decrypted plaintext exposes git tokens and API keys in logs. This completely undermines the encryption security model.

After decryption, log only success status and metadata, never the actual decrypted content.

🔎 Safer logging approach
     # Return decrypted string
     decrypted_str = decrypted_bytes.decode('utf-8')
-    # Debug logging: print decryption success info
-    logger.info(
-        f"decrypt_sensitive_data success - decrypted length={len(decrypted_str)}, "
-        f"first 4 chars='{decrypted_str[:4] if len(decrypted_str) >= 4 else decrypted_str}'"
+    # Log decryption success with non-sensitive metadata only
+    logger.debug(
+        f"decrypt_sensitive_data success - output length={len(decrypted_str)}"
     )
     return decrypted_str
-except Exception as e:
+except Exception as e:
     logger.warning(f"Failed to decrypt sensitive data: {str(e)}")
     # Return the original text as fallback for backward compatibility
     return encrypted_text
🧰 Tools
🪛 Ruff (0.14.10)

156-156: Consider moving this statement to an else block

(TRY300)

🤖 Prompt for AI Agents
In @shared/utils/crypto.py around lines 150-156, The logger.info call in
shared/utils/crypto.py currently logs parts of the decrypted plaintext (variable
decrypted_str) which leaks secrets; remove any logging of decrypted content
(including the first 4 characters) from the decryption path (e.g., inside
decrypt_sensitive_data or where decrypted_str is produced) and replace it with a
non-sensitive success log that only records operation status and safe metadata
(e.g., "decryption succeeded", length as an integer without printing content, or
a boolean), ensuring logger.info no longer references decrypted_str or any
substring of it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants