-
Notifications
You must be signed in to change notification settings - Fork 67
fix: add debug logging for AES key mismatch troubleshooting #842
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Add detailed logging in shared/utils/crypto.py to help diagnose AES key inconsistency issues between backend and executor: - Log GIT_TOKEN_AES_KEY and GIT_TOKEN_AES_IV info on initialization - Log encrypt/decrypt function calls with partial content info - This helps identify if backend and executor are using different keys
📝 WalkthroughWalkthroughDebug logging enhancements added to encryption and decryption helper functions in Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
Fix all issues with AI Agents 🤖
In @shared/utils/crypto.py:
- Around line 122-126: The logging inside decrypt_sensitive_data currently uses
logger.info and prints a partial ciphertext sample; change the call to
logger.debug and remove any plaintext/ciphertext snippets—log only non-sensitive
metadata such as the encrypted_text length and a safe status message. Locate the
logger call that references encrypted_text and replace it with a logger.debug
that reports length and status (no substrings or content of encrypted_text).
- Around line 67-71: The logger call inside encrypt_sensitive_data currently
prints plaintext characters (first 4 chars), which must be removed; update the
logger.info in encrypt_sensitive_data to never include any portion of plain_text
and instead log only non-sensitive metadata such as length, type, and/or a safe
SHA-256 hex digest for correlation (compute hash from plain_text using
hashlib.sha256 and log the hex digest), and ensure any exception paths also
avoid printing plaintext.
- Around line 92-98: The log in encrypt_sensitive_data currently uses
logger.info and prints a ciphertext sample; change this to logger.debug and stop
logging partial ciphertext to avoid leaking data — log only a success message
and the encrypted length (e.g., logger.debug(f"encrypt_sensitive_data success -
encrypted length={len(encrypted_str)}")) and remove the substring sample; keep
the return encrypted_str as-is.
- Around line 34-41: The current logger.info calls expose partial AES key/IV
material (aes_key, aes_iv) which is a security risk; remove those prints and
instead log non-sensitive verification data: compute and log a SHA-256 (or other
secure) hash of aes_key and aes_iv (or a combined HMAC/checksum) and/or a
boolean indicating successful load and source, and emit a timestamped
key-rotation or load event; update the code around the logger.info calls
referencing aes_key and aes_iv to log only the hash values or flags and never
raw or partial secret material.
- Around line 150-156: The logger.info call in shared/utils/crypto.py currently
logs parts of the decrypted plaintext (variable decrypted_str) which leaks
secrets; remove any logging of decrypted content (including the first 4
characters) from the decryption path (e.g., inside decrypt_sensitive_data or
where decrypted_str is produced) and replace it with a non-sensitive success log
that only records operation status and safe metadata (e.g., "decryption
succeeded", length as an integer without printing content, or a boolean),
ensuring logger.info no longer references decrypted_str or any substring of it.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
shared/utils/crypto.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{py,ts,tsx,js,jsx}
📄 CodeRabbit inference engine (AGENTS.md)
**/*.{py,ts,tsx,js,jsx}: All code comments MUST be written in English
File size MUST NOT exceed 1000 lines - split into multiple sub-modules if exceeded
Function length SHOULD NOT exceed 50 lines (preferred)
Files:
shared/utils/crypto.py
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Python code MUST follow PEP 8, use Black formatter with line length 88, and isort for imports
Python code MUST include type hints
Python functions and classes MUST have descriptive names and docstrings for public functions/classes
Python MUST extract magic numbers to named constants
Files:
shared/utils/crypto.py
🧠 Learnings (1)
📚 Learning: 2025-12-18T02:09:09.776Z
Learnt from: CR
Repo: wecode-ai/Wegent PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-18T02:09:09.776Z
Learning: Backend MUST encrypt Git tokens and API keys using AES-256-CBC encryption
Applied to files:
shared/utils/crypto.py
🪛 Ruff (0.14.10)
shared/utils/crypto.py
98-98: Consider moving this statement to an else block
(TRY300)
156-156: Consider moving this statement to an else block
(TRY300)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: E2E Tests (Shard 3/3)
- GitHub Check: E2E Tests (Shard 1/3)
- GitHub Check: E2E Tests (Shard 2/3)
- GitHub Check: Test wegent CLI Integration
| # Debug logging: print key info for troubleshooting encryption mismatch | ||
| logger.info( | ||
| f"Loaded encryption keys from environment variables - " | ||
| f"GIT_TOKEN_AES_KEY: first 4 chars='{aes_key[:4]}', last 4 chars='{aes_key[-4:]}', length={len(aes_key)}" | ||
| ) | ||
| logger.info( | ||
| f"GIT_TOKEN_AES_IV: first 4 chars='{aes_iv[:4]}', last 4 chars='{aes_iv[-4:]}', length={len(aes_iv)}" | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CRITICAL: Never log encryption key material, even partially.
Logging any portion of encryption keys (first/last characters) is a severe security vulnerability that:
- Aids cryptanalysis and reduces effective key entropy
- Violates cryptographic security principles and compliance standards (PCI-DSS, NIST, SOC 2)
- Exposes secrets if logs are compromised, aggregated to external services, or accessed by unauthorized personnel
- Creates an audit trail of sensitive cryptographic material
For troubleshooting key mismatches between services, use these safer alternatives:
- Hash comparison: Log
sha256(key)instead of partial key content—services can verify matching hashes without exposing key material - Key rotation events: Log when keys are rotated or updated (timestamp only)
- Environment validation: Log boolean flags indicating whether keys are loaded from expected sources
- Checksum/version tags: Use non-reversible identifiers
🔎 Safer alternative using hash-based verification
+import hashlib
+
def _get_encryption_key():
"""Get or initialize encryption key and IV from environment variables"""
global _aes_key, _aes_iv
if _aes_key is None:
# Load keys from environment variables
aes_key = os.environ.get('GIT_TOKEN_AES_KEY', '12345678901234567890123456789012')
aes_iv = os.environ.get('GIT_TOKEN_AES_IV', '1234567890123456')
_aes_key = aes_key.encode('utf-8')
_aes_iv = aes_iv.encode('utf-8')
- # Debug logging: print key info for troubleshooting encryption mismatch
+ # Log hash for key verification (safe for troubleshooting)
+ key_hash = hashlib.sha256(aes_key.encode('utf-8')).hexdigest()[:16]
+ iv_hash = hashlib.sha256(aes_iv.encode('utf-8')).hexdigest()[:16]
logger.info(
- f"Loaded encryption keys from environment variables - "
- f"GIT_TOKEN_AES_KEY: first 4 chars='{aes_key[:4]}', last 4 chars='{aes_key[-4:]}', length={len(aes_key)}"
+ f"Loaded encryption keys - "
+ f"GIT_TOKEN_AES_KEY hash: {key_hash}, length={len(aes_key)}"
)
logger.info(
- f"GIT_TOKEN_AES_IV: first 4 chars='{aes_iv[:4]}', last 4 chars='{aes_iv[-4:]}', length={len(aes_iv)}"
+ f"GIT_TOKEN_AES_IV hash: {iv_hash}, length={len(aes_iv)}"
)
return _aes_key, _aes_iv🤖 Prompt for AI Agents
In @shared/utils/crypto.py around lines 34-41, The current logger.info calls
expose partial AES key/IV material (aes_key, aes_iv) which is a security risk;
remove those prints and instead log non-sensitive verification data: compute and
log a SHA-256 (or other secure) hash of aes_key and aes_iv (or a combined
HMAC/checksum) and/or a boolean indicating successful load and source, and emit
a timestamped key-rotation or load event; update the code around the logger.info
calls referencing aes_key and aes_iv to log only the hash values or flags and
never raw or partial secret material.
| # Debug logging: print plain text info | ||
| logger.info( | ||
| f"encrypt_sensitive_data called - plain_text length={len(plain_text)}, " | ||
| f"first 4 chars='{plain_text[:4] if len(plain_text) >= 4 else plain_text}'" | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CRITICAL: Never log sensitive plaintext data (git tokens, API keys).
This function encrypts git tokens and API keys—logging the first 4 characters of plaintext secrets is a severe security vulnerability:
- Exposes partial credentials in logs accessible to developers, operators, log aggregation services, and backups
- Violates the security requirement that sensitive data should only exist encrypted in persistent storage
- Creates a breach vector if logs are compromised
- Non-compliant with PCI-DSS, GDPR (secrets as PII), and SOC 2 controls
For debugging encryption issues, log only non-sensitive metadata:
- Data length and type (already logged)
- Success/failure status
- Timing information
- Hash of the input (if absolutely necessary for correlation)
Never log actual secret content, even partially.
🔎 Safer logging approach
- # Debug logging: print plain text info
- logger.info(
- f"encrypt_sensitive_data called - plain_text length={len(plain_text)}, "
- f"first 4 chars='{plain_text[:4] if len(plain_text) >= 4 else plain_text}'"
+ # Log encryption attempt with non-sensitive metadata only
+ logger.debug(
+ f"encrypt_sensitive_data called - input length={len(plain_text)}"
)🤖 Prompt for AI Agents
In @shared/utils/crypto.py around lines 67-71, The logger call inside
encrypt_sensitive_data currently prints plaintext characters (first 4 chars),
which must be removed; update the logger.info in encrypt_sensitive_data to never
include any portion of plain_text and instead log only non-sensitive metadata
such as length, type, and/or a safe SHA-256 hex digest for correlation (compute
hash from plain_text using hashlib.sha256 and log the hex digest), and ensure
any exception paths also avoid printing plaintext.
| encrypted_str = base64.b64encode(encrypted_bytes).decode('utf-8') | ||
| # Debug logging: print encryption success info | ||
| logger.info( | ||
| f"encrypt_sensitive_data success - encrypted length={len(encrypted_str)}, " | ||
| f"first 10 chars='{encrypted_str[:10]}'" | ||
| ) | ||
| return encrypted_str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Use logger.debug() for debug logging instead of logger.info().
Debug information should use logger.debug() level to prevent it from appearing in production logs by default. The info level is intended for significant operational events, not verbose debugging output.
Additionally, logging partial ciphertext samples, while less critical than logging plaintext, can still provide information useful for cryptanalysis. Consider logging only length and success status.
🔎 Recommended changes
# Return base64 encoded encrypted data
encrypted_str = base64.b64encode(encrypted_bytes).decode('utf-8')
- # Debug logging: print encryption success info
- logger.info(
- f"encrypt_sensitive_data success - encrypted length={len(encrypted_str)}, "
- f"first 10 chars='{encrypted_str[:10]}'"
+ # Log encryption success with metadata only
+ logger.debug(
+ f"encrypt_sensitive_data success - output length={len(encrypted_str)}"
)
return encrypted_str
-except Exception as e:
+except Exception as e:
logger.error(f"Failed to encrypt sensitive data: {str(e)}")
raiseNote: The static analysis tool (Ruff TRY300) suggests moving the return to an else block, but the current structure is acceptable for this use case.
Committable suggestion skipped: line range outside the PR's diff.
🧰 Tools
🪛 Ruff (0.14.10)
98-98: Consider moving this statement to an else block
(TRY300)
🤖 Prompt for AI Agents
In @shared/utils/crypto.py around lines 92-98, The log in encrypt_sensitive_data
currently uses logger.info and prints a ciphertext sample; change this to
logger.debug and stop logging partial ciphertext to avoid leaking data — log
only a success message and the encrypted length (e.g.,
logger.debug(f"encrypt_sensitive_data success - encrypted
length={len(encrypted_str)}")) and remove the substring sample; keep the return
encrypted_str as-is.
| # Debug logging: print encrypted text info | ||
| logger.info( | ||
| f"decrypt_sensitive_data called - encrypted_text length={len(encrypted_text)}, " | ||
| f"first 10 chars='{encrypted_text[:10] if len(encrypted_text) >= 10 else encrypted_text}'" | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Use logger.debug() for debug logging and avoid logging ciphertext samples.
Same issues as the encryption logging:
- Use
logger.debug()instead oflogger.info()for debug-only information - Avoid logging partial ciphertext samples; log only non-sensitive metadata (length, status)
🔎 Recommended changes
- # Debug logging: print encrypted text info
- logger.info(
- f"decrypt_sensitive_data called - encrypted_text length={len(encrypted_text)}, "
- f"first 10 chars='{encrypted_text[:10] if len(encrypted_text) >= 10 else encrypted_text}'"
+ # Log decryption attempt with metadata only
+ logger.debug(
+ f"decrypt_sensitive_data called - input length={len(encrypted_text)}"
)🤖 Prompt for AI Agents
In @shared/utils/crypto.py around lines 122-126, The logging inside
decrypt_sensitive_data currently uses logger.info and prints a partial
ciphertext sample; change the call to logger.debug and remove any
plaintext/ciphertext snippets—log only non-sensitive metadata such as the
encrypted_text length and a safe status message. Locate the logger call that
references encrypted_text and replace it with a logger.debug that reports length
and status (no substrings or content of encrypted_text).
| decrypted_str = decrypted_bytes.decode('utf-8') | ||
| # Debug logging: print decryption success info | ||
| logger.info( | ||
| f"decrypt_sensitive_data success - decrypted length={len(decrypted_str)}, " | ||
| f"first 4 chars='{decrypted_str[:4] if len(decrypted_str) >= 4 else decrypted_str}'" | ||
| ) | ||
| return decrypted_str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CRITICAL: Never log decrypted sensitive data.
This is identical to the pre-encryption logging issue—logging the first 4 characters of decrypted plaintext exposes git tokens and API keys in logs. This completely undermines the encryption security model.
After decryption, log only success status and metadata, never the actual decrypted content.
🔎 Safer logging approach
# Return decrypted string
decrypted_str = decrypted_bytes.decode('utf-8')
- # Debug logging: print decryption success info
- logger.info(
- f"decrypt_sensitive_data success - decrypted length={len(decrypted_str)}, "
- f"first 4 chars='{decrypted_str[:4] if len(decrypted_str) >= 4 else decrypted_str}'"
+ # Log decryption success with non-sensitive metadata only
+ logger.debug(
+ f"decrypt_sensitive_data success - output length={len(decrypted_str)}"
)
return decrypted_str
-except Exception as e:
+except Exception as e:
logger.warning(f"Failed to decrypt sensitive data: {str(e)}")
# Return the original text as fallback for backward compatibility
return encrypted_text🧰 Tools
🪛 Ruff (0.14.10)
156-156: Consider moving this statement to an else block
(TRY300)
🤖 Prompt for AI Agents
In @shared/utils/crypto.py around lines 150-156, The logger.info call in
shared/utils/crypto.py currently logs parts of the decrypted plaintext (variable
decrypted_str) which leaks secrets; remove any logging of decrypted content
(including the first 4 characters) from the decryption path (e.g., inside
decrypt_sensitive_data or where decrypted_str is produced) and replace it with a
non-sensitive success log that only records operation status and safe metadata
(e.g., "decryption succeeded", length as an integer without printing content, or
a boolean), ensuring logger.info no longer references decrypted_str or any
substring of it.
Summary
shared/utils/crypto.pyto troubleshoot AES key inconsistency between backend and executorGIT_TOKEN_AES_KEYandGIT_TOKEN_AES_IVpartial info (first 4 and last 4 chars) on initializationTest plan
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.