- NO BREAKING CHANGES - Existing LLVM pipeline MUST continue to work
- BACKWARD COMPATIBLE - All existing passes (OLLVM, MLIR) must work unchanged
- OPTIONAL FEATURE - ClangIR/Polygeist is an OPT-IN enhancement
- DEFAULT BEHAVIOR - System defaults to current working pipeline
C/C++ Source
↓
Clang → LLVM IR
↓
mlir-translate --import-llvm → MLIR
↓
MLIR Passes (string-encrypt, symbol-obfuscate, crypto-hash, constant-obfuscate)
↓
mlir-translate --mlir-to-llvmir → LLVM IR
↓
OLLVM Passes (optional)
↓
Clang → Binary
C/C++ Source
↓
┌─────────────────────────────────┐
│ FRONTEND CHOICE (NEW!) │
│ ├─ Option 1: Clang (default) │ ← EXISTING (NO CHANGE)
│ ├─ Option 2: ClangIR (new) │ ← NEW FEATURE
│ └─ Option 3: Polygeist (new) │ ← NEW FEATURE
└─────────────────────────────────┘
↓
High-Level MLIR (ClangIR or Polygeist dialect)
↓
MLIR Passes (string-encrypt, symbol-obfuscate, crypto-hash, constant-obfuscate)
↓
Lower to LLVM Dialect MLIR
↓
mlir-translate --mlir-to-llvmir → LLVM IR
↓
OLLVM Passes (optional)
↓
Clang → Binary
File: Dockerfile.test
# Add ClangIR and Polygeist repositories (LLVM 22.0.0)
# This is ADDITIVE ONLY - doesn't change existing toolsRisk:
File: cmd/llvm-obfuscator/core/config.py
# Add new enum (doesn't break existing code)
class MLIRFrontend(str, Enum):
CLANG = "clang" # DEFAULT - existing behavior
CLANGIR = "clangir" # NEW
POLYGEIST = "polygeist" # NEW
# Add new field to ObfuscationConfig (with default)
@dataclass
class ObfuscationConfig:
# ... existing fields ...
mlir_frontend: MLIRFrontend = MLIRFrontend.CLANG # DEFAULT to existingRisk:
File: cmd/llvm-obfuscator/core/obfuscator.py
def _compile(self, source, destination, config, ...):
# EXISTING CODE PATH (default)
if config.mlir_frontend == MLIRFrontend.CLANG:
# Current implementation - NO CHANGES
self._compile_with_clang_llvm(...) # Existing code
# NEW CODE PATHS (opt-in)
elif config.mlir_frontend == MLIRFrontend.CLANGIR:
self._compile_with_clangir(...) # New function
elif config.mlir_frontend == MLIRFrontend.POLYGEIST:
self._compile_with_polygeist(...) # New functionRisk:
File: cmd/llvm-obfuscator/core/obfuscator.py
def _compile_with_clangir(self, ...):
"""NEW FUNCTION - doesn't touch existing code"""
# 1. clangir → MLIR
# 2. Apply MLIR passes (same as existing)
# 3. Lower to LLVM IR
# 4. Continue with existing pipeline
pass
def _compile_with_polygeist(self, ...):
"""NEW FUNCTION - doesn't touch existing code"""
# 1. polygeist → MLIR
# 2. Apply MLIR passes (same as existing)
# 3. Lower to LLVM IR
# 4. Continue with existing pipeline
passRisk:
# Test 1: Existing MLIR pipeline (string-encrypt)
python3 -m cli.obfuscate compile test.c \
--enable-string-encrypt \
--output ./test1
# MUST WORK - no changes to this path
# Test 2: Existing OLLVM pipeline
python3 -m cli.obfuscate compile test.c \
--enable-flattening \
--output ./test2
# MUST WORK - no changes to this path
# Test 3: Combined MLIR + OLLVM
python3 -m cli.obfuscate compile test.c \
--enable-constant-obfuscate \
--enable-crypto-hash \
--enable-flattening \
--output ./test3
# MUST WORK - no changes to this path# Test 4: ClangIR pipeline (new)
python3 -m cli.obfuscate compile test.c \
--mlir-frontend clangir \
--enable-constant-obfuscate \
--output ./test4
# NEW FEATURE - should work with ClangIR
# Test 5: Polygeist pipeline (new)
python3 -m cli.obfuscate compile test.c \
--mlir-frontend polygeist \
--enable-crypto-hash \
--output ./test5
# NEW FEATURE - should work with Polygeist- Default behavior unchanged - No
--mlir-frontendflag = existing behavior - Existing passes work - All MLIR/OLLVM passes unchanged
- No API changes - Existing CLI flags work identically
- Isolated new code - ClangIR/Polygeist in separate functions
- Conditional execution - New code only runs when explicitly requested
- Dockerfile changes - MITIGATION: Only ADD tools, don't REMOVE or MODIFY
- Config schema - MITIGATION: New fields have safe defaults
- Pipeline logic - MITIGATION: Existing path in separate function, untouched
- Add ClangIR build from source (LLVM 22.0.0)
- Add Polygeist build from source (LLVM 22.0.0)
- Keep all existing tools (clang, mlir-opt, etc.)
- Add
MLIRFrontendenum - Add
mlir_frontendfield with default = CLANG - Update
from_dict()to parse new field (optional)
- Extract current MLIR pipeline to
_compile_with_clang_llvm() - No logic changes, just code movement
- Test that existing behavior still works
- Implement
_compile_with_clangir() - Only executes when frontend = CLANGIR
- No impact on existing code
- Implement
_compile_with_polygeist() - Only executes when frontend = POLYGEIST
- No impact on existing code
- Document new
--mlir-frontendflag - Emphasize default behavior (clang)
- Provide migration guide
If anything breaks:
- Git revert - All changes in isolated commits
- Feature flag - Can disable ClangIR/Polygeist via config
- Default safe - System defaults to working state
| File | Change Type | Risk | Rollback |
|---|---|---|---|
Dockerfile.test |
ADDITIVE | LOW | Remove new RUN commands |
config.py |
ADDITIVE | VERY LOW | Remove new enum/field |
obfuscator.py |
REFACTOR + ADD | LOW | Git revert |
MLIR_INTEGRATION_GUIDE.md |
UPDATE | NONE | N/A |
✅ MUST PASS:
- All existing tests pass without changes
- Existing CLI commands work identically
- No changes to MLIR pass implementations
- No changes to OLLVM pass implementations
- Default behavior (no flags) produces same output
✅ NICE TO HAVE:
- ClangIR frontend works for simple C files
- Polygeist frontend works for simple C files
- Combined with MLIR passes (constant-obfuscate, crypto-hash)
- ✅ Review this plan with user
- ⏭️ Implement Step 1 (Dockerfile updates)
- ⏭️ Implement Step 2 (config.py updates)
- ⏭️ Implement Step 3 (obfuscator.py refactor)
- ⏭️ Test existing pipeline (regression tests)
- ⏭️ Implement Step 4 (ClangIR support)
- ⏭️ Implement Step 5 (Polygeist support)
- ⏭️ Final integration testing
CRITICAL: At EVERY step, we run regression tests to ensure existing functionality is NOT broken.
Version: 1.0.0 Target LLVM: 22.0.0 Backward Compatible: YES ✅