The Toxicity Analyzer identifies modules that cannot be safely snapshotted and restored.
Some Python code creates state that cannot be reset via memory snapshots:
- Threading: Background threads, locks, condition variables
- Networking: Open sockets, connections
- Subprocesses: Child processes, file descriptors
- FFI: C extensions with global state
Tach detects these patterns statically and marks affected tests as "toxic", forcing them to run in isolated processes that exit after each test.
flowchart TB
subgraph Analysis["LOCAL ANALYSIS"]
Scan["Scan .py files"]
Parse["Parse AST"]
Detect["Detect toxic patterns"]
Report["ToxicityReport"]
end
subgraph Graph["GRAPH PROPAGATION"]
Build["Build dependency graph"]
Propagate["Fixed-point iteration"]
Tag["Tag all reachable modules"]
end
subgraph Output["OUTPUT"]
Safe["Safe Tests<br/>(Hypervisor Mode)"]
Toxic["Toxic Tests<br/>(Isolation Mode)"]
end
Analysis --> Graph --> Output
Result of analyzing a single file.
pub struct ToxicityReport {
pub is_toxic: bool,
pub reasons: Vec<String>,
pub imports: Vec<String>,
}| Field | Description |
|---|---|
is_toxic |
Whether the file contains toxic patterns |
reasons |
Human-readable explanations |
imports |
All detected imports (for graph construction) |
Data stored in each graph node.
pub struct ModuleNode {
pub name: String,
pub path: PathBuf,
pub is_toxic: bool,
pub reasons: Vec<String>,
}The dependency graph for toxicity propagation.
pub struct ToxicityGraph {
graph: DiGraph<ModuleNode, ()>,
name_to_node: HashMap<String, NodeIndex>,
path_to_node: HashMap<PathBuf, NodeIndex>,
}Uses petgraph::graph::DiGraph where an edge A -> B means "A imports B".
const TOXIC_STD_LIB: &[&str] = &[
"threading",
"_thread",
"multiprocessing",
"socket",
"ctypes",
"signal",
"concurrent.futures",
];const TOXIC_EXTERNAL_MODULES: &[&str] = &[
"grpc",
"pandas", // OpenMP threads
"tensorflow", // CUDA state
"torch", // CUDA state
"cv2", // OpenCV threads
"gevent", // Greenlets
"cffi",
];| Pattern | Example | Reason |
|---|---|---|
__import__ |
__import__("threading") |
Runtime module loading |
exec |
exec("import socket") |
Arbitrary code execution |
importlib.import_module |
importlib.import_module("ctypes") |
Dynamic imports |
from threading import * # Toxic - imports Thread, Lock, etc.Star imports from toxic modules are aggressively marked toxic.
import threading
t = threading.Thread(target=fn) # Toxic call detectedDirect calls to functions from toxic modules are detected even with aliasing.
Toxicity propagates transitively through the import graph:
graph TD
A[test_user.py] --> B[auth.py]
B --> C[crypto_utils.py]
C --> D[ctypes]
style D fill:#f66
style C fill:#f96
style B fill:#fc6
style A fill:#ff6
subgraph Legend
L1[Directly Toxic]
L2[Transitively Toxic]
end
1. Build directed graph: Module -> Imports
2. Analyze each module for LOCAL toxicity
3. Fixed-point iteration:
REPEAT:
FOR each edge (from, to):
IF to.is_toxic AND NOT from.is_toxic:
from.is_toxic = true
from.reasons.push("Imports toxic module '{to.name}'")
UNTIL no changes
4. Result: Complete transitive closure of toxicity
The propagate method is an internal helper called by build():
impl ToxicityGraph {
/// Private method - called internally by build()
fn propagate(&mut self) {
loop {
let mut changed = false;
// Collect edges to avoid borrow issues
let edges: Vec<(NodeIndex, NodeIndex)> = self
.graph
.edge_indices()
.filter_map(|e| self.graph.edge_endpoints(e))
.collect();
for (from_idx, to_idx) in edges {
let to_toxic = self.graph[to_idx].is_toxic;
let to_name = self.graph[to_idx].name.clone();
if to_toxic && !self.graph[from_idx].is_toxic {
self.graph[from_idx].is_toxic = true;
self.graph[from_idx]
.reasons
.push(format!("Imports toxic module '{}'", to_name));
changed = true;
}
}
if !changed {
break;
}
}
}
}sequenceDiagram
participant Disc as Discovery
participant Tox as Toxicity
participant Sched as Scheduler
participant Work as Worker
Disc->>Tox: TestModule[]
Tox->>Tox: analyze_all()
Tox->>Tox: build_graph()
Tox->>Tox: propagate()
Tox->>Sched: RunnableTest[] with is_toxic
loop For each test
Sched->>Work: TestPayload{is_toxic}
alt is_toxic = false
Work->>Work: Apply Seccomp
Work->>Work: Run test
Work->>Work: Reset memory
else is_toxic = true
Work->>Work: Skip Seccomp
Work->>Work: Run test
Work->>Work: exit(0)
end
end
Imports inside if TYPE_CHECKING: blocks are skipped:
from typing import TYPE_CHECKING
if TYPE_CHECKING:
import threading # NOT toxic - only for type hintsCurrently, all imports are detected regardless of runtime conditions:
if sys.platform == "win32":
import ctypes # Still marked toxicThis is conservative but safe.
Analyzes a single Python file for local toxicity.
pub fn analyze_file(source: &str, path: &Path) -> ToxicityReport| Parameter | Description |
|---|---|
source |
Python source code as a string |
path |
Path to the file (used for error messages) |
Returns a ToxicityReport directly (not wrapped in Result).
Constructs the dependency graph from all project files.
pub fn build(paths: &[PathBuf], project_root: &Path) -> Self| Parameter | Description |
|---|---|
paths |
List of Python file paths to analyze |
project_root |
Root directory for module name resolution |
This method:
- Indexes all files (path to module name)
- Analyzes each file for local toxicity
- Builds import edges
- Propagates toxicity transitively
Queries whether a module is toxic (including transitively).
pub fn is_toxic(&self, path: &Path) -> bool| Test Type | Seccomp | After Execution | Worker Fate |
|---|---|---|---|
| Safe | Applied | Memory reset | Continues in pool |
| Toxic | Skipped | exit(0) |
Replaced |
Toxic workers skip Seccomp because they may legitimately need:
fork/execfor subprocess testssocketfor network tests
- Discovery Engine - How modules are found
- Iron Dome - How Seccomp is applied
- Scheduler - How tests are dispatched
This implementation is informed by the following research papers (see docs/pdfs/txt/ for full text):
| Paper | Key Contribution |
|---|---|
| Fork Safety of Python C-Extensions | Orphaned lock scenarios, async-signal-safety, "Poison Fork" triggers (OpenMP, CUDA, gRPC) |
| Rust Static Analysis for Toxic Python Modules | Taxonomy of import-time toxicity, ruff_python_parser integration, fixed-point iteration |
| Python Monorepo Zygote Tree Design | Toxicity propagation rules, contagion model ("if A imports toxic B, A is toxic") |
Implementation Note: Tach uses
rustpython-parserfor AST analysis. The research paper analyzedruff_python_parseras an alternative but the implementation choserustpython-parserfor API stability.
- Orphaned Locks:
fork()only clones the calling thread - background threads (BLAS workers, gRPC pollers) vanish, leaving mutexes permanently locked - POSIX Constraint: Post-fork, only async-signal-safe functions are safe to call - Python interpreter is NOT async-signal-safe
- Detection Patterns:
threading.Thread().start(),ssl.create_default_context(),multiprocessing.Pool()at module scope (depth=0) - C-Extension Blindspot: Static analysis cannot see into compiled
.sofiles - considerld-linux.soauditing for thread spawning detection - if name == "main" Guard: Must not flag code inside this guard as toxic (only runs when executed as main)
See Research Overview for complete analysis.