Skip to content

[REVIEW] containment: add OT/ICS safety and operations evidence gates #1643

@yZangEren

Description

@yZangEren

Skill Being Reviewed

Skill name: containment
Skill path: skills/incident-response/containment/

False Positive Analysis

Benign code/configuration that can be misclassified:

Incident: suspicious engineering-workstation beaconing in an OT cell
Asset role: HMI jump host for a running batch process
Current plant state: safety interlocks healthy, process is inside controlled shutdown window in 45 minutes
Operations decision: keep controller network stable, isolate the IT/VPN ingress path, move the HMI to monitored read-only access, and prepare manual-mode fallback with the process engineer

Why this is a false positive:

The current containment matrix can make "full network isolation" look like the safest high-severity answer for a non-business-critical host. In OT/ICS, the first-order constraint is human safety, environmental safety, and process stability. Pulling network links, blocking controller traffic, or powering off an HMI/engineering workstation can create a safety or availability incident even when the cyber threat is real. The skill needs an OT/ICS branch that asks for process state, safety interlock status, operator/engineering approval, and manual fallback before recommending IT-style isolation.

Coverage Gaps

Missed variant 1: PLC/DCS control path containment without process-safety evidence

Observed activity:
  engineering workstation opens an unexpected outbound session
  same workstation is also the only online programming path to PLC-12

Proposed containment:
  disable switchport immediately

Missing evidence:
  process state, controller mode, safety interlock status, manual operating capability,
  approved maintenance window, and named control engineer approval

Why it should be caught:

The existing skill has strong host/network containment guidance, but it does not distinguish IT endpoints from OT assets whose connectivity may be part of a real-time control loop or safety process. A containment plan should require a plant/operations decision gate before disconnecting PLC, DCS, HMI, historian, safety-instrumented-system, or engineering-station paths.

Missed variant 2: ransomware/wiper response severs OT monitoring and backup telemetry

Action:
  block all SMB, RDP, WMI, DNS, and outbound traffic between OT and IT zones

Side effect:
  historian replication, OT jump-server authentication, log forwarding,
  and backup-status telemetry all stop at the same time

Why it should be caught:

The skill rightly favors aggressive containment for destructive malware, but OT environments need a staged communication matrix: which control traffic must remain, which IT ingress paths must close, which telemetry/logging paths stay one-way, and how backups and historian data are protected. Otherwise containment can blind responders or interrupt the process being protected.

Edge Cases

  • A safety-instrumented system may need to remain independent and untouched even while adjacent Windows hosts are isolated.
  • Some legacy controllers cannot tolerate active scanning, sudden network ACL changes, or broadcast/multicast disruption.
  • A site may choose a controlled process shutdown instead of immediate isolation when an abrupt disconnect is more dangerous than the current cyber activity.
  • Vendor remote access may be the compromise path, but it may also be needed for safe recovery; the plan should require explicit allowlisted break-glass approval and session recording if retained.
  • OT containment validation must include operator-visible process state, controller communications, alarm state, and historian/log continuity, not only "C2 blocked."

Remediation Quality

  • Fix resolves the vulnerability
  • Fix doesn't introduce new security issues
  • Fix doesn't break functionality
  • Issues found: Add an OT/ICS containment branch with safety-first decision gates, asset-role classification, operations/engineering approval, manual-mode fallback, staged segmentation, vendor remote-access controls, and OT-specific validation evidence.

Comparison to Other Tools

Tool Catches this? Notes
Semgrep No This is operational response planning, not source-code syntax.
CodeQL No CodeQL does not model plant safety, controller mode, or containment authority.
NIST SP 800-82 Rev. 3 Partial Provides OT security guidance and emphasizes unique safety, reliability, and operational constraints, but this skill must encode those constraints into containment output.
CISA ICS incident response recommended practice Partial Addresses ICS-specific incident response capability and process-engineering challenges, but reviewers need concrete evidence gates in the containment skill.

Overall Assessment

Strengths:

The skill is already useful for standard IT incidents: it covers NIST-style containment decision factors, network isolation, credential revocation, ATT&CK technique mapping, wiper/destructive-malware urgency, validation, rollback criteria, and output structure.

Needs improvement:

The skill treats containment mostly as an IT/network/identity decision. OT/ICS incidents need safety and process-continuity gates before recommending disruptive containment. Without those gates, a plan can be "cyber-correct" but operationally unsafe.

Priority recommendations:

  1. Add an OT/ICS Safety and Operations Gate before full isolation, shutdown, controller network ACL changes, or vendor remote-access changes.
  2. Extend the output template with fields for OT asset role, process state, safety interlock status, control engineer / operations approval, manual-mode fallback, maintenance window, and telemetry continuity.
  3. Add OT validation checks: controller communication health, alarm state, historian/log continuity, operator confirmation, backup isolation status, and post-containment process stability.

Sources Checked

Bounty Info

  • I have read and agree to the CONTRIBUTING.md bounty terms
  • Preferred payment method: Can provide preferred payment details privately after maintainer acceptance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions