Skip to content

Add C/C++ language support for systems-level code validation #9

@rudra496

Description

@rudra496

Summary

Add C and C++ language support. This is critical because AI-generated C/C++ code can have dangerous memory safety issues that other languages don't have.

Unique C/C++ patterns to detect

Security (Memory Safety)

  • strcpy(), strcat(), sprintf() — buffer overflow
  • malloc() without free() — memory leaks
  • scanf("%s", buf) — stack buffer overflow
  • gets() — always dangerous, removed in C11
  • memcpy() without size validation
  • Use-after-free patterns
  • Double-free patterns
  • Null pointer dereference

Security (General)

  • system(), popen() — command injection
  • Hardcoded file paths (/etc/passwd, /tmp/)
  • Insecure rand() — predictable random numbers
  • setuid without proper checks

Hallucinations

  • Non-existent headers (#include <magic_lib.h>)
  • Invented POSIX functions
  • Fake Win32 API calls

Logic

  • Off-by-one errors in loops
  • Uninitialized variables
  • Missing return statements
  • Integer overflow/underflow

Approach

Since parsing C/C++ AST in Python is complex, consider:

  1. Pattern-based regex analysis (covers 80% of cases)
  2. Optional integration with clang via subprocess for full AST analysis
  3. Focus on the most dangerous patterns first

Acceptance Criteria

  • C files (.c, .h) and C++ files (.cpp, .hpp, .cc) auto-detected
  • At least 10 C/C++ specific security patterns
  • Memory safety issue detection (unique value proposition)
  • Tests with real vulnerable code samples (OWASP examples)
  • Documentation updated

Difficulty

Advanced — C/C++ have complex syntax and dangerous patterns that need careful regex or external tool integration.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions