VulnGuard utilises Semgrep (Semantic Grep) to perform scanning. Semgrep's high performance, large userbase, and extensive database of community-curated rules makes it a powerful tool to detect vulnerabilities. Semgrep has multiple scanning modes, allowing for the use of static analysis, dynamic analysis (sinks and taints), and much more. VulnGuard comes with various open-source rules enabled by default, but also allows for the users to add/import their own Semgrep rules/repositories.
Semgrep Rules are stored in YAML files, and follow this format.
For more examples, refer to the list of VulnGuard's default Semgrep rules.
For information on how to add user-defined Semgrep rules/repositories, see this.
For simpler rules that may not need Semgrep, Regex rules can be created. Regex rules perform Regex pattern matching on code files to look for vulnerabilities.
Regex Rules are stored in YAML files, and follow the following general format:
rules:
# id (Mandatory) - ID for the Regex rule.
- id: <string>
# message (Mandatory) - Description to be provided to the user when the Regex pattern is matched.
message: <string>
# severity (Mandatory) - Either "INFO", "WARNING", or "ERROR".
severity: <string>
# regex (Mandatory) - Regex pattern to be checked.
regex: <string>
# case_sensitive (Optional) - Whether the Regex pattern provided should be compiled case sensitive.
case_sensitive: <boolean>
# fix (Optional) - Fix to be applied to the line in which the Regex pattern is matched.
fix: <string>
# reference (Optional) - Link for the user to find out more about the vulnerability.
reference: <string>
VulnGuard also supports the use of the regex_and
, regex_or
, and regex_not
fields, in addition to the nesting of multiple Regex patterns to form a Regex "tree". VulnGuard will iterate through files line-by-line, and whenever a line matches a Regex pattern/tree, the line will be highlighted to the user.
Example 1
# This rule is equivalent to the condition (A || B || C), where A, B, and C are Regex patterns.
rules:
- id: 'example-1'
message: 'Example 1'
severity: 'INFO'
regex_or:
- regex: A
- regex: B
- regex: C
Example 2
# This rule is equivalent to the condition (A && B && (C || !D)), where A, B, C, and D are Regex patterns.
rules:
- id: 'example-2'
message: 'Example 2'
severity: 'INFO'
regex_and:
- regex: A
- regex: B
- regex_or:
- regex: C
- regex_not: D
Note: VulnGuard does not support the matching of multi-line Regex patterns since Regex checking is done on a line-by-line basis.
For more examples, see VulnGuard's default Regex rules.
For information on how to add user-defined Regex rules, see this.
Default Semgrep rules include:
- p/owasp-top-ten
- p/nodejsscan
- p/javascript
- p/expressjs
- p/react
- p/eslint-plugin-security
- p/xss
- p/sql-injection
- p/r2c-security-audit
Default Regex rules are curated from various sources, and include:
Users can add their own custom rules to be used as part of Semgrep/Regex scanning. This configuration persists through shutdowns and restarts of VSCode.
To add a custom rule, simply navigate to the VulnGuard Dashboard, click the "Plus" icon, and navigate to the YAML file containing the Semgrep/Regex rules to register it with VulnGuard.
When developing software, it is important to ensure that the dependencies imported are safe and not malicious. This is especially so for web development, given how web applications are inherently vulnerable to a wider range of threats due to web applications possessing a larger attack surface. This makes web applications much more susceptible to Supply Chain Attacks, where attackers try to compromise software by targeting less secure modules/packages used. Given the inherent trust given to npm packages with limited vulnerability scanning, it is important to check modules/packages in greater detail.
While it is important to check modules imported, VulnGuard only scans packages defined in package.json
of the project (i.e. the "top-level" packages) for vulnerabilities. This is due to the fact that it is not necessarily optimal to check all modules imported by an application, given how many imported modules will have their own dependencies, and so on. Although it may seem to be beneficial to check all the imported modules' dependencies as well, in reality, the vast majority of vulnerabilities found in these dependencies are "unreachable" and do not affect the application. This is further reinforced by the fact that secure "top-level" packages should be maintained, and be able to resolve any malicious/vulnerable packages being used in its codebase. Conversely, any unsecure "top-level" packages would be flagged out by VulnGuard and highlighted to the developer. Reducing the number of modules being checked by VulnGuard also has the added benefit of lowering the runtime for the dependency checks, ensuring that developers get the dependency check results faster.
When packages are installed/added, package.json
is naturally modified in the process of installing/adding packages. VulnGuard scans all dependency packages listed in package.json
whenever package.json
is updated, and should packages be detected as malicious, warnings will be shown to the developer when viewing package.json
.
VulnGuard's Dependency Checking builds upon the work done by Spaceraccoon in npm-scan and other SDC (Simple Dependency Check) tools. These tools make use of heuristics in order to determine if a package is likely malicious. Using heuristics is advantageous, since it can be done immediately without the need to wait for the vulnerability to be reported and then published as a CVE. Unfortunately, this process could take a month or more, which is too long, as by the time the CVE is published, the vulnerability could have already been exploited. As such, VulnGuard uses heuristics for dependency checking, and the following documents the various heuristics used to determine if a package is malicious:
Time-Related Heuristics (checks against npm)
Obfusication-Related Heuristics
- No Source Code Repository for package (sdc-check)
- JJEncode Code Obfuscation (npm-scan)
- Unicode Code Obfuscation (js-x-ray)
- Package main module export is minified (npm-scan)
Behavior-Related Heuristics
- OS Scripts (.sh, .bat, etc.) found within packages (sdc-check)
- Install Scripts found in manifest scripts (npm-scan)
- Shell Commands found in manifest scripts (sdc-check)
- Fetching of Content Security Policy (CSP) (npm-scan)
- Creation of Child Processes (npm-scan)
Ensuring VulnGuard's scan runtime is low is important to ensure that the Extension is responsive to changes in code made by the developer, so that they can be notified as soon as possible when they introduce a vulnerability. The following documents the ways that VulnGuard tries to achieve a low scan runtime:
- All rule validation and compilation is done during initialization before any scans are performed.
- All scans performed are done asynchronously on a per-file-per-rule basis.
- The whole project is only scanned once as a whole during initialization, and afterwards, all subsequent scans on files are only done when they have been modified by the developer.
- For dependency checking, only the top-level packages are being checked for vulnerabilities.
- Semgrep is not supported on Windows at the moment, and so is automatically disabled on Windows environments.
- Semgrep can be automatically installed through the VSCode Extension using either
homebrew
orpip
. If the installation fails however, one can refer to Semgrep Docs on how to configure Semgrep for their system. - The demo was conducted in Linux to showcase VulnGuard's Semgrep functionality.