Skip to content

Conversation

valentijnscholten
Copy link
Member

@valentijnscholten valentijnscholten commented Sep 30, 2025

Tags are a first class citizin in Defect Dojo, so good performance is important.

Currently the django-tagulous library doesn't provide a way to "bulk add tags to a collection of model instances".
We've raised a (feature request}[https://github.com/radiac/django-tagulous/issues/190] and (a starter/draft PR)[https://github.com/radiac/django-tagulous/pull/191] upstream, but are unsure if and when this would be merged (or not accepted).

Currently bulk adding of tags happen during:

  • imports/reimports
  • bulk edit
  • product tag inheritance [1]

This results in lots of database queries. If we skip the two tags in the performance unit test, we save between ~100 and ~180 queries per run. And that's for a scan report with only 16 findings.

valentijn:~/dd$ ./run-unittest.sh --test-case unittests.test_importers_performance 2>&1 | grep Assertion
AssertionError: 592 != 679 : 592 queries executed, 679 expected
AssertionError: 498 != 606 : 498 queries executed, 606 expected
AssertionError: 592 != 679 : 592 queries executed, 679 expected
AssertionError: 503 != 611 : 503 queries executed, 611 expected
AssertionError: 597 != 684 : 597 queries executed, 684 expected
AssertionError: 509 != 617 : 509 queries executed, 617 expected

In this context I feel it's justifiable that we add a "bulk add tag(s) to models" method to Defect Dojo.
This PR introduces that method and performs tagging in batches of 1000 findings. It needs 3 queries per batch.
The current upstream implementation needs 3 queries per finding, i.e. 3000 queries per batch.

The PR adds test cases for various scerario's.
The PR doesn't use (m)any django-tagulous internals, mainly Django ORM methods.

I considered creating a fork of django-tagulous, but that would add a big maintenance burden.

With the (~10k findings sample file)[https://github.com/DefectDojo/django-DefectDojo/blob/bugfix/unittests/scans/jfrog_xray_unified/very_many_vulns.json] the import time without dedupe gets reducede from ~372s to ~190s (on my laptop). So almost 50% faster.

[1] The bulk method is used in import, reimport and bulk edit. Product Tag Inheritance has not been touched yet, it deserves a PR on its own once we feel the bulk tag add is the way forward.

@valentijnscholten valentijnscholten added this to the 2.51.0 milestone Sep 30, 2025
@github-actions github-actions bot added settings_changes Needs changes to settings.py based on changes in settings.dist.py included in this PR unittests labels Sep 30, 2025
@valentijnscholten valentijnscholten marked this pull request as ready for review September 30, 2025 16:59
Copy link

dryrunsecurity bot commented Sep 30, 2025

DryRun Security

This pull request introduces three security issues: a log injection vulnerability where unsanitized user-controlled tags are logged allowing arbitrary log entry injection, a path traversal in the unittest scan import that concatenates an unchecked scan_file into a file path enabling reading files outside the intended directory, and an arbitrary module import where a crafted scan_file component can cause import_module to resolve unintended modules (potentially outside dojo.tools), allowing access to sensitive modules or unexpected code execution.

Log Injection in dojo/finding/views.py
Vulnerability Log Injection
Description The tags variable, derived from user-controlled input via form.cleaned_data["tags"], is directly logged using a %s format string without prior sanitization of newline characters. This allows an authenticated attacker to inject arbitrary log entries into the application's debug logs.

logger.debug("bulk_edit: adding tags to %d findings: %s", finds.count(), tags)
# Delegate parsing and handling of strings/iterables to helper
bulk_add_tags_to_instances(tag_or_tags=tags, instances=finds, tag_field_name="tags")

Path Traversal in dojo/management/commands/import_unittest_scan.py
Vulnerability Path Traversal
Description The scan_file argument, provided via the command line, is directly concatenated with the base path unittests/scans using pathlib.Path's / operator. There is no sanitization or normalization performed on scan_file to prevent path traversal sequences (e.g., ../) or absolute paths. This allows an attacker to craft a scan_file value that causes the application to access and read arbitrary files outside the intended unittests/scans directory when scan_path.open() is called.

scan_path = Path("unittests/scans") / scan_file
if not scan_path.exists():
msg = f"Scan file not found: {scan_path}"
raise CommandError(msg)

Arbitrary Module Import in dojo/management/commands/import_unittest_scan.py
Vulnerability Arbitrary Module Import
Description The import_unittest_scan management command constructs a module name (module_name) directly from the first path component of the user-provided scan_file argument. This module_name is then used in an f-string to dynamically import a module: import_module(f"dojo.tools.{module_name}.parser"). If an attacker provides a scan_file like ../some_module/scan.json, module_name becomes ... The import_module call then attempts to import dojo.tools...parser, which Python resolves as dojo.parser (one level up from dojo.tools). This allows an attacker to bypass the intended dojo.tools namespace and potentially import any module available on the Python path, including sensitive internal modules or modules that could lead to code execution if their top-level code has side effects.

module = import_module(f"dojo.tools.{module_name}.parser")
# Find the parser class
parser_class = None


All finding details can be found in the DryRun Security Dashboard.

Copy link
Contributor

@mtesauro mtesauro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
settings_changes Needs changes to settings.py based on changes in settings.dist.py included in this PR unittests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants