Skip to content

fix(LicenseInfo): harden XML parsers against XXE in SPDXParser and AbstractCLIParser#3891

Open
saiteja-in wants to merge 1 commit intoeclipse-sw360:mainfrom
saiteja-in:fix/xxe-fix
Open

fix(LicenseInfo): harden XML parsers against XXE in SPDXParser and AbstractCLIParser#3891
saiteja-in wants to merge 1 commit intoeclipse-sw360:mainfrom
saiteja-in:fix/xxe-fix

Conversation

@saiteja-in
Copy link
Contributor

Commit 738366d2 hardened AbstractCLIParser.getDocument() against XXE attacks, but two other
XML parsing locations in the same module were missed. This PR applies the same OWASP-recommended
security features to close the remaining XXE attack surface in the licenseinfo module.

  • No new dependencies added.

Issue: #3890

Changes

1. SPDXParser.openAsSpdx()DocumentBuilderFactory hardening

File: backend/licenseinfo/src/main/java/org/eclipse/sw360/licenseinfo/parsers/SPDXParser.java

Before (vulnerable):

DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
dbFactory.setNamespaceAware(true);
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();

After (hardened):

DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newDefaultInstance();
dbFactory.setNamespaceAware(true);
dbFactory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
dbFactory.setFeature("http://xml.org/sax/features/external-general-entities", false);
dbFactory.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
dbFactory.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
dbFactory.setXIncludeAware(false);
dbFactory.setExpandEntityReferences(false);
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
  • Changed newInstance() to newDefaultInstance() (matches getDocument() pattern)
  • Added all 6 OWASP-recommended security features (identical to AbstractCLIParser.getDocument())
  • Kept setNamespaceAware(true) — required for SPDX RDF parsing (rdf:, spdx: namespace prefixes)
  • disallow-doctype-decl is the strongest defense: rejects all DOCTYPE declarations at parse time

2. AbstractCLIParser.hasThisXMLRootElement()XMLInputFactory hardening

File: backend/licenseinfo/src/main/java/org/eclipse/sw360/licenseinfo/parsers/AbstractCLIParser.java

Before (vulnerable):

XMLInputFactory xmlif = XMLInputFactory.newFactory();

After (hardened):

XMLInputFactory xmlif = XMLInputFactory.newFactory();
xmlif.setProperty(XMLInputFactory.IS_SUPPORTING_EXTERNAL_ENTITIES, false);
xmlif.setProperty(XMLInputFactory.SUPPORT_DTD, false);
  • XMLInputFactory (StAX) uses a different API than DocumentBuilderFactory (DOM)
  • IS_SUPPORTING_EXTERNAL_ENTITIES = false — disables external entity resolution
  • SUPPORT_DTD = false — disables DTD processing entirely (strongest protection for StAX)
  • These are the OWASP-recommended hardening properties for XMLInputFactory

3. SPDXParserTest — test consistency + XXE regression test

File: backend/licenseinfo/src/test/java/org/eclipse/sw360/licenseinfo/parsers/SPDXParserTest.java

  • Updated testAddSPDXContentToCLI to use the same hardened DocumentBuilderFactory configuration as production code (ensures tests validate behavior with the secured parser, not a different config)
  • Added testXXEAttackIsBlocked() — a regression test that:
    1. Constructs a malicious RDF payload with a <!DOCTYPE> XXE entity pointing to file:///etc/passwd
    2. Feeds it through the SPDXParser.getLicenseInfos() public API
    3. Asserts the parser rejects the payload (either throws SW360Exception or returns non-SUCCESS status)

Suggest Reviewer

@aliahmed, @nickvonkaenel — as the authors of the prior XXE fix (commit 738366d2) and licenseinfo module maintainers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant