Validate that string is base64 #648

maraspr · 2026-01-07T12:12:04Z

Default behaviour for b64decode is to discard all characters that are not in the the base64 alphabet before doing a padding check. If, by chance, the remaining characters pass this check, the input is processed without error. This may cause non-base64 input to be processed as base64 by mistake.
See documentation here.

The following xml snippet will fail to be parsed if it is compressed before parsing with parsedmarc test.xml.gz:

<?xml version="1.0" encoding="UTF-8" ?>
<feedback>
  <version>1.0</version>
  <report_metadata>
    <org_name>xxxxxx.xx</org_name>
    <email>[email protected]</email>
    <extra_contact_info>[email protected]</extra_contact_info>
    <report_id>[email protected]</report_id>
    <date_range>
      <begin>1111111111</begin>
      <end>1111111111</end>
    </date_range>
  </report_metadata>
  <policy_published>
    <domain>xxx.xxxxxx.xx</domain>
    <adkim>r</adkim>
    <aspf>r</aspf>
    <p>reject</p>
    <sp></sp>
    <pct>100</pct>
  </policy_published>
  <record>
    <row>
      <source_ip>10.100.10.1</source_ip>
      <count>1</count>
      <policy_evaluated>
        <disposition>none</disposition>
        <dkim>pass</dkim>
        <spf>pass</spf>
      </policy_evaluated>
    </row>
    <identifiers>
      <header_from>xxx.xxxxxx.xx</header_from>
      <envelope_from></envelope_from>
    </identifiers>
    <auth_results>
      <dkim>
        <domain>xxx.xxxxxx.xx</domain>
        <selector>x1</selector>
        <result>pass</result>
      </dkim>
      <spf>
        <domain>xx1.xx.xxx.xx</domain>
        <scope>helo</scope>
        <result>none</result>
      </spf>
    </auth_results>
  </record>
</feedback>

with the following error message:

ERROR:cli.py:1580:Failed to parse test.xml.gz - not a valid report.

This pull-request should fix this issue by setting validate=True in b64decode where applicable.

maraspr · 2026-01-08T16:56:29Z

Sorry about not catching that test.

Seems to be a problem with newlines appearing in the content variable. Not a super elegant solution, but it works to just remove the newlines before b64decode. I commited the following fix for that to my fork. Should I create a new pull-request?

diff --git a/parsedmarc/__init__.py b/parsedmarc/__init__.py
index cf8197c..60c4be7 100644
--- a/parsedmarc/__init__.py
+++ b/parsedmarc/__init__.py
@@ -892,7 +892,11 @@ def extract_report(content: Union[bytes, str, BinaryIO]) -> str:
     try:
         if isinstance(content, str):
             try:
-                file_object = BytesIO(b64decode(content, validate=True))
+                file_object = BytesIO(
+                    b64decode(
+                        content.replace("\n", "").replace("\r", ""), validate=True
+                    )
+                )
             except binascii.Error:
                 return content
             header = file_object.read(6)

seanthegeek · 2026-01-08T16:59:30Z

Please do

maraspr · 2026-01-08T17:06:17Z

#649

Validate that a string is base64-encoded before trying to base64 decode it. (PRs #648 and #649)

Validate that string is base64

6e75ad7

maraspr force-pushed the master branch from 19f1c0f to 6e75ad7 Compare January 7, 2026 23:21

seanthegeek merged commit 792079a into domainaware:master Jan 8, 2026
0 of 5 checks passed

maraspr mentioned this pull request Jan 8, 2026

remove newlines before b64decode #649

Merged

seanthegeek added a commit that referenced this pull request Jan 8, 2026

9.0.9

0e3a4b0

Validate that a string is base64-encoded before trying to base64 decode it. (PRs #648 and #649)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Validate that string is base64 #648

Validate that string is base64 #648

Uh oh!

maraspr commented Jan 7, 2026

Uh oh!

Uh oh!

maraspr commented Jan 8, 2026

Uh oh!

seanthegeek commented Jan 8, 2026

Uh oh!

maraspr commented Jan 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Validate that string is base64 #648

Validate that string is base64 #648

Uh oh!

Conversation

maraspr commented Jan 7, 2026

Uh oh!

Uh oh!

maraspr commented Jan 8, 2026

Uh oh!

seanthegeek commented Jan 8, 2026

Uh oh!

maraspr commented Jan 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants