-
Notifications
You must be signed in to change notification settings - Fork 6
✨(feature): Image proxy for CSP compliance and privacy protection #398
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Add image proxification feature to protect user privacy by proxying external images through the application server. This prevents tracking pixels from leaking user IP addresses and browsing behavior.
Changes:
- Add _proxy_images_in_html method to MessageSerializer to rewrite external image URLs
- Handle both external HTTP(S) URLs and CID (Content-ID) inline attachments
- Create ImageProxyViewSet endpoint at /api/mailboxes/{id}/image-proxy/
- Add per-domain configuration via _proxy_external_images custom attribute
- Add size limits via _proxy_max_image_size_mb custom attribute (default 5MB)
- Add beautifulsoup4 dependency for HTML parsing
- Images are proxied on-demand without caching (simple implementation)
Benefits:
- Protects user privacy (IP address not exposed to external servers)
- Breaks tracking pixels in emails
- Enables strict CSP policy (img-src 'self')
- Per-domain control via MailDomain.custom_attributes
Fix issues preventing the image proxy from working correctly:
- Change OpenApiParameter to OpenApiResponse in response schema definitions
- Use ViewSet.list() method instead of custom action for proper routing
- Add API version prefix to generated proxy URLs in serializer
The image proxy now correctly responds at:
/api/v1.0/mailboxes/{mailbox_id}/image-proxy/?url={encoded_url}
…-outside-toplevel Move imports to top of file to fix pylint warnings: - Move BeautifulSoup, quote, and settings imports to module level - Remove import-outside-toplevel violations Replace custom_attributes with environment variables: - Add PROXY_EXTERNAL_IMAGES setting (default: False) - Add PROXY_MAX_IMAGE_SIZE_MB setting (default: 5) - Use settings instead of mailbox.domain.custom_attributes - Simpler configuration via environment variables
Fixed the image proxy URL rewriting for inline images (CID references) to use the correct blob download endpoint format. The previous implementation was generating non-existent endpoint URLs.
Changes:
- Updated CID image URL from `/api/messages/{id}/attachments/{index}` to `/api/v1.0/blob/msg_{id}_{index}/download/`
- This matches the existing blob download endpoint that handles the `msg_*` format
- Ensures inline images in HTML emails display correctly when image proxying is enabled
…d caching - Add lazy loading to all images (CID and external) for better initial page load - Enable HTTP/2 in nginx for parallel image request handling - Add Cache-Control headers (30 days) to blob downloads for browser caching - Fix TypeScript typing for DomPurify hook parameter These changes significantly improve performance for long emails with many images by reducing initial load time and enabling efficient browser caching.
WalkthroughAdds server-side external-image proxying: HTML rewriting for image src, a new ImageProxyViewSet to fetch and serve images with SSRF and size checks, new settings and URL routing, caching headers for blobs, a URL-safety utility, a BeautifulSoup dependency, and small frontend IMG lazy-loading/type annotations. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Client
participant Frontend
participant Serializer
participant ImageProxyViewSet
participant ExternalServer
Client->>Frontend: Request email view
Frontend->>Serializer: Fetch HTML body
Serializer->>Serializer: Parse HTML (BeautifulSoup)
Serializer->>Serializer: Rewrite IMG src -> proxy URL or blob URL
Serializer->>Frontend: Return modified HTML
Frontend->>Client: Render HTML (IMG loading="lazy")
Note over Client,ImageProxyViewSet: When browser requests proxied image
Client->>ImageProxyViewSet: GET /mailboxes/{id}/image-proxy?url=...
ImageProxyViewSet->>ImageProxyViewSet: validate mailbox, feature flag, validate_url_safety
ImageProxyViewSet->>ExternalServer: Fetch image (10s, no redirects)
ExternalServer-->>ImageProxyViewSet: Stream image
ImageProxyViewSet->>ImageProxyViewSet: enforce size, check content-type, magic bytes
ImageProxyViewSet->>Client: 200 + image bytes + Cache-Control + X-Proxied-From
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Possibly related PRs
Poem
Pre-merge checks and finishing touches✅ Passed checks (5 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
src/backend/poetry.lockis excluded by!**/*.lock
📒 Files selected for processing (8)
src/backend/core/api/serializers.py(2 hunks)src/backend/core/api/viewsets/blob.py(2 hunks)src/backend/core/api/viewsets/image_proxy.py(1 hunks)src/backend/core/urls.py(3 hunks)src/backend/messages/settings.py(1 hunks)src/backend/pyproject.toml(1 hunks)src/frontend/src/features/layouts/components/thread-view/components/thread-message/message-body.tsx(2 hunks)src/nginx/servers.conf.erb(1 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
src/backend/**/*.py
📄 CodeRabbit inference engine (.cursor/rules/django-python.mdc)
src/backend/**/*.py: Follow Django/PEP 8 style with a 100-character line limit
Use descriptive, snake_case names for variables and functions
Use Django ORM for database access; avoid raw SQL unless necessary for performance
Use Django’s built-in user model and authentication framework
Prefer try-except blocks to handle exceptions in business logic and views
Log expected and unexpected actions with appropriate log levels
Capture and report exceptions to Sentry; use capture_exception() for custom errors
Do not log sensitive information (tokens, passwords, financial/health data, PII)
Files:
src/backend/core/api/serializers.pysrc/backend/messages/settings.pysrc/backend/core/urls.pysrc/backend/core/api/viewsets/blob.pysrc/backend/core/api/viewsets/image_proxy.py
src/backend/{templates/**/*.html,**/serializers.py}
📄 CodeRabbit inference engine (.cursor/rules/django-python.mdc)
Use Django templates for HTML and DRF serializers for JSON responses
Files:
src/backend/core/api/serializers.py
src/backend/**/{settings.py,middleware.py}
📄 CodeRabbit inference engine (.cursor/rules/django-python.mdc)
Use middleware judiciously for cross-cutting concerns (authentication, logging, caching)
Files:
src/backend/messages/settings.py
src/backend/**/settings.py
📄 CodeRabbit inference engine (.cursor/rules/django-python.mdc)
src/backend/**/settings.py: Leverage Django’s caching framework (e.g., Redis/Memcached) where appropriate
Use Django’s cache framework with a backend like Redis or Memcached to reduce DB load
Optimize static file handling using Django’s staticfiles pipeline (e.g., WhiteNoise)
Files:
src/backend/messages/settings.py
src/backend/**/urls.py
📄 CodeRabbit inference engine (.cursor/rules/django-python.mdc)
Define clear, RESTful URL patterns using Django’s URL dispatcher
Files:
src/backend/core/urls.py
🧠 Learnings (2)
📚 Learning: 2025-09-02T10:12:12.835Z
Learnt from: CR
PR: suitenumerique/messages#0
File: .cursor/rules/django-python.mdc:0-0
Timestamp: 2025-09-02T10:12:12.835Z
Learning: Applies to src/backend/**/urls.py : Define clear, RESTful URL patterns using Django’s URL dispatcher
Applied to files:
src/backend/core/urls.py
📚 Learning: 2025-09-02T10:12:12.835Z
Learnt from: CR
PR: suitenumerique/messages#0
File: .cursor/rules/django-python.mdc:0-0
Timestamp: 2025-09-02T10:12:12.835Z
Learning: Applies to src/backend/**/views.py : Use Django REST Framework viewsets for API endpoints
Applied to files:
src/backend/core/urls.pysrc/backend/core/api/viewsets/image_proxy.py
🧬 Code graph analysis (3)
src/backend/core/api/serializers.py (2)
src/backend/core/models.py (1)
get_parsed_field(1273-1275)src/backend/core/tests/api/test_messages_import_labels_en.py (2)
user(26-28)mailbox(32-36)
src/backend/core/urls.py (1)
src/backend/core/api/viewsets/image_proxy.py (1)
ImageProxyViewSet(21-135)
src/backend/core/api/viewsets/image_proxy.py (1)
src/backend/core/api/permissions.py (1)
IsAuthenticated(18-25)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: test-back
- GitHub Check: build-front
- GitHub Check: lint-back
🔇 Additional comments (2)
src/frontend/src/features/layouts/components/thread-view/components/thread-message/message-body.tsx (2)
62-62: LGTM - Type safety improvement.Adding the
Elementtype annotation improves type safety and makes the hook parameter's type explicit.
73-89: LGTM - Well-structured lazy loading and CID transformation.The restructured logic correctly applies lazy loading to all images while keeping CID transformation conditional. This approach:
- Improves performance by enabling native browser lazy loading for all images (external proxied images and CID attachments)
- Optimizes by skipping CID transformation when no CID attachments exist
- Provides a fallback for CID handling when backend proxying is disabled
- Maintains security by running transformations in the
afterSanitizeAttributeshook (post-sanitization)The CSP policy (line 18) correctly permits images from
'self',data:, and the API origin, which aligns with both proxied external images and blob download URLs for CID attachments.
Fixes critical vulnerabilities identified by CodeRabbit: - Block IP addresses and validate DNS to prevent SSRF attacks - Stream content in chunks to prevent memory exhaustion - Validate fetched content is actually an image (defense-in-depth) - Disable redirects to prevent SSRF bypass - Remove nginx http2 directive (requires SSL) - Return SVG placeholder for blocked images instead of errors Fixes suitenumerique#187
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (4)
src/backend/core/utils.py (4)
80-88: DNS resolution may block request threads.The
getaddrinfocall performs synchronous DNS resolution, which can take several seconds if the nameserver is slow or unresponsive. In a web application context, this blocks the worker thread handling the HTTP request.Consider adding a timeout mechanism or performing DNS resolution in a separate thread pool, though the security trade-off may be acceptable for this use case.
91-91: Rename unused loop variable to indicate intent.The
familyvariable from thegetaddrinfotuple is intentionally unused. Following Python conventions, prefix it with an underscore to indicate this is intentional.As per static analysis hints.
Apply this diff:
- for family, _, _, _, sockaddr in addr_info: + for _family, _, _, _, sockaddr in addr_info:
125-126: Bare except is acceptable here but consider logging unexpected exceptions.The bare
exceptclause catches all exceptions includingSystemExitandKeyboardInterrupt, which is generally discouraged. However, in this security-critical validation function, failing closed (rejecting the URL) is the correct behavior.Consider logging unexpected exceptions for debugging while maintaining the fail-closed behavior:
- except Exception: - return False, "Invalid URL" + except Exception as e: + logger.warning("Unexpected error validating URL safety: %s", e, exc_info=True) + return False, "Invalid URL"
24-78: Document DNS rebinding limitation in docstring; TOCTOU is an accepted trade-off in DNS-based validation.The TOCTOU concern is valid and inherent to all DNS-based SSRF protections. Between
validate_url_safety()and the actual HTTP request, DNS could change. This is a known limitation rather than a bug—connection-level protections are optional enhancements but not mandatory.Update the docstring as suggested to document this limitation:
Blocked addresses include: - Any direct IP address (e.g., http://192.168.1.1/) - Private IP ranges (RFC1918: 10.x.x.x, 172.16-31.x.x, 192.168.x.x) - Loopback addresses (127.x.x.x, ::1) - Link-local addresses (169.254.x.x, fe80::/10) - Multicast and reserved addresses - Cloud provider metadata endpoints (169.254.169.254, fd00:ec2::254) + + Note: This function is vulnerable to DNS rebinding attacks (time-of-check + to time-of-use) where DNS resolution can change between validation and + actual request. This is an inherent limitation of DNS-based validation.The existing implementation and usage in
image_proxy.pyare correct and properly protected.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/backend/core/api/viewsets/image_proxy.py(1 hunks)src/backend/core/utils.py(2 hunks)src/nginx/servers.conf.erb(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- src/nginx/servers.conf.erb
🧰 Additional context used
📓 Path-based instructions (1)
src/backend/**/*.py
📄 CodeRabbit inference engine (.cursor/rules/django-python.mdc)
src/backend/**/*.py: Follow Django/PEP 8 style with a 100-character line limit
Use descriptive, snake_case names for variables and functions
Use Django ORM for database access; avoid raw SQL unless necessary for performance
Use Django’s built-in user model and authentication framework
Prefer try-except blocks to handle exceptions in business logic and views
Log expected and unexpected actions with appropriate log levels
Capture and report exceptions to Sentry; use capture_exception() for custom errors
Do not log sensitive information (tokens, passwords, financial/health data, PII)
Files:
src/backend/core/utils.pysrc/backend/core/api/viewsets/image_proxy.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
PR: suitenumerique/messages#0
File: .cursor/rules/django-python.mdc:0-0
Timestamp: 2025-09-02T10:12:12.835Z
Learning: Applies to src/backend/**/settings.py : Optimize static file handling using Django’s staticfiles pipeline (e.g., WhiteNoise)
📚 Learning: 2025-09-02T10:12:12.835Z
Learnt from: CR
PR: suitenumerique/messages#0
File: .cursor/rules/django-python.mdc:0-0
Timestamp: 2025-09-02T10:12:12.835Z
Learning: Applies to src/backend/**/views.py : Use Django REST Framework viewsets for API endpoints
Applied to files:
src/backend/core/api/viewsets/image_proxy.py
🧬 Code graph analysis (1)
src/backend/core/api/viewsets/image_proxy.py (2)
src/backend/core/utils.py (1)
validate_url_safety(24-126)src/backend/core/api/permissions.py (1)
IsAuthenticated(18-25)
🪛 Ruff (0.14.2)
src/backend/core/utils.py
91-91: Loop control variable family not used within loop body
Rename unused family to _family
(B007)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test-back
🔇 Additional comments (13)
src/backend/core/utils.py (2)
3-6: LGTM!The imports are appropriate for implementing SSRF protection with URL parsing, DNS resolution, and IP address validation.
117-118: Good defense-in-depth for cloud metadata endpoints.While 169.254.169.254 should already be caught by
is_link_localand fd00:ec2::254 byis_private, the explicit check adds clarity and ensures these critical endpoints are always blocked even if the underlying library behavior changes.src/backend/core/api/viewsets/image_proxy.py (11)
1-21: LGTM!Imports are appropriate for an image proxy viewset with SSRF protection, content validation, and proper API documentation.
23-33: LGTM!Using
ViewSet(notModelViewSet) is appropriate for this non-CRUD operation, and the customIsAuthenticatedpermission correctly gates access.
65-77: LGTM!The mailbox validation and access control are correctly implemented. The method name "list" follows DRF conventions for GET requests on collection endpoints in nested routes.
79-83: LGTM!The feature flag check ensures the proxy endpoint is only available when explicitly enabled via settings.
85-91: LGTM!Correctly extracts and URL-decodes the image URL parameter. Using
unquote(notunquote_plus) is appropriate for URL decoding.
111-121: LGTM!The HTTP request configuration is secure and well-designed:
- Timeout prevents hanging requests
- Streaming prevents memory exhaustion
allow_redirects=Falseblocks redirect-based SSRF bypasses- Custom User-Agent aids logging/debugging
Note: The TOCTOU issue flagged in
validate_url_safetyremains (DNS can change between validation and fetch), but this is an inherent limitation of DNS-based SSRF protection.
123-140: LGTM!The content-type and Content-Length validation correctly addresses previous review concerns:
- Safely handles malformed Content-Length headers
- Uses Content-Length only as a hint (actual enforcement during streaming)
- Content-type check provides initial filtering before download
The later
python-magicvalidation (line 168) provides defense-in-depth against spoofed content-type headers.
142-165: LGTM! Excellent defense against memory exhaustion.The chunked streaming with size enforcement correctly addresses previous security concerns:
- Reads in small 8KB chunks with frequent size checks
- Enforces limit independently of Content-Length header
- Logs security events appropriately
- Prevents memory exhaustion from malicious or broken servers
This is a textbook implementation of safe streaming.
167-183: LGTM! Strong defense-in-depth with magic number validation.Using
python-magicto validate actual file content (not just HTTP headers) is excellent defense-in-depth. This prevents serving malicious content disguised with a fakeimage/*Content-Type header.Note: The SVG duplication was flagged separately.
185-192: LGTM!The success response is well-configured:
- 30-day caching reduces server load for static email images
- X-Proxied-From header aids debugging and provides transparency
- Uses validated content-type and content
194-198: LGTM!Error handling correctly catches all requests exceptions and returns appropriate 502 status for upstream failures. Logging provides operational visibility.
| responses={ | ||
| 200: OpenApiResponse(description="Image content"), | ||
| 400: OpenApiResponse(description="Invalid request"), | ||
| 403: OpenApiResponse(description="Forbidden"), | ||
| 413: OpenApiResponse(description="Image too large"), | ||
| 502: OpenApiResponse(description="Failed to fetch external image"), | ||
| }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add 404 response to OpenAPI schema.
The implementation returns 404 when the mailbox is not found (lines 70-72), but this response code is not documented in the OpenAPI schema.
Apply this diff:
responses={
+ 404: OpenApiResponse(description="Mailbox not found"),
},📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| responses={ | |
| 200: OpenApiResponse(description="Image content"), | |
| 400: OpenApiResponse(description="Invalid request"), | |
| 403: OpenApiResponse(description="Forbidden"), | |
| 413: OpenApiResponse(description="Image too large"), | |
| 502: OpenApiResponse(description="Failed to fetch external image"), | |
| }, | |
| responses={ | |
| 200: OpenApiResponse(description="Image content"), | |
| 400: OpenApiResponse(description="Invalid request"), | |
| 403: OpenApiResponse(description="Forbidden"), | |
| 404: OpenApiResponse(description="Mailbox not found"), | |
| 413: OpenApiResponse(description="Image too large"), | |
| 502: OpenApiResponse(description="Failed to fetch external image"), | |
| }, |
🤖 Prompt for AI Agents
In src/backend/core/api/viewsets/image_proxy.py around lines 57 to 63, the
OpenAPI responses mapping is missing the 404 entry even though the handler
returns 404 when a mailbox is not found (lines 70-72); add a 404:
OpenApiResponse(description="Not found" or "Mailbox not found") entry to the
responses dict so the OpenAPI schema documents the 404 case.
| # SSRF protection: validate URL before making any request | ||
| is_safe, error_message = validate_url_safety(url) | ||
| if not is_safe: | ||
| logger.warning("Blocked unsafe URL: %s - %s", url, error_message) | ||
| # Return placeholder image instead of JSON error for better UX | ||
| svg_placeholder = """<svg xmlns="http://www.w3.org/2000/svg" width="400" height="100" viewBox="0 0 400 100"> | ||
| <rect width="100%" height="100%" fill="#f8f9fa"/> | ||
| <text x="50%" y="50%" text-anchor="middle" dominant-baseline="middle" | ||
| font-family="system-ui, -apple-system, sans-serif" font-size="14" fill="#6c757d"> | ||
| 🚫 Image blocked for security reasons | ||
| </text> | ||
| </svg>""" | ||
| return HttpResponse( | ||
| svg_placeholder, | ||
| content_type="image/svg+xml", | ||
| status=403, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Extract duplicate SVG placeholder to a module-level constant.
The SVG placeholder for blocked images appears here and again at lines 172-178. This violates DRY principles and makes updates error-prone.
Apply this diff:
logger = logging.getLogger(__name__)
+
+# Placeholder SVG shown when images are blocked for security or validation reasons
+BLOCKED_IMAGE_SVG = """<svg xmlns="http://www.w3.org/2000/svg" width="400" height="100" viewBox="0 0 400 100">
+ <rect width="100%" height="100%" fill="#f8f9fa"/>
+ <text x="50%" y="50%" text-anchor="middle" dominant-baseline="middle"
+ font-family="system-ui, -apple-system, sans-serif" font-size="14" fill="#6c757d">
+ 🚫 Image blocked for security reasons
+ </text>
+</svg>"""
class ImageProxyViewSet(ViewSet):Then use it at line 98 and line 172:
- svg_placeholder = """<svg xmlns="http://www.w3.org/2000/svg" width="400" height="100" viewBox="0 0 400 100">
- <rect width="100%" height="100%" fill="#f8f9fa"/>
- <text x="50%" y="50%" text-anchor="middle" dominant-baseline="middle"
- font-family="system-ui, -apple-system, sans-serif" font-size="14" fill="#6c757d">
- 🚫 Image blocked for security reasons
- </text>
-</svg>"""
return HttpResponse(
- svg_placeholder,
+ BLOCKED_IMAGE_SVG,
content_type="image/svg+xml",
status=403,
)Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In src/backend/core/api/viewsets/image_proxy.py around lines 93-109 and again at
lines ~172-178, the SVG placeholder is duplicated; extract that SVG string into
a single module-level constant (e.g., BLOCKED_IMAGE_SVG) defined near the top of
the file, then replace the inline SVG literals at line 98 and line 172 with
references to that constant when constructing the HttpResponse (preserve
content_type="image/svg+xml" and status=403). Ensure the constant is a plain
triple-quoted string and update any imports or linter notes if necessary.
imghdr was removed in Python 3.13. Use python-magic (already a dependency) instead for more robust MIME type detection.
|
@mosa-riel awesome contribution, thanks!! A couple high-level notes:
@jbpenrath any other remarks? |
jbpenrath
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice job ❤️
Can you add unit tests for backend utils and new endpoint ?
|
Good comments, thank you for the insights. I created this so I have a working version for here, but I would be happy to integrate more of your comments. I think we need to add some level of user consent to load external images. Even though the user IP can't leak because we proxy the request, many external images contain individualized tracking links that can report that an email was read. That is true. I don't think that is hard to implement. Let me update the PR later. I'm not sure we should parse the email content to rewrite the image srcs on the backend. We already have a parsing hook on the frontend (modified in this patch), it would be a good place to implement the rewrite instead? This could then be done dynamically based on user consent. What parsing hook are you referring too? I don't really have a preference here, I thought it was nice to not give any external URL to the frontend at all. The consent could just trigger a new fetch from the backend with a tag to add the images. we could also parse attributes like image dimensions (to block trackers that are 1px for instance). isn't image/svg+xml a vector for injecting scripts or other nasty stuff? It would probably be better to have a stricter mime type whitelist I will take these remarks and update the PR. @jbpenrath any other remarks? |
I meant this one: https://github.com/suitenumerique/messages/pull/398/files#diff-e00c6a9c9e319c3b53ce318c4e9ab5366c2bb3fee498726013fb5a07ee3af76fR74 The only advantage to parsing on the frontend I see (but I could be missing something!) would be to create opaque proxy URLs so that the endpoint doesn't accept any URL. But even that seems fragile as you could just send yourself an email with any image URL you want. So I think it's best to leave the email untouched from the backend and do the rewrite from the frontend in that hook.
Indeed. To be tested
I didn't mean for the placeholder image, I meant for embedded external images in the email with this check: |
Summary
This PR adds an image proxy feature that enables strict Content Security Policy (CSP) enforcement while maintaining email functionality, and provides privacy protection by proxying external images through the application server.
Closes #187
Motivation
Primary goal: CSP Compliance
Modern web applications need strict Content Security Policy headers to prevent XSS attacks. However, emails often contain external images that violate a strict
img-src 'self'CSP policy.By proxying all external images through the application server, all image requests originate from the same domain, enabling:
Secondary goal: Privacy Protection
External images in emails are commonly used as tracking pixels. The image proxy prevents exposure of user IP addresses and reading behavior to third-party servers.
Security
The image proxy implements defense-in-depth security measures to prevent Server-Side Request Forgery (SSRF) attacks:
SSRF Protection:
Memory Protection:
Content Validation:
Implementation
Backend:
GET /api/v1.0/mailboxes/{mailbox_id}/image-proxy/?url={encoded_url}core/utils.pyMessageSerializerto proxy external images and handle CID attachmentsPROXY_EXTERNAL_IMAGESandPROXY_MAX_IMAGE_SIZE_MBenvironment variablesFrontend:
Dependencies:
beautifulsoup4for HTML parsingpython-magicfor MIME type validationURL rewriting
External images:
CID attachments:
Benefits
img-src 'self'security policyConfiguration
Summary by CodeRabbit
New Features
Improvements