Skip to content

[πŸ”’ Security]: Add comprehensive input validation and sanitization for all API endpointsΒ #751

@shreyansh-tech21

Description

@shreyansh-tech21

About the Issue

Several API endpoints accept user input without proper validation:

  1. /api/analyze (POST): Accepts file uploads with no content-type validation β€” a user can upload a .png file that is actually a malicious payload. Only the extension is checked, not the actual file content (magic bytes).

  2. /api/weather (GET): Accepts lat, lon, and city parameters with no range validation. A lat=999 or lon=-9999 will be passed directly to external weather APIs.

  3. /admin/models/register (POST): Accepts a file path from the request body that is used in os.path.exists() β€” this could be exploited for path traversal to probe the server's filesystem structure.

  4. /api/chat (POST): No input length validation β€” a massive message string could cause regex DoS (ReDoS) on the keyword matching patterns.

Motivation

As an open-source project that processes user-uploaded files and makes external API calls, Agri-Vision must validate all inputs to prevent:

  • Path traversal attacks on model registration
  • ReDoS attacks on the chatbot
  • File upload abuse (disguised malicious files)
  • API abuse with invalid coordinates consuming rate limits

Proposed Solution

1. File content validation (magic bytes)

MAGIC_BYTES = {
    b'\x89PNG': 'png',
    b'\xff\xd8\xff': 'jpg',
    b'GIF87a': 'gif',
    b'GIF89a': 'gif',
}

def validate_image_content(file_storage) -> bool:
    """Validate file is actually an image by checking magic bytes."""
    header = file_storage.read(8)
    file_storage.seek(0)
    return any(header.startswith(magic) for magic in MAGIC_BYTES)

2. Coordinate range validation

def validate_coordinates(lat: float, lon: float) -> bool:
    return -90 <= lat <= 90 and -180 <= lon <= 180

3. Path traversal prevention

import os

def validate_model_path(path: str) -> bool:
    """Ensure model path is within the models/ directory."""
    abs_path = os.path.abspath(path)
    models_dir = os.path.abspath("models")
    return abs_path.startswith(models_dir)

4. Chat input length limit

MAX_CHAT_MESSAGE_LENGTH = 500

@app.route("/api/chat", methods=["POST"])
def api_chat():
    data = request.get_json(silent=True)
    message = str(data.get("message", ""))
    if len(message) > MAX_CHAT_MESSAGE_LENGTH:
        return jsonify({"reply": "Message too long. Please keep it under 500 characters."}), 400
    ...

Files to Change

File Change
app.py Add validate_image_content() to /analyze and /api/analyze
app.py Add validate_coordinates() to /api/weather
app.py Add validate_model_path() to /admin/models/register
app.py Add message length check to /api/chat
tests/test_app.py Add tests for each validation: invalid magic bytes, out-of-range coords, path traversal attempts, oversized chat messages

Impact

  • Security: Prevents multiple attack vectors (path traversal, ReDoS, file upload abuse)
  • Reliability: Catches invalid inputs early with clear error messages
  • Risk: Low β€” purely additive validation logic
  • Backwards compatibility: Legitimate requests are unaffected

Labels

security, enhancement, good first issue, hacktoberfest

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions