A professional Python-based tool that processes DefensePro forensics data from CSV files (raw or zipped) and generates comprehensive HTML and PDF reports with interactive visualizations for both technical and sales audiences.
- Memory-Efficient Processing: Handles files from 1MB to 1GB+ using chunked processing
- Intelligent Date Parsing: Automatically detects and handles multiple date formats
- Month-to-Month Trends: Analyzes complete calendar months for accurate trend analysis
- Interactive Visualizations: Professional charts with configurable themes using Plotly
- Comprehensive Chart Customization: 6 color palettes and 11 configurable chart types
- Dual Output Formats: Generates both HTML (interactive) and PDF reports
- Cross-Platform: Works on Windows, Linux, and macOS
- Comprehensive Analysis: Both holistic (entire dataset) and trend (monthly) analysis
- Batch Processing: Process multiple files automatically
- Professional Reports: Executive summaries and detailed technical analysis
- Python 3.8 or higher
- 8-16GB RAM recommended for large files
- Internet connection for initial setup (package installation)
-
Clone or download this repository
cd SE_new_report -
Create your configuration file
The script will automatically create
config.pyfromconfig_example.pyon first run. Alternatively, you can create it manually:# Windows PowerShell Copy-Item config_example.py config.py # Linux/Mac cp config_example.py config.py
Note:
config.pyis not tracked by git, so your custom settings will remain local. -
Install dependencies
pip install -r requirements.txt
-
Install Playwright for PDF generation (optional but recommended)
playwright install chromium
-
Create virtual environment
python -m venv forensics_env # Windows forensics_env\Scripts\activate # Linux/Mac source forensics_env/bin/activate
-
Create your configuration file
# Windows PowerShell Copy-Item config_example.py config.py # Linux/Mac cp config_example.py config.py
-
Install dependencies
pip install -r requirements.txt playwright install chromium
Core dependencies automatically installed:
polars- High-performance data processingplotly- Interactive visualizationsjinja2- HTML templatingplaywright- PDF generation (orweasyprintas fallback)python-dateutil- Flexible date parsingtqdm- Progress barsclick- Command-line interfacepsutil- Memory monitoring
SE_new_report/
├── analyzer.py # Main orchestrator script
├── data_processor.py # CSV parsing and data processing
├── report_generator.py # HTML/PDF generation
├── visualizations.py # Chart creation logic
├── utils.py # Helper functions
├── config_example.py # Configuration template (tracked in git)
├── config.py # Your configuration (auto-created, not tracked)
├── requirements.txt # Python dependencies
├── README.md # This file
├── forensics_input/ # Input directory (place files here)
│ └── .gitkeep
└── report_files/ # Output directory
└── .gitkeep
Place your DefensePro forensics export files in the forensics_input/ directory:
- CSV files: Direct forensics exports
- ZIP files: Compressed forensics exports (will be automatically extracted)
Expected CSV columns (some may be missing depending on device):
- S.No, Start Time, End Time, Device IP Address, Threat Category
- Attack Name, Policy Name, Action, Attack ID, Source IP Address
- Source Port, Destination IP Address, Destination Port, Direction
- Protocol, Radware ID, Duration, Total Packets, Packet Type
- Total Mbits, Max pps, Max bps, Physical Port, Risk, VLAN Tag
- Footprint, Device Name, Device Type, Workflow Rule Process
- Activation Id, Protected Object
Basic usage (processes all files in forensics_input/):
python analyzer.pyAdvanced usage:
# Specify custom directories
python analyzer.py --input-dir /path/to/files --output-dir /path/to/reports
# Generate only HTML reports
python analyzer.py --format html
# Generate only PDF reports
python analyzer.py --format pdf
# Enable verbose logging
python analyzer.py --verbose
# Help
python analyzer.py --helpReports are generated in the report_files/ directory:
{filename}_report.html- Interactive HTML report{filename}_report.pdf- PDF version for sharingbatch_summary_{timestamp}.html- Summary when processing multiple files
- Total events, date range, daily averages
- Top attack types and trends
- Business impact assessment
- Security events over time
- Top attack types by month
- Attack volume trends (Mbits, PPS, BPS)
- Attack intensity heatmap by hour and month
- Attack type distribution
- Top source IP addresses
- Protocol distribution
- Daily attack timeline
- Top 10 attack types with percentages
- Top 10 source IPs
- Top 10 targeted destinations
- Data quality and methodology notes
The tool uses config.py for all settings. On first run, this file is automatically created from config_example.py if it doesn't exist. Your config.py file is not tracked by git, allowing you to maintain custom settings without affecting version control.
To reset to default settings, simply delete config.py and run the script again, or manually copy from config_example.py.
Edit config.py to adjust:
CHUNK_SIZE: Rows processed per chunk (default: 50,000)MAX_MEMORY_USAGE_GB: Memory warning threshold (default: 2GB)
The tool provides extensive chart customization options in config.py:
Color Themes: Choose from 6 professionally designed color palettes:
ACTIVE_COLOR_PALETTE: Switch between 'radware_corporate', 'professional_blue', 'modern_minimal', 'vibrant_corporate', 'high_contrast', 'colorblind_friendly'
Chart Types: Configure visualization types for each chart:
CHART_PREFERENCES: Set chart types (line, bar, pie, donut, heatmap, area, etc.)
Individual Overrides: Customize specific chart colors:
CHART_COLOR_ASSIGNMENTS: Override colors for individual charts while keeping global theme
Example - Switch to Professional Blue theme:
ACTIVE_COLOR_PALETTE = 'professional_blue' # Instead of 'radware_corporate'Example - Change chart types:
CHART_PREFERENCES = {
'monthly_events_trend': {
'default_type': 'line', # Change from 'bar' to 'line'
# ... other configuration
},
'attack_type_distribution': {
'default_type': 'donut', # Change from 'pie' to 'donut'
# ... other configuration
}
}- Small files (< 10MB): Process in seconds
- Medium files (10-100MB): Process in 1-2 minutes
- Large files (100MB-1GB): Process in 2-15 minutes
- Very large files (> 1GB): May take 15+ minutes
- Tool automatically uses chunked processing
- Memory usage scales with chunk size, not file size
- Monitor memory warnings in verbose mode
- Reduce chunk size if memory warnings appear
- Close other applications when processing large files
- Use SSD storage for better I/O performance
1. "No CSV or ZIP files found"
- Ensure files are in the
forensics_input/directory - Check file extensions (.csv or .zip)
- Verify file permissions
2. "Missing required columns"
- Check CSV has required columns: Start Time, Attack Name, Source IP Address, Destination IP Address
- Verify CSV is a DefensePro forensics export (not traffic data)
3. "Failed to parse dates"
- Check date format in CSV (common: MM.DD.YYYY HH:MM:SS)
- Tool auto-detects formats but may need manual verification
4. "PDF generation failed"
- Install Playwright:
pip install playwright && playwright install chromium - Alternative: Install WeasyPrint:
pip install weasyprint - Manual fallback: Open HTML in browser and use "Print to PDF"
5. "Memory errors"
- Reduce
CHUNK_SIZEin config.py - Close other applications
- Process files individually instead of batch
6. "Chart generation failed"
- Install visualization dependencies:
pip install plotly kaleido - Check internet connection for initial Plotly setup
- Enable verbose logging:
python analyzer.py --verbose - Check log output for specific error messages
- Verify file formats match expected CSV structure
- Test with smaller files first
- CSV files: Direct DefensePro forensics exports
- ZIP files: Compressed CSV files (auto-extracted)
Tool automatically detects common formats:
- MM.DD.YYYY HH:MM:SS (primary format)
- DD.MM.YYYY HH:MM:SS
- YYYY-MM-DD HH:MM:SS
- Various delimiter combinations (., /, -)
- Tested: Up to 10 million rows
- Recommended: Under 1GB per file
- Memory: Scales with chunk size, not file size
- Complete months only: Excludes partial months at dataset boundaries
- Fair comparisons: Ensures accurate trend analysis
- Minimum requirement: At least 1 complete month of data
- Entire dataset: Uses all available data regardless of month boundaries
- Comprehensive statistics: Full picture of security posture
- No time filtering: Maximum data utilization
- Intelligent parsing: Handles missing columns gracefully
- Validation: Verifies data integrity throughout processing
- Transparency: Reports data quality issues and exclusions
- Local processing: All data remains on your machine
- No network transmission: Data never sent to external services
- Temporary files: Automatically cleaned up after processing
- Memory management: Sensitive data cleared from memory
When reporting issues, please include:
- Input file characteristics (size, date range, format)
- Error messages from verbose logging
- System specifications (OS, Python version, RAM)
- Command used and expected vs actual behavior
Tool performance varies by:
- Hardware: CPU, RAM, storage type
- File characteristics: Size, date range, data density
- System load: Other running applications
This tool is designed for internal use with DefensePro forensics data. Please ensure compliance with your organization's data handling policies.
| Version | Change/Fixes/Features |
|---|---|
| v2.0.6 | - 2/27/26 - Fixed month exclusion caused by date format mismatch: detection now correctly identifies %H:%M (no-seconds) timestamps instead of always assuming %H:%M:%S. Added no-seconds variants to DATE_FORMATS. Fixed last month (e.g. December) being incorrectly excluded when data ends within last 7 days of month. Fixed duplicate log output by clearing existing handlers before setup. Improved Attack Volume Trends chart spacing. |
| v2.0.5 | - 11/27/25 - Fixed date identification with / separator (e.g. 11/27/2025). Improved full month identification if logged events does not start on the 1st of the month. Cosmetical change- modified headline- removed "Top 5" from "Security Events by Policy". Removed 0.0.0.0 and Multiple from Top 10 Sources and Top 10 attacked destinations |
| v2.0.4 | - 11/7/25 - Fixed EXCLUDE_FILTER - was referenced twice in config.py example |
| v2.0.3 | - 11/3/25 - Fixed 2 charts - Top 5 Attacks by Max Bandwidth and PPS. Added config_example.py and removed config.py from git tracking. Enhanced identification of the last full month. |
| v2.0.2 | - 10/24/25 - Fixed CHART_PREFERENCES variables |
| v2.0.1 | - 10/24/25 - Major UX Enhancement: Added user-configurable control for chart types, layouts, and colors. Improved config.py architecture for intuitive customization. Introduced color palettes with individual chart override support if needed. Updated documentation with customization examples. |
| v2.0.0 | - 10/23/25 - Added new charts: 1.Top 5 attacks by Gbps 2. Top 5 attacks by PPS 3. Security Events by Policy |
| v1.1.9 | - 10/21/25 - Enhanced identification of the first complete month(challenge with Packet Anomalies unfiltered) |
| v1.1.8 | - 10/21/25 - Enhanced Attack Type Distribution pie chart visualization, style and to avoid overlap between categories and title |
| v1.1.7 | - 10/21/25 - Updated height of the bar charts to not overlap legend with axis x text |
- Added autofont adjustment if length of the text is too long (longest attack duration for example)
- Added days to longest attack duration
- Added excluded events text to the execs summary html | | v1.1.6 | - 10/21/25 Removed zoom from bar charts, removed vertical zoom from Daily Attack events | | v1.1.5 | - 10/21/25 Bugfix- inconsistent statistics for Max Gbps and Max PPS in Summary statistics and Volume trends | | v1.1.4 | - 10/20/25 Enhancment to better identify complete months | | v1.1.3 | - Bugfix automatically detecting the date format | | v1.1.2 | - Added configurable charts customizations | | v1.1.1 | - Fixed/enhanced accuracy in automatic date identification
- Added FORCE_DATE_FORMAT variable in config.py
- Fixed csv processing using lazy method
- Removed Data Quality Notes section | | v1.1.0 | - Added configurable output format (html, pdf, both) in config.py variable OUTPUT_FORMATS
- Added support for columns header name variations for both 'Total Packets' and 'Total Packets Dropped', also 'Total Mbits' and 'Total Mbits Dropped'
- Added logic- If VOLUME_UNIT is MB -> show Mbps, if GB -> show Gbps, if TB -> show Gbps
- Added cdn mode - reduced HTML size from 37MB to 116Kb
- Added Expandable details for Summary statistics
|
| v1.0.1 | - Added filtering. Use new EXCLUDE_FILTERS var under config.py |
| v1.0.0 | - Initial release
- Support for CSV and ZIP input files
- HTML and PDF report generation
- Interactive Plotly visualizations
- Memory-efficient processing
- Cross-platform compatibility
- Batch processing support |
- Memory-Efficient Processing: Intelligent chunked processing handles files from MB to GB+ sizes
- Smart Date Detection: Automatic format recognition with manual override capabilities
- Complete Month Analysis: Sophisticated algorithm identifies and analyzes complete calendar months for accurate trending
- Data Quality Validation: Built-in validation and cleansing with transparent reporting
- Professional Color Palettes: 6 scientifically designed themes including corporate branding, accessibility, and colorblind-friendly options
- Flexible Chart Types: 11 fully configurable visualization types (line, bar, pie, donut, heatmap, area, stacked, horizontal)
- Granular Customization: Individual chart color overrides while maintaining global theme consistency
- Modern Configuration Architecture: Clean separation of settings and logic with immediate hot-reload capability
- Multi-Dimensional Trending: Month-over-month analysis of attack patterns, volumes, and intensities
- Attack Profiling: Detailed breakdown by type, source, protocol, policy, and temporal patterns
- Performance Metrics: Bandwidth utilization, packet rates, and volume analysis with configurable units
- Executive Reporting: Professional summaries with expandable technical details
- One-Click Theming: Instantly switch color schemes across entire report suite
- Cross-Platform Compatibility: Seamless operation on Windows, Linux, and macOS
- Dual Output Formats: Interactive HTML and print-ready PDF with consistent styling
- Batch Processing: Multiple file analysis with consolidated summary reporting
- Command-Line Flexibility: Comprehensive CLI with format control and verbose logging
- Configuration-Driven: All customization through centralized, well-documented configuration files
- Performance Optimized: CDN-based chart delivery reduces file sizes from 37MB to 116KB
- Extensible Design: Modular architecture supports easy addition of new chart types and analysis methods
- Professional Deployment: Ready for enterprise environments with comprehensive troubleshooting documentation
# 1. Place forensics CSV files in forensics_input/
# 2. Run analysis
python analyzer.py --verbose
# 3. Open generated reports in report_files/
# HTML: Interactive charts and analysis
# PDF: Print-ready version for sharingFor additional help: python analyzer.py --help