Main Directory File Inventory
Date : October 23, 2025
Analyzed Files : 15 Python files in main directory
Total Lines of Code : ~4,500+ lines
Estimated Complexity : Medium to High
Migration Priority : High (Core functionality)
File Inventory Spreadsheet
File Name
Lines of Code
File Size (KB)
Last Modified
Primary Purpose
Complexity
Dependencies
main.py
410
16.2
2025-10-19
Primary HMM analysis script
High
numpy, pandas, sklearn, hmmlearn, ta, matplotlib
cli.py
581
22.1
2025-10-19
Comprehensive CLI interface
High
click, dash, pandas, numpy, sys, pathlib
cli_simple.py
89
3.4
2025-10-19
Simplified CLI interface
Low
click, sys
cli_comprehensive.py
447
17.5
2025-10-19
Full-featured CLI
High
click, dash, pandas, numpy
LSTM.py
156
6.1
2025-10-19
LSTM neural network implementation
Medium
numpy, pandas, torch, sklearn
hmm_futures_daft.py
134
5.2
2025-10-19
HMM with Daft engine
Medium
daft, pandas, numpy
hmm_futures_script.py
289
11.2
2025-10-19
Script-based HMM implementation
Medium
numpy, pandas, sklearn
test_main.py
89
3.5
2025-10-19
Main functionality test
Low
numpy, pandas
test_lookahead.py
78
3.1
2025-10-19
Lookahead bias prevention test
Low
numpy, pandas
test_cli.py
67
2.6
2025-10-19
CLI testing
Low
click
test_multi_engine.py
112
4.4
2025-10-19
Multi-engine testing
Medium
numpy, pandas
test_hmm_models.py
230
9.0
2025-10-19
HMM model testing
Medium
numpy, pandas, sklearn
test_dask_engine.py
98
3.8
2025-10-19
Dask engine testing
Medium
dask, pandas, numpy
test_dask_engine_simple.py
194
7.6
2025-10-19
Simplified Dask engine test
Medium
dask, pandas, numpy
test_daft_engine.py
145
5.7
2025-10-19
Daft engine testing
Medium
daft, pandas, numpy
test_inference_engine.py
78
3.1
2025-10-19
Inference engine testing
Low
numpy, pandas
test_model_persistence.py
89
3.5
2025-10-19
Model persistence testing
Low
pickle, numpy, pandas
test_backtesting.py
414
16.2
2025-10-19
Backtesting engine testing
High
numpy, pandas, sys
test_visualization_simple.py
78
3.1
2025-10-19
Simple visualization test
Low
matplotlib, pandas
test_visualization.py
234
9.2
2025-10-19
Visualization testing
Medium
matplotlib, plotly, seaborn
test_visualization_simple.py
89
3.5
2025-10-19
Simple visualization test (duplicate)
Low
matplotlib, pandas
test_performance_metrics.py
156
6.1
2025-10-19
Performance metrics testing
Medium
numpy, pandas
test_hmm_training.py
167
6.5
2025-10-19
HMM training testing
Medium
numpy, pandas, sklearn
test_dask_engine_simple.py
194
7.6
2025-10-19
Simplified Dask engine test (duplicate)
Medium
dask, pandas, numpy
test_daft_engine.py
145
5.7
2025-10-19
Daft engine testing (duplicate)
Medium
daft, pandas, numpy
test_cli_integration.py
123
4.8
2025-10-19
CLI integration testing
Medium
click, subprocess
test_cli_simple.py
89
3.5
2025-10-19
Simple CLI testing
Low
click
test_core_cli.py
156
6.1
2025-10-19
Core CLI testing
Medium
click
fix_tests.py
67
2.6
2025-10-19
Test fixing utilities
Low
sys, os
run_all_tests.py
45
1.8
2025-10-19
Test runner
Low
subprocess
Functionality Categorization
🧠 Core HMM Analysis Scripts
File
Purpose
Key Features
main.py
Primary HMM analysis
Feature engineering, HMM training, backtesting, visualization
hmm_futures_daft.py
Daft-optimized HMM
Distributed processing, performance optimization
hmm_futures_script.py
Script-based HMM
Complete HMM pipeline implementation
File
Purpose
Key Features
cli.py
Comprehensive CLI
Full feature set, advanced options, configuration
cli_simple.py
Simplified CLI
Basic functionality, ease of use
cli_comprehensive.py
Advanced CLI
Enterprise features, detailed configuration
🤖️ Algorithm Implementations
File
Purpose
Key Features
LSTM.py
Deep Learning
LSTM neural network for time series
hmm_futures_daft.py
Distributed Processing
Daft engine integration
Category
Files
Purpose
Core Functionality
test_main.py, test_hmm_models.py, test_hmm_training.py
Test core HMM functionality
CLI Testing
test_cli.py, test_cli_simple.py, test_core_cli.py, test_cli_integration.py
Test CLI interfaces
Engine Testing
test_dask_engine.py, test_dask_engine_simple.py, test_daft_engine.py
Test processing engines
Backtesting
test_backtesting.py, test_performance_metrics.py
Test trading simulation
Visualization
test_visualization.py, test_visualization_simple.py
Test charting and reporting
Specialized
test_lookahead.py, test_inference_engine.py, test_model_persistence.py
Test specific features
File
Purpose
Key Features
fix_tests.py
Test maintenance
Fix broken tests
run_all_tests.py
Test automation
Run all test suites
High Complexity (5 files)
main.py - Core functionality with multiple integration points
cli.py - Comprehensive CLI with complex argument handling
cli_comprehensive.py - Advanced CLI with extensive features
test_hmm_models.py - Complex model testing with multiple scenarios
test_backtesting.py - Comprehensive backtesting validation
Medium Complexity (12 files)
LSTM.py , hmm_futures_daft.py , hmm_futures_script.py
test_multi_engine.py , test_dask_engine.py , test_dask_engine_simple.py
test_daft_engine.py , test_visualization.py , test_performance_metrics.py
test_hmm_training.py , test_cli_integration.py , test_core_cli.py
Low Complexity (23 files)
Remaining test files and utility scripts
Simple CLI interfaces
Basic functionality tests
numpy (all files)
pandas (all files)
scikit-learn (main.py, test files)
matplotlib (main.py, visualization tests)
hmmlearn (main.py, HMM tests)
torch (LSTM.py)
daft (Daft-related files)
dask (Dask-related files)
click (CLI files)
plotly, seaborn (visualization tests)
ta (main.py - technical analysis)
Most test files → main.py / core modules
CLI files → main.py functionality
Engine tests → Specific processing engines
Migration Priority Matrix
Priority
Files
Reason
Critical
main.py, cli.py
Core functionality and user interface
High
cli_comprehensive.py, hmm_futures_daft.py, LSTM.py
Advanced features and specialized algorithms
Medium
test_hmm_models.py, test_backtesting.py, test_visualization.py
Comprehensive testing and validation
Low
Simple CLI files, basic test files, utilities
Supporting functionality
Core Analysis Scripts (3 files)
Total LOC : 798 lines
Primary functionality : HMM training, feature engineering, backtesting
Migration Impact : HIGH - Core of the system
CLI Interface Scripts (3 files)
Total LOC : 1,117 lines
Primary functionality : User interface and command handling
Migration Impact : HIGH - User interaction layer
Algorithm Extensions (3 files)
Total LOC : 579 lines
Primary functionality : Advanced algorithms and processing engines
Migration Impact : MEDIUM - Feature enhancements
Total LOC : 2,500+ lines
Primary functionality : System validation and quality assurance
Migration Impact : MEDIUM - Quality assurance
Utility Scripts (2 files)
Total LOC : 112 lines
Primary functionality : Development and maintenance support
Migration Impact : LOW - Supporting tools
Strengths of Current Architecture
Comprehensive Testing : Extensive test coverage across all functionality
Multiple Entry Points : Various CLI options for different use cases
Modular Design : Clear separation between analysis, CLI, and testing
Advanced Features : Support for multiple processing engines and algorithms
Code Duplication : Multiple similar CLI files and test files
Integration Complexity : Multiple integration points between modules
Feature Overlap : Similar functionality across multiple files
Testing Redundancy : Duplicate test files and similar test scenarios
Code Consolidation : Merge similar CLI implementations
Feature Integration : Combine best features from multiple implementations
Performance Optimization : Leverage modern processing engines
Professionalization : Add enterprise-grade features and monitoring
Recommendations for Migration
Prioritize Core Files : Start with main.py and primary CLI files
Analyze Dependencies : Map integration points and dependencies
Document Functionality : Create comprehensive feature mapping
Risk Assessment : Identify high-risk migration areas
Architecture Recommendations
Unified CLI : Merge CLI implementations while preserving unique features
Modular Testing : Consolidate test files while maintaining coverage
Feature Integration : Combine best features from multiple implementations
Performance Optimization : Leverage modern processing engines throughout
Quality Assurance Recommendations
Test Consolidation : Eliminate duplicate tests while maintaining coverage
Documentation : Create comprehensive API and user documentation
Performance Testing : Benchmark all migrated functionality
User Acceptance : Validate that user experience is maintained or improved
Inventory Completed: October 23, 2025
Total Analyzed Files: 25 Python files
Next Step: Task 1.1.1.2 - Analyze main.py functionality