A Rust application that synchronizes data from PostgreSQL (with optional S3 storage) to the Recall Network.
The Recall Data Synchronizer bridges centralized data storage with the Recall Network by:
- Reading object metadata from PostgreSQL's
object_indextable - Fetching object data from S3 (when configured) or directly from PostgreSQL
- Writing data to the Recall Network with preserved structure
- Maintaining synchronization state to resume from interruptions
- Query PostgreSQL
object_indextable for new/updated objects - Fetch object data from S3 (if configured) or PostgreSQL
- Write to Recall Network with key format:
[competition_id/][agent_id/]<data_type>/<uuid> - Track progress in local SQLite database
- S3 Mode: Object metadata in PostgreSQL, data in S3
- Direct Mode: Both metadata and data in PostgreSQL
# With configuration file
docker run -it --rm \
-v ./config.toml:/app/config.toml:ro \
-v ./sync-data:/data \
textile/recall-synchronizer:latest \
run --config /app/config.toml
# Start continuous sync (every 60 seconds)
docker run -it --rm \
-v ./config.toml:/app/config.toml:ro \
-v ./sync-data:/data \
textile/recall-synchronizer:latest \
start --interval 60# Clone and setup
git clone <repository-url>
cd recall-synchronizer
cp config.example.toml config.toml
# Start development environment
docker-compose up -d
# Run once
cargo run -- run
# Run continuously (every 60 seconds)
cargo run -- start --interval 60
# Reset sync state
cargo run -- resetCreate a config.toml file:
[database]
url = "postgresql://user:pass@localhost:5432/db"
[sync_storage]
db_path = "./sync-state.db"
[sync]
batch_size = 100
[recall]
network = "localnet"
private_key = "your-private-key"
config_path = "./networks.toml"
# Optional S3 configuration
[s3]
endpoint = "https://s3.amazonaws.com"
region = "us-east-1"
bucket = "your-bucket"
access_key_id = "your-key"
secret_access_key = "your-secret"Run synchronization once:
cargo run -- run [--since <TIMESTAMP>]--since: Sync objects created after this timestamp (RFC3339 format)
Run continuously at specified interval:
cargo run -- start --interval <SECONDS> [--since <TIMESTAMP>]--interval: Seconds between sync runs (required)--since: Initial timestamp filter
Clear synchronization state:
cargo run -- resetThe object_index table stores object metadata:
id: UUID primary keyobject_key: S3 key (required in S3 mode)competition_id: Optional UUIDagent_id: Optional UUIDdata_type: String identifier (e.g., "LOGS", "METRICS")data: Binary data (required in direct mode)created_at: Timestamp for sync ordering
Objects are stored with structured keys:
[competition_id/][agent_id/]<data_type>/<uuid>
Optional segments are omitted if not present.
# Fast tests (in-memory only)
make test-fast
# Integration tests (PostgreSQL + SQLite)
make test
# With coverage
make test-coveragecargo fmt --all -- --check
cargo clippy --all-targets --all-features -- -D warningsPre-built images are available on Docker Hub:
textile/recall-synchronizer:latesttextile/recall-synchronizer:<version>
See DOCKER.md for detailed Docker usage.
- Single Database Type: Uses PostgreSQL with different schemas for S3/Direct modes
- No Per-Competition Tracking: Simplified to global synchronization progress
- Batch Processing: Configurable batch size for efficient processing
- Resilient State Management: SQLite for tracking sync progress with atomic operations