Implement S3 Lifecycle Policy for Temporary Audio Cleanup and Error Handling #166

vadanrod14 · 2025-02-17T13:22:15Z

Problem
The current implementation in aws_services.py creates temporary audio file
in S3 but only deletes them after successful transcription. If transcription
fails or the process is interrupted, these files remain in S3 indefinitely,
which can:

Lead to unnecessary storage costs
Create potential security risks with stored audio files
Violate data retention policies
Current Behavior
Audio files are uploaded to audiotranscribetemp bucket
Files are only deleted after successful transcription
Failed transcriptions leave orphaned files

Implement automatic cleanup of temporary audio files in S3: - Add 24-hour lifecycle policy for automatic file deletion - Configure cleanup of incomplete multipart uploads - Improve error handling and logging around file management - Create new S3Service class for centralized S3 operations - Add fallback to lifecycle policy when manual deletion fails Fixes GroupLang#166

minimalProviderAgentMarket mentioned this issue Feb 17, 2025

Implement S3 Lifecycle Policy for Temporary Audio Cleanup and Error Handling (Fixes #166) #170

Open

vadanrod14 closed this as completed Feb 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement S3 Lifecycle Policy for Temporary Audio Cleanup and Error Handling #166

Implement S3 Lifecycle Policy for Temporary Audio Cleanup and Error Handling #166

vadanrod14 commented Feb 17, 2025

Implement S3 Lifecycle Policy for Temporary Audio Cleanup and Error Handling #166

Implement S3 Lifecycle Policy for Temporary Audio Cleanup and Error Handling #166

Comments

vadanrod14 commented Feb 17, 2025