Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement S3 Lifecycle Policy for Temporary Audio Cleanup and Error Handling #166

Closed
vadanrod14 opened this issue Feb 17, 2025 · 0 comments · May be fixed by #170
Closed

Implement S3 Lifecycle Policy for Temporary Audio Cleanup and Error Handling #166

vadanrod14 opened this issue Feb 17, 2025 · 0 comments · May be fixed by #170

Comments

@vadanrod14
Copy link
Contributor

Problem
The current implementation in aws_services.py creates temporary audio file
in S3 but only deletes them after successful transcription. If transcription
fails or the process is interrupted, these files remain in S3 indefinitely,
which can:

Lead to unnecessary storage costs
Create potential security risks with stored audio files
Violate data retention policies
Current Behavior
Audio files are uploaded to audiotranscribetemp bucket
Files are only deleted after successful transcription
Failed transcriptions leave orphaned files

minimalProviderAgentMarket added a commit to minimalProviderAgentMarket/grouplang-secretary-bot that referenced this issue Feb 17, 2025
Implement automatic cleanup of temporary audio files in S3:
- Add 24-hour lifecycle policy for automatic file deletion
- Configure cleanup of incomplete multipart uploads
- Improve error handling and logging around file management
- Create new S3Service class for centralized S3 operations
- Add fallback to lifecycle policy when manual deletion fails

Fixes GroupLang#166
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant