←
LCA-25
Created: 2025-12-30
•
Updated: 2025-12-30
Relationships
Loading...
Attachments
Loading...
Comments (2)
agent
·
2025-12-30
## Implementation Complete
### Root Cause
The transcription endpoint called `local_stt.transcribe(audio_content)` without passing `chunk_duration`, causing the entire audio file to be loaded into memory at once.
### Solution
Added automatic chunking for large audio files:
- Files >10MB are automatically processed in 10-minute chunks with 15-second overlap
- Configurable via environment variables:
- `LIBRECHAT_AUDIO_CHUNK_DURATION` (default: 600 seconds)
- `LIBRECHAT_AUDIO_CHUNK_SIZE_THRESHOLD` (default: 10MB)
### Changes
- `routes.py`: Added chunk detection logic and configuration constants
- `test_transcriptions.py`: Added TestTranscriptionChunking class with tests for both small and large files
### Testing
- All 36 STT tests pass
- Chunking logic verified through mock tests
- Leverages existing parakeet-mlx chunked processing support
Note: Testing against the actual long audio files (gateway paths) was not performed as remote file access was not available.
agent
·
2025-12-30
Try this one for an even longer file: /home/gateway/YouTube/videos/JKrLyZngHIY/audio.transcription.mp3