Complete project history with Phase 6

This commit is contained in:
Adolfo Reyna
2026-02-26 21:26:34 -05:00
parent 37cb45c12f
commit e3bda17c7d

View File

@@ -46,3 +46,9 @@ The project now features a high-performance, Apple Silicon-optimized pipeline th
- Refactored the script to support a dictionary of multiple `MarianMT` models.
- Each transcribed English segment is passed through each loaded translation engine sequentially.
- **Performance on M2:** Loading 3-4 specialized models + Whisper is highly efficient, using ~1.5GB of RAM and providing near-instant results.
## Phase 6: Memory & Generation Safety
- **Issue:** Occasionally, long inputs or model glitches caused "runaway" translation generation, which could consume excessive memory.
- **Solution:**
- Artificially truncated input transcription to a maximum of 250 characters.
- Added `max_new_tokens=150` to the translation generation call to ensure the model terminates even if it gets stuck in a loop.