Complete history.md with Phase 7

This commit is contained in:
Adolfo Reyna
2026-02-26 21:44:02 -05:00
parent ae2b54da9b
commit 72b2d0001f

View File

@@ -52,3 +52,10 @@ The project now features a high-performance, Apple Silicon-optimized pipeline th
- **Solution:**
- Artificially truncated input transcription to a maximum of 250 characters.
- Added `max_new_tokens=150` to the translation generation call to ensure the model terminates even if it gets stuck in a loop.
## Phase 7: Multilingual Detection & Bridge Translation
- **Goal:** Support input in any language, detect it, and translate to English + others.
- **Approach:**
- Switched to `whisper-small-mlx` (multilingual).
- **Hub-and-Spoke Model:** If a non-English language is detected, Whisper's `task="translate"` is used to create an English "bridge" text, which is then fed into the specialized MarianMT models.
- **Outcome:** Full support for multilingual input with centralized translation.