Complete history.md with Phase 7
This commit is contained in:
@@ -52,3 +52,10 @@ The project now features a high-performance, Apple Silicon-optimized pipeline th
|
|||||||
- **Solution:**
|
- **Solution:**
|
||||||
- Artificially truncated input transcription to a maximum of 250 characters.
|
- Artificially truncated input transcription to a maximum of 250 characters.
|
||||||
- Added `max_new_tokens=150` to the translation generation call to ensure the model terminates even if it gets stuck in a loop.
|
- Added `max_new_tokens=150` to the translation generation call to ensure the model terminates even if it gets stuck in a loop.
|
||||||
|
|
||||||
|
## Phase 7: Multilingual Detection & Bridge Translation
|
||||||
|
- **Goal:** Support input in any language, detect it, and translate to English + others.
|
||||||
|
- **Approach:**
|
||||||
|
- Switched to `whisper-small-mlx` (multilingual).
|
||||||
|
- **Hub-and-Spoke Model:** If a non-English language is detected, Whisper's `task="translate"` is used to create an English "bridge" text, which is then fed into the specialized MarianMT models.
|
||||||
|
- **Outcome:** Full support for multilingual input with centralized translation.
|
||||||
|
|||||||
Reference in New Issue
Block a user