feat: add Voicebox as open-source ElevenLabs alternative

Free, local-first voice synthesis studio (MIT license) powered by Qwen3-TTS. Voice cloning from short samples, local REST API, no per-character costs. 4-5x faster on Apple Silicon via MLX. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 10:23:33 -08:00 · 2026-02-18 10:23:33 -08:00 · d4e6ef796c
commit d4e6ef796c
parent f2a755f750
1 changed files with 43 additions and 0 deletions
--- a/skills/ad-creative/references/generative-tools.md
+++ b/skills/ad-creative/references/generative-tools.md
@ -381,6 +381,46 @@ Ultra-low latency voice generation built for real-time applications.

 ---

+### Voicebox (Open Source)
+
+Free, local-first voice synthesis studio powered by Qwen3-TTS. The open-source alternative to ElevenLabs.
+
+**Best for:** Free voice cloning, local/private generation, zero-cost batch production
+**API:** Local REST API at `http://localhost:8000`
+**Pricing:** Free (MIT license). Runs entirely on your machine.
+**Stack:** Tauri (Rust) + React + FastAPI (Python)
+
+**Capabilities:**
+- Voice cloning from short audio samples via Qwen3-TTS
+- Multi-language support (English, Chinese, more planned)
+- Multi-track timeline editor for composing conversations
+- 4-5x faster inference on Apple Silicon via MLX Metal acceleration
+- Local REST API for programmatic generation
+- No cloud dependency — all processing on-device
+
+**Ad creative use cases:**
+- Free voice cloning for brand spokesperson across all ad variations
+- Batch generate voiceovers without per-character costs
+- Private/local generation when ad content is sensitive or pre-launch
+- Prototype voice variations before committing to a paid service
+
+**API example:**
+```bash
+curl -X POST http://localhost:8000/generate \
+  -H "Content-Type: application/json" \
+  -d '{"text": "Stop wasting hours on manual reporting.", "profile_id": "abc123", "language": "en"}'
+```
+
+**Install:** Desktop apps for macOS and Windows at [voicebox.sh](https://voicebox.sh), or build from source:
+```bash
+git clone https://github.com/jamiepine/voicebox.git
+cd voicebox && make setup && make dev
+```
+
+**Docs:** [GitHub](https://github.com/jamiepine/voicebox)
+
+---
+
 ### Other Voice Tools

 | Tool | Best For | Differentiator | API |
@ -405,6 +445,7 @@ Ultra-low latency voice generation built for real-time applications.
 | **PlayHT** | Very good | Yes | 140+ | <300ms | ~$0.10-0.20 |
 | **Fish Audio** | Good | Yes | 13+ | ~200ms | ~$0.05-0.10 |
 | **WellSaid** | Very good | No (actor voices) | English | ~300ms | Custom pricing |
+| **Voicebox** | Good | Yes (local) | 2+ | Local | Free (open source) |

 ### Choosing a Voice Tool

@ -417,6 +458,8 @@ Need voiceover for ads?
 ├── Need multilingual (same ad, many languages)?
 │   ├── Most languages → PlayHT (140+)
 │   └── Best quality → ElevenLabs (29+)
+├── Need free / open source / local?
+│   └── Voicebox (MIT, runs on your machine)
 ├── Need cheap, fast, good-enough?
 │   └── OpenAI TTS ($0.015/min)
 ├── Need commercially-safe licensing?