Qwen3 TTS Voice Clone
No training required; a 10–20s clip creates a custom voice. Create the voice first, then synthesize with qwen3-tts-vc-realtime.
Voice clone
0/600
History
Sample outputs
Cherry
Sunny, upbeat, friendly young woman
Serena
Gentle, warm young woman
Ethan
Standard Mandarin with a slight northern accent; sunny, warm, energetic
Chelsie
Anime-style virtual girlfriend
Momo
Playful, cute, teasing tone
Vivian
Spunky, cute, a little feisty
Ещё инструменты для изображений
Смотреть всеЕщё инструменты для видео
Смотреть всеПопулярные модели
Популярные инструменты
Ещё инструменты для аудио
Смотреть всеПопулярные модели
Популярные инструменты
Model overview
Voice cloning workflow
Provide a short clip, create a custom voice, then synthesize speech.
10–20s clip
Recommended 10–20s, max 60s.
Format & sample rate
WAV/MP3/M4A, ≥24kHz, mono, <10MB.
Clean speech
At least 3s continuous clear reading; no noise or singing.
Create then synthesize
Create the voice, then synthesize with the same target_model.
Synthesis examples (preset voices)
Preset-voice synthesis examples (not cloned); actual results depend on your input.
Synthesis example · Cherry
Listen
0:000:00
Synthesis example · Dylan
Listen
0:000:00
Voice clone FAQ
Key requirements and workflow questions.
Keep exploring
Want to try image/video generation?
Same interaction style and parameter design, with more models coming.
