Qwen3 TTS Voice Clone
No training required; a 10–20s clip creates a custom voice. Create the voice first, then synthesize with qwen3-tts-vc-realtime.
Voice clone
0/600
Cronologia
Output di esempio
Cherry
Sunny, upbeat, friendly young woman
Serena
Gentle, warm young woman
Ethan
Standard Mandarin with a slight northern accent; sunny, warm, energetic
Chelsie
Anime-style virtual girlfriend
Momo
Playful, cute, teasing tone
Vivian
Spunky, cute, a little feisty
Altri strumenti immagine
Vedi tuttoAltri strumenti video
Vedi tuttoAltri strumenti audio
Vedi tuttoModelli popolari
Strumenti popolari
Model overview
Voice cloning workflow
Provide a short clip, create a custom voice, then synthesize speech.
10–20s clip
Recommended 10–20s, max 60s.
Format & sample rate
WAV/MP3/M4A, ≥24kHz, mono, <10MB.
Clean speech
At least 3s continuous clear reading; no noise or singing.
Create then synthesize
Create the voice, then synthesize with the same target_model.
Synthesis examples (preset voices)
Preset-voice synthesis examples (not cloned); actual results depend on your input.
Synthesis example · Cherry
Listen
0:000:00
Synthesis example · Dylan
Listen
0:000:00
Voice clone FAQ
Key requirements and workflow questions.
Keep exploring
Want to try image/video generation?
Same interaction style and parameter design, with more models coming.
