What Text-to-Speech is in VICIdial
Text-to-Speech turns typed text into spoken audio in VICIdial, but it depends on Cepstral being installed and a system setting being enabled.
Text-to-Speech, or TTS (text to speech), is the feature that takes typed text and turns it into spoken audio a caller can hear. Instead of recording every prompt yourself, you type the words and the system generates the audio file. It is handy when wording changes often or when you want a message to read live data aloud. This post explains what TTS actually is in VICIdial and what has to be in place before it works.
What a TTS entry is
A TTS entry is a named record that holds the text you want spoken and the settings for how it is spoken. You manage entries from the Text To Speech section of the admin. The list there shows each entry's TTS ID, name, active status, and the beginning of its prompt text, with a link to modify each one. A TTS entry on its own is just a definition; it produces audio when something in your call flow references it.
What TTS depends on
TTS is not built into a plain VICIdial install. Two things must be true before any entry produces sound.
- Cepstral must be installed and configured on the system by an administrator. Cepstral is the engine that actually synthesizes the speech.
- The System Settings option for TTS must be enabled. Without that flag the entries exist but never generate audio.
How a TTS prompt gets rendered
When a call flow reaches a TTS entry, the text is sent to Cepstral, which creates an audio file, and that file is played to the caller. This is why TTS can sit anywhere audio plays: an IVR (interactive voice response), a Call menu, or any Dialplan step that points at the entry.
sequenceDiagram
participant C as Call flow
participant V as VICIdial
participant E as Cepstral
participant K as Caller
C->>V: Reach TTS entry
V->>E: Send TTS text
E->>V: Return audio file
V->>K: Play audio to callerWhen to use it
Reach for TTS when the wording changes often, when you need to read a lead's details back, or when recording a human voice for every variant is not worth it. For fixed messages a recorded prompt usually sounds better, so many teams mix the two. A holiday-hours notice or a name read back to the caller is a natural fit for TTS, while your main brand greeting is often worth recording properly once.
It is also worth knowing what TTS is not. It is not a replacement for the audio store, and it does not change how calls are routed; it simply produces an audio file that a step in your call flow plays. Treat it as one more source of audio sitting next to your recorded prompts rather than a separate subsystem.
If you are choosing between recorded audio and synthesized speech, the audio store overview covers how recorded prompts are stored.
Once Cepstral is in place you can create your first entry; the next step is covered in adding a TTS entry. For the wider picture of prompts, Music on hold, and synthesized audio, see the audio and TTS guide. If you would rather not manage Cepstral licensing yourself, VICIfast ships a ready dialer in under 40 seconds, so check the plans.
About VICIfast LLC
VICIfast LLC operates a managed VICIdial hosting + BYOI service for outbound and inbound call centers. We run the dialers, the carriers, the recordings pipeline, and the compliance plumbing so operators don’t have to.
Citing this article
VICIfast Engineering. “What Text-to-Speech is in VICIdial”. VICIfast LLC, June 27, 2026. Retrieved from https://vicifast.com/blog/what-is-vicidial-text-to-speech
Have questions?
Related posts
You might be interested in
VICIfast newsletter
Liked this? Get the next one in your inbox.
We ship the kind of stuff you just read — concrete, numbers-first, no drip. One email when a new post goes live. Unsubscribe in one click.
Comments
No comments yet — be the first.