by Guy Michaels
Making Your Voice Heard: A Client’s Guide to Audio Formats for Telephone Menu Systems (IVR)
Here are two versions of the same recording. The first is the original high-quality recording followed by the reduced quality format as is common in IVR recordings. Read on to find out why IVR recordings are processed in this way!
When you’re choosing a voiceover artist for your telephone menu system – your IVR (Interactive Voice Response) – my goal is always clear, professional communication. But beyond selecting the perfect voice, I know understanding the unique audio specifications for these systems is crucial to ensuring your callers hear exactly what you intend. After all, the first impression your phone system makes is vital for customer experience.
As a pro voice actor, I often guide my clients through these specific requirements to ensure their message is delivered perfectly. Let’s demystify IVR audio formats, the fundamentals of IVR recordings and why I believe they’re so important for your contact centre or call centre.
Why Telephone Audio Has Specific Requirements (and why it’s a good thing!)
In an era of high-fidelity digital audio, I know it might seem counterintuitive that telephone menus often sound a bit “lo-fi.” However, this isn’t a limitation; I see it as a deliberate design for efficiency and widespread compatibility. These pre-recorded voice messages are designed for the very specific needs of telephony.
Many IVR systems, particularly those that integrate with traditional telephone networks (PSTN), rely on established standards. These standards prioritize clarity of speech over minimal bandwidth. Even modern VoIP (Voice over IP) systems benefit from these formats as they ensure seamless communication across diverse phone infrastructures without unnecessary complexity or processing overhead. The result? A reliable, cost-effective message system that delivers clear messages. As a British voiceover artist, I’ve seen these requirements become standard practice across many international projects. Your IVR voice recordings are truly the foundation of your customer support.
The Standard IVR Audio Specifications You Need to Know
For the vast majority of IVR systems and telephone menus, you’ll find that clients typically require audio files with these key characteristics for their voice prompts:
- Sample Rate: 8000 Hz (8 kHz). This is the standard telephony sample rate, capturing the essential frequencies for human speech while efficiently managing file size. Anything higher would simply be discarded by the phone system.
- Bit Depth: 8-bit. This resolution is perfectly adequate for voice prompts in a telephony environment, contributing to smaller file sizes and efficient streaming.
- Channels: Mono. Phone calls are mono; there’s no benefit to stereo audio for these systems.
- File Format: WAV (.wav) with G.711 Encoding. The WAV container is standard, and inside, the audio is almost always encoded using G.711 Pulse Code Modulation (PCM). There are two main variants:
- G.711 A-Law: This is the standard codec used in Europe and the UK for IVR voice messages.
- G.711 u-Law (or Mu-Law): This is the standard codec used in North America and Japan for IVR voice messages.
These specifications ensure that your greeting, menu options, and any other voice message is perfectly suited for your PBX or contact centre IVR setup.
The Importance of Professional Source IVR Recordings
While the final IVR audio file will be in a specific 8-bit, 8 kHz format, I can’t stress enough that the quality of the initial voice recording is paramount. I always ensure my clients’ voice-over artist records and processes the audio at the highest possible fidelity (e.g., 44.1 kHz or 48 kHz sample rate, 16-bit or 24-bit depth).
Why this matters: When you start with a pristine, high-resolution recording, the down-conversion to the required IVR format retains significantly more clarity, presence, and overall quality. A voice actor who records with professional equipment in a treated environment provides a clean foundation. Any noise, distortion, or poor vocal technique present in a lower-quality source recording will be amplified and become far more apparent in the final compressed IVR file. I truly believe a high-quality source ensures your message is conveyed with maximum impact. This is precisely why, as a voiceover expert, I prioritise the initial recording quality above all else. You want an amazing voice that resonates, not a generic IVR sound.
Converting Audio to IVR Formats: Your Options
Once you receive the high-quality recordings from your voice-over artist, you or your audio engineer will need to convert them to the specific IVR format. Many professional audio applications can do this, but for common tasks, dedicated editing software is often best.
Here’s a general two-step process, followed by some software suggestions:
- Start with a High-Quality Mono WAV: Your voice-over artist should provide you with a clean, mono WAV file at a standard sample rate (e.g., 44.1 kHz or 48 kHz) and a higher bit depth (16-bit or 24-bit). This is your ‘master’ for conversion.
- Convert to IVR Standard (8kHz, 8-bit, G.711 WAV): Use an audio editor to perform the necessary resampling and encoding.
Converting IVR recordings using Ocenaudio (Free, User-Friendly):
- Open your high-quality WAV in Ocenaudio.
- Go to
Edit>Convert Sample Type... - In the dialog, set
Sample Rateto8000 Hzand ensureChannelsisMono. Click “OK.” - Go to
File>Export... - Set
FormattoWAV (Microsoft). - For
Encoding, selectA-Law(for UK/Europe) oru-Law(for North America/Japan). - Ensure
ChannelsisMono. Save the file.
Converting IVR recordings using Audacity (Free, Feature-Rich):
- Open your high-quality WAV in Audacity.
- In the bottom-left corner, find the “Project Rate (Hz)” dropdown and set it to
8000. - Go to
File>Export>Export Audio... - Set
FormattoWAV (Microsoft). - For
Encoding, selectA-Laworu-Lawfrom the available options (e.g., “U-Law, 8-bit PCM” or “A-Law, 8-bit PCM”). - Ensure
ChannelsisMono. Save the file.
Converting IVR recordings using Adobe Audition (Professional, Paid):
- Open your high-quality WAV in Audition.
- Go to
Edit>Convert Sample Type...(or equivalent, depending on version). - Set
Sample Rateto8000 Hz. - Go to
File>Export>File... - In the export dialog, set
FormattoWAV. - Under
Format Settings/Codec, choose the specific G.711 variant (A-Law or u-Law). - Ensure
ChannelsisMono. Save the file.
By understanding these essential specifications and the importance of starting with high-quality source audio, I’m confident you can ensure your telephone menu system sounds clear, professional, and effectively guides your callers. This helps you make the best first impression with your contact centre. If you’re looking for a British voice actor who understands these nuances, or a recording service to record your IVR, please don’t hesitate to get in touch. Avoid the pitfalls of an AI voice; nothing can replace the warmth of a human voice. Let me help you create an IVR system that truly shines, providing essential information and guiding customers to the correct department for their query, even if they are in a queue.