It is common knowledge, or at least it should be, that placing a microphone as close to the person speaking will provide the best sound quality. A few other important factors that come in to play are room acoustics, microphone type, and digital signal processing. Mic placement is certainly top priority and always has been. Yet, there is a growing trend for placing microphones on the ceiling…far, far away.
There is another audio component to consider when designing a conferencing space, and it is the deal breaker for ceiling microphones. How is the audio being encoded? Audio codecs play an extremely important role in the overall sound quality of a conferencing room, and the type of audio codec used will be the difference between subpar sound and a great experience.
What the heck is an audio codec? Basically, without getting into the nuts and bolts, it takes the soundwaves that are picked up from the microphones, encodes them, sends them along to the far end, and decodes them for playback. Of course, it’s a little more complicated than that, but for sake of simplicity we will stop there. There are several different kinds of audio codecs, used for several different applications. We will focus on a few common codecs that apply to 99% of all conference rooms that involve audio conferencing.
Generally, humans with good ears can hear frequencies between 20Hz and 20kHz. 20Hz is the deep bass that makes the hair on your arms vibrate, whereas 20 kHz is the high-pitched piercing sound. The human hearing range with perception to speech and intelligibility falls between 300Hz and 3kHz. This is the frequency range that is most sensitive to human hearing, and this is the range that is used by the standard telephony narrowband codec, G.711.
The G.711 audio codec has been the standard for analog telephone transmission since 1972. The G.711 codec passes frequencies from 300 Hz up to 3.4 kHz and uses a sample rate of 8kHz. Sampling is referred to how much of the original signal is represented or measured at a constant time interval. An 8 kHz sample rate takes 8,000 slices of the waveform every second. These samples are reconstructed on the far end and give the listener a representation of the original audio wave. 8,000 samples per second might sound like a lot, but it’s not. For comparison, CD quality audio is sampled at 44 kHz, and DVD audio is sampled at 96 kHz. Together with the narrow bandwidth and low sample rate, the G.711 codec provides “toll quality” audio. Okay for a phone handset, not okay for a ceiling microphone.
Wideband audio codecs provide much improved audio quality, and G.722 is the standard wideband audio codec. It provides a larger frequency range, hence “wideband”, and a better sample rate, 50Hz through 7kHz and 16kHz respectively. The larger frequency response provides a more natural sound, and the 16kHz sample rate delivers a truer representation of the original sound wave.
Wideband audio codes are not a new technology. They have been used in video conferencing forever, going back to ISDN lines. Wideband audio is used for Skype, Zoom, Blujeans, Apple FaceTime, and most other soft conferencing applications. VoIP phone systems offer wideband audio. Over the past few years, wideband audio has even been implemented by cell phone carriers. All the big players now offer wideband audio. The difference in sound quality is noticeable right away, from the first syllable.
Multi-purpose rooms with movable tables are all the rage. Too often, integrators sell the idea to the customer that they can make this multipurpose room a usable audio conferencing space. The integrator will hang some ceiling microphone arrays, throw in a nice expensive DSP, and promise the customer the room will be awesome. The room is never awesome. Complaints start to roll in. The far-end users can’t hear. It sounds like there’s an echo. It sounds tinny. It sounds hollow. It sounds like the call is going through a soup can with a string attached to it. The audio programmer is called back to site to make some tweaks. There are no tweaks. It can’t be EQ’d. It’s not the gain structure, and it’s not a level adjustment. It’s the codec, and it can’t be fixed. The programmer sits in the room, waiting for divine intervention. In the end, no one is happy.
The not-so-funny thing is the same room probably sounds pretty darn good on a video call. Wideband audio is much more forgiving than narrowband audio. Even though the meat of human voice falls into the narrow band frequency range, there are still frequencies and harmonics that are outside the 300Hz – 3.4kHz range. These frequencies are really important for overall intelligibility. Frequencies below 300Hz provide the fullness and depth, and the frequencies above 3.4kHz are responsible for clarity and brightness. Ever notice how difficult it is to spell words or names over the phone? “F” like Frank, “S” like Sam. The letter “s” and the letter “f” sound nearly identical on a phone call. The hissing, or sibilance, used to make the “ess” sound falls outside the 3.4kHz frequency range of narrowband audio. That frequency is cut of, erased from existence.
Narrowband audio was designed to provide adequate sound using a phone handset. The handset microphone is inches away from the persons mouth. Narrowband audio is not adequate for ceiling microphones that are 10 feet from the person speaking. Ceiling microphones are just too far away, plain and simple. By the time the soundwave reaches the mic, the signal is already degraded due to the inverse square law. The mic also begins to pick up more reflections as opposed to direct sound. More reflections, degraded signal, low sample rate, narrow bandwidth; a recipe for disaster. Beamforming mic arrays, voice tracking microphones, and steerable lobes may help, marginally. The sound still won’t be great. Good luck explaining to the customer why the $3,000 ceiling mic sounds like a bad speakerphone.
Even though wideband audio is far superior, there are qualifications that need to be met in order to experience true HD audio. All devices in the chain must be compatible. If there is a VoIP system involved, it will need to be configured to use G.722 audio. The far end user must also have G.722 enabled on their VoIP system. If a calling bridge is involved, it too must be wideband audio comliant. If there is an analog phone line anywhere in the system, forget about wideband audio. POTS lines do not support it. If a call hits the public switched telephone network(PSTN), sorry, no luck here either. Experiencing HD audio on a cell phone requires a compatible device on both ends as well, and at least 3G service or WIFI. If any devices involved in the call do not meet the parameters, the devices will negotiate down to the lowest common codec, narrowband G.711.
It is best practice to stay away from ceiling microphones when used for audio conferencing, regardless of room size or type. It’s impossible to command 100% wideband audio compatibility throughout. It only takes one cell phone call into the room to “dumb” everything down to narrowband audio. Follow these guidelines for a better conferencing experience, a satisfied customer, and most importantly, a much happier audio programmer.
Anthony Ferraro is a Project Manager at Synergy Media Group.