The direction of the headphone market used to be dictated by engineers with acoustical design expertise. But with Bluetooth headphones and earphones having taken over the mass market -- and all of them having at least a Bluetooth receiver and an internal amplifier, and likely a digital signal processor, too -- the chips inside the headphones now have at least as big an influence on the products as acoustical or even industrial design does. And the leader in those chips is Qualcomm -- partly because it owns the aptX audio codec, which audio enthusiasts consider de rigueur for any high-quality Bluetooth audio product.
To find out what’s on the horizon for headphones and earphones, and what today’s toughest design challenges are, I talked with Qualcomm’s Chris Havell. His title is senior director of audio product marketing, but he says he’s generally known in the company as the “head of Bluetooth audio.”
This interview’s much longer than what I normally place in this column, but Havell had so many interesting things to say that I decided to let it run long.
Brent Butterworth: What are the trends you see coming in headphones and earphones?
Chris Havell: Certainly a majority of consumers seem to be purchasing true wireless [earphones]. There’s still a place for headphones, because sometimes you want to be immersed in very good sound, and they can be more comfortable to wear. Now we’re trying to get a lot of headphone features into true wireless, and trying to make true wireless hit the level of performance you have with a wired connection.
A key thing there is ANC [active noise canceling], and how you deliver really good ANC with earphones, where it’s much more of a challenge. That’s because headphones can have quite a good seal around your ear, but earphones are trying to fit in your ear canal and we all have different-sized ear canals. We’re looking into adaptive ANC so the ANC can optimize itself for however well the earphone fits. ANC is especially important for earphones, because when you’re wearing something very close to the eardrum, at high SPL you can damage your hearing. So we’re trying to make it so you can hear things without having the volume up.
BB: We’ve seen a lot of concern about latency and lip sync now that people are watching movies and YouTube on their phones using Bluetooth headphones. I know Qualcomm is addressing that in some of the latest variants of aptX, but it’s a bigger problem with true wireless because you have to add in the extra time it takes for the earpieces to “talk to each other.” Are you working on that?
CH: With the growth of 5G, we’re going to see a lot more online gaming and video, and getting the latency down to an imperceptible level is quite a challenge with true wireless. A lot of it has to do with the robustness of the wireless connection. The Bluetooth connection can catch interference from other RF sources, like Wi-Fi. If your Wi-Fi is running on the same 2.4GHz band as Bluetooth, you inevitably have contention on that link, which can cause corruption to the audio if we don’t take special measures to protect it. Now you’re using your tablet on that same Wi-Fi network to stream video. To allow for that, and for unbroken audio, you have to buffer the audio and have a time-to-play stamp on it. What’s typically been done is to use a large buffer for the audio so you can allow for data retries if the packet gets corrupted in transmission, but a large buffer can mean that you end up with a lot of latency.
For video watching, you can do more on the handset [phone] to time the video so it syncs with the audio. And if you can improve the Bluetooth wireless connection -- potentially through more compression and better use of bandwidth -- you can reduce the buffer size and thus reduce the latency. A key element of aptX Adaptive is looking at the robustness of the connection to the phone, adapting the data coding rate so as to reduce the chance of corrupted audio. In good RF conditions you can also adapt so as to reduce audio buffer size and hence reduce latency as well.
BB: I’ve noticed in the comments on my Bluetooth latency video that a couple of people got perfect sync when I inserted 300 milliseconds of audio latency, which I assume meant there was lag in the video from their source device. Is it possible to address that?
CH: Latency has a few elements involved. The source device is one. It’s playing video but sending out audio on some other route. A laptop might think it’s sending audio to a speaker, but it’s rerouted over a Bluetooth wireless connection, where it has to get packetized and sent off to something else. Some optimization can be done if the system knows where the audio is headed to. And the Bluetooth audio device can report back to the source that “I’ve got a delay of 200ms,” and some phones or tablets can then use that signal to add a delay on the video applications so it syncs.
But that doesn’t work for gaming, because if you delay the video by 200ms, something has already happened by the time you see the video. For gaming, you have to drop into a very low-latency mode. When you’re in the zone with a game, you’re more captured by the video than by the audio, and you won’t notice so much if the audio is reduced quality -- and you can reduce the latency by compressing the signal more and sending lower amounts of data. With aptX Adaptive, you can have 200ms or 300ms of latency and have really great audio for movies and just delay the video, while with a game you can afford to squeeze the audio quality a bit in order to get as short a latency as possible. aptX Adaptive parses all these cues to give you the closest possible experience to a wired correction.
BB: Are you doing anything to help improve fundamental sound quality?
CH: The key elements for that are delivering the highest resolution of audio you can within the given RF environment -- so you’re not getting audio glitches -- then applying noise-reduction processing to remove outside noise, plus whatever DSP [digital signal processing] post-processing you want to do to make it sound great. We provide a really flexible DSP plus really flexible ANC plus the best codec for any particular audio carriage.
We are continuing to look at how we can get even higher-resolution audio across that connection, within the Bluetooth wireless standard. The latest chip we’ve just announced, the QCC514X, has an improved radio, so the connection’s better, and you have a better chance of being able to transmit reliably at a higher data rate. We need to get to a point where you don’t look at the audio as being lossy.
BB: Could you describe some of the DSP tools you offer with these chips?
CH: We’ve got a complete end-to-end tool chain with different blocks of DSP capabilities you can use, or manufacturers with lots of expertise can write their own.
The first thing you’re going to look at with DSP tuning is, you’ve got a certain set of characteristics associated with the drive unit, which will not have a flat response. You need to get the driver’s response flat before you can start building on top of that. So we have a DSP block that comes before the driver. Then you can start adding things like bass boost. With a 5mm or 6mm driver, you’re not getting a whole lot of bass, so you really need that boosted; with a larger driver -- say, an 11mm driver -- then your bass response is much more powerful. Then we have a whole bank of filters so the manufacturers can EQ the headphones to get the sound they want. We also have a spatializer on top of that, to create more of an immersive sound, and then some final post-processing.
BB: Are you doing anything that will bring the amplifiers built into SoCs [systems on a chip] used for headphones closer to the result you’d get with a good standalone headphone amp?
CH: There’s a lot of development in that area as well. As you build a class-D or class-H amp at the back end of the chip, you find that takes a disproportionate amount of area on the chip. The way you design it has a big impact on THD+N [total harmonic distortion plus noise], etc. If you have a 5mm or 6mm drive unit, you’re not going to benefit from a lot of these advantages, but if you have an 11mm or 12mm driver, you can really benefit from the advantages of better amps. It’s a continuous development process, particularly as you’re going down through technology nodes.
BB: So it’s much like what a traditional high-end amplifier designer does, playing around with different circuit layouts and signal paths and component positioning?
CH: Yes, but on a piece of silicon. It’s a continuous evolution within the constraints of the piece of silicon you’ve got.
BB: So what’s interesting about the new chip you mentioned?
CH: We have two, the 514X and 304X, the 514X being for the premium end of the market. One of the key things we added is true wireless mirroring, so in essence your earbuds are mirror copies of each other; they’re doing the same thing. If you took one out and put it back in the case, you wouldn’t hear a glitch in a phone call or music playback in the other earpiece. The second earpiece is listening in on everything going on with the phone. Only one is connected to the phone [and sending audio to the other one], but the other one is ready to connect to the phone. So if you have your phone in your pocket and have a strong connection to one of the earpieces, when you take the phone out of your pocket, it can reconfigure the connection on the fly.
BB: So the need to hold your phone strategically to get a reliable connection is going away?
CH: Definitely, and for a host of reasons: true wireless mirroring, improvements in radio performance, improvements in aptX Adaptive. It’s a very complicated process to achieve something very simple for the end user.
BB: There’s still a huge range in performance with noise-canceling headphones -- some of them work amazingly well, and some barely do anything. What are you doing to bring manufacturers up to a higher level of performance with ANC?
CH: There are two elements: technology and education. A lot of the performance comes down to acoustics design. It’s not like you just put a chip in and you have great ANC. You have to think about the microphone spec, the driver spec, and where things are physically placed relative to each other. It’s garbage in, garbage out -- if you’re not getting good microphone input, measuring from the outside and inside of the headphones, you’re not going to get a good ANC experience. And if you have a small drive unit, you don’t have very good bass performance, so you’re not going to be able to use that for feedback to remove the low-frequency noise. We have gained quite a lot of experience in designing these products, and we’ve worked with several customers on designing the acoustics of ANC headphones.
We’re also building in dedicated ANC hardware and tuning tools, so it’s not just a function of the DSP. You have to think in terms of latency, how quickly can you react to higher-frequency noise? It’s a balance between how much you attenuate signals, and how that affects the rest of the audioband. I assume you know the term “waterbed noise”?
BB: No, that one’s new to me.
CH: If you think of a waterbed, when you push one side down, the other side pushes up. So with ANC, if you’re attenuating low frequencies you can end up with impact on higher frequencies, where you’re increasing the high-frequency noise. When you put ANC headphones on and hear a high-pitched hum, that’s waterbed noise. In order to cancel that, you have to control the ANC process, how it handles feedback, feed-forward, and attenuation.
We’re providing the tools for manufacturers to change the coefficients of the ANC filters on the fly, so they can create different ANC presets that consumers can access. If you’re on a plane with a lot of low-frequency noise, you might want a different ANC setting than if you’re working in a noisy office. I have seen some headphones do it dynamically, adjusting themselves automatically, but some consumers may find that annoying depending on how well it’s done. Giving them control through an app or a button on the headset is a good way of doing it because they understand what’s going on.
BB: Are you doing anything to address the eardrum suck issue?
CH: Yes. It’s a matter of ultimately the amount of silicon you allocate to doing it. In order to remove high-frequency noise, you have to sample at much higher frequencies, which means more silicon and more memory.
BB: So you’re able to run the ANC at higher frequencies by sampling and monitoring the signal and adjusting the filters on the fly to eliminate feedback when it occurs? Like the automatic feedback destroyer in a P.A. system?
BB: One of the problems with true wireless is fitting controls onto the tiny earpieces, and when you push on the controls you’re shoving the earpieces farther into your ears. I know manufacturers are looking at using voice command instead of buttons, but they’re struggling to get the earphones to respond as reliably as smart speakers do. Are you doing anything to bring headphones and earphones up to that level of performance on voice command?
CH: One of the challenges is the amount of processing required. With wake words [such as “Alexa” or “Hey Google”], using a model of the word you’re listening for, the more memory you can allocate for your model, the more accurately you can identify the wake word. Especially when you’re dealing with people speaking in different intonations and accents, the bigger the model, the better. A smart speaker can have a lot of memory and run at gigahertz levels of processing because it’s plugged into the wall. We have lots of challenges in terms of the amount of memory and processing we can put into earphones, but we continue to increase the performance of the CPUs, putting in more memory so we can accommodate bigger wake-word models.
Then other things come into play for how well you can manage the microphone inputs, how you can eliminate noise from the surroundings so you can present clearer captured voice commands. The more noise you can eliminate, the better the accuracy. Also, we’re working on ways that the microphones can know it’s you who’s speaking, so you can eliminate false triggers and don’t have to worry about someone shouting the wake word when they’re right near you.
BB: Is there anything else really new and wild on the horizon?
CH: Yes, you can also pick up a lot of things in the ear through sensors. We’ve already seen things like the Bragi Dash, which put a huge number of sensors in the ear. That could be especially useful right now [in the time of the coronavirus pandemic] -- your earphones could give you early warning if you have a slight fever. Things like this could lead to very interesting applications going forward.
. . . Brent Butterworth