Thanks TCG!
Glad to be a part.
A customer advised us yesterday of a new interview with the Revel speaker designer- Kevin Voecks, whom I understand to be intelligent, thoughtful and a generally good guy. I respect his accomplishments and how hard he has worked.
In this interview, Kevin states quite clearly, the position of those convinced time coherence in speakers is not necessary. He also states the standard reasoning for not using first order crossovers.
This magazine would probably bust me for printing any quotes from the interview, so I shall paraphrase quite accurately:
Time-coherence isn’t important.My response: "You are right, if you are listening to test tones/square waves to prove this to yourself- sounds unfamiliar in daily life. Which is exactly how those tests were run to 'justify' a lack of time coherence.
It would seem the 
best experiment to answer this question would use 
familiar music or voice as the test signal, spanning the particular crossover point in question, like 3kHz, and played through electrostatic headphones (which are time coherent). Then introduce the time delays of a high-order crossover at 3kHz. Which is easy to do for a cost of about $300 for the electronic crossover, dividing and recombining the signal.
From experience, I know, and many others do, including many, many professional studio engineers, that there is no way to miss the resulting time-domain distortion, unless of course, you haven't really heard that much live, unamplified music- which is indeed possible in the hi-fi and engineering worlds."
Timbre is the biggest thing to get right in a speakerMy response: "Yes, indeed. And you go on to make a lot of good points. However, you leave out that a big part of what forms timbre (texture) is the way the harmonics combine/add/cancel as time passes- we can see that on a 'scope. This is what forms the overall wave envelope! Change their relative timing, change the timbre."
High-order crossovers prevent dynamic compression compared to 1st-order crossoversMy response: "So use better drivers, that have the very least dynamic compression, which can handle the real-world, musical, out-of-band signals presented by a slow-rolloff 1st-order crossover w/o noticable distortion. There aren't many of them, and they aren't cheap, but linearity with power applied is one spec that helps define the very best drivers. The Dynaudio and Peerless raw-driver companies used to publish this info graphically, showing how their drivers reacted over a 0.1Watt to 100's of Watts input range- to differentiate their drivers from the competition."
Loudspeakers that use low-order crossovers inevitably run into these problems.Wish I'd known this back in 1973, when Hewlett/Packard engineers I worked with demonstrated just the opposite to me, helping me get started in design. Do not see, hear any off-axis problems either, unless you measure only one pure test tone at one very specific point in space- and a pure, single tone is not music.
There are a lot of room problems in the bass with speakers.And our experiments show that those mostly disappear when the customary 5-8 foot equivalent distance (5 to 8ms of time delay) in the typical woofer or subwoofer crossover is removed. How bout that! A time delay present in most speakers equivalent to the round-trip distance to the wall behind the speaker and back again. Hmmmm. Is it the wall reflection or the speaker?... measures just like a wall reflection/standing wave problem- must be, if we 
can't hear time-domain distortion...
Here is the link to the full interview: 
 Voecks' interviewOh well- live to fight another day. But let me leave you with a thought experiment, relative to time coherence:
What does it mean to the band when someone is 10% late? A 3 minute song would end 18 seconds late- several bars late. The beat goes from 105/minute down to 95. Seems easily audible, right?
10%:  on a fairly rapid dance beat of 105 beats/minute (waltz time is 88), 
one quarter note note is struck every 60/105 seconds, or every 0.57 seconds. Struck and silenced (nearly) and then struck again. All in 0.57 seconds. So the strike and release of the note might itself be only 0.2 seconds. 10% late would be 0.057 seconds late. Or 57 milliseconds. On a 
quarter note.
Except any musician can easily play 16th notes.
Which means one note struck every 0.57/4 seconds, every 0.14 seconds, or 
completed in less than 0.1 seconds from start to release. And 10% late would be 0.014 seconds or 14 ms.
Unless it was a 32nd-note run- which would be 7ms (milliseconds).
Imagine the reproduction system shouldn't screw that up by 
10% if at 
all possible. 10% of the 14ms. Or 0.0014 second- 1.4ms. Which is just about, or less than, the time delay one has with side-mounted woofers- which many accuse of sounding slow.
That was for rhythm. For two harmonics combining- we can easily hear the timbre of the voice change with only one 
microsecond of time delay injected into an otherwise time-coherent design. This change in the sound does 
not show up on 
any sinewave test, any pink noise, any MLS test, any FFT, any impulse test, nor on any swept-filter test. But we can hear it- and have demonstrated this to customers, our dealers, and reviewers.  For a decade.
This delay of about a millionth of a second is equivalent to sitting lower in the chair by ~ 2 inches, or instead, just moving the mid back an eighth of an inch- which was our experiment (so one's relationship to the front of the cabinet and other drivers and the room and the chair-back did not change).
The mid was crossoved over at 3kHz. Which has a period of 1/3000 of a second. Of which a millionth of a second is about 1/333rd- about 1 degrees' worth. Which cannot lead to sinewave cancellation of any audible degree, nor can be read at a 100kHz test-instrument sampling rate. But it can be heard reliably, by 
inexperienced listeners who are simply asked if they hear a difference and if so, what the difference was. All hundreds have agreed.
My view is that when the phase response of the speaker is twisted grossly, then one cannot hear these small changes I just described. That type of speaker is so out of phase it doesn't matter much if one stands or sits. But you also cannot play aggressive or complicated music loudly on that speaker. Ever wonder why "audiophile" recordings are so bland? Don't want to upset those phase anomolies by having multiple tones in that high-voice crossover range... can't play Janis Joplin loudly without upsetting the women in your life- hurting their ears.
Try Janis on excellent headphones- no problem at all. Or on time-coherent speakers.
Thanks for reading this. 
Best to all,
Roy Johnson
Green Mountain Audio