I don't have time to write the longer reply I've been wanting to, but having spent a bit of time doing some background research and listening to the RACE implementation of ambio in my main system, there may be some useful content.
First, it seems that it would be useful to come to some degree of common agreement (as far as that's possible in audio) on the basic mechanisms of the ear and how we localize. This summary seems to be about the best I've found. It's not 'original work', but it seems to agree in all important points with other sources I've found
http://www.aip.org/pt/nov99/locsound.htmlThe important points seem to be
- ITD (really IPD since it's phase rather than time that is sensed) and ILD are both important localization mechanisms
- ITD is dominant up to ~1kHz or so, ILD is effective at basically all frequencies, but is dominant from 3k on up
- ITD is rather delicate, and can be disrupted by reflections; ILD is relatively stable
With this context and (hopefully) a basic understanding of both Blumlein and XTC, we can deduce the 'obvious' observations
- conventional stereo works because as Blumlein showed it can capture/reproduce the ITD effects within the important frequency range. This isn't quite perfect - a 60-degree speaker geometry can only reproduce it up to about 750Hz and it probably doesn't capture the leading edge exactly, but it's quite good, particularly in properly mic'd recordings
- the observation in the article that ITD is sensitive to reflections/ambient is consistent with the basic experience that stereo works best in large well treated spaces, as these effects are well controlled.
- Blumlein doesn't capture ILD particularly well, although it does to a degree. The presence of crosstalk will definately confuse this cue. In a 'good' stereo implementation this is benign, as the ear will ignore marginal or inconsistent cues if it has effective dominant cues due to ITD.
- comb filtering will be a problem in higher frequencies with conventional stereo, and in general is so highly dependent on the physical geometry that it's not exactly predictable.
- XTC can be extremely effective in preserving the ILD at most frequencies. This is very important in the 3k+ range, and (some degree of inference here) doesn't seem to hurt in other frequencies.
- [speculation] XTC *may* not preserve the ITD cues properly. By cancelling the crosstalk component, it will definately reduce the establishment of the phase difference. However, cancellation is only partial so some phase difference will still remain, plus it will additionally create an ILD. Now, in 'natural' soundfields the ILD is generally not present at these frequencies, but as headphone listening shows it can still be an effective cue if it's present.
- XTC (assuming it's largely reliant on ILD) should be more tolerant of reflections and ambient interference
After spending a certain degree of time listening to the AudioMulce RACE implementation, my overwhelming impression is that it performs amazingly well in my extremely poor room. Simply setting up the speakers with some 2" foam at the first reflection points, the stage width was 100-110 degrees, and there was no sense of physical constraint in the soundstage - apparent depth was present, and the sense of ambient space was very good. In contrast, my conventional stereo setup in the same room (7'x16') is marginal - decent lateral localization but not terribly expansive, with not much depth sensation. Additionally, I *think* tonality and low-level detail were better, but I reserve comment on that until I can do a more reasonable A/B.
It wasn't all roses, though. Some recordings definately had wierd effects - some being '3 blob' - hard left, hard right, and center. These were the minority, though - most were greatly improved, and only a few were distractingly worse than stereo. Playing around with the settings seems to indicate that it's possible to back off the aggressiveness of the settings to get better results on the poor recordings, although that greatly complicates the playback interface.
So, I remain very interested in the approach. I can see why folks with the luxury of a large room may find it uninteresting, but IMHO the potential it holds for dramatically better spatial presentation in smaller spaces is very interesting.