transaural adventures

skrivis · « **on:** 19 Apr 2005, 10:42 pm »

I'm interested in the process of stereo recording/playback, and also in variants and attempts to fix the built-in problems.

One of the basic problems is that stereo microphones "hear" a sound source twice, once per microphone. Playing this back on a standard stereo set of loudspeakers makes your ears hear the sound 4 times. L channel to left speaker to left ear. R channel to right speaker to right ear. Plus... left speaker to right ear and right speaker to right ear.

This is crosstalk, and transaural processing attempts to compensate for it. Carver's Sonic Holography was an attempt at it, and so is the McShane Ambience Retrieval System (MARS).

The basic idea is that, since you will hear the left speaker in your right ear and that's wrong, we'll put an inverse "left" signal into the right speaker and it will all cancel out when it reaches your ears.

There are some commercial products that license technology from SRS Labs that do some of this, but they also tend to lump in other "features" that probably don't do much for accuracy.

There are some software DVD players, and also a plugin for Winamp that do SRS or transaural processing too, but they only do it in real-time, and they only output to a sound card.

I don't particularly want to listen to a sound card. They're pretty grotty, and not what I'd consider high fidelity.

So I looked for a way to process audio and burn it to a CD so I could play it on my stereo system, which has good electronics, DAC, speakers, etc.

What I was able to do on UNIX is rip the audio CD to wav files using cdparanoia. You get 16-bit little-endian wav files, one per track.

Then you use BruteFIR to process the files. BruteFIR doesn't know about wav files, just raw "format," so you have to take the beginning of the wav file, copy it to another file, then append the output of BruteFIR onto that.

BruteFIR comes with some example filters, amongst which is a speaker crosstalk cancellation filter. It's designed for playback on "stereo dipoles," which are normal stereo speakers with no physical separation between them.

So, process the files with BruteFIR, burn the resulting wav files to CD, pop the CD in and play it.

I tried it with my speakers at their normal distance apart and the result is rather muddled. No problem, I'm not surprised.

Next I put my speakers right next to each other. Bingo, it now images like nobody's business.

This worked quite well for some classical recordings (I tried older Telarc ones), but not so well with multiple monoral studio recordings. With those the result, while interesting, isn't necessarily an improvement.

Take this all with a grain of salt. It might work differently for you with your system. But, if you have a UNIX box (Linux will work, although I used Solaris) and a CD burner, it will only cost you some time and blank CDs to try it out.

You can find BruteFIR on the web, as well as info on transaural sound. There is info on stereo dipoles on a site about Ambiosonics.

_scotty_ · « **Reply #1 on:** 20 Apr 2005, 12:20 am »

How far away were you from your loudspeakers and was the sweet spot any larger or smaller as a result of the processing. Also what kind of speakers
are you using.
Scotty

csero · « **Reply #2 on:** 20 Apr 2005, 01:10 am »

Quote from: _scotty_

How far away were you from your loudspeakers and was the sweet spot any larger or smaller as a result of the processing. Also what kind of speakers
are you using.
Scotty

Let me chime in. I'm listening to transaural, and only transaural now. No more setereo for me.

I don't feel any disadvantage, even for multitrack mono recordings, but to achieve that you need careful setup.

First, it is a sit down and listen solution, do not except the same sound everywhere in the room ( just like in the original concert hall). Except... see at the end.

The distance depends on the DSP algorhytm. They usually design the crosstalk cancellation for 10-20 deg speaker separation. The smaller the speaker separation, the less sensitive the cancellation for the head position, but also less effective in the lower frequencies - which is not necessarily bad. 10 deg means ~1' distancebetween the speakers and 6' distance from the speakers. The sweet spot is reasonable side to side, and quite long, but you do not feel your head needs to be in a vice as in case of directional stereo speakers.

Speaker selection is not critical, just make sure the acoustical centers of the LO-MID-HI drivers are vertically in line. As long as this is true, it is simply scary how simirarly can two very different speaker image. I made experiments from RS Minimuses to big electrostats, and beside the obvious tonal and distortion differences, they actually created the same image. Why should it be different anyway.

Because of the ear HF locaization abilities, an even better, more believable image can be produced if you duplicate the front speaker pair behind you and feed with the same signal. Why? I can explain, if you are interested, but it is a very long story. It is also helps to make it very head position independent - you can actually turn around without the change in the image. Try this with stereo!

The last link in the chain is fixing the dryness of the original record with lots of generated or recorded ambience channels. This can make the whole reproduction scary realistic, and also helps when you don't want to sit in the sweet spot. If the ambience level is right and diffuse enough, you feel you are at the actual venue, but the front stage is not well defined, just like in real life on a very bad seat on the side. in you move to the sweet spot, nothing change, just the front stage gets more defined.
If you want, we arrived to ambiophonics, but officially it is a hybrid transaural-ambient reproduction method.

But HOW??? That is the hard part. Besides computers, there are very little available. It is definitely a DIY job.

For transaural the cheapest, most accessible experiment is the following:

It can be found for $40 on the net. Don't laugh, just remove the DSP card from that toy woofer and use two real speakers and a quality woofer. Not a complete slution but you might be surprised. ( it is a 10 deg DSP code with 120 Hz XO)

JoshK · « **Reply #3 on:** 20 Apr 2005, 01:26 am »

I've heard csero's system on day t+1 or 2, it is scary realistic as he says. Forget hi-fi, if you really want to sit and listen to performances and reality is paramount there is no looking back. Everything else is second best if even that.

skrivis · « **Reply #4 on:** 20 Apr 2005, 01:39 am »

Quote from: csero

But HOW??? That is the hard part. Besides computers, there are very little available. It is definitely a DIY job.

For transaural the cheapest, most accessible experiment is the following:

This is why I detailed my use of BruteFIR. It was fairly east to get it working and try it out. Changing the parameters of the processing is more complex, but it's extremely flexible. People have even been using BruteFIR to implement a digital active crossover.

If you've got a Linux system and a CD burner, you're in business and can start experimenting. If you like the effect, then you can look into a hardware solution...

One advantage of doing it the way that I did is that you're not dependent on the quality of the hardware processor. I don't know what you get for $40, but I'm skeptical that it's any better than most sound cards.

skrivis · « **Reply #5 on:** 20 Apr 2005, 01:45 am »

Quote from: _scotty_

How far away were you from your loudspeakers and was the sweet spot any larger or smaller as a result of the processing. Also what kind of speakers
are you using.
Scotty

I usually have about 6' in between speakers and 6-8' from each speaker to the listening position. I placed the speakers right next to each other for use as a "stereo dipole."

The sweet spot was actually a bit wider in terms of imaging effects. You could move around quite a bit and still localize instruments. Their position didn't shift either.

My speakers are Fried C/3-Ls. (The satellite part of the Valhalla system.) Hiquphon .75" tweeter, 6" Vifa woven Kevlar mid-woofer on a t-line.

Electronics are all AVA OmegaStar.

JoshK · « **Reply #6 on:** 20 Apr 2005, 01:48 am »

Quote from: _scotty_

How far away were you from your loudspeakers and was the sweet spot any larger or smaller as a result of the processing. Also what kind of speakers
are you using.
Scotty

I felt the sweet spot was incredibly large, as large as any hi-fi system I have heard. Beyond a certain point the volume was diminished quite a bit because Frank's speakers are very directional (as are mine) but still very believable. What more, the off axis presentation is very consistent with what sound sounds like in a hall moving about, off axis, something no hi-fi system I have heard to date is capable of doing.

JoshK · « **Reply #7 on:** 20 Apr 2005, 01:52 am »

The only thing stopping me from going completely transaural is placement of rear speakers. In my room, there is no place for them and so I am not sure how to place speakers. Also the front speakers being close together is going to be an issue with the wife as she wants to put a fireplace inset where the "between" space is currently between my stereo set up. I am a believer even if I am not yet an adopter.

skrivis · « **Reply #8 on:** 20 Apr 2005, 02:04 am »

Quote from: JoshK

I've heard csero's system on day t+1 or 2, it is scary realistic as he says. Forget hi-fi, if you really want to sit and listen to performances and reality is paramount there is no looking back. Everything else is second best if even that.

It was a post of csero's that got me started on trying transaural digital processing.

I'm still not positive that it is more "real" than normal stereo. It's different, but normal stereo can produce a very nice effect too. And, if nothing else, placing two speakers side-by-side is going to do odd things to the directivity, possibly increase baffle-step and diffraction problems, change the tonal balance...

From my brief exposure to it, it really does seem dependent on the original recording.

It sure is fun to play with though!

JoshK · « **Reply #9 on:** 20 Apr 2005, 02:33 am »

Are you using ambience channels? I didn't find it very believeable 'til the ambience channels were added.

csero · « **Reply #10 on:** 20 Apr 2005, 02:44 am »

Quote from: JoshK

Are you using ambience channels? I didn't find it very believeable 'til the ambience channels were added.

Yes, because my room is very dry, which is bad if you want to reproduce "only what is on the record".
The recordings are usually too dry, and in a dead or "treated" room it will give unnatural, although accurate front stage. Increasing liveness will degrade crosstalk cancellation, and also gives more the feeling that you are in a suburban media room rather than a concert hall.
The added ambience not only fix this but also increase liveliness and bass impact without beeing unrealistically loud as in most hi-fi systems.

The rear 2 speaker is just the icing on the cake, you can live without that.

csero · « **Reply #11 on:** 20 Apr 2005, 02:55 am »

Quote from: skrivis

This is why I detailed my use of BruteFIR. It was fairly east to get it working and try it out. Changing the parameters of the processing is more complex, but it's extremely flexible. People have even been using BruteFIR to implement a digital active crossover.

If you've got a Linux system and a CD burner, you're in business and can start experimenting. If you like the effect, then you can look into a hardware solution...

One advantage of doing it the way that I did is that you're not dependent on the ...

I just found computers unfriendly in the listening chain.
There is nothing wrong with the Creative, just a standard Motorola 56k DSP with digital and analog inputs and reasonable quality CS DAC chips. The FIR algorhytm is reasonable, quite long which gives good crosstalk cancellation down to around 600 Hz. Below that you actually need the crosstalk, at least for "normal" stereo records.
AFAIK Brutefir is for 10 deg speaker separation, so your speakers are too close. Try 12". Feed the canceler with L channel only and find the spot, where the music sounds from the extreme left only.

csero · « **Reply #12 on:** 20 Apr 2005, 03:02 am »

Quote from: skrivis

I'm still not positive that it is more "real" than normal stereo. It's different, but normal stereo can produce a very nice effect too. And, if nothing else, placing two speakers side-by-side is going to do odd things to the directivity, possibly increase baffle-step and diffraction problems, change the tonal balance...

From my brief exposure to it, it really does seem dependent on the original recording.

It sure is fun to play with though!

Normal stereo is not real!!! It just an effect we got used to.

The 2 speakers together is actually decrease the diffraction, baffle step and channel imbalance problems. What is bad is the increased off axis energy, which is also tonally incorrect. That's why you need more dead room.

The things transaural does notlike IMHO is multimiked single instrument panned separately to several positions and added fake reverb.

skrivis · « **Reply #13 on:** 20 Apr 2005, 11:00 am »

Quote from: JoshK

Are you using ambience channels? I didn't find it very believeable 'til the ambience channels were added.

No, I wanted to keep it as simple as possible at first.

skrivis · « **Reply #14 on:** 20 Apr 2005, 04:23 pm »

Quote from: csero

I just found computers unfriendly in the listening chain.
There is nothing wrong with the Creative, just a standard Motorola 56k DSP with digital and analog inputs and reasonable quality CS DAC chips. The FIR algorhytm is reasonable, quite long which gives good crosstalk cancellation down to around 600 Hz. Below that you actually need the crosstalk, at least for "normal" stereo records.
AFAIK Brutefir is for 10 deg speaker separation, so your speakers are too close. Try 12". Feed the canceler with L channel only and find the spot, where the music sounds from the extreme left only.

BruteFIR is fast enough to process in real-time, but I didn't use it that way.

The included filter with BruteFIR is for 10 deg, but it should be quite possible to setup whatever you want.

I did separate my speakers a bit last night and found that things at the sides of the soundstage improved, but things in the center seemed more muddled.

I also took a look on the net for the Creative PS2000, and didn't find it for $40.

Too bad it's setup to feed a subwoofer. I'd much rather just run it full-range.

csero · « **Reply #15 on:** 20 Apr 2005, 04:37 pm »

Quote from: skrivis

The included filter with BruteFIR is for 10 deg, but it should be quite possible to setup whatever you want...

..if you have the right filter curve. 10 or 15 deg seems to be the best compromise between usable FR range, overhead requirement, off axis response and head position sensitivity.

I do not feel any muddyness in the middle, but I do not use brutefir - I heard it but found HW solutions better. It is also strange as the crossfeed processes the centered sources the least. Maybe you miss the center image comb filtering of stereo

Quote from: skrivis

I also took a look on the net for the Creative PS2000, and didn't find it for $40.

Too bad it's setup to feed a subwoofer. I'd much rather just run it full-range.

I bought mine for $30 in a local computer shop. Yes, it is not ideal, but for experiment it was worth it. There are much better solutions but harder to get or pricier.

dwk · « **Reply #16 on:** 20 Apr 2005, 05:33 pm »

Quote from: skrivis

BruteFIR is fast enough to process in real-time, but I didn't use it that way.

The included filter with BruteFIR is for 10 deg, but it should be quite possible to setup whatever you want.

I did separate my speakers a bit last night and found that things at the sides of the soundstage improved, but things in the center seemed more muddled.

Sending 16-bit output from brutefir can be dicey. There's a lot of math going on, and keeping the resolution at 24bits usually helps. I hope you at least dithered the output.

Quote from: skrivis

I also took a look on the net for the Creative PS2000, and didn't find it for $40.

just from a google:
http://www.cdromshop.com/cdshop/desc/p.18403.html

Hey csero (or anyone else) did you ever try the 'optimal source distribtion' crosstalk method? Seems to me it came up here a year ago or so. The idea being that different frequency ranges have different angular separation needs. So, array your drivers out in an arc horizontally with the tweeters narrowly spaced (5-10 deg), spanning out to the subs at +-90. If you had a continuous tapering of frequencies, the crosstalk signal would simply be a 90-deg shifted version of the opposite channel. It's more complicated if you only have a 3 of 4-way, but not too bad -doable with BruteFIR and my Delta-1010.
I have been meaning to try this, but my main speaker projects never seem to leave me any time.

csero · « **Reply #17 on:** 20 Apr 2005, 06:53 pm »

Quote from: dwk

just from a google:

Hey csero (or anyone else) did you ever try the 'optimal source distribtion' crosstalk method? Seems to me it came up here a year ago or so. The idea being that different frequency ranges have different angular separation needs. So, array your drivers out in an arc horizontally with the tweeters narrowly spaced (5-10 deg), spanning out to the subs at +-90. If you had a continuous tapering of frequencies, the crosstalk signal would si ...

I heard a realization with brutefir, was very good, but since then I realized that the XTC alone is not enough. If you have to weight which one is more important, the XTC or the ambience, it is a close call, but none of them enough alone. Besides in the below 500Hz range it is no use to have XTC, as you actually need the LF crosstalk for the standard stereo records. It is enough if you put the 2 woofers without XTC to the two sides, or even better if you put lots of woofers around the room.
Also in the extreme top end even the OPSODIS can not create psychoacoustically correct HF without the back pair and/or ambience channels.

skrivis · « **Reply #18 on:** 20 Apr 2005, 06:57 pm »

Quote from: dwk

just from a google:
http://www.cdromshop.com/cdshop/desc/p.18403.html

Hey csero (or anyone else) did you ever try the 'optimal source distribtion' crosstalk method? Seems to me it came up here a year ago or so. The idea being that different frequency ranges have different angular separation needs. So, array your drivers out in an arc horizontally with the tweeters narrowly spaced (5-10 deg), spanning out to the subs at +-90. If you had a continuous tapering of frequencies, the crosstalk signal would si ...

Thanks for the link.

dwk · « **Reply #19 on:** 20 Apr 2005, 07:54 pm »

Quote from: csero

If you have to weight which one is more important, the XTC or the ambience, it is a close call, but none of them enough alone
.

Any pointers to ambience extraction/ synthesis algorithms? I'd assume it starts with (L-R)*filter, but I'd guess that the filter is tricky - it'd need both amplitude and phase characteristics carefully tailored.