recorded sound is so widely variable that NO one stance can cover this waterfront. multitracking is accomplished in many different ways: multiMONO, multiSTEREO, minimalist mic-ing with a stereo pair and rear ambience mics AND spotlight mics on soloists like vocals and guitarists that might not make it in the mix otherwise.
and then there is the creative multichannel mixing that comes NOWHERE near reality. add sound processing like comp/limiting, gating, etc ad infinitum.
still, now and again, an accurate recording or semblance of same, makes its way to the consumer. room ambience and studio room sound, instrument and voice locations can actually be delivered. with live recordings, the audience can emanate from their normal position, back near the listener.
all this in stereo and is DOES work better with phase coherent, well set up loudspeakers, good (improved) room acoustics included.
we are often given a BLOB of sound that could just as well be presented by a mono setup that has speakers aimed at the wall, away from the listener.
vertical imaging is real and captured within some recordings but greatly affected by the speakers themselves and room acoustics.
mono depth is very evident, not just a point at the speakers. some instruments were closer and others farther from the mic and sometimes make you wonder if the recording is in stereo.
tonality and deep bass extension also contribute to imaging, and soundstage.
additionally to all this conjecture, some amateur and professional recordists both do and dont recognize when they have produced exceedingly accurate recordings.
lets not be so quick to make pronouncements that will prove to be the very fallacy you are attempting to debunk. experienced listeners have an open mind about most of this subject.