Parts of number 1 can be measured like channel separation, rise times, impulse response, damping factors and so on but it is largely subjective in the sense that any instrument is only as resolving as the ears the listener is born with. A simple example is that some children are born with a form of deafness that affect how they hear consonant sounds. Consonant sounds are high frequency in nature and vowel sounds low. The unscreened children because they show no signs of deafness are then mistaken for being less than average in intelligence because the deafness is observed as lack in comprehension because the oft mistake one consonant sound for another making sentences a messed up jumble. They get by through by empathy being sensitive to the speakers tone. This is an extreme example but to differentiate the horn section from a string section playing the same sustained note would require that the system be able to sufficiently portray the differences in timbre of the two all of which can be considered micro both because these happen in the smallest voltage changes and as a result the smallest excursions AND the listener should be able to in the most basic sense, hear it. As I age I'm becoming like those children as my HF perception slowly wanes. At some point I'll eventually begin hearing things en masse without the ability to deconstruct and reconstruct. It doesn't mean however that I can not enjoy music. All it means is that someday soundstaging will not be as important as it is now and I too will get by with tone.
Signal symmetry is a must whether trying to make a ball out of pink noise, stacking up instruments in a mono recording or listening to a full phantom stereo stage. It's what feeds No.3. For example if a preamp or source for any reason is churning out a signal weaker in one channel the vocalist's position will move off center towards the the stronger channel. Clearly it is not just needed for test tones.
The 85% of the music comes from the midrange camp is right in that the enjoyment of music comes from an even higher percentage of that however we are talking about the attempt to simulate spatial cues. While the higher frequencies play a great part in localization the bass carries the recorded long reverberation times which determines the hall size. This makes not just extension capabilities important but their linearity as well. The difference in presentation then can be between creating the illusion of the performers being in the room with you or the illusion of you being at the venue. I listen mostly to classical music so having a hundred folks with me in the room isn't my cup of tea.
I'm not being clever or anything these are all part of AES and EBU accepted principles applied to the recording, manufacturing, broadcasting and reproduction processes the last of which includes acoustics. All manufacturers try to achieve these goals the room coupling part however is all up to us.