0 Members and 1 Guest are viewing this topic. Read 8094 times.
Yeah, he also cites a study where folks in a double blind test didn't hear the difference. In studies involving human subjects, there need to be very careful repetitions and an examination of the experimental setup before we can draw any conclusions. The jury is still well out on that topic, and there's a lot of good observational evidence that there is an audible difference as well.
Observational evidence is subject to confirmation bias, and is therefore not relevant. Also, a cited double blind study that suggests that there is no audible difference between hi-rez and redbook is much stronger proof than I have ever seen offered to the contrary. Why immediately cast doubt on the findings?
That's silly. Maybe there's confirmation bias, maybe there isn't. Just because it might be present doesn't invalidate every observation. Maybe the observer expects to hear no difference, or expects the Redbook file to sound better. I agree that observational evidence is subject to problems, but an old, badly designed study is not necessarily dispositive either. One study never is.In this case, the study is weak. The methods section is wholly inadequate to allow another researcher to reproduce the results. I've reviewed numerous papers in my field (geochemistry), and this would get sent right back by me or most other reviewers. The tests of listeners' hearing abilities is stated to have been "informal" and the listeners were different groups in different settings. They report the bulk outcomes of tests with variable equipment and changing clusters of listeners. It's a shambles. There is no way to determine which listeners listened to which setup under which conditions. Furthermore, they see a significant (at face value) difference between male and female listeners, but they don't discuss this seriously. Do other researchers find such profound differences in male and female auditory perceptions? If not, what's going on here? If yes, why and how might it relate to drawing conclusions?The methods are so badly described, that it's impossible to know what the subjects were asked. The authors report the results in terms of the percentage of "correct" answers, but this is meaningless. What counted as a correct answer? Not only is this a biased way to speak about their experimental results, we don't even know if "correct" means that a) the subjects can detect a difference between the two sources or b) the subjects thought that the hi-res file sounded better. The methods section states that their goal is to test SACD and DVD-A for its "superiority to CD audio". This implies that the test subjects were asked if the hi-res source sounded better to them. In the results section, however, they say they that what they are really asking is whether people can hear the difference between a hi-res source and the same source passed through a 16/44.1 downsampling device.They don't specify whether specific songs show different results in this test, or whether they see differences in outcomes starting with SACD or DVD-A. They also don't validate whether removing the A/B switch and downsampler completely from the system has any affect. It's an obvious concern that even if the 16/44.1 compression isn't engaged, that stuff in the signal path (and plugged into the power line) could degrade the sound enough that the hi-res playback is degraded overall. There are so many holes in this study that documenting them all is tedious.There are also only two peer-reviewed publications cited in this paper. Two, plus two conference proceedings, all from this one journal's previous issues. This raises a number of red flags. Again, a serious peer-review would have asked the authors to cite all of the previous research relevant to their methodology, at a minimum (even if nobody ever performed an identical experiment, they are making choices in study design and testing of human hearing that have definitely been tackled previously).My final concern is that the author's obviously approach this work with a bias. They state bluntly they consider it "well past time to settle the matter scientifically." And they state in their conclusions that "the burden of proof has now shifted." This type of swaggering and boastfulness certainly raise questions about their motives, and since the supposedly double blind methodology is not described and they admit to monitoring the outcomes throughout the year during which they conducted the tests, the legitimacy of the study is weakened.Finally, they state clearly at the end that they only tested the affect of their 16/44.1 compression device. When they tested actual CDs vs. their DVD-A or SACD equivalents, they report "virtually all of the SACD and DVD-A recordings sounded better than most CDs."This doesn't fit their preferred conclusions, however, so without any evidence they dismiss it with the statement "These recordings seem to have been made with great care and manifest affection, by engineers trying to please themselves and their peers. They sound like it, label after label." Every one of them sounds better because they were re-mastered by genius craftsmen? Since they provide no information about which recordings they are talking about in any part of their study, we have no way of validating that broad statement.Obviously the means exist now almost 6 years later to do a much better job with a study like this. As it is, this one is little better than our collective observations on the matter.
What would those be exactly, those missed marks?And are you also suggesting you need a super high end player to notice a difference? Do you know which player they used?What about the parts that had nothing to do with playback systems, just about 16bit in general, and the human ear, and 16 bit's underrated capability?
What would you do if a "much better study" as you say, said once again the same things? Did you actually read the article? It's not like it was written in '96.
An interesting question is how to design a study that would test whether the theory is supported in observation. I'm sure we've all run across problems in A/B tests, double-blind and otherwise, in our direct experience or in our reading. What are those and how could we design the experiment around them? I'll post more after I get home, but some are1. Choosing the people to participate in the test (can they hear differences that are in fact measurable? maybe we need to more carefully screen them).2. Choosing the music. Are all passages equally revealing of potential differences? How do we choose music that will not bias the results.3. Validating the equipment setup. Does the equipment at least have the measured linearity of frequency response from end to end (and in total) not to impose its own issues? Maybe we just have to repeat the experiment with different setups because we honestly don't yet have a suite of measurements that captures the differences we reliably hear between systems.4. Room acoustics. We should probably do at least one run of tests in as close as we can get to an anechoic chamber. Then proceed to real rooms if something can be detected in the anechoic environment.5. Statistical methods. What would be considered a significant result given our sample population? What is the results expected from chance, and by how much do we need to vary from that result to conclude people are hearing something?If we focus on how we might do this correctly it could be fun and informative. The hypothesis about Nyquist limits in the original article could then get an impartial hearing.
some industry reflections, this one's from Naim, in the latest Gramphone magazine :sorry no text file, hope you can read it.Mariusps wish the industry would provide us with means to answer his final caution..