Offset listening test results

0 Members and 1 Guest are viewing this topic. Read 627 times.


Offset listening test results
« on: 30 Mar 2019, 05:42 pm »
I would like to get more feedback, but I think I have enough to make some conclusions.

Offset is the leading and trailing nulls or zeros in every digital music track.  This is the preamble and post-amble. These is not the music data and does not create any detectable delays.


Parsing the data (below):

Frame 1:   ratings: 4,4,2,4,1,3,4 - average is 3.14

Frame 1a: ratings: 3,3,1,3,2,2,2 - average is 2.28

Frame 2:   ratings: 1,1,4,2,4,1,1,1 - average is 1.875

Frame 2a: ratings: 2,2,3,1,3,4,3 - average is 2.57

Therefore, this puts the tracks in order from best to worse:

Frame 2 - best ---------------this is the 16/44.1 original
Frame 1a        ----------------this is the 24/96 modified
Frame 2a        ----------------this is the 24/96 original
Frame 1 - worst --------------this is the 16/44.1 modified

The only discrepancy is the middle two, because the order should have been:

Frame 2 - best
Frame 2a
Frame 1a
Frame 1 - worst

It tells me that maybe the upsampling software that was used to get the 24/96 file was not very good.

Conclusions:  It's a small data set and yet the results are fairly consistent. 

1) Pretty much everyone hated Frame 1 and most liked Frame 2.  These were the  altered  and unaltered 16/44.1 tracks, respectively.

2) only 1 in 9 respondents did not hear differences, although one said it was subtle.

3) 88.9% of these respondents heard a difference when comparing the original track to one with 200 offset nulls removed from the preamble and added to the post-amble.

4) Based on this small sample I believe I can conclude that offset is critical and should not be altered when ripping or changing format/encoding lossless.  If it does change, this could be one reason why different rippers sound different or different formats sound different.

5) I did not see any trends for USB or Ethernet or certain DACs either, so this tend to point to the software as the villain.

Here is the raw data from respondents:

1) Mag

Okay I gave it a shot. The longer I listened the more uncertain I became of differences.

Listened with Windows Media Player on i7 computer, usb to BDA-1, SP2 internal dac, mixer, Bryston SST2, Model T passive
speakers and approx 92 decibels.

Differences between the tracks were subtle.

Frame 1 seemed to be sterile sounding but accurate, Rated it 4

Frame 1a sounded fuller with more harmonics, seemed softer with some ringing of the notes. Rated it 2 guessing 16/44

Frame 2 seem fuller sounding similar to 1a ringing of notes seemed more sustained. Rated it 1 guessing this is the 24/96

Frame 2a seemed similar to frame 1 guessing it to be 24/96. Rated 4

2) paul79

I have not had time yet to pick these apart all the way, but I find frame 2 to be the best of the bunch on my system.
It has longer decay, more impact, more weight, more body, strings ring better, and her moaning throughout is more obvious.
Very fluid and emotional. This last aspect especially, is not translated as well with the other versions.

2a was immediately dry sounding.



FRAME2/FRAME1a sounds fine, but its like recording microphone is more close/hot which is not as natural as a normal listening
seat several meters away and therefor lacks that natural room ambience, FRAME1/FRAME2a have this better difuse ambience that
sounds as recording microphone distance is more natural to a listening seat some meters away which in my ears give a better
natural sound, more difuse pleasing and tiny bit less resolution on notes than FRAME2/FRAME1a.

Playback software v23 64bit JRiver / R128 algorithm volume leveling / Equal loudness ISO226-2003 calibrated / track material
rates are played native over WASAPI 32bit to Khadas Tone Board DAC. Tranducer domain for test was using head phones well a
Neaurochrome HP-1 amp and DSP corrected HD650 cans.

4) cat6man


1 and 3 are 44.1k
2 and 4 are 96k

in my test, 1 >> 3 and 2 >> 4

frame 1 and frame 1a are best, frame 2 and frame 2a are worst

WAV files on QNAP nas ==> hqplayer embedded (no processing at all) on ubuntu NUC ==> NAA on cubox in totaldac ==> totaldac
re-clocker ==> totalDAC d1-direct

reclocker is usb in, aes/ebu out to dac

5) Rocoa

I've listened the four tracks right now (eyes closed) some times and the difference is clearly audible.
The system is: Jeff Rowland Aeris DAC, Corus preamp (+PSU), Model 625 S2 power amp., Avalon speakers, Cardas Clear and
Clear Beyond cables. As source an MacBook Air. Tellurium Q Black USB cable (very analog like and time coherent cable).
Between Aeris and USB cable an iFi iPurifier 3. Audirvana Plus.

Well, the most live sound track to me is, no doubt, track 4. I can listen much better the decays, and the piano's wood.
More ambience and harmonics, more body and depth, better image, perhaps less attack than previous one, but a more relaxed
and natural sound to my taste. The worst is the first track 1. The sound is more artificial, mechanic, edgy (less like a
grand piano, like a clavecin as overstatement).

So, my preferences: 4, 3, 2, 1 (2a, 2, 1a, 1).

Best - frame 2a
 ----  frame 2
 ----  frame 1a
 ----  frame 1

6) WGH

My DAC is a Van Alstine Fet Valve Hybrid, I don't think it re-clocks. Software is the latest JRiver 64bit running on a
custom built server that includes a Paul Pang USB card with linear regulated power supplies on the USB card and SSD.
The USB to SPDIF converter is an upgraded Kingrex UC192 with it's own linear power supply. The custom JMaxwell Data Only
USB cable contains only 3 wires, data plus, data minus and a ground wire, it does not have a 5v. lead. Speakers are
Salk HT2-TL with RAAL tweeter.

I downloaded the tracks and converted to .flac with dBpoweramp so I could tag the tracks, that way I could use Gizmo
remote to switch tracks from my couch. All my files are flac, high res downloads are flac so this is a more real life test.

By looking at file size and properties it was easy to see which files are 24/96. I listened to the tracks before looking
and my impressions didn't change.

1 and 1a sounded like the original recording, 1a might be more revealing.
2 seems to have more ringing/janglyness
2a is a little smoother than 2 but the overhang still is there.


1) frame2
- Deep soundstage
- Not "ringy"
- Very focused
- Good attack and decay

2) frame2a
- Deep soundstage
- Not "ringy"
- Crisp attack
- Good focus
- Good decay
- A bit duller sound quality

3) frame1a
- Kind of "ringy", like sustain pedal down
- 3-D effect, but may be contrived
- Attack a bit muted
- Warm sounding
- Deep soundstage
- A bit less focused

4) frame1
- Up close, not as much depth or venu echo
- Good attack
- A bit wooley
- Poor focus

 My wife

1) frame2
- Deep soundstage
- Not "ringy"
- Clean trills
- Notes don't run into each other
- Brighter overall sound quality

2) frame2a
- Not "ringy"
- A bit duller sound quality
- Sounds like damper pedal is down

3) frame1a
- Kind of "ringy", like sustain pedal down
- Warm sounding
- Deep soundstage
- A bit less focused

4) frame1
- Poor decay
- Dry sounding
- Boring

9) only one respondent did not hear any difference in any of the tracks.

Steve N.
« Last Edit: 31 Mar 2019, 01:34 am by audioengr »


  • Industry Participant
  • Posts: 39
    • Tom Christiansen Audio
Re: Offset listening test results
« Reply #1 on: 30 Mar 2019, 06:56 pm »
Sorry to burst your bubble, but your results aren't statistically significant. I plopped your data into SPSS and ran a Friedman Test (K Related Samples test). The results of the test indicate that there is no difference in the ratings of the tracks. In fact, there's a 41.8% chance the ratings are exactly the same. Most scientific research requires 5% chance or lower (p ≤ 0.05) to be considered statistically significant.

Participant #8 was excluded from the analysis as they only provided one data point.



Re: Offset listening test results
« Reply #2 on: 31 Mar 2019, 01:15 am »

Wait.  What?  Science has been finally employed here?   :thumb:


Re: Offset listening test results
« Reply #3 on: 6 Apr 2019, 05:22 pm »
When the results shows 8 out of 9 hearing a difference, I don't consider this a null result.

When a majority of listeners select the same track as the best track, I don't consider this a null result.

You have to consider that every system and software is different.  Nothing here was controlled, and yet these results are still compelling.  It may not meet the standard of statistical significance, but it's a lot better than just a few anecdotes, which is what we had before.