You are correct! I got my own terminology wrong! (A good cite: Oppenheim and Schafer, Discrete-Time Signal Processing, pp. 101-111; this used to be considered the "bible" of discrete-time processing.) So, "oversampling" is sampling at at least the Nyquist rate, which is twice the frequency of the highest frequency component. In audio, the highest frequency component is assumed to be 20kHz, so the Nyquist rate is 40kHz. Upsampling would then take the 40kHz samples and interpolate between them to make higher frequencies. In terms of "real" audio, they use 44.1kHz instead of 40kHz sampling rate. The 96 kHz is about twice this rate, so there would be twice the number of "required" samples (note to self -- why isn't this exactly twice the 44.1kHz?).
Then, what would zero upsampling or zero oversampling be?