I asked this same question back in 2013 and received a bunch of insufficient technical answers. The best responses I've heard over the years were from Danny and Hobbs. I tried searching this forum, but I can't find them.
My non-technical understanding is that a large capacitor holds more energy and discharges more slowly, which can smear the signal. With a bypass cap, the electrical signal travels through the smaller cap to mitigate the detrimental effects of stored energy in the "fat cap." An analogy is a highway bypass -- they're designed to provide faster, more efficient travel and reduce traffic congestion.
Interestingly, after I wrote this, I asked ChatGPT about bypass caps in speaker crossovers, and it offered the same analogy:
Imagine your speaker crossover as a traffic manager at an intersection, directing different sounds (like bass, midrange, and treble) to different lanes or drivers (the speaker drivers). Normally, each sound type (bass, midrange, treble) is routed to its own speaker driver (e.g., a woofer or tweeter) so that it plays optimally for that driver. Now, a bypass capacitor is like a shortcut that allows some high-frequency sounds (treble) to bypass the usual traffic signals (the crossover circuit). It provides the treble a shorter path to the tweeter, helping it reach the speaker driver more directly. In simple terms, the bypass capacitor on the highs, for example, ensures that the treble reaches the tweeter without delay or unwanted interference, leading to cleaner and more accurate sound.