@sagittariansrock: There is a difference in your explanation than the standard one: Your explanation does not utilize the full sensor area of larger sensors. Your explanation is based on the subject filling the same absolute area of the sensor, regardless of the total sensor area.
That is the reach-limited argument. That is the ONE AND ONLY case where smaller sensors can achieve the same image quality as a larger sensor. However, it SEVERELY handicaps the larger sensors. The fair comparison is when your subject is framed the same, which means that for progressively larger sensors, a greater absolute area of sensor covers the subject. In that case, everything Orangutang, Lee Jay, and myself have stated is true. There is no circumstance where smaller sensors, regardless of their pixel size, can ever outperform a larger sensor.
There are real-world use cases where a limited reach is an actual problem. I already posted a topic on that, demonstrating the differences between a 5D III and a 7D, and the 7D does indeed maintain the IQ edge (I really need to try that on a day with better seeing, or find a good terrestrial subject to compare.) But in the "normative" case, you buy a larger sensor to use the greater area to get better IQ. I mean, that's the entire point. That's where improved IQ comes from.
Before I found the equivalence article, I used to think the same thing...that pixel size mattered. But it simply doesn't. Not at lower and midrange ISO settings anyway. At really high ISO settings, then the game does change a bit. Spatially, information in an incomming wavefront is sparser when your working in really low light, or at a really small aperture, or any other circumstance where you NEED something like ISO 12800 or higher. Sparser data ultimately renders smaller pixels useless, since you just don't have complete enough information to render a whole picture. Then, pixel size really does start to matter. Or, conversely, downsampling your image becomes more important to reducing noise.
For the ultra high ISO use cases, I would actually love to see Canon create a sensor that had some kind of dynamic binning. At low ISO, use full maximum resolution, then have a configurable option to switch to a hardware binning mode of 2x2 for say ISO 6400 through 26500 and maybe even have an additional 4x4 binning option for ISO 51200 through 400k or whatever. I think that would be awesome, since you can't really get clean high resolution at ultra high ISO anyway.
However, fundamentally, in a fair or normative situation where your utilizing all the sensor area you can (i.e. assuming identical framing) and and for the same aperture used, larger sensors gather more light per subject area. If you read the equivalence article, when he gets down into the myths, he clearly covers how with a larger sensor, you need to use a narrower aperture on larger sensors for a given FoV to make image quality equivalent. For 80mm f/4 FF, you would need 50mm f/2.5 APS-C (that is 4 divided by 1.6, the scale factor between FF and Canon APS-C...it does not take pixel size into account at all), 40mm f/2 4/3rds:http://www.josephjamesphotography.com/equivalence/#1
1) f/2 = f/2 = f/2
This is perhaps the single most misunderstood concept when comparing formats. Saying "f/2 = f/2 = f/2" is like saying "50mm = 50mm = 50mm". Just as the effect of 50mm is not the same on different formats, the effect of f/2 is not the same on different formats.
Everyone knows what the effect of the focal length is -- in combination with the sensor size, it tells us the AOV (diagonal angle-of-view). Many are also aware that f-ratio affects both DOF and exposure. It is important, however, to understand that the exposure (the density of light falling on the sensor -- photons / mm²) is merely a component of the total amount of light falling on the sensor (photons): Total Light = Exposure x Effective Sensor Area, and it is the total amount of light falling on the sensor, as opposed to the exposure, which is the relevant measure.
Within a format, the same exposure results in the same total light, so the two terms can be used interchangeably, much like mass and weight when measuring in the same acceleration field. For example, it makes no difference whether I say weigh 180 pounds or have a mass of 82 kg, as long as all comparisons are done on Earth. But if makes no sense at all to say that, since I weigh 180 lbs on Earth, that I'm more massive than an astronaut who weighs 30 lbs on the moon, since we both have a mass of 82 kg.
The reason that the total amount of light falling on the sensor, as opposed to the density of light falling on the sensor (exposure), is the relevant measure is because the total amount of light falling on the sensor, combined with the sensor efficiency, determines the amount of noise and DR (dynamic range) of the photo.
For a given scene, perspective (subject-camera distance), framing (AOV), and shutter speed, both the DOF and the total amount of light falling on the sensor are determined by the diameter of the aperture. For example, 80mm on FF, 50mm on 1.6x, and 40mm on 4/3 will have the same AOV (40mm x 2 = 50mm x 1.6 = 80mm). Likewise, 80mm f/4, 50mm f/2.5, and 40mm f/2 will have the same aperture diameter (80mm / 4 = 50mm / 2.5 = 40mm / 2 = 20mm). Thus, if we took a pic of the same scene from the same position with those settings, all three systems would produce a photo with the same perspective, framing, DOF, and put the same total amount of light on the sensor, which would result in the same total noise for equally efficient sensors (the role of the ISO in all this is simply to adjust the brightness of the LCD playback and/or OOC jpg).
Thus, settings that have the same AOV and aperture diameter are called "Equivalent" since they result in Equivalent photos. Hence, saying f/2 on one format is the same as f/2 on another format is just like saying that 50mm on one format is the same as 50mm on another format.