Anyhow, what you are saying is that the larger pixel's larger well doesn't help because if the well is Z times larger it also collects Z times the light and therefore saturates equally fast?
Correct, for the most part. You take 4 smaller pixels that fit into the area of one larger pixel, and while the larger pixel has a deeper well, it's getting more light, and if you sum the wells of the smaller pixels, they're equivalent to the single larger pixel, and together they get the same amount of light.
Actually, there's going to be some wastage around each pixel, even with the better microlenses used today.
The relationship is not simply "Z times larger -> Z times the light." What helps smaller pixels overcome the light gathering ability of a monolithic pixel, say, is the ability to sample that light and derive more information from the light.
There are lots of other questions that often stymie efforts to compare sensors.
- It's rare (if not unknown) for two cameras built for the same market segment to have sensors where one has pixels 1/4, or 1/16, or whatever the size of the larger camera, and even if there were such cameras there is a lot more affecting noise than just the sensor qualities (shot noise).
Bottom line - sensor size has the largest impact on image noise. Reducing the MP count of the sensor will not substantially reduce the noise.
However, at the highest ISO's (i.e., amplifier gains), there are so few photons per pixel that the baseline noise in the sensor, amplifier [chain] and A-to-D converter becomes more visible. That is when reducing the MP count, i.e., using larger pixels, for a given size sensor becomes rational.
If you consider the proposition, you'll see that the effects of a larger sensor and a larger pixel are essentially identical - here the "secondary" characteristics, like the ADC and the rest of the circuitry chain, come into play.
Much of the noise seen in your average digital camera shot is due to read noise, which ought to be comparable (as I mentioned earlier) with that from other cameras in that range, but this is the result of their having been produced with more or less similar technology, at a similar date.
In other words - shot noise shouldn't change since two cameras with the same sensor size don't have the potential to receive wildly varying amounts of photons given the same exposure, other than the standard quantum distribution (and this is more the case the case for newer sensors with microlenses that are closing one gap with sensors that have larger pixels, however big that gap had the potential to be).
And so it is clear, the main contributor to improvements in digital photographic noise lies in improving the characteristics of that efficient capture of photons.* There isn't (and won't be) a percentage-basis breakdown of the contributors to noise, and there won't be (it will change from manufacturer to manufacturer, and by time - by every little change) but shot noise can be assumed more or less constant, but the ADC is a major contributor to noise. There have been some major, but subtle, shortcomings in the photon capture and conversion chain addressed only quite recently - astrophotography with Rebels older than 2009 might have shown some rather bad noise characteristics, and at a certain age you even start to see banding along the sides of sensors. The fixes to the photon-to-data conversion process don't seem to have had any ill effects here.
The final question - Nikon has obviously had a slightly different perspective here. Moving from 16 megapixels to, say, 12 megapixels isn't enough to make a dramatic difference from the standpoint of efficient conversion of photons to data (certainly not along the order of 4:1 the amount of raw data), nor to shot noise. On the other hand, there does seem to be some merit in saying that the way in which read noise occurs does not have a simple linear relationship to this size - so perhaps with just a small reduction in pixel size you could see an asymmetric improvement in other areas. It doesn't seem too convincing an argument, however, especially considering that for the ADC's generation of noise overwhelming the shot enough to be undesirable due to the amount of photons being captured, or the time taken, is most likely seen in astrophotography (or similar shooting, which of course is desirable - photojournalism, sports, night landscapes - everybody wants more ISO latitude), and once again the situation there isn't as bad as it was before. Maybe they were just a bit late to that party.
You have put it simply and convincingly - at some level there are too few photons and so read noise becomes much more important. I have to stress this isn't the result of the pixel size, but the result of the read sensor's characteristics - of course the problem there is that improvements in that capture chain come only with time, and are not a parameter that can simply be altered arbitrarily to fit, like pixel pitch. The performance of the photon-to-data capture circuitry is stressed differently by differing amounts of shot noise but it strikes me as true that newer systems are good enough that it's no longer as compelling an argument that the extra reduction in shot noise is worth the loss of the data. I do think that varying pixel pitch to fit the characteristics of the photon-data-conversion is worth considering but it should not be the first thing suggested.
Unfortunately, all too often amateur (and even pro) photographers, and even some members of the camera industry, latch onto pixel pitch (or, confusingly, "pixel density") as the end-all measure, when really it is the intersection of these two variables (amount of photons vs. ability to render them to data). It would be just as accurate and perhaps less misleading to suggest that high noise characteristics are the result of a manufacturer's failing to stay competitive with their technology. It would be more accurate to suggest that the tradeoffs for increasing pixel pitch outweigh those of keeping it the same, as data is lost.
Unfortunately, the battle lines between Canon and Nikon especially seemed drawn because you do not get a Nikon image just by some simple scaling of the Canon image - many people would be bothered by the complex scaling needed to go from a larger to a smaller image while retaining the same amount of data (which, to be ideal, would require combining 4 pixels to 1). Nikon held out the hope of doing this from the sensor itself (perhaps not as a matter of marketing, but it was interpreted as such). Even if this was justified by a nonlinear improvement (for their technology) in visible noise reduction over a Canon-sized sensor, the effects of this thinking are lamentable.
I think the goalposts have been moved now - if the new sensors are actually do double the resolution of previous cameras, the read noise also will have to have improved by a great leap just to keep pace.
*(Some of the larger sensors get away with lower quantum efficiencies (the Panasonic GH1 has a QE around 55% or so but the Pentax 645D has one of 41% - and the slightly older 1D Mark IV's is 44%; I expect that some of the medium format backs could have QEs substantially lower - i.e. a generation or more less - than these sensors, which are all meant to be competitive, even the 645D's), but they have to be able to leverage both larger pixels and more sensor surface area to overcome older technology.)