First off, those are A real analysis would involve starting with the RAW files themselves (Juza doesn't make those available, just the JPGs), and analyzing the raw data itself using IRIS, Rawnalyze, or dcraw. Even that's a flawed comparison
Yes, those images are not ideal, but they're far better than anything else I've seen (and by the way, his interpretation of the photos is opposite - he's comparing noise at pixel level). It would be really great if you could show a better comparison.
But you're ignoring the practical point. Those images are taken at a very high ISO, where one sensor has 2.5 times the pixel density of the other. So, what does it take for all the factors that you list to cause a visible noise difference? ISO 1 million or a sensor with 100 MP? Does that make any practical difference today?
Even the RAW photos from IR taken with the D3x and the D3s, at ISO 12800, show very little noise differences (and only in the shadows, in favor of the newer one, of course), and one has twice the pixel density of the other. So, what does it take for the noise differences to be more than barely visible?
I'm not sure if we are even debating the same subject?! It makes no difference to me (or to anyone who screams at Canon to put only 10 MP on a FF sensor) if the noise differences can only be quantified with statistical means and aren't visible to the naked eye.
Later edit:
Actually, to make this simpler, I admit that a higher pixel density generates nosier images, but I want the people who want Canon to put only 10 MP on a FF sensor, to see what difference that would make: invisible to the human eye (unless you're looking for it).