True, but one of the advantages of higher pixel counts is larger prints examined more closely, and another is additional ability to crop. The first effectively reduces CoC, the second effectively increases enlargement, this reducing CoC. Both require less blur to be effective.
If you choose to change the enlargement criteria, by making a bigger prints and reducing viewing distances or cropping etc, then obviously you need to change the coc criteria and acceptable blur amounts, but that doesn't alter the fact that pixel size is irrelevant with respect to motion blur (or diffraction) for the same sized image.
I suspect most people (I'm sure there are exceptions) don't judge critical focus based on their intended final output. Rather, they view the image at 100% (most likely with a loupe tool). Therefore, comparing two images shot on different bodies with differently-sized pixels, with the subject projected onto the image plane at the same physical size, the image from the higher resolution/smaller pixel sensor will appear larger, and thus more subject to the perception of blur.
I still think, and as I understand what happens there on the surface of the sensor, that the size of the pixels doesn't play such an important role here, as the resolution. Assuming, that:
1. on the current sensors, geometry of particular subpixels makes them evenly spread across the surface
2. three or (even rather as for now) four of subpixels create a real pixel
3. information from subpixels (creating a pixel) distant from each other is interpolated in terms of luminosity
4. there are microlenses decreasing the infuence of the real geometry of subpixels
I would rather say, that if we want to observe the difference in contrast between the final pixels to catch the motion blur, then
1. if 4 subpixels act as one pixel
2. and those subpixels are evenly spread across the surface
then only the final resolution of final pixels (and not subpixels) determines this matrix'es capability to catch the motion blur caused either by the camera shake or the subject's move.
This can change somehow, if detection is on photo's WB or one of clear RGB components area. In the latter case it can happen, that the other subpixels remain "blind" no matter if the real move has occured or not, but such a case I'd say is rather rare.
Looking at the real sensor's geometry and it's ability to detect motion blur, I think, that the resolution determines it's real pixel size as area limited by subpixels (but not in terms of it's light capturing capablities) so the bigger the resolution, the smaller the real pixels (even if empty somehow in the middle), so the bigger the ability to detect blur. At smaller resolutions, subpixels acting here as a bigger pixel interpolate the move on the bigger sensor's surface, so their sensivity to "detect" the move is smaller.
Does it make sense?