As a side note, since it would take an increase to 82% Q.E. for the 7D II to gain a true ONE stop improvement in high ISO performance, we can never hope to see a true two stop improvement. The 7D II, nor any successor, nor any new pro-grade APS-C line of cameras from Canon or anyone else, will ever perform as well as a FF sensor that has larger pixels. So long as the average pixel size for FF sensors remains larger than the average pixel size for APS-C sensors, FF sensors will always perform better at high ISO. Nothing we can do about that...its just physics.
Well, there are a few tricks that Canon could do. For example, if a camera used a series of fast exposures, the camera could do motion vector analysis on various parts of the image, then add them programmatically after compensating for camera and subject motion, resulting in roughly the same image as you'd get with the shorter shot length (blur-wise), but with the SNR of the longer shot length. However, that's way beyond the realm of sensor tech.
How would that work with a selectable shutter speed, though? I mean, if I as the photographer chose a 1/1250s shutter speed, a single exposure that long is going to be better than multiple separate exposures blended together. You'll lose light in the interframe time as well, so gain would have to be higher...
You're assuming a mechanical shutter. Consider a vertically stacked sensor that can push its value down to a buffer deeper in the silicon or, for simplicity, an interline transfer design. You can then sample the image with no rolling shutter (bette for video) and no delay between shots. If a mechanical shutter is desirable for some reason, open it before the first frame and close it at the end.
It matters not whether the shutter is mechanical or electronic. What matters is that the PHOTOGRAPHER selects the EXPOSURE TIME (shutter speed). Shutter speed is shutter speed, regardless of whether the shutter is mechanical or electronic. If the photographer chooses a 1/2000th shutter speed, then that is AS LONG AS the camera can expose. Trying to make a better exposure by taking several short exposures within that 1/2000th window is likely impossible. At the very least, there is going to be some lag time for read or "ship the charge off to a buffer" between each partial exposure. That lag time is going to cost you light. Because shutter speed is a user selectable quantity of time, gathering light for that total time is the best we can do.
Other way around. The user typically chooses an exposure time with the primary goal of avoiding blur from camera or subject motion. If the user allows the camera to do so, however, the camera could use a much longer
exposure than what the user selected, dicing that long exposure up into pieces of the user-specified length, and compensating for motion to approximate an exposure of the user-chosen duration while gaining increased accuracy in portions of the image that did not change significantly or exhibited only trivial transformation, such as shifting one way or the other, similar to the way MPEG compression reduces data rate by describing portions of one frame in terms of adjacent frames.
BTW, with an electronic shutter, there should be very little (if any) gap between frames. Some CCDs with electronic shutters can dump hundreds or even thousands of frames per second, which means that the gap can't be much more than single-digit or perhaps double-digit microseconds, either of which would almost certainly be completely ignorable.
As far as I can tell, the hard part is not the sensor side; it's being able to dump ten times as many RAW-sized images to the flash card so that such post-processing would even be possible. It's almost certainly infeasible right now, but I'd expect it to be pretty easy to do in just a few years. It could be substantially longer before cameras would have fast enough CPUs to do that sort of processing internally, of course. Alternatively, it might be possible sooner with the use of some sort of perverse RAW-MPEG encoding in which each subsequent frame in the set is described relative to the first, but the compute power required would be... considerable.
As an added bonus, with an electronic shutter, the camera could examine a few shots before and after the moment when the user presses the shutter like an iPhone does, choosing the least smeared, and defaulting to using that one as the base frame for correction purposes. Whether shots taken before the lens fully focuses are useful or not is a different question, but I figure that by the time we see something like I'm describing, we'll probably also have light-field sensors that will make those shots almost usable.... Or not. Hard to say.