Computational cost tends to (yet) dive according to Moores law. Sensor development seems much slower. I think it is realistic to assume that we will be more dependent on fancy processing in the future than we are now.
Already today, the "average consumer" is well served by the 6-24MP in the average appliance. Normal HD, but in 3:2 format, is about 2.5MP. In 4:3 format about 2.8MP. And if the resulting image in a "Full HD" presentation size is sharp, the image quality is considered good. The average image use seems to be about 1024px width...
So we're already oversampling the images, in practice. It's slightly different for the photo enthusiast and the more discerning customer - where often more resolution is better resolution. To put this into context, a full spread ad in a normal>good quality offset magazine takes a reasonable 240 input dpi to create a good rip into print raster. That's about 10MP (add bleed, 12MP)
(*1).
The added cost of a 40MP sensor isn't so much in the manufacture of the sensor plate as in the peripheral equipment. The sensor may get 10% more expensive when production has stabilized, but the ancillaries still have to be twice as fast as before to get the same fps - meaning twice the buffer memory and off-sensor bandwidth, twice the amount of cores in the ASIC PIC and so on. That adds up to a lot more than the sensor cost increase.
Do you know the corresponding number for "3CCD" video cameras? How are they for color accuracy (the low-level sensor/optics, not the processed compressed video output)?
The trichroic prisms they use are very efficient, but to get reasonable color accuracy a thin-film additional color filter is often applied at the prism endpoints, before each sensor. To get reasonable color accuracy (actually "resistance to metamerism failures") you can approach about 75-80% light energy bandwidth preservation - visible light delivered to the sensors. (
This is where the Foveon inherently fails - it has no mechanism for increasing SML separation, it HAS to use all incoming energy. It has no way to use additional filtering). Then you can multiply that with the average efficiency of energy conversion in the 500-600nm spectra, and get an end result of about 40% full-bandwidth QE. About three times higher than a normal Bayer, as expected.
The reason why you HAVE to use additional filtering to get good recorded color / human percieved color correlation is that you have to find an LTI stable way (preferably a simple matrix multiplication) to make the sensory input correspond to the biochemical light response of the human eye (SML response).
http://en.wikipedia.org/wiki/Cone_cellThe main problems with prismatic solutions aren't efficiency or color. It's the production cost (and a very much higher cost for lenses) and the angle sensitivity.
Minimum BFD (back focal distance) is about 2.2x image height, increasing the need for retrofocal wide angles to almost 10mm longer register distances than in an SLR type camera (about 55mm sensor > last lens vertex for an FF camera!). This means that anything shorter than an 85mm lens would have to be constructed basically like the 24/1.4's and 35/1.4's. And that's expensive.
Large aperture color problems. The dichroic mirror surfaces vary in separation bandwidth depending on the angle of incident light. An F1.4 lens has an absolute minimum 65º ray angle from edge to edge of the exit pupil....
The number you quoted on CFA earlier was 30-40%, so I guess that is the loss that can be attributed to Bayer alone?
I find it surprising that we still use the same basic CFA as was suggested in the 70s. Various alternative CFAs have been suggested, but have never really"caught-on". I don't know if this is because Bruce got it right the first time, or because the cost of doing anything out-of-the-ordinary is too high (see e.g. x-trans vs Adobe raw development).
Yes, 30-40% average channel response, multiplied by the average surface bandwidth - which is also around 30-40%. >>> About 10-15% overall system efficiency (compared to the not 100%, but about 75-80% maximum if you want "human perception color response").
Mr Bayer got it right, because he didn't complicate a very easily defined problem. System limitations:
- Use a production practical layout for photocells; that's square or hexagonal cells. Square/octagonal combinations have been found to be counterproductive.
- Maximize the luminance resolution - that is mostly based on the green spectra (M-cones at ~550nm, perceptually achromatic rod cells at ~500nm
- Make it rotationally invariant and preferably in symmetric layout schemes.
- Make the system balanced between luminance and chrominance statistical accuracy (noise types)
Symmetrical layout: 2x2 or 3x3 (4x4 to much?) groups with square cells, triangle layout with hexagonal cells.
Luminance resolution: have more green than blue or red input area. Green cell layout has to be symmetric
Noise considerations: have approximately twice the amount of green as either red or blue input area
There aren't to many layouts to consider...
(*1)At National Geographic (for whom I was part of designing their first in-line print quality inspection cameras, now many, many moons ago...

) they generally accept that their 300 dpi input recommendation for advertisement and art input is way over the top. The ABX blind tests (with loupe!) screens top out at about 175 lpi raster frequency on good quality paper. That's where the blind testers start to fail in recognizing the higher resolution image with statistical ABX comparisons in more than 50% of the samples. As software and algorithms have improved, we now use 1.33x lpi to get needed dpi input, where we had to use almost 2x before (the old "you need twice the resolution on the original to get maximum print quality" dogma).