Thursday, April 07, 2011

New CFA, Rectangular Pixels Proposed

A Seattle based startup Image Algorithmics proposes a new, better CFA combined with rectangular pixels. Regular Bayer CFA relies on 3 colors and square pixels:

The proposed CFA best works with rectangular pixels with aspect ratio 1.41:1 and has 4 colors, each of them is a linear combination of red, green and blue:

Although the physical pixels are rectangular, the color image processing seems to convert them in usual square-shaped pixels.

In comparison with the Bayer CFA with the same pixel count, same resolution, same chrominance bandwidth and demosaicked with the popular AHD algorithm, the proposed pattern is claimed to have the following advantages:
  1. 7.6dB PSNR improvement on the Kodak image set
  2. Even greater, 10dB PSNR, improvement on realistic images
  3. Greatly reduced artifacts
  4. One quarter as much increase in MSE due to noise. This is because of:
    a. lighter filter colors that let in more light
    b. no directional sensing algorithms to get confused by noise
    c. numerically stable demosaicing, unlike CMY patterns
  5. Uniform quantum efficiency - hence resistant to sensor saturation
  6. Low complexity demosaicing using separable filters, can be used in phone camera video
  7. Simple noise characterization resulting in more effective post demosaicking noise reduction
  8. Chromatic Aberration correction can be performed after demosaicing with an order of magnitude less increase in MSE than Bayer

The main principles of the proposed CFA-pixel-algorithm combination are discussed in SPIE IS&T Electronic Imaging 2011 paper here.


  1. "In comparison with the Bayer CFA with the same pixel count, same resolution..."

    Does the overall sensor dimensions remain the same in this comparison?

    For example, if the sensor with square pixels has a pixel size 3um x 3um, is the rectangular pixel 3.57um x 2.52um?

    Or is it now 4.24um x 3um?

  2. what is the benefit of using rectangular pixels anyway?

  3. To the first poster, pixel count and area remain the same so the new pixel dimensions would be 3.57x2.52um.

    To the second poster, by altering the pixel aspect ratio we alter the aspect ratio of its spectrum. We use the CFA spectrum shape that most efficiently packs the image luminance and chrominance spectra into it. For more details see or our paper linked from there.

  4. Interesting approach.
    The obvious drawback is of course the large elementary cell.
    Haven't read the paper yet but I'm assuming the demosaicing algorithm is proprietary too.

    So far manufacturers have stayed away from anything other than Bayer - even though Bayer alternatives have been proposed (and patented) in the past.

    OTOH, the continuously increasing pixel densities could finally make a Bayer alternative successful.

  5. Demosaicing is done by a very simple and non-proprietary demodulate+filter algorithm. This algorithm requires very little computation and can be employed by the most demanding mobile applications.

    So far Bayer is indeed dominant, but alternatives have been tried. Sony has tried an RGB+Emerald color in the past. Nikon and others have sold cameras with CMY filters. Most of these CFAs did not provide a compelling advantage and were abandoned.

    The previous "frequency domain" designs yielded under 2dB advantage at the expense of reduced chrominance resolution. The present frequency domain design achieves roughly 10dB advantage while retaining half as much chrominance bandwidth as luminance. Half chrominance bandwidth is a popular image encoding standard and will lead to no further loss of chrominance if cascaded with such a CFA.

    Separately, Kodak is pushing its RGBW CFAs that promise increased sensitivity.

  6. Kodak is not pushing anything except daisies.
    RGBW is an old idea in general and probably not protected by IP except for specific implementations and specific algorithms that are embodied in patent applications (not yet granted) now owned by Omnivision.

    I certainly don' know alot about color science, but I thought one fundamental idea was that capture should be matched to display. I understand that the digital soft form of an image could be in any color representation but ultimately you either have to print it or display it. I'd like to hear the Singh brothers comments on this.

  7. Matching the capture and display color responses would work well if the spectral response of the RGB did not overlap. Unfortunately they do, and this causes complications. Consider the example of a monochromatic light source whose wavelength is at the peak of our green response. This light is going to stimulate all three types of cones, and not just green. A camera with identical spectral response as the eye will capture the same information. Now if you were to display the captured values directly on an output device with RGB responses matched to our eyes, it will output some R, B in addition to G, which is clearly wrong. Now the human visual system has 3 sources of R, B - one from the green light stimulation and two others from R, B lights, resulting in a less saturated green.

    Because of this spectral overlap, 3 primary displays can never reproduce all colors. Some people have tried to get around this problem by mapping the 3 primary input to 4 or even 6 primary output. They promise realistic display of Gold, among other colors.

    Returning to our CFA, the objective is to capture RGB as closely as possible to what the eye does. To achieve this the spectral response of colors should be exactly as specified. For example one of our colors is yellow. This color's spectral response should be the sum of red and green spectral responses in equal parts. To the extent this is achieved, our demosaicker will directly output the correct color. Otherwise we will get into the realm of color correction.

  8. Thanks for your answer and making us all think about this.

    I think in your example you left out the step of correcting for the eye response in the display but again, I am just a beginner here.

    Meanwhile, I was trying to figure out how you can specify optimum filters without including the spectral response of the silicon? Is it embedded somewhere in your calculations? I think not but just wanted to ask. I wonder if you are assuming 100% QE across the full spectrum.

    And again, I still believe optimization should include the display but maybe I am optimizing the wrong thing.

    As far as the rectangular sampling goes, what is the effect of fill factor (or microlense response) on the results? One cannot, in practice, do perfect sampling of either square or rectangular pixels.

    I suppose mathematically it is possible to sample the image on a rectangular grid and produce an output image pixelation on some other periodicity in X and Y. But, I suppose the practicalities of digitial computation might affect this. Any thoughts?

  9. Color correction in the display can certainly fix the problem in my clunky example. Sorry about that. I am not a color expert either so I'll refrain from making definitive statements on matching camera and display except to say there is no special consideration in manufacturing the proposed CFA colors. If we start with the primaries of a Bayer CFA, however (in)accurate they might be, and realize our colors as the given linear combination of them, then our demosaicker will output the primaries with the same accuracy as the Bayer demosaicker.

    Our proposed colors are the combined spectral responses of the filter and silicon. I should have been more clear about that.

    As for the aspect ratio, it is the periodicity along X and Y that is critical for accurate color encoding, and not the exact shape of the pixels. The exact shape of the pixel determines the box filtering and the fill factor. Box filtering, in turn, determines how strong your OLPF should be and what kind of post demosaicing sharpening you are going to need. Fill factor determines the sensitivity. These factors are similar to those for square pixel Bayer sensors.

    Vladimir had also raised the question of optical and electrical crosstalk between pixels and their dependence on pixel location as well as the variation in spectral response of filters and silicon as a function of pixel location. All of these complications are quite similar to those for Bayer sensors and no novel approach is needed for them.

  10. I have been privately asked a very good fundamental question: since the width of the proposed rectangular pixel is greater than the width of the square Bayer pixel, won't the proposed sensor capture less resolution?

    The answer is no. The reason is quite simple: while our sensor is Nyquist limited in the horizontal direction, commercial Bayer sensors are OLPF limited to less than their (and our) Nyquist limit.

    The finer vertical pitch of the our CFA is not used to capture more vertical resolution, it is instead used to capture color information. Vertical and horizontal resolutions are both limited to the horizontal Nyquist limit by our OLPF.

  11. as you use somewhat complementary color filter, so the sampling frequence in R/G/B is a little bit complicated. If you decompose in R/G/B, maybe there is no much difference between H and V directions.

  12. @ I have been privately asked a very good fundamental question

    OK. But wouldn't the horizontal resolution be different than the vertical resolution?

    Is that an issue at all?

  13. Decomposing into RGB will simplify the pattern though there's still a difference in the H and V directions. In fact, the frequency domain design process first determines the patterns separately for R, G, B and then adds them up to get the final colors.

    As for the horizontal and vertical resolutions, they are limited to the same value by the OLPF.

  14. OLPF...old school. Are they used with small pixel sensors?

  15. Any optical/mechanical technique that filters out high frequencies will work for both Bayer and the proposed CFA. Many low end camera optics are too poor to resolve fine detail, so their sensors do not have an OLPF. Most high end cameras, on the other hand, do have an OLPF.

    Even if the proposed design is used with sharp optics without an OLPF, the demosaicer's (digital) filters won't let output higher vertical resolution than horizontal. Instead, it will artifact, somewhat like Bayer does.

  16. @ As for the horizontal and vertical resolutions, they are limited to the same value by the OLPF.

    Makes sense.

    Considering that a Bayer sensor 'discards' two thirds (?) of the incoming light, I always wondered if there's a way to improve on this.

    Seems like this new CFA provides a very elegant (and scientifically solid) solution to the problem of letting more light hit the sensor and the same time achieve the same (?) color separation.

    Good luck, guys.

  17. Thanks!

    Color separation of the proposed CFA is just as good as Bayer and its sensitivity is greater. However, the increased sensitivity is a lucky side effect since our primary objective was to improve fine detail reconstruction and reduce artifacts.

  18. it's well known to use complementary filter array. Interline CCD uses this due to interlaced scan and even-odd line suming. There are a lot of tries to use complementary filter with CMOS. As rule of thumb, the complementary filter gives 2X light input. But the problem is that in order to get R/G/B, there are some differential operations which increase inevitably the noise. I think that the final result should be very similar. Of course under low light conditions, you will get more photons per pixel, so the luma is better.

  19. Conversion of CMY to RGB increases noise, as you pointed out. This is because CMY does not form an orthogonal basis in the RGB color space. We use a different color space that does.

    The main source of our performance lead is not increased light - our filters let in less light than CMY but a bit more than RGB. It is improved encoding of color signals in the SPATIAL frequency domain, and not the ELECTROMAGNETIC frequency domain.

  20. @commercial Bayer sensors are OLPF limited TO LESS THEN their (and our) Nyquist limit...

    Would you be so kind to share any reference/research supporting this claim?

  21. This reminded me of the work of Hirakawa and Wolfe, see here:


All comments are moderated to avoid spam and personal attacks.