# A 12MP 16-Focal Plane CMOS Image Sensor with 1.75µm Pixel: Architecture and Implementation

Kwang-Bo Cho<sup>1</sup>, Nick Tu<sup>1</sup>, John Brummer<sup>1</sup>, Khandaker Azad<sup>1</sup>, Leo Hsu<sup>1</sup>, Vivian Wang<sup>1</sup>, Dongsoo Kim<sup>1</sup>, Krishna Palle<sup>1</sup>, Tien-Min Miao<sup>1</sup>, Yandong Chen<sup>1</sup>, Canaan Hong<sup>1</sup>, Toan Bao<sup>1</sup>, Vitanshu Sharma<sup>1</sup>, Yuan Fong<sup>1</sup>, Kumudini Irkar<sup>1</sup>, Syed Hashmi<sup>1</sup>, Vinesh Sukumar<sup>1</sup>, Salman Kabir<sup>1</sup>, Gershon Rosenblum<sup>1</sup>, Yong Gao<sup>1</sup>, Kil-Ho Ahn<sup>2</sup>, Hyuk-Jin Ko<sup>2</sup>, Jeff Watson<sup>3</sup>, Chris Kenoyer<sup>3</sup>

Aptina<sup>1</sup>, LLC, San Jose, CA, USA

Aptina Korea Co.<sup>2</sup>, Seoul, Korea

Aptina<sup>3</sup>, LLC, Corvallis, OR, USA Aptina, LLC, 3080 North First Street, San Jose, CA 95134

Tel: 408-660-2236, email: acho@aptina.com

#### Abstract

This paper presents the world's first CMOS image sensor with 16 focal planes (FPs), each FP with 0.75MP (1000x750) 1.75 $\mu$ m 1x4 4-way shared active pixels, analog signal chain, and timing control, and it is providing a total of 12M physical pixels and targeting to meet 12M@30FPS with four lane MIPI running at 768Mbps/lane. Each individual FP can be programmed with its own exposure time and gain. Each pixel array in an FP can have standard Bayer color filter or monochromatic filter. It is expected that this chip will be used with an array lens and image software to realize a range of features including super-resolution imaging, higher dynamic range imaging, and range estimate using inherent parallax with low Z-height module size.

#### **1. Introduction**

The conventional digital camera is provided with a single image sensor with Bayer color pattern and a single corresponding lens. To get the superior color image reproduction without Bayer color processing, 2-way, 3-way, or 4-way prism-based single lens and multiple sensor camera systems can be used. With two cameras in a binocular stereo system, we can extract depth information and it can be extended to a trinocular system. Recently a camera array has been used to capture high resolution images by tiling multiple cameras paired with a complex lens [1]. However, this camera array would be prohibitively large and expensive because of the tiling camera systems.

Compared to the conventional camera, computational camera systems with unconventional optics recently became popular, using image sensors, and image processing to produce new forms of visual information. This includes wide field-of-view images, high dynamic range images, multispectral images, and depth images. For example, a light-field camera, also called a plenoptic camera, is a camera that uses a microlens array together with a main camera lens to capture 4D light field information about a scene [2]. However, this plenoptic camera will sacrifice resolution because of the grouping pixels to create a macropixel, thus reducing effective resolution.

To overcome these difficulties, we can use arrays of image sensors and corresponding lenses to gather more image data and information in a single chip. This type of system, which is referred to as Array Imager (AI), may be used to minimize camera thickness, extend depth of focus, increase output resolution through super-resolution processing, and capture depth and multispectral information from a single capture. Although these features are attractive, the AI camera also poses a number of difficult technical challenges. These include parallax artifacts caused by differences in perspective, resolution limits imposed by sampling, and lens manufacturing issues. In this paper, we focus on the image sensor development for 4x4 AI cameras – their architecture and implementation, and early results.

## 2. Chip Architecture

To make the world's first CMOS image sensor with 16 (4x4) FPs, each with 1000x750 active pixels, providing a total of 12Mpixels, during the architecture phase, we focus on the key features and requirements.

For each FP, the key requirements are: 1) 30FPS, 2) independent gain and exposure time control, 3) different readout positions with same window size, 4) separate power switch control, and 5) separate calibration and trimming capability. For the 16 focal planes integration, key challenges are: 1) hierarchical design and physical implementation, 2) power and signal integrity challenge in array configuration, 3) matching consideration between focal planes and associated circuitry and performances, 4) large die size, 5) high power consumption, and 6) custom design overhead.



Figure 1: Sensor block diagram

As a result, the sensor block diagram with 16 FPs is shown in Figure 1. All 16 FPs are controlled by global digital block that exists outside of the FPs. This global digital block contains registers that control all the FPs at once. Also each FP can be programmed with its own exposure time, and gain control by local digital control logic. In addition, it employs sophisticated power gating and clock gating circuits which can individually turn on and off each FP. Data from 16 FPs are fed to 16 line buffers in the global digital block – one for each of the FPs. The sensor has 4-lane MIPI running at 768Mbps/lane. Preview, 720p, and 1080p mode can operate with a partial set of the FPs powered. Therefore, these sophisticated features enable power saving and deliver superior fault tolerance. It is expected that this sensor will be used with an array lens, not the conventional single lens.

## 2.1 Architecture of Focal Plane

Figure 2 shows an overview of the FP that contains the sensor analog core and local digital control logic. The sensor analog core consists of a pixel array that uses  $1.75\mu$ m 1x4 4-way shared pixel with pixel reset and row select. Pixels on four different rows share the same floating diffusion node. Each pixel column has its own pixel output line. Boosted voltages are used for the pixel transfer gate, reset gate, and row select gate.



Figure 2: Each focal plane block diagram

The active pixel array size is 1000x750 with additional four border pixels on each side. In addition, there are an extra two rows for readout channel analog offset calibration. The mini array [3] is used in this design for area efficiency.

The column readout circuit is at one side of pixel array. The column is laid out in a two-pixel pitch of  $3.5\mu$ m. Thus, two mirrored columns are stacked together with a column decoder between them. Each column decoder unit controls two column switches.

The serial readout architecture is used in this design to minimize the sensor area. The pixel signals are sampled into column capacitors in parallel, and then read out through two channels in series. Two columns are read out each time with 30MHz pixel clock. Each readout channel consists of fully differential switched capacitor gain amplifiers and an analog-to-digital converter (ADC).

The first programmable gain amplifier (PGA) provides an analog gain of 1x, 2x, and 4x. The second programmable gain amplifier provides fine analog gain of up to 4x. The two PGAs share one op-amp for power efficiency. When the first PGA is in amplification phase, the second one is in sampling phase. Alternatively, when the second one is in amplification phase, the first one is in pre-charge phase for the next amplification.

The pipeline ADC has 10 stages. Each of the first nine stages has 1.5-bit output per stage, and the last stage has a 2-bit output. The internal ADC has an 11-bit resolution after digital error correction. The half range of the 11-bit is reserved for channel offset, and the final ADC output is 10-bit. The ADC structure allows offset correction done in digital domain only. Similar to gain amplifiers, the shared op-amp technique is used between stages. While one stage is in amplification phase, the next stage is in the sampling phase and vice versa.

## 2.2 Architecture of Digital Block

Considerations during the architecture and design of the digital portion of the sensor included a required focal plane pitch, tradeoff against area usage, and logic redundancy for better yield. The power, limited area for routing between the FP logic and the global logic, and required functionality in the focal planes were looked into as well. Figure 3 below gives an overall view of the digital architecture.



The FP circuits were designed to be placed and repeated as a single instance. Individual FP was identified by a hard-wired address on the global hierarchy. Registers in the FPs are locally decoded based on the hard-wired address, compared to a "write index" and "read index" global address. The register interface to the sensor follows the standard two-wire specification, while the data interface follows the MIPI CSI2 standard. A 4-lane MIPI interface clocked at 768Mbps allows frame rates of up to 30FPS for 16 FPs in compressed 8-bit format or 24FPS for 16 FP's in native 10-bit pixel format. Depending on the usage scenario, any number of FP's between 0 and 16 can be enabled, with the added advantage of power reduction and lowering rates on the data interface. The FP's can be disabled in three modes: (1) a destructive mode that removes all digital and analog power, apart from a small domain to maintain the register interface, (2) a non-destructive mode by powering down the analog signal interface only providing a usable first frame after exiting the power down

state, and (3) non-destructively excluding the data from the MIPI stream, that can be independently combined with the first two.

The sensor utilizes three on-chip generated primary asynchronous clock domains: (1) one for the FPs, which were limited by the pixel timing requirements to 30MHz, (2) one for the interface logic limited to 96MHz, and (3) one for the high-speed MIPI I/O limited to 768MHz. Clock domains are generated in the global logic domain and spread to the FP's. For power considerations, the focal plane clock domain, being the slowest of the three, was used throughout the global register interface, focal plane domain, and addressing logic. One row of the image data from all active FPs are sampled from each of the 16 FP's at the same clock edge, serially clocked into 16 FIFO (First In First Out)'s and serially sampled for readout in groups of 4 pixels by the interface logic. One FIFO per focal plane insures better redundancy. Line buffers are kept in the global logic domain for easier routing.

The image window positions of each focal plane can also be individually programmed. MIPI data can be compressed from native 10-bit to 8-bit using truncation, with DPCM (Differential Pulse Code Modulation).

Correction logic for the focal planes are local to the FP's for better redundancy and area utilization, and consist of dark current and analog circuit offset subtraction. In addition, digital gain, pedestal offset and pixel timing control logic are kept local to the FP's. Array addressing logics are spread out between global and local logic for better control and area utilization, communicating with a custom serial protocol designed to minimize interconnects. Integration time (coarse and fine) and row addressing are kept local to each FP.

The global logic also contains read-only memory as well as one-time programmable memory (OTPM) with 16 registerselectable contexts to switch between different programmable operating modes. Two register address domains - one customer specific and one manufacturer specific - are maintained using a lookup table. Moreover, 17 Scan chains, one for each FP and one for the global logic, as well as memory BIST (built-in self-test), ensure testability. Defective focal planes can be permanently excluded at manufacture using a special register in the OTPM domain.

## 3. Chip Implementation

The sensor was fabricated in a 130nm CMOS 2P 4M process and the die outline is shown in Figure 4.

Key challenges on physical implementation are, first, the 16 FPs integration with hierarchical design flow, power, clock skew, and signal integrity challenge in array configuration, matching consideration between FPs, and associated circuitry and performance. Adding to this, we have to consider the large die size, high power consumption, and custom design overhead.



Figure 4: Die outline

To get better power integrity, power analysis tools were run to estimate the power consumption of the auto-routed block. Figure 5 shows static and dynamic power analysis results to show power drop and current density at chip level.

Initial 4x4 array image results on monochrome pixels with prototype array lens, color image from silicon with array lens after array image processing at 50lux light condition and key parameters are provided in Figure 6, Figure 7, and Table 1, respectively.





Figure 5: Power analysis (a) Static analysis (Worst 39mV drop) (b) Dynamic analysis (Worst 81mV drop)





Figure 6: Images from silicon with array lens (a) 4x4 array image from all 16 focal planes (b) Image from single focal plane.



Figure 7: Color Image from silicon with array lens after array image processing at 50lux light condition.

| T 11    | 1  | 17    |            |
|---------|----|-------|------------|
| Lanie   | •• | KAV   | narameterc |
| I di nu |    | IXU V | Daramouts  |

| ruoto in itoj parametero |                         |  |  |
|--------------------------|-------------------------|--|--|
| Process                  | 130nm CMOS 2P 4M        |  |  |
| Pixel                    | 1x4 4-way shared 1.75µm |  |  |
| Conversion Gain          | 97µV/e-                 |  |  |
| Linear Full Well         | 5300e-                  |  |  |
| ADC                      | 15Ms/s@11-bit           |  |  |
| Read Noise@Dark          | 3.6e-@16x gain          |  |  |
| Power Consumption        | 713mW@12M, 30FPS        |  |  |
| Die Size                 | 10.8mm x 10.25mm        |  |  |
|                          |                         |  |  |

## 4. Conclusions

This AI camera will not only provide the lowest Z height, but will also provide other value-added features like super resolution image, depth sensing, re-focusing, good mid- and low-light performance, and high dynamic range. A key benefit of the AI camera is that it enables a "software-based or reconfigurable camera" where the end application can change simply by reconfiguring the arrays by upgrading the software in the end product. For example in a smart phone, the application or user can tradeoff between speed, resolution, high dynamic range, depth measurement and other parameters, just by changing the software. It represents a disruptive technology, and has strong potential to be one of the next big trends in the imaging world. We hope our work can help to extend this exciting imaging area, and send a strong message about the future of imaging with the AI camera.

## Acknowledgements

The authors would like to thank all of the staff and management at Aptina for their contribution to this design.

#### References

B. Wilburn *et al.*, "High Performance Imaging Using Large Camera Arrays," ACM Transactions on Graphics, 24(3):776, 2005.
E. H. Adelson and J. Y. A. Wang., "Single Lens Stereo with Plenoptic Camera," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 14, No. 2, pp. 99-106, February 1992.
R. Johansson *et al.*, "A 1/13-inch 30fps VGA SOC CMOS Image Sensor with Shared Reset and Transfer Gate Pixel Control," IEEE International Solid-State Circuits Conference, San Francisco, CA, USA, 2011.