HK1173295A - Techniques for synchronizing audio and video data in an image signal processing system - Google Patents
Techniques for synchronizing audio and video data in an image signal processing system Download PDFInfo
- Publication number
- HK1173295A HK1173295A HK13100520.4A HK13100520A HK1173295A HK 1173295 A HK1173295 A HK 1173295A HK 13100520 A HK13100520 A HK 13100520A HK 1173295 A HK1173295 A HK 1173295A
- Authority
- HK
- Hong Kong
- Prior art keywords
- image
- register
- pixel
- audio
- data
- Prior art date
Links
Description
Technical Field
The present disclosure relates generally to digital imaging devices and, more particularly, to systems and methods for processing image data obtained with an image sensor of a digital imaging device.
Background
The background section is intended to introduce the reader to various aspects of art, which may be related to various aspects of the present technology that are described and/or claimed below. The following discussion is believed to be helpful in providing the reader with background information to facilitate a better understanding of the various aspects of the present disclosure. Accordingly, it should be understood that these statements are to be read in this light, and not as admissions of prior art.
In recent years, digital imaging devices have become increasingly popular, at least in part because the cost of digital imaging devices is increasingly being affordable to the average consumer. Furthermore, in addition to the many stand-alone digital cameras currently available on the market, it is not uncommon for a digital imaging device to be integrated as part of another electronic device, such as a desktop or notebook computer, a cellular telephone, or a portable media player.
To obtain image data, most digital imaging devices include an image sensor that provides a number of light detecting elements (e.g., photodetectors) configured to convert light detected by the image sensor into electrical signals. The image sensor may also include a color filter array that filters light captured by the image sensor to capture color information. Image data captured by the image sensor is then processed by an image processing pipeline, which may apply a variety of different image processing operations to the image data to generate a full color image that may be displayed on a display device, such as a monitor, for viewing.
While conventional image processing techniques generally aim to produce a pleasing, at-a-glance image that is both objective and subjective to a viewer, such conventional techniques may not adequately address errors and/or distortions of image data introduced by the imaging device and/or image sensor. For example, defective pixels on an image sensor due to manufacturing defects or operational failures may not accurately sense light levels and, if uncorrected, may appear as artifacts appearing in the resulting processed image. In addition, a decrease in light intensity at the edge of the image sensor (which may be due to manufacturing defects of the lens) may adversely affect the characteristic measurement and may result in an image with non-uniform overall light intensity. The image processing pipeline may also perform one or more processes that sharpen the image. However, conventional sharpening techniques may not adequately address noise in existing image signals, or may not be able to distinguish noise from edges and textured regions in the image. In such a case, conventional sharpening techniques may actually increase the appearance of noise in the image, which is generally undesirable. In addition, various additional image processing steps may be performed, some of which rely on image statistics collected with a statistics collection engine.
Another image processing operation that may be applied to image data captured by an image sensor is a demosaicing operation. Since color filter arrays typically provide color data one wavelength per sensor pixel, to reproduce a full color image (e.g., an RGB image), a complete set of color data is typically interpolated for each color channel. Conventional demosaicing techniques typically interpolate values of missing color data along the horizontal or vertical direction based on some type of fixed threshold. However, such conventional demosaicing techniques may not adequately account for the location and orientation of edges within an image, which may result in the introduction of edge artifacts, such as aliasing (aliasing), checkerboard artifacts, or rainbow artifacts, into a full color image, particularly along diagonal edges within the image.
Accordingly, when processing digital images obtained with a digital camera or other imaging device, various considerations should be addressed to improve the appearance of the resulting image. In particular, certain aspects of the present disclosure below address one or more of the deficiencies briefly mentioned above.
Disclosure of Invention
The following sets forth a summary of certain embodiments disclosed herein. It should be understood that these aspects are presented merely to provide the reader with a brief summary of these embodiments and that these aspects are not intended to limit the scope of this disclosure. Indeed, the present disclosure may encompass a variety of aspects that may not be set forth below.
The present disclosure provides and illustrates various embodiments of image signal processing techniques. In particular, the disclosed embodiments of the present disclosure may relate to processing of image data with a back-end image processing unit, the arrangement and structure of line buffers that implement raw pixel processing logic, techniques to manage movement of pixel data in the presence of an overflow (also referred to as over-speed) condition, techniques to synchronize video and audio data, and techniques to the use of various pixel memory formats that may be used to store and read pixel data to and from memory.
In terms of back-end processing, the disclosed embodiments provide an image signal processing system that includes a back-end pixel processing unit that receives pixel data after processing by at least one of a front-end pixel processing unit and a pixel processing pipeline. In some embodiments, the back-end processing unit receives luma/chroma image data and may be configured to apply face detection operations, local tone mapping, brightness (brightness), contrast and color adjustments, and scaling. Furthermore, the back-end processing unit may further comprise a back-end statistics unit that may collect frequency statistics. The frequency statistics may be provided to an encoder and may be used to determine quantization parameters to be applied to the image frames.
Another aspect of the present disclosure relates to the implementation of an original pixel processing unit utilizing a set of line buffers. In one embodiment, the set of line buffers may include a first subset and a second subset. The respective logic units of the original pixel processing unit may be implemented in a common manner using the first and second subsets of line buffers. For example, in one embodiment, defective pixel correction and detection logic may be implemented using a first subset of line buffers. The second subset of line buffers may be used to implement lens shading correction logic, gain, offset and clamp logic, and demosaic logic. Further, noise reduction may also be achieved using at least a portion of the first and second subsets of line buffers, respectively.
Another aspect of the disclosure relates to an image signal processing system including overflow control logic that detects an overflow condition at a destination cell when a sensor input queue and/or a front-end processing unit receives backpressure from a downstream destination cell. The image signal processing system may further include a flash controller configured to enable the flash device before the start of the target image frame using the sensor timing signal. In one embodiment, the flash controller receives a delayed sensor timing signal and determines a flash enable start time as follows: a time corresponding to the end of the previous frame is identified using the delayed sensor timing signal, increased by a vertical blanking interval time, and then subtracted by a first offset to compensate for the delay between the sensor timing signal and the delayed sensor timing signal. The flash controller then subtracts the second offset to determine a flash enable time to ensure that the flash is enabled before the first pixel of the target frame is received. Other aspects of the disclosure provide techniques related to audio-video synchronization. In one embodiment, the time code register provides the current time stamp when sampled. The value of the time code register may be incremented at regular intervals according to a clock of the image signal processing system. At the beginning of a current frame acquired by the image sensor, the time code register is sampled and a time stamp is stored in a time stamp register associated with the image sensor. The timestamp is then read from the timestamp register and written into a set of metadata associated with the current frame. The timestamps stored in the frame metadata may then be used to synchronize the current frame with a corresponding set of audio data.
Another aspect of the disclosure provides a flexible memory input/output controller configured to support the saving and reading of multiple pixel and pixel memory formats. For example, the memory I/O controller may support storage and reading of raw image pixels of various bit precisions, such as 8 bits, 10 bits, 12 bits, 14 bits, and 16 bits. Pixel formats that are not byte aligned with memory (e.g., not a multiple of 8 bits) may be saved in a packed manner. The memory I/O controller may also support RGB pixel groups and YCC pixel groups in various formats.
Various modifications of the above-mentioned features exist in various aspects of the present disclosure. Additional features may also be incorporated in these various aspects as well. These modifications and additional features may be present alone or in any combination. For example, various features discussed below in relation to one or more of the illustrated embodiments may be incorporated into any of the various orientations of the present disclosure, either individually or in any combination. Also, the brief summary presented above is intended only to familiarize the reader with certain aspects and contexts of embodiments of the present disclosure without limitation to the claimed subject matter.
Drawings
This patent or application file contains at least one drawing executed in color. Copies of the colored drawings disclosed in this patent or patent application will be provided by the office upon request, at the expense of the necessary fee.
Various aspects of the disclosure may be better understood upon reading the following detailed description with reference to the drawings, in which:
FIG. 1 is a simplified block diagram depicting components of one example of an electronic device including an imaging device and image processing circuitry configured to implement one or more image processing techniques set forth in the present disclosure;
fig. 2 shows a graphical representation of a 2 x 2 block of pixels of a Bayer color filter matrix that may be implemented in the imaging device of fig. 1.
FIG. 3 is a perspective view of the electronic device of FIG. 1 in the form of a laptop computing device, in accordance with aspects of the present disclosure;
FIG. 4 is a front view of the electronic device of FIG. 1 in the form of a desktop computing device, in accordance with aspects of the present disclosure;
fig. 5 is a front view of the electronic device of fig. 1 in the form of a handheld portable electronic device, in accordance with aspects of the present disclosure;
FIG. 6 is a rear view of the electronic device shown in FIG. 5;
FIG. 7 is a block diagram illustrating one embodiment of the image processing circuit of FIG. 1 including front-end Image Signal Processing (ISP) logic and ISP pipeline (pipe) processing logic, in accordance with aspects of the present disclosure;
Fig. 8 is a block diagram illustrating another embodiment of the image processing circuit of fig. 1 comprising front-end Image Signal Processing (ISP) logic, ISP pipeline processing logic, and ISP back-end processing logic, in accordance with aspects of the present disclosure;
FIG. 9 is a flow chart describing a method of processing image data using the image processing circuit of FIG. 7 or FIG. 8, in accordance with various aspects of the present disclosure;
FIG. 10 is a more detailed block diagram representation of one embodiment of ISP front-end logic that may be implemented in FIG. 7 or FIG. 8 in accordance with aspects of the present disclosure;
FIG. 11 is a flow diagram representing a method for processing image data in the ISP front-end logic of FIG. 10 in accordance with one embodiment;
FIG. 12 is a block diagram illustrating the structure of double buffer registers and control registers that may be used to process image data in ISP front-end logic, in accordance with one embodiment;
13-15 are timing diagrams depicting different modes of processing of trigger image frames in accordance with embodiments of the present technique;
FIG. 16 is a diagram illustrating a control register in greater detail, according to one embodiment;
FIG. 17 is a flow chart depicting a method for processing an image frame using a front-end pixel processing unit when the ISP front-end logic of FIG. 10 is operating in a single sensor mode;
FIG. 18 is a flow chart depicting a method for processing an image frame using the front-end pixel processing unit when the ISP front-end logic of FIG. 10 is operating in dual sensor mode;
FIG. 19 is a flow chart depicting a method for processing an image frame using a front-end pixel processing unit when the ISP front-end logic of FIG. 10 is operating in dual sensor mode;
FIG. 20 is a flow diagram depicting a method in which both image sensors are active, according to one embodiment, however, in which the first image sensor is sending image frames to the front-end pixel processing unit and the second image sensor is sending image frames to the statistics processing unit, such that the imaging statistics of the second image sensor are immediately available when the second image sensor later continues to send image frames to the front-end pixel processing unit;
FIG. 21 is a graphical depiction of a linear memory addressing format that may be applied to a pixel format stored in a memory of the electronic device of FIG. 1, in accordance with various aspects of the present disclosure;
FIG. 22 is a graphical depiction of a partitioned (tiled) memory addressing format that may be applied to a pixel format stored in a memory of the electronic device of FIG. 1, in accordance with various aspects of the present disclosure;
FIG. 23 is a graphical depiction of various imaging regions defined within a source image frame captured by an image sensor in accordance with various aspects of the present disclosure;
FIG. 24 is a graphical depiction of a technique for processing overlapping vertical stripes of image frames using an ISP front-end processing unit;
FIG. 25 is a diagram depicting how byte swapping can be applied to incoming (incomming) image pixel data from memory using swap code in accordance with various aspects of the present disclosure;
26-29 illustrate examples of memory formats for raw image data that may be supported by the image processing circuitry of FIG. 7 or FIG. 8, in accordance with various embodiments of the present disclosure;
FIGS. 30-34 illustrate examples of memory formats for full color RGB image data that may be supported by the image processing circuit of FIG. 7 or FIG. 8, in accordance with various embodiments of the present disclosure;
35-36 illustrate examples of memory formats for luminance/chrominance image data (YUV/YC1C2) that may be supported by the image processing circuit of FIG. 7 or FIG. 8, in accordance with various embodiments of the present disclosure;
FIG. 37 illustrates an example of how frame locations in a memory in a linear addressing format may be determined, in accordance with various aspects of the present disclosure;
FIG. 38 illustrates an example of how frame locations in a memory in a block addressing format may be determined, in accordance with various aspects of the present disclosure;
FIG. 39 is a block diagram of the ISP circuit of FIG. 8 depicting how overflow handling occurs in accordance with one embodiment of the present disclosure;
FIG. 40 is a flow chart depicting a method for overflow handling when an overflow condition occurs while reading image pixel data from a picture memory, in accordance with various aspects of the present disclosure;
FIG. 41 is a flow chart describing a method for overflow handling when an overflow condition occurs while reading in image pixel data from an image sensor interface, according to one embodiment of the present disclosure;
FIG. 42 is a flow chart describing another method for overflow handling when an overflow condition occurs while reading in image pixel data from an image sensor interface, in accordance with another embodiment of the present disclosure;
FIG. 43 is a graphical depiction of an image (e.g., video) and corresponding audio data that may be captured and saved by the electronic device of FIG. 1;
FIG. 44 illustrates a set of registers that may be used to provide timestamps to synchronize the audio and video data of FIG. 43, according to one embodiment;
FIG. 45 is a simplified representation of an image frame that may be captured as part of the video data of FIG. 43 and illustrates how timestamp information is saved as part of the image frame metadata, in accordance with aspects of the present disclosure;
FIG. 46 is a flow diagram that describes a method for synchronizing image data with audio data using VSYNC signal based time stamps, according to one embodiment;
FIG. 47 is a block diagram of the ISP circuit of FIG. 8 depicting how flash timing control may be performed according to one embodiment of the present disclosure;
FIG. 48 is a flow diagram depicting a technique for determining flash enable and disable times according to one embodiment of the present disclosure;
FIG. 49 is a flow chart describing a method of determining a flash on time in accordance with the technique shown in FIG. 48;
FIG. 50 is a flow diagram depicting a method for updating image statistics using pre-flashes prior to acquiring an image scene with flashes, in accordance with various aspects of the present disclosure;
FIG. 51 is a block diagram providing a more detailed view of one embodiment of an ISP headend pixel processing unit as shown in the ISP headend logic of FIG. 10, in accordance with various aspects of the present disclosure;
FIG. 52 is a process diagram illustrating how temporal filtering may be applied to image pixel data received by the ISP front-end pixel processing unit shown in FIG. 51, in accordance with one embodiment;
FIG. 53 illustrates a set of reference image pixels and a corresponding set of current image pixels that may be used to determine one or more parameters of the temporal filtering process shown in FIG. 52;
FIG. 54 is a flow diagram illustrating a process of applying temporal filtering to a current image pixel in a set of image data, according to one embodiment;
FIG. 55 is a flow diagram representing a technique to calculate motion delta values for use in temporal filtering of the current image pixels of FIG. 54, according to one embodiment;
FIG. 56 is a flow diagram illustrating another process for applying temporal filtering to a current image pixel in a set of image data, including using different gains for each color component of the image data, in accordance with another embodiment;
FIG. 57 is a process diagram illustrating a temporal filtering technique using separate motion tables and luma tables for each color component of the image pixel data received by the ISP front-end pixel processing unit shown in FIG. 51, in accordance with yet another embodiment;
FIG. 58 is a flow diagram illustrating a process for applying temporal filtering to a current image pixel in a set of image data using the motion table and brightness table shown in FIG. 57 in accordance with yet another embodiment;
FIG. 59 depicts a sample of full resolution raw image data that may be captured with an image sensor, in accordance with various aspects of the present disclosure;
FIG. 60 illustrates an image sensor that may be configured to apply binning to the full resolution raw image data of FIG. 59 to output samples of binned raw image data, in accordance with one embodiment of the present disclosure;
FIG. 61 depicts a sample of binned raw image data that may be provided by the image sensor of FIG. 60, in accordance with various aspects of the present disclosure;
FIG. 62 depicts the binned original image data of FIG. 61 after resampling with a binning compensation filter, in accordance with aspects of the present disclosure;
FIG. 63 depicts a boxed compensation filter that may be implemented in the ISP front-end pixel processing unit of FIG. 51, in accordance with one embodiment;
FIG. 64 is a graphical depiction of various steps that may be applied to a differential analyzer to select a center input pixel and index/phase for binning compensation filtering in accordance with various aspects of the present disclosure;
FIG. 65 is a flow diagram illustrating a process of scaling image data using the boxed compensation filter of FIG. 63, according to one embodiment;
FIG. 66 is a flow diagram illustrating a process of determining the current input source center pixel for horizontal and vertical filtering of the boxed compensation filter of FIG. 63, according to one embodiment;
FIG. 67 is a flow diagram illustrating a process of determining an index for selecting filter coefficients for horizontal and vertical filtering of the boxed compensation filter of FIG. 63, in accordance with one embodiment;
FIG. 68 is a more detailed block diagram representation of one embodiment of a statistics processing unit that may be implemented in the ISP front-end processing logic shown in FIG. 10 in accordance with aspects of the present disclosure;
FIG. 69 illustrates various image frame boundary conditions that may be considered when applying detection and correction of defective pixels in the statistics processing of the statistics processing unit of FIG. 68, in accordance with various aspects of the present disclosure;
FIG. 70 is a flow diagram illustrating a process for defective pixel detection and correction during statistical information processing according to one embodiment;
FIG. 71 shows a three-dimensional profile depicting the relationship of light intensity to pixel location for a conventional lens of an imaging device;
FIG. 72 is a color map representing non-uniform light intensity (possibly due to lens shading irregularities) across an image;
FIG. 73 is an illustration of an original imaging frame including a lens shading correction area and a gain grid, in accordance with aspects of the present disclosure;
FIG. 74 illustrates interpolation of gain values for image pixels surrounded by four boundary grid gain points, in accordance with various aspects of the present disclosure;
FIG. 75 is a flow chart illustrating a process of determining an interpolation gain value applicable to an imaging pixel during a lens shading correction operation, in accordance with one embodiment of the present technique;
FIG. 76 is a three-dimensional profile depicting interpolation gain values that may be applied to an image exhibiting the light intensity characteristics shown in FIG. 71 when performing lens shading correction, in accordance with various aspects of the present disclosure;
FIG. 77 shows the color map of FIG. 72 after application of a lens shading correction operation exhibiting improved light intensity uniformity, in accordance with various aspects of the present disclosure;
FIG. 78 illustrates how the radial distance between the current pixel and the center of the image is calculated and used to determine the radial gain component of lens shading correction, in accordance with one embodiment;
FIG. 79 is a flow chart illustrating a process for using the radial gain and interpolation gain of the gain grid to determine the total gain applicable to an imaged pixel in a lens shading correction operation in accordance with one embodiment of the present technique;
FIG. 80 is a diagram showing a white region in a color space and a low color temperature axis and a high color temperature axis;
FIG. 81 is a table showing how white balance gains are set for various reference lighting conditions, according to one embodiment;
FIG. 82 is a block diagram representation of a statistics collection engine that may be implemented in ISP front-end processing logic according to one embodiment of the present disclosure;
FIG. 83 illustrates downsampling of raw Bayer RGB data, in accordance with aspects of the present disclosure;
FIG. 84 depicts a two-dimensional color histogram that may be collected using the statistics collection engine of FIG. 82, according to one embodiment;
FIG. 85 depicts zooming and panning within a two-dimensional color histogram;
FIG. 86 is a diagram illustrating in greater detail the logic for implementing the pixel filter of the statistics gathering engine, according to one embodiment;
FIG. 87 is a graphical depiction of how the position of a pixel within the C1-C2 color space may be evaluated based on pixel conditions defined for the pixel filter, according to one embodiment;
FIG. 88 is a graphical depiction of how the position of a pixel within the C1-C2 color space may be evaluated based on pixel conditions defined for a pixel filter, in accordance with another embodiment;
FIG. 89 is a graphical depiction of how the position of a pixel within the C1-C2 color space is evaluated based on pixel conditions defined for the pixel filter, in accordance with yet another embodiment;
FIG. 90 is a diagram showing how image sensor integration time may be determined to compensate for flicker, according to one embodiment;
FIG. 91 is a block diagram detailing logic that may be implemented in the statistics collection engine of FIG. 82 and configured to collect autofocus statistics, according to one embodiment;
FIG. 92 is a diagram depicting a technique for autofocusing using coarse and fine autofocus credit values, according to an embodiment;
FIG. 93 is a flowchart describing a process for autofocusing using the coarse and fine autofocus credit values, according to one embodiment;
fig. 94 and 95 show decimation (determination) of the original Bayer data for obtaining white balance luminance values;
FIG. 96 illustrates a technique for autofocusing using the relative autofocus score values for each color component, according to an embodiment;
FIG. 97 is a more detailed diagram of the statistical information processing unit of FIG. 68 showing how Bayer RGB histogram data may be used to assist in black level compensation, according to one embodiment;
FIG. 98 is a block diagram representation of an embodiment of ISP pipeline processing logic of FIG. 7, in accordance with aspects of the present disclosure;
FIG. 99 is a diagram representing in greater detail an embodiment of an original pixel processing block that may be implemented in the ISP pipeline processing logic of FIG. 98, in accordance with various aspects of the present disclosure;
FIG. 100 is a graph representing various image frame boundary conditions that may be considered when applying the technique of detecting and correcting defective pixels during processing of the original pixel processing block shown in FIG. 99, in accordance with various aspects of the present disclosure;
FIG. 101-103 are flow diagrams depicting various processes that may be performed in the original pixel processing block of FIG. 99 to detect and correct defective pixels, according to one embodiment;
FIG. 104 is a representation of the positions of two green pixels in a 2 x 2 pixel block of a Bayer image sensor that may be interpolated when a green non-uniformity correction technique is applied in the processing of the raw pixel processing block of FIG. 99, in accordance with various aspects of the present disclosure;
FIG. 105 illustrates a set of pixels including a center pixel and associated horizontally adjacent pixels that may be used as part of a horizontal filtering process for noise reduction, in accordance with various aspects of the present disclosure;
FIG. 106 illustrates a set of pixels, including a center pixel and associated vertically neighboring pixels, that can be used as part of a vertical filtering process for noise reduction, in accordance with various aspects of the present disclosure;
FIG. 107 is a simplified flowchart describing how demosaicing is applied to the raw Bayer image pattern to produce a full color RGB image;
FIG. 108 depicts a set of pixels of a Bayer image pattern from which horizontal and vertical energy components may be derived during demosaicing of the Bayer image pattern to interpolate green color values, in accordance with one embodiment;
FIG. 109 illustrates a set of horizontal pixels to which filtering may be applied during demosaicing of a Bayer image pattern to determine a horizontal component of an interpolated green color value, in accordance with aspects of the present technique;
FIG. 110 illustrates a set of vertical pixels to which filtering may be applied during demosaicing of a Bayer image pattern to determine a vertical component of an interpolated green color value, in accordance with aspects of the present technique;
FIG. 111 illustrates that during demosaicing of a Bayer image pattern, filtering may be applied thereto to determine respective 3 × 3 pixel blocks of interpolated red and blue values, in accordance with aspects of the present technique;
FIG. 112 & 115 provide a flow diagram describing various processes for interpolating green, red and blue color values during demosaicing of a Bayer image pattern, according to one embodiment;
FIG. 116 is a color diagram representing an original image scene that may be captured with an image sensor and processed in accordance with aspects of the demosaicing techniques disclosed herein;
FIG. 117 shows a color map of a Bayer image pattern of the image scene shown in FIG. 116;
FIG. 118 is a color map of an RGB image reconstructed using conventional demosaicing techniques, according to the Bayer image pattern of FIG. 117;
FIG. 119 is a color map of an RGB image reconstructed from the Bayer image pattern of FIG. 117, in accordance with aspects of the demosaicing technique disclosed herein;
FIG. 120-123 depicts the structure and arrangement of line buffers that may be used to implement the original pixel processing block of FIG. 99, according to one embodiment;
FIG. 124 is a flowchart illustrating a method of processing raw pixel data using the line buffer structure shown in FIG. 120-123 according to one embodiment;
FIG. 125 is a diagram representing more detail for one embodiment of RGB processing blocks that may be implemented in the ISP pipeline processing logic of FIG. 98, in accordance with various aspects of the present disclosure;
FIG. 126 is a diagram representing more details of one embodiment of a YCbCr processing block that may be implemented in the ISP pipeline processing logic of FIG. 98, in accordance with various aspects of the present disclosure;
FIG. 127 is a graphical depiction of active areas of luminance and chrominance defined within a source buffer utilizing a 1-plane format, in accordance with aspects of the present disclosure;
FIG. 128 is a graphical depiction of active regions of luminance and chrominance defined within a source buffer utilizing a 2-plane format in accordance with various aspects of the present disclosure;
FIG. 129 is a block diagram illustrating image sharpening logic that may be implemented in the YCbCr processing block as shown in FIG. 126, in accordance with one embodiment;
FIG. 130 is a block diagram illustrating edge enhancement logic that may be implemented in the YCbCr processing block as shown in FIG. 126, in accordance with one embodiment;
FIG. 131 is a diagram representing a chroma attenuation factor versus a sharpened luminance value in accordance with various aspects of the present disclosure;
FIG. 132 is a block diagram illustrating image brightness (brightness), contrast, and color (BCC) adjustment logic that may be implemented in the YCbCr processing block as shown in FIG. 126, in accordance with one embodiment;
fig. 133 represents a hue and saturation color wheel in a YCbCr color space defining various hue angle and saturation values that may be applied during color adjustment in the BCC adjustment logic of fig. 132;
FIG. 134 is a block diagram representing one embodiment of the ISP back-end processing logic of FIG. 8 that may be configured to perform various post-processing steps downstream of the ISP pipeline, in accordance with various aspects of the present disclosure;
FIG. 135 is a graphical representation showing a conventional global tone mapping technique;
FIG. 136 is a graphical representation showing another conventional global tone mapping technique;
FIG. 137 depicts how various regions of an image may be segmented in order to apply local tone application techniques, in accordance with various aspects of the present disclosure;
FIG. 138 graphically illustrates how conventional local tone mapping results in limited utilization of the output tone range;
FIG. 139 graphically illustrates a technique for local tone mapping, in accordance with an embodiment of the present disclosure;
FIG. 140 is a more detailed block diagram representing one embodiment of local tone mapping LTM logic that may be configured to implement the tone mapping process in the ISP backend logic of FIG. 134, in accordance with various aspects of the present disclosure;
FIG. 141 is a flowchart illustrating a method of processing image data using the ISP back-end processing logic of FIG. 134, according to one embodiment;
FIG. 142 is a flow diagram representing a method of applying tone mapping using the LTM logic shown in FIG. 140, in accordance with one embodiment.
Detailed Description
One or more specific embodiments of the present disclosure will be described below. These illustrated embodiments are merely examples of the presently disclosed technology. In addition, in an effort to provide a concise description of these embodiments, all features of an actual implementation may not be described in the specification. It should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another. Moreover, it should be appreciated that such a development effort might be complex and time consuming, but would nevertheless be a routine undertaking of design, fabrication, and manufacture for those of ordinary skill having the benefit of this disclosure.
When introducing elements of various embodiments of the present disclosure, the articles "a," "an," and the like are intended to mean that there are one or more of the elements. The terms "comprising," "including," and "having" are intended to be open-ended and mean that there may be additional elements other than the listed elements. In addition, it is to be understood that references to "one embodiment" or "an embodiment" of the present disclosure are not intended to be interpreted as excluding the existence of additional embodiments that also incorporate the recited features.
As described below, the present disclosure generally relates to techniques for processing image data obtained with one or more image sensing devices. In particular, certain aspects of the present disclosure may relate to techniques for detecting and correcting defective pixels, techniques for demosaicing an original image pattern, techniques for sharpening a luminance image using a multi-scale unsharp mask, and techniques for correcting lens shading irregularities by applying lens shading gain. Further, it should be appreciated that the presently disclosed technology is applicable to both still images and moving images (e.g., video), and may be used in any suitable type of imaging application, such as digital cameras, electronic devices with integrated digital cameras, security or video surveillance systems, medical imaging systems, and so forth.
With the above in mind, FIG. 1 is a block diagram illustrating an example of an electronic device 10 that provides for processing of image data by utilizing one or more of the image processing techniques briefly mentioned above. The electronic device 10 may be any type of electronic device configured to receive and process image data, such as data obtained using one or more image sensing components, such as a laptop or desktop computer, a mobile phone, a digital media player, and so forth. For example, the electronic device 10 may be a portable electronic device, such as a model number available from apple Inc. of Cupertino, California OrIn addition, the electronic device 10 may be a desktop or laptop computer, such as a model number available from apple IncPro、MacBookMini or MacIn other embodiments, electronic device 10 may also be some type of electronic device of another manufacturer that is capable of obtaining and processing image data.
Regardless of its form (e.g., portable or non-portable), it should be appreciated that the electronic device 10 may provide for the processing of image data using one or more of the image processing techniques briefly discussed above, which may include defective pixel correction and/or detection techniques, lens shading correction techniques, de-mosaic techniques, or image sharpening techniques, among others. In some embodiments, electronic device 10 may apply such image processing techniques to image data stored in a memory of electronic device 10. In further embodiments, electronic device 10 may include one or more imaging devices, such as an integral or external digital camera, configured to acquire image data that electronic device 10 may then process using one or more of the above-described image processing techniques. Various embodiments that represent portable and non-portable embodiments of the electronic device 10 are further discussed below in fig. 3-6.
As shown in FIG. 1, electronic device 10 may include various internal and/or external components that contribute to the functionality of device 10. Those of ordinary skill in the art will recognize that the various functional blocks shown in fig. 1 may comprise hardware components (including circuitry), software components (including computer code stored on a computer-readable medium), or a combination of hardware and software components. For example, in the presently illustrated embodiment, the electronic device 10 may include input/output (I/O) ports 12, input structures 14, one or more processors 16, a storage device 18, a non-volatile storage device 20, an expansion card 22, a networking device 24, a power supply 26, and a display 28. In addition, the electronic device 10 may include one or more imaging devices 30 (such as a digital camera) and image processing circuitry 32. As described further below, the image processing circuitry 32 may be configured to implement one or more of the above-described image processing techniques in processing image data. It is recognized that the image data processed by the image processing circuitry 32 may be retrieved from the memory 18 and/or the non-volatile storage device 20 or may be obtained using the imaging device 30.
Before proceeding with the description, it should be appreciated that the system block diagram of device 10 shown in FIG. 1 is intended to depict a high-level control diagram of various components that may be included in such a device 10. That is, the connecting lines between the individual components represented in FIG. 1 do not necessarily represent the paths or directions in which data flows, or is communicated, between the various components of the device 10. Indeed, as described below, in some embodiments, the processor 16 described may include multiple processors, such as a main processor (e.g., CPU) and a dedicated image and/or video processor. In such embodiments, the processing of the image data may be primarily responsible for these specialized processors, effectively offloading these tasks from the main processor (CPU).
With respect to each of the components illustrated in fig. 1, the I/O ports 12 may include ports configured to connect with various external devices, such as a power supply, an audio output device (e.g., a headset or a microphone), or other electronic devices (such as handheld devices and/or computers, printers, projectors, external displays, modems, docking stations, and so forth). In one embodiment, the I/O port 12 may be configured to connect to an external imaging device, such as a digital camera, in order to obtain image data that may be processed using the image processing circuitry 32. The I/O ports 12 may support any suitable interface type, such as a Universal Serial Bus (USB) port, a serial connection port, an IEEE-1394 (firewire) port, an Ethernet or modem port, and/or an AC/DC power connection port.
In some embodiments, some I/O ports 12 may be configured to provide more than one function. For example, in one embodiment, the I/O port 12 may comprise an apple Inc. dedicated port that not only enables easier transfer of data between the electronic device 10 and an external source, but also enables the device 10 to be coupled with a charging interface (such as a power adapter for providing power from a wall outlet, or an interface cable configured to draw power from another electrical device, such as a desktop or laptop computer) in order to charge a power supply 26 (which may include one or more rechargeable batteries). Thus, the I/O port 12 may be configured to function as both a data transfer port and an AC/DC power connection port based on external components coupled to the device 10 via the I/O port 12.
The input structures 14 may provide user input or feedback to the processor 16. For example, input structures 14 may be configured to control one or more functions of electronic device 10, such as applications running on electronic device 10. For example, the input structures 14 may include buttons, sliders, switches, control pads, keys, knobs, scroll wheels, keyboards, mice, touch pads, and the like, or combinations thereof. In one embodiment, the input structures 14 allow a user to manipulate a Graphical User Interface (GUI) displayed on the device 10. Additionally, the input structure 14 may include a touch sensitive mechanism disposed in conjunction with the display 28. In such embodiments, the user may select or interact with the displayed interface member using a touch sensitive mechanism.
Input structures 14 may include various devices, circuits, and channels that provide user input or feedback to one or more processors 16. Such input structures 14 may be configured to control functions of device 10, applications running on device 10, and/or any interfaces or devices connected to electronic device 10 or used by electronic device 10. For example, the input structures 14 may allow a user to manipulate a displayed user interface or application interface. Examples of input structures 14 may include buttons, sliders, switches, control pads, keys, knobs, scroll wheels, keyboards, mice, touch pads, and the like.
In some embodiments, the input structure 14 and the display device 28 may be provided together, such as in the case of a "touch screen", so that a touch sensitive mechanism is provided in conjunction with the display 28. In such embodiments, the user may select or interact with the displayed interface element via the touch sensitive mechanism. In this way, the displayed interface may provide interactive functionality, allowing a user to manipulate the displayed interface by touching the display 28. For example, user interaction with the input structures 14, such as user or application interface interaction displayed on the display 28, may generate electrical signals representing user input. These input signals may be routed to one or more processors 16 via an appropriate channel, such as an input hub or data bus, for further processing.
In one embodiment, the input structure 14 may include an audio input device. For example, the electronic device 10 may be equipped with one or more audio capture devices, such as one or more microphones. The audio capture device may be integral with the electronic device 10 or may be an external device coupled to the electronic device 10, such as through the I/O port 12. As described further below, the electronic device 10 may be both an audio input device and an imaging device 30 to capture sound and image data (e.g., video data), and may include logic configured to provide for synchronizing the captured video data and audio data.
In addition to processing various input signals received via input structures 14, processor 16 may control the general operation of device 10. For example, the processor 16 may provide processing capabilities to execute an operating system, programs, user interfaces and application interfaces, and any other functions of the electronic device 10. Processor 16 may include one or more microprocessors, such as one or more "general purpose" microprocessors, one or more special purpose microprocessors and/or application specific microprocessors (ASICs), or a combination of such processing components. For example, the processor 16 may include one or more instruction set (e.g., RISC) processors, as well as a Graphics Processor (GPU), a video processor, an audio processor, and/or related chip sets. It should be appreciated that processor 16 may be coupled to one or more data buses for transferring data and instructions between the various components of device 10. In some embodiments, processor 16 may provide for running an imaging application on electronic device 10, such as Photo available from apple IncOrOr available in various models from apple IncThe processing power of the "camera" and/or "photo" applications above.
Instructions or data to be processed by processor 16 may be stored in a computer-readable medium, such as memory device 18. The memory means 18 may be provided as volatile memory, such as Random Access Memory (RAM), or non-volatile memory, such as Read Only Memory (ROM), or a combination of one or more RAM and ROM devices. The memory 18 may hold various information and may be used for various purposes. For example, the memory 18 may store firmware for the electronic device 10, such as a basic input/output system (BIOS), an operating system, various programs, applications, or any other routines that may be run on the electronic device 10, including user interface functions, processor functions, and so forth. Additionally, during operation of the electronic device 10, the memory 18 may be used for caching. For example, in one embodiment, memory 18 includes one or more frame buffers that buffer video data as it is output to display 28.
In addition to the memory device 18, the electronic device 10 may also include a non-volatile storage device 20 for persistently storing data and/or instructions. The non-volatile storage device 20 may include flash memory, a hard disk drive, or any other optical, magnetic, and/or solid-state storage medium, or some combination thereof. Thus, although depicted in FIG. 1 as a single device for clarity, it will be appreciated that the non-volatile storage 20 may include a combination of one or more of the above-listed storage devices operating in conjunction with the processor 16. Non-volatile memory 20 may be used to store firmware, data files, image data, software programs and applications, wireless connection information, personal information, user preferences, and any other suitable data. According to various aspects of the present disclosure, image data stored in the non-volatile storage device 20 and/or the memory arrangement 18 may be processed by the image processing circuitry 32 before being output on the display.
The embodiment illustrated in FIG. 1 also includes one or more card or expansion slots. The card slot may be configured to receive an expansion card 22, and the expansion card 22 may be used to add functionality to the electronic device 10, such as additional memory, I/O functionality, or networking capability. Such an expansion card 22 may be connected to the device by any kind of suitable connector and may be accessed externally or internally with respect to the housing of the electronic device 10. For example, in one embodiment, the expansion card 22 may be a flash memory card, such as a Secure Digital (SD) card, a mini or micro SD, compact flash card, or the like, or may be a PCMCIA device. Additionally, expansion card 22 may be a Subscriber Identity Module (SIM) card for use with embodiments of electronic device 10 that provide mobile telephone capabilities.
The electronic device 10 also includes a network device 24, which may be a network controller or Network Interface Card (NIC) that provides network connectivity over a wireless 802.11 standard, or any other suitable networking standard, such as a Local Area Network (LAN), a Wide Area Network (WAN), e.g., an enhanced data rates for GSM evolution (EDGE) network, a 3G data network, or the Internet. In some embodiments, the network device 24 may provide information related to an online digital media content provider, such as available from apple IncConnection of music services.
The power supply 26 of the device 10 may include the capability to power the device 10 in both non-portable and portable settings. For example, in a portable setting, device 10 may include one or more batteries, such as lithium ion batteries, that provide power to device 10. The battery may be recharged by connecting the device 10 to an external power source, such as a wall outlet. In a non-portable setting, the power supply 26 may include a Power Supply Unit (PSU) configured to draw power from a wall outlet and distribute the power to various components of a non-portable electronic device, such as a desktop computing system.
The display 28 may be used to display various images generated by the device 10, such as a GUI of an operating system, or image data (including still images and video data) processed by the image processing circuitry 32, as further described below. As described above, the image data may include image data obtained with imaging device 30, or image data retrieved from memory 18 and/or non-volatile storage device 20. The display 28 may be any suitable type of display such as, for example, a Liquid Crystal Display (LCD), a plasma display, or an Organic Light Emitting Diode (OLED) display. Additionally, as described above, the display 28 may be provided in conjunction with the touch sensitive mechanism described above (e.g., a touch screen) that functions as part of the control interface for the electronic device 10.
The imaging device 30 shown diagrammatically may be provided in the form of a digital camera configured to obtain still images and moving images (e.g., video). The camera 30 may include a lens and one or more image sensors configured to capture light and convert the light into electrical signals. For example, the image sensor may include a CMOS image sensor (e.g., CMOS Active Pixel Sensor (APS)) or a CCD (charge coupled device) sensor. Typically, the image sensor in the camera 30 includes an integrated circuit having an array of pixels, where each pixel includes a photodetector that senses light. Those skilled in the art will recognize that the photodetectors in the imaging pixels typically detect the intensity of light captured through the camera lens. However, the photodetector itself is typically unable to detect the wavelength of the captured light, and thus unable to determine color information.
Accordingly, the image sensor also includes a Color Filter Array (CFA) that overlies or is disposed over the pixel array of the image sensor to capture color information. The color filter array may include an array of tiny color filters, each of which may overlap a respective pixel of the image sensor and filter the captured light by wavelength. Thus, when used in combination, the color filter array and photodetectors may provide information about the wavelength and intensity of light captured by the camera, which may be representative of a captured image.
In one embodiment, the color filter array may comprise a Bayer color filter array that provides a filtering pattern of 50% green elements, 25% red elements, and 25% blue elements. For example, fig. 2 shows a 2 × 2 pixel block of a Bayer CFA, which includes 2 green elements (Gr and Gb), 1 red element (R), and one blue element (B). Thus, an image sensor utilizing a Bayer color filter array may provide information regarding the intensity of light received by the camera 30 at green, red, and blue wavelengths, such that only one of the three colors (RGB) is recorded per image pixel. This information (which may be referred to as "raw image data" or data in the "raw domain") may then be processed using one or more demosaicing techniques to convert the raw image data into a full color image, typically by interpolating a set of red, green, and blue values for each pixel. This demosaicing technique is performed by the image processing circuit 32, as described further below.
As described above, the image processing circuitry 32 may provide various image processing steps such as defective pixel detection/correction, lens shading correction, demosaicing, image sharpening, noise reduction, gamma correction, image enhancement, color space transformation, image compression, chroma sub-sampling and image scaling operations, and so forth. In some embodiments, image processing circuitry 32 may include various subcomponents and/or discrete logic units that together comprise an image processing "pipeline" that performs each of the various image processing steps. These subcomponents may be implemented in hardware (e.g., a digital signal processor or ASIC) or software, or by a combination of hardware and software components. The various image processing operations that the image processing circuitry 32 may provide, particularly those related to defective pixel detection/correction, lens shading correction, demosaicing, and image sharpening, will be described in greater detail below.
Before proceeding with the description, it should be noted that while various embodiments of various image processing techniques described below may utilize a Bayer CFA, the presently disclosed techniques are not intended to be limited thereto. Indeed, those skilled in the art will recognize that the image processing techniques provided herein are suitable for any suitable type of color filter array, including RGBW filters, CYGM filters, and the like.
Referring back to the electronic device 10, FIGS. 3-6 illustrate various forms that the electronic device 10 may take. As mentioned above, the electronic device 10 may take the form of a computer, including computers that are generally portable (such as laptop, notebook, and tablet computers), as well as computers that are not generally portable (such as desktop computers, workstations, and/or servers), or other types of electronic devices, such as handheld portable electronic devices (e.g., digital media players or mobile telephones). In particular, fig. 3 and 4 depict the electronic device 10 in the form of a laptop computer 40 and a desktop computer 50, respectively. Fig. 5 and 6 show front and rear views, respectively, of an electronic device 10 in the form of a hand-portable device 60.
As shown in FIG. 3, the laptop computer 40 is depicted as including a housing 42, a display 28, I/O ports 12, and input structures 14. The input structure 14 may include a keyboard and touchpad mouse integrated with the housing 42. In addition, the input structures 14 may include various other buttons and/or switches that may be used to interact with the computer 40, such as to power on or start the computer, operate a GUI or application running on the computer 40, and adjust various other aspects related to the operation of the computer 40 (e.g., volume, display brightness, etc.). The computer 40 may also include various I/O ports 12 that provide connectivity to other devices as described above, such as Or a USB port, a high-definition multimedia interface (HDMI) port,Or any other type of port suitable for connection to an external device. Additionally, as described above with respect to fig. 1, computer 40 may include network connectivity (e.g., network device 26), memory (e.g., memory 20), and storage capabilities (e.g., storage device 22).
Further, in the illustrated embodiment, the laptop computer 40 may include an integral imaging device 30 (e.g., a camera). In other embodiments, the laptop computer 40 may utilize an external camera (e.g., an external USB camera or "webcam") connected to one or more I/O ports 12 instead of or in addition to the all-in-one camera 30. For example, the external camera may be available from apple IncA camera. The camera 30 (whether integral or external) may provide for the capture and recording of images. Such images may then be viewed by a user with an image viewing application, or may be used by other applications, including: video conferencing applications (e.g., video conferencing applications)) And image editing/viewing applications (such as Photo)Or) These applications are available from apple, inc. In some embodiments, the laptop computer 40 depicted may be of a type available from apple Inc Pro、MacBookOrAdditionally, in one embodiment, computer 40 may be a portable tablet computing device, such as a model number also available from apple IncA tablet computer.
Fig. 4 also illustrates an embodiment in which the electronic device 10 is provided in the form of a desktop computer 50. It will be appreciated that the desktop computer 50 may include many features generally similar to those provided by the laptop computer 40 shown in FIG. 4, but may have a generally larger overall form factor. As shown, a desktop computer 50 may be disposed in a housing 42, the housing 42 including the display 28, as well as various other components discussed above with respect to the block diagram shown in FIG. 1. Further, the desktop computer 50 may include an external keyboard and mouse (input structures 14) that may be coupled to the computer 50 through one or more I/O ports 12 (e.g., USB), or may communicate wirelessly (e.g., RF, Bluetooth, etc.) with the computer 50. Desktop computer 50 also includes imaging device 30. As noted above, imaging device 30 may be an integral or external camera. In some embodiments, the desktop computer 50 depicted may be of a type available from apple Inc mini or Mac
As further shown, the display 28 may be configured to generate various images that may be viewed by a user. For example, during operation of computer 50, display 28 may display a graphical user interface ("GUI") 52, GUI 52 allowing a user to interact with an operating system and/or applications running on computer 50. The GUI 52 may include various layers, windows, screens, templates, or other graphical elements that may be displayed in whole or in part of the display 28. For example, in the described embodiment, the operating system GUI 52 may include a variety ofGraphical icons 54, each graphical icon 54 corresponding to a variety of applications that are opened or executed when a user selection is detected (e.g., via keyboard/mouse or touch screen input). The icons 54 may be displayed in a docking station (dock)56 or within one or more graphical window elements 58 displayed on the screen. In some embodiments, selection of an icon 54 may lead to a hierarchical navigation process whereby selection of the icon 54 leads to the screen, or opens another graphical window that includes one or more additional icons or other GUI elements. For example, the operating system GUI 52 shown in FIG. 4 may be derived from a version of Mac available from apple Inc And (4) operating the system.
With continued reference to fig. 5 and 6, the electronic device 10 is further illustrated in the form of a portable handheld electronic device 60, which portable handheld electronic device 60 may be of a type available from apple IncOrIn the depicted embodiment, the handheld device 60 includes a housing 42, the housing 42 protecting internal components from physical damage and shielding the internal components from electromagnetic interference. Housing 42 may be constructed of any suitable material, or combination of materials, such as plastic, metal, or a composite material, and may allow certain frequencies of electromagnetic radiation, such as wireless networking signals, to pass through to wireless communication circuitry (e.g., network device 24) disposed within housing 42, as shown in fig. 5.
The housing 42 also includes various user input structures 14 through which a user may interact with the handheld device 60 via the user input structures 14. For example, each input structure 14 may be configured to control one or more respective device functions when pressed or activated. For example, one or more input structures 14 may be configured to invoke a "home" screen, or menu to be displayed, switch between sleep, wake-up or power on/off modes, mute a ring tone for a cellular telephone application, increase or decrease a volume output, and so forth. It should be appreciated that the illustrated input structures 14 are merely illustrative and that the handheld device 60 may include many suitable user input structures in various forms, including buttons, switches, keys, knobs, scroll wheels, etc.
As shown in FIG. 5, handheld device 60 may include various I/O ports 12. For example, the depicted I/O ports 12 may include a dedicated connection port 12a for transmitting and receiving data files, or for charging the power supply 26, and an audio connection port 12b for connecting the device 60 to an audio output device (e.g., a headset or speakers). Further, in embodiments where the handheld device 60 provides mobile telephone functionality, the device 60 may include an I/O port 12c that receives a Subscriber Identity Module (SIM) card (e.g., expansion card 22).
The display device 28 (which may be an LCD, OLED, or any suitable type of display) may display various images generated by the handheld device 60. For example, the display 28 may display various system indicators 64 that provide feedback to the user regarding one or more states of the handheld device 60 (such as power status, signal strength, external device connection, etc.). The display may also display a GUI 52 that allows a user to interact with the device 60, as described below with reference to fig. 4. The GUI 52 may include graphical elements, such as icons 54, the icons 54 corresponding to various applications that may be opened or run when a user selection of the respective icon 54 is detected. For example, one of the icons 54 may represent a camera application 66 that may be used in conjunction with the camera 30 (represented by the dashed line in FIG. 5) to obtain an image. Referring briefly to FIG. 6, a rear view of the handheld electronic device 60 shown in FIG. 5 is illustrated, showing the camera 30 as being integral with the housing 42 and positioned on the back of the handheld device 60.
As described above, image data obtained with the camera 30 may be processed with the image processing circuitry 32, which image processing circuitry 32 may include hardware (e.g., disposed within the housing 42) and/or software stored on one or more memories (e.g., memory 18 or non-volatile storage device 20) of the device 60. Images obtained using the camera application 66 and camera 30 may be saved on the device 60 (e.g., in the storage device 20) and may be viewed at a later time using the photo viewing application 68.
The handheld device 60 may also include various audio input and output components. For example, the audio input/output components, represented by reference numeral 70, may include an input receiver, such as one or more microphones. For example, where the handheld device 60 includes cellular telephone functionality, the input receiver may be configured to receive user audio input, such as the user's voice. In addition, the audio input/output component 70 may include one or more output transmitters. Such an output transmitter may include one or more speakers that may be used to transmit audio signals to the user, such as during playback of music data using the media player application 72. Further, in embodiments where the handheld device 60 includes a cellular telephone application, an additional audio output transmitter 74 may be provided, as shown in FIG. 5. Similar to the output transmitter of audio input/output component 70, output transmitter 74 also includes one or more speakers configured to transmit audio signals to a user, such as voice data received during a telephone conversation. Thus, the audio input/output components 70 and 74 may work together to function as the audio receiving and transmitting components of the telephone.
Given some background regarding the various forms that the electronic device 10 may take, the following discussion will focus on the image processing circuitry 32 depicted in fig. 1. As described above, the image processing circuit 32 may be implemented with hardware and/or software components and may include various processing units that define an Image Signal Processing (ISP) pipeline. In particular, the following discussion will focus on various aspects of the image processing techniques set forth in this disclosure, particularly aspects related to defective pixel detection/correction techniques, lens shading correction techniques, demosaicing techniques, and image sharpening techniques.
Referring now to FIG. 7, a simplified high-level block diagram depicting several functional components that may be implemented as part of the image processing circuit 32 is illustrated, in accordance with one embodiment of the presently disclosed technology. In particular, FIG. 7 is intended to illustrate how image data flows through image processing circuitry 32 in accordance with at least one embodiment. To provide an overview of the image processing circuitry 32, a general description of how these functional components operate to process image data is provided herein with reference to FIG. 7, and a more detailed description of each illustrated functional component and their corresponding subcomponents is provided further below.
Referring to the illustrated embodiment, the image processing circuit 32 may include Image Signal Processing (ISP) front end processing logic 80, ISP pipeline processing logic 82, and control logic 84. The image data captured by the imaging device 30 may first be processed by the ISP front end logic 80 and analyzed to capture image statistics that may be used to determine one or more control parameters of the ISP pipe logic 82 and/or the imaging device 30. The ISP front end logic 80 may be configured to capture image data from the image sensor input signal. For example, as shown in FIG. 7, the imaging device 30 may include a camera having one or more lenses 88 and an image sensor 90. As described above, the image sensor 90 may include a color filter array (e.g., Bayer filters) such that light intensity and wavelength information captured with each imaging pixel of the image sensor 90 may be provided to provide a set of raw image data that may be processed by the ISP front end logic 80. For example, the output 92 of the imaging device 30 may be received by a sensor interface 94, and the sensor interface 94 may then provide raw image data 96 to the ISP front end logic 80 based on, for example, the sensor interface type. For example, the sensor interface 94 may utilize a Standard Mobile Imaging Architecture (SMIA) interface or other serial or parallel camera interface, or some combination thereof. In some embodiments, the ISP front end logic 80 may operate in its own clock domain and may provide an asynchronous interface to the sensor interface 94 to support image sensors of different size and timing requirements. In some embodiments, sensor interface 94 may include a subinterface on the sensor side (e.g., a sensor-side interface) and a subinterface on the ISP frontend side, which subinterfaces constitute sensor interface 94.
The raw image data 96 may be provided to the ISP front-end logic 80 and processed pixel-by-pixel in a variety of formats. For example, each image pixel may have a bit depth of 8, 10, 12, or 14 bits. Various examples of storage formats that represent how pixel data may be stored and addressed in memory are discussed in further detail below. The ISP front end logic 80 may perform one or more image processing operations on the raw image data 96 and collect statistical information about the image data 96. Image processing operations, as well as the collection of statistical data, may be performed with the same or different bit depth precision. For example, in one embodiment, the processing of raw image pixel data 96 may be performed with 14-bit accuracy. In such an embodiment, raw pixel data received by the ISP front-end logic 80 having a bit depth of less than 14 bits (e.g., 8 bits, 10 bits, 12 bits) may be upsampled to 14 bits for image processing. In another embodiment, the statistical processing may be performed with an 8-bit precision, so that the raw pixel data with higher bit depth may be downsampled to an 8-bit format for statistics. It is to be appreciated that downsampling to 8 bits may reduce hardware size (e.g., area) and also reduce processing/computational complexity with respect to the statistical data. In addition, the raw image data may be spatially averaged to make the statistical data more robust to noise.
Further, as shown in fig. 7, the ISP front end logic 80 may also receive pixel data from the memory 108. For example, as indicated by reference numeral 98, raw pixel data may be sent from the sensor interface 94 to the memory 108. The raw pixel data residing in the memory 108 may then be provided to the ISP front end logic 80 for processing, as indicated by reference numeral 100. The memory 108 may be part of the memory device 18, the storage device 20, or may be a separate dedicated memory within the electronic device 10, and may include Direct Memory Access (DMA) features. Further, in some embodiments, the ISP front end logic 80 may operate within its own clock domain and may provide an asynchronous interface to the sensor interface 94 to support sensors of different sizes and with different timing requirements.
Upon receiving the raw image data 96 (from the sensor interface 94) or 100 (from the memory 108), the ISP front end logic 80 may perform one or more image processing operations, such as temporal filtering and/or binning compensation filtering. The processed image data may then be provided to the ISP pipeline logic 82 (output signal 109) for additional processing prior to being displayed (e.g., on display device 28) or may be sent to memory (output signal 110). The ISP pipe logic 82 receives either "front end" processing data directly from the ISP front end logic 80 or "front end" processing data (input signal 112) from the memory 108 and may provide other processing of the image data in the raw domain as well as in the RGB and YCbCr color spaces. The image data processed by the ISP pipeline logic 82 may then be output to the display 28 (signal 114) for viewing by a user and/or may be further processed by a graphics engine or GPU. Additionally, the output of the ISP pipeline logic 82 may be sent to the memory 108 (signal 115) and the display 28 may read image data from the memory 108 (signal 116), in some embodiments the memory 108 may be configured to implement one or more frame buffers. Further, in some implementations, the output of the ISP pipe logic 82 may be provided to a compression/decompression engine 118 (signal 117) for encoding/decoding image data. The encoded image data may be saved and later decompressed before being displayed on the display device 28 (signal 119). For example, the compression engine or "encoder" 118 may be a JPEG compression engine for encoding still images, or an h.264 compression engine for encoding video images, or some combination thereof, and a corresponding decompression engine for decoding image data. Additional information regarding image processing operations that may be provided in the ISP pipeline logic 82 will be discussed in more detail below with reference to fig. 98-133. Additionally, it should be noted that the ISP pipe logic 82 may also receive raw image data from the memory 108, as illustrated by the input signal 112.
The statistics 102 determined by the ISP front end logic 80 may be provided to the control logic unit 84. For example, the statistical data 102 may include image sensor statistics related to auto-exposure, auto-white balance, auto-focus, flicker detection, Black Level Compensation (BLC), lens shading correction, and so forth. The control logic 84 may include a processor and/or microcontroller configured to execute one or more routines (e.g., firmware) that may be configured to determine control parameters 104 of the imaging device 30 and control parameters 106 of the ISP pipe logic 82 based on the received statistical data 102. For example, the control parameters 104 may include sensor control parameters (e.g., gain, integration time for exposure control), camera flash control parameters, lens control parameters (e.g., focal length for focusing or zooming), or a combination of these parameters. The ISP control parameters 106 may include gain levels and Color Correction Matrices (CCMs) for automatic white balance and color adjustment (e.g., during RGB processing), as well as lens shading correction parameters, which may be determined from white point balance parameters, as described below. In some embodiments, in addition to analyzing statistics 102, control logic 84 may also analyze historical statistics, which may be stored on electronic device 10 (e.g., in memory 18 or storage device 20).
Referring to the illustrated embodiment, the image processing circuit 32 may Include Signal Processing (ISP) front end processing logic 80, ISP pipeline processing logic 82, and control logic 84. Image data captured by the imaging device 30 may first be processed by the ISP front end logic 80 and analyzed to capture image statistics that may be used to determine one or more control parameters of the ISP pipe logic 82 and/or the imaging device 30. The ISP front end logic 80 may be configured to capture image data from the image sensor input signal. For example, as shown in FIG. 7, the imaging device 30 may include a camera having one or more lenses 88 and an image sensor 90. As described above, the image sensor 90 may include a color filter array (e.g., Bayer filters) such that light intensity and wavelength information captured with each imaging pixel of the image sensor 90 may be provided to provide a set of raw image data that may be processed by the ISP front end logic 80. For example, the output 92 of the imaging device 30 may be received by a sensor interface 94, and the sensor interface 94 then provides raw image data 96 to the ISP front end logic 80 based on, for example, the type of sensor interface. For example, the sensor interface 94 may utilize a Standard Mobile Imaging Architecture (SMIA) interface, or other serial or parallel camera interface, or some combination thereof. In some embodiments, the ISP front end logic 80 may operate in its own clock domain and may provide an asynchronous interface to the sensor interface 94 to support image sensors of different size and timing requirements.
Fig. 8 shows a block diagram depicting another embodiment of the image processing circuit 32, wherein like components are designated by like reference numerals. In general, the operation and function of the image processing circuit 32 of fig. 8 is similar to the image processing circuit 32 of fig. 7, except that the embodiment shown in fig. 8 further includes an ISP back-end processing logic unit 120, the ISP back-end processing logic unit 120 may be coupled downstream of the ISP pipeline 82 and provide additional post-processing steps.
In the illustrated embodiment, the ISP backend logic 120 may receive the output 114 from the ISP pipeline 82 and post-process the received data 114. Additionally, the ISP backend 120 may receive image data directly from the memory 108, as shown by input 124. As further described below with reference to fig. 134-142, an embodiment of the ISP backend logic 120 may provide dynamic range compression of image data (commonly referred to as "tone mapping"), brightness, contrast, and color adjustment, and scaling logic to scale the image data to a desired size or resolution (e.g., based on the resolution of the output display device). In addition, the ISP backend logic 120 may also include feature detection logic that detects certain features in the image data. For example, in one embodiment, the feature detection logic may include face detection logic configured to identify regions within the image data where faces and/or facial features are located and/or placed. The face detection data may be provided to a front-end statistical information processing unit as feedback data for determining auto white balance, auto focus, flicker, and auto exposure statistical information. For example, a statistical information processing unit (discussed in more detail below in fig. 68-97) in the ISP frontend 80 may be configured to select a window for statistical information processing based on the determined location of the face and/or facial features in the image data.
In some embodiments, the face detection data may also be provided to at least one of the local tone mapping processing logic, the ISP backend statistics unit, or the encoder/decoder unit 118 instead of or in addition to being fed back to the ISP front end statistics feedback control loop. As described further below, the face detection data provided to the back-end statistics unit may be used to control the quantization parameters. For example, when encoding or compressing output image data (e.g., in macroblocks), quantization may be reduced for regions of an image that have been determined to include faces and/or facial features, thereby improving the visual quality of the faces and facial features when the user displays and views the image.
In other embodiments, the feature detection logic may also be configured to detect the location of corners of objects in the image frames. This data may be used to identify the location of features in successive image frames to determine an estimate of global motion between frames, which may be used to perform certain image processing operations, such as image registration. In one embodiment, the identification of corner features and the like may be particularly useful for algorithms that combine multiple image frames, such as certain High Dynamic Range (HDR) imaging algorithms, and certain panorama stitching algorithms.
Further, as shown in fig. 8, the image data processed by the ISP backend logic 120 may be output to the display device 28 (signal 126) for viewing by a user and/or may be further processed by a graphics engine or GPU. Additionally, the output of the ISP backend logic 120 may be sent to the memory 108 (signal 122) and the display 28 may read image data from the memory 108 (signal 116), in some embodiments the memory 108 may be configured to implement one or more frame buffers. In the illustrated embodiment, the output of the ISP backend logic 120 may also be provided to a compression/decompression engine 118 (signal 117) to encode/decode image data for storage and subsequent playback as generally described above in fig. 7. In other embodiments, the ISB subsystem 32 of fig. 8 may have the option of bypassing the ISP backend processing unit 120. In such an embodiment, if the back-end processing unit 120 is bypassed, then the ISP subsystem 32 of FIG. 8 may operate in a similar manner as shown in FIG. 7, i.e., the output of the ISP pipeline 82 is sent directly/indirectly to one or more of the memory 108, the encoder/decoder 118, or the display 28.
The image processing techniques described in the embodiments shown in fig. 7 and 8 may be generally summarized using the method 130 described in the flow chart in fig. 9. As shown, the method 130 begins at block 132 by receiving raw image data (e.g., Bayer pattern data) from an image sensor (e.g., 90) using a sensor interface at block 132. At block 134, the raw image data received at step 132 is processed using the ISP front end logic 80. As described above, the ISP front end logic 80 may be configured to apply temporal filtering, boxed compensation filtering. The raw image data processed by the ISP front-end logic 80 may then be further processed by the ISP pipeline 82 at step 136, and the ISP pipeline 82 may perform various processing steps to demosaic the raw image data into full-color RGB data and further convert the RGB color data to a YUV or YC1C2 color space (where C1 and C2 represent different chroma difference colors, where, in one embodiment, C1 and C2 may represent blue color difference (Cb) and red color difference (Cr) chroma).
From step 136, method 130 may continue to step 138 or to step 140. For example, in embodiments (FIG. 7) in which the output of the ISP pipeline 82 is provided to the display device 28, the method 130 continues to step 140 where the YC1C2 image data is displayed by the display device 28 (or the YC1C2 image data is sent from the ISP pipeline 82 to the memory 108) at step 140. On the other hand, in embodiments (fig. 8) where the ISP pipeline 82 output is post-processed by the ISP backend unit 120, the method 130 may continue from step 136 to step 138, where the YC1C2 output of the ISP pipeline 182 is processed by the ISP backend processing logic 120 at step 138 and then displayed by the display device at step 140.
Due to the generally complex design of the image processing circuitry 32 shown here, it is beneficial to separate the discussion of the ISP front-end logic 80, ISP pipeline processing logic 82 (or ISP pipeline) and ISP back-end processing logic 120 into separate parts, as shown below. In particular, fig. 10-97 of the present application may relate to a discussion of various embodiments and aspects of the ISP frontend logic 80, fig. 98-133 of the present application may relate to a discussion of various embodiments and aspects of the ISP pipe processing logic 82, and fig. 134-142 may relate to a discussion of various embodiments and aspects of the ISP backend logic 120.
ISP front-end processing logic device
Fig. 10 is a block diagram illustrating in greater detail functional logic blocks that may be implemented in the ISP front end logic 80, in accordance with one embodiment. Based on the configuration of the imaging device 30 and/or the sensor interface 94, as described above in fig. 7, raw image data may be provided by one or more image sensors 90 to the ISP front end logic 80. In the described embodiment, raw image data may be provided to the ISP front end logic 80 by a first image Sensor 90a (Sensor0) and a second image Sensor 90b (Sensor 1). As described further below, each image sensor 90a and 90b may be configured to apply binning to full resolution image data in order to improve the signal-to-noise ratio of the image signal. For example, a binning technique such as 2 × 2 binning may be applied, which may interpolate "binned" raw image pixels based on 4 full resolution image pixels of the same color. In one embodiment, this results in 4 cumulative signal components associated with the binned pixels relative to a single noise component, thereby improving the signal-to-noise ratio of the image data, but reducing the overall resolution. Additionally, binning may also result in non-uniform or inconsistent spatial sampling of the image data, which may be corrected using binning compensation filtering, as described in more detail below.
As shown, image sensors 90a and 90b may provide raw image data as signals Sif0 and Sif1, respectively. Each image sensor 90a and 90b may generally be associated with a respective statistical information processing unit 142(StatsPipe0) and 144(StatsPipe1), and the statistical information processing units 142(StatsPipe0) and 144(StatsPipe1) may be configured to process the image data to determine one or more sets of statistical information (as shown by signals Stats0 and Stats 1), including statistical information related to auto-exposure, auto-white balance, auto-focus, flicker detection, black level compensation, and lens shading correction, among others. In some embodiments, when only one of the sensors 90a or 90b is actively acquiring an image, the image data may be sent to StatSpipe0 and StatSpipe1 if additional statistical information is needed. For example, if StatsPipe0 and StatsPipe1 are both available, statistics for one color space (e.g., RGB) may be collected using StatsPipe0 and statistics for the other color space (e.g., YUV or YCbCr) may be collected using StatsPipe 1. That is, the statistical information processing units 142 and 144 may work in parallel to collect a plurality of sets of statistical information for each frame of image data obtained by a valid sensor.
In this embodiment, 5 asynchronous data sources are provided in the ISP headend 80. These include: (1) a direct input from the Sensor interface corresponding to Sensor0(90a) (referred to as Sif0 or Sens0), (2) a direct input from the Sensor interface corresponding to Sensor1(90b) (referred to as Sif1 or Sens1), (3) a Sensor0 data input from memory 108 (referred to as SifIn0 or Sens0DMA), memory 108 may include a DMA interface, (4) a Sensor1 data input from memory 108 (referred to as SifIn0 or Sens1DMA), and (5) a set of image data from each frame of Sensor0 and Sensor1 data inputs retrieved from memory 108 (referred to as FeProcIn or ProcInDMA). The ISP headend 80 may also include a plurality of destinations to which image data from a plurality of sources may be transmitted, where each destination may be a storage location in a memory (e.g., 108) or a processing unit. For example, in the present embodiment, the ISP head end 80 includes 6 destinations: (1) sif0DMA to receive Sensor0 data in memory 108, (2) Sif1DMA to receive Sensor1 data in memory 108, (3) first statistics processing unit 142(StatsPipe0), (4) second statistics processing unit 144(StatsPipe1), (5) front-end pixel processing unit (FEProc)150, and (6) FeOut (or FEProcOut) to memory 108 or ISP pipeline 82 (discussed in more detail below). In one embodiment, the ISP headend 80 may be configured such that only certain destinations are valid for a particular data source, as shown in table 1 below.
| SIf0DMA | SIf1DMA | StatsPipe0 | StatsPipe1 | FEProc | FEOut | |
| Sens0 | X | X | X | X | X | |
| Sens1 | X | X | X | X | X | |
| Sens0DMA | X | |||||
| Sens1DMA | X | |||||
| ProcInDMA | X | X |
Table 1 example of ISP headend valid destinations for each source
For example, according to table 1, source Sens0(Sensor interface of Sensor 0) may be configured to provide data to destination SIf0DMA (signal 154), StatsPipe0 (signal 156), StatsPipe1 (signal 158), FEProc (signal 160), or FEOut (signal 162). For FEOut, in some cases, source data may be provided to FEOut, bypassing pixel processing, such as for debug or test purposes, with FEProc. Additionally, source Sens1(Sensor interface of Sensor 1) may be configured to provide data to destination SIf1DMA (signal 164), StatsPipe0 (signal 166), StatsPipe1 (signal 168), FEProc (signal 170), or FEOut (signal 172), source Sens0DMA (Sensor0 data from memory 108) may be configured to provide data to StatsPipe0 (signal 174), source Sens1DMA (Sensor1 data from memory 108) may be configured to provide data to StatsPipe1 (signal 176), and source ProcInDMA (Sensor0 and Sensor1 data from memory 108) may be configured to provide data to FEProc (signal 178) and FEOut (signal 182).
It should be noted that the presently illustrated embodiment is configured such that Sens0DMA (Sensor0 frame from memory 108) and Sens1DMA (Sensor1 frame from memory 108) are provided only to Statspie 0 and Statspie 1, respectively. This structure allows the ISP front end 80 to retain a certain number of previous frames (e.g., 5 frames) in memory. For example, not every frame that the user intends to capture may be captured and processed in substantially real time due to a delay or lag between the time the user initiates a capture event with the image sensor (e.g., transitioning the image system from preview mode to capture or record mode, or even just turning on or initializing the image sensor) to the time the image scene is captured. Thus, by retaining a number of prior frames (e.g., from the preview phase) in memory 108, these prior frames can be processed after or side-by-side with the frames actually taken in response to a capture event, thereby compensating for any such lag and providing a more complete set of image data.
With respect to the illustrated configuration of fig. 10, it should be noted that StatsPipe 0142 is configured to receive one of inputs 156 (from Sens0), 166 (from Sens1), and 174 (from Sens0DMA) as determined by selection logic 146, such as a multiplexer. Similarly, selection logic 148 may select outputs from signals 158, 176, and 168 to provide to Statsppe 1, and selection logic 152 may select inputs from signals 160, 170, and 178 to provide to FEProc. As described above, the statistical data (Stats0 and Stats1) may be provided to the control logic 84 in order to determine various control parameters that may be used to operate the imaging device 30 and/or the ISP pipeline processing logic 82. It will be appreciated that the selection logic blocks (146, 148 and 152) shown in fig. 10 may be provided by any suitable type of logic, such as a multiplexer that selects one of a plurality of input signals in response to a control signal.
Pixel processing unit (FEProc)150 may be configured to perform various image processing operations on raw image data on a pixel-by-pixel basis. As shown, FEProc150, as a destination processing unit, may receive image data from source Sens0 (signal 160), Sens1 (signal 170), or ProcInDMA (signal 178) through select logic 152. The FEProc150 may also receive and output various signals (e.g., Rin, Hin, Hout, and Yout, which may represent motion history and luminance data used during temporal filtering) when performing pixel processing operations, which may include temporal filtering and binning compensation filtering, as described further below. The output 109(FEProcOut) of the pixel processing unit 150 may then be forwarded to the ISP pipeline logic 82, such as via one or more first-in-first-out (FIFO) queues, or may be sent to the memory 108.
Further, as shown in FIG. 10, in addition to receiving signals 160, 170, and 178, selection logic 152 may also receive signals 180 and 184. Signal 180 may represent "pre-processed" raw image data from stats clip 0 and signal 184 may represent "pre-processed" raw image data from stats clip 1. As described below, each statistics processing unit may apply one or more pre-processing operations to the raw image data prior to collecting the statistics. In one embodiment, each statistical information processing unit may perform a certain degree of defective pixel detection/correction, lens shading correction, black level compensation, and inverse black level compensation. Thus, signals 180 and 184 may represent raw image data that has been processed using the preprocessing operations described above (as described in more detail below in FIG. 68). Thus, selection logic 152 gives ISP front-end processing logic 80 the flexibility to provide either raw image data from Sensor0 (signal 160) and Sensor1 (signal 170) that is not pre-processed, or raw image data from StatsPipe0 (signal 180) and StatsPipe1 (signal 184) that is pre-processed. In addition, as shown by selection logic units 186 and 188, ISP front-end processing logic 80 also has the flexibility to write either unpreprocessed raw image data from Sensor0 (signal 154) or Sensor1 (signal 164) into memory 108, or preprocessed raw image data from StatsPipe0 (signal 180) or StatsPipe1 (signal 184) into memory 108.
To control the operation of the ISP front end logic 80, a front end control unit 190 is provided. The control unit 190 may be configured to initialize and program registers (referred to herein as "execution (go) registers") for configuring and initiating processing of image frames, and to select one or more appropriate register sets for updating double buffered data registers. In some embodiments, control unit 190 may also provide performance monitoring logic that records clock cycles, memory latency, and quality of service (QOS) information. In addition, the control unit 190 may also control dynamic clock gating, which may be used to disable clocks with respect to one or more portions of the ISP headend 80 when there is not enough data in the input queue from the active sensor.
By utilizing the "execution registers" described above, the control unit 190 is able to control the updating of various parameters of each processing unit (e.g., Statspie 0, Statspie 1, and FEProc) and may interface with sensor interfaces to control the starting and stopping of the processing units. Typically, each front-end processing unit operates on a frame-by-frame basis. As described above (table 1), the input to the processing unit may come from the sensor interface (Sens0 or Sens1), or from the memory 108. In addition, the processing unit may utilize various parameters and configuration data held in corresponding data registers. In one embodiment, the data registers associated with each processing unit or destination may be divided into a plurality of blocks that form a group of register sets. In the embodiment of fig. 10, 7 register set groups may be defined in the ISP head end: SIf0, SIf1, StatSpipe0, StatSpipe1, ProcPipe, FEOut, and ProcIn. Each register block address space is replicated to provide two register sets. Only double buffered registers are instantiated in the second register bank. If the registers are not double buffered, the address in the second register set may be mapped to the address of the same register in the first register set.
For double buffered registers, the registers of one register set are activated and used by the processing unit, while the registers of the other register set are masked (shadowed). Control unit 190 may update the masked registers during the current frame interval while the hardware is using the activated registers. In a particular frame, the determination of which register set to use for a particular processing unit may be specified by a "NextBk" (next register set) field in the execution register corresponding to the source providing image data to that processing unit. Essentially, NextBk is a field that allows the control unit 190 to control which register set becomes active with respect to the trigger event of the next frame.
Before discussing in detail the operation of the execution registers, FIG. 11 provides a general method 200 of processing image data on a frame-by-frame basis, in accordance with the techniques of the present invention. Beginning at step 202, a destination processing unit targeted by a data source (e.g., Sens0, Sens1, Sens0DMA, Sens1DMA, or ProcInDMA) enters an idle state. This may indicate that processing for the current frame is complete, and control unit 190 may then prepare to process the next frame. For example, at step 204, the programmable parameters of each destination processing unit are updated. This may include, for example, updating the NextBk field in the execution register corresponding to the source, and updating any parameters in the data register corresponding to the destination cell. Thereafter, at step 206, a triggering event may cause the destination unit to enter an operational state. Further, as shown at step 208, each destination unit targeted by the source completes its processing operations with respect to the current frame, and the method 200 may then return to step 202 for processing the next frame.
Fig. 12 depicts a block diagram showing two data register sets 210 and 212 that may be used by various destination units of an ISP headend. For example, Bank0(210) may include data registers 1-n (210a-210d) and Bank1(212) may include data registers 1-n (212a-212 d). As described above, the embodiment shown in fig. 10 may utilize a register set (Bank0) having 7 register set groups (e.g., SIf0, SIf1, StatsPipe0, StatsPipe1, ProcPipe, FEOut, and ProcIn). Thus, in such embodiments, the register block address space of each register is replicated to provide a second register set (Bank 1).
FIG. 12 also illustrates an execution register 214 corresponding to one of the multiple sources. As shown, the execution register 214 includes a "NextVld" field 216 and the "NextBk" field 218 described above. These fields may be programmed before starting processing of the current frame. In particular, NextVld indicates the destination to which data from the data source is to be sent. As described above, NextBk may select a corresponding data register from Bank0 or Bank1 for each target destination indicated by NextVld. Although not shown in FIG. 12, the execution register 214 may also include a preparation (issue) bit (referred to herein as an "execution bit") that may be set to prepare the execution register. When a trigger event 226 for the current frame is detected, NextVld and NextBk may be copied into the CurrVld field 222 and CurrBk field 224 of the corresponding current or "active" register 220. In one embodiment, one or more of the current registers 220 may be read-only registers that may be set by hardware while remaining inaccessible to software instructions within the ISP headend 80.
It should be understood that for each ISP headend source, a corresponding execution register may be provided. For purposes of this disclosure, the execution registers corresponding to the above-described sources Sens0, Sens1, Sens0DMA, Sens1DMA, and procinddma may be referred to as Sens0Go, Sens1Go, Sens0DMAGo, Sens1DMAGo, and procindaddma, respectively. As described above, the control unit may control the ordering of frame processing within the ISP front end 80 using the execution registers. Each execution register contains a NextVld field and a NextBk field indicating which destinations are valid and which register set (0 or 1) is to be used, respectively, for the next frame. When the trigger event 226 for the next frame occurs, the NextVld and NextBk fields are copied to the corresponding activate read only register 220 indicating the current valid destination and register set number, as shown above in fig. 12. Each source may be configured to operate asynchronously and be able to send data to any of its valid destinations. Furthermore, it should be appreciated that for each destination, typically only one source may be active during the current frame.
In terms of the preparation and triggering of the execution register 214, the preparation bit or "execute bit" in the validation execution register 214 prepares the corresponding data source via the associated NextVld and NextBk fields. For triggering, there are various patterns based on whether the source input data is read from memory (e.g., Sens0DMA, Sens1DMA, or proclndma), or from the sensor interface (e.g., Sens0 or Sens 1). For example, if the input is from memory 108, then preparing the execution bits themselves may serve as a triggering event, as the control unit 190 can control the time at which data is read from the memory 108. If the sensor interface is inputting an image frame, the trigger event may depend on the timing of the preparation of the corresponding execution register relative to the time of receiving data from the sensor interface. In accordance with the present embodiment, three different techniques for triggering timing based on sensor interface inputs are shown in FIGS. 13-15.
Referring first to fig. 13, a first scenario is illustrated in which a trigger occurs once all destinations targeted by a source transition from a busy or running state to an idle state. Here, the data signal VVALID (228) represents an image data signal from a source. Pulse 230 represents a current frame of image data, pulse 236 represents a next frame of image data, and interval 232 represents a vertical blanking interval (VBLANK)232 (e.g., representing a time difference between a last line of current frame 230 and next frame 236). The time difference between the rising and falling edges of pulse 230 represents frame interval 234. Thus, in FIG. 13, the source may be configured to trigger when all target destinations have completed processing operations for the current frame 230 and transition to an idle state. In this case, the source is prepared (e.g., by setting a prepare or "execute" bit) before the destination completes processing, so that once the target destination becomes idle, the source can trigger and begin processing of the next frame 236. During the vertical blanking interval 232, prior to arrival of sensor input data, the processing unit may be set up and configured for the next frame 236 using the register set specified by the execution register corresponding to the source. For example, the read buffer used by the FEProc 150 may be filled before the next frame 236 arrives. In this case, after the triggering event, the mask register corresponding to the active register set may be updated, allowing the full frame interval to set the double buffer register for the next frame (e.g., after frame 236).
FIG. 14 illustrates a second scenario in which a source is triggered by preparing an execution bit in an execution register corresponding to the source. In this "trigger-on-go" configuration, the destination cell targeted by the source is idle and preparing the execute bit is a trigger event. This toggle mode may be used for registers that are not double buffered and are then updated during the vertical blanking (e.g., as opposed to updating double buffered mask registers during frame interval 234).
FIG. 15 illustrates a third trigger mode in which a data source is triggered when the beginning of the next frame, i.e., a rising VSYNC, is detected. It should be noted, however, that in this mode, if the execution register is ready (by setting the execution bit) after the next frame 236 has begun processing, then the source will use the target destination and register set corresponding to the previous frame because the CurrVld and CurrBk fields are not updated before the destination begins processing. This leaves no vertical blanking interval for the set destination processing unit and can potentially lead to dropped frames, especially when operating in dual sensor mode. It should be noted, however, that if the image processing circuit 32 is operating in a single sensor mode (e.g., destination (NextVld) and register set (NextBk) do not change) that uses the same register set for each frame, then such a mode may still yield accurate operation.
Referring now to FIG. 16, the control registers (or "execution registers") 214 are illustrated in more detail. The execution register 214 includes a ready "execute" bit 238, as well as a NextVld field 216 and a NextBk field 218. As described above, each source of the ISP headend 80 (e.g., Sens0, Sens1, Sens0DMA, Sens1DMA, or ProcInDMA) may have a corresponding execution register 214. In one embodiment, the execute bit 238 may be a single bit field, and the execute register 214 may be prepared by setting the execute bit 238 to 1. The NextVld field 216 may contain a number of bits corresponding to the number of destinations for the ISP headend 80. For example, in the embodiment shown in fig. 10, the ISP head end includes 6 destinations: sif0DMA, Sif1DMA, Statspie 0, Statspie 1, FEProc, and FEOut. Thus, the execution register 214 may include 6 bits in the NextVld field 216, one bit corresponding to one destination, and where the destination is set to 1. Similarly, the NextBk field 218 may contain a number of bits corresponding to the number of data registers in the ISP headend 80. For example, as described above, the embodiment of the ISP headend 80 shown in fig. 10 may include 7 data registers: SIf0, SIf1, StatSpipe0, StatSpipe1, ProcPipe, FEOut, and ProcIn. Thus, the NextBk field 218 may include 7 bits, one bit corresponding to one data register, and wherein the data registers corresponding to Bank0 and Bank1 are selected by setting their respective bit values to 0 or 1. Thus, by utilizing execution registers 214, when triggered, the source knows explicitly which destination cells are to receive frame data and which register sets are to be used to configure the target destination cells.
Additionally, due to the dual sensor architecture supported by the ISP circuit 32, the ISP frontend may operate in a single sensor architecture mode (e.g., only one sensor is acquiring data) and a dual sensor architecture mode (e.g., both sensors are acquiring data). In a typical single sensor configuration, input data from a sensor interface (such as Sens0) is sent to StatsPipe0 (for statistics processing) and FEProc (for pixel processing). Alternatively, the sensor frame may be sent to memory (SIf0DMA) for later processing, as described above.
An example of how the NextVld fields corresponding to each source of the ISP headend 80 may be configured when operating in the single sensor mode is described below in table 2.
| SIf0DMA | SIf1DMA | StatsPipe0 | StatsPipe1 | FEProc | FEOut | |
| Sens0Go | 1 | X | 1 | 0 | 1 | 0 |
| Sens1Go | X | 0 | 0 | 0 | 0 | 0 |
| Sens0DMAGo | X | X | 0 | X | X | X |
| Sens1DMAGo | X | X | X | 0 | X | X |
| ProcInDMAGo | X | X | X | X | 0 | 0 |
Table 2-examples of NextVld for each source: single sensor mode
As described above with reference to table 1, IPS front end 80 may be configured such that only some destinations are appropriate for a particular source. Thus, the destination labeled with an "X" in table 2 is intended to indicate that ISP headend 80 is not configured to allow a particular source to transmit frame data to that destination. For such a destination, the bit of the NextVld field of the particular data source corresponding to the destination may always be 0. It should be understood, however, that this is merely one embodiment and that in other embodiments, the ISP headend 80 may be configured to enable each source to target each available destination unit.
The configuration shown above in Table 2 represents a single Sensor mode in which only Sensor0 is providing frame data. For example, the Sens0Go register represents the destination as SIf0DMA, StatsPipe0, and FEProc. Thus, when triggered, each frame of Sensor0 image data is sent to the three destinations. As described above, SIf0DMA may save a frame in memory 108 for later processing, StatsPipe0 applies statistics processing to determine various statistical data points, FEProc processes the frame using, for example, time-domain filtering and boxed compensation filtering. Furthermore, in some configurations where additional statistics are needed (e.g., statistics in different color spaces), StatSpipe1 may also be enabled (with the corresponding NextVld set to 1) during single sensor mode. In such an embodiment, the Sensor0 frame data is sent to StatSpipe0 and StatSpipe 1. Furthermore, as shown in the present embodiment, during the single sensor mode, only a single sensor interface (e.g., Sens0 or Sens1) is the only source of activation.
In view of the foregoing, fig. 17 provides a flow chart describing a method 240 of processing frame data in the ISP front end 80 when only a single Sensor (e.g., Sensor0) is active. Although the method 240 illustrates the processing of Sensor0 frame data by the FEProc 150, it should be understood that the processing may be applied to any other source and corresponding destination unit in the ISP headend 80. Beginning at step 242, Sensor0 begins acquiring image data and transmitting the captured frames to ISP headend 80. Control unit 190 may initialize the execution register corresponding to Sens0(Sensor0 interface) to determine the target destination (including FEProc) and which set of registers to use, as shown in step 244. Thereafter, decision logic 246 determines whether a source trigger event has occurred. As described above, the frame data input from the sensor interface may utilize different trigger patterns (fig. 13-15). If a trigger event is not detected, process 240 continues to wait for a trigger. Once the trigger occurs, the next frame becomes the current frame and is sent to FEProc (and other target destinations) for processing at step 248. Feprc may be configured with data parameters based on the corresponding data register (ProcPipe) specified in the NextBk field of the Sens0 execution register. After processing of the current frame is complete at step 250, the method 240 returns to step 244 where register programming is performed on Sens0 for the next frame at step 244.
When both Sensor0 and Sensor1 of ISP headend 80 are active, the statistical information processing is generally simple because each Sensor input can be processed by the corresponding statistical information blocks StatsPipe0 and StatsPipe 1. However, since the illustrated embodiment of the ISP front end 80 provides only a single pixel processing unit (FEProc), FEProc may be configured to alternately process frames corresponding to Sensor0 input data and frames corresponding to Sensor1 input data. It will be appreciated that in the illustrated embodiment, the image frames are read from FEProc to avoid the situation where image data from one sensor is processed in real time, while image data from another sensor is not processed in real time. For example, as shown in table 3 below (table 3 describes one possible configuration of the NextVld field in the register for each data source when the ISP front end 80 is operating in the dual sensor mode), the input data from each sensor is sent to the memory (SIf0DMA and SIf1DMA) and the corresponding statistics processing units (StatsPipe0 and StatsPipe 1).
| SIf0DMA | SIf1DMA | StatsPipe0 | StatsPipe1 | FEProc | FEOut | |
| Sens0Go | 1 | X | 1 | 0 | 0 | 0 |
| Sens1Go | X | 1 | 0 | 1 | 0 | 0 |
| Sens0DMAGo | X | X | 0 | X | X | X |
| Sens1DMAGo | X | X | X | 0 | X | X |
| ProcInDMAGo | X | X | X | X | 0 | 0 |
Table 3-examples of NextVld for each source: dual sensor mode
Sensor frames in memory are sent from the ProcInDMA source to FEProc so that the Sensor frames alternate between Sensor0 and Sensor1 at a rate based on their corresponding frame rate. For example, if Sensor0 and Sensor1 both obtain image data at a rate of 30 frames per second (fps), their Sensor frames may be interlaced in a 1-to-1 manner. If Sensor0(30fps) obtains image data at a rate of two times Sensor1(15fps), the interleaving may be 2 to 1. That is, two frames of Sensor0 data are read from memory for each frame of Sensor1 data.
In view of the above, fig. 18 depicts a method 252 of processing frame data in an ISP headend 80 having two sensors that simultaneously acquire image data. At step 254, both Sensor0 and Sensor1 begin acquiring image frames. It is to be appreciated that Sensor0 and Sensor1 may acquire image frames with different frame rates, resolutions, etc. At step 256, the frames obtained from Sensor0 and Sensor1 are written into memory 108 (e.g., using SIf0DMA and SIf1DMA destinations). The source procinddma then alternately reads frame data from the memory 108, as shown in step 258. As described above, frames may alternate between Sensor0 data and Sensor1 data based on the frame rate at which the data is obtained. At step 260, the next frame from ProcInDMA is obtained. Thereafter, at step 262, the NextVld and NextBk fields of the execution register corresponding to the source (here, ProcInDMA) are programmed based on whether the next frame is Sensor0 data or Sensor1 data. Decision logic 264 then determines whether a source trigger event has occurred. As described above, data input from memory may be triggered by preparing the execute bit (e.g., an "execute trigger" mode). Thus, a trigger may occur once the execute bit of the execute register is set to 1. Once the trigger occurs, the next frame becomes the current frame and is sent to FEProc for processing at step 266. As described above, feprc may be configured with data parameters based on the corresponding data register (ProcPipe) specified in the NextBk field of the ProcInDMA execution register. At step 268, after processing of the current frame is complete, the method 252 may return to step 260 and continue.
Another operational event that the ISP headend 80 is configured to process is a configuration change during image processing. Such an event may occur, for example, when the ISP front end 80 transitions from a single sensor configuration to a dual sensor configuration, or from a dual sensor configuration to a single sensor configuration. As described above, the NextVld field for certain sources may be different depending on whether one or both image sensors are active. Thus, when the sensor configuration is changed, the ISP head end control unit 190 may free all destination units before a new source determines a destination unit as a target. This may avoid invalid configurations (e.g., assigning multiple data sources to one destination). In one embodiment, the release of the destination cell may be achieved by setting the NextVld field of all execution registers to 0, thereby disabling all destinations and making the execution bit (go bit) ready. After the destination cell is released, the execution registers may be reconfigured according to the current sensor mode, and then image processing may continue.
A method 270 of switching between a single sensor configuration and a dual sensor configuration is shown in fig. 19, in accordance with one embodiment. Beginning at step 272, the next frame of image data from a particular source of the ISP headend 80 is identified. At step 274, the target destination (NextVld) is programmed into the execution register corresponding to the source. Subsequently, at step 276, depending on the target destination, NextBk is programmed to point to the correct data register associated with the target destination. Thereafter, decision logic 278 determines whether a source trigger event has occurred. Once the trigger occurs, the next frame is sent to and processed by the destination cell specified by NextVld using the corresponding data register specified by NextBk, as shown in step 280. Processing continues to step 282 where processing for the current frame is completed at step 282.
Decision logic 284 then determines whether the target destination of the source has changed. As described above, the NextVld settings of the execution registers corresponding to Sens0 and Sens1 may vary based on whether one sensor is active or both sensors are active. For example, referring to Table 2, if only Sensor0 is active, then Sensor0 data is sent to SIf0DMA, Statspie 0, and FEProc. However, referring to Table 3, if both Sensor0 and Sensor1 are active, then the Sensor0 data is not sent directly to FEProc. Instead, as described above, the Sensor0 and Sensor1 data are written into memory 108 and alternately read out by the source ProcInDMA to FEProc. Thus, if no target destination change is detected at decision logic 284, then the control unit 190 concludes that there is no change in the sensor configuration and the method 270 returns to step 276 where the NextBk field of the source execution register is programmed to point to the correct data register for the next frame to continue processing at step 276.
However, if a destination change is detected at decision logic 284, then control unit 190 determines that a sensor configuration change has occurred. For example, this may represent switching from a single sensor mode to a dual sensor mode, or turning off all sensors. Thus, the method 270 proceeds to step 286 where all bits of the NextVld fields of all execution registers are set to 0, effectively disabling the frame from being sent to any destination the next time it is triggered. Subsequently, at decision logic 288, a determination is made as to whether all of the destination cells have transitioned to the idle state. If not, the method 270 waits at decision logic 288 until all destination cells have completed their current operation. Subsequently, at decision logic 290, a determination is made whether to continue image processing. For example, if the destination change represents both Sensor0 and Sensor1 disabled, then image processing ends at step 292. However, if it is determined that image processing is to continue, the method 270 returns to step 274 and the NextVld field of the execution register is programmed according to the current mode of operation (e.g., single sensor or dual sensor). As shown therein, the step 284-292 of clearing the execution registers and destination fields may be collectively referred to by the reference numeral 294.
Next, fig. 20 illustrates another embodiment by providing a flow chart of the operation of another dual sensor mode (method 296). Method 296 depicts the case where one Sensor (e.g., Sensor0) is actively acquiring image data and sending image frames to FEProc 150 for processing, while also sending image frames to Statspie 0 and/or memory 108(Sif0DMA), while the other Sensor (e.g., Sensor1) is deactivated (e.g., turned off), as shown in step 298. Decision logic 300 then detects the condition that, at the next frame, Sensor1 will become active, sending image data to FEProc. If the condition is not met, the method 296 returns to step 298. However, if the condition is satisfied, then the method 296 proceeds to perform act 294 (generally step 284 of FIG. 19 along with 292) to clear and reconfigure the destination field of the source at step 294. For example, at step 294, the NextVld field of the execution register associated with Sensor1 may be programmed to specify fepoc as the destination, and StatsPipe1 and/or memory (Sif1DMA), while the NextVld field of the execution register associated with Sensor0 may be programmed to clear fepoc as the destination. In this embodiment, although the frame captured by Sensor0 is not sent to FEProc at the next frame, Sensor0 may remain active and continue to send its image frames to Statspie 0, as shown at step 302, while at step 304 Sensor1 captures and sends the data to FEProc for processing. Thus, two sensors, namely, Sensor0 and Sensor1, may continue to operate in this "dual Sensor" mode, although only image frames from one Sensor are sent to FEProc for processing. For the present example, a sensor that sends frames to the FEProc for processing may be referred to as an "active sensor", a sensor that does not send frames to the FEProc but still sends data to the statistics processing unit may be referred to as a "semi-active sensor", and a sensor that does not acquire data at all may be referred to as a "inactive sensor".
One benefit of the above technique is that since the statistics continue to be acquired for the semi-active Sensor (Sensor0), the semi-active Sensor can begin acquiring data within a frame when the next time the semi-active Sensor transitions to the active state and the currently active Sensor (Sensor1) transitions to the semi-active or inactive state, since the color balance and exposure parameters may already be available due to the continued collection of image statistics. This technique may be referred to as "hot switching" of the image sensor, and avoids the drawbacks associated with "cold start" of the image sensor (e.g., starting with no statistical information available). Furthermore, to conserve power, the semi-active sensors may operate at a reduced clock and/or frame rate during semi-active because each source is asynchronous (as described above).
Before proceeding to describe in more detail the statistical information processing and pixel processing operations described in the ISP front end logic 80 of fig. 10, it is believed that a brief introduction of several types of memory addressing formats that may be used in conjunction with the presently disclosed technology, as well as the definition of various ISP frame regions, will help to better understand the subject matter of the present invention.
Referring now to fig. 21 and 22, a linear addressing mode and a block (tiled) addressing mode, respectively, are illustrated that may be applied to pixel data received from image sensor 90 and stored in a memory (e.g., 108). The described embodiments may be based on a 64 byte master interface block request size. It is understood that other embodiments may utilize different block request sizes (e.g., 32 bytes, 128 bytes, etc.). In the linear addressing mode shown in fig. 21, the image samples are sequentially located in the memory. The term "linear span" designates the distance between 2 adjacent vertical pixels in number of bytes. In this example, aligning the starting base address (starting base address) of a plane to a 64 byte boundary, the straight stride may be a multiple of 64 (depending on the block request size).
In the example of the tile mode format, as shown in fig. 22, the image samples are first arranged sequentially in "tiles," which are then sequentially saved in memory. In the illustrated embodiment, each block may be 256 bytes wide by 16 rows high. The term "block stride" should be understood as representing the distance between 2 adjacent vertical blocks in bytes. In this example, with the starting base address of the plane in block mode aligned to a 4096 byte boundary (e.g., the size of a block), the block stride may be a multiple of 4096.
In view of the foregoing, various frame regions that may be defined within an image source frame are illustrated in fig. 23. The format of the source frames provided to the image processing circuit 32 may use the block or linear addressing modes described above, as pixel formats of 8, 10, 12, 14 or 16 bit precision may be utilized. As shown in fig. 23, image source frame 306 may include a sensor frame area 308, an original frame area 308, and an active area 310. Sensor frame 308 is typically the maximum frame size that image sensor 90 can provide to image processing circuitry 32. The raw frame area 310 may be defined as the area of the sensor frame 308 that is sent to the ISP front-end processing logic 80. The active area 312 may be defined as a portion of the source frame 306 that is processed for a particular image processing operation, typically within the original frame area 310. In accordance with embodiments of the present technique, the activation regions 312 may be the same or may be different for different image processing operations.
In accordance with aspects of the present technique, the ISP front end logic 80 receives only the original frame 310. Thus, for purposes of this discussion, the total frame size of the ISP front-end processing logic 80 may be assumed to be the original frame size determined by the width 314 and height 316. In some embodiments, the offset from the boundary of sensor frame 308 to the boundary of original frame 310 may be determined and/or maintained by control logic 84. For example, control logic 84 may include firmware that determines original frame area 310 based on input parameters, such as x-offset 318 and y-offset 320 specified with respect to sensor frame 308. Furthermore, in some cases, processing units within IPS front-end logic 80 or ISP pipe logic 82 may have defined activation regions such that pixels in the original frame but outside of active area 312 are not processed, i.e., remain unchanged. For example, the active area 312 of a particular processing unit having a width 322 and a height 324 may be defined in terms of an x-offset 326 and a y-offset 328 relative to the original frame 310. Further, without specifically defining an active region, one embodiment of image processing circuitry 32 may assume that active region 312 is the same as original frame 310 (e.g., x offset 326 and y offset 328 are both equal to 0). Thus, for image processing operations performed on image data, boundary conditions may be defined relative to the boundaries of original frame 310 or activation region 312. Additionally, in some embodiments, the window (frame) may be specified by identifying a start and end position in memory, rather than the start position and window size information.
In some embodiments, ISP front end processing unit (FEProc)80 may also process image frames with overlapping vertical stripe support, as shown in fig. 24. For example, the image processing in this example may occur using the left Stripe (Stripe0), the middle Stripe (Stripe1), and the right Stripe (Stripe2) three times (in three passes). This may allow IPS front-end processing unit 80 to process a wider image multiple times without increasing the size of the line buffer. This technique may be referred to as "stride addressing".
When image frames are processed using multiple vertical stripes, the input frames are read with a slight overlap, running enough filter context overlap that there is no or little difference between the multiple read images and the single read image. For example, in this example, Stripe0 having a width SrcWidth0 and Stripe1 having a width SrcWidth1 partially overlap, as shown by overlap region 330. Similarly, Stripe1 also overlaps on the right with Stripe2 having a width SrcWidth2, as shown by overlap region 332. Here, the total span is the sum of the widths of each band (SrcWidth0, SrcWidth1, SrcWidth2) minus the widths (334, 336) of the overlapping regions 330 and 332. When an image frame is written to a memory (e.g., 108), an active output region is defined, and only data within the active output region is written. As shown in FIG. 24, when writing to memory, each stripe is written according to the non-overlapping widths ActiveDst0, ActiveDst1, and ActiveDst 2.
As described above, the image processing circuit 32 may receive image data directly from the sensor interface (e.g., 94) or from the memory 108 (e.g., DMA memory). Where input data is provided from memory, the image processing circuitry 32 and ISP front-end processing logic 80 may be configured to provide byte swapping, where incoming pixel data from memory may be byte swapped prior to processing. In one embodiment, a swap code may be used to indicate whether adjacent doublewords, words, halfwords, or bytes of input data from memory are swapped. For example, referring to FIG. 25, a 4-bit swap code may be utilized to perform a byte swap with a set of data of 16 bytes (bytes 0-15).
As shown, the exchange code may include 4 bits, which from left to right may be referred to as bit3, bit2, bit1, and bit 0. When all bits are set to 0, no byte swap is performed, as indicated by reference numeral 338. When bit3 is set to 1, a double word (e.g., 8 bytes) is swapped as shown by reference numeral 340. For example, as shown in FIG. 25, the doublewords represented by bytes 0-7 are swapped with the doublewords represented by bytes 8-15. If bit2 is set to 1, a word (e.g., 4 bytes) swap is performed as indicated by reference numeral 342. In the illustrated example, this results in the words represented by bytes 8-11 being swapped with the words represented by bytes 12-15 and the words represented by bytes 0-3 being swapped with the words represented by bytes 4-7. Similarly, if bit1 is set to 1, then a halfword (e.g., 2 bytes) swap is performed (e.g., bytes 0-1 are swapped with bytes 2-3, etc.) as shown at reference numeral 344, and if bit0 is set to 1, then a byte swap is performed as shown at reference numeral 346.
In this embodiment, swapping may be performed by sequentially evaluating bit3, bit2, bit 1, and bit 0 of the swap code. For example, if bit3 and bit2 are set to a value of 1, then a double word swap is performed first (bit3) followed by a word swap (bit 2). Thus, as shown in fig. 25, when the exchange code is set to "1111", then the end result is that the input data is exchanged from the little-side format (littleendian format) to the big-side format.
Various memory formats of image pixel data that may be supported by the image processing circuit 32 for raw image data (e.g., Bayer RGB data), RGB color data, and YUV (YCC, luminance/chrominance data) are discussed in further detail below, in accordance with certain disclosed embodiments. First, the format of raw image pixels (e.g., Bayer data prior to demosaicing) in a destination/source frame that various embodiments of image processing circuitry 32 can support is discussed. As described above, certain embodiments may support image pixel processing of 8, 10, 12, 14, and 16 bit precision. In the case of RAW image data, the 8, 10, 12, 14, and 16-bit RAW pixel formats may be referred to herein as RAW8, RAW10, RAW12, RAW14, and RAW16 formats, respectively. An example of how the formats of RAW8, RAW10, RAW12, RAW14, and RAW16 can be stored in memory in an uncompacted form is graphically illustrated in fig. 26. Pixel data may also be stored in a packed format for original image formats with bit precision greater than 8 bits (and not multiples of 8 bits). For example, fig. 27 shows an example of how RAW10 image pixels are stored in memory. Similarly, fig. 28 and 29 illustrate examples of storing RAW12 and RAW14 image pixels in memory. As described further below, control registers associated with the sensor interface 94 may define a destination/source pixel format (whether the pixel is in a packed or unpacked format), an addressing format (e.g., linear or block), and an exchange code when writing/reading image data to/from memory. Thus, the manner in which the ISP processing circuitry 32 reads and interprets pixel data may depend on the pixel format.
Image Signal Processing (ISP) circuitry 32 may also support certain formats of RGB color pixels in sensor interface source/destination frames (e.g., 310). For example, RGB image frames may be received from a sensor interface (e.g., in embodiments where the sensor interface includes on-board demosaicing logic) and saved into memory 108. In one embodiment, the ISP front end processing logic 80(FEProc) may bypass pixel and statistics processing when an RGB frame is received. For example, image processing circuitry 32 may support the following RGB pixel formats: RGB-565 and RGB-888. An example of how the RGB-565 pixel data is stored in memory is shown in fig. 30. As shown, the RGB-565 format may provide one plane of 5-bit red, 6-bit green, and 5-bit blue components interleaved in RGB order. Thus, a total of 16 bits may be used to represent an RGB-565 pixel (e.g., { R0, G0, B0} or { R1, G1, B1 }).
As shown in fig. 31, the RGB-888 format may include one plane of 8-bit red, green, and blue components interleaved in RGB order. In one embodiment, the ISP circuit 32 may also support the RGB-666 format, which typically provides one plane of 6-bit red, green, and blue components interleaved in RGB order. In such an embodiment, when the RGB-666 format is selected, RGB-666 pixel data can be saved in memory by utilizing the RGB-888 format shown in FIG. 31, but with each pixel left aligned, the two Least Significant Bits (LSBs) set to 0.
In some embodiments, the ISP circuitry 32 may also support an RGB pixel format that allows the pixels to have an extended range and precision of floating point values. For example, in one embodiment, the ISP circuitry 32 may support the RGB pixel format shown in fig. 32, where red (R0), green (G0), and blue (B0) color components are represented as 8-bit values, with a common 8-bit index (E0). Thus, in such embodiments, the actual red (R '), green (G '), and blue (B ') values defined by R0, G0, B0, and E0 may be expressed as:
R′=R0[7:0]*2^E0[7:0]
G′=G0[7:0]*2^E0[7:0]
B′=B0[7:0]*2^E0[7:0]
this pixel format may be referred to as the RGBE format, which is sometimes referred to as the Radince image pixel format.
Fig. 33 and 34 illustrate additional RGB pixel formats that can be supported by the ISP circuit 32. In particular, FIG. 33 depicts a pixel format that can hold 9-bit red, green, and blue components with a 5-bit common exponent. For example, the upper 8 bits [8:1] of each of the red, green, and blue pixels are stored in a corresponding byte in memory. Another byte is used to hold a 5-bit index (e.g., E0[4:0]) and a least significant bit [0] for each of the red, green, and blue pixels. Thus, in such embodiments, the actual red (R '), green (G '), and blue (B ') values defined by R0, G0, B0, and E0 may be expressed as:
R′=R0[8:0]*2^E0[4:0]
G′=G0[8:0]*2^E0[4:0]
B′=B0[8:0]*2^E0[4:0]
In addition, the pixel format illustrated in FIG. 33 is also flexible in that it is compatible with the RGB-888 format shown in FIG. 31. For example, in some embodiments, ISP processing circuitry 32 may process full RGB values having exponential components, or may process only the upper 8-bit portions [7:1] of each RGB color component in a manner similar to the RGB-888 format.
Fig. 34 depicts a pixel format that can hold 10-bit red, green, and blue components with a 2-bit shared component. For example, the upper 8 bits [9:2] of each of the red, green and blue pixels are stored in a corresponding byte in memory. Another byte is used to hold a 2-bit index (e.g., E0[1:0]) and the least significant 2 bits [1:0] for each of the red, green, and blue pixels. Thus, in such embodiments, the actual red (R '), green (G '), and blue (B ') values defined by R0, G0, B0, and E0 may be expressed as:
R′=R0[9:0]*2^E0[1:0]
G′=G0[9:0]*2^E0[1:0]
B′=B0[9:0]*2^E0[1:0]
in addition, similar to the pixel format shown in fig. 33, the pixel format illustrated in fig. 34 is also flexible because it is compatible with the RGB-888 format shown in fig. 31. For example, in some embodiments, the ISP processing circuitry 32 may process full RGB values having exponential components, or may process only the upper 8-bit portions of each RGB color component (e.g., [9:2]) in a manner similar to the RGB-888 format.
The ISP circuitry 32 may additionally support certain formats of ycbcr (yuv) luminance and chrominance pixels in the sensor interface source/destination frames (e.g., 310). For example, YCbCr image frames may be received from a sensor interface (e.g., in embodiments where the sensor interface includes on-board demosaicing logic, and logic configured to convert RGB image data to YCC color space) and saved to memory 108. In one embodiment, the ISP front end processing logic 80 may bypass pixel and statistics processing when a YCbCr frame is received. For example, the image processing circuitry 32 may support the following YCbCr pixel format: YCbCr-4:2: 08, 2 plane; and the YCbCr-4:2: 28, 1 plane.
The YCbCr-4:2: 08, 2-plane pixel format may provide two separate image planes in memory, one for the luminance pixels (Y) and one for the chrominance pixels (Cb, Cr), where the chrominance averages interleave Cb and Cr pixel samples. In addition, the chrominance plane may be subsampled along the horizontal (x) and vertical (y) directions 1/2. An example of how the data of the YCbCr-4:2: 08, 2 plane is stored in memory is shown in fig. 35, where fig. 35 depicts a luminance plane 347 for storing luminance (Y) samples and a chrominance plane 348 for storing chrominance (Cb, Cr) samples. The format of the YCbCr-4:2: 28, 1 plane represented in fig. 36 may include one image plane of interleaved luma (Y) and chroma (Cb, Cr) pixel samples, with the chroma samples being subsampled along the horizontal (x) and vertical (Y) directions 1/2. In some embodiments, the ISP circuitry 32 may also support a 10-bit YCbCr pixel format by saving the pixel samples to memory using the 8-bit format and rounding described above (e.g., the two least significant bits of the 10-bit data are rounded). Further, it will be appreciated that YC1C2 values may also be stored using any of the RGB pixel formats discussed above in FIGS. 30-34, with the Y, C1 and C2 components being stored in a similar manner as the R, G and B components, respectively.
Referring back to the ISP front end processing logic 80 shown in fig. 10, various read and write channels are provided to the memory 108. In one embodiment, the read/write channels may share a common data bus, which may be provided using an advanced microcontroller bus architecture, such as an advanced extensible interface (AXI) bus or any other suitable kind of bus (AHB, ASB, APB, ATB, etc.). Depending on the image frame information (e.g., pixel format, address format, packing method) that may be determined with the aid of control registers as described above, an address generation block, which may be implemented as part of the control logic 84, may be configured to provide address and burst size information to the bus interface. For example, the address calculation may depend on various parameters, such as whether the pixel data is packed, the pixel data format (e.g., RAW8, RAW10, RAW12, RAW14, RAW16, RGB or YCbCr/YUV format), whether a block addressing format or a linear addressing format is used, the x-offset and y-offset of the image frame data relative to the memory array, and the frame width, height, and stride. Other parameters that may be used when calculating pixel addresses may include minimum pixel unit values (MPUs), offset masks, bytes per MPU value (BPPU), and Log2 for MPU values (L2 MPU). Table 4 shown below illustrates the above parameters for both compact and non-compact pixel formats, according to one embodiment.
TABLE 4 Pixel Address calculation parameters (MPU, L2MPU, BPPU)
It should be understood that the MPU and BPPU settings allow the ISP circuitry 32 to access the number of pixels that need to be read in order to read one pixel, even if not all of the read data is needed. That is, the MPU and BPPU settings may allow the ISP circuit 32 to read in pixel data formats that are memory byte aligned (e.g., using multiples of 8 bits (1 byte) to save pixel values) and pixel data formats that are not memory byte aligned (e.g., saving pixel values with multiples less than or greater than 8 bits (1 byte), i.e., RAW10, RAW12, etc.).
Referring to fig. 37, an example of the location of an image frame 350 stored in memory according to linear addressing is illustrated, where each block represents 64 bytes (as described above in fig. 21). In one embodiment, the following pseudo code illustrates processing that may be implemented by control logic to identify the starting block and block width of a memory frame under linear addressing.
BlockWidth ═ LastBlockX-BlockOffsetX + 1; wherein
BlockOffsetX=(((OffsetX>>L2MPU)*BPPU)>>6)
LastBlockX=((((OffsetX+Width-1)>>L2MPU)*BPPU+BPPU-1)>>6)
BlockStart=OffsetY*Stride+BlockOffsetX
Where Stride represents a frame Stride (byte), which is a multiple of 64. For example, in fig. 37, srcslide and dstlide are 4, meaning 4 64-byte blocks. Referring to table 4 above, the values of the L2MPU and BPPU may depend on the format of the pixels in the frame 350. As shown, once blockaoffset x is known, blockastart can be determined. Using blockoffset x and LastBlockX, a BlockWidth may then be determined, which may be determined using values corresponding to the L2MPU and BPPU for the pixel format of frame 350.
A similar example in terms of block addressing is depicted in fig. 38, where a source image frame 352 is stored in memory and overlaps a portion of Tile0, Tile 1, Tile n, and Tile n + 1. In one embodiment, the following pseudo code illustrates a process that may be implemented by control logic to identify a starting block and block width of a memory frame under block addressing.
BlockWidth ═ LastBlockX-BlockOffsetX + 1; wherein
BlockOffsetX=(((OffsetX>>L2MPU)*BPPU)>>6)
LastBlockX=((((OffsetX+Width-1)>>L2MPU)*BPPU+BPPU-1)>>6)
BlockStart=((OffsetY>>4)*(Stride>>6)+(BlockOffsetX>>2)*64+OffsetY[3:0]*4+(BlockOffsetX[1:0])
In the above calculation, the expression "(offset > 4) and (Stride > 6)" may represent the number of blocks reaching the row of blocks in which the image frame is located in the memory. The expression "(blockoffset x > 2) _ 64" may represent the number of blocks in the stored image frame that are shifted in the x-direction. The expression "offset [3:0] 4" may represent the number of blocks to reach the line within the block where the start address of the image frame is located. Furthermore, the expression "BlockOffsetX [1:0 ]" may represent the number of blocks within the arriving block that reach an x offset corresponding to the start address of the image frame. In addition, in the embodiment illustrated in fig. 38, the number of blocks per block (BlocksPerTile) is 64 blocks, and the number of bytes per block (BytesPerBlock) may be 64 bytes.
As shown above in table 4, for pixels held in the RAW10, RAW12, and RAW14 packed format, the 4 pixels constitute a Minimum Pixel Unit (MPU) of 5, 6, or 7 Bytes (BPPU), respectively. For example, referring to the RAW10 pixel format example shown in fig. 27, an MPU for 4 pixels P0-P3 includes 5 bytes, where the upper 8 bits of each of the pixels P0-P3 are stored in 4 corresponding bytes and the lower 2 bytes of each pixel are stored in bits 0-7 of the 32-bit address 01 h. Similarly, referring to FIG. 28, an MPU utilizing 4 pixels P0-P3 in the RAW12 format includes 6 bytes, where the lower 4 bits of pixels P0 and P1 are saved in the byte corresponding to bits 16-23 of address 00h, and the lower 4 bits of pixels P2 and P3 are saved in the byte corresponding to bits 8-15 of address 01 h. Fig. 29 shows that the MPU for 4 pixels P0 to P3 using the RAW14 format includes 7 bytes, of which 4 bytes are used to hold the upper 8 bits of each pixel of the MPU and 3 bytes are used to hold the lower 6 bits of each pixel of the MPU.
With these pixel formats, there may be a partial MPU of less than 4 pixels at the end of a frame line where the MPU is used (e.g., when line width modulo 4 is not 0). Unused pixels may be ignored when reading the local MPU. Similarly, when writing a local MPU in a destination frame, unused pixels can be written with a 0 value. Furthermore, in some cases, the last MPU of a frame line may not be aligned to a 64 byte chunk boundary. In one embodiment, the bytes after the last MPU up to the end of the last 64-byte chunk are not written.
In accordance with embodiments of the present disclosure, the ISP processing circuitry 32 may also be configured to provide overflow processing. For example, an overflow condition (also referred to as "overspeed") may occur in some situations where the ISP front-end processing logic 80 is back-pressured from its own internal processing units, from downstream processing units (e.g., the ISP pipeline 82 and/or the ISP back-end processing logic 120), or from destination storage (e.g., storage into which image data is to be written). An overflow condition may occur when pixel data is read in (e.g., from a sensor interface or memory) faster than one or more processing blocks can process the data, or than the data can be written to a destination (e.g., memory 108).
As described further below, the read-write memory may be one of the causes of an overflow condition. However, since the input data is saved, in the case of an overflow state, the ISP circuit 32 may simply stop reading the input data until the overflow state is restored. However, when image data is read directly from the image sensor, "live" data cannot generally be stopped because the image sensor is generally obtaining image data in real time. For example, an image sensor (e.g., 90) may operate on a timing signal based on its own internal clock and may be configured to output image frames at a frame rate, such as 15 or 30 frames per second (fps). Thus, sensor inputs to the ISP circuitry 32 and memory 108 may include an input queue that buffers the input image data before it is processed (by the ISP circuitry 32) or written to memory (e.g., 108). Thus, an overflow condition may occur if image data is received at the input queue faster than the speed at which image data is read from the queue and processed or saved (e.g., written to memory). That is, if the buffer/queue is full, no further incoming pixels can be buffered, which may be discarded depending on the overflow handling technique implemented.
Fig. 39 shows a block diagram of the ISP processing circuitry 32 and focuses on features of the control logic 84 that may provide overflow handling according to one embodiment. As shown, image data associated with Sensor 090 a and Sensor 190 b may be read into ISP front-end processing logic 80 (fepoc) from memory (using interfaces 174 and 176, respectively) or may be provided to ISP front-end processing logic 80 directly from the respective Sensor interfaces. In the latter case, incoming pixel data from image sensors 90a and 90b may be transferred to input queues 400 and 402, respectively, before being sent to the ISP front-end processing logic 80.
When an overflow condition occurs, the processing block (e.g., block 80, 82, or 120) or memory (e.g., 108) in which the overflow occurs may provide a signal (as shown by signals 405, 407, and 408) that sets one bit in Interrupt Request (IRQ) register 404. In this embodiment, the IRQ register 404 may be implemented as part of the control logic 84. In addition, separate IRQ registers 404 may be implemented for Sensor0 image data and Sensor1 image data, respectively. Based on the values stored in the IRQ register 404, the control logic 84 is able to determine which logic units within the ISP processing blocks 80, 82, 120 or memory 108 generate an overflow condition. The logic cells may be referred to as "destination cells" because they may constitute the destination to which the pixel data is sent. Depending on the overflow status, control logic 84 may also control (e.g., through firmware/software processing) which frames are dropped (e.g., either not written to memory or not output to a display for viewing).
Once an overflow condition is detected, the manner in which overflow handling is performed may depend on whether the ISP front end is reading pixel data from memory 108 or from an image sensor input queue (e.g., buffer) 400, 402, which may be a first-in-first-out (FIFO) queue in one embodiment. In one embodiment, when reading input pixel data from memory 108 through, for example, an associated DMA interface (e.g., 174 or 176), if the ISP front end receives back pressure as a result of detecting an overflow condition from any downstream destination block, which may include ISP pipeline 82, ISP back end processing logic 120, or memory 108 (in the case where the output of ISP front end logic 80 is written to memory 108) (e.g., by control logic 84 utilizing IRQ register 404), the ISP front end will stop reading pixel data. In this case, the control logic 84 may prevent overflow by stopping the reading of pixel data from the memory 108 before the overflow condition is restored. For example, overflow recovery may be signaled when a downstream unit that causes an overflow condition sets a corresponding bit in the IRQ register 404 that indicates that overflow is no longer occurring. One embodiment of this process is generally illustrated by step 412-420 of method 410 of FIG. 40.
Although overflow conditions may be monitored at the sensor input queues in general, it should be understood that many additional queues may exist between the processing units of the ISP subsystem 32 (e.g., internal units including IPS front-end logic 80, ISP pipeline 82, and ISP back-end logic 120). Additionally, each internal unit of the ISP subsystem 32 may also include line buffers, which may also function as queues. Thus, all queues and line buffers of the ISP subsystem 32 may provide buffering. Thus, when the last processing block in a chain of a particular processing block is full (e.g., its line buffer and any intermediate queues are full), backpressure may be applied to a preceding (e.g., upstream) processing block, etc., such that the backpressure propagates up through the logic chain until it reaches a sensor interface where overflow conditions may be monitored. Thus, when overflow occurs at the sensor interface, it may mean that all downstream queues and line buffers are full.
As shown in fig. 40, the method 410 begins at block 412 by reading pixel data for the current frame from memory to the ISP front end processing logic 80 at block 412. Decision logic 414 then determines whether an overflow condition exists. As described above, this may be evaluated by determining the state of a bit in the IRQ register 404. If an overflow condition is not detected, the method 410 returns to step 412 to continue reading pixels of the current frame. If the decision logic 414 detects an overflow condition, the method 410 stops reading pixels of the current frame from memory, as shown in block 416. Subsequently, at decision logic 418, a determination is made whether the overflow condition has recovered. If the overflow condition persists, then the method 410 waits at decision logic 418 until the overflow condition is restored. If decision logic 418 indicates that the overflow condition has recovered, method 410 proceeds to block 420 and resumes reading pixel data for the current frame from storage.
When an overflow condition occurs while the input pixel data is being read in from the sensor interface, the interrupt may indicate which downstream unit (e.g., processing block or destination memory) generated the overflow. In one embodiment, overflow handling may be provided according to two scenarios. In the first case, the overflow condition occurs during an image frame, but is restored before the next image frame begins. In this case, the overflow condition is restored and the input pixels from the image sensor are discarded before there is available space in the input queue corresponding to the image sensor. Control logic 84 may provide a counter 406 that may keep track of the number of pixels dropped and/or frames dropped. When the overflow condition is restored, the discarded pixel may be replaced with an undefined pixel value (e.g., all 1's (e.g., 11111111111111 for a 14-bit pixel value), all 0's, or a value programmed into the data register that sets what the undefined pixel value is), so that downstream processing may resume. In another embodiment, the discarded pixels may be replaced with the previous non-overflowed pixels (e.g., the last "good" pixel read into the input buffer). This ensures that the correct number of pixels (e.g., a number of pixels corresponding to the expected number of pixels in a complete frame) are sent to the ISP front-end processing logic 80, thereby enabling the ISP front-end processing logic 80 to output the correct number of pixels of the frame being read in from the sensor input queue when overflow occurs.
Although in the first case, the correct number of pixels may be output by the ISP front end, depending on the number of pixels discarded and replaced in the overflow state, a software process (e.g., firmware) that may be implemented as part of the control logic 84 may choose to discard (e.g., exclude) the image frame from being sent to the display and/or written to memory. Such a determination may be based on, for example, a comparison of the value of discarded pixel counter 406 to an acceptable discarded pixel threshold. For example, if an overflow condition occurs only briefly during the frame such that only a relatively small number of pixels are discarded (e.g., and replaced with undefined or dummy values; e.g., 10-20 pixels or less), then control logic 84 may choose to display and/or save the image despite the small number of discarded pixels, even if the presence of replacement pixels only appears as a slight artifact in the final image very briefly. However, due to the small number of replacement pixels, such artifacts may not be noticed by the user, or may be perceived to a small extent. That is, the presence of any such artifacts due to undefined pixels of the transient overflow state does not significantly degrade the aesthetic quality of the image (e.g., any such degradation is extremely slight, or negligible, to the human eye).
In the second case, the overflow condition may remain to the beginning of the next image frame. In this case, similar to the first case described above, the pixels of the current frame are also discarded and counted. However, if the overflow condition still exists when a VSYNC rising edge is detected (e.g., indicating the start of the next frame), then the ISP front-end processing logic 80 may be configured to block the next frame, thereby dropping the entire next frame. In this case, the next and subsequent frames will continue to be discarded before overflow is restored. Once overflow is restored, the previous current frame (e.g., the frame being read when overflow was initially detected) may have its discarded pixels replaced with undefined pixel values, thereby allowing the ISP front-end logic 80 to output the correct number of pixels for that frame. Thereafter, downstream processing may be resumed. As for each frame dropped, control logic 84 may also include a counter that counts the number of frames dropped. This data can be used to adjust the timing for audio-video synchronization. For example, for video captured at a frame rate of 30fps, each frame is approximately 33 milliseconds in duration. Thus, if 3 frames are dropped due to overflow, the control logic 84 may be configured to adjust the audio-video synchronization parameters to account for the approximately 99 millisecond (33 milliseconds x 3 frames) duration attributable to the dropped frames. For example, to compensate for the time attributable to the dropped frame, control logic 84 may control the image output by repeating one or more previous frames.
An embodiment of a process 430 representing the above-described situation that may occur when reading pixel data from a sensor interface is illustrated in FIG. 41. As shown, the method 430 begins at block 432 where pixel data for the current frame is read from the sensor into the ISP front end processing logic 80 at block 432. Decision logic 434 then determines whether an overflow condition exists. If there is no overflow, the method 430 continues to read in the pixels of the current frame (e.g., return to block 432). If decision logic 434 determines that an overflow condition exists, method 430 continues to block 436 where the next incoming pixel for the current frame is discarded at block 436. Decision logic 438 then determines whether the current frame has ended and the next frame has started. For example, in one embodiment, this may include detecting a rising edge in the VSYNC signal. If the sensor is still transmitting the current frame, method 430 continues to decision logic 440, and decision logic 440 determines whether the overflow condition originally detected at logic 434 is still present. If the overflow condition has not been restored, the method 430 proceeds to block 442 where a discard pixel counter is incremented (e.g., to account for the incoming pixels discarded at block 436) at block 442. The method then returns to block 436 and continues.
If at decision logic 438 it is detected that the current frame has ended and the sensor is sending the next frame (e.g., VSYNC rising edge), then method 430 proceeds to block 450 and discards all pixels of the next frame, and subsequent frames, as long as the overflow condition still exists (e.g., as represented by decision block 452). As described above, a separate counter may track the number of dropped frames, which may be used to adjust the audio-video synchronization parameters. If decision logic 452 indicates that the overflow condition has been restored, the discarded pixels of the initial frame for which the overflow condition first occurred are replaced with a plurality of undefined pixel values corresponding to the number of discarded pixels of the initial frame (indicated by the discarded pixel counter). As described above, the undefined pixel value may be an all-1, an all-0, a replacement value programmed into the data register, or may take the value of a preceding pixel read before the overflow condition (e.g., the last pixel read before the overflow condition is detected). Thus, this allows the initial frame to be processed with the correct number of pixels, and downstream image processing, including writing the initial frame to memory, may continue at block 446. Also as described above, depending on the number of pixels dropped in a frame, control logic 84 may choose to exclude or include the frame when outputting video data (e.g., if the number of dropped pixels is greater than or less than an acceptable dropped pixel threshold). It should be understood that overflow processing may be performed separately for each of the input queues 400 and 402 of the ISP subsystem 32.
Another embodiment of overflow handling that can be implemented in accordance with the present disclosure is shown in FIG. 42 using a flow chart depicting a method 460. Here, in the same manner as shown in fig. 41, overflow processing regarding an overflow state that occurs during the current frame but is restored before the current frame ends is processed, and then, these steps are numbered with the same reference numeral 432 and 446. The difference between method 460 of FIG. 42 and method 430 of FIG. 41 is the overflow handling when the overflow condition continues to the next frame. For example, referring to decision logic 438, rather than dropping the next frame as in method 430 of FIG. 41, when the overflow condition extends to the next frame, method 460 instead implements block 462, wherein the drop pixel counter is cleared, the sensor input queue is cleared, and control logic 84 is signaled to drop a portion of the current frame. By clearing the sensor input queue and discarding the pixel counter, the method 460 prepares for the next frame (which now becomes the current frame) and the method returns to block 432. It will be appreciated that the pixels of the current frame may be read into the sensor input queue. If the overflow condition is restored before the input queue becomes full, then downstream processing resumes. However, if the overflow condition persists, then the method 460 will continue from block 436 (e.g., begin discarding pixels until overflow resumes or the next frame begins).
As described above, the electronic device 10 may also provide for the capture of audio data (e.g., via an audio capture device provided as one of the input structures 14) concurrently with the image data (e.g., via the imaging device 30 having the image sensor 90). For example, as schematically illustrated in fig. 43, audio data 470 and image data 472 may represent video data and audio data captured simultaneously with an electronic device. Audio data 470 may include audio samples 474 captured over time (t), and similarly, image data 472 may represent a series of image frames captured over time t. Each frame 476 of image data 472 may represent a still image frame. Thus, when still image frames are viewed in chronological order continuously (e.g., a certain number of frames/second, such as 15 to 30 frames/second), the viewer will feel the appearance of a moving image, thereby providing video data. When audio data 470 is obtained and represented as digital data, it may be saved at equal time intervals as binary values representing samples (e.g., 474) of the amplitude of the audio signal. Further, although shown in fig. 43 as having discontinuous portions 474, it should be understood that in a practical implementation, the audio data may have a sufficiently large sampling rate that the human ear perceives the audio data 470 as continuous sound.
During playback of the video data 472, the corresponding audio data 470 may also be played back, thereby allowing the viewer to not only view the video data of the captured event, but also hear the sound corresponding to the captured event. Ideally, the video data 472 and the audio data 470 are played back synchronously. For example, if the audio sample 474a initially appears at time tAThen, under ideal playback conditions, initially at time tAThe captured image frames are output in synchronization with the audio samples 474 a. However, if synchronization is not achieved, the viewer/listener may notice a delay or offset between the audio data and the video data. For example, assume that the audio sample 474a is initially at time t0(earlier in time than time tA) The captured image frames 476c are output together. In this case, the audio data 470 "precedes" the video data 472, and the user may experience hearing at time tAAnd sees its expected corresponding video sample (at time t)AIs the time t), the delay between the image frames 476a) ofAAnd t0The difference between them. Similarly, assume audio sample 474a and at time tB(later in time than time tA) Are output together with the image frame 476 b. Then in this latter case the audio data 470 is "behind" the video data 472 and the user may feel to see that at time t AAnd hearing it at time tAIs the time tAAnd tBThe difference between them. This type of delay is sometimes referred to as "mouth-to-mouth" error. It will be appreciated that the latter two scenarios may adversely affect the user's experience. To achieve audio-video synchronization, the system is typically configured such that, for example, if there is a synchronization problem, then any compensation for the synchronization problem takes the audio into account over the video, and the image frames may be dropped or repeated without changing the audio.
In some conventional systems, the audio data and video data are synchronized using a start frame interrupt (e.g., based on a VSYNC signal). When such an interrupt occurs (indicating the start of a new frame), the processor may execute an interrupt service routine to service the interrupt (e.g., clear a bit), with a timestamp associated with the frame corresponding to the time that the processor serviced the interrupt. It will be appreciated that there is typically some delay between the interrupt request and the time the processor services the interrupt. Thus, the time stamp associated with a particular image frame may reflect the time delay and thus may not actually represent the precise time at which the image frame actually begins. In addition, the latency may vary with processor load and bandwidth, which may further complicate the audio-video synchronization problem.
As described above, the ISP front-end logic 80 may operate in its own clock domain and provide an asynchronous interface to the sensor interface 94 to support sensors of different sizes and with different timing requirements. To provide for synchronization of audio data and video data, the ISP processing circuitry 32 may utilize the ISP front end clock to provide a counter that may be used to generate a timestamp that may be associated with a captured image frame. For example, referring to FIG. 44, in one embodiment, the four registers that may all be used to provide the time stamps (including timer configuration register 490, time code register 492, Sensor0 time code register 494, and Sensor1 time code register 496) operate at least in part according to the clock of the ISP front-end processing logic 80. In one embodiment, registers 490, 492, 494, and 496 may comprise 32-bit registers.
The timer configuration register 490 may be configured to provide a value NC1k, and the value NC1k may be used to provide a count of generated timestamp codes. In one embodiment, NC1k may be a 4-bit value of 0 ~ 15. According to NC1k, the value of the timer or counter indicating the current time code may be incremented by 1 every 2^ NClk clock cycles (based on the ISP front-end clock domain). The current time code may be saved in the time code register 492, thereby providing a time code having a 32-bit resolution. The time code register 492 may also be reset by the control logic 84.
Referring briefly to fig. 10, for each sensor interface input Sif0 and Sif1, when a rising edge is detected on the Vertical Synchronization (VSYNC) signal (or if a falling edge is detected, depending on how VSYNC is configured), the time code register 492 may be sampled, thereby indicating the start of a new frame (e.g., at the end of the vertical blanking interval). The time code corresponding to the VSYNC rising edge may be saved in a time code register 494 or 496, depending on the Sensor (Sensor0 or Sensor1) providing the image frame, providing a timestamp indicating the time at which capture of the current frame began. In some embodiments, the VSYNC signal from the sensor may have a programmed or programmable delay. For example, if the first pixel of a frame is delayed by n clock cycles, control logic 84 may be configured to compensate for the delay, such as by setting an offset in hardware or using software/firmware compensation. Thus, a timestamp may be generated from the VSYNC rising edge with the programmed delay added. In another embodiment, the falling edge of the VSYNC signal with a programmable delay may be utilized to determine a timestamp corresponding to the start of a frame.
In processing the current frame, control logic 84 reads a timestamp from a sensor time code register (494 or 496), which may be associated with the video image frame as a parameter in metadata associated with the image frame. This is more clearly shown in fig. 45, where fig. 45 provides an illustration of an image frame 476 and its associated metadata 498, which metadata 498 includes a timestamp 500 read from an appropriate time code register (e.g., register 494 for Sensor0 or register 496 for Sensor 1). In one embodiment, control logic 84 may then read the timestamp from the time code register when triggered by the start frame interrupt. Thus, each image frame captured by the ISP processing circuitry 32 may have an associated timestamp based on the VSYNC signal. Control circuitry or firmware that may be implemented as part of ISP control logic 84 or as part of a separate control unit of electronic device 10 may align or synchronize a corresponding set of audio data using image frame time stamps to achieve audio-video synchronization.
In some embodiments, device 10 may include an audio processor configured to be responsible for the processing of audio data (e.g., audio data 470). For example, the audio processor may be a stand-alone processing unit (e.g., part of the processor 16), or may be integrated with the main processor, or may be part of a system-on-a-chip processing device. In such embodiments, the audio processor and the image processing circuit 32, which may be controlled by a processor separate from the audio processor (e.g., part of the control logic 84), may operate according to separate clocks. For example, the clock may be generated using a separate Phase Locked Loop (PLL). Thus, for audio-video synchronization, the device 10 may need to be able to associate image timestamps with audio timestamps. In one embodiment, this association may be implemented with a host processor (e.g., CPU) of device 10. For example, the host processor may synchronize its own clock with the clock of the audio processor and the clock of the ISP circuit 32 to determine the difference between the clock of the audio processor and the clock of the ISP circuit 32. Once known, the difference can be used to correlate a timestamp of the audio data (e.g., 470) with an image frame timestamp of the image data (e.g., 472).
In one embodiment, the control logic 84 may also be configured to handle the wrap-around state, such as when a maximum of 32-bit time codes is reached and where a new increment requires an additional one bit (e.g., 33 bits) to provide an accurate value. Providing a simplified example, this wrap-around occurs on a 4-bit counter when the value 9999 increments, due to being limited to 4 bits, resulting in 0000 instead of 10000. While control logic 84 is able to reset timecode register 492, resetting timecode register 492 is not advisable when a wrap-around condition occurs while a video session is still being captured. Thus, in this case, the control logic 84 may include logic configured to handle the wrap-around state by generating a more accurate timestamp (e.g., 64 bits) based on the 32-bit register value, which may be implemented in software in one embodiment. The software may generate a more accurate timestamp that may be written into the image frame metadata until the timecode register 492 is reset. In one embodiment, the software may be configured to detect the wrap around and add the time difference resulting from the wrap around to a higher resolution counter. For example, in one embodiment, when a wrap-around condition is detected for a 32-bit counter, the software may sum the maximum value of the 32-bit counter (to account for wrap-around) and the current time value indicated by the 32-bit counter and save the result in a higher resolution counter (e.g., greater than 32 bits). In this case, the result in the high resolution counter may be written into the image metadata information until the 32-bit counter is reset.
Fig. 46 depicts a method 510 that outlines the audio-video synchronization techniques discussed above. As shown, method 510 begins at step 512 by receiving pixel data from an image Sensor (e.g., Sensor0 or Sensor1) at step 512. Thereafter, at decision logic 514, a determination is made whether the VSYNC signal indicates the start of a new frame. If a new frame is not detected, the method 510 returns to step 512 to continue receiving pixel data for the current image frame. If a new frame is detected at decision logic 514, method 510 continues to step 516 where a time code register (e.g., register 492) is sampled at step 516 to obtain a timestamp value corresponding to the rising (or falling) edge of the VSYNC signal detected at step 514. Subsequently, at step 518, the timestamp value is saved to a time code register (e.g., registers 494 or 496) corresponding to the image sensor providing the input pixel data. Subsequently, at step 520, a timestamp is associated with the metadata of the new image frame, after which the timestamp information in the image frame metadata is available for audio-video synchronization. For example, the electronic device 10 may be configured to provide audio-video synchronization by aligning video data (with a timestamp for each individual frame) with corresponding audio data in a manner that substantially minimizes any delay between the corresponding audio output and video output. For example, as described above, the main processor of device 10 may be utilized to determine how to associate the audio time stamps with the video time stamps. In one embodiment, image frames may be discarded if the audio data precedes the video data to allow the correct image frame to "catch up" to the audio data stream, and image frames may be repeated if the audio data follows the video data to allow the audio data to "catch up" to the video stream.
Turning to FIGS. 47-50, the ISP processing logic or subsystem 32 may also be configured to provide flash (also referred to as "strobing") synchronization. For example, when a flash module is utilized, artificial illumination may be temporarily provided to help illuminate the image scene. The use of a flash is beneficial, for example, when capturing an image scene in low lighting conditions. The flash or strobe may be provided by any suitable light source, such as an LED flash device or a xenon flash device, for example.
In this embodiment, the ISP system 32 may include a flash controller configured to control the timing and/or time interval of flash module activation. It will be appreciated that it is generally desirable to control the timing and duration of flash module activation such that the flash interval begins before the first pixel of a target frame (e.g., an image frame to be captured) is captured and ends after the last pixel of the target frame is captured but before the next successive image frame is started. This helps to ensure that all pixels within the target frame are exposed to similar lighting conditions while the image scene is being captured.
Referring to fig. 47, a block diagram representation of a flash controller implemented as part of ISP subsystem 32 and configured to control flash module 552 is illustrated, in accordance with one embodiment of the present disclosure. In some embodiments, flash module 552 may include more than one strobe device. For example, in some embodiments, flash controller 550 may be configured to provide a pre-flash (e.g., for red-eye removal) followed by a main flash. The pre-flash event and the main flash event may be sequential and may be provided using the same or different strobe devices.
In the illustrated embodiment, the timing of flash module 552 may be controlled based on timing information provided from image sensors 90a and 90 b. For example, the timing of the image sensor can be controlled using rolling shutter technology, whereby the integration time is controlled using the slit aperture of the pixel array that scans the image sensor (e.g., 90a and 90 b). Using sensor timing information (here represented by reference numeral 556) that may be provided to ISP subsystem 32 via sensor interfaces 94a and 94b (which may include sensor side interface 548 and front side interface 549, respectively), control logic 84 may provide appropriate control parameters 554 to flash controller 550, which flash controller 550 then enables flash module 552 using control parameters 554. As described above, by utilizing the sensor timing information 556, the flash controller 556 may ensure that the flash module is enabled prior to capturing the first pixel of the target frame and remains enabled for the duration of the target frame while the flash module is disabled after capturing the last pixel of the target frame and prior to the start of the next image frame (e.g., VSYNC rising edge). This process may be referred to as a "flash synchronization" or "strobe synchronization" technique, which will be discussed further below.
Additionally, as shown in the embodiment of fig. 47, the control logic 84 may also utilize the statistical data 558 from the ISP front end 80 to determine whether the current lighting conditions in the image scene corresponding to the target frame are suitable for use with the flash module. For example, the ISP subsystem 32 may utilize automatic exposure to attempt to maintain a target exposure level (e.g., light level) by adjusting the integration time and/or sensor gain. It is understood, however, that the integration time cannot be longer than the frame time. For example, for video data obtained at 30fps, the duration of each frame is about 33 milliseconds. Thus, if the target exposure level cannot be reached with the maximum integration time, then sensor gain may also be applied. However, if neither the adjustment of the integration time nor the sensor gain achieves the target exposure (e.g., if the light level is less than the target threshold), the flash controller may be configured to enable the flash module. Furthermore, in one embodiment, the integration time may also be limited to avoid motion blur. For example, although the integration time may be extended up to the duration of a frame, in some embodiments the integration time may be further limited to avoid motion blur.
As described above, to ensure that the enabling of the flash illuminates the target frame for the entire duration of the target frame (e.g., turning the flash on before the first pixel of the target frame and turning the flash off after the last pixel of the target frame), ISP subsystem 32 may use sensor timing information 556 to determine when to enable/disable flash 552.
Fig. 48 graphically illustrates how the sensor timing signal from the image sensor 90 is used to control flash synchronization. For example, FIG. 48 shows a portion of an image sensor timing signal 556 that may be provided by one of the image sensors 90a or 90 b. The logic high portion of signal 556 represents a frame interval. For example, the first FRAME (FRAME N) is denoted by reference numeral 570, and the second FRAME (FRAME N +1) is denoted by reference numeral 572. The actual time at which the first frame 570 begins is used at time tVSYNC_ra0(e.g., "r" indicates a rising edge, "a" indicates an "actual" aspect of timing signal 556) the rising edge of signal 556 indicates that the actual time at which first frame 570 ends is used at time tVSYNC_fa0(e.g., "f" indicates a falling edge) of signal 556. Similarly, the actual time at which the second frame 572 begins is used at time tVSYNC_ra1Indicates that the actual time at which the second frame 572 ends is used at time t VSYNC_fa1Is indicated by the falling edge of signal 556. The interval 574 between the first and second frames may be referred to as a blanking interval (e.g., vertical blanking) that may allow image processing circuitry (e.g., ISP subsystem 32) to identify when image frames end and begin. It should be understood that the frame intervals and vertical blanking intervals shown in fig. 48 are not necessarily drawn to scale.
As shown in fig. 48, signal 556 may represent the actual timing from the perspective of image sensor 90. That is, signal 556 represents the timing at which the image sensor actually acquires the frame. However, when sensor timing information is provided to downstream components of the image processing system 32, a delay may be introduced in the sensor timing signal. For example, signal 576 represents a delayed timing signal (delayed by first time delay 578) from the perspective of sensor-side interface 548 of interface logic 94 between sensor 90 and ISP front-end processing logic 80. Signal 580 may represent a delayed sensor timing signal from the perspective of front-side interface 549, which is represented in fig. 48 as being delayed by a second time delay 582 relative to sensor-side interface timing signal 572 and by a third time delay 584 relative to initial sensor timing signal 556, the third time delay 584 being equal to the sum of the first time delay 578 and the second time delay 582. Then, when signal 580 from front-end side 549 of interface 94 is provided to ISP front-end processing logic 80 (feprc), additional delay may be applied so that delayed signal 588 is seen from the perspective of ISP front-end processing logic 80. Specifically, here, the signal 588 seen by the ISP front-end processing logic 80 is shown delayed by a fourth time delay 590 relative to the delayed signal 580 (the front-end side timing signal) and by a fifth time delay 592 relative to the initial sensor timing signal 556, the fifth time delay 592 being equal to the sum of the first time delay 578, the second time delay 582, and the fourth time delay 590.
To control flash timing, flash controller 550 may utilize a first signal available to the ISP frontend, which is then offset from the actual sensor timing signal 556 by a minimum amount of delay. Thus, in this embodiment, flash controller 550 may determine the flash timing parameters from sensor timing signal 580 from the perspective of the sensor and front end side 549 of ISP interface 94. Thus, the signal 596 used by the flash controller 550 in this example may be the same as the signal 580. As shown, the delayed signal 596 (delayed by a time delay 584 relative to the signal 556) includes a signal at a time t associated with the first frame 570VSYNC_rd0And tVSYNC_fd0The frame interval therebetween (e.g., "d" for "delay"), and the time t associated with the second frame 572VSYNC_rd1And tVSYNC_fd1The frame interval in between. As noted above, it is generally desirable to enable the flash before the start of the frame and for the duration of the frame (e.g., to disable the flash after the last pixel of the frame) to ensure that the image scene is illuminated for the entire frame, and to account for any warm-up time (which may be on the order of microseconds (e.g., 100-800 microseconds) to milliseconds (e.g., 1-5 milliseconds)) required for the flash to reach full intensity during the enabling period. However, since the signal 596 analyzed by the flash controller 550 is delayed relative to the actual timing signal 556, the delay is taken into account when determining the flash timing parameters.
For example, assume that a flash is to be enabled to illuminate the image scene of the second frame 572, at tVSYNC_rd1Occurs at tVSYNC_ra1After the actual rising edge of (c). Thus, it is difficult for the flash controller 550 to use the delayed rising edge tVSYNC_rd1The flash enable start time is determined because after the second frame 572 has started (e.g., at t of signal 556)VSYNC_ra1Thereafter), delayed rising edge tVSYNC_rd1Is it present. In this embodiment, the flash controller 550 may instead be based on the end of the previous frame (here, at time t)VSYNC_fd0Falling edge of) the flash enable start time is determined. For example, flash controller 550 may add time interval 600 (which represents vertical blanking interval 574) and time tVSYNC_fd0To calculate the rising edge time t relative to the delay of frame 572VSYNC_rd1The corresponding time. It will be appreciated that the delayed rising edge time tVSYNC_rd1Occurring at the actual rising edge time tVSYNC_ra1(Signal 556) thereafter, from time tVSYNC_fd0And the blanking interval time 600 minus the time OFFSet 598(OFFSet1) corresponding to the time delay 584 of the signal 580. This occurs and at time tVSYNC_ra1The start of the second frame 572 synchronizes the start time of the flash enable. However, as described above, depending on the type of flash device provided (e.g., xenon flash, LED, etc.), flash module 552 may experience a warm-up time between when the flash module is enabled and when the flash device reaches its full luminosity. The amount of preheat time may depend on the type of flash device used (e.g., xenon flash device, LED flash device, etc.). Thus, to account for such warm-up time, the time t may be determined from VSYNC_ra1Minus another OffSet 602(OffSet2) that is programmable or preset (e.g., using a control register). This rolls back the flash enable start time to time 604, ensuring that the flash is enabled prior to the start of the frame 572 obtained by the image sensor (if a flash is required to illuminate the scene). This process of determining the flash on time may be represented by the following equation:
tflash_start_frame1=tVSYNC_fd0+tvert_blank_int-tOffSet1-tOffSet2
in the illustrated embodiment, the time t of the flash controller signal 596 may beVSYNC_fd1Deactivation of the flash occurs for as long as time tVSYNC_fd1Occurs before the start of a FRAME (e.g., FRAME N +2, not shown in fig. 48) following FRAME 572, as shown at time 605 on sensor timing signal 556. In other embodiments, the time t at signal 596 may beVSYNC_fd1Thereafter, the deactivation of the flash may occur at a time, but before the next FRAME begins (e.g., before a subsequent VSYNC rising edge on sensor timing signal 556 indicating the beginning of FRAME N + 2), or may occur immediately following time tVSYNC_fd1The deactivation of the flash occurs within a previous interval 608, where interval 608 is less than the amount of OffSet1 (598). It will be appreciated that this ensures that the flash is on for the entire duration of the target frame (e.g., frame 572).
FIG. 49 depicts a process 618 for determining a flash enable start time on electronic device 10 in accordance with the embodiment shown in FIG. 48. Beginning in block 620, a sensor timing signal (e.g., 556) from an image sensor is obtained and provided to flash control logic (e.g., flash controller 550), which may be part of an image signal processing subsystem (e.g., 32) of the electronic device 10. The sensor timing signal is provided to the flash control logic, but delayed relative to the original timing signal (e.g., 556). At block 622, a delay (e.g., delay 584) between the sensor timing signal and the delayed sensor timing signal (e.g., 596) is determined. Subsequently, a target frame (e.g., frame 572) requesting flash illumination is identified at block 624. To determine the time at which the flash module (e.g., 552) should be enabled to ensure that the flash is activated before the start of the target frame, process 618 then proceeds to block 626 where a first time (e.g., time t) corresponding to the end of the frame before the target frame, as indicated by the delayed timing signal, is determined at block 626VSYNC_fd0). Thereafter, at block 628, the length of the blanking interval between frames is determined and added to the first time determined at block 626 to determine a second time. Then, subtracting at block from the second time 622, as indicated at block 630, to determine a third time. As described above, this sets the flash on time to coincide with the actual start of the target frame, in accordance with the non-delayed sensor timing signal.
To ensure that the flash is active before the start of the target frame, an Offset (e.g., 602, Offset2) is subtracted from the third time, as shown at block 632, to determine the desired flash enable time. It will be appreciated that in some embodiments, the offset of block 632 may not only ensure that the flash is on prior to the target frame, but may also compensate for any warm-up time required for the flash to reach full luminosity from being initially enabled. At block 634, flash 552 is enabled at the flash start time determined at block 632. As described above and shown in block 636, the flash remains on for the entire duration of the target frame and may be disabled after the end of the target frame so that all pixels in the target frame experience similar lighting conditions. While the embodiments illustrated in fig. 48 and 49 above discuss the application of flash synchronization techniques with a single flash, it should be further understood that these flash synchronization techniques may also be applied to embodiments of devices having more than two flash devices (e.g., two LED flashes). For example, if more than one flash module is utilized, the above techniques may be applied to two flash modules such that each flash module is enabled by the flash controller before the start of a frame and remains on for the duration of the frame (e.g., it is not necessary to enable multiple flash modules for multiple identical frames).
The flash timing techniques described herein may be applied when obtaining an image with device 10. For example, in one embodiment, a pre-flash technique may be used during image acquisition. For example, when a camera or image acquisition application is activated on device 10, the application may operate in a "preview" mode. In preview mode, the image sensor (e.g., 90) may be obtaining multiple frames of image data that may be processed by the ISP subsystem 32 of the device 10 for preview (e.g., displayed on the display 28), although the frames may not actually be captured or saved until the user issues a capture request to cause the device 10 to enter "capture" mode. This may be accomplished, for example, by a user actuating a physical capture button on device 10 or a soft capture button, which may be implemented via software as part of a graphical user interface, displayed on a display of device 10, and responsive to user interface inputs (e.g., touch screen inputs).
Since the flash is not typically active during preview mode, in some cases, sudden activation of the flash and illumination of the image scene with the flash may significantly change certain image statistics of a particular scene, such as image statistics related to auto white balance statistics and the like, relative to the same image scene not illuminated with the flash. Thus, to improve the statistics for processing a desired target frame, in one embodiment, the pre-flash operation technique may include receiving a user request to capture an image frame requesting flash illumination, illuminating a first frame with a flash at a first time while device 10 is still in preview mode, and updating the statistics (e.g., auto white balance statistics) before the next frame begins. The device 10 may enter a capture mode and capture the next frame with updated statistics with flash enabled, thereby providing improved image/color accuracy.
FIG. 50 depicts a flow chart illustrating this process 640 in more detail. The process 640 begins at block 642 by receiving a request to capture an image with a flash at block 642. At block 644, while device 10 is still in preview mode, a flash is enabled (e.g., the flash may be timed using the techniques shown in fig. 48 and 49) to illuminate the first frame. Subsequently, at block 646, image statistics, such as automatic white balance statistics, are updated based on the statistics obtained from the illuminated first frame. Thereafter, at block 648, device 10 may enter capture mode and obtain the next frame using the image statistics updated at block 646. For example, the updated image statistics may be used to determine a white balance gain and/or Color Correction Matrix (CCM) that may be used by firmware (e.g., control logic 84) to program the ISP pipeline 82. Thus, the ISP pipeline 82 may process the frame obtained at block 648 (e.g., the next frame) using one or more parameters determined based on the image statistics updated at block 646.
In another embodiment, when capturing image frames with flash, the color properties of a non-flash image scene (e.g., an image scene obtained or previewed without flash) may be applied. It will be appreciated that image scenes without flashes generally exhibit better color properties relative to image scenes illuminated with flashes. However, the use of flash may reduce noise and increase brightness (e.g., in low lighting conditions) relative to an image without flash. However, the use of flashes may also result in some colors in the flash image appearing slightly faded relative to a non-flash image of the same scene. Thus, in one embodiment, to maintain the benefits of low noise and brightness for a flash image, while also partially maintaining some of the color properties of a non-flash image, device 10 may be configured to analyze the first frame of a non-flash to obtain its color properties. Device 10 may then capture a second frame with flash and apply a palette transfer technique to the flash image using the color properties of the non-flash image.
In some embodiments, the device 10 configured to implement any of the flash/strobe techniques described above may be a model of a device available from apple Inc. with an integral imaging device or an external imaging deviceOrA computing device. Further, the imaging/camera application may be a version of that also available from apple IncOrApplication is carried out.
With continued reference to FIG. 51, a more detailed diagram of ISP front-end pixel processing logic 150 (previously discussed in FIG. 10) is illustrated, in accordance with one embodiment of the present technique. As shown, the ISP front end pixel processing logic 150 includes a temporal filter 650 and a boxed compensation filter 652. The temporal filter 650 may receive one of the input image signals Sif0, Sif1, FEProcIn, or a pre-processed image signal (e.g., 180, 184), and may operate on the raw image data before any additional processing is performed. For example, the temporal filter 650 may initially process the image data to reduce noise by averaging the image frames in the temporal direction. The binned compensation filter 652, discussed in more detail below, may apply scaling and resampling to binned raw image data from the image sensors (e.g., 90a, 90b) to maintain a uniform spatial distribution of image pixels.
The temporal filter 650 may be pixel adaptive according to motion and luminance characteristics. For example, when the pixel motion is large, the filtering strength may be reduced to avoid "smearing" or "artifacts" in the final processed image, while when little motion is detected, the filtering strength may be increased. In addition, the filtering strength may also be adjusted based on the luminance data (e.g., "luminance"). For example, as image brightness increases, filtering artifacts become more noticeable to the human eye. Thus, when the pixel has a higher luminance, the filtering strength can be further reduced.
When applying temporal filtering, the temporal filter 650 may receive reference pixel data (Rin) and motion history input data (Hin), which may be from a previously filtered frame or an original frame. With these parameters, the temporal filter 650 may provide motion history output data (Hout) and a filtered pixel output (Yout). The filtered pixel output Yout is then passed to a binning compensation filter 652, which binning compensation filter 652 may be configured to perform one or more scaling operations on the filtered pixel output data Yout to generate an output signal FEProcOut. The processed pixel data FEProcOut may then be forwarded to the ISP pipeline processing logic 82 as described above.
Referring to fig. 52, a process diagram depicting a temporal filtering process 654 that may be performed by the temporal filter shown in fig. 51 is illustrated in accordance with a first embodiment. The temporal filter 650 may comprise a 2-tap filter in which the filter coefficients are adaptively adjusted on a per-pixel basis based at least in part on motion and luminance data. For example, an input pixel x (t) (where the variable "t" represents a time value) may be compared to a reference pixel r (t-1) in a previous filtered frame or a previous initial frame to generate a motion index lookup in a motion history table (M)655, which motion history table (M)655 may contain filter coefficients. In addition, from the motion history input data h (t-1), a motion history output h (t) corresponding to the current input pixel x (t) can be determined.
The motion history output h (t) and the filter coefficient K may be determined from a motion delta d (j, i, t), where (j, i) represents the spatial location coordinate of the current pixel x (j, i, t). The motion delta d (j, i, t) can be calculated by determining the maximum of the 3 absolute deltas between the initial pixel and the reference pixel of the 3 horizontally juxtaposed pixels of the same color. For example, referring briefly to fig. 53, the spatial positions of 3 collocated reference pixels 657, 658 and 659 corresponding to the initial input pixels 660, 661 and 662 are illustrated. In one embodiment, the motion delta may be calculated from these initial and reference pixels using the following equation:
d(j,i,t)=max3[abs(x(j,i-2,t)-r(j,i-2,t-1)),
(abs(x(j,i,t)-r(j,i,t-1)), (1a)
(abs(x(j,i+2,t)-r(j,i+2,t-1))]
A flow chart describing such a technique for determining a motion delta value is further illustrated in fig. 55 below. Further, it should be appreciated that the technique of calculating a motion delta value as shown above in equation 1a (and below in FIG. 55) is merely used to provide one embodiment of determining a motion delta value.
In other embodiments, an array of same color pixels may be evaluated to determine a motion delta value. For example, in addition to the 3 pixels referred to in equation 1a, one embodiment of determining the motion delta value may include an absolute delta between pixels of the same color two rows above the reference pixels 660, 661, and 662 and their corresponding collocated pixels (e.g., j-2; assuming Bayer pattern) and two rows below the reference pixels 660, 661, and 662 and their corresponding collocated pixels (e.g., j + 2; assuming Bayer pattern). For example, in one embodiment, the motion delta value may be expressed as follows:
d(j,i,t)=max9[abs(x(j,i-2,t)-r(j,i-2,t-1)),
(abs(x(j,i,t)-r(j,i,t-1)),
(abs(x(j,i+2,t)-r(j,i+2,t-1)),
(abs(x(j-2,i-2,t)-r(j-2,i-2,t-1)),
(abs(x(j-2,i,t)-r(j-2,i,t-1)), (1b)
(abs(x(j-2,i+2,t)-r(j-2,i+2,t-1)),
(abs(x(j+2,i-2,t)-r(j+2,i-2,t-1))
(abs(x(j+2,i,t)-r(j+2,i,t-1)),
(abs(x(j+2,i+2,t)-r(j+2,i+2,t-1))]
thus, in the embodiment described by equation 1b, the motion increment value can be determined by comparing the absolute increments between pixels of the same color of a 3 x 3 array, with the current pixel (661) located at the center of the 3 x 3 array (e.g., a 5 x 5 array for a Bayer color pattern if pixels of different colors are counted). It should be understood that any suitable 2-dimensional array of pixels of the same color (e.g., including an array of all pixels in the same row (e.g., equation 1a), or an array of all pixels in the same column) may be analyzed to determine the motion delta value, with the current pixel (e.g., 661) located at the center of the array. Further, while the motion delta value may be determined as the maximum value of the absolute delta (e.g., as shown in equations 1a and 1 b), in other embodiments, the motion delta value may also be selected as the average or median value of the absolute delta. In addition, the above-described techniques may also be applied to other kinds of color filter arrays (e.g., RGBW, CYGM, etc.), and are not intended to be dedicated to the Bayer pattern.
Referring back to fig. 52, once the motion delta value is determined, a motion index lookup may be calculated that may be used to select a filter coefficient K from motion table (M)655 by calculating the sum of the motion delta d (t) for the current pixel (e.g., at spatial location (j, i)) and the motion history input h (t-1). For example, the filter coefficient K may be determined as follows:
K=M[d(j,i,t)+h(j,i,t-1)] (2a)
additionally, the motion history output h (t) may be determined using the following equation:
h(j,i,t)=d(j,i,t)+(1-K)×h(j,i,t-1) (3a)
the luminance of the current input pixel x (t) may then be used to generate a luminance index lookup in the luminance table (L) 656. In one embodiment, the brightness table may comprise between 0 and 1, and the attenuation factor may be selected according to the brightness index. The second filter coefficient K' can be calculated by multiplying the first filter coefficient K by the luminance attenuation factor, as shown in the following equation:
K′=K×L[x(j,i,t)] (4a)
the determined value of K' may then be used as a filter coefficient for the time domain filter 650. As described above, the time domain filter 650 may be a 2-tap filter. Additionally, time domain filter 650 may be configured as an Infinite Impulse Response (IIR) filter utilizing a previous filtered frame or as a Finite Impulse Response (FIR) filter utilizing a previous initial frame. Temporal filter 650 may calculate a filtered output pixel y (t) (yout) using the following equation, using current input pixel x (t), reference pixel r (t-1), and filter coefficient K':
y(j,i,t)=r(j,i,t-1)+K′(x(j,i,t)-r(j,i,t-1)) (5a)
As described above, the temporal filtering process 654 shown in fig. 52 may be performed on a pixel-by-pixel basis. In one embodiment, the same motion table M and luminance table L may be used for all color components (e.g., R, G and B). Additionally, some embodiments may provide a bypass mechanism in which temporal filtering may be bypassed, such as in response to a control signal from control logic 84. In addition, as described below with reference to fig. 57 and 58, an embodiment of temporal filter 650 may use a separate motion table and luma table for each color component of the image data.
Embodiments of the temporal filtering technique described with reference to fig. 52 and 53 may be better understood with reference to fig. 54, which is a flow chart illustrating a method 664 according to the above embodiments. The method 664 begins at step 665, where at step 665, the temporal filtering system 654 receives a current pixel x (t) located at a spatial location (j, i) of a current frame of image data. At step 666, a motion delta value d (t) for the current pixel x (t) is determined based at least in part on one or more collocated reference pixels (e.g., r (t-1)) of a previous frame of image data (e.g., an image frame immediately preceding the current frame). The technique of determining the motion delta value d (t) at step 666, described further below with reference to FIG. 55, may be performed in accordance with equation 1a, as described above.
Once the motion delta value d (t) for step 666 is obtained, a motion table look-up index may be determined, as shown in step 667, using the motion delta value d (t) and the motion history input value h (t-1) corresponding to the spatial location (j, i) of the previous frame. Additionally, although not shown, once the motion delta value d (t) is known, the motion history value h (t) corresponding to the current pixel x (t) may also be determined in step 667, for example, using equation 3a shown above. Thereafter, at step 668, a first filter coefficient K may be selected from the motion table 655 using the motion table look-up index of step 667. As described above, according to equation 2a, a motion table lookup index may be determined and the first filter coefficient K is selected from the motion table.
Subsequently, at step 669, an attenuation factor may be selected from brightness table 656. For example, the brightness table 656 may include attenuation factors between approximately 0 and 1, which may be selected from the brightness table 656 by using the value of the current pixel x (t) as a lookup index. Once the attenuation factor is selected, a second filter coefficient K' may be determined at step 670, as shown in equation 4a above, using the selected attenuation factor and the first filter coefficient K (from step 668). Subsequently, at step 671, a temporal filter output value y (t) corresponding to the current input pixel x (t) is determined based on the second filter coefficient K' (from step 669), the value of the collocated reference pixel r (t-1), and the value of the input pixel x (t). For example, in one embodiment, as described above, the output value y (t) may be determined according to equation 5 a.
Referring to FIG. 55, step 666 of determining the motion delta value d (t) of method 664 is illustrated in greater detail, according to one embodiment. In particular, the determination of the motion increment value d (t) may generally correspond to the operation according to equation 1a described above. As shown, step 666 may include sub-steps 672-675. Beginning with sub-step 672, a set of 3 horizontally adjacent pixels is identified whose color value is the same as the current input pixel x (t). For example, according to the embodiment shown in fig. 53, the image data may include Bayer image data, and the 3 horizontally adjacent pixels may include a current input pixel x (t) (661), a second pixel 660 of the same color on the left side of the current input pixel 661, and a third pixel of the same color on the right side of the current input pixel 661.
Subsequently, at sub-step 673, 3 collocated reference pixels 657, 658 and 659 in the previous frame corresponding to the selected set of 3 horizontally adjacent pixels 660, 661 and 662 are identified. With the selected pixels 660, 661, and 662 and the 3 collocated reference pixels 657, 658, and 659, the absolute values of the differences between each of the three selected pixels 660, 661, and 662 and its corresponding collocated reference pixel 657, 658, and 659 are determined in sub-step 674. Subsequently, in sub-step 675, the maximum value of the 3 difference values of sub-step 674 is selected as the motion increment value d (t) for the current input pixel x (t). As noted above, the diagram 55 illustrating the motion delta value calculation technique shown in equation 1a is intended to provide only one embodiment. In fact, as described above, any suitable 2-dimensional array of same-color pixels with the current pixel at the center can be used to determine the motion delta value (e.g., equation 1 b).
Another embodiment of a technique for applying temporal filtering to image data is further described in FIG. 56. For example, since the signal-to-noise ratios of different color components of the image data may be different, a gain may be applied to the current pixel such that the current pixel is gained before the motion value and luminance value are selected from motion table 655 and luminance table 656. By applying a respective gain that depends on the color, the signal-to-noise ratio may be more consistent between different color components. For example, in implementations using raw Bayer image data, the red and blue channels are typically more sensitive than the green (Gr and Gb) channels. Thus, by applying appropriate color-dependent gains to each processed pixel, it is generally possible to reduce the signal-to-noise ratio variations between the color components, thereby reducing artifacts and the like, as well as increasing the consistency between different colors after automatic white balancing.
With this in mind, FIG. 56 is a flow chart describing a method 676 of applying temporal filtering to image data received from front-end processing unit 150 in accordance with such an embodiment. Beginning at step 677, temporal filtering system 654 receives a current pixel x (t) located at a spatial position (j, i) of a current frame of image data. In step 678, a motion delta value d (t) for the current pixel x (t) is determined based at least in part on one or more collocated reference pixels (e.g., r (t-1)) of a previous frame of image data (e.g., an image frame immediately preceding the current frame). Step 678 may be similar to step 666 of FIG. 54 and may utilize the operation represented in equation 1 above.
Subsequently, at step 679, a motion table look-up index may be determined using the motion delta value d (t), the motion history input value h (t-1) corresponding to the spatial position (j, i) of the previous frame (e.g., corresponding to the collocated reference pixel r (t-1)), and the gain associated with the color of the current pixel. Thereafter, at step 680, a first filter coefficient K may be selected from the motion table 655 by looking up the index using the motion table determined at step 679. For example, in one embodiment, the filter coefficient K and the motion table look-up index may be determined as follows:
K=M[gain[c]×(d(j,i,t)+h(j,i,t-1))], (2b)
where M represents a motion table and the gain [ c ] corresponds to the gain associated with the color of the current pixel. Additionally, although not shown in FIG. 56, it should be appreciated that the motion history output value h (t) for the current pixel may also be determined and used to apply temporal filtering to collocated pixels of a subsequent image frame (e.g., the next frame). In this embodiment, the motion history output h (t) of the current pixel x (t) can be determined by using the following formula:
h(j,i,t)=d(j,i,t)+K[h(j,i,t-1)-d(j,i,t)] (3b)
subsequently, in step 681, an attenuation factor may be selected from the brightness table 656 by looking up the index using the brightness table determined based on the gain (gain [ c ]) associated with the color of the current pixel x (t). As described above, the attenuation factor stored in the luminance table may have a range of about 0 to 1. Thereafter, at step 682, a second filter coefficient K' may be calculated based on the attenuation factor (from step 681) and the first filter coefficient K (from step 680). For example, in one embodiment, the second filter coefficient K' and the luma table look-up index may be determined as follows:
K′=K×L[gain[c]×x(j,i,t)] (4b)
Subsequently, in step 683, a temporal filter output value y (t) corresponding to the current input pixel x (t) is determined based on the second filter coefficient K' (from step 682), the value of the collocated reference pixel r (t-1), and the value of the input pixel x (t). For example, in one embodiment, the output value y (t) may be determined as follows:
y(j,i,t)=x(j,i,t)+K′(r(j,i,t-1)-x(j,i,t)) (5b)
continuing to FIG. 57, another embodiment of the temporal filtering process 384 is depicted. Here, the temporal filtering process 384 may be implemented in a similar manner to the embodiment discussed in fig. 56, except that separate motion and luma tables are provided for the respective color components, instead of applying a color-dependent gain (e.g., gain c) to each input pixel and using a common motion and luma table. For example, as shown in fig. 57, the motion table 655 may include a motion table 655a corresponding to a first color, a motion table 655b corresponding to a second color, and a motion table 655c corresponding to an nth color, where n depends on the number of colors present in the original image data. Similarly, the brightness table 656 may include a brightness table 656a corresponding to a first color, a brightness table 656b corresponding to a second color, and a brightness table 656c corresponding to an nth color. Thus, in an embodiment where the raw image data is Bayer image data, 3 motion tables and luminance tables are provided for the red, blue and green components, respectively. As described below, the selection of the filter coefficient K and the attenuation factor may depend on the motion table and the brightness table selected for the current color (e.g., the color of the current input pixel).
Another embodiment illustrating temporal filtering with color-dependent motion tables and luminance tables is shown in fig. 58. It will be appreciated that the method 685 may employ various calculations and formulas similar to the embodiment shown in FIG. 54, but with specific motion tables and brightness tables selected for each color, or similar to the embodiment shown in FIG. 56, but with the selection of color-dependent motion tables and brightness tables instead of the use of color-dependent gain [ c ].
Beginning at step 686, temporal filtering system 684 (FIG. 57) receives current pixel x (t) located at spatial position (j, i) of the current frame of the image data. At step 687, a motion delta value d (t) for the current pixel x (t) is determined based at least in part on one or more collocated reference pixels (e.g., r (t-1)) of a previous frame of image data (e.g., an image frame immediately preceding the current frame). Step 687 may be similar to step 666 of FIG. 54 and may utilize the operations represented in equation 1 above.
Then, at step 688, a motion table lookup index may be determined using the motion delta value d (t) and a motion history input value h (t-1) corresponding to a spatial position (j, i) of the previous frame (e.g., corresponding to the collocated reference pixel r (t-1)). Thereafter, at step 689, a first filter coefficient K may be selected from one of the available motion tables (e.g., 655a, 655b, 655c) based on the color of the current input pixel. For example, once the appropriate motion table is identified, the first filter coefficient K may be selected using the motion table look-up index determined in step 688.
After the first filter coefficient K is selected, the brightness table corresponding to the current color is selected, and an attenuation factor is selected from the selected brightness table according to the value of the current pixel x (t), as shown in step 690. Thereafter, at step 691, a second filter coefficient K' is determined based on the attenuation factor (from step 690) and the first filter coefficient K (step 689). Thereafter, in step 692, a temporal filter output value y (t) corresponding to the current input pixel x (t) is determined based on the second filter coefficient K' (from step 691), the value of the collocated reference pixel r (t-1), and the value of the input pixel x (t). Although the technique shown in fig. 58 may be more costly to implement (e.g., due to the memory required to save additional motion and brightness tables), in some cases it may lead to further improvement in artifacts and increased consistency between different colors after automatic white balancing.
According to other embodiments, the temporal filtering process provided by temporal filter 650 may apply temporal filtering to the input pixels using a combination of color dependent gains and color specific motion and/or chroma tables. For example, in one such embodiment, a single motion table may be provided for all color components, and a motion table look-up index that selects the first filter coefficient (K) from the motion table may be determined based on the color correlation gain (e.g., as shown in fig. 56, step 679 and 680), while the luma table look-up index may not apply the color correlation gain, but may be used to select a luma attenuation factor from one of a plurality of luma tables that depends on the color of the current pixel (e.g., as shown in fig. 58, step 690). On the other hand, in another embodiment, a plurality of motion tables may be provided, and the motion table look-up index (to which no color-dependent gain is applied) may be used to select the first filter coefficient (K) from the motion table corresponding to the color of the current input pixel (e.g., as shown in fig. 58, step 689), while a single luminance table is provided for all color components, wherein the luminance table look-up index for selecting the luminance attenuation factor may be determined according to the color-dependent gain (e.g., as shown in fig. 56, step 681 and 682). Furthermore, in one embodiment utilizing a Bayer color filter matrix, one motion table and/or luminance table may be provided for the red (R) and blue (B) components, respectively, while a common motion table and/or luminance table is provided for the two green components (Gr and Gb).
The output of temporal filter 650 may then be sent to a Binning Compensation Filter (BCF)652, which BCF 652 may be configured to process the image pixels to compensate for the non-linear layout of the color samples (e.g., uneven spatial distribution) caused by binning of image sensors 90a or 90b, so that subsequent image processing operations (e.g., demosaicing, etc.) in ISP pipeline logic 82 that depend on the linear layout of the color samples can work correctly. For example, referring now to fig. 59, a full resolution sample 693 of Bayer image data is illustrated. This may represent full resolution samples of the raw image data captured by the image sensor 90a (or 90b) coupled to the ISP front-end processing logic 80.
It will be appreciated that under certain image capture conditions, it may not be feasible to send the full resolution image data captured by the image processor 90a to the ISP circuit 32 for processing. For example, when capturing image data so as to retain a moving image that appears fluent to the human eye, a frame rate of at least about 30 frames/second may be desirable. However, if the amount of pixel data contained in each frame of full resolution samples exceeds the processing capability of the ISP circuit 32 when sampled at 30 frames/second, in conjunction with binning of the image sensor 90a, binning compensation filtering may be applied to reduce the resolution of the image signal while also improving the signal-to-noise ratio. For example, as described above, various binning techniques, such as 2 × 2 binning, may be applied to produce "binned" original image pixels by averaging the values of surrounding pixels in the active area 312 of the original frame 310.
Referring to FIG. 60, one embodiment of an image sensor 90a that may be configured to bin the full resolution image data 693 of FIG. 59 to produce corresponding binned raw image data 700 shown in FIG. 61 is illustrated, according to one embodiment. As shown, the image sensor 90a may capture full resolution raw image data 693. The binning logic 699 may be configured to apply binning to the full resolution raw image data 693 to produce binned raw image data 700, which raw image data 700 may be provided to the ISP front-end processing logic 80 by utilizing a sensor interface 94a, which sensor interface 94a may be an SMIA interface or any other suitable parallel or serial camera interface, as described above.
As diagrammatically illustrated in fig. 61, binning logic 699 may apply 2 x 2 binning to full resolution raw image data 693. For example, with respect to boxed image data 700, pixels 695, 696, 697, and 698 may form a Bayer pattern and may be determined by averaging the values of individual pixels of full resolution raw image data 693. For example, referring to FIGS. 59 and 61, binned Gr pixels 695 may be determined as an average or mean of full-resolution Gr pixels 695 a-695 d. Similarly, binned R pixels 696 may be determined as an average of full resolution R pixels 696a 696d, binned B pixels 697 may be determined as an average of full resolution B pixels 697a 697d, and binned Gb pixels 698 may be determined as an average of full resolution Gb pixels 698a 698 d. Thus, in this embodiment, 2 × 2 binning may provide a set of 4 full resolution pixels, including an upper left pixel (e.g., 695a), an upper right pixel (e.g., 695b), a lower left pixel (e.g., 695c), and a lower right pixel (e.g., 695d), the 4 full resolution pixels being averaged to result in a binned pixel located in the center of the square formed by the set of 4 resolution pixels. Thus, the boxed Bayer block 694 shown in FIG. 61 contains 4 "superpixels," which 4 superpixels represent the 16 pixels contained in Bayer blocks 694a-694d of FIG. 59.
In addition to reducing spatial resolution, binning brings the additional advantage of reducing noise in the image signal. For example, as long as the image sensor (e.g., 90a) is exposed to the optical signal, there may be some amount of noise associated with the image, such as photon noise. This noise, which may be random or systematic, may also come from multiple sources. Thus, the amount of information contained in an image captured by an image sensor may be represented by a signal-to-noise ratio. For example, each time an image is captured with the image sensor 90a and passed to a processing circuit, such as the ISP circuit 32, there may be a degree of noise in the pixel values, since the process of reading and transferring image data inherently introduces "readout noise" in the image signal. This "readout noise" can be random and is generally unavoidable. By using an average of 4 pixels, noise (e.g., photon noise) can be generally reduced, regardless of the source of the noise.
Thus, when the full resolution image data 693 of FIG. 59 is considered, each Bayer pattern (2 × 2 block) 694a-694d contains 4 pixels, each pixel containing a signal component and a noise component. If each pixel in, for example, the Bayer block 694 is read out separately, there are 4 signal components and 4 noise components. However, by applying binning, as shown in fig. 59 and 61, 4 pixels (e.g., 695a, 695b, 695c, 695d) can be represented by a single pixel (e.g., 695) in the binned image data, and the same area occupied by 4 pixels in full resolution image data 693 can be read out as a single pixel with only one instance of the noise component, thereby improving the signal-to-noise ratio.
Further, while the present embodiment describes the binning logic 699 of FIG. 60 as being configured to apply a 2 × 2 binning process, it should be understood that the binning logic 699 may be configured to apply any suitable type of binning process, such as 3 × 3 binning, vertical binning, horizontal binning, and the like. In some embodiments, the image sensor 90a may be configured to select between different binning modes during the image capture process. Additionally, in other embodiments, the image sensor 90a may also be configured to apply a technique that may be referred to as "skipping," in which, instead of averaging a sample of pixels, the logic 699 selects only certain pixels (e.g., every other pixel, every third pixel, etc.) from the full resolution data 693 to output to the ISP front end 80 for processing. Further, while only image sensor 90a is shown in FIG. 60, it is understood that image sensor 90b may be implemented in a similar manner.
In addition, one effect of the binning process is also depicted in FIG. 61, i.e., the spatial samples of the binned pixels are not evenly spaced. In some systems, this spatial distortion can result in aliasing (e.g., jagged edges) that is generally undesirable. Furthermore, since certain image processing steps in the ISP pipeline logic 82 may depend on the linear layout of the color samples for proper operation, a Binned Compensation Filter (BCF)652 may be applied for resampling and rearranging the binned pixels to cause the binned pixels to be spatially uniformly distributed. That is, the BCF 652 substantially compensates for the non-uniform spatial distribution (e.g., as shown in fig. 61) by resampling the locations of the samples (e.g., pixels). For example, FIG. 62 illustrates a resampled portion of binned image data 702 after processing with BCF 62, where Bayer block 703 contains uniformly distributed resampled pixels 704, 705, 706, and 707 corresponding, respectively, to binned pixels 695, 696, 697, and 698 of binned image data 700 of FIG. 61. In addition, in embodiments utilizing jumps as described above (e.g., instead of binning), there may be no spatial distortion as shown in fig. 61. In this case, the BCF 652 may function as a low pass filter that reduces artifacts (e.g., aliasing) that may result when the image sensor 90a employs skipping.
Fig. 63 shows a block diagram of a combining compensation filter 652 according to an embodiment. BCF 652 may include binned compensation logic 708, and binned compensation logic 708 may process binned pixels 700 to apply horizontal and vertical scaling using horizontal scaling logic 709 and vertical scaling logic 710, respectively, to resample and rearrange binned pixels 700 so that binned pixels 700 are arranged in a spatially uniform distribution, as shown in fig. 62. In one embodiment, the scaling operation by BCF 652 may be performed using horizontal and vertical multi-tap polyphase filtering. For example, the filtering process may include selecting appropriate pixels from the input source image data (e.g., binned image data 700 provided by image sensor 90 a), multiplying each selected pixel by a filter coefficient, and summing the resulting values to form an output pixel at the desired destination.
The selection of pixels used in the scaling operation (which may include the center pixel and surrounding neighboring pixels of the same color) may be determined using a separate differential analyzer 711, one for vertical scaling and one for horizontal scaling. In the depicted embodiment, differential analyzer 711 may be a Digital Differential Analyzer (DDA) and may be configured to control the current output pixel position during the scaling operation in the vertical and horizontal directions. In this embodiment, a first DDA (referred to as 711a) is used for all color components during horizontal scaling and a second DDA (referred to as 711b) is used for all color components during vertical scaling. For example, the DDA 711 may be provided in the form of a 32-bit data register containing the complement fixed point number of 2 of the integer portion 16 bits and the fractional portion 16 bits. The 16-bit integer portion may be used to determine the current position of the output pixel. The fractional portion of the DDA 711 may be used to determine a current index or phase, which may be based on the inter-pixel fractional position of a current DDA position (e.g., corresponding to the spatial position of an output pixel). The index or phase may be used to select an appropriate set of coefficients from a set of filter coefficient tables 712. In addition, the filtering may be performed per color component using pixels of the same color. Thus, the filter coefficients may be selected not only according to the phase of the current DDA position, but also according to the color of the current pixel. In one embodiment, there may be 8 phases between each input pixel, and thus, the vertical and horizontal scaling components may utilize a coefficient table of depth 8, such that the upper 3 bits of the 16 bit fractional portion are used to represent the current phase or index. Thus, it should be understood that the term "raw image" data or the like as used herein refers to multi-color image data obtained with a single sensor covered with a color filter array pattern (e.g., Bayer) that provides multiple color components in one plane. In another embodiment, a separate DDA may be used for each color component. For example, in such an embodiment, the BCF 652 may extract R, B, Gr and Gb components from the raw image data and process each component in the form of a separate plane.
In operation, horizontal and vertical scaling may include initializing the DDA711 and performing multi-tap polyphase filtering using the integer and fractional portions of the DDA 711. Vertical and horizontal scaling are performed in a similar manner as horizontal and vertical scaling are performed separately and with separate DDAs. The step value or step size (DDAStepX for horizontal scaling and DDAStepY for vertical scaling) determines how much the DDA value (currDDA) is increased after each output pixel is determined and the multi-tap polyphase filtering is repeated with the next currDDA value. For example, if the step value is less than 1, the image is scaled up, and if the step value is greater than 1, the image is scaled down. If the step value is equal to 1, no scaling is performed. Further, it should be noted that the same or different step values may be used for horizontal and vertical scaling.
The BCF652 generates output pixels in the same order as the input pixels (e.g., using a Bayer pattern). In the present embodiment, the input pixels may be classified as even pixels or odd pixels according to the ordering of the input pixels. For example, referring to FIG. 64, a graphical representation of input pixel locations (row 713) and corresponding output pixel locations (rows 714-718) based on respective DDAStep values is illustrated. In this example, the depicted lines represent one line of red (R) and green (Gr) pixels in the raw Bayer image data. For horizontal filtering, the red pixels at position 0.0 in row 713 may be considered even pixels, the green pixels at position 1.0 in row 713 may be considered odd pixels, and so on. For output pixel locations, the even and odd pixels may be determined from the least significant bit in the fractional portion (16 lower bits) of the DDA 711. For example, assuming DDAStep is 1.25, as shown in row 715, the least significant bit corresponds to bit 14 of the DDA, since that bit gives a resolution of 0.25. Thus, a red output pixel at DDA position (currDDA)0.0 can be considered an even pixel (0 for least significant bit 14), a green output pixel at currDDA 1.0 (1 for bit 14) can be considered an odd pixel, and so on. Furthermore, while fig. 64 is discussed with respect to filtering in the horizontal direction (using DDAStepX), it should be appreciated that the determination of even and odd input and output pixels may be applied in the same manner for vertical filtering (using DDAStepY). In other embodiments, the DDA711 may also be used to track the location of input pixels (e.g., rather than tracking the desired output pixel location). Further, it is understood that DDAStepX and DDAStepY may be set to the same or different values. Further, assuming a Bayer pattern is used, it should be noted that the starting pixel used by BCF652 may be any one of Gr, Gb, R, or B pixels, depending on, for example, which pixel is located at a corner of active area 312.
In view of the above, the even/odd input pixels are used to generate the even/odd output pixels, respectively. Given the output pixel positions that alternate between even and odd positions, the central source input pixel position for filtering (referred to herein as "currPixel") is determined by rounding DDA to the nearest even or odd input pixel position of the even or odd output pixel position, respectively (in accordance with DDAStepX). In embodiments where the DDA 711a is configured to use 16 bits to represent integers and 16 bits to represent decimal, currPixel may be determined for even and odd currDDA locations by utilizing equations 6a and 6b below.
The even output pixel positions can be determined from bits [31:16] which are:
(currDDA+1.0)&0xFFFE.0000 (6a)
the odd output pixel positions may be determined from bits [31:16] which are:
(currDDA)|0x0001.0000 (6b)
in essence, the above equation represents a rounding operation whereby the even and odd output pixel positions determined by currDDA are rounded to the nearest even and odd input pixel positions, respectively, to select currPixel.
In addition, a current index or phase (currIndex) may also be determined at each currrdda location. As described above, the index or phase value represents the interpixel fractional position of the output pixel location relative to the input pixel location. For example, in one embodiment, 8 phases may be defined between each input pixel location. For example, referring back to fig. 64, between the first red input pixel at position 0.0 and the next red input pixel at position 2.0, 8 index values 0-7 are provided. Similarly, between the first green input pixel at position 1.0 and the next green input pixel at position 3.0, 8 index values 0-7 are provided. In one embodiment, for even and odd output pixel locations, currIndex values can be determined according to equations 7a and 7b below, respectively:
The even output pixel positions can be determined from bits [16:14] of the following equation:
(currDDA+0.125) (7a)
the odd output pixel positions may be determined from bits [16:14] of the following equation:
(currDDA+1.125) (7b)
for odd positions, the extra 1-pixel displacement is equivalent to adding the coefficient index for the odd output pixel position to an offset of 4 to account for the index offset between the different color components for DDA 711.
Once currPixel and currIndex have been determined at a particular currRDDA location, the filtering process may select one or more adjacent pixels of the same color based on currPixel (the selected center input pixel). For example, in an embodiment where horizontal scaling logic 709 includes a 5-tap polyphase filter and vertical scaling logic 710 includes a 3-tap polyphase filter, two same color pixels (e.g., -2, -1, 0, +1, +2) on each side of currPixel along the horizontal direction may be selected for horizontal filtering and the same color pixels (e.g., -1, 0, +1) on each side of currPixel along the vertical direction may be selected for vertical filtering. In addition, currIndex may be used as a selection index to select the appropriate filter coefficients from the filter coefficient table 712 to apply to the selected pixels. For example, with a 5-tap horizontal/3-tap vertical filtering embodiment, 5 tables of depth 8 may be provided for horizontal filtering and 3 tables of depth 8 may be provided for vertical filtering. Although illustrated as part of the BCF 652, it should be understood that in some embodiments, the filter coefficient table 712 may be stored in a memory physically separate from the BCF 652, such as in the memory 108.
Before discussing the horizontal and vertical scaling operations in more detail, table 5 shows an example of currPixel and currIndex values determined from the various DDA locations using different ddatep values (e.g., applicable to ddatepx or ddatepy).
Table 5: DDA example of boxed Compensation Filter-currPixel and currIndex calculations
For example, assume a DDA step size (DDAStep) of 1.5 is selected (row 716 of fig. 64), while the current DDA position (currDDA) begins at 0, indicating an even output pixel position. To determine currPixel, equation 6a can be applied as follows:
CurrDDA 0.0 (even number)
0000 0000 0000 0001.0000 0000 0000 0000(currDDA+1.0)
(AND)1111 1111 1111 1110.0000 0000 0000 0000(0xFFFE.0000)
=0000 0000 0000 0000.0000 0000 0000 0000
currPixel (bits determined as result [31:16]) 0;
thus, at currDDA position 0.0 (row 716), the source input center pixel for filtering corresponds to the red input pixel at position 0.0 of row 713.
To determine currIndex at even currrdda 0.0, equation 7a can be applied as follows:
CurrDDA 0.0 (even number)
0000 0000 0000 0000.0000 0000 0000 0000(currDDA)
+0000 0000 0000 0000.0010 0000 0000 0000(0.125)
=0000 0000 0000 0000.0010 0000 0000 0000
currIndex (bits determined as result [16:14]) 0-000;
thus, at currrdda position 0.0 (row 716), a currIndex value of 0 may be used to select filter coefficients from filter coefficient table 712.
Thus, filtering (which may be vertical or horizontal depending on whether DDAStep is in the X (horizontal) or Y (vertical) direction) may be applied based on the determined currPixel and currIndex values at currrdda 0.0, DDA 711 is incremented by DDAStep (1.5) and the next currPixel and currIndex values determined. For example, at the next currDDA position 1.5 (odd position), currPixel can be determined using equation 6b as follows:
CurrDDA ═ 0.0 (odd number)
0000 0000 0000 0001.1000 0000 0000 0000(currDDA)
(OR)0000 0000 0000 0001.0000 0000 0000 0000(0x0001.0000)
=0000 0000 0000 0001.1000 0000 0000 0000
currPixel (bits determined as result [31:16]) 1;
thus, at currDDA position 1.5 (row 716), the source input center pixel for filtering corresponds to the green input pixel at position 1.0 of row 713.
In addition, currIndex at odd currrdda 1.5 can be determined using equation 7b, as follows:
currDDA ═ 1.5 (odd number)
0000 0000 0000 0001.1000 0000 0000 0000(currDDA)
+0000 0000 0000 0001.0010 0000 0000 0000(1.125)
=0000 0000 0000 0010.1010 0000 0000 0000
currIndex (bits [16:14] determined as a result ═ 010] ═ 2;
thus, at currrdda position 1.5 (row 716), an appropriate filter coefficient may be selected from the filter coefficient table 712 using a currIndex value of 2. So that these currPixel and currIndex values can be utilized, filtering is applied as such (filtering can be vertical filtering or horizontal filtering, depending on whether DDAStep is in the X (horizontal) or Y (vertical) direction).
Thereafter, DDA 711 is again incremented by DDAStep (1.5), resulting in a currDDA value of 3.0. Using equation 6a, a currPixel corresponding to currRDDA 3.0 can be determined as follows:
currDDA ═ 3.0 (even number)
0000 0000 0000 0100.0000 0000 0000 0000(currDDA+1.0)
(AND)1111 1111 1111 1110.0000 0000 0000 0000(0xFFFE.0000)
=0000 0000 0000 0100.0000 0000 0000 0000
currPixel (bits determined as result [31:16]) 4;
thus, at currDDA position 3.0 (row 716), the source input center pixel for filtering corresponds to the red input pixel at position 4.0 of row 713.
Subsequently, the currIndex at even currDDA 3.0 can be determined using equation 7a, as follows:
currDDA ═ 3.0 (even number)
0000 0000 0000 0011.0000 0000 0000 0000(currDDA)
+0000 0000 0000 0000.0010 0000 0000 0000(0.125)
=0000 0000 0000 0011.0010 0000 0000 0000
currIndex (bits determined as result [16:14]) to [100] ═ 4;
thus, at currrdda position 3.0 (row 716), an appropriate filter coefficient may be selected from filter coefficient table 712 using a currIndex value of 4. It will be appreciated that for each output pixel, DDA 711 may continue to be incremented by DDAStep, and filtering may be applied using currPixel and currIndex determined for each currrdda value (filtering may be vertical filtering or horizontal filtering, depending on whether DDAStep is in the X (horizontal) direction or the Y (vertical) direction).
As described above, currIndex may be used as a selection index to select appropriate filter coefficients from the filter coefficient table 712 to apply to the selected pixels. The filtering process may include obtaining source pixel values around the center pixel (currPixel), multiplying each selected pixel by the appropriate filter coefficient selected from the filter coefficient table 712 according to currIndex, and summing the results to obtain the value of the output pixel at the location corresponding to currrdda. Furthermore, since the present embodiment utilizes 8 phases between pixels of the same color, by using the 5-tap horizontal/3-tap vertical filtering embodiment, 5 tables of 8 depths can be provided for horizontal filtering and 3 tables of 8 depths can be provided for vertical filtering. In one embodiment, each coefficient table entry includes a 2's complement fixed point number of 16 bits (3 integer bits and 13 fractional bits).
Furthermore, assuming a Bayer image pattern, in one embodiment, the vertical scaling component may include 4 separate 3-tap polyphase filters, one for each color component: gr, R, B and Gb. Each 3-tap filter may control the step of the current center pixel and the index of the coefficient using the DDA 711, as described above. Similarly, the horizontal scaling component may comprise 4 separate 5-tap polyphase filters, one for each color component: gr, R, B and Gb. Each 5-tap filter may use DDA 711 to control the stepping (e.g., via DDAStep) of the current center pixel and the index of the coefficients. It should be appreciated, however, that in other embodiments, fewer or more taps may be utilized by the horizontal and vertical scaling components (scalars).
For the boundary case, the pixels used in the horizontal and vertical filtering processes may depend on the relationship of the current DDA position (currDDA) relative to the frame boundary (e.g., the boundary defined with activation region 312 in fig. 23). For example, in horizontal filtering, if the currDDA position indicates that DDA 711 is close to the boundary such that there are not enough pixels to do 5-tap filtering, the same color input boundary pixels may be repeated when compared to the position of the center input pixel (SrcX) and the width of the frame (SrcWidth) (e.g., width 322 of activation region 312 of fig. 23). For example, if the selected center input pixel is at the left edge of the frame, the center pixel may be duplicated twice for horizontal filtering. If the center input pixel is near the left edge of the frame, such that only one pixel is available at the center input pixel and the left edge, then for horizontal filtering, the one available pixel is duplicated to provide two pixel values to the left of the center input pixel. Further, horizontal scaling logic 709 may be configured such that the number of input pixels (including the original pixels and the copy pixels) cannot exceed the input width. This can be expressed as follows:
StartX=(((DDAInitX+0x0001.0000)&0xFFFE.0000)>>16)
EndX=(((DDAInitX+DDAStepX*(BCFOutWidth-1))|0x0001.0000)>>16)
EndX-StartX<=SrcWidth-1
Where DDAInitX represents the initial position of DDA 711, ddatepx represents the DDA step value in the horizontal direction, and BCFOutWidth represents the width of the frame output by BCF 652.
For vertical filtering, boundary pixels may be repeatedly input if the currDDA position indicates that DDA 711 is close to the boundary, when compared to the position of the center input pixel (SrcY) and the width of the frame (srchight) (e.g., width 322 of activation region 312 of fig. 23), such that there are not enough pixels for 3-tap filtering. Furthermore, the vertical scaling logic 710 may be configured such that the number of input pixels (including the initial pixels and the replicated pixels) cannot exceed the input height. This can be expressed as follows:
StartY=(((DDAInitY+0x0001.0000)&0xFFFE.0000)>>16)
EndY=(((DDAInitY+DDAStepY*(BCFOutHeight-1))|0x0001.0000)>>16)
EndY-StartY<=SrcHeight-1
where DDAInitY represents the initial position of DDA 711, ddatepy represents the DDA step value in the vertical direction, and bcfouttheight represents the width of the frame output by BCF 652.
Referring now to FIG. 65, FIG. 65 is a flow diagram illustrating a method 720 of applying boxed compensation filtering to image data received by front-end pixel processing unit 150, in accordance with one embodiment. It should be appreciated that the method 720 illustrated in fig. 65 is applicable to both vertical and horizontal scaling. Beginning at step 721, the DDA 711 is initialized and DDA step values (which may correspond to horizontally scaled DDAStepX and vertically scaled DDAStepY) are determined. Subsequently, at step 722, the current DDA location (curddda) is determined from DDAStep. As described above, currDDA may correspond to an output pixel location. By utilizing currrdda, method 720 may determine a center pixel (currPixel) from the input pixel data used to bin compensation filtering to determine a corresponding output value at currrdda, as shown at step 723. Subsequently, at step 724, an index (currIndex) corresponding to currrdda may be determined based on the currrdda relative to the inter-pixel fractional position of the input pixel (e.g., line 713 of fig. 64). For example, in an embodiment where the DDA includes 16 integer bits and 16 fractional bits, currPixel may be determined according to equations 6a and 6b, and currIndex may be determined according to equations 7a and 7b, as described above. Although a 16-bit integer/16-bit fractional structure is illustrated herein as an example, it should be understood that other structures of the DDA 711 may be utilized in accordance with the techniques of the present invention. For example, other embodiments of the DDA 711 may be configured to include a 12-bit integer portion and a 20-bit fractional portion, a 14-bit integer portion and an 18-bit fractional portion, and so on.
Once currPixel and currIndex are determined, source pixels of the same color around currPixel can be selected for multi-tap filtering, as shown in step 725. For example, as described above, one embodiment may utilize 5-tap polyphase filtering in the horizontal direction (e.g., 2 same color pixels are selected on each side of currPixel) and 3-tap polyphase filtering in the vertical direction (e.g., 1 same color pixel is selected on each side of currPixel). Subsequently, at step 726, once the source pixel is selected, filter coefficients may be selected from the filter coefficient table 712 of BCF 652 based on currIndex.
Thereafter, at step 727, filtering may be applied to the source pixels to determine the values of the output pixels corresponding to the locations represented by currrda. For example, in one embodiment, the source pixels may be multiplied by their corresponding filter coefficients and the sum of the results calculated to obtain the output pixel value. The direction in which filtering is applied at step 727 may be the vertical direction or the horizontal direction depending on whether DDAStep is in the X (horizontal) direction or the Y (vertical) direction. Finally, at step 728, DDA 711 is incremented by DDAStep and method 720 returns to step 722 to determine the next output pixel value using the binned compensation filtering technique discussed herein.
Referring to fig. 66, step 723 of determining currPixel of method 720 is illustrated in greater detail, according to one embodiment. For example, step 723 may include sub-step 729 of determining whether an output pixel position corresponding to currDDA (from step 722) is even or odd. As described above, the even or odd output pixels can be determined from the least significant bit of currda according to DDAStep. For example, given a DDAStep of 1.25, the currddda value of 1.25 may be determined to be an odd number because the least significant bit (bit 14 corresponding to the fractional portion of DDA 711) has a value of 1. For a currDDA value of 2.5, bit 14 is 0, indicating an even output pixel position.
At decision logic 730, a determination is made as to whether the output pixel location corresponding to currDDA is even or odd. If the output pixel is even, then decision logic 730 proceeds to sub-step 731 where in sub-step 731 the currPixel is determined by incrementing the currRDDA value by 1 and rounding the result to the nearest even input pixel position, as shown above in equation 6 a. If the output pixel is odd, then decision logic 730 proceeds to sub-step 732 where currPixel is determined by rounding the currRDDA value to the nearest odd input pixel location, as shown above in equation 6b, in sub-step 732. The currPixel value may then be applied to step 725 of method 720 to select a source pixel for filtering, as described above.
Referring additionally to fig. 67, step 724 of determining currIndex of step 720 is illustrated in greater detail, according to one embodiment. For example, step 724 may include sub-step 733 of determining whether an output pixel position corresponding to currDDA (from step 722) is even or odd. The determination may be made in a similar manner to step 729 of FIG. 66. At decision logic 734, a determination is made as to whether the output pixel location corresponding to currDDA is even or odd. If the output pixel is even, then decision logic 734 proceeds to sub-step 735, at sub-step 735, and thus determines currIndex: the currrdda value is incremented by one index step and the currIndex is determined from the lowest integer bit and the two highest fractional bits of DDA 711. For example, in an embodiment where 8 phases are provided between each same color pixel, and where the DDA includes 16 integer bits and 16 fractional bits, one index step may correspond to 0.125, and the currIndex may be determined from bits [16:14] of the currrdda value incremented by 0.125 (e.g., equation 7 a). If the output pixel is odd, then decision logic 734 may proceed to sub-step 736 where currIndex is determined as: the currrdda value is incremented by one index step and one pixel shift and the currIndex is determined from the lowest integer bit and the two highest fractional bits of DDA 711. Thus, in an embodiment where 8 phases are provided between each same color pixel, and where the DDA includes 16 integer bits and 16 fractional bits, one index step may correspond to 0.125, one pixel shift may correspond to 1.0 (moving 8 index steps toward the next same color pixel), and a currIndex may be determined from bits [16:14] of the currrdda value incremented by 1.125 (e.g., equation 7 b).
Although the illustrated embodiment provides the BCF 652 in the form of components of the front-end pixel processing logic 150, other embodiments may incorporate the BCF 652 into the raw image data processing pipeline of the ISP pipeline 82, as further described below, the ISP pipeline 82 may include defective pixel detection/correction logic, gain/offset/compensation blocks, noise reduction logic, lens shading correction logic, and demosaicing logic therein. Furthermore, in embodiments where the above-described defective pixel detection/correction logic, gain/offset/compensation block, noise reduction logic, lens shading correction logic do not rely on a linear layout of pixels, the BCF 652 may be combined with demosaicing logic to perform binning compensation filtering and rearrangement of pixels prior to demosaicing, as demosaicing typically relies on a uniform spatial arrangement of pixels. For example, in one embodiment, the BCF 652 may be incorporated anywhere between the sensor input and the demosaicing logic, while applying temporal filtering and/or defective pixel detection/correction to the raw image data prior to binning compensation.
As described above, the output of the BCF 652, which may be the output FEProcOut (109) with spatially uniform distribution of image data (e.g., sample 702 of fig. 62), may be forwarded to the ISP pipeline processing logic 82 for additional processing. However, before moving the focus of the discussion to the ISP pipeline processing logic 82, the various functions provided by the statistics processing elements (e.g., 142 and 144) implemented in the ISP front-end logic 80 will first be described in more detail.
Referring back to the overview of the statistical information processing units 142 and 144, these units may be configured to collect and capture and provide various statistical information about the image sensor of the raw image signals (Sif0 and Sif1), such as statistical information relating to auto-exposure, auto-white balance, auto-focus, flicker detection, black level compensation, and lens shading correction, among others. In so doing, the statistical information processing units 142 and 144 first apply one or more image processing operations to their respective input signals Sif0 (from Sensor0) and Sif1 (from Sensor 1).
For example, referring to FIG. 68, a block diagram of a statistics processing unit 142 associated with Sensor0(90a) is illustrated in greater detail according to one embodiment. As shown, the statistical information processing unit 142 may include the following functional blocks: defective pixel detection and correction logic 738, Black Level Compensation (BLC) logic 739, lens shading correction logic 740, inverted BLC logic 741, and statistics collection logic 742. These functional blocks will be discussed separately below. Further, it should be appreciated that statistical information processing unit 144 associated with Sensor1(90b) may be implemented in a similar manner.
First, the front-end defective pixel detection logic 738 receives the output of the selection logic 146 (e.g., Sif0 or SifIn 0). It should be understood that a "defective pixel" may be understood to refer to an imaging pixel within the image sensor 90 that fails to accurately sense light levels. Defective pixels can be attributed to a number of factors, and can include "noisy" (or leaky) pixels, "bright" pixels, and "bad" pixels. Noise typically appears brighter than non-defective pixels given the same amount of light at the same spatial location. Reset failures and/or high leakage can cause noise. For example, noise may exhibit higher than normal charge leakage relative to non-defective pixels, thereby appearing brighter than non-defective pixels. In addition, "dead spots" and "bright spots" may be the result of impurities, such as dust or other trace substances, contaminating the image sensor during fabrication and/or assembly, which may cause some defective pixels to be brighter or darker than non-defective pixels, or cause the defective pixels to settle at a particular value, regardless of the amount of light to which it is actually exposed. In addition, the dead spots and bright spots may also be caused by circuit failures occurring during the operation of the image sensor. For example, bright spots may appear to be always on (e.g., fully charged) and thus appear brighter, while bad spots appear to be always off.
Defective Pixel Detection and Correction (DPDC) logic 738 in the ISP front end logic 80 may correct defective pixels (e.g., replace defective pixel values) before considering them in the statistics collection (e.g., 742). In one embodiment, the defective pixel correction is performed independently for each color component (e.g., R, B, Gr and Gb of the Bayer pattern). In general, the front-end DPDC logic 738 may provide dynamic defect correction in which the location of a defective pixel is automatically determined based on the directional gradient calculated using neighboring pixels of the same color. It will be appreciated that a defect is "dynamic" in the sense that the characterization of the defective pixel at a given time may depend on the image data in the adjacent pixels. For example, if the position of a bright spot is in an area of the current image that is dominated by a brighter color or white, then the bright spot that is always at maximum brightness is not considered a defective pixel. Conversely, if a bright spot is in an area of the current image that is dominated by black or darker color, the bright spot will be identified as a defective pixel and thus corrected during processing by DPDC logic 738.
DPDC logic 738 may utilize one or more horizontally adjacent pixels of the same color on each side of the current pixel to determine whether the current pixel is defective via a pixel-to-pixel gradient. If the current pixel is identified as defective, the value of the defective pixel may be replaced with the value of the horizontally adjacent pixel. For example, in one embodiment, 5 same color horizontal neighboring pixels within the original frame 310 (FIG. 23) are used, wherein the 5 horizontal neighboring pixels include the current pixel and 2 neighboring pixels on each side. Thus, as illustrated in fig. 69, for a given color component c and current pixel P, DPDC logic 738 may consider the horizontally adjacent pixels P0, P1, P2 and P3. It should be noted, however, that depending on the position of the current pixel P, pixels outside the original frame 310 are not considered when calculating the pixel-to-pixel gradient.
For example, as shown in fig. 69, in the "left edge" case 743, the current pixel P is at the leftmost edge of the original frame 310, leaving only pixels P, P2 and P3(N ═ 3) without regard to the neighboring pixels P0 and P1 outside the original frame 310. In the "left edge + 1" case 744, the current pixel P is one unit pixel from the leftmost edge of the original frame 310, thus disregarding pixel P0. Only the pixels P1, P, P2 and P3 (N-4) remain. Furthermore, in the "centered" case 745, pixels P0 and P1 to the left of the current pixel P, and pixels P2 and P3 to the right of the current pixel P are all within the original frame 310 boundary, so all neighboring pixels P0, P1, P2, and P3 are considered in calculating the pixel-pixel gradient (N-5). In addition, similar cases 746 and 747 are encountered when approaching the rightmost edge of original frame 310. For example, given the "right edge-1" case 746, the current pixel P is one unit pixel from the rightmost edge of the original frame 310, and thus, pixel P3 is not considered (N-4). Similarly, in the "right edge" case 747, the current pixel P is at the rightmost edge of the original frame 310, and thus, neither of the two neighboring pixels P2 and P3 are considered (N-3).
In the illustrated embodiment, for each neighboring pixel (k 0-3) within a picture boundary (e.g., original frame 310), the pixel-to-pixel gradient may be calculated as follows:
Gk=abs(P-Pk) 0 ≦ k ≦ 3 (for k in the original frame only) (8)
Once the pixel-to-pixel gradient is determined, DPDC logic 738 may perform defective pixel detection as follows. First, assume a certain number of gradients G if a pixelkEqual to or less than a certain threshold (indicated by the variable dprTh), then the pixel is defective. Thus, for each pixel, an accumulation is performedThe product is a count (C) of the number of gradients equal to or less than a threshold dprTh for adjacent pixels within the picture boundary. For example, for each neighboring pixel within the original frame 310, the gradient G equal to or less than the threshold dprTh may be calculated as followskCumulative count of (C):
where 0 ≦ k ≦ 3 (for k in the original frame only)
It will be appreciated that the threshold dprTh may vary with the color component. Subsequently, if it is determined that the cumulative count C is less than or equal to the maximum count (represented by the variable dprMaxC), then the pixel can be considered defective. The logic is represented as follows:
if (C ≦ dprMaxC), then the pixel is defective (10)
Defective pixels are replaced using a variety of replacement conventions. For example, in one embodiment, the defective pixel may be replaced with the pixel P1 immediately to its left. In the case of a boundary (e.g., P1 outside of the original frame 310), the defective pixel may be replaced with the pixel P2 immediately to its right. Further, it should be appreciated that replacement values may be retained or conducted for successive defective pixel detection operations. For example, referring to the set of horizontal pixels shown in fig. 69, if P0 or P1 were previously identified as defective pixels by DPDC logic 738, their corresponding replacement values may be used for defective pixel detection and replacement of the current pixel P.
To summarize the defective pixel detection and correction techniques discussed above, a flow chart 748 is provided in FIG. 70 that describes such processing. As shown, process 748 begins at step 749 by receiving a current pixel (P) and identifying a set of neighboring pixels at step 749. According to the above-described embodiments, the neighboring pixels may include two horizontal pixels (e.g., P0, P1, P2, and P3) of the same color component on both sides of the current pixel. Subsequently, at step 750, a horizontal pixel-pixel gradient is calculated for each neighboring pixel within the original frame 310, as shown above in equation 8. Thereafter, at step 751, a count C of the number of gradients that is less than or equal to a particular threshold dprTh is determined. As shown at decision logic 752, if C is less than or equal to dprMaxC, then process 748 proceeds to step 753 where the current pixel is identified as defective. Subsequently, the defective pixel is corrected using the replacement value in step 754. Also, referring back to decision logic 752, if C is greater than dprMaxC, then the process proceeds to step 755, where the current pixel is identified as non-defective and its value is not changed.
It should be noted that the defective pixel detection/correction technique applied during the IPS front-end statistics processing may be less robust than the defective pixel detection/correction performed in the IPS pipe logic 82. For example, as described in more detail below, defective pixel detection/correction performed in the ISP pipeline logic 82 provides fixed defect correction in addition to dynamic defect correction, wherein the location of defective pixels is known in advance and loaded into one or more defect tables. In addition, dynamic defect correction in IPS pipeline logic 82 may also take into account pixel gradients in both the horizontal and vertical directions, and also provide for detection/correction of speckle, as described below.
Returning to fig. 68, the output of DPDC logic 738 is then passed to Black Level Compensation (BLC) logic 739. BLC logic 739 may provide digital gain, offset, and clipping independently for each color component "c" (e.g., R, B, Gr, and Gb of the Bayer pattern) of the pixel for statistics collection. For example, as shown with the following operation, the input value of the current pixel is first offset by a signed value and then multiplied by a gain.
Y=(X+O[c])×G[c], (11)
Where X represents the input pixel value for a given color component c (e.g., R, B, Gr or Gb), O [ c ] represents the signed 16-bit offset for the current color component c, and G [ c ] represents the gain value for the color component c. In one embodiment, the gain G [ c ] may be a 16-bit unsigned number (e.g., 2.14 in floating-point notation) having 2 integer bits and 14 fractional bits, and rounding may be applied to the gain G [ c ]. For example, the gain Gc may have a range between 0 and 4X (e.g., 4 times the input pixel value).
Then, the signed calculated value Y is then limited to the minimum and maximum ranges as shown in equation 12 below:
Y=(Y<min[c])?min[c]:(Y>max[c])?max[c]:Y) (12)
the variables min [ c ] and max [ c ] may represent signed 16-bit "clipping values" for minimum and maximum output values, respectively. In one embodiment, BLC logic 739 may also be configured to keep a count of the number of pixels clipped above the maximum value and below the minimum value, respectively, by color component.
The output of BLC logic 739 is then forwarded to Lens Shading Correction (LSC) logic 740. LSC logic 740 may be configured to apply an appropriate gain on a per pixel basis to compensate for the decrease in intensity, which is generally roughly proportional to the distance from the optical center of lens 88 of imaging device 30. It will be appreciated that this reduction in intensity may be a result of the geometric optics of the lens. For example, a lens with ideal optical properties can be modeled as the 4 th power of the cosine of the angle of incidence, cos4(θ), referred to as cos4Law. However, since lens fabrication is not perfect, various irregularities in the lens can cause the optical properties to deviate from the assumed cos4And (4) modeling. For example, the thinner edges of the lens typically exhibit the greatest irregularities. In addition, irregularities in the lens shading pattern can also be the result of the micromirror array within the image sensor not being perfectly aligned with the color filter array. In addition, infrared in some lenses(IR) filters may cause the intensity reduction to be dependent on the illuminant, and thus the lens shading gain may be adapted according to the detected light source.
Referring to FIG. 71, a three-dimensional contour 756 describing the relationship of light intensity and pixel location for a typical lens is illustrated. As shown, the light intensity at the center 757 of the lens gradually decreases toward the corners or edges 758 of the lens. The lens shading irregularities depicted in fig. 71 can be better illustrated with fig. 72, where fig. 72 shows a color map of image 759 showing a reduction in light intensity towards corners and edges. In particular, it should be noted that the light intensity at the approximate center of the image appears to be an amount greater than the light intensity at the corners and/or edges of the image.
In accordance with embodiments of the present technique, the lens shading correction gain may be specified in the form of a two-dimensional grid of gains for each color channel (e.g., Gr, R, B, Gb of a Bayer filter). The gain grid points may be distributed within the original frame 310 (fig. 23) at fixed horizontal and vertical intervals. As described above in fig. 23, original frame 310 may include active area 312, active area 312 defining the area to be processed for a particular image processing operation. In the case of the lens shading correction operation, an active processing area (may be referred to as an LSC area) is defined within the original frame area 310. As described below, the LSC region must be completely within or above the gain grid boundaries, otherwise the result may be ambiguous.
For example, referring to fig. 73, LSC regions 60 and gain grids 761 are shown that may be defined within the original frame 310. LSC area 760 may have a width 762 and a height 763, which may be defined by an x-offset 764 and a y-offset 765 relative to the boundaries of original frame 310. A grid offset (e.g., grid x offset 766 and grid y offset 767) from the radix 768 of grid gain 761 to the first pixel 769 in LSC region 760 is also provided. These offsets may be within the first network interval of a given color component. For each color channel, horizontal (x-direction) and vertical (y-direction) grid point spacings 770 and 771, respectively, may be independently specified.
As described above, assuming a Bayer color filter array is used, 4 color channels (R, B, Gr and Gb) of grid gain can be defined. In one embodiment, a total of 4K (4096) grid points are available, and for each color channel, a base address of the starting position of the grid gain may be provided, such as with a pointer. Further, horizontal (770) and vertical (771) grid point spacing may be defined in pixels at the resolution of one color plane, and in some embodiments may provide grid point spacing that is separated by a power of 2 in the horizontal and vertical directions (such as spacing 8, 16, 32, 64, or 128, etc.). It is appreciated that by utilizing powers of 2, gain interpolation utilizing displacement (e.g., division) operations and addition operations may be efficiently implemented. By utilizing these parameters, the same gain value can be used even when the image sensor crop is changed. For example, having grid points aligned with cropped regions only requires updating a few parameters (e.g., updating grid offsets 770 and 771) instead of updating all grid gain values. This may be beneficial, for example, when cropping is used during a digital zoom operation. Further, while the gain grid 761 shown in the embodiment of fig. 73 is described as having grid points that are substantially evenly spaced, it should be appreciated that in other embodiments, the grid points may not necessarily be evenly spaced. For example, in some embodiments, the grid points may be unevenly distributed (e.g., logarithmically distributed) such that in the center of LSC region 760, the grid points are less concentrated, but become more concentrated toward the corners of LSC region 760 where typical lens shading distortion is more pronounced.
According to the presently disclosed lens shading correction technique, no gain is applied (e.g., the pixel is passed through unchanged) when the current pixel position is outside the LSC region 760. When the current pixel position is at the gain grid position, the gain value at that particular grid point may be used. However, when the current pixel position is between grid points, bilinear interpolation gains may be utilized. An example of interpolation gain for pixel location "G" on fig. 74 is provided below.
As shown in fig. 74, the pixel G is between the grid points G0, G1, G2, and G3, and the grid points G0, G1, G2, and G3 correspond to the upper left gain, the upper right gain, the lower left gain, and the lower right gain, respectively, with respect to the current pixel position G. The horizontal and vertical sizes of the grid intervals are denoted by X and Y, respectively. In addition, ii and jj represent horizontal and vertical pixel offsets, respectively, with respect to the position of the upper left gain G0. From these factors, the gain corresponding to position G can be interpolated as follows:
the terms in equation 13a above may then be combined to obtain the following expression:
in one embodiment, the interpolation method may be performed incrementally, rather than using multipliers at each pixel, thereby reducing computational complexity. For example, term (ii) (jj) may be implemented with an adder that may be initialized to 0 at position (0, 0) of gain grid 761 and incremented by the current row number each time the current column number increases by one pixel. As described above, since the values of X and Y can be selected to be powers of 2, gain interpolation can be accomplished with a simple shift operation. Thus, only a multiplier is required at grid point G0 (rather than at each pixel), and only an addition operation is required to determine the interpolation gain for the remaining pixels.
In some embodiments, interpolation of the gains between grid points may utilize 14-bit precision, and the grid gains may be unsigned 10-bit values with 2 integer bits and 8 fractional bits (e.g., 2.8 floating-point representation). With this convention, the gain may range from 0 to 4X, and the gain resolution between grid points may be 1/256.
Process 772, shown in FIG. 75, further illustrates a lens shading correction technique. As shown, process 772 begins at step 773, and at step 773, the location of the current pixel is determined relative to the boundary of LSC area 760 of fig. 73. Decision logic 774 then determines whether the current pixel location is within the LSC area 760. If the current pixel location is outside of the LSC region 760, then process 772 proceeds to step 775 and no gain is applied to the current pixel (e.g., the pixel passes through unchanged).
If the current pixel location is within LSC area 760, process 772 proceeds to decision logic 776, which further determines whether the current pixel location corresponds to a grid point within gain grid 761 at decision logic 776. If the current pixel location corresponds to a grid point, then the gain value at that grid point is selected and applied to the current pixel, as shown in step 777. If the current pixel position does not correspond to a grid point, process 772 proceeds to step 778 and interpolates gains according to boundary grid points (e.g., G0, G1, G2, and G3 of FIG. 74). For example, the interpolation gain may be calculated according to equations 13a and 13b discussed above. Thereafter, the process 772 ends at step 779, where the interpolation gain of step 778 is applied to the current pixel at step 779.
It will be appreciated that process 772 may be repeated for each pixel of image data. For example, as shown in fig. 76, a three-dimensional contour describing the gain that may be applied to each pixel location within the LSC region (e.g., 760) is illustrated. As shown, the gain applied at the corners 780 of the image may generally be greater than the gain applied to the center 781 of the image, since the reduction in light intensity is greater at the corners (as shown in fig. 71 and 72). With the presently described lens shading correction techniques, the occurrence of light intensity drops in an image can be reduced or substantially eliminated. For example, fig. 77 provides a color map of image 759 from fig. 72 after lens shading correction is applied. As shown, the total light intensity is generally more uniform throughout the image compared to the initial image of fig. 72. In particular, the light intensity at the approximate center of the image may be substantially the same as the light intensity values at the corners and/or edges of the image. Additionally, as described above, in some embodiments, the interpolation gain calculations (equations 13a and 13b) may be replaced with incremental "increments" between grid points by utilizing a sequential row and column increment structure. As can be appreciated, this can reduce computational complexity.
In other embodiments, in addition to using grid gains, the global gain for each color component is scaled as a function of distance from the center of the image. The center of the image may be set as an input parameter and may be determined by analyzing each image pixel in the uniformly illuminated imageThe light intensity amplitude estimates the center of the image. Subsequently, the radial distance between the identified center pixel and the current pixel may be used to obtain a linearly scaled radial gain GrAs follows:
Gr=Gp[c]×R, (14)
wherein G isp[c]A global gain parameter representing each color component c (e.g., R, B, Gr and Gb components of the Bayer pattern), where R represents the radial distance between the center pixel and the current pixel.
Referring to fig. 78, which illustrates the LSC region 760 discussed above, the distance R may be calculated or estimated using several techniques. As shown, the pixel C corresponding to the center of the image may have coordinates (x)0,y0) The current pixel G may have a coordinate (x)G,yG). In one embodiment, LSC logic 740 may calculate distance R using the following equation:
in another embodiment, the estimated value of R may be obtained using a simpler estimation formula shown below.
R=α×max(abs(xG-x0),abs(yG-y0))+β×min(abs(xG-x0),abs(yG-y0)) (16)
In equation 16, the estimation coefficients α and β may be scaled to an 8-bit value. For example, in one embodiment, α may be approximately equal to 123/128 and β may be approximately equal to 51/128 to provide an estimate of R. With these coefficient values, the maximum error can be about 4% and the median error about 1.3%. Thus, even though the estimation technique may be somewhat less accurate than using a calculation technique to determine R (equation 15), the error range is still low enough that the estimated value or R is suitable for determining the radial gain component for current lens shading correction techniques.
Subsequently, the radial gain G may be adjustedrThe interpolation grid gain value G (equations 13a and 13b) of the current pixel is multiplied to determine the total gain applicable to the current pixel. The output pixel Y is obtained by multiplying the input pixel value X by the total gain, as follows:
Y=(G×Gr×X) (17)
thus, according to the present technique, lens shading correction can be performed using only the interpolation gain, both the interpolation gain component and the radial gain component. Alternatively, lens shading correction can be achieved with only radial gain in combination with a radial grid table that compensates for radial approximation errors. For example, instead of the rectangular gain grid 761 as shown in fig. 73, a radial gain grid having a plurality of grid points defining gains along the radial direction and the angular direction may be provided. Thus, when determining the gain applied to a pixel that is not aligned with one of the radial grid points within LSC region 760, interpolation may be applied by utilizing the four grid points surrounding that pixel to determine the appropriate interpolated lens shading gain.
Referring to FIG. 79, the use of the interpolated gain component and the radial gain component in lens shading correction is illustrated using a process 782. It should be noted that process 782 may include steps similar to process 772 illustrated in FIG. 75 above. Thus, such steps are numbered with the same reference numerals. Beginning at step 773, the current pixel is received and its position relative to the LSC area 760 is determined. Decision logic 774 then determines whether the current pixel location is within the LSC area 760. If the current pixel location is outside of the LSC region 760, then the process 782 proceeds to step 775 where no gain is applied to the current pixel (e.g., the pixel passes through unchanged). If the current pixel location is within LSC area 760, process 782 may proceed to step 783 and decision logic 776 simultaneously. Referring first to step 783, data identifying the center of the image is retrieved. As described above, determining the center of the image may include analyzing the light intensity amplitude of individual pixels under uniform illumination conditions. This may be done, for example, during calibration. Thus, it should be appreciated that step 783 does not necessarily involve iteratively calculating the center of the image in order to process each pixel, but may involve retrieving data (e.g., coordinates) of a predetermined image center. Once the center of the image is identified, process 782 can proceed to step 784 where the distance (R) between the center of the image and the current pixel location is determined at step 784. As described above, the value of R can be calculated (equation 15) or estimated (equation 16). Subsequently, at step 785, a radial gain component G is calculated using the distance R and the global gain parameter corresponding to the color component of the current pixel r(equation 14). Radial gain component GrMay be used to determine the overall gain as described in step 787 below.
Referring back to decision logic 776, a determination is made whether the current pixel location corresponds to a grid point within gain grid 761. If the current pixel position corresponds to a grid point, then the gain value at that grid point is determined, as shown at step 786. If the current pixel position does not correspond to a grid point, process 782 proceeds to step 778 and the interpolation gain is calculated from the boundary grid points (e.g., G0, G1, G2, and G3 of FIG. 74). For example, the interpolation gain may be calculated according to equations 13a and 13b described above. Subsequently, at step 787, an overall gain is determined based on the radial gain determined at step 785 and one of the grid gain (step 786) or the interpolated gain (step 778). It will be appreciated that this depends on which branch decision logic 776 makes in process 782. The total gain is then applied to the current pixel, as shown in step 788. Likewise, it should be noted that process 782 may also be repeated for each pixel of image data similar to process 772.
The use of radial gain in combination with grid gain may provide various benefits. For example, using radial gain allows a single common gain grid to be used for all color components. This may greatly reduce the total memory space required to hold the separate gain grids for each color component. For example, in a Bayer image sensor, using a single gain grid for R, B, Gr and the Gb component, respectively, may reduce the gain grid data by about 75%. It will be appreciated that this reduction in grid gain data may reduce implementation costs, as the grid gain data table may occupy a significant portion of the memory or chip area in the image processing hardware. Furthermore, depending on the hardware implementation, the use of a single set of gain grid values may bring additional advantages, such as reducing the total chip area (e.g., when the gain grid values are stored in on-chip memory) and reducing memory bandwidth requirements (e.g., when the gain grid values are stored in off-chip external memory).
Having explained the function of the lens shading correction logic 740 shown in fig. 68 in detail, the output of the LSC logic 740 is then forwarded to an Inverted Black Level Compensation (IBLC) logic 741. IBLC logic 741 provides gain, offset, and clipping independently for each color component (e.g., R, B, Gr and Gb), and generally performs the inverse function of BLC logic 739. For example, as shown in the following operation, the value of the input pixel is first multiplied by a gain and then shifted by a signed value.
Y=(X×G[c])+O[c], (18)
Where X represents the input pixel value for a given color component c (e.g., R, B, Gr or Gb), O [ c ] represents the signed 16-bit offset for the current color component c, and G [ c ] represents the gain value for the color component c. In one embodiment, the gain Gc may range from about 0 to about 4X (4 times the input pixel value X). It should be noted that these variables may be the same variables discussed above in equation 11. The calculated value Y may be limited to minimum and maximum ranges by using, for example, equation 12. In one embodiment, IBLC logic 741 may be configured to maintain a count of the number of pixels clipped above the maximum value and below the minimum value, respectively, by color component.
Thereafter, a statistics collection block 742 receives the output of the IBLC logic 741, the statistics collection block 742 may provide collection of various statistical data points about the image sensor 90, such as statistical data points related to auto-exposure (AE), auto-white balance (AWB), auto-focus (AF), flicker detection, and the like. In view of the foregoing, some embodiments of the statistics collection block 742 and various aspects related thereto are described below with reference to fig. 80-97.
It is understood that the AWB, AE and AF statistics can be used in image acquisition in digital still cameras as well as video cameras. For simplicity, the AWB, AE, and AF statistics may be collectively referred to herein as "3A statistics. In the embodiment of the ISP front-end logic illustrated in fig. 68, the architecture of the statistics collection block 742 ("3A statistics logic") may be implemented in hardware, software, or a combination thereof. In addition, control software or firmware can be used to analyze the statistics collected with 3A statistics logic 742 and control various parameters of the lens (e.g., focal length), various parameters of the sensor (e.g., analog gain, integration time), and various parameters of IPS pipeline 82 (e.g., digital gain, color correction matrix coefficients). In some embodiments, the image processing circuitry 32 may be configured to provide flexibility in statistical information collection to enable control software or firmware to implement various AWB, AE, and AF algorithms.
In the case of white balance (AWB), the image sensor response at each pixel may depend on the illumination source, as the light source is reflected from objects in the image scene. Thus, each pixel value recorded in the image scene is correlated with the color temperature of the light source. For example, fig. 80 shows a graph 789, the graph 789 illustrating a color range of a white region at a low color temperature and a high color temperature of a YCbCr color space. As shown, the x-axis and y-axis of graph 789 represent blue color difference chrominance (Cb) and red color difference chrominance (Cr), respectively, of the YCbCr color space. Graph 789 also shows a low color temperature axis 790 and a high color temperature axis 791. The region 792 in which axes 790 and 791 are located represents the color range of the white region at low and high color temperatures in the YCbCr color space. It should be understood, however, that the YCbCr color space is merely one example of a color space that may be used in conjunction with the automatic white balance processing in the present embodiment. Other embodiments may utilize any suitable color space. For example, in some embodiments, other suitable color spaces may include a Lab (CIELab) color space (e.g., based on CIE 1976), a red/blue normalized color space (e.g., R/(R +2G + B and B/(R +2G + B) color spaces; R/G and B/G color spaces; and Cb/Y and Cr/Y color spaces, etc.), thus, the axes of the color spaces used by the 3A statistic logic 742 may be referred to as C1 and C2 for purposes of this disclosure (as is the case in FIG. 80).
When a white object is illuminated at a low color temperature, the white object may appear reddish in a captured image. Conversely, a white object illuminated at a high color temperature may appear bluish in the captured image. Then, the purpose of white balance is to adjust the RGB values so that the image appears to the human eye as if it was taken under standard light (canonical light). Thus, in terms of imaging statistics related to white balance, color information related to white objects is collected to determine the color temperature of the light source. In general, the white balance algorithm may include two main steps. First, the color temperature of the light source is estimated. Secondly, the estimated color temperature is used to adjust the color gain values and/or to determine/adjust the coefficients of the color correction matrix. Such gains may be a combination of analog and digital image sensor gains, as well as ISP digital gains.
For example, in some embodiments, the imaging device 30 may be calibrated with a plurality of different reference illuminants. Thus, by selecting the color correction factor corresponding to the reference illuminant that most closely matches the illuminant of the current scene, the white point of the current scene can be determined. For example, one embodiment may calibrate imaging device 30 with 5 reference illuminators (one low color temperature illuminator, one mid-high color temperature illuminator, and one high color temperature illuminator). As shown in fig. 81, one embodiment may define the white balance gain with the following color correction profile: horizons (h) (simulating a color temperature of about 2300 degrees), Incandescent (a or IncA) (simulating a color temperature of about 2856 degrees), D50 (simulating a color temperature of about 5000 degrees), D65 (simulating a color temperature of about 6500 degrees), and D75 (simulating a color temperature of about 7500 degrees).
Depending on the illuminant of the current scene, the white balance gain may be determined using the gain corresponding to the reference illuminant that most closely matches the current illuminant. For example, if the statistical logic 742 (described in more detail below in fig. 82) determines that the current illuminant approximately matches the reference medium color temperature illuminant D50, then white balance gains of approximately 1.37 and 1.23 may be applied to the red and blue channels, respectively, while no gain (1.0) is applied to the green channel (G0 and G1 for Bayer data) approximation. In some embodiments, if the current illuminant color temperature is between two reference illuminants, the white balance gain can be determined by interpolating the white balance gain between the two reference illuminants. Furthermore, while the present example shows the use of H, A, D50, D65, and D75 illuminators to calibrate an imaging device, it should be appreciated that any suitable type of illuminator, such as TL84 or CWF (fluorescent reference illuminator), etc., may be used for camera calibration.
As described further below, several statistics may be provided for AWB, including two-dimensional (2D) color histograms, and RGB or YCC summations, to provide a plurality of programmable color ranges. For example, in one embodiment, the statistics logic 742 may provide a set of multiple pixel filters, a subset of which may be selected for AWB processing. In one embodiment, 8 filter banks may be provided, each with different configurable parameters, and 3 sets of color range filters may be selected from the set used to collect the blocking statistics and the statistics for each floating window. For example, a first selected filter may be configured to cover the current color temperature to obtain an accurate color estimate, a second selected filter may be configured to cover the low color temperature region, and a third selected filter may be configured to cover the high color temperature region. The special structure can enable the AWB algorithm to adjust the current color temperature area when the light source changes. In addition, the 2D color histogram may be used to determine global and local illuminants, and to determine various pixel filter thresholds for accumulating RGB values. Also, it should be understood that the selection of the 3 pixel filter is merely to illustrate one embodiment. In other embodiments, fewer or more pixel filters may be selected for the AWB statistics.
Furthermore, in addition to selecting the 3 pixel filter, an additional one pixel filter may be used for Auto Exposure (AE), which generally refers to a process of adjusting a pixel integration time and a gain to control the brightness of a captured image. For example, by setting the integration time, the auto-exposure may control the amount of light from the scene captured with the image sensor. In some embodiments, blocks of luminance statistics and floating windows may be collected using 3A statistics logic 742 and processed to determine integration and gain control parameters.
Further, auto-focusing may refer to determining an optimal focal length of a lens to substantially optimize the focusing of an image. In some embodiments, a floating window of high frequency statistics may be collected and the focal length of the lens may be adjusted to bring the image into focus. As described further below, in one embodiment, the autofocus adjustment may utilize coarse and fine adjustments based on one or more metrics, referred to as an autofocus score (AF score), to bring the image into focus. Further, in some embodiments, AF statistics/scores may be determined for different colors, and the relativity between AF statistics/scores for each color channel may be used to determine the direction of focus.
Accordingly, these various types of statistics and the like can be determined and collected using the statistics collection block 742. As shown, the output STATS0 of the statistics gathering block 742 of the Sensor0 statistics processing unit 142 may be sent to the memory 108 and routed to the control logic 84, or alternatively, may be sent directly to the control logic 84. Further, it should be understood that the Sensor1 statistics processing unit 144 may also include a 3A statistics collection block providing a similar structure of statistics STATS1, as shown in fig. 10.
As described above, control logic 84 (which may be a dedicated processor in ISP subsystem 32 of device 10) may process the collected statistics to determine one or more control parameters for controlling imaging device 30 and/or image processing circuitry 32. For example, such control parameters may include parameters for operating the lens of the imaging sensor 90 (e.g., focus adjustment parameters), image sensor parameters (e.g., analog and/or digital gain, integration time), and ISP pipeline processing parameters (e.g., digital gain values, Color Correction Matrix (CCM) coefficients). Additionally, as described above, in some embodiments, statistical processing may be performed with 8-bit precision, and thus, raw pixel data with higher bit depth may be downsampled into an 8-bit format for statistics. As described above, downsampling to 8 bits (or any other low bit resolution) may reduce hardware size (e.g., area), also reduce processing complexity, and make the statistics more robust to noise (e.g., by utilizing spatial averaging of the image data).
With the above in mind, FIG. 82 is a block diagram depicting a logic implementing one embodiment of the 3A statistic logic 742. As shown, the 3A statistical logic 742 may receive signals 793 representative of Bayer RGB data, which signals 793 may correspond to the output of the inverted BLC logic 741 as shown in fig. 68. The 3A statistical logic 742 may process the Bayer RGB data 793 to obtain various statistics 794, which various statistics 794 may represent the output STATS0 of the 3A statistical logic 742, as shown in fig. 68; or alternatively, output STATS1 representing the statistical logic associated with Sensor1 statistical information processing unit 144.
In the illustrated embodiment, to make the statistics more robust to noise, the logic 795 first averages the incoming Bayer RGB pixels 793. For example, the four-tuple can be composed of 42 × 2Bayer (e.g., representing Bayer)A 2 x 2 pixel block of a pattern) and the window size of the 4 x 4 sensor pixels are averaged, and the averaged red (R), green (G), and blue (B) values in the 4 x 4 window may be calculated and converted to 8 bits, as described above. Describing this process in more detail with reference to fig. 83, fig. 83 represents a 4 × 4 window 796 formed as 4 pixels of a 2 × 2Bayer quadruple 797. With this arrangement, each color channel comprises a 2 x 2 block of corresponding pixels within the window 796, and pixels of the same color can be summed and averaged to produce an average color value for each color channel within the window 796. For example, red pixel 799 may be averaged to obtain an average red value (R) AV) 803; the blue pixels 800 may be averaged to obtain an average blue value (B) within the sample 796AV)804. In terms of averaging of green pixels, several techniques may be utilized because the Bayer pattern has twice as many green samples as red or blue samples. In one embodiment, the average green value (G) may be obtained by averaging only the Gr pixels 798, only the Gb pixels 801, or averaging all of the Gr and Gb pixels 798 and 801 togetherAV)802. In another embodiment, the Gr and Gb pixels 798 and 801 in each Bayer quad 797 may be averaged, and the green values of the individual Bayer quad 797 may be further averaged together to obtain GAV802. It will be appreciated that averaging the pixel values across a plurality of pixel blocks may reduce noise. Furthermore, it should be appreciated that the use of 4 x 4 blocks as window samples is only intended to provide one example. In fact, in other embodiments, any suitable block size may be utilized (e.g., 8 × 8, 16 × 16, 32 × 32, etc.).
The downsampled Bayer RGB values 806 are then input to color space transform logic units 807 and 808. Since some 3A statistics may depend on the pixels after applying the color space transform, the color space transform (CSC) logic 807 and CSC logic 808 may be configured to transform the downsampled Bayer RGB values 806 into one or more other color spaces. In one embodiment, the CSC logic 807 may provide a non-linear spatial transformation and the CSC logic 808 may provide a linear spatial transformation. From Instead, the CSC logic units 807 and 808 may transform the raw image data from the sensor Bayer RGB to another color space (e.g., sRGB) that is more ideal or appropriate for making white point estimates for white balancelinearsRGB, YCbCr, etc.).
In this embodiment, the non-linear CSC logic 807 may be configured to perform a 3 x 3 matrix multiplication followed by a non-linear mapping implemented as a look-up table, followed by another 3 x 3 matrix multiplication and an additional offset. This allows the 3A statistical color space transformation to replicate the color processing of RGB processing in ISP pipeline 82 (e.g., applying white balance gains, applying color correction matrices, applying RGB gamma adjustments, and performing color space transformations) for a given color temperature. It may also provide a conversion of the bayer rgb values to a more color consistent color space such as CIELab, or any other color space discussed above (e.g., YCbCr, red/blue normalized color space, etc.). Under some conditions, the Lab color space may be more suitable for white balance operations because chromaticity is more linear with respect to luminance.
As shown in fig. 82, the output pixels of the Bayer RGB downscaled signal 806 are processed by a first 3 × 3 color correction matrix (3A _ CCM) 808. In this embodiment, the 3A _ CCM 809 can be configured to transform from the camera RGB color space (camRGB) to the linear sRGB calibration space (sRGB) linear). The following equations 19-21 provide a programmable color space transform that may be used in one embodiment:
sRlinear=max(0,min(255,(3A_CCM_00*R+3A_CCM_01*G+3A_CCM_02*B)));(19)
sGlinear=max(0,min(255,(3A_CCM_10*R+3A_CCM_11*G+3A_CCM_12*B)));(20)
sBlinear=max(0,min(255,(3A_CCM_20*R+3A_CCM_21*G+3A_CCM_22*B)));(21)
wherein 3A _ CCM _ 00-3A _ CCM _22 represent signed coefficients of the matrix 808. Thus, sRGB can be determined as followslinearRespective sR of a color spacelinear,sGlinearAnd sBlinearComponent (b): it is first determined that the red, blue and green downsample the sum of the bayer rgb values with the corresponding 3A _ CCM coefficients applied, and then clip the value to 0 or 255 (minimum and maximum pixel values of 8-bit pixel data) if the value exceeds 255 or is less than 0. The resulting sRGBlinearThe value is represented in fig. 82 by reference numeral 810 as the output of the 3A _ CCM 809. Additionally, the 3A statistics logic 742 may maintain individual sRslinear、sGlinearAnd sBlinearThe count of the number of clipped pixels for a component is as follows:
3A _ CCM _ R _ clipcount _ low: sR of less than 0 for clippinglinearNumber of pixels
3A _ CCM _ R _ clipcount _ high: sR of greater than 255 for clippinglinearNumber of pixels
3A _ CCM _ G _ clipcount _ low: sG less than 0 of clippinglinearNumber of pixels
3A _ CCM _ G _ clipcount _ high: sG of greater than 255 for clippinglinearNumber of pixels
3A _ CCM _ B _ clipcount _ low: sB less than 0 of clippinglinearNumber of pixels
3A _ CCM _ B _ clipcount _ high: sB of greater than 255 for clipping linearNumber of pixels
Thereafter, the sRGB may be processed using the non-linear look-up table 811linearPixel 810, resulting in sRGB pixel 812. The lookup table 811 may contain 8-bit value entries, each table entry value representing an output level. In one embodiment, the lookup table 811 may include 65 evenly distributed input entries, where the table index represents an input value of step size 4. When the input value falls between intervals, the output value is linearly interpolated.
It will be appreciated that the sRGB color space may represent the color space of the final image produced by imaging device 30 (fig. 7) for a given white point, since white balance statistics collection is performed in the color space of the final image produced by the imaging device. In one embodiment, the white point may be determined by matching characteristics of the image scene and one or more reference illuminants according to, for example, a red-to-green and/or blue-to-green ratio. For example, one reference illuminant may be D65, a CIE standard illuminant that simulates daylight conditions. In addition to D65, calibration of the imaging device 30 may also be performed for other different reference illuminants, and the white balance determination process may include determining the current illuminant such that the process (e.g., color balance) may be adjusted for the current illuminant according to the corresponding calibration point. For example, in one embodiment, the imaging device 30 and 3A statistical logic 742 may be calibrated using a Cold White Fluorescent (CWF) reference illuminant, a TL84 reference illuminant (another fluorescent source), and an IncA (or a) reference illuminant that simulates incandescent lighting in addition to D65. In addition, as described above, in the camera calibration for the white balance process, various other illuminants (e.g., H, IncA, D50, D65, D75, etc.) corresponding to different color temperatures may also be used. Thus, by analyzing the image scene and determining which reference illuminant most closely matches the current illumination source, the white point may be determined.
Still referring to the non-linear CSC logic 807, the sRGB pixel output 812 of the lookup table 811 may be further processed with a second 3 x 3 color correction matrix 813 (referred to herein as a 3A _ CSC). In the described embodiment, the 3A CSC matrix 813 is shown as being configured to transform from the sRGB color space to the YCbCr color space, although it may also be configured to transform sRGB values to other color spaces. For example, the following programmable color space transforms (equations 22-27) may be used:
Y=3A_CSC_00*sR+3A_CSC_01*sG+3A_CSC_02*sB+3A_OffsetY; (22)
Y=max(3A_CSC_MIN_Y,min(3A_CSC_MAX_Y,Y)); (23)
C1=3A_CSC_10*sR+3A_CSC_11*sG+3A_CSC_12*sB+3A_OffsetC1; (24)
C1=max(3A_CSC_MIN_C1,min(3A_CSC_MAX_C1,C1)); (25)
C2=3A_CSC_20*sR+3A_CSC_21*sG+3A_CSC_22*sB+3A_OffsetC2; (26)
C2=max(3A_CSC_MIN_C2,min(3A_CSC_MAX_C2,C2)); (27)
wherein 3A _ CSC _00 to 3A _ CSC _22 represent signed coefficients of matrix 813, 3A _ OffsetY, 3A _ OffsetC1, and 3A _ OffsetC2 represent signed offsets, and C1 and C2 represent different colors (here, blue color difference chroma (Cb) and red color difference chroma (Cr)), respectively. It should be understood, however, that C1 and C2 may represent any suitable different chroma colors, not necessarily Cb and Cr colors.
As shown in equations 22-27, in determining each component of YCbCr, the appropriate coefficients of matrix 813 are applied to the sRGB values 812 and the result is added with the corresponding offset (e.g., equations 22, 24, and 26). In essence, this step is a 3 × 1 matrix multiplication step. The result of the matrix multiplication is then clipped between the maximum and minimum values (e.g., equations 23, 25, and 27). The associated minimum and maximum amplitude limits may be programmable and may depend, for example, on the particular imaging or video standard being used (e.g., bt.601 or bt.709).
The 3A statistic logic 742 also keeps a count of the number of respective clipped pixels of the Y, C1 and C2 components, as follows:
3A _ CSC _ Y _ clipcount _ low: number of clipped Y pixels less than 3A _ CSC _ MIN _ Y
3A _ CSC _ Y _ clipcount _ high: number of clipped Y pixels greater than 3A _ CSC _ MAX _ Y
3A _ CSC _ C1_ clipcount _ low: number of clipped C1 pixels less than 3A _ CSC _ MIN _ C1
3A _ CSC _ C1_ clipcount _ high: number of clipped C1 pixels greater than 3A _ CSC _ MAX _ C1
3A _ CSC _ C2_ clipcount _ low: number of clipped C2 pixels less than 3A _ CSC _ MIN _ C2
3A _ CSC _ C2_ clipcount _ high: number of clipped C2 pixels greater than 3A _ CSC _ MAX _ C2
The output pixels of the Bayer RGB downsampled signal 806 may also be provided to linear color space transform logic 808, which linear color space transform logic 808 may be configured to implement a camera color space transform. For example, the output pixels 806 of the Bayer RGB downsampling logic 795 may be processed with another 3 × 3 color transform matrix (3A _ CSC2)815 of the CSC logic 808 to transform from sensor RGB (camrgb) to a linear white balance color space (camYC1C2), where C1 and C2 may correspond to Cb and Cr, respectively. In one embodiment, chroma pixels may be luminance scaled, which is beneficial in achieving color filters with improved color consistency and more robust to color shifts due to luminance variations. Examples of how the camera color space transformation can be performed using the 3 x 3 matrix 815 are provided below in equations 28-31:
camY=3A_CSC2_00*R+3A_CSC2_01*G+3A_CSC2_02*B+3A_Offset2Y; (28)
camY=max(3A_CSC2_MIN_Y,min(3A_CSC2_MAX_Y,camY)); (29)
camC1=(3A_CSC2_10*R+3A_CSC2_11*G+3A_CSC2_12*B); (30)
camC2=(3A_CSC2_20*R+3A_CSC2_21*G+3A_CSC2_22*B); (31)
Where 3A _ CSC2_ 00-3A _ CSC2_22 represents signed coefficients of matrix 815, 3A _ Offset2Y represents signed offsets of camY, camC1 and camC2 represent different colors (here, blue-difference chroma (Cb) and red-difference chroma (Cr)), respectively. To determine camY, the corresponding coefficients of matrix 815 are applied to the bayer RGB values 806, and the result is added to 3A _ Offset2Y, as shown in equation 28. The result is then clipped before the maximum and minimum values as shown in equation 29. As described above, the clipping limit may be programmable.
At this point, camC1 and camC2 pixels of output 816 are signed. As described above, in some embodiments, the chroma pixels may be scaled. For example, one technique to achieve chroma scaling is represented below:
camC1=camC1*ChromaScale*255/(camYcamY:1); (32)
camC2=camC2*ChromaScale*255/(camYcamY:1); (33)
wherein ChromaScale represents a floating point scaling factor between 0 and 8. In equations 32 and 33, the expression (camYcamY:1) is intended to avoid the case of division by zero. That is, if camY is equal to 0, then the value of camY is set to 1. Further, in one embodiment, ChromaScale may be set to one of two possible values, depending on the sign of camC 1. For example, as shown below in equation 34, if camC1 is negative, then ChromaScale may be set to a first value (ChromaScale0), otherwise may be set to a second value (ChromaScale 1):
ChromaScale=ChromaScale0 if(camC1<0) (34)
ChromaScale1 otherwise
Thereafter, with the addition of the chroma offset, camC1 and camC2 chroma pixels are clipped, as shown below in equations 35 and 36, to produce corresponding unsigned pixel values:
camC1=max(3A_CSC2_MIN_C1,min(3A_CSC2_MAX_C1,(camC1+3A_Offset2C1))) (35)
camC2=max(3A_CSC2_MIN_C2,min(3A_CSC2_MAX_C2,(camC2+3A_Offset2C2))) (36)
where 3A _ CSC2_ 00-3A _ CSC2_22 are signed coefficients of matrix 815 and 3A _ Offset2C1 and 3A _ Offset2C2 are signed offsets. Further, to count the number of pixels clipped for camY, camC1, and camC2, as follows:
3A _ CSC2_ Y _ clipcount _ low: number of camY pixels less than clipped 3A _ CSC2_ MIN _ Y
3A _ CSC2_ Y _ clipcount _ high: number of camY pixels greater than clipped 3A _ CSC2_ MAX _ Y
3A _ CSC2_ C1_ clipcount _ low: the number of camC1 pixels less than the clipped 3A _ CSC2_ MIN _ C1
3A _ CSC2_ C1_ clipcount _ high: number of camC1 pixels greater than clipped 3A _ CSC2_ MAX _ C1
3A _ CSC2_ C2_ clipcount _ low: the number of camC2 pixels less than the clipped 3A _ CSC2_ MIN _ C2
3A _ CSC2_ C2_ clipcount _ high: number of camC2 pixels greater than clipped 3A _ CSC2_ MAX _ C2
Thus, in this embodiment, the non-linear and linear color space transform logics 807 and 808 may provide pixel data in various color spaces: sRGBlinear(signal 810), sRGB (signal 812), YCbYr (signal 814), and camYCbCr (signal 816). It should be appreciated that the coefficients of each of the transformation matrices 809(3A _ CCM), 813(3A _ CSC), and 815(3A _ CSC2), as well as the values in the lookup table 811, may be independently set and programmed.
Still referring to fig. 82, chroma output pixels from a non-linear color space transform (YCbCr 814) or a camera color space transform (camYCbCr 816) may be used to generate a two-dimensional (2D) color histogram 817. Selection logic 818 and 819, which may be implemented as multiplexers as shown, or with any other suitable logic, may be configured to select between luma and chroma pixels from a non-linear color space transform or a camera color space transform. The selection logic 818 and 819 may operate in response to respective control signals, which may be provided by the main control logic 84 of the image processing circuit 32 (FIG. 7) and may be set via software, in one embodiment.
For the present example, it may be assumed that selection logic 818 and 819 selects the YC1C2 color space transform (814), where the first component is luminance and C1, C2 are the first and second colors (e.g., Cb, Cr). A2D histogram 817 in the C1-C2 color space is generated for one window. For example, the window may be defined using a column start point and width, and a row start point and height. In one embodiment, the window position and size may be set to a multiple of 4 pixels, and a 32 × 32 bin (bin) may be used for a total of 1024 bins. The binning boundaries may be at fixed intervals, and pixel scales and offsets may be defined in order to allow zooming and panning of histogram collection in specific regions in the color space.
The upper 5 bits of C1 and C2 (representing a total of 32 values) after shifting and scaling can be used to determine binning. The bin indices of C1 and C2 (here denoted as C1_ index and C2_ index) may be determined as follows:
C1_index=((C1-C1_offset)>>(3-C1_scale) (37)
C2_index=((C2-C2_offset)>>(3-C2_scale) (38)
once the index is determined, if the bin index is in the range [0, 31], then the color histogram is bin incremented by the Count value (which may have a value between 0 and 3 in one embodiment), as shown below in equation 39. In practice, this allows weighting the color count according to the luminance value (e.g., brighter pixels are weighted more, rather than weighting all pixels equally (e.g., × 1)).
if(C1_index>=0&&C1_index<=31&&C2_index>=0&&C2_index<=31)(39)
StatsCbCrHist[C2_index&31][C1_index&31]+=Count;
Where Count is determined based on the selected luminance value (Y in this example). It will be appreciated that the steps represented by equations 37, 38, and 39 may be implemented by the bin update logic block 821. Further, in one embodiment, a plurality of luminance thresholds may be set to define a luminance section. For example, 4 luminance thresholds (Ythd 0-Ythd 3) may define 5 luminance bins, and the Count values of Count 0-4 are defined for each bin. The Count 0-Count 4 may be selected (e.g., with pixel condition logic 820) based on the luminance thresholds as shown below:
if(Y<=Ythd0) (40)
Count=Count0
else if(Y<=Ythd1)
Count=Count1
else if(Y<=Ythd2)
Count=Count2
else if(Y<=Ythd3)
Count=Count3
else
Count=Count4
in view of the above, fig. 84 illustrates color histograms with the scale and offset set to 0 for C1 and C2. The partitions in CbCr space represent each 32 x 32 bin (1024 bins in total). Fig. 85 provides an example of zoom and pan within a 2D color histogram for additional precision, where a small rectangular area 822 specifies the location of 32 x 32 binning.
At the beginning of one frame of image data, the bin value is initialized to 0. For each pixel into the 2D color histogram 817, the bin corresponding to the matching C1C2 value is incremented by a determined Count value (Count 0-Count 4), which may be based on a luminance value, as described above. For each bin within the 2D histogram 817, the total pixel count is reported as part of the collected statistics (e.g., STATS 0). In one embodiment, the total pixel count for each bin may have a resolution of 22 bits, providing an allocation of internal memory equal to 1024 × 22 bits.
Referring back to FIG. 82, Bayer RGB pixels (Signal 806), sRGBlinearThe pixels (signal 810), sRGB pixels (signal 812), and YC1C2 (e.g., YCbCr) pixels (signal 814) are provided to a set of pixel filters 824a-C to conditionally accumulate RGB, sRGB B pixels based on the camYC1C2 or YC1C2 pixel conditions defined by each pixel filter 824linearThe sum of sRGB, YC1C2 or camYC1C 2. I.e. fromThe Y, C1 and C2 values of the output of the linear color space transform (YC1C2) or the camera color space transform (camYC1C2) are used to conditionally select RGB, sRGBlinearsRGB or YC1C2 for accumulation. While the present embodiment describes the 3A statistic logic 742 as having 8 pixel filters (PF 0-PF 7) arranged, it should be appreciated that any number of pixel filters may be arranged.
FIG. 86 shows a functional logic diagram depicting an embodiment of a pixel filter, specifically PF0(824a) and PF1(824b) of FIG. 82. As shown, each pixel filter 824 includes receiving Bayer RGB pixels, sRGBlinearA pixel, an sRGB pixel, and selection logic for a YC1C2 or a camYC1C2 pixel selected by another selection logic 826. For example, selection logic 825 and 826 may be implemented using multiplexers or any other suitable logic. The selection logic 826 may select YC1C2 or camYC1C 2. The selection may be made in response to a control signal provided by the main control logic 84 of the image processing circuit 32 (fig. 7) and/or set in software. The pixel filter 824 may then evaluate the YC1C2 pixels (e.g., non-linear or camera) selected by the selection logic 826 against pixel conditions using a logic 827. Each pixel filter 824 may select a Bayer RGB pixel, sRGB, using a selection circuit 825linearA pixel, an sRGB pixel, and one of YC1C2 or camYC1C2 depending on the output of the selection circuit 826.
Using the evaluation results, the pixels selected by selection logic 825 may be accumulated (828). In one embodiment, the pixel condition may be defined using the thresholds C1_ min, C1_ max, C2_ min, C2_ max as shown in the graph 79 of fig. 80. A pixel is included in the statistical information if it satisfies the following condition:
1.C1_min<=C1<=C1_max
2.C2_min<=C2<=C2_max
3.abs((C2_delta*C1)-(C1_delta*C2)+Offset)<distance_max
4.Ymin<=Y<=Ymax
Referring to the graph 829 of FIG. 87, in one embodiment, the point 830 represents the value (C2, C1) corresponding to the current YC1C2 pixel data selected by the logic 826. C1_ delta may be determined as the difference between C1_1 and C1_0, and C2_ delta may be determined as the difference between C2_1 and C2_ 0. As shown in fig. 87, points (C1_0, C2_0) and (C1_1, C2_1) may define the minimum and maximum boundaries of C1 and C2. Offset can be determined by multiplying C1_ delta by the value 832(C2_ interrupt) of the intersection of line 831 and axis C2. Thus, assuming Y, C1 and C2 satisfy the minimum and maximum boundary conditions, if its distance 833 to the line 831 is less than distance _ max834, the selected pixel (Bayer RGB, sRGB Linear, sRGB and YC1C2/camYC1C2) is included in the accumulated sum, and distance _ max834 may be the product of the distance 833 of the pixel to the line and the normalization factor.
distance_max=distance*sqrt(C1_delta^2+C2_delta^2)
In the embodiment, the distances C1_ delta and C2_ delta can range from-255 to 255. Thus, distance _ max834 may be represented by 17 bits. The points (C1_0, C2_0) and (C1_1, C2_1), as well as parameters (e.g., normalization factors) for determining distance _ max, may be provided as part of the pixel condition logic 827 in each pixel filter 824. It is understood that pixel condition 827 may be configurable/programmable.
Although the example shown in fig. 87 depicts pixel conditions based on two sets of points (C1_0, C2_0) and (C1_1, C2_1), in other embodiments, some pixel filters may define more complex shapes and regions from which pixel conditions are determined. For example, fig. 88 shows an embodiment in which the pixel filter 824 may define the pentagon 835 using points (C1_0, C2_0), (C1_1, C2_1), (C1_2, C2_2), and (C1_3, C2_3) and (C1_4, C2_ 4). Each edge 836 a-836 e may define a line condition. However, unlike the case shown in fig. 87 (for example, the pixels may be on either side of the straight line 831 as long as distance _ max is satisfied), the condition may be that the pixels (C1, C2) must be located on the side of the lines 836a to 836e such that the pixels (C1, C2) are surrounded by the polygon 835. Thus, when the intersection of the line conditions is satisfied, the pixels (C1, C2) are counted. For example, in fig. 88, such intersection occurs for the pixel 837 a. However, the pixel 837b fails to satisfy the line condition of the straight line 836d, and thus, the pixel 837b is not included in the statistical information when processed with the pixel filter configured in this manner.
In another embodiment, represented in fig. 89, the pixel condition may be determined based on the overlap shape. For example, fig. 89 shows how the pixel filter 824 has a pixel condition defined by two overlapping shapes, here, rectangles 838a and 838b defined by points (C1_0, C2_0), (C1_1, C2_1), (C1_2, C2_2), and (C1_3, C2_3), and points (C1_4, C2_4), (C1_5, C2_5), (C1_6, C2_6), and (C1_7, C2_7), respectively. In this example, the pixels (C1, C2) may satisfy the line condition defined by such a pixel filter by being enclosed within the area collectively bounded by the shapes 838a and 838b (e.g., by satisfying the line condition defining each line of the two shapes). For example, in fig. 89, these conditions are satisfied for the pixel 839 a. However, pixel 839b fails to satisfy these conditions (particularly for line 840a of rectangle 838a and line 840b of rectangle 88 b), and is therefore not accounted for in the statistics when processed by a pixel filter constructed in this manner.
For each pixel filter 824, based on the pixel conditions defined by logic 827, a qualifying pixel is identified for which the 3A statistics engine 742 may collect the following statistics: 32 bits and: (R)sum、Gsum、Bsum) Or (sR)linear_sum、sGlinear_sum、sBlinear_sum) Or (sR)sum,sGsum,sBsum) Or (Y)sum,C1sum,C2sum) And a 24-bit pixel Count, which may represent the sum of the number of pixels included in the statistical information. In one embodiment, the software may use the sum to generate an average within a block or window.
When logic 825 of pixel filter 824 selects camYC1C2 pixels, the scaled chrominance values may be color thresholded. For example, since the chromaticity intensity at the white point increases with luminance value, the use of chromaticity scaled with luminance value in pixel filter 824 may provide a consistency improvement result in some cases. For example, minimum and maximum brightness conditions may allow the filter to ignore dark and/or bright areas. If the pixel satisfies the YC1C2 pixel condition, then RGB, sRGB are accumulatedlinearsRGB or YC1C2 values. The selection of pixel values by selection logic 825 may depend on the kind of information required. For example, for white balance, RGB or sRGB is generally selected linearA pixel. The YCC or sRGB pixelets are more suitable for detecting certain conditions, such as sky, grass, skin tone, etc.
In the present embodiment, 8 sets of pixel conditions respectively associated with each of the pixel filters PF0 PF 7824 may be defined. Some pixel conditions may be defined to cut the region in the C1-C2 color space (fig. 80) where white dots are likely to be located. This may be determined or estimated from the current illuminant. The current white point may then be determined from the white balance adjusted R/G and/or B/G ratio using the accumulated RGB sums. In addition, some pixel conditions may be defined or modified for scene analysis and classification. For example, some pixel filters 824 and windows/patches may be used to detect conditions such as a blue sky in the top of the image frame, or a green grass in the bottom of the image frame. This information can also be used to adjust the white balance. In addition, some pixel conditions may be defined or modified to detect skin tones. For such filters, segmentation may be used to detect regions of the image frame that have skin tones. By identifying these regions, the quality of skin tones may be improved by reducing the amount of noise filters in the skin tone regions and/or reducing the quantization of the video compression in these regions, thereby improving the quality.
The 3A statistical logic 742 may also provide for collection of luminance data. For example, luminance value camY resulting from camera color space transformation camYC1C2) may be used to accumulate luminance sum statistics. In one embodiment, the 3A statistical logic 742 may collect the following luminance information:
Ysum: sum of camY
cond(Ysum): satisfies the condition Ymin<=camY<YmaxSum of camY of
Ycount1:camY<YminCounting of pixels in case
Ycount2:camY>=YmaxCounting of pixels in case
Here, Ycount1 may represent the number of underexposed pixels and Ycount2 represents the number of overexposed pixels. This can be used to determine whether the image is overexposed or underexposed. For example, if the pixel is not saturated, then the sum of camY (Y)sum) An average brightness in the scene may be indicated, which may be used to achieve a target AE exposure. For example, in one embodiment, by passing YsumDivided by the number of pixels, the average luminance can be determined. Further, by knowing the block statistics and the brightness/AE statistics of the window position, AE measurement can be performed. For example, depending on the image scene, it may be desirable to weight the AE statistics at the center window more heavily than the AE statistics at the edges of the image, such as may be the case with portraits.
In the presently illustrated embodiment, the 3A statistics collection logic may be configured to collect statistics in chunks and windows. In the illustrated structure, a window may be defined for the block statistics 863. The window may be defined by a column start point and width, and a row start point and height. In one embodiment, the window position and size may be selected to be a multiple of 4 pixels, within which statistical information is collected in arbitrary sized patches. For example, all the partitions in the window may be selected such that they have the same size. The tile size may be set independently for the horizontal and vertical directions, and in one embodiment, a maximum limit on the number of horizontal tiles (e.g., a limit of 128 horizontal tiles) may be set. Further, in one embodiment, the minimum tile size may be set to, for example, 8 pixels wide by 4 pixels high. The following are some examples of blocking configurations based on different video/imaging modes and standards in order to obtain a 16 x 16 blocked window:
VGA 640x 480: block spacing 40x30pixels
HD 1280x 720: block spacing 80x45pixels
HD 1920x 1080: block spacing 120x68pixels
5MP 2592x 1944: block intervals 162x122pixels
8MP 3280x 2464: block interval 205x154pixels
For the present embodiment, from the 8 available pixel filters 824(PF 0-PF 7), 4pixel filters may be selected for the block statistics 863. For each partition, the following statistics may be collected:
(Rsum0,Gsum0,Bsum0) Or (sR)linear_sum0,sGlinear_sum0,sBlinear_sum0) Or is or
(sRsum0,sGsum0,sBsum0) Or (Y)sum0,C1sum0,C2sum0),Count0
(Rsum1,Gsum1,Bsum1) Or (sR)linear_sum1,sGlinear_sum1,sBlinear_sum1) Or is or
(sRsum1,sGsum1,sBsum1) Or (Y)sum1,C1sum1,C2sum1),Count1
(Rsum2,Gsum2,Bsum2) Or (sR)linear_sum2,sGlinear_sum2,sBlinear_sum2) Or is or
(sRsum2,sGsum2,sBsum2) Or (Y)sum2,C1sum2,C2sum2),Count2
(Rsum3,Gsum3,Bsum3) Or (sR)linear_sum3,sGlinear_sum3,sBlinear_sum3) Or is or
(sRsum3,sGsum3,sBsum3) Or (Y)sum3,C1sum3,C2sum3) Count3, or
Ysum,cond(Ysum),Ycount1,Ycount2(from camY)
Among the above-listed statistics, Count 0-3 represents the Count of pixels that satisfy the pixel condition corresponding to the selected 4-pixel filter. For example, if pixel filters PF0, PF1, PF5, and PF6 are selected as 4-pixel filters for a particular tile or window, the above-mentioned expressions may correspond to the pixel data (e.g., Bayer RGB, sRGB) selected (e.g., by selection logic 825) for these filterslinearsRGB, YC1Y2, camYC1C 2). Additionally, the Count value may be used to normalize the statistics (e.g., by dividing the color sum by the corresponding Count value). As indicated above, the selected pixel filter 824 may be configured to select Bayer RGB, sRGB, depending at least in part on the type of statistical information desired linearOr sRGB pixel data, or one of YC1C2 (a non-linear or camera color space transform depending on the selection of logic 826) pixel data, and determines color summation statistics for the selected pixel data. In addition, as described above, the luminance value camY derived from the camera color space transform (camYC1C2) is also collected for luminance sum information used for Auto Exposure (AE) statistical information.
Additionally, the 3A statistics logic may also be configured to collect statistics 861 for multiple windows. For example, in one embodiment, up to 8 floating windows may be used, which may be any rectangular area having an integer multiple of 4 pixels in each dimension (e.g., height x width) up to a maximum size corresponding to the size of the image frame. However, the position of the window does not necessarily have to be limited to a multiple of 4 pixels. For example, the windows may overlap each other.
In this embodiment, 4 pixel filters 824 may be selected from the available 8 pixel filters (PF 0-PF 7) for each window. Statistics for each window may be collected in the same manner as described above for the partitions. Thus, for each window, the following statistical information 861 may be collected:
(Rsum0,Gsum0,Bsum0) Or sRlinear_sum0,sGlinear_sum0,sBlinear_sum0) Or is or
(sRsum0,sGsum0,sBsum0) Or (Y)sum0,C1sum0,C2sum0),Count0
(Rsum1,Gsum1,Bsum1) Or (sR)linear_sum1,sGlinear_sum1,sBlinear_sum1) Or is or
(sRsum1,sGsum1,sBsum1) Or (Y)sum1,C1sum1,C2sum1),Count1
(Rsum2,Gsum2,Bsum2) Or (sR)linear_sum2,sGlinear_sum2,sBlinear_sum2) Or is or
(sRsum2,sGsum2,sBsum2) Or (Y)sum2,C1sum2,C2sum2),Count2
(Rsum3,Gsum3,Bsum3Or (sR)linear_sum3,sGlinear_sum3,sBlinear_sum3) Or is or
(sRsum3,sGsum3,sBsum3) Or (Y)sum3,C1sum3,C2sum3),Count3, or
Ysum,cond(Ysum),Ycount1,Ycount2(from camY)
In the statistical information listed above, the counts 0-3 represent counts of pixels that satisfy the pixel condition corresponding to the selected 4-pixel filter for a particular window. From the 8 available pixel filters PF 0-PF 7, 4 active pixel filters can be selected independently for each window. In addition, one of the sets of statistical information may be collected using a pixel filter or camY luminance statistical information. In one embodiment, window statistics collected about AWB and AE may be mapped to one or more registers.
Still referring to fig. 82, the 3A statistic logic 742 can also be configured to obtain a window of luminance row sum statistics 859 by using the luminance value camY for the camera color space transformation. This information can be used to detect and compensate for flicker. Flicker results from the periodic variation of certain fluorescent and incandescent light sources that is typically caused by an AC power signal. For example, referring to FIG. 90, a graph illustrating how a change in light source causes flicker is shown. Flicker detection may thus be used to detect the frequency of the AC power (e.g., 50Hz or 60Hz) for the light source. Once the frequency is known, flicker can be avoided by setting the integration time of the image sensor to an integer multiple of the flicker period.
To detect flicker, the camera brightness camY is accumulated in each row. Each camY value may correspond to 4 lines of original raw image data due to down-sampling into the Bayer data. The control logic and/or firmware may then perform a frequency analysis of the row average or, more reliably, of the row average difference over successive frames to determine the frequency of the AC power associated with a particular light source. For example, with regard to fig. 90, the integration time of the image sensor may be based on times t1, t2, t3, and t4 (e.g., such that integration occurs at times corresponding to when light sources exhibiting variation are typically at the same brightness level).
In one embodiment, a luminance row sum window may be specified and statistical information 859 is reported for the pixels within the window. For example, for 1080p HD video capture, assuming a window of 1024 pixels high, a sum of 256 luminance lines is produced (e.g., one sum every 4 lines due to downsampling by the logic 795), and each accumulated value may be represented by 18 bits (e.g., an 8-bit camY value of up to 1024 samples per line).
The 3A statistics collection logic 742 of fig. 82 may also collect Autofocus (AF) statistics 842 using autofocus statistics logic 841. Fig. 91 is a functional block diagram illustrating one embodiment of AF statistics logic 841 in more detail. As shown, AF statistics logic 841 may include a horizontal filter 843 and an edge detector 844 applied to the initial Bayer RGB (not downsampled), two 3 x 3 filters 846 applied to the Y component of the Bayer pattern, and two 3 x 3 filters 847 for camY. In general, horizontal filter 843 provides high resolution statistics for each color component, 3 x 3 filter 846 may provide high resolution statistics for Bayer y (Bayer RGB with 3 x 1 transform applied (logic 845)), and 3 x 3 filter 847 may provide coarser two dimensional statistics for camY (since camY is obtained with downsampled Bayer RGB data, i.e., logic 815). Further, logic 841 may include logic 852 to decimate the Bayer RGB data (e.g., 2 x 2 averaging, 4 x 4 averaging, etc.), which decimated Bayer RGB data 853 may be filtered using 3 x 3 filter 854 to produce a filtered output 855 of the decimated Bayer RGB data. This embodiment provides statistics for 16 windows. At the original frame boundary, edge pixels are copied for the filter of AF statistics logic 841. The various components of AF statistics logic 841 are described in more detail below.
First, the horizontal edge detection process includes applying a horizontal filter 843 to each color component (R, Gr, Gb, B), followed by an optional edge detector 844 to each color component. Thus, depending on the imaging conditions, this configuration allows the AF statistics logic 841 to be set to a high pass filter without edge detection (e.g., edge detector disabled) or, alternatively, to a low pass filter followed by an edge detector (e.g., edge detector enabled). For example, in low lighting conditions, the horizontal filter 843 is more sensitive to noise, and then the logic 841 may configure the horizontal filter as a low pass filter followed by an enabled edge detector 844. As shown, the control signal 848 may enable or disable the edge detector 844. Statistics from different color channels are used to determine the direction of focus to improve sharpening, as different colors may focus at different depths. In particular, AF statistics logic 841 may provide techniques for implementing autofocus control using a combination of coarse and fine adjustments (e.g., of the focal length of a lens). Embodiments of this technique are described in more detail below.
In one embodiment, the horizontal filter may be a 7-tap filter and may be defined as follows using equations 41 and 42:
out(i)=(af_horzfilt_coeff[0]*(in(i-3)+in(i+3))+af_horzfilt_coeff[1]*(in(i-2)+in(i+2))+af_horzfilt_coeff[2]*(in(i-1)+in(i+1))+af_horzfilt_coeff[3]*in(i))(41)
out(i)=max(-255,min(255,out(i))) (42)
Here, each coefficient af _ horzfilt _ coeff [0:3] is in the range [ -2, 2], and i represents the input pixel index of R, Gr, Gb or B. The filtered output out (i) may be clipped (equation 42) between a minimum and maximum of-255 and 255, respectively. The filter coefficients may be defined independently for each color component.
An optional edge detector 844 may follow the output of the horizontal filter 843. In one embodiment, the edge detector 844 may be defined as:
edge(i)=abs(-2*out(i-1)+2*out(i+1))+abs(-out(i-2)+out(i+2))(43)
edge(i)=max(0,min(255,edge(i))) (44)
thus, when enabled, the edge detector 844 may output a value based on two pixels on each side of the current input pixel i, as shown in equation 43. The result may be clipped to an 8-bit value between 0 and 255 as shown in equation 44.
Depending on whether an edge is detected, the final output of the pixel filter (e.g., filter 843 and detector 844) may be selected as the output of the horizontal filter 843, or the output of the edge detector 844. For example, as shown in equation 45, if an edge is detected, the output 849 of the edge detector 844 may be edge (i); or if no edge is detected, it may be the absolute value of the horizontal filter output out (i).
edge(i)=(af_horzfilt_edge_detected)?edge(i):abs(out(i))(45)
For each window, the accumulated value, edge _ sum [ R, Gr, Gb, B ] may be selected as (1) the sum of the edges (j, i) of each pixel within the window, or (2) the maximum value max (edge) of the edges (i) across a line in the window summed over the lines in the window. Assuming an original frame size of 4096 × 4096 pixels, the number of bits required to hold the maximum value of edge _ sum [ R, Gr, Gb, B ] is 30 bits (e.g., 8 bits per pixel plus 22 bits for a window covering the entire original image frame).
As described above, the 3 × 3 filter 847 for camY luminance includes two programmable 3 × 3 filters F0 and F1 applied to camY. The result of filter 847 becomes a square function or an absolute value function. The results are accumulated within a given AF window for two 3 × 3 filters F0 and F1, resulting in luminance edge values. In one embodiment, the luminance edge value at each camY pixel is defined as follows:
edgecamY_FX(j,i)=FX*camY (46)
=FX(0,0)*camY(j-1,i-1)+FX(0,1)*camY(j-1,i)+FX(0,2)*camY(j-1,i+1)+FX(1,0)*camY(j,i-1)+FX(1,1)*camY(j,i)+FX(1,2)*camY(j,i+1)+FX(2,0)*camY(j+1,i-1)+FX(2,1)*camY(j+1,i)+FX(2,2)*camY(j+1,i+1)edgecamY_FX(j,i)=f(max(-255,min(255,edgecamY_FX(j,i)))) (47)f(a)=a^2 or abs(a)
where FX represents the 3 x 3 programmable filters F0 and F1 with signed coefficients in the range-4, 4. Indices j and i represent pixel locations in the camY image. As described above, a filter on camY can provide coarse resolution statistics, since camY is derived using downsampled (e.g., 4 × 4 to 1) Bayer RGB data. For example, in one embodiment, filters F0 and F1 may be set using the Scharr operator, which provides improved rotational symmetry compared to the Sobel operator, an example of which is shown below:
for each window, the accumulated value 850, edgecamY _ FX _ sum (where FX ═ F0 and F1) determined with filter 847 may be selected as the maximum value of (1) the sum of edgecamY _ FX (j, i) for each pixel within the window, or (2) the summed edgecamY _ FX (j) across a line in the window within each line in the window. In one embodiment, edgecam _ FX _ sum may saturate relative to a 32-bit value when f (a) is set to a ^2 to provide finer resolution "peaker" statistics. To avoid saturation, the maximum window size X Y in the original frame pixels may be set such that it does not exceed a total of 1024 × 1024 pixels (e.g., i.e., X Y ≦ 1048576 pixels). As mentioned above, f (a) can also be set to an absolute value to provide more linear statistics.
The AF 3 × 3 filters 846 for Bayer Y may be defined in a similar manner to the 3 × 3 filters in camY, but they are applied to luminance values Y generated from Bayer quadruples (2 × 2 pixels). First, the 8-bit Bayer RGB values are converted to Y with programmable coefficients in the range [0, 4] to produce white balance Y values, as shown in equation 48 below:
bayerY=max(0,min(255,bayerY_Coeff[0]*R+bayerY_Coeff[1]*(Gr+Gb)/2+bayerY_Coeff[2]*B)) (48)
similar to filter 847 for camY, the 3 x 3 filter 846 for bayerY luminance may include two programmable 3 x 3 filters F0 and F1 applied to bayerY. The result of the filter 846 becomes a square function or an absolute value function. The results are accumulated within a given AF window for two 3 × 3 filters F0 and F1, resulting in luminance edge values. In one embodiment, the luminance edge value at each bayer pixel is defined as follows:
edgebayerY_FX(j,i)=FX*bayerY (49)
=FX(0,0)*bayerY(j-1,i-1)+FX(0,1)*bayerY(j-1,i)+FX(0,2)*bayerY(j-1,i)+FX(1,0)*bayerY(j,i-1)+FX(1,1)*bayerY(j,i)+FX(1,2)*bayerY(j-1,i)+FX(2,0)*bayerY(j+1,i-1)+FX(2,1)*bayerY(j+1,i)+FX(2,2)*bayerY(j+1,i)edgebayerY_FX(j,i)=f(max(-255,min(255,edgebayerY_FX(j,i))))(50)f(a)=a^2 or abs(a)
where FX represents the 3 x 3 programmable filters F0 and F1 with signed coefficients in the range-4, 4. Indices j and i represent pixel locations in the bayer image. As described above, a filter for baury may provide fine resolution statistics because the Bayer RGB signal received by AF logic 841 is not decimated. For example, filters F0 and F1 of filter logic 846 may be set using one of the following filter structures:
For each window, accumulated value 851, edgebyely _ FX _ sum (where FX ═ F0 and F1) determined with filter 846 may be selected as the maximum value of (1) the sum of edgebyely _ FX (j, i) for each pixel within the window, or (2) the summed edgebyely _ FX (j) across a line in the window within each line in the window. Here, when f (a) is set to a ^2, edgebayerY _ FX _ sum may be saturated with respect to 32 bits. Thus, to avoid saturation, the maximum window size X Y in the original frame pixels should be set so that it does not exceed the total 512X 512 pixels (e.g., X Y262144). As described above, setting f (a) to a 2 provides more peaky statistics, while setting f (a) to abs (a) provides more linear statistics.
As described above, statistics 842 of the AF are collected for 16 windows. The window may be any rectangular area with dimensions that are multiples of 4 pixels. Since each filter logic 846 and 847 includes two filters, in some cases, one filter may be used for normalization within 4 pixels and may be configured to filter in the vertical and horizontal directions. Further, in some embodiments, AF logic 841 may normalize the AF statistics with brightness. This may be accomplished by setting one or more of the filters in logic blocks 846 and 847 to bypass the filter. In some embodiments, the position of the windows may be limited to a multiple of 4 pixels and the windows allowed to overlap. For example, one window may be used to obtain normalized values, while another window may be used for additional statistical information, such as variance, as described below. In one embodiment, the AF filter (e.g., 843, 846, 847) may not implement pixel replication at the edges of the image frame, and then, in order for the AF filter to utilize all the active pixels, the AF windows may be set such that they are at least 4 pixels away from the upper edge of the frame, at least 8 pixels away from the lower edge of the frame, and at least 12 pixels away from the left/right edge of the frame, respectively. In the illustrated embodiment, for each window, the following statistics may be collected and reported:
32-bit edgeGr _ sum of Gr
R32-bit edgeR _ sum
32 bit edgeB _ sum of B
32 bit edgeGb _ sum of Gb
32-bit edgebayer Y _ F0_ sum of filter0(F0) for Bayer's Y
32-bit edgebayer Y _ F1_ sum of filter1(F1) for Bayer's Y
32-bit edgecam _ F0_ sum for cam Y of filter0(F0)
32-bit edgecam _ F1_ sum for cam Y of filter1(F1)
In such an embodiment, the memory required to hold the AF statistics 842 may be 16 (window) x 8(Gr, R, B, Gb, bayer _ F0, bayer _ F1, camY _ F0, camY _ F1) x 32 bits.
Thus, in one embodiment, the following may be: the output of the filter (which may be configured as a default setting), the input pixels, or the accumulated value between the squares of the input pixels for each window. The selection may be made for each of the 16 AF windows, and may be applied to all 8 AF statistics in a given window (listed above). This can be used to normalize the AF score between two overlapping windows, one of which is configured to collect the output of the filter and the other of which is configured to collect the input pixel sum. Additionally, to calculate the pixel variance in terms of two overlapping windows, one window may be configured to collect the sum of the input pixels and the other to collect the sum of the squares of the input pixels, thereby providing a variance that may be calculated as follows:
Variance=(avg_pixel2)-(avg_pixel)^2
Using the AF statistics, ISP control logic 84 (fig. 7) may be configured to adjust the focal length of the lens of the imaging device (e.g., 30) to bring the image into focus using a series of focal length adjustments based on the coarse and fine auto-focus "scores". As described above, the 3 × 3 filter 847 for camY may provide coarse statistical information, while the horizontal filter 843 and the edge detector 844 may provide relatively fine statistical information for each color component, and the 3 × 3 filter 846 for Bayer may provide fine statistical information for Bayer. In addition, a 3 x 3 filter 854 on the decimated Bayer RGB signal 853 may provide coarse statistics for each color channel. As described further below, the AF score may be calculated based on the filter output value of a particular input signal (e.g., the sum of filter outputs F0 and F1 for camY, Bayer, decimated Bayer RGB, or based on the level/edge detector output, etc.).
Fig. 92 is a graph 856 depicting curves 858 and 860 representing a coarse AF score and a fine AF score, respectively. As shown, the coarse AF score based on coarse statistics has a more linear response within the focal length of the lens. Thus, at any focus position, lens movement may produce a change in the autofocus score, which may be used to detect whether an image is becoming in focus or out of focus. For example, an increase in the coarse AF score after a lens adjustment may indicate that the focal length is being adjusted in the correct direction (e.g., toward the optical focus position).
However, as the optical focus position is approached, the change in coarse AF score for smaller lens adjustment steps may be reduced, making it difficult to discern the correct direction of focus adjustment. For example, as shown in graph 856, the change in the coarse AF score between Coarse Position (CP) CP1 and CP2 is measured as ΔC12Indicates that in the coarse tuning from CP1 to CP2, ΔC12Showing an increase. However, as shown, the change in coarse AF score Δ is coarse from CP3 to CP4C34(which passes through the best focus position (OFP)) although still increasing, is relatively small. It should be appreciated that the positions CP 1-CP 6 along the focal length L are not meant to necessarily correspond to the steps taken by the autofocus logic along the focal length. That is, there may be other steps taken between each coarse position that are not shown. The illustrated positions CP 1-CP 6 are merely intended to show how the change in the coarse AF score gradually decreases as the focus position approaches the OFP.
Once the approximate position of the OFP is determined (e.g., between CP3 and CP5 according to the coarse AF score shown in fig. 92), the fine AF score value represented by curve 860 may be evaluated to correct the focus position. For example, when the image is out of focus, the fine AF score is flatter so that a larger change in lens position does not result in a larger change in the fine AF score. However, when the focus position approaches the Optical Focus Position (OFP), the fine AF score changes sharply with a smaller position adjustment. Thus, by locating the peak or vertex 862 on the fine-tune AF score curve 860, the OFP can be determined for the current image scene. Thus, in summary, the coarse AF score can be used to determine a general vicinity of the optical focus position, while the fine AF score can be used to pinpoint a more precise location within that vicinity.
In one embodiment, the automatic focus process may begin by obtaining a coarse AF score along the entire available focal distance starting at position 0 and ending at position L (shown on graph 856), and determining the coarse AF score at various step positions (e.g., CP 1-CP 6). In one embodiment, once the focus position of the lens has reached position L, the position may be reset to 0 before evaluating the AF score at each focus position. This can be attributed, for example, to the coil settling time of the mechanical element controlling the focal position. In this embodiment, after resetting to position 0, the focus position may be adjusted toward position L to a position that first indicates a negative change in coarse AF score, here, a negative change Δ relative to position CP4C45Position CP 5. Starting from position CP5, the focus position (e.g., positions FP1, FP2, FP3, etc.) may be adjusted in smaller increments relative to the increments used in the coarse AF score adjustment, back in the direction toward position 0, while searching for a peak 862 in the fine AF score curve 860. As described above, the focus position OFP corresponding to the peak 862 in the fine AF score curve 860 may be the best focus position of the current image scene.
It will be appreciated that the above-described technique of locating the optimal region and optimal position of the focus may be referred to as "hill climbing" in the sense that the changes in the AF score curves 858 and 860 are analyzed to locate the OFP. Further, while the analysis of the coarse AF score (curve 858) and the fine AF score (curve 860) are shown using steps of the same size for the coarse score analysis (e.g., the distance between CP1 and CP 2) and the fine score analysis (e.g., the distance between FP1 and FP 2), in some embodiments, the step size may be changed based on the change in the score from one location to the next. For example, in one embodiment, the step size between CP3 and CP4 may be reduced relative to the step size between CP1 and CP2 because of the overall increment (Δ) of the coarse AF scoreC34) Less than the increment (Δ) from CP1 to CP2C12)。
A method 864 describing this process is illustrated in fig. 93. Beginning at block 865, a coarse AF score is determined for the image data at different steps along the focal length from position 0 to position L (fig. 92). Thereafter, at block 866, the coarse AF score is analyzed and a coarse position exhibiting a negative change in the coarse AF score for the first time is identified as a starting point for the fine AF score analysis. For example, the focus position is then stepped back toward home position 0 in smaller steps at block 867, while the fine-tuned AF score is analyzed for each step until a peak in the AF score curve (e.g., curve 860 of fig. 92) is found. At block 868, the focus position corresponding to the peak is set to the best focus position for the current image scene.
As described above, due to the mechanical coil settling time, the embodiment of the technique shown in fig. 93 may be adapted to initially obtain a coarse AF score along the entire focal distance, rather than analyzing each coarse position and searching for the best focus area one by one. However, other embodiments in which the coil settling time is less critical may analyze the coarse AF score one by one at each step rather than searching for the entire focal distance.
In some embodiments, the AF score may be determined using white balance luminance values derived from Bayer RGB data. For example, the luminance value Y can be obtained by decimating a 2 × 2Bayer quadruple by a factor of 2 as shown in fig. 94, or decimating a 4 × 4 pixel block composed of 4 2 × 2Bayer quadruples by a factor of 4 as shown in fig. 95. In one embodiment, the AF score may be determined using a gradient. In another embodiment, the AF score may be determined by applying a 3 x 3 transform using a Scharr operator, which provides rotational symmetry while minimizing the weighted mean square angle error in the Fourier domain. For example, the calculation of a coarse AF score for camY using the ordinary Scharr operator is shown below:
where in represents the decimated luminance Y value. In other embodiments, other 3 x 3 transforms may be utilized to calculate AF scores for coarse and fine statistics.
Autofocus adjustment may also be made differently depending on the color components, as the lens may affect different wavelengths of light differently, which is one reason for applying the horizontal filter 843 independently for each color component. Thus, even if chromatic aberration exists in the lens, autofocusing can be performed. For example, since red and blue are generally concentrated at different positions or distances relative to green when chromatic aberration is present, the relative AF score for each color can be used to determine the direction of focus. This is better illustrated in fig. 96, which fig. 96 shows the best focus position of lens 870 for the blue, red and green channels. As shown, the best focus positions of red, green, and blue are denoted by reference numerals R, G and B, respectively, and correspond to one AF score in the case of the current focus position 872, respectively. In general, in such a configuration, it is desirable to select the best focus position as the position corresponding to the best focus position of the green component, here position G (e.g., because the green component of Bayer RGB is twice the red or blue component). Thus, it is expected that the green channel should exhibit the highest autofocus score for the best focus position. Thus, from multiple positions of the best focus position for each color (those closer to the lens have higher AF scores), AF logic 841 and associated control logic 84 can determine the focus direction from the relative AF scores of blue, green, and red. For example, if the blue channel has a higher AF score relative to the green channel (as shown in fig. 96), then the focus position is adjusted in a negative direction (toward the image sensor) without first analyzing in a positive direction from the current position 872. In some embodiments, illuminant detection or analysis using color dependent temperature (CCT) can be performed.
Furthermore, as described above, variance scores may also be used. For example, the values of the pixel sum and pixel sum of squares may be accumulated for block sizes (e.g., 8 × 8 ~ 32 × 32 pixels) and used to derive a variance score (e.g., avg_pixel2) (avg _ pixel) ^ 2). The variances can be summed to obtain a total variance score for each window. A finer variance score may be obtained using a smaller block size and a coarser variance score may be obtained using a larger block size.
Referring to the 3A statistics logic 742 of fig. 82, the logic 742 may also be configured to collect component histograms 874 and 876. It will be appreciated that the histogram may be used to analyze pixel level distributions in the image. This can be used to implement certain functions, such as histogram equalization, where histogram data is used to determine histogram specifications (histogram matching). For example, a luminance histogram may be used for AE (e.g., for adjusting/setting sensor integration time), and a color histogram may be used for AWB. In the present embodiment, the histogram may be 256, 128, 64, or 32 bins for each color component (where the upper 8, 7, 6, and 5 bits of the pixel are used to determine the bin, respectively), as shown by the bin size (BinSize). For example, when the pixel data is 14 bits, additional scale factors and offsets of 0-6 may be specified to determine what range of pixel data (e.g., which 8 bits) to collect for statistics. The number of bins can be obtained as follows:
idx=((pixel-hist_offset)>>(6-hist_scale)
In one embodiment, the color histogram bin is incremented only when the bin index is in the range [0, 2^ (8-BinSize) ]:
if(idx>=0&&idx<2^(8-BinSize))
StatsHist[idx]+=Count;
in the present embodiment, the statistical information processing unit 142 may include two histogram units. The first histogram 874(Hist0) may be configured to collect pixel data as part of the statistics collection after the 4 x 4 decimation. For Hist0, circuit 880 may be used to select components RGB, sRGBlinearsRGB or YC1C 2. A second histogram 876(Hist1) may be configured before the statistics pipeline (at the defective pixel correction logic 738)Front), pixel data is collected, as shown in more detail in fig. 96. For example, the raw Bayer RGB data (output from 146) may be decimated with logic 882 (to produce signal 878) by skipping a number of pixels, as described further below. For the green channel, the color can be selected between Gr, Gb, or both Gr and Gb (both Gr and Gb counts are accumulated in the green bin).
To keep the histogram bin width the same between the two histograms, Hist1 may be configured to collect pixel data every 3 pixels (every other Bayer quad). The start of the histogram window determines the first Bayer quadruple position where the histogram starts to accumulate. Starting from this position, for Hist1, every other Bayer quadruple jumps in the horizontal and vertical directions. For Hist1, the window start position may be any pixel position, and thus the pixels skipped for histogram calculation may be selected by changing the start window position. Hist1 may be used to collect data close to the black level (represented by 884 in fig. 97) to assist in dynamic black level compensation at block 739. Thus, although the histogram 876 is shown in FIG. 97 as being separate from the 3A statistical logic 742 for purposes of illustration, it should be appreciated that the histogram 876 may actually be part of the statistical information written into memory and may actually be physically located within the statistical information processing unit 142.
In this embodiment, the red (R) and blue (B) binning may be 20 bits and the Green (G) binning 21 bits (Green is larger to accommodate the Gr and Gb accumulation in Hist 1). This allows a maximum picture size of 4160 × 3120 pixels (12 MP). The required internal memory size is 3 × 256 × 20(1) bits (3 color components, 256 bins).
As for the storage format, statistics of the AWB/AE window, AF window, 2D color histogram, and component histogram can be mapped to registers to allow early firmware access. In one embodiment, two memory pointers may be used to write the statistics into memory, one for the block statistics 863, one for the chroma row sum 859, followed by all other collected statistics. All statistics are written to an external memory, which may be a DMA memory. The memory address register may be double buffered so that a new location in the memory may be specified every frame.
Before proceeding with a detailed discussion of the ISP pipe logic 82 downstream of the ISP front end logic 80, it should be understood that the arrangement of the various functional logic blocks (e.g., logic blocks 738, 739, 740, 741 and 742) in the statistics processing units 142 and 144 and the various functional logic blocks (e.g., logic blocks 650 and 652) in the ISP front end pixel processing unit 150 is intended to illustrate only one embodiment of the present technology. Indeed, in other embodiments, the logic illustrated herein may be arranged in a different order, or may include other logic to implement other image processing functions not explicitly illustrated herein. Further, it is to be understood that image processing operations such as lens shading correction, defective pixel detection/correction, and black level compensation performed in the statistical information processing units (e.g., 142 and 144) are performed within the statistical information processing units in order to collect statistical data. Thus, the processing operations performed on the image data received by the statistics processing unit are not actually reflected in the image signal 109(FEProcOut) output from the ISP front-end pixel processing logic 150 and forwarded to the ISP pipeline processing logic 82.
Before continuing the description, it should also be noted that given sufficient processing time, and given the similarity of many processing requirements of the various operations described herein, the functional blocks illustrated herein may be reconfigured to perform image processing sequentially rather than in a pipeline. It will be appreciated that this may further reduce the overall hardware implementation cost, but may also increase the bandwidth to external memory (e.g., cache/hold intermediate results/data).
IPS pipeline ('pipeline') processing logic
Having described the ISP front-end logic 80 in detail above, the discussion will now shift focus to the ISP pipeline processing logic 82. Generally, the function of the ISP pipe logic 82 is to receive raw image data provided from the ISP front end logic 80 or retrieved from the memory 108 and perform further image processing operations, i.e., before outputting the image data to the display device 28.
FIG. 98 is a block diagram representation of one embodiment of the ISP pipeline logic 82. As shown, the ISP pipeline logic 82 may include raw processing logic 900, RGB processing logic 902, and YCbCr processing logic 904. Raw processing logic 900 may perform various image processing operations such as defective pixel detection and correction, lens shading correction, demosaicing, as well as apply gains for automatic white balancing and/or set black levels, as described further below. As shown in this embodiment, the input signal 908 to the raw processing logic 900 may be the raw pixel output 109 (signal FEProcOut) from the ISP front-end logic 80, or the raw pixel data 112 from the memory 108, depending on the current configuration of the selection logic 906.
As a result of the demosaicing operation performed within raw processing logic 900, image signal output 910 may be in the RGB domain and may be sequentially forwarded to RGB processing logic 902. For example, as shown in fig. 98, RGB processing logic 902 receives a signal 916, which signal 916 may be either an output signal 910 or an RGB image signal 912 from memory 108 depending on the current configuration of selection logic 914. The RGB processing logic 902 may provide various RGB color adjustment operations including color correction (e.g., using a color correction matrix), application of color gains for automatic white balancing, and global tone mapping, as described further below. The RGB processing logic 904 may also provide a color space transformation of the RGB image data to a YCbCr (luminance/chrominance) color space. Thus, the image signal output 918 may be in the YCbCr domain and may then be forwarded to YCbCr processing logic 904.
For example, as shown in fig. 98, YCbCr processing logic 104 receives signal 924, signal 924 may be output signal 918 from RGB processing logic 902, or YCbCr signal 920 from memory 108, depending on the current configuration of selection logic 922. As described in further detail below, the YCbCr processing logic 904 may provide image processing operations in the YCbCr color space, including scaling, chroma suppression, luminance sharpening, luma, contrast, and color (BCC) adjustment, YCbCr gamma mapping, chroma decimation, and so forth. The image signal output 926 of the YCbCr processing logic 904 may be sent to the memory 108 or may be output as the image signal 114 from the ISP pipeline processing logic 82 (fig. 7). Subsequently, in accordance with the embodiment of the image processing circuit 32 depicted in FIG. 7, the image signal 114 may be sent to the display device 28 (either directly or via the memory 108) for viewing by a user, or may be further processed using a compression engine (e.g., encoder 118), CPU/GPU, graphics engine, or the like. Additionally, in embodiments in which an ISP backend unit 120 is included in the image processing circuitry 32 (e.g., fig. 8), the image signal 114 may be sent to the ISP backend processing logic 120 for additional downstream post-processing.
In accordance with embodiments of the present technique, the ISP pipe logic 82 may support the processing of raw pixel data in 8 bit, 10 bit, 12 bit or 14 bit formats. For example, in one embodiment, 8-bit, 10-bit, or 12-bit input data may be converted to 14-bits at the input of the raw processing logic 900, and raw processing and RGB processing operations may be performed with 14-bit precision. In the latter embodiment, prior to conversion of the RGB data to the YCbCr color space, the 14-bit image data may be downsampled to 10 bits so that YCbCr processing may be performed with 10-bit precision (logic 904).
To fully illustrate the various functions provided by the ISP pipeline processing logic 82, the raw processing logic 900, the RGB processing logic 902, and the YCbCr processing logic 904, respectively, are described in that order beginning with the raw processing logic 900, along with the internal logic that performs the various image processing operations that may be implemented in the respective corresponding elements of the logic 900, 902, and 904. For example, referring now to FIG. 99, a block diagram representing a more detailed diagram of one embodiment of raw processing logic 900 is illustrated, in accordance with one embodiment of the present technique. As shown, the raw processing logic 900 includes gain, offset, and clamp (GOC) logic 930, defective pixel detection/correction (DPDC) logic 932, noise reduction logic 934, lens shading correction logic 936, GOC logic 938, and demosaicing logic 940. Furthermore, while the examples discussed below assume the use of a Bayer color filter array with an image sensor 90, it should be understood that other embodiments of the present technology may utilize different types of color filters.
Input signal 908, which may be the original image signal, is first received by gain, offset, and clamp (GOC) logic 930. The GOC logic 930 may provide similar functionality to the BLC logic 739 of the statistics processing unit 142 of the ISP headend logic 80 illustrated in fig. 68 above, and may be implemented in a similar manner. For example, GOC logic 930 may provide digital gain, offset, and clamping (clipping) independently for each color component R, B, Gr, and Gb of a Bayer image sensor. In particular, GOC logic 930 may perform automatic white balancing or set the black level of the original image data. Further, in some embodiments, GOC logic 930 may also be used to correct or compensate for the offset between the Gr and Gb color components.
In operation, the input value of the current pixel is first offset by a signed value and multiplied by a gain. This operation may be performed using the formula shown in formula 11 above, where X represents the input pixel value for a given color component R, B, Gr, or Gb, OC represents the signed 16-bit offset for the current color component c, and GC represents the gain value for color component c. The value of G [ c ] may be determined in advance during statistical information processing (e.g., at ISP front-end block 80). In one embodiment, the gain G [ c ] may be a 16-bit unsigned number (e.g., 2.14 floating-point representation) with 2 integer bits and 14 fractional bits, which may be applied for rounding. For example, the gain Gc may have a range of 0 to 4X.
The calculated pixel value Y of equation 11 (which includes the gain gc and offset oc) is then clipped to a minimum and maximum range according to equation 12. As described above, the variables min [ c ] and max [ c ] may represent signed 16-bit "slicer values" for minimum and maximum output values, respectively. In one embodiment, GOC logic 930 may also be configured to maintain a count of the number of pixels clipped above and below the maximum and minimum ranges, respectively, for each color component.
The output of GOC logic 930 is then forwarded to defective pixel detection and correction logic 932. As described above with reference to fig. 68(DPDC logic 738), defective pixels may be attributed to a number of factors and may include "noise" (or leaky pixels), "bright" and "dead" where noise may exhibit higher than normal charge leakage relative to non-defective pixels and may appear brighter than non-defective pixels, and bright pixels appear to be always on (e.g., fully charged) and thus appear brighter, while dead pixels appear to be always off. Thus, it is desirable to have a pixel detection scheme that is robust enough to identify and resolve different types of fault conditions. In particular, the pipeline DPDC logic 932 may provide fixed or static defect detection/correction, dynamic defect detection/correction, and speckle (spot) elimination when compared to the front-end DPDC logic 738 that provides only dynamic defect detection/correction.
According to embodiments of the presently disclosed inventive technique, defective pixel correction/detection performed by the DPDC logic 932 may be performed independently for each color component (e.g., R, B, Gr, and Gb), and may include various operations for detecting defective pixels and correcting the detected defective pixels. For example, in one embodiment, the defective pixel detection operation may detect static defects, dynamic defects, and detect speckle (which may refer to electrical interference or noise (e.g., photonic noise) that may be present in the imaging sensor). By analogy, speckle can appear on an image as a random-looking noise artifact, in a manner similar to that of electrostatic noise appearing on a display, such as a television display. Further, as described above, dynamic defect correction is considered dynamic in the sense that the characterization of a pixel that is considered defective at a given time may depend on the image data in neighboring pixels. For example, if the location of a bright spot is in an area of the current image that is dominated by bright white, then a bright spot that is always at maximum brightness may not be considered a defective pixel. Conversely, if the bright spot is in an area of the current image that is dominated by black or darker colors, then during processing by DPDC logic 932, the bright spot may be identified as a defective pixel and thus corrected accordingly.
In the case of static defect detection, the location of each pixel is compared to a static defect table, which may hold data corresponding to the location of pixels known to be defective. For example, in one embodiment, DPDC logic 932 may monitor the detection of defective pixels (e.g., using a counter mechanism or register) and, if a particular pixel is found to fail repeatedly, save the location of that pixel to a static defect table. Thus, during static defect detection, if the location of the current pixel is determined to be in the static defect table, then the current pixel is identified as a defective pixel and a replacement value is determined and temporarily saved. In one embodiment, the replacement value may be the value of the previous pixel (based on the scan order) of the same color component. The replacement values may be used to correct static defects during dynamic/blob defect detection and correction, as described below. In addition, if the previous pixel is outside of the original frame 310 (FIG. 23), then without using its value, static defects can be corrected during the dynamic defect correction process. In addition, static defect tables may hold a limited number of location entries due to memory considerations. For example, in one embodiment, the static defect table may be implemented as a FIFO queue configured to hold a total of 16 locations for every two lines of image data. Nevertheless, the defined locations in the static defect table will be corrected with replacement values for previous pixels (rather than by means of the dynamic defect detection process discussed below). As described above, embodiments of the present technology may also support updating static defect tables intermittently over time.
Embodiments may provide a static defect table implemented in on-chip memory or off-chip memory. It will be appreciated that using an on-chip implementation may increase the overall chip area/size, while using an off-chip implementation may decrease the chip area/size, but increase memory bandwidth requirements. Thus, it should be understood that the static defect table may be implemented on-chip or off-chip depending on the specific implementation requirements, i.e., the total number of pixels to be stored in the static defect table.
The dynamic defect and blob detection process may be time-shifted relative to the static defect detection process discussed above. For example, in one embodiment, the dynamic defect and blob detection process may begin after the static defect detection process has analyzed the pixels of two scan lines (e.g., rows). It will be appreciated that this allows static defects to be identified and corresponding replacement values determined for static defects before dynamic/blob detection is performed. For example, during the dynamic/blob detection process, if the current pixel was previously marked as a static defect, then the static defect is only corrected using the previously evaluated replacement values, rather than applying the dynamic/blob detection operation.
In the case of dynamic defect and blob detection, these processes may be performed sequentially or in parallel. Dynamic defect and blob detection and correction by DPDC logic 932 may rely on adaptive edge detection with pixel-to-pixel directional gradients. In one embodiment, DPDC logic 932 may choose to use the 8 immediate neighboring pixels of the current pixel having the same color component within the original frame 310 (fig. 23). In other words, the current pixel and its 8 immediate neighbors, P0, P1, P2, P3, P4, P5, P6, and P7, may constitute a 3 × 3 region, as shown below in fig. 100.
It should be noted, however, that depending on the position of the current pixel P, pixels outside the original frame 310 are not considered when calculating the pixel-to-pixel gradient. For example, in the case 942 of the "top left" shown in fig. 100, the current pixel P is in the top left corner of the original frame 310, and thus, the neighboring pixels P0, P1, P2, P3, and P5 outside the original frame 310 are not considered, and only the pixels P4, P6, and P7 remain (N ═ 3). In the "top" case 944, the current pixel P is at the uppermost edge of the original frame 310, so that the neighboring pixels P0, P1, and P2 outside the original frame 310 are not considered, leaving only the pixels P3, P4, P5, P6, and P7 (N-5). Next, in the "top right" case 946, the current pixel P is in the top right corner of the original frame 310, and thus, the neighboring pixels P0, P1, P2, P4, and P7 outside the original frame 310 are not considered, and only the pixels P3, P5, and P6 remain (N ═ 3). In the "left" case 948, the current pixel P is at the leftmost edge of the original frame 310, leaving only the pixels P1, P2, P4, P6, and P7 (N-5) without regard to the neighboring pixels P0, P3, and P5 outside the original frame 310.
In the "center" case 950, all of the pixels P0-P7 are located within the original frame 310 and are thus used to determine the pixel-to-pixel gradient (N-8). In the "right" case 952, the current pixel P is at the rightmost edge of the original frame 310, leaving only the pixels P0, P1, P3, P5 and P6 (N-5) without regard to the neighboring pixels P2, P4 and P7 outside the original frame 310. In addition, in the "lower left" case 954, the current pixel P is in the lower left corner of the original frame 310, and thus, the neighboring pixels P0, P3, P5, P6, and P7 outside the original frame 310 are not considered, and only the pixels P1, P2, and P4 are left (N ═ 3). In the "bottom" case 956, the current pixel P is at the lowest edge of the original frame 310, leaving only the pixels P0, P1, P2, P3, and P4 (N-5) without regard to the neighboring pixels P5, P6, and P7 outside the original frame 310. Finally, in the "lower right" case 958, the current pixel P is in the lower right corner of the original frame 310, leaving only the pixels P0, P1, and P3(N ═ 3) without regard to the neighboring pixels P2, P4, P5, P6, and P7 outside the original frame 310.
Thus, the number of pixels used to determine the pixel-pixel gradient may be 3, 5 or 8, depending on the position of the current pixel P. In the illustrated embodiment, for each neighboring pixel (k 0-7) within a picture boundary (e.g., original frame 310), the pixel-to-pixel gradient may be calculated as follows:
Gk=abs(P-Pk) K 0 ≦ 7 (for k in the original frame only) (51)
In addition, the average gradient GavCan be calculated as the average value P of the current pixel and its surrounding pixelsavThe difference between them, as shown in the following equation:
where N is 3, 5 or 8 (depending on the pixel location) (52a)
Gav=abs(P-Pav) (52b)
The pixel-to-pixel gradient values (equation 51) may be used to determine a dynamic defect condition and the average values of neighboring pixels (equations 52a and 52b) may be used to identify a blob condition, as described further below.
In one embodiment, DPDC logic 932 may perform dynamic defect detection as follows. First, assume if a certain number of gradients GkEqual to or less than a certain threshold (indicated by the variable dynTh (dynamic defect threshold)), the pixel is defective. Thus, for each pixel, a count (C) of the number of gradients equal to or less than the threshold dynTh of the neighboring pixels within the picture boundary is accumulated. The threshold dynTh may be a combination of a fixed threshold component and a dynamic threshold component that may depend on the "liveness" of the pixels around the presentation. For example, in one embodiment, by averaging pixel values P based on av(equation 52a) and the sum of the absolute differences between each adjacent pixel, calculating the high frequency component value Phf, the dynamic threshold component of dynTh may be determined as follows:
wherein N is 3, 5 or 8 (52c)
In the case where the pixel is located at a corner of the image (N-3), or at an edge of the image (N-5), Phf may be multiplied by 8/3 or 8/5, respectively. It will be appreciated that this ensures that the high frequency component P is made dependent on 8 neighbouring pixels (N-8)hfAnd (6) normalizing.
Once P is determinedhfThe dynamic defect detection threshold dynTh can be calculated as follows:
dynTh=dynTh1+(dynTh2×Phf), (53)
wherein, dynTh1Representing a fixed threshold component, dynTh2Represents the dynamic threshold component and is P in equation 53hfThe multiplier of (c). For each color component, a different fixed threshold component dynTh may be provided1However, for each pixel of the same color, dynTh1The same is true. For example, dynTh1It can also be set so that it is at least higher than the variance of the noise in the image.
Dynamic threshold component dynTh2May be determined based on some characteristic of the image. For example, in one embodiment, dynTh may be determined using stored empirical data regarding exposure and/or sensor integration time2. The empirical data may be determined during calibration of the image sensor (e.g., 90), and may be dynTh 2A selected dynamic threshold component value is associated with each of the plurality of data points. Thus, dynTh may be determined by selecting a dynamic threshold component value from stored empirical data corresponding to a current exposure and/or sensor integration time value based on the current exposure and/or sensor integration time value determined during statistical information processing in the ISP front-end logic 802. Additionally, if the current exposure and/or sensor integration time value does not directly correspond to one of the empirical data points, dynTh may be determined by interpolating dynamic threshold component values associated with data points between which the current exposure and/or sensor integration time value falls2. Furthermore, with a fixed thresholdComponent dynTh1Similarly, for each color component, the dynamic threshold component dynTh2May have different values. Thus, the composite threshold dynTh may be different for each color component (e.g., R, B, Gr, Gb).
As described above, for each pixel, the count C of the number of gradients equal to or below the threshold dynTh of the neighboring pixels within the picture boundary is determined. For example, for each neighboring pixel within the original frame 310, the gradient G equal to or below the threshold dynTh may be calculated as follows kCumulative count of (C):
k is more than or equal to 0 and less than or equal to 7 (k is only used in the original frame)
Subsequently, if it is determined that the accumulated count C is less than or equal to the maximum count (represented by the variable dynMaxC), the pixel may be considered a dynamic defect. In one embodiment, different dynMaxC values may be provided for the case where N-3 (corner), N-5 (edge), and N-8. The logic is represented as follows:
if (C ≦ dynMaxC), then the current pixel P is defective (55)
As described above, the location of the defective pixel may be saved to the static defect table. In some embodiments, during dynamic defect detection, a minimum gradient value (min (G) calculated with respect to a current pixelk) Can be saved and used to classify defective pixels such that a larger minimum gradient value indicates a more "severe" defect and shouldDuring pixel correction, less severe defects are corrected before they are corrected. In one embodiment, a pixel may need to be processed over multiple imaging frames before being saved in the static defect table, such as by filtering the location of defective pixels over time. In the latter embodiment, the location of the defective pixel is saved into the static defect table only if the defect occurs at the same location in a certain number of consecutive images. Further, in some embodiments, the static defect table may be configured to classify the saved defective pixel locations according to a minimum gradient value. For example, the highest minimum gradient value may indicate a more "severe" defect. By sorting the locations in this manner, a priority for static defect correction can be set, such as correcting the most severe or important defects first. In addition, the static defect table may be updated over time to include newly detected static defects and sort them accordingly according to their respective minimum gradient values.
By determining the value Gav(equation 52b) is higher than the blob detection threshold spkTh, the blob detection can be performed in parallel with the dynamic defect detection process described above. Similar to the dynamic defect threshold dynTh, the blob threshold spkTh may also include what is referred to as spkTh, respectively1And spkTh2A fixed component and a dynamic component. Generally, with dynTh1And dynTh2The fixed component spkTh may be set more "aggressively" than the value1And dynamic component spkTh2To avoid falsely detecting blobs in areas of the image that may be more textured and other areas, such as text, leaves (foldage), certain fabric patterns, etc. Thus, in one embodiment, for highly textured regions of the image, the dynamic blob threshold component spkTh may be increased2While for "flatter" or more consistent regions, the dynamic speckle threshold component spkTh may be reduced2. The blob threshold spkTh can be calculated as follows:
spkTh=spkTh1+(spkTh2×Pnf), (56)
wherein spkTh1Representing a fixed threshold component, spkTh2Representing a dynamic threshold component. The detection of blobs may be determined according to the following expression:
if (G)av> spkTh), then the current pixel P is spotty. (57)
Once defective pixels are identified, DPDC logic 932 may apply pixel correction operations based on the type of defect detected. For example, if a defective pixel is identified as a static defect, then the pixel is replaced with the saved replacement value, as described above (e.g., the value of the previous pixel of the same color component). If a pixel is identified as a dynamic defect or blob, pixel correction can be performed as follows. First, the absolute difference (e.g., G of equation 51) between the center pixel and the first and second adjacent pixels in 4 directions (horizontal (h) direction, vertical (v) direction, diagonal positive direction (dp), and diagonal negative direction (dn)) is calculated kCalculation of (d) the gradient is calculated as follows:
Gh=G3+G4 (58)
Gv=G1+G6 (59)
Gdp=G2+G5 (60)
Gdn=G0+G7 (61)
subsequently, by means of a gradient G in the direction having the minimum valueh,Gv,GdpAnd GdnBy linear interpolation of two adjacent pixels in relation to one another, a correction pixel value P can be determinedC. For example, in one embodiment, the following logic statement may represent PCThe calculation of (2):
if(min==Gh) (62)
else if(min==Gv)
else if(min==Gdp)
else if(min==Gdn)
the pixel correction technique implemented by DPDC logic 932 also provides exceptions under boundary conditions. For example, if one of the two neighboring pixels associated with the selected interpolation direction is outside the original frame, the values of the neighboring pixels within the original frame are replaced instead. Thus, with this technique, the corrected pixel value is equal to the value of the neighboring pixel within the original frame.
It should be noted that the defective pixel detection/correction technique applied by DPDC logic 932 during ISP pipeline processing is more robust than DPDC logic 738 in ISP front-end logic 80. As described in the above embodiments, DPDC logic 738 only performs dynamic defect detection and correction using only horizontally-oriented neighboring pixels, while DPDC logic 932 provides static defect, dynamic defect, and blob detection and correction using both horizontally and vertically-oriented neighboring pixels.
It will be appreciated that using a static defect table to store the locations of defective pixels may provide temporal filtering of defective defects with lower memory requirements. For example, in contrast to many conventional techniques that save an entire image and apply temporal filtering to identify static defects over time, embodiments of the present technique save only the locations of defective pixels, which can generally be accomplished with only a small portion of the memory required to save the entire image frame. Further, as described above, the minimum gradient value (min (G) k) Preservation allows efficient use of the static defect table (e.g., starting from the locations of those defective pixels that are most apparent) by prioritizing the order in which the locations of the defective pixels are corrected.
In addition, dynamic components (e.g., dynTh) are included2And spkTh2) The use of threshold values of (a) can help reduce false defect detection, which is a problem often encountered in conventional image processing systems when processing highly textured areas of an image (e.g., text, leaves, certain fabric patterns, etc.). Furthermore, the use of directional gradients (e.g., h, v, dp, dn) for pixel correction may reduce the occurrence of visible artifacts when false defect detections occur. E.g. even in errorIn the case of detection, filtering along the direction of the smallest gradient can lead to a correction which still yields acceptable results in most cases. In addition, including the current pixel P in the gradient calculation may improve the accuracy of the gradient detection, especially in the case of noise.
The above-described defective pixel detection and correction techniques implemented by DPDC logic 932 may be summarized in a series of flow charts provided in FIGS. 101-103. For example, referring first to FIG. 101, a process 960 for detecting static defects is illustrated. Beginning first at step 962, at a first time T 0An input pixel P is received. Subsequently, in step 964, the location of the pixel P is compared with the values stored in the static defect table. Decision logic 966 determines whether the location of pixel P is found in the static defect table. If the location of pixel P is in the static defect table, process 960 proceeds to step 968 where pixel P is marked as a static defect and a replacement value is determined at step 968. As described above, the replacement value may be determined based on the value of the previous pixel (in scan order) of the same color component. Process 960 then proceeds to step 970, where process 960 proceeds to dynamic defect and blob detection process 980 illustrated in FIG. 102 at step 970. Otherwise, if at decision logic 966, it is determined that the location of pixel P is not in the static defect table, process 960 proceeds to step 970 without performing step 968.
With continued reference to FIG. 102, as shown at step 982, an input pixel P is received for processing at time T1 to determine if a dynamic defect or blob exists. Time T may represent a time shift relative to static defect detection process 960 of fig. 101. As described above, after the static defect detection process has analyzed two scan line (e.g., two rows) of pixels, the dynamic defect and blob detection process may begin, allowing time to identify the static defect and determine a corresponding replacement value for the static defect before dynamic/blob detection occurs.
Decision logic 984 determines whether input pixel P was previously marked as a static defect (e.g., with step 968 of process 960). If pixel P is marked as a static defect, process 980 may proceed to the figure103, and bypasses the remaining steps shown in fig. 102. If the decision logic 984 determines that the input pixel P is not a static defect, then the process proceeds to step 986 where neighboring pixels that can be used in dynamic defect and blob processing are identified. For example, in accordance with the embodiment discussed above and illustrated in FIG. 100, neighboring pixels may include 8 immediately adjacent pixels (e.g., P0-P7) of pixel P, thereby forming a 3 × 3 pixel region. Subsequently, at step 988, a pixel-to-pixel gradient is calculated for each adjacent pixel within the original frame 310, as described in equation 51 above. In addition, as shown in equations 52a and 52b, the average gradient (G) may be calculated as a difference between the average values of the current pixel and its neighboring pixelsav)。
The process 980 then branches to a dynamic defect detection step 990, and decision logic 998 for blob detection. As described above, in some embodiments, dynamic defect detection and blob detection may be performed in parallel. At step 990, a count C of the number of gradients that is less than or equal to the threshold dynTh is determined. As described above, the threshold dynTh may include a fixed component and a dynamic component, and in one embodiment, this may be determined according to equation 53 above. If C is less than or equal to the maximum count dynMaxC, then process 980 proceeds to step 996 and the current pixel is marked as a dynamic defect. Thereafter, the process 980 may proceed to the pixel correction process illustrated in FIG. 103, discussed below.
Returning to the branches for blob detection after step 988, decision logic 998 determines the average gradient GavWhether or not it is greater than the blob detection threshold spkTh, which may also include a fixed component and a dynamic component. If G isavAbove the threshold spkTh, then at step 1000, pixel P is marked as containing a blob, after which process 980 proceeds to diagram 103 to correct the blob pixel. Further, if the outputs of decision logic blocks 992 and 998 are both "no," this indicates that pixel P does not contain a dynamic defect, blob, or even a static defect (decision logic 984). Thus, when the outputs of decision logic 992 and 998 are both "no," process 980 may end at step 994, thereby concluding thatThe pixel P is passed through unchanged because no defects (e.g., static defects, dynamic defects, or blobs) are detected.
With continued reference to fig. 103, a pixel correction process 1010 in accordance with the techniques described above is provided. At step 1012, an input pixel P is received from process 980 of FIG. 102. It should be noted that process 1010 may receive pixel P from step 984 (static defect), or from steps 996 (dynamic defect) and 1000 (blob). Decision logic 1014 then determines whether pixel P is marked as a static defect. If pixel P is a static defect, process 1010 continues and ends at step 1016 to correct the static defect using the replacement value determined at step 968 (FIG. 101).
If pixel P is not identified as a static defect, process 1010 proceeds from decision logic 1014 to step 1018 and calculates a directional gradient. For example, as described above with reference to equations 58-61, the gradient is calculated as the sum of the absolute differences between the center pixel and the first and second adjacent pixels in 4 directions (h, v, dp, and dn). Subsequently, at step 1020, the directional gradient having the smallest value is identified, after which decision logic 1022 evaluates whether one of the two adjacent pixels associated with the smallest gradient is outside the image frame (e.g., original frame 310). If both adjacent pixels are within the image frame, process 1010 proceeds to step 1024 where the pixel correction value (P) is determined by applying linear interpolation to the values of the two adjacent pixels, as shown in equation 62C). Thereafter, the interpolated pixel correction values P are usedCThe input pixel P is corrected as shown in step 1030.
Returning to decision logic 1022, if it is determined that one of the two adjacent pixels is outside the image frame (e.g., original frame 165), then instead of using the value of the out-of-picture pixel (Pout), DPDC logic 932 may replace the value of Pout with the value of another adjacent pixel (Pin) within the image frame, as shown at step 1026. Thereafter, at step 1028, a pixel correction value P is determined by interpolating the value of Pin and the replacement value of Pout C. In other words, in this case, PCMay be equal to the value of Pin. Finally, at step 1030, the value P is utilizedCCorrecting pixel P. Before proceeding, it should be appreciated that the specific defective pixel detection and correction process described herein with respect to DPDC logic 932 is intended to reflect only one possible embodiment of the present technique. Indeed, depending on design and/or cost constraints, a variety of variations are possible, and features may be added or removed so that the overall complexity and robustness of the defect detection/correction logic is intermediate between the simpler detection/correction logic 738 implemented in ISP front-end block 80, and the defect detection/correction logic described herein with respect to DPDC logic 932.
Referring back to fig. 99, the corrected pixel data is output from DPDC logic 932 and subsequently received by noise reduction logic 934 for further processing. In one embodiment, noise reduction logic 934 may be configured to implement two-dimensional edge adaptive low-pass filtering to reduce noise in the image data while preserving detail and texture. The edge adaptive threshold may be set (e.g., by control logic 84) based on the current light level so that filtering can be enhanced under low lighting conditions. Furthermore, as briefly described above with respect to the determination of the dynTh and spkTh values, for a given sensor, the noise variance may be determined early so that the noise reduction threshold may be set just above the noise variance, thereby reducing noise without significantly affecting the texture and detail of the scene (e.g., avoiding/reducing false detections) during the noise reduction process. Assuming a Bayer color filter implementation, the noise reduction logic 934 may independently process each color component Gr, R, B, and Gb using a 7-tap horizontal filter and a 5-tap vertical filter that may be separable. In one embodiment, the noise reduction process may be performed as follows: the green components (Gb and Gr) are corrected for non-uniformity and then horizontally filtered and vertically filtered.
Green disparity (GNU) is typically characterized by a slight brightness difference between Gb and Gr pixels given the planar conditions of uniform illumination. If this disparity is not corrected or compensated for, certain artifacts, such as "maze" artifacts, may appear in the full-color image after demosaicing. The green inconsistency treatment may include: for each green pixel in the raw Bayer image data, it is determined whether the absolute difference between the current green pixel (G1) and the green pixel (G2) on the lower right side of the current pixel is less than the GNU correction threshold (gnuTh). Fig. 104 illustrates the positions of G1 and G2 pixels in a 2 × 2 region of the Bayer pattern. As shown, the color of the pixels adjacent to G1 may depend on whether the current green pixel is a Gb pixel or a Gr pixel. For example, if G1 is Gr, then G2 is Gb, the pixels to the right of G1 are R (red), and the pixels below G1 are B (blue). Alternatively, if G1 is Gb, then G2 is Gr, the pixels to the right of G1 are B, and the pixels below G1 are R. If the absolute difference between G1 and G2 is less than the GNU correction threshold, then the current green pixel G is replaced by the average of G1 and G2, as shown by the following logic:
if(abs(G1-G2)≤gnuTh);
it will be appreciated that applying green disparity correction in this manner may help to avoid averaging G1 and G2 pixels across edges, thereby improving and/or preserving sharpness.
Horizontal filtering is applied after the green disparity correction, which in one embodiment may provide a 7-tap horizontal filter. The gradient across the edge of each filter tap is calculated and if it is above the horizontal edge threshold (horzTh), the filter tap is folded to the center pixel, as described below. In some embodiments, the noise filtering may be edge adaptive. For example, the horizontal filter may be a Finite Impulse Response (FIR) filter, wherein the filter taps are only used if the difference between the center pixel and the pixel at the tap is less than a threshold that depends on the noise variance. The horizontal filter may independently process the image data for each color component (R, B, Gr, Gb), and may use an unfiltered value as an input value.
For example, FIG. 105 shows a graphical depiction of a group of horizontal pixels P0-P6 with the center tap placed at P3. From the pixels shown in fig. 105, the edge gradient of each filter tap can be calculated as follows:
Eh0=abs(P0-P1) (64)
Eh1=abs(P1-P2) (65)
Eh2=abs(P2-P3) (66)
Eh3=abs(P3-P4) (67)
Eh4=abs(P4-P5) (68)
Eh5=abs(P5-P6) (69)
subsequently, using the formula represented in equation 70 below, the horizontal filter component may determine the horizontal filter output P using the edge gradients Eh 0-Eh 5horz:
Phorz=C0×[(Eh2>horzTh[c])?P3:(Eh1>horzTh[c])?P2:(Eh0>horzTh[c])?P1:P0]+
C1×[(Eh2>horzTh[c])?P3:(Eh1>horzTh[c])?P2:P1]+
C2×[(Eh2>horzTh[c])?P3:P2]+
C3×P3+ (70)
C4×[(Eh3>horzTh[c])?P3:P4]+
C5×[(Eh3>horzTh[c])?P3:(Eh4>horzTh[c])?P4:P5]+
C6×[(Eh3>horzTh[c])?P3:(Eh4>horzTh[c])?P4:(Eh5>horzTh[c])?P5:P6],
Wherein horzTh [ c ] ]Is a horizontal edge threshold for each color component C (e.g., R, B, Gr, and Gb), where C0-C6 are pairsCorresponding to the filter tap coefficients of pixels P0-P6. Horizontal filter output PhorzMay be applied at the location of the central pixel P3. In one embodiment, the filter tap coefficients C0-C6 may be 16-bit 2's complement values (3.13 in floating-point representation) having 3 integer bits and 13 fractional bits. Further, it should be noted that the filter tap coefficients C0-C6 need not be symmetric about the center pixel P3.
Noise reduction logic 934 also applies vertical filtering after green disparity correction and horizontal filtering processing. In one embodiment, the vertical filtering operation may provide a 5-tap filter, as shown in fig. 106, with the center tap of the vertical filter located at P2. The vertical filtering process may be performed in a similar manner to the horizontal filtering process described above. For example, the gradient across the edge of each filter tap is calculated, and if the gradient is above the vertical edge threshold (vertTh), then the filter tap is collapsed to the center pixel P2. The vertical filter may process the image data independently for each color component (R, B, Gr, Gb), and may use an unfiltered value as an input value.
From the pixels shown in fig. 106, the vertical edge gradient for each filter tap can be calculated as follows:
Ev0=abs(P0-P1) (71)
Ev1=abs(P1-P2) (72)
Ev2=abs(P2-P3) (73)
Ev3=abs(P3-P4) (74)
using the formula shown in equation 75 below, the vertical filter may then determine the vertical filter output P using the edge gradients Ev 0-Ev 5vert:
Pvert=C0×[(Ev1>vertTh[c])?P2:(Ev0>ver tTh[c])?P1:P0]+C1×[(Ev1>vertTh[c])?P2:P1]+
C2×P2+ (75)
C3×[(Ev2>ver tTh[c])?P2:P3]+
C4×[(Ev2>vertTh[c])?P2:(Eh3>vertTh[c])?P3:P4],
Wherein, vertTh [ c]Is a vertical edge threshold for each color component C (e.g., R, B, Gr, and Gb), and C0-C4 are filter tap coefficients corresponding to pixels P0-P4, respectively, of diagram 106. Vertical filter output PvertMay be applied at the location of the central pixel P2. In one embodiment, the filter tap coefficients C0-C4 may be 16-bit 2's complement values (3.13 in floating-point representation) having 3 integer bits and 13 fractional bits. Further, it should be noted that the filter tap coefficients C0-C4 do not necessarily have to be symmetric about the center pixel P2.
In addition, in terms of the boundary condition, when the adjacent pixels are outside the original frame 310 (fig. 23), the values outside the boundary are copied with the values of the same-color pixels at the edge of the original frame. This specification may be implemented for both horizontal filtering operations and vertical filtering operations. For example, referring back to fig. 105, in the case of horizontal filtering, if pixel P2 is an edge pixel at the leftmost edge of the original frame, so that pixels P0 and P1 are outside the original frame, then for horizontal filtering, the values of pixels P0 and P1 are replaced with the value of pixel P2.
Referring back to the block diagram of the raw processing logic 900 shown in fig. 99, the output of the noise reduction logic 934 is then sent to the Lens Shading Correction (LSC) logic 936 for processing. As described above, the lens shading correction technique may include applying an appropriate gain on a pixel-by-pixel basis to compensate for the reduction in light intensity that may be a result of the geometric optics of the lens, imperfections in the manufacturing process, misalignment of the microlens array and the color filter array, and so forth. Furthermore, Infrared (IR) filters in some lenses may cause the intensity reduction to be dependent on the illuminant, and thus, the lens shading gain may be modified depending on the detected light source.
In the depicted embodiment, the LSC logic 936 of the ISP pipeline 82 may be implemented in a manner similar to the LSC logic 740 of the ISP front-end block 80 described above with reference to FIGS. 71-79, thereby providing substantially the same functionality. Thus, to avoid redundancy, it should be appreciated that the LSC logic 936 of the presently illustrated embodiment is configured to operate generally in the same manner as the LSC logic 740, and thus, the description of the lens shading correction techniques provided above is not repeated here. Briefly, however, it should be appreciated that LSC logic 936 may independently process each color component of the original pixel data stream to determine the gain applied to the current pixel. In accordance with the embodiments discussed above, the lens shading correction gain may be determined based on a set of defined gain grid points distributed within the imaging frame, where the spacing between the respective grid points is defined by a plurality of pixels (e.g., 8 pixels, 16 pixels, etc.). If the position of the current pixel corresponds to a grid point, a gain value associated with that grid point is applied to the current pixel. However, if the position of the current pixel is between grid points (e.g., G0, G1, G2, and G3 of fig. 74), the LSC gain value may be calculated using interpolation of the grid points between which the current pixel is located (equations 13a and 13 b). Process 772 of fig. 75 illustrates this process. Furthermore, as described above with reference to fig. 73, in some embodiments, the grid points may be unevenly distributed (e.g., logarithmically distributed) such that at the center of LSC region 760, the grid points are less concentrated, but become more concentrated toward the corners of LSC region 760 where typical lens shading distortion is more pronounced.
Additionally, LSC logic 936 may also apply the radial gain component using the grid gain values, as described above with reference to fig. 78 and 79. The radial gain component may be determined based on the distance of the center pixel from the center of the image (equations 14-16). As described above, utilizing radial gain allows a single common gain grid to be used for all color components, which may greatly reduce the total memory space required to hold the separate gain grids for each color component. The reduction of the grid gain data may reduce implementation costs, as the grid gain data table may occupy a significant portion of the memory or chip area in the image processing hardware.
Subsequently, referring back to the raw processing logic block diagram 900 of fig. 99, the output of the LSC logic 936 is then passed to a second gain, offset, and clamp (GOC) block 938. GOC logic 938 may be applied prior to demosaicing (performed by logic block 940) and may be used to automatically white balance the output of LSC logic 936. In the depicted embodiment, GOC logic 938 may be implemented in a similar manner as GOC logic 930 (and BLC logic 739). Thus, the input received by GOC logic 938 is first offset by a signed value and then multiplied by a gain, in accordance with equation 11 above. The resulting values are then clipped to a minimum and maximum range according to equation 12.
The output of GOC logic 938 is then forwarded to demosaic logic 940 for processing to produce a full color (RGB) image from the raw Bayer input data. It will be appreciated that the raw output of an image sensor using a color filter array (such as a Bayer filter) is "incomplete" in the sense that each pixel is filtered to obtain only a single color component. Thus, the data collected for a single pixel alone is not sufficient to determine color. Thus, a full color image can be generated from the raw Bayer data using a demosaicing technique by interpolating the missing color data for each pixel.
Referring now to fig. 107, fig. 107 illustrates a graphics processing flow 692 that provides an overview of how demosaicing may be applied to raw Bayer image pattern 1034 to produce full color RGB. As shown, the 4 x 4 portion 1036 of the raw Bayer image 1034 may include separate channels for each color component, including a green channel 1038, a red channel 1040, and a blue channel 1042. Since each imaged pixel in a Bayer sensor only obtains data for one color, the color data for each color channel 1038, 1040, and 1042 may be incomplete, as in the symbol "? "is shown. By applying the demosaicing technique 1044, the missing color samples for each channel may be interpolated. For example, as shown by reference numeral 1046, the interpolated data G' may be used to fill in missing samples on the green channel. Similarly, the interpolated data R 'may be used (in conjunction with interpolated data G' 1046) to fill in missing samples on the red channel 1048 and the interpolated data B 'may be used (in conjunction with interpolated data G' 1046) to fill in missing samples on the blue channel 1050. Thus, as a result of the demosaicing process, each color channel (R, G and B) will have a complete set of color data that can then be used to reconstruct the full-color RGB image 1052.
A demosaicing technique that may be implemented by demosaicing logic 940 is described below, according to one embodiment. On the green channel, missing color samples may be interpolated by using a low-pass directional filter for known green samples and a high-pass (or gradient) filter for adjacent color channels (e.g., red and blue). The missing color samples can be interpolated in a similar manner for the red and blue channels, but by applying a low pass filter to the known red or blue values and a high pass filter to the co-located (colocated) interpolated green values. Furthermore, in one embodiment, demosaicing of the green channel may utilize a 5 x 5 pixel block edge adaptive filter based on the initial Bayer color data. As described further below, the use of edge adaptive filters may provide continuous weighting based on the gradient of the horizontal and vertical filtered values, which reduces the occurrence of certain artifacts, such as aliasing, "checkerboard" artifacts, or "rainbow" artifacts, common in conventional demosaicing techniques.
During demosaicing of the green channel, the initial values of the green pixels (Gr and Gb pixels) of the Bayer image pattern are used. However, to obtain a complete set of data for the green channel, green pixel values may be interpolated over the red and blue pixels of the Bayer image pattern. According to the present technique, first, horizontal and vertical energy components, referred to as Eh and Ev, respectively, are calculated at the red and blue pixels from the above-described 5 × 5 pixel block. The values of Eh and Ev may be used to obtain edge-weighted filtered values for the horizontal and vertical filtering steps, as described further below.
For example, fig. 108 illustrates the calculation of Eh and Ev values for a red pixel centered at position (j, i) in a 5 x 5 pixel block, where j corresponds to a row and i corresponds to a column. As shown, Eh is calculated to consider the middle three rows (j-1, j, j +1) of a 5 × 5 pixel block, and Ev is calculated to consider the middle three columns (i-1, i, i +1) of a 5 × 5 pixel block. To calculate Eh, the absolute value of the sum of each pixel in the red column (i-2, i, i +2) multiplied by a corresponding coefficient (e.g., -1 for columns i-2 and i + 2; 2 for column i) is added to the sum of each pixel in the blue column (i-1, i +1) multiplied by a corresponding coefficient (e.g., -1 for column i-1; 1 for column i + 1). To calculate Ev, the absolute value of the sum of each pixel in the red row (j 2, j, j +2) multiplied by the corresponding coefficient (e.g., -1 for rows j-2 and j + 2; 2 for row j) is added to the sum of each pixel in the blue row (j 1, j +1) multiplied by the corresponding coefficient (e.g., 1 for row j-1; 1 for row j + 1). The following equations 76 and 77 illustrate these calculations:
Eh=abs[2((P(j-1,i)+P(j,i)+P(j+1,i))- (76)
(P(j-1,i-2)+P(j,i-2)+P(j+1,i-2))-
(P(j-1,i+2)+P(j,i+2)+P(j+1,i+2)]+
abs[(P(j-1,i-1)+P(j,i-1)+P(j+1,i-1))-
(P(j-1,i+1)+P(j,i+1)+P(j+1,i+1)]
Ev=abs[2(P(j,i-1)+P(j,i)+P(j,i+1))- (77)
(P(j-2,i-1)+P(j-2,i)+P(j-2,i+1))-
(P(j+2,i-1)+P(j+2,i)+P(j+2,i+1]+
abs[(P(j-1,i-1)+P(j-1,i)+P(j-1,i+1))-
(P(j+1,i-1)+P(j+1,i)+P(j+1,i+1)]
thus, the total energy sum may be expressed as Eh + Ev. Further, while the example shown in fig. 108 illustrates the calculation of Eh and Ev with respect to the red center pixel at (j, i), it should be appreciated that Eh and Ev values may be similarly determined for the blue center pixel.
Horizontal and vertical filtering may then be applied to the Bayer pattern to obtain vertical and horizontal filtered values Gh and Gv, which may represent interpolated green values in the horizontal and vertical directions, respectively. The filtered values Gh and Gv can be determined by applying low-pass filtering to known neighboring green samples, in addition to obtaining the high-frequency signal at the location of the missing green sample using the directional gradient of the neighboring color (R or B). For example, referring to fig. 109, an example of determining horizontal interpolation of Gh is now illustrated.
As shown in fig. 109, in determining Gh, 5 horizontal pixels (R0, G1, R2, G3, and R4) of the red line 1060 of the Bayer image may be considered, where it is assumed that R2 is the center pixel at (j, i). The filter coefficients associated with each of these 5 pixels are denoted by reference numeral 1062. Thus, the interpolation for the green value of the center pixel R2 (referred to as G2') can be determined as follows:
the expression for G2' shown in equations 79 and 80 below may then be generated using various mathematical operations:
thus, referring to FIG. 109 and equations 78-80 above, the general formula for horizontal interpolation of green values at (j, i) can be derived as follows:
in a similar manner to Gh, the vertical filter component Gv may be determined. For example, referring to fig. 110, in determining Gv, 5 vertical pixels (R0, G1, R2, G3, and R4) of red column 1064 of the Bayer image may be considered, where it is assumed that R2 is the center pixel at (j, i). By applying low-pass filtering to the known green samples along the vertical direction and high-pass filtering to the red channel, the following expression can be derived for Gv:
Although the examples discussed herein show interpolation of green values on red pixels, it should be understood that the expressions set forth in equations 81 and 82 may also be used in horizontal and vertical interpolation with respect to green values of blue pixels.
By weighting the horizontal and vertical filter outputs (Gh and Gv) with the energy components (Eh and Ev) discussed above, the final interpolated green value G' for the center pixel (j, i) can be determined, resulting in the following equation:
as described above, the energy components Eh and Ev may provide edge adaptive weighting of the horizontal and vertical filtered outputs Gh and Gv, which helps to reduce image artifacts, such as rainbow artifacts, aliasing artifacts, or checkerboard artifacts, in the reconstructed RGB image. In addition, demosaicing logic 940 may provide the option of bypassing the edge adaptive weighting feature by setting Eh and Ev to 1, respectively, such that Gh and Gv are both equally weighted.
In one embodiment, the horizontal and vertical weighting coefficients shown in equation 51 above may be quantized to reduce the precision of the weighting coefficients to a set of "coarse" values. For example, in one embodiment, the weighting coefficients may be quantized into 8 possible weight ratios: 1/8, 2/8, 3/8, 4/8, 5/8, 6/8, 7/8, and 8/8. Other embodiments may quantize the weighting coefficients into 16 values (e.g., 1/16-16/16), 32 values (1/32-32/32), and so on. It will be appreciated that quantization of the weighting coefficients may reduce the implementation complexity when determining the weighting coefficients and applying the weighting coefficients to the horizontal and vertical filter outputs when compared to using full precision values (e.g., 32-bit floating point values).
In other embodiments, in addition to determining and applying weighting coefficients to the horizontal (Gh) and vertical (Gv) filtered values using the horizontal and vertical energy components, the presently disclosed techniques may also determine and utilize energy components along diagonally positive and diagonally negative directions. For example, in such embodiments, filtering may also be applied along the diagonal positive and diagonal negative directions. The weighting of the filtered outputs may include selecting the two highest energy components and weighting their respective filter outputs with the selected energy components. For example, assuming that the two highest energy components correspond to the vertical direction and the diagonal forward direction, the vertical and diagonal forward energy components are used to weight the vertical and object forward filter outputs to determine interpolated green values (e.g., red or blue pixel locations in a Bayer pattern).
Demosaicing of the red and blue channels may then be performed by interpolating red and blue values at the green pixels of the Bayer image pattern, interpolating red values at the blue pixels of the Bayer image pattern, and interpolating blue values at the red pixels of the Bayer image pattern. In accordance with the presently discussed techniques, missing red and blue pixel values may be interpolated using low pass filtering based on known neighboring red and blue pixels and high pass filtering based on co-located green pixel values, which may be initial or interpolated values (from the green channel demosaicing process discussed above) depending on the location of the current pixel. Thus, for such embodiments, it should be appreciated that the interpolation of the missing green values may be done first, such that when the missing red and blue samples are interpolated, there is a complete set of green values (initial values and interpolated values) available.
With reference to fig. 111, illustrating the interpolation of red and blue pixel values, fig. 111 illustrates various 3 x 3 blocks of a Bayer image pattern to which red and blue demosaicing may be applied, as well as interpolated green values (denoted by G') that may have been obtained during demosaicing of the green channel. Referring first to block 1070, the Gr pixel (G) may be determined as follows11) Interpolated red value R 'of'11:
Wherein, G'10And G'12Representing the interpolated green value indicated with reference numeral 1078. Similarly, the Gr pixel (G) may be determined as follows11) Of interpolated blue value B'11:
Wherein, G'01And G'21Represents the interpolated green value (1078).
Then, see pixel block 1072, where the center pixel is the Gb pixel (G)11) Interpolated red value R 'may be determined as shown in equations 86 and 87 below'11And blue value B'11:
Further, referring to the pixel block 1074, the determination at the blue pixel B can be made as follows11Interpolation of the red values above:
wherein, G'00,G′02,G′11,G′20And G'22Representing the interpolated green value as indicated by reference numeral 1080. Finally, the interpolation of the blue values on the red pixels, as represented by the block 1076 of pixels, can be computed as follows:
while the embodiments discussed above rely on color differences (e.g., gradients) to determine the red and blue interpolated values, another embodiment may utilize a color ratio to provide interpolated red and blue values. For example, interpolated green values (blocks 1078 and 1080) may be used to obtain color ratios at the red and blue pixel locations of the Bayer image pattern, and linear interpolation of the color ratios may be used to determine interpolated color ratios for missing color samples. The green value, which may be an interpolated value or an initial value, may be multiplied by the interpolated color ratio to obtain a final interpolated color value. For example, interpolation of red and blue pixel values using a color ratio may be performed according to the following formulas, where formulas 90 and 91 represent interpolation of red and blue values with respect to a Gr pixel, formulas 92 and 93 represent interpolation of red and blue values with respect to a Gb pixel, formula 94 represents interpolation of a red value on a blue pixel, and formula 95 represents interpolation of a blue value on a red pixel:
(when G is11Is R 'interpolated in Gr pixels'11)
(when G is11Is interpolated B 'for Gr pixels'11)
(when G is11Is R 'interpolated in Gb pixel time'11)
(when G is11Is B 'interpolated in Gb pixel time'11)
(at blue pixel B11Interpolated R'11)
(in the red pixel R11Interpolated B'11)
Once the missing color samples are interpolated for each image pixel of the Bayer image pattern, a complete sample of the color values of the respective red, blue and green channels (e.g., 1046, 1048 and 1050 of fig. 107) may be combined to produce a full color RGB image. For example, referring to fig. 98 and 99, the output 910 of the raw pixel processing logic 900 may be an RGB image signal in 8, 10, 12, or 14 bit format.
Referring now to FIGS. 112-115, FIGS. 112-115 are various flow diagrams illustrating a process for demosaicing a raw Bayer image pattern in accordance with a disclosed embodiment. In particular, process 1082 of FIG. 112 describes the determination of which color components are to be interpolated for a given input pixel P. Depending on the determination of process 1082, one or more of process 1100 (FIG. 113) of interpolating green values, process 1112 (FIG. 114) of interpolating red values, or process 1124 (FIG. 115) of interpolating blue values may be performed (e.g., demosaicing logic 940).
Starting at FIG. 112, process 1082 begins at step 1084 when an input pixel P is received. Decision logic 1086 determines the color of the input pixel. This may depend, for example, on the location of the pixels within the Bayer image pattern. Thus, if P is identified as a green pixel (e.g., Gr or Gb), then process 1082 proceeds to step 1088, thereby obtaining interpolated red and blue values for P. For example, this may include entering processes 1112 and 1124 of fig. 114 and 115, respectively. If P is identified as a red pixel, process 1082 proceeds to step 1090 to obtain interpolated green and blue values for P. This may include further performing the processes 1100 and 1124 of fig. 113 and 115, respectively. Otherwise, if P is identified as a blue pixel, process 1082 proceeds to step 1092 to obtain interpolated green and red values for P. This may include further performing processes 1100 and 1112 of fig. 113 and 114, respectively. The various processes 1100, 1112, and 1124 are further described below.
Illustrated in FIG. 113 is a process 1100 of determining an interpolated green value for an input pixel P, process 1100 including steps 1102-1110. At step 1102, an input pixel P is received (e.g., from process 1082). Subsequently, at step 1104, a set of neighboring pixels that make up a 5 × 5 block of pixels is identified, P being the center of the 5 × 5 block. Thereafter, at step 1106, the block of pixels is analyzed to determine horizontal and vertical energy components. For example, the horizontal and vertical energy components may be determined according to equations 76 and 77 for calculating Eh and Ev, respectively. As described above, the energy components Eh and Ev may be used as weighting coefficients to provide edge adaptive filtering to reduce the occurrence of certain demosaicing artifacts in the final image. In step 1108, low pass filtering and high pass filtering are applied along the horizontal and vertical directions to determine the horizontal and vertical filtered outputs. For example, the horizontal and vertical filter outputs Gh and Gv may be calculated according to equations 81 and 82. Thereafter, process 1082 advances to step 1110 where, at step 1110, interpolated green values G' are interpolated based on the values of Gh and Gv weighted with energy components Eh and Ev, as shown in equation 83.
Referring then to process 1112 of FIG. 114, the interpolation of the red color value begins at step 1114, where the input pixel P is received (e.g., from process 1082) at step 1114. At step 1116, a set of neighboring pixels comprising a 3 x 3 block of pixels is identified, where P is the center of the 3 x 3 block. Thereafter, at step 1118, low pass filtering is applied to adjacent red pixels within the 3 x 3 block, and high pass filtering is applied to the co-located green adjacent values (step 1120), which may be the initial green values captured by the Bayer image sensor, or interpolated values (e.g., interpolated values determined with process 1100 of fig. 113). From the low-pass filtered output and the high-pass filtered output, an interpolated red value R' for P may be determined, as shown in step 1122. Depending on the color of P, R' may be determined according to one of equations 84, 86, and 88.
In the case of interpolation of blue values, the process of fig. 115 may be applied. Steps 1126 and 1128 are generally the same as steps 1114 and 1116 of process 1112 (fig. 114). At step 1130, low pass filtering is applied to adjacent blue pixels within the 3 x 3 block of pixels, and at step 1132, high pass filtering is applied to co-located green adjacent values, which may be the initial green values captured by the Bayer image sensor, or interpolated values (e.g., interpolated values determined using process 1100 of fig. 113). The interpolated blue value B' for P may be determined based on the low-pass filtered output and the high-pass filtered output, as shown in step 1134. Depending on the color of P, B' can be determined according to one of equations 85, 87, and 89. In addition, as described above, the interpolation of the red and blue values can be determined using color differences (equations 84 to 89) or color ratios (equations 90 to 95). Also, it should be understood that the interpolation of the missing green values may be done first, so that when the missing red and blue samples are interpolated, there is a complete set of green values (original and interpolated values) available. For example, process 1100 of fig. 113 may be applied to interpolate all missing green samples before processes 1112 and 1124 of fig. 114 and 115, respectively, are performed.
Referring to fig. 116-119, an example of a color map of an image processed by the original pixel processing logic 900 in the ISP pipeline 82 is provided. Fig. 116 depicts an initial image scene 1140 that may be captured with image sensor 90 of imaging device 30. Fig. 117 shows a raw Bayer image 1142 representing raw image data captured with the image sensor 90. As described above, conventional demosaicing techniques may not provide adaptive filtering based on the detection of edges (e.g., boundaries between regions of two or more colors) in the image data, which may undesirably produce artifacts in the final reconstructed full-color RGB image. For example, the graph 118 represents an RGB image 1144 reconstructed using conventional demosaicing techniques, and may include artifacts such as a "checkerboard" artifact 1146 at the edge 1148. However, comparing image 1144 with RGB image 1150 of fig. 119, RGB image 1150 being an example of an image reconstructed using the demosaicing technique described above, it can be seen that the checkerboard artifact 1146 present in fig. 118 is not present, or at least the appearance of the checkerboard artifact is significantly reduced at edge 1148. Thus, the images shown in fig. 116-119 are intended to illustrate at least one advantage that the demosaicing techniques disclosed herein have over conventional approaches.
In accordance with certain aspects of the image processing techniques disclosed herein, the various processing logic blocks in ISP subsystem 32 may be implemented using a set of line buffers that may be configured to pass image data through the various processing logic blocks, as described above. For example, in one embodiment, raw pixel processing logic 900 discussed above in FIG. 99 can be implemented using a structure of line buffers as arranged in FIGS. 120-123. In particular, diagram 120 depicts an overall line buffer arrangement that may be used to implement the original pixel processing logic 900, while diagram 121 depicts an enlarged view of a first subset of line buffers as shown within bounded region 1162 of diagram 120, diagram 122 depicts an enlarged view of a vertical filter that may be part of the noise reduction logic 934, and diagram 123 depicts an enlarged view of a second subset of line buffers as shown within bounded region 1164 of diagram 120.
As shown in FIG. 120, raw pixel processing logic 900 may include a set of 10 line buffers numbered 0-9 and labeled 1160 a-1160 j, respectively, and a row of logic 1160k that includes an image data input 908 (which may be from an image sensor or from memory) to the raw pixel processing logic 900. Thus, the logic shown in FIG. 120 may include 11 rows, 10 of which include line buffers (1160 a-1160 j). As described below, the logic elements of raw pixel processing logic 900, including gain, offset, clamp logic blocks 930 and 938 (referred to in fig. 120 as GOC1 and GOC2, respectively), defective pixel detection and correction (DPC) logic 932, noise reduction logic 934 (represented in fig. 120 as including green disparity (GNU) correction logic 934a, 7-tap horizontal filter 934b, and 5-tap vertical filter 934c), Lens Shading Correction (LSC) logic 936, and Demosaicing (DEM) logic 940, may share line buffers. For example, in the embodiment shown in FIG. 120, a lower subset of line buffers, represented by line buffers 6-9 (1160 g-1160 j), may be shared between various portions of DPC logic 932 and noise reduction logic 934, including portions of GNU logic 934a, horizontal filter 934b, and vertical filter 934 c. The upper subset of line buffers, represented by line buffers 0-5 (1160 a-1160 f), may be shared among a portion of the vertical filtering logic 934c, the lens shading correction logic 936, the gain, offset and clamp logic 938, and the demosaicing logic 940.
To summarize the movement of image data through line buffers, raw image data 908, representing the output of ISP front-end processing logic 80, is first received and processed by GOC1 logic 930, where appropriate gain, offset, and clamp parameters are applied in GOC1 logic 930. The output of GOC1 logic 930 is then provided to DPC logic 932. As shown, defective pixel detection and correction processing can occur within the line buffers 6-9. A first output of DPC logic 932 is provided to green disparity correction logic 934a (of noise reduction logic 934) which appears on line buffer 9(1160 j). Thus, in the present embodiment, the line buffer 9(1160j) is shared between the DPC logic 932 and the GNU correction logic 934 a.
Subsequently, the output (referred to as W8 in fig. 121) of the line buffer 9(1160j) is supplied to the input of the line buffer 8(1160 i). As shown, the line buffer 8 is shared between DPC logic 932 providing additional defective pixel detection and correction processing and horizontal filtering logic (934b) of noise reduction block 934. As shown in this embodiment, horizontal filter logic 934b may be a 7-tap filter represented by filter taps 1165 a-1165 g in fig. 121, and may be configured as a Finite Impulse Response (FIR) filter. As described above, in some embodiments, the noise filtering may be edge adaptive. For example, the horizontal filter may be a FIR filter, but the filter taps are only used if the difference between the center pixel and the pixels located at the taps is less than a threshold value that depends at least in part on the variance of the noise.
The output 1163 (fig. 121) of the horizontal filtering logic 934b may be provided to inputs of the vertical filtering logic 934c (illustrated in more detail in fig. 122) and the line buffer 7(1160 h). In the illustrated embodiment, the line buffer 7 is configured to provide a delay (W) before providing its input W7 as input W6 to the line buffer 6(1160 g). As shown in fig. 121, line buffer 6 is shared between DPC logic 932 and noise reduction vertical filter 934 c.
Thereafter, referring to FIGS. 120, 122 and 123 concurrently, the upper subset of line buffers, i.e., line buffers 0-5 (1160 a-1160 f), are shared among noise reduction vertical filter 934c (shown in FIG. 122), lens shading correction logic 936, GOC2 logic 938 and demosaicing logic 940. For example, the output of the line buffer 5(1160f) providing the delay (w) is provided to the line buffer 4(1160 e). The vertical filtering is performed in the line buffer 4, and the output W3 of the section of the vertical filter 934c in the line buffer 4 is provided to the line buffer 3(1160d), and downstream to the respective sections of the lens shading correction logic 936, the GOC2 logic 938, and the demosaic logic 940 common to the line buffer 4. In the present embodiment, vertical filter logic 934c may include 5 taps 1166 a-1166 e (fig. 122), but may be configured to operate in a partially recursive (infinite impulse response (iir)) mode and a non-recursive (FIR) mode. For example, when all 5 taps are utilized, such that tap 1166c is the center tap, vertical filter logic 934c operates in a partial IIR recursive mode. The present embodiment may also choose to utilize 3 of the 5 taps, i.e., taps 1166 c-1166 e (tap 1166d is the center tap), to operate vertical filtering logic 934c in a non-recursive (FIR) mode. In one embodiment, the vertical filtering mode may be specified using a configuration register associated with noise reduction logic 934.
The line buffer 3 then receives the W3 input signal and provides a delay (W) before outputting W2 to the line buffer 2(1160c) and downstream to the various portions of the lens shading correction logic 936, the GOC2 logic 938, and the demosaicing logic 940 common to the line buffer 3. As shown, line buffer 2 is also shared among the vertical filter 934c, the lens shading correction logic 936, the GOC2 logic 938, and the demosaicing logic 940, and provides an output W1 to line buffer 1(1160 b). Similarly, line buffer 1 is also shared among the vertical filter 934c, the lens shading correction logic 936, the GOC2 logic 938, and the demosaicing logic 940, and provides output W1 to line buffer 0(1160 a). The output 910 of the demosaicing logic 940 may be provided downstream to the RGB processing logic 902 for additional processing, as described further below.
It should be appreciated that the illustrated embodiments describe arranging the line buffers in a shared manner such that different processing units may simultaneously utilize a shared line buffer may significantly reduce the number of line buffers required to implement the original processing logic 900. It will be appreciated that this may reduce the physical area of hardware required to implement the image processing circuitry 32, thereby reducing overall design and manufacturing costs. For example, the presently illustrated technique of sharing line buffers between different processing components may reduce the number of line buffers required by more than 40-50% in some embodiments, as compared to conventional embodiments that do not share line buffers. Furthermore, while the presently illustrated embodiment of raw pixel processing logic 900 shown in FIG. 120 utilizes 10 line buffers, it should be understood that fewer or more line buffers may be utilized in other embodiments. That is, the embodiment shown in fig. 120 is merely intended to illustrate the principle of sharing a line buffer among multiple processing units, and should not be construed to limit the present technique to the original pixel processing logic 900. Indeed, the various aspects of the disclosure shown in fig. 120 may be implemented in any logic block of the ISP subsystem 32.
FIG. 124 is a flow chart illustrating a method 1167 of processing raw pixel data in accordance with the line buffer structures shown in FIGS. 120-123. Beginning at step 1168, the line buffers of raw pixel processing logic 900 may receive raw pixel data (e.g., from ISP front end 80, memory 108, or both). At step 1169, a first set of gain, offset and clamp (GOC1) parameters is applied to the raw pixel data. Next, in step 1170, defective pixel detection and correction is performed using a first subset of the line buffers (e.g., line buffers 6-9 in FIG. 120). Thereafter, at step 1171, a green disparity (GNU) correction is applied using at least one line buffer (e.g., line buffer 9) of the first subset of line buffers. Thereafter, horizontal filtering for noise reduction is applied, again using at least one line buffer of the first subset, as shown at step 1172. In the embodiment shown in fig. 120, the line buffers used for the first subset of GNU correction and horizontal filtering may be different.
The method 1167 then proceeds to step 1173 where vertical filtering for noise reduction is applied using at least one line buffer of the first subset and at least a portion of a second subset of line buffers of the original pixel processing logic 900 (e.g., line buffers 0-5), at step 1173. For example, as described above, depending on the vertical filtering mode (e.g., recursive or non-recursive), some or all of the second subset of line buffers may be used. Further, in one embodiment, the second subset may include the remaining line buffers not included in the first subset of line buffers of step 1170. At step 1174, lens shading correction is applied to the raw pixel data using the second subset of line buffers. Subsequently, at step 1175, a second set of gain, offset and clamp (GOC2) parameters is applied using a second subset of the line buffers, which is then also used for demosaicing the raw image data, as shown at step 1176. The demosaiced RGB color data is then sent downstream at step 1177 for additional processing by RGB processing logic 902, as described in more detail below.
Referring back to fig. 98, having described in detail the operation of the raw pixel processing logic 900, which may output RGB image signals 910, the processing of the RGB image signals 910 by the RGB processing logic 902 will now be described. As shown, the RGB image signal 910 may be sent to the selection logic 914 and/or the memory 108. The RGB processing logic 902 may receive an input signal 916, which depending on the configuration of the selection logic 914, the input signal 916 may be RGB image data from the signal 910 or from the memory 108, as shown by signal 912. The RGB image data 916 may be processed by the RGB processing logic 902 to perform color adjustment operations, including color correction (e.g., using a color correction matrix), application of color gains for automatic white balancing, and global tone mapping, among others.
FIG. 125 is a more detailed diagram of one embodiment of the RGB processing logic 902. As shown, the RGB processing logic 902 includes gain, offset, and clamp (GOC) logic 1178, RGB gamma adjustment logic, and color space transform logic 1182. Input signal 916 is first received by gain, offset, and clamp (GOC) logic 1178. In the illustrated embodiment, the GOC logic 1178 may apply a gain to perform automatic white balancing of R, G or one or more of the B color channels before the color correction logic 1179 processes R, G or one or more of the B color channels.
The GOC logic 1178 is similar to the GOC logic 930 of the raw pixel processing logic 900, except that it processes the color components of the RGB domain instead of the R, B, Gr and Gb components of the Bayer image data. In operation, the input value for the current pixel is first offset by a signed value O [ c ] and then multiplied by a gain G [ c ], as shown in equation 11 above, where c represents R, G and B. As described above, the gain Gc may be a 16-bit unsigned number having 2 integer bits and 14 fractional bits (e.g., 2.14 in floating-point notation), the value of the gain Gc being predetermined during statistical information processing (e.g., in the ISP front-end block 80). The calculated pixel value Y (according to equation 11) is then clipped to the minimum and maximum ranges according to equation 12. As described above, the variables min [ c ] and max [ c ] may represent signed 16-bit "slicer values" for minimum and maximum output values, respectively. In one embodiment, GOC logic 1178 may be further configured to maintain a count of the number of pixels clipped above the maximum value and below the minimum value for each color component R, G and B, respectively.
The output of the GOC logic 1178 is then forwarded to color correction logic 1179. In accordance with the presently disclosed technology, the color correction logic 1179 may be configured to apply color correction to the RGB image data using a Color Correction Matrix (CCM). In one embodiment, the CCM may be a 3 × 3RGB transform matrix, although in other embodiments, other dimensions of matrices (e.g., 4 × 3, etc.) may be utilized. Thus, the process of color correcting the input pixel having R, G and B components can be represented as follows:
Where R, G and B represent the current red, green and blue values of the input pixel, CCM 00-CCM 22 represent the coefficients of the color correction matrix, and R ', G ' and B ' represent the corrected red, green and blue values of the input pixel. Thus, the correct color values can be calculated according to the following equations 97-99:
R′=(CCM00×R)+(CCM01×G)+(CCM02×B) (97)
G′=(CCM10×R)+(CCM11×G)+(CCM12×B) (98)
B′=(CCM20×R)+(CCM21×G)+(CCM22×B) (99)
as described above, the coefficients for CCMs (CCM 00-CCM 22) may be determined during statistical information processing in the ISP front-end block 80. In one embodiment, the coefficients for a given color channel may be selected such that the sum of these coefficients (e.g., CCM00, CCM01, and CCM02 for red correction) equals 1, which helps to maintain brightness and color balance. Furthermore, the coefficients are typically selected such that a positive gain is applied to the corrected color. For example, for red correction, the coefficient CCM00 may be greater than 1, while one or both of the coefficients CCM01 and CCM02 may be less than 1. Setting the coefficients in this manner may enhance the red (R) component in the final corrected R' value while reducing some of the blue (B) and green (G) components. It will be appreciated that this may solve the problem of color overlap that occurs during the acquisition of the initial Bayer image, as a portion of the filtered light of a particular colored pixel may "flow" into a different color of adjacent pixels. In one embodiment, the coefficients of the CCM may be set to a 16-bit 2's complement value (represented as 4.12 in floating-point representation) with 4 integer bits and 12 decimal bits. Additionally, the color correction logic 1179 may provide clipping if the calculated corrected color value exceeds a maximum value or is below a minimum value.
The output of the RGB color correction logic 1179 is then passed to another GOC logic block 1180. GOC logic block 1180 may be implemented in the same manner as GOC logic 1178 and, thus, a detailed description of the gain, offset, and clamp functions provided is not repeated here. In one embodiment, applying GOC logic 1180 after color correction may provide automatic white balancing of the image data based on the corrected color values, and may also adjust for sensor variations in the red-to-green and blue-to-green ratios.
The output of the GOC logic 1180 is then sent to the RGB gamma adjustment logic 1181 for further processing. For example, the RGB gamma adjustment logic 1181 may provide gamma correction, tone mapping, histogram matching, and so forth. According to the disclosed embodiment, the gamma adjustment logic 1181 may provide a mapping of input RGB values to corresponding output RGB values. For example, the gamma adjustment logic may provide sets of three look-up tables for the R, G and B components, respectively. For example, each lookup table may be configured to hold a 10-bit value of 256 entries, each value representing an output stage. The table entries may be evenly distributed in the range of input pixel values so that when an input value falls between two entries, the output value may be linearly interpolated. In one embodiment, all three lookup tables for R, G and the B component may be duplicated, so that the lookup tables are "double buffered" in memory, allowing one table to be used during processing while its copy is updated. From the 10-bit output value discussed above, it should be noted that the 14-bit RGB image signal is actually down-sampled to 10 bits as a result of the gamma correction processing in the present embodiment.
The output of the gamma adjustment logic 1181 may be sent to the memory 108 and/or the color space transformation logic 1182. The color space transform (CSC) logic 1182 may be configured to convert the RGB output from the gamma adjustment logic 1181 into a YCbCr format, where Y represents a luminance component, Cb represents a blue color difference chrominance component, and Cr represents a red color difference chrominance component, and Y, Cb and Cr may each adopt a 10-bit format as a result of a bit-depth transform of RGB data from 14 bits to 10 bits during a gamma adjustment operation. As described above, in one embodiment, the RGB output of the gamma adjustment logic 1181 may be downsampled to 10 bits, which are then converted by the CSC logic 1182 to 10-bit YCbCr values, which may then be forwarded to YCbCr processing logic 904, described further below.
The transformation from the RGB domain into the YCbCr color space may be performed using a color space transformation matrix (CSCM). For example, in one embodiment, the CSCM may be a 3 x 3 transform matrix. The coefficients of the CSCM may be set according to known transform formulas such as the bt.601 and bt.709 standards. In addition, the CSCM coefficients may be flexible depending on the desired input and output ranges. Thus, in some embodiments, CSCM coefficients may be determined and programmed from data collected during statistical information processing in ISP front-end block 80.
The process of YCbCr color space transformation of RGB input pixels can be represented as follows:
where R, G and B represent the current red, green, and blue values of the input pixel in 10-bit form (e.g., processed by gamma adjustment logic 1181), CSCM 00-CSCM 22 represent the coefficients of a color space transform matrix, and Y, Cb and Cr represent the resulting luminance and chrominance components of the input pixel. Thus, the Y, Cb and Cr values can be calculated according to the following equations 101-103:
Y=(CSCM00×R)+(CSCM01×G)+(CSCM02×B) (101)
Cb=(CSCM10×R)+(CSCM11×G)+(CSCM12×B) (102)
Cr=(CSCM20×R)+(CSCM21×G)+(CSCM22×B) (103)
after the color space transform operation, the resulting YCbCr values may be output from CSC logic 1182 in the form of signal 918, which signal 918 may be processed by YCbCr processing logic 904, as described below.
In one embodiment, the coefficient of the CSCM may be a 16-bit 2's complement number (4.12) with 4 integer bits and 12 fractional bits. In another embodiment, the CSC logic 1182 may be further configured to apply an offset to the Y, Cb and Cr values, respectively, and limit the resulting values to a minimum and maximum value. For example, assuming a YCbCr value of 10 bit format, the offset may be in the range of-512 to 512, with minimum and maximum values of 0 and 1023, respectively.
Referring again to the block diagram of ISP pipeline logic 82 in fig. 98, YCbCr signal 918 may be sent to selection logic 922 and/or memory 108. YCbCr processing logic 904 may receive input signal 924, which input signal 924 may be YCbCr image data from signal 918 or from memory 108 (as shown by signal 920) depending on the configuration of selection logic 922. YCbCr image data 924 may then be processed by YCbCr processing logic 904 for luma sharpening, chroma suppression, chroma noise reduction, and luma, contrast, and color adjustment, among others. In addition, YCbCr processing logic 904 may provide gamma mapping and scaling of the processed image data along the horizontal and vertical directions.
Fig. 126 is a block diagram depicting an embodiment of YCbCr processing logic 904 in greater detail. As shown, YCbCr processing logic 904 includes image sharpening logic 1183, logic for adjusting brightness, contrast, and/or color 1184, YCbCr gamma adjustment logic 1185, chroma decimation logic 1186, and scaling logic 1187. YCbCr processing logic 904 may be configured to process pixel data in 4:4:4, 4:2:2, or 4:2:0 formats using 1-plane, 2-plane, or 3-plane memory structures. Further, in one embodiment, the YCbCr input signal 924 may provide luminance and chrominance information in the form of a 10-bit value.
It is understood that references to 1 plane, 2 planes, or 3 planes refer to the number of imaging planes utilized in the picture store. For example, in a 3-plane format, Y, Cb and the Cr component may utilize separate memory planes. In the 2-plane format, a first plane may be used for the luma component (Y), and a second plane of interleaved Cb and Cr samples may be used for the chroma components (Cb and Cr). In the 1-plane format, a single plane in memory is interleaved with luma and chroma samples. Further, with respect to the 4:4:4, 4:2:2, and 4:2:0 formats, it is understood that the 4:4:4 format refers to a sampling format in which each of the 3 YCbCr components are sampled at the same rate. In the 4:2:2 format, the chrominance components Cb and Cr are sub-sampled at half the sampling rate of the luminance component Y, thereby halving the resolution of the chrominance components Cb and Cr in the horizontal direction. Similarly, the 4:2:0 format sub-samples the chroma components Cb and Cr along the vertical and horizontal directions.
The processing of the YCbCr information may be performed in an active source region defined in the source buffer, wherein the active source region contains "valid" pixel data. For example, referring to fig. 127, a source buffer 1188 is illustrated in which an active source region 1189 is defined. In the illustrated example, the source buffer may represent a 4:4: 41 planar format of source pixels that provide a 10-bit value. The activation source region 1189 may be separately defined for luminance (Y) samples and chrominance samples (Cb and Cr). Thus, it should be appreciated that the active source region 1189 may actually comprise multiple active source regions for luminance and chrominance samples. The starting point of the active source region 1189 for luminance and chrominance may be determined based on an offset from the base address (0, 0)1190 of the source buffer. For example, the starting position (Lm _ X, Lm _ Y)1191 of the luminance activation source region may be defined with an X offset 1193 and a Y offset 1196 with respect to the base address 1190. Similarly, start position (Ch _ X, Ch _ Y)1192 of chroma activation source region may be defined with X-offset 1194 and Y-offset 1198 relative to base address 1190. Note that in this example, y offsets 1196 and 1198 for luminance and chrominance, respectively, may be equal. Depending on the starting position 1191, the luminance activation source region may be defined by a width 1195 and a height 1200, which may represent the number of luminance samples in the x-direction and the y-direction, respectively. In addition, from the starting position 1192, the chroma activation source region may be defined by a width 1202 and a height 1204, the width 1202 and the height 1204 representing the number of chroma samples in the x-direction and the y-direction, respectively.
Fig. 128 also provides an example of how the active source regions for luma samples and chroma samples may be determined in a 2-plane format. For example, as shown, luma activation source region 1189 may be defined in first source buffer 1188 (having base address 1190) in a region defined by width 1195 and height 1200 relative to starting position 1191. The chroma active source region 1208 may be defined in the second source buffer 1206 (with base address 1190) in the form of a region defined by a width 1202 and a height 1204 relative to a starting position 1192.
With the above in mind, and referring back to fig. 126, YCbCr signals 924 are first received by image sharpening logic 1183. Image sharpening logic 1183 may be configured to perform picture sharpening and edge enhancement processing to enhance texture and edge details in an image. It is appreciated that image sharpening may increase perceived image resolution. However, it is generally desirable that the existing noise in the image is not detected as texture and/or edges and is not amplified in the sharpening process.
In accordance with the present technique, image sharpening logic 1183 may perform image sharpening on the luminance (Y) component of the YCbCr signal using a multi-scale unsharp mask filter. In one embodiment, two or more low-pass gaussian filters of different scale sizes may be provided. For example, in an embodiment where two gaussian filters are provided, the output of a first gaussian filter having a first radius (x) (e.g., gaussian blur) is subtracted from the output of a second gaussian filter having a second radius (y), where x is greater than y, to produce an unsharpened mask. Additional unsharpened masks may also be obtained by subtracting the outputs of multiple gaussian filters from the Y output. In some embodiments, the techniques may also provide an adaptive kernel (spring) threshold comparison operation that may utilize an unsharp mask, such that depending on the result of the comparison, an amount of gain may be added to the base image selected as the output of one of the initial Y input image or the gaussian filter, resulting in a final output.
Referring to FIG. 129, FIG. 129 is a block diagram illustrating exemplary logic 1210 for performing image sharpening in accordance with an embodiment of the presently disclosed technology. Logic 1210 represents a multi-scale unsharp filtering mask that may be applied to the input luminance image Yin. For example, as shown, Yin is received and processed by two low-pass gaussian filters 1212(G1) and 1214 (G2). In this example, filter 1212 may be a 3 × 3 filter and filter 1214 may be a 5 × 5 filter. However, it should be understood that in other embodiments, more than two gaussian filters may be used, including filters of different scales (e.g., 7 × 7, 9 × 9, etc.). It will be appreciated that due to the low pass filtering process, high frequency components, which typically correspond to noise, may be removed from the outputs of G1 and G2, resulting in "unsharpened" images (G1out and G2 out). As described below, using the unsharp input image as the base image allows noise reduction as part of the sharpening filter.
The 3 × 3 gaussian filter 1212 and the 5 × 5 gaussian filter 1214 may be defined as follows:
for example, in one embodiment, the values of gaussian filters G1 and G2 may be selected as follows:
from Yin, G1out, and G2out, 3 unsharp masks Sharp1, Sharp2, and Sharp3 may be generated. Sharp1 may be determined to subtract the unsharpened image G2out of gaussian filter 1214 from the unsharpened image G1out of gaussian filter 1212. Since Sharp1 is actually the difference between the two low pass filters, it can be referred to as a "mid-band" mask because in the G1out and G2out unsharpened images, the higher frequency noise components have been filtered out. In addition, by subtracting G2out from the input luminance image Yin, Sharp2 can be calculated, and by subtracting G1out from the input luminance image Yin, Sharp3 can be calculated. As described below, an adaptive thresholding scheme may be applied by using unsharp masks Sharp1, Sharp2, and Sharp 3.
Referring to selection logic 1216, the base image may be selected according to the control signal UnshirapSel. In the illustrated embodiment, the base image is either the input image Yin or the filtered output G1out or G2 out. It can be appreciated that when the initial image has a high noise variance (e.g., almost as high as the signal variance), utilizing the initial image Yin as the base image in sharpening may not be sufficient to reduce the noise component during sharpening. Thus, when a particular threshold of noise content is detected in the input image, the selection logic 1216 may be adapted to select one of the low-pass filtered outputs G1out or G2out from which the high frequency content, which may contain noise, has been reduced. In one embodiment, the value of the control signal Unshirapsel may be determined by analyzing statistical data obtained during statistical information processing in the ISP front-end block 80 to determine the noise component of the image. For example, if the input image Yin has a low noise component so that apparent noise is unlikely to increase due to the sharpening process, the input image Yin may be selected as the base image (e.g., UnsharpSel ═ 0). If it is determined that the input image Yin contains a significant level of noise, such that the sharpening process amplifies the noise, then the filtered output G1out or G2out (e.g., uchrapsel 1 or 2, respectively) may be selected. Thus, by applying adaptive techniques to select the base image, logic 1210 essentially provides noise reduction functionality.
Subsequently, gains may be applied to one or more of the Sharp1, Sharp2, and Sharp3 masks in accordance with an adaptive coring threshold scheme, as described below. Subsequently, the unsharp values Sharp1, Sharp2 and Sharp3 and the respective thresholds Sharp1, Sharp2 and Sharp3 (not necessarily separately) may be compared using comparator blocks 1218, 1220 and 1222. For example, the Sharp1 value is always compared to the Sharp thd1 at the comparator block 1218. As for the comparator block 1220, the threshold SharpThd2 may be compared against Sharp1 or Sharp2, depending on the selection logic 1226. For example, the selection logic 1226 may select Sharp1 or Sharp2 based on the state of the control signal Sharp cmp2 (e.g., Sharp cmp 2-1 select Sharp 1; Sharp cmp 2-0 select Sharp 2). For example, in one embodiment, the state of SharpCmp2 may be determined based on the noise variance/content of the input image (Yin).
In the illustrated embodiment, it is generally preferable to set the values of SharpCmp2 and SharpCmp3 to select Sharp1 unless the image data is detected to have a relatively low amount of noise. This is because Sharp1, which is the difference between the outputs of gaussian low pass filters G1 and G2, is generally less sensitive to noise, thereby helping to reduce the amount of change in the values of Sharp1, Sharp2, and Sharp3 due to noise level fluctuations in the "noisy" image data. For example, if the initial image has a high noise variance, some high frequency components may not be captured when a fixed threshold is utilized and thus may be amplified during the sharpening process. Thus, if the noise component of the input image is high, some noise component may be present in Sharp 2. In this case, the SharpCmp2 may be set to 1 to select the mid-band mask Sharp1, which, as described above, has a reduced high frequency content (because it is the difference of the two low pass filter outputs) and is thus less sensitive to noise.
It is to be appreciated that a similar process may be applied to the selection of Sharp1 or Sharp3 by the selection logic 1224 under the control of Sharp cmp 3. In one embodiment, the SharpCmp2 and SharpCmp3 may be set to 1 by default (e.g., using Sharp1) and set to 0 only for those input images that are identified as typically having low noise variance. This essentially provides an adaptive coring threshold scheme in which the selection of the comparison value (Sharp1, Sharp2 or Sharp3) is adaptive according to the noise variance of the input image.
From the outputs of comparator blocks 1218, 1220 and 1222, a sharpened output image Ysharp may be determined by applying a gain unsharp mask to the base image (e.g., the base image selected via logic 1216). For example, referring first to the comparator block 1222, the B inputs (referred to herein as "SharpAbs" and may be equal to Sharp1 or Sharp3 depending on the state of SharpCmp 3) provided by the SharpThd3 and the selection logic 1224 are compared. If SharpAbs is greater than the threshold SharpThd3, then the gain SharpAmt3 is applied to Sharp3 and the resulting value is added to the base image. If SharpAbs is less than the threshold SharpThd3, then the gain of attenuation Att3 may be applied. In one embodiment, the attenuation gain Att3 may be determined as follows:
Where SharpAbs is Sharp1 or Sharp3 as determined by the selection logic 1224. The selection of the base image added to either full gain (SharpAmt3) or attenuation gain (Att3) is made by the selection logic 1228 from the output of the comparator block 1222. It will be appreciated that the use of attenuation gains may address the case where the sharp abs is not greater than the threshold (e.g., sharp thd3), but the noise variance of the image is still close to the given threshold. This may help to reduce sharp transitions between sharpened and unsharp pixels. For example, if in such a case the image data is passed through without applying an attenuation gain, the resulting pixels may appear as defective pixels (e.g., bright spots).
Subsequently, a similar process may be applied to comparator block 1220. For example, depending on the state of the SharpCmp2, the selection logic 1226 may provide the Sharp1 or Sharp2 as inputs to the compare block 1220 for comparison against the threshold Sharp 2. Depending on the output of the comparator block 1220, the gain SharpAmt2 or the attenuated gain Att2 based on SharpAmt2 is applied to the Sharp2 and added to the output of the selection logic 1228 described above. It is appreciated that the attenuated gain Att2 may be calculated in a similar manner as equation 104 above, except that the gain SharpAmt2 and the threshold SharpThd2 are applied for SharpAbs, which may be selected as Sharp1 or Sharp 2.
The gain SharpAmt1 or the attenuated gain Att1 is then applied to Sharp1 and the resulting value is added to the output of selection logic 1230, producing a sharpened pixel output Ysharp (from selection logic 1232). The selection of the applied gain SharpAmt1 or the attenuated gain Att1 may be determined based on the output of the comparator block 1218 that compares the Sharp1 to the threshold Sharp thd 1. Likewise, the attenuated gain Att1 may be determined in a similar manner as equation 104 above, except that the gain SharpAmt1 and the threshold SharpThd1 are applied relative to Sharp 1. The resulting sharpened pixel values, scaled with three masks respectively, are added to the input pixel Yin to produce a sharpened output Ysharp, which may be clipped to 10 bits in one embodiment (assuming YCbCr processing with 10-bit precision).
It can be appreciated that the image sharpening techniques set forth in this disclosure may improve enhancement of texture and edges while also reducing noise in the output image when compared to conventional unsharp masking techniques. In particular, the techniques of the present invention are particularly well suited for applications in which images captured with, for example, CMOS image sensors exhibit poor signal-to-noise ratios, such as images obtained under low lighting conditions with low resolution cameras integrated into portable devices (e.g., mobile phones). For example, when the noise variance and the signal variance are comparable, it is difficult to sharpen using a fixed threshold because some noise components may be sharpened along with texture and edges. Thus, as described above, the techniques provided herein may filter noise from an input image by extracting features from unsharpened images (e.g., G1out and G2out) using a multi-scale gaussian filter to provide a sharpened image with reduced noise content.
Before proceeding, it is to be appreciated that the illustrated logic 1210 is intended to provide only one exemplary embodiment of the present technique. In other embodiments, additional or fewer features may be provided by image sharpening logic 1183. For example, in some embodiments, logic 1210 may pass only the base value, rather than applying the attenuation gain. Additionally, some embodiments may not include selection logic blocks 1224, 1226, or 1216. For example, the comparator blocks 1220 and 1222 may only receive the Sharp2 and Sharp3 values, respectively, rather than the select outputs from the select logic blocks 1224 and 1226, respectively. While such embodiments do not provide sharpening and/or noise reduction features as robust as the implementation shown in fig. 129, it should be understood that such design choices are a result of cost and/or traffic related constraints.
In this embodiment, image sharpening logic 1183 may also provide edge enhancement and chroma suppression features once a sharpened image output YSharp is obtained. These additional features are discussed separately below. Referring first to FIG. 130, exemplary logic 1234 is illustrated that performs edge enhancement that may be implemented downstream of sharpening logic 1210 of FIG. 129, in accordance with one embodiment. As shown, Sobel filter 1236 processes the initial input values Yin for edge detection. The Sobel filter 1236 may determine the gradient value YEdge from the 3 × 3 pixel block (hereinafter referred to as "a") of the initial image, Yin being the center pixel of the 3 × 3 pixel block. In one embodiment, Sobel filter 1236 may compute YEdge by convolving the initial image data to detect changes in the horizontal and vertical directions. This process is shown in the following equations 105 to 107.
Gx=Sx×A, (105)
Gy=Sy×A, (106)
YEdge=Gx×Gy, (107)
Wherein S isxAnd SyRepresents matrix operators for gradient edge strength detection in the horizontal and vertical directions, respectively, and GxAnd GyRepresenting a gradient image containing derivatives of horizontal and vertical changes, respectively. Thus, the output YEdge is determined to be GxAnd GyThe product of (a).
YEdge is then received by selection logic 1240 with the mid-band Sharp1 mask discussed above in figure 129. In response to the control signal EdgeCmp, Sharp1 or YEdge is compared to a threshold EdgeThd at comparator block 1238. For example, based on the noise component of the image, the state of EdgeCmp can be determined, thereby providing an adaptive coring threshold scheme for edge detection and enhancement. The output of comparator block 1238 may then be provided to selection logic 1242, and either the full gain or the attenuated gain may be applied. For example, when the B input (Sharp1 or YEdge) to comparator 1238 is selected to be higher than EdgeThd, YEdge is multiplied by edge gain EdgeAmt, thereby determining the amount of edge enhancement to be applied. If the B input at comparator block 1238 is less than EdgeThd, an attenuated edge gain AttEdge may be applied to avoid a sharp transition between the edge enhanced pixel and the original pixel. It will be appreciated that AttBdge may be calculated in a similar manner as shown in equation 104 above, but wherein EdgeAmt and EdgeThd are applied to "SharpAbs" (which may be Sharp1 or YEdge, depending on the output of selection logic 1240). Thus, edge pixels enhanced with gain (EdgeAmt) or attenuated gain (AttEdge) may be added to YSharp (the output of logic 1210 of fig. 129) to obtain edge enhanced output pixels Yout, which may be clipped to 10 bits (assuming YCbCr processing at 10 bit precision) in one embodiment.
In the case of the chroma suppression features provided by image sharpening logic 1183, such features may attenuate chroma at the edge of luminance. Generally, chroma suppression may be performed by applying a chroma gain (attenuation factor) of less than 1 depending on the values (YSharp, Yout) obtained from the above-described luma sharpening and/or edge enhancement steps. For example, the graph 131 represents a graph 1250 that includes a curve 1252 representing a chroma gain that may be selected for a corresponding sharpened luminance value (YSharp). The data represented by graph 1250 may be implemented as a look-up table of YSharp values and corresponding chroma gains (attenuation factors) between 0 and 1. A look-up table is used to approximate the curve 1252. For YSharp values co-located between two attenuation factors in the lookup table, linear interpolation may be applied to the two attenuation factors corresponding to YSharp values above and below the current YSharp value. Furthermore, in other embodiments, the input luminance value may also be selected as one of the Sharp1, Sharp2, or Sharp3 values determined by logic 1210, as shown above in fig. 129, or the YEdge value determined by logic 1234, as shown in fig. 130.
The output of the image sharpening logic 1183 (fig. 126) is then processed by the brightness, contrast, and color (BCC) adjustment logic 1184. Fig. 132 is a functional block diagram illustrating an embodiment of BCC adjustment logic 1183. As shown, the logic 1184 includes a luma and contrast processing block 1262, a global hue control block 1264, and a saturation control block 1266. The presently illustrated embodiment provides processing of YCbCr data with 10-bit precision, although other embodiments may utilize different bit depths. The functionality of each of blocks 1262, 1264 and 1266 is discussed below.
Referring first to the brightness and contrast processing block 1262, the offset YOffset is first subtracted from the luminance (Y) data to set the black level to 0. This is done to ensure that the contrast adjustment does not alter the black level. Subsequently, the luminance value is multiplied by the contrast gain value to apply contrast control. For example, the contrast gain value may be a 12-bit unsigned value having 2 integer bits and 10 fractional bits, providing a contrast gain range of up to 4 times the pixel value. Brightness adjustment can then be achieved by adding or subtracting brightness offset values from the luminance data. For example, the lightness offset in the present embodiment may be a 2's complement value having 10 bits in the range of-512 to + 512. Further, it should be noted that the brightness adjustment is performed after the contrast adjustment to avoid varying the DC offset when changing the contrast. Then, the initial YOffset is added back to the adjusted luminance data, thereby restoring the black level.
Blocks 1264 and 1266 provide color adjustment based on hue characteristics of the Cb and Cr data. As shown, the offset 512 is first subtracted from the Cb and Cr data (assuming 10-bit processing) so that the range is placed close to 0. The hue is then adjusted according to the following formula:
Cbadj=Cb cos(θ)+Cr sin(θ), (108)
Cradj=Cr cos(θ)-Cb sin(θ), (109)
Wherein CbadjAnd CradjRepresents the adjusted Cb and Cr values, where θ represents the hue angle, which can be calculated as follows:
the above operations are described by the logic within the global hue control block 1264 and may be represented by the following matrix operations:
where Ka is cos (θ), Kb is sin (θ), and θ is defined in the above equation 110.
Then, Cb can be alignedadjAnd CradjThe value applies saturation control as shown by saturation control block 1266. In the illustrated embodiment, saturation control is performed by applying a global saturation multiplier and a hue-based saturation multiplier to each Cb and Cr value. Saturation control based on hue can improve color reproduction. The hue of the color may be represented in the YCbCr color space, as shown in the color wheel diagram 1270 in fig. 133. It will be appreciated that by shifting the same color wheel in HSV color space (hue, saturation and intensity) by about 109 °, a YCbCr hue and saturation color wheel 1270 may be obtained. As shown, the color wheel graph 1270 includes a saturation multiplier (S) in the range of 0-1 and circumferential values representing the angular values of θ in the range of 0 to 360, as defined above. Each θ may represent a different color (e.g., 49 ° -magenta, 109 ° -red, 229 ° -green, etc.). By selecting an appropriate saturation multiplier S, the hue of a color at a specific hue angle θ can be adjusted.
Referring back to fig. 132, the hue angle θ (calculated in the global hue control block 1264) may be used as an index into the Cb saturation lookup table 1268 and the Cr saturation lookup table 1269. In one embodiment, the saturation lookup tables 1268 and 1269 may contain 256 saturation values evenly distributed in the 0 ° -360 ° hue range (e.g., the first lookup table entry is at 0 °, the last entry is at 360 °), and the saturation value S at a given pixel may be determined by linear interpolation of the saturation values in the lookup table just below and above the current hue angle θ. By saturating the wholeThe value of the value (which may be a global constant for each Cb and Cr) is multiplied by the determined hue-based saturation value to obtain a final saturation value for each Cb and Cr component. Thus, by manipulating Cb as shown in hue based saturation control block 1266adjAnd CradjBy multiplying by their respective final saturation values, final corrected Cb 'and Cr' values can be determined.
The output of the BCC logic 1184 is then passed to YCbCf gamma adjustment logic 1185, as shown in fig. 126. In one embodiment, the gamma adjustment logic 1185 may provide a non-linear mapping function for Y, Cb and the Cr channel. For example, the input Y, Cb and Cr values are mapped to corresponding output values. Also, assuming the YCbCr data is processed at 10 bits, an interpolated 10-bit 256 entry lookup table may be utilized. Three such lookup tables may be provided for the Y, Cb and Cr channels one-to-one. Each of the 256 input entries may be evenly distributed and the output may be determined using linear interpolation of the output values mapped to indices just above and below the current input index. In some embodiments, a non-interpolated lookup table having 1024 entries (for 10-bit data) may also be used, although such a lookup table may have significantly higher memory requirements. It will be appreciated that the YCbCr gamma adjustment function can also be used to achieve certain image filter effects such as black and white, brown tones, negatives, etc., by adjusting the look-up table output values.
Chroma decimation logic 1186 may then apply chroma decimation on the output of gamma adjustment logic 1185. In one embodiment, chroma decimation logic 1186 may be configured to decimate horizontally to convert YCbCr data from a 4:4:4 format to a 4:2:2 format, in which chroma (Cr and Cb) information is subsampled at half the rate of the luma data. For example, decimation can be achieved by applying a 7-tap low-pass filter, such as a half-band lanczos filter, to a set of 7 horizontal pixels, as follows:
where in (i) represents the input pixel (Cb or Cr), and C0-C6 represent the filter coefficients of a 7-tap filter. Each input pixel has an independent filter coefficient (C0-C6) to allow flexible phase shifting for chroma filtered samples.
Furthermore, in some cases, the chroma decimation may also be performed without filtering. This is beneficial when the source image is initially received in 4:2:2 format, but is up-sampled to 4:4:4 format for YCbCr processing. In this case, the resulting decimated 4:2:2 image is the same as the initial image.
The YCbCr data output from the chrominance decimation logic 1186 may then be scaled by scaling logic 1187 before output from the YCbCr processing block 904. The function of scaling logic 1187 may be similar to the function of scaling logic 709, 710 in the binning compensation filter 652 of front-end pixel processing unit 150 described above with reference to fig. 59. For example, scaling logic 1187 may scale horizontally and vertically in two steps. In one embodiment, a 5-tap polyphase filter may be used for vertical scaling and a 9-tap polyphase filter may be used for horizontal scaling. The multi-tap polyphase filter may multiply a selected pixel from the source image by a weighting coefficient (e.g., a filter coefficient) and then sum the outputs to form the destination pixel. The selected pixel may be selected based on the current pixel position and the number of filter taps. For example, in the case of a vertical 5-tap filter, two neighboring pixels on each vertical side of the current pixel may be selected, and in the case of a horizontal 9-tap filter, four neighboring pixels on each horizontal side of the current pixel may be selected. The filter coefficients may be provided from a look-up table and may be determined by the current inter-pixel fractional position. Output 926 of scaling logic 1187 is then output from YCbCr processing block 904.
Returning to FIG. 98, the processed output signal 926 may be sent to memory 108 or, according to the embodiment of the image processing circuitry 32 shown in FIG. 7, may be output as an image signal 114 from the ISP pipeline processing logic 82 to display hardware (e.g., display 28) for viewing by a user or to a compression engine (e.g., encoder 118). In some embodiments, the image signal 114 may also be processed and saved by a graphics processing unit and/or compression engine before being decompressed and provided to a display. In addition, one or more frame buffers may be provided to control the buffering of image data, particularly video image data, output to the display. Further, in embodiments where ISP backend processing logic 120 is provided (e.g., fig. 8), image signal 114 may be sent downstream for additional post-processing steps, as described in the following sections.
ISP back-end processing logic
Having described the ISP front-end logic 80 and ISP pipeline 82 in detail above, the description will now turn attention to the ISP back-end processing logic 120 described above in fig. 8. As described above, the ISP backend logic 120 is generally operable to receive processed image data provided by the ISP pipeline 82, or from the memory 108 (signal 124), and perform additional image post-processing operations, i.e., before outputting the image data to the display device 28.
Figure 134 is a block diagram illustrating one embodiment of the ISP backend logic 120. As shown, the ISP backend logic 120 may include feature detection logic 2200, local tone mapping Logic (LTM)2202, brightness, contrast, and color adjustment logic 2204, scaling logic 2206, and backend statistics unit 2208. In one embodiment, the feature detection logic 2200 may comprise face detection logic and may be configured to identify the location of a face/facial feature in an image frame (here represented by reference numeral 2201). In other embodiments, the feature detection logic 2200 may also be configured to detect the location of other kinds of features, such as the location of corners of objects of an image frame. For example, this data may be used to identify the location of features in successive image frames in order to determine an estimate of global motion between frames, which may then be used to perform certain image processing operations, such as image registration. In one embodiment, the identification of corner features and the like is particularly beneficial in algorithms that combine multiple image frames, such as in certain High Dynamic Range (HDR) imaging algorithms, and in certain panoramic stitching algorithms.
For simplicity, in the following description, feature detection logic 2200 will be referred to as face detection logic. It should be appreciated, however, that the logic 2200 is not intended to be limited to only face detection logic and may be configured to detect other types of features instead of or in addition to facial features. For example, in one embodiment, the logic 2200 may detect corner features, as described above, the output 2201 of the feature detection logic 2200 may include corner features.
The face detection logic 2200 may be configured to receive the YCC image data 114 provided by the ISP pipeline 82, or may receive a reduced resolution image (represented by signal 2207) from the scaling logic 2206 and detect the location and position of faces and/or facial features within the image frame corresponding to the selected image data. As shown in fig. 134, inputs to the face detection logic 2200 may include a selection circuit 2196 that receives the YCC image data 114 from the ISP pipeline 82, and a reduced resolution image 2207 from the scaling logic 2206. A control signal provided by ISP control logic 84 (e.g., a processor executing firmware) may determine which input is provided to face detection logic 2200.
The detected position of the face/facial feature (signal 2201) may be provided as feedback data to one or more upstream processing units, as well as one or more downstream units. For example, data 2201 may represent the location within the current image frame where a face or facial feature appears. In some embodiments, data 2201 may include a reduced resolution transformed image, which may provide additional information for face detection. Further, in some embodiments, face detection logic 2200 may utilize a face detection algorithm, such as the Viola-Jones face/object detection algorithm, or may utilize any other algorithm, transformation, or pattern detection/matching technique suitable for detecting facial features in an image.
In the illustrated embodiment, the face detection data 2201 may be fed back to the control logic 84, and the control logic 84 may represent a processor executing firmware that controls the image processing circuit 32. In one embodiment, the control logic 84 may provide the data 2201 to a front-end statistics control loop (e.g., including the front-end statistics processing units (142 and 144) of the ISP front-end logic 80 of fig. 10), whereby the statistics processing unit 142 or 144 may use the feedback data 2201 to place the appropriate window and/or select particular tiles for auto white balance, auto exposure, and auto focus processing. It will be appreciated that improving the color and/or hue accuracy of regions of an image containing facial features may produce an image that is more aesthetically pleasing to a viewer. Data 2201 may also be provided to LTM logic 2202, back-end statistics unit 2208, and to encoder/decoder block 118, as described further below.
The LTM logic 2202 also receives YCC image data 114 from the ISP pipeline 82. As described above, LTM logic 2202 may be configured to apply tone mapping to image data 114. It will be appreciated that tone mapping techniques may be used to map one set of pixel values to another in image processing applications. Tone mapping may not be necessary where the input image and the output image have the same bit precision, although some embodiments may apply tone mapping without compression in order to improve contrast characteristics in the output image (e.g., make bright areas appear darker and dark areas appear lighter). However, when the input image and the output image have different bit accuracies, tone mapping may be applied to map the input image values to corresponding values of the output range of the input image. For example, a scene may have a dynamic range of 25,000: 1 or higher, while compression standards may allow a much lower range (e.g., 256: 1) for display, and sometimes even a lower range (e.g., 100: 1), for printing.
Thus, tone mapping is beneficial, for example, in cases such as when the output is represented as image data of 10 bits or more accuracy in a low accuracy format (such as an 8-bit JPEG image). In addition, tone mapping is particularly beneficial when applied to High Dynamic Range (HDR) images. In digital image processing, HDR images may be produced by obtaining multiple images of a scene at different exposure levels and combining or compositing the images to produce an image with a higher dynamic range than can be obtained with a single exposure. Further, in some imaging systems, an image sensor (e.g., sensor 90a, 90b) may be configured to obtain an HDR image without the need to combine multiple images to produce a composite HDR image.
LTM logic 2202 of the illustrated embodiments may utilize local tone mapping operators (e.g., spatially varying) that may be determined from local features within an image frame. For example, the local tone mapping operator may be region-based and may vary locally according to the content within a particular region of the image frame. For example, the local tone mapping operator may be based on gradient domain HDR compression, photographic tone reproduction, or And (5) processing the image.
It will be appreciated that when applied to an image, local tone mapping techniques can generally produce an output image with improved contrast characteristics and appear more aesthetically pleasing to a viewer relative to images using global tone mapping. Fig. 135 and 136 illustrate some of the drawbacks associated with global tone mapping. For example, referring to fig. 135, a graph 2400 represents the tone mapping of an input image having an input range 2401 to an output range 2403. The range of tones in the input image is represented by curve 2402, where value 2404 represents the bright regions of the image and value 2406 represents the dark regions of the image.
For example, in one embodiment, the range 2401 of the input image may have a 12-bit precision (0 ~ 4095) and may be mapped to an output range 2403 having an 8-bit precision (0 ~ 255, e.g., a JPEG image). Fig. 135 represents a linear tone mapping process in which a curve 2402 is linearly mapped to a curve 2410. As shown, the result of the tone mapping process shown in fig. 135 produces a range 2404, the range 2404 corresponding to a bright area of the input image compressed to the smaller range 2412, and also produces a range 2406, the range 2406 corresponding to a dark area of the input image compressed to the smaller range 2414. The reduction in the tonal range of the dark regions (e.g., shadows) and bright regions can adversely affect the contrast properties and can appear aesthetically unsatisfactory to the viewer.
Referring to FIG. 136, one approach to solving the problems associated with the compression of the "bright" range 2404 (compressed to range 2412) and the "dark" range 2406 (compressed to range 2414) is to use a non-linear tone mapping technique, as shown in FIG. 136. For example, in fig. 136, a tone curve 2402 representing an input image is mapped with a non-linear "S" -shaped curve (or S-curve) 2422. As a result of the non-linear mapping, the bright portions of input range 2404 are mapped to the bright portions of output range 2424, and similarly, the dark portions of input range 2406 are mapped to the dark portions of output range 2426. As shown, the bright and dark ranges 2424 and 2426 of the output image of fig. 136 are larger than the bright and dark ranges 2412 and 2414 of the output image of fig. 135, thereby preserving more of the bright and dark content of the input image. However, due to the non-linear (e.g., S-curve) aspect of the mapping technique of fig. 136, the mid-range values 2428 of the output image may appear flatter, which may also be aesthetically unsatisfactory to viewers.
Thus, embodiments of the present disclosure may implement a local tone mapping technique that utilizes a local tone mapping operator to process discrete portions of a current image frame that may be divided into regions based on local features within the image, such as brightness characteristics. For example, as shown in fig. 137, a portion 2430 of an image frame received by the ISP backend logic 120 may include a bright area 2432 and a dark area 2434. For example, bright regions 2432 may represent bright regions of an image, such as the sky or the horizon, while dark regions may represent relatively darker regions of the image, such as the foreground or landscape. Local tone mapping may be applied to regions 2432 and 2434, respectively, to produce an output image that maintains more of the dynamic range of the input image relative to the global tone mapping technique described above, thereby improving local contrast and providing an output image that is more aesthetically pleasing to the viewer.
Examples of how the local tone mapping is realized in the present embodiment are exemplified in fig. 138 and 139. Specifically, fig. 138 depicts a conventional local tone mapping technique that may result in a limited output range in some cases, and fig. 139 depicts an adaptive local tone mapping process that may be implemented by LTM logic 2202 utilizing the entire output range, even though a portion of the input range is not used by the image frame.
Referring first to fig. 138, curve 2440 represents the application of local tone mapping to an input image of higher bit precision, resulting in an output image of lower bit precision. For example, in the illustrated example, the higher bit precision input image data may be 12-bit image data (having 4096 input values (e.g., values 0-4095)) represented by range 2442, which 12-bit image data is tone mapped to produce an 8-bit output (having 256 output values (e.g., 0-255)) represented by range 2444. It should be appreciated that the bit depths are merely intended to provide some examples and should in no way be construed as limiting the invention. For example, in other embodiments, the input image may be 8 bits, 10 bits, 14 bits, or 16 bits, etc., and the output image may have a bit depth greater or less than 8 bits of precision.
Here, assume that the region of the image to which local tone mapping is applied utilizes only a portion of the entire input dynamic range, such as range 2448 represented by values 0 through 1023. For example, these input values may correspond to the values of the shaded region 2434 shown in fig. 137. Fig. 138 shows a linear mapping of 4096 (12-bit) input values to 256 (8-bit) output values. Thus, while mapping from values of 0-4095 to values 0-255 of the output dynamic range 2444, the unused portion 2450 (values 1024-4095) of the entire input range 2442 is mapped to the portion 2454 (values 64-255) of the output range 2444, leaving only the output values 0-63 (portion 2452 of the output range 2444) available to represent the used portion 2448 (values 0-1023) of the input range. In other words, such linear local tone mapping techniques do not take into account whether unused values or ranges of values are mapped. This results in a portion (e.g., 2454) of the output values (e.g., 2444) being assigned to represent input values that are not actually present in the region (e.g., 2434) of the image frame for which the current local tone mapping operation (e.g., curve 2440) is being applied, thereby reducing the available output values (e.g., 2452) that can be used to represent input values present in the current region being processed (e.g., range 2448).
With the above in mind, fig. 139 illustrates a local tone mapping technique implemented according to an embodiment of the present disclosure. Here, prior to mapping the input range 2442 (e.g., 12 bits) to the output range 2444 (e.g., 8 bits), the LTM logic 2202 may be configured to first determine a utilization range for the input range 2442. For example, assuming that the area is a substantially dark area, the input values corresponding to colors within the area may utilize only a sub-range of the entire range 2442, such as 2448 (e.g., values 0-1023). That is, the sub-range 2448 represents the actual dynamic range that exists in a particular region of the image frame being processed. Thus, since values 1024-4095 (unused subranges 2450) are not utilized in this region, the utilized range 2448 can be mapped and expanded first to utilize the entire range 2442, as shown by expansion process 2472. That is, since the values 1024 ~ 4095 are not utilized in the current region of the image being processed, they can be used to represent utilized parts (e.g., 0 ~ 1023). As a result, the utilized portion 2448 of the input range can be represented by another value (here, about 3 times more additional input values).
The expanded utilization input range (expanded to values 0-4095) may then be mapped to output values 0-255 (output range 2444), as shown in process 2474. Thus, as shown in FIG. 139, as a result of first expanding the utilization range 2448 of the input values to utilize the entire input range (0 ~ 4095), the utilization range 2448 of the input values can be represented by the entire output range 2444 (values 0 ~ 255), instead of only a part of the output range as shown in FIG. 138.
Before continuing, it should be noted that although referred to as a local tone mapping block, in some cases, LTM logic 2202 may also be configured to implement global tone mapping. For example, where the image frame includes an image scene (e.g., a scene of the sky) that generally has uniform characteristics, the region to which tone mapping is applied may include the entire frame. That is, the same tone mapping operator may be applied to all pixels of the frame. Returning to fig. 134, LTM logic 2202 may also receive data 2201 from face detection logic 2200, and in some cases, may utilize the data to identify one or more regions within the current image frame to which tone mapping is applied. Thus, the end result of applying one or more of the above-described local tone mapping techniques may be an image that is more aesthetically pleasing to a viewer.
The output of the LTM logic 2202 may be provided to a luma, contrast, and color adjustment (BCC) logic 2204. In the described embodiment, the BCC logic 2204 may be implemented substantially the same as the BCC logic 1184 of the YCbCr processing logic 904 of the ISP pipeline as shown in fig. 132, and the BCC logic 2204 may provide substantially similar functionality to provide brightness, contrast, hue, and/or saturation control. Thus, to avoid redundancy, the BCC logic 2204 of this embodiment is not re-described herein, but should be understood to be identical to the BCC logic 1184 of fig. 132 described above.
Subsequently, the scaling logic 2206 may receive the output of the BCC logic 2204 and be configured to scale image data representing the current image frame. For example, when the actual size or resolution (e.g., in pixels) of an image frame differs from the expected or desired output size, scaling logic 2206 may scale the digital image accordingly to obtain an output image of the desired size or resolution. As shown, the output 126 of the scaling logic 2206 may be sent to the display device 28 for viewing by a user or to the memory 108. In addition, the output 126 may also be provided to the compression/decompression engine 118 for encoding/decoding image data. The encoded image data may be saved in a compressed format and subsequently decompressed before being displayed on the display 28.
Further, in some embodiments, scaling logic 2206 may scale the image data using multiple resolutions. For example, when the desired output image resolution is 720p (1280 × 720 pixels), the scaling logic may scale the image frames accordingly to provide a 720p output image, and may also provide a lower resolution image that may be used as a preview image or a thumbnail image. E.g. applications running on the device, e.g. in multiple models The above available "Photos" applications, and in certain modelsAndavailable on computers (all available from apple Inc.)Andthe application may allow the user to view a list of preview versions of video or still images saved on the electronic device 10. When a saved image or video is selected, the electronic device may display and/or play back the selected image or video at full resolution.
In the illustrated embodiment, the scaling logic 2206 may also provide information 2203 to back-end statistics block 2208, which back-end statistics block 2208 may utilize the scaling logic 2206 for back-end statistics processing. For example, in one embodiment, the back-end statistics logic 2208 may process the scaled image information 2203 to determine one or more parameters for modulating quantization parameters (e.g., per macroblock quantization parameters) associated with the encoder 118, which in one embodiment, the encoder 118 may be an h.264/JPEG encoder/decoder. For example, in one embodiment, the back-end statistics logic 2208 may analyze the image by macroblock to determine a frequency content parameter or score for each macroblock. For example, in some embodiments, back-end statistics logic 2206 may determine the frequency score for each macroblock using techniques such as wavelet compression, fast fourier transform, or Discrete Cosine Transform (DCT). With the frequency scores, the encoder 118 is able to modulate the quantization parameters to achieve a generally uniform image quality within the macroblocks that make up the image frame. For example, if there is a large variance in frequency content in a particular macroblock, then compression may be applied more aggressively to that macroblock. As shown in fig. 134, the scaling logic 2206 may also provide a reduced resolution image 2207 to the face detection logic 2200 via an input to a selection circuit 2196, which may be a multiplexer or some other suitable type of selection logic 2196. Thus, the output 2198 of the selection circuit 2196 may be the YCC input 114 from the ISP pipeline 82 or the downsampled YCC image 2207 from the scaling logic 2206.
In some embodiments, the back-end statistics and/or encoder 118 may be configured to predict and detect scene changes. For example, back-end statistics logic 2208 may be configured to obtain motion statistics. The encoder 118 may attempt to predict scene changes by comparing motion statistics of the current frame and motion statistics of the previous frame, which may include certain metrics (e.g., luma), provided by the back-end statistics logic 2208. When the difference in the metrics is greater than a particular threshold, a scene change is predicted and back-end statistics logic 2208 may signal the scene change. In some embodiments, weighted prediction may be used because the fixed threshold may not always be perfect due to the variety of images captured and processed by device 10. Additionally, multiple thresholds may also be used depending on certain characteristics of the image data being processed.
As described above, the face detection data 2201 may also be provided to the back-end statistics logic 2208 and the encoder 118, as shown in fig. 134. Here, during back-end processing, the back-end statistics and/or encoder 118 may utilize the face detection data 2201 as well as the macroblock frequency information. For example, for macro blocks corresponding to the location of a face within an image frame determined using face detection data 2201, quantization may be reduced, thereby improving the visual appearance and overall quality of encoded faces or facial features present in an image displayed using display device 28.
Referring now to fig. 140, a block diagram illustrating LTM logic 2202 in greater detail is illustrated, in accordance with one embodiment. As shown, tone mapping is applied after first converting the YC1C2 image data 114 from the ISP pipeline 82 to a gamma corrected RGB linear color space. For example, as shown in fig. 140, logic 2208 first converts YC1C2 (e.g., YCbCr) data to a non-linear sRGB color space. In this embodiment, the LTM logic 2202 may be configured to receive YCC image data having different sub-sampling characteristics. For example, LTM logic 2202 may be configured to receive YCC 4:4:4 full data, YCC 4:2:2 chroma subsampling data, or YCC 4:2:0 chroma subsampling data, as shown by input 114 to selection logic 2205 (e.g., a multiplexer). For the sub-sampled YCC image data format, up-conversion logic 2209 may be applied to convert the sub-sampled YCC image data to YCC 4:4:4 format before the YCC image data is converted to the sRGB color space by logic 2208.
The converted sRGB image data 2210 may then be converted to RGB by logic 2212linearColor space, RGBlinearThe color space is a linear space for gamma correction. Then, the converted RGB linearImage data 2214 is provided to LTM logic 2216, and LTM logic 2216 may be configured to identify regions in an image frame (e.g., 2432 and 2434 of fig. 137) that share similar brightness, and apply bureaus to these regionsAnd (4) partial tone mapping. As shown in this embodiment, LTM logic 2216 may also receive parameters 2201 from face detection logic 2200 (fig. 134), which parameters 2201 may indicate the location and position within the current image frame where faces and/or facial features are present.
In pair of RGBlinearAfter data 2214 has had local tone mapping applied, the processed RGB is then processed by first utilizing logic 2222linearThe image data 2220 is converted back to the sRGB color space, and then sRGB image data 2224 is converted back to the YC1C2 color space using logic 2226, causing the processed image data 2200 to be converted back to the YC1C2 color space. Thus, the converted YC1C2 data 2228 (with tone mapping applied) may be output from the LTM logic 2202 and provided to the BCC logic 2204, as described above in fig. 134. It should be appreciated that the transformation of image data 114 into the various color spaces utilized within ISP backend LTM logic block 2202 may be accomplished using techniques similar to those described above in fig. 125 for transforming demosaiced RGB image data into the YC1C2 color space in RGB processing logic 902 of ISP pipeline 82. Further, in embodiments where YCC is up-converted (e.g., using logic 2209), YC1C2 data may be down-converted (sub-sampled) by logic 2226. Additionally, in other embodiments, scaling logic 2206 may be used to perform this sub-sampling/down-conversion instead of logic 2226.
Although the present embodiment shows conversion from YCC color space to sRGB color space, followed by conversion to sRGBlinearA color space conversion process, however, other embodiments may utilize a different color space conversion or may apply an approximate transformation using a work function. That is, in some embodiments, conversion to an approximately linear color space is sufficient for local tone mapping. Thus, the conversion logic of such an embodiment may be at least partially simplified (e.g., by eliminating the need for a color space conversion look-up table) with an approximate conversion function. In another embodiment, local tone mapping may also be performed in a color space that is better perceived by the human eye, such as the Lab color space.
Fig. 141 and 142 are flow diagrams depicting a method for processing image data using the ISP backend processing logic 120, in accordance with the disclosed embodiments. Referring first to fig. 141, fig. 141 depicts a method 2230 that illustrates the processing of image data by the ISP back-end processing logic 120. Beginning at step 2232, the method 2230 receives YCC image data from the ISP pipeline 82. For example, as described above, the received YCC image data may be in the YCbCr luminance and chrominance color spaces. Subsequently, method 2232 may branch to various steps 2234 and 2238. At step 2234, the received YCC image data may be processed to detect a location/location of a face and/or facial features within the current image frame. For example, referring to fig. 134, this step may be performed using face detection logic 2200, which face detection logic 2200 may be configured to implement a face detection algorithm, such as Viola-Jones. Thereafter, at step 2236, face detection data (e.g., data 2201) may be provided to the ISP control logic 84 as feedback to the ISP front end statistics processing unit 142 or 144, and to the LTM logic block 2202, back end statistics logic 2208, and encoder/decoder logic 118, as shown in fig. 134.
At step 2238, which may be performed at least partially concurrently with step 2234, the YCC image data received from the ISP pipeline 82 is processed to apply tone mapping. The method 2230 then proceeds to step 2240, whereby the YCC image data (e.g., 2228) is further processed for brightness, contrast, and color adjustments (e.g., using the BCC logic 2204). Subsequently, at step 2242, a scaling is applied to the image data from step 2240 to scale the image data to one or more desired sizes or resolutions. Additionally, as described above, in some embodiments, color space transforms or sub-sampling (e.g., in embodiments where the YCC data is up-sampled for local tone mapping) may also be applied to produce an output image with the desired samples. Finally, at step 2244, the scaled YCC image data may be displayed for viewing (e.g., using display device 28), or may be saved in memory 108 for later viewing.
FIG. 142 illustrates in more detailTone mapping step 2238 of fig. 141. For example, step 2238 may begin at sub-step 2248, where in sub-step 2248, the YCC image data received at step 2232 is first converted to the sRGB color space. As described above and shown in fig. 140, some embodiments may provide for up-conversion of sub-sampled YCC image data prior to conversion to sRGB space. Thereafter, in sub-step 2250, the sRGB image data is converted to a gamma corrected linear color space RGB linear. Subsequently, in sub-step 2252, the tone mapping logic 2216 of the ISP back-end LTM logic block 2202 compares the RGB valueslinearThe data applies tone mapping. Tone mapping image data of sub-step 2252 is then transferred from RGBlinearThe color space is converted back to the sRGB color space as shown in sub-step 2254. Thereafter, in sub-step 2256, the sRGB image data may be converted back to the YCC color space, and step 2238 of method 2230 may continue to step 2240 discussed in fig. 141. As described above, process 2238 shown in FIG. 142 is merely one process for applying a color space transform in a manner suitable for local tone mapping. In other embodiments, an approximate linear transformation may also be applied instead of the illustrated transformation steps.
It will be appreciated that the various image processing techniques described above in connection with defective pixel detection and correction, lens shading correction, demosaicing, and image sharpening, etc., are provided herein by way of example only. Thus, it should be understood that the present disclosure should not be construed as limited to only the examples provided above. Indeed, in other embodiments, the example logic described herein may be subject to many variations and/or additional features. Further, it should be understood that the techniques discussed above may be implemented in any suitable manner. For example, components of the image processing circuitry 32, and in particular the ISP front-end block 80 and the ISP pipeline block 82, may be implemented using hardware (e.g., suitably configured circuitry), software (e.g., via a computer program comprising executable code stored on one or more tangible computer-readable media), or by using a combination of hardware and software elements.
According to an aspect of the present invention, there is also provided a method of synchronizing audio and video on an electronic device, comprising: detecting a start of an image frame in a set of video data acquired by an image processing system of an electronic device using a digital image sensor; associating a timestamp value with metadata associated with the image frame, wherein the timestamp value corresponds to a beginning of the image frame; using the timestamp value, the image frame is aligned with a corresponding audio sample in a set of audio data obtained by an electronic device.
Wherein associating the timestamp value with metadata associated with the image frame comprises: reading a timestamp value from a first register configured to provide a periodically incremented current time value; writing the timestamp value to a second register associated with the digital image sensor; and saving the timestamp value from the second register as part of the metadata associated with the image frame.
The method further comprises the following steps: detecting when a current time value of the first register exceeds a maximum bit resolution of the first register; determining a higher resolution current time value having a bit resolution higher than the bit resolution of the first register if the current time value of the first register exceeds the maximum bit resolution of the first register; and writing the higher resolution current time value to metadata associated with the image frame until the first register is reset.
The specific embodiments described above are shown by way of example, and it should be understood that these embodiments may be susceptible to various modifications and alternative forms. It should be further understood that the claims are not intended to be limited to the particular forms disclosed, but on the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the invention.
Claims (23)
1. A method of synchronizing audio and video data on an electronic device, comprising:
receiving, with an image signal processor, image frames of video data, the image frames obtained with a digital image sensor of an electronic device;
detecting a start of an image frame;
reading a current time stamp value from a time code register configured to provide a time stamp value corresponding to a current time when a start of an image frame is detected;
associating the timestamp value with a set of metadata associated with the image frame; and
synchronizing playback of the image frames with corresponding audio samples of a set of audio data obtained concurrently with the video data on the electronic device using the timestamp values associated with the set of metadata.
2. The method of claim 1, wherein associating the timestamp value with the set of metadata comprises saving the timestamp value in a timestamp register associated with the digital image sensor and writing the timestamp value saved in the timestamp register to the set of metadata.
3. The method of claim 1, wherein the timestamp value provided by the time code register is incremented at intervals defined by a particular number of clock cycles based on a clock domain of the image signal processor.
4. The method of claim 1, wherein a programmed delay is added to the timestamp value prior to associating the timestamp value with the set of metadata.
5. A system, comprising:
an image signal processor comprising an interface configured to receive a plurality of image frames obtained with a digital image sensor, wherein the plurality of image frames correspond to a set of video data;
an audio subsystem including an input device configured to receive audio data concurrently with image data obtained by a digital image sensor; and
audio-video synchronization logic configured to, for each image frame of a plurality of image frames, identify a start of the image frame, determine a timestamp value corresponding to the start of the image frame, and write the timestamp value in metadata associated with the image frame;
wherein the audio-video synchronization logic is configured to synchronize the plurality of image frames with the audio data by associating a timestamp value stored in metadata associated with each image frame of the plurality of image frames with a corresponding timestamp associated with the audio data.
6. The system of claim 5, wherein the audio-video synchronization logic comprises a time code register configured to provide the current time stamp value, wherein the time stamp value provided by the time code register is incremented at regular intervals.
7. The system of claim 6, wherein the image signal processor and the audio subsystem operate according to independent first and second clocks, respectively.
8. The system of claim 7, wherein the audio-video synchronization logic comprises a timer configuration register configured to hold a time configuration code, the regular intervals corresponding to a plurality of clock cycles of the first clock determined as a function of the time configuration code.
9. The system of claim 7, wherein the audio-video synchronization logic is configured to synchronize the first clock with a third clock of a main processor of the system, synchronize the second clock with the third clock of the main processor of the system, determine a time difference between the first clock and the second clock based on the synchronization of the first clock and the second clock with the third clock, and use the time difference to associate a timestamp value stored in the metadata for each of the plurality of image frames with a corresponding timestamp associated with the audio data.
10. The system of claim 6, wherein the audio-video synchronization logic is configured to synchronize the plurality of image frames with the audio data by adding or dropping image frames until timestamps corresponding to the image frames are aligned with timestamps corresponding to the audio data.
11. An electronic device, comprising:
means for detecting a start of an image frame in a set of video data acquired by an image processing system of an electronic device using a digital image sensor;
means for associating a timestamp value with metadata associated with the image frame, wherein the timestamp value corresponds to a beginning of the image frame;
means for aligning the image frame with a corresponding audio sample in a set of audio data obtained by an electronic device using the timestamp value.
12. The electronic device of claim 11, wherein the means for associating the timestamp value with metadata associated with the image frame comprises:
means for reading a timestamp value from a first register configured to provide a periodically incremented current time value;
means for writing the timestamp value to a second register associated with the digital image sensor; and
Means for storing the timestamp value from the second register as part of metadata associated with the image frame.
13. The electronic device of claim 11, further comprising:
means for detecting when a current time value of the first register exceeds a maximum bit resolution of the first register;
means for determining a higher resolution current time value having a bit resolution higher than the bit resolution of the first register if the current time value of the first register exceeds the maximum bit resolution of the first register; and
means for writing a higher resolution current time value to metadata associated with the image frame until the first register is reset.
14. An electronic device, comprising:
a first image sensor;
a first image sensor interface;
an image signal processing subsystem configured to receive a first set of video data obtained with a first image sensor via a first sensor interface;
an audio processing subsystem configured to receive a first set of audio data obtained concurrently with a first set of video data; and
An audio-video synchronization circuit, the audio-audio synchronization circuit comprising:
a time code register configured to provide a current time value;
a first timestamp register associated with the first image sensor; and
logic configured to detect, for each image frame of a first set of video data, a start of an image frame of the first set of video data, sample a time code register when the start of the image frame is detected, thereby obtaining a time stamp value corresponding to the start of the image frame, save the time stamp value in the first time stamp register, write the time stamp value of the first time stamp register in a set of metadata associated with the image frame, and synchronize the first set of video data and the first set of audio data by correlating the time stamp value in the metadata of each image frame of the first set of video data with a corresponding time stamp associated with the first set of audio data.
15. The electronic device of claim 14, further comprising:
a second image sensor;
a second image sensor interface, wherein the image signal processing subsystem is configured to receive video data obtained with a second image sensor via the second sensor interface, and the audio processing subsystem is configured to receive a second set of audio data obtained concurrently with the second set of video data; and
Wherein the audio-video synchronization circuit comprises:
a second timestamp register associated with a second image sensor, and
logic configured to detect, for each image frame of a second set of video data, a start of an image frame of the second set of video data, sample a time code register when the start of the image frame is detected, thereby obtaining a time stamp value corresponding to the start of the image frame, save the time stamp value in the second time stamp register, write the time stamp value of the second time stamp register in a set of metadata associated with the image frame, and synchronize the second set of video data and the second set of audio data by correlating the time stamp value in the metadata of each image frame of the second set of video data with a time stamp associated with the second set of audio data.
16. The electronic device of claim 15, further comprising a display device configured to display the first and second sets of video data, wherein the display device comprises at least one of: an LCD display device, a plasma display device, an LED display device, an OLED display device, or some combination thereof.
17. The electronic device of claim 15, further comprising:
an audio input device configured to obtain first and second sets of audio data, wherein the audio input device comprises at least one microphone; and
an audio output device configured to play back the first and second sets of audio data.
18. The electronic device of claim 15, wherein the first and second image sensors comprise at least one of: a digital camera integral with the electronic device, an external digital camera coupled to the electronic device via an interface, or some combination thereof.
19. The electronic device of claim 14, comprising at least one of: a desktop computer, a laptop computer, a tablet computer, a mobile cellular telephone, a portable media player, or any combination thereof.
20. The electronic device defined in claim 14 wherein if the first set of video data and the first set of audio data are misaligned, the audio-video synchronization circuitry is configured to repeat or discard the image frame to align the time stamps of the first set of video data with the time stamps associated with the first set of audio data.
21. A method of synchronizing audio and video on an electronic device, comprising:
detecting a start of an image frame in a set of video data acquired by an image processing system of an electronic device using a digital image sensor;
associating a timestamp value with metadata associated with the image frame, wherein the timestamp value corresponds to a beginning of the image frame;
using the timestamp value, the image frame is aligned with a corresponding audio sample in a set of audio data obtained by an electronic device.
22. The method of claim 21, wherein associating the timestamp value with metadata associated with the image frame comprises:
reading a timestamp value from a first register configured to provide a periodically incremented current time value;
writing the timestamp value to a second register associated with the digital image sensor; and
the timestamp value from the second register is saved as part of the metadata associated with the image frame.
23. The method of claim 21, further comprising:
detecting when a current time value of the first register exceeds a maximum bit resolution of the first register;
determining a higher resolution current time value having a bit resolution higher than the bit resolution of the first register if the current time value of the first register exceeds the maximum bit resolution of the first register; and
Writing a higher resolution current time value to metadata associated with the image frame until the first register is reset.
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US12/895,299 | 2010-09-30 |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| HK1173295A true HK1173295A (en) | 2013-05-10 |
| HK1173295B HK1173295B (en) | 2018-01-19 |
Family
ID=
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN104902250B (en) | Flash Synchronization Using Image Sensor Interface Timing Signals | |
| CN102572316B (en) | Overflow control techniques for image signal processing | |
| CN102547162B (en) | Method and system and electronic device for processing image data | |
| CN102547301B (en) | Use the system and method for image-signal processor image data processing | |
| CN102572443B (en) | For isochronous audio in image-signal processing system and the technology of video data | |
| US8508621B2 (en) | Image sensor data formats and memory addressing techniques for image signal processing | |
| HK1173295A (en) | Techniques for synchronizing audio and video data in an image signal processing system | |
| HK1173295B (en) | Techniques for synchronizing audio and video data in an image signal processing system |