WO2024112375A1

WO2024112375A1 - Adaptive face brightness adjustment for images and video

Info

Publication number: WO2024112375A1
Application number: PCT/US2023/032428
Authority: WO
Inventors: Paras Maharjan; Tsung-Wei Huang; Guan-Ming Su
Original assignee: Dolby Laboratories Licensing Corporation
Priority date: 2022-11-22
Filing date: 2023-09-11
Publication date: 2024-05-30
Also published as: EP4623404A1

Abstract

A still or video image is brightness adjusted by separately applying a regression model to find a brightness tuning parameter and then a brightness correction, using the tuning parameter, to detected faces in the image. Background over-exposure can also be corrected. The image can be further enhanced with a separate contrast correction. The regression model can also determine if any correction is needed for the image through binary classification.

Description

ADAPTIVE FACE BRIGHTNESS ADJUSTMENT FOR IMAGES AND VIDEO CROSS-REFERENCE TO RELATED APPLICATIONS [001] This application claims priority to U.S. Provisional Application No.63/384,739, filed November 22, 2022, which is incorporated by reference in its entirety. TECHNOLOGY [002] The present disclosure relates generally to image processing operations. More particularly, an embodiment of the present disclosure relates to brightness adjustment of images containing faces. BACKGROUND [003] With the use of video conferencing on the rise, producing quality video images from non-professionally produced video (e.g. at home or in the office video conferencing through a web camera with less-than-ideal ambient lighting) is a concern. If a person’s face isn’t well lit, then the usual solution is to increase the gamma of the video image. However, if the background of that person is well lit, then increasing the gamma can cause the background to be overexposed, which produces a poor image. Additionally, merely increasing the gamma on someone’s face does not always produce accurate or attractive results. [004] The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section. Similarly, issues identified with respect to one or more approaches should not assume to have been recognized in any prior art on the basis of this section, unless otherwise indicated. SUMMARY [005] An embodiment of the present invention is a method for adaptively applying brightness correction to an image, the method comprising: using a regression model to determine if image correction is needed on the image; determining likely face regions of the image; if the regression model determines that image correction is needed, applying brightness correction to the likely face regions of the image based on a tuning parameter from the regression model. In some embodiments the method further comprises applying over- exposure correction to background regions of the image based on an over-exposed probability of the image, wherein background regions are all regions that are not the likely face regions. [006] A method may be computer-implemented in some embodiments. For example, the method may be implemented, at least in part, via a control system comprising one or more processors and one or more non-transitory storage media. [007] Some or all of the methods described herein may be performed by one or more devices according to instructions (e.g. software) stored on one or more non-transitory media. Such non-transitory media may include memory devices such as those described herein, including but not limited to random access memory (RAM) devices, read-only memory (ROM) devices, etc. [008] Accordingly, various innovative aspects of the subject matter described in this disclosure may be implemented in a non-transitory medium having software stored thereon. The software may, for example, be executable by one or more components of a control system such as those disclosed herein. The software may, for example, include instructions for performing one or more of the methods disclosed herein. [009] At least some aspects of the present disclosure may be implemented via an apparatus or apparatuses. For example, one or more devices may be configured for performing, at least in part, the methods disclosed herein. In some implementations, an apparatus may include an interface system and a control system. The interface system may include one or more network interfaces, one or more interfaces between the control system and memory system, one or more interfaces between the control system and another device and/or one or more external device interfaces. The control system may include at least one of a general-purpose single- or multi-chip processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, or discrete hardware components. Accordingly, in some implementations the control system may include one or more processors and one or more non-transitory storage media operatively coupled to one or more processors. [0010] Details of one or more implementations of the subject matter described in this specification are set forth in the accompanying drawings and the description below. Other features, aspects, and advantages will become apparent from the description, the drawings, and the claims. Note that the relative dimensions of the following figures may not be drawn to scale. Like reference numbers and designations in the various drawings generally indicate like elements, but different reference numbers do not necessarily designate different elements between different drawings. BRIEF DESCRIPTION OF THE DRAWINGS [0011] An embodiment of the present invention is illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which like reference numerals refer to similar elements and in which: [0012] FIG.1 illustrates an example flowchart of an embodiment of the method. [0013] FIG.2 illustrates an example of a brute force method for finding a best (“ground truth”) tuning parameter. [0014] FIG.3 illustrates an example flowchart of the regression model during testing. [0015] FIG.4 illustrates an example pipeline for the brightness correction block. [0016] FIG.5 illustrates an example pipeline for the computation of the brightness correction index. [0017] FIG.6 illustrates an example flowchart of detecting an over-exposed region of an image. [0018] FIG.7 illustrates an example flowchart for HSSL adaptive contrast correction. [0019] FIGs.8A and 8B illustrate an example gamma curve (8A) and a flipped gamma curve (8B). [0020] FIG.9 illustrates a simplified block diagram of an example hardware platform on which a computer or a computing device as described herein may be implemented. [0021] FIG.10 illustrates an example sigmoid curve for a look up table for a color channel. [0022] FIG.11 illustrates HSSL local reshaping of a sigmoid curve with fixed g values and varied m values. [0023] FIG.12 illustrates HSSL local reshaping of a sigmoid curve with fixed m values and varied g values. DETAILED DESCRIPTION [0024] In the following description, for the purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the present disclosure. It will be apparent, however, that the present disclosure may be practiced without these specific details. In other instances, well-known structures and devices are not described in exhaustive detail, in order to avoid unnecessarily occluding, obscuring, or obfuscating the present disclosure. [0025] As used herein, “brightness” refers to the intensity of the pixels in question. “Contrast” refers to the relative difference in brightness between different groups of pixels (e.g., regions of the image). “Over-exposure” refers to a level of brightness greater than expected for pixels in question, and “under-exposure” refers to a level of brightness less than expected for pixels in question. [0026] As used herein, “face” refers to a region of an image that has been detected to show a person. Typically, “face” refers mainly to the head region of a person, but it can include more depending on the detection method. “Background” refers to regions of the image that are not considered “face”. [0027] As used herein “dark channel” refers to the pixelwise minimum value of a channel (e.g., R, G, or B) of the image in question. [0028] As used herein “HSSL” refers to a hybrid shifted sigmoid function with linear mapping. [0029] As used herein, “LUT” refers to a look-up table in software/firmware. [0030] A method for adaptively correcting the brightness of face regions and the over- exposure of background regions of an image (e.g., for video) can be accomplished by the use of a regression model for determining a brightness tuning parameter and a brightness/overexposure correction block to apply brightness correction to detected face regions of the image based on the tuning parameter. In some embodiments, the regression model can also determine if any correction is needed at all, bypassing the brightness/overexposure correction block if no correction is needed. In some embodiments, a contrast correction block is used to enhance the contrast of the face regions based on the tuning parameter, to improve the image after brightness correction. [0031] The method/system is particularly useful for video conferencing/streaming situations, where the lighting might be sub-optimal for presenting a uniform brightness and contrast between the person being viewed on the video and their background. Example platforms this can be used on are YouTube™, TikTok™, Twitch™, Zoom™, Teams™, BlueJeans™, etc. The process can be used on video or still images, used on streaming content or stored content, and applied through an add-on, an embedded feature, or a stand-alone system. The system can be implemented in a video decoder to enhance the input video stream. [0032] FIG.1 shows an example flowchart for an embodiment of the method. A frame/image (110) is input to a machine learning regression model (120). The regression model (120) can first determine if brightness correction is needed (125) from the parameters of the input frame (110) (e.g., binary classification). [0033] If contrast correction (e.g., HSSL) is being used by the system and the model (120) determines that brightness correction is not needed, the input frame (110) is directly corrected by a contrast correction block (140) and output (150) for further processing or viewing. If correction is needed, the regression model (120) determines a tuning parameter that reflects an estimated amount of correction needed. The parameter is passed to a brightness correction block (130) that adjusts the brightness of face regions (detected) of the input image (110) based on the tuning parameter from the regression model (120). The image is further corrected by a contrast correction block (140) that adjusts contrast levels of the face regions again based on the tuning parameter. The image is then output (150) for further processing or viewing. [0034] If contrast correction is not being used and the model (120) determines that brightness correction is not needed, the input frame (110) bypasses brightness correction (130) and is output (150) for further processing or viewing. [0035] In the first stage, the regression model determines if the image needs the correction or not by predicting the brightness tuning parameter (β) for the image. This brightness tuning parameter controls how much correction to apply to make the face brighter (if the face in the image is dark or under exposed) or darker (if the face in the image is very bright or over exposed). The brightness tuning parameter is a value ranging from 0 to 1. “0” means the image does not need brightness correction (for normal exposure/normal lighting in the face region) and “1” means the image needs the maximum correction. The values in between designate to what degree correction is needed (e.g., 0.5 might indicate 50% correction from maximum). In some embodiments, the regression model does not make an initial determination if the image needs correction, and just passes the tuning parameter to the correction blocks (which, if the parameter is 0, would not adjust the image). [0036] Using a face probability map (see e.g., PCT/US2022/038249, published as WO 2023/009469, incorporated by reference herein), histograms of the face and background can be computed separately. A “face probability map” (P_face) is a map showing the probabilities (e.g., 0 to 1) of a pixel being in a face region of the image. Background probabilities (P_background) is 1 – P_face. In some embodiments, these histograms can be used as a feature for classification for correction. However, in many cases, the histogram is not an accurate representation of the image, and the histogram can change drastically by changing the number of bins used, so in some embodiments, a weighted histogram is used. Weighted histograms can be computed from summing the face or background probabilities corresponding to each pixel intensity. In some embodiments, instead of a weighted histogram, weighted percentiles are used. A face probability map can also be generated by other methods known in the art, such as deep-learning based face segmentation algorithms. [0037] A regular histogram counts the total number of pixels for each intensity levels. On the other hand, weighted histogram sums the weights associated with each pixel for each intensity levels. To find the weighted histogram of the face region, the probability map of face can be used as weights. Weighted Histogram [0038] The algorithm to compute a weighted histogram of the Y channel can be as follows: [0039] STEP 1: Take the Y channel of the image (IY) and a face probability map (Pface) as input. [0040] STEP 2: Reshape both to 1-dimension and combine together. [0041] STEP 3: Sort the combined array based on the ascending order of the pixel intensities. ^_^ = sortrows([I_^,^^ P_^^^^,^^]) where, sortrows( ) is a function that sorts the data in ascending order based on the first column. [0042] STEP 4: For each pixel intensity of IY (e.g., 64 to 960 for a Y channel in 10-bit SMPTE (Society of Motion Picture and Television Engineers standard) range) the sum of weights of pixel at bin i is defined as: "^# ^_^ = ^ ^_{^^^^,^ ,^}(!) where, ni is total number of

Y channel), and ^_{^^^^,^ ,^}(!) is the weight of the kth pixel in the ith bin. Weighted Percentile [0043] A weighted percentile can be computed similarly to the weighted histogram. An algorithm to compute the percentile for the Y channel is shown below: [0044] STEP 1: Take the Y channel of the image (I_Y) and a face probability map (P_face) as input. [0045] STEP 2: Reshape both to 1-dimension (“flatten”) and combine them together. [0046] STEP 3: Sort the combined array based on the ascending order of the pixel intensities. ^_^ = sortrows([I_^,^^ P_^^^^,^^]) [0047] STEP 4: Compute cumulative distribution function (CDF) of sorted P_^^^^,^^ from Step 3. [0048] STEP 5: Normalize CDF to range from 0 to 1. ^&'_"()* = ^&'/max (^&') [0049] STEP 6: Find the first occurrence of a range of CDF_norm for selected values between 0 and 1 (e.g.0.1, 0.25, 0.5, 0.75 and 0.9, which gives the indexes of 10^th, 25^th, 50^th, 75^th, and 90^th percentiles respectively). [0050] STEP 7: Map the index value to the intensity value from ^_^. [0051] To compute percentile for the chroma channels ICb and ICr use similar methods. In some embodiments, only the 50^th percentile is used for the chroma channels. For weighted histograms of the background region, replace the face probability P_face by the background probability (1 – Pface) in the above steps. [0052] Other forms of face probability mapping are possible, such as skin detection algorithms. [0053] Concatenate the percentiles from the face and background and used them as ^{features for the regression model.} '_{= ^/0123(45615037859215, 4561503785 :21!;6/<0=)} _{'" = >(', ?, @)} where, N( ) is a normalization function that normalizes the data to have the mean 0 and standard deviation 1. And ? is mean and @ is standard deviation of all '. [0054] The mean and standard deviation can be saved during training and to be used to test the model. [0055] The classification model to predict if the face needs the brightness correction or not. Any classifier algorithm can be used – the examples herein use a support-vector machine (SVM) classifier. The input to the classifier is the weighted percentile feature of the face and background as mentioned above. [0056] An output of the classification model is a probability or confidence (a.k.a. score). For example, this could be a normalized distance from a hyperplane of the classifier (if using, for example, a support vector machine) or the equivalent, depending on the algorithm used. A score greater than a threshold value (e.g., 0.5) means the face shows a normal lighting condition (no correction needed). And score of less than the threshold means the face is either under exposed or over exposed and needs correction. A score closer to 0 represents that the face needs a very strong correction and a score closer to the threshold represents that it needs less correction. Note that this is the reverse of the how the eventual tuning parameter is scaled. Because the score implies how strong to perform the brightness correction on the face, it is included as a feature for the regression model. [0057] To find the best tuning parameter for each training image used during training the regression model, a brute force method can be used. Other optimization algorithms known in the art, such as gradient descent, can also be used. [0058] FIG.2. shows an example of the brute force method. First, from the input frame (210) extract the percentile features (215) and pass it though the classification model (220). Start the tuning parameter at zero (β = 0). Based on the output of the classification model, update the tuning parameter (225) by some small value θ and compute the new brightness enhanced image using the brightness correction module (230) (see below). Use the classification model (220) to predict if face in the new enhanced output needs further correction or not. If it still needs correction, increase β by θ and apply the brightness enhancement again. Once the classification output for enhanced image is 1 or the max value of β is reached (e.g., β=1), stop the iteration and save the final value of β as the best tuning parameter (240). [0059] FIG.3 shows an example flowchart of the regression during testing. The saved tuning parameter from the brute force method (above) is the “ground truth” value for the regression. Features (315) are extracted from the input frame (310) and fed into the SVR (320) for testing. The brightness correction parameter β is predicted by the SVR and clipped to [0, 1] range. If β = 0, which means the face does not need the brightness correction, directly compute the HSSL (330) (see below) to correct contrast and generate the final output. If 0 < ^ ≤ 1 then we first correct the face brightness (325) using the value of β and then pass the resulting image to HSSL block (330) to generate the final output image (390). [0060] Normally, brightness correction is done by overall gamma correction, but this can result in uneven results if the face and background are not lit evenly (e.g., brightly lit face against a dark background, or vice versa). So, the brightness correction block herein corrects the face and background separately (e.g., increasing the brightness of the face while correcting the over-exposure of the background). [0061] FIG.4 shows an example pipeline for the brightness correction block. Faces in an input frame (410) are detected to generate a face probability map (415) of the image (410) from face bounding boxes (416). A brightness correction index is computed (420) (see e.g., FIG.5) from the probability map (415). Luma correction (425) is applied based on the correction index (420) and look-up tables (426) of correction curves. The ratio of the chroma channels is computed (430) and further corrections are applied to those channels (435) based on the ratio (saturation adjustment) to give the output frame (490) for viewing or further processing. [0062] FIG.5 shows an example pipeline for the computation of the brightness correction index. The input frame (510) from the face probability map has its dark channel computed (515). This can be seen in terms of a retinex decomposition, which assumes an image can be decomposed into illumination and reflectance. To estimate the illumination, we first calculate the dark channel, which is the pixelwise minimum value of the channels (e.g., R, G or B channel) of the image I: E_{F^)$ ^G^""^H} = I70(E_J , E_K , E_L) [0063] However, the

the true intensity of the light reaching the surface. In some examples some colors are extremely saturated, such as [1023, 0, 0] (e.g. a bright red area), and therefore the dark channel becomes 0 (i.e. min(1023, 0, 0)). This will make the brightness correction module use the strongest correction curve and result in an over brightening of the saturated region. In some embodiments, therefore, the generalized mean across the R, G, and B channel is used as the dark channel prior to the illumination map estimation (525) E_{F^)$ ^G^""^H} = ;5056287M5=_I520(E_JKL , 4) where, p is a non-zero real number (e.g., 0.5), thereby avoiding over-correction of brightness. [0064] The generalized mean with exponent p of these positive real numbers is: " ^/U where n is the

. [0065] The dark channel is then passed though the guided filter (520) to generate the illumination channel. This is to ensure that the noise is removed by smoothing which results in consistent illumination is the surface of neighboring pixels. Illumination describes how much light reached to the surface. This illumination can be used as a map to correct the brightness in the image. The illumination map (520) can be computed by passing the dark channel (515) through a guided filter. The Y channel of YCbCr color space, for example, can be used as the guided image to smoothen the dark channel: E_^HHW*^"^X^(" = YZE&[(E_{F^)$ ^G^""^H}, \) [0066] Based on the can select a reshaping

function (gamma curve) from the LUT (540). The main idea here is to compute the average value in local neighborhood without introducing artifacts near edges, such as halo artifacts. Hence, we use the guided image filter (520) to smooth out the image and preserve the edge. See e.g., Kaiming He et al. “Guided Image Filtering”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.35, No.6, June 2013. [0067] The face illumination index (525) can be computed by applying the illumination map (520) to a face probability map (526). A face probability map (face detection) can be performed by a number of known methods, typically using deep learning algorithms. A fast method is one that finds the dominant color of a face using a bounding box. The probability of each pixel being a face region is checked and assigned a value (e.g.0-1) forming a probability map. See e.g. PCT/US2022/038249 and Shifeng Zhang et al. "Faceboxes: A CPU real-time and accurate unconstrained face detector." Neurocomputing 364 (2019): 297-309. [0068] When the lighting condition in the face is extremely low and the illumination is not consistent, some part of the face was not detected by the face detection module and hence the face probability map estimator also fails to detect these as a part of face. The value of probability around this area becomes very close to 0. This can lead to errors when applying the gamma correction, as areas in the dark region of the face that need strong gamma to correct would be given a smaller gamma because of the small value of the face probability. [0069] To solve this issue for the computation of the face probability map, enhance the whole image using global gamma correction with a gamma value (e.g.0.5). Using this brightened image for the face map estimation improves the accuracy of the face segmentation without introducing an artifact. It is noted that a gamma value above 0.5 does not add significant improvement on the face probability map estimation. The global gamma correction is only used for the face probability map generation to ensure better reconstruction. [0070] Parallel to the face illumination correction (525), over-exposure of the background is also corrected (536). First, a soft threshold (530) is applied to the dark channel (see e.g., FIG.6) to create a soft threshold map I_th. E_XG = (E_{F^)$ ^G^""^H} − ^)₊ 0

Then it goes through smoothing (535) using a guided filter (similar to what was used for the illumination map (520)). This is to ensure that there is no holes or missing patches when using soft thresholding. This smoothing can be computed as: E_(^ = YZE&[(E_XG, \) where, Ioe is the overexposure map and Ith is the output from soft thresholding of dark channel. Y is the Y channel of YCbCr used as guided image. The result can be further used to calculate an over-exposure index map (536), I_index,oe. [0071] Once the face illumination index (525) and overexposure index (536) are computed, they can be used to apply brightness/overexposure correction (550) to the image through the use of appropriate curves found in look-up tables (540) (e.g. one table for the face region, one table for the background region), producing the output frame (590) for viewing or further processing. [0072] FIG.6 shows an example flowchart of detecting an over-exposed region of an image. apply soft threshold (615) to the dark channel (610). This will create a map where all the pixel below the set threshold (T) to zero and make a soft transition to the pixel above the threshold (T). Then sum up the threshold map (Ith) and compare (620) if it exceeds the count threshold value (N_th). This count threshold (N_th) makes sure that there are enough pixels in the image that exceeds the intensity threshold (T) to be considered “over-exposed”. If sum(I_th)>N_th, compute an over-exposure map (630) which is used in the brightness and over- exposure correction block (640). In some embodiments, Nth is 5% of the total number of pixels. In some embodiments, the image is further contrast corrected by an HSSL block (650) (see e.g., FIG.7). ^{[0073] Alternatively, one can use soft thresholding to get the over-exposed probability} _{4(^ for the image. Define a parameter Δ>XG as half of the size of the soft thresholding} window such that: 1 ^{= −}

[0074] Then, update E_XG ← 4_(^ × E_XG. Because now E_XG is scaled by 4_(^, it can fall through the complete over exposure correction process no matter the image is heavily over exposed, slightly over exposed, or not over exposed. [0075] FIG.7 shows an example flowchart for HSSL adaptive contrast correction. When correcting the brightness of over/under exposed image, the contrast seems to decrease in the resulting image. To restore the lost contrast, pass the brightness restored image through HSSL with adaptive contrast correction (720) in the face region (740) of the image (730) based on the brightness tuning parameter as: ^/0362M3 ^/665137/0 = q × r where, q is the brightness tuning parameter and r is a user defined parameter (710) that defines by what percentage user want to correct the contrast. By default, r is set to 0.3 which means the contrast in the face is enhanced by 30% of the default HSSL enhancement strength for non-face region. This contrast correction is applied to adjust the HSSL local reshaping index map (750) along with the P_face where P_face ranges from 0 to 1. Applying the HSSL index map (750) by an HSSL local reshaping function (760) produces an output frame (770) for viewing or further processing. Look-Up Tables (LUTs) [0076] The easiest way to correct the brightness of the image is to use gamma correction. However, a correction curve for LUTs is not limited to gamma curves. One can also use a piece wise linear function, s-shaped curves, etc. as correction curves. Gamma correction is a nonlinear way to increase or decrease the intensity of the image. A traditional gamma curve as shown in FIG.8A usually generates a washed-out color when applied to an image. The gamma curve (810) bulges in the lower intensity pixel. Since the correction in the mid tone are less than that of pixel in lower intensity range the image look hazy or washed out. Therefore, using a flipped gamma (820) solves the issue when gamma value is less than 1, as shown in FIG.8B. [0077] The correction curve is defined by the following equation: Y₌ ^{1t 9/6 :67;ℎ370350M73u 1/665137/0 (v ≥ 1)} 1₎ where c is

range 1 = [0: 1: (>_yz − 1)]/>_yz where, N_CW is the number of of intensities used to represent the

intensity range of the image. Some embodiments use a 10-bit image and NCW = 1024. Note that for modified gamma curve one can have v ≤ 0, while the regular gamma curve is only defined for v > 0. [0078] A function that controls how to spread our gamma function (G) can be defined along the pixel intensities. In some embodiments, it is preferable to have weaker gamma values (γ closer to 1) for the mid tones, and stronger gamma values for the highlights (γ much greater than 1) and the shadow region (γ much smaller than 1). In those cases, the relation between pixel intensity and gamma value can be defined using the following power function. v = 1 + (3|M − 0.5|)^{^.}} × M;0(M − 0.5) where M is pixel intensity in illumination map and M;0⁽. ⁾ is the sign function. The above equation shows that when pixel intensity is in the center of mid tone, i.e., M = 0.5, v = 1. For computational efficiency, uniformly sample >_t points in pixel intensity from 1 to 0 to construct a list of gamma values and the corresponding modified gamma curves as LUT, where >_t is an odd integer. In some embodiments, >_t = 1025. [0079] The details of constructing the gamma value list are described as follows. Because ^{the above function is symmetric at M = 0.5, construct half of the gamma value list M`} ^{H^~X for} _{M ≥ 0.5 and then flip and concatenate it to get the complete list.} M_H ^` ^_~X = 870M4215 _g0.5,0, _^ ^>t _^l where

Finally, ^{construct the LUT for each gamma value in the list such that ^Z ^^^^^(7, ^) corresponds to the} _{7-th gamma value and ^-th pixel intensity.} [0080] To correct the over-exposure, use the clipped gamma curve at the high intensity levels. This will limit the max intensity to the threshold value ^ℎ65M_(^. The new gamma curve for the over exposed pixels is given by, ;_(^ = ^ℎ65M_(^ ∗ 1^{^} where, ^ is the value of gamma. In some embodiments, 90 % (0.9) of the max intensity is used as the maximum value of the over-exposed pixels (Pix_max). Other values between 0 and 1 can be used for other embodiments. The relation between ^ℎ65M_(^ and the over exposure value M_(^ in over exposure map E_(^ are related using the following equation. ^ℎ65M_(^ = −0.1 × (M_(^ − 1) + ^7P_*^^ [0081] So, when M_(^ = 1, ^ℎ65M_(^ = ^7P_*^^. For computational efficiency, uniformly sample ^^{^^} Q ^ + 1 points in probability value from 1 to 0 to construct a list of threshold value and the clipped gamma curves as LUT. [0082] The detail of constructing the gamma value list is described as follows. For gamma value, reuse the first half of v_H^~X from above. > ^{M(^,H^~X = 870M4215 g1,0, ^ t} ^{^ + 1l} ^_{ℎ65M ^~X =} ² (_{^,H −0.1 × OM(^,H^~X − 1S + 0.9} _> _{v(^,H^~X =} [0083] Finally, the LUT of ;

(_^ is _(^,H^~X and v_(^,H^~X. Construct the ^{LUT for each threshold and gamma values in the list such that ^Z (^^(7, ^) corresponds to the} _{7-th threshold and gamma values and ^-th pixel intensity.} Local Reshaping Functions [0084] In the following sections, it is shown how to choose which mapping function to use as a local reshaping function (LRF) index map for face and over-exposure (background) regions. Algorithm for Face Index Map Generation [0085] STEP 1: Compute the quantization step ^ = 1/>_t where, >_t is number of gamma curves used to generate LUT, q is the quantization step. [0086] STEP 2: Find the illumination map index for entire image. E_^"F^^,^^^^ = O>_t + 1S − 18743 g^^{E^HHW*^"^X^("} ^ ^ + 1,1, >_tl where the

7^{9 P < 2, u = 2} u_{= 18743(P, 2, :) = ^58M579 P > :, u = :}

[0087] STEP 3: Compute the illumination map index for face region only. E_^"F^^,^^^^ = ^TE_^"F^^,^^^^ − g1 + ^^>t ^lV × ^_^^^^ × q^ + g1 + ^^>t ^l

Algorithm for over exposure index map generation [0088] STEP 1: Compute I_oe (e.g., using the guided filter) [0089] STEP 2: Compute probability of over-exposed region as: ^_(^ = O1 − ^_^^^^S × E_(^ [0090] STEP 3: Computed

as: E_{^"F^^,(^ = g^} ^>t _{^ + 1l − 18743 g^^(^ × g} ^>t _{l^ , 1, ^} ^>t _^l

Local Reshaping Mechanism [0091] The output of Y channel can be calculated directly from the LUTs and index maps. First apply LUT for face brightness correction and then apply LUT for over exposure correction. For the 7-th pixel, the output Y channel is: \_"^^(7) = ^^' ^E_^"F^^,^^^^(7), E_^"F^^,(^(7), \(7)^

[0092] Note that to use the LUT, quantize the pixel value by multiplying >_yz and taking the floor. Saturation Adjustment [0093] If only the brightness in the Y channel is adjusted, the image looks washed out. Therefore, one needs to correct the chroma channels to preserve the saturation. The idea is to increase the magnitude while not changing the direction on YCbCr color space. The ratio of change of Y channel can be used as the amplification ratio for the chroma channel as well. The ratio is defined as: ^_{237/y^y) = g} ^{\ ^} ^{"^^} \_l where, Y_new is the corrected Y

and α is the saturation tuning parameter. In some embodiments, the value of this saturation tuning parameter is from 0.5 to 1.0. In some embodiments, the value of α is 0.5. This saturation tuning parameter can be manually tune by the user as needed. [0094] Then apply these ratios to update the chroma channels as: ^:_"^^ = ^237/_y^y) × (^: − 0.5) + 0.5 ^6_"^^ = ^237/_y^y) × (^6 − 0.5) + 0.5 Multiple-Frame Smoothing (Temporal Smoothing) [0095] In some embodiments, brightness correction is performed for multiple frames contiguously (e.g., video). [0096] Because the face and background brightness corrections are mainly controlled by the brightness tuning parameter q and over exposed probability 4_(^, respectively, if temporal smoothing is performed on these two parameters, the correction operations on contiguous frames will be similar. One efficient way to perform temporal smoothing is to use the exponential moving averaging (EMA), which is suitable for real-time application such as video conferencing. For frame 3, the EMA of q and 4_(^ can be calculated as: q^{^^^,(X)} = ^^{^^^} × q^(X) + (1 − ^^{^^^}) × q^{^^^,(Xw^)}

4₍ ^{^} ^^{^^,(X)} = ^^{^^^} × 4₍ ^(X ^⁾ + ⁽1 − ^^{^^^)} × 4₍ ^{^} ^^{^^,(Xw^)} where 0 < ^^{^^^} (^(X ^⁾ are the brightness

tuning parameter and over exposed probability of frame 3. For the first frame, we can simply itialize them as q = q and 4_(^ ⁽

in ^{^^^,(X) (X) ^^^, X)} = 4₍ ^(X ^⁾. ^(X) and 4₍ ^{^} ^^{^^,(X)} can be used for the correction operations. In some embodiments, ^^{^^^} = 0.01 for 30 fps videos. HSSL [0097] A construction of a Hybrid Shifted Sigmoid with Linear LUT (HSSL) is explained below. For each color channel, ch (such as Y, R, G, and B) build K (e.g. K=4096) different LUTs, and each LUT is built by merging the shifted sigmoid function and linear function. For each LUT, there are the following parameters: 1^{(^G,$)}: center of sigmoid curve ;^{(^G,$)} ^ : distance from most left point of sigmoid curve to center of sigmoid curve. ;^{(^G,$)} J : distance from rightest point of sigmoid curve to center of sigmoid curve. I^{(^G,$)} ^ : slope used in the left sigmoid curve I^{(^G,$)} J : slope used in the left sigmoid curve

The sigmoid function used here is expressed as a function of m. M_{7;(P, I) =} ^{^} ) With

value will be different. The max and min value can used to normalize the function output to [01]. M7;⁽ ^^{^} ^^{$) (} ^ = max {M7; ^P, I_^ ^{^G,$)}^ ∀P ∈ [01]} } }

M7;^{(^G,$)} J_,*^" = min {M7; _^P, I⁽ J^{^G,$)} _^ ∀P ∈ [01]} The

~^{^¬T} ^,*(±²,³) ^{° Vw~^¬(±²,³)} ^°,´#µ ^{(^G,$) (^G,$) ®¯(±²,³)} ^{° (^G,$)}] )]

^^'⁽ ^^{^G,$)}(P) = M«ª;^{(^G,$)} ^ (P) + 1^{(^G,$)} for P ∈ [0, ;⁽ ^^{^G,$)}]

, 2;⁽ J^{^G,$)}] For the k^th reshaping function, ^^'^{(^G,$)}(P), set the function to default as 1:1 mapping, (i.e., ^^'^{(^G,$)}(P) = P). The following two regions will be replaced by the shifted scaled sigmoid function on the left and right side of 1^{(^G,$)} = ^$¸. ^{[00100] 1) For the region ¹(^G,$)}

^^{= x ∈}

^{(0, 1(^G,$) − ;(^G,$) (^G,$)} ^{^ ) 1 ), replace} _{^^'(^G,$)(P) by ^^'(^G,$)}

_{this s (^G,$)} _{^ egment might be shorter than ;^ , so} normalize the value again to avoid getting a negative value. ^^'^{(^G,$)} = max {^^ ^{(^G,$) (^G,$)} ^_,*^^ '_^ ⁽P⁾, ∀P ∈ ¹_^ } )}

In some to ^{(^G,$)} ^_,*^" < 0 ^^º^'^{^G,$ ^G,$) ^J (±²,³) (±²,³)} ^{( ) ( » (^)w^J»°,´#µ} The

^^{º^'(^G,$) ^^'(^G,$) 0} 0

The right side can be done similarly. ^{[00101] 2) For ¹(^G,$)} ^{J = x ∈ [1(^G,$) min (1, 1(^G,$) + ;(^G,$) (^G,$)} ^{J )], replace ^^' (P) by} _{^^'(^G,$)(P). However, this segment might be short (^G,$)} _{J er than ;J , so normalize the value} getting a value greater than 1.

^^'^{(^G,$) (^G,$)}( ) ^{(^G,$)} J_,*^^ = max {^^'_J P, ∀P ∈ ¹_J } In

^^{J (±²,³) (±²,³)} ^^º^'^{(^G,$)}(P) = ( ^{(^G,$) » (^)w^J»·,´#µ (^G,$)} J 1 − ^^'_J,*^") _{(±²,³) (±²,³)} + ^^' The

(^{^G, ) ( )} ^^'^{(^G,$)}P = ^{^º^' $} ^{J (P) 79 ^^'^G,$} ^{J,*^^ > 1}

k=512, 2048, and 3584 with setting ;^{(^G,$)} =0.3, ;^{(^G,$)}=0.3; and varying I^{(^G,$)}={0,5, ^{(^G,$)} ^_{J ^} 10}; I_J ={0,5,10) (K=4096). The larger the slopes of I^{(^G,$)} ^ and I^{(^G,$)} J are, the stronger the contrast that can be provided. [00103] FIG.12 shows a local reshaping function when k=512, 2048, and 3584 with setting I^{(^G,$)}=5; I^{(^G,$)} = 5; and varying ;^{(^G,$)} ={0,0.2,0.4 ^{(^G,$)} ^_{J ^} }, ;_J ={0,0.2,0.4} (K=4096). The larger the slopes of ;^{(^G,$)} ^ and ;^{(^G,$)} J are, the greater a modification region that can be provided. Physical Embodiments [00104] In an embodiment, a computing device such as a display device, a mobile device, a set-top box, a multimedia device, etc., is configured to perform any of the foregoing methods. In an embodiment, an apparatus comprises a processor and is configured to perform any of the foregoing methods. In an embodiment, a non-transitory computer readable storage medium, storing software instructions, which when executed by one or more processors cause performance of any of the foregoing methods. [00105] In an embodiment, a computing device comprising one or more processors and one or more storage media storing a set of instructions which, when executed by the one or more processors, cause performance of any of the foregoing methods. [00106] Note that, although separate embodiments are discussed herein, any combination of embodiments and/or partial embodiments discussed herein may be combined to form further embodiments. Example Computer System Implementation [00107] Embodiments of the present invention may be implemented with a computer system, systems configured in electronic circuitry and components, an integrated circuit (IC) device such as a microcontroller, a field programmable gate array (FPGA), or another configurable or programmable logic device (PLD), a discrete time or digital signal processor (DSP), an application specific IC (ASIC), and/or apparatus that includes one or more of such systems, devices or components. The computer and/or IC may perform, control, or execute instructions relating to the adaptive perceptual quantization of images with enhanced dynamic range, such as those described herein. The computer and/or IC may compute any of a variety of parameters or values that relate to the adaptive perceptual quantization processes described herein. The image and video embodiments may be implemented in hardware, software, firmware and various combinations thereof. [00108] Certain implementations of the inventio comprise computer processors which execute software instructions which cause the processors to perform a method of the disclosure. For example, one or more processors in a display, an encoder, a set top box, a transcoder or the like may implement methods related to adaptive perceptual quantization of HDR images as described above by executing software instructions in a program memory accessible to the processors. Embodiments of the invention may also be provided in the form of a program product. The program product may comprise any non-transitory medium which carries a set of computer-readable signals comprising instructions which, when executed by a data processor, cause the data processor to execute a method of an embodiment of the invention. Program products according to embodiments of the invention may be in any of a wide variety of forms. The program product may comprise, for example, physical media such as magnetic data storage media including floppy diskettes, hard disk drives, optical data storage media including CD ROMs, DVDs, electronic data storage media including ROMs, flash RAM, or the like. The computer-readable signals on the program product may optionally be compressed or encrypted. [00109] Where a component (e.g. a software module, processor, assembly, device, circuit, etc.) is referred to above, unless otherwise indicated, reference to that component (including a reference to a "means") should be interpreted as including as equivalents of that component any component which performs the function of the described component (e.g., that is functionally equivalent), including components which are not structurally equivalent to the disclosed structure which performs the function in the illustrated example embodiments of the invention. [00110] According to one embodiment, the techniques described herein are implemented by one or more special-purpose computing devices. The special-purpose computing devices may be hard-wired to perform the techniques, or may include digital electronic devices such as one or more application-specific integrated circuits (ASICs) or field programmable gate arrays (FPGAs) that are persistently programmed to perform the techniques, or may include one or more general purpose hardware processors programmed to perform the techniques pursuant to program instructions in firmware, memory, other storage, or a combination. Such special-purpose computing devices may also combine custom hard-wired logic, ASICs, or FPGAs with custom programming to accomplish the techniques. The special-purpose computing devices may be desktop computer systems, portable computer systems, handheld devices, networking devices or any other device that incorporates hard-wired and/or program logic to implement the techniques. [00111] For example, FIG.9 is a block diagram that illustrates a computer system (900) upon which an embodiment of the invention may be implemented. Computer system (900) includes a bus (902) or other communication mechanism for communicating information, and a hardware processor (904) coupled with bus (902) for processing information. A hardware processor (904) may be, for example, a general-purpose microprocessor. [00112] The computer system (900) also includes a main memory (906), such as a random-access memory (RAM) or other dynamic storage device, coupled to bus (902) for storing information and instructions to be executed by processor (904). Main memory (906) also may be used for storing temporary variables or other intermediate information during execution of instructions to be executed by processor (904). Such instructions, when stored in non-transitory storage media accessible to processor (904), render computer system (900) into a special-purpose machine that is customized to perform the operations specified in the instructions. [00113] Computer system (900) further includes a read only memory (ROM) (908) or other static storage device coupled to bus (902) for storing static information and instructions for processor (904). A storage device (910), such as a magnetic disk or optical disk, is provided and coupled to bus (902) for storing information and instructions. [00114] Computer system (900) may be coupled via bus (902) to a display (912), such as a liquid crystal display (LCD) or light emitting diode display (LED), for displaying information to a computer user. An input device (914), including alphanumeric and other keys, is coupled to bus (902) for communicating information and command selections to processor (904). Another type of user input device is cursor control (916), such as a mouse, a trackball, or cursor direction keys for communicating direction information and command selections to processor (904) and for controlling cursor movement on display (912). This input device typically has two degrees of freedom in two axes, a first axis (e.g., x) and a second axis (e.g., y), that allows the device to specify positions in a plane. [00115] Computer system (900) may implement the techniques described herein using customized hard-wired logic, one or more ASICs or FPGAs, firmware and/or program logic which in combination with the computer system causes or programs computer system (900) to be a special-purpose machine. According to one embodiment, the techniques as described herein are performed by computer system (900) in response to processor (904) executing one or more sequences of one or more instructions contained in main memory (906). Such instructions may be read into main memory (906) from another storage medium, such as storage device (910). Execution of the sequences of instructions contained in main memory (906) causes processor (904) to perform the process steps described herein. In alternative embodiments, hard-wired circuitry may be used in place of or in combination with software instructions. [00116] The term “storage media” as used herein refers to any non-transitory media that store data and/or instructions that cause a machine to operation in a specific fashion. Such storage media may comprise non-volatile media and/or volatile media. Non-volatile media includes, for example, optical or magnetic disks, such as storage device (910). Volatile media includes dynamic memory, such as main memory (906). Common forms of storage media include, for example, a floppy disk, a flexible disk, hard disk, solid state drive, magnetic tape, or any other magnetic data storage medium, a CD-ROM, any other optical data storage medium, any physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, NVRAM, any other memory chip or cartridge. [00117] Storage media is distinct from but may be used in conjunction with transmission media. Transmission media participates in transferring information between storage media. For example, transmission media includes coaxial cables, copper wire and fiber optics, including the wires that comprise bus (902). Transmission media can also take the form of acoustic or light waves, such as those generated during radio-wave and infra-red data communications. [00118] Various forms of media may be involved in carrying one or more sequences of one or more instructions to processor (904) for execution. For example, the instructions may initially be carried on a magnetic disk or solid-state drive of a remote computer. The remote computer can load the instructions into its dynamic memory and send the instructions over a telephone line using a modem. A modem local to computer system (900) can receive the data on the telephone line and use an infra-red transmitter to convert the data to an infra-red signal. An infra-red detector can receive the data carried in the infra-red signal and appropriate circuitry can place the data on bus (902). Bus (902) carries the data to main memory (906), from which processor (904) retrieves and executes the instructions. The instructions received by main memory (906) may optionally be stored on storage device (910) either before or after execution by processor (904). [00119] Computer system (900) also includes a communication interface (918) coupled to bus (902). Communication interface (918) provides a two-way data communication coupling to a network link (920) that is connected to a local network (922). For example, communication interface (918) may be an integrated services digital network (ISDN) card, cable modem, satellite modem, or a modem to provide a data communication connection to a corresponding type of telephone line. As another example, communication interface (918) may be a local area network (LAN) card to provide a data communication connection to a compatible LAN. Wireless links may also be implemented. In any such implementation, communication interface (918) sends and receives electrical, electromagnetic or optical signals that carry digital data streams representing various types of information. [00120] Network link (920) typically provides data communication through one or more networks to other data devices. For example, network link (920) may provide a connection through local network (922) to a host computer (924) or to data equipment operated by an Internet Service Provider (ISP). ISP in turn provides data communication services through the worldwide packet data communication network now commonly referred to as the “Internet” (928). Local network (922) and Internet (928) both use electrical, electromagnetic or optical signals that carry digital data streams. The signals through the various networks and the signals on network link 920 and through communication interface (918), which carry the digital data to and from computer system (900), are example forms of transmission media. [00121] Computer system (900) can send messages and receive data, including program code, through the network(s), network link (920) and communication interface (918). In the Internet example, a server (930) might transmit a requested code for an application program through Internet (928), ISP on the Internet, local network (922) and communication interface (918). [00122] The received code may be executed by processor (904) as it is received, and/or stored in storage device (910), or other non-volatile storage for later execution. Equivalents, Extensions, Alternatives and Miscellaneous [00123] In the foregoing specification, embodiments of the invention have been described with reference to numerous specific details that may vary from implementation to implementation. Thus, the sole and exclusive indicator of what is claimed embodiments of the invention, and is intended by the applicants to be claimed embodiments of the invention, is the set of claims that issue from this application, in the specific form in which such claims issue, including any subsequent correction. Any definitions expressly set forth herein for terms contained in such claims shall govern the meaning of such terms as used in the claims. Hence, no limitation, element, property, feature, advantage or attribute that is not expressly recited in a claim should limit the scope of such claim in any way. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense. Enumerated Exemplary Embodiments [00124] The invention may be embodied in any of the forms described herein, including, but not limited to the following Enumerated Example Embodiments (EEEs) which describe structure, features, and functionality of some portions of embodiments of the present invention. [00125] EEE1:A method for adaptively applying brightness correction to an image, the method comprising: using a regression model to determine if image correction is needed on the image; determining likely face regions of the image; if the regression model determines that image correction is needed, applying brightness correction to the likely face regions of the image based on a tuning parameter from the regression model. [00126] EEE2. The method of EEE1, further comprising applying over-exposure correction to background regions of the image based on an over-exposed probability of the image, wherein background regions are all regions that are not the likely face regions. [00127] EEE3. The method of EEEs 1 or 2, wherein the determining likely face regions comprises using a face detection module and forming a face probability map of the image. [00128] EEE4. The method of any of EEEs 1-3, further comprising applying contrast correction to the image after the applying brightness correction. [00129] EEE5. The method of EEE4, wherein the applying contrast correction comprises a hybrid shifted sigmoid function with linear mapping. [00130] EEE6. The method of EEE5, wherein the applying contrast correction is based on the tuning parameter and a user-defined parameter. [00131] EEE7. The method of any of EEE4, wherein the applying contrast correction comprises applying a soft threshold to a dark channel of the image. [00132] EEE8. The method of EEE7, wherein the applying a soft threshold comprises comparing a threshold map to a count threshold value. [00133] EEE9. The method of EEE8, wherein the count threshold value is 5% of the total number of pixels of the image. [00134] EEE10. The method of any of EEEs 7-9, wherein a guided image is used to smooth the dark channel. [00135] EEE11. The method of any of EEEs 7-10, wherein the dark channel is determined by a generalized mean of channels of the image. [00136] EEE12. The method of any of EEEs 1-11, wherein the regression model uses a saved ground truth value for the tuning parameter determined by a brute force method. [00137] EEE13. The method of any of EEEs 1-12, wherein the brightness correction uses a look-up table of correction curves. [00138] EEE14. The method of any of EEEs 1-13, wherein the applying brightness correction is performed on multiple contiguous images. [00139] EEE15. An apparatus comprising a processor and configured to perform any one of the methods recited in EEEs 1-14. [00140] EEE16. The apparatus of EEE15, wherein the apparatus is the same device as a device displaying the image. [00141] EEE17. The apparatus of EEE16, wherein software for performing the image is a stand-alone software package separate from, but interacting with, software used to display the image. [00142] EEE18. The apparatus of EEE15, wherein the apparatus comprises a video decoder and the method is performed in the video decoder. [00143] EEE19. A non-transitory computer-readable storage medium having stored thereon computer-executable instruction for executing a method with one or more processors in accordance with any of the methods recited in EEEs 1-14.

Claims

CLAIMS What is claimed is: 1. A method for adaptively applying brightness correction to an image, the method comprising: using a regression model to determine if image correction is needed on the image; determining likely face regions of the image; if the regression model determines that image correction is needed, applying brightness correction to the likely face regions of the image based on a tuning parameter from the regression model.

2. The method of Claim 1, further comprising applying over-exposure correction to background regions of the image based on an over-exposed probability of the image, wherein background regions are all regions that are not the likely face regions.

3. The method of Claim 1 or 2, wherein the determining likely face regions comprises using a face detection module and forming a face probability map of the image.

4. The method of any of Claims 1-3, further comprising applying contrast correction to the image after the applying brightness correction.

5. The method of Claim 4, wherein the applying contrast correction comprises a hybrid shifted sigmoid function with linear mapping.

6. The method of Claim 5, wherein the applying contrast correction is based on the tuning parameter and a user-defined parameter.

7. The method of any of Claim 4, wherein the applying contrast correction comprises applying a soft threshold to a dark channel of the image.

8. The method of Claim 7, wherein the applying a soft threshold comprises comparing a threshold map to a count threshold value.

9. The method of Claim 8, wherein the count threshold value is 5% of the total number of pixels of the image.

10. The method of any of Claims 7-9, wherein a guided image is used to smooth the dark channel.

11. The method of any of Claims 7-10, wherein the dark channel is determined by a generalized mean of channels of the image.

12. The method of any of Claims 1-11, wherein the regression model uses a saved ground truth value for the tuning parameter determined by a brute force method.

13. The method of any of Claims 1-12, wherein the brightness correction uses a look-up table of correction curves.

14. The method of any of Claims 1-13, wherein the applying brightness correction is performed on multiple contiguous images.

15. An apparatus comprising a processor and configured to perform any one of the methods recited in Claims 1-14.

16. The apparatus of Claim 15, wherein the apparatus is the same device as a device displaying the image.

17. The apparatus of Claim 16, wherein software for performing the image is a stand- alone software package separate from, but interacting with, software used to display the image.

18. The apparatus of Claim 15, wherein the apparatus comprises a video decoder and the method is performed in the video decoder.

19. A non-transitory computer-readable storage medium having stored thereon computer- executable instructions for executing a method with one or more processors in accordance with any of the methods recited in Claims 1-14.