HK1145111B

HK1145111B - Image selection device and method for selecting image

Info

Publication number: HK1145111B
Application number: HK10111628.5A
Authority: HK
Inventors: 松永和久
Original assignee: 卡西欧计算机株式会社
Priority date: 2009-03-31
Filing date: 2010-12-14
Publication date: 2013-09-19

Description

Image selection device, image selection method, and computer readable medium

Technical Field

The present invention relates to an image selection device and an image selection method for selecting an arbitrary image from a plurality of images.

Background

Conventionally, a digital camera having a function of continuously photographing to generate a plurality of image data is known. In recent years, as the number of sheets that can be continuously photographed increases, the selection operation by the user himself becomes very troublesome. As a method for eliminating such trouble, an image processing system is known which selects a group photograph in which all the people open their eyes (for example, see patent document 1).

[ patent document 1] Japanese patent application laid-open No. 2007 & 88594

However, when the selection is made only by the determination of whether or not the eyes of the population are open in the group photograph, there is a possibility that the selected photograph will be zero.

Disclosure of Invention

Accordingly, an object of the present invention is to provide an image selection apparatus and an image selection method capable of appropriately and easily selecting an image.

According to one aspect of the present invention, there is provided an image selection apparatus having: an acquisition unit that acquires a plurality of captured images generated by continuously capturing images of a subject that is a person or more than one person; a face detection unit that detects faces (human faces) included in the plurality of captured images acquired by the acquisition unit; an eye detection unit that detects eyes from the face of the person detected by the face detection unit; a blink detection unit that detects a blink degree (blink degrees) of each of the eyes detected by the eye detection unit; an evaluation unit that evaluates a state (status) of the face based on the blink degree detected by the blink detection unit; and a specifying unit that specifies (select) at least one captured image recorded in a recording medium from the plurality of captured images acquired by the acquiring unit, based on the evaluation made by the evaluating unit.

According to another aspect of the present invention, there is provided an image selection method, comprising: a step of acquiring a plurality of captured images generated by continuously capturing images of at least one person as a subject; a step of performing face detection processing for detecting faces included in the plurality of captured images; a step of performing eye detection processing of detecting eyes from the face detected by the face detection processing; a step of performing blink detection processing for detecting the respective blink degrees of the eyes detected by the eye detection processing; a step of evaluating a state of the human face based on the blink degree detected by the blink detection processing; and a step of specifying at least one captured image recorded in a recording medium from among the plurality of captured images based on the evaluation.

Drawings

Fig. 1 is a block diagram showing a schematic configuration of an imaging apparatus to which an embodiment of the present invention is applied.

Fig. 2 is a flowchart showing an example of an operation in the image selection process performed by the imaging apparatus of fig. 1.

Fig. 3 is a flowchart showing the subsequent image selection process of fig. 2.

Fig. 4 is a flowchart showing an example of an operation related to the blink detection process in the image selection process of fig. 2.

Fig. 5 is a diagram schematically showing an image portion of the eyes of the subject in the blink detection processing of fig. 4.

Fig. 6 is a block diagram showing a schematic configuration of an imaging apparatus according to a modification.

Fig. 7 is a flowchart showing an example of an operation in the image selection process performed by the imaging apparatus of fig. 6.

Detailed Description

Fig. 1 is a block diagram showing a schematic configuration of an imaging apparatus 100 to which an embodiment of the present invention is applied.

The imaging apparatus 100 of the present embodiment detects a human face from a plurality of image frames generated by continuous shooting, calculates a blink evaluation value of eyes of the detected human face, evaluates the state of the human face based on the calculated blink evaluation value, and specifies one captured image stored in the recording medium 13 from the plurality of image frames based on the evaluation of the state of the human face.

Specifically, as shown in fig. 1, the imaging apparatus 100 includes: the image processing apparatus includes a lens unit 1, an electronic imaging unit 2, an imaging control unit 3, an image data generation unit 4, an image memory 5, a face detection unit 6, an eye detection unit 7, a smiling face detection unit 8, a blink detection unit 9, a shake detection unit 10, an image specification unit 11, a development unit 12, a recording medium 13, a display control unit 14, a display unit 15, an operation input unit 16, and a CPU 17.

The imaging control unit 3, the face detection unit 6, the eye detection unit 7, the smile detection unit 8, the blink detection unit 9, the shake detection unit 10, the image specification unit 11, the development unit 12, and the CPU17 are designed as, for example, a custom LSI 1A.

The lens unit 1 is composed of a plurality of lenses, and includes a zoom lens, a focus lens, and the like.

Although not shown in the drawings, the lens unit 1 may include a zoom drive unit that moves the zoom lens in the optical axis direction during image capturing of an object, a focus drive unit that moves the focus lens in the optical axis direction, and the like.

The electronic imaging unit 2 is formed of an image sensor such as a CCD (Charge Coupled Device) or a CMOS (Complementary Metal-Oxide Semiconductor), and converts an optical image transmitted through each lens of the lens unit 1 into a two-dimensional image signal.

The imaging control unit 3 includes a timing generator, a driver, and the like, although not shown. The imaging control unit 3 scans and drives the electronic imaging unit 2 by a timing generator or a driver, converts an optical image into a two-dimensional image signal by the electronic imaging unit 2 every predetermined period, sequentially reads image frames from an imaging area of the electronic imaging unit 2 on a 1-screen basis, and outputs the image frames to the image data generating unit 4.

The imaging control unit 3 performs adjustment control of conditions for imaging an object, such as AF (auto focus processing), AE (auto exposure processing), and AWB (auto white balance processing).

The imaging lens 1, the electronic imaging unit 2, and the imaging control unit 3 configured as described above sequentially generate and acquire a plurality of (e.g., 20) image frames by continuously imaging an object at a predetermined frame rate (e.g., 3fps or 10fps) as imaging means.

The image data generating unit 4 performs appropriate gain adjustment on the analog value signal of the image frame transmitted from the electronic imaging unit 2 for each color component of RGB, and then performs sample hold by a sample hold circuit (not shown) and converts the signal into digital data by an a/D converter (not shown) to generate RAW image data. The image data generation unit 4 performs reduction processing on the luminance signal of the RAW image data at a predetermined magnification in both the horizontal and vertical directions, thereby generating low-definition reduced-luminance image data.

The RAW image data and the reduced-luminance image data are DMA-transferred to the image memory 5 used as a buffer memory via a DMA controller not shown.

The image memory 5 is configured by, for example, a DRAM, and temporarily stores data processed by the face detection unit 6, the eye detection unit 7, the smiling face detection unit 8, the blink detection unit 9, the shake detection unit 10, the image specification unit 11, the CPU17, and the like.

The face detection unit 6 detects a human face from each of the reduced-luminance image data of the plurality of image frames by a predetermined face detection method. Specifically, the face detection unit 6 detects a face image region from each image frame based on the reduced-luminance image data temporarily stored in the image memory 5, and generates image information in the detected face image region as face contour (face frame) information. Since the face detection process is a well-known technique, a detailed description thereof will be omitted here.

In addition, in the image frames in which no face is detected in the face detection processing, face contour information is set in the image frames in which no face is detected based on the coordinates of the face image region in which the face is detected most recently among the preceding and following image frames.

That is, since the imaging interval is extremely short in continuous imaging, when a face is detected from any one image frame, it is considered that a human face is present in the image frames before and after the face, and face contour information is set in the image frame in which no face is detected, using the coordinates of the face image region (for example, the coordinates of the four corners of the rectangular frame) in the image frame in which the face is detected.

The eye detection unit 7 detects the eyes of the human face based on the face contour information of each image frame generated by the face detection unit 7. Specifically, the eye detecting unit 7 detects both the left and right eyes of all the persons in each image frame based on the face contour information of each image frame, and calculates the center coordinates thereof. Since the eye detection process is a known technique, a detailed description thereof will be omitted here.

Here, the eye detecting unit 7 constitutes an eye detecting means for detecting eyes from the face of a person detected by the face detecting unit 6.

The eye detection unit 7 includes a reliability calculation unit 7a and a validity determination unit 7 b.

The reliability calculating unit 7a, as reliability calculating means, calculates the detection reliability in the eye detection by the eye detecting unit 7. Specifically, the reliability calculating unit 7a calculates, for example, detection reliability adjusted to reduce the reliability when the face detection contour is inappropriate, when the face is directed to the side, or the like.

The validity determination unit 7b is a validity determination means for determining the validity of the face with eyes detected, based on whether or not the detection reliability calculated by the reliability calculation unit 7a is higher than a predetermined threshold. When the detection reliability is equal to or lower than the predetermined threshold, the validity determination unit 7b sets the face whose eye detection is invalid to be NG determined, and thus the face is not used in the blink detection process (described later) or the smiling face detection process (described later). On the other hand, when the detection reliability is greater than the predetermined threshold value, the validity determination unit 7b sets the face in eye detection to be an eye detection valid face by making the face in eye detection OK determination, and uses the eye detection valid face in the subsequent blink detection processing or smiling face detection processing.

The smiling face detection unit 8 detects the degree of smiling of the face whose eyes are detected, based on the eye position information detected by the eye detection unit 7. Specifically, the smiling face detection unit 8 searches for the position of the mouth in the reduced-luminance image based on the coordinate information of the left and right eyes of the face determined to be valid by the validity determination unit 7b for the reduced-luminance image data of all the image frames generated by the continuous shooting, and calculates the smile value from the upper lift of the mouth angle.

Here, the smiling face detecting section 8 constitutes a smiling face detecting unit that detects a smile value (smile degree) of the face whose eyes are detected based on the position information of the eyes detected by the eye detecting section 7.

The blink detection unit 9 detects the degree of blinking of the eyes detected by the eye detection unit 7. Specifically, the blink detection unit 9 sets a blink detection window W (see fig. 5 a) in the reduced-luminance image based on the coordinate information of the left and right eyes of the human face determined to be valid by the validity determination unit 7b for the reduced-luminance image data of all the image frames generated by the continuous shooting, calculates an evaluation value for each column C (see fig. 5 b) in the window, and calculates the complement of the minimum evaluation value as the blink evaluation value. That is, the blink detection unit 9 calculates the degree of openness of the eyes detected by the eye detection unit 7 as the blink evaluation value. Thus, a larger value of the blink evaluation value indicates a larger opening of the eye.

The specific processing content of the blink detection processing will be described later (see fig. 4).

Here, the blink detection unit 9 constitutes blink detection means for detecting a blink evaluation value (blink degree) of the eyes detected by the eye detection unit 7.

The blink detection unit 9 includes a smoothing unit 9a, and the smoothing unit 9a smoothes the detected blink evaluation value between adjacent images. Specifically, when the imaging frame rate during continuous shooting is equal to or higher than a predetermined value (for example, 10fps), the smoothing unit 9a performs smoothing by adding and averaging the blink evaluation values of the detected image frames between the blink evaluation values of adjacent preceding and succeeding image frames. That is, when the imaging frame rate (continuous imaging speed) is equal to or higher than a predetermined value, since the correlation between adjacent image frames is high, when there is an image frame in which the eyes are partially open between an image frame in which the eyes are fully open and an image frame in which the eyes are fully closed, it is possible to easily obtain an intermediate value even if the value of the blink evaluation value of the image frame is slightly different.

Here, the smoothing unit 9a constitutes smoothing means for smoothing the detected blink evaluation value (blink degree) between adjacent images when the imaging frame rate is equal to or higher than a predetermined value.

The blink detection unit 9 includes a blink correction unit 9b, and the blink correction unit 9b corrects the detected blink evaluation value based on the smile value detected by the smile detection unit 8. Specifically, the blink correction unit 9b determines whether or not the smile value is lower than a predetermined threshold value, and increases the blink evaluation value as follows when it is determined that the smile value is equal to or higher than the predetermined threshold value as a result of the determination.

Blink evaluation value + ═ k (smile value-threshold)

Here, k is a predetermined constant.

That is, since the eyes of the human face are often lost when the human face smiles, the blink evaluation value is corrected to a more appropriate value so that the half-open eyes are evaluated as the degree of full-open when the smile value is equal to or greater than the predetermined threshold value.

Here, the blink correction unit 9b constitutes blink correction means for correcting the detected blink evaluation value (blink degree) based on the smile value detected by the smile detection unit 8.

The blink detection unit 9 further includes a threshold value calculation unit 9c, and the threshold value calculation unit 9c calculates a threshold value for determining whether or not the eyes of each person are open, based on the detected blink degrees of the respective persons.

Here, since the dependency of the blink evaluation value on the personal characteristics such as the size of the eye and the volume of the eyelash is large, the threshold value calculating unit 9c sets a threshold value for blink determination for each person. Further, since it is difficult to set the state in which the eyes are open to be correctly discriminated, the threshold value calculation unit 9c sets a blink evaluation value at a certain ratio from the upper level as the temporary threshold value Th1 for each person, and determines that the blink determination is OK when the blink evaluation value is equal to or greater than the temporary threshold value Th 1.

It is preferable that the threshold value for blink judgment be set as large as possible so that the state in which the eyes are half open is not judged as blink judgment OK, but if the threshold value is set too strictly by increasing the threshold value, an image in which it is not judged that all the people are open may be selected if the number of persons taking the image is large, such as a group photograph. Therefore, the threshold value calculating unit 9c sets the threshold value so as to change the threshold value according to the number of persons who take pictures. For example, an upper evaluation value of the ratio of N ^ (1/number of persons) for each person is set as the temporary threshold Th1 so that image frames of a predetermined ratio (for example, the ratio is N) are left among the image frames detected by the eyes. Specifically, for example, when 3 consecutive images that are within the field of view of the subject are selected by the blink detection so that 20% of the entire images are finally retained, the provisional threshold value Th1 set for each person 0.2^ (1/3) ≈ 0.58, that is, the blink evaluation value at about 60% from the top is determined as the blink determination OK.

Note that, when the threshold value is too close or too far to the maximum value of the blink evaluation value of the target person, the threshold value calculation unit 9c calculates the true threshold value Th2 for determination by performing clipping processing of the upper limit value and the lower limit value of the temporary threshold value Th1 according to the following formula with the maximum value of the blink evaluation value as a reference, in consideration of being inappropriate as the threshold value.

If(Th1＞Bmax-Ofst1)Th2＝Bmax-Ofst1；

else if(Th 1＜Bmax+Ofst2)Th2＝Bmax+Ofst2；

else Th2＝Th1；

Here, Bmax is the maximum value of the blink evaluation value for each person, Ofst1 is an upper limit clip offset amount, Ofst2 is a lower limit clip offset amount, Th1 is a temporary threshold value for blink determination, and Th2 is a true threshold value for blink determination.

In this way, the threshold value calculating unit 9c constitutes a threshold value calculating means that calculates a true threshold value Th2 for determining whether or not the eyes of each person are open, based on the detected blink evaluation value (blink degree) of each person.

The blink detection unit 9 includes a blink determination unit 9d, and the blink determination unit 9d determines whether or not the eyes of each person are open, based on the reduced-luminance image data of the plurality of image frames generated by the continuous shooting. Specifically, the blink determination unit 9d compares the true threshold value Th2 for blink determination calculated by the threshold value calculation unit 9c with the blink evaluation value of each individual image frame, determines whether or not the blink evaluation value is equal to or greater than the true threshold value Th2, and determines that the blink determination is OK (open eyes) when the determination result determines that the blink evaluation value is equal to or greater than the true threshold value Th 2; on the other hand, when the blink evaluation value is smaller than the true threshold Th2, it is determined that the blink evaluation value is NG (closed eye).

Here, the blink determination unit 9d constitutes blink determination means for determining whether or not the eyes of each person are open for a plurality of image frames generated by consecutive photographing based on the true threshold Th2 for blink determination calculated by the threshold calculation unit 9 c.

The shake detection unit 10 detects a shake evaluation value (shake amount) for each of a plurality of image frames generated by continuous shooting, with respect to an adjacent image frame. Specifically, when a human face is not detected by face detection processing from reduced-luminance image data of a plurality of image frames generated by continuous shooting, or when the number of faces for which eye detection is effective although a human face is detected from a certain image frame is 0, the shake detection unit 10 divides each image frame block into predetermined regions, calculates a difference between blocks at the same position as that of an adjacent image frame for each block, and sets the largest of all blocks as a shake evaluation value of the image frame.

Here, the shake detection unit 10 constitutes a shake detection means, and when the face detection unit 6 does not detect a human face, the shake detection means 10 detects a shake evaluation value (shake amount) for each of a plurality of image frames generated by consecutive shooting, with respect to an adjacent image frame.

The image specification unit 11 evaluates the state of the human face based on the blink evaluation value calculated by the blink detection unit 9, and specifies one captured image recorded in the recording medium 13 from among a plurality of RAW image data generated by continuous shooting based on the evaluation value. Specifically, the image specification unit 11 includes an image determination unit 11a, and the image determination unit 11a determines the number of closed-eye faces in the reduced-luminance image data of a plurality of image frames captured continuously.

The image determination unit 11a, as the 1 st image determination unit, determines whether or not there are a plurality of image frames with the smallest number of closed-eye faces among the reduced-luminance image data of the plurality of image frames generated by the continuous shooting, based on the determination result in the blink determination unit 9d (the 1 st determination process). In addition, the image determination unit 11a, as the 2 nd image determination means, determines whether or not the number of closed-eye faces is 0 when it is determined in the 1 st determination process that there are a plurality of image frames in which the number of closed-eye faces is the smallest (the 2 nd determination process).

The image specification unit 11 specifies one photographed image recorded on the recording medium 13 based on the result of the judgment by the image judgment unit 11 a. That is, in the case where the image specifying unit 11 determines that there are not a plurality of image frames with the smallest number of closed-eye faces in the 1 st determination process, the RAW image data in the image frames is specified as one captured image recorded in the recording medium 13.

In addition, when the image specification unit 11 determines in the 2 nd determination process that the number of closed-eye faces is 0, among the plurality of image frames with the smallest number of closed-eye faces, one image frame with the highest blink evaluation value is specified based on the blink evaluation value of each person detected by the blink detection unit 9, and RAW image data in the image frame is specified as one captured image recorded in the recording medium 13. On the other hand, when it is determined in the 2 nd determination process that the number of closed-eye faces is not 0, the image specification unit 11 specifies, from among the plurality of image frames with the smallest number of closed-eye faces, one image frame with the highest smile value based on the smile value detected by the smile detection unit 8, and specifies RAW image data in the image frame as one captured image recorded in the recording medium 13.

When the face detection unit 6 does not detect a face from all the image frames or when the number of faces whose eyes are detected from a certain image frame is 0, the image specification unit 11 specifies the RAW image data in the image frame whose shake evaluation value detected by the shake detection unit 10 is the smallest as one captured image recorded in the recording medium 13.

In this way, the image specification unit 11 constitutes an evaluation unit that evaluates the state of the human face based on the blink evaluation value (blink degree) calculated by the blink detection unit 9. The image specification unit 11 constitutes a specification unit that specifies at least one captured image recorded on the recording medium 13 from among a plurality of RAW data generated by continuous shooting based on the evaluation of the face state.

The developing unit 12 performs color processing including pixel interpolation processing, gamma correction processing, and the like on the RAW image data specified by the image specifying unit 11 by a color processing circuit (not shown), and then generates a digital luminance signal Y and color difference signals Cb and Cr (YUV data).

The recording medium 13 is configured by, for example, a nonvolatile memory (flash memory) or the like, and stores image data for recording of a captured image encoded by a JPEG compression unit (not shown) of the development unit 12.

The display control unit 14 reads the image data for display temporarily stored in the image memory 5 and displays the image data on the display unit 15.

Specifically, the display control unit 15 includes a VRAM, a VRAM controller, a digital video encoder, and the like. The digital video encoder periodically reads out the luminance signal Y and the color difference signals Cb and Cr, which are read out from the image memory 5 and stored in a VRAM (not shown) under the control of the CPU17, from the VRAM via the VRAM controller, and generates a video signal based on these data and outputs the video signal to the display unit 15.

The display unit 15 is, for example, a liquid crystal display device, and displays an image or the like captured by the electronic imaging unit 2 as a display screen based on a video signal from the display control unit 14. Specifically, the display unit 15 displays a live view image or a recording-view (recording-view) image captured as a real captured image based on a plurality of image frames generated by capturing an image of a subject by the imaging lens 1, the electronic imaging unit 2, and the imaging control unit 3 in the imaging mode.

The operation input unit 16 is used to perform a predetermined operation of the imaging apparatus 100. Specifically, the operation input unit 16 includes a shutter button 16a for instructing to photograph a subject, a mode button 16b for instructing to select a photographing mode, a focus button (not shown) for instructing to adjust a focus amount, and the like, and outputs a predetermined operation signal to the CPU17 in accordance with an operation of these buttons.

The CPU17 controls each unit of the image pickup apparatus 100. Specifically, the CPU17 performs various control operations in accordance with various processing programs (not shown) for the image pickup apparatus 100.

Next, image selection processing in the image selection method performed by the imaging apparatus 100 will be described with reference to fig. 2 to 5.

Fig. 2 and 3 are flowcharts showing an example of an operation in the image selection process. Fig. 4 is a flowchart showing an example of an operation related to the blink detection process in the image selection process. Fig. 5(a) is an image portion schematically showing the blink detection window W and the left and right eyes of the subject, and fig. 5(b) is a diagram schematically showing the blink detection window W in an enlarged scale.

The image selection processing is processing executed when the user selects and instructs the automatic image selection mode from among a plurality of image capturing modes displayed on the menu screen based on a predetermined operation performed on the mode button 16b of the operation input unit 16.

As shown in fig. 2, first, the CPU17 causes the image pickup control unit 3 to adjust the focus position and exposure conditions (shutter speed, aperture, magnification, etc.) of the focus lens and the image pickup conditions such as white balance after a continuous image pickup instruction is input based on a predetermined operation performed by the user on the shutter button 16a of the operation input unit 16, and causes the electronic image pickup unit 2 to perform continuous image pickup of a predetermined number of sheets (for example, 20 sheets) of optical images of the subject at a predetermined image pickup frame rate (for example, 10fps) (step S1). Then, the CPU17 causes the image data generation section 4 to generate RAW image data and reduced-luminance image data of each image frame of the subject transferred from the electronic image pickup section 2, and temporarily stores these image data in the image memory 5 (step S2).

Then, the CPU17 causes the face detection unit 6 to detect a human face from the reduced-luminance image data of each image frame by a predetermined face detection method, and generates image information in the detected face image region as face contour information (step S3).

Next, the CPU17 determines whether or not the number of faces detected in the face detection processing is 0, that is, whether or not no faces are detected from all the image frames (step S4). Here, if it is determined that the number of detected faces is not 0 (no in step S4), the CPU17 performs a process of assigning a personal ID to each face based on face contour information detected from each image frame (step S5). Here, the assignment of the personal ID is performed in such a manner that: in the face contour information of each adjacent image frame, if the center distance of each face contour is within a predetermined ratio (for example, about 50%) to the size (for example, width or length) of an arbitrary face contour, the face contour is determined to be the same person. Here, the CPU17 acquires a face image region detected most recently among preceding and succeeding image frames for an image frame in which a face is not detected in the face detection processing, and sets face contour information in the image frame in which a face is not detected based on the coordinates of the face image region, thereby setting a personal ID for all the image frames.

Next, the CPU17 causes the eye detecting unit 7 to detect the left and right eyes of the person based on the face contour information of each image frame generated by the face detecting unit 6 to calculate the center coordinates thereof, and causes the reliability calculating unit 7a of the eye detecting unit 7 to calculate the detection reliability in the eye detection (step S6).

Then, the CPU17 causes the validity determination unit 7b of the eye detection unit 7 to perform the following operations: the validity of the face with the eye detected at the reliability is determined based on whether or not the detection reliability calculated by the reliability calculating unit 7a is higher than a predetermined threshold (step S7). Specifically, the validity determination unit 7b sets the face in eye detection to be NG determination when the detection reliability is equal to or lower than a predetermined threshold value, and sets the face to be invalid for eye detection; on the other hand, when the detection reliability is greater than the predetermined threshold, the face in the eye detection is set as an OK determination, and the face in which the eye detection is effective is set.

Next, the CPU17 determines whether or not the number of faces for which the eyes are detected to be valid (the number of valid faces for which the eyes are detected) is 0, based on the determination result of the validity determination unit 7b (step S8).

Here, when it is determined that the number of faces for which eye detection is effective is 0 (yes in step S8), the CPU17 causes the shake detection unit 10 to perform a shake detection process for dividing each of a plurality of image frames into predetermined regions, calculating a difference value between blocks at the same position as an adjacent image frame for each block, and setting the largest difference value among all the blocks as an evaluation value of the image frame (step S9). When it is determined in step S4 that the face detection count is 0 (yes in step S4), the CPU17 advances the process to step S9 to cause the shake detection unit 10 to perform the shake detection process.

Thereafter, the CPU17 causes the image specification unit 11 to specify the RAW image data in the image frame in which the shake evaluation value detected by the shake detection unit 10 is the smallest (step S10), and causes the development unit 12 to perform the development processing of the RAW image data specified by the image specification unit 11, encode the image data in the JPEG format, and store the image data in the recording medium 13 (step S11).

On the other hand, if it is determined in step S8 that the number of faces for which eye detection is valid is not 0 (no in step S8), the CPU17 causes the blink detection unit 9 to perform blink detection processing (step S12).

Here, the blink detection process will be described in detail with reference to fig. 4, 5(a), and 5 (b).

As shown in fig. 4, the blink detection process 9 calculates the average distance De between the left and right eyes of each person based on the coordinate information of the left and right eyes of the face determined to be valid by the validity determination unit 7b (step S31).

Next, the blink detection unit 9 sets a blink detection window W in each of the reduced-luminance image data based on the reduced-luminance image data of all the image frames (see fig. 5 a) (step S32). Here, the size Wlen of the blink detection window W is determined by multiplying a coefficient Wratio by the following equation so that the average distance De to both eyes becomes a predetermined ratio.

Wlen＝De*Wratio

The center position of the blink detection window W is set as a coordinate value of both eyes detected by the eyes.

Next, the blink detection unit 9 calculates an evaluation value for each of the columns C (see fig. 5 b) obtained by dividing the set blink detection window W at predetermined intervals in the left-right direction (X-axis direction) (step S33). Specifically, the blink detection unit 9 classifies the pixel values in the vertical direction (Y direction) in the order of the luminance from low to high for each of the columns C, and calculates the value obtained by averaging the pixel values at a certain ratio from the upper level as the evaluation value of each of the columns C. Thus, even if the black eye region is reduced due to blinking, or a white region appears at a position close to the center of the black eye due to reflection of light, or the black eye region is blurred due to face shake, the evaluation value increases in a portion of the black eye that is long in the vertical direction, and therefore the value of the black eye in the longest portion can be obtained.

Then, the blink detection unit 9 specifies the smallest evaluation value from the evaluation values of all the columns C (step S34), and calculates the complement of the smallest evaluation value as the blink evaluation value so that the evaluation value is high when the eyes are open (step S35).

Next, the blink evaluation value 9 determines whether or not the imaging frame rate at the time of continuous shooting is equal to or higher than a predetermined value (for example, 10fps) (step S36). When it is determined that the image capturing frame rate is equal to or higher than the predetermined value (yes in step S36), the smoothing unit 9a of the blink detection unit 9 performs smoothing processing by adding and averaging the blink evaluation values of the image frames between the blink evaluation values of the adjacent preceding and succeeding image frames (step S37). Thus, when there is an image frame in which the eyes are partially open between the image frame in which the eyes are fully open and the image frame in which the eyes are fully closed, an intermediate value can be easily obtained even if there is a slight deviation in the blink evaluation value of the image frame.

On the other hand, when the imaging frame rate is smaller than the predetermined value (no in step S36), the blink evaluation value calculated in step S36 is set as the final evaluation value without performing smoothing processing because there is no time left between adjacent image frames and the correlation is low.

This ends the blink detection process.

As shown in fig. 3, the CPU17 causes the smiling face detection section 8 to find the position of the mouth within the reduced-luminance image based on the coordinate information of the left and right eyes of the human face, and calculates the smile value from the degree of the rise in the corner of the mouth (step S13).

Next, the CPU17 causes the blink correction unit 9b of the blink detection unit 9 to perform the following operations: it is determined whether or not the smile value detected by the smile detection unit 8 is equal to or greater than a predetermined threshold value, and if it is determined that the smile value is equal to or greater than the predetermined threshold value as a result of the determination, the blink evaluation value is increased according to the following formula to correct the blink evaluation value (step S14).

Blink evaluation value + ═ k (smile value-threshold)

Here, k is a predetermined constant.

Next, the CPU17 causes the threshold value calculation unit 9c of the blink detection unit 9 to set the blink evaluation value at a certain ratio from the upper level as the temporary threshold value Th1 for each individual (step S15). Specifically, the threshold calculation unit 9c sets the threshold so as to be changed according to the number of persons who have captured the image, and sets, as the temporary threshold Th1, an upper evaluation value of the ratio of N ^ (1/number of persons) for each person so that, for example, an image frame of a predetermined ratio (for example, the ratio is N) remains among image frames in which eyes are detected.

Next, the threshold value calculation unit 9c performs clipping processing of the upper limit value and the lower limit value of the temporary threshold value Th1 according to the following formula using the maximum value of the blink evaluation value as a reference, and thereby calculates a true threshold value Th2 for determination. (step S16)

If(Th1＞Bmax-Ofst1)Th2＝Bmax-Ofst1；

else if(Th1＜Bmax+Ofst2)Th2＝Bmax+Ofst2；

else Th2＝Th1；

Next, the CPU17 causes the blink determination unit 9d of the blink detection unit 9 to perform the following operations: the true threshold Th2 for blink determination calculated by the threshold calculation unit 9c is compared with the blink evaluation value for each individual image frame, and it is determined whether or not the blink evaluation value is equal to or greater than the true threshold Th2 (step S17). By this determination, when it is determined that the blink evaluation value is equal to or greater than the true threshold Th2, it is determined as blink determination OK (open eye), and when it is determined that the blink evaluation value is smaller than the true threshold Th2, it is determined as blink determination NG (closed eye).

Then, the CPU17 causes the image specification unit 11 to specify an image frame with the smallest number of closed-eye faces among the reduced-luminance image data of the plurality of image frames generated by the continuous shooting (step S18).

Next, the CPU17 causes the image deciding section 11a of the image specifying section 11 to decide whether or not there are a plurality of image frames with the smallest number of closed-eye faces (step S19). Here, it is determined that there are not a plurality of image frames with the smallest number of closed-eye faces (no in step S19), the CPU17 causes the image specification section 11 to specify RAW image data in the image frame as one captured image recorded in the recording medium 13 (step S20).

Thereafter, the CPU17 proceeds to step S11 (see fig. 2), causes the developing unit 12 to perform a developing process of RAW image data in an image frame having the smallest number of closed-eye faces specified by the image specifying unit 11, encodes the image data in the JPEG format, and stores the encoded image data in the recording medium 13 (step S11).

On the other hand, if it is determined in step S19 that there are a plurality of image frames with the smallest number of closed-eye faces (yes in step S19), the CPU17 causes the image determination unit 11a to determine whether or not the number of closed-eye faces is 0 (step S21).

Here, when it is determined that the number of closed-eye faces is 0 (yes in step S21), the CPU17 causes the image specification unit 11 to perform the following operations: among the plurality of image frames with the smallest number of closed-eye faces, the image frame with the highest blink evaluation value is specified based on the blink evaluation value of each person detected by the blink detection section 9, and the RAW image data in the image frame is specified as the captured image recorded in the recording medium 13 (step S22).

Thereafter, the CPU17 proceeds to step S11 (see fig. 2), causes the developing unit 12 to perform a developing process of the RAW image data having the highest blink evaluation value specified by the image specifying unit 11, encodes the image data in the JPEG format, and stores the encoded image data in the recording medium 13 (step S11).

When it is determined in step S21 that the number of closed-eye faces is not 0 (no in step S21), the CPU17 causes the image specification unit 11 to specify, among a plurality of image frames with the smallest number of closed-eye faces, an image frame with the highest smile value based on the smile value detected by the smile detection unit 8, and to specify RAW image data in the image frame as one captured image recorded on the recording medium 13 (step S23).

Thereafter, the CPU17 shifts the process to step S11 (see fig. 2), and causes the developing unit 12 to perform a developing process of the RAW image data having the highest smile value specified by the image specifying unit 11, encode the image data in the JPEG format, and store the encoded image data in the recording medium 13 (step S11).

As described above, according to the image pickup apparatus 100 of the present embodiment, the face detection unit 6 detects a human face from a plurality of image frames generated by continuous shooting, the eye detection unit 7 detects eyes from the detected human face, the blink detection unit 9 calculates a blink evaluation value of the detected eyes, the image specification unit 11 evaluates the state of the human face based on the blink evaluation value, and specifies one captured image recorded on the recording medium 13 from the plurality of image frames based on the evaluation of the state of the human face.

Specifically, the eyes of the face, which is a face whose detection reliability of the eyes is higher than a predetermined threshold value and whose face is detected to be valid, are detected for each person by the eye detecting section 7 based on the face contour information generated by the face detecting section 6, and the image specifying section 11 determines the state of the face for each person. Furthermore, the blink detection unit 9 detects a blink evaluation value of each person, and the image specification unit 11 determines the state of the face of each person based on the blink evaluation value after correction based on the smile value of the face detected based on the position information of the eyes.

This makes it possible to comprehensively determine the state of the human face such as the blink degree or the smile degree for each person, to appropriately select the image recorded on the recording medium 13, and to easily select one image even if the number of continuously-photographed images increases.

In addition, in the blink detection process, the true threshold Th2 for determining whether or not the eyes of each person are open is calculated based on the blink evaluation value of each person, and it is determined whether or not the eyes of each person are open for the plurality of image frames based on the true threshold Th2 for blink determination, so that the true threshold Th2 for blink determination is set based on the blink evaluation value detected in consideration of the size, width, and ease of opening of the eyes, which vary from person to person, and it is possible to appropriately determine whether or not the eyes of each person are open for the plurality of image frames based on the true threshold Th 2.

Furthermore, when it is determined whether or not there are a plurality of image frames having the smallest number of closed-eye faces among the plurality of image frames and it is determined that there are not a plurality of image frames having the smallest number of closed-eye faces, that is, when there is only one image frame, the image specification unit 11 specifies RAW image data in the image frames as one captured image recorded in the recording medium 13, and therefore it is possible to reliably select an image in which all the persons have no closed-eye.

Further, when it is determined that there are a plurality of image frames with the smallest number of closed-eye faces, it is determined whether or not the number of closed-eye faces is 0, and when it is determined that the number of closed-eye faces is 0, the image specification unit 11 specifies one image frame with the highest blink evaluation value based on the blink evaluation value of each person detected by the blink detection unit 9 among the plurality of image frames with the smallest number of closed-eye faces, and specifies RAW image data in the image frame as one captured image recorded in the recording medium 13, so that it is possible to reliably select an image with the highest blink evaluation value. On the other hand, when it is determined that the number of closed-eye faces is not 0, the image specification unit 11 specifies an image frame having the highest smile value based on the smile value detected by the smile detection unit 8 among a plurality of image frames having the smallest number of closed-eye faces, and specifies RAW image data in the image frame as a captured image recorded in the recording medium 13.

When a human face is not detected by the face detection unit 6 from all the image frames or when the number of faces whose eyes are detected is 0, which is effective for detection, although a human face is detected from any image frame, the image specification unit 11 specifies the RAW image data in the image frame having the smallest shake evaluation value detected by the shake detection unit 10 as one captured image recorded in the recording medium 13, and therefore, it is possible to reliably select the image having the smallest shake amount.

This makes it possible to appropriately and easily select an image to be recorded on the recording medium 13 by comprehensively determining not only the state of a human face such as a blink degree or a smile degree but also image blur due to subject shake or hand shake.

Next, a modified example of the imaging apparatus 100 will be described.

(modification example)

The imaging apparatus 100 of this modification detects the amount of shake for adjacent images for each image frame for which eye detection is effective, and performs a blink detection process based on whether or not the number of captured images for each image frame is larger than a predetermined value, or designates the image frame with the smallest amount of shake as one captured image recorded on the recording medium 13.

Fig. 6 is a block diagram showing a schematic configuration of a modified imaging apparatus 100.

As shown in fig. 6, the shake detection unit 10 of the imaging apparatus 100 according to this modification includes a shake correction unit 10a that corrects the amount of shake detected in each image frame based on the position of the shake with respect to the position of the face.

Specifically, when the number of faces whose eyes are effectively detected is 0 even when a human face is detected from any image frame, the shake detection unit 10 calculates a difference value between blocks at the same position as adjacent image frames in each image frame, and sets the difference value to be the largest among all the blocks as a shake evaluation value (shake amount) of the image frame, and in this case, the shake correction unit 10a corrects the shake as follows: the position of the block having the largest difference value decreases in the shake evaluation value with distance from the position of the block where the face is located.

Here, the shake correction unit 10a constitutes shake correction means for correcting a shake evaluation value (shake amount) of each image frame detected by the shake detection unit 10, based on a shake position with respect to a position of the face.

The image determination unit 11a of the image specification unit 11, as the 3 rd image determination means, determines whether or not there are a plurality of image frames whose shake evaluation values are smaller than a predetermined shake detection threshold value in each image frame when the number of faces whose eyes are effectively detected is 0 even when a human face is detected from an arbitrary image.

When the image specification unit 11 determines that there are not a plurality of image frames having a shake evaluation value smaller than the predetermined shake detection threshold, it specifies the RAW image data in the image frame having the smallest shake evaluation value detected by the shake detection unit 10 as one captured image recorded in the recording medium 13.

Next, an image selection process performed by the imaging apparatus 100 according to the modification will be described with reference to fig. 7.

Fig. 7 is a flowchart showing an example of an operation in the image selection process. The processing after the blink detection processing (step S12) in fig. 7 is the same as the processing shown in the flowchart showing the sequence of the image selection processing shown in fig. 3, and detailed description thereof will be omitted.

As shown in fig. 7, when a continuous imaging instruction is input based on a predetermined operation of the shutter button 16a of the operation input unit 16 by the user, the CPU17 adjusts the predetermined imaging conditions and causes the electronic imaging unit 2 to perform continuous imaging for continuously imaging a predetermined number of optical images of the subject at a predetermined imaging frame rate, as in the above-described embodiment (step S1). Then, the CPU17 causes the image data generator 4 to generate RAW image data and reduced-luminance image data of each image frame of the subject transferred from the electronic image pickup unit 2, and temporarily store these image data in the image memory 5, as in the above-described embodiment (step S2).

Then, the CPU17 causes the face detection unit 6 to detect a human face from the reduced-luminance image data of each image frame, and generates image information in the detected face image region as face contour information (step S3), as in the above-described embodiment.

Next, the CPU17 determines whether or not the number of faces detected in the face detection processing is 0 (step S4), as in the above-described embodiment. Here, when it is determined that the number of detected faces is not 0 (no in step S4), the CPU17 assigns a personal ID to each face based on the face contour information detected from each image frame (step S5), and causes the eye detection unit 7 to detect the left and right eyes of the person based on the face contour information of each image frame generated by the face detection unit 6 to calculate the center coordinates thereof (step S6), as in the above-described embodiment.

Then, the CPU17 makes the validity determination unit 7b of the eye detection unit 7 determine whether or not the validity of the face of the eye is detected (step S7), and determines whether or not the number of faces for which eye detection is valid (the number of valid faces for eye detection) is 0 based on the determination result of the validity determination unit 7b (step S8), as in the above-described embodiment.

If it is determined in step S8 that the number of faces for which eye detection is valid is not 0 (no in step S8), the CPU17 causes the shake detection unit 10 to perform a shake detection process, that is: each image frame block relating to a face for which eye detection is effective is divided into predetermined regions, a difference value is calculated for each block between blocks at the same position as adjacent image frames, and the largest difference value among all the blocks is defined as an evaluation value of the image frame (step S41). At this time, the shake correction unit 10a of the shake detection unit 10 performs correction in such a manner that: the shake evaluation value becomes lower as the position of the block whose difference value is largest is distant from the position of the block where the face is located.

Next, the CPU17 causes the image specification unit 11 to determine whether or not there are a plurality of image frames smaller than a predetermined shake detection threshold after the image frame having a shake evaluation value smaller than the shake detection threshold is determined for each image frame based on the predetermined shake detection threshold (step S42) (step S43).

Here, if it is determined that there are not a plurality of image frames smaller than the predetermined shake detection threshold (step S43: NO), the CPU17 proceeds to step S10 and causes the image specification unit 11 to specify RAW image data in an image frame having the smallest shake evaluation value detected by the shake detection unit 10, as in the above-described embodiment (step S10). Thereafter, the CPU17 causes the developing unit 12 to perform the developing process of the RAW image data specified by the image specifying unit 11, encodes the image data in the JPEG format, and stores the encoded image data in the recording medium 13, as in the above-described embodiment (step S11).

On the other hand, when determining that there are a plurality of image frames having a shake evaluation value smaller than the predetermined shake detection threshold value (yes in step S43), the CPU17 causes the blink detection unit 9 to execute the blink detection process (step S12) as in the above-described embodiment.

The processing after the blink detection processing is the same as that in the above embodiment, and detailed description thereof is omitted.

Therefore, according to the imaging apparatus 100 of this modification, when it is determined whether or not the number of image frames having a shake evaluation value smaller than the predetermined shake detection threshold value is plural and it is determined that there are plural image frames having a shake evaluation value smaller than the predetermined shake detection threshold value, the blink detection processing is performed, so that it is possible to comprehensively determine the state of a human face such as the degree of blinking or the degree of smiling for each individual image frame having a small subject shake or hand shake, and to appropriately select an image to be recorded in the recording medium 13.

When it is determined that there are not a plurality of image frames having a shake evaluation value smaller than the predetermined shake detection threshold value, the RAW image data in the image frame having the smallest shake evaluation value detected by the shake detection unit 10 is specified as one captured image recorded in the recording medium 13, and therefore, the image having the smallest shake amount can be reliably selected.

In the shake detection process, the correction is performed so that the shake evaluation value decreases as the position of the block having the largest difference value is away from the block where the face is located, and therefore, it is possible to determine whether or not there are a plurality of image frames whose shake evaluation values are smaller than a predetermined shake detection threshold value after the determination, taking into account the distance between the position where the shake occurs and the position of the face.

That is, even in an image frame in which a little shake occurs, when the position where the shake occurs is far from the face, the blink detection process can be performed without seeing to some extent, and the image recorded in the recording medium 13 can be appropriately selected from a larger number of image frames.

In the above-described embodiment, although any image data is specified in the image selection process, the present invention is not limited to this, and for example, image frames in continuous shooting are evaluated according to the degree of blinking, the degree of smiling, the degree of shaking, and the like, and then the image frames are reordered from high to low in the evaluation, so that the user can specify and specify a desired image frame.

Further, the image data in all the image frames may be reordered from high to low of the evaluation and stored in the recording medium 13.

Further, the above-described image selection processing may be performed at the time of image reproduction. That is, the image frames in the continuous shooting recorded in the recording medium 13 may be evaluated in accordance with the degree of blinking, the degree of smiling, the degree of judder, and the like, and then re-recorded in the recording medium 13. Further, the image frames may be rearranged from high to low of the evaluation, and may be reproduced and displayed with a predetermined interval.

Further, the number of continuous shots may be adjusted according to the state of the face by performing face detection processing from the live view image. Specifically, when continuously photographing an object, the setting is performed such that: the face detection processing can be performed in a state where the shutter button 16a is half-pressed, and the number of continuous shots can be increased in accordance with the number of people whose faces are detected. That is, since the probability that all the eyes are open decreases as the number of persons who take a picture as an object increases, it is possible to reliably specify an image in which all the persons have their eyes open by increasing the number of continuous shots.

In the case of continuous shooting of the subject, the face detection process may be performed in a half-pressed state of the shutter button 16a, and the number of continuous shots may be increased in accordance with the magnitude of the face motion by tracking the detected face motion. That is, the larger the face movement, the lower the probability that the subject of the face will shake or that the face will face the front, and therefore, by increasing the number of consecutive shots, it is possible to reliably prevent the subject from shaking or specify the image with the face facing the front.

The image memory 5 may have, as the acquisition means, a ring buffer (not shown) capable of temporarily storing a predetermined number of image frames, temporarily storing a plurality of image frames generated by the electronic imaging unit 2, and automatically pressing the shutter when it is determined that all the eyes of the person are open in the blink detection processing. This makes it possible to reliably acquire images of all the people with their eyes open.

In the smiling face detection processing in the above embodiment, the evaluation item is supplemented with the size of the entire image, the position of the face (for example, the center or the end of the entire image), and the like, along with the smile degree of the human face, and the smile value is corrected in accordance with the item.

The recording medium 13 may have a person registration database (not shown) as person registration means, and face detection processing and assignment of person IDs may be performed using the database. When a person registered in the person registration database is detected in the face detection process, the image selection process may be performed in consideration of the state of the face of the person in preference to other persons. That is, when the face detected by the face detection unit 6 is a person registered in the person registration database, the image specification unit 11 may correct the evaluation value relating to the face. For example, by making the threshold value for determination of the blink detection processing or the smile detection processing of the person stricter and widening the threshold value of the other person, it is possible to specify an image in which at least the eyes of the person are open and smile.

In addition, when a face is detected from any one of the image frames at the time of assigning the person ID, the face detection processing may be performed from the other image frame using the detected face contour information as a template (template). This makes it possible to reliably detect a human face from all the image frames, and to improve the accuracy of the subsequent eye detection process, smile detection process, and blink detection process.

In the face detection process and the eye detection process, both the left and right eyes of a human face are detected from any one of the image frames, and when template matching is successfully performed between adjacent images using the peripheral regions of the eyes as templates, the operations of the face detection process and the eye detection process may be skipped.

In the image selection process, when an image in which the eyes of all the persons are not open is not found, the image with the highest evaluation may be designated, and the person with the eyes closed on the image may cut the peripheral portion of the eye from the image frames in which the other eyes are open and combine the peripheral portion of the eye with the image with the highest evaluation.

The configuration of the imaging apparatus 100 is merely an example in the above embodiment, and is not limited thereto. That is, although the image pickup apparatus 100 is exemplified as the image selection apparatus, it is not limited thereto. For example, the image selecting apparatus may be an image selecting apparatus that performs continuous shooting in an image pickup apparatus different from the image pickup apparatus 100, records only image data transferred from the image pickup apparatus, and performs only image selection processing.

In the above-described embodiment, the functions as the acquisition means, the face detection means, the eye detection means, the blink detection means, the evaluation means, and the specification means are realized by driving the electronic image pickup unit, the image pickup control unit 3, the face detection unit 6, the eye detection unit 7, the blink detection unit 9, and the image specification unit 11 under the control of the CPU17, but the present invention is not limited thereto, and may be realized by executing a predetermined program or the like by the CPU 17.

That is, a program memory (not shown) storing programs stores programs including an acquisition processing subroutine (routine), a face detection processing subroutine, an evaluation processing subroutine, and a designation processing subroutine in advance. Further, the acquisition processing subroutine may cause the CPU17 to acquire a plurality of captured images generated by continuously capturing images of an object. Further, the face detection processing subroutine may cause the CPU17 to detect a human face from a plurality of captured images generated by continuous shooting. Further, the eye detection processing subroutine may cause the CPU17 to detect eyes from the face of a person detected in the face detection processing. Further, the blink detection processing subroutine may cause the CPU17 to detect the degree of blinking of the eyes detected in the eye detection processing. Further, the evaluation processing subroutine may cause the CPU17 to evaluate the state of the human face based on the blink degree detected in the blink detection processing. Further, the specification processing subroutine may be adapted to cause the CPU17 to specify at least one captured image recorded in the recording medium 13 from among the plurality of captured images that have been acquired based on the evaluation of the face state.

Claims

1. An image selection apparatus comprising:

an acquisition unit that acquires a plurality of captured images generated by continuously capturing images of a subject that is a person or more than one person;

a face detection unit that detects faces included in the plurality of captured images acquired by the acquisition unit;

an eye detection unit that detects eyes from the face of the person detected by the face detection unit;

a blink detection unit that detects a blink degree of each of the eyes detected by the eye detection unit; and

a specifying unit that specifies a captured image having a higher blink degree detected by the blink detecting unit than other captured images from among the plurality of captured images acquired by the acquiring unit,

a higher degree of blinking indicates a greater opening of the eye.

2. The image selection apparatus according to claim 1, further comprising an evaluation unit that evaluates a state of the human face based on the blink degree detected by the blink detection unit;

the specifying unit specifies a captured image having a higher blink degree than other captured images from among the plurality of captured images acquired by the acquiring unit, based on the state of the face evaluated by the evaluating unit.

3. The image selection apparatus according to claim 1,

further comprising: a threshold value calculation unit that calculates a threshold value for determining whether the eyes of each person are open, based on the blink degrees of each person detected by the blink detection unit; and

and a blink determination unit that determines whether or not the eye of each person is open, based on the threshold value calculated by the threshold value calculation unit, for the plurality of captured images acquired by the acquisition unit.

4. The image selection apparatus according to claim 3,

further comprising a 1 st image determination unit that determines whether or not there are a plurality of captured images having the smallest number of closed-eye faces among the plurality of captured images acquired by the acquisition unit, based on a determination result by the blink determination unit,

the specifying unit specifies the captured image when the 1 st image determining unit determines that the captured image having the smallest number of closed-eye faces is not plural.

5. The image selection apparatus according to claim 4,

further comprising a 2 nd image determining means for determining whether or not the number of closed-eye faces is 0 when the 1 st image determining means determines that there are a plurality of captured images having the smallest number of closed-eye faces,

the specifying unit specifies the captured image based on the blink degree of each person detected by the blink detection unit when the 2 nd image determination unit determines that the number of closed-eye faces is 0.

6. The image selection apparatus according to claim 1,

further comprising: a smiling face detection unit that detects a smiling degree of a face in which the eyes are detected, based on position information of the eyes detected by the eye detection unit; and

a blink correction unit that corrects the degree of blinking detected by the blink detection unit based on the degree of smiling detected by the smiling face detection unit,

the specifying unit determines the state of the face based on the blink degree corrected by the blink correction unit.

7. The image selection apparatus according to claim 4,

a 2 nd image determining section that determines whether or not the number of closed-eye faces is 0 when the 1 st image determining section determines that there are a plurality of captured images having the smallest number of closed-eye faces,

the specifying unit specifies the captured image based on the smile degree detected by the smiling face detecting unit when the 2 nd image determining unit determines that the number of closed-eye faces is not 0.

8. The image selection apparatus according to claim 1,

further comprising: a reliability calculation unit that calculates reliability of eye detection by the eye detection unit; and

a validity determination unit configured to determine validity of the face of the eye detected as relating to the reliability, based on whether or not the reliability calculated by the reliability calculation unit is equal to or less than a predetermined threshold,

the specifying unit determines a state of the face determined to be valid by the validity determining unit.

9. The image selection apparatus according to claim 1,

further comprising a smoothing unit for smoothing the blink degree detected by the blink detection unit between adjacent images when the imaging frame rate is equal to or higher than a predetermined value,

the specifying unit judges the state of the human face according to the blink degree smoothed by the smoothing unit.

10. The image selection apparatus according to claim 8,

further comprising: a shake detection unit that detects a shake amount with respect to an adjacent image for each of the plurality of captured images acquired by the acquisition unit, the face of which is determined to be valid by the validity determination unit; and

a 3 rd picture determining means for determining whether or not the number of captured pictures detected by the shake detecting means, the number of which is smaller than a predetermined value,

the blink detection unit may detect a blink degree of the eye detected by the eye detection unit when the 3 rd image determination unit determines that the plurality of captured images have a shake amount smaller than a predetermined value.

11. The image selection apparatus according to claim 10,

further comprises a shake correction means for correcting the amount of shake of each captured image detected by the shake detection means on the basis of the position of the shake with respect to the position of the face,

and a specifying unit that specifies an image pickup image having a shake amount smaller than a predetermined value based on the shake amount of each image pickup image corrected by the shake correction unit.

12. The image selection apparatus according to claim 10,

the specifying unit specifies the captured image with the smallest shake amount when the 3 rd image determining unit determines that the number of captured images with shake amounts smaller than the predetermined value is not plural.

13. The image selection apparatus according to claim 2,

the plurality of captured images acquired by the acquisition unit are reordered based on the evaluation made by the evaluation unit.

14. The image selection apparatus according to claim 1,

the fetch unit includes a ring buffer capable of temporarily storing a prescribed number of image frames.

15. The image selection apparatus according to claim 2,

further has a person registration unit that allows the user to perform registration of a person,

when the face detected by the face detection unit is a person registered in the person registration unit, the evaluation unit corrects the evaluation value relating to the face.

16. The image selection apparatus according to claim 1,

further comprises an imaging unit for imaging the captured image,

the number of images to be continuously captured by the imaging means is set based on the face detected by the face detection means.

17. The image selection apparatus according to claim 1,

further comprising a shake detection unit for detecting a shake amount with respect to each of adjacent images for each of the plurality of captured images acquired by the acquisition unit when no human face is detected by the face detection unit,

the specifying unit specifies the captured image with the smallest amount of shake detected by the shake detection unit.

18. An image selection method, comprising:

a step of acquiring a plurality of captured images generated by continuously capturing images of at least one person as a subject;

a step of performing face detection processing for detecting faces included in the plurality of captured images;

a step of performing eye detection processing of detecting eyes from the face detected by the face detection processing;

a step of performing blink detection processing for detecting the respective blink degrees of the eyes detected by the eye detection processing; and the number of the first and second groups,

a step of specifying a captured image from the plurality of captured images, the blink degree detected by the blink detection processing being higher than the blink degrees of the other captured images,

a higher degree of blinking indicates a greater opening of the eye.