WO2019061658A1 - 眼镜定位方法、装置及存储介质 - Google Patents
眼镜定位方法、装置及存储介质 Download PDFInfo
- Publication number
- WO2019061658A1 WO2019061658A1 PCT/CN2017/108756 CN2017108756W WO2019061658A1 WO 2019061658 A1 WO2019061658 A1 WO 2019061658A1 CN 2017108756 W CN2017108756 W CN 2017108756W WO 2019061658 A1 WO2019061658 A1 WO 2019061658A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- glasses
- image
- sample
- classifier
- real
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
- G06V40/171—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/217—Validation; Performance evaluation; Active pattern learning techniques
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/285—Selection of pattern recognition techniques, e.g. of classifiers in a multi-classifier system
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
Definitions
- the present application relates to the field of computer vision processing technologies, and in particular, to a glasses positioning method, an electronic device, and a computer readable storage medium.
- the face images with deep-frame glasses are highly similar in face recognition, and accurate face recognition cannot be performed.
- the method adopted in the industry is to remove the face image in the face image and then recognize the face image after removing the eyeglass region.
- the key to this method is how to accurately determine the area of the glasses in the face image.
- early glasses detection mainly adopts image processing and template matching methods, detecting the lower border of the glasses and the nose bridge of the glasses according to the discontinuous change of the gray value of the pixels, and then detecting the glasses through the edge information of the area between the two eyes;
- the glasses detection mainly uses the three-dimensional Hough transform method to detect the glasses.
- the image obtained by image processing and Hough method after imaging is too dependent on the edge of the image, so there is noise, and noise interference may often result in the inability to obtain feature points or accurate feature points, so the accuracy of detection Relatively low.
- the present application provides a glasses positioning method, an electronic device, and a computer readable storage medium, the main purpose of which is to improve the accuracy of positioning glasses in a face image.
- the present application provides an electronic device, including: a memory, a processor, and an imaging device, wherein the memory includes a glasses positioning program, and the glasses positioning program is executed by the processor to implement the following steps:
- the position of the glasses in the real-time facial image is located by using a predetermined second classifier, and the positioning result is output.
- the present application further provides a method for positioning glasses, the method comprising:
- the position of the glasses in the real-time facial image is located by using a predetermined second classifier, and the positioning result is output.
- the present application further provides a computer readable storage medium including a glasses positioning program, and when the glasses positioning program is executed by a processor, realizing the positioning of the glasses as described above Any step in the method.
- the eyeglass positioning method, the electronic device and the computer readable storage medium provided by the present application first determine whether the face image includes the glasses through the first classifier, and then input the face image including the glasses into the second classifier to determine the person.
- the position of the glasses in the face image The present invention uses two classifiers to detect the image of the eyeglass area in the face image, and does not depend on the edge of the image, thereby improving the accuracy and accuracy of the eyeglass detection.
- FIG. 1 is a schematic diagram of hardware of a preferred embodiment of an electronic device of the present application.
- FIG. 2 is a schematic block diagram of a preferred embodiment of the glasses positioning program of FIG. 1;
- FIG. 3 is a flow chart of a preferred embodiment of a method for positioning glasses according to the present application.
- the application provides an electronic device 1 .
- 1 is a hardware schematic diagram of a preferred embodiment of an electronic device of the present application.
- the electronic device 1 may be a terminal device having a computing function, such as a server, a smart phone, a tablet computer, a portable computer, or a desktop computer.
- a computing function such as a server, a smart phone, a tablet computer, a portable computer, or a desktop computer.
- the electronic device 1 may be a server having a glasses positioning program, a smart phone, a tablet computer, a portable computer, a desktop computer, and the like having a computing function
- the server may be a rack server. Blade server, tower server, or rack server.
- the electronic device 1 includes a memory 11, a processor 12, an imaging device 13, a network interface 14, and a communication bus 15.
- the memory 11 includes at least one type of readable storage medium.
- the at least one type of readable storage medium may be, for example, a flash memory, a hard disk, a multimedia card, a card type memory (eg, SD or A non-volatile storage medium such as a DX memory or the like, a magnetic memory, a magnetic disk, or an optical disk.
- the memory 11 may be an internal storage unit of the electronic device 1, such as a hard disk of the electronic device 1.
- the memory 11 may also be an external storage device of the electronic device 1, such as a plug-in hard disk equipped on the electronic device 1, a smart memory card (SMC), and a secure digital ( Secure Digital, SD) cards, flash cards, etc.
- SMC smart memory card
- Secure Digital Secure Digital
- the readable storage medium of the memory 11 is generally used to store the glasses positioning program 10 installed in the electronic device 1, the predetermined first classifier, the model file of the second classifier, and various types. Data, etc.
- the memory 11 can also be used to temporarily store data that has been output or is about to be output.
- the processor 12 in some embodiments, may be a Central Processing Unit (CPU), microprocessor or other data processing chip for running program code or processing data stored in the memory 11, such as performing a glasses positioning procedure. 10 and so on.
- CPU Central Processing Unit
- microprocessor or other data processing chip for running program code or processing data stored in the memory 11, such as performing a glasses positioning procedure. 10 and so on.
- the imaging device 13 may be part of the electronic device 1 or may be independent of the electronic device 1.
- the electronic device 1 is a terminal device having a camera such as a smartphone, a tablet computer, a portable computer, etc.
- the camera device 13 is a camera of the electronic device 1.
- the electronic device 1 may be a server, and the camera device 13 is connected to the electronic device 1 via a network, for example, the camera device 13 is installed in a specific place, such as an office. And monitoring the area, real-time image is taken in real time for the target entering the specific place, and the captured real-time image is transmitted to the processor 12 through the network.
- the network interface 14 can optionally include a standard wired interface, a wireless interface (such as a WI-FI interface), and is typically used to establish a communication connection between the electronic device 1 and other electronic devices.
- a standard wired interface such as a WI-FI interface
- Communication bus 15 is used to implement connection communication between these components.
- Figure 1 shows only the electronic device 1 with components 11-15, but it should be understood that not all illustrated components may be implemented, and more or fewer components may be implemented instead.
- the electronic device 1 may further include a user interface, and the user interface may include an input unit such as a keyboard, etc., optionally, the user interface may further include a standard wired interface and a wireless interface.
- the electronic device 1 may further include a display, which may also be appropriately referred to as a display screen or a display unit.
- a display may also be appropriately referred to as a display screen or a display unit.
- it may be an LED display, a liquid crystal display, a touch liquid crystal display, and an Organic Light-Emitting Diode (OLED) touch sensor.
- the display is used to display information processed in the electronic device 1 and a user interface for displaying visualizations.
- the electronic device 1 may further include a touch sensor.
- the area provided by the touch sensor for the user to perform a touch operation is referred to as a touch area.
- the touch sensor described herein may be a resistive touch sensor, a capacitive touch sensor, or the like.
- the touch sensor includes not only a contact type touch sensor but also a proximity type touch sensor or the like.
- the touch sensor may be a single sensor or a plurality of sensors arranged, for example, in an array.
- the area of the display of the electronic device 1 may be the same as the area of the touch sensor. It can also be different.
- a display is stacked with the touch sensor to form a touch display. The device detects a user-triggered touch operation based on a touch screen display.
- the electronic device 1 may further include a radio frequency (RF) circuit, a sensor, an audio circuit, and the like, and details are not described herein.
- RF radio frequency
- a storage unit 10 is stored in a memory 11 as a computer storage medium.
- the processor 12 executes the glasses positioning program 10 stored in the memory 11, the following steps are implemented:
- the position of the glasses in the real-time facial image is located by using a predetermined second classifier, and the positioning result is output.
- the camera 13 When the camera 13 captures a real-time image, the camera 13 transmits the real-time image to the processor 12, and the processor 12 receives the real-time image and acquires the size of the real-time image to create a grayscale image of the same size.
- the acquired color image is converted into a grayscale image, and a memory space is created at the same time; the grayscale image histogram is equalized, the amount of grayscale image information is reduced, the detection speed is accelerated, and then the training library is loaded to detect the face in the image. And return an object containing face information, obtain the data of the location of the face, and record the number; finally get the face area and save it, thus completing the process of face image extraction.
- the face recognition algorithm for extracting the face image from the real-time image may be a geometric feature based method, a local feature analysis method, a feature face method, an elastic model based method, a neural network method, or the like.
- the face image extracted by the face recognition algorithm is input to a predetermined first classifier, and it is determined whether the face image includes glasses.
- the training step of the predetermined first classifier includes:
- the picture is used as a training set, and the second proportion of sample pictures are randomly extracted from the remaining first sample set as a verification set, for example, 50%, that is, 25% of the sample pictures in the first sample set are used as a verification set, and the training is utilized.
- the training convolutional neural network is set to obtain the first classifier; in order to ensure the accuracy of the first classifier, the accuracy of the first classifier needs to be verified, and the first classification of the training is verified by using the verification set Accuracy of the device, if the accuracy is greater than or equal to the preset accuracy, the training ends, or if the accuracy is less than the preset accuracy, the sample in the sample set is increased. Number of pictures and re-execute the above steps.
- the training step of the predetermined first classifier further includes: performing preprocessing such as scaling, cropping, flipping, and/or twisting on the sample image in the first sample set, and utilizing
- preprocessing such as scaling, cropping, flipping, and/or twisting
- the pre-processed sample images train the convolutional neural network to effectively improve the authenticity and accuracy of the model training.
- performing image preprocessing on each sample picture may include:
- each predetermined preset type parameter for example, a corresponding standard parameter value such as color, brightness and/or contrast
- the standard parameter value corresponding to the color is a1
- the standard parameter value corresponding to the brightness is a2
- the standard parameter corresponding to the contrast The value is a3
- each predetermined preset type parameter value of each second picture is adjusted to a corresponding standard parameter value, and a corresponding third picture is obtained, so as to eliminate the unclear picture caused by external conditions of the sample picture when shooting.
- each fourth picture is a training picture of the corresponding sample picture.
- the function of the flip and twist operation is to simulate various forms of pictures in the actual business scene. Through these flip and twist operations, the size of the data set can be increased, thereby improving the authenticity and practicability of the model training.
- the first classifier trained by the above steps determines that the face image includes glasses, and then inputs the face image into a predetermined second classifier, locates the glasses area in the face image, and outputs the The positioning result of the glasses in the face image. It can be understood that, if the eyeglass region is not included in the face image in the determination result output by the first classifier, the real-time image captured by the camera device 13 is reacquired, and the subsequent steps are performed.
- the predetermined second classifier acquiring process is as follows: preparing a preset number of "glasses” sample pictures to form a second sample set, and in other embodiments, using the first sample A sample picture with "glasses” or "1" is marked in the group.
- image preprocessing is performed on each sample picture. Specifically, the preprocessing step includes: converting a sample picture in the second sample set from a color image to a gray image, and then pixel points in the gray image.
- the pixel values are respectively divided by 255, and the pixel values of each pixel point are ranged from 0-255 to 0-1; the position of the glasses in the sample picture after the above pre-processing is marked with a preset number of marked points, For example, eight feature points are marked on the eyeglass frame in each sample picture: the upper and lower frames are uniformly labeled with three feature points, and the left and right frame edges are respectively labeled with one feature point.
- a preset number of marked points representing the position of the glasses in each sample picture are combined into one vector, and the vector of one sample picture is used as a reference vector, and the remaining m-
- the vector of one sample picture is aligned with the reference vector to obtain a first average model for the position of the glasses; the first average model for the position of the glasses is subjected to dimensionality reduction by Principal Components Analysis (PCA).
- PCA Principal Components Analysis
- the local feature of each marker point is extracted from the second average model using a feature extraction algorithm, and the second average model for the position of the glasses and the local features of each of the marker points are used as the second classifier.
- the feature extraction algorithm is a SIFT (scale-invariant feature transform) algorithm, and the SIFT algorithm extracts local features of each feature point from the second average model, selects a feature point as a reference feature point, and searches for A feature point that is the same as or similar to a local feature of the reference feature point (eg, the difference of the local features of the two feature points is within a preset range), according to this principle until all lip feature points are found.
- SIFT scale-invariant feature transform
- the feature extraction algorithm may also be a SURF (Speeded Up Robust Features) algorithm, an LBP (Local Binary Patterns) algorithm, a HOG (Histogram of Oriented Gridients) algorithm, or the like.
- SURF Speeded Up Robust Features
- LBP Long Binary Patterns
- HOG Histogram of Oriented Gridients
- the electronic device 1 proposed in this embodiment first determines whether the face image is included in the face image by the first classifier, and then inputs the face image including the glasses into the second classifier to determine the position of the glasses in the face image.
- the accuracy and accuracy of the glasses detection are improved.
- the glasses positioning program 10 can also be partitioned into one or more modules, one or more modules being stored in the memory 11 and executed by the processor 12 to complete the application.
- a module as referred to in this application refers to a series of computer program instructions that are capable of performing a particular function.
- FIG. 2 it is a block diagram of the glasses positioning program 10 of FIG.
- the glasses positioning program 10 can be divided into: an obtaining module 110, a determining module 120, and a positioning module 130.
- the functions or operating steps implemented by the modules 110-130 are similar to the above, and are not described in detail herein. Sexually, for example:
- the acquiring module 110 is configured to acquire a real-time image captured by the camera device 13 and extract a real-time facial image from the real-time image by using a face recognition algorithm;
- the determining module 120 is configured to identify, by using a predetermined first classifier, whether the real-time facial image includes glasses, and output the recognition result;
- the positioning module 130 is configured to: when the recognition result is that the real-time facial image includes glasses, locate the glasses position in the real-time facial image by using a predetermined second classifier, and output the positioning result.
- the present application also provides a method for positioning glasses.
- FIG. 3 it is a flowchart of the first embodiment of the glasses positioning method of the present application. The method can be performed by a device that can be implemented by software and/or hardware.
- the glasses positioning method includes steps S10-S30:
- Step S10 acquiring a real-time image captured by the camera device, and extracting a real-time face image from the real-time image by using a face recognition algorithm;
- Step S20 using a predetermined first classifier to identify whether the real-time facial image includes glasses, and outputting the recognition result;
- Step S30 when the recognition result is that the real-time facial image includes glasses, the predetermined The second classifier positions the position of the glasses in the real-time face image and outputs the positioning result.
- the camera When the camera captures a real-time image, the camera sends the real-time image to the processor, and the processor receives the real-time image and obtains the size of the real-time image, and creates a gray image of the same size, and the acquired color image Converting to a grayscale image, creating a memory space at the same time; equalizing the grayscale image histogram, reducing the amount of grayscale image information, speeding up the detection speed, then loading the training library, detecting the face in the image, and returning an inclusion
- the object of the face information obtains the data of the location of the face and records the number; finally, the area of the face is obtained and saved, thus completing the process of extracting the face image.
- the face recognition algorithm for extracting the face image from the real-time image may be a geometric feature based method, a local feature analysis method, a feature face method, an elastic model based method, a neural network method, or the like.
- the face image extracted by the face recognition algorithm is input to a predetermined first classifier, and it is determined whether the face image includes glasses.
- the training step of the predetermined first classifier includes:
- the picture is used as a training set, and the second proportion of sample pictures are randomly extracted from the remaining first sample set as a verification set, for example, 50%, that is, 25% of the sample pictures in the first sample set are used as a verification set, and the training is utilized.
- the training convolutional neural network is set to obtain the first classifier; in order to ensure the accuracy of the first classifier, the accuracy of the first classifier needs to be verified, and the first classification of the training is verified by using the verification set Accuracy of the device, if the accuracy is greater than or equal to the preset accuracy, the training ends, or if the accuracy is less than the preset accuracy, the sample in the sample set is increased. Number of pictures and re-execute the above steps.
- the training step of the predetermined first classifier further includes: performing preprocessing such as scaling, cropping, flipping, and/or twisting on the sample image in the first sample set, and using the preprocessed
- preprocessing such as scaling, cropping, flipping, and/or twisting
- performing image preprocessing on each sample picture may include:
- each predetermined preset type parameter for example, a corresponding standard parameter value such as color, brightness and/or contrast
- the standard parameter value corresponding to the color is a1
- the standard parameter value corresponding to the brightness is a2
- the standard parameter corresponding to the contrast The value is a3
- each predetermined preset type parameter value of each second picture is adjusted to a corresponding standard parameter value, and a corresponding third picture is obtained, so as to eliminate the unclear picture caused by external conditions of the sample picture when shooting.
- each third picture is warped according to a preset twist angle (for example, 30 degrees), and a fourth picture corresponding to each third picture is obtained, and each fourth picture is a training picture of the corresponding sample picture.
- the function of the flip and twist operation is to simulate various forms of pictures in the actual business scene. Through these flip and twist operations, the size of the data set can be increased, thereby improving the authenticity and practicability of the model training.
- the first classifier trained by the above steps determines that the face image includes glasses, and then inputs the face image into a predetermined second classifier, locates the glasses area in the face image, and outputs the The positioning result of the glasses in the face image. It can be understood that, if the eyeglass region is not included in the face image in the determination result output by the first classifier, the real-time image captured by the camera device 13 is reacquired, and the subsequent steps are performed.
- the predetermined second classifier acquiring process is as follows: preparing a preset number of "glasses” sample pictures to form a second sample set, and in other embodiments, using the first sample A sample picture with "glasses” or "1" is marked in the group.
- image preprocessing is performed on each sample picture. Specifically, the preprocessing step includes: converting a sample picture in the second sample set from a color image to a gray image, and then pixel points in the gray image.
- the pixel values are respectively divided by 255, and the pixel values of each pixel point are ranged from 0-255 to 0-1; the position of the glasses in the sample picture after the above pre-processing is marked with a preset number of marked points, For example, eight feature points are marked on the eyeglass frame in each sample picture: the upper and lower frames are uniformly labeled with three feature points, and the left and right frame edges are respectively labeled with one feature point.
- a preset number of marked points representing the position of the glasses in each sample picture are combined into one vector, and the vector of one sample picture is used as a reference vector, and the remaining m- A vector of one sample picture is aligned with the reference vector to obtain a first average model for the position of the glasses; a PCA dimensionality reduction process is performed on the first average model for the position of the glasses to obtain a second average model for the position of the glasses.
- a local feature of each marker point is extracted from the second average model using a feature extraction algorithm, and a second average type for the position of the glasses and a local feature of each of the marker points are used as the second classifier.
- the feature extraction algorithm is a SIFT algorithm, and the SIFT algorithm extracts a local feature of each feature point from the second average model, selects a feature point as a reference feature point, and searches for a feature that is identical or similar to the local feature of the reference feature point.
- the point for example, the difference between the local features of the two feature points is within a preset range), according to this principle until all lip feature points are found.
- the feature extraction algorithm may also be a SURF algorithm, an LBP algorithm, an HOG algorithm, or the like.
- the first classifier is used to determine whether the face image includes glasses, and then the face image including the glasses is input to the second classifier to determine the position of the glasses in the face image.
- the embodiment of the present application further provides a computer readable storage medium, where the computer readable storage medium includes a glasses positioning program, and the glasses positioning program is implemented by the processor to implement the following operating:
- the position of the glasses in the real-time facial image is located by using a predetermined second classifier, and the positioning result is output.
- the training process of the predetermined first classifier is as follows:
- the sample image after the classification mark is divided into a training set of a first ratio and a verification set of a second ratio;
- the training ends, or if the accuracy rate is less than the preset accuracy rate, increase the number of sample pictures and Re-execute the training steps.
- the obtaining process of the predetermined second classifier is as follows:
- a local feature of each marker point is extracted from the second average model, and a second average model for the position of the glasses and a local feature of each of the marker points are used as the second classifier.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Human Computer Interaction (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
一种眼镜定位方法,一种电子装置(1),以及一种计算机可读存储介质。该方法包括:获取摄像装置(13)拍摄到的一张实时图像,利用人脸识别算法从该实时图像中提取一张实时脸部图像(S10);利用预先确定的第一分类器识别该实时脸部图像中是否包含眼镜,并输出识别结果(S20);当识别结果为该实时脸部图像中包含眼镜时,利用预先确定的第二分类器对该实时脸部图像中的眼镜位置进行定位,并输出定位结果(S30)。采用两个分类器对人脸图像中的眼镜区域图像进行检测,提高眼镜检测的精度和准确度。
Description
优先权申明
本申请基于巴黎公约申明享有2017年9月30日递交的申请号为CN201710915085.X、名称为“眼镜定位方法、装置及存储介质”的中国专利申请的优先权,该中国专利申请的整体内容以参考的方式结合在本申请中。
本申请涉及计算机视觉处理技术领域,尤其涉及一种眼镜定位方法、电子装置及计算机可读存储介质。
在人脸识别领域,由于很多人戴眼镜,尤其是戴深框眼镜,导致在人脸识别时,带深框眼镜的人脸图像相似度较高,无法进行准确的人脸识别。目前业内采用的方法,是先去除人脸图像中的眼镜区域之后,再对去除眼镜区域之后的人脸图像进行识别。然而,这种方法的关键在于如何准确的确定人脸图像中的眼镜区域。
由于受眼镜形状的多样性及图像质量等因素的影响,眼镜检测存在许多难点。例如,早期的眼镜检测主要采用图像处理和模板匹配的方法,根据像素灰度值的不连续变化来检测眼镜的下边框和眼镜鼻梁,然后通过两眼之间区域的边缘信息来检测眼镜;后期的眼镜检测主要使用三维霍夫(Hough)变换方法检测眼镜。但是,由于不同光线的影响,成像后通过图像处理和Hough方法得到的图像过度依赖于图像边缘,故存在噪声,且噪声干扰会导致经常无法获得特征点或准确的特征点,因此检测的准确率比较低。
发明内容
本申请提供一种眼镜定位方法、电子装置及计算机可读存储介质,其主要目的在于提高人脸图像中眼镜定位的准确度。
为实现上述目的,本申请提供一种电子装置,该装置包括:存储器、处理器及摄像装置,所述存储器中包括眼镜定位程序,所述眼镜定位程序被所述处理器执行时实现如下步骤:
获取摄像装置拍摄到的一张实时图像,利用人脸识别算法从该实时图像中提取一张实时脸部图像;
利用预先确定的第一分类器识别该实时脸部图像中是否包含眼镜,并输出识别结果;及
当识别结果为该实时脸部图像中包含眼镜时,利用预先确定的第二分类器对该实时脸部图像中的眼镜位置进行定位,并输出定位结果。
此外,为实现上述目的,本申请还提供一种眼镜定位方法,该方法包括:
获取摄像装置拍摄到的一张实时图像,利用人脸识别算法从该实时图像中提取一张实时脸部图像;
利用预先确定的第一分类器识别该实时脸部图像中是否包含眼镜,并输出识别结果;及
当识别结果为该实时脸部图像中包含眼镜时,利用预先确定的第二分类器对该实时脸部图像中的眼镜位置进行定位,并输出定位结果。
此外,为实现上述目的,本申请还提供一种计算机可读存储介质,所述计算机可读存储介质中包括眼镜定位程序,所述眼镜定位程序被处理器执行时,实现如上所述的眼镜定位方法中的任意步骤。
本申请提出的眼镜定位方法、电子装置及计算机可读存储介质,首先通过第一分类器判断人脸图像中是否包含眼镜,然后,将包含眼镜的人脸图像输入第二分类器,以确定人脸图像中的眼镜位置。本申请采用两个分类器对人脸图像中的眼镜区域图像进行检测,不依赖于图像边缘,从而提高眼镜检测的精度和准确度。
图1为本申请电子装置较佳实施例的硬件示意图;
图2为图1中眼镜定位程序较佳实施例的模块示意图;
图3为本申请眼镜定位方法较佳实施例的流程图。
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
本申请提供一种电子装置1。参照图1所示,为本申请电子装置较佳实施例的硬件示意图。
在本实施例中,电子装置1可以是服务器、智能手机、平板电脑、便携计算机、桌上型计算机等具有运算功能的终端设备。
在本实施例中,电子装置1可以是安装有眼镜定位程序的服务器、智能手机、平板电脑、便携计算机、桌上型计算机等具有运算功能的终端设备,所述服务器可以是机架式服务器、刀片式服务器、塔式服务器或机柜式服务器。
该电子装置1包括:存储器11、处理器12、摄像装置13、网络接口14及通信总线15。
其中,存储器11至少包括一种类型的可读存储介质。所述至少一种类型的可读存储介质可为如闪存、硬盘、多媒体卡、卡型存储器(例如,SD或
DX存储器等)、磁性存储器、磁盘、光盘等的非易失性存储介质。在一些实施例中,存储器11可以是所述电子装置1的内部存储单元,例如该电子装置1的硬盘。在另一些实施例中,存储器11也可以是所述电子装置1的外部存储设备,例如所述电子装置1上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。
在本实施例中,所述存储器11的可读存储介质通常用于存储安装于所述电子装置1的眼镜定位程序10、预先确定的第一分类器、第二分类器的模型文件及各类数据等。所述存储器11还可以用于暂时地存储已经输出或者将要输出的数据。
处理器12在一些实施例中可以是一中央处理器(Central Processing Unit,CPU),微处理器或其他数据处理芯片,用于运行存储器11中存储的程序代码或处理数据,例如执行眼镜定位程序10等。
摄像装置13既可以是所述电子装置1的一部分,也可以独立于电子装置1。在一些实施例中,所述电子装置1为智能手机、平板电脑、便携计算机等具有摄像头的终端设备,则所述摄像装置13即为所述电子装置1的摄像头。在其他实施例中,所述电子装置1可以为服务器,所述摄像装置13独立于该电子装置1、与该电子装置1通过网络连接,例如,该摄像装置13安装于特定场所,如办公场所、监控区域,对进入该特定场所的目标实时拍摄得到实时图像,通过网络将拍摄得到的实时图像传输至处理器12。
网络接口14可选地可以包括标准的有线接口、无线接口(如WI-FI接口),通常用于在该电子装置1与其他电子设备之间建立通信连接。
通信总线15用于实现这些组件之间的连接通信。
图1仅示出了具有组件11-15的电子装置1,但是应理解的是,并不要求实施所有示出的组件,可以替代的实施更多或者更少的组件。
可选地,该电子装置1还可以包括用户接口,用户接口可以包括输入单元比如键盘(Keyboard)等,可选地用户接口还可以包括标准的有线接口、无线接口。
可选地,该电子装置1还可以包括显示器,显示器也可以适当的称为显示屏或显示单元。在一些实施例中可以是LED显示器、液晶显示器、触控式液晶显示器以及有机发光二极管(Organic Light-Emitting Diode,OLED)触摸器等。显示器用于显示在电子装置1中处理的信息以及用于显示可视化的用户界面。
可选地,该电子装置1还可以包括触摸传感器。所述触摸传感器所提供的供用户进行触摸操作的区域称为触控区域。此外,这里所述的触摸传感器可以为电阻式触摸传感器、电容式触摸传感器等。而且,所述触摸传感器不仅包括接触式的触摸传感器,也可包括接近式的触摸传感器等。此外,所述触摸传感器可以为单个传感器,也可以为例如阵列布置的多个传感器。
此外,该电子装置1的显示器的面积可以与所述触摸传感器的面积相同,
也可以不同。可选地,将显示器与所述触摸传感器层叠设置,以形成触摸显示屏。该装置基于触摸显示屏侦测用户触发的触控操作。
可选地,该电子装置1还可以包括射频(Radio Frequency,RF)电路,传感器、音频电路等等,在此不再赘述。
在图1所示的装置实施例中,作为一种计算机存储介质的存储器11中存储有眼镜定位程序10,处理器12执行存储器11中存储的眼镜定位程序10时实现如下步骤:
获取摄像装置13拍摄到的一张实时图像,利用人脸识别算法从该实时图像中提取一张实时脸部图像;
利用预先确定的第一分类器识别该实时脸部图像中是否包含眼镜,并输出识别结果;及
当识别结果为该实时脸部图像中包含眼镜时,利用预先确定的第二分类器对该实时脸部图像中的眼镜位置进行定位,并输出定位结果。
当摄像装置13拍摄到一张实时图像,摄像装置13将这张实时图像发送到处理器12,处理器12接收到该实时图像并获取实时图像的大小,建立一个相同大小的灰度图像,将获取的彩色图像,转换成灰度图像,同时创建一个内存空间;将灰度图像直方图均衡化,使灰度图像信息量减少,加快检测速度,然后加载训练库,检测图片中的人脸,并返回一个包含人脸信息的对象,获得人脸所在位置的数据,并记录个数;最终获取脸部的区域且保存下来,这样就完成了一次脸部图像提取的过程。具体地,从该实时图像中提取脸部图像的人脸识别算法可以为基于几何特征的方法、局部特征分析方法、特征脸方法、基于弹性模型的方法、神经网络方法,等等。
接下来,将利用人脸识别算法提取的脸部图像输入预先确定的第一分类器,判断该脸部图像中是否包含眼镜,其中,所述预先确定的第一分类器的训练步骤包括:
分别准备一定数量的包含眼镜和不包含眼镜的人脸图片作为样本图片并形成第一样本集,并根据样本图片中是否包含眼镜,为每张样本图片进行分类标记,包含眼镜的样本图片标注“有眼镜”或“1”,不包含眼镜的样本图片标注“没有眼镜”或“0”;从进行样本分类标记后的第一样本集中随机抽取第一比例(例如,50%)的样本图片作为训练集,从剩下的第一样本集中随机抽取第二比例的样本图片作为验证集,例如50%,即第一样本集中的25%的样本图片作为验证集,利用所述训练集训练卷积神经网络,得到所述第一分类器;为了保证第一分类器的准确率,需对第一分类器的准确率进行验证,利用所述验证集验证训练的所述第一分类器的准确率,若准确率大于或者等于预设准确率,则训练结束,或者,若准确率小于预设准确率,则增加样本集中的样本图片数量并重新执行上述步骤。
需要说明的是,所述预先确定的第一分类器的训练步骤还包括:对第一样本集中的样本图片进行预处理如缩放、裁剪、翻转及/或扭曲等操作,利用
经过预处理后的样本图片对卷积神经网络进行训练,有效提高模型训练的真实性及准确率。
例如在一种实施方式中,对每张样本图片进行图片预处理可以包括:
将每张样本图片的较短边长缩放到第一预设大小(例如,640像素)以获得对应的第一图片,在各张第一图片上随机裁剪出一个第二预设大小的第二图片,例如256*256像素的第二图片;
根据各个预先确定的预设类型参数,例如颜色、亮度及/或对比度等对应的标准参数值,例如,颜色对应的标准参数值为a1,亮度对应的标准参数值为a2,对比度对应的标准参数值为a3,将各张第二图片的各个预先确定的预设类型参数值调整为对应的标准参数值,获得对应的第三图片,以消除样本图片在拍摄时外界条件导致的图片不清晰,提高模型训练的有效性;
对各张第三图片进行预设方向(例如,水平和垂直方向)的翻转,及按照预设的扭曲角度(例如,30度)对各张第三图片进行扭曲操作,获得各张第三图片对应的第四图片,各张第四图片即为对应的样本图片的训练图片。其中,翻转和扭曲操作的作用是模拟实际业务场景下各种形式的图片,通过这些翻转和扭曲操作可以增大数据集的规模,从而提高模型训练的真实性和实用性。
假设通过以上步骤训练得到的第一分类器判断人脸图像中包含眼镜,则将该人脸图像输入预先确定的第二分类器中,对该人脸图像中的眼镜区域进行定位,并输出该人脸图像中的眼镜定位结果。可以理解的是,若第一分类器输出的判断结果中,该人脸图像中不包含眼镜区域,则重新获取摄像装置13拍摄到的实时图像,并进行后续步骤。
需要说明的是,所述预先确定的第二分类器的获取过程如下:准备预设数量张“有眼镜”的样本图片形成第二样本集,在其他实施例中,也可以利用第一样本集中标注有“有眼镜”或“1”的样本图片。为了简化后续计算,对每张样本图片进行图片预处理,具体地,所述预处理步骤包括:将第二样本集中的样本图片从彩色图像转为灰度图像,再将灰度图像中像素点的像素值分别除以255,将各像素点的像素值的范围从0-255规范到0-1之间;在经过上述预处理后的样本图片中的眼镜位置标记预设数量的标记点,例如,在每张样本图片中的眼镜镜框上标记8个特征点:上、下框沿分别均匀标记3个特征点,左、右框沿分别标记1个特征点。
假设第二样本集中有m张样本图片,分别将每张样本图片中代表眼镜位置的预设数量的标记点组合成一个向量,以其中一张样本图片的向量为基准向量,将其余的m-1张样本图片的向量与该基准向量对齐,得到关于眼镜位置的第一平均模型;对关于眼镜位置的第一平均模型通过主成分分析法(Principal Components Analysis,简称PCA)进行降维处理,得到关于眼镜位置的第二平均模型,上述对齐和降维为本领域人员习知技术,这里不进行说明。
利用特征提取算法从第二平均模型中提取每个标记点的局部特征,例如,HOG特征,将关于眼镜位置的第二平均模型及其每个标记点的局部特征作为第二分类器。在本实施例中,所述特征提取算法为SIFT(scale-invariant feature transform)算法,SIFT算法从第二平均模型中提取每个特征点的局部特征,选择一个特征点为参考特征点,并查找与该参考特征点的局部特征相同或相似的特征点(例如,两个特征点的局部特征的差值在预设范围内),依此原理直到查找出所有嘴唇特征点。在其他实施例中,该特征提取算法还可以为SURF(Speeded Up Robust Features)算法,LBP(Local Binary Patterns)算法,HOG(Histogram of Oriented Gridients)算法等。
本实施例提出的电子装置1,首先通过第一分类器判断人脸图像中是否包含眼镜,然后,将包含眼镜的人脸图像输入第二分类器,以确定人脸图像中的眼镜位置。通过采用两个分类器对人脸图像中的眼镜区域图像进行检测,从而提高眼镜检测的精度和准确度。
在其他实施例中,眼镜定位程序10还可以被分割为一个或者多个模块,一个或者多个模块被存储于存储器11中,并由处理器12执行,以完成本申请。本申请所称的模块是指能够完成特定功能的一系列计算机程序指令段。参照图2所示,为图1中眼镜定位程序10的模块示意图。所述眼镜定位程序10可以被分割为:获取模块110、判断模块120及定位模块130,所述模块110-130所实现的功能或操作步骤均与上文类似,此处不再详述,示例性地,例如其中:
获取模块110,用于获取摄像装置13拍摄到的一张实时图像,利用人脸识别算法从该实时图像中提取一张实时脸部图像;
判断模块120,用于利用预先确定的第一分类器识别该实时脸部图像中是否包含眼镜,并输出识别结果;及
定位模块130,用于当识别结果为该实时脸部图像中包含眼镜时,利用预先确定的第二分类器对该实时脸部图像中的眼镜位置进行定位,并输出定位结果。
此外,本申请还提供一种眼镜定位方法。参照图3所示,为本申请眼镜定位方法第一实施例的流程图。该方法可以由一个装置执行,该装置可以由软件和/或硬件实现。
在本实施例中,眼镜定位方法包括步骤S10-S30:
步骤S10,获取摄像装置拍摄到的一张实时图像,利用人脸识别算法从该实时图像中提取一张实时脸部图像;
步骤S20,利用预先确定的第一分类器识别该实时脸部图像中是否包含眼镜,并输出识别结果;及
步骤S30,当识别结果为该实时脸部图像中包含眼镜时,利用预先确定的
第二分类器对该实时脸部图像中的眼镜位置进行定位,并输出定位结果。
当摄像装置拍摄到一张实时图像,摄像装置将这张实时图像发送到处理器,处理器接收到该实时图像并获取实时图像的大小,建立一个相同大小的灰度图像,将获取的彩色图像,转换成灰度图像,同时创建一个内存空间;将灰度图像直方图均衡化,使灰度图像信息量减少,加快检测速度,然后加载训练库,检测图片中的人脸,并返回一个包含人脸信息的对象,获得人脸所在位置的数据,并记录个数;最终获取脸部的区域且保存下来,这样就完成了一次脸部图像提取的过程。具体地,从该实时图像中提取脸部图像的人脸识别算法可以为基于几何特征的方法、局部特征分析方法、特征脸方法、基于弹性模型的方法、神经网络方法,等等。
接下来,将利用人脸识别算法提取的脸部图像输入预先确定的第一分类器,判断该脸部图像中是否包含眼镜,其中,所述预先确定的第一分类器的训练步骤包括:
分别准备一定数量的包含眼镜和不包含眼镜的人脸图片作为样本图片并形成第一样本集,并根据样本图片中是否包含眼镜,为每张样本图片进行分类标记,包含眼镜的样本图片标注“有眼镜”或“1”,不包含眼镜的样本图片标注“没有眼镜”或“0”;从进行样本分类标记后的第一样本集中随机抽取第一比例(例如,50%)的样本图片作为训练集,从剩下的第一样本集中随机抽取第二比例的样本图片作为验证集,例如50%,即第一样本集中的25%的样本图片作为验证集,利用所述训练集训练卷积神经网络,得到所述第一分类器;为了保证第一分类器的准确率,需对第一分类器的准确率进行验证,利用所述验证集验证训练的所述第一分类器的准确率,若准确率大于或者等于预设准确率,则训练结束,或者,若准确率小于预设准确率,则增加样本集中的样本图片数量并重新执行上述步骤。
需要说明的是,所述预先确定的第一分类器的训练步骤还包括:对第一样本集中的样本图片进行预处理如缩放、裁剪、翻转及/或扭曲等操作,利用经过预处理后的样本图片对卷积神经网络进行训练,有效提高模型训练的真实性及准确率。
例如在一种实施方式中,对每张样本图片进行图片预处理可以包括:
将每张样本图片的较短边长缩放到第一预设大小(例如,640像素)以获得对应的第一图片,在各张第一图片上随机裁剪出一个第二预设大小的第二图片,例如256*256像素的第二图片;
根据各个预先确定的预设类型参数,例如颜色、亮度及/或对比度等对应的标准参数值,例如,颜色对应的标准参数值为a1,亮度对应的标准参数值为a2,对比度对应的标准参数值为a3,将各张第二图片的各个预先确定的预设类型参数值调整为对应的标准参数值,获得对应的第三图片,以消除样本图片在拍摄时外界条件导致的图片不清晰,提高模型训练的有效性;
对各张第三图片进行预设方向(例如,水平和垂直方向)的翻转,及按
照预设的扭曲角度(例如,30度)对各张第三图片进行扭曲操作,获得各张第三图片对应的第四图片,各张第四图片即为对应的样本图片的训练图片。其中,翻转和扭曲操作的作用是模拟实际业务场景下各种形式的图片,通过这些翻转和扭曲操作可以增大数据集的规模,从而提高模型训练的真实性和实用性。
假设通过以上步骤训练得到的第一分类器判断人脸图像中包含眼镜,则将该人脸图像输入预先确定的第二分类器中,对该人脸图像中的眼镜区域进行定位,并输出该人脸图像中的眼镜定位结果。可以理解的是,若第一分类器输出的判断结果中,该人脸图像中不包含眼镜区域,则重新获取摄像装置13拍摄到的实时图像,并进行后续步骤。
需要说明的是,所述预先确定的第二分类器的获取过程如下:准备预设数量张“有眼镜”的样本图片形成第二样本集,在其他实施例中,也可以利用第一样本集中标注有“有眼镜”或“1”的样本图片。为了简化后续计算,对每张样本图片进行图片预处理,具体地,所述预处理步骤包括:将第二样本集中的样本图片从彩色图像转为灰度图像,再将灰度图像中像素点的像素值分别除以255,将各像素点的像素值的范围从0-255规范到0-1之间;在经过上述预处理后的样本图片中的眼镜位置标记预设数量的标记点,例如,在每张样本图片中的眼镜镜框上标记8个特征点:上、下框沿分别均匀标记3个特征点,左、右框沿分别标记1个特征点。
假设第二样本集中有m张样本图片,分别将每张样本图片中代表眼镜位置的预设数量的标记点组合成一个向量,以其中一张样本图片的向量为基准向量,将其余的m-1张样本图片的向量与该基准向量对齐,得到关于眼镜位置的第一平均模型;对关于眼镜位置的第一平均模型进行PCA降维处理,得到关于眼镜位置的第二平均模型。
利用特征提取算法从第二平均模型中提取每个标记点的局部特征,例如,HOG特征,将关于眼镜位置的第二平均型及其每个标记点的局部特征作为第二分类器。所述特征提取算法为SIFT算法,SIFT算法从第二平均模型中提取每个特征点的局部特征,选择一个特征点为参考特征点,并查找与该参考特征点的局部特征相同或相似的特征点(例如,两个特征点的局部特征的差值在预设范围内),依此原理直到查找出所有嘴唇特征点。在其他实施例中,该特征提取算法还可以为SURF算法,LBP算法,HOG算法等。
本实施例提出的眼镜定位方法,首先,利用第一分类器判断人脸图像中是否包含眼镜,然后,将包含眼镜的人脸图像输入第二分类器,以确定人脸图像中的眼镜位置。通过采用两个分类器对人脸图像中的眼镜区域图像进行检测,不依赖于图像边缘,从而提高眼镜检测的精度和准确度。
此外,本申请实施例还提出一种计算机可读存储介质,所述计算机可读存储介质中包括眼镜定位程序,所述眼镜定位程序被处理器执行时实现如下
操作:
获取摄像装置拍摄到的一张实时图像,利用人脸识别算法从该实时图像中提取一张实时脸部图像;
利用预先确定的第一分类器识别该实时脸部图像中是否包含眼镜,并输出识别结果;及
当识别结果为该实时脸部图像中包含眼镜时,利用预先确定的第二分类器对该实时脸部图像中的眼镜位置进行定位,并输出定位结果。
优选地,所述预先确定的第一分类器的训练过程如下:
分别准备包含眼镜和不包含眼镜的样本图片,根据样本图片中是否包含眼镜,对每个样本图片进行分类标记;
将分类标记后的样本图片分为第一比例的训练集和第二比例的验证集;
利用所述训练集训练卷积神经网络,得到所述第一分类器;及
利用所述验证集验证训练的所述第一分类器的准确率,若准确率大于或者等于预设准确率,则训练结束,或者,若准确率小于预设准确率,则增加样本图片数量并重新执行训练步骤。
优选地,所述预先确定的第二分类器的获取过程如下:
对包含眼镜的样本图片进行预处理,在预处理后的样本图片中的眼镜位置标记预设数量的标记点;
将每张样本图片中代表眼镜位置的预设数量的标记点组合成一个向量,以其中一张样本图片的向量为基准向量,将其它所有样本图片的向量与该基准向量对齐,得到关于眼镜位置的第一平均模型;
对关于眼镜位置的第一平均模型进行降维处理得到关于眼镜位置的第二平均模型;及
从第二平均模型中提取每个标记点的局部特征,将关于眼镜位置的第二平均模型及其每个标记点的局部特征作为第二分类器。
本申请之计算机可读存储介质的具体实施方式与上述眼镜定位方法的具体实施方式大致相同,在此不再赘述。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、装置、物品或者方法不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、装置、物品或者方法所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、装置、物品或者方法中还存在另外的相同要素。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质
上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在如上所述的一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。
Claims (20)
- 一种眼镜定位方法,应用于电子装置,其特征在于,该方法包括:获取摄像装置拍摄到的一张实时图像,利用人脸识别算法从该实时图像中提取一张实时脸部图像;利用预先确定的第一分类器识别该实时脸部图像中是否包含眼镜,并输出识别结果;及当识别结果为该实时脸部图像中包含眼镜时,利用预先确定的第二分类器对该实时脸部图像中的眼镜位置进行定位,并输出定位结果。
- 如权利要求1所述的眼镜定位方法,其特征在于,所述预先确定的第一分类器的训练过程如下:分别准备包含眼镜和不包含眼镜的样本图片,根据样本图片中是否包含眼镜,对每个样本图片进行分类标记;将分类标记后的样本图片分为第一比例的训练集和第二比例的验证集;利用所述训练集训练卷积神经网络,得到所述第一分类器;及利用所述验证集验证训练的所述第一分类器的准确率,若准确率大于或者等于预设准确率,则训练结束,或者,若准确率小于预设准确率,则增加样本图片数量并重新执行训练步骤。
- 如权利要求1所述的眼镜定位方法,其特征在于,所述预先确定的第二分类器的获取过程如下:对包含眼镜的样本图片进行预处理,在预处理后的样本图片中的眼镜位置标记预设数量的标记点;将每张样本图片中代表眼镜位置的预设数量的标记点组合成一个向量,以其中一张样本图片的向量为基准向量,将其它所有样本图片的向量与该基准向量对齐,得到关于眼镜位置的第一平均模型;对关于眼镜位置的第一平均模型进行降维处理得到关于眼镜位置的第二平均模型;及从第二平均模型中提取每个标记点的局部特征,将关于眼镜位置的第二平均模型及其每个标记点的局部特征作为第二分类器。
- 如权利要求3所述的眼镜定位方法,其特征在于,所述对每张样本图片进行预处理的步骤包括:将每张样本图片转为灰度图像,读取灰度图像中各像素点的像素值,分别除以255,将灰度图像中各像素点的像素值规范化。
- 如权利要求1所述的眼镜定位方法,其特征在于,所述人脸识别算法可以为基于几何特征的方法、局部特征分析方法、特征脸方法、基于弹性模 型的方法及神经网络方法。
- 一种电子装置,其特征在于,该电子装置包括:存储器、处理器,所述存储器上存储有眼镜定位程序,所述眼镜定位程序被所述处理器执行时实现如下步骤:获取摄像装置拍摄到的一张实时图像,利用人脸识别算法从该实时图像中提取一张实时脸部图像;利用预先确定的第一分类器识别该实时脸部图像中是否包含眼镜,并输出识别结果;及当识别结果为该实时脸部图像中包含眼镜时,利用预先确定的第二分类器对该实时脸部图像中的眼镜位置进行定位,并输出定位结果。
- 如权利要求6所述的电子装置,其特征在于,所述预先确定的第一分类器的训练过程如下:分别准备包含眼镜和不包含眼镜的样本图片,根据样本图片中是否包含眼镜,对每个样本图片进行分类标记;将分类标记后的样本图片分为第一比例的训练集和第二比例的验证集;利用所述训练集训练卷积神经网络,得到所述第一分类器;及利用所述验证集验证训练的所述第一分类器的准确率,若准确率大于或者等于预设准确率,则训练结束,或者,若准确率小于预设准确率,则增加样本图片数量并重新执行训练步骤。
- 如权利要求6所述的电子装置,其特征在于,所述预先确定的第二分类器的获取过程如下:对包含眼镜的样本图片进行预处理,在预处理后的样本图片中的眼镜位置标记预设数量的标记点;将每张样本图片中代表眼镜位置的预设数量的标记点组合成一个向量,以其中一张样本图片的向量为基准向量,将其它所有样本图片的向量与该基准向量对齐,得到关于眼镜位置的第一平均模型;对关于眼镜位置的第一平均模型进行降维处理得到关于眼镜位置的第二平均模型;及从第二平均模型中提取每个标记点的局部特征,将关于眼镜位置的第二平均模型及其每个标记点的局部特征作为第二分类器。
- 如权利要求8所述的电子装置,其特征在于,所述对每张样本图片进行图片预处理的步骤包括:将每张样本图片转为灰度图像,读取灰度图像中各像素点的像素值,分别除以255,将灰度图像中各像素点的像素值规范化。
- 如权利要求6所述的电子装置,其特征在于,所述人脸识别算法可以为基于几何特征的方法、局部特征分析方法、特征脸方法、基于弹性模型的方法及神经网络方法。
- 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中包括眼镜定位程序,所述眼镜定位程序被处理器执行时,实现如下步骤:获取摄像装置拍摄到的一张实时图像,利用人脸识别算法从该实时图像中提取一张实时脸部图像;利用预先确定的第一分类器识别该实时脸部图像中是否包含眼镜,并输出识别结果;及当识别结果为该实时脸部图像中包含眼镜时,利用预先确定的第二分类器对该实时脸部图像中的眼镜位置进行定位,并输出定位结果。
- 如权利要求11所述的计算机可读存储介质,其特征在于,所述预先确定的第一分类器的训练过程如下:分别准备包含眼镜和不包含眼镜的样本图片,根据样本图片中是否包含眼镜,对每个样本图片进行分类标记;将分类标记后的样本图片分为第一比例的训练集和第二比例的验证集;利用所述训练集训练卷积神经网络,得到所述第一分类器;及利用所述验证集验证训练的所述第一分类器的准确率,若准确率大于或者等于预设准确率,则训练结束,或者,若准确率小于预设准确率,则增加样本图片数量并重新执行训练步骤。
- 如权利要求11所述的计算机可读存储介质,其特征在于,所述预先确定的第二分类器的获取过程如下:对包含眼镜的样本图片进行预处理,在预处理后的样本图片中的眼镜位置标记预设数量的标记点;将每张样本图片中代表眼镜位置的预设数量的标记点组合成一个向量,以其中一张样本图片的向量为基准向量,将其它所有样本图片的向量与该基准向量对齐,得到关于眼镜位置的第一平均模型;对关于眼镜位置的第一平均模型进行降维处理得到关于眼镜位置的第二平均模型;及从第二平均模型中提取每个标记点的局部特征,将关于眼镜位置的第二平均模型及其每个标记点的局部特征作为第二分类器。
- 如权利要求13所述的计算机可读存储介质,其特征在于,所述对每张样本图片进行图片预处理的步骤包括:将每张样本图片转为灰度图像,读取灰度图像中各像素点的像素值,分别除以255,将灰度图像中各像素点的像素值规范化。
- 如权利要求11所述的计算机可读存储介质,其特征在于,所述人脸识别算法可以为基于几何特征的方法、局部特征分析方法、特征脸方法、基于弹性模型的方法及神经网络方法。
- 一种眼镜定位程序,其特征在于,该眼镜定位程序包括:获取模块,用于获取摄像装置拍摄到的一张实时图像,利用人脸识别算法从该实时图像中提取一张实时脸部图像;判断模块,用于利用预先确定的第一分类器识别该实时脸部图像中是否包含眼镜,并输出识别结果;及定位模块,用于当识别结果为该实时脸部图像中包含眼镜时,利用预先确定的第二分类器对该实时脸部图像中的眼镜位置进行定位,并输出定位结果。
- 如权利要求16所述的眼镜定位程序,其特征在于,所述预先确定的第一分类器的训练过程如下:分别准备包含眼镜和不包含眼镜的样本图片,根据样本图片中是否包含眼镜,对每个样本图片进行分类标记;将分类标记后的样本图片分为第一比例的训练集和第二比例的验证集;利用所述训练集训练卷积神经网络,得到所述第一分类器;及利用所述验证集验证训练的所述第一分类器的准确率,若准确率大于或者等于预设准确率,则训练结束,或者,若准确率小于预设准确率,则增加样本图片数量并重新执行训练步骤。
- 如权利要求16所述的眼镜定位程序,其特征在于,所述预先确定的第二分类器的获取过程如下:对包含眼镜的样本图片进行预处理,在预处理后的样本图片中的眼镜位置标记预设数量的标记点;将每张样本图片中代表眼镜位置的预设数量的标记点组合成一个向量,以其中一张样本图片的向量为基准向量,将其它所有样本图片的向量与该基准向量对齐,得到关于眼镜位置的第一平均模型;对关于眼镜位置的第一平均模型进行降维处理得到关于眼镜位置的第二平均模型;及从第二平均模型中提取每个标记点的局部特征,将关于眼镜位置的第二平均模型及其每个标记点的局部特征作为第二分类器。
- 如权利要求18所述的眼镜定位程序,其特征在于,所述对每张样本图片进行预处理的步骤包括:将每张样本图片转为灰度图像,读取灰度图像中各像素点的像素值,分 别除以255,将灰度图像中各像素点的像素值规范化。
- 如权利要求16所述的眼镜定位程序,其特征在于,所述人脸识别算法可以为基于几何特征的方法、局部特征分析方法、特征脸方法、基于弹性模型的方法及神经网络方法。
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US16/337,938 US10635946B2 (en) | 2017-09-30 | 2017-10-31 | Eyeglass positioning method, apparatus and storage medium |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201710915085.X | 2017-09-30 | ||
| CN201710915085.XA CN107808120B (zh) | 2017-09-30 | 2017-09-30 | 眼镜定位方法、装置及存储介质 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2019061658A1 true WO2019061658A1 (zh) | 2019-04-04 |
Family
ID=61592052
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2017/108756 Ceased WO2019061658A1 (zh) | 2017-09-30 | 2017-10-31 | 眼镜定位方法、装置及存储介质 |
Country Status (3)
| Country | Link |
|---|---|
| US (1) | US10635946B2 (zh) |
| CN (1) | CN107808120B (zh) |
| WO (1) | WO2019061658A1 (zh) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112346862A (zh) * | 2020-10-27 | 2021-02-09 | 上海影创信息科技有限公司 | 分体式智能眼镜控制方法、系统及介质 |
| CN112926439A (zh) * | 2021-02-22 | 2021-06-08 | 深圳中科飞测科技股份有限公司 | 检测方法及装置、检测设备和存储介质 |
Families Citing this family (19)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN108564035B (zh) * | 2018-04-13 | 2020-09-25 | 杭州睿琪软件有限公司 | 识别单据上记载的信息的方法及系统 |
| CN108830062B (zh) * | 2018-05-29 | 2022-10-04 | 浙江水科文化集团有限公司 | 人脸识别方法、移动终端及计算机可读存储介质 |
| CN109063604A (zh) * | 2018-07-16 | 2018-12-21 | 阿里巴巴集团控股有限公司 | 一种人脸识别方法及终端设备 |
| CN109345553B (zh) * | 2018-08-31 | 2020-11-06 | 厦门熵基科技有限公司 | 一种手掌及其关键点检测方法、装置和终端设备 |
| CN111382651A (zh) * | 2018-12-29 | 2020-07-07 | 杭州光启人工智能研究院 | 数据打标方法、计算机装置及计算机可读存储介质 |
| CN111814815B (zh) * | 2019-04-11 | 2023-08-22 | 浙江快奇控股有限公司 | 一种基于轻量级神经网络的眼镜放置状态的智能判别方法 |
| CN110334698A (zh) * | 2019-08-30 | 2019-10-15 | 上海聚虹光电科技有限公司 | 眼镜检测系统及方法 |
| CN111008569A (zh) * | 2019-11-08 | 2020-04-14 | 浙江工业大学 | 一种基于人脸语义特征约束卷积网络的眼镜检测方法 |
| CN112825115A (zh) * | 2019-11-20 | 2021-05-21 | 北京眼神智能科技有限公司 | 基于单目图像的眼镜检测方法、装置、存储介质及设备 |
| CN111474901B (zh) * | 2019-12-18 | 2021-04-23 | 山东合众智远信息技术有限公司 | 自动化电子设备联动系统及方法 |
| CN111429409A (zh) * | 2020-03-13 | 2020-07-17 | 深圳市雄帝科技股份有限公司 | 对图像中人物佩戴眼镜的识别方法、系统及其存储介质 |
| CN111881770B (zh) * | 2020-07-06 | 2024-05-31 | 上海序言泽网络科技有限公司 | 一种人脸识别方法及系统 |
| CN112101261B (zh) * | 2020-09-22 | 2023-12-26 | 北京百度网讯科技有限公司 | 人脸识别方法、装置、设备及存储介质 |
| CN112418138B (zh) * | 2020-12-04 | 2022-08-19 | 兰州大学 | 一种眼镜试戴系统 |
| US12277803B2 (en) | 2021-04-21 | 2025-04-15 | Assa Abloy Global Solutions Ab | Thermal based presentation attack detection for biometric systems |
| CN113449740A (zh) * | 2021-06-30 | 2021-09-28 | 上海宇仓智能仓储设备有限公司 | 移动货架的通道视觉检测方法、系统、设备和存储介质 |
| EP4224432A1 (en) * | 2022-02-04 | 2023-08-09 | Carl Zeiss Vision International GmbH | Device, system and method for spectacle frame identification |
| CN116797809A (zh) * | 2022-03-10 | 2023-09-22 | 北京沃东天骏信息技术有限公司 | 一种眼镜分类方法、装置、设备和存储介质 |
| CN116088676A (zh) * | 2022-12-21 | 2023-05-09 | 广州视享科技有限公司 | 人眼注视深度的获取方法、装置、电子设备以及存储介质 |
Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103093215A (zh) * | 2013-02-01 | 2013-05-08 | 北京天诚盛业科技有限公司 | 人眼定位方法及装置 |
| CN104408426A (zh) * | 2014-11-27 | 2015-03-11 | 小米科技有限责任公司 | 人脸图像眼镜去除方法及装置 |
| CN105095841A (zh) * | 2014-05-22 | 2015-11-25 | 小米科技有限责任公司 | 生成眼镜的方法及装置 |
| US9367730B2 (en) * | 2007-01-09 | 2016-06-14 | S1 Corporation | Method and system for automated face detection and recognition |
| CN106407911A (zh) * | 2016-08-31 | 2017-02-15 | 乐视控股(北京)有限公司 | 基于图像的眼镜识别方法及装置 |
| CN106778453A (zh) * | 2015-11-25 | 2017-05-31 | 腾讯科技(深圳)有限公司 | 人脸图像中检测眼镜佩戴的方法及装置 |
Family Cites Families (12)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6714665B1 (en) * | 1994-09-02 | 2004-03-30 | Sarnoff Corporation | Fully automated iris recognition system utilizing wide and narrow fields of view |
| US9129505B2 (en) * | 1995-06-07 | 2015-09-08 | American Vehicular Sciences Llc | Driver fatigue monitoring system and method |
| US9111147B2 (en) * | 2011-11-14 | 2015-08-18 | Massachusetts Institute Of Technology | Assisted video surveillance of persons-of-interest |
| US9230180B2 (en) * | 2013-01-18 | 2016-01-05 | GM Global Technology Operations LLC | Eyes-off-the-road classification with glasses classifier |
| CN103093210B (zh) * | 2013-01-24 | 2017-02-08 | 北京天诚盛业科技有限公司 | 人脸识别中眼镜的鉴别方法及装置 |
| US20180268458A1 (en) * | 2015-01-05 | 2018-09-20 | Valorbec Limited Partnership | Automated recommendation and virtualization systems and methods for e-commerce |
| KR102492318B1 (ko) * | 2015-09-18 | 2023-01-26 | 삼성전자주식회사 | 모델 학습 방법 및 장치, 및 데이터 인식 방법 |
| CN105205482B (zh) | 2015-11-03 | 2018-10-26 | 北京英梅吉科技有限公司 | 快速人脸特征识别及姿态估算方法 |
| CN105426963B (zh) | 2015-12-01 | 2017-12-26 | 北京天诚盛业科技有限公司 | 用于人脸识别的卷积神经网络的训练方法、装置及应用 |
| US9779492B1 (en) * | 2016-03-15 | 2017-10-03 | International Business Machines Corporation | Retinal image quality assessment, error identification and automatic quality correction |
| FR3053509B1 (fr) * | 2016-06-30 | 2019-08-16 | Fittingbox | Procede d’occultation d’un objet dans une image ou une video et procede de realite augmentee associe |
| EP3635626A1 (en) * | 2017-05-31 | 2020-04-15 | The Procter and Gamble Company | System and method for guiding a user to take a selfie |
-
2017
- 2017-09-30 CN CN201710915085.XA patent/CN107808120B/zh active Active
- 2017-10-31 US US16/337,938 patent/US10635946B2/en active Active
- 2017-10-31 WO PCT/CN2017/108756 patent/WO2019061658A1/zh not_active Ceased
Patent Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US9367730B2 (en) * | 2007-01-09 | 2016-06-14 | S1 Corporation | Method and system for automated face detection and recognition |
| CN103093215A (zh) * | 2013-02-01 | 2013-05-08 | 北京天诚盛业科技有限公司 | 人眼定位方法及装置 |
| CN105095841A (zh) * | 2014-05-22 | 2015-11-25 | 小米科技有限责任公司 | 生成眼镜的方法及装置 |
| CN104408426A (zh) * | 2014-11-27 | 2015-03-11 | 小米科技有限责任公司 | 人脸图像眼镜去除方法及装置 |
| CN106778453A (zh) * | 2015-11-25 | 2017-05-31 | 腾讯科技(深圳)有限公司 | 人脸图像中检测眼镜佩戴的方法及装置 |
| CN106407911A (zh) * | 2016-08-31 | 2017-02-15 | 乐视控股(北京)有限公司 | 基于图像的眼镜识别方法及装置 |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112346862A (zh) * | 2020-10-27 | 2021-02-09 | 上海影创信息科技有限公司 | 分体式智能眼镜控制方法、系统及介质 |
| CN112926439A (zh) * | 2021-02-22 | 2021-06-08 | 深圳中科飞测科技股份有限公司 | 检测方法及装置、检测设备和存储介质 |
Also Published As
| Publication number | Publication date |
|---|---|
| US10635946B2 (en) | 2020-04-28 |
| CN107808120B (zh) | 2018-08-31 |
| US20190362193A1 (en) | 2019-11-28 |
| CN107808120A (zh) | 2018-03-16 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2019061658A1 (zh) | 眼镜定位方法、装置及存储介质 | |
| CN112200136B (zh) | 证件真伪识别方法、装置、计算机可读介质及电子设备 | |
| US10671879B2 (en) | Feature density object classification, systems and methods | |
| US10133921B2 (en) | Methods and apparatus for capturing, processing, training, and detecting patterns using pattern recognition classifiers | |
| US10534957B2 (en) | Eyeball movement analysis method and device, and storage medium | |
| WO2019109526A1 (zh) | 人脸图像的年龄识别方法、装置及存储介质 | |
| KR102290392B1 (ko) | 얼굴 등록 방법 및 장치, 얼굴 인식 방법 및 장치 | |
| CN103914676B (zh) | 一种在人脸识别中使用的方法和装置 | |
| US9098888B1 (en) | Collaborative text detection and recognition | |
| WO2019033572A1 (zh) | 人脸遮挡检测方法、装置及存储介质 | |
| WO2019033571A1 (zh) | 面部特征点检测方法、装置及存储介质 | |
| CN106650740B (zh) | 一种车牌识别方法及终端 | |
| CN111626163B (zh) | 一种人脸活体检测方法、装置及计算机设备 | |
| WO2019169532A1 (zh) | 车牌识别方法及云系统 | |
| JP6351243B2 (ja) | 画像処理装置、画像処理方法 | |
| WO2016149944A1 (zh) | 用于识别人脸的方法、系统和计算机程序产品 | |
| WO2019033570A1 (zh) | 嘴唇动作分析方法、装置及存储介质 | |
| WO2019033567A1 (zh) | 眼球动作捕捉方法、装置及存储介质 | |
| Hartl et al. | Real-time detection and recognition of machine-readable zones with mobile devices. | |
| WO2019061659A1 (zh) | 人脸图像眼镜去除方法、装置及存储介质 | |
| CN115294557A (zh) | 图像处理方法、图像处理装置、电子设备及存储介质 | |
| CN109753981B (zh) | 一种图像识别的方法及装置 | |
| US20250349025A1 (en) | Automatic image cropping using a reference feature | |
| HK1247372A (zh) | 眼镜定位方法、装置及存储介质 | |
| HK1247372A1 (zh) | 眼鏡定位方法、裝置及存儲介質 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17927028 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 25/09/2020) |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 17927028 Country of ref document: EP Kind code of ref document: A1 |