[go: up one dir, main page]

WO2019157690A1 - 自动抓拍方法及装置、无人机及存储介质 - Google Patents

自动抓拍方法及装置、无人机及存储介质 Download PDF

Info

Publication number
WO2019157690A1
WO2019157690A1 PCT/CN2018/076792 CN2018076792W WO2019157690A1 WO 2019157690 A1 WO2019157690 A1 WO 2019157690A1 CN 2018076792 W CN2018076792 W CN 2018076792W WO 2019157690 A1 WO2019157690 A1 WO 2019157690A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
processed
classification
perform
target object
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2018/076792
Other languages
English (en)
French (fr)
Inventor
李思晋
赵丛
张李亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SZ DJI Technology Co Ltd
Original Assignee
SZ DJI Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SZ DJI Technology Co Ltd filed Critical SZ DJI Technology Co Ltd
Priority to CN201880028125.1A priority Critical patent/CN110574040A/zh
Priority to PCT/CN2018/076792 priority patent/WO2019157690A1/zh
Publication of WO2019157690A1 publication Critical patent/WO2019157690A1/zh
Priority to US16/994,092 priority patent/US20200371535A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/0094Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots involving pointing a payload, e.g. camera, weapon, sensor, towards a fixed or moving target
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/12Target-seeking control
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B64AIRCRAFT; AVIATION; COSMONAUTICS
    • B64UUNMANNED AERIAL VEHICLES [UAV]; EQUIPMENT THEREFOR
    • B64U2101/00UAVs specially adapted for particular uses or applications
    • B64U2101/30UAVs specially adapted for particular uses or applications for imaging, photography or videography
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Definitions

  • the present disclosure relates to the field of image processing, and in particular, to an automatic capture method and device, a drone, and a storage medium.
  • the current methods of photographing mainly include the following:
  • One way is to take a selfie, that is, use your smartphone, tablet, etc. to take a selfie, or use a tool such as a selfie stick to assist with the self-timer.
  • This kind of photographing method has great limitations. On the one hand, it is only suitable for occasions with relatively small numbers. If it is a multi-person travel, the effect of self-timer shooting is not good enough to achieve the expected shooting effect; The adjustment of the shooting angle during self-timer is not flexible enough, and people's facial expressions and postures are also deliberate and not natural enough.
  • Another way is to ask someone to help shoot, and temporarily give your shooting equipment to other people, ask others to help shoot.
  • This type of shooting has the following shortcomings. On the one hand, it is necessary to seek help from others, it may be rejected by others, or in places where there are few people, it is difficult to find the person who helps to shoot in time; on the other hand, the shooting level of others cannot be guaranteed. The effect is sometimes very poor. When others are not satisfied with the shooting, it is often inconvenient for the other party to take a few more shots.
  • Another way is to hire a professional photographer to accompany the whole process, although this method can guarantee the effect of shooting, and does not require the user to shoot or seek help from others, but this way for personal comparison, cost comparison Large, not suitable for daily travel or tourism, generally only for special anniversaries of families with better economic conditions.
  • an automatic capture method comprising: acquiring an image to be processed; preprocessing the image to be processed to obtain a pre-processing result; inputting the pre-processing result to Classification is performed in the trained machine learning model; a control signal is generated and transmitted according to the classification, and the control signal is used to perform a corresponding preset operation on the image to be processed.
  • an automatic capture device includes: an image acquisition module, configured to acquire an image to be processed; and a preprocessing module, configured to preprocess the image to be processed, and obtain a pre-processing result, a classification module, configured to input the pre-processing result into the trained machine learning model for classification; and a control module, configured to generate and send a control signal according to the classification, where the control signal is used for The processed image is processed to perform a corresponding preset operation.
  • a drone including: a body; an image pickup apparatus disposed on the body; and a processor configured to perform: acquiring an image to be processed Pre-processing the image to be processed to obtain a pre-processing result; inputting the pre-processing result into a trained machine learning model for classification; generating and transmitting a control signal according to the classification, the control signal is used for Performing a corresponding preset operation on the image to be processed.
  • a computer readable storage medium having stored thereon a computer program that, when executed by a processor of a computer, causes the computer to perform an automatic capture method, The method includes: acquiring an image to be processed; performing preprocessing on the image to be processed to obtain a preprocessing result; inputting the preprocessing result into a trained machine learning model for classification; generating and transmitting a control signal according to the classification The control signal is used to perform a corresponding preset operation on the image to be processed.
  • the automatic capture method can conveniently capture those natural and elegant pictures, actions, and scenes, and record the most natural form of the journey. At the same time, the implementation cost of this automatic capture is relatively low.
  • the pre-processing of the current image to be processed and the pre-processing result are classified by the trained machine learning model, so that the corresponding preset can be performed on the current image to be processed according to the classification result.
  • the pre-processing of the current image to be processed and the pre-processing result are classified by the trained machine learning model, so that the corresponding preset can be performed on the current image to be processed according to the classification result.
  • FIG. 1 is a flow chart of an automatic capture method according to an embodiment of the present disclosure.
  • FIG. 2 is a flowchart of step S120 of the automatic capture method according to an embodiment of the present disclosure.
  • FIG. 3 is a schematic diagram of an automatic capture device according to an embodiment of the present disclosure.
  • FIG. 4 is a schematic diagram of a drone according to an embodiment of the present disclosure.
  • embodiments of the present invention can be implemented as a system, apparatus, device, method, or computer program product. Accordingly, the present disclosure may be embodied in the form of full hardware, complete software (including firmware, resident software, microcode, etc.), or a combination of hardware and software.
  • an automatic capture method a drone, and a storage medium are proposed.
  • the principles and spirit of the present invention are explained in detail below with reference to a few representative embodiments of the invention.
  • FIG. 1 is a flow chart of a method for automatic capture according to an embodiment of the present disclosure. As shown in FIG. 1, the method of this embodiment includes the following steps S110-S140.
  • step S110 an image to be processed is acquired.
  • an image of the environment in which the user is located can be captured in real time by the imaging device of the smart device, and the image to be processed is acquired from the captured image.
  • the smart device may be a drone, and the image to be processed may be a frame image in a video captured by the drone.
  • the user can operate the drone to fly within the environment where the user is located, and control the drone to perform real-time shooting on the user through the camera mounted on the drone to obtain a video and extract any frame of the video.
  • the image serves as the image to be processed.
  • the smart device may also be any one of a handheld cloud platform, a vehicle, a watercraft, an autonomous driving vehicle, an intelligent robot, and the like, as long as it has an imaging device and can perform mobile shooting. I will not list them one by one here.
  • step S120 the image to be processed is preprocessed to obtain a preprocessing result.
  • step S120 may include step S1210.
  • step S1210 the scene to be processed is subjected to scene understanding, and the scene classification result of the image to be processed is obtained.
  • the scene understanding may adopt a deep learning method, but the disclosure does not limit this. In other embodiments, other methods may also be used.
  • the obtained scene classification result may include any one of a seaside, a forest, a city, an indoor, a desert, and the like, but is not limited thereto, and may include other scenes such as a square.
  • a plurality of test pictures may be selected, and each of the plurality of test pictures (each test picture may include a plurality of test pictures of the same kind) corresponding to a scene classification, and the scene classification is, for example, It can include any of the seaside, forest, city, indoor, desert, and the like.
  • a network model including one or more scene classifications may be trained through deep learning, and the network model may include a convolution layer and a fully connected layer.
  • the feature of the image to be processed may be extracted by the convolution layer, and the extracted features are integrated by the fully connected layer to compare the feature of the image to be processed with the one or more scene classifications. Determining a scene classification result of the image to be processed, for example, a seaside.
  • step S120 may further include step S1220 and step S1230, where:
  • step S1220 object detection is performed on the image to be processed, and a target object in the image to be processed is obtained.
  • the target object may be, for example, a pedestrian in the image to be processed, and in other embodiments, other objects such as an animal.
  • the target object is taken as an example of a pedestrian.
  • the pedestrian in the image to be processed may be detected by a pedestrian detection algorithm, and all pedestrians in the image to be processed are obtained and sent to the terminal device (the corresponding application is installed on the terminal device)
  • the program for example, a mobile phone, a tablet computer, or the like, can select, by the terminal device, a pedestrian to be photographed, that is, the target object, that is, a person who needs to be captured, from among all the pedestrians in the image to be processed.
  • a pedestrian detection method based on a multi-layer network model may be used to identify all pedestrians in the image to be processed, and specifically, a candidate position of the pedestrian may be extracted through a multi-layer convolutional neural network, and then passed through the second stage.
  • the neural network verifies all candidate locations, refines its predictions, and uses the tracking box to correlate pedestrian detection in multiple frames.
  • the user may receive, by the terminal device, the image to be processed and each person selected by the tracking frame on the image to be processed, and the user may select a tracking frame of a person who wants to capture the target to determine the target object, the target
  • the object may be the same person as the user operating the terminal device, or may be a different person.
  • step S1230 the target object is tracked to obtain a tracking result.
  • the tracking result may include a position or a size (size) of the target object in the image to be processed, and of course, a position and a size may be included at the same time.
  • the target object may be selected from the to-be-processed image and tracked in real time by comparing information of a previous frame or an initial frame of the image to be processed.
  • the position of each pedestrian in the image to be processed may be acquired first, and then the image to be processed and the image of the previous frame are matched by using a tracking algorithm; the pedestrian is framed by the tracking frame, and the tracking is updated in real time.
  • the position of the frame, thereby determining the position and size of the pedestrian in real time the position of the pedestrian may be the coordinates of the pedestrian in the image to be processed, and the size of the pedestrian may be the area of the area occupied by the pedestrian in the image to be processed.
  • step S1240 posture analysis is performed on the target object to obtain an action category of the target object.
  • the method of posture analysis may be a detection method based on morphological features, that is, a detector is trained based on each human joint, and then these joints are combined into a posture of the human body using a rule-based or optimization-based method.
  • the method of pose analysis may also be a regression method based on global information, that is, directly predicting the position (coordinates) of each joint point in the image, and determining the action category based on the calculated position classification of the joint point.
  • other methods can also be used for pose analysis, which are not listed here.
  • the action category of the target object may include any one of running, walking, jumping, and the like, but is not limited thereto, and may include, for example, an action category such as bending, rolling, and rocking.
  • step S120 may further include step S1240.
  • step S1240 image quality analysis is performed on the image to be processed, and image quality of the image to be processed is obtained.
  • the image quality of the image to be processed may be analyzed by using a peak signal to noise ratio and a mean square error full reference evaluation algorithm or other algorithms to obtain an image quality of the image to be processed, and the image quality is available.
  • a plurality of scores may be represented by specific numerical values of parameters that reflect image quality such as sharpness.
  • step S130 the pre-processing result is input into the trained machine learning model for classification.
  • the pre-processing result may include a combination of any one or more of a scene classification result, a target object, a tracking result, an action category, and an image quality in the above-described embodiments.
  • the trained machine learning model may be a deep learning neural network model, which may be obtained based on an algorithm such as attitude analysis, pedestrian detection, pedestrian tracking, and scene analysis, combined with a preset evaluation standard training, and a forming process thereof. For example, it may include establishing evaluation criteria, labeling samples according to evaluation criteria, and training models according to machine learning algorithms.
  • the evaluation standard can be proposed by an expert in photography or a photographer.
  • different photographic experts may propose more subdivided evaluation criteria of different factions, such as evaluation criteria suitable for human shooting, suitable for evaluation criteria of natural scenery shooting; Suitable for retro style evaluation criteria, suitable for fresh style evaluation criteria, and more.
  • the trained machine learning model may be a deep learning neural network model, which may be based on algorithms such as attitude analysis, pedestrian detection, pedestrian tracking, scene analysis, and image quality analysis, combined with preset evaluation criteria and imaging.
  • the shooting parameters of the device are obtained by training, and the forming process thereof may include establishing an evaluation standard, marking the sample according to the evaluation standard, and training the model according to the machine learning algorithm.
  • the photo For example, given a photo, you can mark the photo by analyzing the image clarity of the photo, taking the shooting parameters of the camera that took the photo, and input it into the machine learning model for training.
  • the trained model can predict whether the shooting parameters of the image capturing apparatus that captures the image to be processed need to be adjusted according to the image quality of the image to be processed.
  • the trained machine learning model may score the to-be-processed image according to the pre-processing result, and the scoring may be based on one of a scene classification result, a target object, a tracking result, and an action category. For example, the obtained score is compared with a preset threshold to determine the classification of the image to be processed.
  • the score of the image to be processed when the score of the image to be processed is higher than the threshold, it is classified into the first category. At this time, the corresponding image to be processed may be saved, and the image to be processed may be sent to the user. a terminal device; when the score of the image to be processed is lower than the threshold, the image to be processed may be deleted.
  • the image to be processed may be scored based on a single scene classification result. For example, when the scene classification result of the image to be processed is a beach, it may be classified into a first category, and the image to be processed is retained.
  • the image to be processed may be scored based on a tracking result of the target object. For example, when it is determined that the target object to be captured is multiple, when it is detected that the plurality of target objects are simultaneously in the middle position of the image to be processed, it may be determined that the plurality of target objects currently want to take a picture, this The image to be processed may be divided into the first classification, and the corresponding image to be processed is retained. For example, when it is known that the target object occupies more than 1/2 of the image to be processed according to the tracking result (this value can be adjusted according to a specific situation), it may be determined that the target object currently wants to take a photo. In particular, the image to be processed is classified to the first classification, and the corresponding image to be processed is saved.
  • the image to be processed may also be scored based on a single action category, for example, when it is detected that the target object currently has a jump action, and the jump action reaches a first preset height, for example, 1 meter.
  • the image to be processed is scored for 10 points, the image to be processed is in the first category, and the image to be processed is retained; and when it is detected that the target object currently has a jumping action, and the jumping action reaches the first
  • the preset height is, for example, 50 cm
  • the image to be processed is scored 5 points, and the image to be processed is in the second category, and the image to be processed is deleted.
  • the scoring may be performed based on the scene classification result and the target object of the pedestrian detection.
  • the image to be processed belongs to the first category.
  • the scene classification result and the target object are not matched, the image to be processed is considered to belong to the second category. Whether the scene classification result and the target object are matched here can be predicted by the machine learning model according to the massive labeled photo training.
  • the image to be processed may be divided into the first category, and the corresponding information may be saved. Pending image.
  • the image to be processed may be scored by considering the scene classification result, the tracking result of the target object, and the action category of the target object. For example, when the scene classification result of the image to be processed is grassland, when the tracking result shows that the target object is in a near intermediate position of the image to be processed, and the target object occupies an area of the image to be processed When it exceeds 1/3, at the same time, the target object makes an action of a scissors hand (or other common photographing action), and it can be determined that the image to be processed is the first category, and the image to be processed is saved.
  • the image to be processed is classified into the second category, and the image to be processed is deleted.
  • the machine learning model may classify the image to be processed according to image quality while scoring the image to be processed.
  • the image to be processed may be classified into a third classification.
  • the image quality is poor, and the machine learning model may be based on the image quality.
  • a shooting adjustment parameter is generated to adjust the shooting parameters of the imaging device according to the shooting adjustment parameter to improve subsequent image quality.
  • the photographing adjustment parameter may include any one or more of an adjustment amount of an aperture of the imaging device, an exposure parameter, a distance of focusing, a contrast, and the like, and is not particularly limited herein.
  • the shooting adjustment parameter may further include an adjustment amount of a parameter such as a shooting angle of the drone, a shooting distance, and the like.
  • step S140 a control signal is generated and transmitted according to the classification, and the control signal is used to perform a corresponding preset operation on the to-be-processed image.
  • each of the above classifications may correspond to a control signal, and each control signal may correspond to a different preset operation.
  • the preset operation may include any one of a save operation, a delete operation, and a retake operation.
  • a first control signal may be generated, where the first control signal is used to perform a saving operation on the corresponding pre-processed image, thereby saving the pre-processed image. Down, easy for users to use.
  • a second control signal may be generated, and the second control signal is used to perform a deletion operation on the corresponding pre-processed image.
  • the third control signal may be generated, and the third control signal is used to obtain a corresponding shooting adjustment parameter according to the corresponding image to be processed, and then the image to be processed is Performing a deletion operation and a retake operation, the retake operation may include: adjusting a shooting parameter of the camera device and/or the drone according to the shooting adjustment parameter, and acquiring the adjusted drone and the camera device mounted thereon Another image to be processed, the other image to be processed can be processed according to the above automatic capture method.
  • the above automatic capture method can be applied to any one of a drone, a handheld head, a vehicle, a ship, an autonomous vehicle, an intelligent robot, and the like.
  • the automatic capture method of the embodiment of the present disclosure can conveniently capture those natural and elegant pictures, actions and scenes, and record the most natural form of the journey. At the same time, the implementation cost of this automatic capture is relatively low. And the pre-processing of the current image to be processed, and the pre-processing result is classified by the trained machine learning model, so that the corresponding preset operation can be performed on the current image to be processed according to the classification result, so that The prior art, on the one hand, can not only realize the function of automatic capture, but also ensure the shooting effect of the automatically captured photos.
  • FIG. 3 is a schematic diagram of an automatic capture device according to an embodiment of the present disclosure.
  • the automatic capture device 100 can include an image acquisition module 110, a pre-processing module 120, a classification module 130, and a control module 140, where:
  • the image acquisition module 110 can be configured to acquire an image to be processed.
  • the image acquisition module 110 may include a photographing unit 111, which may be used to capture the image to be processed by photographing on a smart device.
  • the pre-processing module 120 is configured to perform pre-processing on the image to be processed to obtain a pre-processing result.
  • the pre-processing module 120 may include any one or a combination of the detection unit 121, the tracking unit 122, the posture analysis unit 123, the quality analysis unit 124, and the scene classification unit 125, where:
  • the detecting unit 121 is configured to perform object detection on the image to be processed to obtain a target object in the image to be processed.
  • the tracking unit 122 can be configured to track the target object to obtain a tracking result.
  • the tracking result may include a location and/or size of the target object in the image to be processed.
  • the gesture analysis unit 123 can be configured to perform posture analysis on the target object to obtain an action category of the target object.
  • the action category includes any one of running, walking, jumping, and the like.
  • the quality analysis unit 124 is configured to perform image quality analysis on the image to be processed to obtain an image quality of the image to be processed.
  • the scene classification unit 125 is configured to perform scene understanding on the image to be processed, and obtain a scene classification result of the image to be processed.
  • the scene classification result may include any one of a seaside, a forest, a city, an indoor, and a desert.
  • the classification module 130 can be configured to input the pre-processing results into a trained machine learning model for classification.
  • control module 140 is configured to generate and send a control signal according to the classification, where the control signal is used to perform a corresponding preset operation on the image to be processed.
  • control module 140 can include a saving unit 141 and a deleting unit 142, where:
  • the saving unit 141 is configured to perform a saving operation on the image to be processed when the classification is the first classification.
  • the deleting unit 142 is configured to perform a deleting operation on the image to be processed when the classification is the second classification.
  • control module 140 may further include an adjustment unit 143 and a retake unit 144, where:
  • the adjusting unit 143 is configured to obtain a corresponding shooting adjustment parameter according to the to-be-processed image when the classification is the third classification.
  • the re-shooting unit 144 is configured to perform a deletion operation on the image to be processed, and acquire another image to be processed according to the shooting adjustment parameter.
  • the photographing adjustment parameter may include any one or more of an aperture adjustment amount, an exposure parameter, a focus distance, a photographing angle, and the like.
  • the above automatic snapping device can be applied to any one of a drone, a handheld head, a vehicle, a ship, an autonomous vehicle, an intelligent robot, and the like.
  • the drone 30 may include: a body 302; an imaging device 304 disposed on the body; and a processor 306 configured to perform: acquiring an image to be processed; The image to be processed is preprocessed to obtain a preprocessing result; the preprocessing result is input into the trained machine learning model for classification; a control signal is generated and transmitted according to the classification, and the control signal is used to Processing the image performs the corresponding preset operation.
  • the processor 306 is further configured to perform a function of performing scene understanding on the image to be processed, and obtaining a scene classification result of the image to be processed.
  • the processor 306 is further configured to perform a function of performing object detection on the image to be processed, and obtaining a target object in the image to be processed.
  • the processor 306 is further configured to perform the function of tracking the target object to obtain a tracking result.
  • the processor 306 is further configured to perform a function of performing a pose analysis on the target object to obtain an action category of the target object.
  • UAV can be replaced with any one of a handheld PTZ, a vehicle, a ship, an autonomous vehicle, an intelligent robot, and the like in other application scenarios.
  • modules or units of equipment for action execution are mentioned in the detailed description above, such division is not mandatory. Indeed, in accordance with embodiments of the present disclosure, the features and functions of two or more modules or units described above may be embodied in one module or unit. Conversely, the features and functions of one of the modules or units described above may be further divided into multiple modules or units.
  • the components displayed as modules or units may or may not be physical units, ie may be located in one place or may be distributed over multiple network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the wood disclosure scheme. Those of ordinary skill in the art can understand and implement without any creative effort.
  • the present exemplary embodiment further provides a computer readable storage medium having stored thereon a computer program, the program being executable by the processor to implement the steps of the automatic capture method in any one of the above embodiments.
  • the computer readable storage medium may be a ROM, a random access memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, and an optical data storage device.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Medical Informatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • Databases & Information Systems (AREA)
  • Automation & Control Theory (AREA)
  • Remote Sensing (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Image Analysis (AREA)
  • Studio Devices (AREA)

Abstract

一种自动抓拍方法,包括:获取待处理图像(S110);对待处理图像进行预处理,获得预处理结果(S120);将预处理结果输入至训练好的机器学习模型中进行分类(S130);根据分类生成并发送控制信号,该控制信号用于对待处理图像执行相应的预设操作(S140)。该方法不仅可以实现自动抓拍的功能,还能够保证自动抓拍的照片的拍摄效果。

Description

自动抓拍方法及装置、无人机及存储介质 技术领域
本公开涉及影像处理领域,尤其涉及一种自动抓拍方法及装置、无人机及存储介质。
背景技术
目前的拍照方式主要有以下几种:
一种方式是自拍,即自己使用智能手机、平板电脑等进行摆拍,或者借助自拍杆这类工具辅助自拍。这种拍照方式具有较大的局限性,一方面,只适用于人数相对较少的场合,如果是多人出行,自拍方式拍摄的效果就不够好,无法达到人们预想的拍摄效果;另一方面,自拍时拍摄角度的调整不够灵活,人们的面部表情和动作姿态也显得刻意,不够自然。
另一种方式是请他人帮忙拍摄,即将自己的拍摄设备暂时给其他人,请他人帮忙拍摄。这种拍摄方式存在以下缺点,一方面,需要寻求他人帮助,有可能被他人拒绝,或者在人比较少的地方,难以及时找到帮忙拍摄的人;另一方面,他人的拍摄水平无法保证,拍摄的效果有时会很差,当他人拍摄的不满意时,常常又不方便让对方重新多拍几次。
同时,上述两种拍照方式大多数情况下都属于摆拍,动作相对单一,拍摄的照片也不够自然。
还有一种方式是聘请一个随行的专业摄影师全程跟拍,虽然这种方式可以保证拍摄的效果,同时也不需要用户自己拍摄或者寻求他人帮忙拍摄,但是这种方式对于个人来将,成本比较大,并不适用于日常出行或者旅游,一般只能够用于一些经济条件较好的家庭的特殊纪念日。
因此,需要一种新的自动抓拍方法及装置、无人机及存储介质。
应当理解的是,以上的一般描述仅是对相关技术的示例性解释,并不表示属于本公开的现有技术。
发明内容
本公开的目的是提供一种自动抓拍方法及装置、无人机及存储介质,至少在一定程度上克服由于相关技术的限制和缺陷而导致的一个或者多个问题。
本公开的其他特性和优点将通过下面的详细描述变得显然,或部分地通过本公开的实践而习得。
根据本公开实施例的第一方面,提供一种自动抓拍方法,所述方法包括:获取待处理图像;对所述待处理图像进行预处理,获得预处理结果;将所述预处理结果输入至训练好的机器学习模型中进行分类;根据所述分类生成并发送控制信号,所述控制信号用于对所述待处理图像执行相应的预设操作。
根据本公开实施例的第二方面,提供一种自动抓拍装置,所述装置包括:图像获取模 块,用于获取待处理图像;预处理模块,用于对所述待处理图像进行预处理,获得预处理结果;分类模块,用于将所述预处理结果输入至训练好的机器学习模型中进行分类;控制模块,用于根据所述分类生成并发送控制信号,所述控制信号用于对所述待处理图像执行相应的预设操作。
根据本公开实施例的第三方面,提供一种无人机,包括:机身;设置于所述机身上的摄像装置;以及处理器,所述处理器被配置为执行:获取待处理图像;对所述待处理图像进行预处理,获得预处理结果;将所述预处理结果输入至训练好的机器学习模型中进行分类;根据所述分类生成并发送控制信号,所述控制信号用于对所述待处理图像执行相应的预设操作。
根据本公开实施例的第四方面,提供一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序在由计算机的处理器运行时,使所述计算机执行自动抓拍方法,所述方法包括:获取待处理图像;对所述待处理图像进行预处理,获得预处理结果;将所述预处理结果输入至训练好的机器学习模型中进行分类;根据所述分类生成并发送控制信号,所述控制信号用于对所述待处理图像执行相应的预设操作。
本公开的实施例提供的技术方案可以包括以下有益效果:
本公开的一种实施例中,通过自动抓拍方法,能够方便的将那些自然、优雅的画面、动作和场景捕获下来,记录出行时旅途中最自然的形态。同时,这种自动抓拍的实现成本相对较低。
本公开的一种实施例中,通过对当前待处理图像进行预处理,并将预处理结果通过训练好的机器学习模型进行分类,从而可以根据分类结果对该当前待处理图像执行相应的预设操作,这样,相比于现有技术,一方面,不仅可以实现自动抓拍的功能,另一方面,还能够保证自动抓拍的照片的拍摄效果。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本公开。
附图说明
图1为根据本公开一实施例的自动抓拍方法的流程图。
图2为根据本公开一实施例的自动抓拍方法的步骤S120的流程图。
图3为根据本公开一实施例的自动抓拍装置的示意图。
图4为根据本公开一实施例的无人机的示意图。
具体实施方式
下面将参考若干示例性实施方式来描述本发明的原理和精神。应当理解,给出这些实施方式仅仅是为了使本领域技术人员能够更好地理解进而实现本发明,而并非以任何方式限制本发明的范围。相反,提供这些实施方式是为了使本公开更加透彻和完整,并且能够 将本公开的范围完整地传达给本领域的技术人员。
本领域技术人员知道,本发明的实施方式可以实现为一种系统、装置、设备、方法或计算机程序产品。因此,本公开可以具体实现为以下形式,即:完全的硬件、完全的软件(包括固件、驻留软件、微代码等),或者硬件和软件结合的形式。
根据本发明的实施方式,提出了一种用于自动抓拍方法,无人机及存储介质。下面参考本发明的若干代表性实施方式,详细阐释本发明的原理和精神。
图1为根据本公开一实施例的用于自动抓拍方法的流程图。如图1所示,本实施例的方法包括以下步骤S110-S140。
在步骤S110中,获取待处理图像。
在本实施例中,可通过智能设备的摄像装置实时拍摄用户所处环境的图像,并从拍摄的图像中获取所述待处理图像。
其中,该智能设备可以是无人机,该待处理图像可以是无人机拍摄的一段视频中的一帧图像。举例而言,用户可操作无人机在用户所处环境内飞行,并控制无人机通过安装于该无人机上的摄像装置对用户进行实时拍摄,得到一段视频,提取该视频的任一帧图像作为所述待处理图像。
在本公开的其它实施例中,所述智能设备还可以是手持云台、车辆、船只、自动驾驶车辆、智能机器人等中的任意一种,只要具有摄像装置,并能进行移动拍摄即可,在此不再一一列举。
在步骤S120中,对所述待处理图像进行预处理,获得预处理结果。
在一实施例中,步骤S120可以包括步骤S1210。
如图2所示,在步骤S1210中,对待处理图像进行场景理解,获得待处理图像的场景分类结果。
其中,场景理解可以采用深度学习的方法,但本公开对此不作限定,在其他实施例中,还可以采用其它方法。
其中,获得的场景分类结果可以包括海边、森林、城市、室内、沙漠等中的任意一种,但不以此为限,例如还可以包括广场等其它场景。
举例而言:可选定多种测试图片,该多种测试图片中的每种测试图片(每种测试图片中又可以包括多张同种类的测试图片)对应一种场景分类,该场景分类例如可以包括海边、森林、城市、室内、沙漠等中的任意一种。基于所述多种测试图片,可通过深度学习训练出包含一个或者多个场景分类的网络模型,该网络模型可包括卷积层(convolution layer)和全连接层(fully connected layer)。
可通过该卷积层对所述待处理图像的特征进行提取,再通过全连接层对提取到的特征进行整合,以便将所述待处理图像的特征与上述一个或多个场景分类进行比较,确定所述待处理图像的场景分类结果,例如为海边。
在一实施例中,步骤S120还可以包括步骤S1220和步骤S1230,其中:
如图2所示,在步骤S1220中,对所述待处理图像进行对象检测,获得所述待处理图像中的目标对象。
本公开实施例中,所述目标对象例如可以是所述待处理图像中的行人,在其他实施例中,也可以是其他对象例如动物。下面的实施例中均以所述目标对象为行人为例进行举例说明。
在示例性实施例中,可通过行人检测算法对所述待处理图像中的行人进行检测,获得所述待处理图像中的所有行人,并发送至终端设备(该终端设备上安装有相应的应用程序)例如手机、平板电脑等,用户可通过终端设备从所述待处理图像中的所有行人中选择待拍摄的行人即所述目标对象,即需要被抓拍的人。
举例而言,可以采用基于多层网络模型的行人检测方法识别所述待处理图像中的所有行人,具体可以通过一个多层卷积神经网络提取出行人的候选位置,然后通过第二个stage的神经网络对所有候选的位置进行校验,refine其的预测结果,并利用跟踪框将行人在多帧中的检测联系起来。
用户可通过终端设备接收所述待处理图像以及所述待处理图像上被跟踪框选出的每个人,用户可选择出自己想要抓拍的人的跟踪框,以确定所述目标对象,该目标对象与操作该终端设备的用户可以是同一个人,也可以是不同的人。
在步骤S1230中,对所述目标对象进行跟踪,获得跟踪结果。
在示例性实施例中,所述跟踪结果可包括所述目标对象在所述待处理图像中的位置或者大小(尺寸),当然,也可以同时包括位置和大小。
本实施例中,可通过比较所述待处理图像的之前帧或者初始帧的信息,从所述待处理图像中选择所述目标对象并实时跟踪。
举例而言,可先获取所述待处理图像中每个行人的位置,再利用跟踪算法对所述待处理图像和之前帧的图像进行匹配;利用跟踪框对行人进行框选,并实时更新跟踪框的位置,从而实时确定行人的位置和大小,行人的位置可以是行人在所述待处理图像中的坐标,行人的大小可以是行人在所述待处理图像中所占的区域的面积。
在步骤S1240中,对所述目标对象进行姿态分析,获得所述目标对象的动作类别。
本公开实施例中,姿态分析的方法可以是基于形态特征的检测方法,即基于每一个人体关节训练出一个检测器,然后使用基于规则或者优化的方法把这些关节组合出人体的姿态。或者,姿态分析的方法还可以是基于全局信息的回归方法,即直接预测每一个关节点在图像中的位置(坐标),基于计算出的关节点的位置分类判断动作类别。当然,也可以采用其它方法进行姿态分析,在此不再一一列举。
其中,所述目标对象的动作类别可以包括跑、行走、跳跃等中的任意一种,但不以此为限,例如还可以包括弯腰、翻滚、摇摆等动作类别。
在一实施例中,步骤S120还可以包括步骤S1240。
如图2所示,在步骤S1240中,对所述待处理图像进行图像质量分析,获得所述待 处理图像的图像质量。
本实施例中,可通过峰值信噪比和均方误差全参考评价算法或者其它算法,对所述待处理图像的图像质量进行分析,得出所述待处理图像的图像质量,该图像质量可用多个分值表示,也可以用清晰度等反应图像质量的参数的具体数值表示。
在步骤S130中,将所述预处理结果输入至训练好的机器学习模型中进行分类。
在示例性实施例中,所述预处理结果可以包括上述实施例中的场景分类结果、目标对象、跟踪结果、动作类别和图像质量中的任意一种或者多种的组合。
一实施例中,所述训练好的机器学习模型可以是深度学习神经网络模型,其可以基于姿态分析、行人检测、行人跟踪和场景分析等算法,结合预设评价标准训练而获得,其形成过程例如可以包括建立评价标准、根据评价标准对样本进行标注以及根据机器学习算法训练模型等过程。
其中,所述评价标准可以由摄像方面的专家或者摄影爱好者提出。本实施例中,还可以根据不同的摄影派别,由不同派别的摄影专家提出不同派别的更细分的评价标准,例如适合于人物拍摄的评价标准,适合于自然景色拍摄的评价标准;再例如,适合于复古风格的评价标准,适合于清新风格的评价标准,等等。
另一实施例中,所述训练好的机器学习模型可以是深度学习神经网络模型,其可以基于姿态分析、行人检测、行人跟踪、场景分析以及图像质量分析等算法,结合预设评价标准和摄像装置的拍摄参数训练而获得,其形成过程可以包括建立评价标准、根据评价标准对样本进行标注以及根据机器学习算法训练模型等过程。
例如,给定一张照片,可以通过分析这张照片的图像清晰度、获取拍摄这张照片的摄像装置的拍摄参数等信息对这张照片进行标注,输入至机器学习模型中进行训练,相应的训练好的模型可以根据所述待处理图像的图像质量来预测拍摄该待处理图像的摄像装置的拍摄参数是否需要进行调整。
本实施例中,该训练好的机器学习模型可根据所述预处理结果对所述待处理图像进行打分,打分的依据可以是场景分类结果、目标对象、跟踪结果和动作类别中的一种或者多种,将获得的分值与预设的阈值进行比较,从而确定所述待处理图像的分类。
举例而言,当所述待处理图像的分值高于所述阈值时,其分类为第一分类,此时,可保存相应的待处理图像,并可以将所述待处理图像发送给用户的终端设备;当所述待处理图像的分值低于所述阈值时,可将所述待处理图像删除。
一实施例中,可以基于单一的场景分类结果给所述待处理图像进行打分,例如所述待处理图像的场景分类结果为海滩时,可以将其分类为第一分类,保留该待处理图像。
在又一实施例中,可以基于目标对象的跟踪结果给所述待处理图像进行打分。例如当确定待抓拍的目标对象为多个时,当检测到该多个目标对象同时处于所述待处理图像的靠中间位置时,可以判定为该多个目标对象当前希望拍摄一张合照,此时可以将所述待处理图像划分到所述第一分类,保留相应的待处理图像。又例如当根据所述跟踪结果获知所述 目标对象占据所述待处理图像的面积超过1/2(这个数值可以根据具体情况进行调整)时,可以判定为所述目标对象当前希望拍摄一张照片,特意走到了相对该无人机较为合适拍摄的位置,此时可以将所述待处理图像分类至所述第一分类,保存相应的待处理图像。
在另一实施例中,也可以基于单一的动作类别给所述待处理图像进行打分,例如当检测到所述目标对象当前有一个跳跃动作,且该跳跃动作达到第一预设高度例如1米时,给所述待处理图像打10分,所述待处理图像处于所述第一分类,保留该待处理图像;而当检测到所述目标对象当前有一个跳跃动作,且该跳跃动作达到第二预设高度例如50厘米时,给所述待处理图像打5分,所述待处理图像处于所述第二分类,删除该待处理图像。
在另一实施例中,可以基于场景分类结果和行人检测的目标对象综合考虑进行打分,当所述场景分类结果和所述目标对象较为搭配时,认为所述待处理图像属于所述第一分类;而当所述场景分类结果和所述目标对象不搭配时,认为所述待处理图像属于所述第二分类。这里的所述场景分类结果和所述目标对象是否搭配可以由机器学习模型根据海量的标注照片训练来预测获知。
例如,在海边场景,当检测到所述目标对象和大海,且当前镜头中没有其他闲杂人等(非想抓拍的对象),可以将所述待处理图像划分至所述第一分类,保存相应的待处理图像。
在再一实施例中,还可以综合考虑场景分类结果、目标对象的跟踪结果和目标对象的动作类别对所述待处理图像进行打分。例如,当所述待处理图像的场景分类结果为草原时,当所述跟踪结果显示所述目标对象处于所述待处理图像的接近中间位置,且所述目标对象占据所述待处理图像的面积超过1/3时,同时,该目标对象作出一个剪刀手(或者其他常见的拍照动作)的动作,可以判定为所述待处理图像为所述第一分类,保存所述待处理图像。
本公开实施例中,可以判定场景分类结果与目标对象不搭配时,或者目标对象的位置和/或大小不符合拍摄要求时,或者目标对象的动作类别与当前的场景分类结果不搭配时,将所述待处理图像分类为所述第二分类,删除所述待处理图像。
在示例性实施例中,该机器学习模型在对所述待处理图像进行打分的同时,还可根据图像质量对所述待处理图像进行分类。
举例而言,当所述待处理图像的图像质量的分值低于一阈值时,可以将所述待处理图像分类为第三分类,此时,图像质量较差,机器学习模型可根据图像质量生成拍摄调整参数,以便根据所述拍摄调整参数对所述摄像装置的拍摄参数进行调节,提高后续的图像质量。
其中,该拍摄调整参数可以包括摄像装置的光圈的调整量、曝光参数、对焦的距离、对比度等中的任意一种或者多种,在此不作特殊限定。此外,该拍摄调整参数还可以包括无人机的拍摄角度、拍摄距离等参数的调整量。
在步骤S140中,根据所述分类生成并发送控制信号,所述控制信号用于对所述待处 理图像执行相应的预设操作。
本公开实施例中,上述的每种分类均可对应于一控制信号,且每个控制信号可对应于不同的预设操作。其中,所述预设操作可以包括保存操作、删除操作和重拍操作等中的任意一种。
举例而言:当一待处理图像的分类为上述的第一分类时,可生成第一控制信号,该第一控制信号用于对相应的预处理图像执行保存操作,从而将该预处理图像保存下来,便于用户使用。
当一待处理图像的分类为上述的第二分类时,可生成第二控制信号,该第二控制信号用于可对该相应的预处理图像执删除操作。
当一待处理图像的分类为上述的第三分类时,可生成第三控制信号,该第三控制信号用于根据该相应的待处理图像获得相应的拍摄调整参数,然后,对该待处理图像执行删除操作和重拍操作,该重拍操作可以包括:根据该拍摄调整参数调整摄像装置和/或无人机的拍摄参数,并采用调整完后的无人机及其上安装的摄像装置获取另一待处理图像,对该另一待处理图像可以再按照上述自动抓拍方法进行处理。
可以理解的,上述的自动抓拍方法可以应用于无人机、手持云台、车辆、船只、自动驾驶车辆、智能机器人等中的任意一种。
需要说明的是,以上示例仅为步骤S110-S140的优选实施例,但本公开的实施例并不仅限于此,本领域技术人员基于上述公开易于思及其他的实施方式也属于本公开的保护范围。
本公开实施例的自动抓拍方法,能够方便的将那些自然、优雅的画面、动作和场景捕获下来,记录出行时旅途中最自然的形态。同时,这种自动抓拍的实现成本相对较低。并能通过对当前待处理图像进行预处理,并将预处理结果通过训练好的机器学习模型进行分类,从而可以根据分类结果对该当前待处理图像执行相应的预设操作,这样,相比于现有技术,一方面,不仅可以实现自动抓拍的功能,另一方面,还能够保证自动抓拍的照片的拍摄效果。
需要说明的是,尽管在附图中以特定顺序描述了本公开中方法的各个步骤,但是,这并非要求或者暗示必须按照该特定顺序来执行这些步骤,或是必须执行全部所示的步骤才能实现期望的结果。附加的或备选的,可以省略某些步骤,将多个步骤合并为一个步骤执行,以及/或者将一个步骤分解为多个步骤执行等。另外,也易于理解的是,这些步骤可以是例如在多个模块/进程/线程中同步或异步执行。
图3为根据本公开一实施例的自动抓拍装置的示意图。如图3所示,自动抓拍装置100可以包括图像获取模块110、预处理模块120、分类模块130和控制模块140,其中:
在一实施例中,图像获取模块110可用于获取待处理图像。举例而言,图像获取模块110可以包括拍摄单元111,可以用于通过智能设备上的摄像装置拍摄获取所述待处理图像。
在一实施例中,预处理模块120可用于对所述待处理图像进行预处理,获得预处理结果。举例而言,预处理模块120可以包括检测单元121、跟踪单元122、姿态分析单元123、质量分析单元124和场景分类单元125中的任意一种或者多种的组合,其中:
检测单元121可用于对所述待处理图像进行对象检测,获得所述待处理图像中的目标对象。
跟踪单元122可用于对所述目标对象进行跟踪,获得跟踪结果。
在示例性实施例中,所述跟踪结果可以包括所述目标对象在所述待处理图像中的位置和/或大小。
姿态分析单元123可用于对所述目标对象进行姿态分析,获得所述目标对象的动作类别。
在示例性实施例中,所述动作类别包括跑、行走、跳跃等中的任意一种。
质量分析单元124可用于对所述待处理图像进行图像质量分析,获得所述待处理图像的图像质量。
场景分类单元125可用于对所述待处理图像进行场景理解,获得所述待处理图像的场景分类结果。
在示例性实施例中,所述场景分类结果可以包括海边、森林、城市、室内、沙漠中的任意一种。
在一实施例中,分类模块130可用于将所述预处理结果输入至训练好的机器学习模型中进行分类。
在一实施例中,控制模块140可用于根据所述分类生成并发送控制信号,所述控制信号用于对所述待处理图像执行相应的预设操作。
举例而言,控制模块140可以包括保存单元141和删除单元142,其中:
保存单元141可用于当所述分类为第一分类时,对所述待处理图像执行保存操作。
删除单元142可用于当所述分类为第二分类时,对所述待处理图像执行删除操作。
在示例性实施例中,控制模块140还可以包括调节单元143和重拍单元144,其中:
调节单元143可用于当所述分类为第三分类时,根据所述待处理图像获得相应的拍摄调整参数。
重拍单元144可用于对所述待处理图像执行删除操作,并根据所述拍摄调整参数获取另一待处理图像。
在示例性实施例中,所述拍摄调整参数可以包括光圈调整量、曝光参数、对焦距离、拍摄角度等中的任意一种或者多种。
可以理解的,上述的自动抓拍装置可以应用于无人机、手持云台、车辆、船只、自动驾驶车辆、智能机器人等中的任意一种。
本公开实施例提供的自动抓拍装置的具体原理和实现方式已经在有关该方法的实施例中进行了详细描述,此处将不做详细阐述说明。
图4为根据本公开一实施例的无人机的示意图。如图4所示,无人机30可以包括:机身302;设置于所述机身上的摄像装置304;以及处理器306,处理器306被配置为执行:获取待处理图像;对所述待处理图像进行预处理,获得预处理结果;将所述预处理结果输入至训练好的机器学习模型中进行分类;根据所述分类生成并发送控制信号,所述控制信号用于对所述待处理图像执行相应的预设操作。
在一实施例中,处理器306还被配置为执行如下功能:对待处理图像进行场景理解,获得待处理图像的场景分类结果。
在一实施例中,处理器306还被配置为执行如下功能:对待处理图像进行对象检测,获得待处理图像中的目标对象。
在一实施例中,处理器306还被设置为执行如下功能:对所述目标对象进行跟踪,获得跟踪结果。
在一实施例中,处理器306还被设置为执行如下功能:对所述目标对象进行姿态分析,获得所述目标对象的动作类别。
可以理解的,上述的无人机在其他应用场景可以替换为手持云台、车辆、船只、自动驾驶车辆、智能机器人等中的任意一种。
本公开实施例提供的无人机的具体原理和实现方式已经在有关该方法的实施例中进行了详细描述,此处将不做详细阐述说明。
应当注意,尽管在上文详细描述中提及了用于动作执行的设备的若干模块或者单元,但是这种划分并非强制性的。实际上,根据本公开的实施方式,上文描述的两个或更多模块或者单元的特征和功能可以在一个模块或者单元中具体化。反之,上文描述的一个模块或者单元的特征和功能可以进一步划分为由多个模块或者单元来具体化。作为模块或单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现木公开方案的目的。本领域普通技术人员在不付出创造性劳动的情况下,即可以理解并实施。
本示例实施方式,还提供一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时可以实现上述任意一个实施例中所述自动抓拍方法的步骤。所述自动抓拍方法的具体步骤可参考前述方法实施例中各步骤的详细描述,此处不再赘述。所述计算机可读存储介质可以是ROM、随机存取存储器(RAM)、CD-ROM、磁带、软盘和光数据存储设备等。
此外,上述附图仅是根据本发明示例性实施例的方法所包括的处理的示意性说明,而不是限制目的。易于理解,上述附图所示的处理并不表明或限制这些处理的时间顺序。另外,也易于理解,这些处理可以是例如在多个模块中同步或异步执行的。
本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本公开的其他实施例。本申请旨在涵盖本公开的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯 用技术手段。说明书和实施例仅被视为示例性的,本公开的真正范围和精神由权利要求指出。

Claims (24)

  1. 一种自动抓拍方法,包括:
    获取待处理图像;
    对所述待处理图像进行预处理,获得预处理结果;
    将所述预处理结果输入至训练好的机器学习模型中进行分类;
    根据所述分类生成并发送控制信号,所述控制信号用于对所述待处理图像执行相应的预设操作。
  2. 如权利要求1所述的方法,所述获取待处理图像包括:
    通过智能设备上的摄像装置拍摄获取所述待处理图像。
  3. 如权利要求1所述的方法,所述对所述待处理图像进行预处理,获得预处理结果包括:
    对所述待处理图像进行场景理解,获得所述待处理图像的场景分类结果。
  4. 如权利要求3所述的方法,所述场景分类结果包括海边、森林、城市、室内、沙漠中的任意一种。
  5. 如权利要求1所述的方法,其中,所述对所述待处理图像进行预处理,获得预处理结果包括:
    对所述待处理图像进行对象检测,获得所述待处理图像中的目标对象。
  6. 如权利要求5所述的方法,其中,所述对所述待处理图像进行预处理,获得预处理结果还包括:
    对所述目标对象进行跟踪,获得跟踪结果。
  7. 如权利要求6所述的方法,其中,所述跟踪结果包括所述目标对象在所述待处理图像中的位置和/或大小。
  8. 如权利要求5或6所述的方法,其中,所述对所述待处理图像进行预处理,获得预处理结果还包括:
    对所述目标对象进行姿态分析,获得所述目标对象的动作类别。
  9. 如权利要求8所述的方法,其中,所述动作类别包括跑、行走、跳跃中的任意一种。
  10. 如权利要求1所述的方法,其中,所述对所述待处理图像进行预处理,获得预处理结果包括:
    对所述待处理图像进行图像质量分析,获得所述待处理图像的图像质量。
  11. 如权利要求1所述的方法,其中,所述根据所述分类生成并发送控制信号,所述控制信号用于对所述待处理图像执行相应的预设操作包括:
    当所述分类为第一分类时,对所述待处理图像执行保存操作;
    当所述分类为第二分类时,对所述待处理图像执行删除操作。
  12. 如权利要求11所述的方法,其中,所述根据所述分类生成并发送控制信号,所述控制信号用于对所述待处理图像执行相应的预设操作还包括:
    当所述分类为第三分类时,根据所述待处理图像获得相应的拍摄调整参数;
    对所述待处理图像执行删除操作,并根据所述拍摄调整参数获取另一待处理图像。
  13. 如权利要求12所述的方法,其中,所述拍摄调整参数包括光圈调整量、曝光参数、对焦距离、拍摄角度中的任意一种或者多种。
  14. 如权利要求1所述的方法,所述自动抓拍方法用于无人机或手持云台。
  15. 一种自动抓拍装置,包括:
    图像获取模块,用于获取待处理图像;
    预处理模块,用于对所述待处理图像进行预处理,获得预处理结果;
    分类模块,用于将所述预处理结果输入至训练好的机器学习模型中进行分类;
    控制模块,用于根据所述分类生成并发送控制信号,所述控制信号用于对所述待处理图像执行相应的预设操作。
  16. 如权利要求15所述的装置,其中,所述预处理模块包括:
    场景分类单元,用于对所述待处理图像进行场景理解,获得所述待处理图像的场景分类结果。
  17. 如权利要求15所述的装置,其中,所述预处理模块包括:
    检测单元,用于对所述待处理图像进行对象检测,获得所述待处理图像中的目标对象。
  18. 如权利要求17所述的装置,其中,所述预处理模块还包括:
    跟踪单元,用于对所述目标对象进行跟踪,获得跟踪结果。
  19. 如权利要求17或18所述的装置,其中,所述预处理模块还包括:
    姿态分析单元,用于对所述目标对象进行姿态分析,获得所述目标对象的动作类别。
  20. 如权利要求15所述的装置,其中,所述预处理模块包括:
    质量分析单元,用于对所述待处理图像进行图像质量分析,获得所述待处理图像的图像质量。
  21. 如权利要求15所述的装置,其中,所述控制模块包括:
    保存单元,用于当所述分类为第一分类时,对所述待处理图像执行保存操作;
    删除单元,用于当所述分类为第二分类时,对所述待处理图像执行删除操作。
  22. 如权利要求21所述的装置,其中,所述控制模块还包括:
    调节单元,用于当所述分类为第三分类时,根据所述待处理图像获得相应的拍摄调整参数;
    重拍单元,用于对所述待处理图像执行删除操作,并根据所述拍摄调整参数获取另一待处理图像。
  23. 一种无人机,包括:
    机身;
    设置于所述机身上的摄像装置;
    以及处理器,所述处理器被配置为执行:
    获取待处理图像;
    对所述待处理图像进行预处理,获得预处理结果;
    将所述预处理结果输入至训练好的机器学习模型中进行分类;
    根据所述分类生成并发送控制信号,所述控制信号用于对所述待处理图像执行相应的预设操作。
  24. 一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序在由计算机的处理器运行时,使所述计算机执行自动抓拍方法,所述方法包括:
    获取待处理图像;
    对所述待处理图像进行预处理,获得预处理结果;
    将所述预处理结果输入至训练好的机器学习模型中进行分类;
    根据所述分类生成并发送控制信号,所述控制信号用于对所述待处理图像执行相应的预设操作。
PCT/CN2018/076792 2018-02-14 2018-02-14 自动抓拍方法及装置、无人机及存储介质 Ceased WO2019157690A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN201880028125.1A CN110574040A (zh) 2018-02-14 2018-02-14 自动抓拍方法及装置、无人机及存储介质
PCT/CN2018/076792 WO2019157690A1 (zh) 2018-02-14 2018-02-14 自动抓拍方法及装置、无人机及存储介质
US16/994,092 US20200371535A1 (en) 2018-02-14 2020-08-14 Automatic image capturing method and device, unmanned aerial vehicle and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/076792 WO2019157690A1 (zh) 2018-02-14 2018-02-14 自动抓拍方法及装置、无人机及存储介质

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/994,092 Continuation US20200371535A1 (en) 2018-02-14 2020-08-14 Automatic image capturing method and device, unmanned aerial vehicle and storage medium

Publications (1)

Publication Number Publication Date
WO2019157690A1 true WO2019157690A1 (zh) 2019-08-22

Family

ID=67619090

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/076792 Ceased WO2019157690A1 (zh) 2018-02-14 2018-02-14 自动抓拍方法及装置、无人机及存储介质

Country Status (3)

Country Link
US (1) US20200371535A1 (zh)
CN (1) CN110574040A (zh)
WO (1) WO2019157690A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112702521A (zh) * 2020-12-24 2021-04-23 广州极飞科技有限公司 图像拍摄方法及装置、电子设备、计算机可读存储介质
CN113469250A (zh) * 2021-06-30 2021-10-01 阿波罗智联(北京)科技有限公司 图像拍摄方法、图像分类模型训练方法、装置及电子设备
CN114650356A (zh) * 2022-03-16 2022-06-21 思翼科技(深圳)有限公司 一种高清无线数字图像传输系统
US11445121B2 (en) 2020-12-29 2022-09-13 Industrial Technology Research Institute Movable photographing system and photography composition control method

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110908295A (zh) * 2019-12-31 2020-03-24 深圳市鸿运达电子科技有限公司 一种基于物联网的用于智能家居的多媒体设备
CN113095141A (zh) * 2021-03-15 2021-07-09 南通大学 一种基于人工智能的无人机视觉学习系统
CN113095157A (zh) * 2021-03-23 2021-07-09 深圳市创乐慧科技有限公司 一种基于人工智能的图像拍摄方法、装置及相关产品
CN113824884B (zh) * 2021-10-20 2023-08-08 深圳市睿联技术股份有限公司 拍摄方法与装置、摄影设备及计算机可读存储介质
CN114531553B (zh) * 2022-02-11 2024-02-09 北京字跳网络技术有限公司 生成特效视频的方法、装置、电子设备及存储介质
CN114782805B (zh) * 2022-03-29 2023-05-30 中国电子科技集团公司第五十四研究所 一种面向无人机巡逻的人在环路混合增强目标识别方法
CN114660605B (zh) * 2022-05-17 2022-12-27 湖南师范大学 一种机器学习的sar成像处理方法、装置及可读存储介质
CN115086607B (zh) * 2022-06-14 2024-12-24 国网山东省电力公司电力科学研究院 一种电力施工监控系统、监控方法、计算机设备
CN115988313A (zh) * 2022-12-20 2023-04-18 成都云天励飞技术有限公司 拍照辅助方法、装置、电子设备及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105512643A (zh) * 2016-01-06 2016-04-20 北京二郎神科技有限公司 一种图像采集方法和装置
CN106231173A (zh) * 2015-06-02 2016-12-14 Lg电子株式会社 移动终端及其控制方法
CN106845549A (zh) * 2017-01-22 2017-06-13 珠海习悦信息技术有限公司 一种基于多任务学习的场景与目标识别的方法及装置
US20170195591A1 (en) * 2016-01-05 2017-07-06 Nvidia Corporation Pre-processing for video noise reduction
CN107566907A (zh) * 2017-09-20 2018-01-09 广东欧珀移动通信有限公司 视频剪辑方法、装置、存储介质及终端
CN107622281A (zh) * 2017-09-20 2018-01-23 广东欧珀移动通信有限公司 图像分类方法、装置、存储介质及移动终端

Family Cites Families (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE69635101T2 (de) * 1995-11-01 2006-06-01 Canon K.K. Verfahren zur Extraktion von Gegenständen und dieses Verfahren verwendendes Bildaufnahmegerät
US8429173B1 (en) * 2009-04-20 2013-04-23 Google Inc. Method, system, and computer readable medium for identifying result images based on an image query
US10311096B2 (en) * 2012-03-08 2019-06-04 Google Llc Online image analysis
TWI532361B (zh) * 2013-12-27 2016-05-01 國立臺灣科技大學 自動尋景拍攝方法及其系統
WO2016168976A1 (en) * 2015-04-20 2016-10-27 SZ DJI Technology Co., Ltd. Imaging system
WO2017041304A1 (en) * 2015-09-11 2017-03-16 SZ DJI Technology Co., Ltd. Carrier for unmanned aerial vehicle
CN105512634B (zh) * 2015-12-14 2019-01-15 联想(北京)有限公司 一种感应装置以及一种电子设备
TWI557526B (zh) * 2015-12-18 2016-11-11 林其禹 自拍無人飛機系統及其執行方法
US9838641B1 (en) * 2015-12-30 2017-12-05 Google Llc Low power framework for processing, compressing, and transmitting images at a mobile image capture device
US9836484B1 (en) * 2015-12-30 2017-12-05 Google Llc Systems and methods that leverage deep learning to selectively store images at a mobile image capture device
US10225511B1 (en) * 2015-12-30 2019-03-05 Google Llc Low power framework for controlling image sensor mode in a mobile image capture device
US9609288B1 (en) * 2015-12-31 2017-03-28 Unmanned Innovation, Inc. Unmanned aerial vehicle rooftop inspection system
CN105554480B (zh) * 2016-03-01 2018-03-16 深圳市大疆创新科技有限公司 无人机拍摄图像的控制方法、装置、用户设备及无人机
CN105915801A (zh) * 2016-06-12 2016-08-31 北京光年无限科技有限公司 改善抓拍效果的自学习方法及装置
US10482621B2 (en) * 2016-08-01 2019-11-19 Cognex Corporation System and method for improved scoring of 3D poses and spurious point removal in 3D image data
US10530991B2 (en) * 2017-01-28 2020-01-07 Microsoft Technology Licensing, Llc Real-time semantic-aware camera exposure control
US11205283B2 (en) * 2017-02-16 2021-12-21 Qualcomm Incorporated Camera auto-calibration with gyroscope
CN107092926A (zh) * 2017-03-30 2017-08-25 哈尔滨工程大学 基于深度学习的服务机器人物体识别算法
CN107135352A (zh) * 2017-04-28 2017-09-05 北京小米移动软件有限公司 滤镜选项排序的方法及装置
US10540589B2 (en) * 2017-10-24 2020-01-21 Deep North, Inc. Image quality assessment using similar scenes as reference
WO2019100219A1 (zh) * 2017-11-21 2019-05-31 深圳市大疆创新科技有限公司 输出影像生成方法、设备及无人机
JP6496955B1 (ja) * 2017-12-19 2019-04-10 エスゼット ディージェイアイ テクノロジー カンパニー リミテッドSz Dji Technology Co.,Ltd 制御装置、システム、制御方法、及びプログラム
US10467526B1 (en) * 2018-01-17 2019-11-05 Amaon Technologies, Inc. Artificial intelligence system for image similarity analysis using optimized image pair selection and multi-scale convolutional neural networks

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106231173A (zh) * 2015-06-02 2016-12-14 Lg电子株式会社 移动终端及其控制方法
US20170195591A1 (en) * 2016-01-05 2017-07-06 Nvidia Corporation Pre-processing for video noise reduction
CN105512643A (zh) * 2016-01-06 2016-04-20 北京二郎神科技有限公司 一种图像采集方法和装置
CN106845549A (zh) * 2017-01-22 2017-06-13 珠海习悦信息技术有限公司 一种基于多任务学习的场景与目标识别的方法及装置
CN107566907A (zh) * 2017-09-20 2018-01-09 广东欧珀移动通信有限公司 视频剪辑方法、装置、存储介质及终端
CN107622281A (zh) * 2017-09-20 2018-01-23 广东欧珀移动通信有限公司 图像分类方法、装置、存储介质及移动终端

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112702521A (zh) * 2020-12-24 2021-04-23 广州极飞科技有限公司 图像拍摄方法及装置、电子设备、计算机可读存储介质
US11445121B2 (en) 2020-12-29 2022-09-13 Industrial Technology Research Institute Movable photographing system and photography composition control method
CN113469250A (zh) * 2021-06-30 2021-10-01 阿波罗智联(北京)科技有限公司 图像拍摄方法、图像分类模型训练方法、装置及电子设备
WO2023273035A1 (zh) * 2021-06-30 2023-01-05 阿波罗智联(北京)科技有限公司 图像拍摄方法、图像分类模型训练方法、装置及电子设备
CN114650356A (zh) * 2022-03-16 2022-06-21 思翼科技(深圳)有限公司 一种高清无线数字图像传输系统

Also Published As

Publication number Publication date
CN110574040A (zh) 2019-12-13
US20200371535A1 (en) 2020-11-26

Similar Documents

Publication Publication Date Title
WO2019157690A1 (zh) 自动抓拍方法及装置、无人机及存储介质
Kalra et al. Dronesurf: Benchmark dataset for drone-based face recognition
CN108229369B (zh) 图像拍摄方法、装置、存储介质及电子设备
KR101363017B1 (ko) 얼굴영상 촬영 및 분류 시스템과 방법
CN101383000B (zh) 信息处理装置和信息处理方法
JP2020205637A (ja) 撮像装置およびその制御方法
US11176679B2 (en) Person segmentations for background replacements
US20120300092A1 (en) Automatically optimizing capture of images of one or more subjects
CN112702521B (zh) 图像拍摄方法及装置、电子设备、计算机可读存储介质
US10986287B2 (en) Capturing a photo using a signature motion of a mobile device
CN117813581A (zh) 多角度手部跟踪
US20240414443A1 (en) Noise removal from surveillance camera image by means of ai-based object recognition
JP2009088687A (ja) アルバム作成装置
CN109986553B (zh) 一种主动交互的机器人、系统、方法及存储装置
CN112655021A (zh) 图像处理方法、装置、电子设备和存储介质
CN110581950B (zh) 相机、选择相机设置的系统和方法
CN115988313A (zh) 拍照辅助方法、装置、电子设备及存储介质
JP6410427B2 (ja) 情報処理装置、情報処理方法及びプログラム
JP2007088644A (ja) 撮像装置及びその制御方法、コンピュータプログラム、記憶媒体
JP7483532B2 (ja) キーワード抽出装置、キーワード抽出方法及びキーワード抽出プログラム
CN113837114A (zh) 一种景区中人脸视频片段采集方法系统
WO2018192244A1 (zh) 一种智能设备的拍摄引导方法
CN119012001A (zh) 辅助拍摄方法、装置、电子设备和存储介质
CN115457666B (zh) 活体对象运动重心识别方法、系统及计算机可读存储介质
JP2021110962A (ja) 捜索支援システムにおける探索方法および装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18906476

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18906476

Country of ref document: EP

Kind code of ref document: A1