WO2016202111A1 - Audio output method and apparatus based on photographing - Google Patents
Audio output method and apparatus based on photographing Download PDFInfo
- Publication number
- WO2016202111A1 WO2016202111A1 PCT/CN2016/080941 CN2016080941W WO2016202111A1 WO 2016202111 A1 WO2016202111 A1 WO 2016202111A1 CN 2016080941 W CN2016080941 W CN 2016080941W WO 2016202111 A1 WO2016202111 A1 WO 2016202111A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- target object
- coordinate
- target
- deflection angle
- distance
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R1/00—Details of transducers, loudspeakers or microphones
- H04R1/20—Arrangements for obtaining desired frequency or directional characteristics
- H04R1/32—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
Definitions
- the present disclosure relates to the field of audio processing technologies, and in particular, to a camera-based audio output method and a camera-based audio output device.
- a prompt sound is usually sent through the speaker to prompt the user and the time when the photographed person takes a photo.
- the sound of the photo may be transmitted to other people (non-photographed), affecting others.
- embodiments of the present disclosure have been made in order to provide a photo-based audio output method and a corresponding photo-based audio output device that overcome the above problems or at least partially solve the above problems.
- an audio output method based on photographing including:
- the driving speaker When receiving the photographing instruction, the driving speaker outputs the audio in accordance with the sound field angle and the deflection angle.
- the embodiment of the present disclosure further discloses a camera-based audio output device, including:
- a target object determining module configured to determine one or more target objects in the preview image data when the camera captures the preview image data
- a sound field angle calculation module configured to calculate an acoustic field angle relative to the speaker for the one or more target objects
- a deflection angle calculation module configured to calculate a deflection angle relative to the speaker for the one or more target objects
- an audio directional output module configured to drive the speaker to output audio according to the sound field angle and the deflection angle when receiving the photographing indication.
- the embodiment of the present disclosure calculates the sound field angle and the deflection angle by one or more target objects in the preview image data, and outputs audio according to the sound field angle and the deflection angle when photographing, and prompts the photographing.
- the sound is output to the target object in the actual target, so that the photographer can hear the prompt sound under the environment and other factors, thereby improving the success rate of the photograph, avoiding the chance of re-photographing, and reducing the electronic device. Waste of resources, improve the efficiency of photographing, and reduce the cost of photographing; in addition, areas other than directional output audio generally do not hear the sound of photographing, reducing the impact on others.
- FIG. 1 is a flow chart showing the steps of an embodiment of a camera-based audio output method according to the present disclosure
- FIGS. 2A and 2B are diagrams showing an example of a scene of a sound field angle and a deflection angle of the present disclosure
- 3A and 3B are diagrams showing an example of calculation of an acoustic field angle of the present disclosure
- 4A to 4C are diagrams showing an example of calculation of a deflection angle of the present disclosure
- FIG. 5 is a structural block diagram of an embodiment of a photo-based audio output device of the present disclosure.
- FIG. 1 a flow chart of steps of an embodiment of a camera-based audio output method according to the present disclosure is shown. Specifically, the method may include the following steps:
- Step 101 When the camera collects preview image data, determine one or more target objects in the preview image data;
- the embodiments of the present disclosure may be applied to various electronic devices, such as a mobile phone, a tablet computer, a personal digital assistant, a wearable device (such as glasses, a watch, etc.), etc., and the embodiment of the present disclosure does not limit this. .
- the operating system of the electronic device may include Android (Android), IOS, Windows Phone, Windows, etc., and generally supports the operation of the camera and the speaker.
- Android Android
- IOS IOS
- Windows Phone Windows, etc.
- the camera is a hardware on the electronic device, which can be used for photographing and photographing, and can be front-mounted (in the same direction as the screen of the electronic device) or rear-mounted (in the opposite direction to the screen of the electronic device), and the implementation of the present disclosure This example does not limit this.
- the image generated by the scene through the lens of the camera is projected onto the surface of the image sensing processor (Sensor), and then converted into an electrical signal, which is converted by A/D (Analog to Digital Conversion).
- A/D Analog to Digital Conversion
- digital image signals digital image signals are compressed by a digital signal processing chip (DSP) or an encoding library and converted into a specific image file format, which is transmitted to a mobile device's Central Processing Unit (CPU) through a data bus. Processing can be displayed on the display of the electronic device.
- DSP digital signal processing chip
- CPU Central Processing Unit
- the preview is relative to the photographing
- the preview image data is image data that is provided to the user for adjustment and selection before being stored in the photograph, and is saved in the cache.
- the camera captures a series of preview image data, that is, preview image data of multiple frames.
- the sound field angle and the deflection angle can be continuously calculated while the camera continuously collects the preview image data until the user takes a photo. .
- a selection control may also be provided, by which the user can select whether to output audio at the time of photographing, and the embodiment of the present disclosure can determine the state of the selection control before determining one or more target objects in the preview image data. If the state of the control is selected to select the directional output audio, the step of determining one or more target objects in the preview image data may be continued. If the state of the selected control is undirected output audio, the step of determining one or more target objects in the preview image data may not be performed to reduce resource consumption of the electronic device.
- step 101 may include the following sub-steps:
- Sub-step S11 detecting a face in the preview image data, and determining one or more target objects.
- the camera when the camera captures preview image data, the camera can perform auto focus through face detection.
- face detection can refer to calibrating the position and size of all faces from one frame of preview image data.
- the object that is successfully detected may be identified as the target object, and the photographed person actually mapped by the target object may be a person.
- step 101 may include the following sub-steps:
- Sub-step S11 when receiving the focus operation instruction, determining that one or more human faces in the preview image data corresponding to the focus operation instruction are one or more target objects;
- the user can perform manual focusing, and the focus operation instruction is triggered by clicking the preview image data, selecting the focus frame, and the like, and the camera can perform the focusing operation on the object selected by the user according to the focus operation instruction.
- the object corresponding to the focus operation indication may be identified as the target object, and the photographed person actually mapped by the target object may be a person, an animal, or a still life.
- Sub-step S12 when the cancellation operation instruction is received, the determined one or more target objects are canceled.
- the user can trigger the cancel operation instruction by clicking the target object or the like, and cancel the target object.
- the target object determined in the preview image data may be one, or may be multiple (ie, two or more), which is not limited by the embodiment of the present disclosure.
- Step 102 Calculate an acoustic field angle with respect to the speaker for the one or more target objects
- the speaker can be a hardware that outputs audio, such as a miniature piezoelectric film ultrasonic sensor.
- the angle of the range with respect to the speaker may be referred to as the sound field angle.
- the sound field angle R can be an angular range in which the speaker 201 can hear the audio in the case of outputting audio.
- step 102 may include the following sub-steps:
- Sub-step S21 measuring a target distance between the speaker and the one or more target objects
- the target distance refers to the linear distance between the speaker and the target object as a whole, and does not necessarily mean the linear distance between the speaker and a certain target object.
- the sub-step S21 may further include the following sub-steps:
- Sub-step S211 when the target object is one, acquiring a candidate distance between the camera and the target object as a target distance;
- Sub-step S212 when the target object is multiple, respectively acquiring a plurality of candidate distances between the camera and the plurality of target objects;
- the difference between the two is generally small, so the candidate distance between the camera and the target object (the linear distance between the two), and, the speaker and the target object
- the difference between the candidate distances (the linear distance between the two) is small and is generally within the acceptable difference.
- the candidate distance between the camera and the target object is calculated by previewing the image data, and therefore, in this example, in order to avoid adding additional hardware, the candidate distance between the camera and the target object can be replaced with the speaker and the target object.
- candidate distance between the camera and the target object can be replaced with the speaker and the target object.
- the candidate distance between the camera and the target object can be directly set as the target distance between the speaker and the target object.
- multiple candidates between the camera and multiple target objects can be used.
- the distance between the calculated speaker and the target object is calculated, such as calculating a plurality of candidate distance averages, selecting a maximum value of the plurality of candidate distance species, selecting a minimum of the plurality of candidate distances, and the like, which is not Limit it.
- the candidate distance between the camera and the target object can be calculated by one or more of the following methods:
- the binocular or multi-view camera observes the same target object at different viewpoints, acquires two-dimensional images of target objects at different viewing angles, and calculates the positional deviation of image pixels, ie, parallax, by triangulation principle. Get the 3D information of the target object.
- the two-dimensional image of the continuous target object is acquired by the barrage camera at different times or different spatial positions, and the distance and other parameters of the target object are calculated by the time or space change of the target object in the two-dimensional image sequence.
- Image-based ranging methods in monocular ranging include: Depth from Focus (DFF) and Depth from Defocus (DFD).
- the focus ranging method captures a series of image data by adjusting the optical writing parameters, finds the clearest image data among the image data, and calculates the distance based on the imaging parameters of the image data using the imaging principle of geometric optics.
- the defocusing distance method is based on the principle that the defocusing degree of the object is larger and the image is more blurred.
- the two or three frames of image data captured under different optical parameters are used to determine the diffusion parameter of the scattered focus diffusion function, according to the defocus diffusion parameter.
- the depth calculation is performed on the relationship with the distance of the target object.
- calculation manner is only an example.
- other calculation manners may be set according to actual conditions, and the embodiment of the present disclosure does not limit this.
- other calculation methods may be used by those skilled in the art according to actual needs, and the embodiments of the present disclosure do not limit this.
- the candidate distance between the speaker and the target object can be directly measured by the active ranging method, that is, using a beam such as a laser or a light having a certain texture structure to find a target object, By analyzing the texture deformation or measurement of the reflected light of the target object The propagation time of the light speed is used to determine the distance of the object, and the embodiment of the present disclosure does not limit this.
- Sub-step S22 acquiring an audition range distance that matches the one or more target objects
- the audio when the audio is directionally output, the audio can be heard within a certain range, and the audio is generally not heard outside the range, and the distance of the range is called the audition range distance.
- the range of the audition range can be the distance from which the range of the directional output audio can be heard.
- the matching audition range distance can be set in advance according to the target object, for example, the audition range of one target object is 35 cm, the audition range of two target objects is 45 cm, and the like.
- Sub-step S23 calculating a sound field angle according to the target distance and the distance of the audition range.
- the target distance and the listening range distance may be used, and the sound field angle is calculated according to a trigonometric function relationship.
- the isosceles triangle is constructed with the target distance being high and the audition range distance as the base, and the sound field angle is calculated according to the following trigonometric relationship:
- R is the sound field angle
- K is the distance of the audition range
- L is the target distance
- the candidate distance between the camera and the target object is measured as L 0 , that is, the target distance between the speaker 301 and the target object is L 0 , and the distance of the audition range of one target object is For K 0 , the isosceles triangle is constructed with the target distance L 0 as the height and the audition range distance K 0 as the base, and the sound field angle R 0 is calculated according to the following trigonometric relationship:
- the candidate distance distribution of the camera and the target object is measured as L 2 , L 3 , L 4 , that is, the target distance L 1 of the speaker 301 and the target object is L 1 ( L 2 +L 3 +L 4 )/3
- the distance of the audition range of the three target objects is K 1
- the isosceles triangle is constructed with the target distance L 1 as the height and the audition range distance K 1 as the base.
- the function relationship calculates the sound field angle R 1 :
- Step 103 Calculate a deflection angle with respect to the speaker for the one or more target objects
- the deflection angle S may be an angle at which the subject actually mapped by the target object deviates from the forward direction of the speaker 201.
- step 103 may include the following sub-steps:
- Sub-step S31 projecting the target object in the preview image into a preset coordinate system, the coordinate system is constructed based on the position of the speaker;
- the value of the target object deviating from the speaker in the preview image data is generally the same as the value of the bias angle of the subject with respect to the speaker mapped by the target object in practice. it's the same.
- the coordinate system can be constructed in advance based on the position of the speaker.
- the position of the speaker and the sound hole (the hole on the outer casing of the electronic device, the sound emitted by the speaker can propagate through the sound hole) is usually opposite, and the position of the speaker on the electronic device (ie, electronically)
- the back side of the device acts as a projection surface, and the position at which the speaker is projected onto the projection surface is often coincident with the sound hole. Therefore, the coordinate system can also be directly constructed based on the sound hole.
- the projection is a method in which a projection line is projected through an object (such as a speaker) to a selected projection surface, and a graphic is obtained on the surface.
- a coordinate system such as a Cartesian coordinate system may be constructed with the speaker or the sound hole as an origin.
- the plane of the coordinate system is used as a projection plane, and the target object in the preview image is projected into the coordinate system to calculate the deflection angle.
- Sub-step S32 in the coordinate system, calculating focus coordinates of the one or more target objects
- the focus coordinate which can be the coordinates of the focus when the target object is focused.
- sub-step S32 may further include the following sub-steps:
- Sub-step S322 calculating an average value of the first coordinate and the second coordinate, sitting as a focus Standard
- the target object is an area, and the midpoint of the area can be used as the focus coordinate.
- Sub-step S324 respectively calculating an average value of the third coordinate and the fourth coordinate, and an average value of the fifth coordinate and the sixth coordinate as focus coordinates.
- the target object is an area, and the midpoint of the area can be used as the focus coordinate.
- the focus coordinates of the leftmost target object and the focus coordinates of the rightmost target object can be calculated, for a total of two focus coordinates.
- Sub-step S33 the deflection angle is calculated using the focus coordinates.
- the focus angle can be used to calculate the deflection angle according to a trigonometric function relationship.
- a right-angled triangle may be constructed with the value of the X-axis coordinate of the focus coordinate and the value of the Y-axis coordinate being a right-angled edge, and calculated according to the following trigonometric relationship. Deflection angle:
- S is the deflection angle
- X 0 is the value of the X-axis coordinate of the focus coordinate
- Y 0 is the value of the Y-axis coordinate of the focus coordinate.
- sub-step S33 may further include the following sub-steps:
- Sub-step S331 when the target object is multiple, calculate a first candidate deflection angle by using focus coordinates of the leftmost target object;
- Sub-step S332 calculating a second candidate deflection angle by using focus coordinates of the rightmost target object
- Sub-step S333 if the leftmost target object and the rightmost target object are located in the two speakers Side, the first feature angle is set to a deflection angle
- the first feature angle is a half of a difference between the first candidate deflection angle and the second candidate deflection angle
- the second feature angle is half of a sum of the first candidate deflection angle and the second candidate deflection angle.
- a right-angled triangle may be constructed with a value of an X coordinate of the focus coordinate and a value of the Y coordinate, and a first candidate deflection angle and a second candidate deflection angle may be calculated according to a trigonometric function relationship. If the leftmost target object and the rightmost target object are located on either side of the forward direction of the speaker, then the deflection angle:
- the deflection angle If the leftmost target object and the rightmost target object are located on the same side of the speaker in the forward direction (such as the left or right side), the deflection angle:
- S is the deflection angle
- S 1 is the first candidate deflection angle
- S 2 is the second candidate deflection angle
- the XY coordinate system is constructed with the projection O of the speaker as a dot.
- the upper left corner of the target object is point A (X 1 , Y 1 ), and the lower right corner is a point.
- B(X 2 , Y 2 ) then the focus coordinate U((X 1 +X 2 )/2, (Y 1 +Y 2 )/2), then the X-axis coordinate of the focus coordinate U (X 1 +X 2
- the /2 and Y-axis coordinates (Y 1 +Y 2 )/2 are right-angled edges, and a right-angled triangle is constructed.
- the deflection angle S 3 is calculated according to the following trigonometric relationship:
- tanS 3 (X 1 +X 2 )/2/(Y 1 +Y 2 )/2;
- the XY coordinate system is constructed with the projection O of the speaker as a dot.
- the upper left corner of the leftmost target object is the point C (X 3 , Y 3 ).
- the lower right corner is the point D (X 4 , Y 4 )
- the focus coordinate V ((X 3 +X 4 )/2, (Y 3 +Y 4 )/2)
- the upper left corner of the rightmost target object is Point E (X 5 , Y 5 )
- the lower right corner is point F (X 6 , Y 7 )
- the focus coordinate W ((X 5 +X 6 )/2, (Y 5 +Y 6 )/2)
- the X-axis coordinate (X 3 +X 4 )/2 and the Y-axis coordinate (Y 3 +Y 4 )/2 of the focus coordinate V are respectively a right-angled edge
- the X-axis coordinate of the focal coordinate W (X 5 +X) 6 )/2 and the Y-axis coordinate (Y 5 +Y 6 )/2 are right-angled sides, and a right-angled triangle is constructed, and the first candidate deflection angle S 5 and the second candidate deflection angle S 6 are calculated according to the following trigonometric relationship:
- tanS 5 (X 3 +X 4 )/2/(Y 3 +Y 4 )/2;
- tanS 6 (X 5 +X 6 )/2/(Y 5 +Y 6 )/2;
- the XY coordinate system is constructed with the projection O of the speaker as a dot.
- the target object is three
- the upper left corner of the leftmost target object is the point G (X 7 , Y 7 ).
- the lower right corner is point H (X 8 , Y 8 )
- the focus coordinate is M ((X 7 +X 8 )/2, (Y 7 +Y 8 )/2)
- the lower right corner is point J (X 10 , Y 10 )
- the focus coordinate is N((X 9 +X 10 )/2, (Y 9 +Y 10 )/2) ;
- the X-axis coordinate (X 7 +X 8 )/2 and the Y-axis coordinate (Y 7 +Y 8 )/2 of the focus coordinate V are respectively a right-angled edge
- the X-axis coordinate of the focal coordinate W (X 9 +X) 10 )/2 and the Y-axis coordinate (Y 9 +Y 10 )/2 are right-angled sides, and a right-angled triangle is constructed, and the first candidate deflection angle S 8 and the second candidate deflection angle S 9 are calculated according to the following trigonometric relationship:
- tanS 8 (X 7 +X 8 )/2/(Y 7 +Y 8 )/2;
- tanS 9 (X 9 +X 10 )/2/(Y 9 +Y 10 )/2;
- Step 104 When receiving the photographing instruction, the driving speaker outputs the audio according to the sound field angle and the deflection angle.
- the user can trigger the photographing instruction by clicking the photographing control, clicking on the preview image data, and the like, and the camera performs the photographing process, and at the same time, driving the speaker to output the audio according to the sound field angle and the deflecting angle, that is, the actual target to the target object.
- a photo is sounded in the area to remind the person being photographed to take a picture.
- Directional output audio can be produced by utilizing the nonlinear propagation effects of ultrasound in air to produce highly directional audible sound (ie, audio orientation).
- two plane waves propagate nonlinearly in an inhomogeneous medium.
- the ultrasonic transducer one of the components of the loudspeaker
- the ultrasonic transducer passes The mechanical vibration emits two ultrasonic waves of frequencies f 1 and f 2 into the air.
- these two columns of ultrasonic waves propagate in the air, a nonlinear interaction will occur, resulting in the inclusion of the fundamental frequencies f 1 , f 2 , the sum frequency f 1 +f 2 , the difference frequencies f 1 -f 2 and the various orders Complex sound waves including harmonics.
- the acoustic attenuation coefficient ⁇ is proportional to the square of the frequency, the higher frequency ultrasonic signals f 1 , f 2 , f 1 +f 2 and the harmonics will be quickly absorbed by the air, leaving the difference in the audio frequency range.
- the frequency signals f 1 -f 2 continue to propagate in the air.
- the sound wave has directivity is closely related to the ratio of the wavelength of the sound wave to the size of the sound source.
- the sound wave has no directivity; when the wavelength of the sound wave is close to far smaller than the size of the sound source, the sound wave will gradually exhibit more and more directivity. Therefore, when the greetings select the ultrasonic frequencies f 1 and f 2 , the beat signals f 1 -f 2 can be made to be in the audible range, thereby generating the acoustic waves by the ultrasonic waves.
- an ultrasonic transducer (one of the components of the loudspeaker) emits a strongly modulated ultrasonic wave into the air medium, the ultrasonic wave being in the direction of its main axis of propagation (such as the direction of the sound field and the direction in which the deflection angle is directed)
- the audio signals are continuously modulated by nonlinear interaction, and these continuously demodulated audio waves are accumulated and superimposed, whereby an end-fire virtual array is realized in this way.
- This virtual sound source array is a so-called parametric acoustic array.
- the parametric acoustic array makes the energy of the acoustic wave continuously strengthened in the direction of the sound wave.
- the superposition enhancement effect will be weak except for the direction of the propagation principal axis (such as the direction of the sound field and the direction of the deflection angle), which ultimately causes the acoustic wave to be in the direction of the main propagation axis (such as the sound field angle and the deflection angle).
- the direction of pointing has a strong directivity.
- FIG. 5 a structural block diagram of an embodiment of a camera-based audio output device according to the present disclosure is shown, which may specifically include the following modules:
- the target object determining module 501 is configured to determine one or more target objects in the preview image data when the camera captures the preview image data;
- a sound field angle calculation module 502 configured to calculate, relative to the speaker, the one or more target objects Sound field angle of the device;
- a deflection angle calculation module 503, configured to calculate a deflection angle relative to the speaker for the one or more target objects
- the audio directional output module 504 is configured to drive the speaker to output audio according to the sound field angle and the deflection angle when receiving the photographing indication.
- the target object determining module 501 may include the following sub-modules:
- the first determining submodule is configured to detect a face in the preview image data and determine the target object as one or more targets.
- the target object determining module 501 may include the following sub-modules:
- a second determining submodule configured to: when receiving the focus operation indication, determine that one or more human faces in the preview image data corresponding to the focus operation instruction are one or more target objects;
- the cancel submodule is used to cancel the determined one or more target objects upon receiving the cancel operation indication.
- the sound field angle calculation module 502 may include the following sub-modules:
- a target distance measurement submodule for measuring a target distance between the speaker and the one or more target objects
- the audition range distance obtaining sub-module is configured to obtain an audition range distance matching the one or more target objects, where the audition range distance is a distance in which the range of the audio can be heard;
- a first calculating submodule configured to calculate a sound field angle according to the target distance and the distance of the audition range.
- the target distance measurement submodule may further include the following submodules:
- a first obtaining submodule configured to acquire a candidate distance between the camera and the target object as the target distance when the target object is one;
- a second obtaining submodule configured to acquire a plurality of candidate distances between the camera and the plurality of target objects respectively when the target object is multiple;
- a second calculation submodule configured to calculate the target distance by using the plurality of candidate distances.
- the deflection angle calculation module 503 may include the following sub-modules:
- a projection submodule configured to project a target object in the preview image data into a preset coordinate system, where the coordinate system is constructed based on a position of the speaker;
- a focus coordinate calculation submodule configured to calculate focus coordinates of the one or more target objects in the coordinate system
- a third calculation sub-module for calculating a deflection angle using the focus coordinates.
- the focus coordinate calculation sub-module may further include the following sub-modules:
- a first search submodule configured to search for a first coordinate of the upper left corner of the target object and a second coordinate of a lower right corner when the target object is one;
- a fourth calculation submodule configured to calculate an average value of the first coordinate and the second coordinate as a focus coordinate
- a second search submodule configured to: when the target object is multiple, find the third coordinate of the upper left corner of the leftmost target object, the fourth coordinate of the lower right corner, and the upper left corner of the rightmost target object The fifth coordinate, the sixth coordinate of the lower right corner;
- a fifth calculating submodule configured to respectively calculate an average value of the third coordinate and the fourth coordinate, and an average value of the fifth coordinate and the sixth coordinate as a focus coordinate.
- the third calculating submodule may further include the following submodule:
- a first candidate deflection angle calculation submodule configured to calculate a first candidate deflection angle by using focus coordinates of the leftmost target object when the target object is multiple;
- a second candidate deflection angle calculation submodule configured to calculate a second candidate deflection angle by using focus coordinates of the rightmost target object
- the first setting sub-module is used to target the leftmost target object and the rightmost target object.
- the first feature angle is set to a deflection angle
- a second setting sub-module configured to set a second feature angle as a deflection angle when the leftmost target object and the rightmost target object are located on the same side of the speaker;
- the first feature angle is a half of a difference between the first candidate deflection angle and the second candidate deflection angle; the second feature angle is the first candidate deflection angle and the second candidate bias Half of the sum of the horns.
- the description is relatively simple, and the relevant parts can be referred to the description of the method embodiment.
- embodiments of the disclosed embodiments can be provided as a method, apparatus, or computer program product.
- embodiments of the present disclosure can take the form of an entirely hardware embodiment, an entirely software embodiment, or a combination of software and hardware.
- embodiments of the present disclosure may take the form of a computer program product embodied on one or more computer usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.
- Embodiments of the present disclosure are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the present disclosure. It will be understood that each flow and/or block of the flowchart illustrations and/or FIG.
- These computer program instructions can be provided to a processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing terminal device to produce a machine such that instructions are executed by a processor of a computer or other programmable data processing terminal device
- Means are provided for implementing the functions specified in one or more of the flow or in one or more blocks of the flow chart.
- the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing terminal device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
- the instruction device implements the functions specified in one or more blocks of the flowchart or in a flow or block of the flowchart.
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Otolaryngology (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Studio Devices (AREA)
Abstract
Description
相关申请的交叉引用Cross-reference to related applications
本申请主张在2015年6月19日在中国提交的中国专利申请号No.201510345291.2的优先权,其全部内容通过引用包含于此。The present application claims priority to Chinese Patent Application No. 20151034529, filed on Jun. 19, 2015, which is hereby incorporated by reference.
本公开涉及音频处理技术领域,特别是涉及一种基于拍照的音频输出方法和一种基于拍照的音频输出装置。The present disclosure relates to the field of audio processing technologies, and in particular, to a camera-based audio output method and a camera-based audio output device.
随着科技的发展,各种电子设备,尤其是诸如手机、平板电脑等移动设备,在人们的工作、学习、日常交流等各方面的使用率也越来越高。With the development of technology, various electronic devices, especially mobile devices such as mobile phones and tablet computers, are increasingly used in people's work, study, and daily communication.
手机、平板电脑等电子设备中,大多数都配置有摄像头,使得拍照成为电子设备的一个重要应用。Most of the electronic devices such as mobile phones and tablet computers are equipped with cameras, making photography an important application of electronic devices.
目前,在拍照的场景中,通常会通过扬声器发出提示声,提示用户及被拍照者拍照的时机。At present, in the scene of photographing, a prompt sound is usually sent through the speaker to prompt the user and the time when the photographed person takes a photo.
但是,由于环境嘈杂等因素,有可能会出现被拍照者听不清楚提示声的情况。若被拍照者没听清提示声,稍微动一下,拍出的照片就模糊了,尤其是在拍摄小孩时,由于不能很好的吸引小孩子的注意力,所以很难对小孩子进行拍照。However, due to factors such as environmental noise, there may be cases where the person being photographed cannot hear the prompt sound. If the person being photographed does not hear the prompt, and the picture is blurred when the camera is moved a little, especially when shooting a child, it is difficult to take pictures of the child because it does not attract the attention of the child.
若照片模糊,则需要进行重新拍照,浪费电子设备的资源,拍照的效率低,成本高。If the photo is blurred, it is necessary to re-photograph, waste the resources of the electronic device, and the photographing efficiency is low and the cost is high.
而且,在人员众多等场景下,拍照的提示声可能会传给其他人(非被拍照者),对其他人造成影响。Moreover, in scenes with a large number of people, the sound of the photo may be transmitted to other people (non-photographed), affecting others.
发明内容Summary of the invention
鉴于上述问题,提出了本公开实施例以便提供一种克服上述问题或者至少部分地解决上述问题的一种基于拍照的音频输出方法和相应的一种基于拍照的音频输出装置。In view of the above problems, embodiments of the present disclosure have been made in order to provide a photo-based audio output method and a corresponding photo-based audio output device that overcome the above problems or at least partially solve the above problems.
为了解决上述问题,本公开实施例公开了一种基于拍照的音频输出方法,包括: In order to solve the above problem, an embodiment of the present disclosure discloses an audio output method based on photographing, including:
当摄像头采集到预览图像数据时,确定所述预览图像数据中的一个或多个目标对象;Determining one or more target objects in the preview image data when the camera captures the preview image data;
对所述一个或多个目标对象计算相对于扬声器的声场角;Calculating an acoustic field angle relative to the speaker for the one or more target objects;
对所述一个或多个目标对象计算相对于扬声器的偏向角;Calculating a deflection angle relative to the speaker for the one or more target objects;
当接收到拍照指示时,驱动扬声器按照所述声场角和所述偏向角定向输出音频。When receiving the photographing instruction, the driving speaker outputs the audio in accordance with the sound field angle and the deflection angle.
本公开实施例还公开了一种基于拍照的音频输出装置,包括:The embodiment of the present disclosure further discloses a camera-based audio output device, including:
目标对象确定模块,用于在摄像头采集到预览图像数据时,确定所述预览图像数据中的一个或多个目标对象;a target object determining module, configured to determine one or more target objects in the preview image data when the camera captures the preview image data;
声场角计算模块,用于对所述一个或多个目标对象计算相对于扬声器的声场角;a sound field angle calculation module, configured to calculate an acoustic field angle relative to the speaker for the one or more target objects;
偏向角计算模块,用于对所述一个或多个目标对象计算相对于扬声器的偏向角;a deflection angle calculation module, configured to calculate a deflection angle relative to the speaker for the one or more target objects;
音频定向输出模块,用于在接收到拍照指示时,驱动扬声器按照所述声场角和所述偏向角定向输出音频。And an audio directional output module, configured to drive the speaker to output audio according to the sound field angle and the deflection angle when receiving the photographing indication.
本公开实施例包括以下优点:本公开实施例通过对预览图像数据中的一个或多个目标对象计算声场角和偏向角,在拍照时按照该声场角和偏向角定向输出音频,将拍照的提示声定向输出至该目标对象在实际所映射的被拍照者,使得被拍照者在环境嘈杂等因素下可以听清提示声,从而提高了拍照的成功率,避免重新拍照的几率,减少电子设备的资源浪费,提高拍照的效率,降低拍照的成本;此外,定向输出音频以外的区域一般听不清拍照的提示声,减少对其他人的影响。The embodiments of the present disclosure include the following advantages: the embodiment of the present disclosure calculates the sound field angle and the deflection angle by one or more target objects in the preview image data, and outputs audio according to the sound field angle and the deflection angle when photographing, and prompts the photographing. The sound is output to the target object in the actual target, so that the photographer can hear the prompt sound under the environment and other factors, thereby improving the success rate of the photograph, avoiding the chance of re-photographing, and reducing the electronic device. Waste of resources, improve the efficiency of photographing, and reduce the cost of photographing; in addition, areas other than directional output audio generally do not hear the sound of photographing, reducing the impact on others.
图1是本公开的一种基于拍照的音频输出方法实施例的步骤流程图;1 is a flow chart showing the steps of an embodiment of a camera-based audio output method according to the present disclosure;
图2A和图2B是本公开的一种声场角和偏向角的场景示例图;2A and 2B are diagrams showing an example of a scene of a sound field angle and a deflection angle of the present disclosure;
图3A和图3B是本公开的一种声场角的计算示例图;3A and 3B are diagrams showing an example of calculation of an acoustic field angle of the present disclosure;
图4A至图4C是本公开的一种偏向角的计算示例图;4A to 4C are diagrams showing an example of calculation of a deflection angle of the present disclosure;
图5是本公开的一种基于拍照的音频输出装置实施例的结构框图。FIG. 5 is a structural block diagram of an embodiment of a photo-based audio output device of the present disclosure.
为使本公开的上述目的、特征和优点能够更加明显易懂,下面结合附图和具体实施方式对本公开作进一步详细的说明。The above-described objects, features and advantages of the present disclosure will become more apparent from the aspects of the appended claims.
参照图1,示出了本公开的一种基于拍照的音频输出方法实施例的步骤流程图,具体可以包括如下步骤:Referring to FIG. 1 , a flow chart of steps of an embodiment of a camera-based audio output method according to the present disclosure is shown. Specifically, the method may include the following steps:
步骤101,当摄像头采集到预览图像数据时,确定所述预览图像数据中的一个或多个目标对象;Step 101: When the camera collects preview image data, determine one or more target objects in the preview image data;
需要说明的是,本公开实施例可以应用在各种电子设备中,例如,手机、平板电脑、个人数字助理、穿戴设备(如眼镜、手表等)等等,本公开实施例对此不加以限制。It should be noted that the embodiments of the present disclosure may be applied to various electronic devices, such as a mobile phone, a tablet computer, a personal digital assistant, a wearable device (such as glasses, a watch, etc.), etc., and the embodiment of the present disclosure does not limit this. .
该电子设备的操作系统可以包括Android(安卓)、IOS、Windows Phone、Windows等等,通常可以支持摄像头、扬声器的运行。The operating system of the electronic device may include Android (Android), IOS, Windows Phone, Windows, etc., and generally supports the operation of the camera and the speaker.
摄像头是电子设备上的一个硬件,可以用于拍照和拍摄,可以是前置的(与电子设备的屏幕同方向),也可以是后置(与电子设备的屏幕反方向)的,本公开实施例对此也不加以限制。The camera is a hardware on the electronic device, which can be used for photographing and photographing, and can be front-mounted (in the same direction as the screen of the electronic device) or rear-mounted (in the opposite direction to the screen of the electronic device), and the implementation of the present disclosure This example does not limit this.
在实际应用中,景物(Scene)通过摄像头的镜头(Lens)生成的光学图像投射到图像感应处理器(Sensor)表面上,然后转为电信号,经过A/D(模数转换)转换后变为数字图像信号,由数字信号处理芯片(DSP)或编码库中对数字图像信号进行压缩并转化为特定的图像文件格式,通过数据总线传输到移动设备的处理器(Central Processing Unit,CPU)进行处理,则可以在电子设备的显示屏显示了。In practical applications, the image generated by the scene through the lens of the camera (Lens) is projected onto the surface of the image sensing processor (Sensor), and then converted into an electrical signal, which is converted by A/D (Analog to Digital Conversion). For digital image signals, digital image signals are compressed by a digital signal processing chip (DSP) or an encoding library and converted into a specific image file format, which is transmitted to a mobile device's Central Processing Unit (CPU) through a data bus. Processing can be displayed on the display of the electronic device.
需要说明的是,预览是相对于拍照而言的,该预览图像数据是在拍照存储之前提供给用户调整、选择用的图像数据,保存在缓存中。It should be noted that the preview is relative to the photographing, and the preview image data is image data that is provided to the user for adjustment and selection before being stored in the photograph, and is saved in the cache.
摄像头所采集的是一系列的预览图像数据,即多帧的预览图像数据,本公开实施例中,可以在摄像头不断采集预览图像数据的同时,不断进行声场角和偏向角的计算,直至用户拍照。The camera captures a series of preview image data, that is, preview image data of multiple frames. In the embodiment of the present disclosure, the sound field angle and the deflection angle can be continuously calculated while the camera continuously collects the preview image data until the user takes a photo. .
当然,还可以提供一个选择控件,用户可以通过该选择控件选择是否在拍照时定向输出音频,本公开实施例在确定预览图像数据中的一个或多个目标对象之前,可以判断该选择控件的状态,若选择控件的状态为选择定向输出音频,则可以继续执行确定预览图像数据中的一个或多个目标对象的步骤, 若选择控件的状态为不定向输出音频,则可以不执行确定预览图像数据中的一个或多个目标对象的步骤,以减少电子设备的资源消耗。Certainly, a selection control may also be provided, by which the user can select whether to output audio at the time of photographing, and the embodiment of the present disclosure can determine the state of the selection control before determining one or more target objects in the preview image data. If the state of the control is selected to select the directional output audio, the step of determining one or more target objects in the preview image data may be continued. If the state of the selected control is undirected output audio, the step of determining one or more target objects in the preview image data may not be performed to reduce resource consumption of the electronic device.
在本公开的一种可选实施例中,步骤101可以包括如下子步骤:In an optional embodiment of the present disclosure,
子步骤S11,检测所述预览图像数据中的人脸,确定为一个或多个目标对象。Sub-step S11, detecting a face in the preview image data, and determining one or more target objects.
一般而言,摄像头在采集预览图像数据时,可以通过人脸检测进行自动对焦,所谓人脸检测可以是指从一帧预览图像数据中标定出所有人脸的位置和尺寸。In general, when the camera captures preview image data, the camera can perform auto focus through face detection. The so-called face detection can refer to calibrating the position and size of all faces from one frame of preview image data.
则在本公开实施例中,可以将检测成功的对象认定为目标对象,该目标对象实际所映射的被拍照者可以为人。Then, in the embodiment of the present disclosure, the object that is successfully detected may be identified as the target object, and the photographed person actually mapped by the target object may be a person.
进一步而言,以Android(安卓)系统为例,在Android(安卓)系统中提供了两个专业的API(Application Program Interface,应用程序编程接口),android.media.FaceDetector和android.media.FaceDetector.Face,实现在位图上进行人脸检测。Further, taking the Android (Android) system as an example, two professional APIs (Application Program Interface), android.media.FaceDetector and android.media.FaceDetector are provided in the Android system. Face, which implements face detection on a bitmap.
在本公开的另一种可选实施例中,步骤101可以包括如下子步骤:In another optional embodiment of the present disclosure,
子步骤S11,当接收到对焦操作指示时,确定所述对焦操作指示对应的、在所述预览图像数据中的一个或多个人脸为一个或多个目标对象;Sub-step S11, when receiving the focus operation instruction, determining that one or more human faces in the preview image data corresponding to the focus operation instruction are one or more target objects;
在本公开实施例中,用户可以进行手动对焦,通过点击预览图像数据、选择对焦框等操作触发对焦操作指示,摄像头可以按照该对焦操作指示,对用户选定的对象进行对焦操作。In the embodiment of the present disclosure, the user can perform manual focusing, and the focus operation instruction is triggered by clicking the preview image data, selecting the focus frame, and the like, and the camera can perform the focusing operation on the object selected by the user according to the focus operation instruction.
则在本公开实施例中,可以将对焦操作指示对应的对象认定为目标对象,该目标对象实际所映射的被拍照者可以为人、也可以为动物、还可以为静物。In the embodiment of the present disclosure, the object corresponding to the focus operation indication may be identified as the target object, and the photographed person actually mapped by the target object may be a person, an animal, or a still life.
或,or,
子步骤S12,当接收到取消操作指示时,取消已确定的一个或多个目标对象。Sub-step S12, when the cancellation operation instruction is received, the determined one or more target objects are canceled.
在摄像头自动对焦识别的人不是所需的被拍摄者等情况下,用户可以通过点击目标对象等方式触发取消操作指示,取消该目标对象。In the case where the person who recognizes the camera autofocus is not the desired subject, the user can trigger the cancel operation instruction by clicking the target object or the like, and cancel the target object.
在具体实现中,在预览图像数据中确定的目标对象可以为一个,也可以是多个(即两个或两个以上),本公开实施例对此不加以限制。 In a specific implementation, the target object determined in the preview image data may be one, or may be multiple (ie, two or more), which is not limited by the embodiment of the present disclosure.
步骤102,对所述一个或多个目标对象计算相对于扬声器的声场角;Step 102: Calculate an acoustic field angle with respect to the speaker for the one or more target objects;
扬声器,可以为输出音频的硬件,如微型压电薄膜超声传感器。The speaker can be a hardware that outputs audio, such as a miniature piezoelectric film ultrasonic sensor.
假设在扬声器定向输出音频时,在某个范围内被拍摄者能听到该音频,该范围外其他人一般不能听到该音频,则该范围相对于扬声器的角度可以称为声场角。Assuming that the speaker can hear the audio within a certain range when the speaker is directed to output audio, and the other person outside the range cannot generally hear the audio, the angle of the range with respect to the speaker may be referred to as the sound field angle.
即如图2A和图2B所示,声场角R,可以为扬声器201在输出音频的情况下,所能听到音频的角度范围。That is, as shown in FIGS. 2A and 2B, the sound field angle R can be an angular range in which the
在本公开的一种可选实施例中,步骤102可以包括如下子步骤:In an optional embodiment of the present disclosure,
子步骤S21,测量扬声器与所述一个或多个目标对象之间的目标距离;Sub-step S21, measuring a target distance between the speaker and the one or more target objects;
需要说明的是,目标距离是指扬声器与目标对象整体之间的直线距离,并非一定指扬声器与某个目标对象之间的直线距离。It should be noted that the target distance refers to the linear distance between the speaker and the target object as a whole, and does not necessarily mean the linear distance between the speaker and a certain target object.
在本公开实施例的一种可选示例中,子步骤S21进一步可以包括如下子步骤:In an optional example of an embodiment of the present disclosure, the sub-step S21 may further include the following sub-steps:
子步骤S211,当所述目标对象为一个时,获取摄像头与所述目标对象之间的候选距离,作为目标距离;Sub-step S211, when the target object is one, acquiring a candidate distance between the camera and the target object as a target distance;
或者,or,
子步骤S212,当所述目标对象为多个时,分别获取摄像头与所述多个目标对象之间的多个候选距离;Sub-step S212, when the target object is multiple, respectively acquiring a plurality of candidate distances between the camera and the plurality of target objects;
子步骤S213,采用所述多个候选距离计算目标距离。Sub-step S213, calculating the target distance by using the plurality of candidate distances.
由于摄像头与扬声器都配置在同一个电子设备中,两者之间的差距一般很小,因此,摄像头与目标对象之间的候选距离(两者之间的直线距离),和,扬声器与目标对象之间的候选距离(两者之间的直线距离)的差异很小,一般在可接受的差异范围内。Since the camera and the speaker are both disposed in the same electronic device, the difference between the two is generally small, so the candidate distance between the camera and the target object (the linear distance between the two), and, the speaker and the target object The difference between the candidate distances (the linear distance between the two) is small and is generally within the acceptable difference.
并且,通过预览图像数据计算摄像头与目标对象之间的候选距离,因此,在本示例中,为了避免添加额外的硬件,可以将摄像头与目标对象之间的候选距离,替换为,扬声器与目标对象之间的候选距离。And, the candidate distance between the camera and the target object is calculated by previewing the image data, and therefore, in this example, in order to avoid adding additional hardware, the candidate distance between the camera and the target object can be replaced with the speaker and the target object. Candidate distance between.
当目标对象为单个时,可以将摄像头与目标对象之间的候选距离直接设置为扬声器与目标对象之间的目标距离。When the target object is a single, the candidate distance between the camera and the target object can be directly set as the target distance between the speaker and the target object.
当目标对象为多个时,可以采用摄像头与多个目标对象之间的多个候选 距离计算扬声器与目标对象之间的目标距离,如计算多个候选距离平均值、选取多个候选距离种的最大值、选取多个候选距离中的最小值等等,本公开实施例对此不加以限制。When there are multiple target objects, multiple candidates between the camera and multiple target objects can be used. The distance between the calculated speaker and the target object is calculated, such as calculating a plurality of candidate distance averages, selecting a maximum value of the plurality of candidate distance species, selecting a minimum of the plurality of candidate distances, and the like, which is not Limit it.
进一步而言,可以通过以下的一种或多种方式计算摄像头与目标对象之间的候选距离:Further, the candidate distance between the camera and the target object can be calculated by one or more of the following methods:
1、立体视觉。1, stereo vision.
模仿人类的立体感知分析方法,将双目或多目摄像头在不同视点观察同一目标对象,获取在不同视角下的目标对象的二维图像,通过三角测量原理计算图像像素的位置偏差即视差,来获取目标对象的三维信息。Imitating human stereo perception analysis method, the binocular or multi-view camera observes the same target object at different viewpoints, acquires two-dimensional images of target objects at different viewing angles, and calculates the positional deviation of image pixels, ie, parallax, by triangulation principle. Get the 3D information of the target object.
2、运动测距法。2. Motion ranging method.
用弹幕摄像头在不同时间或不同的空间位置获取连续的目标对象的二维图像,通过目标对象在二维图像序列的时间或空间变化计算出目标对象的距离和其他参数。The two-dimensional image of the continuous target object is acquired by the barrage camera at different times or different spatial positions, and the distance and other parameters of the target object are calculated by the time or space change of the target object in the two-dimensional image sequence.
3、单目测距。3. Monocular ranging.
单目测距中基于图像处理的测距方法有:对焦测距法(Depth from Focus,DFF)和散焦测距法(Depth from Defocus,DFD)。Image-based ranging methods in monocular ranging include: Depth from Focus (DFF) and Depth from Defocus (DFD).
对焦测距法是通过调节光写参数拍摄一系列的图像数据,在这些图像数据中找出最清晰的图像数据,根据这种图像数据的拍摄参数,利用几何光学的成像原理,计算出距离。The focus ranging method captures a series of image data by adjusting the optical writing parameters, finds the clearest image data among the image data, and calculates the distance based on the imaging parameters of the image data using the imaging principle of geometric optics.
散焦测距法是根据物体散焦程度越大、图像越模糊的原理,利用在不同光学参数下拍摄的两帧或三帧图像数据来确定散焦点扩散函数的扩散参数,根据散焦扩散参数与目标对象距离的关系来进行深度计算。The defocusing distance method is based on the principle that the defocusing degree of the object is larger and the image is more blurred. The two or three frames of image data captured under different optical parameters are used to determine the diffusion parameter of the scattered focus diffusion function, according to the defocus diffusion parameter. The depth calculation is performed on the relationship with the distance of the target object.
当然,上述计算方式只是作为示例,在实施本公开实施例时,可以根据实际情况设置其他计算方式,本公开实施例对此不加以限制。另外,除了上述计算方式外,本领域技术人员还可以根据实际需要采用其它计算方式,本公开实施例对此也不加以限制。Of course, the above calculation manner is only an example. When the embodiment of the present disclosure is implemented, other calculation manners may be set according to actual conditions, and the embodiment of the present disclosure does not limit this. In addition, in addition to the foregoing calculation manners, other calculation methods may be used by those skilled in the art according to actual needs, and the embodiments of the present disclosure do not limit this.
此外,除了复用摄像头与目标对象之间的候选距离,还可以直接通过主动测距法测量扬声器与目标对象之间的候选距离,即使用激光等波束或具有一定纹理结构的光找事目标对象,通过分析目标对象反射光的纹理形变或测 量光速的传播时间来确定物体的距离,本公开实施例对此亦不加以限制。In addition, in addition to multiplexing the candidate distance between the camera and the target object, the candidate distance between the speaker and the target object can be directly measured by the active ranging method, that is, using a beam such as a laser or a light having a certain texture structure to find a target object, By analyzing the texture deformation or measurement of the reflected light of the target object The propagation time of the light speed is used to determine the distance of the object, and the embodiment of the present disclosure does not limit this.
子步骤S22,获取与所述一个或多个目标对象匹配的试听范围距离;Sub-step S22, acquiring an audition range distance that matches the one or more target objects;
假设在定向输出音频时,在某个范围内能听到该音频,该范围外一般不能听到该音频,则该范围的距离称为试听范围距离。It is assumed that when the audio is directionally output, the audio can be heard within a certain range, and the audio is generally not heard outside the range, and the distance of the range is called the audition range distance.
即试听范围距离,可以为能听到定向输出音频的范围的距离。That is, the range of the audition range can be the distance from which the range of the directional output audio can be heard.
应用本公开实施例,可以预先按照目标对象设置匹配的试听范围距离,例如,一个目标对象的试听范围为35cm,两个目标对象的试听范围为45cm等等。With the embodiment of the present disclosure, the matching audition range distance can be set in advance according to the target object, for example, the audition range of one target object is 35 cm, the audition range of two target objects is 45 cm, and the like.
当然,还可以根据焦距、目标对象在预览图像数据中的差距计算一个合适的试听范围距离,等等,本公开实施例对此不加以限制Of course, it is also possible to calculate a suitable audition range distance according to the focal length, the difference of the target object in the preview image data, and the like, and the embodiment of the present disclosure does not limit this.
子步骤S23,根据所述目标距离和所述试听范围距离计算声场角。Sub-step S23, calculating a sound field angle according to the target distance and the distance of the audition range.
在具体实现中,可以采用目标距离和试听范围距离,按照三角函数关系计算声场角。In a specific implementation, the target distance and the listening range distance may be used, and the sound field angle is calculated according to a trigonometric function relationship.
在一个实施例中,以目标距离为高、以试听范围距离为底,构建等腰三角形,按照以下三角函数关系计算声场角:In one embodiment, the isosceles triangle is constructed with the target distance being high and the audition range distance as the base, and the sound field angle is calculated according to the following trigonometric relationship:
tanR/2=(K/2)/LtanR/2=(K/2)/L
其中,R为声场角,K为试听范围距离,L为目标距离。Where R is the sound field angle, K is the distance of the audition range, and L is the target distance.
当然,除了正切tan函数之外,还可以采用其他三角函数关系计算声场角,本公开实施例对此不加以限制。Of course, in addition to the tangent tan function, other trigonometric function relationships may be used to calculate the acoustic field angle, which is not limited by the embodiments of the present disclosure.
例如,如图3A所示,在目标对象为一个时,测得摄像头与目标对象的候选距离为L0,即可以认为扬声器301与目标对象的目标距离为L0,一个目标对象的试听范围距离为K0,则以目标距离L0为高、以试听范围距离K0为底,构建等腰三角形,按照以下三角函数关系计算声场角R0:For example, as shown in FIG. 3A, when the target object is one, the candidate distance between the camera and the target object is measured as L 0 , that is, the target distance between the
tanR0/2=(K0/2)/L0 tanR 0 /2=(K 0 /2)/L 0
又例如,如图3B所示,在目标对象为三个时,测得摄像头与目标对象的候选距离分布为L2、L3、L4,即扬声器301与目标对象的目标距离L1=(L2+L3+L4)/3,三个目标对象的试听范围距离为K1,则以目标距离L1为高、以试听范围距离K1为底,构建等腰三角形,按照以下三角函数关系计算声场角R1: For another example, as shown in FIG. 3B, when the target object is three, the candidate distance distribution of the camera and the target object is measured as L 2 , L 3 , L 4 , that is, the target distance L 1 of the speaker 301 and the target object is L 1 ( L 2 +L 3 +L 4 )/3, the distance of the audition range of the three target objects is K 1 , and the isosceles triangle is constructed with the target distance L 1 as the height and the audition range distance K 1 as the base. The function relationship calculates the sound field angle R 1 :
tanR1/2=(K1/2)/L1 tanR 1 /2=(K 1 /2)/L 1
步骤103,对所述一个或多个目标对象计算相对于扬声器的偏向角;Step 103: Calculate a deflection angle with respect to the speaker for the one or more target objects;
如图2A和图2B所示,偏向角S,可以为目标对象实际所映射的被拍摄者偏离扬声器201正向方向的角度。As shown in FIGS. 2A and 2B, the deflection angle S may be an angle at which the subject actually mapped by the target object deviates from the forward direction of the
在本公开的一种可选实施例中,步骤103可以包括如下子步骤:In an optional embodiment of the present disclosure,
子步骤S31,将所述预览图像中的目标对象投影到预设的坐标系中,所述坐标系基于扬声器的位置构建;Sub-step S31, projecting the target object in the preview image into a preset coordinate system, the coordinate system is constructed based on the position of the speaker;
由于摄像头采集的预览图像数据与实际的景物一般是等比例的,因此,在预览图像数据中目标对象偏离扬声器的角度与实际中目标对象所映射的被拍摄者相对于扬声器的偏向角的值一般是一样的。Since the preview image data collected by the camera is generally proportional to the actual scene, the value of the target object deviating from the speaker in the preview image data is generally the same as the value of the bias angle of the subject with respect to the speaker mapped by the target object in practice. it's the same.
应用本公开实施例,可以预先基于扬声器的位置构建坐标系。With the embodiment of the present disclosure, the coordinate system can be constructed in advance based on the position of the speaker.
需要说明的是,扬声器与出声孔(电子设备外壳上的孔,扬声器发出的音频可以通过该出声孔进行传播)的位置通常是相对的,扬声器在电子设备上的投影位置(即以电子设备的背面作为投影面,将扬声器投影至该投影面的位置)与出声孔经常是重合的,因此,也可以基于出声孔直接构建坐标系。It should be noted that the position of the speaker and the sound hole (the hole on the outer casing of the electronic device, the sound emitted by the speaker can propagate through the sound hole) is usually opposite, and the position of the speaker on the electronic device (ie, electronically) The back side of the device acts as a projection surface, and the position at which the speaker is projected onto the projection surface is often coincident with the sound hole. Therefore, the coordinate system can also be directly constructed based on the sound hole.
其中,投影为投射线通过物体(如扬声器),向选定的投影面投射,并在该面上得到图形的方法。Among them, the projection is a method in which a projection line is projected through an object (such as a speaker) to a selected projection surface, and a graphic is obtained on the surface.
本公开实施例中,可以以扬声器或出声孔为原点,构建坐标系,如直角坐标系。In the embodiment of the present disclosure, a coordinate system such as a Cartesian coordinate system may be constructed with the speaker or the sound hole as an origin.
以该坐标系的平面为投影平面,将预览图像中的目标对象投影到该坐标系中,进行偏向角的计算。The plane of the coordinate system is used as a projection plane, and the target object in the preview image is projected into the coordinate system to calculate the deflection angle.
子步骤S32,在所述坐标系中,计算所述一个或多个目标对象的焦点坐标;Sub-step S32, in the coordinate system, calculating focus coordinates of the one or more target objects;
焦点坐标,可以为在对目标对象进行对焦操作时焦点的坐标。The focus coordinate, which can be the coordinates of the focus when the target object is focused.
在本公开实施例的一种可选示例中,子步骤S32进一步可以包括如下子步骤:In an optional example of an embodiment of the present disclosure, sub-step S32 may further include the following sub-steps:
子步骤S321,当所述目标对象为一个时,查找所述目标对象左上角的第一坐标、右下角的第二坐标;Sub-step S321, when the target object is one, searching for the first coordinate of the upper left corner of the target object and the second coordinate of the lower right corner;
子步骤S322,计算所述第一坐标和所述第二坐标的平均值,作为焦点坐 标;Sub-step S322, calculating an average value of the first coordinate and the second coordinate, sitting as a focus Standard
在本示例中,目标对象为一个区域,则可以以该区域的中点作为焦点坐标。In this example, the target object is an area, and the midpoint of the area can be used as the focus coordinate.
或者,or,
子步骤S323,当所述目标对象为多个时,查找最左侧的目标对象左上角的第三坐标、右下角的第四坐标,及,最右侧的目标对象左上角的第五坐标、右下角的第六坐标;Sub-step S323, when the target object is multiple, find the third coordinate of the upper left corner of the leftmost target object, the fourth coordinate of the lower right corner, and the fifth coordinate of the upper left corner of the rightmost target object, The sixth coordinate of the lower right corner;
子步骤S324,分别计算所述第三坐标和所述第四坐标的平均值,及,所述第五坐标和所述第六坐标的平均值,作为焦点坐标。Sub-step S324, respectively calculating an average value of the third coordinate and the fourth coordinate, and an average value of the fifth coordinate and the sixth coordinate as focus coordinates.
在本示例中,目标对象为一个区域,则可以以该区域的中点作为焦点坐标。In this example, the target object is an area, and the midpoint of the area can be used as the focus coordinate.
若具有多个目标对象,则可以计算最左侧的目标对象的焦点坐标和最右侧的目标对象的焦点坐标,共两个焦点坐标。If there are multiple target objects, the focus coordinates of the leftmost target object and the focus coordinates of the rightmost target object can be calculated, for a total of two focus coordinates.
子步骤S33,采用所述焦点坐标计算偏向角。Sub-step S33, the deflection angle is calculated using the focus coordinates.
在具体实现中,可以采用焦点坐标按照三角函数关系计算偏向角。In a specific implementation, the focus angle can be used to calculate the deflection angle according to a trigonometric function relationship.
在本公开实施例的一种可选示例中,当目标对象为一个时,可以以焦点坐标的X轴坐标的值和Y轴坐标的值为直角边,构建直角三角形,按照以下三角函数关系计算偏向角:In an optional example of the embodiment of the present disclosure, when the target object is one, a right-angled triangle may be constructed with the value of the X-axis coordinate of the focus coordinate and the value of the Y-axis coordinate being a right-angled edge, and calculated according to the following trigonometric relationship. Deflection angle:
tanS=X0/Y0 tanS=X 0 /Y 0
其中,S为偏向角,X0为焦点坐标的X轴坐标的值,Y0为焦点坐标的Y轴坐标的值。Where S is the deflection angle, X 0 is the value of the X-axis coordinate of the focus coordinate, and Y 0 is the value of the Y-axis coordinate of the focus coordinate.
当然,除了正切tan函数之外,还可以采用其他三角函数关系计算偏向角,本公开实施例对此不加以限制。Of course, in addition to the tangent tan function, other trigonometric function relationships may be used to calculate the deflection angle, which is not limited by the embodiments of the present disclosure.
在本公开实施例的另一种可选示例中,子步骤S33进一步可以包括如下子步骤:In another optional example of an embodiment of the present disclosure, sub-step S33 may further include the following sub-steps:
子步骤S331,当所述目标对象为多个时,采用最左侧的目标对象的焦点坐标计算第一候选偏向角;Sub-step S331, when the target object is multiple, calculate a first candidate deflection angle by using focus coordinates of the leftmost target object;
子步骤S332,采用最右侧的目标对象的焦点坐标计算第二候选偏向角;Sub-step S332, calculating a second candidate deflection angle by using focus coordinates of the rightmost target object;
子步骤S333,若最左侧的目标对象和最右侧的目标对象位于扬声器的两 侧,则将第一特征角度设置为偏向角;Sub-step S333, if the leftmost target object and the rightmost target object are located in the two speakers Side, the first feature angle is set to a deflection angle;
其中,所述第一特征角度为所述第一候选偏向角与所述第二候选偏向角之差的一半;Wherein the first feature angle is a half of a difference between the first candidate deflection angle and the second candidate deflection angle;
子步骤S334,若最左侧的目标对象和最右侧的目标对象位于扬声器的同一侧,则将第二特征角度设置为偏向角;Sub-step S334, if the leftmost target object and the rightmost target object are located on the same side of the speaker, the second feature angle is set to a deflection angle;
其中,所述第二特征角度为所述第一候选偏向角与所述第二候选偏向角之和的一半。The second feature angle is half of a sum of the first candidate deflection angle and the second candidate deflection angle.
在本示例中,对于每个焦点坐标,可以以焦点坐标的X坐标的值和Y坐标的值为直角边,构建直角三角形,按照三角函数关系计算第一候选偏向角、第二候选偏向角,若最左侧的目标对象和最右侧的目标对象分别位于扬声器正向方向的两侧,则偏向角:In this example, for each focus coordinate, a right-angled triangle may be constructed with a value of an X coordinate of the focus coordinate and a value of the Y coordinate, and a first candidate deflection angle and a second candidate deflection angle may be calculated according to a trigonometric function relationship. If the leftmost target object and the rightmost target object are located on either side of the forward direction of the speaker, then the deflection angle:
S=(S1-S2)/2S=(S 1 -S 2 )/2
若最左侧的目标对象和最右侧的目标对象分别位于扬声器正向方向的同一侧(如左侧或右侧),则偏向角:If the leftmost target object and the rightmost target object are located on the same side of the speaker in the forward direction (such as the left or right side), the deflection angle:
S=(S1+S2)/2S=(S 1 +S 2 )/2
其中,S为偏向角,S1为第一候选偏向角,S2为第二候选偏向角。Where S is the deflection angle, S 1 is the first candidate deflection angle, and S 2 is the second candidate deflection angle.
需要说明的是,上述计算It should be noted that the above calculation
例如,如图4A所示,以扬声器的投影O为圆点,构建XY坐标系,当目标对象为一个时,该目标对象的左上角为点A(X1,Y1),右下角为点B(X2,Y2),则焦点坐标U((X1+X2)/2,(Y1+Y2)/2),则以焦点坐标U的X轴坐标(X1+X2)/2和Y轴坐标(Y1+Y2)/2为直角边,构建直角三角形,按照以下三角函数关系计算偏向角S3:For example, as shown in FIG. 4A, the XY coordinate system is constructed with the projection O of the speaker as a dot. When the target object is one, the upper left corner of the target object is point A (X 1 , Y 1 ), and the lower right corner is a point. B(X 2 , Y 2 ), then the focus coordinate U((X 1 +X 2 )/2, (Y 1 +Y 2 )/2), then the X-axis coordinate of the focus coordinate U (X 1 +X 2 The /2 and Y-axis coordinates (Y 1 +Y 2 )/2 are right-angled edges, and a right-angled triangle is constructed. The deflection angle S 3 is calculated according to the following trigonometric relationship:
tanS3=(X1+X2)/2/(Y1+Y2)/2;tanS 3 = (X 1 +X 2 )/2/(Y 1 +Y 2 )/2;
又例如,如图4B所示,以扬声器的投影O为圆点,构建XY坐标系,当目标对象为三个时,最左侧的目标对象的左上角为点C(X3,Y3),右下角为点D(X4,Y4),则焦点坐标V((X3+X4)/2,(Y3+Y4)/2),最右侧的目标对象的左上角为点E(X5,Y5),右下角为点F(X6,Y7),则焦点坐标W((X5+X6)/2,(Y5+Y6)/2);For another example, as shown in FIG. 4B, the XY coordinate system is constructed with the projection O of the speaker as a dot. When the target object is three, the upper left corner of the leftmost target object is the point C (X 3 , Y 3 ). , the lower right corner is the point D (X 4 , Y 4 ), then the focus coordinate V ((X 3 +X 4 )/2, (Y 3 +Y 4 )/2), the upper left corner of the rightmost target object is Point E (X 5 , Y 5 ), the lower right corner is point F (X 6 , Y 7 ), then the focus coordinate W ((X 5 +X 6 )/2, (Y 5 +Y 6 )/2);
则分别以焦点坐标V的X轴坐标(X3+X4)/2和Y轴坐标(Y3+Y4)/2 为直角边,以及,焦点坐标W的X轴坐标(X5+X6)/2和Y轴坐标(Y5+Y6)/2为直角边,构建直角三角形,按照以下三角函数关系计算第一候选偏向角S5和第二候选偏向角S6:Then, the X-axis coordinate (X 3 +X 4 )/2 and the Y-axis coordinate (Y 3 +Y 4 )/2 of the focus coordinate V are respectively a right-angled edge, and the X-axis coordinate of the focal coordinate W (X 5 +X) 6 )/2 and the Y-axis coordinate (Y 5 +Y 6 )/2 are right-angled sides, and a right-angled triangle is constructed, and the first candidate deflection angle S 5 and the second candidate deflection angle S 6 are calculated according to the following trigonometric relationship:
tanS5=(X3+X4)/2/(Y3+Y4)/2;tanS 5 = (X 3 +X 4 )/2/(Y 3 +Y 4 )/2;
tanS6=(X5+X6)/2/(Y5+Y6)/2;tanS 6 = (X 5 +X 6 )/2/(Y 5 +Y 6 )/2;
假设S5为30°,S6为50°,则偏向角S4=(S6-S5)/2=10°,表示对三个目标对象整体向扬声器的右侧偏离10°。Assuming S 5 is 30° and S 6 is 50°, the deflection angle S 4 =(S 6 -S 5 )/2=10°, indicating that the three target objects are entirely shifted by 10° toward the right side of the speaker.
又例如,如图4C所示,以扬声器的投影O为圆点,构建XY坐标系,当目标对象为三个时,最左侧的目标对象的左上角为点G(X7,Y7),右下角为点H(X8,Y8),则焦点坐标为M((X7+X8)/2,(Y7+Y8)/2),最右侧的目标对象的左上角为点I(X9,Y9),右下角为点J(X10,Y10),则焦点坐标为N((X9+X10)/2,(Y9+Y10)/2);For another example, as shown in FIG. 4C, the XY coordinate system is constructed with the projection O of the speaker as a dot. When the target object is three, the upper left corner of the leftmost target object is the point G (X 7 , Y 7 ). , the lower right corner is point H (X 8 , Y 8 ), then the focus coordinate is M ((X 7 +X 8 )/2, (Y 7 +Y 8 )/2), the upper left corner of the rightmost target object For point I(X 9 , Y 9 ), the lower right corner is point J (X 10 , Y 10 ), then the focus coordinate is N((X 9 +X 10 )/2, (Y 9 +Y 10 )/2) ;
则分别以焦点坐标V的X轴坐标(X7+X8)/2和Y轴坐标(Y7+Y8)/2为直角边,以及,焦点坐标W的X轴坐标(X9+X10)/2和Y轴坐标(Y9+Y10)/2为直角边,构建直角三角形,按照以下三角函数关系计算第一候选偏向角S8和第二候选偏向角S9:Then, the X-axis coordinate (X 7 +X 8 )/2 and the Y-axis coordinate (Y 7 +Y 8 )/2 of the focus coordinate V are respectively a right-angled edge, and the X-axis coordinate of the focal coordinate W (X 9 +X) 10 )/2 and the Y-axis coordinate (Y 9 +Y 10 )/2 are right-angled sides, and a right-angled triangle is constructed, and the first candidate deflection angle S 8 and the second candidate deflection angle S 9 are calculated according to the following trigonometric relationship:
tanS8=(X7+X8)/2/(Y7+Y8)/2;tanS 8 = (X 7 +X 8 )/2/(Y 7 +Y 8 )/2;
tanS9=(X9+X10)/2/(Y9+Y10)/2;tanS 9 = (X 9 +X 10 )/2/(Y 9 +Y 10 )/2;
假设S8为50°,S9为30°,则偏向角S7=(S8+S9)/2=40°,表示对三个目标对象整体向扬声器的右侧偏离40°。Assuming S 8 is 50° and S 9 is 30°, the deflection angle S 7 =(S 8 +S 9 )/2=40°, indicating that the three target objects are entirely offset from the right side of the speaker by 40°.
步骤104,当接收到拍照指示时,驱动扬声器按照所述声场角和所述偏向角定向输出音频。Step 104: When receiving the photographing instruction, the driving speaker outputs the audio according to the sound field angle and the deflection angle.
在具体实现中,用户可以通过点击拍照控件、在预览图像数据上点击等操作触发拍照指示,摄像头进行拍照处理,同时,驱动扬声器按照声场角和偏向角定向输出音频,即向目标对象所在的实际区域发出拍照提示声,以提示被拍照者正在进行拍照。In a specific implementation, the user can trigger the photographing instruction by clicking the photographing control, clicking on the preview image data, and the like, and the camera performs the photographing process, and at the same time, driving the speaker to output the audio according to the sound field angle and the deflecting angle, that is, the actual target to the target object. A photo is sounded in the area to remind the person being photographed to take a picture.
定向输出音频,可以通过利用超声波在空气中的非线性传播效应产生高指向性可听声的(即声频定向)。Directional output audio can be produced by utilizing the nonlinear propagation effects of ultrasound in air to produce highly directional audible sound (ie, audio orientation).
根据非线性声学理论,两平面波在不均匀介质中非线性传播,当向超声 换能器(扬声器的组件之一)输入两列频率为f1、f2的电信号时,超声换能器通过机械振动向空气中发射两列频率为f1、f2的超声波。当这两列超声波在空气中传播的过程中将产生非线性交互作用,从而生成了包括基频f1、f2,其和频f1+f2、差频f1-f2及各阶谐波在内的复杂声波。由于声衰系数α与频率的平方成正比,频率较高的超声波信号f1、f2、f1+f2及各次谐波将很快被空气吸收掉,剩下处于声频范围内的差频信号f1-f2在空气中继续传播。According to the theory of nonlinear acoustics, two plane waves propagate nonlinearly in an inhomogeneous medium. When two columns of electrical signals of frequency f 1 and f 2 are input to the ultrasonic transducer (one of the components of the loudspeaker), the ultrasonic transducer passes The mechanical vibration emits two ultrasonic waves of frequencies f 1 and f 2 into the air. When these two columns of ultrasonic waves propagate in the air, a nonlinear interaction will occur, resulting in the inclusion of the fundamental frequencies f 1 , f 2 , the sum frequency f 1 +f 2 , the difference frequencies f 1 -f 2 and the various orders Complex sound waves including harmonics. Since the acoustic attenuation coefficient α is proportional to the square of the frequency, the higher frequency ultrasonic signals f 1 , f 2 , f 1 +f 2 and the harmonics will be quickly absorbed by the air, leaving the difference in the audio frequency range. The frequency signals f 1 -f 2 continue to propagate in the air.
声波是否具有指向性,与声波波长和声源尺寸的比率密切相关。当声波波长远大于声源尺寸时,声波没有指向性;当声波波长接近直至远小于声源尺寸时,声波将逐渐呈现出越来越强的指向性。因此,当贺礼选择超声频率f1、f2时,可使差频信号f1-f2处于可听范围内,从而通过超声波产生声频波。Whether the sound wave has directivity is closely related to the ratio of the wavelength of the sound wave to the size of the sound source. When the wavelength of the sound wave is much larger than the size of the sound source, the sound wave has no directivity; when the wavelength of the sound wave is close to far smaller than the size of the sound source, the sound wave will gradually exhibit more and more directivity. Therefore, when the greetings select the ultrasonic frequencies f 1 and f 2 , the beat signals f 1 -f 2 can be made to be in the audible range, thereby generating the acoustic waves by the ultrasonic waves.
进一步而言,在参数声学阵理论中,超声换能器(扬声器的组件之一)向空气介质中发出强烈调制的超声波,超声波在沿其传播主轴方向(如声场角、偏向角指向的方向)行进的过程中不断通过非线性交互作用调制出声频信号,这些不断解调出来的声频波累积叠加起来,由此一个端射式虚拟声源阵列(end-fire virtual array)通过这种方式实现了。这个虚拟声源阵列即所谓的参量声学阵,参量声学阵使得声频波的能量在声波前进方向上不断得到加强。由于超声波具有很强的指向性,传播主轴方向(如声场角、偏向角指向的方向)以外这种叠加加强效应会很微弱,这最终使得声频波在主传播轴方向(如声场角、偏向角指向的方向)具有了很强的指向性。Further, in the parametric acoustic array theory, an ultrasonic transducer (one of the components of the loudspeaker) emits a strongly modulated ultrasonic wave into the air medium, the ultrasonic wave being in the direction of its main axis of propagation (such as the direction of the sound field and the direction in which the deflection angle is directed) During the process of traveling, the audio signals are continuously modulated by nonlinear interaction, and these continuously demodulated audio waves are accumulated and superimposed, whereby an end-fire virtual array is realized in this way. . This virtual sound source array is a so-called parametric acoustic array. The parametric acoustic array makes the energy of the acoustic wave continuously strengthened in the direction of the sound wave. Since the ultrasonic wave has strong directivity, the superposition enhancement effect will be weak except for the direction of the propagation principal axis (such as the direction of the sound field and the direction of the deflection angle), which ultimately causes the acoustic wave to be in the direction of the main propagation axis (such as the sound field angle and the deflection angle). The direction of pointing) has a strong directivity.
需要说明的是,对于方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本公开实施例并不受所描述的动作顺序的限制,因为依据本公开实施例,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于可选实施例,所涉及的动作并不一定是本公开实施例所必须的。It should be noted that, for the method embodiments, for the sake of simple description, they are all expressed as a series of action combinations, but those skilled in the art should understand that the embodiments of the present disclosure are not limited by the described action sequence, because In accordance with embodiments of the present disclosure, certain steps may be performed in other sequences or concurrently. In the following, those skilled in the art should also understand that the embodiments described in the specification are optional embodiments, and the actions involved are not necessarily required by the embodiments of the present disclosure.
参照图5,示出了本公开的一种基于拍照的音频输出装置实施例的结构框图,具体可以包括如下模块:Referring to FIG. 5, a structural block diagram of an embodiment of a camera-based audio output device according to the present disclosure is shown, which may specifically include the following modules:
目标对象确定模块501,用于在摄像头采集到预览图像数据时,确定所述预览图像数据中的一个或多个目标对象;The target
声场角计算模块502,用于对所述一个或多个目标对象计算相对于扬声
器的声场角;a sound field
偏向角计算模块503,用于对所述一个或多个目标对象计算相对于扬声器的偏向角;a deflection
音频定向输出模块504,用于在接收到拍照指示时,驱动扬声器按照所述声场角和所述偏向角定向输出音频。The audio
在本公开的一种可选实施例中,所述目标对象确定模块501可以包括如下子模块:In an optional embodiment of the present disclosure, the target
第一确定子模块,用于检测所述预览图像数据中的人脸,确定为一个或多个目标对象。The first determining submodule is configured to detect a face in the preview image data and determine the target object as one or more targets.
在本公开的一种可选实施例中,所述目标对象确定模块501可以包括如下子模块:In an optional embodiment of the present disclosure, the target
第二确定子模块,用于在接收到对焦操作指示时,确定所述对焦操作指示对应的、在所述预览图像数据中的一个或多个人脸为一个或多个目标对象;a second determining submodule, configured to: when receiving the focus operation indication, determine that one or more human faces in the preview image data corresponding to the focus operation instruction are one or more target objects;
或,or,
取消子模块,用于在接收到取消操作指示时,取消已确定的一个或多个目标对象。The cancel submodule is used to cancel the determined one or more target objects upon receiving the cancel operation indication.
在本公开的一种可选实施例中,所述声场角计算模块502可以包括如下子模块:In an optional embodiment of the present disclosure, the sound field
目标距离测量子模块,用于测量扬声器与所述一个或多个目标对象之间的目标距离;a target distance measurement submodule for measuring a target distance between the speaker and the one or more target objects;
试听范围距离获取子模块,用于获取与所述一个或多个目标对象匹配的试听范围距离,所述试听范围距离为能听到音频的范围的距离;The audition range distance obtaining sub-module is configured to obtain an audition range distance matching the one or more target objects, where the audition range distance is a distance in which the range of the audio can be heard;
第一计算子模块,用于根据所述目标距离和所述试听范围距离计算声场角。And a first calculating submodule configured to calculate a sound field angle according to the target distance and the distance of the audition range.
在本公开实施例的一种可选示例中,所述目标距离测量子模块进一步可以包括如下子模块:In an optional example of the embodiments of the present disclosure, the target distance measurement submodule may further include the following submodules:
第一获取子模块,用于在所述目标对象为一个时,获取摄像头与所述目标对象之间的候选距离,作为目标距离;a first obtaining submodule, configured to acquire a candidate distance between the camera and the target object as the target distance when the target object is one;
或者, Or,
第二获取子模块,用于在所述目标对象为多个时,分别获取摄像头与所述多个目标对象之间的多个候选距离;a second obtaining submodule, configured to acquire a plurality of candidate distances between the camera and the plurality of target objects respectively when the target object is multiple;
第二计算子模块,用于采用所述多个候选距离计算目标距离。And a second calculation submodule configured to calculate the target distance by using the plurality of candidate distances.
在本公开的一种可选实施例中,所述偏向角计算模块503可以包括如下子模块:In an optional embodiment of the present disclosure, the deflection
投影子模块,用于将所述预览图像数据中的目标对象投影到预设的坐标系中,所述坐标系基于扬声器的位置构建;a projection submodule, configured to project a target object in the preview image data into a preset coordinate system, where the coordinate system is constructed based on a position of the speaker;
焦点坐标计算子模块,用于在所述坐标系中,计算所述一个或多个目标对象的焦点坐标;a focus coordinate calculation submodule, configured to calculate focus coordinates of the one or more target objects in the coordinate system;
第三计算子模块,用于采用所述焦点坐标计算偏向角。a third calculation sub-module for calculating a deflection angle using the focus coordinates.
在本公开实施例的一种可选示例中,所述焦点坐标计算子模块进一步可以包括如下子模块:In an optional example of the embodiments of the present disclosure, the focus coordinate calculation sub-module may further include the following sub-modules:
第一查找子模块,用于在所述目标对象为一个时,查找所述目标对象左上角的第一坐标、右下角的第二坐标;a first search submodule, configured to search for a first coordinate of the upper left corner of the target object and a second coordinate of a lower right corner when the target object is one;
第四计算子模块,用于计算所述第一坐标和所述第二坐标的平均值,作为焦点坐标;a fourth calculation submodule, configured to calculate an average value of the first coordinate and the second coordinate as a focus coordinate;
或者,or,
第二查找子模块,用于在所述目标对象为多个时,查找最左侧的目标对象左上角的第三坐标、右下角的第四坐标,及,最右侧的目标对象左上角的第五坐标、右下角的第六坐标;a second search submodule, configured to: when the target object is multiple, find the third coordinate of the upper left corner of the leftmost target object, the fourth coordinate of the lower right corner, and the upper left corner of the rightmost target object The fifth coordinate, the sixth coordinate of the lower right corner;
第五计算子模块,用于分别计算所述第三坐标和所述第四坐标的平均值,及,所述第五坐标和所述第六坐标的平均值,作为焦点坐标。And a fifth calculating submodule, configured to respectively calculate an average value of the third coordinate and the fourth coordinate, and an average value of the fifth coordinate and the sixth coordinate as a focus coordinate.
在本公开实施例的一种可选示例中,所述第三计算子模块进一步可以包括如下子模块:In an optional example of the embodiment of the present disclosure, the third calculating submodule may further include the following submodule:
第一候选偏向角计算子模块,用于在所述目标对象为多个时,采用最左侧的目标对象的焦点坐标计算第一候选偏向角;a first candidate deflection angle calculation submodule, configured to calculate a first candidate deflection angle by using focus coordinates of the leftmost target object when the target object is multiple;
第二候选偏向角计算子模块,用于采用最右侧的目标对象的焦点坐标计算第二候选偏向角;a second candidate deflection angle calculation submodule, configured to calculate a second candidate deflection angle by using focus coordinates of the rightmost target object;
第一设置子模块,用于在最左侧的目标对象和最右侧的目标对象位于扬 声器的两侧时,则将第一特征角度设置为偏向角;The first setting sub-module is used to target the leftmost target object and the rightmost target object. When the two sides of the sounder are set, the first feature angle is set to a deflection angle;
第二设置子模块,用于在最左侧的目标对象和最右侧的目标对象位于扬声器的同一侧时,则将第二特征角度设置为偏向角;a second setting sub-module, configured to set a second feature angle as a deflection angle when the leftmost target object and the rightmost target object are located on the same side of the speaker;
其中,所述第一特征角度为所述第一候选偏向角与所述第二候选偏向角之差的一半;所述第二特征角度为所述第一候选偏向角与所述第二候选偏向角之和的一半。The first feature angle is a half of a difference between the first candidate deflection angle and the second candidate deflection angle; the second feature angle is the first candidate deflection angle and the second candidate bias Half of the sum of the horns.
对于装置实施例而言,由于其与方法实施例基本相似,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。For the device embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and the relevant parts can be referred to the description of the method embodiment.
本说明书中的各个实施例均采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似的部分互相参见即可。The various embodiments in the present specification are described in a progressive manner, and each embodiment focuses on differences from other embodiments, and the same similar parts between the various embodiments can be referred to each other.
本领域内的技术人员应明白,本公开实施例的实施例可提供为方法、装置、或计算机程序产品。因此,本公开实施例可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本公开实施例可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art will appreciate that embodiments of the disclosed embodiments can be provided as a method, apparatus, or computer program product. Thus, embodiments of the present disclosure can take the form of an entirely hardware embodiment, an entirely software embodiment, or a combination of software and hardware. Moreover, embodiments of the present disclosure may take the form of a computer program product embodied on one or more computer usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.
本公开实施例是参照根据本公开实施例的方法、终端设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理终端设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理终端设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。Embodiments of the present disclosure are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the present disclosure. It will be understood that each flow and/or block of the flowchart illustrations and/or FIG. These computer program instructions can be provided to a processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing terminal device to produce a machine such that instructions are executed by a processor of a computer or other programmable data processing terminal device Means are provided for implementing the functions specified in one or more of the flow or in one or more blocks of the flow chart.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理终端设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。 The computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing terminal device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device. The instruction device implements the functions specified in one or more blocks of the flowchart or in a flow or block of the flowchart.
这些计算机程序指令也可装载到计算机或其他可编程数据处理终端设备上,使得在计算机或其他可编程终端设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程终端设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded onto a computer or other programmable data processing terminal device such that a series of operational steps are performed on the computer or other programmable terminal device to produce computer-implemented processing, such that the computer or other programmable terminal device The instructions executed above provide steps for implementing the functions specified in one or more blocks of the flowchart or in a block or blocks of the flowchart.
尽管已描述了本公开实施例的可选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例做出另外的变更和修改。所以,所附权利要求意欲解释为包括可选实施例以及落入本公开实施例范围的所有变更和修改。Although alternative embodiments of the disclosed embodiments have been described, those skilled in the art can make additional changes and modifications to the embodiments once they are aware of the basic inventive concept. Therefore, the appended claims are intended to be interpreted as including all alternatives and modifications of the embodiments.
最后,还需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者终端设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者终端设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者终端设备中还存在另外的相同要素。Finally, it should also be noted that in this context, relational terms such as first and second are used merely to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply these entities. There is any such actual relationship or order between operations. Furthermore, the terms "comprises" or "comprising" or "comprising" or any other variations are intended to encompass a non-exclusive inclusion, such that a process, method, article, or terminal device that includes a plurality of elements includes not only those elements but also Other elements that are included, or include elements inherent to such a process, method, article, or terminal device. An element defined by the phrase "comprising a ..." does not exclude the presence of additional identical elements in the process, method, article, or terminal device that comprises the element, without further limitation.
以上对本公开所提供的一种基于拍照的音频输出方法和一种基于拍照的音频输出装置,进行了详细介绍,本文中应用了具体个例对本公开的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本公开的方法及其核心思想;同时,对于本领域的一般技术人员,依据本公开的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本公开的限制。 The above is a detailed description of a photo-based audio output method and a photo-based audio output device provided by the present disclosure. The principle and implementation of the present disclosure are described in the following. The descriptions are only used to help understand the method of the present disclosure and its core ideas; at the same time, for those of ordinary skill in the art, according to the idea of the present disclosure, there will be changes in the specific embodiments and application scopes. The description is not to be construed as limiting the disclosure.
Claims (16)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201510345291.2A CN105827931B (en) | 2015-06-19 | 2015-06-19 | It is a kind of based on the audio-frequency inputting method and device taken pictures |
| CN201510345291.2 | 2015-06-19 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2016202111A1 true WO2016202111A1 (en) | 2016-12-22 |
Family
ID=56514385
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2016/080941 Ceased WO2016202111A1 (en) | 2015-06-19 | 2016-05-04 | Audio output method and apparatus based on photographing |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN105827931B (en) |
| WO (1) | WO2016202111A1 (en) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112835021A (en) * | 2020-12-31 | 2021-05-25 | 杭州海康机器人技术有限公司 | Positioning method, device, system and computer readable storage medium |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107024990B (en) * | 2017-03-31 | 2019-08-20 | 维沃移动通信有限公司 | A kind of method and mobile terminal attracting children's self-timer |
Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20100027832A1 (en) * | 2008-08-04 | 2010-02-04 | Seiko Epson Corporation | Audio output control device, audio output control method, and program |
| JP2011201406A (en) * | 2010-03-25 | 2011-10-13 | Denso It Laboratory Inc | Outer-vehicle sound providing device, outer-vehicle sound providing method, and program |
| CN102342131A (en) * | 2009-03-03 | 2012-02-01 | 松下电器产业株式会社 | Speaker with camera, signal processing device, and AV system |
| CN103139480A (en) * | 2013-02-28 | 2013-06-05 | 华为终端有限公司 | Image acquisition method and image acquisition device |
| CN103661163A (en) * | 2012-09-21 | 2014-03-26 | 索尼公司 | Mobile object and storage medium |
| CN104185116A (en) * | 2014-08-15 | 2014-12-03 | 南京琅声声学科技有限公司 | Automatic acoustic radiation mode determining method |
| CN104469491A (en) * | 2013-09-13 | 2015-03-25 | 索尼公司 | audio delivery method and audio delivery system |
Family Cites Families (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2005295502A (en) * | 2004-03-09 | 2005-10-20 | Yoshito Suzuki | Electronic device with recording and imaging function, device with recording and imaging function, method of using the same, and microphone set used therefor |
| GB0616293D0 (en) * | 2006-08-16 | 2006-09-27 | Imp Innovations Ltd | Method of image processing |
| EP2564601A2 (en) * | 2010-04-26 | 2013-03-06 | Cambridge Mechatronics Limited | Loudspeakers with position tracking of a listener |
| CN102595275B (en) * | 2012-02-29 | 2014-12-03 | 长城汽车股份有限公司 | Vehicle loudspeaker system with adjustable sound field |
| CN102970484B (en) * | 2012-11-27 | 2016-02-24 | 惠州Tcl移动通信有限公司 | A kind of method of auditory tone cues and the electronic equipment based on the method when taking pictures |
| CN103337175A (en) * | 2013-06-22 | 2013-10-02 | 太仓博天网络科技有限公司 | Vehicle type recognition system based on real-time video steam |
-
2015
- 2015-06-19 CN CN201510345291.2A patent/CN105827931B/en active Active
-
2016
- 2016-05-04 WO PCT/CN2016/080941 patent/WO2016202111A1/en not_active Ceased
Patent Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20100027832A1 (en) * | 2008-08-04 | 2010-02-04 | Seiko Epson Corporation | Audio output control device, audio output control method, and program |
| CN102342131A (en) * | 2009-03-03 | 2012-02-01 | 松下电器产业株式会社 | Speaker with camera, signal processing device, and AV system |
| JP2011201406A (en) * | 2010-03-25 | 2011-10-13 | Denso It Laboratory Inc | Outer-vehicle sound providing device, outer-vehicle sound providing method, and program |
| CN103661163A (en) * | 2012-09-21 | 2014-03-26 | 索尼公司 | Mobile object and storage medium |
| CN103139480A (en) * | 2013-02-28 | 2013-06-05 | 华为终端有限公司 | Image acquisition method and image acquisition device |
| CN104469491A (en) * | 2013-09-13 | 2015-03-25 | 索尼公司 | audio delivery method and audio delivery system |
| CN104185116A (en) * | 2014-08-15 | 2014-12-03 | 南京琅声声学科技有限公司 | Automatic acoustic radiation mode determining method |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112835021A (en) * | 2020-12-31 | 2021-05-25 | 杭州海康机器人技术有限公司 | Positioning method, device, system and computer readable storage medium |
| CN112835021B (en) * | 2020-12-31 | 2023-11-14 | 杭州海康威视数字技术股份有限公司 | Positioning method, device, system and computer-readable storage medium |
Also Published As
| Publication number | Publication date |
|---|---|
| CN105827931B (en) | 2019-04-12 |
| CN105827931A (en) | 2016-08-03 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP6291055B2 (en) | Method and system for realizing adaptive surround sound | |
| CN104106267B (en) | Signal enhancing beam forming in augmented reality environment | |
| US20140362253A1 (en) | Beamforming method and apparatus for sound signal | |
| AU2014203801B2 (en) | Image capture device having tilt and/or perspective correction | |
| JP2024056955A (en) | Personalized HRTF with Optical Capture | |
| CN108020200B (en) | Depth measurement method and system | |
| TW201120469A (en) | Method, computer readable storage medium and system for localizing acoustic source | |
| CN108286945B (en) | 3D scanning system and method based on visual feedback | |
| CN105245811B (en) | A kind of kinescope method and device | |
| CN114363522A (en) | Photographing method and related device | |
| CN105338241A (en) | Shooting method and device | |
| WO2016202111A1 (en) | Audio output method and apparatus based on photographing | |
| WO2016197444A1 (en) | Method and terminal for achieving shooting | |
| CN110225247B (en) | Image processing method and electronic equipment | |
| US20120033043A1 (en) | Method and apparatus for processing an image | |
| WO2025103219A1 (en) | Camera module, electronic device, focusing method and apparatus, and readable storage medium | |
| CN105389779A (en) | Image correction method, device and mobile terminal | |
| JP6882266B2 (en) | Devices and methods for generating data representing pixel beams | |
| CN112330793A (en) | Acquiring method of ear mold 3D model, earphone customization method and computing device | |
| WO2016198014A1 (en) | Focusing imaging device, method, and terminal | |
| Wang et al. | Active stereo vision for improving long range hearing using a laser Doppler vibrometer | |
| US11184520B1 (en) | Method, apparatus and computer program product for generating audio signals according to visual content | |
| WO2022213332A1 (en) | Method for bokeh processing, electronic device and computer-readable storage medium | |
| WO2021155575A1 (en) | Electric device, method of controlling electric device, and computer readable storage medium | |
| CN118590579A (en) | Control method and electronic device |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 16810852 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 16810852 Country of ref document: EP Kind code of ref document: A1 |