WO2018014730A1 - Method for adjusting parameters of camera, broadcast-directing camera, and broadcast-directing filming system - Google Patents
Method for adjusting parameters of camera, broadcast-directing camera, and broadcast-directing filming system Download PDFInfo
- Publication number
- WO2018014730A1 WO2018014730A1 PCT/CN2017/091863 CN2017091863W WO2018014730A1 WO 2018014730 A1 WO2018014730 A1 WO 2018014730A1 CN 2017091863 W CN2017091863 W CN 2017091863W WO 2018014730 A1 WO2018014730 A1 WO 2018014730A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- camera
- video object
- target
- navigation
- target video
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30204—Marker
Definitions
- the present invention relates to the field of image processing technologies, and in particular, to a camera parameter adjustment method, a navigation camera, and a system.
- FIG. 1 is a schematic diagram of a video conference.
- the conference room adopts a long elliptical conference table
- the participants' seats surround the conference table
- the participants include A and B
- the participants A and B sit opposite each other.
- Cameras C0 and C1 are arranged on both sides of the projection screen in front of B and B.
- the camera parameters are usually adjusted manually by remote control or other means to obtain a better shooting result.
- the manual adjustment method requires the operator to have certain camera expertise and the operation process is cumbersome, which makes the adjustment efficiency low, and the better shooting effect cannot be guaranteed in time.
- the camera can be determined by sound source positioning and the camera shooting effect can be adjusted.
- the sound source is positioned by locating and tracking the participant who is speaking (ie, "speaker"), while using a camera to capture the close-up of the speaker, while tracking the speaker's face position and performing PTZ (for the lens) Pan Tilt Zoom, which is the “Pan, Tilt, Zoom” adjustment, so that the speaker's face is in the middle of the image.
- the sound source localization method only considers adjusting the front side of the speaker to the center of the image, and does not consider the effect of the image being taken, nor can it guarantee a better shooting effect.
- the embodiment of the invention provides a camera parameter adjustment method, a guide camera and a system, which can improve the efficiency of camera parameter adjustment and improve the camera shooting effect.
- an embodiment of the present invention provides a camera parameter adjustment method, where the method is applied to a navigation camera, including:
- first three-dimensional coordinate of the target video object where the first three-dimensional coordinate is a three-dimensional coordinate of the target video object in a first coordinate system corresponding to the target camera;
- the first three-dimensional coordinate may be a three-dimensional coordinate of the target video object in a first coordinate system corresponding to the target camera.
- the target camera may be a navigation camera or an ordinary PTZ camera, and the first coordinate system corresponding to the target camera may be a three-dimensional coordinate system established with the optical center of the target camera as an origin, or established by using any other reference object as an origin.
- the three-dimensional coordinate system is not limited in the embodiment of the present invention.
- the target video object may be any one or more video objects in the shooting scene corresponding to the navigation camera system where the navigation camera is located.
- the acquiring the first three-dimensional coordinates of the target video object comprises:
- the second three-dimensional coordinates are converted into first three-dimensional coordinates according to a pre-calibrated positional relationship between the binocular camera and the target camera.
- the second three-dimensional coordinates may be two-dimensional coordinates of the binocular camera obtained by respectively acquiring video objects in left and right views of the binocular camera and the binocular camera acquired The internal and external parameters of the data are calculated.
- the second three-dimensional coordinate is a three-dimensional coordinate of the target video object in a second coordinate system corresponding to the binocular camera
- the second coordinate system corresponding to the binocular camera may be a binocular camera optical center
- the two-dimensional coordinates may specifically be corresponding pixel coordinates of the target video object in the left view and the right view of the binocular camera.
- the determining a target video object that needs to be captured comprises:
- the target camera for capturing the target video object is filtered out from each camera of the navigation camera system where the navigation camera is located according to a preset navigation policy, including:
- the camera whose shooting effect parameter satisfies the preset guiding strategy is determined as the target camera for capturing the target video object.
- determining the target video object from the captured image acquired by the camera comprises:
- the video object is determined to be the target video object.
- the current camera is any camera other than the binocular camera that is in a positional relationship with the navigation camera in the navigation camera system, and the third three-dimensional coordinate is that the target video object corresponds to the current camera.
- the three-dimensional coordinates in the third coordinate system are any camera other than the binocular camera that is in a positional relationship with the navigation camera in the navigation camera system, and the third three-dimensional coordinate is that the target video object corresponds to the current camera.
- the shooting effect parameter includes any one or more of an eye-to-eye effect parameter, an occlusion relationship parameter, and a scene object parameter of the shooting area of the target video object in a coordinate system corresponding to the current camera.
- the current camera is any camera other than the binocular camera in the navigation camera system.
- the eye-to-eye effect parameter may include a rotation angle of the target video object relative to a coordinate system corresponding to the current camera, the rotation angle being according to a rotation angle of the target video object in the second coordinate system and the binocular pre-calibrated The positional relationship between the camera and the current camera is determined. The smaller the angle of rotation, the better the eye-to-eye effect.
- the occlusion relationship parameter and the scene object parameter may be determined according to a pre-calibrated positional relationship between the binocular camera and the current camera, and the region of the scene object detected by the current camera is re-projected to an imaging plane of the current camera. of.
- the output image is better when there is no occlusion relationship (the smaller the occlusion relationship parameter is). The smaller the area of the scene object and the smaller the number, the better the output image effect; otherwise, the worse the output image effect.
- an embodiment of the present invention further provides a navigation camera, including: a memory and a processor, wherein the processor is connected to the memory;
- the memory is used to store driver software
- the processor reads the driver software from the memory and performs some or all of the steps of the camera parameter adjustment method of the first aspect described above by the driver software.
- the embodiment of the present invention further provides a parameter adjustment apparatus, including an object determining unit, a selecting unit, an acquiring unit, and a parameter adjusting unit, wherein the parameter adjusting device implements a part of the camera parameter adjusting method of the first aspect by using the foregoing unit. Or all steps.
- an embodiment of the present invention further provides a computer storage medium, where the computer storage medium stores a program, and the program includes some or all of the steps of the camera parameter adjustment method of the first aspect.
- the embodiment of the present invention further provides a navigation camera system, including a first camera and at least one second camera, the first camera includes a navigation camera and a binocular camera, and the navigation camera and the binocular Between the cameras, the first camera and the second camera are connected by a wired interface or a wireless interface;
- the guidance camera is configured to determine a target video object that needs to be captured, and select a target camera for capturing the target video object from a camera of the navigation camera system according to a preset navigation strategy;
- the binocular camera is configured to acquire a second three-dimensional coordinate of the target video object, and transmit the second three-dimensional coordinate to the navigation camera; wherein the second three-dimensional coordinate is that the target video object is The three-dimensional coordinates of the binocular camera corresponding to the second coordinate system;
- the guidance camera is configured to receive the second three-dimensional coordinates transmitted by the binocular camera; convert the second three-dimensional coordinates into a first position according to a pre-calibrated positional relationship between the binocular camera and the target camera a three-dimensional coordinate; adjusting an imaging parameter of the target camera to an imaging parameter corresponding to the first three-dimensional coordinate, and outputting a video image after adjusting the imaging parameter; wherein the first three-dimensional coordinate is the target video object The three-dimensional coordinates in the first coordinate system corresponding to the target camera.
- the second camera may include a navigation camera and a binocular camera, and the target camera may be any of the navigation cameras in the navigation camera system; or the second camera may also be a normal PTZ camera. Then the target camera can be the guide camera or a normal PTZ camera.
- the binocular camera can be placed on a preset guide bracket and connected to the guide camera via the guide bracket.
- the target camera that captures the target video object is selected from the cameras of the navigation camera system according to a preset guiding strategy, and is obtained.
- a three-dimensional coordinate of the target video object in a coordinate system corresponding to the target camera to control the target camera to perform camera parameter adjustment according to the three-dimensional coordinates of the target video object, and output a video image after adjusting the imaging parameter, so that the guided camera system It can improve the accuracy of video object detection and tracking based on three-dimensional coordinate detection and preset navigation strategy, and improve the efficiency of camera parameter adjustment, and effectively improve the camera's shooting effect.
- FIG. 1 is a schematic diagram of a scene of a video conference
- FIG. 2 is a schematic flowchart of a method for adjusting a camera parameter according to an embodiment of the present invention
- FIG. 3a is a schematic diagram of a camera imaging model according to an embodiment of the present invention.
- FIG. 3b is a schematic diagram of a calibration scenario of a multi-camera according to an embodiment of the present invention.
- FIG. 3c is a schematic diagram of a three-dimensional positioning of a binocular camera according to an embodiment of the present invention.
- FIG. 3 is a schematic diagram of a PTZ camera rotation model according to an embodiment of the present invention.
- FIG. 4a is a schematic diagram of a video object matching scenario according to an embodiment of the present invention.
- Figure 4b is an image view of a set of video objects in Figure 4a;
- FIG. 5 is a schematic structural diagram of a parameter adjustment apparatus according to an embodiment of the present invention.
- FIG. 6 is a schematic structural diagram of a navigation camera system according to an embodiment of the present invention.
- FIG. 7 is a schematic structural diagram of a first camera according to an embodiment of the present disclosure.
- FIG. 8 is a schematic diagram of networking of a navigation camera system according to an embodiment of the present invention.
- FIG. 9 is a schematic structural diagram of a navigation camera according to an embodiment of the present invention.
- the navigation camera according to the embodiment of the present invention may be specifically a PTZ camera for performing the technical solution of the embodiment of the present invention, which can be connected to a binocular camera, and the navigation camera can be applied to a scene such as a conference or a training, and The deployment of the position and number of the guided cameras is performed according to different scenarios.
- the binocular camera can be mounted on a guide bracket, that is, the guide camera can be coupled to the binocular camera via a guide bracket (referred to as a "bracket").
- the navigation camera is used for guiding shooting and tracking.
- a microphone can be mounted on the bracket, which can be used for sound source localization, sound source recognition and the like.
- the camera and the bracket may be separate or integrated, and a communication interface such as a serial interface may be used for communication between the camera and the bracket.
- the binocular camera can be used for video capture, video pre-processing, motion detection, face detection, humanoid detection, scene object detection, feature detection/matching, binocular camera calibration, multi-camera calibration, etc.
- microphones are available
- the camera can be used for audio and video (Audio Video, referred to as "AV") object 3D positioning, AV object modeling, AV object tracking, motion / gesture Identification, guided control and video switching/synthesis, etc.
- AV Audio Video
- the video capture includes synchronously collecting video streams of the binocular camera and the guide camera; the video pre-processing includes pre-processing the input binocular image, such as performing noise reduction, changing resolution and frame rate, etc.; motion detection includes detecting the scene. The moving object in the moving object and the stationary background are separated to obtain the moving object area; the face detection includes detecting the face target object in the scene, and outputting the detection information of the face, such as the face position, area, direction The information detection includes detecting the human head and shoulder area in the scene and outputting detection information; the scene object detection includes detecting objects other than people in the scene, such as a lamp, a window, a conference table, etc.; feature detection/matching includes Perform feature detection and matching on the detected moving object area, detect characteristic objects (such as feature points) in one image and match in another image, and output matching feature object information; binocular camera calibration includes binocular camera Perform calibration to obtain the internal and external parameters of the binocular camera for calculating the video image.
- the video pre-processing includes pre
- multi-camera calibration comprises relative positional relationship between the plurality of cameras directed be calibrated to obtain the reference information relative to a plurality of external cameras directed, for positioning a plurality of video objects in the camera coordinate system.
- the audio collection includes synchronously acquiring the multi-channel audio data of the microphone;
- the audio pre-processing includes performing 3A processing on the input multi-channel audio data, wherein the 3A processing includes automatic exposure control (AE), auto focus control (AF), and auto white Balance control (AWB);
- sound source localization includes detecting input multi-channel audio data to find two-dimensional position information of the sounding object;
- sound source behavior recognition includes detecting and counting the voice behavior of the video object in the scene.
- the 3D positioning of the AV object includes obtaining the depth information of the object feature in the image according to the parallax information obtained by the internal and external parameters of the binocular camera and the feature detection/matching, and combining the result of the audio positioning to obtain the object feature in a single guided camera coordinate system.
- the three-dimensional position information can obtain the position information of the feature in other navigation camera coordinate systems according to the position of the feature in a single navigation coordinate system and the relative positional relationship of the plurality of navigation cameras; the AV object modeling includes combining the sound source positioning and the human face.
- Information such as information, feature objects, and scene objects constructs a model of the AV object;
- the AV object tracking includes tracking a plurality of AV objects in the scene, and updating state information of the object;
- the motion/gesture recognition includes performing actions, gestures, and the like of the AV object. Identifying, for example, identifying a standing posture of a subject, a gesture action, etc.;
- the navigation control includes determining a navigation strategy in conjunction with the result of the motion/gesture recognition and the sound source behavior recognition, and the navigation camera controls the control instruction corresponding to the output guidance strategy, the video object and the scene feature information, and the video. Output strategy, etc.
- the camera control command can be used to control the PTZ camera to perform PTZ operation, that is, pan, tilt, zoom operation, etc., and the video object and scene feature information can be used for information sharing between multiple guide cameras.
- the frequency output strategy can be used to control the output strategy of a single or multiple camera video streams.
- the embodiment of the invention provides a camera parameter adjustment method, a guide camera and a system, which can improve the efficiency of camera parameter adjustment and improve the camera shooting effect. The details are explained below.
- FIG. 2 is a schematic flowchart diagram of a camera parameter adjustment method according to an embodiment of the present invention. Specifically, the method of the embodiment of the present invention may be specifically applied to the above-mentioned navigation camera. As shown in FIG. 2, the camera parameter adjustment method in the embodiment of the present invention may include the following steps:
- Filter according to a preset guiding strategy, a target camera for capturing the target video object from each camera of the guiding camera system where the guiding camera is located.
- the determining the target video object that needs to be captured may be specifically: acquiring a captured image transmitted by the binocular camera, the captured image includes at least one video object; and establishing a video object model including the at least one video object And determining a target video object from the at least one video object.
- the target camera for capturing the target video object is selected from each camera of the navigation camera system in which the navigation camera is located according to a preset navigation strategy, which may be specifically: separately from the navigation camera system Determining the target video object in the captured image acquired by each camera, and acquiring a shooting effect parameter of the target video object in each camera; determining, by the camera that the shooting effect parameter meets the preset guiding strategy, is used for shooting The target camera of the target video object.
- One or more navigation cameras can be deployed in the navigation camera system, that is, the navigation camera system can be deployed as a guide camera + a guide camera, and can also be deployed as a guide camera + a normal camera (such as a normal PTZ camera).
- the video object model may include all video objects in the shooting scene corresponding to the navigation camera system where the guidance camera is located. If the other cameras in the navigation camera system also include the guidance camera, the captured images transmitted by the binocular cameras connected to the other navigation cameras may also be received, and the video object model is updated to obtain the video objects of all the video objects in the captured scene. model.
- the target video object may be any one or more video objects in the shooting scene.
- the shooting effect parameter may include any one or more of an eye-to-eye effect parameter, an occlusion relationship parameter, and a scene object parameter of the shooting area of the target video object in a coordinate system corresponding to the current camera.
- the current camera is any camera other than the binocular camera in the navigation camera system, that is, the current camera may be any of the guidance cameras or ordinary PTZ cameras in the guidance camera system.
- the eye-to-eye effect parameter may include a rotation angle of the target video object with respect to a coordinate system corresponding to the current camera, and the rotation angle may be a rotation angle of the target video object in the second coordinate system. And determining a positional relationship between the binocular camera and the current camera that is pre-calibrated.
- the rotation angle of the target video object relative to the coordinate system corresponding to the current camera may refer to an optical axis angle of the face or the humanoid object corresponding to the target video object relative to the current camera (the navigation camera or the ordinary PTZ camera). The smaller the angle, the more the face is rendered in a positive face manner, that is, the better the eye-to-eye effect, the better the output image effect.
- the occlusion relationship parameter and the scene object parameter may be that the area of the scene object detected by the current camera is re-injected to the location according to a pre-calibrated positional relationship between the binocular camera and the current camera.
- the imaging plane of the current camera is determined. Specifically, if the areas of the two video objects overlap, the depth information may be used to determine an occlusion relationship between the two objects, and the video object that is closer to the binocular camera may block the farther video object.
- the output image is better when there is no occlusion relationship (the smaller the occlusion relationship parameter is).
- the scene object indicated by the scene object parameter may include a light tube, a window, a table, and the like, and the smaller the area of the scene object is, the smaller the number is, the better the output image effect is; Conversely, the worse the output image is.
- the first three-dimensional coordinate may be a three-dimensional coordinate of the target video object in a first coordinate system corresponding to the target camera.
- the target camera can be configured as the above-mentioned navigation camera or a normal PTZ camera.
- the first coordinate system corresponding to the target camera can be a three-dimensional coordinate system established with the target camera's optical center as the origin, or any other reference object.
- the three-dimensional coordinate system established by the origin is not limited in the embodiment of the present invention.
- the navigation camera can be connected to a preset binocular camera.
- the acquiring the first three-dimensional coordinates of the target video object may be specifically: acquiring a second three-dimensional coordinate transmitted by the binocular camera connected to the navigation camera; and according to the pre-calibrated the binocular camera and the target The positional relationship of the camera converts the second three-dimensional coordinates into first three-dimensional coordinates.
- the second three-dimensional coordinates may be two-dimensional coordinates of the target video object acquired in the left view and the right view of the binocular camera and the binocular acquired by the binocular camera respectively The internal and external parameters of the camera are calculated.
- the second three-dimensional coordinate is a three-dimensional coordinate of the target video object in a second coordinate system corresponding to the binocular camera
- the second coordinate system corresponding to the binocular camera may be a binocular camera optical center
- the two-dimensional coordinates may specifically be corresponding pixel coordinates of the target video object in the left view and the right view of the binocular camera.
- the positional relationship between the binocular cameras, the positional relationship between the camera and the binocular camera, and the positional relationship between the cameras of the multi-camera in the camera system can be calibrated in advance.
- the parameter obtained by the binocular camera system calibration can be used to calculate the three-dimensional coordinates of the video object in the coordinate system corresponding to the binocular camera; the positional relationship calibration between the navigation camera and the binocular camera can be used to calculate the video object in the navigation camera coordinate system.
- the three-dimensional coordinates of the lower three-dimensional coordinates; and the positional calibration parameters between the cameras of the multi-camera can be used to calculate the three-dimensional coordinates of the video object in the camera coordinate system of each aircraft position in the multi-camera deployment scenario, so as to facilitate coordinate conversion.
- the deployment mode of the multi-camera camera may be the deployment mode of the above-mentioned guide camera + guide camera, or the deployment mode of the guide camera + ordinary PTZ camera.
- Each of the navigation cameras can be referred to as a single position. When multiple cameras are used for cooperative shooting, a host position can be determined therefrom, and the rest is a slave position. As a guide camera of the slave position, the camera can be used.
- the binocular camera includes a left camera and a right camera.
- the image acquired by the left camera may be referred to as a left view
- the image acquired by the right camera may be referred to as a right view.
- the imaging (projection) model of a single camera can be described by the following formula:
- x is a pixel coordinate of a certain point in the scene (ie, a video object, specifically a feature point corresponding to the video object) in the image coordinate system, which is a two-dimensional coordinate;
- X is a certain point in the scene.
- P is a 3 ⁇ 4 projection matrix.
- PX means P ⁇ X.
- K is a 3 ⁇ 3 camera internal reference matrix, which can be expressed as:
- f x , f y are the equivalent focal lengths in the x and y directions
- c x , c y are the image coordinates of the optical center
- s is the skew coefficient of the skew (the sensor and the optical axis are not perpendicular, usually small, during the calibration process) Ignorable).
- R and t are camera external parameters, which are represented as a 3 ⁇ 3 rotation matrix and a 3 ⁇ 1 translation vector, respectively, as follows:
- r 1 , r 2 , r 3 are 3 ⁇ 1 column vectors in the rotation matrix.
- the model of camera image distortion can be described according to the following formula:
- x p , y p are the corrected pixel positions
- x d , y d are the pre-correction pixel positions
- k 1 , k 2 , k 3 are radial distortion coefficients
- p 1 , p 2 are tangential distortion coefficients.
- the positional relationship between the binocular cameras and the positional relationship between the navigation camera such as the PTZ camera and the binocular camera are fixed, and the two calibrations can be completed before leaving the factory, that is, The data obtained by the two calibrations is fixed as the internal and external data.
- the calibration of the camera may adopt various schemes, such as the plane calibration method of Zhang (also referred to as “Zhang's calibration method”), and the distortion parameter calculation adopts the Brown method, which is not described herein. .
- the above-mentioned binocular camera calibration principle shows that the calibration of the positional relationship of a multi-camera camera such as a multi-camera camera is to find a relative external parameter between two adjacent guide cameras, according to the position between adjacent guide cameras.
- the relative external parameters calculate the external parameters between any two navigation cameras, thereby obtaining the positional relationship between any two navigation cameras.
- the multi-guide camera is deployed, a large overlapping area of the camera is required between the two cameras.
- the multiple positions are similar to the surrounding multi-camera system.
- the rotation matrix and translation vector of the i-th camera relative to the j-th camera are:
- R i,i-1 R i-1,i-2 ...R j+1,j denotes R i,i-1 ⁇ R i-1,i-2 ⁇ ... ⁇ R j+1,j . Since the position of the camera for positioning on different guide brackets changes according to the actual deployment scenario when the guide camera is deployed, the positional relationship between the multi-guide cameras cannot be pre-calibrated before the device leaves the factory, and can be deployed in the guide camera. Perform on-site calibration.
- FIG. 3b is a schematic diagram of a calibration scenario of a multi-guide camera according to an embodiment of the present invention.
- a local area network or a wireless Fidelity (Local Area Network, simply referred to as "LAN”) or a wireless fidelity can be used between two adjacent navigation cameras or between a binocular camera and a navigation camera.
- LAN local area network
- LAN wireless Fidelity
- a wireless fidelity can be used between two adjacent navigation cameras or between a binocular camera and a navigation camera.
- the transmission protocol can use a variety of network protocols, such as the HyperText Transfer Protocol ("HTTP").
- HTTP HyperText Transfer Protocol
- a global ID number can be assigned in advance to each of the navigation camera and the camera (binocular camera) connected to the navigation camera. For example, you can select a camera as the starting position. For example, you can select the leftmost or rightmost camera as the starting position.
- the ID numbers of other cameras are incremented counterclockwise or clockwise.
- a group of cameras is selected among all cameras to participate in the calibration. The principle of selection may be to ensure that the overlapping area between adjacent cameras is the largest. As shown in FIG.
- each of the positions includes a navigation camera (recorded as PTZ0, PTZ1, and PTZ2, respectively) and a binocular camera (the binocular camera). Includes two left and right cameras, denoted as C0, C1, C2, C3, C4, C5).
- cameras with ID numbers C0, C2, and C4 are selected for calibration, and one of the navigation cameras is selected as the calibration computing device, such as the above-mentioned navigation camera for the host position.
- the external reference calibration between the two cameras can be performed from left to right or from right to left, and the relative external parameters between the two cameras are obtained.
- a camera relative position relationship table can be maintained in each navigation camera, as shown in Table 1 below. Among them, each of the tables will add or update one of the entries, and each entry is uniquely determined by its two camera ID numbers.
- the positional relationship table which is the calibration computing device, can be sent to all other navigation cameras for storage via the network. Further, according to the position relationship table and the external parameters of the binocular camera (the external reference can be calibrated at the factory), any two or two cameras in the calibration scene can be calculated (including between binocular cameras, binocular) The positional relationship between the camera and the PTZ camera and between the PTZ cameras.
- the navigation camera D3 in FIG. 3b is a calibration computing device, and the navigation cameras D1 and D2 are cameras to be calibrated, the navigation camera D3 can be set as the calibration host position, and the other cameras are set as slaves. The position is set and the calibration is initiated by the camera D3. Before calibration, it is necessary to ensure that the cameras are interconnected through the network. The overlapping areas can be captured between the cameras that need to be calibrated, and there are calibration templates (such as checkerboard templates) in the overlap area.
- the calibration camera is required to start the calibration process, and the image acquisition command is sent to the navigation camera D1.
- the acquisition command includes the ID number of the navigation camera D1 (ie, D1) and the ID number of the binocular camera to be acquired (C4 or C5).
- the navigation camera D1 After receiving the acquisition command, the navigation camera D1 performs image acquisition of the calibration template, and transmits the collected image data to the navigation camera D3. Similarly, the guide camera D3 obtains the binocular camera on the guide camera D2. The calibration template image taken. If the binocular camera to be calibrated is located on the navigation camera D3, the navigation camera D3 can directly acquire the calibration template image of the binocular camera. After obtaining the calibration template image of the camera to be calibrated, the guide camera D3 can perform checkerboard corner detection on the calibration template images of the two cameras. If the two images can detect all the checkerboard corner points, the collection can be indicated. Successful; otherwise the two images can be discarded and the image reacquired.
- a plurality of calibration template images of the two cameras that need to be calibrated are cyclically acquired and saved in the navigation camera D3.
- the camera D3 can be used to perform the camera. Calibration, the internal parameters of each camera have been calibrated at the factory, so it can be used as the initial input value for the calibration.
- the relative external parameters R and T between the two cameras are obtained, and whether the re-projection (shadow) error is less than a preset threshold is calculated. If the re-injection error is greater than the threshold, the calibration failure may be indicated; otherwise, it may indicate The calibration was successful.
- the positional relationship table can be updated based on the calculated relative position relationship of the camera and transmitted to other navigation cameras.
- the video within the shooting range of the navigation camera can be obtained.
- the object is positioned, and the three-dimensional position information is obtained, so as to determine an appropriate guiding camera position according to the acquired three-dimensional position information, and adjust the parameter of the guiding camera according to the guiding strategy corresponding to the three-dimensional position information, and control the positioning of the guiding camera to be suitable.
- the location of the video object is taken.
- the positioning of the video object includes three-dimensional positioning of the binocular camera, single-camera camera such as PTZ camera positioning, and three-dimensional positioning between the cameras of the multi-camera.
- the stereoscopic image captured by the binocular camera can be used to calculate the depth position information of a certain observation point in the camera coordinate system, thereby determining the three-dimensional position information of the observation point.
- This method is the same as the principle that the human eye perceives the depth distance, and is called binocular camera ranging.
- FIG. 3c a three-dimensional positioning schematic diagram of a binocular camera is provided. The following describes the ranging principle of the binocular camera system. Among them, P is an observation point in the world coordinate system, and is imaged by two left and right cameras.
- the position of the P point in the physical coordinate system of the left camera is X L , Y L , Z L , and the coordinates of the pixel position of the imaging point in the left view are x l , y l ;
- the position in the physical coordinate system of the right camera is X R , Y R , Z R , the pixel position coordinates of the imaging point in the right view are x r , y r , assuming that the relative external parameters of the left and right cameras are R, T;
- the focal lengths of the left and right cameras are: f l , f r .
- the relationship between the imaging model of the left and right cameras and the physical coordinate position of the left and right cameras is as follows:
- the values of x l , y l , x r , y r can be obtained by image matching, f l , f r , R, T can be obtained by binocular camera calibration, so X L , Y L , Z L can be calculated And the values of X R , Y R , Z R , thereby determining the three-dimensional coordinates of the observation points in the scene in the coordinate system corresponding to the binocular camera.
- FIG. 3 is a schematic diagram of a PTZ camera rotation model according to an embodiment of the present invention. As shown in Fig.
- the PTZ camera is a zoom camera, it is necessary to obtain a function relationship of the zoom factor Z and the internal parameters such as the focal length and the distortion coefficient.
- the polynomial can be used to fit the relationship between the zoom factor Z and the focal lengths f x , f y to obtain the following relationship:
- the camera internal parameters are obtained, the corresponding f x , f y and distortion coefficients are calculated, and the coefficients are fitted using least squares method.
- Other internal parameters such as distortion coefficients can also be processed in a similar manner.
- the values of ⁇ p and ⁇ t can be calculated according to the Pan/Tilt model formula.
- the captured image sent by the binocular camera connected to other navigation cameras may be acquired, and the video object model is updated after the video object is matched.
- Determining the target video object from the captured image obtained by the camera which may be specifically: converting the second three-dimensional coordinate into a third three-dimensional coordinate according to a pre-calibrated positional relationship between the binocular camera and the current camera; Determining whether the area of the target video object in the third three-dimensional coordinate and the area of the video object detected by the current camera in the three-dimensional coordinate of the video object exceeds a preset area threshold; if exceeded, The video object is then determined as the target video object. That is, the video object matches successfully.
- the third three-dimensional coordinate is a three-dimensional coordinate of the target video object in a third coordinate system corresponding to the current camera, and the current camera is any other than the binocular camera in the navigation camera system.
- a camera for example, in a multi-camera camera scene, the current camera can be a binocular camera other than the binocular camera of the host position.
- the purpose of the three-dimensional positioning of the multi-camera video object is to calculate the three-dimensional coordinates of the video object in a binocular coordinate system of a navigation camera, and calculate it in other binocular cameras or a PZT camera coordinate system.
- the three-dimensional coordinates It is known that a certain observation point (ie, a video object, specifically a certain feature point of the video object) has a coordinate vector X 1 in the camera D1, and an external parameter R 21 , T 21 of the camera D2 relative to the camera D1 (through The binocular camera calibration is obtained), and the coordinate vector X 2 of the observation point in the camera D2 can be calculated:
- FIG. 4a is a schematic diagram of an object matching scenario according to an embodiment of the present invention.
- Three-dimensional positioning of multi-camera video objects can be used to determine the correspondence of multiple video objects.
- three-camera cameras D1, D2, and D3 are deployed in the scene.
- FIG 4b it is a set of video objects in the image of FIG 4a, the participants O 1 which is a different view of the imaging cameras directed D1, D2 and D3 in.
- the binocular camera in the station D1 detects the video object VO 11 by using an algorithm such as face detection, and then uses the binocular camera three-dimensional positioning algorithm to obtain the three-dimensional position of the object in the D1 binocular coordinate system.
- D2 and D3 will also detect the video objects VO 12 and VO 13 and calculate the three-dimensional coordinates of the object in the D2 and D3 binocular coordinate systems.
- the positional relationship between D1, D2 and D3 can be used to calibrate the three-dimensional position of the video object VO 11 in the D1 coordinate system to the D2 and D3 coordinate systems. And detect the coincident area.
- the VO 11 , VO 12 and VO 13 are considered to be the same video object, and the video object is successfully matched. Further, if there are multiple video objects in the image that are close to each other in the image, simply determining the correspondence relationship of the video objects by using the positional coincidence region may result in a matching error. Thereby, the image information of the video object can be further combined and the accuracy of determining the correspondence can be improved by the matching algorithm.
- the matching algorithm may include a template matching algorithm and the like.
- a template matching algorithm may be used to take a two-dimensional image of a video object detected by a binocular camera of a certain host position such as a host bit as a known template, and perform a video object detected by a binocular camera of another station.
- a matching finds the object that matches the video object by algorithm such as squared difference matching and correlation matching, thereby establishing the correspondence of the object.
- Video object detection tracking and scene modeling are examples of video objects existing in the scene, and to track and identify these objects.
- Video objects include participants, as well as scene objects such as lights, windows, conference tables, and more.
- the system needs to cyclically process the image data of the input binocular camera, including face detection and matching, human shape detection and matching, moving object detection and matching, scene object detection and matching, etc., modeling the video object and updating the model. Parameters to model the entire shooting scene based on the detected object model.
- the modeled model can be used for subsequent object recognition and guided strategy processing.
- face detection can be used to detect video objects with close distances, such as participants with relatively close detection distances.
- the face detection can obtain various parameters of the face video object, including the two-dimensional coordinates of the circumscribed rectangular area of the face, the coordinates of the center point, the area of the rectangle, and the rotation angle of the face around the coordinate axis (representing the left and right deflection of the face, pitching Parameters such as the degree of rotation and the position of the organs such as the eyes, nose, and mouth in the face.
- the video object needs to be tracked in the video frame sequence to establish a correspondence relationship between the video objects in the time domain.
- video object tracking algorithms include grayscale based template matching, MeanShift, CamShift, Kalman filtering algorithms and the like.
- the video object The matching can be applied to the binocular camera, and the video object region detected in one camera image of the binocular camera is used to find a corresponding video image region in another camera image, so that the feature can be performed in the matching region of the video object. Match and calculation of 3D coordinates.
- the matching algorithm of the video object is similar to the tracking algorithm, and grayscale-based template matching and algorithms such as MeanShift can be used.
- a video object may be represented by its features, and commonly used features include feature points, image textures, histogram information, and the like.
- the feature detection and matching can be performed in the detected video object region, so that the three-dimensional position information of the video object, that is, the three-dimensional coordinates can be calculated according to the feature point information, and the video object can be tracked according to the texture information and the histogram information.
- the feature point is the main feature type, and the feature point detection algorithm includes Harris corner detection and SIFT feature point detection. Further, the feature matching is used to establish the correspondence relationship between the features of the same video object of the binocular camera.
- the feature points can be matched by using a matching algorithm such as FLANN algorithm and KLT optical flow method, and the image texture can be matched by using a gray template matching algorithm and the like.
- Graphs can be matched using algorithms such as histogram matching.
- a plurality of video object models can be established in a single navigation camera coordinate system, and can be passed through a human face or a human form.
- the motion detection tracking algorithm updates the model data.
- each video object model can be assigned a unique ID number, and the data in the model represents the attributes of the video object.
- the data in the model may include attributes such as an object ID, a circumscribed rectangle two-dimensional coordinate, a three-dimensional coordinate of the object feature point, a motion region texture data, a histogram data, and the like.
- video camera model data can be exchanged between multiple guide cameras through network communication.
- the above-mentioned multi-guide camera can be used to generate three-dimensional images.
- the algorithm for locating and matching the video object establishes a correspondence relationship between the video object models, thereby obtaining a guiding strategy for the entire scene.
- the network communication protocol during communication can adopt a standard protocol such as the HTTP protocol, or a custom protocol, and the data of the video object model is formatted according to a format such as an eXtensible Markup Language (“XML”) format. , packaging and transmission.
- XML eXtensible Markup Language
- the scene model contains models of multiple video objects, reflecting the characteristics of the video objects and their distribution in three dimensions.
- the camera needs to maintain the scene model, including adding and deleting object models and object model properties. For example, when a new participant appears in the scene, when the binocular camera detects a new face or a humanoid object, the object model is created and added to the object model set; when the participant leaves the scene, the object is deleted. The model; after the participant's position changes, the parameters of the corresponding object model are updated. To develop a navigation strategy based on the latest video object model, select the camera with the best position for shooting.
- one or more guiding camera positions of the best shooting effect can be selected according to the preset guiding strategy.
- the camera having the better shooting effect is determined according to the eye-to-eye effect parameter, the occlusion relationship parameter, and the scene object parameter of the shooting area.
- the eye-to-eye effect needs to be determined according to the optical axis angle of the face/human object relative to the PTZ camera. The smaller the angle, the more the face is presented in a positive face manner, and the eye-to-eye effect is better.
- the face/human shape detection algorithm obtains a rotation angle (left and right deflection, pitch and rotation angle) of the three-dimensional coordinate axis centered on the face/human shape with respect to the binocular camera coordinate system, and uses the aforementioned camera.
- the formula for the conversion of the coordinate system converts the rotation angle of the face/human shape relative to the binocular camera to the rotation angle of the relative PTZ camera.
- a PTZ camera priority queue with an eye-to-eye effect can be further established for each video object, and a camera with better eye-to-eye effect has a higher priority.
- the area of the video object detected by a certain camera may be known according to the projection equation of the camera, and the single-guided camera binocular camera and the calibration may be utilized.
- the external parameters between the PTZ cameras and the external parameters between the binocular cameras of different positions re-enter the area onto the imaging plane of each position PTZ camera. If the regions of the two video objects overlap, the depth information can be used to determine the occlusion relationship between the two objects, that is, the video object closer to the binocular camera will block the farther video object.
- a PTZ camera priority queue can be established for each video object with an occlusion relationship, and an unoccluded camera has a higher priority.
- the system also detects other video objects (scene objects) of interest in the scene, such as lamps, windows, conference tables, and the like.
- the detection of these objects can employ algorithms based on image color and edge features, and the like.
- the Canny operator can be used to extract the edge of the tube to obtain its long straight line feature, and then detect whether there is an overexposed pixel area (lighting area) in the adjacent area. According to these two features, it can be detected. Exit the tube object and get the coordinates of its circumscribed rectangle. The detection of the window is similar to the detection of the tube.
- the quadrilateral feature can be obtained by edge detection, and then whether the window is determined according to whether there is a certain area of the overexposed pixel area in the quadrilateral.
- the conference table can also be detected using edge features in the image.
- the guide camera of the navigation camera such as the host position, establishes a priority queue for the PTZ cameras of each station according to the acquired image effect parameters and the preset guide strategy, and can determine the camera to be selected.
- one or more video objects that need to be photographed that is, a target video object, such as a talking video object determined according to the sound source localization result, may be determined to capture a close-up of the video object; or an AutoFrame strategy is required.
- Pan/Tilt can be adjusted to include all the video objects in the scene, and Zoom adjusts the object to the appropriate size, and so on.
- a PTZ camera priority queue using eye-to-eye effect parameters, occlusion relationship parameters, and scene object parameters can determine a comprehensive PTZ camera priority queue according to a certain guiding strategy.
- the navigation policy may be automatically calculated by the system or preset by the user, which is not limited by the embodiment of the present invention.
- the unobstructed PTZ camera select the PTZ camera with the best image object parameters and the best image as the target camera. .
- the host bit can adjust the PTZ parameters of the selected PTZ camera according to the three-dimensional coordinates of the target video object to obtain the best image effect as much as possible. For example, during voice tracking, when shooting a close-up of a participant, avoid shooting an object that affects the brightness of the image, such as a lamp or a window; when adjusting the Zoom size, avoid adjusting the white balance effect of the large-area table object on the image. and many more.
- the host bit (the host position of the navigation camera) can output the selected PTZ camera video image or ID.
- the host bit can directly output an image of the selected camera; for a multi-guide camera system output through the video matrix, the host bit can pass through a communication interface (such as a serial port or a network port).
- the ID of the selected PTZ camera is output to the video matrix, and the camera image is switched by the video matrix.
- the target camera that captures the target video object is selected from the cameras of the navigation camera system according to a preset guiding strategy, and is obtained.
- a three-dimensional coordinate of the target video object in a coordinate system corresponding to the target camera to control the target camera to perform camera parameter adjustment according to the three-dimensional coordinates of the target video object, and output a video image after adjusting the imaging parameter, so that the guided camera system It can improve the accuracy of video object detection and tracking based on three-dimensional coordinate detection and preset navigation strategy, and improve the efficiency of camera parameter adjustment, and effectively improve the camera's shooting effect.
- FIG. 5 is a schematic structural diagram of a parameter adjustment apparatus according to an embodiment of the present invention.
- the device in the embodiment of the present invention may be specifically configured in the above-mentioned navigation camera.
- the parameter adjustment device in the embodiment of the present invention may include an object determining unit 10, a selecting unit 20, and an acquiring unit. 30 and parameter adjustment unit 40. among them,
- the object determining unit 10 is configured to determine a target video object that needs to be photographed.
- the selecting unit 20 is configured to select, from a camera of the navigation camera system where the navigation camera is located, a target camera for capturing the target video object according to a preset navigation policy.
- the shooting effect parameter may include any one or more of an eye-to-eye effect parameter, an occlusion relationship parameter, and a scene object parameter of the shooting area of the target video object in a coordinate system corresponding to the current camera.
- the current camera is any camera other than the binocular camera in the navigation camera system.
- the eye-to-eye effect parameter may include a rotation angle of the target video object with respect to a coordinate system corresponding to the current camera, and the rotation angle may be a rotation angle of the target video object in the second coordinate system. And determining a positional relationship between the binocular camera and the current camera that is pre-calibrated.
- the occlusion relationship parameter and the scene object parameter may be that the area of the scene object detected by the current camera is re-injected to the location according to a pre-calibrated positional relationship between the binocular camera and the current camera.
- the imaging plane of the current camera is determined.
- the acquiring unit 30 is configured to acquire first three-dimensional coordinates of the target video object.
- the first three-dimensional coordinate may be a three-dimensional coordinate of the target video object in a first coordinate system corresponding to the target camera.
- the target camera may be the above-mentioned navigation camera or an ordinary PTZ camera.
- the first coordinate system corresponding to the target camera may refer to a three-dimensional coordinate system established with the optical center of the target camera as the origin, or the origin of any other reference object.
- the established three-dimensional coordinate system is not limited in the embodiment of the present invention.
- the parameter adjustment unit 40 is configured to adjust an imaging parameter of the target camera to an imaging parameter corresponding to the first three-dimensional coordinate, and output a video image after adjusting the imaging parameter.
- the obtaining unit 30 may be specifically configured to:
- the second three-dimensional coordinates are converted into first three-dimensional coordinates according to a pre-calibrated positional relationship between the binocular camera and the target camera.
- the second three-dimensional coordinates may be two-dimensional coordinates of the target video object acquired in the left view and the right view of the binocular camera and the binocular acquired by the binocular camera respectively The internal and external parameters of the camera are calculated.
- the second three-dimensional coordinate is a three-dimensional coordinate of the target video object in a second coordinate system corresponding to the binocular camera, and the second coordinate system corresponding to the binocular camera may be a binocular camera optical center
- the two-dimensional coordinates may specifically be corresponding pixel coordinates of the target video object in the left view and the right view of the binocular camera.
- the object determining unit 10 may be specifically configured to:
- the selecting unit 20 can be specifically configured to:
- the camera whose shooting effect parameter satisfies the preset guiding strategy is determined as the target camera for capturing the target video object.
- the selecting unit 20 performs the determination of the target video object from the captured image acquired by the camera.
- the third three-dimensional coordinate is a three-dimensional coordinate of the target video object in a third coordinate system corresponding to the current camera;
- the video object is determined to be the target video object.
- the target camera that captures the target video object is selected from the cameras of the navigation camera system according to a preset guiding strategy, and is obtained.
- a three-dimensional coordinate of the target video object in a coordinate system corresponding to the target camera to control the target camera to perform camera parameter adjustment according to the three-dimensional coordinates of the target video object, and output a video image after adjusting the imaging parameter, so that the guided camera system It can improve the accuracy of video object detection and tracking based on three-dimensional coordinate detection and preset navigation strategy, and improve the efficiency of camera parameter adjustment, and effectively improve the camera's shooting effect.
- FIG. 6 is a schematic structural diagram of a navigation camera system according to an embodiment of the present invention.
- the navigation camera system of the embodiment of the present invention may include a first camera 1 and at least one second camera 2, the first camera 1 includes a navigation camera 11 and a binocular camera 12, and the navigation camera 11 and the camera Between the binocular cameras 12, the first camera 1 and the second camera 2 can be connected through a wired interface or a wireless interface;
- the guidance camera 11 is configured to determine a target video object that needs to be captured, and select a target camera for capturing the target video object from a camera of the navigation camera system according to a preset navigation strategy;
- the binocular camera 12 is configured to acquire a second three-dimensional coordinate of the target video object, and transmit the second three-dimensional coordinate to the navigation camera 11; wherein the second three-dimensional coordinate is the target video The three-dimensional coordinates of the object in the second coordinate system corresponding to the binocular camera 12;
- the navigation camera 11 is configured to receive the second three-dimensional coordinates transmitted by the binocular camera 12; and according to a pre-calibrated positional relationship between the binocular camera 12 and the target camera, the second three-dimensional coordinates Converting to a first three-dimensional coordinate; adjusting an imaging parameter of the target camera to an imaging parameter corresponding to the first three-dimensional coordinate, and outputting a video image after adjusting the imaging parameter; wherein the first three-dimensional coordinate is the The three-dimensional coordinates of the target video object in the first coordinate system corresponding to the target camera.
- the second camera 2 may also include a navigation camera and a binocular camera, and the target camera may be any of the navigation cameras in the navigation camera system; or the second camera 2 is a normal PTZ camera.
- the target camera can be the guide camera or a normal PTZ camera.
- the binocular camera 12 can be disposed on a preset guide bracket and connected to the guide camera 11 through the guide bracket.
- FIG. 7 it is a schematic structural diagram of a first camera provided by an embodiment of the present invention.
- the first camera includes a binocular camera and one or more guide cameras. It is assumed that in the embodiment of the present invention, the first camera is equipped with two navigation cameras for guiding shooting and tracking, which can be wired or wirelessly connected to the binocular camera through a guiding bracket (referred to as "cradle").
- the binocular camera is mounted on the bracket.
- a microphone can be mounted on the bracket, and the installed microphone can be in the form of an array.
- the microphone in the array can be used for realizing sound source positioning, sound source identification and the like. Includes a horizontal array of microphones and a vertical array of microphones.
- the guide camera and the bracket may be separated or integrated, and a communication interface such as a serial interface may be used for communication between the guide camera and the bracket.
- a communication interface such as a serial interface
- the above-mentioned navigation camera and the guide bracket may be integrated into one guide device.
- the connection form of each device in the navigation camera system is not limited in the embodiment of the present invention.
- FIG. 8 is a schematic diagram of networking of a guided camera system according to an embodiment of the present invention.
- the multi-camera network includes multiple inter-camera networking with guided cameras, and the camera and guide brackets are installed + multiple common PTZ camera groups. Net, the position of the camera and the guide bracket is installed + the position of the station without PTZ camera (that is, only the guide bracket), and the position without the PTZ camera + the network of multiple ordinary PTZ cameras (ie, no guide bracket).
- the cameras of each station can be interconnected by LAN or Wi-Fi to transmit control messages, including camera switching messages, audio and video data such as video object model data, and the like.
- control message may be transmitted through an Internet Protocol (IP), such as an IP Camera protocol stack.
- IP Internet Protocol
- a binning camera is required between the binocular cameras in the two positions.
- IP Camera IP Camera protocol stack.
- a binning camera is required between the binocular cameras in the two positions.
- the switching policy of the video matrix may be controlled by any specified navigation camera in the scenario, such as a navigation camera as a host, or controlled by a third-party device, which is not limited in the embodiment of the present invention.
- the video image output by the video matrix is encoded by the codec device and transmitted to the far end for video conferencing.
- the video data can be processed in cascade (the navigation bracket supports video cascading); if the number is large, the video of multiple cameras is output to the video matrix for processing, by the video matrix. Switch or synthesize one or more camera video sources.
- the bracket can provide external video input / Output interface, LAN/Wi-Fi network port and serial interface.
- the video input interface is used for external input video of other cameras; the video output interface is used to connect terminals or video matrix devices to output video images; the serial interface provides control and debugging interface for the bracket; LAN/Wi-Fi
- the network port is used for cascading multiple camera positions, and can transmit audio and video data and control data.
- the plurality of navigation cameras have video object detection capability and PTZ camera function, and one of the navigation cameras can be used as a host position, and is responsible for outputting the position selection and PTZ camera control, and other cameras.
- the slave position; the position of the guide camera and the guide bracket + multiple common PTZ cameras only one guide camera has video object detection capability, which is responsible for output position selection and PTZ camera control, and the ordinary camera is only used as a PTZ camera. Since only the navigation camera has the video object detection capability, the data of the video object model of the camera position is not obtained through the network, and the matching process of the multi-camera video object model is performed.
- FIG. 9 is a schematic structural diagram of a navigation camera according to an embodiment of the present invention, for performing the above camera parameter adjustment method.
- the navigation camera of the embodiment of the present invention includes: a communication interface 300, a memory 200, and a processor 100, and the processor 100 is respectively connected to the communication interface 300 and the memory 200.
- the memory 200 may be a high speed RAM memory or a non-volatile memory such as at least one disk memory.
- the communication interface 300, the memory 200, and the processor 100 may be connected to each other through a bus, or may be connected by other means. In the present embodiment, a bus connection will be described.
- the device structure shown in FIG. 10 does not constitute a limitation on the embodiments of the present invention, and may include more or less components than those illustrated, or a combination of certain components, or different component arrangements. among them:
- the processor 100 is a control center of the device, which connects various parts of the entire device using various interfaces and lines, by running or executing programs and/or units stored in the memory 200, and calling driver software stored in the memory 200, Perform various functions and process data of the device.
- the processor 100 may be composed of an integrated circuit ("IC"), for example, may be composed of a single packaged IC, or may be composed of a plurality of packaged ICs that have the same function or different functions.
- the processor 100 may include only a central processing unit (“CPU"), or may be a CPU, a digital signal processor ("DSP"), or a graphics processor ( Graphic Processing Unit (referred to as "GPU") and a combination of various control chips.
- the CPU may be a single operation core, and may also include multiple operation cores.
- Communication interface 300 can include a wired interface, a wireless interface, and the like.
- the memory 200 can be used to store driver software (or software programs) and units, and the processor 100 and the communication interface 300 perform various functional applications of the devices and implement data processing by calling the driver software and the units stored in the memory 200.
- the memory 200 mainly includes a program storage area and a data storage area, wherein the program storage area can store driver software and the like required for at least one function; the data storage area can store data according to the parameter adjustment process, such as the above-described three-dimensional coordinate information.
- the processor 100 reads the driver software from the memory 200 and executes it under the action of the driver software:
- first three-dimensional coordinate of the target video object where the first three-dimensional coordinate is a three-dimensional coordinate of the target video object in a first coordinate system corresponding to the target camera;
- the processor 100 reads the driver software from the memory 200 and performs the acquiring the first three-dimensional coordinates of the target video object by using the driver software, and specifically performing the following steps:
- the second three-dimensional coordinates are converted into first three-dimensional coordinates according to a pre-calibrated positional relationship between the binocular camera and the target camera.
- the processor 100 reads the driver software from the memory 200 and performs the determining of the target video object that needs to be captured by the driver software, and specifically performs the following steps:
- the processor 100 reads the driver software from the memory 200 and performs the function of the driver software to select from the cameras of the navigation camera system where the camera is located according to a preset navigation policy. For the target camera for capturing the target video object, perform the following steps:
- the camera whose shooting effect parameter satisfies the preset guiding strategy is determined as the target camera for capturing the target video object.
- the processor 100 reads the driver software from the memory 200 and performs the determination of the target video object from the captured image acquired by the camera under the action of the driver software, and specifically performs the following steps:
- the third three-dimensional coordinate is a three-dimensional coordinate of the target video object in a third coordinate system corresponding to the current camera;
- the video object is determined to be the target video object.
- the shooting effect parameter may include any one or more of an eye-to-eye effect parameter, an occlusion relationship parameter, and a scene object parameter of the shooting area of the target video object in a coordinate system corresponding to the current camera.
- the current camera is any camera other than the binocular camera in the navigation camera system.
- the eye-to-eye effect parameter may include a rotation angle of the target video object relative to a coordinate system corresponding to the current camera, the rotation angle being according to a rotation angle of the target video object in the second coordinate system and The pre-calibrated positional relationship between the binocular camera and the current camera is determined.
- the occlusion relationship parameter and the scene object parameter may be according to the binocular camera pre-calibrated. And a positional relationship between the current camera, and re-casting an area of the scene object detected by the current camera to an imaging plane of the current camera.
- the disclosed apparatus and method may be implemented in other manners.
- the device embodiments described above are merely illustrative.
- the division of the unit is only a logical function division.
- there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
- the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
- the units described as separate components may or may not be physically separated, and the components displayed as the unit may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. . Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
- each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
- the above integrated unit can be implemented in the form of hardware or in the form of hardware plus software functional units.
- the above-described integrated unit implemented in the form of a software functional unit can be stored in a computer readable storage medium.
- the above software functional unit is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor to perform the methods of the various embodiments of the present invention. Part of the steps.
- the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (Read-Only Memory, simply referred to as "ROM"), a random access memory (RAM), a magnetic disk, or an optical disk, and the like.
- the medium of the program code includes: a U disk, a mobile hard disk, a read-only memory (Read-Only Memory, simply referred to as "ROM"), a random access memory (RAM), a magnetic disk, or an optical disk, and the like.
- the medium of the program code includes: a U disk, a mobile hard disk, a read-only
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Studio Devices (AREA)
Abstract
Description
本申请要求于2016年7月18日提交中国专利局、申请号为201610562671.6、发明名称为“一种摄像机参数调整方法、导播摄像机及系统”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to Chinese Patent Application No. 201610562671.6, entitled "A Camera Parameter Adjustment Method, Guide Camera and System", filed on July 18, 2016, the entire contents of which are incorporated by reference. In this application.
本发明涉及图像处理技术领域,尤其涉及一种摄像机参数调整方法、导播摄像机及系统。The present invention relates to the field of image processing technologies, and in particular, to a camera parameter adjustment method, a navigation camera, and a system.
随着图像处理技术以及互联网的不断发展,需要采用视频会议的场景越来越多,该视频会议为用户间的远程交流带来了极大便利。目前,在进行视频会议时,往往需要部署多个摄像机来进行拍摄,以获取参会者的正面图像。例如,请参见图1,图1是一种视频会议的场景示意图。如图1所示,在该场景中,会议室采用长条形椭圆会议桌,参会者的座位围绕会议桌,参会者包括A和B,参会者A和B相对而坐,该A和B的前方的投影屏幕两侧布置有摄像机C0和C1。则对于A,只有通过摄像机C0才能拍到A的正面图像,而摄像机C1无法拍摄到A的正面。而对于B,只有通过摄像机C1才能拍到B的正面图像,而摄像机C0无法拍摄到B的正面。由此可见,需采用多个摄像机进行拍摄以实现视频会议。With the continuous development of image processing technology and the Internet, there are more and more scenes requiring video conferencing, which brings great convenience for remote communication between users. Currently, when video conferencing is performed, it is often necessary to deploy multiple cameras for shooting to obtain a frontal image of the participants. For example, refer to FIG. 1. FIG. 1 is a schematic diagram of a video conference. As shown in Fig. 1, in this scenario, the conference room adopts a long elliptical conference table, the participants' seats surround the conference table, the participants include A and B, and the participants A and B sit opposite each other. Cameras C0 and C1 are arranged on both sides of the projection screen in front of B and B. For A, only the front image of A can be captured by camera C0, and camera C1 cannot capture the front of A. For B, the front image of B can only be captured by camera C1, and camera C0 cannot capture the front of B. It can be seen that multiple cameras are required for shooting to achieve video conferencing.
在部署多个摄像机进行会议拍摄时,一般是通过遥控器或其他方式手动调节摄像机参数以获取一个较好的拍摄效果。然而,该手动调节的方式需要操作者具备一定的摄像机专业知识,且操作过程繁琐,这就使得调节效率较低,无法及时保证较佳的拍摄效果。此外,还可通过声源定位的方式来确定拍摄的摄像机并调整摄像机拍摄效果。该声源定位方式是通过定位和跟踪正在发言的参会者(即“发言人”),同时使用一个摄像机拍摄该发言人的特写,特写时跟踪发言人的脸部位置并进行镜头的PTZ(Pan Tilt Zoom,即“平移,倾斜,变焦”)调整,以使发言人的脸部位于图像的中间区域。然而,该声源定位方式仅考虑了将发言人正面调整至图像中央,而并未考虑拍摄图像的效果问题,也无法保证一个较佳的拍摄效果。When deploying multiple cameras for conference shooting, the camera parameters are usually adjusted manually by remote control or other means to obtain a better shooting result. However, the manual adjustment method requires the operator to have certain camera expertise and the operation process is cumbersome, which makes the adjustment efficiency low, and the better shooting effect cannot be guaranteed in time. In addition, the camera can be determined by sound source positioning and the camera shooting effect can be adjusted. The sound source is positioned by locating and tracking the participant who is speaking (ie, "speaker"), while using a camera to capture the close-up of the speaker, while tracking the speaker's face position and performing PTZ (for the lens) Pan Tilt Zoom, which is the “Pan, Tilt, Zoom” adjustment, so that the speaker's face is in the middle of the image. However, the sound source localization method only considers adjusting the front side of the speaker to the center of the image, and does not consider the effect of the image being taken, nor can it guarantee a better shooting effect.
发明内容Summary of the invention
本发明实施例提供一种摄像机参数调整方法、导播摄像机及系统,能够提高摄像机参数调整的效率,并提升摄像机拍摄效果。The embodiment of the invention provides a camera parameter adjustment method, a guide camera and a system, which can improve the efficiency of camera parameter adjustment and improve the camera shooting effect.
第一方面,本发明实施例提供了一种摄像机参数调整方法,所述方法应用于导播摄像机中,包括:In a first aspect, an embodiment of the present invention provides a camera parameter adjustment method, where the method is applied to a navigation camera, including:
确定需要拍摄的目标视频对象;Determining the target video object that needs to be captured;
按照预设的导播策略从所述导播摄像机所在的导播摄像系统的各摄像机中筛选出用于 拍摄所述目标视频对象的目标摄像机;Filtering from each camera of the navigation camera system where the navigation camera is located according to a preset navigation strategy Shooting a target camera of the target video object;
获取所述目标视频对象的第一三维坐标,所述第一三维坐标为所述目标视频对象在所述目标摄像机对应的第一坐标系下的三维坐标;Obtaining a first three-dimensional coordinate of the target video object, where the first three-dimensional coordinate is a three-dimensional coordinate of the target video object in a first coordinate system corresponding to the target camera;
将所述目标摄像机的摄像参数调整为与所述第一三维坐标对应的摄像参数,并输出调整摄像参数后的视频图像。And adjusting an imaging parameter of the target camera to an imaging parameter corresponding to the first three-dimensional coordinate, and outputting a video image after adjusting the imaging parameter.
其中,该第一三维坐标可以为该目标视频对象在目标摄像机对应的第一坐标系下的三维坐标。该目标摄像机可为导播摄像机或为普通的PTZ摄像机,则该目标摄像机对应的第一坐标系可以是指以目标摄像机光心为原点建立的三维坐标系,或者以其他任意参照物为原点建立的三维坐标系,本发明实施例不做限定。The first three-dimensional coordinate may be a three-dimensional coordinate of the target video object in a first coordinate system corresponding to the target camera. The target camera may be a navigation camera or an ordinary PTZ camera, and the first coordinate system corresponding to the target camera may be a three-dimensional coordinate system established with the optical center of the target camera as an origin, or established by using any other reference object as an origin. The three-dimensional coordinate system is not limited in the embodiment of the present invention.
其中,该目标视频对象可以为该导播摄像机所在的导播摄像系统对应的拍摄场景中的任一个或多个视频对象。The target video object may be any one or more video objects in the shooting scene corresponding to the navigation camera system where the navigation camera is located.
在一些实施例中,所述获取所述目标视频对象的第一三维坐标,包括:In some embodiments, the acquiring the first three-dimensional coordinates of the target video object comprises:
获取与所述导播摄像机连接的双目摄像机传输的第二三维坐标,所述第二三维坐标为所述目标视频对象在所述双目摄像机对应的第二坐标系下的三维坐标;Obtaining a second three-dimensional coordinate transmitted by the binocular camera connected to the navigation camera, where the second three-dimensional coordinate is a three-dimensional coordinate of the target video object in a second coordinate system corresponding to the binocular camera;
根据预先标定的所述双目摄像机和所述目标摄像机的位置关系,将所述第二三维坐标转换为第一三维坐标。The second three-dimensional coordinates are converted into first three-dimensional coordinates according to a pre-calibrated positional relationship between the binocular camera and the target camera.
在一些实施例中,所述第二三维坐标可以是所述双目摄像机通过分别获取的视频对象在所述双目摄像机的左视图和右视图中的二维坐标以及获取的所述双目摄像机的内外参数据计算得到的。In some embodiments, the second three-dimensional coordinates may be two-dimensional coordinates of the binocular camera obtained by respectively acquiring video objects in left and right views of the binocular camera and the binocular camera acquired The internal and external parameters of the data are calculated.
其中,该第二三维坐标为所述目标视频对象在所述双目摄像机对应的第二坐标系下的三维坐标,该双目摄像机对应的第二坐标系可以是指以双目摄像机光心为原点建立的三维坐标系,或者以其他任意参照物为原点建立的三维坐标系。该二维坐标可具体为该目标视频对象在所述双目摄像机的左视图和右视图中对应的像素坐标。The second three-dimensional coordinate is a three-dimensional coordinate of the target video object in a second coordinate system corresponding to the binocular camera, and the second coordinate system corresponding to the binocular camera may be a binocular camera optical center The three-dimensional coordinate system established by the origin, or the three-dimensional coordinate system established with the origin of any other reference object. The two-dimensional coordinates may specifically be corresponding pixel coordinates of the target video object in the left view and the right view of the binocular camera.
在一些实施例中,所述确定需要拍摄的目标视频对象,包括:In some embodiments, the determining a target video object that needs to be captured comprises:
获取双目摄像机传输的拍摄图像,所述拍摄图像中包括至少一个视频对象;Obtaining a captured image transmitted by a binocular camera, the captured image including at least one video object;
建立包括所述至少一个视频对象的视频对象模型,并从所述至少一个视频对象中确定出目标视频对象;Establishing a video object model including the at least one video object, and determining a target video object from the at least one video object;
所述按照预设的导播策略从所述导播摄像机所在的导播摄像系统的各摄像机中筛选出用于拍摄所述目标视频对象的目标摄像机,包括:The target camera for capturing the target video object is filtered out from each camera of the navigation camera system where the navigation camera is located according to a preset navigation policy, including:
分别从所述导播摄像系统中的各摄像机获取的拍摄图像中确定出所述目标视频对象,并获取所述目标视频对象在各摄像机的拍摄效果参数;Determining the target video object from the captured images acquired by the cameras in the navigation camera system, and acquiring shooting effect parameters of the target video object in each camera;
将拍摄效果参数满足预设导播策略的摄像机确定为用于拍摄所述目标视频对象的目标摄像机。The camera whose shooting effect parameter satisfies the preset guiding strategy is determined as the target camera for capturing the target video object.
在一些实施例中,从摄像机获取的拍摄图像中确定出所述目标视频对象,包括:In some embodiments, determining the target video object from the captured image acquired by the camera comprises:
根据预先标定的所述双目摄像机和当前摄像机的位置关系,将所述第二三维坐标转换为第三三维坐标;Converting the second three-dimensional coordinates into third three-dimensional coordinates according to a pre-calibrated positional relationship between the binocular camera and the current camera;
判断所述目标视频对象在所述第三三维坐标下的区域和所述当前摄像机检测到的视频对象在该视频对象的三维坐标下的区域的重合面积是否超过预设的面积阈值;Determining whether an area of the target video object in the third three-dimensional coordinate and a region of the video object detected by the current camera in the three-dimensional coordinate of the video object exceed a preset area threshold;
若超过,则将该视频对象确定为所述目标视频对象。 If exceeded, the video object is determined to be the target video object.
其中,该当前摄像机为所述导播摄像系统中与该导播摄像机标定好位置关系的除所述双目摄像机以外的任一摄像机,该第三三维坐标为所述目标视频对象在该当前摄像机对应的第三坐标系下的三维坐标。The current camera is any camera other than the binocular camera that is in a positional relationship with the navigation camera in the navigation camera system, and the third three-dimensional coordinate is that the target video object corresponds to the current camera. The three-dimensional coordinates in the third coordinate system.
在一些实施例中,所述拍摄效果参数包括所述目标视频对象在当前摄像机对应的坐标系下的眼对眼效果参数、遮挡关系参数以及拍摄区域的场景对象参数中的任一项或多项,所述当前摄像机为所述导播摄像系统中除所述双目摄像机以外的任一摄像机。In some embodiments, the shooting effect parameter includes any one or more of an eye-to-eye effect parameter, an occlusion relationship parameter, and a scene object parameter of the shooting area of the target video object in a coordinate system corresponding to the current camera. The current camera is any camera other than the binocular camera in the navigation camera system.
其中,该眼对眼效果参数可包括该目标视频对象相对于当前摄像机对应的坐标系的旋转角度,该旋转角度是根据目标视频对象在该第二坐标系的旋转角度以及预先标定的该双目摄像机和该当前摄像机的位置关系确定出的。该旋转角度越小,眼对眼效果越好。The eye-to-eye effect parameter may include a rotation angle of the target video object relative to a coordinate system corresponding to the current camera, the rotation angle being according to a rotation angle of the target video object in the second coordinate system and the binocular pre-calibrated The positional relationship between the camera and the current camera is determined. The smaller the angle of rotation, the better the eye-to-eye effect.
其中,该遮挡关系参数和该场景对象参数可以是根据预先标定的该双目摄像机和当前摄像机的位置关系,将该当前摄像机检测到的场景对象的区域重投到该当前摄像机的成像平面确定出的。没有遮挡关系(遮挡关系参数越小)时输出图像效果越好。该场景对象的面积越小、数目越小,则输出图像效果越好;反之,则输出图像效果越差。The occlusion relationship parameter and the scene object parameter may be determined according to a pre-calibrated positional relationship between the binocular camera and the current camera, and the region of the scene object detected by the current camera is re-projected to an imaging plane of the current camera. of. The output image is better when there is no occlusion relationship (the smaller the occlusion relationship parameter is). The smaller the area of the scene object and the smaller the number, the better the output image effect; otherwise, the worse the output image effect.
第二方面,本发明实施例还提供了一种导播摄像机,包括:存储器和处理器,所述处理器与所述存储器连接;其中,In a second aspect, an embodiment of the present invention further provides a navigation camera, including: a memory and a processor, wherein the processor is connected to the memory;
所述存储器用于存储驱动软件;The memory is used to store driver software;
所述处理器从所述存储器读取所述驱动软件并在所述驱动软件的作用下执行上述第一方面的摄像机参数调整方法的部分或全部步骤。The processor reads the driver software from the memory and performs some or all of the steps of the camera parameter adjustment method of the first aspect described above by the driver software.
第三方面,本发明实施例还提供了一种参数调整装置,包括对象确定单元、选择单元、获取单元以及参数调整单元,该参数调整装置通过上述单元实现第一方面的摄像机参数调整方法的部分或全部步骤。In a third aspect, the embodiment of the present invention further provides a parameter adjustment apparatus, including an object determining unit, a selecting unit, an acquiring unit, and a parameter adjusting unit, wherein the parameter adjusting device implements a part of the camera parameter adjusting method of the first aspect by using the foregoing unit. Or all steps.
第四方面,本发明实施例还提供了一种计算机存储介质,所述计算机存储介质存储有程序,所述程序执行时包括上述第一方面的摄像机参数调整方法的部分或全部的步骤。In a fourth aspect, an embodiment of the present invention further provides a computer storage medium, where the computer storage medium stores a program, and the program includes some or all of the steps of the camera parameter adjustment method of the first aspect.
第五方面,本发明实施例还提供了一种导播摄像系统,包括第一摄像机和至少一个第二摄像机,所述第一摄像机包括导播摄像机和双目摄像机,所述导播摄像机与所述双目摄像机之间、所述第一摄像机和所述第二摄像机之间通过有线接口或无线接口连接;其中,In a fifth aspect, the embodiment of the present invention further provides a navigation camera system, including a first camera and at least one second camera, the first camera includes a navigation camera and a binocular camera, and the navigation camera and the binocular Between the cameras, the first camera and the second camera are connected by a wired interface or a wireless interface;
所述导播摄像机,用于确定需要拍摄的目标视频对象,并按照预设的导播策略从所述导播摄像系统的摄像机中筛选出用于拍摄所述目标视频对象的目标摄像机;The guidance camera is configured to determine a target video object that needs to be captured, and select a target camera for capturing the target video object from a camera of the navigation camera system according to a preset navigation strategy;
所述双目摄像机,用于获取所述目标视频对象的第二三维坐标,并将所述第二三维坐标传输给所述导播摄像机;其中,所述第二三维坐标为所述目标视频对象在所述双目摄像机对应的第二坐标系下的三维坐标;The binocular camera is configured to acquire a second three-dimensional coordinate of the target video object, and transmit the second three-dimensional coordinate to the navigation camera; wherein the second three-dimensional coordinate is that the target video object is The three-dimensional coordinates of the binocular camera corresponding to the second coordinate system;
所述导播摄像机,用于接收所述双目摄像机传输的所述第二三维坐标;根据预先标定的所述双目摄像机和所述目标摄像机的位置关系,将所述第二三维坐标转换为第一三维坐标;将所述目标摄像机的摄像参数调整为与所述第一三维坐标对应的摄像参数,并输出调整摄像参数后的视频图像;其中,所述第一三维坐标为所述目标视频对象在所述目标摄像机对应的第一坐标系下的三维坐标。The guidance camera is configured to receive the second three-dimensional coordinates transmitted by the binocular camera; convert the second three-dimensional coordinates into a first position according to a pre-calibrated positional relationship between the binocular camera and the target camera a three-dimensional coordinate; adjusting an imaging parameter of the target camera to an imaging parameter corresponding to the first three-dimensional coordinate, and outputting a video image after adjusting the imaging parameter; wherein the first three-dimensional coordinate is the target video object The three-dimensional coordinates in the first coordinate system corresponding to the target camera.
在一些实施例中,该第二摄像机可以包括导播摄像机和双目摄像机,则该目标摄像机可为该导播摄像系统中的任一导播摄像机;或者,该第二摄像机还可以为普通的PTZ摄像机,则该目标摄像机可为该导播摄像机或普通PTZ摄像机。 In some embodiments, the second camera may include a navigation camera and a binocular camera, and the target camera may be any of the navigation cameras in the navigation camera system; or the second camera may also be a normal PTZ camera. Then the target camera can be the guide camera or a normal PTZ camera.
在一些实施例中,该双目摄像机可设置于预置的导播支架上,并通过该导播支架与该导播摄像机连接。In some embodiments, the binocular camera can be placed on a preset guide bracket and connected to the guide camera via the guide bracket.
实施本发明实施例,具有如下有益效果:Embodiments of the present invention have the following beneficial effects:
在本发明实施例中,可在确定出需要拍摄的目标视频对象之后,按照预设的导播策略从导播摄像系统的各摄像机中筛选出拍摄该目标视频对象效果最佳的目标摄像机,并获取得到该目标视频对象在该目标摄像机对应的坐标系下的三维坐标,以控制该目标摄像机根据该目标视频对象的三维坐标来进行摄像机参数调整,并输出调整摄像参数后的视频图像,使得导播摄像系统能够基于三维坐标检测及预设的导播策略,以提高视频对象检测和跟踪的精度,同时提高了摄像机参数调整的效率,并有效提升了摄像机的拍摄效果。In the embodiment of the present invention, after determining the target video object that needs to be captured, the target camera that captures the target video object is selected from the cameras of the navigation camera system according to a preset guiding strategy, and is obtained. a three-dimensional coordinate of the target video object in a coordinate system corresponding to the target camera, to control the target camera to perform camera parameter adjustment according to the three-dimensional coordinates of the target video object, and output a video image after adjusting the imaging parameter, so that the guided camera system It can improve the accuracy of video object detection and tracking based on three-dimensional coordinate detection and preset navigation strategy, and improve the efficiency of camera parameter adjustment, and effectively improve the camera's shooting effect.
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below. Obviously, the drawings in the following description are only It is a certain embodiment of the present invention, and other drawings can be obtained from those skilled in the art without any creative work.
图1是一种视频会议的场景示意图;1 is a schematic diagram of a scene of a video conference;
图2是本发明实施例提供的一种摄像机参数调整方法的流程示意图;FIG. 2 is a schematic flowchart of a method for adjusting a camera parameter according to an embodiment of the present invention;
图3a是本发明实施例提供的一种摄像机成像模型示意图;FIG. 3a is a schematic diagram of a camera imaging model according to an embodiment of the present invention; FIG.
图3b是本发明实施例提供的一种多摄像机的标定场景示意图;FIG. 3b is a schematic diagram of a calibration scenario of a multi-camera according to an embodiment of the present invention; FIG.
图3c是本发明实施例提供的一种双目摄像机的三维定位原理图;FIG. 3c is a schematic diagram of a three-dimensional positioning of a binocular camera according to an embodiment of the present invention; FIG.
图3d是本发明实施例提供的一种PTZ摄像机旋转模型示意图;FIG. 3 is a schematic diagram of a PTZ camera rotation model according to an embodiment of the present invention; FIG.
图4a是本发明实施例提供的一种视频对象匹配场景示意图;4a is a schematic diagram of a video object matching scenario according to an embodiment of the present invention;
图4b是图4a中的一组视频对象成像图;Figure 4b is an image view of a set of video objects in Figure 4a;
图5是本发明实施例提供的一种参数调整装置的结构示意图;FIG. 5 is a schematic structural diagram of a parameter adjustment apparatus according to an embodiment of the present invention; FIG.
图6是本发明实施例提供的一种导播摄像系统的结构示意图;6 is a schematic structural diagram of a navigation camera system according to an embodiment of the present invention;
图7是本发明实施例提供的一种第一摄像机的结构示意图;FIG. 7 is a schematic structural diagram of a first camera according to an embodiment of the present disclosure;
图8是本发明实施例提供的一种导播摄像系统的组网示意图;FIG. 8 is a schematic diagram of networking of a navigation camera system according to an embodiment of the present invention; FIG.
图9是本发明实施例提供的一种导播摄像机的结构示意图。FIG. 9 is a schematic structural diagram of a navigation camera according to an embodiment of the present invention.
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is obvious that the described embodiments are only a part of the embodiments of the present invention, but not all embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.
应理解,本发明实施例涉及的“第一”、“第二”和“第三”等是用于区别不同对象,而非用于描述特定顺序。此外,术语“包括”以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出 的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。It should be understood that the "first", "second", and "third" and the like in the embodiments of the present invention are used to distinguish different objects, and are not used to describe a specific order. Moreover, the term "comprise" and any variants thereof are intended to cover a non-exclusive inclusion. For example, a process, method, system, product, or device that contains a series of steps or units is not limited to the listed Steps or units, but optionally also include steps or units not listed, or alternatively other steps or units inherent to such processes, methods, products or devices.
应理解,本发明实施例涉及的导播摄像机可具体为用于执行本发明实施例的技术方案的PTZ摄像机,其可与双目摄像机连接,该导播摄像机可应用于会议、培训等场景,并可根据不同场景进行导播摄像机位置及数目的部署。It should be understood that the navigation camera according to the embodiment of the present invention may be specifically a PTZ camera for performing the technical solution of the embodiment of the present invention, which can be connected to a binocular camera, and the navigation camera can be applied to a scene such as a conference or a training, and The deployment of the position and number of the guided cameras is performed according to different scenarios.
在一些实施例中,该双目摄像机可安装在导播支架上,即该导播摄像机可通过导播支架(简称“支架”)与双目摄像机连接。其中,该导播摄像机用于进行导播拍摄和跟踪。此外,该支架上还可安装麦克风,该麦克风可用于实现声源定位、声源识别等功能。该导播摄像机和支架可以是分离的,也可以是集成在一起的,且该导播摄像机和支架之间可采用控制接口如串行接口进行通信。In some embodiments, the binocular camera can be mounted on a guide bracket, that is, the guide camera can be coupled to the binocular camera via a guide bracket (referred to as a "bracket"). The navigation camera is used for guiding shooting and tracking. In addition, a microphone can be mounted on the bracket, which can be used for sound source localization, sound source recognition and the like. The camera and the bracket may be separate or integrated, and a communication interface such as a serial interface may be used for communication between the camera and the bracket.
在一些实施例中,双目摄像机可用于视频采集、视频预处理、运动检测、人脸检测、人形检测、场景对象检测、特征检测/匹配、双目摄像机标定、多摄像机标定等等,麦克风可用于音频采集、音频预处理、视频采集、声源行为识别等等,导播摄像机可用于音视频(Audio Video,简称为“AV”)对象3D定位、AV对象建模、AV对象跟踪、动作/姿态识别、导播控制及视频切换/合成等等。其中,视频采集包括同步采集双目摄像机和导播摄像机的视频流;视频预处理包括对输入的双目图像进行预处理,如进行降噪、更改分辨率和帧率等操作;运动检测包括检测场景中的运动对象,并将运动对象和静止的背景进行分离,得到运动对象的区域;人脸检测包括检测场景中的人脸目标对象,输出人脸的检测信息,如人脸位置、区域、方向等信息;人形检测包括检测场景中的人形头肩部分区域,输出检测信息;场景对象检测包括检测场景中除了人之外的其它对象,如灯管、窗户、会议桌等;特征检测/匹配包括对检测得到的运动对象区域进行特征检测和匹配,检测一个图像中的特性对象(如特征点)并在另一个图像中进行匹配,输出匹配的特征对象信息;双目摄像机标定包括对双目摄像机进行标定,获得双目摄像机内外参,用于计算视频图像中的视频对象的三维坐标;多摄像机标定包括对多个导播摄像机的相对位置关系进行标定,获得多个导播摄像机的相对外参信息,用于视频对象在多个摄像机坐标系中的定位。进一步的,音频采集包括同步采集麦克风的多路音频数据;音频预处理包括对输入的多路音频数据进行3A处理,其中3A处理包括自动曝光控制(AE)、自动聚焦控制(AF)、自动白平衡控制(AWB);声源定位包括对输入的多路音频数据进行检测,找到发声对象的二维位置信息;声源行为识别包括检测和统计场景中视频对象的语音行为。进一步的,AV对象3D定位包括根据双目摄像机的内外参和特征检测/匹配得到的视差信息,获得图像中对象特征的深度信息,结合音频定位的结果,得到对象特征在单个导播摄像机坐标系下的三维位置信息,根据特征在单个导播坐标系下的位置以及多个导播摄像机的相对位置关系,可以得到特征在其它导播摄像机坐标系中的位置信息;AV对象建模包括结合音源定位、人脸信息、特征对象和场景对象等信息构建AV对象的模型;AV对象跟踪包括对场景中的多个AV对象进行跟踪,更新对象的状态信息;动作/姿态识别包括对AV对象的动作、姿态等进行识别,例如识别对象的站立姿势、手势动作等;导播控制包括结合动作/姿态识别和音源行为识别的结果确定导播策略,导播摄像机控制输出导播策略对应的控制指令、视频对象和场景特征信息以及视频输出策略等。其中,摄像机控制指令可用于控制PTZ摄像机进行PTZ操作,即平移、倾斜、变焦操作等等,视频对象和场景特征信息可用于多个导播摄像机之间的信息共享,视 频输出策略可用于控制单个或多个导播摄像机视频流的输出策略。In some embodiments, the binocular camera can be used for video capture, video pre-processing, motion detection, face detection, humanoid detection, scene object detection, feature detection/matching, binocular camera calibration, multi-camera calibration, etc., microphones are available For audio acquisition, audio pre-processing, video capture, sound source behavior recognition, etc., the camera can be used for audio and video (Audio Video, referred to as "AV") object 3D positioning, AV object modeling, AV object tracking, motion / gesture Identification, guided control and video switching/synthesis, etc. The video capture includes synchronously collecting video streams of the binocular camera and the guide camera; the video pre-processing includes pre-processing the input binocular image, such as performing noise reduction, changing resolution and frame rate, etc.; motion detection includes detecting the scene. The moving object in the moving object and the stationary background are separated to obtain the moving object area; the face detection includes detecting the face target object in the scene, and outputting the detection information of the face, such as the face position, area, direction The information detection includes detecting the human head and shoulder area in the scene and outputting detection information; the scene object detection includes detecting objects other than people in the scene, such as a lamp, a window, a conference table, etc.; feature detection/matching includes Perform feature detection and matching on the detected moving object area, detect characteristic objects (such as feature points) in one image and match in another image, and output matching feature object information; binocular camera calibration includes binocular camera Perform calibration to obtain the internal and external parameters of the binocular camera for calculating the video image. Three-dimensional coordinates of the video object; multi-camera calibration comprises relative positional relationship between the plurality of cameras directed be calibrated to obtain the reference information relative to a plurality of external cameras directed, for positioning a plurality of video objects in the camera coordinate system. Further, the audio collection includes synchronously acquiring the multi-channel audio data of the microphone; the audio pre-processing includes performing 3A processing on the input multi-channel audio data, wherein the 3A processing includes automatic exposure control (AE), auto focus control (AF), and auto white Balance control (AWB); sound source localization includes detecting input multi-channel audio data to find two-dimensional position information of the sounding object; sound source behavior recognition includes detecting and counting the voice behavior of the video object in the scene. Further, the 3D positioning of the AV object includes obtaining the depth information of the object feature in the image according to the parallax information obtained by the internal and external parameters of the binocular camera and the feature detection/matching, and combining the result of the audio positioning to obtain the object feature in a single guided camera coordinate system. The three-dimensional position information can obtain the position information of the feature in other navigation camera coordinate systems according to the position of the feature in a single navigation coordinate system and the relative positional relationship of the plurality of navigation cameras; the AV object modeling includes combining the sound source positioning and the human face. Information such as information, feature objects, and scene objects constructs a model of the AV object; the AV object tracking includes tracking a plurality of AV objects in the scene, and updating state information of the object; and the motion/gesture recognition includes performing actions, gestures, and the like of the AV object. Identifying, for example, identifying a standing posture of a subject, a gesture action, etc.; the navigation control includes determining a navigation strategy in conjunction with the result of the motion/gesture recognition and the sound source behavior recognition, and the navigation camera controls the control instruction corresponding to the output guidance strategy, the video object and the scene feature information, and the video. Output strategy, etc. The camera control command can be used to control the PTZ camera to perform PTZ operation, that is, pan, tilt, zoom operation, etc., and the video object and scene feature information can be used for information sharing between multiple guide cameras. The frequency output strategy can be used to control the output strategy of a single or multiple camera video streams.
本发明实施例提供了一种摄像机参数调整方法、导播摄像机及系统,能够提高摄像机参数调整的效率,并提升摄像机拍摄效果。以下分别详细说明。The embodiment of the invention provides a camera parameter adjustment method, a guide camera and a system, which can improve the efficiency of camera parameter adjustment and improve the camera shooting effect. The details are explained below.
进一步的,请参见图2,图2是本发明实施例提供的一种摄像机参数调整方法的流程示意图。具体的,本发明实施例的所述方法可具体应用于上述的导播摄像机中。如图2所示,本发明实施例的所述摄像机参数调整方法可以包括以下步骤:Further, please refer to FIG. 2. FIG. 2 is a schematic flowchart diagram of a camera parameter adjustment method according to an embodiment of the present invention. Specifically, the method of the embodiment of the present invention may be specifically applied to the above-mentioned navigation camera. As shown in FIG. 2, the camera parameter adjustment method in the embodiment of the present invention may include the following steps:
101、确定需要拍摄的目标视频对象。101. Determine a target video object that needs to be shot.
102、按照预设的导播策略从所述导播摄像机所在的导播摄像系统的各摄像机中筛选出用于拍摄所述目标视频对象的目标摄像机。102. Filter, according to a preset guiding strategy, a target camera for capturing the target video object from each camera of the guiding camera system where the guiding camera is located.
可选的,所述确定需要拍摄的目标视频对象,可以具体为:获取双目摄像机传输的拍摄图像,所述拍摄图像中包括至少一个视频对象;建立包括所述至少一个视频对象的视频对象模型,并从所述至少一个视频对象中确定出目标视频对象。进一步的,所述按照预设的导播策略从所述导播摄像机所在的导播摄像系统的各摄像机中筛选出用于拍摄所述目标视频对象的目标摄像机,可以具体为:分别从所述导播摄像系统中的各摄像机获取的拍摄图像中确定出所述目标视频对象,并获取所述目标视频对象在各摄像机的拍摄效果参数;将拍摄效果参数满足预设导播策略的摄像机确定为用于拍摄所述目标视频对象的目标摄像机。其中,该导播摄像系统中可部署一个或多个导播摄像机,即该导播摄像系统可以为导播摄像机+导播摄像机的部署方式,还可以为导播摄像机+普通摄像机(如普通的PTZ摄像机)的部署方式。具体的,该视频对象模型中可包括该导播摄像机所在的导播摄像系统对应的拍摄场景中的所有视频对象。若该导播摄像系统中的其他摄像机也包括导播摄像机,则还可接收与其他导播摄像机连接的双目摄像机发送的拍摄图像,进行视频对象模型更新,以获取得到拍摄场景中所有视频对象的视频对象模型。其中,该目标视频对象可以为该拍摄场景中的任一个或多个视频对象。Optionally, the determining the target video object that needs to be captured may be specifically: acquiring a captured image transmitted by the binocular camera, the captured image includes at least one video object; and establishing a video object model including the at least one video object And determining a target video object from the at least one video object. Further, the target camera for capturing the target video object is selected from each camera of the navigation camera system in which the navigation camera is located according to a preset navigation strategy, which may be specifically: separately from the navigation camera system Determining the target video object in the captured image acquired by each camera, and acquiring a shooting effect parameter of the target video object in each camera; determining, by the camera that the shooting effect parameter meets the preset guiding strategy, is used for shooting The target camera of the target video object. One or more navigation cameras can be deployed in the navigation camera system, that is, the navigation camera system can be deployed as a guide camera + a guide camera, and can also be deployed as a guide camera + a normal camera (such as a normal PTZ camera). . Specifically, the video object model may include all video objects in the shooting scene corresponding to the navigation camera system where the guidance camera is located. If the other cameras in the navigation camera system also include the guidance camera, the captured images transmitted by the binocular cameras connected to the other navigation cameras may also be received, and the video object model is updated to obtain the video objects of all the video objects in the captured scene. model. The target video object may be any one or more video objects in the shooting scene.
可选的,所述拍摄效果参数可包括所述目标视频对象在当前摄像机对应的坐标系下的眼对眼效果参数、遮挡关系参数以及拍摄区域的场景对象参数中的任一项或多项。其中,所述当前摄像机为所述导播摄像系统中除所述双目摄像机以外的任一摄像机,也即,该当前摄像机可以为该导播摄像系统中的任一导播摄像机或普通PTZ摄像机。Optionally, the shooting effect parameter may include any one or more of an eye-to-eye effect parameter, an occlusion relationship parameter, and a scene object parameter of the shooting area of the target video object in a coordinate system corresponding to the current camera. The current camera is any camera other than the binocular camera in the navigation camera system, that is, the current camera may be any of the guidance cameras or ordinary PTZ cameras in the guidance camera system.
其中,所述眼对眼效果参数可包括所述目标视频对象相对于当前摄像机对应的坐标系的旋转角度,所述旋转角度可以是根据所述目标视频对象在所述第二坐标系的旋转角度以及预先标定的所述双目摄像机和所述当前摄像机的位置关系确定出的。具体的,该目标视频对象相对于当前摄像机对应的坐标系的旋转角度可以是指该目标视频对象对应的人脸或人形对象相对于该当前摄像机(导播摄像机或普通PTZ摄像机)的光轴角度。该角度越小则可表示人脸越以正脸方式进行呈现,也即眼对眼效果越好,输出图像效果越好。The eye-to-eye effect parameter may include a rotation angle of the target video object with respect to a coordinate system corresponding to the current camera, and the rotation angle may be a rotation angle of the target video object in the second coordinate system. And determining a positional relationship between the binocular camera and the current camera that is pre-calibrated. Specifically, the rotation angle of the target video object relative to the coordinate system corresponding to the current camera may refer to an optical axis angle of the face or the humanoid object corresponding to the target video object relative to the current camera (the navigation camera or the ordinary PTZ camera). The smaller the angle, the more the face is rendered in a positive face manner, that is, the better the eye-to-eye effect, the better the output image effect.
其中,所述遮挡关系参数和所述场景对象参数可以是根据预先标定的所述双目摄像机和所述当前摄像机的位置关系,将所述当前摄像机检测到的场景对象的区域重投到所述当前摄像机的成像平面确定出的。具体的,如两个视频对象的区域重叠,则可利用深度信息确定该两个对象间的遮挡关系,该距离双目摄像机较近的视频对象会遮挡较远的视频对象。没有遮挡关系(遮挡关系参数越小)时输出图像效果越好。该场景对象参数指示的场景对象可包括灯管、窗户、桌子等,该场景对象的面积越小、数目越小,则输出图像效果越好; 反之,则输出图像效果越差。The occlusion relationship parameter and the scene object parameter may be that the area of the scene object detected by the current camera is re-injected to the location according to a pre-calibrated positional relationship between the binocular camera and the current camera. The imaging plane of the current camera is determined. Specifically, if the areas of the two video objects overlap, the depth information may be used to determine an occlusion relationship between the two objects, and the video object that is closer to the binocular camera may block the farther video object. The output image is better when there is no occlusion relationship (the smaller the occlusion relationship parameter is). The scene object indicated by the scene object parameter may include a light tube, a window, a table, and the like, and the smaller the area of the scene object is, the smaller the number is, the better the output image effect is; Conversely, the worse the output image is.
103、获取所述目标视频对象的第一三维坐标。103. Acquire a first three-dimensional coordinate of the target video object.
其中,该第一三维坐标可以为该目标视频对象在目标摄像机对应的第一坐标系下的三维坐标。该目标摄像机可配置为上述的导播摄像机或为普通的PTZ摄像机,则该目标摄像机对应的第一坐标系可以是指以目标摄像机光心为原点建立的三维坐标系,或者以其他任意参照物为原点建立的三维坐标系,本发明实施例不做限定。The first three-dimensional coordinate may be a three-dimensional coordinate of the target video object in a first coordinate system corresponding to the target camera. The target camera can be configured as the above-mentioned navigation camera or a normal PTZ camera. The first coordinate system corresponding to the target camera can be a three-dimensional coordinate system established with the target camera's optical center as the origin, or any other reference object. The three-dimensional coordinate system established by the origin is not limited in the embodiment of the present invention.
可选的,所述导播摄像机可与预置的双目摄像机相连接。则所述获取所述目标视频对象的第一三维坐标,可以具体为:获取与所述导播摄像机连接的双目摄像机传输的第二三维坐标;根据预先标定的所述双目摄像机和所述目标摄像机的位置关系,将所述第二三维坐标转换为第一三维坐标。进一步可选的,所述第二三维坐标可以是所述双目摄像机通过分别获取的该目标视频对象在所述双目摄像机的左视图和右视图中的二维坐标以及获取的所述双目摄像机的内外参数据计算得到的。其中,该第二三维坐标为所述目标视频对象在所述双目摄像机对应的第二坐标系下的三维坐标,该双目摄像机对应的第二坐标系可以是指以双目摄像机光心为原点建立的三维坐标系,或者以其他任意参照物为原点建立的三维坐标系。该二维坐标可具体为该目标视频对象在所述双目摄像机的左视图和右视图中对应的像素坐标。Optionally, the navigation camera can be connected to a preset binocular camera. The acquiring the first three-dimensional coordinates of the target video object may be specifically: acquiring a second three-dimensional coordinate transmitted by the binocular camera connected to the navigation camera; and according to the pre-calibrated the binocular camera and the target The positional relationship of the camera converts the second three-dimensional coordinates into first three-dimensional coordinates. Further, the second three-dimensional coordinates may be two-dimensional coordinates of the target video object acquired in the left view and the right view of the binocular camera and the binocular acquired by the binocular camera respectively The internal and external parameters of the camera are calculated. The second three-dimensional coordinate is a three-dimensional coordinate of the target video object in a second coordinate system corresponding to the binocular camera, and the second coordinate system corresponding to the binocular camera may be a binocular camera optical center The three-dimensional coordinate system established by the origin, or the three-dimensional coordinate system established with the origin of any other reference object. The two-dimensional coordinates may specifically be corresponding pixel coordinates of the target video object in the left view and the right view of the binocular camera.
具体实施例中,可预先对双目摄像机之间的位置关系、导播摄像机和双目摄像机之间的位置关系以及导播摄像系统中的多机位的摄像机之间的位置关系进行标定。其中,双目摄像机系统标定得到的参数可用于计算视频对象在双目摄像机对应的坐标系下的三维坐标;导播摄像机和双目摄像机之间的位置关系标定可用于计算视频对象在导播摄像机坐标系下的三维坐标;而多机位的摄像机之间的位置关系标定的参数可用于计算多机位部署场景时,视频对象在各机位的摄像机坐标系下的三维坐标,以便于进行坐标转换。该多机位的摄像机的部署方式可以是上述的导播摄像机+导播摄像机的部署方式,或者为导播摄像机+普通的PTZ摄像机的部署方式。其中,每个导播摄像机可称为一个机位,当采用多个导播摄像机进行配合拍摄时,可从中确定出一个主机位,则其余为从机位,作为从机位的导播摄像机可将自身的IP等信息注册到主机位上,从而主机位能够实现对多个从机位进行管理。具体的,下面对标定过程进行简要描述。其中,双目摄像机包括左摄像机和右摄像机,左摄像机获取的图像可称为左视图,右摄像机获取的图像可称为右视图。则其中单个摄像机的成像(投影)模型可通过如下公式来描述:In a specific embodiment, the positional relationship between the binocular cameras, the positional relationship between the camera and the binocular camera, and the positional relationship between the cameras of the multi-camera in the camera system can be calibrated in advance. The parameter obtained by the binocular camera system calibration can be used to calculate the three-dimensional coordinates of the video object in the coordinate system corresponding to the binocular camera; the positional relationship calibration between the navigation camera and the binocular camera can be used to calculate the video object in the navigation camera coordinate system. The three-dimensional coordinates of the lower three-dimensional coordinates; and the positional calibration parameters between the cameras of the multi-camera can be used to calculate the three-dimensional coordinates of the video object in the camera coordinate system of each aircraft position in the multi-camera deployment scenario, so as to facilitate coordinate conversion. The deployment mode of the multi-camera camera may be the deployment mode of the above-mentioned guide camera + guide camera, or the deployment mode of the guide camera + ordinary PTZ camera. Each of the navigation cameras can be referred to as a single position. When multiple cameras are used for cooperative shooting, a host position can be determined therefrom, and the rest is a slave position. As a guide camera of the slave position, the camera can be used. Information such as IP is registered to the host bit, so that the host bit can manage multiple slave locations. Specifically, the calibration process is briefly described below. The binocular camera includes a left camera and a right camera. The image acquired by the left camera may be referred to as a left view, and the image acquired by the right camera may be referred to as a right view. Then the imaging (projection) model of a single camera can be described by the following formula:
x=PX=K[R|t]Xx=PX=K[R|t]X
如图3a所示,x为场景中某个点(即视频对象,具体可为视频对象对应的特征点)在图像坐标系下的像素坐标,其为二维坐标;X为场景中某个点在世界坐标系下的位置坐标;P为3×4的投影矩阵。PX是指P×X。其中,K为3×3的摄像机内参矩阵,可以表示为:As shown in FIG. 3a, x is a pixel coordinate of a certain point in the scene (ie, a video object, specifically a feature point corresponding to the video object) in the image coordinate system, which is a two-dimensional coordinate; X is a certain point in the scene. Position coordinates in the world coordinate system; P is a 3×4 projection matrix. PX means P×X. Where K is a 3×3 camera internal reference matrix, which can be expressed as:
其中,fx,fy为x和y方向的等效焦距,cx,cy为光心的图像坐标,s为skew形变系数 (sensor和光轴不垂直导致,通常很小,在标定过程中可忽略)。Where f x , f y are the equivalent focal lengths in the x and y directions, c x , c y are the image coordinates of the optical center, and s is the skew coefficient of the skew (the sensor and the optical axis are not perpendicular, usually small, during the calibration process) Ignorable).
此外,R和t为摄像机外参,分别表示为3×3的旋转矩阵和3×1的平移向量,如下所示:In addition, R and t are camera external parameters, which are represented as a 3×3 rotation matrix and a 3×1 translation vector, respectively, as follows:
R=[r1 r2 r3]R=[r 1 r 2 r 3 ]
t=[t1 t2 t3]T t=[t 1 t 2 t 3 ] T
其中,r1,r2,r3为旋转矩阵中3×1的列向量。Where r 1 , r 2 , r 3 are 3×1 column vectors in the rotation matrix.
由于摄像机镜头的光学特性、图像感光器件的制造和安装等因素,摄像机实际拍摄得到的图像不是理想的,会存在畸变,因此可对图像畸变进行建模,以获取理想图像。具体的,摄像机图像畸变的模型可根据如下公式来描述:Due to the optical characteristics of the camera lens, the manufacture and installation of the image sensor, etc., the actual image taken by the camera is not ideal and there will be distortion, so the image distortion can be modeled to obtain the ideal image. Specifically, the model of camera image distortion can be described according to the following formula:
其中,xp,yp为校正后像素位置,xd,yd为校正前像素位置,k1,k2,k3为径向畸变系数,p1,p2为切向畸变系数。Where x p , y p are the corrected pixel positions, x d , y d are the pre-correction pixel positions, k 1 , k 2 , k 3 are radial distortion coefficients, and p 1 , p 2 are tangential distortion coefficients.
基于上述单目摄像机的成像模型,当已知世界坐标系变换到左摄像机坐标系和右摄像机坐标系的旋转矩阵R1和R2及平移向量t1和t2时,则可以得到双目摄像机之间的相对外参,包括旋转矩阵R和平移向量T:Based on the imaging model of the monocular camera described above, when the world coordinate system is known to be transformed to the rotation matrix R1 and R2 and the translation vectors t1 and t2 of the left camera coordinate system and the right camera coordinate system, the relative between the binocular cameras can be obtained. External parameters, including the rotation matrix R and the translation vector T:
应理解,在本发明实施例中,双目摄像机之间的位置关系以及导播摄像机如PTZ摄像机与双目摄像机之间的位置关系是固定不变的,这两种标定可以在出厂前完成,即该两种标定得到的数据如内外参数据是固定不变的。可选的,在本发明实施例中,摄像机的标定可以采用多种方案,如Zhang的平面标定法(又称“张氏标定法”),其畸变参数计算采用Brown的方法,此处不赘述。It should be understood that, in the embodiment of the present invention, the positional relationship between the binocular cameras and the positional relationship between the navigation camera such as the PTZ camera and the binocular camera are fixed, and the two calibrations can be completed before leaving the factory, that is, The data obtained by the two calibrations is fixed as the internal and external data. Optionally, in the embodiment of the present invention, the calibration of the camera may adopt various schemes, such as the plane calibration method of Zhang (also referred to as “Zhang's calibration method”), and the distortion parameter calculation adopts the Brown method, which is not described herein. .
进一步的,由上述双目摄像机标定原理可知,多机位的摄像机如多导播摄像机位置关系的标定的本质是求两两相邻的导播摄像机之间的相对外参,根据相邻导播摄像机之间的相对外参计算出任意两个导播摄像机之间的外参,从而得到任意两个导播摄像机之间的位置关系。多导播摄像机部署时两两导播摄像机之间需要有较大的拍摄重叠区域,多个机位构成类似于环绕多摄像机系统,第i个摄像机相对第j个摄像机的旋转矩阵和平移向量为:Further, the above-mentioned binocular camera calibration principle shows that the calibration of the positional relationship of a multi-camera camera such as a multi-camera camera is to find a relative external parameter between two adjacent guide cameras, according to the position between adjacent guide cameras. The relative external parameters calculate the external parameters between any two navigation cameras, thereby obtaining the positional relationship between any two navigation cameras. When the multi-guide camera is deployed, a large overlapping area of the camera is required between the two cameras. The multiple positions are similar to the surrounding multi-camera system. The rotation matrix and translation vector of the i-th camera relative to the j-th camera are:
Ri,i-1Ri-1,i-2…Rj+1,j R i,i-1 R i-1,i-2 ...R j+1,j
Ri,i-1Ri-1,i-2…Rj+2,j+1Tj+Ri,i-1Ri-1,i-2…Rj+3,j+2Tj+1+…+Ri,i-1Ti-2+Ti-1 R i,i-1 R i-1,i-2 ...R j+2,j+1 T j +R i,i-1 R i-1,i-2 ...R j+3,j+2 T j+1 +...+R i,i-1 T i-2 +T i-1
其中,Ri,i-1Ri-1,i-2…Rj+1,j表示Ri,i-1×Ri-1,i-2×…×Rj+1,j。由于导播摄像机部署时,不同导播支架上用于定位的摄像头的位置是根据实际部署场景发生变化的,因此多导播摄像机之间的位置关系无法在设备出厂前进行预标定,则可在导播摄像机部署时进行 现场标定。Wherein R i,i-1 R i-1,i-2 ...R j+1,j denotes R i,i-1 ×R i-1,i-2 ×...×R j+1,j . Since the position of the camera for positioning on different guide brackets changes according to the actual deployment scenario when the guide camera is deployed, the positional relationship between the multi-guide cameras cannot be pre-calibrated before the device leaves the factory, and can be deployed in the guide camera. Perform on-site calibration.
进一步的,假设该导播摄像系统中的摄像机均为导播摄像机,且每一个导播摄像机连接有双目摄像机。请参见图3b,图3b是本发明实施例提供的一种多导播摄像机的标定场景示意图。如图3b所示,在进行标定时,相邻的两个导播摄像机之间或双目摄像机与导播摄像机之间可通过局域网(Local Area Network,简称为“LAN”)或无线保真(Wireless Fidelity,简称为“Wi-Fi”)网络进行通信,包括传输标定模板图像和标定参数等等,传输协议可以采用多种网络协议,例如超文本传输协议(HyperText Transfer Protocol,简称为“HTTP”)。具体的,可预先为每个导播摄像机和与导播摄像机连接的摄像机(双目摄像机)分配全局ID号。例如可选择某个摄像机为起始位置,比如可选择最左边或最右边的摄像机作为起始位置,其它摄像头的ID号按逆时针或顺时针方向递增。在所有摄像机中选取一组摄像机参与标定,选择的原则可以是确保相邻摄像机之间的重叠区域最大。如图3b所示,假设当前拍摄场景中部署有3个机位D1、D2、D3,每个机位分别包括导播摄像机(分别记为PTZ0、PTZ1和PTZ2)和双目摄像机(该双目摄像机包括左、右两个摄像机,分别记为C0、C1、C2、C3、C4、C5)。假设选择ID号为C0、C2和C4的摄像机进行标定,并选择其中一个导播摄像机作为标定计算设备,如上述的主机位的导播摄像机。进一步的,可按从左到右或从右到左的方向进行两两摄像机之间的外参标定,得到两个摄像机间的相对外参。可选的,每个导播摄像机中可维护一个摄像机相对位置关系表,如下表一所示。其中,每次标定会新增或更新其中的一个表项,每个表项由其两个摄像机ID号唯一确定。Further, it is assumed that the cameras in the navigation camera system are all guided cameras, and each of the navigation cameras is connected with a binocular camera. Referring to FIG. 3b, FIG. 3b is a schematic diagram of a calibration scenario of a multi-guide camera according to an embodiment of the present invention. As shown in FIG. 3b, in the calibration, a local area network (LAN) or a wireless Fidelity (Local Area Network, simply referred to as "LAN") or a wireless fidelity can be used between two adjacent navigation cameras or between a binocular camera and a navigation camera. Referred to as "Wi-Fi" network for communication, including transmission of calibration template images and calibration parameters, etc., the transmission protocol can use a variety of network protocols, such as the HyperText Transfer Protocol ("HTTP"). Specifically, a global ID number can be assigned in advance to each of the navigation camera and the camera (binocular camera) connected to the navigation camera. For example, you can select a camera as the starting position. For example, you can select the leftmost or rightmost camera as the starting position. The ID numbers of other cameras are incremented counterclockwise or clockwise. A group of cameras is selected among all cameras to participate in the calibration. The principle of selection may be to ensure that the overlapping area between adjacent cameras is the largest. As shown in FIG. 3b, it is assumed that three positions D1, D2, and D3 are deployed in the current shooting scene, and each of the positions includes a navigation camera (recorded as PTZ0, PTZ1, and PTZ2, respectively) and a binocular camera (the binocular camera). Includes two left and right cameras, denoted as C0, C1, C2, C3, C4, C5). Suppose that cameras with ID numbers C0, C2, and C4 are selected for calibration, and one of the navigation cameras is selected as the calibration computing device, such as the above-mentioned navigation camera for the host position. Further, the external reference calibration between the two cameras can be performed from left to right or from right to left, and the relative external parameters between the two cameras are obtained. Optionally, a camera relative position relationship table can be maintained in each navigation camera, as shown in Table 1 below. Among them, each of the tables will add or update one of the entries, and each entry is uniquely determined by its two camera ID numbers.
表一Table I
在标定完成后,作为标定计算设备的导播摄像机即可将该位置关系表通过网络发送给其它所有的导播摄像机进行保存。进一步的,根据该位置关系表,以及双目摄像机的外参(该外参可在出厂前标定完成),即可计算得到该标定场景中任意两两摄像机(包括双目摄像机之间,双目摄像机和PTZ摄像机之间以及PTZ摄像机之间)的位置关系。After the calibration is completed, the positional relationship table, which is the calibration computing device, can be sent to all other navigation cameras for storage via the network. Further, according to the position relationship table and the external parameters of the binocular camera (the external reference can be calibrated at the factory), any two or two cameras in the calibration scene can be calculated (including between binocular cameras, binocular) The positional relationship between the camera and the PTZ camera and between the PTZ cameras.
举例来说,假设图3b中的导播摄像机D3为标定计算设备,导播摄像机D1和D2为要标定位置关系的摄像机,即可将导播摄像机D3设为用于标定的主机位,其它摄像机设为从机位,并通过导播摄像机D3发起标定。标定前,需确保摄像机之间已通过网络进行互联,需要标定的摄像机之间可以拍摄到重叠区域,重叠区域中具有标定模板(如棋盘格模板)等等。需要标定时,导播摄像机D3启动标定过程,并发送图像采集命令给导播摄像机D1,采集命令中包含导播摄像机D1的ID号(即D1)和需要采集的双目摄像机的ID号(C4或C5)。导播摄像机D1接收到该采集命令后进行标定模板的图像采集,并将采集的图像数据被传输给导播摄像机D3。类似的,导播摄像机D3获取得到导播摄像机D2上的双目摄像头 拍摄的标定模板图像。若需标定的双目摄像机位于导播摄像机D3上,则导播摄像机D3可直接获取该双目摄像机的标定模板图像。获取需要标定的摄像机的标定模板图像之后,导播摄像机D3即可对该两个摄像机的标定模板图像进行棋盘格角点检测,若两幅图像能够检测出所有的棋盘格角点,则可表明采集成功;否则可丢弃该两幅图像,重新采集该图像。进一步的,可通过改变标定模板的位置,循环地获取需要标定的两个摄像机的多幅标定模板图像保存到导播摄像机D3中,当满足标定模板图像的数量要求后,导播摄像机D3即可进行摄像机标定,每个摄像机的内参在出厂前已经标定,因此可以作为标定的输入初始值。标定完成后得到两个摄像机之间的相对外参R和T,并计算重投(影)误差是否小于预设的阈值,若重投误差大于阈值,则可表明标定失败;否则,则可表明标定成功。导播摄像机D3完成标定后即可基于计算得到的摄像机相对位置关系更新位置关系表并发送给其它的导播摄像机。For example, if the navigation camera D3 in FIG. 3b is a calibration computing device, and the navigation cameras D1 and D2 are cameras to be calibrated, the navigation camera D3 can be set as the calibration host position, and the other cameras are set as slaves. The position is set and the calibration is initiated by the camera D3. Before calibration, it is necessary to ensure that the cameras are interconnected through the network. The overlapping areas can be captured between the cameras that need to be calibrated, and there are calibration templates (such as checkerboard templates) in the overlap area. The calibration camera is required to start the calibration process, and the image acquisition command is sent to the navigation camera D1. The acquisition command includes the ID number of the navigation camera D1 (ie, D1) and the ID number of the binocular camera to be acquired (C4 or C5). . After receiving the acquisition command, the navigation camera D1 performs image acquisition of the calibration template, and transmits the collected image data to the navigation camera D3. Similarly, the guide camera D3 obtains the binocular camera on the guide camera D2. The calibration template image taken. If the binocular camera to be calibrated is located on the navigation camera D3, the navigation camera D3 can directly acquire the calibration template image of the binocular camera. After obtaining the calibration template image of the camera to be calibrated, the guide camera D3 can perform checkerboard corner detection on the calibration template images of the two cameras. If the two images can detect all the checkerboard corner points, the collection can be indicated. Successful; otherwise the two images can be discarded and the image reacquired. Further, by changing the position of the calibration template, a plurality of calibration template images of the two cameras that need to be calibrated are cyclically acquired and saved in the navigation camera D3. After the number of calibration template images is met, the camera D3 can be used to perform the camera. Calibration, the internal parameters of each camera have been calibrated at the factory, so it can be used as the initial input value for the calibration. After the calibration is completed, the relative external parameters R and T between the two cameras are obtained, and whether the re-projection (shadow) error is less than a preset threshold is calculated. If the re-injection error is greater than the threshold, the calibration failure may be indicated; otherwise, it may indicate The calibration was successful. After the navigation camera D3 completes the calibration, the positional relationship table can be updated based on the calculated relative position relationship of the camera and transmitted to other navigation cameras.
进一步的,在实现对双目摄像机之间的位置关系、导播摄像机和双目摄像机之间的位置关系以及多导播摄像机之间的位置关系的标定之后,即可对导播摄像机的拍摄范围内的视频对象进行定位,获取其三维位置信息,以根据获取的三维位置信息确定出合适的导播摄像机机位,并根据该三维位置信息对应的导播策略对该导播摄像机进行参数调整,控制导播摄像机定位到合适的位置进行视频对象拍摄。其中,对视频对象的定位包括双目摄像机三维定位、单导播摄像机如PTZ摄像机定位以及多机位的摄像机间的三维定位。Further, after realizing the positional relationship between the binocular cameras, the positional relationship between the navigation camera and the binocular camera, and the calibration of the positional relationship between the multi-camera cameras, the video within the shooting range of the navigation camera can be obtained. The object is positioned, and the three-dimensional position information is obtained, so as to determine an appropriate guiding camera position according to the acquired three-dimensional position information, and adjust the parameter of the guiding camera according to the guiding strategy corresponding to the three-dimensional position information, and control the positioning of the guiding camera to be suitable. The location of the video object is taken. The positioning of the video object includes three-dimensional positioning of the binocular camera, single-camera camera such as PTZ camera positioning, and three-dimensional positioning between the cameras of the multi-camera.
具体的,在双目摄像机的三维定位过程中,可利用双目摄像机拍摄的立体图像,计算得到场景中某个观测点在摄像机坐标系中的深度位置信息,从而确定该观测点的三维位置信息。该方式和人眼感知深度距离的原理相同,称为双目摄像机测距。如图3c所示,其提供了一种双目摄像机的三维定位原理图,以下对该双目摄像机系统的测距原理进行简要介绍。其中,P为世界坐标系下的观测点,被左右两个摄像机拍摄成像。其中,该P点在左摄像机物理坐标系中的位置为XL,YL,ZL,在左视图的成像点像素位置坐标为xl,yl;在右摄像机物理坐标系中的位置为XR,YR,ZR,在右视图的成像点像素位置坐标为xr,yr,假设左右摄像机的相对外参为R,T;左右摄像机的焦距分别为:fl,fr。根据双目摄像机模型,可知左右摄像机的成像模型和左右摄像机的物理坐标位置关系为:Specifically, in the three-dimensional positioning process of the binocular camera, the stereoscopic image captured by the binocular camera can be used to calculate the depth position information of a certain observation point in the camera coordinate system, thereby determining the three-dimensional position information of the observation point. . This method is the same as the principle that the human eye perceives the depth distance, and is called binocular camera ranging. As shown in FIG. 3c, a three-dimensional positioning schematic diagram of a binocular camera is provided. The following describes the ranging principle of the binocular camera system. Among them, P is an observation point in the world coordinate system, and is imaged by two left and right cameras. Wherein, the position of the P point in the physical coordinate system of the left camera is X L , Y L , Z L , and the coordinates of the pixel position of the imaging point in the left view are x l , y l ; the position in the physical coordinate system of the right camera is X R , Y R , Z R , the pixel position coordinates of the imaging point in the right view are x r , y r , assuming that the relative external parameters of the left and right cameras are R, T; the focal lengths of the left and right cameras are: f l , f r . According to the binocular camera model, the relationship between the imaging model of the left and right cameras and the physical coordinate position of the left and right cameras is as follows:
根据上述公式可推导得到: According to the above formula, it can be derived:
其中,xl,yl,xr,yr的值可以通过图像匹配得到,fl,fr,R,T可以通过双目摄像机标定得到,因此可以计算出XL,YL,ZL和XR,YR,ZR的值,从而确定场景中观测点在双目摄像机对应的坐标系下的三维坐标。Among them, the values of x l , y l , x r , y r can be obtained by image matching, f l , f r , R, T can be obtained by binocular camera calibration, so X L , Y L , Z L can be calculated And the values of X R , Y R , Z R , thereby determining the three-dimensional coordinates of the observation points in the scene in the coordinate system corresponding to the binocular camera.
进一步的,在导播摄像机如PTZ摄像机三维定位过程中,PTZ摄像机定位的基本目的是已知某个目标在PTZ摄像机坐标系中的物理坐标,如何通过旋转PTZ摄像机使得该目标的某个点定位到图像中的特定像素坐标位置。该目标在PTZ摄像机坐标系中的物理坐标可以通过该目标在双目摄像机坐标系中的三维位置,以及标定得到的双目摄像机和PTZ摄像机之间的位置关系计算得出。请参见图3d,是本发明实施例提供的一种PTZ摄像机旋转模型示意图。如图3d所示,假设希望目标点P出现的位置坐标为x0,y0,目标点P的物理坐标位置为X,Y,Z,在成像平面上的像素坐标位置为xc,yc,则可分别绕X轴和Y轴旋转,使点P的像素位置和目标位置重合,则Pan(平移)和Tilt(倾斜)操作的旋转角度可以按下列公式建模:Further, in the three-dimensional positioning process of the guide camera such as the PTZ camera, the basic purpose of the PTZ camera positioning is to know the physical coordinates of a certain target in the PTZ camera coordinate system, how to position a certain point of the target by rotating the PTZ camera. The specific pixel coordinate position in the image. The physical coordinates of the target in the PTZ camera coordinate system can be calculated by the three-dimensional position of the target in the binocular camera coordinate system and the positional relationship between the binocular camera and the PTZ camera obtained by calibration. FIG. 3 is a schematic diagram of a PTZ camera rotation model according to an embodiment of the present invention. As shown in Fig. 3d, it is assumed that the position coordinates of the desired target point P are x 0 , y 0 , the physical coordinate position of the target point P is X, Y, Z, and the pixel coordinate position on the imaging plane is x c , y c , then rotate around the X and Y axes respectively, so that the pixel position of the point P coincides with the target position, and the rotation angles of the Pan and Tilt operations can be modeled according to the following formula:
由于PTZ摄像机为变焦相机,因此需要获取变焦倍数Z和焦距、畸变系数等内参的函数关系。例如,可利用多项式拟合变焦倍数Z和焦距fx,fy的关系,得到如下关系:Since the PTZ camera is a zoom camera, it is necessary to obtain a function relationship of the zoom factor Z and the internal parameters such as the focal length and the distortion coefficient. For example, the polynomial can be used to fit the relationship between the zoom factor Z and the focal lengths f x , f y to obtain the following relationship:
fx=a0+a1Z+a2Z2+…anZn f x =a 0 +a 1 Z+a 2 Z 2 +...a n Z n
fy=b0+b1Z+b2Z2+…bnZn f y =b 0 +b 1 Z+b 2 Z 2 +...b n Z n
具体的,在不同的Z值下,标定得到摄像机内参,计算得到对应的fx,fy和畸变系数,并使用最小二乘法拟合出系数。其它的畸变系数等内参也可以按照类似方法处理。得到不同Z值下摄像机的内参后,根据Pan/Tilt模型公式可以计算得到Δp,Δt的值。Specifically, under different Z values, the camera internal parameters are obtained, the corresponding f x , f y and distortion coefficients are calculated, and the coefficients are fitted using least squares method. Other internal parameters such as distortion coefficients can also be processed in a similar manner. After obtaining the internal parameters of the camera under different Z values, the values of Δp and Δt can be calculated according to the Pan/Tilt model formula.
进一步可选的,在多导播摄像机场景下,还可获取与其他导播摄像机连接的双目摄像机发送的拍摄图像,进行视频对象匹配后对视频对象模型进行更新。则从摄像机获取的拍摄图像中确定出所述目标视频对象,可以具体为:根据预先标定的所述双目摄像机和当前摄像机的位置关系,将所述第二三维坐标转换为第三三维坐标;判断所述目标视频对象在所述第三三维坐标下的区域和所述当前摄像机检测到的视频对象在该视频对象的三维坐标下的区域的重合面积是否超过预设的面积阈值;若超过,则将该视频对象确定为所述目标视频对象。也即视频对象匹配成功。其中,所述第三三维坐标为所述目标视频对象在所述当前摄像机对应的第三坐标系下的三维坐标,所述当前摄像机为所述导播摄像系统中除所述双目摄像机以外的任一摄像机。例如,在多导播摄像机场景下,该当前摄像机可以为除主机位的双目摄像机以外的其他双目摄像机。 Further, in the multi-camera camera scene, the captured image sent by the binocular camera connected to other navigation cameras may be acquired, and the video object model is updated after the video object is matched. Determining the target video object from the captured image obtained by the camera, which may be specifically: converting the second three-dimensional coordinate into a third three-dimensional coordinate according to a pre-calibrated positional relationship between the binocular camera and the current camera; Determining whether the area of the target video object in the third three-dimensional coordinate and the area of the video object detected by the current camera in the three-dimensional coordinate of the video object exceeds a preset area threshold; if exceeded, The video object is then determined as the target video object. That is, the video object matches successfully. The third three-dimensional coordinate is a three-dimensional coordinate of the target video object in a third coordinate system corresponding to the current camera, and the current camera is any other than the binocular camera in the navigation camera system. A camera. For example, in a multi-camera camera scene, the current camera can be a binocular camera other than the binocular camera of the host position.
具体实施例中,多机位视频对象三维定位的目的是根据视频对象在某个导播摄像机双目坐标系中的三维坐标,计算得到其在其它导播摄像机双目摄像机或某个PZT摄像机坐标系中的三维坐标。已知某个观测点(即视频对象,具体可以是视频对象的某一特征点)在摄像机D1中的坐标向量为X1,以及摄像机D2相对于摄像机D1的外参R21,T21(通过双目摄像机标定得到),可以计算出该观测点在摄像机D2中的坐标向量X2:In a specific embodiment, the purpose of the three-dimensional positioning of the multi-camera video object is to calculate the three-dimensional coordinates of the video object in a binocular coordinate system of a navigation camera, and calculate it in other binocular cameras or a PZT camera coordinate system. The three-dimensional coordinates. It is known that a certain observation point (ie, a video object, specifically a certain feature point of the video object) has a coordinate vector X 1 in the camera D1, and an external parameter R 21 , T 21 of the camera D2 relative to the camera D1 (through The binocular camera calibration is obtained), and the coordinate vector X 2 of the observation point in the camera D2 can be calculated:
X2=R21X1+t21 X 2 =R 21 X 1 +t 21
具体的,请参见图4a,是本发明实施例提供的一种对象匹配场景示意图。多机位视频对象三维定位可以用于确定多个视频对象的对应关系。如图4a所示,场景中部署了三机位的摄像机D1、D2和D3,场景中有O1、O2和O3共3个参会者。进一步的,如图4b所示,是图4a中的一组视频对象成像图,其为参会者O1在不同视角的导播摄像机D1、D2和D3中的成像。对于参会者O1,机位D1中的双目摄像机利用人脸检测等算法检测得到视频对象VO11,然后采用双目摄像机三维定位算法得到该对象在D1双目坐标系中的三维位置。同理D2和D3也会检测到视频对象VO12和VO13,并计算得到在D2、D3双目坐标系下该对象的三维坐标。在多机位视频对象三维定位时,可利用D1,D2和D3之间的位置关系标定,将D1坐标系下视频对象VO11的三维位置转换到D2和D3坐标系下。并检测其重合区域。若检测到转换后VO11的三维位置和VO12、VO13的位置重合区域超过一定面积阈值,即可认为VO11、VO12和VO13为同一视频对象,视频对象匹配成功。进一步的,若图像中存在相邻距离较近的多个视频对象,单纯利用位置重合区域确定视频对象的对应关系可能会导致匹配错误。由此,可进一步结合视频对象的图像信息并通过匹配算法来提升确定对应关系的准确性。其中,该匹配算法可包括模板匹配算法等。例如可利用模板匹配算法,将某个机位如主机位的双目摄像机检测到的视频对象的二维图像作为已知模板,并将其它机位的双目摄像机检测到的视频对象与其进行一一匹配,通过平方差匹配、相关匹配等算法找到该视频对象最匹配的对象,从而建立起对象的对应关系。Specifically, please refer to FIG. 4a, which is a schematic diagram of an object matching scenario according to an embodiment of the present invention. Three-dimensional positioning of multi-camera video objects can be used to determine the correspondence of multiple video objects. As shown in Figure 4a, three-camera cameras D1, D2, and D3 are deployed in the scene. There are 3 participants in O 1 , O 2 , and O 3 in the scene. Further, as shown in Figure 4b, it is a set of video objects in the image of FIG 4a, the participants O 1 which is a different view of the imaging cameras directed D1, D2 and D3 in. For the participant O 1 , the binocular camera in the station D1 detects the video object VO 11 by using an algorithm such as face detection, and then uses the binocular camera three-dimensional positioning algorithm to obtain the three-dimensional position of the object in the D1 binocular coordinate system. Similarly, D2 and D3 will also detect the video objects VO 12 and VO 13 and calculate the three-dimensional coordinates of the object in the D2 and D3 binocular coordinate systems. In the three-dimensional positioning of the multi-camera video object, the positional relationship between D1, D2 and D3 can be used to calibrate the three-dimensional position of the video object VO 11 in the D1 coordinate system to the D2 and D3 coordinate systems. And detect the coincident area. If it is detected that the three-dimensional position of the VO 11 and the positional overlap of the VO 12 and the VO 13 exceed a certain area threshold, the VO 11 , VO 12 and VO 13 are considered to be the same video object, and the video object is successfully matched. Further, if there are multiple video objects in the image that are close to each other in the image, simply determining the correspondence relationship of the video objects by using the positional coincidence region may result in a matching error. Thereby, the image information of the video object can be further combined and the accuracy of determining the correspondence can be improved by the matching algorithm. The matching algorithm may include a template matching algorithm and the like. For example, a template matching algorithm may be used to take a two-dimensional image of a video object detected by a binocular camera of a certain host position such as a host bit as a known template, and perform a video object detected by a binocular camera of another station. A matching finds the object that matches the video object by algorithm such as squared difference matching and correlation matching, thereby establishing the correspondence of the object.
进一步的,在确定出双目摄像机的三维定位及PTZ摄像机三维定位之后,即可对Further, after determining the three-dimensional positioning of the binocular camera and the three-dimensional positioning of the PTZ camera,
视频对象检测跟踪和场景建模。其中,视频对象检测/跟踪的目的是构建和描述场景中存在的视频对象,并对这些对象进行跟踪和识别。视频对象包括参会人对象,以及场景对象,如灯管,窗户,会议桌等。系统需要循环地对输入的双目摄像机的图像数据进行处理,包括进行人脸检测和匹配、人形检测和匹配、运动对象检测和匹配、场景对象检测和匹配等,对视频对象建立模型并更新模型参数,从而根据检测得到的对象模型进行整个拍摄场景的建模。建模得到的场景模型可用于后续的对象识别和导播策略处理。其中,人脸检测可以用于检测距离较近的视频对象,如检测距离较近的参会人,对于较远的区域,由于人脸面积较小无法很好的检测出,则可采用人形或运动对象检测的方法。该人脸检测可获取得到人脸视频对象的各项参数,包括人脸外接矩形区域的二维坐标、中心点坐标、矩形面积、人脸绕坐标轴的旋转角度(代表人脸左右偏转、俯仰和旋转的程度)、人脸中眼睛、鼻子和嘴等器官的位置等参数。Video object detection tracking and scene modeling. Among them, the purpose of video object detection/tracking is to construct and describe the video objects existing in the scene, and to track and identify these objects. Video objects include participants, as well as scene objects such as lights, windows, conference tables, and more. The system needs to cyclically process the image data of the input binocular camera, including face detection and matching, human shape detection and matching, moving object detection and matching, scene object detection and matching, etc., modeling the video object and updating the model. Parameters to model the entire shooting scene based on the detected object model. The modeled model can be used for subsequent object recognition and guided strategy processing. Among them, face detection can be used to detect video objects with close distances, such as participants with relatively close detection distances. For distant areas, because the face area is small and cannot be detected well, human form or The method of moving object detection. The face detection can obtain various parameters of the face video object, including the two-dimensional coordinates of the circumscribed rectangular area of the face, the coordinates of the center point, the area of the rectangle, and the rotation angle of the face around the coordinate axis (representing the left and right deflection of the face, pitching Parameters such as the degree of rotation and the position of the organs such as the eyes, nose, and mouth in the face.
进一步的,在每一帧图像中检测视频对象后,还需要在视频帧序列中对视频对象进行跟踪,从而建立起视频对象在时域上的对应关系。目前传统广泛使用的视频对象跟踪算法包括基于灰度的模板匹配,MeanShift,CamShift,Kalman滤波算法等。其中,视频对象的 匹配可应用于双目摄像机中,利用该双目摄像机中一个摄像机图像中检测到的视频对象区域,在另一个摄像机图像中找到对应的视频图像区域,从而可在视频对象的匹配区域中进行特征匹配和三维坐标的计算。视频对象的匹配算法和跟踪算法类似,可采用基于灰度的模板匹配和MeanShift等算法。Further, after the video object is detected in each frame image, the video object needs to be tracked in the video frame sequence to establish a correspondence relationship between the video objects in the time domain. Currently widely used video object tracking algorithms include grayscale based template matching, MeanShift, CamShift, Kalman filtering algorithms and the like. Among them, the video object The matching can be applied to the binocular camera, and the video object region detected in one camera image of the binocular camera is used to find a corresponding video image region in another camera image, so that the feature can be performed in the matching region of the video object. Match and calculation of 3D coordinates. The matching algorithm of the video object is similar to the tracking algorithm, and grayscale-based template matching and algorithms such as MeanShift can be used.
在本发明实施例中,视频对象可以通过其特征进行表示,通常使用的特征包括特征点、图像纹理、直方图信息等。该特征检测和匹配可在检测到的视频对象区域中进行,从而可根据特征点信息来计算视频对象的三维位置信息,即三维坐标,并可根据纹理信息和直方图信息进行视频对象的跟踪。其中,该特征点是主要的特征类型,特征点检测算法包括Harris角点检测、SIFT特征点检测等算法。进一步的,特征匹配用于建立双目摄像机同一视频对象特征的对应关系,特征点可以采用FLANN算法、KLT光流法等匹配算法进行匹配,图像纹理可以采用灰度模板匹配等算法进行匹配,直方图可以采用直方图匹配等算法进行匹配。综上,根据匹配得到的特征信息,以及上述的双目摄像机三维定位算法,则可计算得到单个导播摄像机三维坐标系下视频对象特征的三维坐标,从而可以在三维空间中定位和跟踪某个视频对象。In the embodiment of the present invention, a video object may be represented by its features, and commonly used features include feature points, image textures, histogram information, and the like. The feature detection and matching can be performed in the detected video object region, so that the three-dimensional position information of the video object, that is, the three-dimensional coordinates can be calculated according to the feature point information, and the video object can be tracked according to the texture information and the histogram information. The feature point is the main feature type, and the feature point detection algorithm includes Harris corner detection and SIFT feature point detection. Further, the feature matching is used to establish the correspondence relationship between the features of the same video object of the binocular camera. The feature points can be matched by using a matching algorithm such as FLANN algorithm and KLT optical flow method, and the image texture can be matched by using a gray template matching algorithm and the like. Graphs can be matched using algorithms such as histogram matching. In summary, according to the matching feature information and the above-mentioned binocular camera three-dimensional positioning algorithm, the three-dimensional coordinates of the video object features in the three-dimensional coordinate system of the single guided camera can be calculated, so that a certain video can be located and tracked in the three-dimensional space. Object.
进一步的,根据视频对象检测和匹配、特征检测和匹配算法得到数据以及视频对象三维位置计算的结果,可在单个导播摄像机坐标系中建立起多个视频对象的模型,并可通过人脸、人形和运动检测跟踪算法对模型数据进行更新。具体的,可为每个视频对象模型分配一个唯一的ID号,模型中的数据代表了该视频对象的属性。例如,对于运动对象模型,模型中的数据可包括对象ID、外接矩形二维坐标、对象特征点的三维坐标、运动区域纹理数据、直方图数据等属性。当运动对象位置发生变化时,其属性会根据上述检测和匹配算法的输出进行刷新,但对象的ID保持不变。人脸和人形对象的建立与运动对象模型类似,此处不赘述。Further, according to the video object detection and matching, the feature detection and matching algorithm, and the result of calculating the three-dimensional position of the video object, a plurality of video object models can be established in a single navigation camera coordinate system, and can be passed through a human face or a human form. And the motion detection tracking algorithm updates the model data. Specifically, each video object model can be assigned a unique ID number, and the data in the model represents the attributes of the video object. For example, for a moving object model, the data in the model may include attributes such as an object ID, a circumscribed rectangle two-dimensional coordinate, a three-dimensional coordinate of the object feature point, a motion region texture data, a histogram data, and the like. When the position of the moving object changes, its properties are refreshed according to the output of the above detection and matching algorithm, but the ID of the object remains unchanged. The creation of human faces and humanoid objects is similar to the moving object model and will not be described here.
应理解,在多机位应用场景中,多个导播摄像机之间可以通过网络通信交换视频对象模型数据,单个导播摄像机得到了其它导播摄像机的视频对象模型数据后,可利用上述的多导播摄像机三维定位和视频对象匹配的算法建立视频对象模型的对应关系,从而得到对整个场景的导播策略。通信时的网络通信协议可以采用标准协议如HTTP协议,也可以采用自定义协议,视频对象模型的数据被按照一定格式如可扩展标记语言(eXtensible Markup Language,简称为“XML”)格式进行格式化、打包和传输。通过对多个导播摄像机视频对象模型的匹配和合并,单个导播摄像机可建立起整个拍摄场景的模型。场景模型中包含了多个视频对象的模型,反映了视频对象的特征和在三维空间中的分布情况。导播摄像机需要对场景模型进行维护,包括新增、删除对象模型和对象模型属性。例如当场景中有新的参会者出现时,双目摄像机检测到新的人脸或人形对象时,建立对象模型后加入对象模型集合中;当场景中有参会者离开后,删除该对象的模型;参会者位置发生变化后,更新对应对象模型的参数。以根据最新的视频对象模型来制定导播策略,选择最佳机位的摄像机进行拍摄。It should be understood that in a multi-camera application scenario, video camera model data can be exchanged between multiple guide cameras through network communication. After a single guide camera obtains video object model data of other guide cameras, the above-mentioned multi-guide camera can be used to generate three-dimensional images. The algorithm for locating and matching the video object establishes a correspondence relationship between the video object models, thereby obtaining a guiding strategy for the entire scene. The network communication protocol during communication can adopt a standard protocol such as the HTTP protocol, or a custom protocol, and the data of the video object model is formatted according to a format such as an eXtensible Markup Language ("XML") format. , packaging and transmission. By matching and merging multiple guided camera video object models, a single guided camera can establish a model of the entire shooting scene. The scene model contains models of multiple video objects, reflecting the characteristics of the video objects and their distribution in three dimensions. The camera needs to maintain the scene model, including adding and deleting object models and object model properties. For example, when a new participant appears in the scene, when the binocular camera detects a new face or a humanoid object, the object model is created and added to the object model set; when the participant leaves the scene, the object is deleted. The model; after the participant's position changes, the parameters of the corresponding object model are updated. To develop a navigation strategy based on the latest video object model, select the camera with the best position for shooting.
104、将所述目标摄像机的摄像参数调整为与所述第一三维坐标对应的摄像参数,并输出调整摄像参数后的视频图像。104. Adjust an imaging parameter of the target camera to an imaging parameter corresponding to the first three-dimensional coordinate, and output a video image after adjusting the imaging parameter.
具体实施例中,在建立(更新)得到包括所有视频对象的视频对象模型并确定出目标视频对象之后,即可根据预设的导播策略选择一个或多个最佳拍摄效果的导播摄像机机位, 如根据眼对眼效果参数、遮挡关系参数以及拍摄区域的场景对象参数等来确定出具有较佳拍摄效果的摄像机。具体的,该眼对眼效果需要根据人脸/人形对象相对于PTZ摄像机的光轴角度来确定,角度越小则人脸越以正脸方式进行呈现,眼对眼效果越好。具体的,可通过人脸/人形检测算法得到的是以人脸/人形为中心的三维坐标轴相对于双目摄像机坐标系的旋转角度(左右偏转,俯仰和旋转角度),并利用前述摄像机之间坐标系转换的公式,将人脸/人形相对双目摄像机的旋转角度,转换为相对PTZ摄像机的旋转角度。转换过程中需要利用前述标定好的双目摄像机和PTZ摄像机之间的外参,以及不同机位双目摄像机之间的外参,确定出该眼对眼效果参数。并可进一步针对每个视频对象建立起一个眼对眼效果的PTZ摄像机优先级队列,眼对眼效果更好的摄像机具有更高的优先级。In a specific embodiment, after the video object model including all the video objects is created (updated) and the target video object is determined, one or more guiding camera positions of the best shooting effect can be selected according to the preset guiding strategy. For example, the camera having the better shooting effect is determined according to the eye-to-eye effect parameter, the occlusion relationship parameter, and the scene object parameter of the shooting area. Specifically, the eye-to-eye effect needs to be determined according to the optical axis angle of the face/human object relative to the PTZ camera. The smaller the angle, the more the face is presented in a positive face manner, and the eye-to-eye effect is better. Specifically, the face/human shape detection algorithm obtains a rotation angle (left and right deflection, pitch and rotation angle) of the three-dimensional coordinate axis centered on the face/human shape with respect to the binocular camera coordinate system, and uses the aforementioned camera. The formula for the conversion of the coordinate system converts the rotation angle of the face/human shape relative to the binocular camera to the rotation angle of the relative PTZ camera. During the conversion process, it is necessary to use the external parameters between the previously-calibrated binocular camera and the PTZ camera, and the external parameters between the binocular cameras of different positions to determine the eye-to-eye effect parameters. A PTZ camera priority queue with an eye-to-eye effect can be further established for each video object, and a camera with better eye-to-eye effect has a higher priority.
进一步的,在获取视频对象的遮挡关系时,可根据摄像机投影方程,已知某个导播摄像机检测到的视频对象的区域(如外接矩形),并可利用标定好的单个导播摄像机双目摄像机和PTZ摄像机之间的外参,以及不同机位的双目摄像机之间的外参,将该区域重投到各机位PTZ摄像机的成像平面上。若两个视频对象的区域重叠,利用深度信息可确定这两个对象间的遮挡关系,即距离双目摄像机较近的视频对象会遮挡较远的视频对象。由此,针对每个视频对象可以建立起一个遮挡关系的PTZ摄像机优先级队列,没有遮挡的摄像机具有更高的优先级。Further, when acquiring the occlusion relationship of the video object, the area of the video object detected by a certain camera (such as an circumscribed rectangle) may be known according to the projection equation of the camera, and the single-guided camera binocular camera and the calibration may be utilized. The external parameters between the PTZ cameras and the external parameters between the binocular cameras of different positions re-enter the area onto the imaging plane of each position PTZ camera. If the regions of the two video objects overlap, the depth information can be used to determine the occlusion relationship between the two objects, that is, the video object closer to the binocular camera will block the farther video object. Thus, a PTZ camera priority queue can be established for each video object with an occlusion relationship, and an unoccluded camera has a higher priority.
进一步的,除了基于人的视频对象检测外,系统还对场景中感兴趣的其它视频对象(场景对象)进行检测,例如灯管,窗户和会议桌等。这些对象的检测可以采用基于图像颜色和边缘特征的算法等等。例如,对于灯管检测,可以先用Canny算子提取出灯管的边缘,得到其长直线特征,然后再检测相邻区域是否存在过曝像素区域(发光区域),根据这两个特征可以检测出灯管对象,得到其外接矩形的坐标。窗户的检测和灯管检测类似,可以通过边缘检测得到四边形特征,然后根据四边形内是否存在一定面积的过曝像素区域来判断是否为窗户。会议桌也可以利用图像中的边缘特征进行检测得到。则在获取该场景对象参数时,可根据摄像机投影方程,已知某个导播摄像机检测到的场景对象的区域,利用标定好的单个导播摄像机双目摄像机和PTZ摄像机之间的外参,以及不同机位的双目摄像机之间的外参,将该区域重投到各机位PTZ摄像机的成像平面上。对于灯管和窗户等对象,通常会存在大面积的过曝区域,导致摄像机的自动曝光效果变差,场景变暗,而桌子等场景对象会存在大面积的红色或黄色等颜色区域,导致摄像机的自动白平衡偏色,这些场景对象应该尽量避免出现在图像中。由此,可建立起是否会拍摄到不利于图像效果的场景对象的PTZ摄像机优先级队列,拍摄到场景对象概率较小的摄像机具有更高的优先级。Further, in addition to human-based video object detection, the system also detects other video objects (scene objects) of interest in the scene, such as lamps, windows, conference tables, and the like. The detection of these objects can employ algorithms based on image color and edge features, and the like. For example, for lamp detection, the Canny operator can be used to extract the edge of the tube to obtain its long straight line feature, and then detect whether there is an overexposed pixel area (lighting area) in the adjacent area. According to these two features, it can be detected. Exit the tube object and get the coordinates of its circumscribed rectangle. The detection of the window is similar to the detection of the tube. The quadrilateral feature can be obtained by edge detection, and then whether the window is determined according to whether there is a certain area of the overexposed pixel area in the quadrilateral. The conference table can also be detected using edge features in the image. When acquiring the scene object parameter, according to the camera projection equation, the area of the scene object detected by a certain guide camera is known, and the external parameters between the single guide camera binocular camera and the PTZ camera are used, and different The external parameter between the binocular cameras of the aircraft re-casts the area onto the imaging plane of each position PTZ camera. For objects such as tubes and windows, there is usually a large area of overexposed area, which causes the camera's automatic exposure effect to deteriorate, the scene to darken, and scene objects such as tables have large areas of red or yellow, resulting in cameras. The automatic white balance color cast, these scene objects should be avoided as much as possible in the image. Thereby, it is possible to establish a PTZ camera priority queue in which a scene object that is not conducive to an image effect is captured, and a camera that has a low probability of capturing a scene object has a higher priority.
进一步的,导播摄像机如主机位的导播摄像机根据获取的图像效果参数,结合预设的导播策略,对各机位的PTZ摄像机建立优先级队列,即可确定出需要选择的摄像机。具体的,可预先确定出需要拍摄的一个或多个视频对象,即目标视频对象,如根据声源定位结果确定的正在说话的视频对象,以拍摄该视频对象的特写;或者采用AutoFrame策略,需将所有视频对象作为该目标视频对象时,则可调整Pan/Tilt将场景中的所有视频对象纳入拍摄范围,并调整Zoom使对象具有合适的大小,等等。对于目标拍摄对象,利用眼对眼效果参数、遮挡关系参数和场景对象参数的PTZ摄像机优先级队列,按照一定的导播策略即可确定出一个综合的PTZ摄像机优先级队列。该导播策略可以由系统自动计算,或者由使用者预先设定,本发明实施例不做限定。 Further, the guide camera of the navigation camera, such as the host position, establishes a priority queue for the PTZ cameras of each station according to the acquired image effect parameters and the preset guide strategy, and can determine the camera to be selected. Specifically, one or more video objects that need to be photographed, that is, a target video object, such as a talking video object determined according to the sound source localization result, may be determined to capture a close-up of the video object; or an AutoFrame strategy is required. When all video objects are used as the target video object, Pan/Tilt can be adjusted to include all the video objects in the scene, and Zoom adjusts the object to the appropriate size, and so on. For the target subject, a PTZ camera priority queue using eye-to-eye effect parameters, occlusion relationship parameters, and scene object parameters can determine a comprehensive PTZ camera priority queue according to a certain guiding strategy. The navigation policy may be automatically calculated by the system or preset by the user, which is not limited by the embodiment of the present invention.
举例来说,假设优先考虑眼对眼效果最好,无遮挡的PTZ摄像机,如果有满足该条件的多个摄像机,再选择场景对象参数最佳即图像效果最好的PTZ摄像机作为目标摄像机进行拍摄。完成PTZ摄像机选择后,主机位可根据该目标视频对象的三维坐标对所选择的PTZ摄像机的PTZ参数进行调整,以尽可能获得最佳的图像效果。例如在语音跟踪时,在拍摄参会人特写时避免拍摄到灯管、窗户等影响图像亮度效果的对象;在AutoFrame时,调节Zoom大小时避免拍摄到大面积桌子对象影响图像的白平衡效果,等等。For example, suppose that the eye-to-eye effect is the best, the unobstructed PTZ camera, if there are multiple cameras that meet the condition, select the PTZ camera with the best image object parameters and the best image as the target camera. . After the PTZ camera selection is completed, the host bit can adjust the PTZ parameters of the selected PTZ camera according to the three-dimensional coordinates of the target video object to obtain the best image effect as much as possible. For example, during voice tracking, when shooting a close-up of a participant, avoid shooting an object that affects the brightness of the image, such as a lamp or a window; when adjusting the Zoom size, avoid adjusting the white balance effect of the large-area table object on the image. and many more.
进一步的,主机位(主机位的导播摄像机)可输出所选择的PTZ摄像机视频图像或ID。可选的,对于支持视频级联的多导播摄像机系统,主机位可直接输出所选择摄像机的图像;对于通过视频矩阵输出的多导播摄像机系统,可由主机位通过通信接口(如串口或网口)输出所选择的PTZ摄像机的ID给视频矩阵,由视频矩阵完成摄像机图像的切换。Further, the host bit (the host position of the navigation camera) can output the selected PTZ camera video image or ID. Optionally, for a multi-guide camera system that supports video cascading, the host bit can directly output an image of the selected camera; for a multi-guide camera system output through the video matrix, the host bit can pass through a communication interface (such as a serial port or a network port). The ID of the selected PTZ camera is output to the video matrix, and the camera image is switched by the video matrix.
在本发明实施例中,可在确定出需要拍摄的目标视频对象之后,按照预设的导播策略从导播摄像系统的各摄像机中筛选出拍摄该目标视频对象效果最佳的目标摄像机,并获取得到该目标视频对象在该目标摄像机对应的坐标系下的三维坐标,以控制该目标摄像机根据该目标视频对象的三维坐标来进行摄像机参数调整,并输出调整摄像参数后的视频图像,使得导播摄像系统能够基于三维坐标检测及预设的导播策略,以提高视频对象检测和跟踪的精度,同时提高了摄像机参数调整的效率,并有效提升了摄像机的拍摄效果。In the embodiment of the present invention, after determining the target video object that needs to be captured, the target camera that captures the target video object is selected from the cameras of the navigation camera system according to a preset guiding strategy, and is obtained. a three-dimensional coordinate of the target video object in a coordinate system corresponding to the target camera, to control the target camera to perform camera parameter adjustment according to the three-dimensional coordinates of the target video object, and output a video image after adjusting the imaging parameter, so that the guided camera system It can improve the accuracy of video object detection and tracking based on three-dimensional coordinate detection and preset navigation strategy, and improve the efficiency of camera parameter adjustment, and effectively improve the camera's shooting effect.
请参见图5,图5是本发明实施例提供的一种参数调整装置的结构示意图。具体的,本发明实施例的所述装置可具体设置于上述的导播摄像机中,如图6所示,本发明实施例的所述参数调整装置可以包括对象确定单元10、选择单元20、获取单元30以及参数调整单元40。其中,Referring to FIG. 5, FIG. 5 is a schematic structural diagram of a parameter adjustment apparatus according to an embodiment of the present invention. Specifically, the device in the embodiment of the present invention may be specifically configured in the above-mentioned navigation camera. As shown in FIG. 6, the parameter adjustment device in the embodiment of the present invention may include an
所述对象确定单元10,用于确定需要拍摄的目标视频对象。The
所述选择单元20,用于按照预设的导播策略从所述导播摄像机所在的导播摄像系统的各摄像机中筛选出用于拍摄所述目标视频对象的目标摄像机。The selecting
可选的,所述拍摄效果参数可包括所述目标视频对象在当前摄像机对应的坐标系下的眼对眼效果参数、遮挡关系参数以及拍摄区域的场景对象参数中的任一项或多项。其中,所述当前摄像机为所述导播摄像系统中除所述双目摄像机以外的任一摄像机。Optionally, the shooting effect parameter may include any one or more of an eye-to-eye effect parameter, an occlusion relationship parameter, and a scene object parameter of the shooting area of the target video object in a coordinate system corresponding to the current camera. The current camera is any camera other than the binocular camera in the navigation camera system.
其中,所述眼对眼效果参数可包括所述目标视频对象相对于当前摄像机对应的坐标系的旋转角度,所述旋转角度可以是根据所述目标视频对象在所述第二坐标系的旋转角度以及预先标定的所述双目摄像机和所述当前摄像机的位置关系确定出的。The eye-to-eye effect parameter may include a rotation angle of the target video object with respect to a coordinate system corresponding to the current camera, and the rotation angle may be a rotation angle of the target video object in the second coordinate system. And determining a positional relationship between the binocular camera and the current camera that is pre-calibrated.
其中,所述遮挡关系参数和所述场景对象参数可以是根据预先标定的所述双目摄像机和所述当前摄像机的位置关系,将所述当前摄像机检测到的场景对象的区域重投到所述当前摄像机的成像平面确定出的。The occlusion relationship parameter and the scene object parameter may be that the area of the scene object detected by the current camera is re-injected to the location according to a pre-calibrated positional relationship between the binocular camera and the current camera. The imaging plane of the current camera is determined.
所述获取单元30,用于获取所述目标视频对象的第一三维坐标。The acquiring
其中,该第一三维坐标可以为该目标视频对象在目标摄像机对应的第一坐标系下的三维坐标。该目标摄像机可为上述的导播摄像机或为普通的PTZ摄像机,则该目标摄像机对应的第一坐标系可以是指以目标摄像机光心为原点建立的三维坐标系,或者以其他任意参照物为原点建立的三维坐标系,本发明实施例不做限定。The first three-dimensional coordinate may be a three-dimensional coordinate of the target video object in a first coordinate system corresponding to the target camera. The target camera may be the above-mentioned navigation camera or an ordinary PTZ camera. The first coordinate system corresponding to the target camera may refer to a three-dimensional coordinate system established with the optical center of the target camera as the origin, or the origin of any other reference object. The established three-dimensional coordinate system is not limited in the embodiment of the present invention.
所述参数调整单元40,用于将所述目标摄像机的摄像参数调整为与所述第一三维坐标对应的摄像参数,并输出调整摄像参数后的视频图像。
The
可选的,所述获取单元30可具体用于:Optionally, the obtaining
获取与所述导播摄像机连接的双目摄像机传输的第二三维坐标,所述第二三维坐标为所述目标视频对象在所述双目摄像机对应的第二坐标系下的三维坐标;Obtaining a second three-dimensional coordinate transmitted by the binocular camera connected to the navigation camera, where the second three-dimensional coordinate is a three-dimensional coordinate of the target video object in a second coordinate system corresponding to the binocular camera;
根据预先标定的所述双目摄像机和所述目标摄像机的位置关系,将所述第二三维坐标转换为第一三维坐标。The second three-dimensional coordinates are converted into first three-dimensional coordinates according to a pre-calibrated positional relationship between the binocular camera and the target camera.
进一步可选的,所述第二三维坐标可以是所述双目摄像机通过分别获取的该目标视频对象在所述双目摄像机的左视图和右视图中的二维坐标以及获取的所述双目摄像机的内外参数据计算得到的。其中,该第二三维坐标为所述目标视频对象在所述双目摄像机对应的第二坐标系下的三维坐标,该双目摄像机对应的第二坐标系可以是指以双目摄像机光心为原点建立的三维坐标系,或者以其他任意参照物为原点建立的三维坐标系。该二维坐标可具体为该目标视频对象在所述双目摄像机的左视图和右视图中对应的像素坐标。Further, the second three-dimensional coordinates may be two-dimensional coordinates of the target video object acquired in the left view and the right view of the binocular camera and the binocular acquired by the binocular camera respectively The internal and external parameters of the camera are calculated. The second three-dimensional coordinate is a three-dimensional coordinate of the target video object in a second coordinate system corresponding to the binocular camera, and the second coordinate system corresponding to the binocular camera may be a binocular camera optical center The three-dimensional coordinate system established by the origin, or the three-dimensional coordinate system established with the origin of any other reference object. The two-dimensional coordinates may specifically be corresponding pixel coordinates of the target video object in the left view and the right view of the binocular camera.
可选的,所述对象确定单元10可具体用于:Optionally, the
获取双目摄像机传输的拍摄图像,所述拍摄图像中包括至少一个视频对象;Obtaining a captured image transmitted by a binocular camera, the captured image including at least one video object;
建立包括所述至少一个视频对象的视频对象模型,并从所述至少一个视频对象中确定出目标视频对象;Establishing a video object model including the at least one video object, and determining a target video object from the at least one video object;
所述选择单元20可具体用于:The selecting
分别从所述导播摄像系统中的各摄像机获取的拍摄图像中确定出所述目标视频对象,并获取所述目标视频对象在各摄像机的拍摄效果参数;Determining the target video object from the captured images acquired by the cameras in the navigation camera system, and acquiring shooting effect parameters of the target video object in each camera;
将拍摄效果参数满足预设导播策略的摄像机确定为用于拍摄所述目标视频对象的目标摄像机。The camera whose shooting effect parameter satisfies the preset guiding strategy is determined as the target camera for capturing the target video object.
进一步可选的,所述选择单元20执行从摄像机获取的拍摄图像中确定出所述目标视频对象的具体方式可以为:Further, the specific manner in which the selecting
根据预先标定的所述双目摄像机和当前摄像机的位置关系,将所述第二三维坐标转换为第三三维坐标,其中,所述当前摄像机为所述导播摄像系统中除所述双目摄像机以外的任一摄像机,所述第三三维坐标为所述目标视频对象在所述当前摄像机对应的第三坐标系下的三维坐标;Converting the second three-dimensional coordinates into a third three-dimensional coordinate according to a pre-calibrated positional relationship between the binocular camera and the current camera, wherein the current camera is in the navigation camera system except the binocular camera The third three-dimensional coordinate is a three-dimensional coordinate of the target video object in a third coordinate system corresponding to the current camera;
判断所述目标视频对象在所述第三三维坐标下的区域和所述当前摄像机检测到的视频对象在该视频对象的三维坐标下的区域的重合面积是否超过预设的面积阈值;Determining whether an area of the target video object in the third three-dimensional coordinate and a region of the video object detected by the current camera in the three-dimensional coordinate of the video object exceed a preset area threshold;
若超过,则将该视频对象确定为所述目标视频对象。If exceeded, the video object is determined to be the target video object.
在本发明实施例中,可在确定出需要拍摄的目标视频对象之后,按照预设的导播策略从导播摄像系统的各摄像机中筛选出拍摄该目标视频对象效果最佳的目标摄像机,并获取得到该目标视频对象在该目标摄像机对应的坐标系下的三维坐标,以控制该目标摄像机根据该目标视频对象的三维坐标来进行摄像机参数调整,并输出调整摄像参数后的视频图像,使得导播摄像系统能够基于三维坐标检测及预设的导播策略,以提高视频对象检测和跟踪的精度,同时提高了摄像机参数调整的效率,并有效提升了摄像机的拍摄效果。In the embodiment of the present invention, after determining the target video object that needs to be captured, the target camera that captures the target video object is selected from the cameras of the navigation camera system according to a preset guiding strategy, and is obtained. a three-dimensional coordinate of the target video object in a coordinate system corresponding to the target camera, to control the target camera to perform camera parameter adjustment according to the three-dimensional coordinates of the target video object, and output a video image after adjusting the imaging parameter, so that the guided camera system It can improve the accuracy of video object detection and tracking based on three-dimensional coordinate detection and preset navigation strategy, and improve the efficiency of camera parameter adjustment, and effectively improve the camera's shooting effect.
请参见图6,图6是本发明实施例提供的一种导播摄像系统的结构示意图。具体的,本发明实施例的所述导播摄像系统可包括第一摄像机1和至少一个第二摄像机2,所述第一摄像机1包括导播摄像机11和双目摄像机12,所述导播摄像机11与所述双目摄像机12之间、所述第一摄像机1和所述第二摄像机2之间可通过有线接口或无线接口连接;其中,
Referring to FIG. 6, FIG. 6 is a schematic structural diagram of a navigation camera system according to an embodiment of the present invention. Specifically, the navigation camera system of the embodiment of the present invention may include a
所述导播摄像机11,用于确定需要拍摄的目标视频对象,并按照预设的导播策略从所述导播摄像系统的摄像机中筛选出用于拍摄所述目标视频对象的目标摄像机;The
所述双目摄像机12,用于获取所述目标视频对象的第二三维坐标,并将所述第二三维坐标传输给所述导播摄像机11;其中,所述第二三维坐标为所述目标视频对象在所述双目摄像机12对应的第二坐标系下的三维坐标;The binocular camera 12 is configured to acquire a second three-dimensional coordinate of the target video object, and transmit the second three-dimensional coordinate to the
所述导播摄像机11,用于接收所述双目摄像机12传输的所述第二三维坐标;根据预先标定的所述双目摄像机12和所述目标摄像机的位置关系,将所述第二三维坐标转换为第一三维坐标;将所述目标摄像机的摄像参数调整为与所述第一三维坐标对应的摄像参数,并输出调整摄像参数后的视频图像;其中,所述第一三维坐标为所述目标视频对象在所述目标摄像机对应的第一坐标系下的三维坐标。The
可选的,所述第二摄像机2也可包括导播摄像机和双目摄像机,则该目标摄像机可为该导播摄像系统中的任一导播摄像机;或者,所述第二摄像机2为普通的PTZ摄像机,则该目标摄像机可为该导播摄像机或普通PTZ摄像机。进一步可选的,所述双目摄像机12可设置于预置的导播支架上,并通过所述导播支架与所述导播摄像机11连接。Optionally, the
具体的,如图7所示,是本发明实施例提供的一种第一摄像机的结构示意图。该第一摄像机包括双目摄像机和一个或多个导播摄像机。假设本发明实施例中该第一摄像机安装有2个导播摄像机,用于进行导播拍摄和跟踪,其可通过导播支架(简称“支架”)与双目摄像机进行有线或无线连接。该双目摄像机安装在该支架上,此外,该支架上还可安装有麦克风,安装的麦克风可以是阵列形式的,该阵列形式的麦克风可用于实现声源定位、声源识别等功能,具体可包括水平阵列的麦克风和垂直阵列的麦克风。进一步的,该导播摄像机和支架可以是分离的,也可以是集成在一起的,该导播摄像机和支架之间可采用控制接口如串行接口进行通信。在一些实施例中,上述的导播摄像机和导播支架(包括双目摄像机、麦克风等)还可集成为一个导播设备,对于导播摄像系统中各设备的连接形式,本发明实施例不做限定。Specifically, as shown in FIG. 7 , it is a schematic structural diagram of a first camera provided by an embodiment of the present invention. The first camera includes a binocular camera and one or more guide cameras. It is assumed that in the embodiment of the present invention, the first camera is equipped with two navigation cameras for guiding shooting and tracking, which can be wired or wirelessly connected to the binocular camera through a guiding bracket (referred to as "cradle"). The binocular camera is mounted on the bracket. In addition, a microphone can be mounted on the bracket, and the installed microphone can be in the form of an array. The microphone in the array can be used for realizing sound source positioning, sound source identification and the like. Includes a horizontal array of microphones and a vertical array of microphones. Further, the guide camera and the bracket may be separated or integrated, and a communication interface such as a serial interface may be used for communication between the guide camera and the bracket. In some embodiments, the above-mentioned navigation camera and the guide bracket (including the binocular camera, the microphone, and the like) may be integrated into one guide device. The connection form of each device in the navigation camera system is not limited in the embodiment of the present invention.
进一步的,请参见图8,是本发明实施例提供的一种导播摄像系统的组网示意图。如图8所示,多个机位可以进行组网,多机位组网方式包括多个安装导播摄像机的机位间组网,安装导播摄像机和导播支架的机位+多个普通PTZ摄像机组网,安装导播摄像机和导播支架的机位+无PTZ摄像机的机位(即只有导播支架)间组网,以及无PTZ摄像机的机位+多个普通PTZ摄像机组网(即无导播支架)。各机位的摄像机之间可以通过LAN或Wi-Fi的方式进行互联以传输控制消息,该控制消息包括摄像机切换消息、音视频数据如视频对象模型数据等等。进一步可选的,该控制消息可通过互联网协议(Internet Protocol,简称为“IP”)传输,如采用IP Camera协议栈。两两机位中的双目摄像机之间要求具有拍摄重叠区域。当某一导播摄像机需要进行多路视频输出时,可连接到该导播摄像机所在组网系统的视频矩阵上,由视频矩阵进行切换输出。可选的,该视频矩阵的切换策略可以由场景中的任一指定的导播摄像机如作为主机位的导播摄像机控制,或由第三方设备进行控制,本发明实施例不做限定。该视频矩阵输出的视频图像通过编解码设备进行编码后,可传输到远端,以实现视频会议。具体的,若组网中的摄像机数量较少,视频数据可以级联进行处理(导播支架支持视频级联);若数量较多,多个摄像机的视频都输出到视频矩阵进行处理,由视频矩阵进行一个或多个摄像机视频源的切换或合成。进一步的,支架可对外提供视频输入/ 输出接口、LAN/Wi-Fi网口和串行接口等。其中,视频输入接口用于外接其它摄像机的输入视频;视频输出接口用于连接终端或视频矩阵等设备以输出视频图像;串行接口则提供了对支架的控制和调试接口;LAN/Wi-Fi网口用于多个摄像机机位的级联,可传输音视频数据及控制数据等等。Further, please refer to FIG. 8 , which is a schematic diagram of networking of a guided camera system according to an embodiment of the present invention. As shown in Figure 8, multiple slots can be networked. The multi-camera network includes multiple inter-camera networking with guided cameras, and the camera and guide brackets are installed + multiple common PTZ camera groups. Net, the position of the camera and the guide bracket is installed + the position of the station without PTZ camera (that is, only the guide bracket), and the position without the PTZ camera + the network of multiple ordinary PTZ cameras (ie, no guide bracket). The cameras of each station can be interconnected by LAN or Wi-Fi to transmit control messages, including camera switching messages, audio and video data such as video object model data, and the like. Further optionally, the control message may be transmitted through an Internet Protocol (IP), such as an IP Camera protocol stack. A binning camera is required between the binocular cameras in the two positions. When a certain video camera needs to perform multi-channel video output, it can be connected to the video matrix of the networking system where the navigation camera is located, and is switched and output by the video matrix. Optionally, the switching policy of the video matrix may be controlled by any specified navigation camera in the scenario, such as a navigation camera as a host, or controlled by a third-party device, which is not limited in the embodiment of the present invention. The video image output by the video matrix is encoded by the codec device and transmitted to the far end for video conferencing. Specifically, if the number of cameras in the network is small, the video data can be processed in cascade (the navigation bracket supports video cascading); if the number is large, the video of multiple cameras is output to the video matrix for processing, by the video matrix. Switch or synthesize one or more camera video sources. Further, the bracket can provide external video input / Output interface, LAN/Wi-Fi network port and serial interface. The video input interface is used for external input video of other cameras; the video output interface is used to connect terminals or video matrix devices to output video images; the serial interface provides control and debugging interface for the bracket; LAN/Wi-Fi The network port is used for cascading multiple camera positions, and can transmit audio and video data and control data.
进一步的,在多导播摄像机组网场景下,该多个导播摄像机都具有视频对象检测能力和PTZ摄像机功能,可将其中一个导播摄像机作为主机位,负责输出机位选择和PTZ摄像机控制,其它摄像机作为从机位;导播摄像机和导播支架的机位+多个普通PTZ摄像机场景下,只有一个导播摄像机具有视频对象检测能力,负责输出机位选择和PTZ摄像机控制,普通摄像机只作为PTZ摄像机使用,而由于只有导播摄像机具有视频对象检测能力,因此该场景下没有通过网络获得从机位视频对象模型的数据,进行多机位视频对象模型的匹配的过程。Further, in the multi-guide camera networking scenario, the plurality of navigation cameras have video object detection capability and PTZ camera function, and one of the navigation cameras can be used as a host position, and is responsible for outputting the position selection and PTZ camera control, and other cameras. As the slave position; the position of the guide camera and the guide bracket + multiple common PTZ cameras, only one guide camera has video object detection capability, which is responsible for output position selection and PTZ camera control, and the ordinary camera is only used as a PTZ camera. Since only the navigation camera has the video object detection capability, the data of the video object model of the camera position is not obtained through the network, and the matching process of the multi-camera video object model is performed.
具体的,本发明实施例中的导播摄像机和双目摄像机可参照上述图1-6对应实施例的相关描述,此处不再赘述。For details, refer to the related descriptions of the corresponding embodiments in the foregoing FIG. 1-6 in the navigation camera and the binocular camera in the embodiment of the present invention, and details are not described herein again.
请参见图9,图9是本发明实施例提供的一种导播摄像机的结构示意图,用于执行上述的摄像机参数调整方法。具体的,如图9所示,本发明实施例的所述导播摄像机包括:通信接口300、存储器200和处理器100,所述处理器100分别与所述通信接口300及所述存储器200连接。所述存储器200可以是高速RAM存储器,也可以是非不稳定的存储器(non-volatile memory),例如至少一个磁盘存储器。所述通信接口300、存储器200以及处理器100之间可以通过总线进行数据连接,也可以通过其他方式数据连接。本实施例中以总线连接进行说明。图10中示出的设备结构并不构成对本发明实施例的限定,还可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。其中:Referring to FIG. 9, FIG. 9 is a schematic structural diagram of a navigation camera according to an embodiment of the present invention, for performing the above camera parameter adjustment method. Specifically, as shown in FIG. 9, the navigation camera of the embodiment of the present invention includes: a
处理器100为设备的控制中心,利用各种接口和线路连接整个设备的各个部分,通过运行或执行存储在存储器200内的程序和/或单元,以及调用存储在存储器200内的驱动软件,以执行设备的各种功能和处理数据。处理器100可以由集成电路(Integrated Circuit,简称为“IC”)组成,例如可以由单颗封装的IC所组成,也可以由连接多颗相同功能或不同功能的封装IC而组成。举例来说,处理器100可以仅包括中央处理器(Central Processing Unit,简称为“CPU”),也可以是CPU、数字信号处理器(Digital Signal Processor,简称为“DSP”)、图形处理器(Graphic Processing Unit,简称为“GPU”)及各种控制芯片的组合。在本发明实施方式中,CPU可以是单运算核心,也可以包括多运算核心。The
通信接口300可包括有线接口、无线接口等。
存储器200可用于存储驱动软件(或软件程序)以及单元,处理器100、通信接口300通过调用存储在存储器200中的驱动软件以及单元,从而执行设备的各项功能应用以及实现数据处理。存储器200主要包括程序存储区和数据存储区,其中,程序存储区可存储至少一个功能所需的驱动软件等;数据存储区可存储根据参数调整过程中的数据,如上述的三维坐标信息。The
具体的,所述处理器100从所述存储器200读取所述驱动软件并在所述驱动软件的作用下执行:Specifically, the
确定需要拍摄的目标视频对象;Determining the target video object that needs to be captured;
按照预设的导播策略从所述导播摄像机所在的导播摄像系统的各摄像机中筛选出用于 拍摄所述目标视频对象的目标摄像机;Filtering from each camera of the navigation camera system where the navigation camera is located according to a preset navigation strategy Shooting a target camera of the target video object;
获取所述目标视频对象的第一三维坐标,所述第一三维坐标为所述目标视频对象在所述目标摄像机对应的第一坐标系下的三维坐标;Obtaining a first three-dimensional coordinate of the target video object, where the first three-dimensional coordinate is a three-dimensional coordinate of the target video object in a first coordinate system corresponding to the target camera;
将所述目标摄像机的摄像参数调整为与所述第一三维坐标对应的摄像参数,并输出调整摄像参数后的视频图像。And adjusting an imaging parameter of the target camera to an imaging parameter corresponding to the first three-dimensional coordinate, and outputting a video image after adjusting the imaging parameter.
可选的,所述处理器100从所述存储器200读取所述驱动软件并在所述驱动软件的作用下执行所述获取所述目标视频对象的第一三维坐标,具体执行以下步骤:Optionally, the
通过所述通信接口300获取与所述导播摄像机连接的双目摄像机传输的第二三维坐标,所述第二三维坐标为所述目标视频对象在所述双目摄像机对应的第二坐标系下的三维坐标;Obtaining, by the
根据预先标定的所述双目摄像机和所述目标摄像机的位置关系,将所述第二三维坐标转换为第一三维坐标。The second three-dimensional coordinates are converted into first three-dimensional coordinates according to a pre-calibrated positional relationship between the binocular camera and the target camera.
可选的,所述处理器100从所述存储器200读取所述驱动软件并在所述驱动软件的作用下执行所述确定需要拍摄的目标视频对象,具体执行以下步骤:Optionally, the
获取所述双目摄像机传输的拍摄图像,所述拍摄图像中包括至少一个视频对象;Acquiring a captured image transmitted by the binocular camera, the captured image including at least one video object;
建立包括所述至少一个视频对象的视频对象模型,并从所述至少一个视频对象中确定出目标视频对象;Establishing a video object model including the at least one video object, and determining a target video object from the at least one video object;
所述处理器100从所述存储器200读取所述驱动软件并在所述驱动软件的作用下执行所述按照预设的导播策略从所述导播摄像机所在的导播摄像系统的各摄像机中筛选出用于拍摄所述目标视频对象的目标摄像机,具体执行以下步骤:The
分别从所述导播摄像系统中的各摄像机获取的拍摄图像中确定出所述目标视频对象,并获取所述目标视频对象在各摄像机的拍摄效果参数;Determining the target video object from the captured images acquired by the cameras in the navigation camera system, and acquiring shooting effect parameters of the target video object in each camera;
将拍摄效果参数满足预设导播策略的摄像机确定为用于拍摄所述目标视频对象的目标摄像机。The camera whose shooting effect parameter satisfies the preset guiding strategy is determined as the target camera for capturing the target video object.
可选的,所述处理器100从所述存储器200读取所述驱动软件并在所述驱动软件的作用下执行从摄像机获取的拍摄图像中确定出所述目标视频对象,具体执行以下步骤:Optionally, the
根据预先标定的所述双目摄像机和当前摄像机的位置关系,将所述第二三维坐标转换为第三三维坐标,其中,所述当前摄像机为所述导播摄像系统中除所述双目摄像机以外的任一摄像机,所述第三三维坐标为所述目标视频对象在所述当前摄像机对应的第三坐标系下的三维坐标;Converting the second three-dimensional coordinates into a third three-dimensional coordinate according to a pre-calibrated positional relationship between the binocular camera and the current camera, wherein the current camera is in the navigation camera system except the binocular camera The third three-dimensional coordinate is a three-dimensional coordinate of the target video object in a third coordinate system corresponding to the current camera;
判断所述目标视频对象在所述第三三维坐标下的区域和所述当前摄像机检测到的视频对象在该视频对象的三维坐标下的区域的重合面积是否超过预设的面积阈值;Determining whether an area of the target video object in the third three-dimensional coordinate and a region of the video object detected by the current camera in the three-dimensional coordinate of the video object exceed a preset area threshold;
若超过,则将该视频对象确定为所述目标视频对象。If exceeded, the video object is determined to be the target video object.
可选的,所述拍摄效果参数可包括所述目标视频对象在当前摄像机对应的坐标系下的眼对眼效果参数、遮挡关系参数以及拍摄区域的场景对象参数中的任一项或多项,所述当前摄像机为所述导播摄像系统中除所述双目摄像机以外的任一摄像机。Optionally, the shooting effect parameter may include any one or more of an eye-to-eye effect parameter, an occlusion relationship parameter, and a scene object parameter of the shooting area of the target video object in a coordinate system corresponding to the current camera. The current camera is any camera other than the binocular camera in the navigation camera system.
其中,所述眼对眼效果参数可包括所述目标视频对象相对于当前摄像机对应的坐标系的旋转角度,所述旋转角度是根据所述目标视频对象在所述第二坐标系的旋转角度以及预先标定的所述双目摄像机和所述当前摄像机的位置关系确定出的。The eye-to-eye effect parameter may include a rotation angle of the target video object relative to a coordinate system corresponding to the current camera, the rotation angle being according to a rotation angle of the target video object in the second coordinate system and The pre-calibrated positional relationship between the binocular camera and the current camera is determined.
其中,所述遮挡关系参数和所述场景对象参数可以是根据预先标定的所述双目摄像机 和所述当前摄像机的位置关系,将所述当前摄像机检测到的场景对象的区域重投到所述当前摄像机的成像平面确定出的。The occlusion relationship parameter and the scene object parameter may be according to the binocular camera pre-calibrated. And a positional relationship between the current camera, and re-casting an area of the scene object detected by the current camera to an imaging plane of the current camera.
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。In the above embodiments, the descriptions of the various embodiments are different, and the details that are not detailed in a certain embodiment can be referred to the related descriptions of other embodiments.
在本发明所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided by the present invention, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the device embodiments described above are merely illustrative. For example, the division of the unit is only a logical function division. In actual implementation, there may be another division manner, for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed. In addition, the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
所述该作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as the unit may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. . Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit. The above integrated unit can be implemented in the form of hardware or in the form of hardware plus software functional units.
上述以软件功能单元的形式实现的集成的单元,可以存储在一个计算机可读取存储介质中。上述软件功能单元存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本发明各个实施例所述方法的部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,简称为“ROM”)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。The above-described integrated unit implemented in the form of a software functional unit can be stored in a computer readable storage medium. The above software functional unit is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor to perform the methods of the various embodiments of the present invention. Part of the steps. The foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (Read-Only Memory, simply referred to as "ROM"), a random access memory (RAM), a magnetic disk, or an optical disk, and the like. The medium of the program code.
本领域技术人员可以清楚地了解到,为描述的方便和简洁,仅以上述各功能单元的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元完成,即将装置的内部结构划分成不同的功能单元,以完成以上描述的全部或者部分功能。上述描述的装置的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。It will be apparent to those skilled in the art that for the convenience and brevity of the description, only the division of each functional unit described above is exemplified. In practical applications, the above function assignment can be completed by different functional units as needed, that is, the device The internal structure is divided into different functional units to perform all or part of the functions described above. For the specific working process of the device described above, refer to the corresponding process in the foregoing method embodiment, and details are not described herein again.
最后应说明的是:以上各实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述各实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的范围。 Finally, it should be noted that the above embodiments are merely illustrative of the technical solutions of the present invention, and are not intended to be limiting; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art will understand that The technical solutions described in the foregoing embodiments may be modified, or some or all of the technical features may be equivalently replaced; and the modifications or substitutions do not deviate from the technical solutions of the embodiments of the present invention. range.
Claims (15)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201610562671.6A CN106251334B (en) | 2016-07-18 | 2016-07-18 | A kind of camera parameter adjustment method, guide camera and system |
| CN201610562671.6 | 2016-07-18 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2018014730A1 true WO2018014730A1 (en) | 2018-01-25 |
Family
ID=57613157
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2017/091863 Ceased WO2018014730A1 (en) | 2016-07-18 | 2017-07-05 | Method for adjusting parameters of camera, broadcast-directing camera, and broadcast-directing filming system |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN106251334B (en) |
| WO (1) | WO2018014730A1 (en) |
Cited By (19)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110969662A (en) * | 2018-09-28 | 2020-04-07 | 杭州海康威视数字技术股份有限公司 | Fisheye camera internal parameter calibration method, device, calibration device controller and system |
| CN111080679A (en) * | 2020-01-02 | 2020-04-28 | 东南大学 | Method for dynamically tracking and positioning indoor personnel in large-scale place |
| CN111243029A (en) * | 2018-11-28 | 2020-06-05 | 驭势(上海)汽车科技有限公司 | Calibration method and device of vision sensor |
| CN111325790A (en) * | 2019-07-09 | 2020-06-23 | 杭州海康威视系统技术有限公司 | Target tracking method, device and system |
| CN112106110A (en) * | 2018-04-27 | 2020-12-18 | 上海趋视信息科技有限公司 | System and method for calibrating camera |
| CN112468680A (en) * | 2019-09-09 | 2021-03-09 | 上海御正文化传播有限公司 | Processing method of advertisement shooting site synthesis processing system |
| CN112819770A (en) * | 2021-01-26 | 2021-05-18 | 中国人民解放军陆军军医大学第一附属医院 | Iodine contrast agent allergy monitoring method and system |
| CN113129376A (en) * | 2021-04-22 | 2021-07-16 | 青岛联合创智科技有限公司 | Checkerboard-based camera real-time positioning method |
| CN113587895A (en) * | 2021-07-30 | 2021-11-02 | 杭州三坛医疗科技有限公司 | Binocular distance measuring method and device |
| CN113610932A (en) * | 2021-08-20 | 2021-11-05 | 苏州智加科技有限公司 | Method and device for external parameter calibration of binocular camera |
| CN113838146A (en) * | 2021-09-26 | 2021-12-24 | 昆山丘钛光电科技有限公司 | Method and device for verifying calibration precision of camera module and method and device for testing camera module |
| CN114025107A (en) * | 2021-12-01 | 2022-02-08 | 北京七维视觉科技有限公司 | Image ghost shooting method and device, storage medium and fusion processor |
| CN114666457A (en) * | 2022-03-23 | 2022-06-24 | 华创高科(北京)技术有限公司 | Video and audio program broadcasting guide method, device, equipment, system and medium |
| CN116563381A (en) * | 2022-01-27 | 2023-08-08 | 北京小米移动软件有限公司 | Method, device and storage medium for determining camera assembly tolerance |
| CN117523431A (en) * | 2023-11-17 | 2024-02-06 | 中国科学技术大学 | A pyrotechnic detection method, device, electronic equipment and storage medium |
| CN118368524A (en) * | 2024-06-17 | 2024-07-19 | 深圳市联合光学技术有限公司 | Multi-camera view field switching system and method thereof |
| CN118967810A (en) * | 2024-07-30 | 2024-11-15 | 江苏濠汉信息技术有限公司 | A binocular vision hazard source ranging method and system in extreme environments |
| CN120088333A (en) * | 2025-01-03 | 2025-06-03 | 中国水利水电夹江水工机械有限公司 | A target positioning method and system based on machine vision |
| WO2025260363A1 (en) * | 2024-06-21 | 2025-12-26 | 广州视源电子科技股份有限公司 | Method and apparatus for capturing close-up picture, and display device and storage medium |
Families Citing this family (25)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN106251334B (en) * | 2016-07-18 | 2019-03-01 | 华为技术有限公司 | A kind of camera parameter adjustment method, guide camera and system |
| US10091412B1 (en) * | 2017-06-30 | 2018-10-02 | Polycom, Inc. | Optimal view selection method in a video conference |
| CN109413359B (en) * | 2017-08-16 | 2020-07-28 | 华为技术有限公司 | Camera tracking method, device and equipment |
| CN109922251B (en) * | 2017-12-12 | 2021-10-22 | 华为技术有限公司 | Method, device and system for rapid capture |
| CN109031201A (en) * | 2018-06-01 | 2018-12-18 | 深圳市鹰硕技术有限公司 | The voice localization method and device of Behavior-based control identification |
| CN108900860A (en) * | 2018-08-23 | 2018-11-27 | 佛山龙眼传媒科技有限公司 | A kind of instructor in broadcasting's control method and device |
| CN109360250A (en) * | 2018-12-27 | 2019-02-19 | 爱笔(北京)智能科技有限公司 | Scaling method, equipment and the system of a kind of pair of photographic device |
| CN109712188A (en) * | 2018-12-28 | 2019-05-03 | 科大讯飞股份有限公司 | A kind of method for tracking target and device |
| CN111787243B (en) * | 2019-07-31 | 2021-09-03 | 北京沃东天骏信息技术有限公司 | Broadcasting guide method, device and computer readable storage medium |
| CN110456829B (en) * | 2019-08-07 | 2022-12-13 | 深圳市维海德技术股份有限公司 | Positioning tracking method, device and computer readable storage medium |
| CN111353368A (en) * | 2019-08-19 | 2020-06-30 | 深圳市鸿合创新信息技术有限责任公司 | Pan-tilt camera, face feature processing method and device and electronic equipment |
| CN110737798B (en) * | 2019-09-26 | 2022-10-14 | 万翼科技有限公司 | Indoor inspection method and related product |
| CN111080698B (en) * | 2019-11-27 | 2023-06-06 | 上海新时达机器人有限公司 | Method, system and storage device for calibrating position of long plate |
| CN111131697B (en) * | 2019-12-23 | 2022-01-04 | 北京中广上洋科技股份有限公司 | Multi-camera intelligent tracking shooting method, system, equipment and storage medium |
| CN113516717A (en) * | 2020-04-10 | 2021-10-19 | 富华科精密工业(深圳)有限公司 | Camera device external parameter calibration method, electronic equipment and storage medium |
| CN111698467B (en) * | 2020-05-08 | 2022-05-06 | 北京中广上洋科技股份有限公司 | Intelligent tracking method and system based on multiple cameras |
| CN113808199B (en) * | 2020-06-17 | 2023-09-08 | 华为云计算技术有限公司 | Positioning method, electronic equipment and positioning system |
| CN111800590B (en) * | 2020-07-06 | 2022-11-25 | 深圳博为教育科技有限公司 | Broadcasting-directing control method, device and system and control host |
| CN114549650B (en) * | 2020-11-26 | 2025-09-05 | 阿里巴巴集团控股有限公司 | Camera calibration method, device, electronic device and readable storage medium |
| CN112802058A (en) * | 2021-01-21 | 2021-05-14 | 北京首都机场航空安保有限公司 | Method and device for tracking illegal moving target |
| CN112887653B (en) * | 2021-01-25 | 2022-10-21 | 联想(北京)有限公司 | Information processing method and information processing device |
| CN113453021B (en) * | 2021-03-24 | 2022-04-29 | 北京国际云转播科技有限公司 | Artificial intelligence broadcasting guide method, system, server and computer readable storage medium |
| CN113271482A (en) * | 2021-05-17 | 2021-08-17 | 广东彼雍德云教育科技有限公司 | Portable full-width image scratching blackboard |
| CN116389660B (en) * | 2021-12-22 | 2024-04-12 | 广州开得联智能科技有限公司 | Recorded broadcast guiding method, recorded broadcast guiding device, recorded broadcast guiding equipment and storage medium |
| CN115375757B (en) * | 2022-08-19 | 2025-11-04 | 中国科学技术大学 | Video-based sound source localization angle calibration methods, systems, equipment, and media |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20090303329A1 (en) * | 2008-06-06 | 2009-12-10 | Mitsunori Morisaki | Object image displaying system |
| CN101630406A (en) * | 2008-07-14 | 2010-01-20 | 深圳华为通信技术有限公司 | Camera calibration method and camera calibration device |
| CN102638672A (en) * | 2011-02-09 | 2012-08-15 | 宝利通公司 | Automatic video layout for multi-stream multi-site telepresence conferencing systems |
| CN102843540A (en) * | 2011-06-20 | 2012-12-26 | 宝利通公司 | Automatic camera selection for videoconference |
| CN106251334A (en) * | 2016-07-18 | 2016-12-21 | 华为技术有限公司 | A kind of camera parameters method of adjustment, instructor in broadcasting's video camera and system |
Family Cites Families (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN104869365B (en) * | 2015-06-02 | 2018-12-18 | 阔地教育科技有限公司 | A kind of mouse tracking method and device based on straight recording and broadcasting system |
| CN105049764B (en) * | 2015-06-17 | 2018-05-25 | 武汉智亿方科技有限公司 | A kind of geography teaching map based on multiple positioning shooting heads is as tracking and system |
| CN105718862A (en) * | 2016-01-15 | 2016-06-29 | 北京市博汇科技股份有限公司 | Method, device and recording-broadcasting system for automatically tracking teacher via single camera |
-
2016
- 2016-07-18 CN CN201610562671.6A patent/CN106251334B/en active Active
-
2017
- 2017-07-05 WO PCT/CN2017/091863 patent/WO2018014730A1/en not_active Ceased
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20090303329A1 (en) * | 2008-06-06 | 2009-12-10 | Mitsunori Morisaki | Object image displaying system |
| CN101630406A (en) * | 2008-07-14 | 2010-01-20 | 深圳华为通信技术有限公司 | Camera calibration method and camera calibration device |
| CN102638672A (en) * | 2011-02-09 | 2012-08-15 | 宝利通公司 | Automatic video layout for multi-stream multi-site telepresence conferencing systems |
| CN102843540A (en) * | 2011-06-20 | 2012-12-26 | 宝利通公司 | Automatic camera selection for videoconference |
| CN106251334A (en) * | 2016-07-18 | 2016-12-21 | 华为技术有限公司 | A kind of camera parameters method of adjustment, instructor in broadcasting's video camera and system |
Cited By (25)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112106110A (en) * | 2018-04-27 | 2020-12-18 | 上海趋视信息科技有限公司 | System and method for calibrating camera |
| US11468598B2 (en) | 2018-04-27 | 2022-10-11 | Shanghai Truthvision Information Technology Co., Ltd. | System and method for camera calibration |
| CN110969662A (en) * | 2018-09-28 | 2020-04-07 | 杭州海康威视数字技术股份有限公司 | Fisheye camera internal parameter calibration method, device, calibration device controller and system |
| CN110969662B (en) * | 2018-09-28 | 2023-09-26 | 杭州海康威视数字技术股份有限公司 | Fisheye camera internal parameter calibration method, device, calibration device controller and system |
| CN111243029A (en) * | 2018-11-28 | 2020-06-05 | 驭势(上海)汽车科技有限公司 | Calibration method and device of vision sensor |
| CN111243029B (en) * | 2018-11-28 | 2023-06-23 | 驭势(上海)汽车科技有限公司 | Calibration method and device of vision sensor |
| CN111325790A (en) * | 2019-07-09 | 2020-06-23 | 杭州海康威视系统技术有限公司 | Target tracking method, device and system |
| CN111325790B (en) * | 2019-07-09 | 2024-02-20 | 杭州海康威视系统技术有限公司 | Target tracking methods, equipment and systems |
| CN112468680A (en) * | 2019-09-09 | 2021-03-09 | 上海御正文化传播有限公司 | Processing method of advertisement shooting site synthesis processing system |
| CN111080679A (en) * | 2020-01-02 | 2020-04-28 | 东南大学 | Method for dynamically tracking and positioning indoor personnel in large-scale place |
| CN112819770A (en) * | 2021-01-26 | 2021-05-18 | 中国人民解放军陆军军医大学第一附属医院 | Iodine contrast agent allergy monitoring method and system |
| CN113129376A (en) * | 2021-04-22 | 2021-07-16 | 青岛联合创智科技有限公司 | Checkerboard-based camera real-time positioning method |
| CN113587895A (en) * | 2021-07-30 | 2021-11-02 | 杭州三坛医疗科技有限公司 | Binocular distance measuring method and device |
| CN113610932A (en) * | 2021-08-20 | 2021-11-05 | 苏州智加科技有限公司 | Method and device for external parameter calibration of binocular camera |
| CN113610932B (en) * | 2021-08-20 | 2024-06-04 | 苏州智加科技有限公司 | Binocular camera external parameter calibration method and device |
| CN113838146A (en) * | 2021-09-26 | 2021-12-24 | 昆山丘钛光电科技有限公司 | Method and device for verifying calibration precision of camera module and method and device for testing camera module |
| CN114025107A (en) * | 2021-12-01 | 2022-02-08 | 北京七维视觉科技有限公司 | Image ghost shooting method and device, storage medium and fusion processor |
| CN114025107B (en) * | 2021-12-01 | 2023-12-01 | 北京七维视觉科技有限公司 | Image ghost shooting method, device, storage medium and fusion processor |
| CN116563381A (en) * | 2022-01-27 | 2023-08-08 | 北京小米移动软件有限公司 | Method, device and storage medium for determining camera assembly tolerance |
| CN114666457A (en) * | 2022-03-23 | 2022-06-24 | 华创高科(北京)技术有限公司 | Video and audio program broadcasting guide method, device, equipment, system and medium |
| CN117523431A (en) * | 2023-11-17 | 2024-02-06 | 中国科学技术大学 | A pyrotechnic detection method, device, electronic equipment and storage medium |
| CN118368524A (en) * | 2024-06-17 | 2024-07-19 | 深圳市联合光学技术有限公司 | Multi-camera view field switching system and method thereof |
| WO2025260363A1 (en) * | 2024-06-21 | 2025-12-26 | 广州视源电子科技股份有限公司 | Method and apparatus for capturing close-up picture, and display device and storage medium |
| CN118967810A (en) * | 2024-07-30 | 2024-11-15 | 江苏濠汉信息技术有限公司 | A binocular vision hazard source ranging method and system in extreme environments |
| CN120088333A (en) * | 2025-01-03 | 2025-06-03 | 中国水利水电夹江水工机械有限公司 | A target positioning method and system based on machine vision |
Also Published As
| Publication number | Publication date |
|---|---|
| CN106251334B (en) | 2019-03-01 |
| CN106251334A (en) | 2016-12-21 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2018014730A1 (en) | Method for adjusting parameters of camera, broadcast-directing camera, and broadcast-directing filming system | |
| WO2017215295A1 (en) | Camera parameter adjusting method, robotic camera, and system | |
| JP2024056955A (en) | Personalized HRTF with Optical Capture | |
| US12033355B2 (en) | Client/server distributed camera calibration | |
| US10694167B1 (en) | Camera array including camera modules | |
| US8749607B2 (en) | Face equalization in video conferencing | |
| US9832583B2 (en) | Enhancement of audio captured by multiple microphones at unspecified positions | |
| JP7179515B2 (en) | Apparatus, control method and program | |
| US20160191815A1 (en) | Camera array removing lens distortion | |
| CN109492506A (en) | Image processing method, device and system | |
| JP2019083402A (en) | Image processing apparatus, image processing system, image processing method, and program | |
| CN106161985B (en) | A Realization Method of Immersive Video Conference | |
| KR20050084263A (en) | Method and apparatus for correcting a head pose in a video phone image | |
| WO2024119902A1 (en) | Image stitching method and apparatus | |
| JP5963006B2 (en) | Image conversion apparatus, camera, video system, image conversion method, and recording medium recording program | |
| JP2023502552A (en) | WEARABLE DEVICE, INTELLIGENT GUIDE METHOD AND APPARATUS, GUIDE SYSTEM, STORAGE MEDIUM | |
| CN108053376A (en) | A kind of semantic segmentation information guiding deep learning fisheye image correcting method | |
| CN114374903B (en) | Sound pickup method and sound pickup apparatus | |
| TW201824178A (en) | Image processing method for immediately producing panoramic images | |
| JP2023167486A (en) | Image processing device, image processing method and program | |
| JP2018205008A (en) | Camera calibration apparatus and camera calibration method | |
| KR20250161550A (en) | Image based reconstruction of 3D landmarks for use in generating personalized head transfer functions. | |
| JP2019096926A (en) | Image processing system, image processing method and program | |
| CN118192083A (en) | Image processing method, head mounted display device and medium | |
| TW202334902A (en) | Systems and methods for image reprojection |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17830364 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 17830364 Country of ref document: EP Kind code of ref document: A1 |