Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the embodiments of the present application will be described in further detail with reference to the accompanying drawings.
First, terms involved in the embodiments of the present application will be described:
IMU, short for inertial measurement unit (Inertial Measurement Unit). IMUs generally include acceleration sensors, angular velocity sensors (gyroscopes) and magnetometers. According to the application, the terminal can determine the gesture of the camera on the terminal through the acquired data of the IMU component.
Sky Segmentation (Sky Segmentation) is to divide a Sky region and a non-Sky region in an image screen to determine a Sky region.
Sky hemisphere a hemispherical model with radius length s, built with the camera collecting the image as the center, is usually set to a value of s twice the focal length of the camera. The position of a point on the sphere of the sky hemisphere can be uniquely determined by longitude and latitude, similar to the method of uniquely determining a point on a globe by longitude and latitude.
Schematically shown in fig. 1, a sky hemisphere 24 is created with a camera 22 that captures images as the center of the hemisphere model and with twice the focal length of the camera 22 as the radius of the hemisphere model.
Optical center, the center of convex lens of camera.
The world coordinate system is an absolute coordinate system, and is unchanged and unique after being selected. In the application, a world coordinate system is established by taking the camera position as an origin O, the forward and the backward directions as an x axis, the forward and the backward directions as a y axis and the vertical downward direction as a z axis. The application is not limited to the method of establishing the world coordinate system.
And a camera coordinate system for indicating the positional relationship of the shooting of the camera relative to the camera. As shown in the figure, in the application, the optical center (namely, the small hole in the small hole imaging) is taken as an origin O, and the direction which is perpendicular to the optical center of the camera and points to the shooting object is taken as a z-axis as an example, so as to establish a camera coordinate system.
Schematically shown in fig. 2, fig. 2 shows a pinhole imaging model in the principle of camera shooting and the way in which the camera coordinate system is established as used in the present application. The light of the light source P propagates along a straight line, passes through the optical center O of the camera, i.e. the aperture in the aperture imaging model, and falls on a physical imaging plane at a focal length f from the camera lens, obtaining a point P'. And establishing a camera coordinate system by taking the camera optical center as an origin O, taking a plane in which the camera optical center is positioned as an xoy plane and taking a direction which is perpendicular to the camera optical center and points to a shooting object as a z axis. Fig. 2 shows one possible way of establishing a camera coordinate system.
As shown in fig. 3, the user performs shooting of a short video through the terminal. The picture obtained at the initial moment of the camera is shown as a picture 10, elements such as buildings and trees exist in the picture 10, a blank area in the picture 10 is a sky area, a terminal determines the sky area in an image through a sky segmentation method and displays preset special effects in the sky area in response to a special effect function selected by a user, the picture displayed by the terminal is shown as a picture 11, and the terminal continuously performs sky segmentation according to the shot picture and displays corresponding special effects in the sky area in response to the user holding the camera for rotary shooting, as shown as a picture 12. That is, after the sky area is determined by the sky segmentation method, the sky is replaced by the sky with special effects, and according to the continuous change of the camera gesture, the sky with special effects displayed by the terminal also carries out corresponding continuous change.
Fig. 4 is a schematic diagram of a computer device according to an exemplary embodiment of the present application. The computer device may be a terminal device or may be a part of a terminal device. The device includes a bus 101, a processor 102, a memory 103, an IMU assembly 104, and a camera assembly 105.
The processor 102 includes one or more processing cores, and the processor 102 executes various functional applications and information processing by running software programs and modules.
The memory 103 is connected to the processor 102 via the bus 101.
The memory 103 may be used to store at least one instruction that the processor 102 may use to execute to implement the various steps in the method embodiments described below.
Furthermore, memory 103 may be implemented by any type or combination of volatile or non-volatile Memory devices including, but not limited to, magnetic or optical disks, electrically erasable programmable Read-Only Memory (EEPROM), erasable programmable Read-Only Memory (Erasable Programmable Read Only Memory, EPROM), static random access Memory (Static Random Access Memory, SRAM), read-Only Memory (ROM), magnetic Memory, flash Memory, programmable Read-Only Memory (Programmable Read-Only Memory, PROM).
The IMU component 104 may be operative to collect IMU data indicative of a change in pose. For example, IMU data acquired by the IMU assembly 104 in the present application is used to indicate changes in the pose of a camera in the terminal.
The camera assembly 105 is used to capture images or video.
In the related art, a traditional sky segmentation method for setting a color threshold is adopted to analyze a current image, and a sky area is determined, so that the obtained sky area is replaced by a preset effect. However, when the sky is close to the ground, erroneous judgment is likely to occur. For example, in an image acquired in a rainy day, the color of the sky area is most likely similar to the color of the cement floor, resulting in a decrease in the accuracy of sky segmentation. In the sky hemisphere model with the camera as the center, the pixel points belonging to the ground area on the image are not mapped to the sky hemisphere in the sky hemisphere model-based sky segmentation method, so that the ground area is prevented from being misjudged as the sky area with similar colors.
Fig. 5 shows a flowchart of a sky segmentation method according to an exemplary embodiment of the present application, which is applied to a terminal. As shown in fig. 5, the method includes:
step 220, acquiring a first image segmentation probability of a pixel point on an image;
the image may be a photograph taken by the terminal through the camera assembly, or any frame of picture in the video taken by the terminal through the camera assembly.
The first image segmentation probability is used to indicate a probability that a pixel on the image belongs to the sky.
Or training a deep learning network model by collecting a picture sample set, and calling the trained deep learning network model to identify the image so as to obtain the first image segmentation probability of the pixel point on the image.
The present application is not limited in the manner in which the image is acquired and in the manner in which the first image segmentation probability for the pixel points on the image is acquired based on the image.
For example, as shown in fig. 6, the terminal acquires the image 32, analyzes the image 32 by calling the deep learning network model, and outputs the probability that each pixel on the image 32 belongs to the sky, so as to obtain the first image segmentation probability of each pixel on the image 32, as shown in the image 34. The numbers on the image 34 represent the probability that the pixel belongs to the sky region. For example, when the first image segmentation probability of a pixel is 1, it represents that 100% of the pixel belongs to the sky area, that is, the pixel must belong to the sky area, when the first image segmentation probability of the pixel is 0.7, it represents that 70% of the pixel belongs to the sky area, and when the first image segmentation probability of the pixel is 0, it represents that 0% of the pixel belongs to the sky area, that is, the pixel cannot belong to the sky area.
Step 240, determining sky segmentation probability of a space point on the sky hemisphere according to the first image segmentation probability of the target pixel point based on the position mapping relation between the image and the sky hemisphere;
The sky hemisphere is a hemispherical model with a radius length s established centering on a camera for capturing images, and in general, the value of s is set to be twice the focal length f of the camera. The position of a point on the sphere of the sky hemisphere can be uniquely determined by longitude and latitude, similar to the method of uniquely determining a point on a globe by longitude and latitude.
The target pixel point is a pixel point where a corresponding spatial point exists on the sky hemisphere. And carrying out three-dimensional spherical mapping on the pixel points on the image, and determining the pixel points as target pixel points when the space points subjected to the three-dimensional spherical mapping on the pixel points on the image belong to the sky hemisphere.
The longitude and latitude of the space point corresponding to the pixel point are determined by three-dimensional spherical mapping of the pixel point on the image, and the pixel point is determined to be a target pixel point when the latitude of the space point belongs to a sky hemisphere. For example, in the case where the latitude of the spatial point is greater than 0, the pixel point is determined as the target pixel point.
The position mapping relation is used for indicating the mapping relation of one-to-one correspondence between the target pixel point on the image and the space point on the sky hemisphere. The process of determining the position of the space point on the sky hemisphere through the coordinates of the target pixel point on the image comprises the steps of determining the rotation gesture of a camera according to the IMU, determining the position coordinates of the space point under the camera coordinate system according to the coordinates of the target pixel point, determining the position coordinates of the space point under the world coordinate system according to the rotation gesture of the camera and the position coordinates of the space point under the camera coordinate system, and determining the longitude and latitude of the space point according to the position coordinates of the space point under the world coordinate system.
The sky segmentation probability is used to indicate the probability that a spatial point on the sky hemisphere belongs to the sky.
The first image segmentation probability of the target pixel point is directly determined as the sky segmentation probability of the space point corresponding to the target pixel point, or the first image segmentation probability of the target pixel point and the sky segmentation probability of the space point at the historical moment are subjected to filtering processing based on the position mapping relation of the target pixel point and the space point, and the obtained result after filtering is determined as the sky segmentation probability of the space point at the current moment.
Step 260, determining a second image segmentation probability of the target pixel point according to the sky segmentation probability of the space point on the sky hemisphere based on the position mapping relation.
The target pixel points are pixels with corresponding space points on the sky hemisphere, and the rest pixel points except the target pixel points are pixels with no corresponding space points on the sky hemisphere, namely the rest pixel points are pixels belonging to the ground area in the image.
The second image segmentation probability is used to indicate a probability that the target pixel belongs to the sky.
Based on the mapping relationship between the target pixel point on the indication image and the space point on the sky hemisphere in step 240, the sky segmentation probability of the space point on the sky hemisphere is mapped to the image, the second image segmentation probability of the target pixel point in the image is determined, and the second image segmentation probability of the rest pixel points except the target pixel point on the image is determined to be 0.
Optionally, after determining the sky-segmentation probability of the sky-hemisphere, the sky-segmentation probability buffered on the sky-hemisphere is updated, i.e. the sky-segmentation probability buffered at the historical moment is replaced with the latest determined sky-segmentation probability.
Optionally, based on the second image segmentation probability of the pixel points on the image, determining a sky area in the image, and replacing the sky area with a preset sky element, wherein the sky element is at least one of video, animation effect, text and picture. For example, the sky area in the image is replaced with a sky, or with a sky with fireworks animation, etc.
Alternatively, the method of determining the sky area in the image according to the second image segmentation probability may be based on a preset threshold value. For example, in the case where the second image division probability of a pixel is greater than 0.7, the pixel is determined to belong to the sky area. The present application is not limited to a method of determining a sky area according to the second image segmentation probability.
In summary, the sky segmentation method provided by the embodiment of the invention determines the sky segmentation probability of the space point based on the position mapping relation between the target pixel point and the space point by acquiring the first image segmentation probability of the pixel point on the image, and then determines the second segmentation probability of the target pixel point according to the sky segmentation probability of the space point, thereby determining the sky region on the image. By adopting the sky hemisphere model, the interference of the rest pixel points except the target pixel point, namely the pixel points belonging to the ground area is eliminated, the situation that the ground with the similar color as the sky area is misjudged is avoided, and the accuracy rate of sky segmentation is improved.
In the above embodiment, the sky is distinguished from the ground by combining the sky hemisphere model in the sky segmentation process. Based on the method, the sky segmentation probability can be filtered by combining the cached data of the historical moment on the sky hemisphere, so that the sky segmentation result is more accurate, the edge change of the sky region is also more stable, the phenomenon of shaking of the sky region in the video is avoided, and the accuracy of the sky segmentation is further improved.
Fig. 7 shows a flowchart of a sky segmentation method according to an exemplary embodiment of the present application, which is applied to a terminal. As shown in fig. 7, the method includes:
step 420, obtaining a first image segmentation probability of a pixel point on an image;
The method for obtaining the first image segmentation probability of the image is referred to as step 220, and will not be described here.
Step 442, determining the pixel point as a target pixel point under the condition that the space point after the three-dimensional spherical mapping of the pixel point on the image belongs to a sky hemisphere;
And determining the pixel point as a target pixel point under the condition that the latitude of the space point belongs to a sky hemisphere.
The pixel points on the image are subjected to three-dimensional spherical mapping, and the longitude and latitude of the space points corresponding to the pixel points are determined by the following steps of determining the rotation gesture of the camera according to the IMU; the method comprises the steps of determining position coordinates of a space point under a camera coordinate system according to coordinates of pixel points, determining position coordinates of the space point under a world coordinate system according to rotation postures of a camera and the position coordinates of the space point under the camera coordinate system, and determining longitude and latitude of the space point according to the position coordinates of the space point under the world coordinate system.
Fig. 8 shows a detailed procedure of determining the longitude and latitude of the spatial point P corresponding to the pixel point P based on the coordinates of the pixel point P, taking the pixel point P on the image as an example:
4421, determining the rotation gesture of the camera according to the IMU;
the IMU gesture R imu is obtained from the IMU of the mobile phone and is used for indicating the position relation of the IMU relative to the world coordinate system, and the IMU gesture R imu is a matrix of 3 rows and 3 columns.
The rotational attitude R c of the camera is calculated by the following formula:
Rc=Ric -1*Rimu*Ric
Wherein R ic is used for indicating the rotation gesture of the camera coordinate system relative to the IMU coordinate system, R ic is a known matrix of 3 rows and 3 columns because the positions and the gestures of the camera and the IMU in the terminal are fixed, and the result R c obtained by calculation through the formula is used for indicating the gesture of the camera, namely the gesture of the current moment of the camera relative to the world coordinate system.
4422, Determining the position coordinates of the space point P under a camera coordinate system according to the coordinates of the pixel point P;
Taking any pixel point p on the image as an example, the coordinate of the point p on the image is (u p,vp).
The three-dimensional coordinates P sphere_in_unit of the pixel point P on a normalized plane in the camera coordinate system are determined by the following formula, wherein the normalized plane refers to a plane at a unit distance in front of the camera:
Wherein K is an internal reference matrix (Intrisic Matrix) of the camera, is a matrix of 3 rows and 3 columns, and can be expressed as F is the focal length of the camera, u is the abscissa of the optical center on the image, v is the ordinate of the optical center on the image, and the units of f, u and v are pixels.
Note that u, v are the abscissa and ordinate of the image center point. For example, the Width of an image is Width pixels and the Height is Height pixels, and when the image is not distorted, width=2×u and height= 2*v.
The three-dimensional coordinates P sphere_in_unit of P on the normalized plane under the camera coordinate system obtained by the above formula are 3 rows and 1 column matrices, and 3 elements in the matrices are respectively represented by P sphere_in_unit_x、Psphere_in_unit_y、Psphere_in_unit_z.
Based on this, the coordinates P sphere_in_cam of the point P on the image in the camera coordinate system are obtained by the following formula:
where s is the radius of the sky hemisphere.
4423, Determining the position coordinates of the space point P under the world coordinate system according to the rotation gesture of the camera and the position coordinates of the space point P under the camera coordinate system;
Converting the position coordinates of the space point in the camera coordinate system into position coordinates in the world coordinate system by the following formula:
Psphere_in_w=Rc*Psphere_in_cam
wherein R c is the rotational pose of the camera.
Step 4424, determining the longitude and latitude of the space point P according to the position coordinates of the space point P in the world coordinate system.
P sphere_in_w obtained in the previous step is a matrix of 3 rows and 1 column, 3 elements in the matrix are respectively represented by P x、Py、Pz, and a position coordinate matrix of the space point P under a world coordinate system is schematically obtained by:
Longitude P longitude and latitude P latitude of the spatial point P are calculated:
Wherein atan2 represents an arctangent function, returning an azimuth angle represented by radian, asin is an arcsine function, returning an azimuth angle represented by radian, wherein s is a radius of a sky hemisphere.
Alternatively, the longitude P longitude and latitude P latitude of the point of space P may be expressed in radians ranging from [ -pi, +pi ], or the longitude P longitude and latitude P latitude of the point of space P may be expressed in angles ranging from [ -180, +180 ]. The radian and the angle can be mutually converted.
After the longitude and latitude of the spatial point corresponding to the pixel point are obtained, the pixel point is determined as the target pixel point when the latitude of the spatial point belongs to the sky hemisphere. For example, a pixel corresponding to a spatial point with a latitude greater than 0 is determined as the target pixel.
Step 444, determining the first image segmentation probability of the target pixel point as the first space segmentation probability of the space point corresponding to the target pixel point;
the first sky segmentation probability refers to a probability that a spatial point on a sky hemisphere belongs to the sky based on mapping of a target pixel point on a corresponding image.
Based on the positional mapping relationship between the target pixel point and the spatial point determined in step 442, the first image segmentation probability of the target pixel point is determined as the first spatial segmentation probability of the spatial point corresponding to the target pixel point.
Step 446, determining a third sky segmentation probability of the space point corresponding to the target pixel point by a filtering method according to the first sky segmentation probability and the second sky segmentation probability;
the second sky segmentation probability is used for indicating the probability that the space point belongs to the sky at the historical moment, and the third sky probability is used for indicating the probability that the space point belongs to the sky at the current moment.
The filtering method comprises at least one of weighting filtering, kalman filtering, mean filtering and median filtering.
Illustratively, a weighted filtering mode is selected to filter the first sky-segmentation probability and the second sky-segmentation probability. The filter values a, a are preset to range from 0 to 1. The first sky-segmentation probability obtained by step 444 is Pro pic and the second sky-segmentation probability of the historic moment buffered on the sky hemisphere is Pro sphere. Third sky segmentation probabilities Pro sphere_filter are obtained by weighting and filtering Pro sphere and Pro pic:
Prosphere_filter=a·Prosphere+(1-a)·Propic
optionally, the magnitude of the filtered value a may be adjusted according to the actual situation. And increasing the value of a under the condition that the sky segmentation result at the historical moment is more reliable, and decreasing the value of a under the condition that the sky segmentation result at the current moment is more reliable.
The application does not limit the selection of the filtering method and the setting of the filtering value.
Optionally, replacing the second sky segmentation probability with the third sky segmentation probability of each space point on the sky hemisphere to obtain the sky hemisphere at the current moment for caching, wherein the sky hemisphere at the current moment is used for carrying out filtering calculation on the third sky segmentation probability at the next moment. That is, the third sky-segmentation probability Pro sphere_filter is used to replace the second sky-segmentation probability Pro sphere, and the sky hemisphere at the current time is obtained and cached.
Step 460, determining a second image segmentation probability of the target pixel point according to a third sky segmentation probability of the space point on the sky hemisphere based on the position mapping relation.
The target pixel points are pixels with corresponding space points on the sky hemisphere, and the rest pixel points except the target pixel points are pixels with no corresponding space points on the sky hemisphere, namely the rest pixel points are pixels belonging to the ground area in the image.
The second image segmentation probability is used to indicate a probability that the target pixel belongs to the sky.
After filtering is done on the sky hemisphere, the result of the filtering needs to be returned to the image for subsequent operations. Based on the position mapping relation, determining a second image segmentation probability of the target pixel point according to the sky segmentation probability of the space point on the sky hemisphere; a second image segmentation probability of the remaining pixels on the image except the target pixel is determined to be 0.
The mapping of the position of the spatial point on the sky hemisphere to the coordinates of the target pixel point on the image is opposite to the process in step 442 in that the coordinates of the spatial point in the world coordinate system are determined according to the longitude and latitude of the spatial point on the sky hemisphere, the position coordinates of the spatial point in the camera coordinate system are determined according to the position coordinates of the spatial point in the world coordinate system, and the coordinates of the target pixel point on the image are obtained according to the position coordinates of the spatial point in the camera coordinate system.
Fig. 9 shows the steps of determining coordinates of a target pixel point corresponding to a space point P on an image based on the longitude and latitude of the space point P, taking the space point P on a sky hemisphere as an example:
4601 determining coordinates of the space point P under a world coordinate system according to longitude and latitude of the space point P on the sky hemisphere;
The coordinates P sphere_in_w of the spatial point P in the world coordinate system are determined based on the longitude P longitude and the latitude P latitude of the point P by the following formula:
step 4602, determining the position coordinates of the space point P under the camera coordinate system according to the position coordinates of the space point P under the world coordinate system;
the position coordinates of the point P in the camera coordinate system are determined by the following formula based on the position coordinates P sphere_in_w of the point P in the world coordinate system.
Psphere_in_cam=Rc -1*Psphere_in_w
Wherein R c is the rotational pose of the camera.
P sphere_in_cam is a matrix of 3 rows and1 column, and P sphere_in_cam_x、Psphere_in_cam_y、Psphere_in_cam_z is used to represent 3 elements in the matrix, respectively.
Step 4603, obtaining coordinates of the pixel points on the image according to the position coordinates of the spatial point P in the camera coordinate system.
Firstly, the position coordinates P sphere_in_cam of the spatial point P in the camera coordinate system are projected to a normalization plane, namely, a plane at a unit distance in front of the camera, so as to obtain the position coordinates P sphere_in_unit of the spatial point P in the normalization plane:
And obtaining the coordinate P pic of the target pixel point of the space point P on the image based on the position coordinate P sphere_in_unit of the space point P on the normalized plane and the camera internal reference matrix K:
Ppic=K*Psphere_in_unit
and after obtaining the position coordinates of the target pixel points corresponding to the space points based on the position mapping relation, determining the third sky segmentation probability of the space points as the second image segmentation probability of the target pixel points.
Alternatively, when mapping the space point on the sky hemisphere to the target pixel point on the image, the sky hemisphere may be divided in integer according to longitude and latitude, or divided at intervals of 0.1 longitude and latitude, etc. The smaller the interval between the longitude and the latitude of the selection division is, the more space points are obtained by division, and the more pixel points are obtained by projection on the image, so that the sky segmentation result on the image is more accurate, but more memory resources are occupied at the same time. The method for dividing the sky hemisphere is not limited.
Alternatively, the method of determining the sky area in the image according to the second image segmentation probability may be based on a preset threshold value. For example, in the case where the second image division probability of a pixel is greater than 0.8, the pixel is determined to belong to the sky area. The present application is not limited to a method of determining a sky area according to the second image segmentation probability.
Optionally, after determining the second image segmentation probability of the image, determining a sky area in the image based on the second image segmentation probability of the pixel points, and replacing the sky area with a preset sky element, wherein the sky element is at least one of video, animation effect, text and picture. For example, the sky area in the image is replaced with a sky, or with a sky area with fireworks animation, etc.
In summary, the method provided in this embodiment maps the first image segmentation probability of the target pixel point to the space point on the sky hemisphere to obtain the first sky segmentation probability based on the first image segmentation probability of the pixel point on the obtained image and the position mapping relation between the pixel point and the space point, filters the first sky segmentation probability based on the second sky segmentation probability of the space point at the historical time on the sky hemisphere to obtain the third sky segmentation probability at the current time, and maps the third sky segmentation probability to the image to obtain the sky segmentation result of the image. According to the sky segmentation method, the sky hemisphere model is built, so that the pixel points belonging to the ground area in the image cannot be mapped to the sky hemisphere, interference of the ground area on sky segmentation is eliminated, and accuracy of sky segmentation is improved.
Moreover, the method provided by the embodiment fuses the sky segmentation result at the historical moment and the sky segmentation result at the current moment through filtering performed on the sky hemisphere, so that the sky segmentation result is more accurate. Further, under the condition that the sky is replaced by a preset sky element based on the sky segmentation result, the obtained sky area is also continuous and stable, the shaking phenomenon of the sky area is avoided, and the sky segmentation effect is improved.
In addition, in the method provided by the embodiment, the sky segmentation result is fused on the sky hemisphere after the sky segmentation result is obtained, so that the sky segmentation result can be directly extracted from the buffer memory of the sky hemisphere when the sky segmentation result is needed by an application program, and the sky segmentation process can be asynchronously performed in the background, thereby avoiding the condition of blocking in the process of the application program and ensuring the fluency of the application program.
The method provided by the application is well applied to application programs needing to be applied to sky segmentation. Taking sky special effects in short videos as an example, the accuracy and efficiency of the sky special effects can be effectively improved by the method provided by the application, firstly, the rotation gesture of a camera relative to the ground is determined based on data provided by the IMU of the mobile phone, so that a ground area is eliminated from the sky segmentation process by using a model of a sky hemisphere, and the ground area is prevented from being used as a sky area by mistake; in addition, the short video shooting application program in the related art carries out sky segmentation on each frame of video picture, the phenomenon of sky region shaking in the video can be caused by inconsistent sky segmentation results in each frame of video picture, and the method provided by the application combines the historical sky segmentation results and the current sky segmentation results, so that the sky segmentation results are more accurate on one hand, and the edge change process of the sky region is more flexible and natural on the other hand, further, the shooting of the short video can occupy larger memory of the mobile phone, if the sky segmentation operation is carried out on each frame of image, the memory occupancy rate is too high, the problems of picture blocking, high terminal power consumption and the like are caused, and the method provided by the application can be used for caching the sky segmentation probability on the sky hemisphere, so that the sky segmentation result on the sky hemisphere can be directly obtained without waiting until the sky segmentation process of the current frame is completely operated when the application program needs to be used for the sky segmentation results, and the sky segmentation process can be carried out asynchronously on the background, thereby improving the smoothness of the video.
Fig. 8 is a block diagram of a sky-segmentation apparatus according to an exemplary embodiment of the present application, and as shown in fig. 8, the apparatus includes:
an obtaining module 720, configured to obtain a first image segmentation probability of a pixel point on an image, where the first image segmentation probability is used to indicate a probability that the pixel point belongs to the sky;
The sky determining module 740 is configured to determine, based on a position mapping relationship between the image and a sky hemisphere, a sky segmentation probability of a spatial point on the sky hemisphere according to the first image segmentation probability of a target pixel point, where the sky hemisphere is a hemisphere model built with a camera for capturing the image as a center, and the target pixel point is a pixel point on the sky hemisphere where a corresponding spatial point exists;
The image determining module 760 is configured to determine, based on the position mapping relationship, a second image segmentation probability of the target pixel according to a sky segmentation probability of a spatial point on the sky hemisphere, where the second image segmentation probability is used to indicate a probability that the target pixel belongs to the sky.
In one possible embodiment, the sky determining module 740 is configured to determine a pixel point as the target pixel point in a case where a spatial point obtained by performing three-dimensional spherical mapping on the pixel point on the image belongs to a sky hemisphere, and determine a sky segmentation probability of the spatial point corresponding to the target pixel point according to a first image segmentation probability of the target pixel point.
In one possible embodiment, the sky determining module 740 is configured to determine a longitude and a latitude of a spatial point corresponding to a pixel point by performing three-dimensional spherical mapping on the pixel point on the image, and determine the pixel point as the target pixel point if the latitude of the spatial point belongs to the sky hemisphere.
In one possible embodiment, the sky determining module 740 is configured to determine a first image segmentation probability of the target pixel point as the sky segmentation probability of the spatial point corresponding to the target pixel point.
In one possible embodiment, the sky determining module 740 is configured to determine a first image segmentation probability of the target pixel point as a first sky segmentation probability of the spatial point corresponding to the target pixel point, and determine a third sky segmentation probability of the spatial point corresponding to the target pixel point according to the first sky segmentation probability and a second sky segmentation probability by a filtering method, where the second sky probability is used for indicating a probability that the spatial point belongs to the sky at a historical moment, and the third sky probability is used for indicating a probability that the spatial point belongs to the sky at a current moment, where the filtering method includes at least one of weighting filtering, kalman filtering, average filtering and median filtering.
In a possible embodiment, the sky determining module 740 is configured to replace the second sky-segmentation probability with the third sky-segmentation probability of each spatial point on the sky hemisphere, to obtain the sky hemisphere at the current time for buffering, where the sky hemisphere at the current time is used for performing filtering calculation on the third sky-segmentation probability at the next time.
In one possible embodiment, the sky determining module 740 is configured to determine a rotation gesture of the camera according to an inertial measurement unit IMU, determine a position coordinate of the spatial point in a camera coordinate system according to a coordinate of the pixel point, determine a position coordinate of the spatial point in a world coordinate system according to the rotation gesture of the camera and the position coordinate of the spatial point in the camera coordinate system, and determine a longitude and a latitude of the spatial point according to the position coordinate of the spatial point in the world coordinate system.
In a possible embodiment, the image determining module 760 is configured to determine the second image segmentation probability of the remaining pixels on the image except for the target pixel point to be 0.
In one possible embodiment, the sky determining module 740 is further configured to determine a sky area in the image based on the second image segmentation probability of the pixel point, and replace the sky area with a preset sky element, where the sky element is at least one of a video, an animation effect, a text, and a picture.
It should be noted that, in the sky segmentation device provided in the above embodiment, only the division of the above functional modules is used as an example, in practical application, the above functional distribution may be performed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the sky segmentation device provided in the above embodiment belongs to the same concept as the sky segmentation method embodiment, and the specific implementation process is detailed in the method embodiment, which is not described herein again.
Fig. 9 shows a block diagram of an electronic device 2000 provided by an exemplary embodiment of the present application. The electronic device 2000 may be a portable mobile terminal such as a smart phone, tablet computer, MP3 player (Moving Picture Experts Group Audio Layer III, dynamic video expert compression standard audio plane 3), MP4 player (Moving Picture Experts Group Audio Layer IV, dynamic video expert compression standard audio plane 4), notebook computer, or desktop computer. The electronic device 2000 may also be referred to by other names as user device, portable terminal, laptop terminal, desktop terminal, etc.
In general, the electronic device 2000 includes a processor 2001 and a memory 2002.
Processor 2001 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and so forth. The processor 2001 may be implemented in at least one hardware form of DSP (DIGITAL SIGNAL Processing), FPGA (Field-Programmable gate array), PLA (Programmable Logic Array ). The processor 2001 may also include a main processor for processing data in the awake state, which is also called a CPU (Central Processing Unit ), and a coprocessor for processing data in the standby state, which is a low-power-consumption processor. In some embodiments, the processor 2001 may be integrated with a GPU (Graphics Processing Unit, image processor) for rendering and drawing of content to be displayed by the display screen. In some embodiments, the processor 2001 may also include an AI (ARTIFICIAL INTELLIGENCE ) processor for processing computing operations related to machine learning.
Memory 2002 may include one or more computer-readable storage media, which may be non-transitory. Memory 2002 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 2002 is used to store at least one instruction for execution by processor 2001 to implement the sky segmentation method provided by the method embodiments of the present application.
In some embodiments, the electronic device 2000 may also optionally include a peripheral interface 2003 and at least one peripheral. The processor 2001, memory 2002, and peripheral interface 2003 may be connected by a bus or signal line. The respective peripheral devices may be connected to the peripheral device interface 2003 through a bus, signal line, or circuit board. Specifically, the peripheral devices include at least one of radio frequency circuitry 2004, a display 2005, a camera assembly 2006, audio circuitry 2007, and a power supply 2008.
Peripheral interface 2003 may be used to connect I/O (Input/Output) related at least one peripheral device to processor 2001 and memory 2002. In some embodiments, processor 2001, memory 2002, and peripheral interface 2003 are integrated on the same chip or circuit board, and in some other embodiments, either or both of processor 2001, memory 2002, and peripheral interface 2003 may be implemented on separate chips or circuit boards, which is not limited in this embodiment.
The Radio Frequency circuit 2004 is used to receive and transmit RF (Radio Frequency) signals, also known as electromagnetic signals. The radio frequency circuit 2004 communicates with a communication network and other communication devices via electromagnetic signals. The radio frequency circuit 2004 converts an electrical signal into an electromagnetic signal for transmission, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuitry 2004 includes an antenna system, an RF transceiver, one or more amplifiers, tuners, oscillators, digital signal processors, codec chipsets, subscriber identity module cards, and so forth. The radio frequency circuitry 2004 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to, the world wide web, metropolitan area networks, intranets, various generations of mobile communication networks (2G, 3G, 4G, and 5G), wireless local area networks, and/or WiFi (WIRELESS FIDELITY ) networks. In some embodiments, the radio frequency circuitry 2004 may also include NFC (NEAR FIELD Communication) related circuitry, which is not limited by the present application.
The display 2005 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display 2005 is a touch display, the display 2005 also has the ability to capture touch signals at or above the surface of the display 2005. The touch signal may be input to the processor 2001 as a control signal for processing. At this point, the display 2005 may also be used to provide virtual buttons and/or virtual keyboards, also referred to as soft buttons and/or soft keyboards. In some embodiments, the display 2005 may be one, disposed on a front panel of the electronic device 2000, in other embodiments, the display 2005 may be at least two, disposed on different surfaces of the electronic device 2000 or in a folded design, respectively, and in other embodiments, the display 2005 may be a flexible display, disposed on a curved surface or a folded surface of the electronic device 2000. Even more, the display 2005 may be arranged in an irregular pattern that is not rectangular, i.e., a shaped screen. The display 2005 can be made of LCD (Liquid CRYSTAL DISPLAY), OLED (Organic Light-Emitting Diode) or other materials.
The camera assembly 2006 is used to capture images or video. Optionally, the camera assembly 2006 includes a front camera and a rear camera. Typically, the front camera is disposed on the front panel of the terminal and the rear camera is disposed on the rear surface of the terminal. In some embodiments, the at least two rear cameras are any one of a main camera, a depth camera, a wide-angle camera and a tele camera, so as to realize that the main camera and the depth camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize a panoramic shooting and Virtual Reality (VR) shooting function or other fusion shooting functions. In some embodiments, the camera assembly 2006 may also include a flash. The flash lamp can be a single-color temperature flash lamp or a double-color temperature flash lamp. The dual-color temperature flash lamp refers to a combination of a warm light flash lamp and a cold light flash lamp, and can be used for light compensation under different color temperatures.
Audio circuitry 2007 may include a microphone and a speaker. The microphone is used for collecting sound waves of users and the environment, converting the sound waves into electric signals, and inputting the electric signals to the processor 2001 for processing, or inputting the electric signals to the radio frequency circuit 2004 for voice communication. For purposes of stereo acquisition or noise reduction, the microphone may be multiple and separately disposed at different locations of the electronic device 2000. The microphone may also be an array microphone or an omni-directional pickup microphone. The speaker is then used to convert electrical signals from the processor 2001 or the radio frequency circuit 2004 into sound waves. The speaker may be a conventional thin film speaker or a piezoelectric ceramic speaker. When the speaker is a piezoelectric ceramic speaker, not only the electric signal can be converted into a sound wave audible to humans, but also the electric signal can be converted into a sound wave inaudible to humans for ranging and other purposes. In some embodiments, audio circuit 2007 may also include a headphone jack.
The power supply 2008 is used to power the various components in the electronic device 2000. The power source 2008 may be alternating current, direct current, disposable battery, or rechargeable battery. When power supply 2008 includes a rechargeable battery, the rechargeable battery may be a wired rechargeable battery or a wireless rechargeable battery. The wired rechargeable battery is a battery charged through a wired line, and the wireless rechargeable battery is a battery charged through a wireless coil. The rechargeable battery may also be used to support fast charge technology.
In some embodiments, the electronic device 2000 further includes one or more sensors 2009. The one or more sensors 2009 include, but are not limited to, acceleration sensor 2010, gyroscope sensor 2011, pressure sensor 2012, optical sensor 2013, and proximity sensor 2014.
The acceleration sensor 2010 may detect the magnitudes of accelerations on three coordinate axes of a coordinate system established with the electronic device 2000. For example, the acceleration sensor 2010 may be used to detect components of gravitational acceleration in three coordinate axes. Processor 2001 may control display 2005 to display a user interface in either a landscape view or a portrait view based on gravitational acceleration signals acquired by acceleration sensor 2010. Acceleration sensor 2010 may also be used for gathering motion data for a game or user.
The gyro sensor 2011 may detect a body direction and a rotation angle of the electronic device 2000, and the gyro sensor 2011 may cooperate with the acceleration sensor 2010 to collect a 3D motion of the user on the electronic device 2000. The processor 2001 can realize functions such as motion sensing (e.g., changing a UI according to a tilting operation by a user), image stabilization at the time of photographing, game control, and inertial navigation, based on data acquired by the gyro sensor 2011.
The pressure sensor 2012 may be disposed at a side frame of the electronic device 2000 and/or below the display 2005. When the pressure sensor 2012 is disposed on a side frame of the electronic device 2000, a grip signal of the electronic device 2000 by a user may be detected, and the processor 2001 performs a left-right hand recognition or a shortcut operation according to the grip signal collected by the pressure sensor 2012. When the pressure sensor 2012 is disposed below the display 2005, control of the operability control on the UI interface is achieved by the processor 2001 in accordance with a user's pressure manipulation of the display 2005. The operability controls include at least one of a button control, a scroll bar control, an icon control, and a menu control.
The optical sensor 2013 is used to collect the ambient light intensity. In one embodiment, processor 2001 may control the display brightness of display 2005 based on the intensity of ambient light collected by optical sensor 2013. Specifically, the display luminance of the display screen 2005 is turned up when the ambient light intensity is high, and the display luminance of the display screen 2005 is turned down when the ambient light intensity is low. In another embodiment, the processor 2001 may also dynamically adjust the shooting parameters of the camera assembly 2006 based on the ambient light intensity collected by the optical sensor 2013.
The proximity sensor 2014, also referred to as a distance sensor, is typically disposed on a front panel of the electronic device 2000. The proximity sensor 2014 is used to collect the distance between the user and the front of the electronic device 2000. In one embodiment, the processor 2001 controls the display 2005 to switch from the on-screen state to the off-screen state when the proximity sensor 2014 detects a gradual decrease in the distance between the user and the front of the electronic device 2000, and the processor 2001 controls the display 2005 to switch from the off-screen state to the on-screen state when the proximity sensor 2014 detects a gradual increase in the distance between the user and the front of the electronic device 2000.
Those skilled in the art will appreciate that the structure shown in fig. 9 is not limiting of the electronic device 2000 and may include more or fewer components than shown, or may combine certain components, or may employ a different arrangement of components.
Embodiments of the present application also provide a computer readable storage medium having at least one instruction, at least one program, a code set, or an instruction set stored thereon, where the at least one instruction, the at least one program, the code set, or the instruction set is loaded and executed by a processor to implement the sky-segmentation method provided by the above method embodiments.
Embodiments of the present application also provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. A processor of a computer device reads the computer instructions from a computer-readable storage medium, the processor executing the computer instructions, causing the computer device to perform the sky segmentation method according to any of the embodiments described above.
Alternatively, the computer readable storage medium may include a Read Only Memory (ROM), a random access Memory (RAM, random Access Memory), a Solid state disk (SSD, solid STATE DRIVES), an optical disk, or the like. The random access memory may include resistive random access memory (ReRAM, RESISTANCE RANDOM ACCESS MEMORY) and dynamic random access memory (DRAM, dynamic Random Access Memory), among others. The foregoing embodiment numbers of the present application are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program for instructing relevant hardware, where the program may be stored in a computer readable storage medium, and the storage medium may be a read-only memory, a magnetic disk or an optical disk, etc. The foregoing description of the preferred embodiments of the present application is not intended to limit the application, but rather, the application is to be construed as limited to the appended claims.