Disclosure of Invention
In order to solve the above technical problem, an embodiment of the present application provides a method for determining an event graph, where the method includes:
sampling a video stream to obtain a picture sequence; the pictures in the picture sequence are provided with time parameters;
determining a reference picture with a first time parameter and a target picture with a second time parameter from the picture sequence; the reference picture and the target picture are two adjacent pictures in the picture sequence, and the first time parameter is earlier than the second time parameter;
identifying a set of reference objects from the reference picture and a set of target objects from the target picture;
determining state change information of the target object in the target object set on a second time parameter according to the reference object in the reference object set;
and determining an event map according to the target object, the second time parameter and the state change information.
Further, the pictures in the above-mentioned picture sequence are taken by the same camera based on a fixed view angle.
Further, determining the state change information of the target object in the target object set on the second time parameter according to the reference object in the reference object set, including:
determining that a first target object is located in a first area in a target picture;
determining a second region corresponding to the first region in the reference picture;
and if the second area does not have the first reference object matched with the first target object, determining the state change information of the first target object on the second time parameter as the first state change information.
Further, determining the state change information of the target object in the target object set on the second time parameter according to the reference object in the reference object set, including:
determining that the second target object is located in a third area in the target picture;
determining a fourth region from the reference picture; a second reference object matching the second target object exists in the fourth area;
determining a first relative position relationship value of a second target object relative to the target picture and determining a second relative position relationship value of a second reference object relative to the reference picture;
determining a position relation matching value based on the first relative position relation value and the second relative position relation value;
and determining the state change information of the second target object on the second time parameter as second state change information based on the position relation matching value.
Further, determining the state change information of the target object in the target object set on the second time parameter according to the reference object in the reference object set, including:
determining that the third reference object is located in a fifth region in the reference picture;
determining a sixth area corresponding to the fifth area in the target picture;
and if the sixth area does not have a third target object matched with the third reference object, determining the state change information of the third reference object on the second time parameter as third state change information.
Further, determining that the state change information of the second target object on the second time parameter is the second state change information based on the position relation matching value, including:
if the location relationship match value is the first relationship match value,
and determining the state change information of the second target object on the second time parameter as the first sub-state information.
Further, determining that the state change information of the second target object on the second time parameter is the second state change information based on the position relation matching value, including:
if the location relationship match value is the second relationship match value,
and determining the state change information of the second target object on the second time parameter as second sub-state information.
Correspondingly, the embodiment of the application also provides a device for determining the event map, which comprises:
the sampling module is used for sampling the video stream to obtain a picture sequence; the pictures in the picture sequence are provided with time parameters;
the device comprises a first determining module, a second determining module and a third determining module, wherein the first determining module is used for determining a reference picture with a first time parameter from a picture sequence and determining a target picture with a second time parameter; the reference picture and the target picture are two adjacent pictures in the picture sequence, and the first time parameter is earlier than the second time parameter;
the identification module is used for identifying a reference object set from a reference picture and identifying a target object set from a target picture;
the second determining module is used for determining the state change information of the target object in the target object set on the second time parameter according to the reference object in the reference object set;
and the third determining module is used for determining the event map according to the target object, the second time parameter and the state change information.
Accordingly, an embodiment of the present application further provides an electronic device, which includes a processor and a memory, where the memory stores at least one instruction, at least one program, a code set, or a set of instructions, and the at least one instruction, the at least one program, the code set, or the set of instructions is loaded and executed by the processor to implement the method for determining an event map.
Accordingly, an embodiment of the present application further provides a computer-readable storage medium, in which at least one instruction, at least one program, a code set, or a set of instructions is stored, and the at least one instruction, the at least one program, the code set, or the set of instructions is loaded and executed by a processor to implement the above method for determining an event map.
The embodiment of the application has the following beneficial effects:
the embodiment of the application discloses a method, a device, an electronic device and a storage medium for determining an event map, wherein the determining method comprises sampling the video stream to obtain a picture sequence, pictures in the picture sequence having time parameters, determining a reference picture having a first time parameter from the picture sequence, and determining a target picture with a second time parameter, the reference picture and the target picture being two adjacent pictures in the sequence of pictures, the first time parameter being earlier than the second time parameter, identifying a set of reference objects from the reference picture and identifying a set of target objects from the target picture, and determining the state change information of the target object in the target object set on the second time parameter according to the reference object in the reference object set, and determining the event map according to the target object, the second time parameter and the state change information. Based on the embodiment of the application, scene understanding is carried out on a dynamic scene in a video stream through an image recognition technology, and an object with state change in different sampling frames and corresponding time sequence or logic state change information are determined. The difficulty of storing dynamic scene understanding information can be reduced, the process of determining the change of the object state can be simplified, and a computer can be assisted to further process reasoning and decision according to the change of the object state.
Detailed Description
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings. It should be apparent that the described embodiment is only one embodiment of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that the terms "first", "second", "third", "fourth", "fifth" and "sixth" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, features defined as "first", "second", "third", "fourth", "fifth" and "sixth" may explicitly or implicitly include one or more of the features. Moreover, the terms "first," "second," "third," "fourth," "fifth," and "sixth," etc., are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in other sequences than described or illustrated herein. Furthermore, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or modules is not necessarily limited to those steps or modules explicitly listed, but may include other steps or modules not expressly listed or inherent to such process, method, article, or apparatus.
Please refer to fig. 1, which is a schematic diagram of an application environment according to an embodiment of the present application, including: the server 101 and the terminal 102 are connected through a wireless link between the server 101 and the terminal 102. The server 101 may be a desktop computer, a notebook computer, a mobile phone, a tablet computer, or other devices that can be loaded with a program for determining a state change of an object. The server 101 samples the video stream according to the video stream sent by the terminal to obtain a picture sequence; the method comprises the steps that pictures in a picture sequence are provided with time parameters, a reference picture with a first time parameter is determined from the picture sequence, a target picture with a second time parameter is determined, the reference picture and the target picture are two adjacent pictures in the picture sequence, the first time parameter is earlier than the second time parameter, a server identifies a reference object set from the reference pictures, identifies a target object set from the target picture, and determines state change information of a target object in the target object set on the second time parameter according to a reference object in the reference object set.
While specific embodiments of a method for determining an event map according to the present application are described below, fig. 2 is a schematic flow chart of a method for determining an event map according to the present application, and the present application provides the method operation steps as shown in the embodiments or the flow chart, but more or less operation steps can be included based on conventional or non-inventive labor. The order of steps recited in the embodiments is only one of many possible orders of execution and does not represent the only order of execution, and in actual execution, the steps may be performed sequentially or in parallel as in the embodiments or methods shown in the figures (e.g., in the context of parallel processors or multi-threaded processing). Specifically, as shown in fig. 2, the method includes:
s201: sampling a video stream to obtain a picture sequence; pictures in a picture sequence carry a temporal parameter.
In the embodiment of the application, a server samples a video stream sent by a terminal according to a preset sampling rule to obtain a picture sequence, wherein the preset sampling rule refers to selecting a corresponding sampling frequency parameter value according to the dynamic change degree of a scene in the video stream. It should be understood that the value of the sampling frequency parameter selected for the dynamic change degree of the scene in the same video stream is not fixed according to different requirements. The video stream may be video data acquired by a laser radar, or video data acquired by a depth camera, or video data acquired by a binocular camera, or other sensing devices capable of extracting depth information, and the format of the picture sequence may be any one of a depth image, a point cloud, a mesh, or a three-dimensional data model.
In an optional implementation manner, the server samples the video stream for two minutes according to a sampling frequency parameter value to obtain a picture sequence, wherein the picture sequence is in a point cloud data format, and names the picture sequence according to a preset sampling rule, wherein the naming rule can be a picture sequence obtained by sampling and naming in sequence, namely picture 1, picture 2 …, picture n, or a picture sequence obtained by naming according to sampling time, namely picture 2019, 10, 14, 00, and both naming modes ensure the naming uniqueness of the pictures in the picture sequence. The server stores the named picture sequence into a database in an increasing or decreasing sequence mode, establishes a sample-name.
S203: determining a reference picture with a first time parameter and a target picture with a second time parameter from the picture sequence; the reference picture and the target picture are two adjacent pictures in the picture sequence, and the first time parameter is earlier than the second time parameter.
In the embodiment of the application, a server samples an obtained picture sequence, determines a reference picture with a first time parameter, and determines a target picture with a second time parameter, wherein the reference picture and the target picture carry distinguishing information.
S205: a set of reference objects is identified from the reference picture, and a set of target objects is identified from the target picture.
In the embodiment of the application, a server performs sampling according to a video stream sent by a terminal to obtain a picture sequence, extracts an object in the picture sequence according to an image recognition technology, recognizes a reference object set from a reference picture, and recognizes a target object set from a target picture.
In an optional implementation mode, the picture in the sample-name.text is read once according to the preset frequency, the object set in the picture is extracted based on the VoteNet algorithm, and the object in the picture and the semantic level category corresponding to the object are determined through target detection and semantic segmentation.
S207: and determining the state change information of the target object in the target object set on the second time parameter according to the reference object in the reference object set.
In the embodiment of the present application, the pictures in the picture sequence are taken by the same camera based on a fixed viewing angle. And the server determines a target object corresponding to the reference object in the target object set according to the reference object in the reference object set, and determines the state change information of the target object on the second time parameter.
S209: and determining an event map according to the target object, the second time parameter and the state change information.
In the embodiment of the present application, the server extracts the first target object, the second time parameter, and the state change information as the structured information, that is, < "the first target object," the second time parameter, "and" the state change information.
In the embodiment of the present application, the event map may include a time sequence event map, a logic event map, and a cause and effect event map, but is not limited to the above.
By adopting the method for determining the event map provided by the embodiment of the application, scene understanding is carried out on a dynamic scene in a video stream through an image recognition technology, and the object with state change in different sampling frames and the corresponding time sequence or logic state change information are determined. The difficulty of storing dynamic scene understanding information can be reduced, the process of determining the change of the object state can be simplified, and a computer can be assisted to further process reasoning and decision according to the change of the object state.
There are various ways to determine the state change information of the target object in the target object set on the second time parameter based on the method for determining the event graph shown in fig. 2, and three alternative embodiments are specifically described below.
In an alternative embodiment, the server determines that the first target object is located in a first region in the target picture, and determines a second region corresponding to the first region in the reference picture. The server judges whether a first reference object matched with the first target object exists in the second area or not, and if the first reference object matched with the first target object does not exist in the second area, the change information of the first target object on the second time parameter is determined to be first state change information. The first state information may specifically be that the first target object "appears" at the second time parameter.
Specifically, as shown in fig. 3a and fig. 3b, a schematic structural diagram of a specific implementation of determining state change information of a target object in a target object set on a second time parameter according to an embodiment of the present application is provided. The server samples the video stream sent by the terminal according to a preset sampling rule to obtain a picture sequence, and determines the video stream with a first time parameter t from the picture sequence1And determining with a second temporal parameter t2Wherein the reference picture a and the target picture b are two adjacent pictures in the picture sequence, and the first time parameter t1Before the second time parameter t2. The server identifies a reference object set from a reference picture a and an object set from an object picture b through an image identification technology, wherein the object set comprises a first object m1. The server determines the first target object m in the target picture b1As indicated by the solid line box in the figure, and accordingly, a second region corresponding to the first region is determined in the reference picture a as indicated by the dashed line box in the figure. If the second area does not exist the first target object m1The matched first reference object, then the first target object m is determined1At a second time parameter t2The state change information on is "present". The server sends the first target object m1A second time parameter t2And state change information "presence" as structured information, i.e.<"first target object m1"," second time parameter t2"," appear ">。
In another alternative embodiment, the server determines that the second target object is located in a third region in the target picture, and determines a fourth region from the reference picture; a second reference object matching the second target object is present in the fourth area. The server determines a first relative position relation value of the second target object relative to the target picture, determines a second relative position relation value of the second reference object relative to the reference picture, and determines a position relation matching value and determines state change information of the second target object on a second time parameter as second state change information based on the first relative position relation value and the second relative position relation value.
The second state change information includes first sub-state information and second sub-state information, and the position relation matching value includes a first relation matching value and a second relation matching value. The first sub-status information and the second sub-status information may respectively correspond to "position of the second target object is changed" on the second time parameter and "position of the second target object is not changed" on the second time parameter, or the first sub-status information and the second sub-status information may respectively correspond to "position of the second target object is not changed" on the second time parameter and "position of the second target object is changed" on the second time parameter.
The server determines the state change information of the target object in the target object set on the second time parameter according to the reference object in the reference object set, and introduces a specific implementation mode that the server determines the state change information of the second target object on the second time parameter as the second state change information based on the position relation matching value.
And the server judges whether the position relation matching value is a first relation matching value or not according to the determined position relation matching value, and if the position relation matching value is the first relation matching value, the state change information of the second target object on the second time parameter is determined to be first sub-state information. If not, the position relation matching value is the second relation matching value, and the state change information of the second target object on the second time parameter is determined to be the second sub-state information.
Specifically, as shown in fig. 4a and 4b, a schematic structural diagram of a specific implementation of determining state change information of a target object in a target object set on a second time parameter according to an embodiment of the present application is provided.
In the embodiment of the present application,an alternative application scenario is parking. The server samples the video stream sent by the terminal according to a preset sampling rule to obtain a picture sequence, and determines the video stream with a first time parameter t from the picture sequence1And determining with a second temporal parameter t2Wherein the reference picture c and the target picture d are two adjacent pictures in the picture sequence, and the first time parameter t1Before the second time parameter t2. The server identifies a reference object set from a reference picture c and an object set from an object picture d through an image identification technology, wherein the object set comprises a second object m2The reference object set comprises a second reference object n2. The server determines the second target object m in the target picture d2As indicated by the solid line box in the figure, and accordingly, a fourth region is determined from the reference picture c; the fourth region has a second target object m2Matching second reference object n2(ii) a As indicated by the dashed box in the figure. Determining a second target object m2A first relative position relation value with respect to the target picture d, and determining a second reference object n2A second relative position relation value relative to the reference picture c, and determining a position relation match value and determining a second target object m based on the first relative position relation value and the second relative position relation value2At a second time parameter t2The state change on (1) is "driving away from the parking lot". Wherein the first relative position relation value refers to the second target object m2The second relative position relation value refers to a direction and a distance of a side edge of the target picture d2Relative to the reference picture c and the second target object m2The direction and distance of the same side relative to the target picture. The relationship match value may be "0"&The "1" set may also be other representation manners, and when the relationship matching value is "0", the first sub-state information corresponds to the first relative position relationship value, that is, the first relative position relationship value and the second relative position relationship value are consistent in size, that is, the second target object m in the target picture d is2Relative to a second reference object n in the reference picture c2Position ofSet to a non-occurrence state change, or, a second target object m2And a second reference object n2A subject not requiring study; when the relationship matching value is "1", the second sub-status information corresponds to, that is, the first relative position relationship value and the second relative position relationship value are not in the same size, that is, the second target object m in the target picture d2Relative to a second reference object n in the reference picture c2The position of (2) is changed in state, specifically, the state is "driven out of the parking lot". The server puts the second target object m2A second time parameter t2And the state change information "driving out of the parking lot" is extracted as structured information, i.e.<"second target object m2"," second time parameter t2'drive away from parking lot'>。
In another alternative embodiment, the server determines that the third reference object is located in a fifth region in the reference picture, and determines a sixth region corresponding to the fifth region in the target picture. The server judges whether a third target object matched with the third reference object exists in the sixth area, and if the third target object matched with the third reference object does not exist in the sixth area, the state change information of the third reference object on the second time parameter is determined to be third state change information. The third state change information may specifically be that the third reference object "disappears" on the second time parameter.
Specifically, as shown in fig. 5a and fig. 5b, a schematic structural diagram of a specific implementation of determining state change information of a target object in a target object set on a second time parameter according to an embodiment of the present application is provided. The server samples the video stream sent by the terminal according to a preset sampling rule to obtain a picture sequence, and determines the video stream with a first time parameter t from the picture sequence1And determining with a second temporal parameter t2Wherein, the reference picture e and the target picture f are two adjacent pictures in the picture sequence, and the first time parameter t1Before the second time parameter t2. The server identifies a reference object set from a reference picture e and a target picture f by image identification technologyIdentifying a target object set, wherein the reference object set comprises a third reference object n3. The server determines the third reference object n in the reference picture e3And correspondingly, a sixth area corresponding to the fifth area is determined in the target picture f, as shown by a solid line box in the figure. If the sixth area does not exist the third reference object n3A third target object is matched, a third reference object m is determined3At a second time parameter t2The state change information on (1) is "disappeared". The server maps the third reference object m3A second time parameter t2And state change information "vanishing" as structured information, i.e.<"third reference object m3"," second time parameter t2"," disappear ">。
Fig. 6 is a schematic structural diagram of the apparatus for determining an event map provided in the embodiment of the present application, and as shown in fig. 6, the apparatus includes:
the sampling module 601 samples the video stream to obtain a picture sequence; the pictures in the picture sequence are provided with time parameters;
the first determining module 603 determines a reference picture with a first time parameter and a target picture with a second time parameter from the picture sequence; the reference picture and the target picture are two adjacent pictures in the picture sequence, and the first time parameter is earlier than the second time parameter;
the identification module 605 identifies a set of reference objects from the reference picture and a set of target objects from the target picture;
the second determining module 607 determines the state change information of the target object in the target object set on the second time parameter according to the reference object in the reference object set;
the third determining module 609 determines an event map according to the target object, the second time parameter and the state change information.
The device and method embodiments in the embodiments of the present application are based on the same application concept.
The present application further provides an electronic device, which may be disposed in a server to store at least one instruction, at least one program, a code set, or a set of instructions related to implementing a method for determining an event map in the method embodiments, where the at least one instruction, the at least one program, the code set, or the set of instructions is loaded from the memory and executed to implement the method for determining an event map.
A storage medium may be disposed in the server to store at least one instruction, at least one program, a set of codes, or a set of instructions related to implementing a method for determining an event map in the method embodiments, where the at least one instruction, the at least one program, the set of codes, or the set of instructions is loaded and executed by the processor to implement the method for determining an event map.
Optionally, in this embodiment, the storage medium may be located in at least one network server of a plurality of network servers of a computer network. Optionally, in this embodiment, the storage medium may include, but is not limited to, a storage medium including: various media that can store program codes, such as a usb disk, a Read Only Memory (ROM), a removable hard disk, a magnetic or optical disk, and the like.
As can be seen from the above embodiments of the method, apparatus, electronic device or storage medium for determining an event map provided by the present application, the method in the present application includes sampling a video stream to obtain a sequence of pictures, a picture in the picture sequence is provided with a temporal parameter, a reference picture with a first temporal parameter is determined from the picture sequence, and determining a target picture with a second time parameter, the reference picture and the target picture being two adjacent pictures in the sequence of pictures, the first time parameter being earlier than the second time parameter, identifying a set of reference objects from the reference picture and identifying a set of target objects from the target picture, and determining the state change information of the target object in the target object set on the second time parameter according to the reference object in the reference object set, and determining the event map according to the target object, the second time parameter and the state change information. Based on the embodiment of the application, scene understanding is carried out on a dynamic scene in a video stream through an image recognition technology, and an object with state change in different sampling frames and corresponding time sequence or logic state change information are determined. The difficulty of storing dynamic scene understanding information can be reduced, the process of determining the change of the object state can be simplified, and a computer can be assisted to further process reasoning and decision according to the change of the object state.
It should be noted that: the foregoing sequence of the embodiments of the present application is for description only and does not represent the superiority and inferiority of the embodiments, and the specific embodiments are described in the specification, and other embodiments are also within the scope of the appended claims. In some cases, the actions or steps recited in the claims can be performed in the order of execution in different embodiments and achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown or connected to enable the desired results to be achieved, and in some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
All the embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment is described with emphasis on differences from other embodiments. Especially, for the embodiment of the device, since it is based on the embodiment similar to the method, the description is simple, and the relevant points can be referred to the partial description of the method embodiment.
While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention.