CN120916006A

CN120916006A - Interactive video playing method, player, video platform, computer program product and storage medium

Info

Publication number: CN120916006A
Application number: CN202510996917.XA
Authority: CN
Inventors: 刘杰; 鹍鹏
Original assignee: Interactive Entertainment Shanghai Technology Co ltd
Current assignee: Interactive Entertainment Shanghai Technology Co ltd
Priority date: 2025-07-18
Filing date: 2025-07-18
Publication date: 2025-11-07

Abstract

This invention provides a method for playing interactive videos, a player, a video platform, a computer program product, and a storage medium. The method includes: loading a main configuration file and parsing the main configuration file; loading a scene configuration file according to scene configuration file information and parsing the scene configuration file; constructing a video state machine based on video mapping, scene information, and variables; starting a timeline, obtaining the current physical video progress in conjunction with the video state machine, and mapping it to a relative time point on the timeline; monitoring whether the playback progress matches a preset interactive control point in the timeline; when matching, executing a corresponding preset action based on the video state machine; or, when matching and satisfying the timeline time point trigger condition or the logical condition of the variables in the video state machine, executing a corresponding preset action based on the video state machine. Using the solution provided by this invention improves cross-platform adaptability and interactive flexibility, and reduces customized development costs.

Description

Interactive video playing method, player, video platform, computer program product and storage medium

Technical Field

The invention relates to the technical field of interactive video live broadcasting, in particular to a playing method, a player, a video platform, a computer program product and a storage medium of an interactive video.

Background

Along with the enrichment of the content of the Internet video, the user is no longer satisfied with passive watching, the interactivity and individuation demands of the user on the video are increasing, the interactive video is used as a novel video form, the user is allowed to influence the trend of the content by selecting, and the participation of the user is greatly improved.

At present, the interactive video realizes multi-dependent platform specific coding format and playing logic, and lacks a universal and flexibly configurable playing frame, so that cross-platform application is limited.

Therefore, how to provide a universal and flexibly configurable interactive video playing method to support cross-platform compatibility and diversified interactive scenes, and reduce development and maintenance costs becomes a technical problem to be solved in the present day.

Disclosure of Invention

In view of the above, embodiments of the present invention provide a playing method, a player, a video platform, a computer program product, and a storage medium for an interactive video, which can solve the problems of poor cross-platform compatibility, insufficient support for diversified interactive scenes, and high development and maintenance costs caused by the dependence on platform specific logic of the interactive video in the prior art, significantly improve cross-platform adaptation capability and interactive flexibility, and reduce customized development costs.

The embodiment of the invention provides a playing method of an interactive video, which comprises the steps of loading a main configuration file, analyzing the main configuration file to obtain video mapping, scene configuration file information and metadata, wherein the video mapping is used for mapping a logic video Identity (ID) into a physical video ID and a corresponding video segment, loading the scene configuration file according to scene configuration file information, analyzing the scene configuration file to obtain scene information, a time line, conditions, actions and variables, constructing a video state machine based on the video mapping, the scene information and the variables, starting the time line, combining the video state machine to obtain the currently played physical video progress, mapping the currently played physical video progress into the relative moment point of the time line, monitoring whether the playing progress is matched with a preset interaction control point in the time line, executing corresponding preset actions based on the video state machine when the video identity is matched with the preset interaction control point, or executing corresponding actions based on the video state machine when the video state machine is matched with the preset interaction control point and the moment point of the time line or the video state machine meets the preset conditions.

It can be understood that, according to the scene configuration file information, after the scene configuration file is loaded, information such as a component, a style or an animation can be obtained by analysis.

By adopting the scheme, the core logic (such as video mapping and time line control) of the interactive video is separated from the presentation layer details (such as components and styles) through the hierarchical analysis of the main configuration file and the scene configuration file, so that the interaction rule is not dependent on the bottom layer interface of a specific platform. The video mapping realizes the adaptation of the same interaction scenario on different video sources (such as video files with different resolutions and formats) through the corresponding relation between the logic ID and the physical ID, and elements such as time lines, conditions, actions and the like analyzed in the scene configuration file are combined with a video state machine to form a unified interaction control center, so that timing triggering (such as playing to a certain moment popup option) can be realized through time line monitoring, dynamic response (such as jumping scenario according to user selection) can be realized through variable logic, and diversified interaction scenes are covered. The scheme breaks through the limitation of strong binding of interaction logic and a platform in the prior art, the same set of configuration files can be reused in different environments such as a browser and a native application, repeated development is not needed, cross-platform adaptation capability is improved, meanwhile, the programming of customized codes is reduced by presetting control points and conditions-action mechanisms, development and maintenance cost is reduced, and creation and iteration of an interaction video are more efficient and flexible.

It can be appreciated that in some possible implementations, when the preset interactive control point is not matched or the related condition is not met, the current playing state may be maintained and the playing progress may be continuously monitored.

With reference to the first aspect, in a first possible implementation manner of the first aspect, the constructing a video state machine includes using a physical video ID and a corresponding video segment in the video map as a state node, using the variable as a state parameter, using a jump rule in the scene information as a state transition condition, and constructing a video state machine.

The scheme defines the construction logic of the video state machine, takes the physical video ID and the video segment as state nodes, takes the variable as state parameters and takes the jump rule as a conversion condition, and forms a structured state management system. Compared with the state control logic (such as independent storage of playing state and user selection) in the prior art, the scheme realizes collaborative management of video playing, user interaction and variable state through the finite state machine, and ensures consistency and traceability of state transition. For example, when a User selects a certain option, the variable update synchronously triggers node jump of the state machine, so that play abnormality (such as dislocation of video jump and User Interface (UI) display) caused by state asynchronism is avoided, and stability and User experience of the interactive video are improved.

With reference to the first aspect, in a second possible implementation manner of the first aspect, the matching the preset interaction control point in the timeline includes matching the preset interaction control point in the timeline based on a timestamp after mapping the physical video progress to a relative moment point of the timeline, or matching the interaction control point based on a current value of the variable and a variable logic condition in the condition.

According to the scheme, trigger logic of the interaction control point is thinned through two matching modes (time stamp matching and variable logic matching), and two types of core interaction scenes of time sequence driving and state driving are covered. The time stamp matching ensures the accurate interaction (such as timing popup window) from video playing to a specific moment, the variable logic matching supports the dynamic interaction (such as displaying different scenario branches according to user selection) based on user behaviors (such as selection and input), and the problems that the interaction triggering mode is single and complex scenes are difficult to support in the prior art are solved. The flexible combination of the two matching modes ensures that the interactive video can meet the timing interaction requirement in the linear narrative and adapt to the branch selection requirement in the nonlinear narrative, thereby improving the diversity and the accuracy of the interactive scene.

With reference to the first aspect, in a third possible implementation manner of the first aspect, the preset actions include at least one of a video control operation of controlling video playing, pausing or jumping through the video state machine, a UI operation of updating a display state of an interface element, and a variable update operation of modifying a variable value in the video state machine.

According to the scheme, three types of core operations (video control, UI operation and variable update) of preset actions and the combination logic of at least one type of the core operations are defined, and a modularized action execution mechanism is provided for the interactive video. Compared with the design that the actions are strongly coupled with specific functions in the prior art (such as the play control and the UI update are bound in the same function), the scheme allows the action type to be flexibly selected according to scene requirements, for example, a simple interaction scene can only execute UI operation (such as display prompt), and a complex scene can combine variable update and video skip (such as updating score and skip to corresponding ending after user selection). The modularized design reduces the multiplexing threshold of the action logic, a developer does not need to repeatedly write basic operation codes, and can realize diversified interaction by configuring action combinations, so that the development efficiency is improved and the code redundancy is reduced.

With reference to the first aspect, in a fourth possible implementation manner of the first aspect, the UI operation for updating the display state of the interface element includes triggering, by the video state machine, a front-end rendering engine to update the display state, the position, the size, the transparency, or the text content of the interface component.

According to the scheme, the specific implementation mode of UI operation is refined, the video state machine triggers the front-end rendering engine to update the display state, style or content of the interface component, and decoupling of UI interaction and video playing logic is achieved. Compared with the design of deep binding of UI control and video player in the prior art (such as UI plug-in depending on a specific player), the scheme supports control of interface elements of different front end frames (such as practice and Vue) through a unified interface, and ensures cross-platform consistency of UI interaction. For example, the same set of UI updating instructions can be realized through document object model (Document Object Model, DOM) operation in a browser, and can be realized through native control rendering in a native application, so that the problem of high UI adaptation cost of different platforms is solved, and meanwhile, the fine control capability of UI interaction is improved through definitely updating dimensions (display state, position, size, transparency and the like).

With reference to the first aspect, in a fifth possible implementation manner of the first aspect, the method further includes updating a variable value in the video state machine in response to a user input event, and triggering a conditional checksum action associated with the variable to be performed.

The scheme defines a dual mechanism (memory modification and persistence/synchronization) of variable update operation, and ensures the real-time performance and continuity of the user interaction state. The video state machine directly modifies the memory variable, so that the immediate response of variable update and action trigger (such as updating the numerical value and displaying immediately after user selection) is realized, and the problem that the variable state is easy to lose in the prior art (such as disappearance of user selection record after page refreshing) is solved by a local storage or remote synchronization mechanism. For example, after the user pauses the video at the mobile terminal, the variable value can be synchronized to the cloud end, and the state is automatically restored when the PC terminal continues to play, so that the consistency of the cross-equipment experience is improved, and meanwhile, a data basis is provided for subsequent data analysis (such as user behavior tracking).

The embodiment of the invention provides a player of interactive video, which comprises a loading and analyzing module, a time line synchronizing module, a monitoring module and an interaction control module, wherein the loading and analyzing module is used for loading a main configuration file and analyzing the main configuration file to obtain video mapping, scene configuration file information and metadata, the video mapping is used for mapping a logic video ID into a physical video ID and a corresponding video segment, the scene configuration file is loaded according to the scene configuration file information to obtain scene information, a time line, conditions, actions and variables, the state machine engine is used for constructing a video state machine based on the video mapping, the scene information and the variables, the time line synchronizing module is used for starting the time line, combining the video state machine to obtain the physical video progress of current playing and mapping the physical video progress to be the relative time point of the time line, the monitoring module is used for monitoring whether the playing progress is matched with the preset interaction control point in the time line, and the interaction control module is used for executing the corresponding action machine based on the preset interaction control point when the preset interaction control point is matched with the preset interaction control point and the logic video state trigger condition or the variables in the video state machine are met.

In a third aspect, an embodiment of the present invention provides an interactive video platform, including an interactive video player according to the second aspect or any one of the possible implementation manners of the second aspect, a content management system for generating, storing and managing a main configuration file and a scene configuration file of an interactive video, and a data synchronization module for synchronizing user interaction data with variable values in the video state machine.

In a fourth aspect, an embodiment of the present invention provides a computer program product, including a computer program, where the computer program when executed by a processor implements the steps of the method for playing interactive video according to the first aspect or any possible implementation manner of the first aspect.

In a fifth aspect, an embodiment of the present invention provides a computer readable storage medium, on which a computer program is stored, the computer program implementing the steps of the method for playing interactive video according to the first aspect or any possible implementation manner of the first aspect when the computer program is executed by a processor.

It can be understood that the technical effects obtained by the player for interactive video according to the second aspect, the interactive video platform according to the third aspect, the computer program product quality according to the fourth aspect, and the computer readable storage medium according to the fifth aspect are similar to the technical effects obtained by the corresponding technical means in the playing method for interactive video according to the first aspect, and are not repeated herein.

Drawings

Fig. 1 is a flowchart of a playing method of an interactive video according to an embodiment of the present invention;

FIG. 2 is a schematic structural diagram of an interactive video player according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of an interactive video player according to an embodiment of the present invention;

fig. 4 is a schematic diagram illustrating an implementation process of a playing method of an interactive video according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The terminology used in the following embodiments of the application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification of the present application and the appended claims, the singular forms "a," "an," "the," and "the" are intended to include the plural forms as well, unless the context clearly indicates to the contrary. It should also be understood that the term "and/or" as used in this disclosure is intended to encompass any or all possible combinations of one or more of the listed items.

The terms "first," "second," and the like, are used below for descriptive purposes only and are not to be construed as implying or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature, and in the description of embodiments of the application, unless otherwise indicated, the meaning of "a plurality" is two or more.

With the development of internet video content, the interactive video can enable users to improve the participation of the users by selecting the characteristic of influencing the trend of the content, so that the interactive video becomes a new video form which is favored. However, the implementation of the current interactive video depends on the specific coding format and playing logic of the platform, and lacks a universal and flexibly configurable playing framework, which results in that the interactive video is severely limited in cross-platform application, and is difficult to smoothly flow between different operating systems, terminal devices or application platforms.

The platform dependence not only causes the problem of poor cross-platform compatibility, so that the same interactive video is difficult to keep consistent playing effect and interactive experience on a plurality of platforms, but also has insufficient support for diversified interactive scenes, and cannot meet the interactive requirements under different scenes. Meanwhile, the method greatly increases the development and maintenance cost for the customized development of adapting to different platforms and scenes, and brings barriers to the popularization and application of related technologies, thereby becoming a technical bottleneck for restricting the further development of interactive videos.

In view of this, the embodiment of the invention provides an interactive video playing method, which constructs a video state machine by loading and analyzing a main configuration file and a scene configuration file, and constructs a universal and flexible playing system by combining the steps of executing corresponding actions on a time line and the like. According to the scheme, a series of problems caused by dependence on platform specific logic of the interactive video in the prior art can be effectively solved, cross-platform adaptation capability and interaction flexibility are improved, customized development cost is reduced, and technical barriers are cleared for wide application of the interactive video.

Referring to fig. 1, fig. 1 is a flowchart of an interactive video playing method according to an embodiment of the present invention, where the interactive video playing method includes steps S101 to S106.

S101, loading a main configuration file, and analyzing the main configuration file to obtain video mapping, scene configuration file information and metadata, wherein the video mapping is used for mapping a logic video ID into a physical video ID and a corresponding video segment.

It should be noted that, the configuration file is used to describe the interactive logic of the interactive video, so that in order to describe the business logic by configuration as much as possible, to reduce code development, a complete set of description primitives (primities) needs to be defined. Meanwhile, in order to be compatible with the configuration standards of each video platform, the configuration description should be a superset of the standards of each platform, so that each platform can be flexibly adapted.

The media asset form of interactive video is not limited to video, but may include audio, computer graphics (Computer Graphics, CG), pictures, text, and even augmented reality (Augmented Reality, AR) of objects in the real physical world. Each interaction between the user and the interactive video can be abstracted into an interactive Scene (Scene), and the carrier can be video, CG, pictures and the like. Each scene acts as a separate scene profile. The interactive video loads a main configuration file at initial running, and the file contains description information of the whole interactive video.

The principle of designing the interactive logic description primitive is atomization and orthogonalization, wherein the atomization refers to the fact that the description primitive is not subdivided, and is a basic operation unit of interaction, the orthogonalization refers to the fact that the description primitives are not associated, and definition is not overlapped. By combining these primitives, the description of the overall interaction logic can be accomplished.

It should be noted that, the configuration files are divided into a main configuration file and a scene configuration file, wherein the main configuration file (Master Configuration, MC) mainly comprises the following modules:

Video Mapping (Video Mapping) is used for Mapping the logical Video ID into a physical Video ID actually stored in the background and a Video segment corresponding to the physical Video ID, and establishing a Mapping relation between the logical Video ID and a physical Video resource, wherein the physical Video resource can be further described by a resource configuration file (Resource Configuration), and the resource configuration file can comprise types (such as MP4, webM, JPEG and the like) of media such as audio and Video, pictures and the like, multi-format links (such as 1080p/720p/4K and the like) and quality parameters, and is used for a software development kit (Software Development Kit, SDK) to dynamically select an optimal version based on environmental factors such as network bandwidth, equipment performance and the like.

Scene profile information (Scene Info) for recording all Scene profile information required for the entire item, including path, identifier, etc., for subsequent loading.

Metadata (Metadata) is descriptive data of items, including global properties of the interactive video (e.g., title, author, version number, etc.), and item-level variables declared by "variable configuration (Var Configuration)" (e.g., user score, selection state, etc.), which participate as state parameters in the construction of subsequent video state machines.

When the interactive video is played, a main configuration file (for example, JSON or XML format) is loaded first, and then the configuration file is analyzed to obtain video mapping, scene configuration file information and metadata, wherein the video mapping is used for mapping a logical video ID into a physical video ID and a corresponding video segment.

The video mapping is used for establishing a mapping relation between a logical video ID and a physical video resource, for example, mapping "intro_scene" to actually stored "video/intro_1080p.mp4". The scene profile information is used to record a path or identifier of the scene profile for subsequent loading. Metadata contains global properties of the interactive video (e.g., title, author, version number, etc.).

For example, the master profile may include the following:

json

{

"videoMapping": {

"intro": {"physicalId":"video_001","startTime": 0,"endTime": 60 },

"choice_a": {"physicalId":"video_002","startTime": 0,"endTime": 30 }

},

"sceneConfig":"scenes/main_scene.json",

"metadata": { "title": "interactive adventure game", "version": "1.0" }

}

The section of JSON is a main configuration file of an interactive video, and after analysis, the core contains three parts of information, namely 'videoMapping' (video mapping), which define the corresponding relation between a logic video ID and a physical video resource, for example 'intro' (logic identification, which can be understood as 'opening fragment') corresponds to the actually stored 'video_001' (physical identification), the playing range is 0-60 seconds, and 'choice_a' (logic identification, which can be understood as 'A option scenario') corresponds to the 'video_002', and the playing range is 0-30 seconds. Second, "sceneConfig" specifies the path "scenes/main_scene. Json" of the scene configuration file for subsequent loading of more detailed interaction logic. Thirdly, "metadata" records metadata of the video, such as titled "interactive adventure game", version number "1.0". Through the structured configuration, the decoupling of the interactive video logic layer and the physical resource layer is realized, and a foundation is provided for cross-platform multiplexing and flexible management.

S102, loading the scene configuration file according to the scene configuration file information, and analyzing the scene configuration file to obtain scene information, a time line, conditions, actions and variables.

It can be understood that, according to the scene configuration file information, after the scene configuration file is loaded, the information such as a component, a style or an animation can be obtained by analysis.

The scene configuration file is analyzed, and the analyzed information is not limited to scene information, time lines, conditions, actions, and variables. The scene information, the time line, the condition, the action, the component, the variable, the style and the animation can be more than the information, the information can be less than the information, and the analysis value corresponding to the missing information can be a preset value, such as a null value, when the information is less than the information.

In the implementation, the scene configuration file can be loaded according to the path information in the main configuration file, the interaction logic elements in the scene configuration file are analyzed, and the related information in the configuration file is analyzed, wherein the information mainly comprises the following information.

The Scene information (Scene) is used for describing the resource ID, the video mapping relation and the duration required by the current Scene, and defines the organization structure (such as chapter, branching relation and the like) of the video clip. The seamless connection of the multi-segment video is facilitated by combining predefined material processing rules (such as video splicing sequence and transcoding parameters) in the construction configuration (Build Configuration).

For example, one scene configuration file is defined as follows.

json

{

"scenes": [

{"id":"scene_1","videoId":"intro","nextScenes": ["scene_2","scene_3"] }

],

"timeline": [

{"timePoint": 10,"controlPoint":"show_question_1"},

{"timePoint": 30,"controlPoint":"check_answer"}

],

"variables": {"score": 0,"choice": null },

"actions": [

{"id":"show_question_1","type":"show_ui","target":"question_component"},

{"id":"check_answer","type":"jump","condition":"choice ==='A'","targetScene":"scene_2"}

]

}

After analysis, it can be known that, in the interactive video corresponding to the example, the scene configuration file defines the video playing flow and the interactive logic. The method comprises four core parts, namely (1) a scene structure that a current scene 'scene_1' is associated with video 'intro' (corresponding to physical video in main configuration), and two possible subsequent scenes 'scene_2' and 'scene_3' are set to suggest that branching storylines exist. (2) Time line, trigger "show_ question _1" control point (e.g. display selection questions) when video is played to 10 th second, trigger "check_answer" control point (e.g. check answer and decide to jump) at 30 th second. (3) The variable system maintains dynamic data with an initial score of 0 and a user selection of null. (4) The action rule is that the "show_ question _1" action displays a User Interface (UI) component "question _component", and the "check_answer" action jumps to "scene_2" when the variable "choice" is equal to "A", otherwise "scene_3" may be entered by default. And the interactive logic is realized through time triggering and condition judgment, if the user selects A, the user enters the scene 2, and otherwise, the user enters the scene 3, so that support is provided for the nonlinear narrative.

The Timeline (Timeline) records the time stamps of the interactive control points, which are a collection of time points (e.g., 10 th, 30 th, etc.). The time point is used to describe an operation to be performed at a specific time, such as displaying a hidden UI, jumping a video, or the like. Specifically, the timeline may be described in seconds with the current scene starting point being 0 points, in the format of floating point numbers.

In some possible implementations, a user uploads segmented video through a comic authoring engine (Altstory CREATIVE ENGINE, ACE) system, interlinking the interactive video by specifying logical relationships between the segments of video. While for players, it is not necessarily possible to preload multiple pieces of video simultaneously at the time of preloading, and thus several pieces of video need to be combined into one piece of video, subject to the limitations of devices and systems. At this time, each segmented video is designated to be in the position section of the merged video. On the other hand, the method can provide proper audio and video coding for different devices in different network environments, and the condition that the same video has multiple coding formats and packaging formats exists. Both of the above cases require mapping a piece of video (logical video) onto the actual physical video.

For example, a mapping configuration file of an interactive Video is as follows, where VVID is a Logical Video identifier (VVID), and PVID is an actual Physical Video Identifier (PVID), and the "begin" and "end" key values correspond to a section start-stop time position of a section of the Logical Video corresponding to the Physical Video.

{

"videoMapping": {

"vvid001": {

"pvid":"pvid001",

"begin":"0.000",

"end":"15.500"

},

"vvid002": {

"pvid":"pvid001",

"begin":"15.500",

"end":"31.750"

},

"vvid003": {

"pvid":"pvid002",

"begin":"0.000",

"end":"22.220"

},

"vvid004": {

"pvid":"pvid002",

"begin":"22.220",

"end":"33.000"

},

"vvid005": {

"pvid":"pvid001",

"begin":"31.750",

"end":"40.000"

}

It can be known that the problem of segment video merging and multi-format adaptation can be solved by the correspondence between the logical video ID (VVID) and the Physical Video ID (PVID). The method comprises the steps of realizing logical interlinking of segmented videos, for example, although the 'vvid 001' and the 'vvid 002' are two-segment logical videos, mapping the two-segment logical videos to different intervals (0-15.5 seconds and 15.5-31.75 seconds) of the same physical video 'pvid 001', enabling the player to play the segmented videos in a seamless splicing mode according to the configuration, supporting multi-version coding adaptation, for example, mapping 'vvid 003' and 'vvid 004' to 'pvid 002', possibly representing different coding formats (for example, H.264/H.265) of the same content, enabling SDK to select according to network and equipment environments, and allowing cross mapping 'vvid 005' to return to a subsequent interval (31.75-40 seconds) of 'pvid 001', and reflecting flexible correspondence of the logical videos and physical storage. The mapping mechanism not only meets the logic requirement of the interactive video nonlinear narrative, but also solves the problems of equipment compatibility and network adaptability, and ensures that the same set of interactive logic can adapt to diversified playing environments.

In one example, the scenario information includes information such as an ID of the current scenario, a set of required resource IDs, a corresponding VVID, an effective duration, and the like, as follows.

{

"sceneID":"scene002",

"resID": [

"res006",

"res007",

"res008"

],

"vvid": [

{

"id":"vvid002",

"baseTime":"10.000"

}

],

"duration":"16.250"

}

In this scenario example, configuration rules for the scenario information in the interactive video are defined. The scene information includes a scene unique identifier "sceneID" (e.g., "scene 002"), a set of resources I required by the scene (e.g., "res006", "res007", "res008" for specifying the resources of the loaded audio/video, picture, etc.), an associated logical video ID (VVID) and a corresponding reference time (baseTime), and a validation duration of the scene (e.g., "16.250" seconds). Here, "baseTime" in vvid indicates the starting point of the current scene in the associated logical video (e.g., "10.000" seconds, i.e., the scene starts from the 10 th second of the logical video), and baseTime is "0.000" seconds if the scene starts from the beginning of the logical video. The configuration defines the time association and resource dependence of the scene and the logic video, and provides a basis for accurate triggering of the time line operation and on-demand loading of resources.

The time line configuration and the time point mapping rule of the interactive video are as follows:

{

"timeline": {

"0.000": [

{

"type":"action",

"id":"act001"

},

{

"type":"condition",

"id":"cond001"

},

{

"type":"condition",

"id":"cond002"

}

],

"16.250": [

{

"type":"action",

"id":"act002"

}

]

}

it can be seen that, since the timeline uses the current scene start point as 0 point, the time points are recorded according to the number of seconds in the floating point format, and each time point is associated with an operation (such as an action and a condition judgment) to be executed, and as can be seen from the previous example, the action "act001" is executed at "0.000" seconds and the condition judgment "cond001", "cond002", and the action "act002" is executed at "16.250" seconds in the configuration. The mapping logic of the time points is that, with the start point of the logic video (vvid) associated with the current scene as a reference, the time line time point needs to be overlapped with the start reference time (baseTime) of the logic video to obtain a corresponding vvid time point, and then the start time (begin) of the vvid in the physical video (pvid) is overlapped, and finally the time point is mapped to the actual playing time point of the physical video (for example, act002 action corresponds to baseTime +16.250 of vvid002, namely 10.000+16.250= 26.250 seconds, and the corresponding pvid001 time point is vvid002.begin+ baseTime +16.250, namely 15.500+10.000+16.250)

= 41.750 Seconds) to achieve a precise correlation of the timeline operation with the physical video playback progress, ensuring that the interaction event is triggered at the correct moment.

It should be noted that, the interactive video system may also implement a rich interactive experience through collaborative configuration of components, variables, styles and animations. The components are basic elements that make up the interface, including buttons, text boxes, pictures, etc., for direct interaction with the user (e.g., selecting a question button, progress bar). Variables are used to store and track system status (e.g., user score, selection result), supporting dynamic logic decisions (e.g., "if score >80 then display rewards interface"). The style adopts a grammar similar to CSS, defines the visual representation (such as color, font and position) of the component, and can realize unified style or responsive adjustment of the interface (such as adapting to the mobile phone/tablet layout). The animation adds dynamic effects (such as fade-in and fade-out and sliding switching) to the component, so that the smooth sense of interaction (such as the gradual animation when options appear) is improved.

For example, in an educational interactive video, when the video is played for 15 seconds, the system judges whether the user has finished the pre-test (variable value: isTestCompleted =true) through the variables, if so, displays a "continue learning" button (component), the button style is set to a green round corner (style), and adds an animation effect sliding in from the bottom, and if not, displays a red prompt text box (component), prompting "please finish the test first" (variable driving content). The combination mechanism enables the interactive video to adjust the interface presentation in real time according to the user behavior, and achieves personalized viewing experience.

S103, constructing a video state machine based on the video mapping, the scene information and the variables.

The construction of the video state machine is a core link for realizing the logic control of the interactive video, and abstracts the interactive video into a state transition model capable of being controlled accurately by integrating video mapping, scene information and a variable system. The basic unit of the state machine is a state node, which is defined by a physical video ID and a time interval thereof (such as ' video_001 (0-60 seconds) "), each node represents a specific playing fragment, the state parameter is dynamic data (such as user selection, score and the like) in a variable system and is used for judging conditions during state transition, and the state transition conditions are defined based on a jump rule (such as ' user selection A-jump to scene_2 ') in scene configuration, so that trigger logic of state transition is defined. The model couples video playing, user interaction and state management understanding, so that the system can dynamically adjust the playing flow according to user behaviors.

For example, in an interactive video of a branching storyline, the system listens to the change of the variable "choice" while the user views the "intro" video (0-60 seconds corresponding to the physical video "video_001"). If the user selects option A during video playback (i.e., the variable "choice" is assigned to "A"), the state machine will trigger the transition logic to jump from the current state to "video_002" (0-30 seconds) corresponding to "scene_2". The mechanism enables the interactive video to be flexibly switched among a plurality of predefined video clips according to real-time selection of users, achieves interaction experience of nonlinear narrative, ensures traceability and prediction of all state transitions, and improves stability and maintainability of a system.

S104, starting a time line, and acquiring the physical video progress of the current playing by combining a video state machine, and mapping the physical video progress to the relative moment point of the time line.

The core of the step is to start a time line module and establish a mapping relation between the physical video playing progress and the time line relative moment point. When the physical video is played, the system can convert the actual playing progress (such as 15 th second) into the relative moment point in the time line in real time, so as to ensure that the preset interaction control point on the time line can be accurately triggered. The process needs to process the splicing logic of video clips, for example, when the playing is switched from one physical video clip to another, the time offset needs to be calculated through the video mapping relation, and the consistency of the time line when the video clips are played is ensured.

For example, if the control point of "show prompt text" is set in the timeline at 10 seconds, when the physical video is played to the corresponding moment (the start offset may be superimposed due to the clip splicing), the system will map the progress to 10 seconds of the timeline, thereby triggering the text display action, and if the video is played to 30 seconds, and the timeline defines the control point of "hide button" here, the operation is activated by the mapping logic as well. The real-time mapping mechanism ensures the synchronism of the interaction event and the video content, and is a key link for realizing accurate interaction.

S105, monitoring whether the playing progress is matched with an interaction control point preset in a time line.

The step realizes the accurate matching of the interaction control points by continuously monitoring the video playing progress, and ensures the synchronous triggering of the interaction logic and the video content. The process comprises two matching modes, namely, timestamp matching directly compares the current playing time with the preset control point time (such as 10 th second and 30 th second) in a time line, corresponding operation is triggered when the current playing time and the preset control point time are consistent, and variable logic matching is combined with a variable system to check whether the current variable value meets a specific condition (such as score is more than or equal to 60 and the selection result is B) or not, so that the current variable value is used as an additional judgment basis for triggering interaction. The method can be combined with two modes, so that the time-driven deterministic operation is ensured, and dynamic logic judgment based on user behaviors is supported.

For example, in a answer interactive video, when playing for 15 seconds (time stamp matching), the system triggers a control point of displaying the title, pops up a choice title interface, after selecting option B by the user, the variable "choice" is assigned to "B", when playing for 30 seconds (time stamp matching), the system confirms that the condition is met and triggers the operation of "jumping to the resolution video corresponding to option B". The matching mechanism not only ensures that the interaction event occurs at the correct time point, but also can dynamically adjust the subsequent flow according to the real-time selection of the user, thereby realizing flexible interaction experience.

S106, executing corresponding preset actions based on the video state machine when the preset interaction control points are matched, or executing corresponding preset actions based on the video state machine when the preset interaction control points are matched and the time line moment trigger conditions or the logic conditions of variables in the video state machine are met.

The core of the step is to execute preset actions based on the video state machine when the interaction control point is matched and the triggering condition is met, wherein the actions comprise three key operations, namely video control (such as pausing, playing, jumping to a specified time stamp or switching video clips), UI operation (such as displaying/hiding buttons, updating text content or adjusting component styles), variable update (such as recording user selection and increasing and decreasing scores). The method can also be a synergistic effect of three types of operations, not only can respond to user interaction, but also can drive dynamic changes of video flows, and ensure complete execution of interaction logic.

For example, after the user clicks the "accept task" button at the scenario branch point, the system first performs variable update, sets "TASKACCEPTED" to "true", then performs UI operation, hides the option button group and displays the task detail panel (the style is a semitransparent floating window), and then when the video is played to 45 seconds, the state machine triggers the video control operation according to the preset rule of "TASKACCEPTED = = = 'true', and jumps to the physical video segment corresponding to the task execution (12.300-58.750 seconds interval of pvid 005). According to the action execution mechanism, the seamless connection of the interaction logic and the video content is ensured through the accurate judgment of the state machine on the variable state, meanwhile, because the interaction logic is separated from the bottom layer rendering, the uniform interaction experience can be maintained no matter the Web end passes through a DOM updating interface or the native application calls the native control, and the cross-platform adaptation efficiency is remarkably improved.

By adopting the scheme provided by the embodiment, through constructing a set of complete configuration description primitives (which are embodied as atomization and orthogonalization definitions of each core element of the interactive logic and comprise video mapping, scene information, time lines, variables, actions, components, patterns, animations and the like, the primitives are independent and can not be subdivided, and all interactive logic scenes can be covered by combination), the decoupling of the interactive logic and video playing is realized, and the cross-platform adaptation cost and development complexity are remarkably reduced. The method has the core advantages that through layered design of the main configuration file and the scene configuration file and cooperation of the primitives and the video state machine, flexible splicing and multi-format coding adaptation of segmented videos (meeting the requirements of different equipment and network environments) are supported, nonlinear narrative and personalized interaction (such as user selection driving scenario branches) can be realized based on time triggering and condition judgment, meanwhile, as interaction logic is defined by the configuration file instead of codes and is configured to be compatible with various platform standards, the same interaction logic can be multiplexed on multiple platforms such as Web, mobile terminals and the like, only the bottom rendering mode is required to be adjusted, customized development aiming at different video platforms is effectively avoided, development cost of cross-platform application is greatly reduced, platform limitation of the interaction videos is broken through, and in addition, based on a complete primitive system, the scheme supports diversified media forms such as audio, CG, pictures, characters and AR, and the like, and can flexibly cope with abundant interaction scenes, and provides efficient and flexible technical support for diversified interaction experiences (such as education, entertainment games and the like).

In some possible implementations, constructing the video state machine includes constructing the video state machine with physical video IDs and corresponding video segments in the video map as state nodes, variables as state parameters, and jump rules in the scene information as state transition conditions. The scheme defines core elements of video state machine construction, wherein a state node consists of a physical video ID and a time interval (for example, 0-15.5 seconds of pvid & lt 001 & gt) of the physical video ID and represents a specific playing fragment, a state parameter is dynamic data (for example, "score=80", "choice= 'A'") in a variable system, and a state transition condition is based on a jump rule in scene configuration (for example, jump to scene_2 if choice= 'A'). The three parts together form closed loop logic of node-parameter-condition, so that the state machine can dynamically switch the playing fragments according to the real-time variable value.

For example, in a knowledge competition interactive video, the status node includes "pvid for 20-40 seconds" (corresponding to the first question answering segment), the status parameter is "answer=null", and the conversion condition is "jump to pvid for 0-30 seconds (rewarding segment) if answer= 'correct'). When the user selects the correct answer, the variable "answer" is updated to "correct", the state machine triggers the conversion, and the bonus clip is played.

The scheme ensures the predictability and logic of state transition through defining the structural elements of a state machine, so that the complicated branching storyline can be accurately controlled, and simultaneously, the physical video clips, the variables and the jump rules are bound, thereby realizing the deep coordination of video playing and interaction logic and providing stable technical support for nonlinear narrative.

In some possible implementations, the UI operation that updates the display state of the interface element includes triggering a front-end rendering engine via a video state machine to update the display state, location, size, transparency, or text content of the interface component. The scheme refines the specific content of the UI operation, and explicit UI update needs to trigger a front-end rendering engine through a video state machine, adjust the visual attribute of a component (such as adjusting the button transparency from 0 to 1 to realize fade-in effect), layout parameters (such as the text box position is moved from the left side to the center of the screen), or content (such as updating 'remaining time: 10 seconds' to 'remaining time: 5 seconds').

For example, in a live interactive scenario, when the variable "viewerCount" exceeds 1000, the state machine triggers the rendering engine to perform UI operations that enlarge the size of the "human medal" component 1.2 times, move the position to the upper right corner of the screen, update the text content to "human air explosion.

According to the scheme, through the triggering mode and the adjustment dimension of the standard UI operation, the logical synchronization of the interface update and the video state machine is ensured, interaction confusion caused by the disconnection of the UI and the state is avoided, and meanwhile, fine visual adjustment is supported, and the smoothness and the attractiveness of user experience are improved.

Referring to fig. 2, fig. 2 is a schematic structural diagram of an interactive video player according to an embodiment of the present invention, where the interactive video player includes a loading and parsing module configured to load a master configuration file and parse the master configuration file to obtain a video map, scene configuration file information, and metadata, where the video map is configured to map a logical video ID to a physical video ID and a corresponding video segment, and is configured to load the scene configuration file according to the scene configuration file information, parse the scene configuration file to obtain scene information, a timeline, a condition, an action, and a variable, and a state machine engine configured to construct a video state machine based on the video map, the scene information, and the variable, a timeline synchronization module configured to start the timeline, obtain a physical video progress currently played in conjunction with the video state machine, map the physical video progress to a relative time point of the timeline, and a monitor module configured to determine whether the playing progress matches a preset interaction control point in the timeline, and an interaction control module configured to execute a corresponding preset action based on the video state machine when the video map matches the preset interaction control point, or configured to satisfy the preset interaction control point, and execute the corresponding action based on the condition or the state machine when the video state machine matches the preset interaction control point and the trigger time point.

In some possible implementations, the state machine engine is specifically configured to construct a video state machine by using a physical video ID and a corresponding video segment in the video map as state nodes, variables as state parameters, and jump rules in the scene information as state transition conditions.

In some possible implementations, the interactive control module is specifically configured to match a preset interactive control point in the timeline based on a timestamp after mapping the physical video progress to a relative time point in the timeline, or match the interactive control point based on a current value of a variable and a variable logic condition in a condition.

In some possible implementations, the preset actions include at least one of a video control operation to control video play, pause, or skip through a video state machine, a UI operation to update the display state of the interface element, and a variable update operation to modify the variable value in the video state machine.

In some possible implementations, in terms of UI operations for updating the display state of the interface element, the interactive control module is specifically configured to trigger, by the video state machine, the front-end rendering engine to update the display state, the position, the size, the transparency, or the text content of the interface component.

In some possible implementations, the interactive control module is further configured to update a variable value in the video state machine in response to a user input event and trigger execution of a conditional checksum action associated with the variable.

It should be noted that, for the working details of each module, reference may be made to the corresponding description in the foregoing method embodiment, and details are not repeated here.

The embodiment of the invention also provides an interactive video platform, which comprises the interactive video player according to any embodiment, a content management system for generating, storing and managing a main configuration file and a scene configuration file of the interactive video, and a data synchronization module for synchronizing user interaction data with variable values in a video state machine.

In implementation, the modules in the interactive video platform may be further refined, as shown in fig. 3. The flow of the interactive video player is described in connection with the modules in fig. 3, which are roughly divided into three stages.

(1) Resource acquisition and configuration resolution phase

At this stage, the "content delivery network (Content Delivery Network, CDN) file" module is responsible for providing configuration files and static files, which are processed by the "configuration acquisition, verification" module to ensure the accuracy and integrity of the file, which is then parsed by the "parse configuration" module. The configuration file obtained in this process contains key information, such as the correspondence between the logical video ID (VVID) and the actual Physical Video ID (PVID) in the "video mapping" module, and the jump rule in the "scene information" module. Meanwhile, a 'content delivery network (Content Delivery Network, CDN) streaming media' module pulls media resources, including videos, configuration files and the like, and the resources enter a 'resource preloading (video, configuration files)' module to preload, wherein the video resources correspond to state node materials required by a subsequent construction video state machine, namely video segments corresponding to specific physical video IDs.

(2) Logic build and run phase

Based on the analyzed configuration, the video mapping module is constructed by the video state machine constructing module, the actual physical video ID and the corresponding video segment in the video mapping module are used as state nodes, the variables in the variable module are used as state parameters, and the jump rules in the scene information module are used as state transition conditions. The module of starting the time line, monitoring Video timeupdate and time point matching strategy starts to work, and the matching mode comprises two modes of matching the preset interactive control points in the time line, namely, after mapping the physical video progress to the relative time points of the time line, matching in the module of the time point matching strategy based on the time stamp, and matching with the set variable logic conditions according to the current value of the variable through the module of the variable. After the interaction control point is matched, the 'action execution' module executes preset actions, including video playing, pausing or jumping operation realized by the 'video control operation' module, updating interface element display state operation finished by the 'UI operation' module, such as triggering a front-end rendering engine to change the display state, position, size and the like of an interface component, and modifying variable value operation in a video state machine, performed by the 'variable update operation' module. When a user input event occurs, the variable module updates the variable value, then triggers a condition check associated therewith, and performs a corresponding action at the action execution module.

(3) Scene management and resource release phase

When the scene needs to be exited, the current sub-scene is destroyed and the resources are released by operating a module of logging out the scene (when the scene is exited, the current sub-scene is destroyed and the resources are released). In the process of playing the interactive video, along with scene switching, information such as state nodes in a video state machine module, variables in a variable module and the like can be changed. The reasonable release of resources can ensure the fluency of the interactive video playing, ensure the smooth execution of the interactive logic, for example, the correct updating of variables in a new scene and the condition judgment can be normally carried out, and the logic error cannot occur due to the problem of resource occupation.

Fig. 4 is a schematic diagram of an Execution process of a playing method of an interactive video according to an embodiment of the present invention, where a "scene" interacts with an "Execution" module through a "exports API" (export application program interface). Within the "Execution" module, "ExecManager" has the function of initializing a scene (by the init (Array < scene >) method) and the ability to perform operations related to a specific scene ID and event (by the exec (sceneid, events) method). This is similar to the initialization setup of different scenes and the execution of operations in response to various events within the scenes during interactive video playback.

"ConditionsManager" is responsible for managing conditions, and can add conditions (addCond (scene) method), process condition mapping (condMap (type) method) and the like, and similarly in the interactive video logic, various condition judgment rules are set and managed, so that video playing can be ensured to carry out logic circulation according to the set conditions. The 'ActionsManager' focuses on management actions, such as adding actions (addAction (scene) method) and processing action mapping (actionMap (actionKey) method), and corresponds to actual operation execution in the interactive video, video play control, interface element display hiding and other operations, and through the cooperation of the parts, the complete logic flow of condition judgment and action execution in the interactive video is realized.

Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, where the electronic device 500 includes a memory 501 and a processor 502, and the processor 502 is configured to execute computer executable instructions, and the memory 501 stores one or more computer executable instructions, and when the computer executable instructions are executed by the processor 502, the steps of any of the interactive video playing methods described in the foregoing method embodiments are implemented.

The embodiment of the invention also provides a computer storage medium, on which a computer program is stored, which when executed by a processor, implements any of the embodiments of the playing method of the interactive video as before.

The embodiment of the invention also provides a computer program product, which comprises a computer program, and the computer program realizes any embodiment of the playing method embodiment of the interactive video when being executed by a processor.

The embodiment of the invention also provides a chip system which is applied to the electronic equipment, wherein the chip system comprises one or more processors, and the one or more processors are used for calling the computer instructions to enable the electronic equipment to execute any one of the methods in the previous interactive video playing method embodiment.

It may be appreciated that the above-mentioned advantages achieved by the interactive video player, the interactive video platform, the electronic device, the computer storage medium, the computer program product, the chip system, and the like may refer to the advantages described in the method embodiments, and are not described herein again.

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer instructions are loaded and executed on a computer, the processes or functions described in accordance with embodiments of the present application are produced in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, data subscriber line (Digital Subscriber Line, DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., digital versatile disk (DIGITAL VERSATILE DISC, DVD)), or a semiconductor medium (e.g., solid state disk (Solid STATE DISK, SSD)), etc.

The above embodiments are not intended to limit the present application, and any modifications, equivalent substitutions, improvements, etc. within the technical scope of the present application should be included in the scope of the present application.

Claims

1. The playing method of the interactive video is characterized by comprising the following steps:

loading a main configuration file and analyzing the main configuration file to obtain video mapping, scene configuration file information and metadata; the video mapping is used for mapping the logical video ID into a physical video ID and a corresponding video segment, loading a scene configuration file according to the scene configuration file information, and analyzing the scene configuration file to obtain scene information, a time line, conditions, actions and variables;

Starting the time line, and acquiring the physical video progress of the current playing by combining the video state machine, and mapping the physical video progress to the relative moment point of the time line;

Monitoring whether the playing progress is matched with an interaction control point preset in the time line;

And executing corresponding preset actions based on the video state machine when the preset interaction control point is matched, or executing corresponding preset actions based on the video state machine when the preset interaction control point is matched and the time point triggering condition of the time line or the logic condition of the variable in the video state machine is met.

2. The method for playing interactive video according to claim 1, wherein said constructing a video state machine comprises:

And taking the physical video ID and the corresponding video segment in the video mapping as state nodes, taking the variable as a state parameter, taking a jump rule in the scene information as a state transition condition, and constructing a video state machine.

3. The method for playing an interactive video according to claim 1, wherein said matching the preset interactive control point in the timeline comprises:

after mapping the physical video progress to the relative time points of the time line, matching the preset interactive control points in the time line based on the time stamp, or

And matching the interaction control point with a variable logic condition in the conditions based on the current value of the variable.

4. The method of claim 1, wherein the preset actions include at least one of a video control operation to control video play, pause or skip through the video state machine, a UI operation to update a display state of an interface element, and a variable update operation to modify a variable value in the video state machine.

5. The method for playing an interactive video according to claim 4, wherein the UI operation for updating the display state of the interface element comprises:

triggering a front-end rendering engine through the video state machine, and updating the display state, position, size, transparency or text content of the interface component.

6. The method for playing an interactive video according to any one of claims 1 to 5, further comprising:

in response to a user input event, a variable value in the video state machine is updated and conditional checksum action execution associated with the variable is triggered.

7. A player for interactive video, comprising:

The loading and analyzing module is used for loading a main configuration file and analyzing the main configuration file to obtain video mapping, scene configuration file information and metadata, wherein the video mapping is used for mapping a logic video ID into a physical video ID and a corresponding video segment;

a state machine engine for constructing a video state machine based on the video map, scene information, and variables;

The time line synchronization module is used for starting the time line, combining the video state machine to acquire the currently played physical video progress, and mapping the physical video progress into the relative moment point of the time line;

the monitoring module is used for monitoring whether the playing progress is matched with an interaction control point preset in the time line;

and the interaction control module is used for executing corresponding preset actions based on the video state machine when the preset interaction control point is matched, or executing corresponding preset actions based on the video state machine when the preset interaction control point is matched and the time line moment trigger condition or the logic condition of the variable in the video state machine is met.

8. An interactive video platform, comprising:

the interactive video player of claim 7;

The content management system is used for generating, storing and managing a main configuration file and a scene configuration file of the interactive video;

and the data synchronization module is used for synchronizing the user interaction data with the variable values in the video state machine.

9. A computer program product comprising a computer program, characterized in that the computer program, when being executed by a processor, implements the steps of the method for playing an interactive video according to any one of claims 1 to 6.

10. A computer readable storage medium having stored thereon a computer program, characterized in that the computer program, when executed by a processor, realizes the steps of the method of playing interactive video according to any of claims 1 to 6.