CN118400546A

CN118400546A - Live broadcast explanation method, computing device and readable storage medium

Info

Publication number: CN118400546A
Application number: CN202410459057.1A
Authority: CN
Inventors: 殷文韬; 万顺飞
Original assignee: Beijing Che Smart Information Technology Co ltd
Current assignee: Beijing Che Smart Information Technology Co ltd
Priority date: 2024-04-16
Filing date: 2024-04-16
Publication date: 2024-07-26

Abstract

The invention relates to the field of computers, and particularly discloses a live broadcast interpretation method, computing equipment and a readable storage medium. The method comprises the following steps: acquiring a character image and an environment image, wherein the character image is a real character image in a real environment, and the environment image is a virtual environment image comprising a vehicle model; acquiring attribute information of each pixel in the character image, wherein the attribute information at least comprises coordinate information and depth values; determining the corresponding position of each pixel in the figure image in the environment image based on the coordinate information; for each corresponding position, determining the shielding relation between the character image and the environment image at the position according to the corresponding depth value, and shielding corresponding pixels of the character image or the environment image by utilizing the determined shielding relation so as to map the character image to the environment image, so that the real character can directly broadcast the vehicle model in the virtual environment. The live broadcast interpretation method has the advantages of low cost, high fusion degree and strong interactivity.

Description

Live broadcast explanation method, computing device and readable storage medium

Technical Field

The present invention relates to the field of computers, and in particular, to a live broadcast interpretation method, a computing device, and a readable storage medium.

Background

Along with the increasing demands of users on live experience and enterprises on live capability, MR (Mixed Reality) live broadcast becomes an application trend of live broadcast. However, although MR live broadcasting schemes have been proposed for a long time, the MR live broadcasting schemes on the market at present are not ideal in both cost and scene fusion degree. Therefore, how to obtain more realistic scene fusion special effects at low cost becomes a problem to be solved in the live broadcast industry at present.

Disclosure of Invention

To this end, the present invention provides a live interpretation method, computing device, and readable storage medium in an effort to solve or at least alleviate the problems presented above.

According to one aspect of the present invention, there is provided a live interpretation method adapted to be executed in a computing device, the method comprising: acquiring a character image and an environment image, wherein the character image is a real character image in a real environment, and the environment image is a virtual environment image comprising a vehicle model; acquiring attribute information of each pixel in the character image, wherein the attribute information at least comprises coordinate information and depth values; determining the corresponding position of each pixel in the figure image in the environment image based on the coordinate information; for each corresponding position, determining the shielding relation between the character image and the environment image at the position according to the corresponding depth value, and shielding corresponding pixels of the character image or the environment image by utilizing the determined shielding relation so as to map the character image to the environment image, so that the real character can directly broadcast the vehicle model in the virtual environment.

Optionally, in the live interpretation method according to the present invention, further includes: setting a first reference object and a second reference object with the same attribute in a real environment and a virtual environment respectively; and adjusting the visual angle of the virtual environment by utilizing the image difference condition of the first reference object and the second reference object so that the image in the real environment is the same as the acquisition visual angle of the image in the virtual environment.

Optionally, in the live interpretation method according to the present invention, further includes: and displaying the mapped character image and the environment image to the real character, so that the real character can control the vehicle model based on the displayed content.

Optionally, in the live interpretation method according to the present invention, further includes: and when the control instruction generated based on the real person is monitored, controlling the vehicle model to respond correspondingly.

Optionally, in the live interpretation method according to the present invention, the control instruction includes at least an action instruction and a voice instruction.

Optionally, in the live interpretation method according to the present invention, further includes: and preprocessing the control instruction.

Optionally, in the live interpretation method according to the invention, wherein: if the control instruction is an action instruction, preprocessing comprises analyzing the gesture, the gesture and the fake action of the real person; if the control instruction is a voice instruction, preprocessing comprises wave band recognition, tone screening and noise removal processing on the voice of the real person.

Optionally, in the live interpretation method according to the present invention, controlling the vehicle model to respond accordingly includes: determining control content of a control instruction, wherein the control content comprises at least one of vehicle starting, vehicle running, vehicle closing, vehicle acceleration and part display; based on the control content, the corresponding function of the vehicle model is turned on.

According to yet another aspect of the present invention, there is provided a computing device comprising: at least one processor; and a memory storing program instructions, wherein the program instructions are configured to be executed by the at least one processor, the program instructions comprising instructions for performing the live interpretation method according to the invention.

According to yet another aspect of the present invention, there is provided a readable storage medium storing program instructions that, when read and executed by a computing device, cause the computing device to perform a live interpretation method according to the present invention.

In summary, the invention provides a new live broadcast interpretation method, which maps the real character image to the virtual environment image by using the pixel attribute of the real character image in the real environment, so that the real character in the reality is fused into the virtual scene to live broadcast the vehicle, and the cost is low and the fusion degree is high. In addition, the real person in the invention can interact with the vehicle in the virtual scene through voice, gestures and the like, so that the live broadcast impression of the audience user can be improved.

Drawings

To the accomplishment of the foregoing and related ends, certain illustrative aspects are described herein in connection with the following description and the annexed drawings, which set forth the various ways in which the principles disclosed herein may be practiced, and all aspects and equivalents thereof are intended to fall within the scope of the claimed subject matter. The above, as well as additional objects, features, and advantages of the present disclosure will become more apparent from the following detailed description when read in conjunction with the accompanying drawings. Like reference numerals generally refer to like parts or elements throughout the present disclosure.

FIG. 1 illustrates a block diagram of a computing device 100, according to one embodiment of the invention;

FIG. 2 illustrates a flow diagram of a live interpretation method 200 in accordance with one embodiment of the invention;

FIG. 3 illustrates a schematic diagram of an effect map of a character image to an ambient image according to one embodiment of the invention.

Detailed Description

Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

Recording technology based on LED large screen is a mainstream virtual live broadcasting technical scheme at present. Specifically, first, a large screen of several tens or hundreds square meters of vertical LEDs is placed, and a grounded LED screen is placed in front of the large screen to form a virtual content playing space composed of LEDs. And then, the virtual content is rendered to LED screens in different directions through a split-screen rendering technology of the UE. And then, the host broadcasting station is used for directly broadcasting in a large screen space, wherein some special props can be placed on the LED platform attached to the ground to perform some auxiliary shielding. And finally, recording the anchor and the content displayed by the LED screen together by using a camera, thereby achieving the effect of live broadcasting of the anchor in the virtual world.

This solution, while effective, is too expensive, such as the cost of placement of the virtual shooting sites, the cost of renting equipment, etc., and is not affordable at all by a common live team. Moreover, the shielding relation of the scheme needs to be realized by using various auxiliary props, so that not only can the arrangement of scenes be complicated, but also the fusion degree of a host and virtual contents can be influenced.

In addition, the mode is highly dependent on a disguise system abroad, so that the equipment is relatively difficult to introduce and maintain, and the mode is unfavorable for the domestic trend of the current technology.

Based on the above, the invention provides a new live broadcast explanation method to solve the technical problems existing in the prior art. Wherein the live interpretation method of the present invention may be implemented in a computing device. In particular, the computing device may be implemented as a server, such as an application server, web server, or the like; but not limited to, desktop computers, notebook computers, processor chips, tablet computers, and the like.

Fig. 1 illustrates a block diagram of the physical components (i.e., hardware) of a computing device 100. In a basic configuration, computing device 100 includes at least one processing unit 102 and system memory 104. According to one aspect, the processing unit 102 may be implemented as a processor, depending on the configuration and type of computing device. The system memory 104 includes, but is not limited to, volatile storage (e.g., random access memory), non-volatile storage (e.g., read only memory), flash memory, or any combination of such memories. According to one aspect, the system memory 104 includes an operating system 105 and a program module 106, the program module 106 includes a live interpretation module 120, and the live interpretation module 120 is configured to perform the live interpretation method 200 of the invention.

According to one aspect, operating system 105 is suitable, for example, for controlling the operation of computing device 100. Further, examples are practiced in connection with a graphics library, other operating systems, or any other application program and are not limited to any particular application or system. This basic configuration is illustrated in fig. 1 by those components within dashed line 108. According to one aspect, computing device 100 has additional features or functionality. For example, according to one aspect, computing device 100 includes additional data storage devices (removable and/or non-removable) such as, for example, magnetic disks, optical disks, or tape. Such additional storage is illustrated in fig. 1 by removable storage 109 and non-removable storage 110.

As set forth hereinabove, according to one aspect, program modules are stored in the system memory 104. According to one aspect, program modules may include one or more applications, the invention is not limited in the type of application, for example, the application may include: email and contacts applications, word processing applications, spreadsheet applications, database applications, slide show applications, drawing or computer-aided application, web browser applications, etc.

According to one aspect, the examples may be practiced in a circuit comprising discrete electronic components, a packaged or integrated electronic chip containing logic gates, a circuit utilizing a microprocessor, or on a single chip containing electronic components or a microprocessor. For example, examples may be practiced via a system on a chip (SOC) in which each or many of the components shown in fig. 1 may be integrated on a single integrated circuit. According to one aspect, such SOC devices may include one or more processing units, graphics units, communication units, system virtualization units, and various application functions, all of which are integrated (or "burned") onto a chip substrate as a single integrated circuit. When operating via an SOC, the functionality described herein may be operated via dedicated logic integrated with other components of computing device 100 on a single integrated circuit (chip). Embodiments of the invention may also be practiced using other techniques capable of performing logical operations (e.g., AND, OR, AND NOT), including but NOT limited to mechanical, optical, fluidic, AND quantum techniques. In addition, embodiments of the invention may be practiced within a general purpose computer or in any other circuit or system.

According to one aspect, the computing device 100 may also have one or more input devices 112, such as a keyboard, mouse, pen, voice input device, touch input device, and the like. Output device(s) 114 such as a display, speakers, printer, etc. may also be included. The foregoing devices are examples and other devices may also be used. Computing device 100 may include one or more communication connections 116 that allow communication with other computing devices 118. Examples of suitable communication connections 116 include, but are not limited to: RF transmitter, receiver and/or transceiver circuitry; universal Serial Bus (USB), parallel and/or serial ports.

The term computer readable media as used herein includes computer storage media. Computer storage media may include volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information (e.g., computer readable instructions, data structures, or program modules). System memory 104, removable storage 109, and non-removable storage 110 are all examples of computer storage media (i.e., memory storage). Computer storage media may include Random Access Memory (RAM), read Only Memory (ROM), electrically erasable read only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other article of manufacture that can be used to store information and that can be accessed by computer device 100. According to one aspect, any such computer storage media may be part of computing device 100. Computer storage media does not include a carrier wave or other propagated data signal.

According to one aspect, communication media is embodied by computer readable instructions, data structures, program modules, or other data in a modulated data signal (e.g., carrier wave or other transport mechanism) and includes any information delivery media. According to one aspect, the term "modulated data signal" describes a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, radio Frequency (RF), infrared, and other wireless media.

Fig. 2 illustrates a flow diagram of a live interpretation method 200 according to one embodiment of the invention, the method 200 being adapted for execution in a computing device (e.g., the computing device 100 shown in fig. 1). Specifically, the method 200 maps the real person image to the virtual environment image by using the pixel attribute of the real person image in the real environment, so that the real person in reality is fused into the virtual scene, and the vehicle is live-broadcast.

As shown in fig. 2, the live interpretation method 200 of the present invention begins at step 210. In step 210, a character image and an environment image are acquired, wherein the character image is a real character image in a real environment, and the environment image is a virtual environment image including a vehicle model.

The real environment, i.e. the real shooting place, may be, for example, a green curtain shooting space (the green curtain lighting needs to be uniform). In some embodiments, a video camera (further, a depth camera) may be disposed in the shooting site to obtain an image of a real person located in the shooting site. In particular, when the camera is aimed at a real person, it is necessary to ensure that the green curtain can completely cover the camera. Further, regarding the image of the real person captured by the camera, according to one embodiment of the present invention, an HDMI line (High Definition Multimedia Interface Cable, high-definition multimedia interface line) may be utilized to transmit the image to the computing device, thus obtaining the image of the person. In addition, in some embodiments, a large anti-display screen may be further provided, and the live image of the person may be sent to the large anti-display screen.

A virtual environment is a computer-generated digital environment, i.e., a real world is simulated or a completely new virtual world is created using computer technology for users to interact, explore and experience in. The virtual environment may contain virtual terrain, virtual characters, virtual objects, rules related thereto, physical properties, and social interactions.

The virtual environment created in this embodiment is mainly provided with a vehicle model, and further, is a 3D automobile model. Specifically, the virtual environment may be generated based on model data respectively corresponding to a plurality of types of vehicles stored in advance. Wherein the model data includes 3D automobile model data and 3D scene data. Specifically, vehicle models are created strictly according to standard rules, each vehicle model has over 4000 ten thousand triangular surfaces on average, the size of a model inclusion exceeds 1.5G, the physical size is restored strictly according to the 1:1 of an actual vehicle in meter, and the display precision of leather, wood grain, plastic, a screen and the like of interior decoration is kept to be equal to (i.e. consistent with) the actual vehicle. Further, when the 3D scene is established, the lighting effect in the 3D scene can be achieved by using an atmosphere lamp, a volume light, a halation, a flowing water lamp and the like, so that a real atmosphere is created, and the obtained image of the virtual environment can bring extremely shocking experience effect to the user.

In addition, the live broadcast interpretation method 200 of the present embodiment may be executed in an MR live broadcast system built on a computing device, and when the MR live broadcast system starts rendering, the picture of the camera (i.e. the picture of the real environment) is accessed into the MR live broadcast system, so as to be fused into the virtual environment.

If there are differences in position, angle of view, etc. between the real environment and the virtual environment, the perspective relationship between the image captured by the camera in the real environment and the virtual environment is incorrect, and therefore, it is necessary to correct the angle of view between the virtual environment and the real environment.

According to one embodiment of the present invention, correction can be performed as follows. First, a first reference object and a second reference object having the same attribute are provided in a real environment and a virtual environment, respectively (for example, a cube having a side length of 10cm may be used as a reference object). Then, the visual angle of the virtual environment is adjusted by utilizing the image difference condition of the first reference object and the second reference object so that the image in the real environment is the same as the acquisition visual angle of the image in the virtual environment. Specifically, the visual angle correction principle is that a coordinate relationship between a real environment and a virtual environment is determined, a first reference object is placed in the real environment, a second reference object is dragged to a corresponding position in the virtual environment, images of the first reference object and the second reference object are acquired at the same time, and according to the position difference of the first reference object and the second reference object, the acquisition visual angle of the images acquired from the virtual environment is continuously adjusted until the images of the second reference object and the first reference object are identical (namely, the sizes and angles of a cube picture output by a camera and a picture output by 3D software are identical), and at the moment, the acquisition visual angle of the images in the real environment is determined to be identical to that of the images in the virtual environment. For convenience of description, the first reference object simulated by the computer will be referred to as a second reference object in this embodiment.

Then, step 220 is performed to obtain attribute information of each pixel in the character image, where the attribute information includes at least coordinate information and depth value. The depth value is related to the distance between each pixel and the camera, and further, in some embodiments, the depth value of each pixel may be the relative distance between each pixel and the camera. In terms of the depth value of each pixel, according to an embodiment of the present invention, an image of a real person may be photographed using a depth camera, and then the depth value of each pixel may be obtained from the photographed image. In addition, in some embodiments, the reference depth value may be set in advance, so that after the depth value of each pixel is obtained from the captured image, the depth value of each pixel may be corrected according to the brightness and/or the size of each pixel according to the preset reference depth value.

In addition, according to an embodiment of the present invention, attribute information of each pixel in the environmental image may also be acquired. Wherein in 3D rendering, the depth value represents the distance of the pixel point from the camera in the 3D world. The greater the depth value, the further away from the camera.

During the rendering process, the computer generates a depth map, which is an image of only one channel, representing the distance of the object from the camera. The brighter places represent closer distances from the camera and the darker places represent further distances from the camera. With this map, when rendering an object, the depth value of the pixel to be output is compared with the pixel color in the current depth buffer, and if there is already a pixel closer to the camera than the pixel to be rendered, the pixel to be rendered is discarded; otherwise, the original cover is covered. This is an operation called "depth testing".

The occlusion of a specific pixel, or the occlusion of a specific pixel, can be achieved by the depth value of the pixel, as follows.

After the attribute information of the pixels in the person image and the environment image is obtained, step 230 is performed to determine the corresponding positions of the pixels in the person image in the environment image based on the coordinate information.

As can be seen from step 210, the acquired viewing angles of the person image and the environment image are the same, which indicates that there is a one-to-one correspondence between the coordinates of the person image and the environment image, for example, the coordinates of the top left pixel of the person image are the same as the coordinates of the top left pixel of the environment image, the coordinates of the next pixel of the top left pixel of the person image are the same as the coordinates of the next pixel of the top left pixel of the environment image, and so on, and finally, the corresponding positions of the pixels in the person image in the environment image are determined.

Finally, in step 240, for each corresponding position, an occlusion relationship between the person image and the environment image at the position is determined according to the corresponding depth value (i.e., the depth value of the pixel point at the corresponding position in the environment image), and the corresponding pixel of the person image or the environment image is occluded by using the determined occlusion relationship, so as to map the person image to the environment image, so that the real person can directly explain the vehicle model in the virtual environment. That is, when there is an occlusion of an object (e.g., a vehicle model) in the person image and the environment image, a simulation of a real occlusion is performed according to a depth value between the person image and the environment image.

According to the above description, the depth value of any pixel in the character image is related to the distance from the camera. Therefore, for any pixel in the human image, the occlusion relationship of the pixel corresponding to the pixel in the ambient image can be determined as follows. Specifically, for a corresponding position of each pixel in the person image in the environment image, a depth value of a pixel point at the corresponding position in the environment image is obtained from a depth buffer (for convenience of description, referred to as a first depth value), and the obtained first depth value is compared with a depth value of the pixel in the person image (for convenience of description, referred to as a second depth value), if the second depth value is greater than the first depth value, it is determined that the pixel in the person image is blocked by a pixel at the corresponding position in the environment image, and if the second depth value is less than or equal to the first depth value, it is determined that the pixel in the person image blocks a pixel at the corresponding position in the environment image.

That is, for any pixel a in the person image, when determining the occlusion relation of the pixel B corresponding to the pixel a in the environment image, the depth value of the pixel a in the person image is compared with the depth value of the pixel B in the environment image recorded in the depth buffer, if the depth value of the pixel a is greater than the depth value of the pixel B, it is determined that the pixel a in the person image is occluded by the pixel B in the environment image, and if the depth value of the pixel a is less than or equal to the depth value of the pixel B, it is determined that the pixel B in the environment image is occluded by the pixel a in the person image (i.e., the pixel a in the person image occludes the pixel B in the environment image).

For example, when a real person (hereinafter referred to as a "host") in a real environment is simulated to sit in a vehicle cab, the real person image may be a sitting posture, and the head area of the vehicle model is required to cover the leg and part of the waist area in the person image, and the person image is required to cover the rear seat area of the vehicle model cab, and the depth value of the vehicle glass may be adjusted, so that the person image may be displayed on a live large screen through the vehicle glass, as can be seen in fig. 3. In addition, smart interaction between the anchor and the virtual environment can be achieved by using green props, for example, by placing green chairs at corresponding positions to support the anchor, so that the anchor can be as if the anchor is actually sitting in the virtual environment, and more realistic visual effects can be achieved.

Therefore, the occlusion is performed one by one from the pixel level, so that the occlusion realized by the embodiment has better fusion.

In some embodiments, the method 200 further includes presenting the mapped character image and the environmental image to the real character such that the real character controls the vehicle model based on the presentation. The scene picture of the fused character image and the environment image is shown to the host (real character) in the real environment, so that the host can conveniently control the vehicle model based on the position relationship between the host and the environment image. For example, when a host is at a vehicle door, opening a door of a vehicle model may be accomplished by simulating an opening motion of the door.

Specifically, when a control instruction (opening the door as described above) generated based on the real person is monitored, the vehicle model is controlled to respond accordingly. The control command of the present embodiment includes at least an action command and a voice command. The action instructions are such as opening the door, turning the vehicle key (corresponding to starting the vehicle), engaging a gear, etc. The voice command indicates that the host directly sends out voice such as "open door", "start vehicle", "engage gear", etc.

In some embodiments, the responsive function of the vehicle model may be triggered by determining control content of the control instructions, the control content including at least one of vehicle start, vehicle travel, vehicle shut-down, vehicle acceleration, part presentation. By way of example, through the association relationship between the control instruction and the vehicle model, the host broadcast can accurately guide the virtual environment through voice and gestures, so that the automobile in the virtual environment can freely run, be opened, be closed, even burst and the like. In addition, the anchor can also display the various components of the car individually and the performance of the three-electric system through the outgoing information window.

In some embodiments, control instructions may also need to be preprocessed.

Specifically, if the control instruction is an action instruction, the preprocessing includes analyzing the pose, gesture, and false action of the real person. The camera captures the images of the anchor gesture, the images are transmitted to the gesture recognition subsystem, the system selects different instructions according to the gestures, so that the virtual space is accurately controlled, and the gesture recognition subsystem comprises gesture analysis, false action analysis and other modules, so that an accurate gesture recognition effect is achieved comprehensively.

If the control instruction is a voice instruction, preprocessing comprises wave band recognition, tone screening and noise removal processing on the voice of the real person. The microphone is used for capturing the voice input of the anchor, the voice perception subsystem inside the system is used for recognizing the voice and selecting different instruction operation virtual spaces, and the voice perception subsystem is respectively composed of a wave band recognition module, a tone screening module and a denoising module, so that a relatively accurate perception effect is achieved.

Of course, more dazzling special effects (such as vehicle drift, etc.) can be created according to the demand, and the application is not limited thereto. Also, in some embodiments, light effects may be added in the virtual environment. In addition, in order to further improve the reality of the virtual world, a lighting device (i.e., an environment image further includes a corresponding lighting hardware device model and 3D scene data includes lighting hardware device model data) for providing a lighting effect for the display of the vehicle model may be set in the virtual environment, so as to create a stereoscopic impression in the real world, and achieve an immersive see-through effect. Additionally, in some embodiments, HDRP rendering pipelines may be utilized to render the generated vehicle model from the model data. In the process of rendering and generating the vehicle model, corresponding material balls can be set according to the attribute values in the model data (different material balls are set by different attribute values), so that one or more effects of reflection, color and refraction can be realized in the image of the vehicle model.

In addition, the host can observe the live bullet screen through the large screen of the reverse display to carry out interactive interaction with the audience participating in the live broadcast.

In summary, the method provided by the invention provides a new live broadcast method, which utilizes the pixel attribute of the real character image in the real environment to map the real character image to the virtual environment image, so that the real character in the reality is fused into the virtual scene to live broadcast the vehicle, and the cost is low and the fusion degree is high. In addition, the real person in the invention can interact with the vehicle in the virtual scene through voice, gestures and the like, so that the live broadcast impression of the audience user can be improved.

The various techniques described herein may be implemented in connection with hardware or software or, alternatively, with a combination of both. Thus, the methods and apparatus of the present invention, or certain aspects or portions of the methods and apparatus of the present invention, may take the form of program code (i.e., instructions) embodied in tangible media, such as removable hard drives, U-drives, floppy diskettes, CD-ROMs, or any other machine-readable storage medium, wherein, when the program is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the invention.

In the case of program code execution on programmable computers, the mobile terminal will generally include a processor, a storage medium readable by the processor (including volatile and non-volatile memory and/or storage elements), at least one input device, and at least one output device. Wherein the memory is configured to store program code; the processor is configured to execute the live interpretation method of the invention in accordance with instructions in said program code stored in the memory.

By way of example, and not limitation, readable media comprise readable storage media and communication media. The readable storage medium stores information such as computer readable instructions, data structures, program modules, or other data. Communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. Combinations of any of the above are also included within the scope of readable media.

In the description provided herein, algorithms and displays are not inherently related to any particular computer, virtual system, or other apparatus. Various general-purpose systems may also be used with examples of the invention. The required structure for a construction of such a system is apparent from the description above. In addition, the present invention is not directed to any particular programming language. It will be appreciated that the teachings of the present invention described herein may be implemented in a variety of programming languages, and the above description of specific languages is provided for disclosure of enablement and best mode of the present invention.

In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects.

Those skilled in the art will appreciate that the modules or units or components of the devices in the examples disclosed herein may be arranged in a device as described in this embodiment, or alternatively may be located in one or more devices different from the devices in this example. The modules in the foregoing examples may be combined into one module or may be further divided into a plurality of sub-modules.

Unless otherwise specified the use of the ordinal adjectives "first", "second", "third", etc., to describe a common object, merely indicate that different instances of like objects are being referred to, and are not intended to imply that the objects so described must be in a given sequence, either temporally, spatially, in ranking, or in any other manner.

Claims

1.A live interpretation method adapted for execution in a computing device, the method comprising:

acquiring a character image and an environment image, wherein the character image is a real character image in a real environment, and the environment image is a virtual environment image comprising a vehicle model;

Acquiring attribute information of each pixel in the character image, wherein the attribute information at least comprises coordinate information and a depth value;

Determining the corresponding position of each pixel in the figure image in the environment image based on the coordinate information;

And aiming at each corresponding position, determining the shielding relation between the character image and the environment image at the position according to the corresponding depth value, and shielding corresponding pixels of the character image or the environment image by utilizing the determined shielding relation so as to map the character image to the environment image, so that the real character can directly broadcast the vehicle model in the virtual environment.

2. The method of claim 1, further comprising:

setting a first reference object and a second reference object with the same attribute in the real environment and the virtual environment respectively;

and adjusting the visual angle of the virtual environment by utilizing the image difference condition of the first reference object and the second reference object so that the image in the real environment is the same as the acquisition visual angle of the image in the virtual environment.

3. The method of claim 1 or 2, further comprising:

and displaying the mapped character image and the environment image to the real character, so that the real character can control the vehicle model based on the displayed content.

4. A method as in any of claims 1-3, further comprising:

and when a control instruction generated based on the real person is monitored, controlling the vehicle model to respond correspondingly.

5. The method of claim 4, wherein the control instructions include at least action instructions and voice instructions.

6. The method of claim 5, further comprising:

and preprocessing the control instruction.

7. The method of claim 6, wherein:

If the control instruction is an action instruction, the preprocessing comprises analyzing the gesture, the gesture and the fake action of the real person;

If the control instruction is a voice instruction, the preprocessing comprises wave band recognition, tone screening and noise removal processing on the voice of the real person.

8. The method of any of claims 4-7, wherein controlling the vehicle model to respond accordingly comprises:

Determining control content of the control instruction, wherein the control content comprises at least one of vehicle starting, vehicle running, vehicle closing, vehicle acceleration and part displaying;

And starting corresponding functions of the vehicle model based on the control content.

9. A computing device, comprising:

At least one processor; and

A memory storing program instructions, wherein the program instructions are configured to be adapted to be executed by the at least one processor, the program instructions comprising instructions for performing the method of any of claims 1-8.

10. A readable storage medium storing program instructions which, when read and executed by a computing device, cause the computing device to perform the method of any of claims 1-8.