[go: up one dir, main page]

CN115373550A - Method, system and chip for acquiring interaction information - Google Patents

Method, system and chip for acquiring interaction information Download PDF

Info

Publication number
CN115373550A
CN115373550A CN202211303445.8A CN202211303445A CN115373550A CN 115373550 A CN115373550 A CN 115373550A CN 202211303445 A CN202211303445 A CN 202211303445A CN 115373550 A CN115373550 A CN 115373550A
Authority
CN
China
Prior art keywords
input
character
frame image
cloud application
application server
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211303445.8A
Other languages
Chinese (zh)
Other versions
CN115373550B (en
Inventor
王嘉诚
张少仲
张栩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongcheng Hualong Computer Technology Co Ltd
Original Assignee
Zhongcheng Hualong Computer Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongcheng Hualong Computer Technology Co Ltd filed Critical Zhongcheng Hualong Computer Technology Co Ltd
Priority to CN202211303445.8A priority Critical patent/CN115373550B/en
Publication of CN115373550A publication Critical patent/CN115373550A/en
Application granted granted Critical
Publication of CN115373550B publication Critical patent/CN115373550B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0481Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/02Input arrangements using manually operated switches, e.g. using keyboards or dials
    • G06F3/023Arrangements for converting discrete items of information into a coded form, e.g. arrangements for interpreting keyboard generated codes as alphanumeric codes, operand codes or instruction codes
    • G06F3/0233Character input methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/18Extraction of features or characteristics of the image
    • G06V30/1801Detecting partial patterns, e.g. edges or contours, or configurations, e.g. loops, corners, strokes or intersections
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses a method, a system and a chip for acquiring interactive information, which relate to the technical field of information transmission, wherein the method comprises the following steps: the cloud application server identifies whether the decoded input information is in a normal state or an abnormal state; if the image is in the abnormal state, calling the first frame image cached in the cloud application server, calculating a difference value between the first frame image and the second frame image, judging that the input content is a character element or a character and picture element, marking the character element or the character and picture element, and uploading the marked character element or character and picture element to the background server through the input content socket. According to the invention, the cloud application server obtains the outline of the input information by calculating the difference between the sent image frame and the received image frame, judges the type of the input information by an artificial intelligence mode, and extracts, stores and uploads the input content to the background server by adopting a corresponding mode, so that the background server can correctly display all types of input information, the problem of accuracy of user interaction information transmission is solved, and the user experience in the cloud application using process is improved.

Description

Method, system and chip for obtaining interaction information
Technical Field
The invention belongs to the technical field of information transmission, and particularly relates to a method, a system and a chip for acquiring interactive information.
Background
According to the method, a background server provides global data storage processing under the existing cloud application network architecture, the cloud application server and a client interact with each other, part of data which are processed and stored by the client can be processed by the cloud application server under the cloud application architecture, the client captures user operation and then sends the user operation to the cloud application server through a network for interaction, and the client receives audio and video streams from the cloud application server for decoding and displaying.
However, when interactive information is transmitted, because the input methods installed locally at the client are various, and there are many compatibility problems in the input methods from different operating systems and different development sources, if the cloud application server directly uploads the input information sent by the client to the background server, the background server often encounters the situation of decoding and displaying messy codes. In addition, many input methods also have personalized character pictures or expression packages, the input method programs are various, many input methods also support the function of user-defined character pictures or expression symbols, the background usually has no way to normally obtain the images of the character pictures and expression packages, or the images of the character pictures and expression packages cannot be correctly displayed due to disordered display and sequencing, and the problems reduce the accuracy of interactive information transmission of a user and influence the user experience in the use of cloud application.
Disclosure of Invention
In view of the above-mentioned drawbacks in the prior art, the present invention provides a method for acquiring interaction information, where the method includes:
the method comprises the steps that a client captures local operation of a user, and when an input method triggering operation is detected, an input request is sent to a cloud application server;
the cloud application server receives the input request and sends a first frame image data packet carrying an input label to the client, the cloud application server caches the first frame image, and the client receives and decodes the first frame image to acquire the input label and then renders a dialog box structure in a foreground layer and calls a local input method;
when the client detects that the local input is finished and the submission and the transmission are triggered, the current screen image information of the client is stored as a second frame image, and the second frame image and the input information are packaged into a second frame image data packet to be transmitted to the cloud application server;
the cloud application server decodes the second frame image data packet to obtain decoded input information and a second frame image;
the cloud application server identifies whether the decoded input information is in a normal state or an abnormal state;
if the input information is in a normal state, uploading the decoded input information to a background server through an input content socket;
if the image is in an abnormal state, calling a first frame image cached in the cloud application server, calculating a difference value between the first frame image and a second frame image, judging that the input content is a character element or a character and picture element, determining the input character through character recognition, representing a character picture or an expression symbol in an array form, marking the character element or the character and picture element, and uploading the marked character element or character and picture element to the background server through an input content socket;
and the background server receives the input content socket, decodes the input content socket to obtain an input content field, and updates the interface display content field by using the input content field.
The cloud application server identifies whether the decoded input information is in a normal state or an abnormal state, and the method comprises the following steps:
extracting feature information of the decoded input information, inputting the feature information into a semantic understanding deep neural network model, and judging whether the input information is in a normal state or not according to the output of the deep neural network;
when judging whether input information is in an abnormal state according to the output of the semantic understanding deep neural network model, the cloud application server reads a first frame image in a cache, calculates a difference value between the first frame image and a second frame image to be decoded, inputs the difference value into the text input abnormal deep neural network model, and judges the text input abnormal type according to the output of the text input abnormal deep neural network model;
the text input abnormal type comprises text messy codes, character pictures or emoticons.
When the character input abnormal type is a character messy code, determining input characters through character recognition;
and when the character input abnormal type is a character picture or an emoticon, expressing the character picture or the emoticon in an array form.
The method comprises the steps that at least two data fields are arranged in a first frame image data packet, the two data fields are used for bearing first frame image data and an input label respectively, the data size of the two data fields is indicated in a data packet header, when a client decodes the packet header of the first frame image data packet to obtain the input label data field with non-zero data size, the indication information obtained by input information is indicated, and a dialog box structure is rendered in a foreground layer by the client and a local input method is called.
The first frame image data packet is used for bearing an input label, an input style label preset by a user is further included in a data field of the input label, and the client renders a dialog box structure in a foreground layer according to the input style label.
The cloud application server receives and decodes a second frame image data packet sent by the client, the second frame image data packet at least comprises two data fields, the two data fields are respectively used for bearing a second frame image and input information, the two data fields of the second frame image data packet cannot be empty, otherwise, the cloud application server instructs the client to retransmit the second frame image data packet.
The method for uploading the marked text elements or character and picture elements to the background server through the input content socket comprises the following steps:
and labeling the different elements according to the sequence input by the user.
And during the current service duration, when the returned results of multiple times are met and the compatibility qualified conditions are normally met, the cloud application server changes the input instruction setting, sends an instruction to the client to prevent the client from sending the input information by the second frame image data packet, and sends the input information by the input information data packet.
The invention also provides a system for acquiring the interactive information based on the method, which comprises a client, a cloud application server and a background server, wherein the client is used for executing the method corresponding to the client, the cloud application server is used for executing the method corresponding to the cloud application server, and the background server is used for executing the method corresponding to the background server.
The invention also provides a chip for acquiring the interactive information based on the system, which is used for executing the program code in the computer readable storage medium to realize the data processing method of the client, or used for executing the program code in the computer readable storage medium to realize the data processing method of the cloud application server; for executing program codes in a computer readable storage medium to implement a data processing method of a backend server.
Compared with the prior art, the client uploads the image frame at the character sending moment while sending the input character information, so that the cloud application server can quickly obtain the outline of the input information by calculating the difference between the sent image frame and the received image frame, judge the type of the input information in an artificial intelligence mode, and extract, store and upload the input content to the background server in a corresponding mode. By implementing the method and the device, the background server can correctly display all types of input information, the problem of accuracy of user interaction information transmission is solved, and user experience in the use process of cloud application is improved.
Drawings
The above and other objects, features and advantages of exemplary embodiments of the present disclosure will become readily apparent from the following detailed description read in conjunction with the accompanying drawings. Several embodiments of the present disclosure are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which like reference numerals refer to similar or corresponding parts and in which:
FIG. 1 is a flow chart illustrating a method of obtaining interaction information according to an embodiment of the present invention;
fig. 2 is a schematic diagram illustrating a system for acquiring interaction information according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
The terminology used in the embodiments of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the description of the invention and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise, and "the plural" typically includes at least two.
It should be understood that the term "and/or" as used herein is merely one type of association that describes an associated object, meaning that three relationships may exist, e.g., a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.
The words "if", as used herein may be interpreted as "at \8230; \8230whenor" when 8230; \8230when or "in response to a determination" or "in response to a detection", depending on the context. Similarly, the phrases "if determined" or "if detected (a stated condition or event)" may be interpreted as "when determined" or "in response to a determination" or "when detected (a stated condition or event)" or "in response to a detection (a stated condition or event)", depending on the context.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a good or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such good or apparatus. Without further limitation, an element defined by the phrases "comprising one of \8230;" does not exclude the presence of additional like elements in an article or device comprising the element.
An alternative embodiment of the present invention is described in detail below with reference to the drawings.
The first embodiment,
As shown in fig. 1, the present invention discloses a method for obtaining interactive information, wherein the method comprises:
the method comprises the steps that a client captures local operation of a user, and when an input method triggering operation is detected, an input request is sent to a cloud application server;
the cloud application server receives the input request and sends a first frame image data packet carrying an input label to the client, the cloud application server caches the first frame image, and the client receives and decodes the first frame image to obtain the input label and then renders a dialog box structure in a foreground layer and calls a local input method;
when the client detects that the local input is finished and the submission and the transmission are triggered, the current screen image information of the client is stored as a second frame image, and the second frame image and the input information are packaged into a second frame image data packet to be transmitted to the cloud application server;
the cloud application server decodes the second frame image data packet to obtain decoded input information and a second frame image;
the cloud application server identifies whether the decoded input information is in a normal state or an abnormal state;
if the input information is in a normal state, uploading the decoded input information to a background server through an input content socket;
if the image is in an abnormal state, calling a first frame image cached in the cloud application server, calculating a difference value between the first frame image and a second frame image, judging that the input content is a character element or a character and picture element, determining the input character through character recognition, representing a character picture or an expression symbol in an array form, marking the character element or the character and picture element, and uploading the marked character element or character and picture element to the background server through an input content socket;
and the background server receives the input content socket, decodes the input content socket to obtain an input content field, and updates the interface display content field by using the input content field.
In general, a user does not perform other operations at the same time when inputting information, after a first frame image is received by a client, the client and a second frame image do not change except an input frame and input content, so that after the difference between the first frame image and the second frame image is obtained, outline information of the input information can be obtained, at the moment, an input type corresponding to the outline information is judged, characters are identified through an OCR technology, and the characters identified through the OCR are transmitted to a background server by a cloud application server in a decoding mode supported by the cloud application server and the background server; or when the character or the expression is judged to be the character or the expression, the character or the expression can be stored in an array form, for example, the character and the expression can be stored in an RGB matrix form.
If the background of the user changes when the user inputs information, if other dynamic elements are displayed, the difference between the first frame image and the second frame image can be obtained, the corresponding outline of the input dialog box is found through feature point comparison, an input style label preset by the user and a corresponding template style of the dialog box are contained in the cloud application server, the dialog box area can be quickly positioned according to the outline of the dialog box and the corresponding template style of the dialog box, and then the area is analyzed to determine the corresponding input type.
In one embodiment, the input type includes at least one of a word, a character, a picture, and an emoticon, or a combination of two or more of the same.
In one embodiment, the identifying, by the cloud application server, whether the decoded input information is in a normal state or an abnormal state includes:
extracting feature information of the decoded input information, inputting the feature information into a semantic understanding deep neural network model, and judging whether the input information is in a normal state or not according to the output of the deep neural network;
when judging whether input information is in an abnormal state according to the output of the semantic understanding deep neural network model, the cloud application server reads a first frame image in a cache, calculates a difference value between the first frame image and a second frame image to be decoded, inputs the difference value into the character input abnormal deep neural network model, and judges the character input abnormal type according to the output of the character input abnormal deep neural network model;
the text input abnormal type comprises text messy codes, character pictures or emoticons.
The method comprises the steps of extracting features of input information, wherein the feature information can be word2vector w2v features, in an off-line stage, extracting the features of marked interactive information and then training to obtain an input abnormal deep neural network model, in the present stage, extracting the features of the received input information and then putting the extracted input information into the input abnormal deep neural network model, and determining whether the input information is normal input information or abnormal messy codes without any semantic expression or character pictures and the like according to an output result.
In addition, semantic information can be analyzed, and contents with inappropriate speech expression can be filtered out according to needs, so that the network environment is healthier.
In an off-line stage, marking image frame difference values of input types and abnormal conditions in various forms as training data to obtain a character input abnormal depth neural network model; and in the online stage, the image frame difference value obtained by real-time calculation is used as input, and the abnormal type can be obtained.
The problem often encountered in the display of the character pictures is that although the character pictures are characters and can be transmitted in a character mode, different display interfaces have large difference, the character pictures with specific meanings formed locally can be completely deformed like messy codes when being synchronized to interactive information of other clients by a background server, or are not matched with the display size, and the reading and understanding of the content of the interactive information by a user are influenced by the requirement of multi-page display or disordered typesetting due to different numbers of characters in each line.
In one embodiment, when the text input exception type is text messy codes, determining input texts through character recognition;
and when the character input abnormal type is a character picture or an emoticon, expressing the character picture or the emoticon in an array form.
In a certain embodiment, at least two data fields are set in the first frame image data packet, the two data fields are respectively used for bearing first frame image data and an input label, and indicate the data size of the two data fields in a data packet header, when a client decodes the packet header of the first frame image data packet to obtain an input label data field with a nonzero data size, it indicates that indication information for acquiring input information exists, and the client renders a dialog box structure in a foreground layer and invokes a local input method.
In a certain embodiment, the data field of the first frame image data packet for carrying the input label further includes an input style label preset by a user, and the client renders a dialog structure in the foreground layer according to the input style label.
The dialog box structure may be personalized by the user, such as a personalized outline, color, font, and the like, and the corresponding cloud application server may also store a dialog box template determined according to the user selection.
In a certain embodiment, the cloud application server receives and decodes a second frame image data packet sent by the client, where the second frame image data packet includes at least two data fields, the two data fields are respectively used for bearing a second frame image and input information, neither of the two data fields of the second frame image data packet is empty, and otherwise, the cloud application server instructs the client to retransmit the second frame image data packet.
And both data fields of the second frame image data packet cannot be empty, so that the scheme of the invention can normally run to obtain a correct result.
In one embodiment, the uploading the marked text elements or character and picture elements to the background server through the input content socket includes:
and labeling the different elements according to the sequence input by the user.
Socket is a communication mechanism, which can use stream socket or data packet socket to report the input content, when using data packet socket, different packets need to be numbered, so that the correct sequence of input content can be recovered at the receiving end according to the numbers.
In a certain embodiment, during the current service duration, when the results returned for multiple times are satisfied and the compatibility qualified condition is normally met, the cloud application server changes the input instruction setting, sends an instruction to the client to instruct the client not to send the input information by using the second frame image data packet, and sends the input information by using the input information data packet instead.
The client side can feed back the input method type called locally when returning a second frame image data packet, the cloud server can inquire the input method attribute value in the database, determine whether the input method comprises a user-defined character drawing function or other incompatible character drawing and character functions, and if not, judge that the compatibility qualified condition is reached.
By implementing the method and the system, the client uploads the image frame at the character sending moment while sending the input character information, so that the cloud application server can quickly obtain the outline of the input information by calculating the difference between the sent image frame and the received image frame, judge the type of the input information in an artificial intelligence mode, and extract, store and upload the input content to the background server in a corresponding mode. By implementing the method and the device, the background server can correctly display all types of input information, the problem of accuracy of user interaction information transmission is solved, and user experience in the use process of cloud application is improved.
Example II,
As shown in fig. 2, the invention discloses a system for acquiring interaction information based on the method, and the system comprises a client, a cloud application server and a background server, wherein the client is used for executing the method corresponding to the client, the cloud application server is used for executing the method corresponding to the cloud application server, and the background server is used for executing the method corresponding to the background server.
By implementing the method and the system, the client uploads the image frame at the character sending moment while sending the input character information, so that the cloud application server can quickly obtain the outline of the input information by calculating the difference between the sent image frame and the received image frame, judge the type of the input information in an artificial intelligence mode, and extract, store and upload the input content to the background server in a corresponding mode. By implementing the method and the device, the background server can correctly display all types of input information, the problem of accuracy of user interaction information transmission is solved, and user experience in the use process of cloud application is improved.
Example III,
The invention also provides a chip for acquiring the interactive information based on the system, which is used for executing the program code in the computer readable storage medium to realize the data processing method of the client, or used for executing the program code in the computer readable storage medium to realize the data processing method of the cloud application server; for executing program codes in a computer-readable storage medium to implement a data processing method of a backend server.
It should be noted that the computer readable medium of the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.
Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present disclosure may be implemented by software or hardware. Where the name of an element does not in some cases constitute a limitation on the element itself.
The foregoing describes preferred embodiments of the present invention, and is intended to provide a clear and concise description of the spirit and scope of the invention, and not to limit the same, but to include all modifications, substitutions, and alterations falling within the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. A method of obtaining interaction information, the method comprising:
the method comprises the steps that a client captures local operation of a user, and when an input method triggering operation is detected, an input request is sent to a cloud application server;
the cloud application server receives the input request and sends a first frame image data packet carrying an input label to the client, the cloud application server caches the first frame image, and the client receives and decodes the first frame image to obtain the input label and then renders a dialog box structure in a foreground layer and calls a local input method;
when the client detects that the local input is finished and the submission and the transmission are triggered, the current screen image information of the client is stored as a second frame image, and the second frame image and the input information are packaged into a second frame image data packet to be transmitted to the cloud application server;
the cloud application server decodes the second frame image data packet to obtain decoded input information and a second frame image;
the cloud application server identifies whether the decoded input information is in a normal state or an abnormal state;
if the input information is in a normal state, uploading the decoded input information to a background server through an input content socket;
if the image is in an abnormal state, calling a first frame image cached in the cloud application server, calculating a difference value between the first frame image and a second frame image, judging that the input content is a character element or a character and picture element, determining the input character through character recognition, representing a character picture or an expression symbol in an array form, marking the character element or the character and picture element, and uploading the marked character element or character and picture element to the background server through an input content socket;
and the background server receives the input content socket, decodes the input content socket to obtain an input content field, and updates the interface display content field by using the input content field.
2. The method of claim 1, wherein the cloud application server identifying whether the decoded input information is in a normal state or an abnormal state comprises:
extracting feature information of the decoded input information, inputting the feature information into a semantic understanding deep neural network model, and judging whether the input information is in a normal state or not according to the output of the deep neural network;
when judging whether input information is in an abnormal state according to the output of the semantic understanding deep neural network model, the cloud application server reads a first frame image in a cache, calculates a difference value between the first frame image and a second frame image to be decoded, inputs the difference value into the character input abnormal deep neural network model, and judges the character input abnormal type according to the output of the character input abnormal deep neural network model;
the text input abnormal type comprises text messy codes, character pictures or emoticons.
3. The method of claim 2,
when the character input abnormal type is a character messy code, determining input characters through character recognition;
and when the character input abnormal type is a character picture or an emoticon, expressing the character picture or the emoticon in an array form.
4. The method according to claim 1, wherein at least two data fields are set in the first frame of image data packet, the two data fields are respectively used for carrying first frame of image data and an input label, and indicate data size of the two data fields in a data packet header, when a client decodes the packet header of the first frame of image data packet to obtain an input label data field with nonzero data size, it indicates that there is indication information for obtaining input information, and the client renders a dialog box structure in a foreground layer and invokes a local input method.
5. The method of claim 4, wherein the data field for carrying the input label of the first frame image data packet further includes an input style label preset by a user, and the client renders a dialog structure in the foreground image layer according to the input style label.
6. The method of claim 1, wherein the cloud application server receives and decodes a second frame image data packet sent by the client, the second frame image data packet includes at least two data fields, the two data fields are respectively used for carrying a second frame image and input information, neither data field of the second frame image data packet can be empty, otherwise, the cloud application server instructs the client to retransmit the second frame image data packet.
7. The method of claim 1, wherein the uploading of the tagged text element or character-drawing element to the background server via an input-content socket comprises:
and labeling the different elements according to the sequence input by the user.
8. The method of claim 1,
and during the current service duration, when the condition that the returned results of multiple times are all normal and meet the compatibility qualified condition is met, the cloud application server changes the input instruction setting, sends an instruction to the client to prevent the client from sending the input information by the second frame image data packet, and sends the input information by the input information data packet.
9. A system for acquiring interaction information, the system comprising a client, a cloud application server, and a backend server, wherein the client is configured to execute the method corresponding to the client in any one of claims 1 to 8, the cloud application server is configured to execute the method corresponding to the cloud application server in any one of claims 1 to 8, and the backend server is configured to execute the method corresponding to the backend server in any one of claims 1 to 8.
10. A chip for acquiring interaction information, for executing a program code in a computer-readable storage medium to implement the method of the client in any one of claims 1 to 8, or for executing a program code in a computer-readable storage medium to implement the method of the cloud application server in any one of claims 1 to 8; a method for executing program code in a computer readable storage medium to implement the backend server of any of claims 1-8.
CN202211303445.8A 2022-10-24 2022-10-24 Method, system and chip for obtaining interaction information Active CN115373550B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211303445.8A CN115373550B (en) 2022-10-24 2022-10-24 Method, system and chip for obtaining interaction information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211303445.8A CN115373550B (en) 2022-10-24 2022-10-24 Method, system and chip for obtaining interaction information

Publications (2)

Publication Number Publication Date
CN115373550A true CN115373550A (en) 2022-11-22
CN115373550B CN115373550B (en) 2022-12-20

Family

ID=84073979

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211303445.8A Active CN115373550B (en) 2022-10-24 2022-10-24 Method, system and chip for obtaining interaction information

Country Status (1)

Country Link
CN (1) CN115373550B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090313536A1 (en) * 2008-06-11 2009-12-17 Microsoft Corporation Dynamically Providing Relevant Browser Content
CN108806355A (en) * 2018-04-26 2018-11-13 浙江工业大学 A kind of calligraphy and painting art interactive education system
CN109309844A (en) * 2017-07-26 2019-02-05 腾讯科技(深圳)有限公司 Video platform word treatment method, videoconference client and server
CN109889907A (en) * 2019-04-08 2019-06-14 北京东方国信科技股份有限公司 A kind of display methods and device of the video OSD based on HTML5
CN113794903A (en) * 2021-09-16 2021-12-14 广州虎牙科技有限公司 Video image processing method and device and server

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090313536A1 (en) * 2008-06-11 2009-12-17 Microsoft Corporation Dynamically Providing Relevant Browser Content
CN109309844A (en) * 2017-07-26 2019-02-05 腾讯科技(深圳)有限公司 Video platform word treatment method, videoconference client and server
CN108806355A (en) * 2018-04-26 2018-11-13 浙江工业大学 A kind of calligraphy and painting art interactive education system
CN109889907A (en) * 2019-04-08 2019-06-14 北京东方国信科技股份有限公司 A kind of display methods and device of the video OSD based on HTML5
CN113794903A (en) * 2021-09-16 2021-12-14 广州虎牙科技有限公司 Video image processing method and device and server

Also Published As

Publication number Publication date
CN115373550B (en) 2022-12-20

Similar Documents

Publication Publication Date Title
US11954455B2 (en) Method for translating words in a picture, electronic device, and storage medium
CN109947512B (en) Text adaptive display method, device, server and storage medium
US20170169822A1 (en) Dialog text summarization device and method
CN111738041A (en) Video segmentation method, device, equipment and medium
CN112995749A (en) Method, device and equipment for processing video subtitles and storage medium
CN111741329B (en) Video processing method, device, equipment and storage medium
KR102002024B1 (en) Method for processing labeling of object and object management server
CN109816023B (en) Method and device for generating picture label model
CN113255377A (en) Translation method, translation device, electronic equipment and storage medium
CN113705300A (en) Method, device and equipment for acquiring phonetic-to-text training corpus and storage medium
CN111813929A (en) Information processing method, device and electronic equipment
CN110970011A (en) Picture processing method, device and equipment and computer readable storage medium
US20170325003A1 (en) A video signal caption system and method for advertising
CN110379406B (en) Voice comment conversion method, system, medium and electronic device
CN114697762B (en) Processing method, processing device, terminal equipment and medium
CN116645673A (en) Keyword labeling method and device for video content, electronic equipment and medium
CN116962755A (en) Subtitle generation method, related equipment and storage medium
CN111291575A (en) Text processing method and device, electronic equipment and storage medium
CN114286181B (en) Video optimization method and device, electronic equipment and storage medium
CN115373550B (en) Method, system and chip for obtaining interaction information
CN109947526B (en) Method and apparatus for outputting information
CN111914850B (en) Picture feature extraction method, device, server and medium
CN113743060A (en) Customizable report generation method and device, computer equipment and storage medium
CN112578916B (en) Information processing method and system
CN110706309B (en) Method and device for generating fishbone map

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant