[go: up one dir, main page]

CN106547753B - Information analysis method and electronic equipment - Google Patents

Information analysis method and electronic equipment Download PDF

Info

Publication number
CN106547753B
CN106547753B CN201510591872.4A CN201510591872A CN106547753B CN 106547753 B CN106547753 B CN 106547753B CN 201510591872 A CN201510591872 A CN 201510591872A CN 106547753 B CN106547753 B CN 106547753B
Authority
CN
China
Prior art keywords
vocabulary information
attribute
rule template
information
vocabulary
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510591872.4A
Other languages
Chinese (zh)
Other versions
CN106547753A (en
Inventor
冯晓冰
王飞
徐浩
许振兴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201510591872.4A priority Critical patent/CN106547753B/en
Publication of CN106547753A publication Critical patent/CN106547753A/en
Application granted granted Critical
Publication of CN106547753B publication Critical patent/CN106547753B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses an information analysis method and electronic equipment, wherein the method comprises the following steps: acquiring at least two vocabulary information from the text information; acquiring attribute sets of the at least two vocabulary information; selecting at least one rule template based on the at least two vocabulary information; and acquiring the meaning corresponding to the text information by using at least one attribute value in the attribute set of the at least two vocabulary information, and a rule part and an attribute value constraint relational expression in the at least one rule template.

Description

Information analysis method and electronic equipment
Technical Field
The present invention relates to information analysis technologies in the field of information processing, and in particular, to an information analysis method and an electronic device.
Background
In the prior art, sentences are analyzed by a method of identifying a rule template and a domain named entity, and the attributes of the domain named entity and the relationship between the attributes of the named entity are not utilized. Generally, in a complete sentence, named entities have a certain relationship, but the current technical scheme can not obtain accurate sentence meaning.
Disclosure of Invention
In view of the above, embodiments of the present invention provide an information analysis method and an electronic device, which can at least solve the above problems in the prior art.
The embodiment of the invention provides an information analysis method, which comprises the following steps:
acquiring at least two vocabulary information from the text information;
acquiring attribute sets of the at least two vocabulary information; the attribute set comprises at least one attribute type, and each attribute type comprises an attribute value;
selecting at least one rule template based on the at least two vocabulary information; wherein, the rule template comprises: a rule part at least composed of attribute types and an attribute value constraint relational expression;
and acquiring the meaning corresponding to the text information by using at least one attribute value in the attribute set of the at least two vocabulary information, and a rule part and an attribute value constraint relational expression in the at least one rule template.
An embodiment of the present invention further provides an electronic device, including:
the text information processing unit is used for acquiring at least two vocabulary information from the text information;
the attribute acquisition unit is used for acquiring an attribute set of the at least two vocabulary information; the attribute set comprises at least one attribute type, and each attribute type comprises an attribute value;
the template acquisition unit is used for selecting at least one rule template based on the at least two vocabulary information; wherein, the rule template comprises: a rule part at least composed of attribute types and an attribute value constraint relational expression;
and the analysis unit is used for acquiring the meaning corresponding to the text information by using at least one attribute value in the attribute set of the at least two vocabulary information, a rule part in the at least one rule template and an attribute value constraint relational expression.
An embodiment of the present invention further provides an electronic device, including:
a processor and a memory for storing a computer program capable of running on the processor; wherein,
the processor is configured to execute the information analysis method when the computer program is run.
An embodiment of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium stores computer-executable instructions, and the computer-executable instructions are configured to, when executed by a processor, perform:
acquiring at least two vocabulary information from the text information;
acquiring attribute sets of the at least two vocabulary information; the attribute set comprises at least one attribute type, and each attribute type comprises an attribute value;
selecting at least one rule template based on the at least two vocabulary information; wherein, the rule template comprises: a rule part at least composed of attribute types and an attribute value constraint relational expression;
and acquiring the meaning corresponding to the text information by using at least one attribute value in the attribute set of the at least two vocabulary information, and a rule part and an attribute value constraint relational expression in the at least one rule template.
In the above solution, the computer-executable instructions are further configured to, when executed by the processor, perform:
setting a vocabulary information base, wherein the vocabulary information base comprises at least one vocabulary information and comprises a class name, an attribute type and an attribute value corresponding to the vocabulary information; setting a corresponding domain name for each category name;
and setting a rule template library, wherein the rule template library comprises at least one rule template corresponding to the field name.
In the above solution, the computer-executable instructions are further configured to, when executed by the processor, perform:
respectively acquiring category names of the at least two vocabulary information based on the vocabulary information base;
determining a domain name corresponding to the at least two vocabulary information based on the category names of the at least two vocabulary information;
and acquiring at least one rule template from the rule template library based on the field name.
In the above solution, the computer-executable instructions are further configured to, when executed by the processor, perform:
extracting rule templates from the at least one rule template one by one;
when the attribute types in the attribute sets of the at least two pieces of vocabulary information are matched with the extracted rule part of the rule template, acquiring the attribute constraint relation of the at least two pieces of vocabulary information by using at least one attribute value in the attribute sets of the at least two pieces of vocabulary information and the extracted attribute value constraint relational expression in the rule template;
and determining the meaning corresponding to the text information based on the attribute constraint relation of the at least two pieces of vocabulary information.
In the above solution, the computer-executable instructions are further configured to, when executed by the processor, perform:
if the attribute constraint relationship of the at least two vocabulary information represents that the attribute constraint relationship is effective, storing the rule template and the vocabulary information to a matching list;
processing the rule template and the vocabulary information stored in the matching list, and selecting a target rule template and the vocabulary information corresponding to the target rule template;
and determining the meaning corresponding to the text information based on the selected target rule template and the corresponding vocabulary information.
The information analysis method and the electronic equipment provided by the embodiment of the invention can extract the vocabulary information from the text information, acquire the attribute value of the vocabulary information and acquire the rule template of the vocabulary information; and utilizing the attribute value of the vocabulary information, the rule part in the rule template and the attribute value constraint relational expression to obtain the meaning corresponding to the text information. Therefore, the meaning of the voice information can be analyzed by combining the attribute values of the vocabulary information, and the problem that the semantic analysis accuracy cannot be improved due to the fact that constraint relation analysis is not performed by combining the attribute values in the prior art is solved.
Drawings
FIG. 1 is a first schematic flow chart of an information analysis method according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of an information analysis method according to an embodiment of the present invention;
FIG. 3 is an example of a vocabulary information including property types and property values thereof according to an embodiment of the present invention;
FIG. 4 is a first schematic view of an electronic device according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a second exemplary embodiment of an electronic device;
FIG. 6 is a third schematic view of a component structure of an electronic device according to an embodiment of the present invention;
fig. 7 is a hardware implementation structure diagram of the embodiment of the invention.
Detailed Description
The embodiments of the present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
An embodiment of the present invention provides an information analysis method, as shown in fig. 1, the method includes:
step 101: acquiring at least two vocabulary information from the text information;
step 102: acquiring attribute sets of the at least two vocabulary information; the attribute set comprises at least one attribute type, and each attribute type comprises an attribute value;
step 103: selecting at least one rule template based on the at least two vocabulary information; wherein, the rule template comprises: a rule part at least composed of attribute types and an attribute value constraint relational expression;
step 104: and acquiring the meaning corresponding to the text information by using at least one attribute value in the attribute set of the at least two vocabulary information, and a rule part and an attribute value constraint relational expression in the at least one rule template.
Here, the text information in step 101 may be acquired in a manner that: acquiring text information input by a user;
or, the acquired voice information may be the voice information acquired by the audio acquisition unit, or the text information acquired by converting the acquired voice information may be the voice information acquired by converting the acquired voice information into the text information; wherein the audio acquisition unit may be a Microphone (MIC). The method for converting the voice information into the text information may be converting by using preset voice-to-text conversion software.
The obtaining of the at least two vocabulary information from the text information may be to separate the text information according to a preset separation manner to obtain the at least two vocabulary information;
wherein, the preset separation mode can be as follows: and matching a preset word bank with the text information, and selecting the vocabulary matched with the word bank as finally acquired vocabulary information.
Preferably, if the text information converted from the voice information input by the user has a plurality of separation modes, the separation modes can be selected one by one for division, the text information is divided according to the selected separation mode, and the subsequent processing flow is completed.
Preferably, before executing step 101, the present embodiment may further set parameters such as a domain name, a category name of vocabulary information, and an attribute type, and specifically, the method further includes:
setting a vocabulary information base, wherein the vocabulary information base comprises at least one vocabulary information and comprises a class name, an attribute type and an attribute value corresponding to the vocabulary information; setting a corresponding domain name for each category name;
and setting a rule template library, wherein the rule template library comprises at least one rule template corresponding to the field name.
The manner of acquiring the attribute sets of the at least two vocabulary information may include:
extracting vocabulary information to be matched from the at least two vocabulary information respectively;
comparing the vocabulary information to be matched with all vocabulary information in a preset vocabulary information base, selecting the matched vocabulary information, and taking the category name, the attribute type and the attribute value of the matched vocabulary information as the category name, the attribute type and the attribute value of the vocabulary information to be matched.
Example (c):
and (3) small sheets:
Figure GDA0001633352240000051
Figure GDA0001633352240000061
wherein, the 'small page' is used as vocabulary information, and the attribute set of the 'small page' has three attribute types, namely birthday, song and girl friend; there is at least one attribute value under the three attribute types, for example, the song has three attribute values of "a", "B" and "C" under the attribute type.
The rule template in this embodiment includes: a rule part composed of at least attribute types, and an attribute value constraint relational expression.
The attribute value constraint relational expression may be a relational expression obtained by combining various terms defined in the rule section by using at least one preset expression keyword.
Wherein, the expression keyword may be as shown in table 1:
Figure GDA0001633352240000062
Figure GDA0001633352240000071
TABLE 1
In table 1, the function hold _ word _ func represents: the hole word relatePos for the relatePos position of the idx-th standardized item is obtained-1 denotes the left relatePos:1 denotes that the right attention idx is numbered according to the total standardized item. For example, hw (2, -1) represents the left hole word of the 2 nd regular item
The function item word func represents: the idx-th primitive word or concept is obtained. Note that idx is numbered according to total regulartem; for example, iw (1, word) indicates that the original word of the 1 st regulartem is obtained.
The function NE _ attribute _ key _ func, which represents obtaining all the attribute keys of the idx NE. Note that idx is numbered according to NE; such as: nak (1), representing all attribute keys for obtaining the 1 st NE
The function ne _ atterb _ value _ func, which represents: the key of the idx NE is obtained as the attribute value of key. Note that idx is numbered according to NE.
The function nav (1, singer) indicates the attribute value corresponding to singer that obtained the 1 st NE.
Correspondingly, in this embodiment, the selecting at least one rule template based on the at least two pieces of vocabulary information includes:
respectively acquiring category names of the at least two vocabulary information based on the vocabulary information base;
determining a domain name corresponding to the at least two vocabulary information based on the category names of the at least two vocabulary information;
and acquiring at least one rule template from the rule template library based on the field name.
After at least two pieces of vocabulary information are obtained, category names corresponding to the vocabulary information are selected from the vocabulary information base respectively, and corresponding field names are further selected from the vocabulary information base.
In addition, in the present embodiment, the attribute value constraint relational expressions are true value expressions; expressions must be distinguished by "{" } "; expression connector: and &, |; the expression is composed of: an operation element + an operator + an operation element; an operation element: character strings, functions, character string arrays; an operator: ═! Nin.
Therefore, by adopting the scheme, the vocabulary information can be extracted from the text information, the attribute value of the vocabulary information can be obtained, and the rule template of the vocabulary information can be obtained; and utilizing the attribute value of the vocabulary information, the rule part in the rule template and the attribute value constraint relational expression to obtain the meaning corresponding to the text information. Therefore, the meaning of the voice information can be analyzed by combining the attribute values of the vocabulary information, and the problem that the semantic analysis accuracy cannot be improved due to the fact that constraint relation analysis is not performed by combining the attribute values in the prior art is solved.
An embodiment of the present invention provides an information analysis method, as shown in fig. 1, the method includes:
step 101: acquiring at least two vocabulary information from the text information;
step 102: acquiring attribute sets of the at least two vocabulary information; the attribute set comprises at least one attribute type, and each attribute type comprises an attribute value;
step 103: selecting at least one rule template based on the at least two vocabulary information; wherein, the rule template comprises: a rule part at least composed of attribute types and an attribute value constraint relational expression;
step 104: and acquiring the meaning corresponding to the text information by using at least one attribute value in the attribute set of the at least two vocabulary information, and a rule part and an attribute value constraint relational expression in the at least one rule template.
Here, the obtaining of the at least two vocabulary information from the text information may be to separate the text information according to a preset separation manner to obtain the at least two vocabulary information;
wherein, the preset separation mode can be as follows: and matching a preset word bank with the text information, and selecting the vocabulary matched with the word bank as finally acquired vocabulary information.
Preferably, if the text information has a plurality of separation modes, the separation modes can be selected one by one for division, the text information is divided according to the selected separation mode, and the subsequent processing flow is completed.
Preferably, before executing step 101, the present embodiment may further set parameters such as a domain name, a category name of vocabulary information, and an attribute type, and specifically, the method further includes:
setting a vocabulary information base, wherein the vocabulary information base comprises at least one vocabulary information and comprises a class name, an attribute type and an attribute value corresponding to the vocabulary information; setting a corresponding domain name for each category name;
and setting a rule template library, wherein the rule template library comprises at least one rule template corresponding to the field name.
The manner of acquiring the attribute sets of the at least two vocabulary information may include:
extracting vocabulary information to be matched from the at least two vocabulary information respectively;
comparing the vocabulary information to be matched with all vocabulary information in a preset vocabulary information base, selecting the matched vocabulary information, and taking the category name, the attribute type and the attribute value of the matched vocabulary information as the category name, the attribute type and the attribute value of the vocabulary information to be matched.
The rule template in this embodiment includes: a rule part composed of at least attribute types, and an attribute value constraint relational expression.
The attribute value constraint relational expression may be a relational expression obtained by combining various terms defined in the rule section by using at least one preset expression keyword. Wherein, the expression key words can be shown in table 1.
Correspondingly, in this embodiment, the selecting at least one rule template based on the at least two pieces of vocabulary information includes:
respectively acquiring category names of the at least two vocabulary information based on the vocabulary information base;
determining a domain name corresponding to the at least two vocabulary information based on the category names of the at least two vocabulary information;
and acquiring at least one rule template from the rule template library based on the field name.
After at least two pieces of vocabulary information are obtained, category names corresponding to the vocabulary information are selected from the vocabulary information base respectively, and corresponding field names are further selected from the vocabulary information base.
Preferably, in this embodiment, the obtaining the meaning corresponding to the text information by using at least one attribute value in the attribute set of the at least two pieces of vocabulary information, and the rule part and the attribute value constraint relational expression in the at least one rule template includes:
extracting rule templates from the at least one rule template one by one;
judging whether the attribute types in the attribute sets of the at least two vocabulary information are matched with the extracted rule part of the rule template or not;
if the two are matched, acquiring the attribute constraint relation of the at least two vocabulary information by using at least one attribute value in the attribute set of the at least two vocabulary information and the extracted attribute value constraint relation expression in the rule template;
and determining the meaning corresponding to the text information based on the attribute constraint relation of the at least two pieces of vocabulary information.
In this embodiment, the attribute constraint relationship between the two pieces of vocabulary information may include: valid and invalid.
Further, the determining the meaning corresponding to the text information based on the attribute constraint relationship of the at least two vocabulary information includes:
if the attribute constraint relationship of the at least two vocabulary information represents that the attribute constraint relationship is effective, storing the rule template and the vocabulary information to a matching list;
processing the rule template and the vocabulary information stored in the matching list, and selecting a target rule template and the vocabulary information corresponding to the target rule template;
and determining the meaning corresponding to the text information based on the selected target rule template and the corresponding vocabulary information.
The processing of the rule template and the vocabulary information stored in the matching list may be: when the matching list comprises a plurality of rule templates and vocabulary information, the association degree of the rule templates and the text information can be further analyzed, and one with the highest association degree is selected as a target rule template and the corresponding vocabulary information; the relevancy can be the relevancy of the vocabulary information and the text information contained in the records of the matching list;
or, selecting one record containing the most vocabulary information as the target rule template and the corresponding vocabulary information.
The scheme provided by the present embodiment is explained as follows: first, the algorithm of the present example is input as a sentence text string; the algorithm output is: rule templates and vocabulary information. Preparation work: and establishing a vocabulary information base and a rule template information base. The algorithm flow, as shown in fig. 2, includes:
step 21: acquiring attribute information of each vocabulary based on vocabulary information of the text information, and selecting one field from a plurality of fields;
step 22: acquiring a rule template of the current field, judging whether the rule template is empty, returning to the step 21 to continue the next vertical field, and otherwise, executing the step 23;
step 23: judging whether the vocabulary information is matched with the rule part of the rule template, if so, jumping to the step 24, otherwise, returning to the step 22 to continue the next rule template;
step 24: judging whether the attribute value constraint relation expression of the rule template returns to be true or not, if so, considering the constraint relation to be valid, and storing the rule template and the vocabulary information into a matching list; continuing to the next rule template;
step 25: and carrying out disambiguation processing on the rule template in the matching list, outputting the most suitable rule template and the vocabulary information, and determining the meaning of the text information based on the most suitable rule template and the vocabulary information.
Examples are provided below based on the above description: assuming that the input speech information is converted into text information: "young girls"; then, as shown in fig. 3, the text information is divided into two pieces of vocabulary information, that is, "small" and "girlfriend". Wherein, the 'small page' is used as vocabulary information, and the attribute set of the 'small page' has three attribute types, namely birthday, song and girl friend; there is at least one attribute value under the three attribute types, for example, the song has three attribute values of "a", "B" and "C" under the attribute type. The female friend is used as second vocabulary information, the category names of the female friend are songs and universal vocabularies, the corresponding attribute set is provided with an attribute category singer, and the specific attribute values include a king and a plum.
Supposing that the domain name can be determined to comprise a music domain and a question and answer domain based on the first vocabulary information; the music field corresponds to a rule template I, and the question and answer field corresponds to a rule template II;
wherein, the rule template is shown in table 2:
Figure GDA0001633352240000111
TABLE 2
The second rule template is shown in table 3:
rules Attribute value constraint relational expression
[ name of person/NE]+ of + [ general vocabulary] {{{iw(2,word)in nak(1)}}
TABLE 3
The algorithm analysis process is as follows:
music field processing:
vocabulary information: [ petaloid/singer ] [ of/] [ girl friend/song title ]
Obtaining the information of the [ small pages/singers ], obtaining the information of the [ female friends/singers ]
For a rule template: [ singer/NE ] + plus [ title/NE ]
An attribute constraint expression { { singer nin nak (2) } | { nav (2, singer) ═ iw (1, word) } } };
the interpretation of this expression by the engine is: the second NE vocabulary has no singer attribute, or the singer attribute value of the second NE vocabulary is equal to the first NE vocabulary;
in the present example, since the singer attribute value of the second NE vocabulary [ girlfriend/song title ] is "queen; prune ", which does not contain the first NE vocabulary booklet, the constraint expression returns false, and the rule template cannot take effect.
Question and answer field processing:
question-answer field word segmentation: [ Small piece/name ] [ of/] [ girl friend/general vocabulary ]
Obtaining information of [ small sheets ]
For a rule template:
+ [ general vocabulary ] { { iw (2, word) in nak (1) } of [ name/NE ] +
The interpretation of parsing this expression by the attribute constraint expression engine is:
the second word [ girlfriend ] is in the attribute list of the first NE word, so the re-constrained expression is valid. Rules and vocabulary segmentation are preserved.
And (3) outputting an algorithm: + [ universal vocabulary ]: photos/names, of/girls/universal vocabulary.
Therefore, by adopting the scheme, the vocabulary information can be extracted from the text information, the attribute value of the vocabulary information can be obtained, and the rule template of the vocabulary information can be obtained; and utilizing the attribute value of the vocabulary information, the rule part in the rule template and the attribute value constraint relational expression to obtain the meaning corresponding to the text information. Therefore, the meaning of the voice information can be analyzed by combining the attribute values of the vocabulary information, and the problem that the semantic analysis accuracy cannot be improved due to the fact that constraint relation analysis is not performed by combining the attribute values in the prior art is solved.
An embodiment of the present invention provides an electronic device, as shown in fig. 4, including:
a text information processing unit 41 configured to acquire at least two vocabulary information from the text information;
an attribute obtaining unit 42, configured to obtain an attribute set of the at least two vocabulary information; the attribute set comprises at least one attribute type, and each attribute type comprises an attribute value;
a template obtaining unit 43, configured to select at least one rule template based on the at least two vocabulary information; wherein, the rule template comprises: a rule part at least composed of attribute types and an attribute value constraint relational expression;
the analysis unit 44 is configured to obtain the meaning corresponding to the text information by using at least one attribute value in the attribute set of the at least two pieces of vocabulary information, and a rule portion and an attribute value constraint relational expression in the at least one rule template.
Preferably, the electronic device provided in this embodiment, in addition to the modules described in fig. 4, may further include, as shown in fig. 5, an audio acquisition unit 45, configured to acquire voice information, convert the voice information into text information, and send the text information to the text information processing unit 41; wherein the audio acquisition unit may be a Microphone (MIC). The method for converting the voice information into the text information may be converting by using preset voice-to-text conversion software.
The obtaining of the at least two vocabulary information from the text information may be to separate the text information according to a preset separation manner to obtain the at least two vocabulary information; wherein, the preset separation mode can be as follows: and matching a preset word bank with the text information, and selecting the vocabulary matched with the word bank as finally acquired vocabulary information.
Preferably, if the text information converted from the voice information input by the user has a plurality of separation modes, the separation modes can be selected one by one for division, the text information is divided according to the selected separation mode, and the subsequent processing flow is completed.
Preferably, as shown in fig. 6, in addition to the above-described structures shown in fig. 5 and 4, the electronic apparatus further includes: the storage unit 46 is used for setting a vocabulary information base, wherein the vocabulary information base comprises at least one vocabulary information and comprises a category name, an attribute type and an attribute value corresponding to the vocabulary information; setting a corresponding domain name for each category name; and setting a rule template library, wherein the rule template library comprises at least one rule template corresponding to the field name.
The attribute acquiring unit 42 is configured to extract vocabulary information to be matched from the at least two vocabulary information respectively; comparing the vocabulary information to be matched with all vocabulary information in a preset vocabulary information base, selecting the matched vocabulary information, and taking the category name, the attribute type and the attribute value of the matched vocabulary information as the category name, the attribute type and the attribute value of the vocabulary information to be matched.
The rule template in this embodiment includes: a rule part composed of at least attribute types, and an attribute value constraint relational expression.
The attribute value constraint relational expression may be a relational expression obtained by combining various terms defined in the rule section by using at least one preset expression keyword. Wherein, the expression key words can be shown in table 1.
Correspondingly, in this embodiment, the template obtaining unit 43 is configured to obtain category names of the at least two pieces of vocabulary information based on the vocabulary information base, respectively; determining a domain name corresponding to the at least two vocabulary information based on the category names of the at least two vocabulary information; and acquiring at least one rule template from the rule template library based on the field name.
After at least two pieces of vocabulary information are obtained, category names corresponding to the vocabulary information are selected from the vocabulary information base respectively, and corresponding field names are further selected from the vocabulary information base.
In addition, in the present embodiment, the attribute value constraint relational expressions are true value expressions; expressions must be distinguished by "{" } "; expression connector: and &, |; the expression is composed of: an operation element + an operator + an operation element; an operation element: character strings, functions, character string arrays; an operator: ═! Nin.
Preferably, in this embodiment, the analysis unit is configured to extract rule templates from the at least one rule template one by one; judging whether the attribute types in the attribute sets of the at least two vocabulary information are matched with the extracted rule part of the rule template or not; if the two are matched, acquiring the attribute constraint relation of the at least two vocabulary information by using at least one attribute value in the attribute set of the at least two vocabulary information and the extracted attribute value constraint relation expression in the rule template; and determining the meaning corresponding to the text information based on the attribute constraint relation of the at least two pieces of vocabulary information.
In this embodiment, the attribute constraint relationship between the two pieces of vocabulary information may include: valid and invalid.
Further, the analysis unit is configured to store the rule template and the vocabulary information in a matching list if the attribute constraint relationship of the at least two pieces of vocabulary information characterizes the attribute constraint relationship; processing the rule template and the vocabulary information stored in the matching list, and selecting a target rule template and the vocabulary information corresponding to the target rule template; and determining the meaning corresponding to the text information based on the selected target rule template and the corresponding vocabulary information.
The processing of the rule template and the vocabulary information stored in the matching list may be: when the matching list comprises a plurality of rule templates and vocabulary information, the association degree of the rule templates and the text information can be further analyzed, and one with the highest association degree is selected as a target rule template and the corresponding vocabulary information; the relevancy can be the relevancy of the vocabulary information and the text information contained in the records of the matching list;
or, selecting one record containing the most vocabulary information as the target rule template and the corresponding vocabulary information.
Therefore, by adopting the scheme, the vocabulary information can be extracted from the text information, the attribute value of the vocabulary information can be obtained, and the rule template of the vocabulary information can be obtained; and utilizing the attribute value of the vocabulary information, the rule part in the rule template and the attribute value constraint relational expression to obtain the meaning corresponding to the text information. Therefore, the meaning of the voice information can be analyzed by combining the attribute values of the vocabulary information, and the problem that the semantic analysis accuracy cannot be improved due to the fact that constraint relation analysis is not performed by combining the attribute values in the prior art is solved.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described device embodiments are merely illustrative, for example, the division of the unit is only a logical functional division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or other forms.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed on a plurality of network units; some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, all the functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may be separately used as one unit, or two or more units may be integrated into one unit; the integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; and the aforementioned storage medium includes: a mobile storage device, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
This embodiment provides a specific hardware based on the above device embodiment, as shown in fig. 7, the apparatus includes a processor 72, a storage medium 74, and at least one external communication interface 71; the processor 72, the storage medium 74, and the external communication interface 71 are all connected by a bus 73. The processor 72 may be a microprocessor, a central processing unit, a digital signal processor, a programmable logic array, or other electronic components with processing functions. The storage medium has stored therein computer executable code.
The hardware may be the server. The processor, when executing the computer executable code, is capable of at least: acquiring at least two vocabulary information from the text information; acquiring attribute sets of the at least two vocabulary information; selecting at least one rule template based on the at least two vocabulary information; and acquiring the meaning corresponding to the text information by using at least one attribute value in the attribute set of the at least two vocabulary information, and a rule part and an attribute value constraint relational expression in the at least one rule template.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims (10)

1. An information analysis method, characterized in that the method comprises:
acquiring at least two vocabulary information from the text information;
acquiring attribute sets of the at least two vocabulary information; the attribute set comprises at least one attribute type, and each attribute type comprises an attribute value;
selecting at least one rule template based on the at least two vocabulary information; wherein, the rule template comprises: the system comprises a rule part at least consisting of attribute types and an attribute value constraint relational expression, wherein the attribute value constraint relational expression is obtained by combining various words defined in the rule part by utilizing at least one preset expression keyword;
performing the following processing for each rule template of the at least one rule template to obtain a matching list storing a plurality of rule templates and corresponding vocabulary information: when the attribute types in the attribute sets of the at least two pieces of vocabulary information are matched with the rule part of the rule template, acquiring the attribute constraint relation of the at least two pieces of vocabulary information by using at least one attribute value in the attribute sets of the at least two pieces of vocabulary information and the extracted attribute value constraint relation expression in the rule template; when the attribute constraint relation of the at least two vocabulary information is valid, storing the rule template and the vocabulary information to a matching list;
selecting a rule template with the highest association degree with the text information or a rule template with the most corresponding vocabulary information from the matching list as a target rule template, and acquiring the vocabulary information corresponding to the target rule template;
and determining the meaning corresponding to the text information based on the target rule template and the vocabulary information corresponding to the target rule template.
2. The method of claim 1, further comprising:
setting a vocabulary information base, wherein the vocabulary information base comprises at least one vocabulary information and comprises a class name, an attribute type and an attribute value corresponding to the vocabulary information; setting a corresponding domain name for each category name;
and setting a rule template library, wherein the rule template library comprises at least one rule template corresponding to the field name.
3. The method of claim 2, wherein selecting at least one rule template based on the at least two lexical information comprises:
respectively acquiring category names of the at least two vocabulary information based on the vocabulary information base;
determining a domain name corresponding to the at least two vocabulary information based on the category names of the at least two vocabulary information;
and acquiring at least one rule template from the rule template library based on the field name.
4. An electronic device, comprising:
the text information processing unit is used for acquiring at least two vocabulary information from the text information;
the attribute acquisition unit is used for acquiring an attribute set of the at least two vocabulary information; the attribute set comprises at least one attribute type, and each attribute type comprises an attribute value;
the template acquisition unit is used for selecting at least one rule template based on the at least two vocabulary information; wherein, the rule template comprises: the system comprises a rule part at least consisting of attribute types and an attribute value constraint relational expression, wherein the attribute value constraint relational expression is obtained by combining various words defined in the rule part by utilizing at least one preset expression keyword;
an analysis unit, configured to perform the following processing for each rule template of the at least one rule template to obtain a matching list storing a plurality of rule templates and corresponding vocabulary information: when the attribute types in the attribute sets of the at least two pieces of vocabulary information are matched with the rule part of the rule template, acquiring the attribute constraint relation of the at least two pieces of vocabulary information by using at least one attribute value in the attribute sets of the at least two pieces of vocabulary information and the extracted attribute value constraint relation expression in the rule template; when the attribute constraint relation of the at least two vocabulary information is valid, storing the rule template and the vocabulary information to a matching list;
selecting a rule template with the highest association degree with the text information or a rule template with the most corresponding vocabulary information from the matching list as a target rule template, and acquiring the vocabulary information corresponding to the target rule template;
and determining the meaning corresponding to the text information based on the target rule template and the vocabulary information corresponding to the target rule template.
5. The electronic device of claim 4, further comprising:
the storage unit is used for setting a vocabulary information base, and the vocabulary information base comprises at least one vocabulary information and comprises a category name, an attribute type and an attribute value corresponding to the vocabulary information; setting a corresponding domain name for each category name; and setting a rule template library, wherein the rule template library comprises at least one rule template corresponding to the field name.
6. The electronic device according to claim 5, wherein the template obtaining unit is configured to obtain category names of the at least two pieces of vocabulary information, respectively, based on the vocabulary information base in a storage unit; determining a domain name corresponding to the at least two vocabulary information based on the category names of the at least two vocabulary information; and acquiring at least one rule template from the rule template library based on the field name.
7. An electronic device, comprising:
a processor and a memory for storing a computer program capable of running on the processor; wherein,
the processor, when executing the computer program, is configured to execute the information analysis method according to any one of claims 1 to 3.
8. A computer-readable storage medium having stored thereon computer-executable instructions for execution by a processor to perform:
acquiring at least two vocabulary information from the text information;
acquiring attribute sets of the at least two vocabulary information; the attribute set comprises at least one attribute type, and each attribute type comprises an attribute value;
selecting at least one rule template based on the at least two vocabulary information; wherein, the rule template comprises: the system comprises a rule part at least consisting of attribute types and an attribute value constraint relational expression, wherein the attribute value constraint relational expression is obtained by combining various words defined in the rule part by utilizing at least one preset expression keyword;
performing the following processing for each rule template of the at least one rule template to obtain a matching list storing a plurality of rule templates and corresponding vocabulary information: when the attribute types in the attribute sets of the at least two pieces of vocabulary information are matched with the rule part of the rule template, acquiring the attribute constraint relation of the at least two pieces of vocabulary information by using at least one attribute value in the attribute sets of the at least two pieces of vocabulary information and the extracted attribute value constraint relation expression in the rule template; when the attribute constraint relation of the at least two vocabulary information is valid, storing the rule template and the vocabulary information to a matching list;
selecting a rule template with the highest association degree with the text information or a rule template with the most corresponding vocabulary information from the matching list as a target rule template, and acquiring the vocabulary information corresponding to the target rule template;
and determining the meaning corresponding to the text information based on the target rule template and the vocabulary information corresponding to the target rule template.
9. The computer-readable storage medium of claim 8, wherein the computer-executable instructions are further configured to, when executed by the processor, perform:
setting a vocabulary information base, wherein the vocabulary information base comprises at least one vocabulary information and comprises a class name, an attribute type and an attribute value corresponding to the vocabulary information; setting a corresponding domain name for each category name;
and setting a rule template library, wherein the rule template library comprises at least one rule template corresponding to the field name.
10. The computer-readable storage medium of claim 9, wherein the computer-executable instructions are further configured to, when executed by the processor, perform:
respectively acquiring category names of the at least two vocabulary information based on the vocabulary information base;
determining a domain name corresponding to the at least two vocabulary information based on the category names of the at least two vocabulary information;
and acquiring at least one rule template from the rule template library based on the field name.
CN201510591872.4A 2015-09-16 2015-09-16 Information analysis method and electronic equipment Active CN106547753B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510591872.4A CN106547753B (en) 2015-09-16 2015-09-16 Information analysis method and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510591872.4A CN106547753B (en) 2015-09-16 2015-09-16 Information analysis method and electronic equipment

Publications (2)

Publication Number Publication Date
CN106547753A CN106547753A (en) 2017-03-29
CN106547753B true CN106547753B (en) 2021-12-10

Family

ID=58362722

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510591872.4A Active CN106547753B (en) 2015-09-16 2015-09-16 Information analysis method and electronic equipment

Country Status (1)

Country Link
CN (1) CN106547753B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108334491B (en) * 2017-09-08 2020-07-31 腾讯科技(深圳)有限公司 Text analysis method and device, computing equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104636466A (en) * 2015-02-11 2015-05-20 中国科学院计算技术研究所 Entity attribute extraction method and system oriented to open web page
CN104657440A (en) * 2015-01-30 2015-05-27 欧阳江 Structured query statement generating system and method
CN104679728A (en) * 2015-02-06 2015-06-03 中国农业大学 A Text Similarity Detection Method
CN104794209A (en) * 2015-04-24 2015-07-22 清华大学 Chinese microblog sentiment classification method and system based on Markov logic network

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8977631B2 (en) * 2007-04-16 2015-03-10 Ebay Inc. Visualization of reputation ratings
US8943094B2 (en) * 2009-09-22 2015-01-27 Next It Corporation Apparatus, system, and method for natural language processing

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104657440A (en) * 2015-01-30 2015-05-27 欧阳江 Structured query statement generating system and method
CN104679728A (en) * 2015-02-06 2015-06-03 中国农业大学 A Text Similarity Detection Method
CN104636466A (en) * 2015-02-11 2015-05-20 中国科学院计算技术研究所 Entity attribute extraction method and system oriented to open web page
CN104794209A (en) * 2015-04-24 2015-07-22 清华大学 Chinese microblog sentiment classification method and system based on Markov logic network

Also Published As

Publication number Publication date
CN106547753A (en) 2017-03-29

Similar Documents

Publication Publication Date Title
US12093648B2 (en) Systems and methods for producing a semantic representation of a document
CN114556328B (en) Data processing method, device, electronic equipment and storage medium
TWI685760B (en) Method for analyzing semantics of natural language
US20220358297A1 (en) Method for human-machine dialogue, computing device and computer-readable storage medium
CN107016994A (en) The method and device of speech recognition
CN114514528B (en) Data processing method, device, electronic device and storage medium
CN106570180A (en) Artificial intelligence based voice searching method and device
US20160071511A1 (en) Method and apparatus of smart text reader for converting web page through text-to-speech
CN111401034B (en) Semantic analysis method, semantic analysis device and terminal for text
CN109828748A (en) Code naming method, system, computer installation and computer readable storage medium
CN110287286B (en) Method, device and storage medium for determining similarity of short text
JP3372532B2 (en) Computer-readable recording medium for emotion information extraction method and emotion information extraction program
CN111178076A (en) Named entity identification and linking method, device, equipment and readable storage medium
CN111611793B (en) Data processing method, device, equipment and storage medium
CN113128205B (en) Scenario information processing method and device, electronic equipment and storage medium
CN112581327A (en) Knowledge graph-based law recommendation method and device and electronic equipment
CN114548093A (en) Natural language processing method, device, equipment, medium and program product
JP5718405B2 (en) Utterance selection apparatus, method and program, dialogue apparatus and method
CN119227813A (en) Question and answer method, device, related equipment and computer program product
CN119646016A (en) Data query method, device, electronic device, medium and program product
CN106021371A (en) Event recognition method and system
CN113741864A (en) Automatic design method and system of semantic service interface based on natural language processing
CN106547753B (en) Information analysis method and electronic equipment
CN109684357B (en) Information processing method and device, storage medium, and terminal
CN111881695A (en) Audit knowledge retrieval method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant