[go: up one dir, main page]

CN109740157B - Method and device for determining label of working individual and computer storage medium - Google Patents

Method and device for determining label of working individual and computer storage medium Download PDF

Info

Publication number
CN109740157B
CN109740157B CN201811637972.6A CN201811637972A CN109740157B CN 109740157 B CN109740157 B CN 109740157B CN 201811637972 A CN201811637972 A CN 201811637972A CN 109740157 B CN109740157 B CN 109740157B
Authority
CN
China
Prior art keywords
verb
text data
label
working
verbs
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811637972.6A
Other languages
Chinese (zh)
Other versions
CN109740157A (en
Inventor
陈凤杰
任真
黄扬
单若诚
周星
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guizhou Xiaoai Robot Technology Co ltd
Original Assignee
Guizhou Xiaoai Robot Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guizhou Xiaoai Robot Technology Co ltd filed Critical Guizhou Xiaoai Robot Technology Co ltd
Priority to CN201811637972.6A priority Critical patent/CN109740157B/en
Publication of CN109740157A publication Critical patent/CN109740157A/en
Application granted granted Critical
Publication of CN109740157B publication Critical patent/CN109740157B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Machine Translation (AREA)

Abstract

A method, apparatus, and computer storage medium for determining a label of a work individual, the method comprising: acquiring the content of text data; extracting the name, proper noun and verb of the working person in the text data; the work individual includes: staff and departments; determining verbs matched with the proper nouns as labels; and matching the label to the corresponding working individual name according to the position of the working individual name in the text data and the position of the label in the text data. By adopting the scheme, the capacity of the working individuals can be determined more objectively, and the accuracy is higher.

Description

Method and device for determining label of working individual and computer storage medium
Technical Field
The present application relates to the field of data processing, and in particular, to a method and apparatus for determining a label of a working person, and a computer storage medium.
Background
Today, as various enterprises are developing more and more rapidly, the number of individuals working within the enterprise is increasing, and thus, in order to better manage the enterprise, it is important to know the abilities of various employees within the enterprise.
In the prior art, since the manager of the enterprise cannot personally interact with all employees in the enterprise, the manner of determining the ability of each individual is generally summarized by the generic histories of each employee.
However, relying solely on manually summarizing the abilities of individuals, the determination of the abilities of individuals is often too subjective, less objective and less accurate, and it is difficult to achieve better management of an enterprise by knowing the abilities of the individual employees within the enterprise.
Disclosure of Invention
The application solves the technical problems that the capacity determination of a working individual is too subjective and objective and the accuracy is poor.
In order to solve the above technical problems, an embodiment of the present application provides a method for determining a label of a working individual, including: acquiring the content of text data; extracting the name, proper noun and verb of the working person in the text data; the work individual includes: staff and departments; determining verbs matched with the proper nouns as labels; and matching the label to the corresponding working individual name according to the position of the working individual name in the text data and the position of the label in the text data.
Optionally, the acquiring the content of the text data includes: converting the text data into an html file format; and extracting the content in the text data in the html file format in a crawler mode.
Optionally, after extracting the name, proper noun and verb of the working person in the text data, the method further includes: constructing a work individual name word stock and a proper noun word stock according to the work individual names and proper nouns in the text data; and dividing the text data according to the name word stock of the working individual and the proper noun word stock, and filtering stop words in the text data after dividing the words.
Optionally, after extracting the name, proper noun and verb of the working person in the text data, the method further includes: constructing a verb word library according to verbs in the text data; screening out effective verbs in the verb word stock; and counting word frequency of the effective verbs in the verb word stock in the text data.
Optionally, the screening the valid verbs in the verb word library includes: screening a word stock through a semantic analysis algorithm or a preset verb, and screening effective verbs in the verb word stock.
Optionally, after the counting the word frequency of the valid verbs in the verb word stock in the text data, the method further includes: and splitting or merging the effective verbs in the text data according to the word frequency.
Optionally, the determining the verb matched with the proper noun as the label includes: if a verb having an association relationship with a proper noun in a sentence exists in a certain sentence in the text data, determining that the verb is a label matched with the proper noun.
Optionally, the matching the label to the corresponding working individual name according to the position of the working individual name in the text data and the position of the label in the text data includes: in a certain sentence in the text data, if the names of the working individuals are in parallel relation, matching the labels in the sentence to all the names of the working individuals in the sentence; and if the names of the working individuals are mutually independent, matching the label in the sentence to the name of the working individual closest to the label in word number.
The application also provides a label determining device of the working individual, which comprises: the device comprises an acquisition unit, an extraction unit, a label determination unit and a label matching unit, wherein: the acquisition unit is used for acquiring the content of the text data; the extraction unit is used for extracting the name, proper noun and verb of the working person in the text data; the work individual includes: staff and departments; the label determining unit is used for determining verbs matched with proper nouns as labels; the label matching unit is used for matching the label to the corresponding working individual name according to the position of the working individual name in the text data and the position of the label in the text data.
The application also provides a computer readable storage medium, wherein the computer readable storage medium is a nonvolatile storage medium or a non-transient storage medium, and the computer instructions execute the steps of the label determining method of any working entity when running.
The application also provides an electronic device comprising a memory and a processor, wherein the memory stores computer instructions, and the processor executes the steps of the label determining method of any working individual when the computer instructions are running.
Compared with the prior art, the technical scheme of the embodiment of the application has the following beneficial effects:
acquiring the content of text data; extracting the name, proper noun and verb of the working person in the text data; determining verbs matched with proper nouns as labels; and matching the label to the corresponding working individual name according to the position of the working individual name in the document data and the position of the label in the text data. In the scheme, the verbs which can be matched with proper nouns are used as the labels, a part of invalid verbs are filtered out to a certain extent, the accuracy of the selected labels is reflected, the positions of the labels and the working individuals in text data are used as limiting factors in the matching process, the correct matching of the labels and the working individuals is ensured, and in conclusion, the scheme is adopted, so that the capability of the working individuals is determined objectively, and the accuracy is high.
Drawings
FIG. 1 is a flow chart of a method for determining labels of individuals in a work area according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of a label determining apparatus for an operating person according to an embodiment of the present application.
Detailed Description
In the prior art, when the number of people in the enterprise is large, the manager of the enterprise may not be able to personally communicate with all employees in the enterprise, so the manner of determining the ability of each working individual is generally summarized by the affiliated upper part of each employee.
However, relying solely on manually summarizing the abilities of individuals, the determination of the abilities of individuals is often too subjective, less objective and less accurate, and it is difficult to achieve better management of an enterprise by knowing the abilities of the individual employees within the enterprise.
In the embodiment of the application, text data are acquired; extracting the name, proper noun and verb of the working person in the text data; determining verbs matched with proper nouns as labels; and matching the label to the corresponding working individual name according to the position of the working individual name in the document data and the position of the label in the text data. By adopting the scheme, the capacity of the working individuals is determined more objectively, and the accuracy is higher.
In order to make the above objects, features and advantages of the present application more comprehensible, embodiments accompanied with figures are described in detail below.
Referring to fig. 1, a flowchart of a method for determining a label of a working person according to an embodiment of the present application includes the following specific steps:
step S101, the content of the text data is acquired.
In implementations, the text data obtained may be text data within the enterprise that relates to the staff and the work content of the department to which the staff belongs.
In the embodiment of the application, the text data can be converted into an html file format; and extracting the content in the text data in the html file format in a crawler mode.
In the implementation, the format of the text data is converted into the html file format, so that the content of the text data can be conveniently extracted by using electronic equipment such as a computer, and the processing speed of the text data is effectively improved under the condition that the data volume of the text data is huge.
Step S102, extracting the name, proper noun and verb of the working person in the text data; the work individual includes: staff and departments.
In a specific implementation, the proper noun may be an application name, a device name, or a keyword of a task within an enterprise. Such as an information system, an analog circuit, or a product appearance. The setting can be performed by a user according to the application scene.
In a specific implementation, the name, proper noun and verb of the working person in the text data can be extracted according to a word sense analysis algorithm and a corresponding word database.
In specific implementation, by setting word senses of word objects extracted by a word sense analysis algorithm, corresponding workplace names, proper nouns and verbs can be extracted from text data.
In a specific implementation, the word sense analysis algorithm may be a tie (jieba) algorithm, or a corresponding word sense analysis algorithm may be set by the user according to different application scenarios.
In specific implementation, the text data is searched for the names, proper nouns and verbs of the working individuals matched with the word database through a preset word database.
In a specific implementation, the word database may be set by a user according to an application scenario.
In the embodiment of the application, after the names, proper nouns and verbs of the working individuals in the text data are extracted, a word stock of the names of the working individuals and a word stock of the proper nouns are constructed according to the names of the working individuals and the proper nouns in the text data; and dividing the text data according to the name word stock of the working individual and the proper noun word stock, and filtering stop words in the text data after dividing the words.
In particular implementations, the term "deactivated" refers to words that have no actual meaning in the text data, e.g., "o," "o," etc. And filtering the stop words and then carrying out data processing, so that the data processing efficiency can be effectively improved.
In the embodiment of the application, after the name, proper noun and verb of the working person in the text data are extracted, a verb word stock is constructed according to the verb in the text data; screening out effective verbs in the verb word stock; and counting word frequency of the effective verbs in the verb word stock in the text data.
In the implementation, the verb is taken as a label of the worker, so that constructing the verb word stock facilitates the subsequent determination of the verb taken as the label in the verb.
In implementations, the verbs in the text data may not all be used as labels, as labels are used to describe the ability of a worker to perform at work. For example, the verb "write" in the sentence "write text" cannot be used as a label. And screening out effective verbs in the verb word library, namely screening out verbs which can be used for indicating the working capacity of the working individuals.
In the embodiment of the application, a word stock is screened out by a word sense analysis algorithm or a preset verb screening word stock, and effective verbs in the verb word stock are screened out.
In specific implementation, when a word sense analysis algorithm is used, a corresponding valid verb can be selected by setting a word sense of the verb capable of being used as a label; or the corresponding verb can be deleted by setting word senses of verbs which cannot be used as labels, so that the purpose of screening out valid verbs is achieved.
In implementation, when using the verb screening word stock, a corresponding valid verb can be selected by inputting a verb capable of being used as a label in the verb screening word stock; or the corresponding verbs can be deleted by inputting verbs which cannot be used as labels in the verb screening word stock, so that the purpose of screening the valid verbs is achieved.
In the embodiment of the application, the effective verbs in the text data are split or combined according to the word frequency.
In implementations, some verbs consist of two or more verbs, e.g., the verb "weave" consists of the verb "weave" and the verb "weave". Therefore, when the verb is selected, it can be determined whether the verb is split according to the frequency of the verb, and the "braid" are used as one verb separately or combined, respectively, and the "braid" is used as one verb.
In a specific implementation, a word frequency threshold value can be set by a user according to an actual application scene, and when the word frequency of a certain verb is higher than the word frequency threshold value, the verb is seen as a verb alone. For example, when the word frequency of the verb "braid" is higher than the word frequency threshold, the verb "braid" alone may be regarded as a verb; when the word frequency of the verb "weave" is higher than the word frequency threshold, the verb "weave" alone can be regarded as a verb; when the word frequency of the verb 'braided' and the word frequency of the verb 'braided' are both larger than the word frequency threshold value; when the word frequency of the verb 'weave' is greater than that of the verb 'weave', the verb 'weave' is regarded as a verb alone; when the word frequency of the verb "braid" is smaller than that of the verb "braid", the verb "braid" is regarded as a verb alone.
Step S103, determining the verb matched with the proper noun as a label.
In the embodiment of the application, if a verb having an association relationship with a proper noun in a sentence exists in a certain sentence in the text data, the verb is determined to be a label matched with the proper noun.
In implementations, matching of proper nouns to verbs is limited by the meaning of the proper nouns themselves, to which the verbs matching proper nouns should be associated. For example, a proper noun is "information system", a verb is "maintenance", the maintenance and the information system "may be recognized as being associated, and the verb" maintenance "may be regarded as a label; the proper noun is "product appearance", the verb is "maintenance", "maintenance" and "product appearance" may be considered as not being associated, the "product appearance" and the verb "design" may be considered as being associated, and the verb "design" may be considered as a label.
Step S104, matching the label to the corresponding working individual name according to the position of the working individual name in the text data and the position of the label in the text data.
In implementations, the location of words in text data can be determined by regular expressions.
In a specific implementation, the label may be limited to a paragraph, and the label is matched with the name of the working person in the paragraph where the label is located.
In a specific implementation, the label may be matched with the name of the working entity in the sentence, which is limited by the sentence.
In the embodiment of the application, in a certain sentence in the text data, if the names of the working individuals are in parallel relation, the labels in the sentence are matched with all the names of the working individuals in the sentence; and if the names of the working individuals are mutually independent, matching the label in the sentence to the name of the working individual closest to the label in word number.
For example, in the sentence "the first and second departments perform product appearance design together", the work individual name "a" and the work individual name "second department" are expressed in a parallel relationship, and then the tag "design" is matched to the work individual name "a" and the work individual name "second department"; in the statement that the second department completes the appearance design of the product under the guidance of the first department, the working individual name of the first department and the working individual name of the second department are mutually independent, the number of words between the label of the first department and the working individual name of the second department is smaller than that between the label of the first department and the working individual name of the second department, and the label of the second department is matched with the label of the first department.
Referring to fig. 2, a schematic structural diagram of a label determining apparatus 20 for a working unit according to an embodiment of the present application specifically includes: an acquisition unit 201, an extraction unit 202, a tag determination unit 203, and a tag matching unit 204, wherein:
the acquiring unit 201 may be configured to acquire text data;
the extracting unit 202 may be configured to extract a workplace name, a proper noun, and a verb in the text data; the work individual includes: staff and departments;
the tag determination unit 203 may be configured to determine, as a tag, a verb that matches the proper noun;
the tag matching unit 204 may be configured to match the tag to a corresponding working individual name according to a position of the working individual name in the text data and a position of the tag in the text data.
In a specific implementation, the proper noun may be an application name, a device name, or a keyword of a task within an enterprise. Such as an information system, an analog circuit, or a product appearance. The setting can be performed by a user according to the application scene.
In a specific implementation, the name, proper noun and verb of the working person in the text data can be extracted according to a word sense analysis algorithm and a corresponding word database.
In specific implementation, by setting word senses of word objects extracted by a word sense analysis algorithm, corresponding workplace names, proper nouns and verbs can be extracted from text data.
In a specific implementation, the word sense analysis algorithm may be a tie (jieba) algorithm, or a corresponding word sense analysis algorithm may be set by the user according to different application scenarios.
In specific implementation, the text data is searched for the names, proper nouns and verbs of the working individuals matched with the word database through a preset word database.
In a specific implementation, the word database may be set by a user according to an application scenario.
In the implementation, the verbs which can be matched with proper nouns are used as labels, a part of invalid verbs are filtered out to a certain extent, the accuracy of the selected labels is reflected, the positions of the labels and the working individuals in text data are used as limiting factors in the matching process, the correct matching of the labels and the working individuals is ensured, and in sum, the above scheme is adopted, so that the ability of the working individuals is determined more objectively and the accuracy is higher.
In the embodiment of the present application, the obtaining unit 201 further includes: a content acquisition subunit, further configured to convert the text data into an html file format; and extracting the content in the text data in the html file format in a crawler mode.
In a specific implementation, the format of the text data is converted into an html file format, so that the content of the text data can be conveniently extracted by using electronic equipment such as a computer, and the processing speed of the text data is effectively improved under the condition that the data volume of the text data is huge
In an embodiment of the present application, the extracting unit 202 further includes: the construction subunit can also be used for constructing a work individual name word stock and a proper noun word stock according to the work individual names and proper nouns in the text data; the word segmentation subunit may be further configured to segment the text data according to the name lexicon of the working individual and the proper noun lexicon, and filter out stop words in the text data after the word segmentation.
In the implementation, the stop words are removed, so that the efficiency of data processing can be effectively improved.
In the embodiment of the present application, the extracting unit 202 further includes a statistics subunit, and the constructing subunit may be further configured to construct a verb word library according to verbs in the text data; screening out effective verbs in the verb word stock; the statistics subunit may be further configured to count a word frequency of the valid verbs in the verb word stock in the text data.
In the implementation, the verb is taken as a label of the worker, so that constructing the verb word stock facilitates the subsequent determination of the verb taken as the label in the verb.
In implementations, the verbs in the text data may not all be used as labels, as labels are used to describe the ability of a worker to perform at work. For example, the verb "write" in the sentence "write text" cannot be used as a label. And screening out effective verbs in the verb word library, namely screening out verbs which can be used for indicating the working capacity of the working individuals.
In the embodiment of the present application, the extracting unit 202 further includes a screening subunit, where the screening subunit may be further configured to screen, by using a semantic analysis algorithm or a preset verb screening word library, valid verbs in the verb word library.
In specific implementation, when a word sense analysis algorithm is used, a corresponding valid verb can be selected by setting a word sense of the verb capable of being used as a label; or the corresponding verb can be deleted by setting word senses of verbs which cannot be used as labels, so that the purpose of screening out valid verbs is achieved.
In implementation, when using the verb screening word stock, a corresponding valid verb can be selected by inputting a verb capable of being used as a label in the verb screening word stock; or the corresponding verbs can be deleted by inputting verbs which cannot be used as labels in the verb screening word stock, so that the purpose of screening the valid verbs is achieved.
In specific implementation, verbs capable of being used as labels are screened out to be effective verbs, and a part of verbs which cannot be used for describing the capability of the working individuals are filtered out, so that the accuracy of the capability determination of the working individuals is improved.
In an embodiment of the present application, the extracting unit 202 further includes: the word processing subunit is further configured to split or merge valid verbs in the text data according to the word frequency.
In implementations, some verbs consist of two or more verbs, e.g., the verb "weave" consists of the verb "weave" and the verb "weave". Therefore, when the verb is selected, it can be determined whether the verb is split according to the frequency of the verb, and the "braid" are used as one verb separately or combined, respectively, and the "braid" is used as one verb.
In a specific implementation, a word frequency threshold value can be set by a user according to an actual application scene, and when the word frequency of a certain verb is higher than the word frequency threshold value, the verb is seen as a verb alone. For example, when the word frequency of the verb "braid" is higher than the word frequency threshold, the verb "braid" alone may be regarded as a verb; when the word frequency of the verb "weave" is higher than the word frequency threshold, the verb "weave" alone can be regarded as a verb; when the word frequency of the verb 'braided' and the word frequency of the verb 'braided' are both larger than the word frequency threshold value; when the word frequency of the verb 'weave' is greater than that of the verb 'weave', the verb 'weave' is regarded as a verb alone; when the word frequency of the verb "braid" is smaller than that of the verb "braid", the verb "braid" is regarded as a verb alone.
In the embodiment of the present application, the tag determining unit 203 may be further configured to determine, in a certain sentence in the text data, that the verb is a tag that matches the proper noun if there is a verb that has an association relationship with the proper noun in the sentence.
In implementations, matching of proper nouns to verbs is limited by the meaning of the proper nouns themselves, which the verbs matching proper nouns should be associated with, thereby improving the accuracy of the capability determination of the work individual.
For example, a proper noun is "information system", a verb is "maintenance", the maintenance and the information system "may be recognized as being associated, and the verb" maintenance "may be regarded as a label; the proper noun is "product appearance", the verb is "maintenance", "maintenance" and "product appearance" may be considered as not being associated, the "product appearance" and the verb "design" may be considered as being associated, and the verb "design" may be considered as a label.
In the embodiment of the present application, the tag matching unit 204 may be further configured to match, in a sentence in the text data, tags in the sentence to all the names of the working individuals in the sentence if the names of the working individuals are in a parallel relationship; and if the names of the working individuals are mutually independent, matching the label in the sentence to the name of the working individual closest to the label in word number.
In implementations, the location of words in text data can be determined by regular expressions.
In a specific implementation, the label may be limited to a paragraph, and the label is matched with the name of the working person in the paragraph where the label is located.
In a specific implementation, the label may be matched with the name of the working entity in the sentence, which is limited by the sentence.
For example, in the sentence "the first and second departments perform product appearance design together", the work individual name "a" and the work individual name "second department" are expressed in a parallel relationship, and then the tag "design" is matched to the work individual name "a" and the work individual name "second department"; in the statement that the second department completes the appearance design of the product under the guidance of the first department, the working individual name of the first department and the working individual name of the second department are mutually independent, the number of words between the label of the first department and the working individual name of the second department is smaller than that between the label of the first department and the working individual name of the second department, and the label of the second department is matched with the label of the first department.
It should be noted that, the tag determining apparatus of the working individual according to the embodiment of the present application may be integrated into the electronic device as one software module and/or hardware module, in other words, the electronic device may include the tag determining apparatus of the working individual. For example, the label determining means of the work individual may be a software module in the operating system of the electronic device or may be an application developed for it; of course, the label determining means of the working person may equally well be one of a number of hardware modules of the electronic device.
In another embodiment of the application, the label determining means of the work entity and the electronic device may also be separate devices (e.g. servers) and the label determining means of the work entity may be connected to the electronic device via a wired and/or wireless network and transmit the interaction information in a agreed data format.
In an electronic device provided in an embodiment of the present application, the electronic device includes: one or more processors and memory; and computer program instructions stored in the memory, which when executed by the processor, cause the processor to perform the method of determining a label of a working individual as in any of the embodiments described above.
The processor may be a Central Processing Unit (CPU) or other form of processing unit having data processing and/or instruction execution capabilities, and may control other components in the electronic device to perform the desired functions.
The memory may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, random Access Memory (RAM) and/or cache memory (cache), and the like. The non-volatile memory may include, for example, read Only Memory (ROM), hard disk, flash memory, and the like. One or more computer program instructions may be stored on the computer readable storage medium that can be executed by a processor to perform the steps in the method of determining a label for a working individual of the various embodiments of the application described above and/or other desired functions. Information such as light intensity, compensation light intensity, position of the filter, etc. may also be stored in the computer readable storage medium.
In one example, the electronic device may further include: input devices and output devices, which are interconnected by a bus system and/or other forms of connection mechanisms.
The output means may output various information to the outside, and may include, for example, a display, a speaker, a printer, and a communication network and a remote output device connected thereto, and the like.
In addition, the electronic device may include any other suitable components depending on the particular application.
In addition to the methods and apparatus described above, embodiments of the application may also be a computer program product comprising computer program instructions which, when executed by a processor, cause the processor to perform the steps in the method of determining a tag of a working person as in any of the embodiments described above.
The computer program product may include program code for performing operations of embodiments of the present application in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server.
Furthermore, embodiments of the present application may also be a computer-readable storage medium, on which computer program instructions are stored, which, when being executed by a processor, cause the processor to perform the steps in the method of determining a label of a work individual according to the various embodiments of the present application described in the above-mentioned portion of the method of determining a label of a work individual of the present specification.
The computer readable storage medium may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may include, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
Although the present application is disclosed above, the present application is not limited thereto. Various changes and modifications may be made by one skilled in the art without departing from the spirit and scope of the application, and the scope of the application should be assessed accordingly to that of the appended claims.

Claims (10)

1. A method of determining a label of a work individual, comprising:
acquiring the content of text data;
extracting the name, proper noun and verb of the working person in the text data; the work individual includes: staff and departments; the proper nouns are keywords of application system names, equipment names or work tasks in enterprises;
determining verbs matched with the proper nouns as labels;
in a certain sentence in the text data, if the names of the working individuals are in parallel relation, matching the labels in the sentence to all the names of the working individuals in the sentence; and if the names of the working individuals are mutually independent, matching the label in the sentence to the name of the working individual closest to the label in word number.
2. The method for determining the label of the work individual according to claim 1, wherein the acquiring the content of the text data includes:
converting the text data into an html file format;
and extracting the content in the text data in the html file format in a crawler mode.
3. The method of claim 1, further comprising, after said extracting the work individual name, proper noun, and verb in the text data:
constructing a work individual name word stock and a proper noun word stock according to the work individual names and proper nouns in the text data;
and dividing the text data according to the name word stock of the working individual and the proper noun word stock, and filtering stop words in the text data after dividing the words.
4. The method of claim 1, further comprising, after said extracting the work individual name, proper noun, and verb in the text data:
constructing a verb word library according to verbs in the text data;
screening out effective verbs in the verb word stock;
and counting word frequency of the effective verbs in the verb word stock in the text data.
5. The method for determining labels of worksubjects according to claim 4, wherein said screening out valid verbs in said verb thesaurus includes:
screening a word stock through a semantic analysis algorithm or a preset verb, and screening effective verbs in the verb word stock.
6. The method of claim 4, further comprising, after said counting word frequencies of valid verbs in said verb word stock in said text data:
and splitting or merging the effective verbs in the text data according to the word frequency.
7. The method for determining the label of the work individual according to claim 1, wherein the step of determining the verb matching the proper noun as the label includes:
if a verb having an association relationship with a proper noun in a sentence exists in a certain sentence in the text data, determining that the verb is a label matched with the proper noun.
8. A label determining apparatus for a work individual, comprising: the device comprises an acquisition unit, an extraction unit, a label determination unit and a label matching unit, wherein:
the acquisition unit is used for acquiring the content of the text data;
the extraction unit is used for extracting the name, proper noun and verb of the working person in the text data; the work individual includes: staff and departments; the proper nouns are keywords of application system names, equipment names or work tasks in enterprises;
the label determining unit is used for determining verbs matched with proper nouns as labels;
the label matching unit is used for matching labels in a sentence to all the names of the working individuals in the sentence in the text data if the names of the working individuals are in parallel relation; and if the names of the working individuals are mutually independent, matching the label in the sentence to the name of the working individual closest to the label in word number.
9. A computer readable storage medium having stored thereon computer instructions, the computer readable storage medium being a non-volatile storage medium or a non-transitory storage medium, characterized in that the computer instructions when run perform the steps of the method of determining a label of a working individual according to any of claims 1-7.
10. An electronic device comprising a memory and a processor, the memory having stored thereon computer instructions, characterized in that the processor, when the computer instructions are run, performs the steps of the method for determining a label of a working person according to any of claims 1-7.
CN201811637972.6A 2018-12-29 2018-12-29 Method and device for determining label of working individual and computer storage medium Active CN109740157B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811637972.6A CN109740157B (en) 2018-12-29 2018-12-29 Method and device for determining label of working individual and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811637972.6A CN109740157B (en) 2018-12-29 2018-12-29 Method and device for determining label of working individual and computer storage medium

Publications (2)

Publication Number Publication Date
CN109740157A CN109740157A (en) 2019-05-10
CN109740157B true CN109740157B (en) 2023-08-18

Family

ID=66362296

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811637972.6A Active CN109740157B (en) 2018-12-29 2018-12-29 Method and device for determining label of working individual and computer storage medium

Country Status (1)

Country Link
CN (1) CN109740157B (en)

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2484439A1 (en) * 2003-10-10 2005-04-10 Daniel Nicholas Crow Conceptualization of job candidate information
US8214346B2 (en) * 2008-06-27 2012-07-03 Cbs Interactive Inc. Personalization engine for classifying unstructured documents
CN101833555B (en) * 2009-03-12 2016-05-04 富士通株式会社 Information extracting method and device
CN106776571A (en) * 2016-12-27 2017-05-31 北京奇虎科技有限公司 The generation method and device of a kind of label
CN107480200B (en) * 2017-07-17 2020-10-23 深圳先进技术研究院 Word labeling method, device, server and storage medium based on word labels
CN108288229B (en) * 2018-03-02 2022-03-15 北京邮电大学 A method for building user portraits
CN108959575B (en) * 2018-07-06 2019-09-24 北京神州泰岳软件股份有限公司 A kind of enterprise's incidence relation information mining method and device

Also Published As

Publication number Publication date
CN109740157A (en) 2019-05-10

Similar Documents

Publication Publication Date Title
US20220230465A1 (en) Sectionizing documents based on visual and language models
CN107193973B (en) Method, device and equipment for identifying field of semantic analysis information and readable medium
JP2020027649A (en) Method, apparatus, device and storage medium for generating entity relationship data
RU2704531C1 (en) Method and apparatus for analyzing semantic information
US20170192955A1 (en) System and method for sentiment lexicon expansion
CN112445775B (en) Fault analysis method, device, equipment and storage medium of photoetching machine
US9880834B2 (en) Source program analysis system, source program analysis method, and recording medium on which program is recorded
US20140214402A1 (en) Implementation of unsupervised topic segmentation in a data communications environment
US11023654B2 (en) Analyzing document content and generating an appendix
CN113986864A (en) Log data processing method, device, electronic device and storage medium
US11928437B2 (en) Machine reading between the lines
CN110555212A (en) Document verification method and device based on natural language processing and electronic equipment
Rahmi Dewi et al. Software Requirement-Related Information Extraction from Online News using Domain Specificity for Requirements Elicitation: How the system analyst can get software requirements without constrained by time and stakeholder availability
CN109754224A (en) Organizational affiliation map construction method, apparatus and computer storage medium
US9619463B2 (en) Document decomposition into parts based upon translation complexity for translation assignment and execution
CN114141384A (en) Method, apparatus and medium for retrieving medical data
CN113947082A (en) Method, device, device and storage medium for word segmentation processing
CN109740157B (en) Method and device for determining label of working individual and computer storage medium
CN111126034A (en) Medical variable relation processing method and device, computer medium and electronic equipment
Al-Negheimish et al. Discrete reasoning templates for natural language understanding
US9558269B2 (en) Extracting and mining of quote data across multiple languages
CN113657082A (en) Display method and display device
KR20220024251A (en) Method and apparatus for building event library, electronic device, and computer-readable medium
CN114881313A (en) Behavior prediction method, device and related equipment based on artificial intelligence
CN114492409B (en) Method and device for evaluating file content, electronic equipment and program product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant