[go: up one dir, main page]

CN108628863B - Information acquisition method and device - Google Patents

Information acquisition method and device Download PDF

Info

Publication number
CN108628863B
CN108628863B CN201710153107.3A CN201710153107A CN108628863B CN 108628863 B CN108628863 B CN 108628863B CN 201710153107 A CN201710153107 A CN 201710153107A CN 108628863 B CN108628863 B CN 108628863B
Authority
CN
China
Prior art keywords
financial
preset
information
proportion
characteristic information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710153107.3A
Other languages
Chinese (zh)
Other versions
CN108628863A (en
Inventor
杨兴
杨晓静
武熠阳
赵鑫
王江丰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing duxiaoman Youyang Technology Co.,Ltd.
Original Assignee
Shanghai Youyang New Media Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Youyang New Media Information Technology Co ltd filed Critical Shanghai Youyang New Media Information Technology Co ltd
Priority to CN201710153107.3A priority Critical patent/CN108628863B/en
Publication of CN108628863A publication Critical patent/CN108628863A/en
Application granted granted Critical
Publication of CN108628863B publication Critical patent/CN108628863B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Finance (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The application discloses an information acquisition method and device. One embodiment of the method comprises: acquiring information of an entity object corresponding to a financial object, and extracting keywords in the information; inputting the keywords into a preset logistic regression model to obtain an output result, wherein the preset logistic regression model is generated by training based on characteristic information of a plurality of financial objects in advance; and generating indication information indicating whether the preset financial event occurs to the financial object based on the output result. The method realizes that whether the financial object has the preset financial event or not can be predicted only by extracting key words from information such as news without depending on financial data through a logistic regression model, thereby providing a prediction result in time.

Description

Information acquisition method and device
Technical Field
The present application relates to the field of computers, and in particular, to the field of data analysis, and more particularly, to a method and apparatus for acquiring information.
Background
Predicting whether a financial object (e.g., a bond) will have a financial event (e.g., a default event) is the most critical link in investment decision of the financial object. At present, the commonly adopted mode is as follows: and predicting whether the financial object can generate a financial event according to the financial data of the entity object corresponding to the financial object.
However, due to the hysteresis of the financial report data, the prediction result of whether the financial event will occur to the financial object cannot be obtained in time, and the investment decision of the financial object is further influenced.
Disclosure of Invention
The application provides an information acquisition method and an information acquisition device, which are used for solving the technical problems existing in the background technology part.
In a first aspect, the present application provides an information obtaining method, including: acquiring information of an entity object corresponding to a financial object, and extracting keywords in the information; inputting the keywords into a preset logistic regression model to obtain an output result, wherein the preset logistic regression model is generated by training based on characteristic information of a plurality of financial objects in advance, and the characteristic information comprises: indicating whether the financial object has the label information of the preset financial event or not and the preset financial event key word in the information of the entity object corresponding to the financial object; and generating indication information indicating whether the preset financial event occurs to the financial object based on the output result.
In a second aspect, the present application provides an information acquisition apparatus, comprising: the acquisition unit is configured to acquire information of an entity object corresponding to the financial object and extract keywords in the information; the prediction unit is configured to input the keyword into a preset logistic regression model to obtain an output result, wherein the preset logistic regression model is generated based on training by utilizing feature information of a plurality of financial objects in advance, and the feature information comprises: indicating whether the financial object has the label information of the preset financial event or not and the preset financial event key word in the information of the entity object corresponding to the financial object; and a generating unit configured to generate indication information indicating whether a preset financial event may occur to the financial object based on the output result.
According to the information acquisition method and the information acquisition device, the information of the entity object corresponding to the financial object is acquired, and the keywords in the information are extracted; inputting the keywords into a preset logistic regression model to obtain an output result, wherein the preset logistic regression model is generated by training based on characteristic information of a plurality of financial objects in advance; and generating indication information indicating whether the preset financial event occurs to the financial object based on the output result. The method realizes that whether the financial object has the preset financial event or not can be predicted only by extracting key words from information such as news without depending on financial data through a logistic regression model, thereby providing a prediction result in time.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:
FIG. 1 is an exemplary system architecture diagram that may be applied to the information acquisition method or apparatus of the present application;
FIG. 2 shows a flow diagram of one embodiment of an information acquisition method according to the present application;
FIG. 3 illustrates an exemplary flow chart for constructing a pre-set logistic regression model;
FIG. 4 shows a schematic structural diagram of one embodiment of an information acquisition apparatus according to the present application;
fig. 5 is a schematic structural diagram of a server suitable for implementing the information acquisition method according to the embodiment of the present application.
Detailed Description
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
Fig. 1 shows an exemplary system architecture to which the information acquisition method or apparatus of the present application can be applied.
As shown in fig. 1, the system architecture may include a server 101, a network 102, and a server 103. Network 102 is used to provide the medium of a transmission link between server 101 and server 103. Server 103 may be a server that provides network resources such as financial news. The server 101 may employ a web crawler to obtain news of web resources on the server 103, such as companies that issue bonds.
Referring to fig. 2, a flow chart of an embodiment of an information acquisition method according to the present application is shown. The method may be performed by a server, such as server 101 in fig. 1, and accordingly, the apparatus may be provided in a server, such as server 101 in fig. 1. The method comprises the following steps:
step 201, obtaining information of an entity object corresponding to the financial object, and extracting a keyword in the information.
In this embodiment, to predict whether the financial object will have the predetermined financial event, information of the physical object corresponding to the financial object may be first obtained. For example, the financial object is a bond, a default financial event is a non-default financial event, the entity object corresponding to the financial object is a company issuing the bond, and the information is news of the company issuing the bond, and the news of the company issuing the bond can be obtained first, in order to predict whether the bond will have the non-default event.
In this embodiment, after the information is acquired, the keywords in the information may be extracted, and the keywords may be associated with the operation status of the physical object corresponding to the financial object. Taking an example in which the financial object is a bond and the entity object is a company that issues the bond, the news of the company that issues the bond includes words related to the operating status of the company. For example, if news that relates to the progress of the investment project of the company describes that one project of the company is progressing slowly, keywords such as the title of the project, the progress, and the slowness can be extracted.
Step 202, inputting the keywords into a preset logistic regression model to obtain an output result.
In this embodiment, after the keywords are extracted in step 201, a preset logistic regression model may be used to predict whether the financial object will have a preset financial event based on the extracted keywords. For example, the financial object is a bond, the preset financial event is a default event, and whether the default event occurs to the bond can be predicted according to the extracted keywords.
By taking a financial object as a bond and a preset financial event as a default event as an example, the characteristic information of a plurality of bonds can be acquired in advance, and the characteristic information of the bonds comprises: the system comprises annotation information indicating whether the bond has a default event or not, and a preset financial event keyword in an entity object corresponding to the bond, such as news of a company issuing the bond. The logistic regression model can be trained by utilizing the characteristic information of a plurality of bonds in advance to obtain a preset logistic regression model. After training, the pre-configured logistic regression model may determine a weight, i.e., a regression coefficient, for each of the pre-configured financial event keywords. The weight of each preset financial event keyword indicates the importance degree of the preset financial event keyword for judging whether the bond will have default events.
In some optional implementation manners of this embodiment, information of an entity object corresponding to a financial object in which a preset financial event occurs may be obtained in advance; dividing the information into a plurality of information sentences, and segmenting the information sentences to obtain a plurality of words; and carrying out cluster analysis on the plurality of words to obtain preset financial event keywords.
Taking a financial object as a bond and a preset financial event as a default event as an example, news of a company of the bond, in which the default event occurs within a certain period of time, for example, within three years, can be obtained in advance, the news is divided into a plurality of sentences, and after the sentences are segmented, a plurality of words can be obtained. Clustering analysis can be performed on the plurality of words to obtain preset financial event keywords associated with the default event.
In this embodiment, after the keywords extracted in step 201 are input into the preset logistic regression model, the preset logistic regression model obtains an output result according to the weight of the preset financial event keywords matched with the extracted keywords, that is, the regression coefficient. The output of the pre-set logistic regression model may be indicative of a probability that the financial object will have the pre-set financial event.
In some optional implementations of this embodiment, the preset logistic regression model may be pre-constructed in the following manner: the logistic regression model may be first constructed, the feature information of the plurality of financial objects is acquired, and the plurality of feature information is divided into feature information for training and feature information for verification. The plurality of financial objects comprise financial objects which have a preset financial event and financial objects which have not a preset financial event. The characteristic information of the financial object which has undergone the preset financial event comprises annotation information indicating that the financial object has undergone the preset financial event and a preset financial event keyword in the information of the entity object corresponding to the financial object. The characteristic information of the financial object without the preset financial event comprises annotation information indicating that the financial object has not the preset financial event and a preset financial event keyword in the information of the entity object corresponding to the financial object.
When the logistic regression model is trained using the feature information of the financial object used for training, the label information in the feature information may be used as a numerical value of the dependent variable, for example, the label information in the feature information of the financial object is 1, which indicates that a preset financial event has occurred, and the label information in the feature information of the financial object is 0, which indicates that the preset financial event has not occurred. And taking preset financial event keywords in the characteristic information as numerical values of independent variables, and training the logistic regression model to obtain the trained logistic regression model. Each preset financial event keyword in the feature information of the financial object used for training corresponds to a regression coefficient, and the regression coefficient may represent the importance degree of the preset financial event keyword for determining whether the financial object will have a preset financial event.
Then, a plurality of regression results indicating whether the financial object will have the preset financial event can be obtained by inputting the preset financial event keywords in the characteristic information of each financial object for verification into the trained logistic regression model. The characteristic information of the financial object used for verification comprises the characteristic information of the financial object which has not occurred with the preset financial event and the characteristic information of the financial object which has occurred with the preset financial event. The feature information of the financial object in which the preset financial event has occurred among the feature information of the financial object used for verification may be referred to as first feature information, and the feature information of the financial object in which the preset financial event has not occurred among the feature information of the financial object used for verification may be referred to as second feature information.
After obtaining the multiple regression results, a ratio of the number of the first feature information corresponding to the regression result in accordance with the labeling information in all the first feature information to the number of all the first feature information may be calculated, and the ratio is referred to as a first ratio. In other words, the first ratio is to predict whether a financial object having a preset financial event will have a preset financial event among all financial objects corresponding to the feature information for verification by using the trained logistic regression model, and the obtained regression result is a ratio of the number of the preset financial events to the total number of the financial objects having the preset financial event.
After obtaining the multiple regression results, a ratio of the number of the second feature information corresponding to the regression result and the annotation information in all the second feature information to the number of all the second feature information may be calculated, and the ratio is referred to as a second ratio. In other words, the second ratio is a ratio of the number of financial objects in which no preset financial event occurs to the total number of financial objects in which no preset financial event occurs, which is obtained by predicting whether a preset financial event occurs to a financial object in which no preset financial event occurs among all financial objects corresponding to the feature information for verification by using the trained logistic regression model.
After the first proportion and the second proportion are calculated, whether the first proportion and the second proportion meet preset conditions can be judged, and the preset conditions comprise: the first proportion and the second proportion are both larger than a proportion threshold value, and when a preset condition is met, the trained logistic regression model can be used as a preset logistic regression model.
When the calculated first proportion and the second proportion do not meet the preset conditions, Bayesian analysis can be carried out on the regression results, and the regression coefficient of each preset financial event keyword is adjusted until the preset conditions are met; and taking the logistic regression model after adjusting the regression coefficient of the preset financial event key words as a preset logistic regression model.
Referring to FIG. 3, an exemplary flow chart for constructing a pre-set logistic regression model is shown.
In the application, the negative samples can be firstly subjected to cluster analysis to obtain the financial event keywords. For example, the financial object is a bond, the negative sample is news in three years of the company issuing the bond with the default event, and the words in the news are segmented to obtain a plurality of words appearing in the news. Clustering analysis can be performed on the plurality of words to obtain a plurality of financial event keywords.
And establishing a logistic regression model based on the financial event keywords, and outputting a probability value between 0 and 1. In the logistic regression model based on the financial event keywords, each financial event keyword corresponds to a regression coefficient, and the regression coefficient can represent the importance degree of the preset financial event keyword for judging whether the preset financial event occurs to the financial object. When determining whether the preset financial event occurs to the current financial object, the key word associated with the operation state of the entity object, which is extracted from the information of the entity object corresponding to the current financial object, may be input to the logistic regression model, and the logistic regression model outputs a probability indicating that the preset financial event may occur to the current financial object.
And carrying out Bayesian analysis on the regression result, and adjusting the regression coefficient. When the output result of the established logistic regression model is not ideal, Bayesian analysis can be carried out on the regression result, and the regression coefficient of the preset financial event keywords can be adjusted.
And step 203, generating indication information indicating whether the preset financial event occurs to the financial object or not based on the output result.
In this embodiment, after the keyword of the financial object is input to the preset logistic regression model in step 202 to obtain the output result, the indication information indicating whether the financial object will have the preset financial event may be generated according to the output result. For example, the output result of the preset logistic regression model is a probability indicating that the financial object will have a preset financial event, and according to the probability, it may be determined whether the financial object will have the preset financial event, and indication information indicating whether the financial object will have the preset financial event may be generated.
In some optional implementations of the embodiment, when the output result of the preset logistic regression model indicates that the financial object may have a probability of a preset financial event, and when the probability output by the preset logistic regression model is greater than a probability threshold, indication information indicating that the financial object may have the preset financial event may be generated. When the probability output by the preset logistic regression model is smaller than the probability threshold value, indication information indicating that the preset financial event does not occur to the financial object can be generated.
Taking a financial object as a bond and a preset financial event as a default event as an example, when the probability output by the preset logistic regression model is greater than a probability threshold, indicating information indicating that the bond will have the default event can be generated. When the probability output by the preset logistic regression model is smaller than the probability threshold value, indication information indicating that the bond does not have default events can be generated.
Referring to fig. 4, a schematic structural diagram of an embodiment of an information acquisition apparatus according to the present application is shown, the information acquisition apparatus including: acquisition section 401, prediction section 402, and generation section 403. The acquiring unit 401 is configured to acquire information of an entity object corresponding to a financial object, and extract a keyword from the information; the prediction unit 402 is configured to input the keyword into a preset logistic regression model, and obtain an output result, where the preset logistic regression model is generated based on training performed in advance by using feature information of a plurality of financial objects, where the feature information includes: indicating whether the financial object has the label information of the preset financial event or not and the preset financial event key word in the information of the entity object corresponding to the financial object; the generating unit 403 is configured to generate indication information indicating whether a preset financial event may occur to the financial object based on the output result.
In some optional implementations of this embodiment, the generating unit 403 includes: an indication information generating subunit (not shown) configured to generate, when the output result is a probability indicating that the financial object will have a preset financial event, indication information indicating that the financial object will have the preset financial event when the probability is greater than a probability threshold; and when the probability is smaller than the probability threshold value, generating indicating information indicating that the preset financial event does not occur to the financial object.
In some optional implementation manners of this embodiment, the information obtaining apparatus further includes: a first model generation unit (not shown) configured to construct a logistic regression model; acquiring characteristic information of a plurality of financial objects, and dividing the characteristic information into characteristic information for training and characteristic information for verification; training the logistic regression model by using the feature information for training to obtain the trained logistic regression model, wherein each preset financial event keyword in the feature information for training corresponds to one regression coefficient; determining a first quantity of first characteristic information containing marking information indicating that a preset financial event occurs to the financial object and a second quantity of second characteristic information containing marking information indicating that the preset financial event does not occur to the financial object in the characteristic information for verification; inputting preset financial event keywords in each piece of feature information for verification into the trained logistic regression model to obtain a plurality of regression results indicating whether the financial objects can generate the preset financial events; calculating a first proportion and a second proportion, wherein the first proportion is the proportion of the quantity of the first characteristic information of the corresponding regression result consistent with the labeling information to the first quantity, and the second proportion is the proportion of the quantity of the second characteristic information of the corresponding regression result consistent with the labeling information to the second quantity; judging whether the first proportion and the second proportion meet preset conditions or not, wherein the preset conditions comprise: the first proportion and the second proportion are both greater than a proportion threshold; when the first proportion and the second proportion meet preset conditions, taking the trained logistic regression model as a preset logistic regression model; a second model generation unit (not shown) configured to perform bayesian analysis on the regression result when the first ratio and the second ratio do not satisfy the preset condition, and adjust the regression coefficient of each preset financial event keyword until the preset condition is satisfied; taking the logistic regression model after adjusting the regression coefficient of the preset financial event key words as a preset logistic regression model; a keyword obtaining unit (not shown) configured to obtain information of an entity object corresponding to a financial object in which a preset financial event has occurred; dividing the information into a plurality of information sentences, and segmenting the information sentences to obtain a plurality of words; and carrying out cluster analysis on the plurality of words to obtain preset financial event keywords.
The application also provides a server, which can comprise the information acquisition device described in the figure 4. The server may be configured with one or more processors; a memory for storing one or more programs, wherein the one or more programs may include instructions for performing the operations described in the above steps 201 and 203. The one or more programs, when executed by the one or more processors, cause the one or more processors to perform the operations described in step 201 and 203 above.
Fig. 5 shows a schematic structural diagram of a server suitable for implementing the information acquisition method according to the embodiment of the present application.
As shown in fig. 5, a Central Processing Unit (CPU)501 is included, which can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)502 or a program loaded from a storage section 508 into a Random Access Memory (RAM) 503. The CPU 501, ROM502, and RAM 503 are connected to each other via a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.
The following components are connected to the I/O interface 505: an input portion 506; an output portion 507; a storage portion 508 including a hard disk and the like; and a communication section 509 including a network interface card such as a LAN card, a modem, or the like. The communication section 509 performs communication processing via a network such as the internet. The driver 510 is also connected to the I/O interface 505 as necessary. A removable medium 511 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 510 as necessary, so that a computer program read out therefrom is mounted into the storage section 508 as necessary.
The processes described in the above-described respective steps in the present application may be implemented as a computer program. The computer program may be carried on a computer readable medium, the computer program comprising instructions for carrying out the method illustrated in the flow chart. The computer program can be downloaded and installed from a network through the communication section 509, and/or installed from the removable medium 511.
The present application also provides a computer readable medium, which may be included in a server; or the device can exist independently and is not assembled into the server. The computer readable medium carries one or more programs which, when executed by the server, cause the server to: acquiring information of an entity object corresponding to a financial object, and extracting keywords in the information; inputting the keywords into a preset logistic regression model to obtain an output result, wherein the preset logistic regression model is generated by training based on characteristic information of a plurality of financial objects in advance, and the characteristic information comprises: indicating whether the financial object has the label information of the preset financial event or not and the preset financial event key word in the information of the entity object corresponding to the financial object; and generating indication information indicating whether the preset financial event occurs to the financial object based on the output result.
It should be noted that the computer readable medium can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention herein disclosed is not limited to the particular combination of features described above, but also encompasses other arrangements formed by any combination of the above features or their equivalents without departing from the spirit of the present application. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.

Claims (10)

1. An information acquisition method, characterized in that the method comprises:
acquiring information of an entity object corresponding to a financial object, and extracting keywords in the information;
inputting the keywords into a preset logistic regression model to obtain an output result, wherein the preset logistic regression model is generated by training based on characteristic information of a plurality of financial objects in advance, and the characteristic information comprises: indicating whether the financial object has the label information of the preset financial event or not and the preset financial event key word in the information of the entity object corresponding to the financial object;
generating indication information indicating whether a preset financial event occurs to the financial object based on the output result;
before obtaining the information of the entity object corresponding to the financial object, the method further comprises:
constructing a logistic regression model;
acquiring characteristic information of a plurality of financial objects, and dividing the characteristic information into characteristic information for training and characteristic information for verification;
training the logistic regression model by using the feature information for training to obtain the trained logistic regression model, wherein each preset financial event keyword in the feature information for training corresponds to one regression coefficient;
inputting preset financial event keywords in each piece of feature information for verification into the trained logistic regression model to obtain a plurality of regression results indicating whether the financial objects can generate the preset financial events;
calculating a first proportion and a second proportion, wherein the first proportion is the proportion of the number of first characteristic information of which the corresponding regression result is consistent with the labeling information to a first number, the second proportion is the proportion of the number of second characteristic information of which the corresponding regression result is consistent with the labeling information to a second number, the first characteristic information is the characteristic information of a financial object which has a preset financial event in the characteristic information of the financial object for verification, the second characteristic information is the characteristic information of a financial object which has no preset financial event in the characteristic information of the financial object for verification, the first number is the number of all first characteristic information in the characteristic information for verification, and the second number is the number of all second characteristic information in the characteristic information for verification;
and in response to the first proportion and the second proportion meeting a preset condition, taking the trained logistic regression model as a preset logistic regression model, wherein the preset condition comprises: the first proportion and the second proportion are both greater than a proportion threshold.
2. The method of claim 1, wherein the output is indicative of a probability that a financial object will have a predetermined financial event; and
generating indication information indicating whether the preset financial event may occur to the financial object based on the output result includes:
when the probability is larger than a probability threshold value, generating indicating information indicating that a preset financial event can occur to the financial object;
and when the probability is smaller than a probability threshold value, generating indication information indicating that the preset financial event does not occur to the financial object.
3. The method of claim 2, wherein prior to obtaining information about physical objects corresponding to the financial objects, the method further comprises:
determining the first number and the second number;
and judging whether the first proportion and the second proportion meet preset conditions.
4. The method of claim 3, further comprising:
when the first proportion and the second proportion do not meet the preset conditions, carrying out Bayesian analysis on the regression results, and adjusting the regression coefficient of each preset financial event keyword until the preset conditions are met;
and taking the logistic regression model after adjusting the regression coefficient of the preset financial event key words as a preset logistic regression model.
5. The method of claim 4, wherein before obtaining information about physical objects corresponding to the financial objects, the method further comprises:
acquiring information of an entity object corresponding to a financial object in which a preset financial event occurs;
dividing the information into a plurality of information sentences, and performing word segmentation on the information sentences to obtain a plurality of words;
and carrying out cluster analysis on the plurality of words to obtain preset financial event keywords.
6. An information acquisition apparatus, characterized in that the apparatus comprises:
the system comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is configured to acquire information of an entity object corresponding to a financial object and extract a keyword in the information;
the prediction unit is configured to input the keyword into a preset logistic regression model to obtain an output result, wherein the preset logistic regression model is generated by training with feature information of a plurality of financial objects in advance, and the feature information includes: indicating whether the financial object has the label information of the preset financial event or not and the preset financial event key word in the information of the entity object corresponding to the financial object;
a generating unit configured to generate indication information indicating whether a preset financial event may occur to the financial object based on the output result;
a first model generation unit configured to construct a logistic regression model; acquiring characteristic information of a plurality of financial objects, and dividing the characteristic information into characteristic information for training and characteristic information for verification; training the logistic regression model by using the feature information for training to obtain the trained logistic regression model, wherein each preset financial event keyword in the feature information for training corresponds to one regression coefficient; inputting preset financial event keywords in each piece of feature information for verification into the trained logistic regression model to obtain a plurality of regression results indicating whether the financial objects can generate the preset financial events; calculating a first proportion and a second proportion, wherein the first proportion is the proportion of the number of first characteristic information of which the corresponding regression result is consistent with the labeling information to a first number, the second proportion is the proportion of the number of second characteristic information of which the corresponding regression result is consistent with the labeling information to a second number, the first characteristic information is the characteristic information of a financial object which has a preset financial event in the characteristic information of the financial object for verification, the second characteristic information is the characteristic information of a financial object which has no preset financial event in the characteristic information of the financial object for verification, the first number is the number of all first characteristic information in the characteristic information for verification, and the second number is the number of all second characteristic information in the characteristic information for verification; and in response to the first proportion and the second proportion meeting a preset condition, taking the trained logistic regression model as a preset logistic regression model, wherein the preset condition comprises: the first proportion and the second proportion are both greater than a proportion threshold.
7. The apparatus of claim 6, wherein the generating unit comprises:
the indication information generation subunit is configured to generate indication information indicating that the financial object will have a preset financial event when the output result indicates that the probability of the preset financial event will occur to the financial object is greater than a probability threshold; and when the probability is smaller than a probability threshold value, generating indication information indicating that the preset financial event does not occur to the financial object.
8. The apparatus according to claim 7, wherein the first model generation unit is further configured to determine the first number and the second number; judging whether the first proportion and the second proportion meet preset conditions or not;
the device further comprises:
the second model generation unit is configured to perform Bayesian analysis on the regression result when the first proportion and the second proportion do not meet the preset condition, and adjust the regression coefficient of each preset financial event keyword until the preset condition is met; taking the logistic regression model after adjusting the regression coefficient of the preset financial event key words as a preset logistic regression model;
the system comprises a keyword acquisition unit, a keyword acquisition unit and a processing unit, wherein the keyword acquisition unit is configured to acquire information of an entity object corresponding to a financial object of which a preset financial event occurs; dividing the information into a plurality of information sentences, and performing word segmentation on the information sentences to obtain a plurality of words; and carrying out cluster analysis on the plurality of words to obtain preset financial event keywords.
9. A server, comprising:
one or more processors;
a memory for storing one or more programs,
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method recited in any of claims 1-5.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-5.
CN201710153107.3A 2017-03-15 2017-03-15 Information acquisition method and device Active CN108628863B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710153107.3A CN108628863B (en) 2017-03-15 2017-03-15 Information acquisition method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710153107.3A CN108628863B (en) 2017-03-15 2017-03-15 Information acquisition method and device

Publications (2)

Publication Number Publication Date
CN108628863A CN108628863A (en) 2018-10-09
CN108628863B true CN108628863B (en) 2021-07-20

Family

ID=63687386

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710153107.3A Active CN108628863B (en) 2017-03-15 2017-03-15 Information acquisition method and device

Country Status (1)

Country Link
CN (1) CN108628863B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110033284A (en) * 2019-03-13 2019-07-19 平安城市建设科技(深圳)有限公司 Source of houses verification method, apparatus, equipment and storage medium
CN111786802B (en) * 2019-04-03 2023-07-04 北京嘀嘀无限科技发展有限公司 Event detection method and device
CN112989165B (en) * 2021-03-26 2022-07-01 浙江有数数智科技有限公司 Method for calculating public opinion entity relevance
CN113111635A (en) * 2021-04-19 2021-07-13 中国工商银行股份有限公司 Report form comparison method and device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103176983A (en) * 2011-12-20 2013-06-26 中国科学院计算机网络信息中心 Event warning method based on Internet information
CN105528652A (en) * 2015-12-03 2016-04-27 北京金山安全软件有限公司 Method and terminal for establishing prediction model

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7389265B2 (en) * 2001-01-30 2008-06-17 Goldman Sachs & Co. Systems and methods for automated political risk management
CN102135967B (en) * 2010-01-27 2013-06-05 华为技术有限公司 Webpage keywords extracting method, device and system
CN106372961A (en) * 2016-08-23 2017-02-01 北京小米移动软件有限公司 Commodity recommendation method and device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103176983A (en) * 2011-12-20 2013-06-26 中国科学院计算机网络信息中心 Event warning method based on Internet information
CN105528652A (en) * 2015-12-03 2016-04-27 北京金山安全软件有限公司 Method and terminal for establishing prediction model

Also Published As

Publication number Publication date
CN108628863A (en) 2018-10-09

Similar Documents

Publication Publication Date Title
CN110555451B (en) Information identification method and device
CN111144937A (en) Advertisement material determination method, device, equipment and storage medium
CN108628863B (en) Information acquisition method and device
CN111061877A (en) Text theme extraction method and device
CN110543946A (en) method and apparatus for training a model
CN113780367A (en) Classification model training and data classification method and device, and electronic equipment
US10733537B2 (en) Ensemble based labeling
CN111383100A (en) Risk model-based full life cycle management and control method and device
CN114298050A (en) Model training method, entity relation extraction method, device, medium and equipment
CN116756041A (en) Code defect prediction and positioning method and device, storage medium and computer equipment
CN111178687A (en) Financial risk classification method and device and electronic equipment
CN116932919A (en) Information pushing method, device, electronic equipment and computer readable medium
CN113052635A (en) Population attribute label prediction method, system, computer device and storage medium
CN111582649B (en) Risk assessment method and device based on user APP single-heat coding and electronic equipment
CN111582645B (en) APP risk assessment method and device based on factoring machine and electronic equipment
CN111191677B (en) User characteristic data generation method and device and electronic equipment
CN115345600A (en) RPA flow generation method and device
CN110704614B (en) Information processing method and device for predicting user group type in application
CN117478434B (en) Edge node network traffic data processing method, device, equipment and media
CN113111167B (en) Method and device for extracting warning text received vehicle model based on deep learning model
CN119106750A (en) Task processing method based on large model, device, equipment and medium
CN109933926B (en) Method and device for predicting flight reliability
CN113780996A (en) Post data detection method, model training method and device and electronic equipment
CN110674497B (en) Malicious program similarity calculation method and device
CN110119433B (en) Method and apparatus for predicting gender

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20191122

Address after: 201210 room j1328, floor 3, building 8, No. 55, Huiyuan Road, Jiading District, Shanghai

Applicant after: SHANGHAI YOUYANG NEW MEDIA INFORMATION TECHNOLOGY Co.,Ltd.

Address before: 100085 Beijing, Haidian District, No. ten on the ground floor, No. 10 Baidu building, layer three

Applicant before: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY Co.,Ltd.

TA01 Transfer of patent application right
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20181009

Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY Co.,Ltd.

Assignor: SHANGHAI YOUYANG NEW MEDIA INFORMATION TECHNOLOGY Co.,Ltd.

Contract record no.: X2020990000201

Denomination of invention: Network service description information acquisition method and network service description information acquisition device

License type: Exclusive License

Record date: 20200420

EE01 Entry into force of recordation of patent licensing contract
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address

Address after: 401120 b7-7-2, Yuxing Plaza, No.5, Huangyang Road, Yubei District, Chongqing

Patentee after: Chongqing duxiaoman Youyang Technology Co.,Ltd.

Address before: 201210 room j1328, 3 / F, building 8, 55 Huiyuan Road, Jiading District, Shanghai

Patentee before: SHANGHAI YOUYANG NEW MEDIA INFORMATION TECHNOLOGY Co.,Ltd.

CP03 Change of name, title or address