CN119167063A - Data prediction method, large model training method, device and electronic equipment - Google Patents
Data prediction method, large model training method, device and electronic equipment Download PDFInfo
- Publication number
- CN119167063A CN119167063A CN202411283026.1A CN202411283026A CN119167063A CN 119167063 A CN119167063 A CN 119167063A CN 202411283026 A CN202411283026 A CN 202411283026A CN 119167063 A CN119167063 A CN 119167063A
- Authority
- CN
- China
- Prior art keywords
- information
- feature data
- sample
- data prediction
- news
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/27—Regression, e.g. linear or logistic regression
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/04—Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Business, Economics & Management (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Finance (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Accounting & Taxation (AREA)
- Economics (AREA)
- Computing Systems (AREA)
- Technology Law (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Marketing (AREA)
- Computational Linguistics (AREA)
- Development Economics (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The disclosure provides a data prediction method, a large model training method, a device and electronic equipment, relates to the technical field of artificial intelligence, and particularly relates to the technical fields of deep learning, data mining, data analysis, large data processing and the like. The method comprises the steps of obtaining a plurality of information summaries related to a target object, wherein the information summaries have different information categories and are used for representing the performance and/or affected degree of the target object in different dimensions, and obtaining a characteristic data prediction result of the target object and interpretation information aiming at the characteristic data prediction result by utilizing a target large model based on the information summaries.
Description
Technical Field
The disclosure relates to the technical field of artificial intelligence, in particular to the technical fields of deep learning, data mining, data analysis, big data processing and the like, and specifically relates to a data prediction method, a big model training method, a device and electronic equipment.
Background
With the continuous and steady increase of economy, the financial market has become the core of modern economic systems, attracting attention of numerous investors. However, in view of the sudden change in market environment, the financial market may be subject to severe fluctuations, which makes market participants desirous to have an efficient and accurate way to predict partial feature data associated with the financial market for optimizing investment decisions.
Disclosure of Invention
The disclosure provides a data prediction method, a large model training method, a device and electronic equipment.
According to a first aspect of the present disclosure, there is provided a data prediction method, comprising:
obtaining a plurality of information summaries related to the target object, wherein the plurality of information summaries have different information categories for representing the performance and/or affected degree of the target object in different dimensions;
and obtaining a characteristic data prediction result of the target object and interpretation information aiming at the characteristic data prediction result based on the plurality of information abstracts by using the target large model.
According to a second aspect of the present disclosure, there is provided a large model training method comprising:
obtaining a plurality of information abstract samples related to a target object sample, wherein the information abstract samples have different information categories and are used for representing the performance and/or affected degree of the target object sample in different dimensions;
Obtaining a characteristic data prediction result sample aiming at a target object sample and an interpretation information sample aiming at the characteristic data prediction result sample based on a plurality of information abstract samples by utilizing an initial large model;
And training the initial large model based on the characteristic data prediction result sample and the interpretation information sample to obtain a target large model.
According to a third aspect of the present disclosure, there is provided a data prediction apparatus comprising:
the system comprises a summary information acquisition unit, a target object acquisition unit and a storage unit, wherein the summary information acquisition unit is used for acquiring a plurality of information summaries related to the target object, and the plurality of information summaries have different information categories and are used for representing the performance and/or affected degrees of the target object in different dimensions;
And the characteristic data prediction unit is used for obtaining a characteristic data prediction result of the target object and interpretation information aiming at the characteristic data prediction result based on the plurality of information abstracts by utilizing the target large model.
According to a fourth aspect of the present disclosure, there is provided a large model training apparatus comprising:
The system comprises a first sample acquisition unit, a second sample acquisition unit and a first sample analysis unit, wherein the first sample acquisition unit is used for acquiring a plurality of information abstract samples related to a target object sample, and the plurality of information abstract samples have different information categories and are used for representing the performance and/or affected degrees of the target object sample in different dimensions;
The second sample acquisition unit is used for obtaining a characteristic data prediction result sample aiming at the target object sample and an interpretation information sample aiming at the characteristic data prediction result sample based on a plurality of information abstract samples by utilizing the initial large model;
And the large model training unit is used for training the initial large model based on the characteristic data prediction result sample and the interpretation information sample to obtain a target large model.
According to a fifth aspect of the present disclosure, there is provided an electronic device comprising:
At least one processor;
A memory communicatively coupled to the at least one processor;
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the methods provided by the embodiments of the present disclosure.
According to a sixth aspect of the present disclosure, there is provided a non-transitory computer readable storage medium storing computer instructions for causing the computer to perform a method provided according to an embodiment of the present disclosure.
According to another aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements a method provided according to embodiments of the present disclosure.
By adopting the method and the device, the accuracy and the usability of the characteristic data prediction result can be improved.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
fig. 1 is a schematic flow chart of a data prediction method according to an embodiment of the disclosure;
Fig. 2 is a flowchart illustrating a news digest acquisition process according to an embodiment of the present disclosure;
FIG. 3 is an explanatory diagram of a process for obtaining a dynamic summary of feature data according to an embodiment of the present disclosure;
FIG. 4 is a schematic diagram of a macro message digest acquisition process according to an embodiment of the disclosure;
FIG. 5 is a flowchart illustrating a basic face abstract acquisition process according to an embodiment of the present disclosure;
Fig. 6 is an integrity flow chart of a data prediction method according to an embodiment of the disclosure;
FIG. 7 is an auxiliary explanatory diagram of an integrity flow of a data prediction method according to an embodiment of the present disclosure;
Fig. 8 is a schematic view of a scenario of a data prediction method according to an embodiment of the present disclosure;
FIG. 9 is a flow chart of a large model training method provided in an embodiment of the disclosure;
FIG. 10 is a schematic diagram of an integrity flow chart of a large model training method according to an embodiment of the disclosure;
FIG. 11 is an auxiliary illustration of the integrity flow of a large model training method provided by embodiments of the present disclosure;
FIG. 12 is a schematic view of a scenario of a large model training method provided by an embodiment of the present disclosure;
FIG. 13 is a schematic block diagram of a data prediction apparatus according to an embodiment of the present disclosure;
FIG. 14 is a schematic block diagram of a large model training apparatus provided in an embodiment of the present disclosure;
fig. 15 is a schematic block diagram of an electronic device according to an embodiment of the disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
As described in the background, as the sustained robust growth of economies, the financial market has become the core of modern economic systems, attracting attention of numerous investors. However, in view of the sudden change in market environment, the financial market may suffer from severe fluctuations, which makes market participants desiring to be able to get an efficient and accurate way to predict partial feature data related to the financial market for optimizing investment decisions.
Taking the stock market as an example, at present, part of market participants understand and analyze market information to obtain a stock price trend prediction result of target stocks, and make investment decisions according to the stock price trend prediction result. However, for a bulk investor, due to the limited ability to understand and analyze market information, the tendency to develop behavioral deviations, and lack of risk management skills, not only is a promising investment opportunity easily missed, but the incidence of improper risk is increased. Similarly, small and medium-sized assets and financial management companies have difficulty in deeply analyzing market information due to limited resources, small selection range and the like. In addition, even a large professional asset or a financial management company has the problems of unsmooth communication, excitation diversity and the like caused by huge organization scale, so that the capability of understanding and analyzing market information is affected, and finally, the accuracy of the stock price trend prediction result of the target stock is affected.
In seeking to solve the above problems, researchers have explored a solution for predicting stock price trends using deep learning techniques, which mainly includes:
(1) Method for predicting stock price trend based on text analysis
In early studies to implement stock price trend prediction based on text analysis, researchers were primarily concerned with how to evaluate the validity of different text representations (e.g., word bags, noun phrases, named entities, etc.) using a support vector machine (Support Vector Machine, SVM). Thereafter, these "shallow" features are gradually replaced by structured data, forming events in the form of (Action, object) tuples, and as input to the neural network model for use in implementing stock price trend predictions. Still later, as artificial intelligence technology developed, researchers began capturing key information directly from pre-trained text embedding using neural network models based on attention mechanisms to enhance the performance of the neural network models, while taking into account the confusing and diversifying nature of the text information.
(2) Method for realizing stock price trend prediction by using existing large language model
The studies of Zaremba and Demir reveal a great potential for ChatGPT to be applied in the financial field, especially in processing related tasks based on natural language processing techniques, such as emotional analysis of public opinion news. These tasks show a significant correlation of natural language processing objects with stock market dynamics. In addition, lopez-Lira and Tang in turn verify the accuracy of ChatGPT in terms of public opinion news emotion analysis and emphasize the positive correlation between the stock price trend prediction result (score value) generated by ChatGPT and the subsequent stock return.
(3) Method for predicting stock price trend by using large language model in vertical field
In the related research of using large language models to process financial tasks, bloombergGPT is more prominent, and a 500 hundred million-parameter large language model of finance class (i.e. a large language model in the vertical field) is trained by using the existing financial corpus, so as to realize named entity recognition and emotion analysis of public opinion news, and further realize stock price trend prediction.
However, the inventor has found that the above scheme for predicting stock price trend by using deep learning technique still has the following disadvantages:
(1) Single information source
In the prior art, only a single type of text data (e.g., public opinion news) is generally considered as a prediction basis when the existing large language model is used for predicting the price trend, but other types of text data (e.g., price dynamic data, macro economic study report, basic plane information, etc.) are not considered, and other modal data (e.g., image data, knowledge graph, etc.) are not considered, which makes the large language model incapable of obtaining a relatively accurate price trend prediction result.
(2) Difficulties exist in processing social text and related public opinion
The existence of social text and related public opinion increases the difficulty of predicting stock price trends using existing large language models. In particular, breaking news, crisis events, government reports, etc. may have a large impact on stock prices, whereas unproven opinions, small news, and ambiguous comments typically do not. However, it is difficult for existing large language models to accurately trade off these confounding social text and related public opinion and to derive a maximum likelihood estimate to ensure accuracy of the stock price trend prediction results.
(3) The accumulated error influences the accuracy of the stock price trend prediction result
Multiple prediction stages are typically involved in predicting stock price trends using existing large language models. In the case that the low-quality result is generated in the previous stage, the low-quality result generated in the previous stage will affect the predicted result in the next stage, and the effect will gradually accumulate errors, and finally affect the accuracy of the stock price trend predicted result.
(4) Insufficient interpretation ability
When the existing large language model is used for predicting the stock price trend, only the stock price trend prediction result is usually obtained, and no corresponding interpretation information exists. This may reduce the user's confidence in the stock price trend prediction result, thereby reducing the availability of the stock price trend prediction result.
To summarize, at present, a scheme for predicting the price trend by using a deep learning technology cannot obtain a more accurate price trend prediction result, and cannot ensure the availability of the price trend prediction result.
In view of the above problems, embodiments of the present disclosure provide a data prediction method that may be applied to an electronic device. The electronic device may be a server or a terminal device. Here, the terminal device may be a workstation, a mainframe computer, a conventional computer (e.g., desktop computer, notebook computer, vehicle-mounted computer, etc.), a personal digital assistant, or other similar computing device. In the following, a data prediction method provided by the embodiment of the present disclosure will be described with reference to a flowchart shown in fig. 1. It should be noted that although a logical order is illustrated in the flowchart, in some cases, the steps illustrated or described may be performed in other orders.
Step S101, a plurality of information summaries related to the target object are acquired.
The target object may be any financial investment object (e.g., stock, fund, futures, etc.) that requires prediction of characteristic data. Here, the feature data may be price trends of the target object.
Furthermore, in the disclosed embodiments, the plurality of information summaries have different information categories for characterizing the performance and/or the affected extent of the target object in different dimensions. For example, the plurality of information summaries may include at least one of a news summary, a feature data dynamic summary, a macro information summary, and a basic surface summary.
The news digest may be digest information obtained by processing public opinion news related to the target object to represent the affected degree of the target object in the dimension of the public opinion news, the feature data dynamic digest may be digest information obtained by processing price dynamic data of the target object to represent the market performance of the target object, the macro information digest may be digest information obtained by processing macro market economic data to represent the affected degree of the target object in the dimension of the macro market economic, and the basic face digest may be digest information obtained by processing basic face data related to the target object to represent the affected degree of the target object in the dimension of the basic face data.
Step S102, obtaining a characteristic data prediction result of the target object and interpretation information aiming at the characteristic data prediction result by utilizing the target large model based on a plurality of information abstracts.
The characteristic data prediction result may be a price trend prediction result of the target object.
In an example, the answer guide information may be acquired, and the answer guide information and the plurality of information summaries may be input into the target large model, so that the target large model processes the plurality of information summaries according to the answer guide information to obtain a feature data prediction result of the target object, and interpretation information for the feature data prediction result. The answer guide information is first Prompt information, which is used for indicating how the target large model obtains a characteristic data prediction result of the target object based on a plurality of information abstracts, and interpretation information for the characteristic data prediction result.
In addition, in the embodiment of the present disclosure, the target large model is a target signal generator, and the target signal generator may be a large language model. Here, the large language model may be a neural network model with pre-trained and parameter fine-tuned, which has general language knowledge, world knowledge, domain expertise, etc., and is stored inside the model in the form of parameters. In particular, the large language model may be an autoregressive generation model of a transducer architecture.
By adopting the data prediction method provided by the embodiment of the disclosure, a plurality of information summaries which are related to the target object and have different information categories can be obtained for the performance and/or the affected degree of the target object in different dimensions, and then the target large model is utilized to obtain the characteristic data prediction result of the target object and the interpretation information aiming at the characteristic data prediction result based on the plurality of information summaries. Therefore, on one hand, the target large model can comprehensively know the performance and affected degree of the target object from multiple dimensions based on multiple information abstracts which are related to the target object and have different information types, provide relatively comprehensive prediction basis to reduce bias of a single information source and further improve accuracy of the feature data prediction result, and on the other hand, the target large model can obtain not only the feature data prediction result of the target object but also interpretation information aiming at the feature data prediction result, so that the reliability of the feature data prediction result by a user is enhanced by improving the interpretability of the feature data prediction result, and usability of the feature data prediction result is improved.
Since public opinion news (including bulletins, reports, analyst opinions, research results, etc.) of the target entity (i.e., the economic organization that issued the target object) may have immeasurable short-term or long-term impact on market emotion and stock price. Thus, in the embodiments of the present disclosure, the news summarizer may be used to process (e.g., collect, concentrate, etc.) the public opinion news related to the target object, to obtain a daily news summary and/or a current period news summary (e.g., a week news summary, a month news summary, or a period line summary of other lengths) related to the target object. Based on this, it may be appreciated that in embodiments of the present disclosure, the plurality of information summaries may include a news summary, and the news summary may include a daily news summary and/or a current period news summary.
In some alternative embodiments, in the case that the news digest includes a daily news digest, step S101, that is, "obtaining a plurality of information digests related to the target object" may include:
Acquiring N groups of daily public opinion news related to a target object, wherein the N groups of daily public opinion news are in one-to-one correspondence with N independent dates in the current period, and N is more than or equal to 2 and is an integer;
And using each group of daily public opinion news in the N groups of daily public opinion news as the public opinion news to be processed, and processing the public opinion news to be processed by using a first novel Wen Zhaiyao device to obtain daily news abstracts corresponding to the public opinion news to be processed.
The public opinion news may be in text form or picture form, which is not limited by the embodiment of the disclosure.
In one example, N sets of daily public opinion news related to a target object may be obtained using an application programming interface (Application Programming Interface, API) that provides news about stock markets and financial resources, and each of the N sets of daily public opinion news may be preprocessed to exclude news text (e.g., news text having a headline decoy property) that is not related to the target object from among the N sets of daily public opinion news, and to ensure that each of the N sets of daily public opinion news has an appropriate text format for the purpose of supporting the first news summarizer.
After obtaining N groups of daily public opinion news, each group of daily public opinion news in the N groups of daily public opinion news may be used as a public opinion news to be processed, and then the public opinion news to be processed is input into a first new Wen Zhaiyao device, so that organization names, entity names (e.g., character names, product names, etc.) in the public opinion news to be processed and key events (e.g., performance reports, combined acquisitions, etc.) related to the public opinion news are identified and extracted by using the first new Wen Zhaiyao device, and context information (e.g., event reasons, organization responses, etc.) behind the public opinion news is understood, and finally, daily news summaries corresponding to the public opinion news to be processed are output according to a preset output format and stored in a centralized repository.
Wherein the first news digest may be a large language model. Here, the large language model may be a pre-trained neural network model having general language knowledge, world knowledge, domain expertise, etc., and stored in the form of parameters inside the model. In particular, the large language model may be an autoregressive generation model of a transducer architecture.
Further, in the case where the news digest further includes the news digest of the current period, step S101, that is, "acquiring a plurality of information digests related to the target object" may further include:
obtaining a daily news abstract set based on N daily news abstracts corresponding to N groups of daily public opinion news one by one;
acquiring a historical period news abstract corresponding to a historical period;
and processing the daily news digest set based on the news digests of the historical time period by using a second news digest device to obtain the news digest of the current time period corresponding to the current time period.
The time length of the current period may be one week, and the current period and the historical period may be two consecutive weeks, and the time length of the current period may be one month, and the current period and the historical period may be two consecutive months, and the second news summarizer may be a large language model, and the second news summarizer and the first news summarizer may be the same large language model.
In an example, after N daily news digests are obtained, the N daily news digests may be spliced to obtain a daily news digest set, a historical period news digest corresponding to a historical period is obtained, and then the second news digest device is utilized to process the daily news digest set based on the historical period news digest to obtain a current period news digest corresponding to a current period. The news digest of the current period may include, among other things, monthly market location of the target entity, sales performance, legal and strategic actions, innovation and sustainable development, supervision challenges, etc.
In order to further clarify the process of obtaining the daily news digest and the news digest in the current period, the process of obtaining the daily news digest and the news digest in the current period will be described below using mathematical formulas in conjunction with fig. 2:
PNS,t=Summarize2(PNS,t-1,NS,τ)
wherein N S,i is used for representing the public opinion news of the degree of day corresponding to the i-th independent date in the current period, and i is more than or equal to 1 and less than or equal to N; The method comprises the steps of processing public opinion news to be processed by a first novel Wen Zhaiyao unit, namely, a daily news abstract corresponding to the i-th independent date in the current period, N S,τ, PN S,t-1 and a daily news abstract set, wherein the first novel Wen Zhaiyao unit is used for representing the daily news abstract set; The PN S,t is used for representing a processing result obtained by processing the daily news digest set by the second news digest device, and is particularly used for representing the news digest of the current period.
In this way, in the embodiment of the disclosure, after obtaining N groups of daily public opinion news related to the target object, and taking each group of daily public opinion news in the N groups of daily public opinion news as the public opinion news to be processed, the first novel Wen Zhaiyao device is utilized to process the public opinion news to be processed to obtain the daily news abstract corresponding to the public opinion news to be processed, so that the time cost required by manually understanding and analyzing the N groups of daily public opinion news is reduced, and meanwhile, the error of manual processing is reduced, so that the reliability of the daily news abstract is improved. In the embodiment of the disclosure, a daily news digest set may be obtained based on N daily news digests corresponding to N groups of daily public opinion news one-to-one, a historical period news digest corresponding to a historical period may be obtained, and the daily news digest set may be processed based on the historical period news digest by using a second news digest device to obtain a current period news digest corresponding to a current period. Therefore, the time cost required by manual understanding and analysis of the daily news digest collection can be reduced, and meanwhile, the error of manual processing is reduced, so that the reliability of the news digest in the current period is improved.
In some alternative embodiments, the plurality of information summaries comprises a dynamic summary of the feature data. Based on this, step S101, that is, "acquiring a plurality of information digests related to the target object" may include:
Acquiring first market data of a target object;
acquiring second market data of a first target number of reference objects similar to the target object;
And obtaining the characteristic data dynamic abstract based on the first market data of the target objects, the second market data of the first target number of reference objects and the specific reference index by using the characteristic data dynamic abstract.
Taking the target object as an example of stocks for which feature data prediction is required, the first market data may include a yield, a volatility, a summer-average rate, a maximum withdrawal, a different movement average line, a relative strength index, a balance trading volume index, a fund flow index, a brin zone index, and the like of the target object within a preset period. Here, the preset period may be a history period of one month, one year, or other length of time.
In addition, in the embodiment of the disclosure, the first target number may be set according to actual application requirements, for example, may be set to 3, the first target number of reference objects may be financial investment objects issued by a field expert in the industry to which the target entity belongs in a first target number of similar economic organizations with high similarity to the target entity based on industry expertise (Know-How) and expertise, and the second market data may include, for example, a yield, a fluctuation rate, a summer average, a maximum withdrawal, a different movement average, a relative strength index, a balance transaction amount index, a fund flow index, a brin zone index, and the like of the strand reference objects.
After obtaining the first market data of the target object and the second market data of the first target number of reference objects, the feature data dynamic summary may be obtained according to the first extraction instruction using the feature data dynamic summary, based on the first market data of the target object, the second market data of the first target number of reference objects, and the specific reference index (e.g., when both the target object and the reference object are stocks, the specific reference index may be a medium certificate 1000 index).
The first extraction instruction may include a summary extraction requirement for each sub-class feature data index of a plurality of sub-class feature data indexes included in the feature data dynamic summary, and similarly, when the target object and the reference object are stocks, the plurality of sub-class feature data indexes may include a yield rate, a volatility rate, a summer rate, a maximum withdrawal, current and future market emotion, a conclusion, a profit driving factor, a future financial prospect, and the like.
In order to further clarify the process of obtaining the dynamic summary of the feature data, the process of obtaining the dynamic summary of the feature data will be described below with reference to fig. 3 by using a mathematical formula:
Wherein Metrics S,t is used to characterize the first market data of the target object; Second market data representing a j 1 th reference object of a first target number (i.e., n 1) of reference objects, and 1.ltoreq.j 1≤n1; The dynamic summarizer for characterizing the feature data is specifically configured to characterize the dynamic summary of the feature data, which is a processing result obtained by processing the first market data of the target object, the second market data of the first target number of reference objects, and the specific reference index, that is, P S,t.
In this way, in the embodiment of the disclosure, the first market data of the target object may be acquired, the second market data of the first target number of reference objects similar to the target object may be acquired, and the feature data dynamic summarizer may be utilized to obtain the feature data dynamic summary based on the first market data of the target object, the second market data of the first target number of reference objects, and the specific reference index. In this way, not only the time cost required for manually understanding and analyzing the first market data of the target object and for understanding and analyzing the second market data of the first target number of reference objects can be reduced, but also the reliability of the dynamic summary of the feature data can be improved because the first market data, the second market data of the first target number of reference objects, and the specific reference index can relatively comprehensively represent the market performance of the target object.
Further, since in-depth macro-economic analysis is critical to making intelligent investment decisions and efficient capital configuration. In particular, it can provide critical insights into overall economic health and performance to help investors obtain critical information in a timely manner, better understand the current economic environment and make informed investment decisions. Thus, in embodiments of the present disclosure, the plurality of message digests may also include a macroscopic message digest. Based on this, in some alternative embodiments, step S101, that is, "obtaining a plurality of information summaries related to the target object" may include:
Processing the second target number of macroscopic research reports by using the first macroscopic abstractor to obtain second target number of macroscopic research abstracts which are in one-to-one correspondence with the second target number of macroscopic research reports;
processing the macro index data by using a second macro summarizer to obtain a macro index summary;
And obtaining the macro information abstract based on the macro research abstract and the macro index abstract of the second target number by using a third macro abstract device.
For the macro study summary, in one example, after a second target number of macro study reports (e.g., macro economic study reports) are obtained, each of the second target number of macro study reports may be treated as a to-be-treated study report, and the to-be-treated study report may be treated with the first macro summarizer according to the second extraction instruction to obtain a macro study summary corresponding to the to-be-treated study report.
The second target number may be set according to actual application requirements, which is not limited in this embodiment of the disclosure, the first macro-summarizer may be a large language model, the second extraction instruction is third Prompt information, which is used to instruct the first macro-summarizer how to process the research report to be processed to obtain a macro-research summary corresponding to the research report to be processed, and the macro-research summary may include summary information obtained by analyzing economic growth, inflation, employment market, currency policy, financial policy, international trade and exchange rate, real estate market, overseas macro environment, macro environment and expansion, and assets and industry categories that are expected to rise or fall.
For the macro index digest, in an example, after macro index data (e.g., macro economic index data) is obtained, the macro index data may be processed with a second macro summarizer according to a third extraction instruction to obtain the macro index digest.
The macro-index data may include a total amount of money in circulation (M0)/total amount of money in circulation plus a demand deposit (M1) of a person and an enterprise/total amount of money in circulation plus a regular deposit (M2) of a person and an enterprise, a comparably increased rate, an expected liquidity index, a mermaid currency exchange rate, etc., the second macro-index data may be a large language model, and the second macro-index data and the first macro-index data may be the same large language model, the third extraction instruction is a fourth promt information for instructing the second macro-index data how to process the macro-index data to obtain a macro-index digest, for example, the third extraction instruction may include macro-index data of a reference history period (for example, last month) and a current period (for example, this month), and the macro-index digest may include digest information obtained by analyzing economic increase, general expansion, employment market, currency policy, financial policy, international trade, exchange rate, etc.
For the macro information digest, in an example, after the second target number of macro study digests and macro index digests are obtained, the macro information digest may be obtained based on the second target number of macro study digests and macro index digests by using a third macro digest device according to a fourth extraction instruction.
The third macro-summarization device can be a large language model, the third macro-summarization device, the first macro-summarization device and the second macro-summarization device can be the same large language model, the fourth extraction instruction is fifth Prompt information which is used for indicating how the third macro-summarization device obtains macro-information summaries based on macro-research summaries and macro-index summaries of the second target quantity, and the macro-information summaries can comprise summary information obtained by analyzing economic growth, general expansion, employment markets, currency policies, financial policies, international trade and exchange rates, assets expected to rise and fall, and the like.
In order to further clarify the process of obtaining the macro information abstract, the process of obtaining the macro information abstract will be described below using mathematical formulas with reference to fig. 4:
Wherein, A macro economic report of the j 2 of the macro study reports for characterizing a second target quantity (i.e., n 2), and 1.ltoreq.j 2≤n2; The macro-scale summarizer is used for representing processing results obtained by processing the j 2 macro-scale economic reports in the second target number of macro-scale economic reports, namely macro-scale study summaries corresponding to the j 2 macro-scale economic reports, report t-1 is used for representing macro-scale economic indexes of historical time periods, and Report t is used for representing macro-scale economic indexes of current time periods; The second macro-summarizer is used for representing a processing result obtained by processing macro-index data, namely, macro-index summarization; the processing result obtained by processing the second target number of macro study summaries and macro index summaries by the third macro summarizer, namely, the M t is used for representing the macro information summary.
In this way, in the embodiment of the disclosure, the first macro-summarizer may be used to process the second target number of macro-research reports to obtain a second target number of macro-research summaries corresponding to the second target number of macro-research reports one by one, and the second macro-summarizer may be used to process the macro-index data to obtain a macro-index summary, and the third macro-summarizer may be used to obtain a macro-information summary based on the second target number of macro-research summaries and the macro-index summary. Therefore, the time cost for manually understanding and analyzing the second target number of macro research reports and the time cost for understanding and analyzing the macro index data can be reduced, and meanwhile, the influence degree of the target object in the dimension of macro market economy can be relatively comprehensively represented by the second target number of macro research reports and the macro index data, so that the reliability of the macro information abstract can be improved.
In some alternative embodiments, the plurality of message digests includes a basic face digest. Based on this, step S101, that is, "acquiring a plurality of information digests related to the target object" may include:
Processing a management layer analysis result of a target entity by using a first basic surface abstractor to obtain a management layer basic surface abstract, wherein the target entity is an economic organization issuing a target object;
processing the analysis engineer research report related to the target entity by using a second basic face abstract device to obtain an analysis engineer basic face abstract;
and obtaining the basic surface abstract based on the management layer basic surface abstract, the analyst basic surface abstract and specific data of the target entity by using a third basic surface abstract device.
For the management layer basic plane abstract, in an example, the latest and next new management layer analysis results of the target entity can be obtained, and after the management layer analysis results are obtained, the management layer analysis results are processed by using the first basic plane abstract device according to the fifth extraction instruction, so as to obtain the management layer basic plane abstract.
The management layer analysis result may be an analysis result obtained when a management layer of the target entity discusses and analyzes the current health condition and future development track of the target entity, the first basic surface abstractor may be a large language model, the fifth extraction instruction is sixth Prompt information, which is used for instructing the first basic surface abstractor how to process the management layer analysis result to obtain a management layer basic surface abstract, and the management layer basic surface abstract may include abstract information obtained by analyzing emotion mood of the management layer, potential risks and opportunities expressed by the management layer and aimed at the target entity, profit driving factors expressed by the management layer and aimed at the target entity.
For the basic face abstract of the analyst, in an example, after the first analyst report of the target entity is obtained, the first analyst report of the target entity may be independently used as the analyst report related to the target entity, and the second basic face abstract device may be a large language model, and the analyst report related to the target entity may be processed according to the sixth extraction instruction to obtain the basic face abstract of the analyst.
The first analyst report may be an analysis result obtained when the financial analyst discusses and analyzes the current health condition and future development track of the target entity, the second basic surface abstractor may be a large language model, the second basic surface abstractor and the first basic surface abstractor may be the same large language model, the sixth extraction instruction is seventh Prompt information, which is used for indicating how the second basic surface abstractor processes the analyst report related to the target entity to obtain an analyst basic surface abstract, and the analyst basic surface abstract may include abstract information obtained by analyzing emotion and mood of the financial analyst, potential risks and opportunities for the target entity expressed by the financial analyst, earning driving factors for the target entity expressed by the financial analyst, and the like.
For the analyst basic surface abstract, in another example, a first analyst report for the target entity may be obtained, and a second analyst report for a third target number of reference entities (e.g., reference economic organizations, which may specifically be marketplaces) similar to the target entity may be obtained, together as an analyst report associated with the target entity, and the analyst report associated with the target entity may be processed using a second basic surface abstractor according to a seventh extraction instruction to obtain the analyst basic surface abstract.
The first analyst report may be an analysis result obtained by the financial analyst when discussing and analyzing the current health condition and the future development track of the target entity, the third target number may be set according to the actual application requirement, for example, may be set to 5, the second analyst report may include an analysis result obtained by the financial analyst when discussing and analyzing the health condition and the development track of the target entity in a historical period (for example, the last month) and a current period (for example, the current month), that is, the second analyst report may include an analyst report in the historical period and an analyst report in the current period, the second basic face abstract may be a large language model, and the second basic face abstract and the first basic face abstract may be the same large language model, the seventh extraction instruction is eighth promt information indicating how the second basic face abstract processes the analyst report related to the target entity, and obtains a basic face abstract of the analysis, the basic face abstract may include a factor of the target entity, the potential emotion of the target entity, and the target entity of the potential emotion of the potential analyst, and the target entity, and the like.
For the basic surface summary, in an example, after the management layer basic surface summary, the analyst basic surface summary, and the specific data of the target entity (for example, the structured financial data of the target entity may be in Excel format) are obtained, the management layer basic surface summary, the analyst basic surface summary, and the specific data of the target entity may be processed by using a third basic surface abstractor according to the eighth extraction instruction to obtain the basic surface summary.
The third basic surface abstractor may be a large language model, and the third basic surface abstractor, the first basic surface abstractor and the second basic surface abstractor may be the same large language model, the eighth extraction instruction is ninth promt information, which is used to instruct the third basic surface abstractor how to process the management layer basic surface abstractor, the analyst basic surface abstractor, and specific data of the target entity to obtain the basic surface abstractor, where the basic surface abstractor may include evaluation levels, profitability, income increasing, debt repaying capability, company administering capability, emotion gas of the management layer, potential risk and opportunity of the target entity expressed by the management layer, profit driving factors and opportunity of the target entity expressed by the management layer, future financial prospect of the target entity expressed by the management layer, emotion gas of the financial analyst, potential risk and opportunity of the target entity expressed by the financial analyst, profit driving factors and opportunity of the target entity expressed by the financial analyst, future information of the target entity expressed by the financial analyst, future abstract of the target entity expressed by the financial analyst, and the like.
Further, in the embodiments of the present disclosure, the basic face abstract may be a plain text format document, for example, a text document written using the plain text format Markdown.
In order to further clarify the basic face abstract acquisition process, the basic face abstract acquisition process will be described below using mathematical formulas with reference to fig. 5:
Wherein, The management layer analysis result is used for representing the target entity; The management layer abstract is used for representing a processing result obtained by the first basic plane abstract device for processing the management layer analysis result of the target entity, namely, the management layer basic plane abstract; an analyst report for characterizing the target entity; the second basic surface abstractor is used for representing a processing result obtained by processing the analyst research report related to the target entity, namely, the analyst basic surface abstracts; Specific data for characterizing the target entity; The third basic surface abstractor is used for representing the processing result obtained by processing the management layer basic surface abstracts, the analyst basic surface abstracts and the specific data of the target entity, namely, F S, which is specifically used for representing the basic surface abstracts.
In this way, in the embodiment of the disclosure, the first basic surface abstractor may be used to process the management layer analysis result of the target entity to obtain the management layer basic surface abstract, the second basic surface abstractor may be used to process the analyst report related to the target entity to obtain the analyst basic surface abstract, and the third basic surface abstractor may be used to obtain the basic surface abstract based on the management layer basic surface abstract, the analyst basic surface abstract, and the specific data of the target entity. Therefore, not only can the management layer analysis result of the target entity be understood and analyzed manually and the time cost required by the analysis of the analyst related to the target entity be understood and analyzed can be reduced, but also the reliability of the basic surface abstract can be improved because the management layer analysis result of the target entity and the analysis of the analyst related to the target entity can relatively comprehensively represent the affected degree of the target object in the dimension of the basic surface data.
In some optional embodiments, step S102, that is, "obtaining the feature data prediction result of the target object based on the plurality of information summaries, and the interpretation information for the feature data prediction result" may include:
Acquiring answer guide information;
obtaining a chained logic reasoning step based on the answer guiding information;
Based on the plurality of information abstracts, according to the chained logic reasoning step, the characteristic data prediction result of the target object and the interpretation information aiming at the characteristic data prediction result are obtained.
The answer guide information is tenth Prompt information, which is used for indicating how the target large model obtains a feature data prediction result of the target object based on the plurality of information abstracts, and interpretation information for the feature data prediction result.
For example, the answer guiding information is used for indicating the target big model to define its role as an expert financial analyst, and obtains a feature data prediction result Of the target object based on a plurality Of information summaries by adopting a thinking Chain (Chain-Of-Thought, COT) reasoning mode, and interpretation information Of the feature data prediction result, that is, the feature data prediction result Of the target object and the interpretation information Of the feature data prediction result are obtained according to a chained logic reasoning step based on the plurality Of information summaries. Wherein the chained logical inference steps may include a plurality of inference steps, and the plurality of inference steps may have a fixed order of execution. Based on this, it can be appreciated that in the embodiment of the disclosure, the target large model may perform a plurality of inference steps according to a fixed execution order based on a plurality of information summaries, to obtain a feature data prediction result of the target object, and interpretation information for the feature data prediction result.
In addition, in the embodiment of the disclosure, in addition to providing the answer guiding information for the target big model, answer format suggestions, answer examples, answer limiting conditions and the like can also be provided for the target big model to provide data structures, arrangement sequences and regulations on content limiting aspects for the feature data prediction result of the target object and/or the interpretation information of the feature data prediction result, so that the feature data prediction result of the target object and/or the interpretation information of the feature data prediction result are more normative. Based on this, in an example, after obtaining the answer guidance information and obtaining the chained logical inference step based on the answer guidance information, the feature data prediction result of the target object and the interpretation information for the feature data prediction result may be obtained according to the chained logical inference step, the answer format advice, the answer example, and the answer constraint condition based on the plurality of information digests.
Further, taking a target object as any financial investment object needing to perform feature data prediction, and taking the feature data prediction result as a price trend prediction result of the target object as an example, in the embodiment of the present disclosure, the feature data prediction result of the target object may be represented by a prediction score located in a first numerical interval [0,10], specifically, when the prediction score is lower than 5, the feature data prediction result representing the target object is a roll-off and the prediction score is lower, the characteristic roll-off is larger, when the prediction score is equal to 5, the feature data prediction result representing the target object is neutral, that is, does not roll-off and does not roll-off, and when the prediction score is higher than 5, the feature data prediction result representing the target object is a roll-off and the prediction score is higher, and the characteristic roll-up is larger. In addition, in the embodiment of the present disclosure, the interpretation information for the feature data prediction result may be a clear interpretation of the reasoning behind the feature data prediction result.
Through the method, in the embodiment of the disclosure, after the target large model obtains the answer guiding information and obtains the chained logic reasoning step based on the answer guiding information, the feature data prediction result of the target object and the interpretation information aiming at the feature data prediction result are obtained according to the chained logic reasoning step based on a plurality of information abstracts, and the chained logic reasoning step can provide clear reasoning guidance for the target large model, so that the occurrence rate of reasoning errors is reduced, the accuracy of the feature data prediction result is further improved, and meanwhile, the reliability of the feature data prediction result by a user is enhanced based on the interpretation information aiming at the feature data prediction result, so that the usability of the feature data prediction result is improved.
In some alternative embodiments, the plurality of information summaries comprises a news summary and a feature data dynamic summary, and the step of chained logical reasoning comprises a first step and a second step. Based on this, the "obtaining the feature data prediction result of the target object according to the chained logical inference step based on the plurality of information digests" and the interpretation information for the feature data prediction result "in step S102 may further include:
Executing a first step to obtain a first reasoning result based on the news abstract;
executing a second step to obtain a second reasoning result based on the feature data dynamic abstract;
And obtaining a first characteristic data prediction result of the target object in a short-term future period based on the first reasoning result and the second reasoning result, and interpretation information aiming at the first characteristic data prediction result.
The first step may be to evaluate the influence degree of the daily news digest and the current period news digest related to the target object on the dimension of the public opinion news as the first reasoning result, and the second step may be to compare the feature data dynamic digest of the target object with the price dynamic of the reference object (i.e., the financial investment object similar to the target object) and the price dynamic of each financial investment object in the whole financial market, respectively, and evaluate the market performance of the target object as the second reasoning result.
After the first reasoning result and the second reasoning result are obtained, the first reasoning result and the second reasoning result can be weighted by utilizing the target big model, the first characteristic data prediction result of the target object in a short-term future period is summarized in a reasoning mode, and the interpretation information of the first characteristic data prediction result is obtained. Wherein the short-term future period may be a future week, a future month, etc.
Further, taking a target object as any financial investment object to be subjected to characteristic data prediction, and taking a characteristic data prediction result as a price trend prediction result of the target object as an example, in the embodiment of the present disclosure, a first characteristic data prediction result of the target object in a short-term future period may be represented by a prediction score located in a second numerical interval [0,10], specifically, when the prediction score is lower than 5, the first characteristic data prediction result is represented as a roll, and the lower the prediction score is, the larger the roll-down amplitude is represented, when the prediction score is equal to 5, the first characteristic data prediction result is represented as a neutral, that is, no roll-off and no roll-up, and when the prediction score is higher than 5, the first characteristic data prediction result is represented as a roll-off, and the higher the prediction score is represented as a roll-up amplitude is larger. In addition, in the embodiment of the present disclosure, the interpretation information for the first feature data prediction result may be a clear interpretation of the reasoning behind the first feature data prediction result. In addition, in the embodiment of the present disclosure, the interpretation information for the first feature data prediction result may be a clear interpretation of the reasoning behind the first feature data prediction result.
Finally, in the embodiment of the present disclosure, the output result of the target large model may be characterized as:
N S,τ is used for representing a daily news digest set in the news digest, PN S,t is used for representing a news digest in the current period of time in the news digest, and P S,t is used for representing a characteristic data dynamic digest; The method is used for representing the target large model and obtaining a first characteristic data prediction result of the target object in a short-term future period based on the news abstract and the characteristic data dynamic abstract, and interpretation information aiming at the first characteristic data prediction result. Here, the first characteristic data prediction result is passed through Characterization, interpretation information passing for the first characteristic data prediction resultCharacterization.
In this way, in the embodiment of the disclosure, the first step may be performed to obtain the first inference result based on the news digest, and the second step may be performed to obtain the second inference result based on the feature data dynamic digest, and obtain the first feature data prediction result of the target object in the short-term future period and the interpretation information for the first feature data prediction result based on the first inference result and the second inference result. In the process, the chained first step and the chained second step can provide clearer reasoning guidance for the target large model, are beneficial to reducing the occurrence rate of reasoning errors, so that the accuracy of the first characteristic data prediction result is improved, meanwhile, based on the interpretation information of the first characteristic data prediction result, the trust degree of a user on the first characteristic data prediction result can be improved by improving the interpretability of the first characteristic data prediction result, and the usability of the first characteristic data prediction result is improved.
In some alternative embodiments, the plurality of message digests includes a macro message digest and a basic face digest, and the step of chained logical reasoning includes a third step and a fourth step. Based on this, the "obtaining the feature data prediction result of the target object according to the chained logical inference step based on the plurality of information digests" and the interpretation information for the feature data prediction result "in step S102 may further include:
Executing a third step to obtain a third reasoning result based on the macroscopic information abstract;
executing a fourth step to obtain a fourth reasoning result based on the basic face abstract;
And obtaining a second characteristic data prediction result of the target object in a long-term future period based on the third reasoning result and the fourth reasoning result, and interpretation information aiming at the second characteristic data prediction result.
The third step may be to evaluate the degree of influence of the target object in the dimension of the macro market economy (e.g., evaluate the influence of a broad macro economy pattern on the target object) based on the macro information abstract as the third reasoning result, and the fourth step may be to analyze the degree of influence of the target object in the dimension of the base data (e.g., analyze the financial condition and future prospect of the target entity) based on the base abstract as the fourth reasoning result.
After the third reasoning result and the fourth reasoning result are obtained, the third reasoning result and the fourth reasoning result can be weighted by utilizing the target big model, the second characteristic data prediction result of the target object in a long-term future period is summarized in a reasoning mode, and the interpretation information of the second characteristic data prediction result is obtained. Wherein the length of time of the long-term future period is greater than the length of time of the short-term future period, in particular the long-term future period may be one year, a few years, etc. in the future.
Further, taking a target object as any financial investment object to be subjected to characteristic data prediction, and taking a characteristic data prediction result as a price trend prediction result of the target object as an example, in the embodiment of the present disclosure, the second characteristic data prediction result of the target object in a short-term future period may be represented by a prediction score located in a third numerical interval [0,10], specifically, when the prediction score is lower than 5, the second characteristic data prediction result is represented as a roll, and the lower the prediction score is, the larger the roll is represented, when the prediction score is equal to 5, the second characteristic data prediction result is represented as a neutral, that is, not roll and not roll, and when the prediction score is higher than 5, the first characteristic data prediction result is represented as a roll, and the higher the prediction score is, the larger the roll is represented. In addition, in the embodiment of the present disclosure, the interpretation information for the second feature data prediction result may be a clear interpretation of the reasoning behind the second feature data prediction result.
Finally, in the embodiment of the present disclosure, the output result of the target large model may be characterized as:
Wherein M t is used for representing macro information abstract, F S is used for representing basic face abstract; The method comprises the steps of obtaining a second characteristic data prediction result of a target object in a long-term future period based on a macro information abstract and a basic surface abstract and obtaining interpretation information aiming at the second characteristic data prediction result. Here, the second characteristic data prediction result is passed through Characterization, interpretation information passing for the first characteristic data prediction resultCharacterization.
In the above manner, in the embodiment of the present disclosure, the third step may be performed to obtain a third inference result based on the macroscopic information summary, and the fourth step may be performed to obtain a fourth inference result based on the basic face summary, and obtain a second feature data prediction result of the target object in a long-term future period and interpretation information for the second feature data prediction result based on the third inference result and the fourth inference result. In the process, the third step and the fourth step of the chain can provide clearer reasoning guidance for the target large model, and are beneficial to reducing the occurrence rate of reasoning errors, so that the accuracy of the second characteristic data prediction result is improved, meanwhile, based on the interpretation information of the second characteristic data prediction result, the reliability of the user on the characteristic data prediction result can be enhanced by improving the interpretability of the second characteristic data prediction result, and the usability of the second characteristic data prediction result is improved.
In some alternative embodiments, the plurality of information summaries includes a news summary, a feature data dynamic summary, a macro information summary, and a basic surface summary, and the step of chained logical reasoning includes a first step, a second step, a third step, and a fourth step. Based on this, the "obtaining the feature data prediction result of the target object according to the chained logical inference step based on the plurality of information digests" and the interpretation information for the feature data prediction result "in step S102 may further include:
Executing a first step to obtain a first reasoning result based on the news abstract;
executing a second step to obtain a second reasoning result based on the feature data dynamic abstract;
Executing a third step to obtain a third reasoning result based on the macroscopic information abstract;
executing a fourth step to obtain a fourth reasoning result based on the basic face abstract;
and obtaining a first characteristic data prediction result of the target object in a short-term future period, a second characteristic data prediction result of the target object in a long-term future period and interpretation information aiming at the first characteristic data prediction result and the second characteristic data prediction result based on the first inference result, the second inference result, the third inference result and the fourth inference result.
The first step may be to evaluate the degree of influence of the daily news digest and the news digest of the current period on the target object in the dimension of public opinion news as a first inference result, the second step may be to compare the feature data dynamic digest of the target object with the price dynamic of the reference object (i.e., the financial investment object similar to the target object) and the price dynamic of each financial investment object in the overall financial market, respectively, and evaluate the market performance of the target object as a second inference result, the third step may be to evaluate the degree of influence of the target object in the dimension of macroscopic market economy (e.g., evaluate the extensive macroscopic economy pattern and influence thereof on the target object) based on the macroscopic information digest as a third inference result, and the fourth step may be to analyze the degree of influence of the target object in the dimension of basic plane data (e.g., analyze the financial condition and future prospect of the target entity) based on the basic plane digest as a fourth inference result.
After the first inference step, the second inference step, the third inference step, and the fourth inference step are obtained, the first inference result, the second inference result, the third inference result, and the fourth inference result may be weighted by using the objective large model, and the first feature data prediction result of the objective in the short-term future period may be summarized (specifically, the first feature data prediction result of the objective in the short-term future period may be summarized by combining the first inference result and the second inference result), the second feature data prediction result of the objective in the long-term future period may be summarized by combining the third inference step and the fourth inference step, and the interpretation information for the first feature data prediction result and the second feature data prediction result may be obtained. Wherein the short-term future period may be a future week, a future month, etc., and the long-term future period may have a time length greater than the time length of the short-term future period, in particular the long-term future period may be a future year, years, etc.
In the embodiment of the disclosure, the output result of the target large model may be characterized as:
Wherein N S,τ is used for representing a daily news digest set in the news digest, PN S,t is used for representing a news digest of the current period in the news digest, P S,t is used for representing a characteristic data dynamic digest, M t is used for representing a macroscopic information digest, and F S is used for representing a basic face digest; The method comprises the steps of obtaining a first characteristic data prediction result of a target object in a short-term future period, a second characteristic data prediction result of the target object in a long-term future period and interpretation information aiming at the first characteristic data prediction result and the second characteristic data prediction result by a representation target large model base based on a first inference result, a second inference result, a third inference result and a fourth inference result. Here, the first characteristic data prediction result is passed through Characterization, second characteristic data prediction result passingCharacterization, interpretation information passing for the first characteristic data prediction result and the second characteristic data prediction resultCharacterization.
In this way, in the embodiment of the disclosure, a clearer reasoning guide can be provided for the target large model through the chained first step, the second step, the third step and the fourth step, which is helpful to reduce the occurrence rate of reasoning errors, so as to improve the accuracy of the first feature data prediction result and the second feature data prediction result, and meanwhile, based on the interpretation information aiming at the first feature data prediction result and the second feature data prediction result, the reliability of the first feature data prediction result and the second feature data prediction result by a user can be improved, so as to improve the usability of the first feature data prediction result and the second feature data prediction result.
An integrity flow of a data prediction method provided by an embodiment of the present disclosure will be described below with reference to fig. 6 and 7.
Step S601, obtaining N groups of daily public opinion news related to a target object, using each group of daily public opinion news in the N groups of daily public opinion news as public opinion news to be processed, processing the public opinion news to be processed by using a first novel Wen Zhaiyao device to obtain daily news abstracts corresponding to the public opinion news to be processed, obtaining historical period news abstracts corresponding to a historical period, processing a daily news abstract set by using a second news abstracter based on the historical period news abstracts to obtain current period news abstracts corresponding to a current period, wherein N groups of daily public opinion news are in one-to-one correspondence with N independent dates in the current period, and N is more than or equal to 2 and is an integer.
The first new Wen Zhaiyao and the second news digest may be the same news digest, and the news digest may be a large language model.
In addition, in the embodiment of the present disclosure, the specific description of step S601 may be referred to the related description in step S101, which is not described herein.
Step S602, obtaining first market data of target objects, obtaining second market data of first target number of reference objects similar to the target objects, and obtaining a feature data dynamic summary based on the first market data of the target objects, the second market data of the first target number of reference objects and a specific reference index by using a feature data dynamic summary device.
Wherein the feature data dynamic summarizer may be a large language model.
In addition, in the embodiment of the present disclosure, the specific description of the step S602 may be referred to the related description in the step S101, which is not described herein.
Step S603, processing the second target number of macro study reports by using a first macro summarizer to obtain a second target number of macro study summaries corresponding to the second target number of macro study reports, processing macro index data by using the second macro summarizer to obtain a macro index summary, and obtaining a macro information summary by using a third macro summarizer based on the second target number of macro study summaries and the macro index summary.
The first macro digest device, the second macro digest device and the third macro digest device may be the same macro digest device, and the macro digest device may be a large language model.
In addition, in the embodiment of the present disclosure, the specific description of step S603 may be referred to the related description in step S101, which is not described herein.
Step S604, a management layer analysis result of a target entity is processed by using a first basic surface abstractor to obtain a management layer basic surface abstract, wherein the target entity is an economic organization issuing a target object, an analyst report related to the target entity is processed by using a second basic surface abstractor to obtain an analyst basic surface abstract, and a third basic surface abstractor is used to obtain the basic surface abstract based on the management layer basic surface abstract, the analyst basic surface abstract and specific data of the target entity.
Wherein the first basic surface digest device, the second basic surface digest device and the third basic surface digest device may be the same basic surface digest device, and the basic surface digest device may be a large language model.
In addition, in the embodiment of the present disclosure, the specific description of step S604 may be referred to the related description in step S101, which is not described herein.
Step S605, obtaining answer guide information by utilizing a target large model, obtaining a chained logic reasoning step based on the answer guide information, obtaining a characteristic data prediction result of a target object and interpretation information aiming at the characteristic data prediction result based on a plurality of information abstracts according to the chained logic reasoning step.
In some alternative embodiments, the plurality of message digests includes a macro message digest and a basic face digest, and the step of chained logical reasoning includes a third step and a fourth step. Based on this, the "obtaining the feature data prediction result of the target object according to the chained logical inference step based on the plurality of information digests, and the interpretation information for the feature data prediction result" in step S605 may further include:
Executing a first step to obtain a first reasoning result based on the news abstract;
executing a second step to obtain a second reasoning result based on the feature data dynamic abstract;
Executing a third step to obtain a third reasoning result based on the macroscopic information abstract;
executing a fourth step to obtain a fourth reasoning result based on the basic face abstract;
and obtaining a first characteristic data prediction result of the target object in a short-term future period, a second characteristic data prediction result of the target object in a long-term future period and interpretation information aiming at the first characteristic data prediction result and the second characteristic data prediction result based on the first inference result, the second inference result, the third inference result and the fourth inference result.
In addition, in the embodiment of the present disclosure, the specific description of step S605 may be referred to the related description in step S102, which is not described herein.
It may be understood that in the embodiment of the present disclosure, a manner of introducing an Exchange-of-Thought, eoT framework, that is, four independent large language models, i.e., a news summarizer, a feature data dynamic summarizer, a macro summarizer and a basic surface summarizer, are used to form a star network topology (taking a target large model as a central node, collecting and summarizing output results of the news summarizer, the feature data dynamic summarizer, the macro summarizer and the basic surface summarizer), so that the problem of difficulty in information interaction among the news summarizer, the feature data dynamic summarizer, the macro summarizer and the basic surface summarizer is solved, and the reasoning capability of the target large model is improved, so that when the feature data prediction result of a target object is obtained by using the target large model, and the interpretation information of the feature data prediction result is further improved, and the interpretability of the feature data prediction result is further improved.
Fig. 8 is a schematic view of a scenario of a data prediction method according to an embodiment of the disclosure.
As described above, the data prediction method provided by the embodiment of the present disclosure is applied to an electronic device. The electronic device may be a server or a terminal device. Here, the terminal device may be a workstation, a mainframe computer, a conventional computer (e.g., desktop computer, notebook computer, vehicle-mounted computer, etc.), a personal digital assistant, or other similar computing device.
The electronic device is used for:
obtaining a plurality of information summaries related to the target object, wherein the plurality of information summaries have different information categories for representing the performance and/or affected degree of the target object in different dimensions;
and obtaining a characteristic data prediction result of the target object and interpretation information aiming at the characteristic data prediction result based on the plurality of information abstracts by using the target large model.
It should be noted that, in the embodiment of the present disclosure, the schematic view of the scenario shown in fig. 8 is merely illustrative and not restrictive, and those skilled in the art may make various obvious changes and/or substitutions based on the example of fig. 8, and the obtained technical solution still falls within the scope of the embodiment of the present disclosure.
The embodiment of the disclosure provides a large model training method which can be applied to electronic equipment. The electronic device may be a server or a terminal device. Here, the terminal device may be a workstation, a mainframe computer, a conventional computer (e.g., desktop computer, notebook computer, vehicle-mounted computer, etc.), a personal digital assistant, or other similar computing device. In the following, a description will be given of a large model training method provided in the embodiment of the present disclosure with reference to a flowchart shown in fig. 9. It should be noted that although a logical order is illustrated in the flowchart, in some cases, the steps illustrated or described may be performed in other orders.
In step S901, a plurality of information summary samples related to a target object sample are acquired.
The target object sample may be any financial investment object (e.g., stock, fund, futures) that needs characteristic data trend prediction, where the characteristic data may be price trend of the target object.
Furthermore, in the disclosed embodiments, the plurality of information summary samples have different information categories for characterizing the performance and/or the affected extent of the target object sample in different dimensions. For example, the plurality of information summary samples may include at least one of a news summary sample, a feature data dynamic summary sample, a macro information summary sample, and a basic face summary sample.
The news abstract sample may be an abstract information sample obtained by processing public opinion news related to the target object sample, which is used for representing the affected degree of the target object sample in the dimension of the public opinion news, the characteristic data dynamic abstract sample may be an abstract information sample obtained by processing stock price dynamic data of the target object sample, which is used for representing the market performance condition of the target object sample, the macro information abstract sample may be an abstract information sample obtained by processing macro market economic data, which is used for representing the affected degree of the target object sample in the dimension of the macro market economic, and the basic surface abstract sample may be an abstract information sample obtained by processing basic surface data related to the target object sample, which is used for representing the affected degree of the target object sample in the dimension of the basic surface data.
Step S902, obtaining a feature data prediction result sample for the target object sample and an interpretation information sample for the feature data prediction result sample based on the plurality of information summary samples by using the initial large model.
The characteristic data prediction result sample may be a price trend prediction result sample of the target object.
In an example, an answer guide information sample may be obtained, and the answer guide information sample and the plurality of information summary samples may be input into the initial large model, so that the initial large model processes the plurality of information summary samples according to the answer guide information sample to obtain a feature data prediction result sample of the target object sample, and an interpretation information sample for the feature data prediction result sample. The answer guide information sample is eleventh Prompt information, which is used to instruct how to obtain a feature data prediction result sample of the target object sample based on the plurality of information abstract samples by the initial large model, and an interpretation information sample of the feature data prediction result sample.
Furthermore, in the embodiment of the present disclosure, the initial large model is an initial signal generator, and the initial signal generator may be a large language model. Here, the large language model may be a pre-trained neural network model having general language knowledge, world knowledge, domain expertise, etc., and stored in the form of parameters inside the model. In particular, the initial large model may be an autoregressive generation model of a transducer architecture.
Step S903, training the initial large model based on the feature data prediction result sample and the interpretation information sample to obtain a target large model.
In an example, the real feature data for the target object sample may be obtained, and the initial large model may be made to perform a negative based on the feature data prediction result sample, the interpretation information sample, and the real feature data, so as to achieve the purpose of training the initial large model, so as to obtain the purpose of the target large model.
By adopting the large model training method provided by the embodiment of the disclosure, a plurality of information abstract samples related to the target object sample can be obtained, and the initial large model is utilized to obtain a characteristic data prediction result sample aiming at the target object sample and an interpretation information sample aiming at the characteristic data prediction result sample based on the plurality of information abstract samples, and then the initial large model is trained, namely, parameter fine adjustment is carried out on the initial large model, so that the target large model is obtained. Thus, the performance of the target large model can be improved, so that the accuracy of the characteristic data prediction result and the interpretability of the characteristic data prediction result can be further improved when the characteristic data prediction result of the target object and the interpretation information of the characteristic data prediction result are obtained by utilizing the target large model.
In some optional embodiments, step S903, that is, "training the initial large model based on the feature data prediction result sample and the interpretation information sample, to obtain the target large model" may include:
Acquiring real characteristic data aiming at a target object sample;
Generating a negative basis comprising a plurality of information abstract samples, characteristic data prediction result samples, explanation information samples and real characteristic data;
obtaining a current thinking-back result aiming at a characteristic data prediction result sample and an interpretation information sample based on a thinking-back basis by utilizing a thinking-back model;
And training the initial large model based on the current thinking-back result to obtain a target large model.
The information summary samples may include at least one of a news summary sample, a feature data dynamic summary sample, a macro information summary sample and a basic face summary sample, the feature data prediction result sample may include a first feature data prediction result sample of the target object sample in a short-term future period and a second feature data prediction result sample of the target object sample in a long-term future period, and the interpretation information sample may be a clear interpretation of reasoning behind the first feature data prediction result sample and the second feature data prediction result sample. Here, the short-term future period may be one week, one month, several months, etc., and the long-term future period may have a longer time length than the short-term future period, and in particular, the long-term future period may be one year, several years, etc.
In an example, "generating an answer that includes a plurality of information summary samples, feature data predictor samples, interpretation information samples, and real feature data" may include:
Obtaining a scoring result aiming at the characteristic data prediction result sample and the scoring interpretation information sample by using a target scoring model, and scoring interpretation aiming at the scoring result;
generating a negative basis comprising a plurality of information abstract samples, characteristic data prediction result samples, interpretation information samples, scoring results, scoring interpretation and real characteristic data.
The target scoring model may be a Multi-Layer Perceptron (MLP), among others.
In the embodiment of the disclosure, the consistency of the feature data prediction result sample and the real feature data can be evaluated by using a target scoring model to obtain a scoring result, and the scoring interpretation of the scoring result is obtained by combining the interpretation information sample, so that a negative basis comprising a plurality of information abstract samples, feature data prediction result samples, interpretation information samples, scoring results, scoring interpretation and real feature data is regenerated.
Further, in a specific example, the initial scoring module may be trained by a novel model fine Tuning manner (e.g., promt Tuning), so as to obtain a target scoring model, thereby improving the performance of the target scoring model, further improving the accuracy of the scoring result for the feature data prediction result sample and the interpretation information sample, and improving the accuracy of the scoring interpretation for the scoring result. The process specifically may include:
The method comprises the steps of obtaining a first sample pair, wherein the first sample pair comprises a first characteristic data prediction sample, a first interpretation information sample and first real characteristic data corresponding to the first characteristic data prediction sample, and the similarity between the first characteristic data prediction sample and the first real characteristic data is larger than or equal to a first preset similarity threshold value;
The method comprises the steps of obtaining a second sample pair, wherein the second sample pair comprises a second characteristic data prediction sample, a second interpretation information sample and second real characteristic data corresponding to the second characteristic data prediction sample, the similarity between the second characteristic data prediction sample and the second real characteristic data is smaller than a second similarity threshold value, and the second similarity threshold value is smaller than a first preset similarity threshold value;
Training the initial scoring module based on the first sample pair and the second sample pair to obtain a target scoring model.
The first sample pair is a positive sample pair, the first similarity threshold may be set according to practical application requirements, for example, may be set to 0.95, the second sample pair is a negative sample pair, and the second similarity threshold may be set according to practical application requirements, for example, may be set to 0.05. Further, in the disclosed embodiments, the first and second pairs of samples may be collected during training of the initial large model and stored in the comparison dataset.
After obtaining the first sample pair (including the first feature data prediction sample and the first interpretation information sample), the initial scoring model may learn how to obtain a first scoring result sample for the first sample pair based on a similarity between the first feature data prediction sample and the first real feature data (the higher the first scoring result sample, the more accurate the first feature data prediction result sample is characterized). Specifically, how to evaluate the consistency of the first sample pair including the first feature data prediction sample and the first real feature data can be learned, so as to obtain a first scoring result sample, and a first scoring interpretation sample aiming at the first scoring result sample is obtained by combining the first interpretation information sample.
Also, after obtaining the second sample pair (including the second feature data prediction sample and the second interpretation information sample), the initial scoring model may learn how to obtain a second scoring result sample for the second sample pair based on a similarity between the second feature data prediction sample and the second real feature data (the higher the second scoring result sample, the more accurate the second feature data prediction result sample is characterized). Specifically, how to evaluate the consistency of the second sample pair including the second feature data prediction sample and the second real feature data can be learned, so as to obtain a second scoring result sample, and a second scoring interpretation sample aiming at the second scoring result sample is obtained by combining the second interpretation information sample.
It should be noted that, in the embodiment of the present disclosure, the sum of the first scoring result sample and the second scoring result sample may be maximized as a target, and the model parameters of the initial scoring model may be adjusted to achieve the purpose of training the initial scoring module to obtain the target scoring model.
It should be further noted that, in the embodiment of the present disclosure, in the case that the negative evidence includes a plurality of information summary samples, feature data prediction result samples, interpretation information samples, scoring results, scoring interpretation and real feature data, "obtaining the current negative result for the feature data prediction result samples and interpretation information samples based on the negative evidence by using the negative evidence model" may be characterized as follows:
The method comprises the steps of N S,τ ' representing a daily news digest aggregate sample in a news digest sample, PN S,t ' representing a news digest sample in the current period in the news digest sample, P S,t ' representing a characteristic data dynamic digest sample, M t ' representing a macroscopic information digest sample, and F S ' representing a basic surface digest sample; a first feature data predictor sample for characterizing the first feature data; For characterizing a second feature data predictor sample; The system comprises a first feature data prediction result sample, a second feature data prediction result sample, an interpretation information sample, a SR, a SE, a y aspt and a target object sample, wherein the first feature data prediction result sample and the second feature data prediction result sample are used for representing the interpretation information sample; the method is used for representing the anti-thinking model, and obtaining the current anti-thinking result aiming at the characteristic data prediction result sample and the interpretation information sample based on the anti-thinking basis, namely, Which is particularly used to characterize the current jeopardy results for the feature data prediction result samples and the interpretation information samples.
Further, in the embodiments of the present disclosure, the reflexive model may be a large language model in which model parameters are frozen. In a specific example, the initial large model may be trained to obtain the target large model, so as to improve performance of the target large model, so that when the characteristic data prediction result of the target object and the interpretation information of the characteristic data prediction result are obtained by using the target large model, accuracy of the characteristic data prediction result can be further improved, and interpretability of the characteristic data prediction result can be further improved. The process specifically may include:
Acquiring a historical returnal thinking result;
Obtaining long-term returnal thinking data based on the current returnal thinking result and the historical returnal thinking result;
Training the initial large model based on long-term jeopardy data to obtain a target large model.
The long-term thinking-back data can be a long-term memory obtained by splicing the current thinking-back result and the historical thinking-back result, and can be specifically characterized as:
Wherein, For characterizing the current returnal outcome; for characterizing historical returnal results; for characterizing long-term jeopardy data.
After long-term thinking-back data is obtained, model parameters of the initial large model can be adjusted based on the long-term thinking-back data so as to achieve the purpose of training the initial large model and obtaining the target large model. The process can also be regarded as the next round of parameter iteration process of the initial large model, and can be specifically characterized as follows:
Wherein N S,τ ' is used for representing a daily news digest aggregate sample in a news digest sample when the next round of parameter iteration is performed, PN S,t ' is used for representing a news digest sample in the current period in the news digest sample when the next round of parameter iteration is performed, P S,t ' is used for representing a characteristic data dynamic digest sample when the next round of parameter iteration is performed, M t ' is used for representing a macroscopic information digest sample when the next round of parameter iteration is performed, and F S ' is used for representing a basic face digest sample when the next round of parameter iteration is performed; For characterizing long-term jeopardy data; For characterizing a first feature data predictor sample of the resulting target object sample over a short future period of time at the next round of parameter iteration (i.e., ) Second characteristic data prediction result samples of the target object samples over a long-term future period (i.e.,) And an interpretation information sample behind the prediction result sample for the first feature data and the second feature data (i.e.,)。
Through the method, in the embodiment of the disclosure, the real characteristic data aiming at the target object sample can be obtained, the thinking back basis comprising a plurality of information abstract samples, characteristic data prediction result samples, interpretation information samples and real characteristic data is generated, the thinking back model is utilized, the current thinking back result aiming at the characteristic data prediction result samples and interpretation information samples is obtained based on the thinking back basis, and finally, the initial large model is trained based on the current thinking back result, so that the target large model is obtained. That is, in the embodiments of the present disclosure, the predictive power of itself can be improved by letting the initial large model self-think. Specifically, by connecting the initial large model and the reflexive model, an automated Agent model is deployed that contains two neural network models, which can be improved by speech self-reflexive, loop-iteratively, i.e., parameter tuning of the initial large model. Meanwhile, when the parameter fine adjustment is carried out on the initial large model, manual feedback is not selected, and the improvement is carried out iteratively and circularly based on the returnal thinking results (comprising the current returnal thinking result and the historical returnal thinking result) generated by the automatic proxy model, so that the learning effect of the initial large model is further improved.
An integrity flow of a large model training method provided by embodiments of the present disclosure will be described below with reference to fig. 10 and 11.
In step S1001, a plurality of information summary samples related to the target object sample are acquired.
The specific description of step S1001 may be referred to the related description in step S901, which is not described herein.
Step S1002, obtaining a feature data prediction result sample for the target object sample and an interpretation information sample for the feature data prediction result sample based on the plurality of information summary samples by using the initial large model.
The specific description of step S1002 may be referred to the related description in step S902, which is not described herein.
Step S1003, using the target scoring model, obtaining scoring results for the feature data prediction result samples and the interpretation information samples, and scoring interpretation for the scoring results.
The specific description of step S1003 may be referred to the related description in step S903, which is not repeated here.
Step S1004, generating an anti-thinking basis including a plurality of information summary samples, feature data prediction result samples, interpretation information samples and real feature data.
The specific description of step S1004 may be referred to the related description in step S903, which is not repeated herein.
Step S1005, obtaining a current thinking-back result aiming at the characteristic data prediction result sample and the interpretation information sample based on the thinking-back basis by utilizing the thinking-back model.
The specific description of step S1005 may be referred to the related description in step S903, which is not described herein.
Step S1006, obtaining a historical returnal thinking result, and obtaining long-term returnal thinking data based on the current returnal thinking result and the historical returnal thinking result.
For a specific description of step S1006, reference may be made to the description of step S903, which is not repeated herein.
And step S1007, training the initial large model based on the long-term jeopardy data to obtain a target large model.
The specific description of step S1007 may be referred to the description of step S903, which is not described herein.
Fig. 12 is a schematic view of a scenario of a large model training method according to an embodiment of the disclosure.
As described above, the large model training method provided by the embodiment of the present disclosure is applied to an electronic device. The electronic device may be a server or a terminal device. Here, the terminal device may be a workstation, a mainframe computer, a conventional computer (e.g., desktop computer, notebook computer, vehicle-mounted computer, etc.), a personal digital assistant, or other similar computing device.
The electronic device is used for:
obtaining a plurality of information abstract samples related to a target object sample, wherein the information abstract samples have different information categories and are used for representing the performance and/or affected degree of the target object sample in different dimensions;
Obtaining a characteristic data prediction result sample aiming at a target object sample and an interpretation information sample aiming at the characteristic data prediction result sample based on a plurality of information abstract samples by utilizing an initial large model;
And training the initial large model based on the characteristic data prediction result sample and the interpretation information sample to obtain a target large model.
It should be noted that, in the embodiment of the present disclosure, the schematic view of the scenario shown in fig. 12 is merely illustrative and not restrictive, and those skilled in the art may make various obvious changes and/or substitutions based on the example of fig. 12, and the obtained technical solutions still fall within the scope of the embodiment of the present disclosure.
In order to better implement the data prediction method, the embodiment of the disclosure also provides a data prediction device, which can be integrated in an electronic device. The electronic device may be a server or a terminal device. Here, the terminal device may be a workstation, a mainframe computer, a conventional computer (e.g., desktop computer, notebook computer, vehicle-mounted computer, etc.), a personal digital assistant, or other similar computing device. Hereinafter, a data prediction apparatus 1300 provided by the disclosed embodiment will be described with reference to a schematic block diagram shown in fig. 13.
The data prediction apparatus 1300 includes:
A summary information obtaining unit 1301, configured to obtain a plurality of information summaries related to the target object, where the plurality of information summaries have different information categories, so as to characterize the performance and/or the affected degree of the target object in different dimensions;
The feature data prediction unit 1302 is configured to obtain a feature data prediction result of the target object based on the plurality of information summaries by using the target large model, and interpretation information for the feature data prediction result.
In some alternative embodiments, the feature data prediction unit 1302 is configured to:
Acquiring answer guide information;
obtaining a chained logic reasoning step based on the answer guiding information;
Based on the plurality of information abstracts, according to the chained logic reasoning step, the characteristic data prediction result of the target object and the interpretation information aiming at the characteristic data prediction result are obtained.
In some alternative embodiments, the plurality of information summaries comprises a news summary and a feature data dynamic summary, the step of chained logical reasoning comprises a first step and a second step, and the feature data prediction unit 1302 is configured to:
Executing a first step to obtain a first reasoning result based on the news abstract;
executing a second step to obtain a second reasoning result based on the feature data dynamic abstract;
And obtaining a first characteristic data prediction result of the target object in a short-term future period based on the first reasoning result and the second reasoning result, and interpretation information aiming at the first characteristic data prediction result.
In some alternative embodiments, the plurality of message digests includes a macro message digest and a basic surface digest, the chained logical reasoning step includes a third step and a fourth step, and the feature data prediction unit 1302 is configured to:
Executing a third step to obtain a third reasoning result based on the macroscopic information abstract;
executing a fourth step to obtain a fourth reasoning result based on the basic face abstract;
And obtaining a second characteristic data prediction result of the target object in a long-term future period based on the third reasoning result and the fourth reasoning result, and interpretation information aiming at the second characteristic data prediction result.
In some alternative embodiments, the plurality of information summaries includes a news summary, a feature data dynamic summary, a macro information summary, and a basic surface summary, the chained logical reasoning step includes a first step, a second step, a third step, and a fourth step, and the feature data prediction unit 1302 is configured to:
Executing a first step to obtain a first reasoning result based on the news abstract;
executing a second step to obtain a second reasoning result based on the feature data dynamic abstract;
Executing a third step to obtain a third reasoning result based on the macroscopic information abstract;
executing a fourth step to obtain a fourth reasoning result based on the basic face abstract;
and obtaining a first characteristic data prediction result of the target object in a short-term future period, a second characteristic data prediction result of the target object in a long-term future period and interpretation information aiming at the first characteristic data prediction result and the second characteristic data prediction result based on the first inference result, the second inference result, the third inference result and the fourth inference result.
In some alternative embodiments, the plurality of information summaries include news summaries, and the news summaries include daily news summaries, and the summary information acquiring unit 1301 is configured to:
Acquiring N groups of daily public opinion news related to a target object, wherein the N groups of daily public opinion news are in one-to-one correspondence with N independent dates in the current period, and N is more than or equal to 2 and is an integer;
And using each group of daily public opinion news in the N groups of daily public opinion news as the public opinion news to be processed, and processing the public opinion news to be processed by using a first novel Wen Zhaiyao device to obtain daily news abstracts corresponding to the public opinion news to be processed.
In some optional embodiments, the news digest further includes a news digest of the current period, and the digest information acquiring unit 1301 is further configured to:
obtaining a daily news abstract set based on N daily news abstracts corresponding to N groups of daily public opinion news one by one;
acquiring a historical period news abstract corresponding to a historical period;
and processing the daily news digest set based on the news digests of the historical time period by using a second news digest device to obtain the news digest of the current time period corresponding to the current time period.
In some alternative embodiments, the plurality of information summaries include feature data dynamic summaries, and summary information acquisition unit 1301 is configured to:
Acquiring first market data of a target object;
acquiring second market data of a first target number of reference objects similar to the target object;
And obtaining the characteristic data dynamic abstract based on the first market data of the target objects, the second market data of the first target number of reference objects and the specific reference index by using the characteristic data dynamic abstract.
In some alternative embodiments, the plurality of message digests includes a macro message digest, and digest information acquisition unit 1301 is configured to:
processing the second target number of macroscopic research reports by using a large language model to obtain second target number of macroscopic research summaries which are in one-to-one correspondence with the second target number of macroscopic research reports;
processing the macro index data by using a large language model to obtain a macro index abstract;
And obtaining the macro information abstract based on the macro research abstract and the macro index abstract of the second target number by using the large language model.
In some alternative embodiments, the plurality of information summaries includes a basic face summary, and summary information acquisition unit 1301 is configured to:
Processing the management layer analysis result of the target entity by using a large language model to obtain a management layer basic surface abstract, wherein the target entity is an economic organization issuing a target object;
Processing the analyst research report related to the target entity by using a large language model to obtain an analyst basic face abstract;
and obtaining the basic surface abstract based on the management layer basic surface abstract, the analyst basic surface abstract and specific data of the target entity by using a large language model.
In some alternative embodiments, summary information acquiring unit 1301 is configured to:
acquiring a first analyst report of a target entity;
Acquiring second analysts of a third target number of home reference entities similar to the target entities;
and processing the first analyst report of the target entity and the second analyst report of the third target quantity home reference entity as analyst reports related to the target entity to obtain an analyst basic face abstract.
Descriptions of specific functions and examples of each unit of the data prediction apparatus 1300 in the embodiment of the present disclosure may refer to the related descriptions of corresponding steps in the foregoing data prediction method embodiment, which are not repeated herein.
In order to better implement the large model training method, the embodiment of the disclosure also provides a large model training device, which can be integrated in the electronic equipment. The electronic device may be a server or a terminal device. Here, the terminal device may be a workstation, a mainframe computer, a conventional computer (e.g., desktop computer, notebook computer, vehicle-mounted computer, etc.), a personal digital assistant, or other similar computing device. In the following, a large model training apparatus 1400 provided by the disclosed embodiment will be described with reference to a schematic block diagram shown in fig. 14.
Large model training apparatus 1400, comprising:
A first sample acquiring unit 1401, configured to acquire a plurality of information abstract samples related to the target object sample, where the plurality of information abstract samples have different information categories, so as to characterize the performance and/or the affected degree of the target object sample in different dimensions;
A second sample acquiring unit 1402 configured to obtain a feature data prediction result sample for the target object sample and an interpretation information sample for the feature data prediction result sample based on the plurality of information summary samples using the initial large model;
the large model training unit 1403 is configured to train the initial large model based on the feature data prediction result sample and the interpretation information sample, so as to obtain a target large model.
In some alternative embodiments, large model training unit 1403 is used to:
Acquiring real characteristic data aiming at a target object sample;
Generating a negative basis comprising a plurality of information abstract samples, characteristic data prediction result samples, explanation information samples and real characteristic data;
obtaining a current thinking-back result aiming at a characteristic data prediction result sample and an interpretation information sample based on a thinking-back basis by utilizing a thinking-back model;
And training the initial large model based on the current thinking-back result to obtain a target large model.
In some alternative embodiments, large model training unit 1403 is used to:
Obtaining a scoring result aiming at the characteristic data prediction result sample and the scoring interpretation information sample by using a target scoring model, and scoring interpretation aiming at the scoring result;
generating a negative basis comprising a plurality of information abstract samples, characteristic data prediction result samples, interpretation information samples, scoring results, scoring interpretation and real characteristic data.
In some alternative embodiments, large model training apparatus 1400 further comprises a scoring large model training unit for:
The method comprises the steps of obtaining a first sample pair, wherein the first sample pair comprises a first characteristic data prediction sample, a first interpretation information sample and first real characteristic data corresponding to the first characteristic data prediction sample, and the similarity between the first characteristic data prediction sample and the first real characteristic data is larger than or equal to a first preset similarity threshold value;
The method comprises the steps of obtaining a second sample pair, wherein the second sample pair comprises a second characteristic data prediction sample, a second interpretation information sample and second real characteristic data corresponding to the second characteristic data prediction sample, the similarity between the second characteristic data prediction sample and the second real characteristic data is smaller than a second similarity threshold value, and the second similarity threshold value is smaller than a first preset similarity threshold value;
Training the initial scoring module based on the first sample pair and the second sample pair to obtain a target scoring model.
In some alternative embodiments, large model training unit 1403 is used to:
Acquiring a historical returnal thinking result;
Obtaining long-term returnal thinking data based on the current returnal thinking result and the historical returnal thinking result;
Training the initial large model based on long-term jeopardy data to obtain a target large model.
For descriptions of specific functions and examples of each unit of the large model training apparatus 1400 in the embodiment of the disclosure, reference may be made to the related descriptions of corresponding steps in the foregoing large model training method embodiment, which are not described herein. In the technical scheme of the disclosure, the acquisition, storage, application and the like of the related user personal information all conform to the regulations of related laws and regulations, and the public sequence is not violated.
According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product.
Fig. 15 illustrates a schematic block diagram of an example electronic device 1500 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile apparatuses, such as personal digital assistants, cellular telephones, smartphones, wearable devices, and other similar computing apparatuses. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 15, the apparatus 1500 includes a computing unit 1501, which can perform various appropriate actions and processes according to a computer program stored in a Read-Only Memory (ROM) 1502 or a computer program loaded from a storage unit 1508 into a random access Memory (Random Access Memory, RAM) 1503. In the RAM 1503, various programs and data required for the operation of the device 1500 may also be stored. The computing unit 1501, the ROM 1502, and the RAM 1503 are connected to each other through a bus 1504. An Input/Output (I/O) interface 1505 is also connected to bus 1504.
Various components in the device 1500 are connected to the I/O interface 1505, including an input unit 1506, e.g., a keyboard, mouse, etc., an output unit 1507, e.g., various types of displays, speakers, etc., a storage unit 1508, e.g., magnetic disk, optical disk, etc., and a communication unit 1509, e.g., a network card, modem, wireless communication transceiver, etc. The communication unit 1509 allows the device 1500 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunications networks.
The computing unit 1501 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 1501 include, but are not limited to, a central processing unit (Central Processing Unit, CPU), a graphics processing unit (Graphics Processing Unit, GPU), various specialized artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) computing chips, various computing units running machine learning model algorithms, digital signal processors (DIGITAL SIGNAL processes, DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 1501 performs the various methods and processes described above, for example, a data prediction method and/or a large model training method. For example, in some embodiments, the data prediction method and/or the large model training method may be implemented as a computer software program tangibly embodied on a machine-readable medium, e.g., the storage unit 1508. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 1500 via the ROM 1502 and/or the communication unit 1509. When the computer program is loaded into the RAM 1503 and executed by the computing unit 1501, one or more steps of the data prediction method and/or the large model training method described above may be performed. Alternatively, in other embodiments, the computing unit 1501 may be configured as a data prediction method and/or a large model training method by any other suitable means (e.g., by means of firmware).
Various implementations of the systems and techniques described here above can be implemented in digital electronic circuitry, integrated circuitry, field programmable gate array (Field Programmable GATE ARRAY, FPGA), application-specific integrated Circuit (ASIC), application SPECIFIC STANDARD Product (ASSP), system On Chip (SOC), load programmable logic device (Complex Programmable Logic Device, CPLD), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include being implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be a special or general purpose programmable processor, operable to receive data and instructions from, and to transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a RAM, a ROM, an erasable programmable read-Only Memory (EPROM) or flash Memory, an optical fiber, a portable compact disc read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having a display device (e.g., a Cathode Ray Tube (CRT) display or Liquid crystal display (Liquid CRYSTAL DISPLAY, LCD)) for displaying information to the user and a keyboard and a pointing device (e.g., a mouse or a trackball) by which the user can provide input to the computer.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network (Local Area Network, LAN), a wide area network (Wide Area Network, WAN), and the Internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server incorporating a blockchain.
The disclosed embodiments also provide a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform a data prediction method and/or a large model training method.
The disclosed embodiments also provide a computer program product comprising a computer program which, when executed by a processor, implements a data prediction method and/or a large model training method.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel, sequentially, or in a different order, provided that the desired results of the disclosed aspects are achieved, and are not limited herein. Moreover, in this disclosure, relational terms such as "first," "second," "third," and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Furthermore, "plurality" in the present disclosure may be understood as at least two.
The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions, improvements, etc. that are within the principles of the present disclosure are intended to be included within the scope of the present disclosure.
Claims (35)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202411283026.1A CN119167063A (en) | 2024-09-12 | 2024-09-12 | Data prediction method, large model training method, device and electronic equipment |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202411283026.1A CN119167063A (en) | 2024-09-12 | 2024-09-12 | Data prediction method, large model training method, device and electronic equipment |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN119167063A true CN119167063A (en) | 2024-12-20 |
Family
ID=93892300
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202411283026.1A Pending CN119167063A (en) | 2024-09-12 | 2024-09-12 | Data prediction method, large model training method, device and electronic equipment |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN119167063A (en) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN120561579A (en) * | 2025-05-09 | 2025-08-29 | 北京睿科伦智能科技有限公司 | A dual-level optimization method for AI training data based on large models |
-
2024
- 2024-09-12 CN CN202411283026.1A patent/CN119167063A/en active Pending
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN120561579A (en) * | 2025-05-09 | 2025-08-29 | 北京睿科伦智能科技有限公司 | A dual-level optimization method for AI training data based on large models |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Kothandapani | Applications of robotic process automation in quantitative risk assessment in financial institutions | |
| Gao | The use of machine learning combined with data mining technology in financial risk prevention | |
| Veganzones et al. | Corporate failure prediction models in the twenty-first century: a review | |
| Zhao et al. | Financial distress prediction by combining sentiment tone features | |
| EP4268172A1 (en) | Artificial intelligence financial analysis and reporting platform | |
| CN113469818B (en) | Investment risk early warning method and device, electronic equipment and computer readable medium | |
| Biswas et al. | Automated credit assessment framework using ETL process and machine learning | |
| Grant et al. | The double‐edged sword of global integration: Robustness, fragility, and contagion in the international firm network | |
| Wang et al. | Credit debt default risk assessment based on the XGBoost algorithm: An empirical study from China | |
| CN116596662A (en) | Risk early warning method and device based on enterprise public opinion information, electronic equipment and medium | |
| KR102519878B1 (en) | Apparatus, method and recording medium storing commands for providing artificial-intelligence-based risk management solution in credit exposure business of financial institution | |
| CN112862182A (en) | Investment prediction method and device, electronic equipment and storage medium | |
| Jammazi et al. | Estimating and forecasting portfolio’s Value-at-Risk with wavelet-based extreme value theory: Evidence from crude oil prices and US exchange rates | |
| CN110348995A (en) | A kind of credit risk control method, apparatus and electronic equipment based on risk attribution | |
| CN119167063A (en) | Data prediction method, large model training method, device and electronic equipment | |
| Sun et al. | Accounting earnings and economic growth, trends, and challenges: a bibliometric approach | |
| Mustafin et al. | Evaluation of the choice of borrower rating groups | |
| CN119963341A (en) | A method, system, device and medium for intelligent risk assessment and decision-making assistance of investment activities | |
| Daying et al. | Discovering variation financial performance of ESG scoring through big data analytics | |
| CN116341723A (en) | Stock trend prediction method, system, device and medium based on deep learning and multi-source data fusion | |
| Piven | Analysis of financial reports in companies using machine learning | |
| Rudnichenko et al. | Intelligent System for Processing and Forecasting Financial Assets and Risks | |
| Hao et al. | An Evaluation Study on Investment Efficiency: A Predictive Machine Learning Approach | |
| Kulachinskaya et al. | Artificial neural network model for managing bank capital | |
| CN120632061B (en) | An intelligent analysis method and system for enterprise digital transformation |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |