Detailed Description
In the description of the present invention, it should be understood that the terms "center", "longitudinal", "lateral", "length", "width", "thickness", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", "clockwise", "counterclockwise", etc. indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings are merely for convenience in describing the present invention and simplifying the description, and do not indicate or imply that the devices or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and thus should not be construed as limiting the present invention.
Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature. In the description of the present invention, the meaning of "a plurality" is two or more, unless explicitly defined otherwise.
In the present invention, unless explicitly specified and limited otherwise, the terms "mounted," "connected," "secured," and the like are to be construed broadly, and may be, for example, fixedly connected, detachably connected, or integrally connected, mechanically connected, electrically connected, directly connected, indirectly connected via an intervening medium, or in communication between two elements. The specific meaning of the above terms in the present invention can be understood by those of ordinary skill in the art according to the specific circumstances.
In order that the above objects, features and advantages of the present disclosure may be more clearly understood, a further description of aspects of the present disclosure will be provided below. It should be noted that, without conflict, the embodiments of the present disclosure and features in the embodiments may be combined with each other.
In order to solve the technical problems that when the existing low-code development platform faces to a complex service application scene, a developer is always required to perform a certain amount of manual coding or configure advanced logic, so that the application depth and efficiency of the low-code development platform in the complex service scene are limited. The embodiment of the disclosure provides a low-code development platform and a data processing method thereof.
As shown in fig. 1, an embodiment of the present disclosure provides a low-code development platform, which includes a low-code base module, an MCP service module, a RAG service module, a natural language processing module, and a core generation logic arrangement module, where the low-code base module, the RAG service module, the natural language processing module, and the core generation logic arrangement module are all connected to the MCP service module, and the core generation logic arrangement module is also connected to the natural language processing module.
For the disclosed embodiments, low-Code may be a technique or methodology for rapidly developing an application through a visual interface and a small amount of manual coding. The core aims are to reduce the development threshold and improve the efficiency, so that non-professional developers can participate in application construction, and meanwhile, the requirement of agile iteration of enterprises is met;
the low-code basic module can be used for providing a visual development function, a man-machine interaction inlet and a metadata interface, wherein the visual development function can be used for providing a traditional drag visual development function so that a user can quickly construct a basic interface and a flow of an application through simple operation, the man-machine interaction inlet is used as an inlet for collecting and integrating requirements, the user can input application requirements in a natural language form, and the metadata interface can be a pulling and submitting interface for exposing metadata definition information of a bottom layer component so as to facilitate other modules to acquire and update metadata.
The MCP service module may act as an implementation server for the Model Context Protocol (MCP), with the core objective of solving the fragmentation problem when the natural language processing module is integrated with external resources (e.g., databases, APIs, local files, etc.). The module has dynamic adaptability, flexibility and uniformity, can automatically adapt to corresponding interaction modes according to different external resource types, and provides uniform context information for the natural language processing module;
The MCP service module may be developed based on the MCP open standard protocol, defining a unified interface specification. When the natural language processing module needs to access external resources, a request is sent to the MCP service module, the module dynamically calls a corresponding resource adapter according to the resource identification and the operation type in the request, interacts with the external resources, and packages acquired data into unified context information to be returned to the natural language processing module;
The MCP service module serves as a core processing node of the MCP protocol, and can be used for receiving MCP protocol requests from other modules and converting the MCP protocol requests into specific operation instructions of external resources. Interaction with various external resources can include invoking a vector database for knowledge creation, retrieval, update operations, internet information query, file system reading and writing, metadata submission with low code platform base, pulling, and the like.
The technology of search enhancement generation (RETRIEVAL AUGMENTED GENERATION, RAG) can search related information from a large number of documents when a natural language processing module answers questions or generates texts, and then answer or generate texts based on the searched information, thereby improving the quality of the answers. The developer does not need to retrain the whole large model for each specific task, and can provide additional information input for the model only by hanging a related knowledge base, so that the accuracy of answers is improved;
The RAG service module can define knowledge for integrating business knowledge and metadata of the low-code platform bottom layer component to construct a cross knowledge base. The knowledge base covers rich business rules, technical documents, example codes and other information, and provides additional information input for the natural language processing module. And when the natural language processing module generates a text or answers a question, information related to the current task is retrieved from the cross knowledge base, so that the quality and accuracy of the answer are improved. Real-time updating and maintaining of the knowledge base are supported to ensure timeliness and accuracy of the knowledge.
The natural language processing module is integrated with an artificial intelligent large language model (Large Language Model, LLM) based on deep learning, can be trained through massive text data, can understand, generate and infer natural language texts, and is widely applied to tasks such as text creation, translation, question and answer and the like. Advanced large language models have powerful natural language understanding, generating and reasoning capabilities.
The core generation logic arrangement module can be used for being responsible for overall scheduling and arrangement of the whole application generation process, coordinating the workflow among the modules and ensuring smooth progress of application development. The core generation logic arrangement module can be responsible for calling a natural language processing module, an MCP service module and a RAG service module based on an MCP protocol to carry out interactive communication so as to realize information transmission and sharing.
For the embodiment of the disclosure, the core generation logic arrangement module may be configured to receive an original demand text of a user, send a division instruction and a RAG search request to the MCP service module, receive a first search result, integrate the first search result and the original demand text to obtain first integrated data, and send a data processing instruction to the natural language processing module;
The MCP service module is used for receiving the dividing instruction, and performing text division on the original required text based on the dividing instruction to obtain a plurality of text blocks;
the MCP service module is also used for receiving an RAG retrieval request, sending a vector conversion instruction to an embedded model of the RAG service module based on the RAG retrieval request, sending a retrieval instruction to a vector database of the RAG service module based on the RAG retrieval request, and sending an execution instruction to the low-code basic module;
The embedded model of the RAG service module is used for receiving a vector conversion instruction, and carrying out vector conversion on a plurality of text blocks based on the vector conversion instruction to obtain first vector data corresponding to the text blocks;
The vector database of the RAG service module is used for receiving a search instruction, searching first vector data based on the search instruction, and obtaining a first search result;
The natural language processing module is used for receiving the data processing instruction, performing task decomposition on the first integrated data based on the data processing instruction to obtain a plurality of subtasks, and performing content generation processing on the subtasks to obtain metadata information corresponding to each subtask;
The low code base module is used for receiving the execution instruction and executing corresponding configuration on the metadata information based on the execution instruction.
For the disclosed embodiments, as shown in fig. 2, a specific workflow may be as follows:
s1, a user submits demand information (namely original demand text) through natural language, and a core generation logic arrangement module receives the original demand text of the user.
S2, the core generation logic arrangement module calls the MCP service module to block the content of the original demand text, so as to prepare for RAG inquiry.
And S3, the MCP service module calls the MCP service of the content block of the MCP service module, divides the original demand text to obtain a plurality of text blocks, and returns the segmented content.
S4, the core generation logic arrangement module requests the MCP service module to perform RAG retrieval.
And S5, the MCP service module calls an embedded model in the RAG service module to vectorize the input blocks (namely a plurality of text blocks), the embedded model receives a vector conversion instruction and executes operation, converts the text blocks into vector data, and returns the vector data to the MCP service module.
And S6, the MCP service module continuously calls a vector database in the RAG service module to search vector data, and the RAG service module returns the searched vector data and corresponding text data to the core logic arrangement module.
And S7, after the core logic arrangement module performs integration processing (namely, integration processing) on the return information and the original demand text, calling the natural language processing module to perform content generation, wherein the original demand text is subjected to task decomposition into task demand descriptions for generating low-code metadata, namely, the task decomposition of the original demand.
And S8, generating content aiming at each subtask to obtain low-code metadata (namely metadata information), sending an execution instruction to a low-code basic module by the MCP service module, receiving the execution instruction by the low-code basic module, and executing corresponding configuration on the metadata information based on the execution instruction to enable the metadata information to be immediately effective, thereby completing development of the application.
For the disclosed embodiments, the various modules may be deployed and initialized prior to low code development to ensure that the core generation logic orchestration module, the MCP service module, the RAG service module, the natural language processing module, and the low code base module are able to communicate and operate normally. Meanwhile, the vector database of the RAG service module is subjected to knowledge filling, and common information such as business knowledge, technical documents and the like can be stored in the vector database.
The user may input the application requirements through a man-machine interface (i.e., man-machine interaction portal in fig. 1) provided by the platform, for example, "develop an inventory management system including commodity warehousing, ex-warehouse, inventory checking and report generation functions". And after the core generation logic programming module receives the original demand text of the user, starting a subsequent processing flow.
The core generation logic orchestration module may send a partitioning instruction to the MCP service module, where the partitioning instruction may be used to instruct the MCP service module to perform text partitioning on the original demand text, resulting in a plurality of text blocks. The MCP service module may then send vector conversion instructions to the embedded model of the RAG service module, which may be used to instruct the embedded model to convert the plurality of text blocks into vector data.
The MCP service module may send a search instruction to a vector database of the RAG service module, the search instruction may be used to instruct the vector database to search for vector data and return a first search result. The core generation logic arrangement module can receive the first search result and integrate the first search result with the original demand text to obtain first integrated data.
The core generation logic programming module sends a data processing instruction to the natural language processing module, wherein the data processing instruction can be used for instructing the natural language processing module to conduct task decomposition on the first integrated data to obtain a plurality of subtasks. For example, the inventory management system requirements are decomposed into subtasks such as "commodity warehouse module development", "commodity warehouse-out module development", "inventory module development", and "report generation module development". Content generation is then performed for each subtask, resulting in low code metadata such as form field definition, page layout, business logic, etc.
The MCP service module may send an execution instruction to the low code base module, where the execution instruction may be used to instruct the low code base module to automatically generate relevant components and functions of the application based on the metadata information and perform a corresponding configuration, thereby completing development of the inventory management system.
After development is completed, the application can be comprehensively tested to check whether the functions meet the requirements of users, whether the interface is friendly, whether the performance is stable and the like. If problems are found, the metadata can be adjusted and optimized through the metadata interface of the low-code base module, and the metadata is submitted again for updating the application.
Through the implementation mode, the low-code development platform disclosed by the invention can efficiently and accurately process the original requirements of users, generate low-code application meeting business requirements, effectively reduce development threshold and improve development efficiency and quality.
For the embodiments of the present disclosure, for S8 in a specific workflow, the core generation logic orchestration module may recursively process each subtask, as shown in fig. 2, the specific workflow may be as follows:
and S8-1, when the core logic orchestration module processes each subtask, the core logic orchestration module can send a request to the MCP service module to request the RAG retrieval of the subtask requirement information. Because each subtask has smaller particles, the subtask needs information as a whole to be directly retrieved without further blocking basically. For example, if the subtask is "create a merchandise display page containing a merchandise picture, name, price, and profile," the description is used as a search input.
And S8-2, after the MCP service module receives the request of the core logic arrangement module, an embedded model in the RAG service module can be called, specifically, a vector conversion instruction can be sent to the embedded model of the RAG service module, the embedded model can convert the input subtasks from natural language description to second vector data based on the vector conversion instruction, and the vector represents semantic information capable of capturing the subtasks so as to carry out efficient matching in a vector database.
The MCP service module may continue to call the vector database in the RAG service module, and in particular, may send a search instruction to the vector database of the RAG service module, where the search instruction may be used to instruct the vector database to search the second vector data and return the second search result. The vector database stores a large amount of knowledge information, including business rules, technical documents, common design modes and the like. The knowledge vector most similar to the input subtask vector can be found by calculating the similarity between the vectors, and the corresponding vector data and text data are returned. The RAG service module returns the search results to the core logic arrangement module.
And S8-4, the core logic arrangement module can integrate and process the returned vector data, text data and subtask requirements so as to correlate and fuse the retrieved related knowledge information with the subtask requirements to form a more complete and accurate requirement description. Then, the core logic programming module can call a proper large language model in the natural language processing module, content generation is carried out according to the integrated information, and metadata definition information of low codes is mainly generated at this time. The metadata definition information may be used to describe attributes of the application component to which the subtask corresponds, such as form field type, page layout, business logic rules, and the like.
And S8-5, after the core logic orchestration module generates metadata definition information, a service sending request can be submitted to metadata submitting services in the MCP service module so as to submit the metadata information to the low-code basic module, and the metadata definition information is immediately validated.
And S8-6, after receiving the request for submitting the metadata of the core logic orchestration module, the MCP service module can call the internal submitting metadata MCP service. The service can accurately submit metadata information to the low-code base module. The low-code basic module can automatically generate application components and functions corresponding to the subtasks according to the submitted metadata information, and the development of the subtasks is completed.
For the embodiments of the present disclosure, for future iterations of application, the core logic orchestration module may actively initiate a flow of knowledge base information update to ensure that the knowledge base can keep pace with the development changes of the application, as shown in fig. 2, the specific workflow may be as follows:
And S9-1, the core logic orchestration module can send a request to the MCP service module to request to pull low-code metadata information, wherein the metadata information comprises key information such as component definition, business logic, page layout and the like of an application and is an important data source for updating a knowledge base.
And S9-2, after the MCP service module receives the request of the core logic arrangement module, the MCP service module can call the internal pull metadata MCP service. The service may pull the required metadata information from the low code base module and return it to the core logic orchestration module.
And S9-3, the MCP service module can call internal content blocking MCP service to carry out blocking processing on the pulled metadata information. Because metadata information can be complex and huge, the partitioning processing is beneficial to subsequent vectorization operation and knowledge base updating, and processing efficiency and accuracy are improved.
And S9-4, the MCP service module can call an embedded model in the RAG service module to carry out vectorization processing on the segmented metadata information. The embedded model can be used for converting each metadata block into vector data which can be processed by a computer, and the vector data can capture semantic information of the metadata, so that the metadata can be stored and retrieved in a vector database conveniently.
And S9-5, the MCP service module can call a vector database in the RAG service module, and the information of the whole knowledge base is updated by using the vectorized metadata information. The vector database can store new vector data and corresponding metadata information into the knowledge base to replace or supplement original knowledge content, so that the knowledge base can reflect the latest state of the application.
For the disclosed embodiments, as shown in fig. 1, the overall workflow may be as follows:
The user inputs the application requirement (namely the original requirement text of the user) in a natural language form through the man-machine interaction entrance of the low-code basic module, for example, "develop a client management system comprising client information input, query and statistical analysis functions".
After the core generation logic orchestration module receives the original demand text input by the user, the core generation logic orchestration module may call the MCP service module to perform content blocking on the original demand text to obtain a plurality of text blocks, and may call a RAG knowledge base retrieval function (i.e., the knowledge base retrieval function in fig. 1) in the MCP service module through the MCP protocol, so that the MCP service module retrieves knowledge related to the original demand text of the user from a cross knowledge base (i.e., the vector database in fig. 1) of the RAG service module.
Specifically, the MCP service module invokes an embedded model in the RAG service module to vectorize the input blocks (i.e., the plurality of text blocks), and the embedded model receives the vector conversion instruction and performs an operation to convert the plurality of text blocks into vector data and returns the vector data to the MCP service module. The MCP service module continues to call a vector database in the RAG service module to search vector data, and the RAG service module returns the searched vector data and the corresponding text data to the core logic arrangement module.
The core generation logic arrangement module can integrate the search result with the original demand text, then call a model A in the natural language processing module to conduct subtask decomposition, call a model B in the natural language processing module to conduct subtask demand content generation to obtain table unit data information corresponding to the subtasks, and call a model C in the natural language processing module to conduct subtask demand content generation to obtain report metadata information corresponding to the subtasks. For example, the customer management system requirements are broken down into subtasks such as "customer information entry module development", "customer information query module development", "statistical analysis module development", and the like.
The core generation logic orchestration module may recursively process each subtask. For each subtask, the MCP service module can be called again to search the RAG knowledge base in the subtask field.
The MCP service module may retrieve knowledge related to the subtasks, such as form design specifications entered by the customer information, SQL statement templates of the query function, etc., from the cross knowledge base (i.e., vector database in fig. 1) of the RAG service module and return to the core generation logic orchestration module.
The core generation logic orchestration module may invoke a natural language processing module (LLM), combine the retrieved knowledge, and generate metadata information corresponding to each subtask, such as form field definitions, page layout, business logic, and the like.
The core generation logic orchestration module may invoke the MCP service in the MCP service module that is responsible for the low code metadata operations, submitting the generated metadata information to the metadata services of the low code base module.
The low-code basic module can automatically generate related components and functions of the application according to the submitted metadata information to complete development of the application, wherein the low-code basic module can contain components such as forms, processes, pages, reports and the like, and convenient interaction and basic function support are provided for users.
After development is completed, the application may be tested to check whether the function meets the user's requirements. If problems are found, the metadata can be adjusted and optimized through the metadata interface of the low-code base module, and the metadata is submitted again for updating the application.
For future iterations, the core logic orchestration module may proactively initiate a knowledge base information update flow.
The core logic orchestration module may send a request to the MCP service module to request to pull low code metadata information. The MCP service module can call the internal metadata pulling MCP service after receiving the request of the core logic arrangement module so as to pull the required metadata information from the low-code basic module and return the required metadata information to the core logic arrangement module.
The MCP service module can call internal content block MCP service to carry out block processing on the pulled metadata information. The MCP service module can call an embedded model in the RAG service module to carry out vectorization processing on the segmented metadata information. The MCP service module may call a vector database in the RAG service module to store the new vector data and corresponding metadata information into a knowledge base of the vector database to complete the knowledge base update.
Through the implementation mode, the low-code development platform disclosed by the invention can effectively integrate business knowledge and technical knowledge, reduce the development difficulty of complex business application and improve the development efficiency and quality.
As shown in fig. 3, an embodiment of the present disclosure provides a data processing method of a low-code development platform, which may be applied to the low-code development platform and controlled by a core generation logic arrangement module in the low-code development platform, wherein the data processing method of the low-code development platform may include:
Step 101, receiving an original demand text of a user, and performing text division on the original demand text to obtain a plurality of text blocks.
For the embodiment of the disclosure, the system can receive the original requirement text input by the user through the human-computer interaction interface, and the original requirement text can describe the functional requirement of the user on the application in a natural language form, for example, "develop an employee attendance management system, including the functions of attendance record input, leave-on approval, attendance statistics report generation and the like.
The MCP service module may perform text partitioning on the received original demand text, splitting it into a plurality of text blocks. The text division can be performed in a semantic, sentence or paragraph mode, so that each text block can be processed independently. For example, the requirement text of the attendance management system is divided into a plurality of text blocks such as "attendance record input related requirement", "leave examination and approval related requirement", "attendance statistics report generation related requirement", and the like.
For the embodiment of the present disclosure, text partitioning is performed on an original demand text to obtain a plurality of text blocks, which may specifically include:
dynamically analyzing and processing the semantic structure of the original demand text to obtain a semantic structure analysis result;
determining a partition boundary based on a semantic structure analysis result according to a preset semantic integrity rule;
And carrying out text division on the original required text according to the division boundary to obtain a plurality of text blocks.
The dynamic analysis and processing can adopt natural language processing technology, such as syntactic analysis, semantic role labeling and other methods, so as to deeply understand the semantic relation among all components in the text. For example, main predicate structures, modifier relationships, logical connector words, etc. in sentences are analyzed to build a semantic structure framework of text.
The semantic integrity rule comprises at least one of integrity judgment based on a grammar structure, consistency judgment based on a semantic topic and boundary recognition based on domain knowledge.
The partition boundary can be determined according to grammar structure information such as sentence boundary, clause relation and the like based on the integrity judgment of the grammar structure. For example, a complete sentence is taken as one text block, or a complex sentence is split into multiple text blocks according to the boundaries of clauses.
The consistency judgment based on the semantic topics can divide the text according to the consistency of the semantic topics such as the functional module, the operation step and the like. For example, in a demand text describing one e-commerce system, texts on different functional modules of user registration, merchandise display, order processing, and the like are divided into different text blocks, respectively.
Domain knowledge based boundary identification may utilize domain knowledge, such as specific terms or logic units in low code development, to identify partition boundaries. For example, when a specific term of the low code development field such as "form design" and "workflow configuration" appears in the text, the semantic unit where it is located is divided into a separate text block.
Each divided text block should have relatively complete semantic content, so that subsequent vectorization, retrieval, metadata generation and other operations are facilitated.
After determining the dividing boundary according to the semantic integrity rule, a context association mark can be added for each text block, wherein the mark can comprise at least one of an identifier of a leading text block used for indicating the front-order dependency relationship of the current text block, an identifier of a trailing text block used for indicating the subsequent association relationship of the current text block, and a position index in the original requirement used for recording the starting and ending positions of the current text block in the original requirement text.
Through the implementation mode, the original demand text dividing method can accurately and efficiently divide the original demand text into text blocks with clear semantics and context association, provides powerful support for application development of a low-code development platform, and improves development efficiency and quality.
Step 102, performing vector conversion on the text blocks to obtain first vector data corresponding to the text blocks, and searching the first vector data in a vector database to obtain a first search result.
For the embodiment of the disclosure, the vector conversion can be performed on the divided text blocks by using an embedded model in the RAG service module, so as to convert the text information into a numerical form which can be processed by a computer, and obtain first vector data corresponding to the text blocks. The embedded model may be an embedded layer of a BERT, GPT, or the like model, so that each text block is mapped into one vector in a high-dimensional vector space through the embedded model.
The first vector data may be retrieved from a vector database of the RAG service module to obtain a first retrieval result. Where a large amount of knowledge and information is stored in a vector database, which information is represented in the form of vectors. By means of vector retrieval, a knowledge vector which is most similar to the input text block vector can be found, and therefore relevant knowledge information is obtained. For example, in the need handling of an attendance management system, the vector database may retrieve knowledge about the attendance record data structure, the leave-on approval process design, the attendance statistics algorithm, etc.
For the embodiment of the present disclosure, retrieving the first vector data in the vector database to obtain the first retrieval result may specifically include:
Determining first similar vector data with the similarity degree larger than a preset similarity degree threshold value from a vector database;
And determining the text data corresponding to the first similar vector data as a first search result.
For the disclosed embodiments, in the vector database, for each vector data stored, the degree of similarity between it and the first vector data may be calculated. The similarity calculation may be performed by various methods, such as cosine similarity, euclidean distance, and the like. Taking cosine similarity as an example, let vector a and vector B be cosine similarity calculation formulas of two vectors:
Where similarity (A, B) may represent cosine similarity of vector A and vector B, A.B may represent dot product of vector A and vector B, and A may represent modulo of vector A and vector B, respectively.
According to the calculated similarity, first similarity vector data with the similarity degree larger than a preset similarity degree threshold value can be determined, wherein the preset similarity degree threshold value can be set according to specific application scenes and requirements. For example, the threshold may be set higher in some scenes where accuracy of the search result is required, and the threshold may be appropriately lowered in some scenes where the recall is required.
The first similarity vector data and text data corresponding to the first similarity vector data may be determined as the first search result. In addition to storing vector data, the vector database also stores corresponding text data in association, and the text data contains specific information related to the vector. By returning the vector and corresponding text data, more information can be provided for subsequent processing.
Step 103, integrating the first search result and the original demand text to obtain first integrated data, performing task decomposition on the first integrated data to obtain a plurality of subtasks, and performing content generation processing on the subtasks to obtain metadata information corresponding to each subtask so as to execute corresponding configuration based on the metadata information.
For the embodiment of the disclosure, the core logic arrangement module may be used to integrate the first search result and the original demand text to obtain first integrated data. The integration process can correlate and fuse the retrieved related knowledge information with the original demand text so as to more fully understand the user demand.
The first integrated data may be task-decomposed using a corresponding model in the natural language processing module, split into a plurality of subtasks, where each subtask corresponds to a particular functional module or component in the application development. For example, the requirements of the attendance management system are decomposed into subtasks such as 'attendance record input module development', 'leave examination and approval module development', 'attendance statistics report generation module development'.
The content generation processing can be performed on the plurality of sub-tasks obtained through decomposition by utilizing a corresponding model in the natural language processing module, so that metadata information corresponding to each sub-task is obtained, wherein the metadata information describes related attributes of the sub-tasks, such as form field definition, page layout, business logic and the like. For example, for "attendance record entry module," metadata information may include field definitions of attendance time, attendance place, employee name, etc., as well as layout design of the entry page, etc.
The low code basic module can be utilized to carry out corresponding configuration on the low code platform according to the generated metadata information, and related components and functions of the application are automatically generated to finish the development of the application.
For the embodiment of the present disclosure, performing content generation processing on a plurality of subtasks to obtain metadata information corresponding to each subtask may specifically include:
Vector conversion is carried out on the plurality of subtasks to obtain second vector data corresponding to the plurality of subtasks;
Retrieving the second vector data in the vector database to obtain a second retrieval result;
Integrating the second search result and the plurality of subtasks to obtain second integrated data;
And performing content generation processing on the second integrated data to obtain metadata information.
In particular, the embedded model in the natural language processing module may be utilized to perform vector conversion on the plurality of subtasks to convert the natural language description of each subtask into a vector form (i.e., second vector data) that the computer is capable of processing. For example, subtask 1 "develop a commodity inventory management function, including inventory quantity display, inventory increase and decrease operation and inventory early warning", and subtask 2 "implement commodity sales statistics function, and generate sales report by day, week, month. The natural language processing module may use a pre-trained embedding model to convert the two subtasks into vector data, respectively, resulting in second vector data.
The RAG service module may retrieve in the vector database after receiving the second vector data. By calculating the similarity between vectors, the knowledge vector most similar to the input subtask vector is found. For sub-task 1, knowledge about the data structure design of inventory management, database transaction processing of inventory operation, algorithm of inventory pre-warning, etc. can be retrieved, for sub-task 2, knowledge about the statistical method of sales data, templates and specifications for report generation, etc. can be retrieved, and these knowledge vectors and corresponding text data are returned as a second retrieval result. The vector database stores a large amount of information such as business knowledge, technical documents, development cases and the like, and the information is also expressed in a vector form.
The core generation logic orchestration module may correlate and fuse the second search result with the plurality of subtasks to form a more comprehensive and richer data set. For example, the retrieved inventory management data structure design information is combined with the description of the subtask 1 to specify the fields (commodity name, inventory number, upper and lower limits of inventory, etc.) required for inventory number display, and the sales data statistics method is combined with the description of the subtask 2 to determine the dimensions (day, week, month) and indexes (sales amount, sales number, etc.) of the statistics report. And obtaining second integration data after integration.
The natural language processing module may perform content generation processing on the second integrated data. Metadata information corresponding to each subtask can be generated according to the integrated information. For sub-task 1, the generated metadata information may include layout of inventory quantity display pages, button design and logic of inventory increase and decrease operations, threshold setting of inventory pre-warning, etc., and for sub-task 2, the generated metadata information may include field definition of sales statistics report, time period setting of report generation, chart type of report presentation, etc.
For the embodiment of the present disclosure, retrieving the second vector data in the vector database to obtain the second retrieval result may specifically include:
Determining second similar vector data with the similarity degree larger than a preset similarity degree threshold value from a vector database;
And determining the text data corresponding to the second similar vector data as a second search result.
For the disclosed embodiments, in the vector database, for each vector data stored, the degree of similarity between it and the second vector data may be calculated. The similarity calculation may be performed by various methods, such as cosine similarity, euclidean distance, and the like.
And determining the second similarity vector data with the similarity degree larger than a preset similarity degree threshold value according to the calculated similarity degree. The preset similarity threshold may be flexibly set according to specific application scenarios and requirements, and taking the preset similarity threshold as an example of 0.75, a vector database stores a plurality of vector data related to development of the client management system, and after comparison, it is found that the similarity between 5 vector data and the second vector data is greater than 0.75, the 5 vector data are used as the second similar vector data.
The determined second similarity vector data and corresponding text data thereof can be returned as a second search result, wherein the corresponding text data can comprise development case descriptions of the conventional client information management system, such as system function module division, database design thought, interface layout scheme and the like. This information will provide important reference and knowledge support for subsequent low code development.
The generated metadata information can be applied to a low-code development platform to automatically generate application components and functions corresponding to the subtasks. And comprehensively testing the generated application, checking whether the function meets the subtask requirement, whether the interface is friendly, whether the performance is stable and the like. If a problem is found, the metadata information can be adjusted in time, and the application components and functions are regenerated until the requirements are met.
Through the implementation mode, the subtask metadata generation method can accurately and efficiently generate the metadata information corresponding to each subtask, provides a solid foundation for low-code application development, and improves development efficiency and quality.
For the embodiments of the present disclosure, database updates may be performed on a vector database using metadata information, which may specifically include:
The metadata information is subjected to blocking processing, and vector conversion is carried out on the blocking metadata information to obtain third vector data corresponding to the blocking metadata information;
Metadata information and third vector data are added to the vector database to update the vector database.
For the disclosed embodiments, the metadata information is partitioned into a plurality of smaller, relatively independent partitioned metadata information. The basis of the chunks may be determined according to the structure, content or business logic of the metadata information. For example, if the metadata information is a description about a complex system, the blocks may be performed according to different modules or functions of the system, and if the metadata information is a longer document, the blocks may be performed according to paragraphs or chapters. The partitioning process is helpful to improve the efficiency and accuracy of the subsequent vector conversion, and is also convenient to manage and maintain metadata information.
And carrying out vector conversion on the block metadata information to obtain third vector data corresponding to the block metadata information. The vector conversion may use an embedded model in a natural language processing module, such as Word2Vec, gloVe or BERT, so as to convert the metadata information in text form into a high-dimensional vector, and capture the semantic features of the metadata information. For example, when using the BERT model to vector transform the chunk metadata information, the model may map each word or sub-word in each chunk metadata information to a vector representation, and then obtain a vector representation of the entire chunk metadata information, i.e., the third vector data, by a particular aggregation (e.g., average pooling or maximum pooling).
Metadata information and third vector data may be added to the vector database to update the vector database. In the adding process, the association relation between the metadata information and the third vector data can be established, so that the complete metadata information can be conveniently obtained in the subsequent retrieving and using processes. Meanwhile, the vector database can be index-optimized according to the requirement, so that the retrieval efficiency is improved.
In summary, compared with the prior art, the data processing method of the low-code development platform provided by the disclosure can obtain a plurality of text blocks by receiving an original demand text of a user and performing text division on the original demand text, perform vector conversion on the plurality of text blocks to obtain first vector data corresponding to the plurality of text blocks, retrieve the first vector data in a vector database to obtain a first retrieval result, integrate the first retrieval result and the original demand text to obtain first integrated data, perform task decomposition on the first integrated data to obtain a plurality of subtasks, perform content generation processing on the plurality of subtasks to obtain metadata information corresponding to each subtask, and execute corresponding configuration based on the metadata information.
By the scheme in this disclosure, this disclosure scheme allows business personnel to enter requirements in natural language (i.e., raw requirement text) without having to master programming or complex configuration syntax. The system can automatically divide, convert and search the input text, then decompose the text into a plurality of subtasks and generate metadata information corresponding to each subtask, so that service personnel do not need to manually write codes or configure logic, and the learning and practice difficulty is reduced. According to the scheme, through a natural language processing technology, the system can directly understand the requirement text of service personnel, and translation errors between service requirements and technology realization are avoided. The system searches the knowledge related to the demand text in the vector database, integrates the search result with the original demand, and further ensures accurate understanding of the service demand, thereby improving the application depth and efficiency of the low-code platform in complex service scenes.
It should be noted that in this document, relational terms such as "first" and "second" and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises an element.
It will be appreciated by those skilled in the art that the modules or steps of the invention described above may be implemented in a general purpose computing device, they may be concentrated on a single computing device, or distributed across a network of computing devices, they may alternatively be implemented in program code executable by computing devices, so that they may be stored in a memory device for execution by computing devices, and in some cases, the steps shown or described may be performed in a different order than that shown or described, or they may be separately fabricated into individual integrated circuit modules, or multiple modules or steps within them may be fabricated into a single integrated circuit module for implementation. Thus, the present invention is not limited to any specific combination of hardware and software. The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.